Digital Micrograph is an image processing program produced commercially by Gatan.
Gatan does not publish the file format for Digital Micrograph. This information has been obtained by examining the structure of files, thus it may be inaccurate or wrong and is definitely incomplete.
See also the Greg Jefferis' Digital Micrograph 3 file format page on which some of the information here is based.
DM3 info updated March 2006
The files examined were written by Digital Micrograph 2.1.5. For DM 2.5 the main difference is the version tag
A ? means a guess or I've not bothered to check or I'm not sure.
Mac file type: GSHN
Mac creator: GCCD
The 2 byte tag identifies the type of data contained in the field. These are, approximately in the order they appear in the file (is the order significant?)
tag value
(hex)
3d DM version multiplied by 100 and stored as i4 (ie length 4). For DM 2.1.5
the version is 200, for DM 2.5 it is 250.
ffff The image itself. First 8 bytes specify size and type as 4i2.
This is the same as the "small header format", except that only data
types 1 to 7 are listed in the manual for this format.
1 width
2 height
3 bytes/pixel (eg float=4, complex=8)
4 data type. 1 2 byte integer signed ("short")
2 4 byte real (IEEE 754)
3 8 byte complex (real, imaginary)
4 ?
5 4 byte packed complex (see below)
6 1 byte integer unsigned ("byte")
7 4 byte integer signed ("long")
8 rgb view, 4 bytes/pixel, unused, red, green, blue?
9 1 byte integer signed
10 2 byte integer unsigned
11 4 byte integer unsigned
12 8 byte real
13 16 byte complex
14 1 byte binary (ie 0 or 1)
The first 3 multiplied together should give the total number of bytes in
the picture.
The rest (ie all but first 8 bytes) is the image.
Packed complex (data type 5)
This is used for the Fourier transform of real images, which have
symmetric real parts and antisymmetric imaginary parts and thus can
be stored in half the number of bytes that the equivalent complex
picture would take. The format is somewhat strange.
I have confused things further by using semper's coordinate system.
If the equivalent full complex picture of size n by n would look like
x1 = -n/2, x2 = int((n-1)/2)
y1 = -int((n-1)/2), y2 = n/2
real part
x1 ... -1 0 1 ... x2
y1 rx1,y1 r0,y1 rx2,y1
...
-1 r-1,-1 r0,-1 r1,-1
0 rx1,0 r-1,0 r0,0 r1,-1 rx2,0
1 r-1,1 r0,0 r1,-1
...
y2 rx1,y2 r0,y2 rx2,y2
imaginary part likewise but i-1,-1 etc
packed complex
x1 x1+1 ... -2 -1 0 1 ... x2-1 x2
y1 rx1,0 *rx1,y1 r1,y1 i1,y1 r2,y1 i2,y1 rx2,y1 ix2,y1
...
1 rx1,y2 ix1,y2 r1,1 i1,1 r2,1 i2,1 rx2,1 ix2,1
0 r0,0 *r0,y1 r1,0 i1,0 r2,0 i2,0 rx2,0 ix2,0
-1 r0,-1 i0,-1 r1,-1 i1,-1 r2,-1 i2,-1 rx2,-1 ix2,-1
...
y2 r0,y2 i0,y2 r1,y2 i1,y2 r2,y2 i2,y2 rx2,y2 ix2,y2
The top of the x1 and x1+1 columns contain what would be in the bottom
of the x1 column, with two imaginary parts containing real parts (marked
with *)
3b Contains the local info saved with the picture eg mictroscope cs.
First 4 bytes - number of tags (i4)
Each tag has the format
4i2, string, 8i2, string, 10i2
The integer before each string is the string length
The integer before this is the string length + 2
Integer 8 seems to be the type of the tag
2 string
3 number
4 keyword
5 unknown
All the rest of the integers were the same in all tags examined.
3c Contents of notes box. First 4 bytes are number of characters. Rest is
text of notes box. There is no trailing null.
2d Display type = raster image if present? Length=0
Also has 16 and 3e set
2e Display type = surface plot if present? Length=0
Also has 2f, 30, 31, 32, 33, 34 set
1f4 Display type = line plot if present? Length=0
Also has 1f5, 1f6, 1f7, 1f8, 1f9 and others set
16 Display magnification (screen pixels/pixel) (real)
3e Position of top left of picture with respect to top left of window
(2i2)
1b Picture maximum value (real)
1c Picture minimum value (real)
35 Units for pixel size (null terminated (eg 1/um for fft) plus other
stuff to total of 16 bytes (or is everything after the null junk?)
1f Pixel size in um? (real)
20 Pixel size in um? (real)
23 0 = normal contrast, 1 = inverted contrast (i1) set in display info
d Colour mode (i2) set in display info
1 Gray-scale
2 Linear
3 Rainbow?
4 Temperature?
5 Custom?
c Contrast mode (i2) set in display info
1 Linear
2 Equalized
3 Pseudo-contour?
4 Custom?
27 0 = survey off, 1 = survey on (i1) set in display info
28 0 = survey cross-wires, 1 = survey entire image (i2) set in display info
11 Value to display as black (contrast limits) (real)
12 Value to display as white (contrast limits) (real)
26 Minimum contrast (real) set in display info
25 Annotation, eg text or lines on screen. First 4 bytes is probably number
of annotations (i4)
19 position & size of window on screen, top left = 0,0. (4i2)
top, left, bot, right
0 End of file (length 0)
Mac file type: GTGI
Mac creator: GCCD
Files examined were written from DM 3.3.1 on a PC and a Mac and later versions.
The notation is loosely based on Fortran.
i1 char 1 byte integer i2 short 2 byte integer i4 long 4 byte integer f4 float 4 byte floating point f8 double 8 byte floating point a char string
Byte order
i4be big endian, Motorola, Mac, eg 00 00 01 02 for 258 i4le little endian, Intel, PC, eg 02 01 00 00 for 258 i4* order depends on byte order flag (3rd i4 integer in file)
Hex values are written eg 14h, ie 14h = 20
File consists of a header, a tag directory and a group of nulls marking the end of the file. The tag directory contains both tags and more tag directories in a hierarchical structure.
The image itself is in a tag directory called "ImageList". More than one image can be stored in Imagelist.
All numbers relating to the header and tag structure are in big endian byte order. Tag values are in the native order of the machine writing the file.
Example, Mac DM3 file
00 00 00 03 00 22 59 b9 00 00 00 00 01 00 00 00 00 12 15 00 11 41 70 70 6c 69 63 61 74 69 6f 6e ...... 00 00 00 00 00 00 00 00
00 00 00 03 i4be DM version = 3
00 22 59 b9 i4be file length - 16 = size of root tag directory
00 00 00 00 i4be byte order, 0 = big endian (Mac) order,
1 = little endian (PC) order
01 i1 1 = sorted (normally = 1) 00 i1 0 = closed, 1 = open (normally = 0) 00 00 00 12 i4be number of tags in root directory (12h = 18) ......
The root tag directory contains both tags and more tag directories (see below).
00 00 00 00 00 00 00 00 End of file is marked with 8 nulls
Tag directories and tags are identified by their first byte
14h = 20 tag directory 15h = 21 tag 00 end of file
Example
14 00 12 44 6f 63 75 6d 65 6e 74 4f 62 6a 65 63 74 4c 69 73 74
00 00 00 00 00 01
......
14 i1 identifies tag directory (14h = 20)
00 12 i2be bytes in tag name (ltname), may be 0
44 6f 63 75 6d 65 6e 74 4f 62 6a 65 63 74 4c 69 73 74
a tag name, length ltname "DocumentObjectList"
00 i1 1 = sorted? (can be 0 or 1)
00 i1 0 = closed?, 1 = open (normally = 0)
00 00 00 01 i4be number of tags in tag directory (01h = 1). Can be 0
i1 identifies tag (15h = 21)
i2be ltname, bytes in tag name, may be 0
a tag name, length ltname
a4 string "%%%%"
i4be nnum, size of info array following (=1)
i4be * nnum info(nnum), array of nnum integers
contains number type(s) for tag values
xx* * nnum tag values (byte order set by byte order flag)
byte lengths specified in info(nnum)
Example
15 00 0e 41 6e 6e 6f 74 61 74 69 6f 6e 54 79 70 65
25 25 25 25 00 00 00 01 00 00 00 03 00 00 00 14
15 i1 identifies tag (15h = 21)
00 0e i2be bytes in tag name (ltname), may be 0
41 6e 6e 6f 74 61 74 69 6f 6e 54 79 70 65
a tag name, length ltname "AnnotationType"
25 25 25 25 a4 "%%%%"
00 00 00 01 i4be nnum, size of info array following (=1)
00 00 00 03 i4be info(nnum), array of nnum i4 integers, in this case just 1
contains number type (3 = signed i4*)
00 00 00 14 i4* tag value, 14h = 20
For single entry tags:
nnum = 1 info(1) = number type
Example
15 0006 4f6666736574 25252525 00000007 0000000f 00000000 00000002
00000000 00000006 00000000 00000006
00000000 00000000
15 i1 identifies tag (15h = 21)
00 06 i2be bytes in tag name (ltname), may be 0
4f 66 66 73 65 74
a tag name, length ltname "Offset"
25 25 25 25 a4 "%%%%"
00 00 00 07 i4be nnum, size of info array following (=7)
info(nnum)
00 00 00 0f i4be info(1) number type (0fh = group of data)
00 00 00 00 i4be info(2) length of groupname? (always = 0)
00 00 00 02 i4be info(3) number of entries in group (=2)
00 00 00 00 i4be info(4) length of fieldname? (always = 0)
00 00 00 06 i4be info(5) number type for value 1 (06h = f4)
00 00 00 00 i4be info(6) length of fieldname? (always = 0)
00 00 00 06 i4be info(7) number type for value 2 (06h = f4)
00 00 00 00 i4* value(1)
00 00 00 00 i4* value(2)
For group tags
nnum = size of info array info(1) = 0fh info(3) = number of values in group info(2*i+3) = number type for value i Other info entries are always zero
Example, an image tag
15 0004 44617461 25252525 00000003 00000014 00000002 00000024
fdff feff ffff 0000 0100 0200 0300 0400 0500
fdff feff ffff 0000 0100 0200 0300 0400 0500
fdff feff ffff 0000 0100 0200 0300 0400 0500
fdff feff ffff 0000 0100 0200 0300 0400 0500
15 i1 identifies tag (15h = 21)
00 04 i2be bytes in tag name (ltname)
44 61 74 61 a tag name, length ltname "Data"
25 25 25 25 a4 "%%%%"
00 00 00 03 i4be nnum, size of info array following (=3)
info(nnum)
00 00 00 14 i4be info(1), number type (14h = array)
00 00 00 02 i4be info(2), number type (02h = i2 signed)
00 00 00 24 i4be info(3) = info(nnum), size of array (=36)
fd ff i2* value(1)
fe ff i2* value(2)
.... etc to value(36)
For array tags
nnum = 3 info(1) = 14h info(2) = number type for all values info(3) = info(nnum) = size of array
Example
15 0004 434c5554 25252525 0000000b 00000014 0000000f 00000000 00000003
00000000 00000002 00000000 00000002 00000000 00000002
00000100
0000 0000 0000
0101 0101 0101
0202 0202 0202
0303 0303 0303
.....
15 i1 identifies tag (15h = 21)
00 04 i2be bytes in tag name (ltname)
43 4c 55 54 a tag name, length ltname "CLUT"
25 25 25 25 a4 "%%%%"
00 00 00 0b i4be nnum, size of info array following (=11)
info(nnum)
00 00 00 14 i4be info(1), number type (14h = array)
00 00 00 0f i4be info(2), number type (0fh = group)
00 00 00 00 i4be info(3), length of groupname? (always = 0)
00 00 00 03 i4be info(4), number of entries in group (=3)
00 00 00 00 i4be info(5), length of fieldname? (always = 0)
00 00 00 02 i4be info(6), number type for value 1 (02h = i2)
00 00 00 00 i4be info(7), length of fieldname? (always = 0)
00 00 00 02 i4be info(8), number type for value 2 (02h = i2)
00 00 00 00 i4be info(9), length of fieldname? (always = 0)
00 00 00 02 i4be info(10), number type for value 3 (02h = i2)
00 00 01 00 i4be info(11) = info(nnum), size of array (=256)
0000 0000 0000 3i2* 3 values for first element of array
0101 0101 0101 3i2* 3 values for second element of array
....
For arrays of groups
nnum = size of array info(1) = 14h info(2) = 0fh info(4) = number of values in group info(2*i+4) = number type for value i info(nnum) = size of info array
02h = 2 i2* signed (short)
03h = 3 i4* signed (long)
04h = 4 i2* unsigned (ushort) or unicode string
05h = 5 i4* unsigned (ulong)
06h = 6 f4* (float)
07h = 7 f8* (double)
08h = 8 i1 (boolean)
09h = 9 a1 (char)
0ah = 10 i1
0fh = 15 group of data (struct)
info(2) = 0
info(3) = number in group
info(2*n+4) = 0
info(2*n+5) data type for each value in group
12h = 18 a (string)
14h = 20 array of numbers or groups
info(nnum) = number = ngroup
info(2) is then treated as info(1) above
There is no simple way of finding the length of a type 15 tag without completely decoding it and working out the number of bytes in each data type.
The image itself is in a type 15 tag with name "Data" about half way through the tags. It is thus difficult to find the image as the length and number of the preceeding tags can change between images. One possible lazy way is to search for the string "15h 0004h Data%%%%", the image will start 16 bytes beyond this.
There may be more than one image in the file. Each image will have its own Data tag. Images are numbered from 0.
There may be a "thumbnail" image which can be either before or after the main images in the file. The image number of the thumbnail Data tag is given in the tag with name Thumbnails::ImageIndex.
Useful tags in order of appearance:
Description info in the notes box (not always present)
Data the image itself
DataType as in DM2. Note this is different from the number type above.
These values are only for the image data and must be
consistent with the number type for the Data tag.
There are a number of other DataTypes defined that I've
never seen in images
0 null
1 i2 2 byte integer signed ("short")
2 f4 4 byte real (IEEE 754)
3 c8 8 byte complex (real, imaginary)
4 obsolete
5 c4 4 byte packed complex (see DM2)
6 ui1 1 byte integer unsigned ("byte")
7 i4 4 byte integer signed ("long")
8 4*ui1 rgb, 4 bytes/pixel, unused, red, green, blue
9 i1 1 byte integer signed
10 ui2 2 byte integer unsigned
11 ui4 4 byte integer unsigned
12 f8 8 byte real
13 c16 16 byte complex
14 i1 1 byte binary (ie 0 or 1)
23 4*ui1 rgba, 4 bytes/pixel, 0, red, green, blue. Used for thumbnail images
Dimensions a type 14 tag containing 2 type 15s with no names
(irritatingly) which are image width and height
PixelDepth bytes/pixel
For CCD images (these follow the tags above)
Acquisition Date image acquisition date and time, unfortunatley both
Acquisition Time as strings. Worse still, the date string can be
in either UK/international or US order depending on the
date settings on the mac or PC. It is thus impossible
to unambiguously determine the date from the date string.
ImageIndex Image number of thumbnail image
Unfortunately the tags describing the image are after the image itself, this is particularly annoying for the image dimensions.