Understanding the Imaris 5.5 File Format

Understanding the Imaris 5.5 File Format

by Blair Rossetti
Apr 16, 2020
data-and-analysis
file-format

Introduction #

In my previous post, I mentioned that the AIC is exploring alternative file formats for use in our processing and analysis pipelines. Our pre-commercial microscopes present a unique challenge in that they all generate different formats and structures of image data. The exact format of the output images depends on the acquisition mode of the microscope and the developer of the microscope’s control software. We are currently converting most of our images into sequences of KLB files. Compared to the TIFF and binary images produced by the microscopes, KLB offers better compression and faster reads/writes. Unfortunately, support for KLB is diminishing, and there are better formats on the horizon. Our survey of image file formats suggested several alternative—namely, Imaris IMS, BigDataViewer HDF5, Not HDF5 (N5), tileDB, and Zarr. Surprisingly, the Imaris IMS (version 5.5) format is the most versatile for our use cases. IMS is the native format for the Imaris software, it can be read by Bio-Formats and BigDataViewer, and it is compatible with the standard HDF5 library. In this post, we are going to take a closer look at the IMS format, its specifications, and identify any undocumented changes.

Imaris 5.5 File Format #

Back in 2015, Bitplane released documentation for their Imaris 5.5 image format. Unfortunately, the documentation is fairly light, and a quick cross-reference with the Internet Archive shows that there have been no updates since it was originally posted. I suspect that the lack of updates is more a product of there being no major changes in the format specifications. Nevertheless, I will be walking through the documentation and making notes of where I see discrepancies with recently generated IMS files. The files that I will be using are the Imaris Demo Images that come with Imaris 9.5. I will also be using sample images collected by the OME/Bio-Formats team for developing their Imaris reader (see https://downloads.openmicroscopy.org/images/Imaris-IMS/). Be sure to check out the script I wrote for downloading these OME data sets in my other blog post.

A Flavor of HDF5 #

You can think of the Imaris 5.5 image format as a flavor of HDF5. IMS files are generated using the HDF5 library, but they follow a specific format that the Imaris software expects to see. As the documentation points out in section 2. Tools and Resources, we can use standard HDF5 tooling to explore IMS files. Specifically, we will use the HDFView software to compare the internal structure of new IMS files with the structure described in the documentation.

IMS Structure #

Internal IMS hierarchy as depicted in the documentation

The Imaris documentation conveniently includes a screenshot of HDFView’s hierarchy for the retina.ims image (note: this file is still included as an example image in Imaris 9.5). The screenshot shows that IMS files consist of three main groups: DataSet, DataSetInfo, and Thumbnail. The documentation also indicates that a fourth group, called Scene or Scene8, may be present if the file contains surpass objects or annotations. We can also see that the image data is stored in the DataSet group by ResolutionLevel, TimePoint, and Channel. Each channel contains a one- to three-dimensional image. Although not explicitly stated in the documentation, this structure appears to limits us to multiresolution 5D images (R-T-C-XYZ). Since we still have access to the retina.ims file in the Imaris Demo Images directory that comes with Imaris 9.5, we can compare the structure of that file with others example images from the same directory and from the OME example image repository.

IMS Structure by Version #

In the window on the right, I have included a series of screenshots from the HDFView program. Each screenshot shows the internal hierarchy for a different IMS file. The v5.7 tab has the same retina.ims that we saw in the documentation. However, I’ve expanded the DataSetInfo and Thumbnail groups to show their content. As you switch to the v6.5 tab, you will see that not much has changed. The only different between retina.ims and celldemo.ims is the number of resolutions levels and channels. Similarly, the PyramidalCell.ims in the v9.3 tab has roughly the same structure. We see for the the first time the Scene and Scene8 groups that are created when surpass objects are added to the file in Imaris. Interestingly, the PyramidalCell.ims file also contains a DataSetTimes group. This particular group is not defined in the Imaris documentation. I suspect that the DataSetTimes is similar to the Scene and Scene8 groups in that they are all created when the file is edited in Imaris.

Things start to get interesting when we look at the two files in the v9.5 tabs. First, the DrosophilaEggChamber_with_objects.ims has two notable difference compared to the other files that we have examined: (1) there appear to be duplicate groups and (2) there exists a higher resolution histogram object called Histogram1024. A closer look shows that the DataSet and DataSet1 groups contain slightly different objects (the same applies to /DataSetInfo/DataSetInfo1 and /DataSetTimes/DataSetTimes1). It is unclear how and when these new groups and objects are created. I am also unclear on which /DataSet/…/Data object is displayed when Imaris opens this file. The story behind the Histogram1024 object seems to be a little more obvious. The Imaris documentation says, “the histogram is a HDF5 1D Dataset of type 64bit unsigned integer.” Although it is not stated in the documentation, the Histogram object appears to have 256 bins by default. The Histogram1024 object has 1024 bin. I expect that this higher resolution Histogram1024 was added in recent versions of Imaris for visualization purposes. We can see in the last v.9.5 tab with the Microglia-vascularture_with_objects.ims hierarchy, that not all Imaris 9.5 images contain the Histogram1024 object. The Microglia-vascularture_with_objects.ims file does contain some interesting metadata groups within the main DataSetInfo group. The ND2_Attributes, ND2_Capturing, ND2_Description, and ND2_MetaData groups were likely created by the ImarisFileConverter when the file was being converted from the Nikon ND2 format to the IMS format. Unfortunately, the Imaris documentation does not specify if this extra metadata is used in any way. After exploring some other example images, I suspect that this vendor-specific metadata is not used.

retina.ims v5.7

celldemo.ims v6.5

PyramidalCell.ims v9.3

DrosophilaEggChamber_with_objects.ims v9.5

Microglia_vasculature_with_objects.ims v9.5

IMS Groups and Their Attributes #

As I’ve already mentioned, there are three main groups within an IMS file: DataSet, DataSetInfo, and Thumbnail. In section 3. Structure of the Imaris documentation, there is a table that briefly describes these groups. Unfortunately, the table is not accurate. In the window below, I’ve included the original table from the documentation and my edited version.

Path Attribute Value Description
/ ImarisDataSet ImarisDataSet Specific to format
Format Version 5.5.0 Format version
/DataSetInfo Contains DataSet information
/Thumbnail Contains the Thumbnail data
Path Attribute Value Description
/ Root group
DataSetDirectoryName DataSet Name of DataSet group
DataSetInfoDirectoryName DataSetInfo Name of DataSetInfo group
ImarisDataSet ImarisDataSet Specific to format
ImarisVersion 5.5.0 File format version (not Imaris version)
NumberOfDataSets N+1 Number of datasets, where groups are named as DataSet, DataSet1, … , DataSetN (only set in newer IMS files)
ThumbnailDirectoryName Thumbnail Name of Thumbnail group
/DataSet Contains image and histogram data
/DataSetInfo Contains metadata
/Thumbnail Contains the thumbnail image

DataSet #

We saw in the HDFView screenshots above that the DataSet group contains a series of child groups that hold the image and histogram data for different resolutions, time points, and channels. The Channel groups also have attributes describing the image size and histogram bounds. The Imaris documentation states that the

Dataset should be chunked, storage allocation time incremental, fill value none, compression none or GZIP. If GZIP compression is activated, any valid GZIP parameter may be used (i.e. compression 0 to 9 is acceptable, 3 is preferred). –IMARIS 5.5 File Format Description

Oddly, I did not come across any IMS files that were uncompressed. I suspect that the ImarisFileConverter uses GZIP compression by default (there do not appear to be any options for changing this in the software). Contrary to the documentation, I noticed that the GZIP compression level is set to 2 (not 3) in all of the example files.

Section 3.1 DataSet of the documentation also includes a short code example showing how to read a 3D image from a file using the HDF5 library. I noticed a couple of small error in the code snippet, so I am including an edited version below.

mFileId = H5Fopen(mFileName.c_str(), H5F_ACC_RDONLY, H5P_DEFAULT);
hid_t vDataSetId = H5Gopen(mFileId, "DataSet");
hid_t vLevelId = H5Gopen(vDataSetId, "Resolution Level 0");
hid_t vTimePointId = H5Gopen(vLevelId, "TimePoint 0");
hid_t vChannelId = H5Gopen(vTimePointId, "Channel 0");
hid_t vDataId = H5Dopen(vChannelId, "Data");
// read the attributes ImageSizeX,Y,Z hsize_t vFileDim[3] = {ImageSizeZ, ImageSizeY, ImagesizeX};
hid_t vFileSpaceId = H5Screate_simple(3, vFileDim, NULL); char* vBuffer = new vBuffer[ImageSizeZ*ImageSizeY*ImageSizeX];
H5Dread(vDataId, H5T_NATIVE_CHAR, H5S_ALL, vFileSpaceId, H5P_DEFAULT, vBuffer);
mFileId = H5Fopen(mFileName.c_str(), H5F_ACC_RDONLY, H5P_DEFAULT);
hid_t vDataSetId = H5Gopen(mFileId, "DataSet");
hid_t vLevelId = H5Gopen(vDataSetId, "Resolution Level 0");
hid_t vTimePointId = H5Gopen(vLevelId, "TimePoint 0");
hid_t vChannelId = H5Gopen(vTimePointId, "Channel 0");
hid_t vDataId = H5Dopen(vChannelId, "Data");
// read the attributes ImageSizeX,Y,Z 
hsize_t vFileDim[3] = {ImageSizeZ, ImageSizeY, ImagesizeX};
hid_t vFileSpaceId = H5Screate_simple(3, vFileDim, NULL); 
char* vBuffer = new vBuffer[ImageSizeZ*ImageSizeY*ImageSizeX];
H5Dread(vDataId, H5T_NATIVE_CHAR, H5S_ALL, vFileSpaceId, H5P_DEFAULT, vBuffer);

DataSetInfo #

Rather than saving metadata as attributes of the /DataSet/…/Data object, metadata is saved as attributes of child groups of DataSetInfo. The Imaris documentation includes a long table describing the metadata attributes; unfortunately, they do not indicate which fields are required/expected by Imaris. For example, the table lists the Filename attribute in the /DataSetInfo/Imaris group, but this was not present in all of the example files that I examined. Similarly, there appear to be required attributes that are not included in the documentation. All of the example files contain a ThumbnailSize attribute in the /DataSetInfo/Imaris group. I suspect that the ImarisFileConverter automatically sets this attribute for all files because the value is always set to 256 (regardless of the actual thumbnail size).

Thumbnail #

Arena, the file explorer within Imaris, shows a thumbnail for all IMS files so that the user can more easily find their desired data. Therefore, Imaris requires that a 2D, square RGBA image be saved within the IMS file. The thumbnail is saved as a W x 4W dataset, where W is the size of the thumbnail width in pixels. Although thumbnails do not have a defined size, the vast majority of the thumbnails are 256 x 1024. Interestingly, I noticed that the PyramidalCell.ims file has a 255 x 1020 thumbnail dataset. I suspect this odd size is the result of a bug in the ImarisFileConverter writer for Zeiss files. This file was generated with Imaris v9.3, so this bug may be fixed at this point.

Data Storage #

Perhaps the most critical yet poorly documented aspect of the IMS format pertains to how the image data is stored. As we mentioned above, the image data is stored in chunks. The Imaris documentation states,

Typical chunk sizes are 128x128x64 or 256x256x16. The optimal chunk size is determined by the geometry of the image and it is not easy to specify rules for reproducing exactly the chunk size that Imaris will write into the hdf-file… If you want to write image files for Imaris to read, try to create 3D chunks that are roughly 1 Megabyte size. –IMARIS 5.5 File Format Description

I find it a bit unsatisfying that rules for chunking are not described in this open format. Nevertheless, the guideline that chunks should be about 1MB appears suitable for most circumstances. I will keep my fingers crossed that the documentation receives an update with more information about the current chunking scheme.

Aside from data chunking, Imaris also generates a multiresolution pyramid of image data. Unfortunately, the only information about how this pyramid is generated is a short code snippet. Again, I noticed that the example code had some bugs. I’ve included both the original and edited code below for reference.

void getMultiResolutionPyramidalSizes( const size_t[3] aDataSize,
std::vector& aResolutionSizes)
{
const float mMinVolumeSizeMB = 1.f;
aResolutionSizes.clear();
size_t[3] vNewResolution = aDataSize;

float vVolumeMB;
do {
vResolutionSizes.push_back(vNewResolution);
size_t[3] vLastResolution = vNewResolution;
size_t vLastVolume = vLastResolution[0] * vLastResolution[1] * vLastResolution[2];
for (int d = 0; d < N; ++d) {
if ((10*vLastResolution[d]) * (10*vLastResolution[d]) > vLastVolume / vLastResolution[d])
vNewResolution[d] = vLastResolution[d] / 2;
else
vNewResolution[d] = vLastResolution[d];
// make sure we don't have zero-size dimension
vNewResolution[d] = std::max((size_t)1, vNewResolution[d]);
}
vVolumeMB = vNewResolution[0] * vNewResolution[1] * vNewResolution[2]) / (1024.f * 1024.f);
} while (vVolumeMB > mMinVolumeSizeMB);
}
#include <algorithm>

void getMultiResolutionPyramidalSizes( const size_t aDataSize[3], std::vector &vResolutionSizes)
{
    const float mMinVolumeSizeMB = 1.f;
    vResolutionSizes.clear();
    size_t vNewResolution[3] = aDataSize;

    float vVolumeMB;
    do {
        vResolutionSizes.push_back(vNewResolution);
        size_t vLastResolution[3] = vNewResolution;
        size_t vLastVolume = vLastResolution[0] * vLastResolution[1] * vLastResolution[2];
        
        for (int d = 0; d < N; ++d)
        {
            if ((10*vLastResolution[d]) * (10*vLastResolution[d]) > vLastVolume / vLastResolution[d])
                vNewResolution[d] = vLastResolution[d] / 2;
            else
                vNewResolution[d] = vLastResolution[d];

            // make sure we don't have zero-size dimension
            vNewResolution[d] = std::max((size_t)1, vNewResolution[d]);
        }
        vVolumeMB = (vNewResolution[0] * vNewResolution[1] * vNewResolution[2]) / (1024.f * 1024.f);
    } while (vVolumeMB > mMinVolumeSizeMB);
}

It is pretty difficult to figure out exactly what is going on in this multiresolution example code because the contents of the variables are not described and the value for N is not defined. I am slightly concerned that the individual dimensions appear to be downsampled independently. In other words, each dimension d is halved if a multiple of its resolution is greater than the product of the resolutions of the remaining dimensions. I suspect this code would produce anisotropic downsampling for images with one dimension being much greater than the others (i.e., tall or wide images).

Conclusion #

I am quite interested in testing out the IMS file format with AIC instruments. However, my review of the IMS format specifications has raised more questions than answers. We will be reaching out to the Imaris developers to ask for clarification on some of the items I’ve outlined above. Although the IMS format seems stable, the IMS documentation could certainly use a refresh. For anyone else interest in working with the IMS format, we will be sure to update this post as we learn more.


Last modified Jul 15, 2020