Image File Formats

Image File Formats

We deal with a lot of different image file formats at the AIC. Keeping track of their documentation, source code, and APIs is quite a challenge. We realized this when we did a survey of image file formats (see the blog post here). Since we went through the trouble of cataloging information on a variety of image formats, we decided to post it as a resource page for others. Below you will find two main tables—one for image formats and the other for abstract data formats. These tables contain a collection of useful links and notes. In addition, we’ve included a Feature Comparison table to serve as an overview. These tables are not exhaustive, but we tried our best to ensure their accuracy. Please let us know if we missed your favorite file format or if we got something wrong.

Image Formats #

Format Ext Repo/Docs Lang Bindings Bio-Formats Publication Notes
Tagged Image File Format (TIFF) .tiff, .tif, .tf2, .tf8, .btf gitlab, docs C reader/writer the extensions .tf2, .tf8, and .btf are used to identify BigTIFF images
OME-TIFF .ome.tiff, .ome.tif, .ome.tf2, .ome.tf8, .ome.btf github, docs Java C++ reader/writer Goldberg et al. 2005 variant of TIFF which includes the OME-XML metadata block
Keller Lab Block (KLB) .klb bitbucket C++ ImageJ, Java, julia, MATLAB, python reader only Amat et al. 2015 unsupported (developer is recommending N5 instead of KLB)
CellH5 .ch5 github python Perl, R reader/writer Sommer et al. 2013 CellH5 is an HDF5 variant defined in python using h5py; designed for high-content screening data
BigDataViewer (BDV) .h5 & .xml github, docs Java reader only Pietzsch et al. 2015 BDV is an HDF5 variant with metadata stored as XML; support for this appears to be dropping in favor of N5
Imaris .ims docs C++ reader only IMS 5.5 is an HDF5 variant; older IMS versions are not HDF5-based, and their format is described here
H5J .h5J docs H5J is an HDF5 variant; data is stored lossy using the H.265 codec

Feature Comparison #

TIFF OME-TIFF KLB CellH5 BDV Imaris H5J
Parallel Read X X1 X X X X X
Parallel Write X2
Single File X X X X X X
Multiresolution X3 X X
ImageJ/Fiji Reader X X X X X X4 X5
ImageJ/Fiji Writer X X X X X6
Imaris Reader X X X
Imaris Writer X X X

Abstract Data Formats #

Unlike the image formats above, abstract data formats are capable of holding many different types of data (i.e., they are not restricted to storing images). In fact, it is more appropriate to think of these formats as data containers. Although they can store image data, they do not specify how that data should be organized within the container. As such, there are no standards for reading images of these types. However, abstract data formats can be used to define an image format. For example, an Imaris .ims file is nothing more than an HDF5 file with a standardized organization.

Format Ext Repo/Docs Lang Bindings Publication Notes
HDF5 .h5 bitbucket, docs C C++, Java, julia, Mathematica, MATLAB, python, R, more Folk et al. 2011 parallel HDF5 allows parallel read/write on parallel filesystems using MPI-IO; multithreaded writing of multiple file is not straight forward since hdf5.dll cannot be loaded multiple times (forum post)
NetCDF .nc, .cdl github, docs C C++, Fortran, Java, julia, MATLAB, python, more Rew et al. 1990, Li et al. 2003, Lee et al. 2008 NetCDF has four different file formats: classic (CDF-1), 64-bit offset (CDF-2), 64-bit (CDF-5), and HDF5 (netCDF-4/HDF5); NetCDF-Java is a dependency in Bio-Formats
tileDB .tdb github, docs C++ C, Go, Java, python, R Papadopoulos et al. 2016 Directory-of-files format
Zarr .zarr github, docs python C++, Java in preparation Directory-of-files format; still early days
Not HDF5 (N5) .n5 github Java C++, Kotlin, Rust in preparation Directory-of-files format; still early days

Feature Comparison #

HDF5 pHDF5 NetCDF tileDB Zarr N5
Parallel Read X X X X X X
Parallel Write X7 X7 X X8 X
Single File X X X
Sparse Data X X X

  1. according to the BioFormats docs, “the Bio-Formats file readers are not thread-safe. If files are read within a parallelized environment, a new reader must be fully initialized in each parallel session.” ↩︎

  2. parallelization capacity decreases as block size increases ↩︎

  3. possible with the newer OME-TIFF pyramid format (see docs and blog post↩︎

  4. can be read by Bio-Formats or BDV’s experimental reader ↩︎

  5. using the Fiji reader plugin ↩︎

  6. using the Fiji writer plugin ↩︎

  7. only on parallel file systems ↩︎

  8. not available in the Zarr library, but possible when used with Dask and another parallel library ↩︎


Last modified May 25, 2020