Image File Formats
We deal with a lot of different image file formats at the AIC. Keeping track of their documentation, source code, and APIs is quite a challenge. We realized this when we did a survey of image file formats (see the blog post here). Since we went through the trouble of cataloging information on a variety of image formats, we decided to post it as a resource page for others. Below you will find two main tables—one for image formats and the other for abstract data formats. These tables contain a collection of useful links and notes. In addition, we’ve included a Feature Comparison table to serve as an overview. These tables are not exhaustive, but we tried our best to ensure their accuracy. Please let us know if we missed your favorite file format or if we got something wrong.
Image Formats #
Format | Ext | Repo/Docs | Lang | Bindings | Bio-Formats | Publication | Notes |
---|---|---|---|---|---|---|---|
Tagged Image File Format (TIFF) | .tiff , .tif , .tf2 , .tf8 , .btf |
gitlab, docs | C | reader/writer | the extensions .tf2 , .tf8 , and .btf are used to identify BigTIFF images |
||
OME-TIFF | .ome.tiff , .ome.tif , .ome.tf2 , .ome.tf8 , .ome.btf |
github, docs | Java | C++ | reader/writer | Goldberg et al. 2005 | variant of TIFF which includes the OME-XML metadata block |
Keller Lab Block (KLB) | .klb |
bitbucket | C++ | ImageJ, Java, julia, MATLAB, python | reader only | Amat et al. 2015 | unsupported (developer is recommending N5 instead of KLB) |
CellH5 | .ch5 |
github | python | Perl, R | reader/writer | Sommer et al. 2013 | CellH5 is an HDF5 variant defined in python using h5py; designed for high-content screening data |
BigDataViewer (BDV) | .h5 & .xml |
github, docs | Java | reader only | Pietzsch et al. 2015 | BDV is an HDF5 variant with metadata stored as XML; support for this appears to be dropping in favor of N5 | |
Imaris | .ims |
docs | C++ | reader only | IMS 5.5 is an HDF5 variant; older IMS versions are not HDF5-based, and their format is described here | ||
H5J | .h5J |
docs | H5J is an HDF5 variant; data is stored lossy using the H.265 codec |
Feature Comparison #
TIFF | OME-TIFF | KLB | CellH5 | BDV | Imaris | H5J | |
---|---|---|---|---|---|---|---|
Parallel Read | X | X1 | X | X | X | X | X |
Parallel Write | X2 | ||||||
Single File | X | X | X | X | X | X | |
Multiresolution | X3 | X | X | ||||
ImageJ/Fiji Reader | X | X | X | X | X | X4 | X5 |
ImageJ/Fiji Writer | X | X | X | X | X6 | ||
Imaris Reader | X | X | X | ||||
Imaris Writer | X | X | X |
Abstract Data Formats #
Unlike the image formats above, abstract data formats are capable of holding many different types of data (i.e., they are not restricted to storing images). In fact, it is more appropriate to think of these formats as data containers. Although they can store image data, they do not specify how that data should be organized within the container. As such, there are no standards for reading images of these types. However, abstract data formats can be used to define an image format. For example, an Imaris .ims
file is nothing more than an HDF5 file with a standardized organization.
Format | Ext | Repo/Docs | Lang | Bindings | Publication | Notes | |
---|---|---|---|---|---|---|---|
HDF5 | .h5 |
bitbucket, docs | C | C++, Java, julia, Mathematica, MATLAB, python, R, more | Folk et al. 2011 | parallel HDF5 allows parallel read/write on parallel filesystems using MPI-IO; multithreaded writing of multiple file is not straight forward since hdf5.dll cannot be loaded multiple times (forum post) |
|
NetCDF | .nc , .cdl |
github, docs | C | C++, Fortran, Java, julia, MATLAB, python, more | Rew et al. 1990, Li et al. 2003, Lee et al. 2008 | NetCDF has four different file formats: classic (CDF-1), 64-bit offset (CDF-2), 64-bit (CDF-5), and HDF5 (netCDF-4/HDF5); NetCDF-Java is a dependency in Bio-Formats | |
tileDB | .tdb |
github, docs | C++ | C, Go, Java, python, R | Papadopoulos et al. 2016 | Directory-of-files format | |
Zarr | .zarr |
github, docs | python | C++, Java | in preparation | Directory-of-files format; still early days | |
Not HDF5 (N5) | .n5 |
github | Java | C++, Kotlin, Rust | in preparation | Directory-of-files format; still early days |
Feature Comparison #
HDF5 | pHDF5 | NetCDF | tileDB | Zarr | N5 | |
---|---|---|---|---|---|---|
Parallel Read | X | X | X | X | X | X |
Parallel Write | X7 | X7 | X | X8 | X | |
Single File | X | X | X | |||
Sparse Data | X | X | X |
-
according to the BioFormats docs, “the Bio-Formats file readers are not thread-safe. If files are read within a parallelized environment, a new reader must be fully initialized in each parallel session.” ↩︎
-
parallelization capacity decreases as block size increases ↩︎
-
possible with the newer OME-TIFF pyramid format (see docs and blog post) ↩︎
-
can be read by Bio-Formats or BDV’s experimental reader ↩︎
-
using the Fiji reader plugin ↩︎
-
using the Fiji writer plugin ↩︎
-
only on parallel file systems ↩︎
-
not available in the Zarr library, but possible when used with Dask and another parallel library ↩︎
Last modified May 25, 2020