Talk:ITK/HDF5

From KitwarePublic
Revision as of 08:04, 27 April 2011 by Glehmann (talk | contribs) (Created page with "===Typing=== With HDF5, everything is either a group or a dataset. ITK must be able to save many different types -- how do we store the actual ITK type in the HDF5? [[User;ke...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigationJump to search

Typing

With HDF5, everything is either a group or a dataset.

ITK must be able to save many different types -- how do we store the actual ITK type in the HDF5?

kent williams This is handled in TransformIO by actually saving the ITK type name in the HDF file. This parallels the other Transform readers. There's a lookup mechanism in the itkTransformIOFactory to handle instantiation by class name.

(Attributes may be an option for that.)

kent williamsThat is true, but once I figured out DataSets, that became my hammer for every nail. There may be some efficiency issues with using datasets when Attributes would do, but in the case of ITK, it shouldn't be an issue

How do we store the template parameters -- do we even need to store them? Glehmann 16:06, 18 April 2011 (EDT)

kent williams in general, no. We can recover the native scalar type of datasets, and for most things, that's enough to decide what ITK object to instantiate, based on context. Where that isn't the case, we could store attributes to disambiguate what class of object should be created.

Composite objects

Composite objects are store as groups in the HDF5 file and are made of one or more atomic or composite objects. Each object is named in the same way it is named in the ITK classes, without the leading "m_".

Kent Williams Actually, there is a compound object, which is good in limited cases for fixed-size structs of POD, especially if there are many of them, i.e. an array of fixed sized structs;

Version

We may need something simpler to store the version as an attribute. Glehmann 16:06, 18 April 2011 (EDT) Kent Williams I store it in the Image file as a string; at this point it isn't being checked. The rationale is that the version could be checked and that could influence how the data is read out of the file.

The way HDF5 files are organized and accessed, simply augmenting a file with new data is backwards compatible -- there will just be some groups/datasets/attributes whose paths are unknown to the old reader.

ImageRegion

TODO: where do we store the Dimension of the ImageRegion?

Kent WilliamsRight now, in the WIP image file reader deduces the dimension from the size of the Direction Cosines matrix. It could be stored explicitly and accessed first if need be.

TODO: Is it good enough to assume that it can be deduced from the dimension of the Index and of the Size?

Kent Williams If we write the reader and writer, we can depend on the preconditions we impose. In this specific case, yes, the size of the Index & Spacing are equally valid markers of the Dimension of the object being read in.

TODO: What to do when the Index and the Size dimensions mismatch? [User;kentwilliams|Kent Williams]] Fail early and fail loud!


Image

This is not a strict requirement, but images should be saved in chunks to allow them to be efficiently streamed (both read and write) and compressed. [User;kentwilliams|Kent Williams]]I haven't messed with it yet, but if a file is stored with its voxel data as one contiguous array, you can still read subregions. Writing in chunks, I'm not sure about.

I think the chunk size should be one on all the dimensions but x and y. Which chunk size to choose on x and y is tricky, and may depend on the use case -- should we choose a size?

Managing versions

How to do that? The version should be stored somewhere for sure - should it be:

  • at the base of the file? in an /ITKVersion group for example?
  • in each object, as an attribute? This would allow to easily copy an object from one file to another. I think I like much this method Glehmann 16:06, 18 April 2011 (EDT)