# Proposal

itkImage should contain information regarding the orienttaion of the image volume.

# Coordinate Systems

Coordinate systems are an important part of any medical application. Medical scanners create regular, "rectangular" arrays of points and cells. The topology of the arrays is implicit in the representation. The geometric location of each point is also implicit.

# vtk and itk Image

vtk and itk store meta-information that can be used to convert between image coordinates (i-j-k) and world coordinates (x-y-z). Each image has an origin that is a 3-D (n-D for itk) vector. The origin (x-y-z) specifies the world coordinate of the first point in memory. The spacing specifies the distance between points along each axis. Using the spacing and origin, the transformation between i-j-k and x-y-z is a fast, simple computation. x(yz) = origin + i(jk) * spacing; i(jk) = (x(yz) - origin) / spacing;

# Limitations of the current Image

Many image processing and segmentation algrothms do not need additional spatial information. But, registration and modeling techniques need to respect the orientation of the image arrays.

# Proposed Extension to itk::Image

By adding an itk::Matrix containing direction cosines, the i-j-k -> x-y-z transformation could include the orientation in the computation. Adding the matrix will not change the existing API. All index to point calculations are confined to itk::Image. The transformations in matrix form are: XYZ = To * Rc * Ss * IJK where To is a translate to the origin, Rc is the matrix of direction cosines and Ss is a scale matrix of spacings. There are performance considerations, so the implementation may cache some internal matrices and state.

# Questions Raised by Proposal

• The coordinate frame in which the direction cosines are measured needs to be either:
• identified explicitly in the Image, so that the image itself knows if it is, for example, RAS (from NIFTY-1 data) versus LPS (from DICOM data)
• not explicitly identified in the Image, but convertible from whatever is native to the input file, to whatever the application prefers to use.
• Consensus on the 4 March 2005 tcon seemed to favor the second choice, using an (optional) orientation filter to impose one chosen orientation. This works most cleanly for axis aligned orientations, this is most commonly done for 3-D (4-D is also quite likely in fMRI, cardiac and other time based imaging methods).

# External References

• How NRRD handles orientation information:
• Some info on Analyze format:

http://wideman-one.com/gw/brain/analyze/formatdoc.htm The ITK Analyze filter places itk::SpatialOrientation tags in the MetaDataDictionary. A utility class itkOrientImageFilter takes inputs of one 3D image, it's orientation, and the desired orientation and internally uses the flip and permute filters to acheive the desired behavior.

``` /**
* \enum ValidAnalyzeOrientationFlags
* Valid Orientation values for objects
* - Key  Description           Origin   dims[1]  dims[2]  dims[3]
* - =================================================================
* - 0    transverse-unflipped   IRP       R->L     P->A    I->S
* - 1    coronal-unflipped      IRP       R->L     I->S    P->A
* - 2    sagittal-unflipped     IRP       P->A     I->S    R->L
* - 3    transverse-flipped     IRA       R->L     A->P    I->S
* - 4    coronal-flipped        SRP       R->L     S->I    P->A
* - 5    sagittal-flipped       ILP       P->A     I->S    L->R
* - Where the Origin disignators are with respect to the patient
* - [(I)nferior|(S)uperior] [(L}eft|(R)ight] [(A)nterior|(P)osterior]
* \note Key's 0-5 correspond to the Analyze v7.5 orientations, and should not be changed.
*/
switch (temporient)
{
case itk::AnalyzeImageIO::ITK_ANALYZE_ORIENTATION_RPI_TRANSVERSE:
coord_orient = itk::SpatialOrientation::ITK_COORDINATE_ORIENTATION_RPI;
}

```
```   orienter->SetDesiredCoordinateOrientation( DesiredImageOrientation );
orienter->SetGivenCoordinateOrientation( fixedOrientation );
orienter->SetInput( inputFixedImage );
```
• DICOM can be deciphered in a way similar to the Analyze method as described by David Clunie in this mail message
• Info from Tosa Yasunari about how Freesurfer handles coordinates:
• A lot of good documentation is included in the nifti1.h header file:
• The DICOM Standard in PDF:

http://medical.nema.org/dicom/2004.html Look in particular at http://medical.nema.org/dicom/2004/04_04PU.PDF section C.7.6.2.1.1, page 275 for a description of the coordinate transformation details.

# Some notes on the DICOM convention and current ITK usage

### Tags

Image Orientation (Patient) (0020,0037) The direction cosines of the first row and the first column with respect to the patient.

Image Position (Patient) (0020,0032) The x, y, and z coordinates of the upper left hand corner (center of the first voxel transmitted) of the image, in mm.

DICOM Plane Attribute Descriptions C.7.6.2.1.1 Image Position And Image Orientation The Image Position (0020,0032) specifies the x, y, and z coordinates of the upper left hand corner of the image; it is the center of the first voxel transmitted.

### C.7.6.2.1.1 Image Position And Image Orientation

Image Orientation (0020,0037) specifies the direction cosines of the first row and the first column with respect to the patient. These Attributes shall be provide as a pair. Row value for the x, y, and z axes respectively followed by the Column value for the x, y, and z axes respectively.

The direction of the axes is defined fully by the patient's orientation. The x-axis is increasing to the left hand side of the patient. The y-axis is increasing to the posterior side of the patient. The z-axis is increasing toward the head of the patient. The patient based coordinate system is a right handed system, i.e. the vector cross product of a unit vector along the positive x-axis and a unit vector along the positive y-axis is equal to a unit vector along the positive z-axis.

Note: If a patient lies parallel to the ground, face-up on the table, with his feet-to-head direction same as the front-to-back direction of the
imaging equipment, the direction of the axes of this patient based coordinate
system and the equipment based coordinate system in previous versions of this
Standard will coincide. The Image Plane Attributes, in conjunction with the
Pixel Spacing Attribute, describe the position and orientation of the image
slices relative to the patient-based coordinate system.
In each image frame the Image Position (Patient) (0020,0032) specifies the
origin of the image with respect to the patient-based coordinate system.

Reference Coordinate System (RCS) and the Image Orientation (Patient) (0020,0037) attribute values specify the orientation of the image frame rows and columns. The mapping of pixel location (i,j) to the RCS is calculated as follows:

``` $\displaystyle P_x$
= [ $\displaystyle X_x$
$\displaystyle \Delta_i$
Y_x $\displaystyle \Delta_j$
0    $\displaystyle S_x$
] [i ]
$\displaystyle P_y$
[ $\displaystyle X_y$
$\displaystyle \Delta_i$
Y_y $\displaystyle \Delta_j$
0    $\displaystyle S_y$
] [j ]
$\displaystyle P_z$
[ $\displaystyle X_z$
$\displaystyle \Delta_i$
Y_z $\displaystyle \Delta_j$
0    $\displaystyle S_z$
] [0 ]
$\displaystyle 1$
[ $\displaystyle 0$
0        0    1   ] [1 ]

```

where :

$\displaystyle P_{xyz}$ - The coordinates of the voxel (i,j) in the frame's image plane in units of mm.

$\displaystyle S_{xyz}$ - The three values of the Image Position (Patient) (0020,0032) attributes. It is the location in mm from the origin of the RCS.

$\displaystyle X_{xyz}$ - The values from the row (X) direction cosine of the Image Orientation (Patient) (0020,0037) attribute.

$\displaystyle Y_{xyz}$ - The values from the column (Y) direction cosine of the Image Orientation (Patient) (0020,0037) attribute.

$\displaystyle i$ - Column index to the image plane. The first column is index zero.

$\displaystyle \Delta_i$ - Column pixel resolution of the Pixel Spacing (0028,0030) attribute in units of mm.

$\displaystyle j$ - Row index to the image plane. The first row index is zero.

$\displaystyle \Delta_j$ - Row pixel resolution of the Pixel Spacing (0028,0030) attribute in units of mm.

1. The row and column direction cosine vectors shall be orthogonal, i.e. their dot product shall be zero.
2. The row and column direction cosine vectors shall be normal, i.e. the dot product of each direction cosine vector with itself shall be unity.

The direction cosine for slices (k) is determined by the cross product of the two provided direction cosines. The coordinate system is right handed and the direction cosine must be unit magnitude.

Determination of whether consecutive slices were acquired in a +Z or a -Z direction requires comparison of the Image Position (Patient) tag values of at least two slices.

### Some other useful DICOM tags include:

Pixel Spacing (0028,0030) Physical distance in the patient between the center of each pixel, specified by a numeric pair - adjacent row spacing (delimiter) adjacent column spacing in mm.

Slice Thickness (0018,0050) Nominal slice thickness, in mm. Note that the value of Slice Thickness can be equal to, larger than, or smaller than, the distance between adjacent slices.

### DICOM on Diffusion MRI

Diffusion Gradient Orientation (0018, 9089) - The direction cosines of the diffusion gradient vector with respect to the patient. Required if Frame Type (0008,9007) Value 1 of this frame is ORIGINAL. May be present otherwise.

### Current ITK Usage and sources of confusion

It seems unfortunate to have a letter code selection for axes that is opposite to that used by DICOM and hence the scanner manufacturers.

itkSpatialOrientation.h -

```   //  Coordinate orientation codes have a place-value organization such that
//  an ImageDimension-al sequence of subcodes says both which varies
//  fastest
//  through which varies slowest, but also which end of the frame of
//  reference
//  is considered zero for each of the coordinates.  For example, 'RIP'
//  means
//  Right to Left varies fastest, then Inferior to Superior, and Posterior
//  to
//  Anterior varies the slowest.

```

I take this to mean R implies the fastest varying axis runs from Right to Left. DICOM would give this the code L, as described here:

C.7.6.1.1.1 Patient Orientation The Patient Orientation (0020,0020) relative to the image plane shall be specified by two values that designate the anatomical direction of the positive row axis (left to right) and the positive column axis (top to bottom). The first entry is the direction of the rows, given by the direction of the last pixel in the first row from the first pixel in that row. The second entry is the direction of the columns, given by the direction of the last pixel in the first column from the first pixel in that column. Anatomical direction shall be designated by the capital letters: A (anterior), P (posterior), R (right), L (left), H (head), F (foot). Each value of the orientation attribute shall contain at least one of these characters. If refinements in the orientation descriptions are to be specified, then they shall be designated by one or two additional letters in each value. Within each value, the letters shall be ordered with the principal orientation designated in the first character.

Example code for this is available here : http://www.dclunie.com/medical-image-faq/html/part2.html

Consider a DICOM file with direction cosines: 1.000000 0.000000 0.000000 0.000000 0.000000 -1.000000

Interpretation in DICOM terms: orientation is CORONAL, it turns out slices running from A to P. unit vector along row is L i.e. a unit vector along the row points from patient Right to Patient Left. unit vector along column is I (i.e. columns from the Superior to the Inferior)

row : L columns : I slices : P

Three letter code for (column index, row index, slice index is) ILP. This indicates the origin is at the superior, right, anterior corner of the volume, and that the axes run from superior to inferior, from right to left, from anterior to posterior.

METAIMAGE format offers a complete set of tags for storing a patient's orientation within an image, the image's origin, the orientation of the image with respect to the patient, and the spacing of the voxels in the image.

```  See http://caddlab.rad.unc.edu/software/MetaIO/MetaIO-Introduction.pdf
```

The relevant MetaObject tags are AnatomicalOrientation, ElementSpacing, Orientation, and Origin.

• In AnatomicOrientation each axis is given a label [R|L] | [A|P] | [S|I].
• The ElementSpacing tracks inter-voxel spacing for each axis (called Scaling above). MetaImage can also track slice thickness as ElementSize, but as noted above, it is ElementSpacing that is used to form an image in space.
• Orientation is determined by the two vectors stored in, for example, CTN's DCM_RELIMAGEORIENTATIONPATIENT tag (0020 0032) and their cross product.
• Origin is determined by CTN's DCM_RELIMAGEPOSITIONPATIENT tag (0020 0037).

The MetaIO package in ITK stores these data in the appropriate/existing ITK Image fields. AnatomicOrientation is stored in an itk Image's metaDataDictionary. Spacing and origin are stored directly in the image. If the image is read as a spatialObject, the orientation matrix is used with spacing and origin to position the object in a SpatialObjectScene. See the chapter on SpatialObjects in itk's SoftwareGuide.

The InsightSNAP tool in InsightApplications (which has a great IO loader by the way), uses the following: When it says LPS it means the x axis is RUNNING FROM left to right, the y axis is running from Posterior to Anterior, the z axis is running from Superior to Inferior.

This is exactly the same as what DICOM means when it says RAI.

# The third coordinate system: measurement or laboratory frame

Whenever non-scalar pixel values involve vectors (or tensors) with coefficients that are expressed with respect to some coordinate frame, the orientation of that coordinate frame should also be known. This frame, which can be called the measurement frame, is conceptually distinct from both the "world" space around the array (such as right-anterior-superior), and the image-aligned basis determined by the direction cosines.

With medical data, the measurement frame becomes an issue for expressing the diffusion-sensitizing gradient directions used in diffusion-weighted (DW) MRI: the diffusion tensors estimated from DW-MRI will likely be expressed in whatever frame the gradients were expressed in. This frame could be, for example, the same as the one used for world space, or the image space frame. The DICOM spec includes a field (see notes above) for identifying the space of the gradients, but apparently scanners are not consistently responsible in their setting of this field, so it may have to be manually recorded outside the DICOM file.

But outside the specific context of DW-MRI and DT-MRI, the measurement frame is an issues for any kind of vector and tensor data, so perhaps it should have representation/handling in ITK.

• Are there already established conventions (from ITK users/code) for the frame in which ITK vectors are expressed?
• Supposing another matrix of direction cosines is added to the itk::Image to represent the orientation of the measurement frmae, one still has to chose which frame is being used as the basis for the cosines- it is image frame or the world frame? Or should this be left as an application-specific choice?
• When an image is rotated, who is responsible for updating vector- or tensor-valued pixels? Should the pixel values actually be changed, or is it enough to update a matrix representing the measurement frame?

# Recursive Templates for fast computation of Index-Point transform

Conversions betweeen index and physical point coordinates are critical for the performance of many ITK algorithms. It is desirable to speed up those operations even at the cost of some code complexity. The following is a proposal for a helper class that will compute the transformations between indices and physical points by taking advantage of the capabilities of C++ templates for performing recursion.

The basic concept of template recursion is illustrated in the following example

``` /** Compute the sum of elements of a vector */
template <unsigned int N>
class Series
{
typedef int Vector[10];
public:
static double Compute(Vector & vector)
{
return vector[N-1] + Series<N-1>::Compute( vector ); // recursion
}
private:
};
double Series<1>::Compute(Vector & vector) // Specialization for terminating the recursion
{
return vector[0];
}
```

and it would be used as

``` int main()
{
double sum;
int values[10];
sum = Series<10>::Compute(values);
return 0;
}
```

The current proposal would involve to

• Move the TranformIndexToPhysicalPoint from itk::Image to itk::ImageBase
• Move the TranformIndexToContinuousIndex from itk::Image to itk::ImageBase
• Move the TranformPhysicalPointToIndex from itk::Image to itk::ImageBase
• Add an ImageHelper class (classname to be discussed)
• Invoke the Helper from the three methods above

A tentative version of the helper is in the following code:

``` template <unsigned int NImageDimension, class TPoint, unsigned int N>
class ImageHelper
{
public:
typedef ImageBase<NImageDimension>         ImageType;
typedef typename ImageType::IndexType      IndexType;
typedef typename ImageType::SpacingType    SpacingType;
typedef typename ImageType::PointType      OriginType;
typedef typename TPoint::ValueType         TCoordRep;
itkStaticConstMacro(ImageDimension, unsigned int,
NImageDimension );
itkStaticConstMacro(DN, unsigned int, N-1);
static void TransformIndexToPhysicalPoint(
const SpacingType & spacing,
const OriginType  & origin,
const IndexType   & index, TPoint   & point )
{
point[DN] = static_cast<TCoordRep>( spacing[DN] *
static_cast<double>( index[DN] ) + origin[DN] );
// Do next dimension, in order to follow the recursion
ImageHelper<NImageDimension,TPoint,DN>
::TransformIndexToPhysicalPoint(spacing,origin,index,point);
}
};
```

## Comments on recursive templates (Miller)

If we move the TransformIndexToPhysicalPoint(), TransformPhysicalPointToIndex() methods to ImageBase, we'll need to study the implact on itk::SpecialCoordinatesImage and itk::PhasedArray3DSpecialCoordinatesImage.

## Comments on recursive templates (Lorensen)

I just did an experiment with my OrientedImage implementation. I unrolled the IndexToPhysicalPoint computation for a 3D OrientedImage (to simulate using recursive templates). Using my Borland compile, the computation for 10,000,000 calls went from 11.5 to .75 seconds! For PhysicalPointToIndex, the speed went from 14.37 to 4.88. Here's a table summarizing the timings.

Method Image OrientedImage OrientedImage(unrolled)
IndexToPhysical .611 11.487 .75
PhysicalToIndex 3.725 14.371 4.88

The same tests on the same machine with the Visual Studio 7 compiler:

Method Image OrientedImage OrientedImage(unrolled)
IndexToPhysical 0.091 4.757 .02
PhysicalToIndex 1.622 4.085 1.722

# Using an embedded transform for registration

If the image origin and orientation are stored as a matrix-offset transform within the image base class, then ITK's registration framework can be adapated to directly optimize this "embedded transform" during registration. This may reduce image registration time by up to 33%.

The underlying concept is that an image has a pose in a scene/space. This pose captures the acquisition orientation of an image as well as how that image has been moved in space (e.g., as captured by intra-operative data). This concept is essentially the motivation behind spatialObjects. An image's embedded transform would be equal to the index-to-world transform maintained as a helper member variable in spatialObjects.

Image-image registration currently involves three transforms: (1) the fixed image's indexToPhysicalSpace transform, (2) the transform being optimized, and (3) the moving images's physicalSpaceToindex transform. The embedded transform within an image can combine transforms (2) and (3).

For registration it is common for the transform being optimized to directly or indirectly involve a matrix-offset transform, i.e., the proposed form of an image's embedded transform. Directly involvement occurs when the transform to be optimized is affine, similarity, rigid, rotation, or translation. Indirect involvement occurs when affine, similarity, ... is applied prior to, or iteratively with, deformable registration - so that the deformation parameterization can focus on capturing local residuals.

When a transform is computed on one image and is needed to be applied to another image, the image acquisition component of orientation and offset can be decomposed from the embedded transform. Specifically, an image's acquisition orientation and offset should be an invertible transform. That inverted tranform can be composed with the embedded transform to resolve only the portion of pose due to movement (e.g., intra-operative data).

The embedded transform can be used for registration by extending the existing APIs for the registration framework and the image base class. It should be backward compatible. In particular, 1) The changes proposed by Bill's/Luis' proposal would be unchanged. 2) The registration framework would be extended to include ImageTransforms: RigidImageTransform, AffineImageTransform, etc. These classes (derived from itkTransform) accept an image as a template/member variable. They provide different parameterizations of the image's embedded transform. 3) The image base class will neeed get/setTransformParameters methods to provide direct access to the embedded transform. 4) Optionally we could install interpolators into the image base...in fact, I think itkImage once had interpolators in it...

The result is that many registration optimization tasks will be reduced to two transforms: (1) fixed image's indextophysicalspace transform and (2) moving image's embedded orientation transform. Image resmpling to account for the affine component of a deformable registration would be eliminated. An image would be a complete representation of an object in space.

## Comments on embedded transform (Miller)

I like the idea of combining the registration transform with the physical coordinate to index transformation to save computations. We'd have to run some timing tests to determine whether it save 33% of the computation time (the physical to index transformation is much simpler than the current registration transformation, so it probably requires much less time).

However, I am not sure about "embedding" this transformation in the image. The same effect can be gained by having the registration framework simply use a transformation that takes into account the registration alignment and transformation from the physical space to index space. (Can this be taken even further where the transformation is the full transformation needed to go from the fixed image index space to the moving image index space?)

The matrix offset transform for an image currently represents the "original" relationship between the index coordinate frame and the acquisition or physical frame. If we alter this transformation via registration, then we loose information on the original coordinate frame. All that we are left with is the composite of the original coordinate frame and the transformation relating the original coordinate frame to the new coordinate frame.

# Cosine Direction Vectors, and itk::SpatialOrientation (Kent Williams)

I am currently working through the use of orientation in reading and writing NIFTI files. I think I've got making direction vectors from the nifti header info worked out, or at least I can only be wrong in a couple of well defined ways, if you know what I mean.

But there is also the issue of itk::SpatialOrientation, and how it relates to the direction cosine vectors. The both are descriptors of almost the same thing, from different points of view.

In itk::AnalyzeImageIO, the Metadata Object named "ITK_CoordinateOrientation" is set to the itk::SpatialOrientationValidAnalyzeOrientationsFlags value derived from the Analyze Header, but SetDirection is never called.

In itk:GDCMImageIO, the Direction Vectors are set but the ITK_CoordinateOrientation flag is not set.

In itk:OrientImageFilter, you can use either CoordinateOrientation or Direction Vectors to specify the given and desired orientation.

There is a defined mapping of CoordinateOrientation codes and Direction Vectors:

For each CoordinateOrientation code, there is a set of direction vectors, which is along unit axes, and describes unambiguously the patient's orientation in the volume.

For each possible set of perpendicular, normalized, direction vectors, there is a closest CoordinateOrientation code. The CoordinateOrientation code are determined by the indices of the largest component in each direction vector.

The problem is that as it stands, is there are the following use cases for the direction/orientation concepts:

1. Ignore them entirely.
2. Use Direction Vectors
3. Use CoordinateOrientation codes
4. Use Both, and the application explicitly keep them consistent.
5. Use Both, but use one part of the time and the other the rest of the time.

This is forced to be an application design decision. Case 1 is useful for some applications, but at some point image data must represent some object in the real world, so it's hard to envision an application that is completely agnostic on orientation. Cases 2 or 3 means that the application enforces consistent usage of one or the other end-to-end. Cases 4 and 5 are maintenance and debugging nightmares.

So... here is a modest proposal for dealing with this issue:

2. Add methods to itk::ImageBase that accept an itk::SpatialOrientationValidAnalyzeOrientationsFlags value and sets the Direction Cosine Vectors.
3. Add a method to itk::ImageBase that returns an itk:SpatialOrientationValidAnalyzeOrientationsFlags value 'closest to' the current Direction Cosine Vectors.
4. In the case of ImageIO objects, set _both_ the Flags and the Directions when an image is read in. When writing a file, always use the Direction Vectors to determine orientation.

This actually is a very small change, which would break minimal code out in the wild, except where there is a dependency on the Orientation Flags being in the Metadata. The transformation between Directions and CoordinateOrientation flags is well defined and code already exists in various places. But I wanted feedback on whether this would be the best way to go about addressing the problem, or if others have better ideas.

# Proposed Helper class to remove dependency on itk::SpatialOrientation from itk::ImageBase(Kent Williams)

I went ahead and implemented the proposal above, which was a step towards improved consistency with regard to SpatialOrientation.

After review and discussion we decided that for a number of reasons, the SpatialOrientation stuff did not belong in itk::ImageBase. This is my proposal to rectify the situation by way of a helper class:

```//
template <typename OrientationRepresentation, unsigned int Dimension>
{
public:
typedef typename itk::ImageBase<Dimension> ImageType;
typedef typename ImageType::DirectionType DirectionType;
// define the conversion function signatures, but don't provide implementations,
// so it's a compile time error if the no conversions are provided
OrientationRepresentation FromDirections(const DirectionType &Directions);
DirectionType ToDirections(const OrientationRepresentation &Representation);
};
//
// define specializations for the specific orientation representations
template <>
itk::SpatialOrientation::ValidCoordinateOrientationFlags
::FromDirections(const itk::ImageBase<3>::DirectionType &dir)
{
}
template <>
itk::ImageBase<3>::DirectionType
::ToDirections(const itk::SpatialOrientation::ValidCoordinateOrientationFlags &Rep)
{
// convert representation to direction cosines
}
```

The idea here is to allow for different sorts of Spatial Orientation representations, without having any code generated unless it's used. In the particular case of itk::SpatialOrientation, only 3D images make sense, so you explicitly implement that case, and leave the rest to cause compile-time errors.

Other representations of orientation may be more general; unfortunately the most elegant way to represent the general case for a specific orientation type uses partial specialization i.e.

```typedef std::string SomeOrientation; // for example

template <unsigned int Dimension>
itk::ImageBase<Dimension>::DirectionType