[Insight-users] Speed optimizations in mutual information registration

Luis Ibanez luis.ibanez@kitware.com
Thu, 23 May 2002 11:45:02 -0400


Hi Hannu,

Thanks a lot for your feedback !

We are interested the localized changes
that you proposed in order to increase
performance.  Metrics are certainly the
bottleneck of registration. Any improvement
on the metrics will reflect directly on the
overall performance of the registration method.

To give you some historical background (though
"history" here means a couple of months :-)

The ITK registration framework was heavely templated.

The original implementation had a "RegistrationMethod"
class templated over:

1) the type of the images to register,
2) the optimization method to use
3) the metric to compare the images
4) the interpolator to estimate intensity
      in non-grid positions
5) the transform to map one image into the other.

which among other things involved (as you propose now)
to have the number of parameters of the transform as
one of the template parameters.

This Generic Programming approach lead to some
maintenance and technical difficulties. The most
significant of them being the extreme difficulty
for wrapping these classes so they can be used
from Tcl, Python and Java. ITK tries to balance
the needs of a large variety of users, which
includes Tcl, Python and Java developers.

Since wrapping requires to explicitly instantiate
the particular combination of the types, it was
quite difficult to produce wrappings in an automated
way.


We de-templated this implementation and move toward
a more traditional C++ approach using virtual
functions, polymorphism and hierarchies. Which is
the current implementation.

There are some nice advantages in the current approach.
For example: You can change at run time the type of
optimizer or the metric your are using.   A road that
has not been explored in all its potential in current
registration methods.

You can imagine for example a Smart registration technique
in which components can be switched according to their
performance. The typical case being: start a registration
using a rigid transform, switch later to Affine and finaly
apply a deformable transform. All this guided by a Fuzzy
Logic Controler with rules like:

IF metric value is high THEN use affine transform
IF step size is high THEN use Gradient descent optimizer
IF image resolution is low THEN use linear interpolation
IF image resolution is high THEN use nearest neighbor

and so on...

which are after all the heuristic rules that we all
use when we think about the registration process but
we do not always find an easy way to implement. ITK, by
offering all these components in an easy plug-in
presentation opens the way for these new approaches.

Such smarter methods could not be implemented in a
templated approach since all the decisions about
components are made at compile time and cannot be
changed later.


As you correctly mention, the price for this
flexibility is the extra time required for managing
dynamic arrays. Here is where the trade-off between
the itk::FixedArray<> and the itk::Array<> comes into
play.

As far as I can see, your analysis seems to indicate
that we may be doing unnecesary constructions of
itkArray<>'s (which produces the hit in performance).

Could you please post the localized changes that
you propose ?

We can probably identify the places where we could
avoid itk::Array<> constructions, by for example passing
Arrays as const references or by sharing a Parameters
array at the level of the registration method.
We could also identify itkArrays that are constructed
as local auxiliary variables and can be promoted to
ivars in order to avoid to construct them every time.

These changes will probably help us to get around
the performance hit.

We look forward to implement these changes !!



     Thanks

        Luis


==============================================

Hannu Helminen wrote:
  >
  > It is great to hear that speed optimizations are among your priorities
  > as well.
  >
  > I also did some testing with using FixedArray in the internal
  > calculations of MutualInformationImageToImageMetric class and got very
  > comparable results (about 7-fold speed increase). Since these changes
  > are very well localized, I can post the diffs here in case someone is
  > interested in attempting to reproduce the results.
  >
  > However this idea requires templatizing
  > MutualInformationImageToImageMetric over the size of the transform
  > parameter space. So it is certainly not in the direction of reducing
  > templates!
  >
  > I agree that too many templates, especially when combined with deep
  > inheritance hierarchies, may lead to convoluted code. However, I do not
  > fully agree with the idea that use of templates would automatically lead
  > to larger library size (in terms of object code). It is a question of
  > how many times the templates are instantiated.
  >
  > Hannu
  >
  >