[Insight-users] Speed optimizations in mutual information registration
Hannu Helminen
Hannu.Helminen.0001@dosetek.varian.com
Wed, 22 May 2002 13:08:58 +0300
Hi!
I did some profiling to the mutual information multi-resolution matching
routines. It seems that a considerable amount of the total optimization
time is spent in the dynamic allocation of vectors. Since the vector type
itk::Array has variable size, it has to do all memory allocations from the
heap. Unfortunately heap allocations and deallocations tend to be quite
expensive in terms of speed.
I then did an quick and dirty hack to replace the use of itk::Array with
itk::FixedArray. This resulted in quite large speedup: the total
registration time (which includes the read & write routines etc.) was cut
in half. I also calculated the time spent in the core optimization routine
m_Optimizer->StartOptimization() by cumulating the run times in each round:
Time before optimization: 270 seconds
Time after optimization: 38 seconds
My test setup: 2 x Xeon 1,7 MHz, Windows 2000, Visual Studio 6.0,
RelWithDebInfo configuration, CT image 512x512x76, MR image 256x256x82,
latest CVS version of ITK, 5 levels of multiresolution with 2500 iterations
per level.
Now my question is, is there any interest in the ITK developers or ITK
community in these speed-ups? If there is, I may use part of my work time
in this project, if needed.
Here are some of my thoughts about speed-up:
1) Changing the type of the parameter and derivative of
MIImageToImageMetric to itk::FixedArray. This is what I did in my quick
hack. Unfortunately this change is not small: currently all the
optimization routines in the Numeric subdirectory are hard-coded to use
itk::Array as the type of parameter and derivative.
However, I have the feeling that templatizing the optimization routines
over the vector types would be in the spirit of ITK. It would make them
more generic in the spirit of STL's algorithms.
2) A smaller change would be to use a fixed array inside
itk::MutualInformationImageToImageMetric in all the calculations. I have
not benchmarked how this would affect the run times.
3) Yet another possibility would be to use specialized fixed size allocator
in itk::Array, such as that in the Loki library [1]. Again, I do not have
any data about the speed-ups attainable by this method.
Any thoughts?
Hannu Helminen
Varian Medical Systems Finland Oy
[1] <http://sourceforge.net/projects/loki-lib/>