[Insight-users] Release 2.0.0 slower than 1.6.0 and 1.8.1 -Cygwin1.5.13, GCC 3.3.3-3

Frederic Perez fredericpcx at gmail.com
Mon Mar 7 14:38:57 EST 2005


Hello everyone,

thank you very much for your responses. The posts by James V. Miller
and Karthik
Krishnan convinced me that something strange is happening for the Cygwin 1.5.13,
GCC 3.3.3-3 combination. (By the way, my own output of "gcc -v" matches 
Karthik's.)

I've got the suspicion that the itkGetConstReferenceMacro is not as fast as 
expected---the sequence of actions driving me to that conclusion is given below.
The macro is defined in Common/itkMacro.h as follows:

#define itkGetConstReferenceMacro(name,type) \
  virtual const type & Get##name () const \
  { \
    itkDebugMacro("returning " << #name " of " << this->m_##name ); \
    return this->m_##name; \
  }

I would say that since it defines a virtual method is up to the
compiler to inline
it or not, and in my particular case it is not inlined. Since this
method is called
a lot of times, the executions slow down significatively.

Perhaps you know a method to force the compiler to inline the macro---if 
I'm right about that.

Thanks again for your consideration,

Frederic Perez

--

Reaching the macro (and proposed workaround):

First I built profiled versions of ITK 1.8.1 and 2.0.0, to later link
the testing
program against them both. Unfortunately I got linking errors. Since I didn't 
feel like fixing that, I tried profiling the 
itk::ImageMomentsCalculator<ImageType>::Update method directly. Thus,
I edited the Code/Algorithms/itkImageMomentsCalculator.txx adding some
itk::TimeProbe objects to narrow down the problem. (I haven't seen 
anything in that code related to random number seeding. Maybe I 
misunderstood Stephen R. Aylward post.) The test program was linked to 
**debug** relases for versions 1.8.1 and 2.0.0 of the Insight toolkit. Their 
corresponding executions are summarized as follows:

ITK 1.8.1:
Time region iterator loop: 39.969 s
 Time region iterator inner loop 1:    6.903 s
 Time region iterator transform calls: 5.942 s
 Time region iterator inner loop 2:    7.355 s
GetTotalMass = 4.33882e+07 [ Time: 39.969 seconds ]

ITK 2.0.0:
Time region iterator loop: 75.25 s
 Time region iterator inner loop 1:    7.381 s
 Time region iterator transform calls: 40.031 s
 Time region iterator inner loop 2:    7.716 s
GetTotalMass = 4.33882e+07 [ Time: 75.25 seconds ]

This shows that in debug mode there is still a decrease in efficiency
(recall: Cygwin 1.5.13, GCC 3.3.3-3), and that it is due to the line 
exhibited below:

template<class TImage>
void ImageMomentsCalculator<TImage>::Compute()
{
 <<snipped>>
 m_Image->TransformIndexToPhysicalPoint(indexPosition, physicalPosition);  
 <<snipped>>
}

The core of the Image's TransformIndexToPhysicalPoint inspector is slightly
different in the two versions:

ITK 1.8.1 (Version:   $Revision: 1.119 $):
 point[i] = static_cast<TCoordRep>( m_Spacing[i] *
   static_cast<double>( index[i] ) + m_Origin[i] );

ITK 2.0.0 (Version:   $Revision: 1.122 $):
 point[i] = static_cast<TCoordRep>( this->GetSpacing()[i] *
   static_cast<double>( index[i] ) + this->GetOrigin()[i] );

This particular change occurred between revisions 1.121 and 1.122, due 
to GCC 3.4 considerations. Thus, we have edited the file itkImageBase.h to 
change the spacing and origin attributes to be protected instead of private, 
and reverted the point[i] computation to the 1.119 version. With those 
changes, which I admit are ugly, after rebuilding we obtain:

ITK 2.0.0 MODIFIED:
Time region iterator loop: 41.25 s
 Time region iterator inner loop 1:    7.275 s
 Time region iterator transform calls: 6.28 s
 Time region iterator inner loop 2:    7.581 s
GetTotalMass = 4.33882e+07 [ Time: 41.25 seconds ]

These values match the results for the 1.8.1 build.


More information about the Insight-users mailing list