[Insight-developers] Threading image metrics

Karthik Krishnan Karthik.Krishnan at kitware.com
Mon Mar 14 10:49:42 EST 2005


Dear Jim,

Thank you for your quick reply.
I cannot offer a reasonable explanation for the behaviour, but if you 
try the version checked into CVS now and compare it to rev 1.5, it does 
offer a 15-20% speedup on linuxgcc335. My recollection is that it did on 
windows too (vs7).

The revision right now does pretty much the same as what you've 
described since the ivars m_ThreadMatches[threadId] and 
m_ThreadCounts[threadId] are accesssed only once per call to 
ThreadedGetValue, which is called GetNumberOfThreads() per metric 
evaluation.

Its still puzzling because although the ivars m_ThreadCounts are shared 
data, they aren't mutable, so I don't know why it would slow the thread 
down. But that was the only conclusion I could derive after setting 
several time probes.

Thanks
regards
karthik



Miller, James V (Research) wrote:

> Karthik,
>  
> This problems still befuddles me.  The MatchCardinality did run slower 
> on some systems when threaded.  On other systems, multithreaded had 
> its expected benefit, near linear speedup in the number of processors.
>  
> Is your m_NumberOfPixelCounted a local variable? If not, you could 
> have multiple threads
> attempting to write to the same memory.
>  
> Another option to try is to change the function declaration for 
> ThreadedGetValue to be
>  
> ::ThreadedGetValue( const FixedImageRegionType &regionForThread,
>                     int threadId, unsigned long &count, double &metric )
> where count and metric would take the place of 
> m_ThreadMatches[threadId] and m_ThreadCounts[threadId].  This just 
> moves the problem a bit.  You still need to rollup
> the counts and metric from across the threads.
>  
> Jim
>
>  
>  
>
>     -----Original Message-----
>     *From:* insight-developers-bounces at itk.org
>     [mailto:insight-developers-bounces at itk.org]*On Behalf Of *Karthik
>     Krishnan
>     *Sent:* Monday, March 14, 2005 2:29 AM
>     *To:* Insight-developers (E-mail)
>     *Subject:* [Insight-developers] Threading image metrics
>
>     Hi,
>
>     I have been trying to thread the MeanSquares metric in an attempt
>     to get registration to run faster on my dual processor machine.
>     After following an architectrure similar to the one in
>     itkMatchCardinalityImageToImageMetric.txx, I was able to thread it
>     and happily both processors were used. However this did not speed
>     up execution times. In fact threading slowed it down by 2%.
>
>     Surprised, I decided to go back to the MatchCardinality metric. A
>     google search turned up the following post:
>     http://www.itk.org/pipermail/insight-users/2004-May/008553.html
>     which also ran 2% slower on both linuxgcc335 (-O2-g build) and on
>     Debug builds in VS7.
>
>     It turns out that using m_ThreadMatches[threadId] and
>     m_ThreadCounts[threadId] variables within the iterator in the
>     ThreadedGetValue function introduces something akin to a race
>     condition. Here is the hypothesis: m_ThreadMatches/Counts are
>     ivars representing contiguous indices in memory. Moving it outside
>     the iterator and using local vars within the iterator seems to help.
>
>     After the changes, the threaded version runs faster on VS7 and
>     linuxgcc335. Still its just a 17% reduction in time on
>     linuxgcc335. (There don't seem any additional mutexes). I was
>     wondering if there were any followups on the mail-thread above.
>
>     Thanks
>     Regards
>     Karthik
>
>     ::ThreadedGetValue( const FixedImageRegionType &regionForThread,
>                         int threadId )
>     {
>       itk::TimeProbe MultiThreadClk1;
>       MultiThreadClk1.Start();
>
>       //m_ThreadMatches[threadId] = NumericTraits< MeasureType >::Zero;
>       //m_ThreadCounts[threadId] = 0;
>       MeasureType measure = NumericTraits< MeasureType >::Zero;
>       m_NumberOfPixelsCounted = 0;
>
>       while(!ti.IsAtEnd())
>         {
>         index = ti.GetIndex();
>        
>         typename Superclass::InputPointType inputPoint;
>         fixedImage->TransformIndexToPhysicalPoint( index, inputPoint );
>
>         if( this->GetFixedImageMask() &&
>     !this->GetFixedImageMask()->IsInside( inputPoint ) )
>           {
>           ++ti;
>           continue;
>           }
>
>         typename Superclass::OutputPointType
>           transformedPoint = this->GetTransform()->TransformPoint(
>     inputPoint );
>
>         if( this->GetMovingImageMask() &&
>     !this->GetMovingImageMask()->IsInside( transformedPoint ) )
>           {
>           ++ti;
>           continue;
>           }
>
>         if( this->GetInterpolator()->IsInsideBuffer( transformedPoint ) )
>           {
>           const RealType movingValue=
>     this->GetInterpolator()->Evaluate( transformedPoint );
>           const RealType fixedValue = ti.Get();
>           RealType diff;
>          
>           //m_ThreadCounts[threadId]++;
>           m_NumberOfPixelsCounted++;
>
>           if (m_MeasureMatches)
>             {
>             diff = (movingValue == fixedValue); // count matches
>             }
>           else
>             {
>             diff = (movingValue != fixedValue); // count mismatches
>             }
>           //m_ThreadMatches[threadId] += diff;
>           measure += diff;
>           }
>
>         ++ti;
>         }
>
>       m_ThreadMatches[threadId] = measure;
>       m_ThreadCounts[threadId] = m_NumberOfPixelsCounted;
>
>      
>       MultiThreadClk1.Stop();
>       std::cout  << " [ Time taken by Function ThreadedGetValue()
>     thread(" << threadId << ")" << MultiThreadClk1.GetMeanTime() << "
>     seconds ]" << std::endl;
>     }
>
>
>
>
>



More information about the Insight-developers mailing list