[Insight-developers] Threading image metrics

Miller, James V (Research) millerjv at crd.ge.com
Mon Mar 14 09:57:46 EST 2005


Karthik, 
 
This problems still befuddles me.  The MatchCardinality did run slower on some systems when threaded.  On other systems, multithreaded had its expected benefit, near linear speedup in the number of processors.
 
Is your m_NumberOfPixelCounted a local variable? If not, you could have multiple threads 
attempting to write to the same memory.
 
Another option to try is to change the function declaration for ThreadedGetValue to be
 
::ThreadedGetValue( const FixedImageRegionType &regionForThread,
                    int threadId, unsigned long &count, double &metric ) 

where count and metric would take the place of m_ThreadMatches[threadId] and m_ThreadCounts[threadId].  This just moves the problem a bit.  You still need to rollup
the counts and metric from across the threads.
 
Jim


 
 

-----Original Message-----
From: insight-developers-bounces at itk.org [mailto:insight-developers-bounces at itk.org]On Behalf Of Karthik Krishnan
Sent: Monday, March 14, 2005 2:29 AM
To: Insight-developers (E-mail)
Subject: [Insight-developers] Threading image metrics


Hi,

I have been trying to thread the MeanSquares metric in an attempt to get registration to run faster on my dual processor machine. After following an architectrure similar to the one in itkMatchCardinalityImageToImageMetric.txx, I was able to thread it and happily both processors were used. However this did not speed up execution times. In fact threading slowed it down by 2%. 

Surprised, I decided to go back to the MatchCardinality metric. A google search turned up the following post:
http://www.itk.org/pipermail/insight-users/2004-May/008553.html
which also ran 2% slower on both linuxgcc335 (-O2-g build) and on Debug builds in VS7.

It turns out that using m_ThreadMatches[threadId] and m_ThreadCounts[threadId] variables within the iterator in the ThreadedGetValue function introduces something akin to a race condition. Here is the hypothesis: m_ThreadMatches/Counts are ivars representing contiguous indices in memory. Moving it outside the iterator and using local vars within the iterator seems to help. 

After the changes, the threaded version runs faster on VS7 and linuxgcc335. Still its just a 17% reduction in time on linuxgcc335. (There don't seem any additional mutexes). I was wondering if there were any followups on the mail-thread above.

Thanks
Regards
Karthik

::ThreadedGetValue( const FixedImageRegionType &regionForThread,
                    int threadId ) 
{
  itk::TimeProbe MultiThreadClk1;
  MultiThreadClk1.Start();

  //m_ThreadMatches[threadId] = NumericTraits< MeasureType >::Zero;
  //m_ThreadCounts[threadId] = 0;
  MeasureType measure = NumericTraits< MeasureType >::Zero;
  m_NumberOfPixelsCounted = 0;

  while(!ti.IsAtEnd())
    {
    index = ti.GetIndex();
    
    typename Superclass::InputPointType inputPoint;
    fixedImage->TransformIndexToPhysicalPoint( index, inputPoint );

    if( this->GetFixedImageMask() && !this->GetFixedImageMask()->IsInside( inputPoint ) )
      {
      ++ti;
      continue;
      }

    typename Superclass::OutputPointType
      transformedPoint = this->GetTransform()->TransformPoint( inputPoint );

    if( this->GetMovingImageMask() && !this->GetMovingImageMask()->IsInside( transformedPoint ) )
      {
      ++ti;
      continue;
      }

    if( this->GetInterpolator()->IsInsideBuffer( transformedPoint ) )
      {
      const RealType movingValue= this->GetInterpolator()->Evaluate( transformedPoint );
      const RealType fixedValue = ti.Get();
      RealType diff;
      
      //m_ThreadCounts[threadId]++;
      m_NumberOfPixelsCounted++;

      if (m_MeasureMatches)
        {
        diff = (movingValue == fixedValue); // count matches
        }
      else
        {
        diff = (movingValue != fixedValue); // count mismatches
        }
      //m_ThreadMatches[threadId] += diff; 
      measure += diff;
      }

    ++ti;
    }

  m_ThreadMatches[threadId] = measure;
  m_ThreadCounts[threadId] = m_NumberOfPixelsCounted;

  
  MultiThreadClk1.Stop();
  std::cout  << " [ Time taken by Function ThreadedGetValue() thread(" << threadId << ")" << MultiThreadClk1.GetMeanTime() << " seconds ]" << std::endl;
}







-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.itk.org/mailman/private/insight-developers/attachments/20050314/be56926c/attachment.htm


More information about the Insight-developers mailing list