[Insight-developers] memcpy VS iterators copy!!
Bradley Lowekamp
blowekamp at mail.nih.gov
Mon Mar 28 11:00:10 EDT 2011
Hello,
This is another performance improvement that I think should me a MUST for v4! We need to replace the for loop image iterator copies with an abstraction that can use memcpy when possible!
I have been wanting to run the performance comparison for a while and this was the opportunity to do so! I replaced the for loop in question here with a memcpy ( it still has bugs it it but it's doing the needed work extremely fast! )
# memcpy loop
Executed 10 times with mean 16.8704s
I just replaced the for loop with a memcpy:
{
const IdentifierType numberOfPixelsInSlice = sliceRegionToRequest.GetNumberOfPixels();
const size_t numberOfComponents = output->GetNumberOfComponentsPerPixel();
const IdentifierType numberOfPixelsUpToSlice = numberOfPixelsInSlice * i * numberOfComponents;
typename TOutputImage::InternalPixelType * outputSliceBuffer = outputBuffer + numberOfPixelsUpToSlice;
typename TOutputImage::InternalPixelType * inputBuffer =c reader->GetOutput()->GetBufferPointer();
memcpy( outputSliceBuffer, inputBuffer, sizeof( typename TOutputImage::InternalPixelType ) * numberOfPixelsInSlice * numberOfComponents );
}
Still for this case, no copy is still better then memcpy.
On Mar 28, 2011, at 10:32 AM, Lowekamp, Bradley (NIH/NLM/LHC) [C] wrote:
> Hello Roger,
>
> Your benchmark program had a few more dependencies, the just ITK so I wrote my own and attached it. I used a series of tiff I have, so I hope it would be comparable. I have also arrived at a similar conclusion that the copy loop is expensive and should be avoided. However, my benchmark does indicate that the progress reporting is taking 50% of the additional execution time, which is rather different then your experiment.
>
>
> Testing series reader with 349 files.
> Image Size: [2048, 1536, 349]
>
> # current ITK
> Executed 10 times with mean 24.4403s
>
> # progress commented out
> Executed 10 times with mean 20.7206s
>
> # copy loop commented out
> Executed with 10 times with mean 16.5306s
>
> # gerrit patch version
> Executed 10 times with mean 16.9262s
>
> <itkImageSeriesReaderPerformance.cxx><ATT00001..htm>
========================================================
Bradley Lowekamp
Lockheed Martin Contractor for
Office of High Performance Computing and Communications
National Library of Medicine
blowekamp at mail.nih.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.itk.org/mailman/private/insight-developers/attachments/20110328/41db97f7/attachment.htm>
More information about the Insight-developers
mailing list