[Insight-users] Performance regression ImageSeriesReader? (with test)

Wed Mar 24 08:24:31 EDT 2010

Hi,

I changed the line in method GenerateOutputInformation and now I show you
the results. I used the same data of the previous tests.

We can observe the UpdateOutputInformation of 3.16 patched is much faster
than v3.16 released. However, if we compare 2.8 and 3.16patched, in some
cases 2.8 is quite faster. This performance difference can be appreciated in
data which its TransferSintax is Little Endian Implicit (test 2, 4, 7) and
it exists in both update methods (UpdateOutputInformation and
UpdateLargestPossibleRegion) even though the most critical is the
UpdateLargestPossibleRegion because it needs seconds instead of
milliseconds.

Nowadays, we aren't using streaming. Therefore, we collaborate with you in
order to solve the problem, but we are really interested to fix the usual
use case because we need the performance improvement quite soon.

The results of the experiment:

1. Head (http://mri.radiology.uiowa.edu/VHDicom/VHFCT1mm/VHF-Head.tar.gz)
TransferSyntax: Little Endian Explicit
117MB
234 Files

          Read dir  UpdateOutputInformation  UpdateLargestPossibleRegion
 ITK 2.8       385                        4                         1435
ITK 3.16       400                        4                         1485

__________________________________

2. CT example study we work with
(http://dl.dropbox.com/u/3613789/ct_anonymized.zip)

TransferSyntax: Little Endian Implicit
124MB
243 Files

          Read dir  UpdateOutputInformation  UpdateLargestPossibleRegion
 ITK 2.8      1013                        9                         2186
ITK 3.16      1003                       14                         2999

___________________________________

3. CALIX/CT1 abdomen/D30MN BILISCOPIN from OSIRIX Data
(http://pubimage.hcuge.ch:8080/DATA/CALIX.zip)

TransferSyntax: JPEG2000
130MB
243 files

          Read dir  UpdateOutputInformation  UpdateLargestPossibleRegion
 ITK 2.8       642                        6                        20156
ITK 3.16       653                        7                        20714

_____________________________________

4. Mr Perfusion data
http://dl.dropbox.com/u/3613789/perfusion_implicit_anonymized.zip
(The test crashs when the v2.8 is used. I did the test without anonymize it)

TransferSyntax: Little Endian Implicit
41,9MB
1080 Files

          Read dir  UpdateOutputInformation  UpdateLargestPossibleRegion
 ITK 2.8      4032                        9                         4809
ITK 3.16      4051                       25                        13376

_____________________________________________

5. Mr Perfusion data (4) recodified
http://dl.dropbox.com/u/3613789/perfusion_explicit_anonymized.zip

TransferSyntax: Little Endian Explicit
41,9MB
1080 Files

          Read dir  UpdateOutputInformation  UpdateLargestPossibleRegion
 ITK 2.8     15480                       31                        17148
ITK 3.16     15522                       31                        16821

_____________________________________________

6. Mr Perfusion data (4) anonymized by Osirix. All private tags were
removed.
http://dl.dropbox.com/u/3613789/perfusion_anonymized_osirix.zip

TransferSyntax: Little Endian Explicit
36,9MB
1080 Files

          Read dir  UpdateOutputInformation  UpdateLargestPossibleRegion
 ITK 2.8      3898                        9                         4669
ITK 3.16      3869                        9                         5330

_______________________________________________

7. CT Abdomen

TransferSintax: Little Endian Implicit
271MB
531 Files

          Read dir  UpdateOutputInformation  UpdateLargestPossibleRegion
 ITK 2.8      1798                        8                         4343
ITK 3.16      1791                       13                         6233

________________________________________________

Roger

On Tue, Mar 23, 2010 at 11:05 PM, Bill Lorensen <bill.lorensen at gmail.com>wrote:

> Roger,
>
> Thanks for all of your help. We're getting close to a solution.
>
> Bill
>
> On Tue, Mar 23, 2010 at 2:57 PM, Roger Bramon Feixas
> <rogerbramon at gmail.com> wrote:
> > Bill,
> > Now I'm at home. You will have the results of the experiment tomorrow
> > morning.
> > Thanks,
> > Roger
> >
> > On Tue, Mar 23, 2010 at 8:44 PM, Bill Lorensen <bill.lorensen at gmail.com>
> > wrote:
> >>
> >> Sounds good. So you are saying that in the
> >> ImageSeriesReader<TOutputImage>
> >> ::GenerateOutputInformation
> >>
> >> that the number of files that are processed will be only 2.
> >>
> >> Bill
> >>
> >> On Tue, Mar 23, 2010 at 12:39 PM, Bradley Lowekamp
> >> <blowekamp at mail.nih.gov> wrote:
> >> > Bill:
> >> > My proposed solution is to use the old behavior but use a time stamp
> to
> >> > avoid the extra updates of the MDDA when streaming.
> >> > The requested region is set after UpdateOutputInformation is executed,
> >> > so it
> >> > can't be toggled during the this phase of execution.
> >> >
> >> > On Mar 23, 2010, at 3:28 PM, Bill Lorensen wrote:
> >> >
> >> > Can we check if streaming is on and revert to the old behavior if it
> is
> >> > off?
> >> >
> >> > This is a huge performance penalty to support streaming which is
> >> > important but no the usual use case.
> >> >
> >> >
> >> > On Tue, Mar 23, 2010 at 12:12 PM, Bradley Lowekamp
> >> > <blowekamp at mail.nih.gov> wrote:
> >> >
> >> > Bill,
> >> >
> >> > Going back would have horrible effects for streaming. It would make
> >> > slice by
> >> >
> >> > slice streaming an n^2 algorithm, which is far worse then the current
> >> > order
> >> >
> >> > of N hindrance for normals Updates. We must make some improvements
> from
> >> > 2.8.
> >> >
> >> > If we declare the the MetaDataDictionary is suppose to be updated in
> the
> >> >
> >> > update data phase. ( the ImageFileReader does it in the
> >> >
> >> > UpdateOutputInformation phase ) Then the prior stated point 1 design
> >> >
> >> > requirement is gone. And the following solution come to mind:
> >> >
> >> > 1) Modify the GetMMDA methods to produce a warning if the update
> output
> >> > data
> >> >
> >> > has not been called. This is to be nice if some users now expect
> >> >
> >> > UpdateOutputInformation to produce the MDDA.
> >> >
> >> > 2) Add a time stamp for the MMDA, so that when streaming the MMDA is
> >> > only
> >> >
> >> > updated once and not every time a region is requested.
> >> >
> >> > Additionally I believe that we need better DICOM test data which
> include
> >> >
> >> > more tags similar to real world data.
> >> >
> >> > Brad
> >> >
> >> > On Mar 23, 2010, at 2:54 PM, Bill Lorensen wrote:
> >> >
> >> > The UpdateInformation is supposed to update origin, spacing,
> >> >
> >> > direction, pixel type, etc. I don't think it is supposed to completely
> >> >
> >> > populate the meta data dictionary. At least until itk 2.8 it did not.
> >> >
> >> > Why not revert back to the old behavior as a sort term fix.
> >> >
> >> > I think this performance hit needs to be repaired before we release
> >> >
> >> > 3.16. This has been causing major pain for Slicer3 users who
> >> >
> >> > frequently use dicom. Fortunately for us, Roger brought it to light.
> >> >
> >> > We missed it because our performance testing is weak.
> >> >
> >> > There are other issues for sure.
> >> >
> >> > Bill
> >> >
> >> > On Tue, Mar 23, 2010 at 11:45 AM, Bradley Lowekamp
> >> >
> >> > <blowekamp at mail.nih.gov> wrote:
> >> >
> >> > Bill,
> >> >
> >> > After my tests I agree that reading the headers in DICOM files is a
> >> >
> >> > surprisingly expensive operation as such it should be minimized. The
> >> > coping
> >> >
> >> > of the MDAs is insignificant performance wise.  I believe that the
> best
> >> >
> >> > solution would be to have a dedicated DICOM series readers, which also
> >> >
> >> > removes the extra header reads needed for the name generation as well
> as
> >> > the
> >> >
> >> > extra one in the UpdateOutputInformation.
> >> >
> >> > If we assume that the usually way to utilize the reader is to just
> >> > Update,
> >> >
> >> > or stream Update, then the additional read of the headers appears
> >> >
> >> > unnecessary.
> >> >
> >> > I believe a solution would be to make the GetMDDA method smarter, and
> by
> >> >
> >> > default update this MDDA in the UpdateData. A time stamp would need to
> >> > be
> >> >
> >> > used for the MDDA to check when it needs to be updated in the
> UpdateData
> >> >
> >> > methods. For streaming, the first time through would require reading
> all
> >> > of
> >> >
> >> > the headers for the MDDA, this should bring the time stamp up to date.
> >> > The
> >> >
> >> > GetMDDA methods could also check this timestamp and perform the
> reading
> >> > of
> >> >
> >> > the headers if it's out of date. This is my best current idea on how
> to
> >> >
> >> > maintain the 1) and 2) I previously mentioned.
> >> >
> >> > Brad
> >> >
> >> > On Mar 23, 2010, at 12:33 PM, Bill Lorensen wrote:
> >> >
> >> > Brad,
> >> >
> >> > I have an itk 2.8 checkout. The difference is due to the processing of
> >> >
> >> > all files in the GenerateOutputInformation method. In the past, only
> >> >
> >> > two files were processed. If I restrict the number of files to 2
> >> >
> >> > rather that number of files, I get pretty reasonable speeds.
> >> >
> >> > Roger,
> >> >
> >> > As an experiment (and definitely not a fix!), can you in the method
> >> >
> >> > void ImageSeriesReader<TOutputImage>
> >> >
> >> > ::GenerateOutputInformation(void)
> >> >
> >> > change the line:
> >> >
> >> > for ( int i = 0; i != numberOfFiles; ++i )
> >> >
> >> > to
> >> >
> >> > for ( int i = 0; i != 2; ++i )
> >> >
> >> > and rerun your tests.
> >> >
> >> > Bill
> >> >
> >> >
> >> > On Tue, Mar 23, 2010 at 8:59 AM, Bradley Lowekamp
> >> >
> >> > <blowekamp at mail.nih.gov> wrote:
> >> >
> >> > Bill,
> >> >
> >> > That is only the half of it. Every time an ImageFileReader is used 3
> >> > MDDs
> >> >
> >> > (meta data dictionaries) are created, one in the ImageIO, one in the
> >> >
> >> > ImageFileReader, and one in the output Image. This is in addition to
> the
> >> > two
> >> >
> >> > copies, you pointed out in ImageSeriesReader. Clearly reading with an
> >> >
> >> > ImageFileReader the MDD scales very poorly as the it's size increases.
> I
> >> >
> >> > still have the remaining performance questions:
> >> >
> >> > How much time is spent coping the MDD vs reading? (leaning towards
> >> > reading
> >> >
> >> > as very expensive)
> >> >
> >> > As pointed out in Roger's most recent performance tests, there appears
> >> > to be
> >> >
> >> > some additional performance problems in the UpdateData, part. This is
> >> >
> >> > independent of the additional MDD read in the UpdateOutputInformation.
> >> > This
> >> >
> >> > is definitely another problem, perhaps inside the DICOM library.
> >> >
> >> > The change of moving (apparently duplicating) the copying to MDDs to
> the
> >> > MDD
> >> >
> >> > array was added over a year ago, when streaming support was added. If
> I
> >> >
> >> > recall correctly the two motivating factors were 1) the MDD array is
> >> > output
> >> >
> >> > information and logically should be updating during the
> >> >
> >> > UpdateOutputInformation part of the pipeline 2) when streaming each
> file
> >> >
> >> > should not need to be read to create the MMD array. I don't recall
> where
> >> >
> >> > this discussion took place right now.
> >> >
> >> > I will run some performance test to try to figure out where the time
> is
> >> >
> >> > being spent. Without changing 1 from above, I am not sure how much
> could
> >> > be
> >> >
> >> > gained.
> >> >
> >> > Looking at the performance numbers of the Read Directory part, I would
> >> > guess
> >> >
> >> > that the meta data is also read there. I believe that an idea solution
> >> > would
> >> >
> >> > only read this information once. But that is beyond this scope.
> >> >
> >> > Brad
> >> >
> >> > On Mar 22, 2010, at 11:20 PM, Bill Lorensen wrote:
> >> >
> >> > Brad,
> >> >
> >> > It looks like the meta data array is populated in both the
> >> >
> >> > GenerateOutputInformation and GenerateData. Also all slices are
> >> >
> >> > processed in GenerateOutputInformation. In 2.8, only 2 slices were
> >> >
> >> > processed.
> >> >
> >> > Why were these changes made? We are also seeing bad dicom performance
> >> >
> >> > in Slicer3.
> >> >
> >> > Bill
> >> >
> >> > On Mon, Mar 22, 2010 at 6:24 AM, Bradley Lowekamp
> >> >
> >> > <blowekamp at mail.nih.gov> wrote:
> >> >
> >> > Hello,
> >> >
> >> > Can you please tell us a little more about your test data and
> computer.
> >> > What
> >> >
> >> > kind of file system is the data on ( locale or network)? How much
> memory
> >> >
> >> > does the computer have? What is the size of the data? What is the
> native
> >> >
> >> > pixel type of the data? What are the actual timings? Does the
> execution
> >> > seem
> >> >
> >> > to be CPU or IO bound?
> >> >
> >> > One of the changes that was made to the class was to populate the
> >> >
> >> > MetaDataArray in the UpdataOutputInformation phase of the instead of
> the
> >> >
> >> > UpdateOutputData part. This should be just reading the headers of the
> >> > files
> >> >
> >> > in the series. There were several reasons this change was made. To
> help
> >> >
> >> > determine the cause of your slowness, lets break up the timing a
> little
> >> >
> >> > further.
> >> >
> >> > Could you please call:
> >> >
> >> > start timer
> >> >
> >> > reader->UpdateOutputInformation();
> >> >
> >> > lap timer
> >> >
> >> > reader->UpdateLargestPossibleRegion();
> >> >
> >> > stop timer
> >> >
> >> > And post the timing results.
> >> >
> >> > Thanks,
> >> >
> >> > Brad
> >> >
> >> > On Mar 21, 2010, at 2:52 PM, Roger Bramon Feixas wrote:
> >> >
> >> > This week we updated our ITK version from 2.8 to 3.16  and we noticed
> >> > the
> >> >
> >> > medical models are loading 2x slower using the 3.16 ITK version. We
> use
> >> >
> >> > itk::ImageSeriesReader and the problem is focused in its Update()
> >> > method.
> >> >
> >> > I attached a simple test program which reproduces the problem and
> where
> >> > we
> >> >
> >> > can see that the Update() method is 2 times slower using ITK 3.16 vs.
> >> > ITK
> >> >
> >> > 2.8.
> >> >
> >> > We compiled both versions using Visual Studio 2008 on Windows XP
> 32bits
> >> > and
> >> >
> >> >  we don't known if this problem also occurs in other platforms.
> >> >
> >> > I wonder if other itk users have this same performance problem and if
> >> > there
> >> >
> >> > is anybody can help us in order to solve it.
> >> >
> >> > Thanks!
> >> >
> >> > Roger
> >> >
> >> >
> >> > ========================================================
> >> >
> >> > Bradley Lowekamp
> >> >
> >> > Lockheed Martin Contractor for
> >> >
> >> > Office of High Performance Computing and Communications
> >> >
> >> > National Library of Medicine
> >> >
> >> > blowekamp at mail.nih.gov
> >> >
> >> >
> >> >
> >> > ========================================================
> >> >
> >> > Bradley Lowekamp
> >> >
> >> > Lockheed Martin Contractor for
> >> >
> >> > Office of High Performance Computing and Communications
> >> >
> >> > National Library of Medicine
> >> >
> >> > blowekamp at mail.nih.gov
> >> >
> >> >
> >
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.itk.org/pipermail/insight-users/attachments/20100324/923a28d3/attachment-0001.htm>