[Insight-developers] OtsuThresholdCalculator versus OtsuMultipleThresholdsCalculator

Johnson, Hans J hans-johnson at uiowa.edu
Tue Jul 9 11:35:30 EDT 2013


Awesome work Dirk.  Thank you for looking into this.

Hans


From: Bill Lorensen <bill.lorensen at gmail.com<mailto:bill.lorensen at gmail.com>>
Date: Tuesday, July 9, 2013 10:30 AM
To: "Padfield, Dirk R (GE Global Research)" <padfield at research.ge.com<mailto:padfield at research.ge.com>>
Cc: "Richard.Beare at ieee.org<mailto:Richard.Beare at ieee.org>" <Richard.Beare at ieee.org<mailto:Richard.Beare at ieee.org>>, Hans Johnson <hans-johnson at uiowa.edu<mailto:hans-johnson at uiowa.edu>>, ITK <insight-developers at itk.org<mailto:insight-developers at itk.org>>
Subject: Re: [Insight-developers] OtsuThresholdCalculator versus OtsuMultipleThresholdsCalculator

Nice job Dirk.

If you add GetMeasurement, leave GetBinMax for backward compatibility.

Bill


On Tue, Jul 9, 2013 at 11:22 AM, Padfield, Dirk R (GE Global Research) <padfield at research.ge.com<mailto:padfield at research.ge.com>> wrote:
Hi All,

Sorry for my delayed reply; I wanted to go through the code so that I could give an accurate answer about the differences.  I have now had the chance to carefully compare the Otsu and OtsuMultiple code, and I found a number of discrepancies that I have fixed.  Here is a summary:

First of all, the equations used for the two filters are different because the Otsu uses a simplified mathematical representation that only holds when there is one threshold.  I derived the equations to check that they lead to the same mathematical representation, and they do.

For all of the tests, I used the 8-bit image ExternalData/Testing/Data/Input/cthead1.png.  For each code line below, the original line is shown with a "-" first, and the new line is shown with a "+" first.  All of the comments are about the Otsu because the MultipleOtsu is implemented more accurately.

In Otsu, the global mean was being computed incorrectly.  We can measure this because we can compare the value computed from the histogram to the true mean of the image pixels, which is 77.76.  The original and corrected code are as follows:
-    totalMean += ( j + 1 ) * relativeFrequency[j];
+    totalMean += histogram->GetMeasurementVector(j)[0] * relativeFrequency[j];

Here is a comparison of the mean calculated for different bin values.  Clearly, it should be close to 77.76 for all of them.  Before it was a monotonically increasing value, but now it is more consistent (it is never exactly 77.76 because of how the histogram bins store the image pixels).
Bins: 4,8,16,32,64,128,256,512,1024
Old Mean: 1.95, 3.07, 5.42, 10.36, 20.12, 39.66, 78.76, 156.92, 312.99
New Mean: 92.95, 81.90, 78.38, 78.56, 78.18, 78.03, 77.96, 77.91, 77.82

The initial left mean was being computed incorrectly.  By mathematical definition, when the leftMean is computed from only the first bin of the histogram, it is equal to the bin value itself.
-  double meanLeft = 1.0;
+  double meanLeft = histogram->GetMeasurementVector(0)[0];

The left mean was computed incorrectly in the loop.  It only made sense when the number of bins was the same as the number of gray levels of the image itself.
-                   + ( j + 1 ) * relativeFrequency[j] ) / freqLeft;
+                   + ( histogram->GetMeasurementVector(j)[0] ) * relativeFrequency[j] ) / freqLeft;

The offset of 1 was incorrect in the GetMeasurement call.
-  this->GetOutput()->Set( static_cast<OutputType>( histogram->GetMeasurement( maxBinNumber + 1, 0 ) ) );
+  this->GetOutput()->Set( static_cast<OutputType>( histogram->GetMeasurement( maxBinNumber, 0 ) ) );

Finally, there is a discrepancy in that Otsu uses GetMeasurement (as shown above), but OtsuMultiple uses GetBinMax to get the final threshold.  The difference is that GetMeasurement is the average of GetBinMax and GetBinMin.  When the number of bins is large, this makes almost no difference because the GetBinMin and GetBinMax are almost the same.  But for small numbers of bins, the GetMeasurement is more stable.  Here are some example results:

Bins: 4,8,16,32,64,128,256,512,1024
GetBinMin:            63.91, 63.83, 63.79, 79.71, 79.70, 81.69, 83.68, 83.67, 83.92
GetBinMax:         127.82, 95.74, 79.74, 87.68, 83.68, 83.68, 84.67, 84.17, 84.17
GetMeasurement: 95.86, 79.79, 71.76, 83.70, 81.69, 82.68, 84.17, 83.92, 84.05

Finally, the thresholds should be rounded at the end of the calculator so that they are not truncated by the corresponding ImageFilters.

I recommend making all of the changes to the Otsu listed above and changing the GetBinMax to GetMeasurement in the OtsuMultiple.  This will make the two completely consistent.  The results are as follows for different number of bins:

Bins: 4,8,16,32,64,128,256,512,1024
Old Otsu threshold: 159.77, 111.70, 87.71, 91.67, 85.68, 84.67, 85.17, 84.42, 84.30
New threshold:          95.86, 79.79, 71.76, 83.70, 81.69, 82.68, 84.17, 83.92, 84.05

The ImageJ Otsu plugin (http://rsbweb.nih.gov/ij/plugins/otsu-thresholding.html) outputs a threshold of 79.
The ImageJ MultipleOtsu plugin (http://rsbweb.nih.gov/ij/plugins/multi-otsu-threshold.html) with 1 threshold outputs a threshold of 87.
Using Matlab, I get 84.
So they are not consistent, but they are all close.

Thanks,
Dirk

________________________________
From: Richard Beare [richard.beare at gmail.com<mailto:richard.beare at gmail.com>]
Sent: Wednesday, July 03, 2013 6:01 PM
To: Padfield, Dirk R (GE Global Research)
Cc: Johnson, Hans J; Matt McCormick; Bradley Lowekamp; <insight-developers at itk.org<mailto:insight-developers at itk.org>> Developers; Gaëtan Lehmann
Subject: Re: [Insight-developers] OtsuThresholdCalculator versus OtsuMultipleThresholdsCalculator

Is the difference in threshold when there are a small number of bins explainable via the one bin difference? If it isn't then that is cause for concern. Otherwise it is just another side effect of the original difference. Small numbers of bins can easily break the assumptions in many of the histogram based methods, as can data with features like large numbers of zeros or maximum values.


On Thu, Jul 4, 2013 at 6:17 AM, Padfield, Dirk R (GE Global Research) <padfield at research.ge.com<mailto:padfield at research.ge.com><mailto:padfield at research.ge.com<mailto:padfield at research.ge.com>>> wrote:
Exactly.  It seems crazy to use more than 256 bins in that case.  But the goal of that test was simply timing, and the best way to increase the computation time is to increase the number of bins.

The real takeaway message is: either implementation is equally fast for reasonably-sized inputs, so there is no timing-based concern for replacing one for the other.

Dirk


________________________________________
From: Johnson, Hans J [hans-johnson at uiowa.edu<mailto:hans-johnson at uiowa.edu><mailto:hans-johnson at uiowa.edu<mailto:hans-johnson at uiowa.edu>>]
Sent: Wednesday, July 03, 2013 3:04 PM
To: Padfield, Dirk R (GE Global Research); Matt McCormick
Cc: Bradley Lowekamp; <insight-developers at itk.org<mailto:insight-developers at itk.org><mailto:insight-developers at itk.org<mailto:insight-developers at itk.org>>> Developers; Richard.Beare at ieee.org<mailto:Richard.Beare at ieee.org><mailto:Richard.Beare at ieee.org<mailto:Richard.Beare at ieee.org>>; Gaëtan Lehmann
Subject: Re: [Insight-developers] OtsuThresholdCalculator versus OtsuMultipleThresholdsCalculator

Dirk,

With 10,000,000 bins, and an unsigned char data set this seems suspicious.
 Are there real needs for such a large number of bins?

I'd hate to spend a lot of time doing optimizations for 10,000,000 bins if
there is no real-world use case for it.

Hans


-----Original Message-----
From: <Padfield>, "Dirk R   (GE Global Research)"
<padfield at research.ge.com<mailto:padfield at research.ge.com><mailto:padfield at research.ge.com<mailto:padfield at research.ge.com>>>
Date: Wednesday, July 3, 2013 1:27 PM
To: Matt McCormick <matt.mccormick at kitware.com<mailto:matt.mccormick at kitware.com><mailto:matt.mccormick at kitware.com<mailto:matt.mccormick at kitware.com>>>, Hans Johnson
<hans-johnson at uiowa.edu<mailto:hans-johnson at uiowa.edu><mailto:hans-johnson at uiowa.edu<mailto:hans-johnson at uiowa.edu>>>
Cc: Bradley Lowekamp <blowekamp at mail.nih.gov<mailto:blowekamp at mail.nih.gov><mailto:blowekamp at mail.nih.gov<mailto:blowekamp at mail.nih.gov>>>, ITK
<insight-developers at itk.org<mailto:insight-developers at itk.org><mailto:insight-developers at itk.org<mailto:insight-developers at itk.org>>>, "Richard.Beare at ieee.org<mailto:Richard.Beare at ieee.org><mailto:Richard.Beare at ieee.org<mailto:Richard.Beare at ieee.org>>"
<Richard.Beare at ieee.org<mailto:Richard.Beare at ieee.org><mailto:Richard.Beare at ieee.org<mailto:Richard.Beare at ieee.org>>>, Gaëtan Lehmann <gaetan.lehmann at gmail.com<mailto:gaetan.lehmann at gmail.com><mailto:gaetan.lehmann at gmail.com<mailto:gaetan.lehmann at gmail.com>>>
Subject: RE: [Insight-developers] OtsuThresholdCalculator versus
OtsuMultipleThresholdsCalculator

Hi Folks,

I agree as well that we should merge the two implementations and provide
an update on the migration guide.

I looked in a little more detail at the two implementations.  If I run
both algorithms for a large range of numberOfHistogramBins, they produce
very different results when the number is small, but very similar results
when the number of bins is large.  This is disturbing in itself, so I am
looking into the implementations to see why this happens.

In terms of the timing, they are both very fast for normal bin sizes.  But
when I push the numberOfHistogramBins up to 1,000,000 and 10,000,000, I
get some differences:

numberOfHistogramBins = 1,000,000:
Otsu = 5 sec
OtsuMultiple = 0.5 sec

numberOfHistogramBins = 10,000,000:
Otsu = 45.80 sec
OtsuMultiple = 4.58 sec

Thus, when the numberOfHistogramBins is very large, the difference is
significant: the OtsuMultiple is about 10 times faster.  This is
encouraging since the OtsuMultiple is the more general implementation.

All of these tests are on the cthead1.png image that was used for the
ctest.

My plan is to carefully compare the implementations to see how to make
them report the same results.  And then I will merge them so that the Otsu
is just a shell that inherits from OtsuMultiple.

Thanks,
Dirk


________________________________________
From: Matt McCormick [matt.mccormick at kitware.com<mailto:matt.mccormick at kitware.com><mailto:matt.mccormick at kitware.com<mailto:matt.mccormick at kitware.com>>]
Sent: Tuesday, July 02, 2013 3:44 PM
To: Johnson, Hans J
Cc: Bradley Lowekamp; Padfield, Dirk R (GE Global Research);
<insight-developers at itk.org<mailto:insight-developers at itk.org><mailto:insight-developers at itk.org<mailto:insight-developers at itk.org>>> Developers; Richard.Beare at ieee.org<mailto:Richard.Beare at ieee.org><mailto:Richard.Beare at ieee.org<mailto:Richard.Beare at ieee.org>>; Gaëtan
Lehmann
Subject: Re: [Insight-developers] OtsuThresholdCalculator versus
OtsuMultipleThresholdsCalculator

I concur with Brad and Hans.

Adding Gaëtan in CC.

Thanks,
Matt

On Tue, Jul 2, 2013 at 5:08 PM, Johnson, Hans J <hans-johnson at uiowa.edu<mailto:hans-johnson at uiowa.edu><mailto:hans-johnson at uiowa.edu<mailto:hans-johnson at uiowa.edu>>>
wrote:
> I agree with brad.  The two should produce the same results, even tough
>it
> may introduce a different numerical result in the
> OtsuMultipleThresholdCalculator.
>
> -----Original Message-----
> From: Bradley Lowekamp <blowekamp at mail.nih.gov<mailto:blowekamp at mail.nih.gov><mailto:blowekamp at mail.nih.gov<mailto:blowekamp at mail.nih.gov>>>
> Date: Tuesday, July 2, 2013 12:01 PM
> To: "Padfield, Dirk R (GE Global Research)" <padfield at research.ge.com<mailto:padfield at research.ge.com><mailto:padfield at research.ge.com<mailto:padfield at research.ge.com>>>
> Cc: ITK <insight-developers at itk.org<mailto:insight-developers at itk.org><mailto:insight-developers at itk.org<mailto:insight-developers at itk.org>>>, "Richard.Beare at ieee.org<mailto:Richard.Beare at ieee.org><mailto:Richard.Beare at ieee.org<mailto:Richard.Beare at ieee.org>>"
> <Richard.Beare at ieee.org<mailto:Richard.Beare at ieee.org><mailto:Richard.Beare at ieee.org<mailto:Richard.Beare at ieee.org>>>
> Subject: Re: [Insight-developers] OtsuThresholdCalculator
> versus  OtsuMultipleThresholdsCalculator
>
> Dirk,
>
> I would vote to have the two produce the same results, and create a
>little
> migration guide which notes the change.
>
> Regarding, the refactoring, is there any difference in the algorithm
> complexity of the two? Any measurements of the performance difference
> between the two?
>
> Brad
>
> On Jul 2, 2013, at 11:04 AM, "Padfield, Dirk R (GE Global Research)"
> <padfield at research.ge.com<mailto:padfield at research.ge.com><mailto:padfield at research.ge.com<mailto:padfield at research.ge.com>>> wrote:
>
>> Hi Richard,
>>
>> Thank you for your response and insight.  If anything would need to be
>>changed, I would also lean towards changing the OtsuMultiple because I
>>think we shouldn't change Otsu since I am sure many more people are using
>>the Otsu because it is a standard thresholding algorithm.  I will do some
>>comparisons of the two against other implementations to see what I find.
>>
>> But the question still remains: is it okay to change the OtsuMultiple to
>>give the same output as Otsu for one threshold?  I am thinking in terms
>>of those people who use OtsuMultiple whose results will then be slightly
>>different.  Here are the advantages and disadvantages for keeping things
>>the same:
>>
>> Advantages: everyone's code still works as it did before
>> Disadvantages: the two implementations are inconsistent with each other
>>even though they are the same algorithm.  The overlapping code cannot be
>>merged (inheritance).  And future enhancements will need to be made in
>>both places.
>>
>> My vote is: change OtsuMultiple to be consistent with Otsu.
>>
>> What do others think?
>>
>> Dirk
>>
>>
>> ________________________________
>> From: Richard Beare [richard.beare at gmail.com<mailto:richard.beare at gmail.com><mailto:richard.beare at gmail.com<mailto:richard.beare at gmail.com>>]
>> Sent: Monday, July 01, 2013 5:40 PM
>> To: Bradley Lowekamp
>> Cc: Padfield, Dirk R (GE Global Research); <insight-developers at itk.org<mailto:insight-developers at itk.org><mailto:insight-developers at itk.org<mailto:insight-developers at itk.org>>>
>>Developers
>> Subject: Re: [Insight-developers] OtsuThresholdCalculator versus
>>OtsuMultipleThresholdsCalculator
>>
>> Hi,
>> I think the order was - I introduced new filters copied from ImageJ,
>>then Gaetan started refactoring to use the histogram framework. We both
>>did some work to make that correspond to old versions. I don't remember
>>working on the MultipleThreshold  version, but the code does look
>>similar, so perhaps it was done somewhere along the way - will need to
>>check the logs.
>>
>> I'm pretty sure that Otsu was producing the same results that it used to
>>- I didn't compare to other implementations. Thus, if the original Otsu
>>was correct then the current one should be too, which would suggest that
>>the MultipleThresholds version should probably change.
>>
>> Not sure when I'll get a chance to look at this in detail.
>>
>> I don't have a current email for Gaetan to CC for confirmation.
>>
>>
>> On Mon, Jul 1, 2013 at 11:02 PM, Bradley Lowekamp
>><blowekamp at mail.nih.gov<mailto:blowekamp at mail.nih.gov><mailto:blowekamp at mail.nih.gov<mailto:blowekamp at mail.nih.gov>><mailto:blowekamp at mail.nih.gov<mailto:blowekamp at mail.nih.gov><mailto:blowekamp at mail.nih.gov<mailto:blowekamp at mail.nih.gov>>>> wrote:
>> Dirk,
>>
>> I believe Richard Beare did the refactoring of the thresholding
>>framework an Insight Journal Article. He will likely know why it is this
>>way better than anyone else.
>>
>> You also didn't say which implementation is correct.
>>
>> Brad
>>
>>
>> On Jun 30, 2013, at 10:03 PM, "Padfield, Dirk R (GE Global Research)"
>><padfield at research.ge.com<mailto:padfield at research.ge.com><mailto:padfield at research.ge.com<mailto:padfield at research.ge.com>><mailto:padfield at research.ge.com<mailto:padfield at research.ge.com><mailto:padfield at research.ge.com<mailto:padfield at research.ge.com>>>> wrote:
>>
>>> Hi ITK Developers,
>>>
>>> I was just looking through the OtsuThresholdCalculator and
>>>OtsuMultipleThresholdsCalculator to see whether I could refactor them so
>>>that the Otsu inherits from the OtsuMultiple since the latter is a more
>>>general case of the former.  Currently, the code for these two filters
>>>is totally different resulting in significant code duplication and a
>>>need to keep both filters in sync.
>>>
>>> As a first step, I wrote a CMake test to check that the output of the
>>>OtsuMultiple with 1 threshold is the same as the output of the Otsu.
>>>Unfortunately, they are not!  The two filters output thresholds that are
>>>different by 1 histogram bin!  This can be a quite extreme difference
>>>when the numberOfHistogramBins is low, and it leads to different
>>>thresholds even when the numberOfHistogramBins is reasonably high (say
>>>256).  I tracked it down to this code in the Calculators:
>>>
>>> The relevant code from Otsu:
>>>  const double tolerance = 0.00001;
>>>  if ( (varBetween - tolerance) > maxVarBetween )
>>>    {
>>>    maxVarBetween = varBetween;
>>>    maxBinNumber = j;
>>>    }
>>>  }
>>> this->GetOutput()->Set( static_cast<OutputType>(
>>>histogram->GetMeasurement( maxBinNumber + 1, 0 ) ) );
>>>
>>> The relevant code from MultipleOtsu:
>>>  if ( varBetween > maxVarBetween )
>>>    {
>>>    maxVarBetween = varBetween;
>>>    maxVarThresholdIndexes = thresholdIndexes;
>>>    }
>>>  }
>>> for ( j = 0; j < m_NumberOfThresholds; j++ )
>>>  {
>>>  m_Output[j] = histogram->GetBinMax(0, maxVarThresholdIndexes[j]);
>>>  }
>>>
>>> The difference is that the Otsu adds one to the computed threshold
>>>whereas the MultipleOtsu does not.  This is problematic because users
>>>would expect them to give the same result.
>>>
>>> My question is: how should we proceed?  If we change one or the other,
>>>people's code that use the changed one will give slightly different
>>>answers.  If we don't change them, the two filters will give different
>>>outputs for the same input, and it will not be possible to refactor them
>>>to share code.
>>>
>>> What are your thoughts?
>>>
>>> Thanks,
>>> Dirk
>>> _______________________________________________
>>> Powered by www.kitware.com<http://www.kitware.com><http://www.kitware.com><http://www.kitware.com>
>>>
>>> Visit other Kitware open-source projects at
>>> http://www.kitware.com/opensource/opensource.html
>>>
>>> Kitware offers ITK Training Courses, for more information visit:
>>> http://kitware.com/products/protraining.php
>>>
>>> Please keep messages on-topic and check the ITK FAQ at:
>>> http://www.itk.org/Wiki/ITK_FAQ
>>>
>>> Follow this link to subscribe/unsubscribe:
>>> http://www.itk.org/mailman/listinfo/insight-developers
>>
>>
>
> _______________________________________________
> Powered by www.kitware.com<http://www.kitware.com><http://www.kitware.com>
>
> Visit other Kitware open-source projects at
> http://www.kitware.com/opensource/opensource.html
>
> Kitware offers ITK Training Courses, for more information visit:
> http://kitware.com/products/protraining.php
>
> Please keep messages on-topic and check the ITK FAQ at:
> http://www.itk.org/Wiki/ITK_FAQ
>
> Follow this link to subscribe/unsubscribe:
> http://www.itk.org/mailman/listinfo/insight-developers
>
>
>
> ________________________________
> Notice: This UI Health Care e-mail (including attachments) is covered by
>the Electronic Communications Privacy Act, 18 U.S.C. 2510-2521, is
>confidential and may be legally privileged.  If you are not the intended
>recipient, you are hereby notified that any retention, dissemination,
>distribution, or copying of this communication is strictly prohibited.
>Please reply to the sender that you have received the message in error,
>then delete it.  Thank you.
> ________________________________
> _______________________________________________
> Powered by www.kitware.com<http://www.kitware.com><http://www.kitware.com>
>
> Visit other Kitware open-source projects at
> http://www.kitware.com/opensource/opensource.html
>
> Kitware offers ITK Training Courses, for more information visit:
> http://kitware.com/products/protraining.php
>
> Please keep messages on-topic and check the ITK FAQ at:
> http://www.itk.org/Wiki/ITK_FAQ
>
> Follow this link to subscribe/unsubscribe:
> http://www.itk.org/mailman/listinfo/insight-developers



________________________________
Notice: This UI Health Care e-mail (including attachments) is covered by the Electronic Communications Privacy Act, 18 U.S.C. 2510-2521, is confidential and may be legally privileged.  If you are not the intended recipient, you are hereby notified that any retention, dissemination, distribution, or copying of this communication is strictly prohibited.  Please reply to the sender that you have received the message in error, then delete it.  Thank you.
________________________________

_______________________________________________
Powered by www.kitware.com<http://www.kitware.com>

Visit other Kitware open-source projects at
http://www.kitware.com/opensource/opensource.html

Kitware offers ITK Training Courses, for more information visit:
http://kitware.com/products/protraining.php

Please keep messages on-topic and check the ITK FAQ at:
http://www.itk.org/Wiki/ITK_FAQ

Follow this link to subscribe/unsubscribe:
http://www.itk.org/mailman/listinfo/insight-developers



--
Unpaid intern in BillsBasement at noware dot com


________________________________
Notice: This UI Health Care e-mail (including attachments) is covered by the Electronic Communications Privacy Act, 18 U.S.C. 2510-2521, is confidential and may be legally privileged.  If you are not the intended recipient, you are hereby notified that any retention, dissemination, distribution, or copying of this communication is strictly prohibited.  Please reply to the sender that you have received the message in error, then delete it.  Thank you.
________________________________
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.itk.org/pipermail/insight-developers/attachments/20130709/9f76b67c/attachment-0001.htm>


More information about the Insight-developers mailing list