[ITK] [ITK-dev] [ITK Community] [Insight-developers] non-deterministic v4 registrations in 4.5.x

Wed Mar 19 12:45:43 EDT 2014

it's brian - and, yes, we all have "copious free time" of course.

brian

On Wed, Mar 19, 2014 at 12:43 PM, Simon Alexander <skalexander at gmail.com>wrote:

> Thanks for the summary Brain.
>
> A lot of partitioning issues fundamentally  come down to the lack of
> associativity & distributivity  of fp operations.  Not sure I can do
> anything practical to improve it  but I will have a look if I can find a
> bit of my "copious free time" .
>
>
> On Wed, Mar 19, 2014 at 12:29 PM, brian avants <stnava at gmail.com> wrote:
>
>> yes - i understand.
>>
>> * matt mccormick implemented compensated summation to address - it helps
>> but is not a full fix
>>
>> * truncating floating point precision greatly reduces the effect you are
>> talking about but is unatisfactory to most people ... not sure if the
>> functionality for that truncation was taken out of the v4 metrics but it
>> was in there at one point.
>>
>> * there may be a small and undiscovered bug that contributes to this in
>> mattes specificallly but i dont think that's the issue.  we saw this effect
>> even in mean squares.  if there is a bug it may be beyond just mattes.   we
>> cannot disprove that there is a bug.  if anyone knows of way to do that,
>> let me know.
>>
>> * any help is appreciated
>>
>>
>> brian
>>
>>
>>
>>
>> On Wed, Mar 19, 2014 at 12:24 PM, Simon Alexander <skalexander at gmail.com>wrote:
>>
>>> Brain,
>>>
>>> I could have sworn I had initially added a follow up email clarifying
>>> this but since I can't find it in the current quoted exchange, let me
>>> reiterate:
>>>
>>> This is not a case of with different results on different systems.  This
>>> is a case of different results on the same system if you use a different
>>> number of threads.
>>>
>>> So while that possibly could be some odd intrinsics issue, for example,
>>> the far more likely thing is that data partitioning is not being handled in
>>> a way that ensures consistency.
>>>
>>>  Originally I was also seeing intra-system differences due to internal
>>> precision, but that was a separate issue and has been solved.
>>>
>>> Hope that is more clear!
>>>
>>>
>>>
>>> On Wed, Mar 19, 2014 at 12:13 PM, Simon Alexander <skalexander at gmail.com
>>> > wrote:
>>>
>>>> Brian,
>>>>
>>>> Do you mean the generality of my AVX  internal precision problem?
>>>>
>>>> I agree that is a very common issue, the surprising thing there was
>>>> that we were already constraining the code generation in way that worked as
>>>> over the different processor generations and types we used, up until we hit
>>>> the first Haswell cpus with AVX2 support (even though no AVX2 instructions
>>>> were generated).  Perhaps it shouldn't have surprised me, but It took me a
>>>> few tests to work that out because the problem was confounded with the
>>>> problem I discuss in this thread (which is unrelated).  Once I separated
>>>> them it was easy to spot.
>>>>
>>>> So that is a solved issue for now, but I am still interested the
>>>> partitioning issue in the image metric, as I only have a work around for
>>>> now.
>>>>
>>>>
>>>>
>>>> On Wed, Mar 19, 2014 at 11:24 AM, brian avants <stnava at gmail.com>wrote:
>>>>
>>>>>
>>>>> http://software.intel.com/en-us/articles/consistency-of-floating-point-results-using-the-intel-compiler
>>>>>
>>>>> just as an example of the generality of this problem
>>>>>
>>>>>
>>>>> brian
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Wed, Mar 19, 2014 at 11:22 AM, Simon Alexander <
>>>>> skalexander at gmail.com> wrote:
>>>>>
>>>>>> Brian, Luis,
>>>>>>
>>>>>> Thanks.  I have been using Mattes as you suspect.
>>>>>>
>>>>>> I don't quite understand how precision is specifically the issue with
>>>>>> # of cores.  There are all kinds of issues with precision and order of
>>>>>> operations in numerical analysis, but often data partitioning (i.e. for
>>>>>> concurrency) schemes can be set up so that the actual sums are done the
>>>>>> same way regardless of number of workers, which keeps your final results
>>>>>> identical.  Is there some reason this can't be done for the Matte's metric?
>>>>>>   I really should look at the implementation to answer that, of course.
>>>>>>
>>>>>> Do you have a pointer to earlier discussions?  If I can find the time
>>>>>> I'd like to dig into this a bit, but I'm not sure when I'll have the
>>>>>> bandwidth.  I've "solved" this currently by constraining the core count.
>>>>>>
>>>>>> Perhaps interestingly, my earlier experiments were confounded a bit
>>>>>> by a precision issue, but that had to do with intrinsics generation on my
>>>>>> compiler behaving differently on systems with AVX2 (even though only AVX
>>>>>> intrinsics were being generated).  So that made things confusing at first
>>>>>> until I separated the issues.
>>>>>>
>>>>>>
>>>>>> On Wed, Mar 19, 2014 at 9:49 AM, brian avants <stnava at gmail.com>wrote:
>>>>>>
>>>>>>> yes - we had several discussions about this during v4 development.
>>>>>>>
>>>>>>> experiments showed that differences are due to precision.
>>>>>>>
>>>>>>> one solution was to truncate precision to the point that is
>>>>>>> reliable.
>>>>>>>
>>>>>>> but there are problems with that too.   last i checked, this was an
>>>>>>>
>>>>>>> open problem, in general, in computer science.
>>>>>>>
>>>>>>>
>>>>>>> brian
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Mar 19, 2014 at 9:16 AM, Luis Ibanez <
>>>>>>> luis.ibanez at kitware.com> wrote:
>>>>>>>
>>>>>>>> Hi Simon,
>>>>>>>>
>>>>>>>> We are aware of some multi-threading related issues in
>>>>>>>> the registration process that result in metric values changing
>>>>>>>> depending on the number of cores used.
>>>>>>>>
>>>>>>>> Are you using the MattesMutualInformationMetric ?
>>>>>>>>
>>>>>>>> At some point it was suspected that the problem was the
>>>>>>>> result of accumulative rounding, in the contributions that
>>>>>>>> each pixel makes to the metric value.... this may or may
>>>>>>>> not be related to what you are observing.
>>>>>>>>
>>>>>>>>
>>>>>>>>    Thanks
>>>>>>>>
>>>>>>>>        Luis
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Thu, Feb 20, 2014 at 3:27 PM, Simon Alexander <
>>>>>>>> skalexander at gmail.com> wrote:
>>>>>>>>
>>>>>>>>> I've been finding some regressions in registration results when
>>>>>>>>> using systems with different numbers of cores (so the thread count is
>>>>>>>>> different).  This is resolved by fixing the global max.
>>>>>>>>>
>>>>>>>>> It's difficult for me to run the identical code on against 4.4.2,
>>>>>>>>> but similar experiments were run in that timeframe without these
>>>>>>>>> regressions.
>>>>>>>>>
>>>>>>>>> I recall that there were changes affecting multhreading in the v4
>>>>>>>>> registration in 4.5.0 release, so I thought this might be a side effect.
>>>>>>>>>
>>>>>>>>> So a few questions:
>>>>>>>>>
>>>>>>>>> Is this behaviour expected?
>>>>>>>>>
>>>>>>>>> Am I correct that this was not the behaviour in 4.4.x ?
>>>>>>>>>
>>>>>>>>> Does anyone who has a feel for  the recent changes 4.4.2 ->
>>>>>>>>> 4.5.[0,1]  have a good idea where to start looking?  I haven't yet dug into
>>>>>>>>> the multithreading architecture, but this "smells" like a data partitioning
>>>>>>>>> issue to me.
>>>>>>>>>
>>>>>>>>> Any other thoughts?
>>>>>>>>>
>>>>>>>>> cheers,
>>>>>>>>> Simon
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> Powered by www.kitware.com
>>>>>>>>>
>>>>>>>>> Visit other Kitware open-source projects at
>>>>>>>>> http://www.kitware.com/opensource/opensource.html
>>>>>>>>>
>>>>>>>>> Kitware offers ITK Training Courses, for more information visit:
>>>>>>>>> http://kitware.com/products/protraining.php
>>>>>>>>>
>>>>>>>>> Please keep messages on-topic and check the ITK FAQ at:
>>>>>>>>> http://www.itk.org/Wiki/ITK_FAQ
>>>>>>>>>
>>>>>>>>> Follow this link to subscribe/unsubscribe:
>>>>>>>>> http://www.itk.org/mailman/listinfo/insight-developers
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> Community mailing list
>>>>>>>>> Community at itk.org
>>>>>>>>> http://public.kitware.com/cgi-bin/mailman/listinfo/community
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> Powered by www.kitware.com
>>>>>>>>
>>>>>>>> Visit other Kitware open-source projects at
>>>>>>>> http://www.kitware.com/opensource/opensource.html
>>>>>>>>
>>>>>>>> Kitware offers ITK Training Courses, for more information visit:
>>>>>>>> http://kitware.com/products/protraining.php
>>>>>>>>
>>>>>>>> Please keep messages on-topic and check the ITK FAQ at:
>>>>>>>> http://www.itk.org/Wiki/ITK_FAQ
>>>>>>>>
>>>>>>>> Follow this link to subscribe/unsubscribe:
>>>>>>>> http://www.itk.org/mailman/listinfo/insight-developers
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://public.kitware.com/pipermail/community/attachments/20140319/8131d01b/attachment-0002.html>
-------------- next part --------------
_______________________________________________
Powered by www.kitware.com

Visit other Kitware open-source projects at
http://www.kitware.com/opensource/opensource.html

Kitware offers ITK Training Courses, for more information visit:
http://kitware.com/products/protraining.php

Please keep messages on-topic and check the ITK FAQ at:
http://www.itk.org/Wiki/ITK_FAQ

Follow this link to subscribe/unsubscribe:
http://www.itk.org/mailman/listinfo/insight-developers