[Insight-developers] Very suspicious behavior.

Bradley Lowekamp blowekamp at mail.nih.gov
Fri Feb 25 09:53:14 EST 2011


Hans,

Just a random though have you looked at seeing if the failure is depended on the number of bin?  Have you tried a prime number of bins? I feel like there may be some integer ring math that could be performed to check for something. 

Brad

On Feb 24, 2011, at 9:29 PM, Johnson, Hans J wrote:

> Hello ITK Experts:
> 
> I have a really nasty problem that is almost certainly some sort algorithmic anomaly that occurs during the creation of multiple threads under the right data conditions.  The failure is a lot more than just numerical precision, it is about 15 degrees and several mm off.
> 
> Mark Scully has been chasing this down for several days now, and we have been able to narrow down the environments for which this occurs.  
> 
> The problem attempting to be solved is a straight forward Versor3DTransformOptimizer, MattesMutualInformation, LiniearInterpolator ITK registration process between two T1 weighted images.
> 
> ---To cause failure run the process with exactly 4 threads (it also fails with 7, 14,15 threads, but not with 1,2,3,5,6,8,9,10,11,12,13, or 16 threads).
> 
> ---We have replicated the problem on Window 32bi, Linux 32/64, and Mac 64 bit OS.  Additionally, on mac we've tested in both Debug and Release mode, and we've built it against both ITKv3 and ITKv4.  The results are consistent in that they consistently only fail when the thread count is one of {4,7,14,15} , and passing with any of the other listed successful number of threads (see listing at end of message).
> 
> ---Changing the metric from MattesMutualInformation to MeanSquareError allows this to pass with any number of threads.
> 
> --ITK is built with OptimizedRegistration in ITKv3 (In ITKv4 OptimizedRegistration is the default).
> 
> Create a bash script with the following contents and run to see the behavior.  The first instance of the test will pass, the second will fail.
> 
> \/\//\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/
> #!/bin/sh
> svn checkout --username slicerbot --password slicer https://www.nitrc.org/svn/brains/StandAloneApps/StandAloneBRAINSFit StandAloneBRAINSFit
> mkdir StandAloneBRAINSFit-build
> cd StandAloneBRAINSFit-build
> cmake ../StandAloneBRAINSFit
> make
> cd BRAINSFit-build
> 
> echo export ITK_GLOBAL_DEFAULT_NUMBER_OF_THREADS=1
> export ITK_GLOBAL_DEFAULT_NUMBER_OF_THREADS=1
> ctest -R RigidRotGeom
> 
> echo export ITK_GLOBAL_DEFAULT_NUMBER_OF_THREADS=4
> export ITK_GLOBAL_DEFAULT_NUMBER_OF_THREADS=4
> ctest -R RigidRotGeom
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> 
> 
> Any hints on a strategy to help identify and correct the failure would be GREATLY appreciated.  
> 
> Thanks,
> Hans
> 
> 
> [hjohnson at hjhomebuildbox BRAINSFit-build]$ for i in 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16;do echo "=============== Running with ${i} Threads ========= " ; export ITK_GLOBAL_DEFAULT_NUMBER_OF_THREADS=${i} ; ctest -R BRAINSFitTest_RigidRotGeomNoMasks$ ; done
> =============== Running with 1 Threads ========= 
> Test project /Users/hjohnson/src/StandAloneApps/BF-ITKv4-Release/BRAINSFit-build
>     Start 28: BRAINSFitTest_RigidRotGeomNoMasks
> 1/1 Test #28: BRAINSFitTest_RigidRotGeomNoMasks ...   Passed  184.79 sec
> 
> 100% tests passed, 0 tests failed out of 1
> 
> Total Test time (real) = 184.81 sec
> =============== Running with 2 Threads ========= 
> Test project /Users/hjohnson/src/StandAloneApps/BF-ITKv4-Release/BRAINSFit-build
>     Start 28: BRAINSFitTest_RigidRotGeomNoMasks
> 1/1 Test #28: BRAINSFitTest_RigidRotGeomNoMasks ...   Passed  140.10 sec
> 
> 100% tests passed, 0 tests failed out of 1
> 
> Total Test time (real) = 140.13 sec
> =============== Running with 3 Threads ========= 
> Test project /Users/hjohnson/src/StandAloneApps/BF-ITKv4-Release/BRAINSFit-build
>     Start 28: BRAINSFitTest_RigidRotGeomNoMasks
> 1/1 Test #28: BRAINSFitTest_RigidRotGeomNoMasks ...   Passed   54.98 sec
> 
> 100% tests passed, 0 tests failed out of 1
> 
> Total Test time (real) =  55.01 sec
> =============== Running with 4 Threads ========= 
> Test project /Users/hjohnson/src/StandAloneApps/BF-ITKv4-Release/BRAINSFit-build
>     Start 28: BRAINSFitTest_RigidRotGeomNoMasks
> 1/1 Test #28: BRAINSFitTest_RigidRotGeomNoMasks ...***Failed   22.80 sec
> 
> 0% tests passed, 1 tests failed out of 1
> 
> Total Test time (real) =  22.83 sec
> 
> The following tests FAILED:
> 28 - BRAINSFitTest_RigidRotGeomNoMasks (Failed)
> Errors while running CTest
> =============== Running with 5 Threads ========= 
> Test project /Users/hjohnson/src/StandAloneApps/BF-ITKv4-Release/BRAINSFit-build
>     Start 28: BRAINSFitTest_RigidRotGeomNoMasks
> 1/1 Test #28: BRAINSFitTest_RigidRotGeomNoMasks ...   Passed   35.23 sec
> 
> 100% tests passed, 0 tests failed out of 1
> 
> Total Test time (real) =  35.29 sec
> =============== Running with 6 Threads ========= 
> Test project /Users/hjohnson/src/StandAloneApps/BF-ITKv4-Release/BRAINSFit-build
>     Start 28: BRAINSFitTest_RigidRotGeomNoMasks
> 1/1 Test #28: BRAINSFitTest_RigidRotGeomNoMasks ...   Passed   29.61 sec
> 
> 100% tests passed, 0 tests failed out of 1
> 
> Total Test time (real) =  29.63 sec
> =============== Running with 7 Threads ========= 
> Test project /Users/hjohnson/src/StandAloneApps/BF-ITKv4-Release/BRAINSFit-build
>     Start 28: BRAINSFitTest_RigidRotGeomNoMasks
> 1/1 Test #28: BRAINSFitTest_RigidRotGeomNoMasks ...***Failed   55.38 sec
> 
> 0% tests passed, 1 tests failed out of 1
> 
> Total Test time (real) =  55.41 sec
> 
> The following tests FAILED:
> 28 - BRAINSFitTest_RigidRotGeomNoMasks (Failed)
> Errors while running CTest
> =============== Running with 8 Threads ========= 
> Test project /Users/hjohnson/src/StandAloneApps/BF-ITKv4-Release/BRAINSFit-build
>     Start 28: BRAINSFitTest_RigidRotGeomNoMasks
> 1/1 Test #28: BRAINSFitTest_RigidRotGeomNoMasks ...   Passed   17.25 sec
> 
> 100% tests passed, 0 tests failed out of 1
> 
> Total Test time (real) =  17.26 sec
> =============== Running with 9 Threads ========= 
> Test project /Users/hjohnson/src/StandAloneApps/BF-ITKv4-Release/BRAINSFit-build
>     Start 28: BRAINSFitTest_RigidRotGeomNoMasks
> 1/1 Test #28: BRAINSFitTest_RigidRotGeomNoMasks ...   Passed   27.49 sec
> 
> 100% tests passed, 0 tests failed out of 1
> 
> Total Test time (real) =  27.51 sec
> =============== Running with 10 Threads ========= 
> Test project /Users/hjohnson/src/StandAloneApps/BF-ITKv4-Release/BRAINSFit-build
>     Start 28: BRAINSFitTest_RigidRotGeomNoMasks
> 1/1 Test #28: BRAINSFitTest_RigidRotGeomNoMasks ...   Passed   36.89 sec
> 
> 100% tests passed, 0 tests failed out of 1
> 
> Total Test time (real) =  36.92 sec
> =============== Running with 11 Threads ========= 
> Test project /Users/hjohnson/src/StandAloneApps/BF-ITKv4-Release/BRAINSFit-build
>     Start 28: BRAINSFitTest_RigidRotGeomNoMasks
> 1/1 Test #28: BRAINSFitTest_RigidRotGeomNoMasks ...   Passed   19.71 sec
> 
> 100% tests passed, 0 tests failed out of 1
> 
> Total Test time (real) =  19.72 sec
> =============== Running with 12 Threads ========= 
> Test project /Users/hjohnson/src/StandAloneApps/BF-ITKv4-Release/BRAINSFit-build
>     Start 28: BRAINSFitTest_RigidRotGeomNoMasks
> 1/1 Test #28: BRAINSFitTest_RigidRotGeomNoMasks ...   Passed   17.97 sec
> 
> 100% tests passed, 0 tests failed out of 1
> 
> Total Test time (real) =  17.98 sec
> =============== Running with 13 Threads ========= 
> Test project /Users/hjohnson/src/StandAloneApps/BF-ITKv4-Release/BRAINSFit-build
>     Start 28: BRAINSFitTest_RigidRotGeomNoMasks
> 1/1 Test #28: BRAINSFitTest_RigidRotGeomNoMasks ...   Passed   34.96 sec
> 
> 100% tests passed, 0 tests failed out of 1
> 
> Total Test time (real) =  34.98 sec
> =============== Running with 14 Threads ========= 
> Test project /Users/hjohnson/src/StandAloneApps/BF-ITKv4-Release/BRAINSFit-build
>     Start 28: BRAINSFitTest_RigidRotGeomNoMasks
> 1/1 Test #28: BRAINSFitTest_RigidRotGeomNoMasks ...***Failed   18.18 sec
> 
> 0% tests passed, 1 tests failed out of 1
> 
> Total Test time (real) =  18.20 sec
> 
> The following tests FAILED:
> 28 - BRAINSFitTest_RigidRotGeomNoMasks (Failed)
> Errors while running CTest
> =============== Running with 15 Threads ========= 
> Test project /Users/hjohnson/src/StandAloneApps/BF-ITKv4-Release/BRAINSFit-build
>     Start 28: BRAINSFitTest_RigidRotGeomNoMasks
> 1/1 Test #28: BRAINSFitTest_RigidRotGeomNoMasks ...***Failed   17.45 sec
> 
> 0% tests passed, 1 tests failed out of 1
> 
> Total Test time (real) =  17.48 sec
> 
> The following tests FAILED:
> 28 - BRAINSFitTest_RigidRotGeomNoMasks (Failed)
> Errors while running CTest
> =============== Running with 16 Threads ========= 
> Test project /Users/hjohnson/src/StandAloneApps/BF-ITKv4-Release/BRAINSFit-build
>     Start 28: BRAINSFitTest_RigidRotGeomNoMasks
> 1/1 Test #28: BRAINSFitTest_RigidRotGeomNoMasks ...   Passed   23.14 sec
> 
> 100% tests passed, 0 tests failed out of 1
> 
> Total Test time (real) =  23.15 sec
> 
> 
> 
> Notice: This UI Health Care e-mail (including attachments) is covered by the Electronic Communications Privacy Act, 18 U.S.C. 2510-2521, is confidential and may be legally privileged.  If you are not the intended recipient, you are hereby notified that any retention, dissemination, distribution, or copying of this communication is strictly prohibited.  Please reply to the sender that you have received the message in error, then delete it.  Thank you.
> <ATT00001..txt>

========================================================
Bradley Lowekamp  
Lockheed Martin Contractor for
Office of High Performance Computing and Communications
National Library of Medicine 
blowekamp at mail.nih.gov


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.itk.org/mailman/private/insight-developers/attachments/20110225/b67c9a6c/attachment-0001.htm>


More information about the Insight-developers mailing list