<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=us-ascii">
</head>
<body style="word-wrap:break-word; color:rgb(0,0,0); font-size:14px; font-family:Calibri,sans-serif">
<div>All,</div>
<div><br>
</div>
<div>
<div>Based on a discussion with Nick Tustison on the train from Nogoya airport to the MICCAI conference, I started some profiling to determine what is actually causing registration to be so slow. Some fixes have already been pushed to gerrit (<a href="http://review.source.kitware.com/#/c/12747/">http://review.source.kitware.com/#/c/12747/</a>)
and that has shown about a 15% speed improvement. This however, appears to only be the tip of the iceberg. </div>
</div>
<div><br>
</div>
<div>In addition, I have been greatly disappointed that converting to floating point precision did not result in performance improvement (even though all my past experience indicates that it should be a performance improvement!). If these multithreading issues
turn out to be the problem, that would explain why improving floating point performance does not improve overall performance. </div>
<div><br>
</div>
<div>
<div>=================</div>
</div>
<div><br>
</div>
<div>So far everything I've profile with regards to ants registration indicates that there is a serious flaw in the multi-threaded implementation.</div>
<div><br>
</div>
<div>20 of the 52 seconds are waiting for condition variables to clear (I.e. Variables are shared and require synchronization to complete). The thread concurrency histogram is particularly troubling. Only 1 or 2 threads are actually doing productive work
at the same time. NOTE: THIS IS A REAL program that is actually in use for affine registration. I use it every day and have been terribly disappointed in it's speed. Every ants registration that you do like has this behavior.</div>
<div><br>
</div>
<div>=================</div>
<div><br>
</div>
<div>I'll continue to track down where the issues are, but it appears to be in places where a transform is referenced in multiple threads, but is requiring updating the internal reference count of the smart pointer. Each smart pointer reference count update
requires a global lock on that object to do the increment/decrement.</div>
<div><br>
</div>
<div>More testing to follow.</div>
<div><br>
</div>
<div>Hans</div>
<div><br>
</div>
<div><img src="cid:BB93F41A-C611-4E82-8897-59D419BC5E08" type="image/png"></div>
<br>
<br>
<br>
<hr>
Notice: This UI Health Care e-mail (including attachments) is covered by the Electronic Communications Privacy Act, 18 U.S.C. 2510-2521, is confidential and may be legally privileged. If you are not the intended recipient, you are hereby notified that any
retention, dissemination, distribution, or copying of this communication is strictly prohibited. Please reply to the sender that you have received the message in error, then delete it. Thank you.
<hr>
</body>
</html>