Brad,<div><br></div><div>So Mattes MI metric, single threaded, was 4 times slower in 4.2, and after the patch became only 2 times slower.  Egad - thats a big difference - i dont know about you but i am quite surprised.  I wasnt using this metric so i hadn&#39;t noticed.  On friday I went compared the files from 3.20 -&gt; 4.2 briefly, but I didnt see where the slowdown would be.</div>
<div><br></div><div>It does not seem to be in the interpolators - i made an interpolator benchmark and it seems to run as good or better in 4.2.  I split itkbench into multiple files for each benchmark to try to stay organized.  I&#39;ll try to reproduce your result from your fork and pull it in.</div>
<div><br></div><div>In any case, i think your approach to handling the Jacobian is good.  Since the Jacobian is used by nearly all metrics, i wonder if it should be a member of the base class and allocated there?</div><div>
<br></div><div>Cheers,</div><div>Rupert</div><div><br></div><div>--------------------------------------------------------------<br>Rupert Brooks<br><a href="mailto:rupert.brooks@gmail.com">rupert.brooks@gmail.com</a><br><br>

<br><br><div class="gmail_quote">On Fri, Jul 27, 2012 at 9:59 AM, Bradley Lowekamp <span dir="ltr">&lt;<a href="mailto:blowekamp@mail.nih.gov" target="_blank">blowekamp@mail.nih.gov</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div style="word-wrap:break-word"><div class="im">OK here are some more numbers for the latest patch in gerrit. I will follow Ruperts format as it&#39;s the most clear.<div><br></div><div>MeanSquares:</div><div><div>Threads<span style="white-space:pre-wrap">        </span>3.2<span style="white-space:pre-wrap">                </span>4.2<span style="white-space:pre-wrap">                </span>4.2+patch<span style="white-space:pre-wrap">        </span>patch percentage of 3.20</div>
<div>1<span style="white-space:pre-wrap">        </span><span style="white-space:pre-wrap">        </span>0.3615<span style="white-space:pre-wrap">        </span>0.8214<span style="white-space:pre-wrap">        </span>0.4071<span style="white-space:pre-wrap">        </span><span style="white-space:pre-wrap">        </span>113%</div>
<div>2<span style="white-space:pre-wrap">        </span><span style="white-space:pre-wrap">        </span>0.3222<span style="white-space:pre-wrap">        </span>0.6055<span style="white-space:pre-wrap">        </span>0.3365<span style="white-space:pre-wrap">        </span><span style="white-space:pre-wrap">        </span>104%</div>
<div>4<span style="white-space:pre-wrap">        </span><span style="white-space:pre-wrap">        </span>0.3249<span style="white-space:pre-wrap">        </span>0.4448<span style="white-space:pre-wrap">        </span>0.3293<span style="white-space:pre-wrap">        </span><span style="white-space:pre-wrap">        </span>101%</div>
<div>8<span style="white-space:pre-wrap">        </span><span style="white-space:pre-wrap">        </span>0.1703<span style="white-space:pre-wrap">        </span>0.3093<span style="white-space:pre-wrap">        </span>0.1943<span style="white-space:pre-wrap">        </span><span style="white-space:pre-wrap">        </span>114%</div>
<div>12<span style="white-space:pre-wrap">        </span><span style="white-space:pre-wrap">        </span>0.1457<span style="white-space:pre-wrap">        </span>0.2031<span style="white-space:pre-wrap">        </span>0.1322<span style="white-space:pre-wrap">        </span><span style="white-space:pre-wrap">        </span>91%</div>
<div>24*<span style="white-space:pre-wrap">        </span><span style="white-space:pre-wrap">        </span>0.1062<span style="white-space:pre-wrap">        </span>0.1332<span style="white-space:pre-wrap">        </span>0.0949<span style="white-space:pre-wrap">                </span>89%</div>
</div><div><br></div><div>MutualInformation:</div><div><div><div>Threads<span style="white-space:pre-wrap">        </span>3.2<span style="white-space:pre-wrap">                </span>4.2<span style="white-space:pre-wrap">                </span>4.2+patch<span style="white-space:pre-wrap">        </span>patch percentage of 3.20</div>
<div>1<span style="white-space:pre-wrap">        </span><span style="white-space:pre-wrap">        </span>0.1467<span style="white-space:pre-wrap">        </span>0.6103<span style="white-space:pre-wrap">        </span>0.3353<span style="white-space:pre-wrap">                </span>228%</div>
<div>2<span style="white-space:pre-wrap">                </span>0.1036<span style="white-space:pre-wrap">        </span>0.3747<span style="white-space:pre-wrap">        </span>0.1774<span style="white-space:pre-wrap">                </span>171%</div><div>4<span style="white-space:pre-wrap">                </span>0.0847<span style="white-space:pre-wrap">        </span>0.2175<span style="white-space:pre-wrap">        </span>0.1262<span style="white-space:pre-wrap">                </span>149%</div>
<div>8<span style="white-space:pre-wrap">                </span>0.0655<span style="white-space:pre-wrap">        </span>0.1291<span style="white-space:pre-wrap">        </span>0.0681<span style="white-space:pre-wrap">                </span>104%</div><div>12<span style="white-space:pre-wrap">        </span><span style="white-space:pre-wrap">        </span>0.0551<span style="white-space:pre-wrap">        </span>0.1035<span style="white-space:pre-wrap">        </span>0.0486<span style="white-space:pre-wrap">                </span>88%</div>
<div>24*<span style="white-space:pre-wrap">        </span><span style="white-space:pre-wrap">        </span>0.0460<span style="white-space:pre-wrap">        </span>0.0829<span style="white-space:pre-wrap">        </span>0.0526<span style="white-space:pre-wrap">                </span>114%</div>
</div></div><div><br></div><div>*Hyperthreading</div></div><div><div class="im"><div><br></div><div>The observation to be made about MutualInformation is that while 4.2 it&#39;s still slower with one thread, there is a significant increase is speed-up due to threads now.</div>
<div><br></div></div><div>Brad</div></div><div><br></div><div><div class="im"><div>On Jul 26, 2012, at 2:02 PM, Rupert Brooks wrote:</div><br></div><div class="im"><blockquote type="cite">Ok that makes way more sense, sorry i didnt understand first time around.<div>
<br></div><div>Just so i&#39;ve got it right<div>Threads                    3.20                              4.2+patch            Time 4.2 as percent of 3.20</div>
<div><div>1                         0.347567                            0.383342                110.293%</div><div>2                         0.300869                            0.335328                111.453</div><div>4                         0.348677                            0.315688                 90.5388</div>

<div>8                         0.182681                            0.192132                105.173</div></div><div><br></div><div>So theres about 10% more time with ITK 4.2 used in the 1 and 2 thread case.  That is definitely better than what we were getting.  Cool.</div>

<div><br></div><div>Rupert</div><div><br></div><div>--------------------------------------------------------------<br>Rupert Brooks<br><a href="mailto:rupert.brooks@gmail.com" target="_blank">rupert.brooks@gmail.com</a><br>
<br>
<br><br><div class="gmail_quote">On Thu, Jul 26, 2012 at 1:13 PM, Bradley Lowekamp <span dir="ltr">&lt;<a href="mailto:blowekamp@mail.nih.gov" target="_blank">blowekamp@mail.nih.gov</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">

<div style="word-wrap:break-word">Sorry for not being clear! I got too excited by finding the solution to the performance issue with ITKv3 registration in ITKv4.<div><br></div><div>This first is vanilla 3.20, the second is 4.20+ the gerrit patch. The third is the gerrit patch with the pre-malloc of the Jacobin outside the threaded section! Vanilla 4.2 is ~2x 3.20 for this test on my system too.<div>

<br></div><div>Summary for the MeansSquares metric in your test:</div><div><br></div><div>3.20:  1X</div><div>4.2: 2+X</div><div>4.2+gerrit patch: 1X</div><div>4.2+gerrit patch + single-threaded preallocation of jacobian: 1.5X<br>

</div></div></div></blockquote></div></div></div></blockquote></div></div><br><div class="im"><div>
<span style="text-indent:0px;letter-spacing:normal;font-variant:normal;text-align:-webkit-auto;font-style:normal;font-weight:normal;line-height:normal;border-collapse:separate;text-transform:none;font-size:medium;white-space:normal;font-family:Helvetica;word-spacing:0px"><span style="text-indent:0px;letter-spacing:normal;font-variant:normal;font-style:normal;font-weight:normal;line-height:normal;border-collapse:separate;text-transform:none;font-size:12px;white-space:normal;font-family:Helvetica;word-spacing:0px"><div style="word-wrap:break-word">
<span style="text-indent:0px;letter-spacing:normal;font-variant:normal;font-style:normal;font-weight:normal;line-height:normal;border-collapse:separate;text-transform:none;font-size:12px;white-space:normal;font-family:Helvetica;word-spacing:0px"><p style="margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px">
<font face="Helvetica" size="3" style="font:normal normal normal 12px/normal Helvetica">========================================================</font></p><p style="margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px">
<font face="Helvetica" size="3" style="font:normal normal normal 12px/normal Helvetica">Bradley Lowekamp<span> </span><span> </span></font></p><p style="margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px"><font face="Helvetica" size="3" style="font:normal normal normal 12px/normal Helvetica">Medical Science and Computing for</font></p>
<p style="margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px"><font face="Helvetica" size="3" style="font:normal normal normal 12px/normal Helvetica">Office of High Performance Computing and Communications</font></p>
<p style="margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px"><font face="Helvetica" size="3" style="font:normal normal normal 12px/normal Helvetica">National Library of Medicine<span> </span></font></p><p style="margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px">
<font face="Helvetica" size="3" style="font:normal normal normal 12px/normal Helvetica"><a href="mailto:blowekamp@mail.nih.gov" target="_blank">blowekamp@mail.nih.gov</a></font></p><br></span></div></span></span><br>
</div>
<br></div></div><br>_______________________________________________<br>
Powered by <a href="http://www.kitware.com" target="_blank">www.kitware.com</a><br>
<br>
Visit other Kitware open-source projects at<br>
<a href="http://www.kitware.com/opensource/opensource.html" target="_blank">http://www.kitware.com/opensource/opensource.html</a><br>
<br>
Kitware offers ITK Training Courses, for more information visit:<br>
<a href="http://kitware.com/products/protraining.php" target="_blank">http://kitware.com/products/protraining.php</a><br>
<br>
Please keep messages on-topic and check the ITK FAQ at:<br>
<a href="http://www.itk.org/Wiki/ITK_FAQ" target="_blank">http://www.itk.org/Wiki/ITK_FAQ</a><br>
<br>
Follow this link to subscribe/unsubscribe:<br>
<a href="http://www.itk.org/mailman/listinfo/insight-developers" target="_blank">http://www.itk.org/mailman/listinfo/insight-developers</a><br>
<br></blockquote></div><br></div>