<html><head><meta http-equiv="Content-Type" content="text/html charset=iso-8859-1"><base href="x-msg://23/"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; ">Hello Vikash,<div><br></div><div><div>Just to clarify some terminology, some of it may have been mixed up, and I am a little confused ( I'd like a laptop with 4 CPUs ):</div><div><br></div><div>CPU or socket</div><div>core or physical processor</div><div>logical processor or virtual core</div><div><br></div><div><br></div><div>So ITK by defaults tries to use all the logical processor or virtual cores. In my experience with large data sets the virtual cores generally don't hurt performance, and some of the RecursiveGaussian related filters have gotten up to a 50% speed up.</div><div><br></div><div>The operating system does the task of assigning processed and threads to logical processors. The scheduling even include things like CPU temperature to add in CPU features like turbo boost.</div><div><br></div><div>Most threading libraries have an thread affinity attribute which can be set I have never messed with it.</div><div><br></div><div>I generally multi-thread on the whole data-set in the standard ITK way, or have a bunch of objects I process with multiple processes or threads. For example if I have to process 1000 objects, I may write a program to perform the processing, and then use shell scripting to batch it out. I'd try to run the number of processes as physical processors, and tell ITK to only run 2 threads. I have found this to be a very efficient, with hyper-threading.</div><div><br></div><div>This is really a problem specific and system specific issue with how to get the best performance with modern heterogeneous systems. And it's important to keep in mind the NUMA with regards to the memory layout and the shared and unshared layers of CPU cache, when considering how to break up your problem.</div><div><br></div><div>Also I generally try to avoid doing two layers of ITK multi-threading. As the pipeline object's are not thread safe in a concurrent access with regards to the pipeline, it's seems more problematic than beneficial.</div><div><br></div><div>I hope my ramblings on this issue are helpful.</div><div><br></div><div>Brad</div><div><br><div><div>On Mar 21, 2013, at 4:47 PM, Vikash Gupta <<a href="mailto:vikash.gupta@inria.fr">vikash.gupta@inria.fr</a>> wrote:</div><br class="Apple-interchange-newline"><blockquote type="cite"><div style="font-family: Helvetica; font-size: medium; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; "><div style="font-family: 'times new roman', 'new york', times, serif; font-size: 12pt; ">Hi Matt and Willi,<span class="Apple-converted-space"> </span><br>Thanks for your insights in the Insight ToolKit :) . Yes I agree that the the number of threads will be divided on the number of cpus, my only worry was what happens in multithreading when I am assigning one particular job to one particular CPU, and from your answeres i feel that on the second level of multi-threading I should be little bit careful, as in if the CPU has 4 cores i shouldnt assign it 8 using SetNumberOfThreads() and the computing will be restricted to the CPU itself..<br><br>Thanks<span class="Apple-converted-space"> </span><br>Vikash<span class="Apple-converted-space"> </span><br><br><hr id="zwchr"><blockquote style="border-left-width: 2px; border-left-style: solid; border-left-color: rgb(16, 16, 255); margin-left: 5px; padding-left: 5px; font-weight: normal; font-style: normal; text-decoration: none; font-family: Helvetica, Arial, sans-serif; font-size: 12pt; "><b>From:<span class="Apple-converted-space"> </span></b>"Willi Huber" <<a href="mailto:surfersparadise85-itk@yahoo.com">surfersparadise85-itk@yahoo.com</a>><br><b>To:<span class="Apple-converted-space"> </span></b>"insight-users@itk org" <<a href="mailto:insight-users@itk.org">insight-users@itk.org</a>>, "vikash gupta" <<a href="mailto:vikash.gupta@inria.fr">vikash.gupta@inria.fr</a>><br><b>Sent:<span class="Apple-converted-space"> </span></b>Friday, March 22, 2013 2:07:18 AM<br><b>Subject:<span class="Apple-converted-space"> </span></b>AW: [Insight-users] ThreadedGenerateData()<br><br><table border="0" cellpadding="0" cellspacing="0"><tbody><tr><td valign="top" style="font: inherit; "><div style="margin: 0px; ">Hello Vikash,</div><div style="margin: 0px; ">this might not be correct since I have no deep insight into ITK an wether or not they have their own threading library but usually your OS, i.e. Windows or Linux, takes care of the distribution of threads.<br>I don't know of any threading library that fixes threads to a certain Core or CPU since that is the idea behind threads: "Loadbalancing".<span class="Apple-converted-space"> </span><br>Therefore it (OS or some hardware) attaches threads to a Core which is currently out of work.</div><div style="margin: 0px; ">Cheers,<br>Willi</div></td></tr></tbody></table><div id="_origMsg_"><br><div style="font-family: 'times new roman', 'new york', times, serif; font-size: 12pt; "><font face="Tahoma" size="2"><hr size="1"><b><span style="font-weight: bold; ">From:</span><span class="Apple-converted-space"> </span></b>Vikash Gupta <<a href="mailto:vikash.gupta@inria.fr">vikash.gupta@inria.fr</a>>;<span class="Apple-converted-space"> </span><br><b><span>To:</span><span class="Apple-converted-space"> </span></b>itk <<a href="mailto:insight-users@itk.org">insight-users@itk.org</a>>;<span class="Apple-converted-space"> </span><br><b><span>Subject:</span><span class="Apple-converted-space"> </span></b>[Insight-users] ThreadedGenerateData()<span class="Apple-converted-space"> </span><br><b><span style="font-weight: bold; ">Sent:</span><span class="Apple-converted-space"> </span></b>Thu, Mar 21, 2013 8:02:22 PM<span class="Apple-converted-space"> </span><br></font><br><table border="0" cellpadding="0" cellspacing="0" style="position: static; z-index: auto; "><tbody><tr><td valign="top" style="font: inherit; "><div style="font-family: 'times new roman', 'new york', times, serif; font-size: 12pt; ">Dear Itk-Users,<span class="Apple-converted-space"> </span><br>This might be a naive question but I am pondering over it for some time. When the ThreadedGenerateData() function is called in an itk filter, the threads refer to the number of CPUs on a computer or the number of cores on each CPU.<span class="Apple-converted-space"> </span><br><br>For example,<span class="Apple-converted-space"> </span><span style="white-space: pre; ">        </span>i have 8 cpus on my laptop with 4 cores on each, so if I call SetNumberOfThreads(4), these 4 cores will belong to the same CPU ?<span class="Apple-converted-space"> </span><br><br>On the next lever, if I divide the work on each cpu and then also do multithreading in each CPU, will the number of threads correspond to the number of cores on each CPU ?<br>Thanks a lot for any insight..<br><br>Vikash<br><br></div></td></tr></tbody></table></div></div></blockquote><br></div>_____________________________________<br>Powered by<span class="Apple-converted-space"> </span><a href="http://www.kitware.com">www.kitware.com</a><br><br>Visit other Kitware open-source projects at<br><a href="http://www.kitware.com/opensource/opensource.html">http://www.kitware.com/opensource/opensource.html</a><br><br>Kitware offers ITK Training Courses, for more information visit:<br><a href="http://www.kitware.com/products/protraining.php">http://www.kitware.com/products/protraining.php</a><br><br>Please keep messages on-topic and check the ITK FAQ at:<br><a href="http://www.itk.org/Wiki/ITK_FAQ">http://www.itk.org/Wiki/ITK_FAQ</a><br><br>Follow this link to subscribe/unsubscribe:<br><a href="http://www.itk.org/mailman/listinfo/insight-users">http://www.itk.org/mailman/listinfo/insight-users</a><br></div></blockquote></div><br></div></div></body></html>