<html><head><style type="text/css"><!-- DIV {margin:0px;} --></style></head><body><div style="font-family:times new roman,new york,times,serif;font-size:14pt"><div>Thank you Andriy for your reply.<br><br>Actually my problem is way simpler than that. I want to multiply matrices (ordinary matrix-matrix multiplication and element-wise multiplication) but my algorithm involves iterative and large matrix multiplication. I tried to exploit multi-threading by openmp (not openmpi). I googled it and find this simple code for multi-threaded matrix-matrix multiplication:<br><br>from: https://computing.llnl.gov/tutorials/openMP/samples/C/omp_mm.c <br>/******************************************************************************<br>* FILE: omp_mm.c<br>* DESCRIPTION:&nbsp; <br>*&nbsp;&nbsp; OpenMp Example - Matrix Multiply - C Version<br>*&nbsp;&nbsp; Demonstrates a matrix multiply using OpenMP. Threads share row iterations<br>*&nbsp;&nbsp; according to a predefined

 chunk size.<br>* AUTHOR: Blaise Barney<br>* LAST REVISED: 06/28/05<br>******************************************************************************/<br>#include &lt;omp.h&gt;<br>#include &lt;stdio.h&gt;<br>#include &lt;stdlib.h&gt;<br><br>#define NRA 62&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; /* number of rows in matrix A */<br>#define NCA 15&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; /* number of columns in matrix A */<br>#define NCB 7&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; /* number of columns in matrix B */<br><br>int main (int argc, char *argv[]) <br>{<br>int&nbsp;&nbsp;&nbsp; tid, nthreads, i, j, k, chunk;<br>double&nbsp;&nbsp;&nbsp; a[NRA][NCA],&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; /* matrix A to be multiplied */<br>&nbsp;&nbsp;&nbsp;

 b[NCA][NCB],&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; /* matrix B to be multiplied */<br>&nbsp;&nbsp;&nbsp; c[NRA][NCB];&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; /* result matrix C */<br><br>chunk = 10;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; /* set loop iteration chunk size */<br><br>/*** Spawn a parallel region explicitly scoping all variables ***/<br>#pragma omp parallel shared(a,b,c,nthreads,chunk) private(tid,i,j,k)<br>&nbsp; {<br>&nbsp; tid = omp_get_thread_num();<br>&nbsp; if (tid == 0)<br>&nbsp;&nbsp;&nbsp; {<br>&nbsp;&nbsp;&nbsp; nthreads = omp_get_num_threads();<br>&nbsp;&nbsp;&nbsp; printf("Starting matrix multiple example with %d threads\n",nthreads);<br>&nbsp;&nbsp;&nbsp; printf("Initializing matrices...\n");<br>&nbsp;&nbsp;&nbsp; }<br>&nbsp; /*** Initialize matrices ***/<br>&nbsp; #pragma omp for schedule (static, chunk)

 <br>&nbsp; for (i=0; i&lt;NRA; i++)<br>&nbsp;&nbsp;&nbsp; for (j=0; j&lt;NCA; j++)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; a[i][j]= i+j;<br>&nbsp; #pragma omp for schedule (static, chunk)<br>&nbsp; for (i=0; i&lt;NCA; i++)<br>&nbsp;&nbsp;&nbsp; for (j=0; j&lt;NCB; j++)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; b[i][j]= i*j;<br>&nbsp; #pragma omp for schedule (static, chunk)<br>&nbsp; for (i=0; i&lt;NRA; i++)<br>&nbsp;&nbsp;&nbsp; for (j=0; j&lt;NCB; j++)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; c[i][j]= 0;<br><br>&nbsp; /*** Do matrix multiply sharing iterations on outer loop ***/<br>&nbsp; /*** Display who does which iterations for demonstration purposes ***/<br>&nbsp; printf("Thread %d starting matrix multiply...\n",tid);<br>&nbsp; #pragma omp for schedule (static, chunk)<br>&nbsp; for (i=0; i&lt;NRA; i++)&nbsp;&nbsp;&nbsp; <br>&nbsp;&nbsp;&nbsp; {<br>&nbsp;&nbsp;&nbsp; printf("Thread=%d did row=%d\n",tid,i);<br>&nbsp;&nbsp;&nbsp; for(j=0; j&lt;NCB;

 j++)&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; for (k=0; k&lt;NCA; k++)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; c[i][j] += a[i][k] * b[k][j];<br>&nbsp;&nbsp;&nbsp; }<br>&nbsp; }&nbsp;&nbsp; /*** End of parallel region ***/<br><br>/*** Print results ***/<br>printf("******************************************************\n");<br>printf("Result Matrix:\n");<br>for (i=0; i&lt;NRA; i++)<br>&nbsp; {<br>&nbsp; for (j=0; j&lt;NCB; j++) <br>&nbsp;&nbsp;&nbsp; printf("%6.2f&nbsp;&nbsp; ", c[i][j]);<br>&nbsp; printf("\n"); <br>&nbsp; }<br>printf("******************************************************\n");<br>printf ("Done.\n");<br><br>}<br><br><br>it compiles and works perfectly when I type: <br>&gt;&gt; g++&nbsp; omp_mm.c&nbsp; -o omp_mm -fopenmp<br><br>but when I add following lines to my CMakefile:<br>SET(CurrentExe "omp_mm")<br>ADD_EXECUTABLE(${CurrentExe} omp_mm.c ${SRCS} ${HDRS}) <br>TARGET_LINK_LIBRARIES(${CurrentExe}

 ${Libraries})<br><br>and change :<br>CMAKE_EXE_LINKER_FLAGS : -fopenmp<br><br>here is what I get in compilation time:<br>/home/kaveh/Src/ITKNMF/Source/omp_mm.c: In function Ąmainĸ:<br>/home/kaveh/Src/ITKNMF/Source/omp_mm..c:28: warning: ignoring #pragma omp parallel<br>/home/kaveh/Src/ITKNMF/Source/omp_mm.c:38: warning: ignoring #pragma omp for<br>/home/kaveh/Src/ITKNMF/Source/omp_mm.c:42: warning: ignoring #pragma omp for<br>/home/kaveh/Src/ITKNMF/Source/omp_mm.c:46: warning: ignoring #pragma omp for<br>/home/kaveh/Src/ITKNMF/Source/omp_mm.c:54: warning: ignoring #pragma omp for<br>/home/kaveh/Src/ITKNMF/Source/omp_mm.c:76: warning: control reaches end of non-void function<br><br>and no matter how I change the enviroment variable of $OMP_NUM_THREADS, it always lunches one thread. Unlike when I compile it with command line (without cmake), in which it works perfectly!!!! <br><br>Now, I am confused!! There should be something wrong with cmake but how

 can I fix it?<br><br>Second question is, ITK uses LAPACK and LAPACK is multithreaded (corret me if I am wrong). If yes how can I set number of threads for matrix multiplication?<br><br>Regards,<br>Kaveh<br> </div><div style="font-family: times new roman,new york,times,serif; font-size: 14pt;"><div style="font-family: arial,helvetica,sans-serif; font-size: 13px;"><br>Kaveh,<br><br>You can look at PETSc for parallel linear algebra routines:<br><a href="http://www.mcs.anl.gov/petsc/petsc-as/" target="_blank">http://www.mcs.anl.gov/petsc/petsc-as/</a>.<br><br>In the past I used PETSc in conjunction with ITK code. You have to be<br>careful to make sure that both PETSc and ITK are compiled with the<br>same mpicc/mpicxx/mpiCC compiler.<br><br>Hope this helps<br><br>Andriy Fedorov<br><br><br><br>&gt; Date: Wed, 21 Jan 2009 09:01:08 -0800 (PST)<br>&gt; From: Kaveh Kohan &lt;<a ymailto="mailto:kaveh.kohan@yahoo.com"

 href="mailto:kaveh.kohan@yahoo.com">kaveh.kohan@yahoo.com</a>&gt;<br>&gt; Subject: [Insight-users] about multithreading in ITK<br>&gt; To: <a ymailto="mailto:insight-users@itk.org" href="mailto:insight-users@itk.org">insight-users@itk.org</a><br>&gt; Message-ID: &lt;<a ymailto="mailto:352748.10256.qm@web59915.mail.ac4.yahoo.com" href="mailto:352748.10256.qm@web59915.mail.ac4.yahoo.com">352748.10256.qm@web59915.mail.ac4.yahoo.com</a>&gt;<br>&gt; Content-Type: text/plain; charset="us-ascii"<br>&gt;<br>&gt; Hello All,<br>&gt;<br>&gt; I have a general question about multi-threading in ITK and I would be thankful if anybody<br>&gt; answer:<br>&gt;<br>&gt; As far as I know, ITK uses vnl for linear albera operation but it is not<br>&gt; multithreaded (Please correct me if I am wrong). While many complicated<br>&gt; algorithm are already multithreaded in ITK, was there any technical reason that<br>&gt; linear algebra operation (like matrix-matrix

 multiplication, solving linear<br>&gt; system, LU, ....) is not multithreaded? Perhaps it didn't have priority.<br>&gt;<br>&gt; I would be appreciated if you tell what is the best way to implement efficient,<br>&gt; multithread linear algebra operation. In case it is not a good idea to use ITK<br>&gt; API for multithreading of such operation, is there any third party multithreaded<br>&gt; libray for linear algebra operation that I can link to ITK.<br>&gt;<br>&gt;<br>&gt; Regards,<br>&gt; Kaveh<br>&gt;<br>&gt;<br>&gt;<br>&gt;<br><br></div></div></div><br>


      </body></html>