<font size="2"><font face="verdana,sans-serif">Instead of doing it &quot;by hand&quot;, you might use an optimized linear algebra library. ITK includes VNL, but you may also use something else (I use <a href="http://eigen.tuxfamily.org/index.php?title=Main_Page">Eigen</a>). It should be faster, because those libraries are optimized to take advantage of SIMD instructions such as MMX, SSE2, AVX etc, among other performance optimizations.<br>


<br>But before that, make sure you are compiling your code in release mode. 12 million multiplications + additions should take much closer to 2ms than 20ms.<br><br>About transferring data to GPU and back: you are right, it makes no sense to do it if you only need to execute 2 operations on each element.<br>


<br>HTH<br></font></font><br><div class="gmail_quote">On Mon, Oct 24, 2011 at 17:36, zlf <span dir="ltr">&lt;<a href="mailto:jxdw_zlf@yahoo.com.cn">jxdw_zlf@yahoo.com.cn</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">


Hi<br>

I have two 4000x3000 matrics. Say matrix A and matrix C. I want to add the<br>

two matrics. It tooks me 20ms now.<br>

<br>

But in my application. I need it finished in 2ms!<br>

<br>

Moreover, I cannot put the calculation in GPU because the data translation<br>

between memory and GPU is time-costuming.<br>

<br>

Anyway to accerlate the calculation in CPU?<br>

<br>

#include &quot;stdafx.h&quot;<br>

#include &lt;iostream&gt;<br>

#include &lt;time.h&gt;<br>

#include &lt;conio.h&gt;<br>

<br>

int _tmain(int argc, _TCHAR* argv[])<br>

{<br>

        short* a = new short[4000*3000];<br>

        short* b = new short[4000*3000];<br>

        short* c = new short[4000*3000];<br>

<br>

        clock_t cstart, cend;<br>

        double spend;<br>

        cstart = clock();<br>

<br>

        for(int i = 0 ; i &lt; 1000;++i){<br>

                #pragma omp parallel for<br>

                for(int x = 0 ; x &lt; 4000*3000;++x){<br>

                        //for(int y = 0 ; y &lt; 3000;++y){<br>

                        short value = a[x] * 3000 + b[x];<br>

                        c[x] = value;<br>

                        //}<br>

                }<br>

        }<br>

<br>

        cend = clock();<br>

        spend = ((double)(cend-cstart)) / (double)CLOCKS_PER_SEC*1000/1000;<br>

        printf(&quot;spend: %f\n(ms)&quot;, spend);<br>

<br>

        return 0;<br>

}<br>

<br>

Thanks<br>

<br>

Jerry<br>

<br>

--<br>

View this message in context: <a href="http://itk-insight-users.2283740.n2.nabble.com/Fast-large-matrix-add-tp6925448p6925448.html" target="_blank">http://itk-insight-users.2283740.n2.nabble.com/Fast-large-matrix-add-tp6925448p6925448.html</a><br>


Sent from the ITK Insight Users mailing list archive at Nabble.com.<br>

_____________________________________<br>

Powered by <a href="http://www.kitware.com" target="_blank">www.kitware.com</a><br>

<br>

Visit other Kitware open-source projects at<br>

<a href="http://www.kitware.com/opensource/opensource.html" target="_blank">http://www.kitware.com/opensource/opensource.html</a><br>

<br>

Kitware offers ITK Training Courses, for more information visit:<br>

<a href="http://www.kitware.com/products/protraining.html" target="_blank">http://www.kitware.com/products/protraining.html</a><br>

<br>

Please keep messages on-topic and check the ITK FAQ at:<br>

<a href="http://www.itk.org/Wiki/ITK_FAQ" target="_blank">http://www.itk.org/Wiki/ITK_FAQ</a><br>

<br>

Follow this link to subscribe/unsubscribe:<br>

<a href="http://www.itk.org/mailman/listinfo/insight-users" target="_blank">http://www.itk.org/mailman/listinfo/insight-users</a><br>

</blockquote></div><br>