[Insight-users] ITK+CUDA libraries

Wed May 26 14:35:23 EDT 2010

Badri Roysam <roysam at ...> writes:

> 
> I am interested to hear people's opinion of OpenCL. Do you have to write
> 2 versions of code (for nVIDIA & ATI), or does a single piece of code
> cross-compile for the 2 platforms?
> 

You can write one piece of code and it will work on both platforms, but you
have to have the OpenCL driver installed for whatever hardware you have.  The
hardware specific code, called 'kernels' are compiled at runtime by default by
the driver, but you can also pre-compile them ( the binary is no longer
hardware independent, then, though ).

I did an experiment with ITK + OpenCL by making an alternate FFT
implementation.  The result was not so good: slower than pure CPU :P  I think
the problem lies in transfers of image buffers to and from the cards; if you
have to do it for every filter, transfer time is the bottleneck.  For OpenCL to
be effective, the buffer probably has to sit on the GPU for the entire
pipeline.

The article is here:
http://hdl.handle.net/10380/3159
The code is here:
http://gitorious.org/ultrasound-b-mode/itk-fft-extensions
http://gitorious.org/ultrasound-b-mode/opencl_fft