VTK/Parallel

From KitwarePublic
< VTK
Revision as of 05:41, 26 November 2008 by Edice (talk | contribs)
Jump to navigationJump to search

This is a spot to start documenting the ins and outs of making VTK work in parallel at the various levels. The page has been started by someone who is new to VTK, so there are probably errors in here. Other people will hopefully assist with fleshing out this page. Formatting will be sparse for the moment, I'll just put down my thoughts to begin with.

There doesn't seem to be much solid documentation for VTK in parallel. The mailing list gets regular questions, and I personally haven't come away from reading replies knowing everything I wanted to know.


There are several levels and meanings of "Parallel":

Loop-Level (compiler extensions, very fine)

This can be achieved via the new OpenMP features of GCC, Microsoft's Visual C++ (pro, not express) and Intel's icc. icc can also support Intels Thread Building Blocks (did I read that gcc also supports TBB?).

A loop can be run in parallel via compiler extensions. The only concern is whether the data it is accessing is thread-safe.

http://www.openmp.org

http://en.wikipedia.org/wiki/OpenMP


Algorithm-Level (threads, MPI, fine)

The algorithm has a parallel mode of operation built-in. Well-written quote from the mailing list:

Many of VTK's imaging filters support thread level parallelism. When VTK is compiled with threading support, those filters in the Imaging kit will automatically spawn threads to take advantage of the available parallelism near to the innermost loop. When processing this type of data, threading is fairly easy to do and scales, in terms of execution time, very well.

I assume that its up to the filter-writer as to how this can be implemented, and could also be done with MPI, sockets and processes, SOAP over web services, phone-a-friend and ask-the-audience. VTK does not provide any support for these alternate methods, but it does provide support for threading image filters via vtkThreadedImageAlgorithm.

Note that this SEEMS to be where the vtkMultiThreader slots in. It provides support for an algorithm to spawn threads and distribute its workload. Its a low-level class.

This is also where vtkMultiProcessController and vtkMPIController slots in. These classes help to distribute and control processes that an algorithm wants to distribute. The algorithm registers callbacks, remote methods, etc and triggers remote executions.

Note that all this parallelism is still in the branch-execute-join pattern ... each filter in the pipeline executes sequentially, but that execution can be done in parallel.

The Streaming pipeline system complicates this a little bit, because one filter may request more information (pieces) from a source filter. Note that during the request the calling filter waits for the response, so again its just one filter executing at once.


Filter/Task-Level (processes, MPI, coarse)

If you want to execute more than one filter at a time, you basically have to create two independent pipelines and merge the results together at the end via vtkParallelRenderManager (eg vtkCompositeRenderManager) or similar.

Each pipeline has to operate in a separate process (thread is not enough) via socket or MPI communication

The data must be partitioned between the two pipelines (doing this efficiently is complicated).

Render-compositing is not straight forward either, as you have to consider object-ordering, or where the data will be displayed, and volume rendering in parallel may not be possible at all.

Note that the rendering is no longer done directly to the display, it is often rendered to memory (ie images), then composited and only then displayed. How much hardware acceleration is possible, I don't know. I see a lot of mentions of using the Mesa library (OpenGL software-rendering).


Why can't it be threaded?