Proposals:GridComputing

From KitwarePublic
Jump to navigationJump to search

Development and extension of ITK to be applied to grid computing

For more information contact: Simon Warfield, Steve Pieper, Ron Kikinis

There is an increasing trend to deploy grid computing infrastructures to support computations on extremely large datasets like those found in the Visible Human Project. We therefore believe that it is important for critical aspects of the architecture of ITK revisited and refined to support the emerging standards in the grid computing community and to develop example applications to demonstrate the power of the ITK/grid combination in real-world research computing scenarios.

Task 1: Survey of existing grid infrastructure software including the Globus Toolkit and associated middleware. Needs analysis of the bottlenecks in the current ITK as applied to collaborator projects within the SPL. Creation of initial ITK on grid computing examples using task-level parallelization of large population data sets (n > 1000).

Other examples of task level parallelism include parametric search in which an algorithm is applied to a single or small number of data sets many times (n > 1000) with different parameter settings.

Task 2: Focus on optimizing the current ITK inner-loops and threading model for CPU architectures and compilers deployed as compute grid nodes. Work with computer science collaborators to identify bottlenecks that can benefit from machine-level code optimizations to accommodate issues such as cache size, special instruction sets, and memory bandwidth hierarchies in the context of a heterogeneous compute grid environment.

Cache size varies from CPU model to CPU model and can have a tremendous impact upon performance. Also, utilization of vector instruction sets such as SSE, and scheduling and branch prediction must be carefully optimized for the specific CPU model, instruction cache size, data cache size and scheduling pipeline for best performance. Compilers, such as GCC or the Intel compiler, can generate optimized code that outperforms hand-tuned code. Our intention is to investigate the potential for performance gains on certain platforms through the generation of cache-optimized, instruction-set optimized, scheduler optimized binaries. We will also test the accuracy and precision of highly optimized code to ensure correct results are obtained.

Task 3: Integrate message passing parallelism using the globus compatible MPI implementations (mpich-g2 or its successors) first to distribute ITK processing pipelines to among grid processing nodes and second to break individual algorithms across nodes.

The grid protocols have been undergoing tremendous change recently. There is no clear preferred protocol for distributing ITK processing pipelines among grid processing nodes at this time. Scheduling and access across Teragrid nodes has also been evolving. Some Teragrid sites are using tools such as PBS for managing jobs. We will focus our efforts on utilizing the protocols which are most stable and provide for the most significant benefits for the effort invested.

Task 4: Ongoing support of ITK grid tools at the task, threading, and message passing layers. Review of any newly developed grid middleware emerging during the project period and rework of ITK architecture to accommodate same.

In our initial effort we used Image Guided Neurosurgery as the clinical application driver, and optimized algorithms and performance especially suited for this application. This application requirements range from the need for highly optimized special instruction set inner loops and thread parallel execution for specific algorithms, such as intrasubject registration, to task level parallelism inherent in the statistical atlas construction that underlies robust segmentation methods. We intend to maintain these application specific optimizations as ITK evolves, while adding new application drivers that ensure the changes we make are relevant and will have significant clinical impact.

Compute platforms

We would like to get feedback about the compute platforms used in the ITK community. In particular, we would like to know what effort on which hardware platforms would have the largest impact on the community.

CPU model family, user count:

 Intel  :  1
 AMD    :  0
 SPARC  :  0