Difference between revisions of "ParaView/Line Integral Convolution"
|Line 301:||Line 301:|
== Optimizations for interactivity ==
== Optimizations for interactivity ==
Given the complexity of the surface LIC pipeline, the computational expense of computing the LIC itself, and the large number of shader and noise generator parameters available, in order to deliver interactive rendering performance as paramters are adjusted it is important to render quickly and efficiently. Large differences in the run time of the various stages provide the potential for huge speed ups during interaction when the more expensive stages can be skipped. For instance typically the vector projection and image LIC stages split the majority of the rendering time about equally with remaining stages running orders of magnitude faster. When either of these more expensive stages can be skipped during interaction rendering performance is dramatically improved. To make this possible the output of each shader stage is cached and parameters are grouped according to the shader stage that they affect. See figure
Given the complexity of the surface LIC pipeline, the computational expense of computing the LIC itself, and the large number of shader and noise generator parameters available, in order to deliver interactive rendering performance as paramters are adjusted it is important to render quickly and efficiently. Large differences in the run time of the various stages provide the potential for huge speed ups during interaction when the more expensive stages can be skipped. For instance typically the vector projection and image LIC stages split the majority of the rendering time about equally with remaining stages running orders of magnitude faster. When either of these more expensive stages can be skipped during interaction rendering performance is dramatically improved. To make this possible the output of each shader stage is cached and parameters are grouped according to the shader stage that they affect. See figure for the details. As a user interacts with the visualization, cached results are re-used whenever possible so that only the stage affected by the interaction, and the stages downstream from it, are re-executed drastically speeding up interactive exploration.
== Parallelization ==
== Parallelization ==
Revision as of 18:25, 16 December 2013
- 1 Introduction
- 2 ParaView Surface LIC Plugin
- 3 OpenGL requirements
- 4 Internal Pipeline
- 5 Noise generator
- 6 Two pass image LIC
- 7 Scalar color shaders
- 8 Contrast enhancement
- 9 Masking invalid data
- 10 Integrator Normalization
- 11 Optimizations for interactivity
- 12 Parallelization
- 13 Acknowledgment
The line integral convolution(LIC) vector field visualization technique convolves noise with a vector field producing streaking patterns that follow vector field tangents. Originally the technique was developed for use with 2D image based data but has since been extended to work on arbitrary surfaces and volumes. ParaView supports LIC on arbitrary surfaces via the Surface LIC plugin. Some examples of ParaView's surface LIC in action are shown in Figure 1. ParaView's implementation has been designed to work with composite data in parallel and includes a number of customizations that facilitate interactive data exploration. This document describes the parallelization and features and optimizations facilitating interactive data exploration that were released with ParaView 4.1 and VTK 6.1.
ParaView Surface LIC Plugin
Surface LIC may be used in ParaView by loading the Surface LIC Plugin. The plugin may be loaded by opening the Tools menu and selecting Manage Plugins and finally selecting the surface LIC plugin and clicking Load on both client and server. For more information on ParaView plugins see the plugin how to. Once loaded the surface LIC is activated by selecting it from the drop-down list of Representations. Once active the algorithm's run-time tunable parameters are accessed from the Properties panel in the ParaView client GUI. Clicking on the properties panel's Advanced button will display all of the available parameters. The following table and figure have a basic description of the parameters with links to more information.
|Select Input Vectors||This is used to select the vector field.|
|Number Of Steps||Number of integration steps used in the first LIC pass. When the two pass LIC is enabled, the number of steps used in the second pass is set automatically.|
|Step Size||Integration step size given in the original vector field's units.|
|Normalize Vectors||Enable vector field normalization during integration.|
|Two pass image LIC||Enable the two-pass image LIC algorithm.|
|Color Mode||Select the shader that is used to combine lit and pseudocolored surface geometry with the LIC. (Blend, Multiply)|
|LIC Intensity||Sets the intensity for LIC pattern when using Blend shader.|
|Map Mode Bias||Sets the additive term used to brighten or darken the final colors when using the Multiply shader.|
|Enhance Contrast||Enable contrast enhancement shaders (Off, LIC Only, LIC and Color, Color Only)|
|Low LIC Contrast Enhancement Factor||Adjusts the minimum intensity value realized in the output of the image LIC contrast enhancement. This can be used to make the dark portions of LIC streaks darker when more contrast is needed.|
|High LIC Contrast Enhancement Factor||Adjusts the maximum intensity value realized in the output of the image LIC contrast enhancement. This can be used to make light portions of LIC streaks lighter when more contrast is needed or when pseudocoloring is dark.|
|Low Color Contrast Enhancement Factor||Adjusts the minimum lightness value realized in the output of the surface LIC painter color contrast enhancement. Can be used to darken low intensity colors when more contrast is needed.|
|High Color Contrast Enhancement Factor||Adjusts the maximum lightness value realized in the output of the surface LIC painter color contrast enhancement. Can be used to brighten high intensity colors when pseduocoloring is dark.|
|AntiAlias||Sets the number of times the image LIC antialiasing stage is applied. 0 disables the antialiasing stage.|
|Mask On Surface||Use the magnitude of the surface projected vectors for the masking test. If the vector field has an out-of-surface component disabling Mask On Surface will result in masked fragments matching the pseudocoloring.|
|Mask Threshold||Vector magnitude below which fragments are masked. If set less than zero masking is disabled.|
|Mask Intensity||The fraction of Mask Color to blend with lit and scalar colored surface geometry in the place of the LIC where the threshold criteria is satisfied.|
|Mask Color||An RGB tuple defining the color to use when masking fragments.|
|Noise Type||Select the noise distribution or type. (uniform, Gaussian, Perlin).|
|Noise Texture Size||Set the dimension of one side of the square noise texture. For example a setting of 128 results in a pixel noise texture. Large values may negatively impact performance.|
|Noise Grain Size||Set the dimension of one side of a texture noise element. For example if set to 2 then each noise element takes up pixels in the noise texture.|
|Min Noise Value||Set the minimum gray scale intensity value in the generated noise.|
|Max Noise Value||Set the maximum gray scale intensity value in the generated noise.|
|Number Of Noise Levels||Set the number of discrete gray scale intensity values in the generated noise.|
|Impulse Noise Probability||Set the probability that any given noise element in the generated texture will be filled in by the noise generator. A setting of 1 results in all elements filled while settings less than 1 produce impulse noise where elements not filled take on a background color.|
|Impulse Noise Background Value||Set the gray scale intensity to be used when a noise element is not filled when generating impulse noise.|
|Noise Generator Seed||A seed value for the random number generator.|
|Composite Strategy||Select the compositing strategy used for parallel operations (INPLACE, DISJOINT, BALANCED, AUTO). The default, AUTO, will select a strategy based on the input data an view parameters.|
|Use LIC For LOD||When enabled LIC is computed on LOD during interaction. This can negatively impact performance.|
ParaView implements surface LIC in OpenGL using a framebuffer ping-pong technique from early GPGPU computing. The required OpenGL extensions are listed in the following table. These requirements are satisfied for most OpenGL 2.1 implementations and all OpenGL 3.0 and newer implementations. The surface LIC painter may be used without graphics hardware by using the llvmpipe OSMesa state tracker found in Mesa3D OpenGL v9.2 and newer. See ParaView and Mesa3D for information on configuring ParaView for use with OSMesa. The OpenGL support on a given system may be checked using either ParaView's or VTK's regression tests. ParaView's LIC tests can be executed by issuing the command "ctest -R SurfaceLIC -L PARAVIEW --verbose" while VTK's LIC tests can be executed by issuing the command "ctest -R LIC --verbose" from their respective build directories. The output of these tests will indicate whether or not the system supports the required extensions.
|GL_ARB_vertex_buffer_object, GL_ARB_pixel_buffer_object, GL_ARB_depth_buffer_float, GL_ARB_multitexture, GL_EXT_texture3D, GL_ARB_texture_non_power_of_two, GL_ARB_texture_float, GL_ARB_depth_texture, GL_ARB_draw_buffers, GL_EXT_framebuffer_object, GL_EXT_framebuffer_blit, GL_ARB_shading_language_100, GL_ARB_shader_objects, GL_ARB_vertex_shader, GL_ARB_fragment_shader|
The surface LIC algorithm projects vectors defined on an arbitrary surface onto the surface and then from physical space into screen space where an image LIC is computed. During the projection phase lit pseudocolored surface geometry is rendered into a texture for later use. When running in parallel moving the vector field into screen space necessitates a compositing step that makes the screen space vector field consistent in regions of interprocess screen space overlap, and the addition of guard pixel halos ensuring consistent results at process boundaries. Once the image space LIC computation is made it's combined with the previously rendered lit pseudocolored surface geometry and copied into the back buffer with a depth test. A schematic of the algorithm is presented in figure 3a. Optional processing stages are shaded gray, cached textures are represented by red parallelograms, and green double arrows indicate inter-process communication that occurs only during parallel operation. On the right half of the figure a break-out diagram detailing the processing stages used in our image LIC algorithm are shown.
For a given vector dataset the streaking patterns produce by the LIC can vary widely based on the properties of the noise convolved, screen resolution, and scene or view parameters. The properties of the noise play an important role in determining the characteristics of streaking patterns and can be varied easily giving us a simple means for controlling streaking patterns realized. For example, contrast in the streaking patterns are strongly influenced by the selection of noise distribution, and its minimum, maximum, and number of noise levels, while the width of the streaks produced is strongly tied to the noise grain size. Additionally, the number of light and dark pixels in the result can be controlled by varying the impulse probability along with the choice of background intensity. Figure 4 shows an example of how varying noise texture parameters can result in markedly different the streaking patterns. ParaView's noise texture generator defines the following 9 run-time tunable degrees of freedom, which together can be used to modify streaking pattern, dynamic range, and contrast in the resulting LIC.
Two pass image LIC
ParaView implements a two-pass LIC computation. Enabling the two-pass computation activates an algorithm sub-pipeline that applies a number of image processing filters which can greatly improve the visibility of the LIC's streaking patterns. In the first pass a traditional LIC is computed. Image processing filters, consisting of an optional contrast enhancement(CE) filter and a Laplace edge-enhance(EE) filter, are then applied strengthening the streaking patterns. In the second pass LIC is recomputed using the output of the image processing filters in the place of a noise texture. Integration in the second pass is made using of the number of integration steps to account for the fact that the filtered first pass output is relatively smooth. The image processing and LIC stages that comprise the two pass algorithm are shown in Figure 3b. Figure 6 shows the progression as the streaking patterns form and are strengthened by each successive stage in the two-pass image LIC computation.
Scalar color shaders
ParaView provides two shaders for combining LIC with lit pseudocolored surface geometry, a multiplicative or mapping shader and an additive or blending shader. By default the blending shader is used. However, the determination of which shader is right for a given situation is best done interactively at runtime as the results will depend on a number of factors such as the surface curvature, lighting and view parameters, vector field, and choice of pseudocolor lookup table. When ParaView's contrast enhancement stages and noise texture generator are used to produce a LIC with high-contrast streak patterns and the right balance of light intensity values these shaders are highly effective and produce compelling visualizations. For comparison of the two shaders figure 7 shows a matrix of renderings where the top row was rendered without psudocoloring ,the bottom row with pseudocoloring, and the left column was generated with the blending shader, and the right column was generated using the mapping shader. These samples show some of the differences between the two shaders.
Mapping colors onto the LIC
The mapping fragment shader is described by the following equation:
where the indices identify a specific fragment, is the final RGB color, is LIC gray scale intensity, is the scalar RGB color, and is a biasing parameter, typically 0, that may be used for fine tuning. When , the typical case, colors are transferred directly to the final image where the LIC is 1, and a linearly scaled transfer of scalar colors where LIC gray scale color is less than one down to 0, where the final color is black. The bias parameter may be set to small positive or negative values between -1 and 1 to increase or decrease LIC values uniformly resulting in brighter or darker images. When final fragment colors, , are clamped such that .
With the mapping approach the distribution of intensity values in the LIC directly affect the accuracy and intensity with which scalar colors and lighting effects are transferred in the final rendering. With this shader the RGB values produced will be less then or equal to the maximum intensity values in the image LIC. The greater the number of pixels with an intensity close to 1, the more accurately and brighter the lit pseduocolored surface geometry is rendered. Of course, the need for many high intensity pixels must be balanced with need for a sufficient number of highly contrasting pixels, where the value is closer to 0, in order to accurately represent the LIC pattern itself. ParaView's noise generator and contrast enhancement stages provide a number of features that aid in interactively achieving this balance.
Blending colors with the LIC
The blending fragment shader is described by the following equation:
where the indices identify a specific fragment, is final RGB color, is LIC gray scale value, is the scalar RGB color, and is a constant ranging from 0 to 1, with a default of 0.8. Decreasing to obtain brighter colors has the affect of diminishing the intensity of the LIC, while increasing to obtain stronger LIC has the affect of washing out colors.
In some cases when colors are bright the LIC is difficult to see and attaining a usable result will require sacrificing both visibility of the LIC and brightness of pseudocoloring. The blending shader, like the mapping shader, benefits from high contrast streaking patterns and the right balance of high intensity pixels in the LIC in order to preserve streaking patterns and pseudocoloring in the final rendering. Note that despite the fact that it inherently decreases visibility of features in scalar coloring and image LIC the blending approach can be especially useful with curved surfaces and pronounced lighting effects and also when scalar color map is very intense.
The convolution process inherently tends to decrease both the contrast and dynamic range in the LIC narrowing and concentrating the distribution of resulting intensity values around a mid tone. The use of Gaussian noise during LIC computation produces relatively smooth and pixelation free streaking but tends to worsen this narrowing effect since the input intensities are already highly concentrated about a central intensity value. An example of the narrowing and concentration resulting from the convolution can be seen in the top row of figure 8a where the output of the first LIC pass is shown with its intensity distribution on the right. This narrowing and concentration of intensities in the LIC can result in an overall dark and dull image making the combination with lit pseduocolored surface geometry difficult. An example of a dark and dull result can be seen in the left panel of figure 9. In order to counteract the narrowing and darkening affects of the convolution three optional contrast enhancement (CE) stages have been added, one after each LIC stage and one after the combination of scalar colors and LIC. The new stages increase both dynamic range and contrast, improve the streaking patterns, and facilitate combination of LIC with pre-rendered lit pseudocolored surface geometry.
Figure 8 shows the input and output images of each of the contrast enhancement stages along side their respective intensity distributions. The intensity distributions for input images include vertical lines indicating the min and max values. The contrast enhancement algorithm works by stretching the input distributions so that in the output distribution the min is 0 and the max is 1 increasing the contrast and dynamic range in the final rendered image. An example of how the CE stages can be used improve the efficacy of scalar color shader is shown in the right panel of figure 9.
Figure 8 CE stage examples. Vertical lines plotted in the input distribution show min and max values. The CE stages map the input distribtions onto the range 0 to 1. Use of the CE adjustment factors results in an accumulation of output values that are exactly 0 or 1. Figure 8a Input and output of first image LIC CE stage with histogram. Figure 8b Input and output of the second image LIC CE stage. Figure 8c Input and output of the surface LIC painter color CE stage.
Image LIC CE stages
The image LIC CE stages are implemented by histogram stretching of the gray scale intensities as follows:
where, the indices identify a specific fragment, is the fragment's gray scale intensity, is the intensity to map to 0, is the intensity to map to 1. In the first CE stage, which is applied on the input of the EE stage, and are always set to the current minimum and maximum gray scale color of all fragments. However, in the final CE stage and may be individually adjusted using the following set of equations:
where, , are the set of gray scale intensities in the input image and and are adjustment factors that take on values between 0 and 1. Setting these factors to 0 maps the current minimum and maximum intensity onto 0 and 1 respectively stretching the values in between and filling the entire range effectively increasing the dynamic range. Increasing shifts the minimum intensity left mapping the lower portion of the input's intensity distribution onto values less than 0, increasing the number of low intensity values in the output which darkens the darker parts of LIC streaks. Increasing shifts the maximum intensity right mapping the upper portion of the input intensity distribution on to values greater than 1, increasing the number of high intensity values in the output which lightens the lighter parts of the LIC streaks.
The adjustment factors provide a means to increase the contrast of LIC streaks and improve the efficacy of the scalar color shaders which rely on a good balance between light and dark LIC intensities in order to successfully represent both lit scalar pseudocolored surface geometry and LIC in the same rendering. They become especially useful when the minimum and/or maximum intensity values in the CE stage input are unrepresentative of the LIC as a whole. For example, this can occur when, near the dataset boundary or in regions of stagnant flow, input noise values are convolved relatively less than in the majority of the rest of the LIC. These relatively unconvolved fragments have unrepresentative low and high intensity values reducing the efficacy of the CE stage. The adjustment factors can be used to correct this so that more representative values are mapped to 0 and 1. Figure 8b shows an example where the adjustment factors have been used. The affects of the adjustment factors can be seen in the output distribution's accumulation of values that are exactly 0 and 1 which is the result of clamping values mapped above 1 and below 0.
Occasionally the use of the image LIC CE stages results in jagged or pixelated streaking patterns in the output of the second LIC pass. This pixelation is a result of over saturation occuring during the processing of sharp intensity transitions by the EE stage. The level of pixelation that's introduced depends on a number of factors such as the vector data, properties of the noise texture, number of integration steps taken, and the min and max CE factors used. Pixelation can be reduced by increasing the number of integration steps or enabling the optional anti-aliasing (AA) stage. The AA stage, when enabled, is applied to the input of the final LIC CE stage. By applying the AA stage before the final CE stage pixelation can be removed while allowing the final CE stage to boost contrast in preparation for the combination of the LIC and lit pseudoclored surface geometry by the scalar color shaders. This helps to ensure bright scalar colors in the final image. An examples showing the use of the AA stage is shown in figure 6.
Painter CE stage
After the combination of lit pseudocolored surface geometry and LIC an optional color contrast enhancement (CCE) stage may be applied. This may be used to increase the contrast in the LIC's streaking patterns and brighten pseudocolors. The CCE stage is implemented using histogram stretching on the fragments lightness in the HSL color space.
where, the indices identify a specific fragment, is the fragment's lightness in HSL space, is the lightness to map to 0, is the lightness to map to 1. and take on minimum and maximum lightness over all fragments by default but may be individually adjusted by the following set of equations:
where, are fragment lightness values and and are the adjustment factors that take on values between 0 and 1. When and are 0 the current minimum and maximum lightness values found in the stage's input are used. Increasing shifts the minimum lightness value left stretching the lower tail of the distribution across values less than 0, increasing the dark colors in the output, darkening the darker colors in the image. Increasing shifts the maximum right mapping the distributions upper tail on to values greater than 1, increasing the number of high light colors in the output, intensifying brighter colors. Because lightness channel is clamped in the range of 0 to 1, increasing the adjustment factors too much leads to over saturation in the resulting image. Figure 9, which compares the result with(right) and without(left the CE stage, shows an example of the improvement that may be attained using the CCE stage.
Masking invalid data
Masking is a technique where by a specialized shader handles combining LIC fragments with lit pseudocolored surface geometry where the vector magnitude is below a user provided threshold. This provides control over how fragments near regions of stagnant flow are handled. When integrating without normalization the convolution doesn't smooth out noise in these regions as much as it does where the flow is strong and these relatively unconvolved noise values can become outliers in the LIC intensity distribution reducing the efficacy of the contrast enhancement stage and potentially disrupting the visualization. When masking is enabled these unconvolved noise values are discarded. In their place lit psudocolored geometry is blended with a masking color resulting in a visually harmonious match of the pseduocolored LIC intensity across the entire surface.
Fragments are masked according to the following equation:
where the indices identify a specific fragment, is final RGB color, is the RGB mask color, is the scalar RGB color, and is the mask color intensity. This allows one control over the masking process so that:
- by setting the mask threshold less than the smallest vector magnitude unconvolved noise is rendered directly.
- by setting a unique mask color and mask intensity greater than 0, masked fragments are highlighted.
- by setting mask intensity to 0, masked fragments are replaced by lit pseudocolored surface geometry at their full intensity.
- by setting mask intensity greater than 0, masked fragments are blended with a masking color harmoniously matching the intensity of the nearby LIC
Figure 4 shows an example of fragment masking where fragments with are blended harmoniously with the surrounding LIC. Without fragment masking unconvolved noise in stagnant regions of the flow disrupted the visualization. Figure 10a shows an example where masking wasn't necessary as flow around stagnant regions varied smoothly. In figures 1a and 1b masking is used with the mask intensity set to 0 in order to display the lit surface of the launch vehicle at full intensity where vector data doesn't exist.
Normalizing vectors during integration is a trick that can be used to simplify integrator configuration and give the LIC a uniformly smooth look. By using normalized vector field values the convolution occurs over the same integrated arclength for all pixels in the image. This gives the result a smooth and uniform look and makes it possible to provide reasonable default values for step size and number of steps to the integrator independent of the input vector field. The resulting visualization accurately shows the tangent field however perceptual cues indicating variation in the relative strength in the flow are lost and can make weak insignificant features prominent and strong dominant features less so. For example figure 10 shows a flow where integrator normalization results in the visual emphasis of insignificant features in a stagnant part of the flow. In this case visualizing the tangent field lead to much confusion and debate during the analysis of the dataset. Disabling integrator normalization resulted in an accurate visualization of flow features and resolved the confusion. Because the visualization of the tangent field using integrator normalization generally produce good results and integrator normalization significantly simplifies algorithm configuration it's the default in ParaView. However, when visualizing flows with large variations in flow speed it can be useful to disable integrator normalization in order to get a representative visualization of the flow.
Optimizations for interactivity
Given the complexity of the surface LIC pipeline, the computational expense of computing the LIC itself, and the large number of shader and noise generator parameters available, in order to deliver interactive rendering performance as paramters are adjusted it is important to render quickly and efficiently. Large differences in the run time of the various stages provide the potential for huge speed ups during interaction when the more expensive stages can be skipped. For instance typically the vector projection and image LIC stages split the majority of the rendering time about equally with remaining stages running orders of magnitude faster. When either of these more expensive stages can be skipped during interaction rendering performance is dramatically improved. To make this possible the output of each shader stage is cached and parameters are grouped according to the shader stage that they affect. See figure 3 for the details. As a user interacts with the visualization, cached results are re-used whenever possible so that only the stage affected by the interaction, and the stages downstream from it, are re-executed drastically speeding up interactive exploration.
ParaView's data parallel pipeline allows us to handle datasets larger than can fit on a single compute node and provides a means for achieving faster rendering time on very large datatsets. However, load balancing considerations are made based on the data reported available by the reader and filters may move data among processes, and add or subtract data from any given process. In addition the distribution of rendering work depends view parameters such as camera position, view angle, and the positions of near and far clipping planes relative to the dataset being rendered. All of this tends to make load balancing a computationally costly rendering algorithm such as the surface LIC nontrivial.
The surface LIC algorithm is unique compared to other parallel rendering algorithms in two ways. First, in the data parallel setting the integration step requires access to off-process vector data in order to produce consistent results at process boundaries. This is dealt with by adding a gaurd pixel halo to each screen space pixel extent over which the LIC will be computed. Second, after vectors have been locally projected into screen space they must then be composited where ever there is inter-process screen space overlap to ensure global vector field correctness. The generation of guard pixels and vector field compositing occur within ParaView's normal image compositing pass which further complicates the situation. ParaView contains a number of compositing algorithms that help balance the LIC computation ensure that it's carried out efficiently in the data parallel setting.
Vector field compositing stage
Given the relatively high computational cost of computing the surface LIC it's important to have a good parallel distribution of rendering work. It can be beneficial to redistribute screen space data to achieve a more balanced work distribution. The vector field compositing stage provides an opportunity to load balance the LIC computation. However, working within ParaView's normal compositing pass makes any attempt at load balancing screen space computations challenging. ParaView's image compositor expects screen space data distribution to remain fixed which places some restrictions on what load balancing schemes will be practical since any reorganization of screen space data, by moving data from the initial decomposition to a more favorable one, requires moving it back after the computation has been made adding communication overhead. When attempting to load balance rendering computations in this environment the goal of achieving equal distribution of work must be balanced by the compositing and communication costs.
ParaView provides 3 vector field compositing strategies for composite data, in-place, in-place disjoint, and balanced. To reduce communication overhead prior to compositing, screen space bounds of the input data are minimized using the cached depth buffer values. Each compositing strategy implies a specific target domain decomposition and once the target domain decomposition is determined each of it's extents are minimized and guard pixels are added.
The in-place strategy composites the vector field onto the minimized screen space extents of the input dataset without additional screen-space load balancing. This strategy is optimal when there is no off-process screen space data overlap, as in the case of computing LIC on a slice. However, the strategy can be inefficient when there is substantial off-process screen space data overlap because each overlapping pixel extent becomes a target for compositing and the LIC is redundantly computed on each. For example the screen space decomposition shown in figure 10b is what would be use for in-place compositing in that situation. There are large portions of the screen that overlap on as many as 4 processes resulting in 4 times duplication of compositing and integration for those regions. In this case the in-place strategy relatively inefficient because of the large number of overlapping data across processes.
In-place disjoint compositing
The in-place disjoint strategy adds load balancing to the in-place strategy. The target domain decomposition to which vector field will be composited and LIC computed on is constructed by making the minimized input screen space domain decomposition disjoint with respect to itself. The disjointification assigns each pixel where the LIC will be computed to a single process so that redundant compositing and computation are eliminated. Data is left in-place which minimizes the compositing costs. This strategy can be more efficient than the in-place strategy when there is a high degree of off-process screen space overlap. However, the disjointification process results in an increased number of screen space extents which tends to increase the number of guard pixels required which becomes a serious performance issue as the number of guard pixels approaches the number of valid pixels. Scaling studies show that this becomes a concern for very large parallel runs, for example 512 processes and more. In order to work with ParaView's image compositing pass the in-place disjoint strategy includes a scatter stage that moves the computed LIC back onto the original domain decomposition. This scatter stage doesn't require compositing because the disjointification process ensures that the source and destination extents are unique. In the worst case, where only two processes overlap in screen space, the additional cost of the scatter stage makes the total communication costs of the in-place disjoint strategy equal to those of the in-place strategy. Thus when there are many overlapping off-process pixels and the ratio of guard pixels to valid pixels is much less than 1 the disjoint strategy will be much more efficient than the in-place strategy.
The balanced strategy partitions the global minimized input dataset screen space bounds into equally sized tiles with one tile assigned to each process. The screen space vector field data is composited onto the new target domain decomposition where LIC is computed. Like the in-place disjoint strategy the balanced strategy assigns pixels to processes uniquely resulting in the LIC being computed only once for each process, and a scatter stage is required to move data back onto the ParaView's image compositing domain decomposition. One advantage of the balanced strategy over the in-place disjoint strategy is that because there is only one target pixel extent per ParaView process the ratio of guard pixels to valid pixels tends to be much smaller as the number of processes increases. However, the compositing costs can be relatively higher because data is not necessarily left in-place, and in situations where valid pixels don't fill the screen space extents some processes may be left with no work.
In the automatic compositing strategy an estimate of the compositing cost is made and a heuristic is used to select either the in-place or in-place disjoint strategy. Thus the benefits of both of these strategies are leveraged while some of the downsides are avoided on a case by case basis with user intervention. This is the recommended strategy and is used by default.
The work was performed at Lawrence Berkeley National Laboratory (LBNL) under a non-federal agreement with the University of Tennessee supported by the National Science Foundation under Grant number SF OR.13425-001.01, National Institute for Computational Sciences (NICS) NSF Center for Remote Data Analysis and Visualization (RDAV).
This research used resources of the National Energy Research Scientific Computing Center, which is supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC02-05CH11231. Some of the datasets used in the work were created under grant number DE-FG02-10ER55076, Kinetic Physics of Homogeneous Turbulence in Collisionless Plasmas.