Antw: [Insight-users] Shape prior level sets: question
about MAPCostFunction
Karl Fritscher
Karl.Fritscher at umit.at
Wed Feb 9 04:46:46 EST 2005
Zachary,
Thank you very much for your detailed explanation and the source code you have sent to me. I really appreciate that. Currently I am running some test examples with your classes and the first results are very satisfying, although I am still "re-tuning" and adapting some of the segmenation parameters to your modified cost function.
Thank you very much again,
Karl Fritscher
------------------------------------------------------------------------------------------------------
Dr. Karl D. Fritscher, M.Sc.
Institute for Biomedical Image Analysis (IBIA)
University for Health Sciences, Medical Informatics and Technology (UMIT)
Eduard-Wallnöfer-Zentrum 1
A-6060 Hall in Tirol, Austria
e-mail: karl.fritscher at umit.at
http://imwv.umit.at/
Tel.: +43 (0)50 8648 - 3825
Cell phone: +43 (0) 676 - 3306221
>>> Zachary Pincus <zpincus at stanford.edu> 02/02 9:32 >>>
Karl,
I hope you don't mind, but I'm cc'ing the insight-users list in case
anyone else is interested in my very preliminary findings.
> Among others we are also using the ITK implementation of Michael
> Leventon s work. Therefore I have been very interested in your
> thoughts about the different itk classes, which are used for the shape
> prior level set segmentation. I am sharing your opinion especially
> about the itkShapePriorMAPCostFunction class and would really be very
> interested in your modifications, if you are willing to code a new
> class. Implementing the modified cost function in our framework and
> comparing the results/performance would definitely be very interesting
> for us.
I've been looking at the ShapePriorMAPCostFunction class to try to
determine what I could do to make the shape-model part of the level set
segmentation a bit more robust. (This class computes the term described
in section 3.2 of Leventon's paper.)
I had found that even in my toy examples, there were knife-edge thin
margins for success, depending on the initial level set and parameter
choices, so I wanted to see if I could beef that up. So I looked at
each term in isolation and tried to figure out how to make then
individually robust, so that in combination they would hopefully be
much less initialization-dependent. Note carefully that I have been
optimizing for the type of images I use in my work (microscopy) -- some
or all of these findings may not apply to your setting. However, I
think some of these observations are general enough to be useful.
Firstly, I found that the "Inside Term" as defined by Leventon and
calculated by ITK can be a potent inflationary factor. Basically, this
term tries to keep the shape model *outside* of the evolving curve.
This can lead to a nasty feed-forward loop if the shape model starts
overlapping the curve. Specifically, if the shape model starts
overlapping the current curve, then the optimization of the MAP
function will drive the shape to expand in that region (if possible).
But, the current curve is then influenced by the MAP shape model, so it
likewise expands in that region, setting off this feed-forward chain.
I decided that for my work, I would prefer to change the definition of
the inside term from Leventon's work. He states that P(current curve |
estimated shape) is inversely proportional to the volume of the current
curve outside the estimated shape. This is just one possible model for
that probability. I noticed that both the evolving curve and the
estimated shape are signed distance functions, so they can be directly
compared. As such, I use the L1-norm of the difference between these
functions (in the narrow band active region) as a similarity metric,
and use that as a proxy for the probability. L2 and L-infinity norms
(RMSD and maximum deviation) seem good too. In this way, I encourage
the shape model to "stay near" the evolving curve. This is bad when the
curve is very small compared to its desired final size, however -- in
these regimes Leventon's initial suggestion is better. I am examining
ways to switch between the two probability definitions based on the
size of the curve.
Also, I noted that the Inside Term was proportional to the number of
pixels in the active set, so when the curve expands, this term grows
proportionally more weighty. This made these feed-forward effects
worse. I took the step of normalizing the final value by the number of
pixels in the active set. This step may not precisely map on to a
probabilistic interpretation, but it seems pragmatically useful.
Next, I looked at the "Gradient Term" which calculates how well a shape
model fits to an image. This term assumes that there is a gaussian
distribution of edge intensities (e.g. 1-FeatureImage) normal to image
edges, and tries to fit the shape model to this. (e.g. make the
zero-level-set of the shape model line up with the tallest part of the
gaussian.) Basically, this is supposed to encourage the shape model to
sit squarely on image edges.
The most useful modification that I did make was to rescale the
(1-FeatureImage) values so that they were in the range of heights of a
gaussian curve. Note that (1-FeatureImage) is in the range [0,1] but a
unit gaussian is [0, 0.4]. Practically speaking, this means that there
was an asymmetry in error when the shape model was misaligned with the
center of the gaussian: too little edge intensity (e.g. 1-FeatureImage
is zero where the gaussian would expect it to be peaked) would be
penalized a maximum of 0.4, while too much edge intensity was penalized
at a maximum of 0.6 per pixel. This difference effectively drove the
shape model to an edge-poor region of the image, the penalty for
misalignment to an edge was higher than the penalty for misalignment to
a not-edge. By rescaling, this was alleviated.
Another problem was the assumption that the 1-FeatureImage values had
unit variance. With a GradientMagnitudeRecursiveGaussianImageFilter
sigma of 1 and using a bounded reciprocal to calculate the feature
image, the 1-FeatureImage values have a true variance of about 2.7. By
assuming that the variance was 1, the MAP cost function basically put a
floor on how well the gaussian could fit. This made the "proper fit"
much more unstable and sometimes unattainable: if the 1-FeatureImage
values had too-wide a spatial spread (too large a sigma in the
derivative-of-gaussian filter), then it would be lower-error overall to
push the shape model to a edge-poor region where the tails of the
gaussian fit well, instead of having the shape model centered in the
middle of the edge, where the too-narrow gaussian peak fit well, but
the tails fit very poorly.
I thus modified the MAP cost function to properly estimate the
variance, and then correct for that. The variance estimate is a bit
funny, because you aren't really estimating variance from some sampled
population. Instead, you have input values (the values of the shape
model), and output values (the 1-FeatureImage pixel intensities at
those points). Plotted on a cartesian plane, these look a bit like a
non-normalized gaussian. (Cf. Leventon's Figure 7.) So the way to find
the variance is to treat the input values as samples *weighted by the
output values*, and take the variance of those weighted samples.
Think of it like this: Say the 1-FeatureImage had value 10 at shape
model value 1: then we would pretend that there were ten "1" samples
recorded. If the 1-FeatureImage had value 2 at shape model value -4, we
would record two "-4" samples. Then we would take the variance of these
samples. Now, there's a formula for weighted variance that does this
directly and in one pass, so that's what I really do.
I also noted that with too few "layers" in the active set (too narrow a
narrow band), the shape fit was rather unstable: by only seeing a
"truncated" gaussian under the narrow band, the optimizer has a much
harder time finding and staying at the true gaussian peak, and can
slide more easily to local optima. This is the case regardless of
whether you estimate true variance or assume a unit gaussian. Basically
the problem is that the metric is "fit-to-gaussian" but the image
provides a truncated gaussian. You could modify the fit term to try to
fit to truncated gaussians, or to be more nonparametric. I haven't tied
any of this -- I just widened the narrow band with a larger number of
layers, so that the optimizer is looking at the full gaussian.
Finally, I also scale this value by the size of the active region, to
keep it from blowing up as the region gets large.
I've attached my modifications, so you can look at them directly. The
code isn't perfectly clean, but it's a start.
Zach Pincus
Department of Biochemistry and Program in Biomedical Informatics
Stanford University School of Medicine
More information about the Insight-users
mailing list