[Insight-users] Understanding OptimizerScales in its entirety
Kris Zygmunt
krismz at sci.utah.edu
Fri Aug 26 15:52:53 EDT 2011
Hi Rupert,
I am still working my way through your thesis, and I have also
downloaded the trust region optimizer you provided for the Insight
Journal. I am still having trouble getting the results I desire using
your optimizer for registering multimodal (T1, T2, DWI) brain data
using a Rigid 3D Versor transform and the mutual information metric
(this is done as an initialization to then feed deformable
registrations). Based on what I've read and experimented with so far,
I believe this trouble can be attributed to some or all of the
following issues:
1. I don't think my scaling is right yet, do you have an
implementation available of your scaling calculation?
2. I think the optimizer is not using enough samples. According to a
comment I found in one of the ITK examples, "Regulating the number of
samples in the Metric is equivalent to performing multi-resolution
registration because it is indeed a sub-sampling of the image."
However, based on my own experiments and also based on your
evaluation, just arbitrarily setting a fixed percentage of pixels to
use does not perform well especially in the mutual information case.
I feel like I should be doing an actual multi-resolution registration
or at least using all of the pixels in the image for a single-level
registration.
3. I am not sure that I can guarantee (even if I fix 1-2 above) that
I will be starting within the capture radius of the optimizer. Do you
have any thoughts / recommendations / suggestions on how to better
initialize the transform? Or do you have an analysis of what the ITK
optimizers' capture radii are like? I still seem to have significant
translation and rotation error if I use the
CenteredVersorTransformInitializer with and without moments on.
4. I modified VersorRigid3DTransformOptimizer to derive from your
trust region optimizer, but found that the API for StepAlongGradient
no longer provides the factor as an argument. The Versor optimizer
was scaling the rotation and the transformed gradient by that factor,
is this no longer necessary?
5. Can you provide access to your modified Hessian approximation for
mutual information?
Thanks!
Kris Zygmunt
krismz at sci.utah.edu
>
>
> Hi Rupert,
>
> thank you for your very helpful explanations and the link to your
> thesis!!
> Your thesis incidentally answered some other questions I had.
>
> regards
> Levin
> ________________________________________
> Von: Rupert Brooks [rupert.brooks at gmail.com]
> Gesendet: Dienstag, 16. August 2011 04:01
> An: Wolf, Levin
> Cc: insight-users at itk.org
> Betreff: Re: [Insight-users] Understanding OptimizerScales in its
> entirety
>
> Hi Levin,
>
> Perhaps no one responded because understanding the optimizer scales in
> their _entirety_ is a very tall order. :-) I'll take a shot at
> answering the immediate question anyway.
>
> The optimizer scales are, unfortunately, not consistently applied
> across all the ITK optimizers. However, VersorRigid3DOptimizer is a
> subclass of RegularStepGradientDescentOptimizer and they both work the
> same way.
>
> In these optimizers, the gradient is divided by the scales. Then the
> step is this direction normalized to the step length.
>
> This sounds simple but the effect is a bit counterintuitive. This is
> like scaling the transform parameters by the square roots of the
> optimizer scale factors, and then limiting the step to a circle in the
> original parameter space. Which would be an ellipse in the new one.
> Why square root? because you change the derivative by changing the
> scales - and then consider it a direction in the original parameter
> space.
>
> I apologize in advance for self-promotion, but i just put an optimizer
> on the Insight-Journal that may interest you, if you are digging into
> this subject. http://www.insight-journal.org/browse/publication/834
> Different people have different theories about what the scales
> accomplish - if you are up to some rather dry reading, i'll refer you
> to Section 4.5 of my thesis
> http://www.rupertbrooks.ca/downloads/Brooks_PhDThesis.pdf
>
> And yes, different people have different heuristics for how to set
> these scales. In the thesis I argued that they should be chosen to
> precondition the Hessian matrix of the cost function. Others will
> tell you they should roughly equalize the average pixel motion in the
> image due to a unit shift of the parameters. It turns out that these
> are roughly the same thing. Its important also to consider both how
> the scales affect the optimizer path through parameter space, and how
> they affect the stopping criteria.
>
> Hope that helps a little,
> Rupert
>
>
> --------------------------------------------------------------
> Rupert Brooks
> rupert.brooks at gmail.com
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.itk.org/pipermail/insight-users/attachments/20110826/22e6a914/attachment.htm>
More information about the Insight-users
mailing list