[Insight-users] Understanding OptimizerScales in its entirety

Rupert Brooks rupert.brooks at gmail.com
Sat Aug 27 09:59:19 EDT 2011


Hi Kris,

re getting access to my code.  I've been wanting to put it out there
for a long time - however three years have gone by since i finished
the thesis and as you can see only one tiny little insight journal
article ever got done.  This is because i have to look at it in my
spare time, of which there is very little.  So if you wait for me to
organize myself and get it out there, you will sadly wait a long time.

You may want to look at elastix - its a parametric registration tool
that covers much of the same ground as i covered in my thesis, and
other, different ground as well.  Its very good work, and most
importantly, the authors have the time and resources to support it.
:-)  They have a slightly different way of estimating scaling factors,
but i believe the end results are similar.
http://elastix.isi.uu.nl/about.php

In any case, i'll try to help with some tips

Subsampling is NOT the same as using a multiresolution approach,
because - briefly - smoothing the image changes the derivative by
removing high-frequency components.  You should almost definitely use
a multilevel process.  However, using all the samples may not be
necessary at each resolution.  At some point, the estimate of the MI
doesnt get any better by using more samples.  How many?  Depends on
the characteristics of your data, but i would guess 20000-30000 is
probably enough.

Capture radius for MI, on heads, seems to be in what ive seen at least
10 degrees if not more.  However - you haven't mentioned some key
details about your problem - is the data intrasubject or intersubject?
 Intersubject is obviously trickier.  All from the same source or
different sources - more importantly, are there some other objects in
the scan (head supports, etc) that differ and might be interfering
with the registration.  If the data comes from entirely different
sources, it could even be totally flipped around - x could be
superior-inferior in one and anterior-posterior in the other.  In that
case, its best to get your anatomical axes consistent before starting.

If you really have problems with capture radius, you could try a
pre-registration with NCC - i saw this technique in a paper once.  NCC
has a wider capture radius than MI.  You could also try a pattern
search on a coarse grid (watch out, itkExhaustiveOptimizer does not
report the best visited position) to find a good starting point, or
try restarting the optimization after it stops.

How big is a 'significant error' and do you mean after the entire
registration process, or just after the initializer?

Re the versor optimizer.  I'm a little fuzzy on what the
VersorTransformOptimizer was accomplishing with the factor, but what i
think you should do to subclass the
GradientDescentTrustRegionOptimizer is to take the step computed as it
is now, and update the parameters as follows:
split step into first three (angular) components a b c, and second
three (translation) components x y z
treat first three as a quaternion, a b c 1-norm(a,b,c)
if norm(a,b,c) is anywhere near 1, the step is way too large. For
practical purposes you could cap the length of a,b,c at 0.5.  This
would be a 90 degree turn - a huge optimization step.
combine this part of the step with the current versor parameters by
quaternion mutiplication.
add x y z to the translation components.

If you get this working - consider submitting it to the insight
journal, its interesting.

Good luck
Rupert

--------------------------------------------------------------
Rupert Brooks
rupert.brooks at gmail.com




On Fri, Aug 26, 2011 at 15:52, Kris Zygmunt <krismz at sci.utah.edu> wrote:
> Hi Rupert,
>     I am still working my way through your thesis, and I have also
> downloaded the trust region optimizer you provided for the Insight Journal.
>  I am still having trouble getting the results I desire using your optimizer
> for registering multimodal (T1, T2, DWI) brain data using a Rigid 3D Versor
> transform and the mutual information metric (this is done as an
> initialization to then feed deformable registrations).  Based on what I've
> read and experimented with so far, I believe this trouble can be attributed
> to some or all of the following issues:
> 1.  I don't think my scaling is right yet, do you have an implementation
> available of your scaling calculation?
> 2.  I think the optimizer is not using enough samples.  According to a
> comment I found in one of the ITK examples, "Regulating the number of
> samples in the Metric is equivalent to performing multi-resolution
> registration because it is indeed a sub-sampling of the image."  However,
> based on my own experiments and also based on your evaluation, just
> arbitrarily setting a fixed percentage of pixels to use does not perform
> well especially in the mutual information case.  I feel like I should be
> doing an actual multi-resolution registration or at least using all of the
> pixels in the image for a single-level registration.
> 3.  I am not sure that I can guarantee (even if I fix 1-2 above) that I will
> be starting within the capture radius of the optimizer.  Do you have any
> thoughts / recommendations / suggestions on how to better initialize the
> transform?  Or do you have an analysis of what the ITK optimizers' capture
> radii are like?  I still seem to have significant translation and rotation
> error if I use the CenteredVersorTransformInitializer with and without
> moments on.
> 4. I modified VersorRigid3DTransformOptimizer to derive from your trust
> region optimizer, but found that the API for StepAlongGradient no longer
> provides the factor as an argument.  The Versor optimizer was scaling the
> rotation and the transformed gradient by that factor, is this no longer
> necessary?
> 5.  Can you provide access to your modified Hessian approximation for mutual
> information?
> Thanks!
> Kris Zygmunt
> krismz at sci.utah.edu
>
>
>
> Hi Rupert,
>
> thank you for your very helpful explanations and the link to your thesis!!
> Your thesis incidentally answered some other questions I had.
>
> regards
> Levin
> ________________________________________
> Von: Rupert Brooks [rupert.brooks at gmail.com]
> Gesendet: Dienstag, 16. August 2011 04:01
> An: Wolf, Levin
> Cc: insight-users at itk.org
> Betreff: Re: [Insight-users] Understanding OptimizerScales in its entirety
>
> Hi Levin,
>
> Perhaps no one responded because understanding the optimizer scales in
> their _entirety_ is a very tall order. :-) I'll take a shot at
> answering the immediate question anyway.
>
> The optimizer scales are, unfortunately, not consistently applied
> across all the ITK optimizers.  However, VersorRigid3DOptimizer is a
> subclass of RegularStepGradientDescentOptimizer and they both work the
> same way.
>
> In these optimizers, the gradient is divided by the scales.  Then the
> step is this direction normalized to the step length.
>
> This sounds simple but the effect is a bit counterintuitive.  This is
> like scaling the transform parameters by the square roots of the
> optimizer scale factors, and then limiting the step to a circle in the
> original parameter space.  Which would be an ellipse in the new one.
> Why square root? because you change the derivative by changing the
> scales - and then consider it a direction in the original parameter
> space.
>
> I apologize in advance for self-promotion, but i just put an optimizer
> on the Insight-Journal that may interest you, if you are digging into
> this subject.  http://www.insight-journal.org/browse/publication/834
> Different people have different theories about what the scales
> accomplish - if you are up to some rather dry reading, i'll refer you
> to Section 4.5 of my thesis
> http://www.rupertbrooks.ca/downloads/Brooks_PhDThesis.pdf
>
> And yes, different people have different heuristics for how to set
> these scales.  In the thesis I argued that they should be chosen to
> precondition the Hessian matrix of the cost function.  Others will
> tell you they should roughly equalize the average pixel motion in the
> image due to a unit shift of the parameters.  It turns out that these
> are roughly the same thing.  Its important also to consider both how
> the scales affect the optimizer path through parameter space, and how
> they affect the stopping criteria.
>
> Hope that helps a little,
> Rupert
>
>
> --------------------------------------------------------------
> Rupert Brooks
> rupert.brooks at gmail.com
>
>
>


More information about the Insight-users mailing list