[Insight-users] Gradient computation bug in SingleValuedVnlCostFunctionAdaptor when scales are used (patch included)

Tue May 20 04:13:29 EDT 2008

Hi Luis,

I just realized that the same problem should appear on the multiple
valued adaptor. I have reopened the bug and am currently running an
experimental build with a patched multiple valued adaptor.

Tom

On Thu, May 1, 2008 at 12:17 AM, Luis Ibanez <luis.ibanez at kitware.com> wrote:
>
>
> Hi Tom,
>
>
> Your argument sounds quite convincing.
>
> I'm running an Experimental build with your patch, and
> if all test pass it will be committed later today or
> early tomorrow.
>
>
>  Thanks
>
>
>     Luis
>
>
> ------------------------
> Tom Vercauteren wrote:
>>
>> Hi Luis,
>>
>> Thanks for your fast answer. Below are some comments.
>>
>>
>>> However, that was not the intent of introducing the scaling
>>> in these classes.
>>
>> As far as I understand it, there really is not other easy option than
>> the patch I propose. If vnl works on a scaled function, it needs a
>> well-scaled derivative. I understand that different optimizers handles
>> scales differently but for vnl the problem is different. Assuming that
>> we don't modify the entire vnl optimization framework, I think that
>> the only way for vnl_optimizers to access a well-scaled derivative is
>> through the cost function wrapper.
>>
>>
>>> The motivation for this scaling operation was to extend the
>>> functionality of VNL optimizers to be usable with domains outside
>>> of the cannonical [-1:+1] domain.
>>>
>>> Since these functions are used for the purpose of optimization,
>>> we intended simply to shrink (or expand) the domain of the original
>>> problem in order to map to the [-1:+1] that the algorithmic
>>> implementation of vnl optimizers expect.
>>
>> I couldn't agree more.
>>
>>
>>> Take for example,
>>> the itkAmoebaOptimizer.cxx:
>>
>> I would consider the AmoebaOptimizer optimizer as a bad example as it
>> is not meant to use any derivative information.
>>
>>
>>>  it its call to GetValue() in line 84, it uses the scales
>>>  to multiply the parameters *before* they are passes to
>>>  the cost function.
>>>
>>>    84:      parameters[i] *= scales[i];
>>>
>>>  In its call to StartOptimization(), before invoking the
>>>  VNL optimizer, we scale the parameters of the transform
>>>
>>>   218:       parameters[i] *= scales[i];
>>>
>>>  then we invoke the VNL optimizer
>>>
>>>   227:      m_VnlOptimizer->minimize( parameters );
>>>
>>>  and then we restore the scale of the transform parameters
>>>
>>>  242:       parameters[i] /= scales[i];
>>>
>>>
>>> The goal, is to create a situation where the VNL optimizer
>>> "thinks" that is optimizing a function whose domain is close
>>> to [-1:+1], as VNL algorithms expect or assume.
>>
>> Sure but it should be exactly the same for the derivatives. The
>>
>>
>>>    At this point, we could make this correction,
>>>    but this may require to recalibrate the parameters
>>>    of most of image registration examples and tests
>>>    that use VNL optimizers...
>>
>> You can actually try the vnl_lbfgs optimizer and let hime check the
>> derivatives by using:
>> optimizer->set_check_derivatives(1);
>>
>> You'll see him complaining...
>>
>>
>>> Have you run an experimental build with this change
>>> to evaluate the impact of the modification ?
>>
>> I am currently evaluating it. With the patch, Testing/Code/Numerics
>> still compiles and run without any failings.
>>
>> Furthermore my code now works.
>>
>>
>>> The operation is not done in the same way either across optimizers.
>>>
>>> For example,
>>> if you look at the GradientDescent and RegularStepGradientDescent
>>> optimizers, the scales are applied only to the *resulting* gradient
>>> vector returned from the cost function, and then, the resulting vector
>>> is used during the computation of the step to be taken. In these two
>>> optimizers, the scale factor is not used for feeding the values
>>> inside the cost function.
>>
>> Yes but these are not vnl optimizers.
>>
>> I would actually vote for using a scaling wrapper for all the
>> optimizers. Then the scale would be used in a consistent manner.
>> Something like that is already done in elastix which is built on top
>> of ITK:
>> http://elastix.isi.uu.nl/
>>
>> As a side note, I definitely think that an extensive evaluation of the
>> optimizers available in ITK would be a great contribution. I have been
>> struggling with most of them and haven't yet found one that suited my
>> needs.  The only one close to pleasing me was the FRPRoptimizer
>> including the numerical recipes code (ITK 3.0 I think). Alas it was
>> not usable because of copyright issues.
>>
>> This explains why I am now moving to using the vnl optimizers directly...
>>
>> Hope this helps,
>> Tom
>>
>