[Insight-users] Termination condition for the optimizer
Luis Ibanez
luis.ibanez at kitware.com
Tue Jul 7 18:37:28 EDT 2009
Hi Sharath,
The assumption that an Optimizer will stop when the value of the Metric
peaks, is incorrectly based on the idea that you are performing
optimization in only 1 dimension.
Unfortunately, when you are optimizing the parameters of an Affine
transform, you are walking in a space of 12 dimensions, and the logic
of 1D doesn't apply there.
This is because the variety of paths that lead to (and away from)
that peak are so much greater in a 12 dimensional space than in a 1D.
It is normal (although it is admittedly annoying) to see the Metric
values oscillating up and down, when you get close to a local extrema
of the metric. The behavior, of course, depends on the optimizer
type and the parameters that you have set on it.
0) Did you call the MaximizeOn() method in the optimizer ?
1) The GradientDescentOptimizer has only two stopping conditions:
a) Maximum number of iterations, and
b) The metric throwing an exception
In other words, it doesn't have a notion of convergence.
It is normal for it to run until it exhausts all the iterations
that you alloted.
2) The RegularStepGradientDescent has the following stopping
conditions:
GradientMagnitudeTolerance = 1,
StepTooSmall = 2,
ImageNotAvailable = 3,
CostFunctionError = 4,
MaximumNumberOfIterations = 5,
Unknown = 6
Ideally, you want the Optimizer to stop due to
a) GradientMagnitudeTolerance or
b) StepTooSmall
(a) means that the cost function has reached a plateau
(or that you are in an extrema location), but
it is YOUR responsibility to provide the Tolerance
value that you consider to be the slope of a
plateau.
(b) means that the optimizer is performing steps that
are too small to make a difference for the precision
that you are looking for.
unfortunately you can stop prematurely due to (b)
if you have a noisy cost function in which you have
bounced a lot back and forth. Having a high value
of relaxation factor (something close to 1.0) will
prevent the optimizer from a premature termination.
Claiming that you get incorrect results, just based on
the name of the optimizer is a very misleading statement.
The parameters that you set in an optimizer are an
essential part of the problem. Any good optimizer can
be made to fail by simply making a poor choice of
parameters.
So, if you want to report problems with the optimizers
and expect realistic help with them you should include
the parameters that you are using with them.
Simply saying that you should use optimizer "B" instead
of optimizer "A" is not going to help you at all, if
you don't take the time to learn to tune the parameters
of the optimizer.
My strong suggestion is that you start by using one
of the Examples in ITK along with some of the images
that are provided with the examples.
Then play with the parameters of the optimizer in
*THESE* same examples, so that you get familiar with
the effects of modifying any of the parameters.
In this way, you will be starting from conditions
that have been tunned correctly and you will be
able to recognize the behavior of the cost function
as you change parameters.
Regards,
Luis
-------------------------
Sharath Venkatesha wrote:
> Hi,
>
> I am experimenting with GradientDescentStepOptimizer with Mutual Information Histogram metric, using affine transformation. I observe that the optimizer does not stop when the value of the metric is peaked. I have a couple of questions.
>
> (a) The metric I am using is to be maximized, and hence I expect the Optimizer to terminate at an iteration where the metric value is maximum for the given range. Am I correct? I do get a good bell shaped curve for the metric values. Is it correct to put my own stopping condition at this place? I observe that the change in the metric value, between iterations reduces greatly.
>
> (b) The Gradient Magnitude value ( sigma( (gradient value)^2) over all the dimensions) does not have a defined behavior. I expect that this value, should be minimum when the metric peaks, but observe that the value gradually decreases, over the number of iterations. Can you please explain it?
>
> (c) Can you please suggest any other optimizer, that I can try? ( I have tried GradientStep, GradientDescent and Amoeba and got different, but incorrect results).
>
> (d) I am sure having a problem with the step size( for GradientDescentStep). On low values, I reach the max number of iterations and fail to get the optimal, and on high values, I go way off the direction. Any suggestions?
>
> (e) Relaxation Factor: I see that when the sign of the gradient changes (measured by the value of the scalarProduct in itkRegularStepGradientDescentBaseOptimizer.cxx), the step size is multiplied with the relaxation factor, but at this step, we have already crossed the peak, and going downhill.. but we do not change the direction. Am I missing something?
>
> Below is the log of my metric values
>
>
> MaximumStepLength : 0.1
> MinimumStepLength : 0.0001
> .....
>
> Format:
> Gradient_Magnitude Tolerance_value Current_step_Length
> Iter_Number Metric_Value [Affine Parameters] Angle_in_Degrees Change_in_Metric_value Change_in_angle
>
> .......
> ....
>
> GradMag = 7.494 Tol =0.0001 StepLen = 0.1
> 31 0.1749 [0.5827, -0.06824, 0.06624, 0.6027, -856.8, -931.7] Angle:6.473 Metr:0.0005967 ADiff:-0.02686
> GradMag = 8.051 Tol =0.0001 StepLen = 0.1
> 32 0.1757 [0.5827, -0.06827, 0.06578, 0.6035, -856.7, -931.7] Angle:6.448 Metr:0.0007982 ADiff:-0.02441
> GradMag = 7.914 Tol =0.0001 StepLen = 0.1
> 33 0.1766 [0.5826, -0.06834, 0.06529, 0.6043, -856.6, -931.7] Angle:6.423 Metr:0.0009113 ADiff:-0.02468
> GradMag = 7.315 Tol =0.0001 StepLen = 0.1
> 34 0.1773 [0.5826, -0.06841, 0.06477, 0.6052, -856.5, -931.7] Angle:6.398 Metr:0.0006155 ADiff:-0.02567
> GradMag = 7.99 Tol =0.0001 StepLen = 0.1
> 35 0.1778 [0.5827, -0.06849, 0.06428, 0.606, -856.4, -931.6] Angle:6.374 Metr:0.0004911 ADiff:-0.02425
> GradMag = 7.949 Tol =0.0001 StepLen = 0.1
> 36 0.1783 [0.5827, -0.06857, 0.06378, 0.6068, -856.4, -931.6] Angle:6.349 Metr:0.000521 ADiff:-0.02433
> GradMag = 7.78 Tol =0.0001 StepLen = 0.1
> 37 0.1787 [0.5827, -0.06868, 0.06327, 0.6076, -856.3, -931.5] Angle:6.326 Metr:0.0004448 ADiff:-0.02362
> GradMag = 7.599 Tol =0.0001 StepLen = 0.1
> 38 0.179 [0.5828, -0.06881, 0.06276, 0.6085, -856.2, -931.5] Angle:6.302 Metr:0.0003036 ADiff:-0.02324
> GradMag = 7.277 Tol =0.0001 StepLen = 0.1
> 39 0.1795 [0.5829, -0.06896, 0.0622, 0.6093, -856.1, -931.5] Angle:6.278 Metr:0.0004738 ADiff:-0.02439
> GradMag = 8.089 Tol =0.0001 StepLen = 0.1
> 40 0.1797 [0.5829, -0.0691, 0.06171, 0.6101, -856, -931.4] Angle:6.257 Metr:0.0002499 ADiff:-0.02088
> GradMag = 8.046 Tol =0.0001 StepLen = 0.1
> 41 0.18 [0.583, -0.06926, 0.06122, 0.6108, -855.9, -931.4] Angle:6.237 Metr:0.0002045 ADiff:-0.01961
> GradMag = 8.101 Tol =0.0001 StepLen = 0.1
> 42 0.1802 [0.5831, -0.06943, 0.06075, 0.6116, -855.8, -931.4] Angle:6.219 Metr:0.0002648 ADiff:-0.01865
> GradMag = 7.21 Tol =0.0001 StepLen = 0.1
> 43 0.1803 [0.5831, -0.06964, 0.06021, 0.6124, -855.7, -931.3] Angle:6.199 Metr:6.091e-005 ADiff:-0.02007
> GradMag = 8.228 Tol =0.0001 StepLen = 0.1
> 44 0.1805 [0.5832, -0.06986, 0.05974, 0.6131, -855.6, -931.3] Angle:6.183 Metr:0.0002261 ADiff:-0.01615
> GradMag = 7.485 Tol =0.0001 StepLen = 0.1
> 45 0.1808 [0.5833, -0.07012, 0.05922, 0.6139, -855.5, -931.2] Angle:6.166 Metr:0.0002553 ADiff:-0.01641
> GradMag = 5.909 Tol =0.0001 StepLen = 0.1
> 46 0.1806 [0.5834, -0.07048, 0.05857, 0.6148, -855.4, -931.2] Angle:6.147 Metr:-0.0002007 ADiff:-0.01916
> GradMag = 6.526 Tol =0.0001 StepLen = 0.1
> 47 0.1803 [0.5835, -0.07083, 0.058, 0.6156, -855.3, -931.1] Angle:6.132 Metr:-0.000293 ADiff:-0.01502
> GradMag = 7.555 Tol =0.0001 StepLen = 0.1
> 48 0.1803 [0.5836, -0.07113, 0.05752, 0.6163, -855.3, -931.1] Angle:6.12 Metr:3.284e-006 ADiff:-0.01187
> GradMag = 6.259 Tol =0.0001 StepLen = 0.1
> 49 0.1803 [0.5837, -0.0715, 0.05695, 0.617, -855.2, -931.1] Angle:6.106 Metr:-3.978e-006 ADiff:-0.01394
> GradMag = 6.12 Tol =0.0001 StepLen = 0.1
> 50 0.1803 [0.5837, -0.07192, 0.05638, 0.6177, -855.1, -931] Angle:6.095 Metr:6.792e-005 ADiff:-0.01128
> GradMag = 5.317 Tol =0.0001 StepLen = 0.1
> 51 0.1803 [0.5839, -0.07242, 0.05577, 0.6185, -855, -931] Angle:6.086 Metr:-2.758e-005 ADiff:-0.009191
> GradMag = 5.665 Tol =0.0001 StepLen = 0.1
> 52 0.1802 [0.584, -0.07289, 0.05525, 0.6191, -854.9, -930.9] Angle:6.079 Metr:-0.0001544 ADiff:-0.006347
> GradMag = 5.083 Tol =0.0001 StepLen = 0.1
> 53 0.1802 [0.5842, -0.07338, 0.05469, 0.6198, -854.8, -930.9] Angle:6.072 Metr:3.066e-006 ADiff:-0.007603
> GradMag = 4.135 Tol =0.0001 StepLen = 0.1
> 54 0.1802 [0.5844, -0.07398, 0.05404, 0.6206, -854.7, -930.8] Angle:6.064 Metr:2.267e-005 ADiff:-0.007948
> GradMag = 3.486 Tol =0.0001 StepLen = 0.1
> 55 0.1797 [0.5848, -0.07466, 0.05324, 0.6215, -854.7, -930.8] Angle:6.052 Metr:-0.0005252 ADiff:-0.01135
> GradMag = 4.202 Tol =0.0001 StepLen = 0.1
> 56 0.1792 [0.585, -0.07522, 0.05258, 0.6222, -854.6, -930.7] Angle:6.043 Metr:-0.0004813 ADiff:-0.009771
> GradMag = 3.866 Tol =0.0001 StepLen = 0.1
> 57 0.1789 [0.5853, -0.07577, 0.05185, 0.6229, -854.5, -930.7] Angle:6.029 Metr:-0.0002625 ADiff:-0.01337
> GradMag = 3.972 Tol =0.0001 StepLen = 0.1
> 58 0.1786 [0.5856, -0.07628, 0.05115, 0.6235, -854.4, -930.6] Angle:6.017 Metr:-0.0003571 ADiff:-0.01262
> GradMag = 4.536 Tol =0.0001 StepLen = 0.1
> 59 0.1781 [0.5858, -0.07669, 0.05057, 0.6239, -854.3, -930.5] Angle:6.005 Metr:-0.0004517 ADiff:-0.0113
> GradMag = 3.345 Tol =0.0001 StepLen = 0.1
> 60 0.1777 [0.5861, -0.07723, 0.04978, 0.6244, -854.3, -930.4] Angle:5.99 Metr:-0.000372 ADiff:-0.0155
> -....
>
>
>
>
> Thanks for any clues,
>
> Sharath Venkatesha
>
>
>
> _____________________________________
> Powered by www.kitware.com
>
> Visit other Kitware open-source projects at
> http://www.kitware.com/opensource/opensource.html
>
> Please keep messages on-topic and check the ITK FAQ at: http://www.itk.org/Wiki/ITK_FAQ
>
> Follow this link to subscribe/unsubscribe:
> http://www.itk.org/mailman/listinfo/insight-users
>
More information about the Insight-users
mailing list