Normalized Correlation Using FFT With Mask Images For Input Images


Normalized correlation using the FFT with optional mask images for both input images.











Maximum location: [45, 44]
Maximum location fixed: [5, 6]
Maximum value: 1



#include "itkImage.h"
#include "itkImageFileReader.h"
#include "itkFFTNormalizedCorrelationImageFilter.h"
#include "itkRegionOfInterestImageFilter.h"
#include "itkImageKernelOperator.h"
#include "itkRescaleIntensityImageFilter.h"
#include "itkImageFileWriter.h"
#include "itkMinimumMaximumImageCalculator.h"

#include <iostream>
#include <string>

using ImageType = itk::Image<unsigned char, 2>;
using FloatImageType = itk::Image<float, 2>;
using MaskType = itk::Image<unsigned char, 2>;
} // namespace

static void
CreateMask(MaskType * const mask);
static void
CreateImage(ImageType::Pointer image, const itk::Index<2> & cornerOfSquare);

template <typename TImage>
WriteImage(TImage * const image, const std::string & filename);

main(int, char *[])
  itk::Index<2> offset;
  offset[0] = 5;
  offset[1] = 6;

  // Setup mask
  MaskType::Pointer mask = MaskType::New();

  ImageType::Pointer fixedImage = ImageType::New();
  itk::Index<2>      cornerOfFixedSquare;
  cornerOfFixedSquare[0] = 3;
  cornerOfFixedSquare[1] = 8;
  CreateImage(fixedImage, cornerOfFixedSquare);
  WriteImage(fixedImage.GetPointer(), "fixedImage.png");

  ImageType::Pointer movingImage = ImageType::New();
  itk::Index<2>      cornerOfMovingSquare;
  cornerOfMovingSquare[0] = cornerOfFixedSquare[0] + offset[0];
  cornerOfMovingSquare[1] = cornerOfFixedSquare[1] + offset[1];
  CreateImage(movingImage, cornerOfMovingSquare);
  WriteImage(movingImage.GetPointer(), "movingImage.png");

  // Perform normalized correlation
  using CorrelationFilterType = itk::FFTNormalizedCorrelationImageFilter<ImageType, FloatImageType>;
  CorrelationFilterType::Pointer correlationFilter = CorrelationFilterType::New();
  // correlationFilter->SetFixedImageMask(mask);

  WriteImage(correlationFilter->GetOutput(), "correlation.mha");

  using RescaleFilterType = itk::RescaleIntensityImageFilter<FloatImageType, ImageType>;
  RescaleFilterType::Pointer rescaleFilter = RescaleFilterType::New();
  WriteImage(rescaleFilter->GetOutput(), "correlation.png");

  using MinimumMaximumImageCalculatorType = itk::MinimumMaximumImageCalculator<FloatImageType>;
  MinimumMaximumImageCalculatorType::Pointer minimumMaximumImageCalculatorFilter =

  itk::Index<2> maximumCorrelationPatchCenter = minimumMaximumImageCalculatorFilter->GetIndexOfMaximum();

  itk::Size<2> outputSize = correlationFilter->GetOutput()->GetLargestPossibleRegion().GetSize();

  itk::Index<2> maximumCorrelationPatchCenterFixed;
  maximumCorrelationPatchCenterFixed[0] = outputSize[0] / 2 - maximumCorrelationPatchCenter[0];
  maximumCorrelationPatchCenterFixed[1] = outputSize[1] / 2 - maximumCorrelationPatchCenter[1];

  std::cout << "Maximum location: " << maximumCorrelationPatchCenter << std::endl;
  std::cout << "Maximum location fixed: " << maximumCorrelationPatchCenterFixed
            << std::endl; // This is the value we expect!
  std::cout << "Maximum value: " << minimumMaximumImageCalculatorFilter->GetMaximum()
            << std::endl; // If the images can be perfectly aligned, the value is 1

  return EXIT_SUCCESS;

CreateImage(ImageType::Pointer image, const itk::Index<2> & cornerOfSquare)
  ImageType::IndexType start;

  ImageType::SizeType size;

  ImageType::RegionType region(start, size);


  itk::ImageRegionIterator<ImageType> imageIterator(image, region);

  ImageType::IndexValueType squareSize = 8;

  while (!imageIterator.IsAtEnd())
    if (imageIterator.GetIndex()[0] > cornerOfSquare[0] &&
        imageIterator.GetIndex()[0] < cornerOfSquare[0] + squareSize &&
        imageIterator.GetIndex()[1] > cornerOfSquare[1] && imageIterator.GetIndex()[1] < cornerOfSquare[1] + squareSize)


template <typename TImage>
WriteImage(TImage * const image, const std::string & filename)
  using WriterType = itk::ImageFileWriter<TImage>;
  typename WriterType::Pointer writer = WriterType::New();

CreateMask(MaskType * const mask)
  ImageType::IndexType start;

  ImageType::SizeType size;

  ImageType::RegionType region(start, size);

  mask->FillBuffer(255); // Make the whole mask "valid"

  // unsigned int squareSize = 8;
  ImageType::IndexValueType squareSize = 3;

  itk::Index<2> cornerOfSquare = { { 3, 8 } };

  // Remove pixels from the mask in a small square. The correlationw will not be computed at these pixels.
  itk::ImageRegionIterator<MaskType> maskIterator(mask, region);

  while (!maskIterator.IsAtEnd())
    if (maskIterator.GetIndex()[0] > cornerOfSquare[0] && maskIterator.GetIndex()[0] < cornerOfSquare[0] + squareSize &&
        maskIterator.GetIndex()[1] > cornerOfSquare[1] && maskIterator.GetIndex()[1] < cornerOfSquare[1] + squareSize)


Classes demonstrated

template<typename TInputImage, typename TOutputImage, typename TMaskImage = TInputImage>
class MaskedFFTNormalizedCorrelationImageFilter : public itk::ImageToImageFilter<TInputImage, TOutputImage>

Calculate masked normalized cross correlation using FFTs.

This filter calculates the masked normalized cross correlation (NCC) of two images under masks using FFTs instead of spatial correlation. It is much faster than spatial correlation for reasonably large structuring elements. This filter is not equivalent to simply masking the images first and then correlating them; the latter approach yields incorrect results because the zeros in the images still affect the metric in the correlation process. This filter implements the masked NCC correctly so that the masked-out regions are completely ignored. The fundamental difference is described in detail in the references below. If the masks are set to images of all ones, the result of this filter is the same as standard NCC.

Inputs: Two images are required as inputs, fixedImage and movingImage, and two are optional, fixedMask and movingMask. In the context of correlation, inputs are often defined as: “image” and “template”. In this filter, the fixedImage plays the role of the image, and the movingImage plays the role of the template. However, this filter is capable of correlating any two images and is not restricted to small movingImages (templates). In the fixedMask and movingMask, non-zero positive values indicate locations of useful information in the corresponding image, whereas zero and negative values indicate locations that should be masked out (ignored). Internally, the masks are converted to have values of only 0 and 1. For each optional mask that is not set, the filter internally creates an image of ones, which is equivalent to not masking the image. Thus, if both masks are not set, the result will be equivalent to unmasked NCC. For example, if only a mask for the fixed image is needed, the movingMask can either not be set or can be set to an image of ones.

Optional parameters: The RequiredNumberOfOverlappingPixels enables the user to specify the minimum number of voxels of the two masks that must overlap; any location in the correlation map that results from fewer than this number of voxels will be set to zero. Larger values zero-out pixels on a larger border around the correlation image. Thus, larger values remove less stable computations but also limit the capture range. If RequiredNumberOfOverlappingPixels is set to 0, the default, no zeroing will take place.

The RequiredFractionOfOverlappingPixels enables the user to specify a fraction of the maximum number of overlapping pixels that need to overlap; any location in the correlation map that results from fewer than the product of this fraction and the internally computed maximum number of overlapping pixels will be set to zero. The value ranges between 0.0 and 1.0. This is very useful when the user does does not know beforehand the maximum number of pixels of the masks that will overlap. For example, when the masks have strange shapes, it is difficult to predict how the correlation of the masks will interact and what the maximum overlap will be. It is also useful when the mask shapes or sizes change because it is relative to the internally computed maximum of the overlap. Larger values zero-out pixels on a larger border around the correlation image. Thus, larger values remove less stable computations but also limit the capture range. Experiments have shown that a value between 0.1 and 0.6 works well for images with significant overlap and between 0.05 and 0.1 for images with little overlap (such as in stitching applications). If RequiredFractionOfOverlappingPixels is set to 0, the default, no zeroing will take place.

The user can either specify RequiredNumberOfOverlappingPixels or RequiredFractionOfOverlappingPixels (or both or none). Internally, the number of required pixels resulting from both of these methods is calculated and the one that gives the largest number of pixels is chosen. Since these both default to 0, if a user only sets one, the other is ignored.

Image size: fixedImage and movingImage need not be the same size, but fixedMask must be the same size as fixedImage, and movingMask must be the same size as movingImage. Furthermore, whereas some algorithms require that the “template” be smaller than the “image” because of errors in the regions where the two are not fully overlapping, this filter has no such restriction.

Image spacing: Since the computations are done in the pixel domain, all input images must have the same spacing.

Outputs; The output is an image of RealPixelType that is the masked NCC of the two images and its values range from -1.0 to 1.0. The size of this NCC image is, by definition, size(fixedImage) + size(movingImage) - 1.

Example filter usage:

using FilterType = itk::MaskedFFTNormalizedCorrelationImageFilter< ShortImageType, DoubleImageType >;
FilterType::Pointer filter = FilterType::New();
filter->SetFixedImage( fixedImage );
filter->SetMovingImage( movingImage );
filter->SetFixedImageMask( fixedMask );
filter->SetMovingImageMask( movingMask );

References: 1) D. Padfield. “Masked object registration in the Fourier domain.” Transactions on

Image Processing. 2) D. Padfield. “Masked FFT registration”. In Proc. Computer Vision and Pattern Recognition, 2010.

The pixel type of the output image must be of real type (float or double). ConceptChecking is used to enforce the output pixel type. You will get a compilation error if the pixel type of the output image is not float or double.


: Dirk Padfield, GE Global Research,

ITK Sphinx Examples:

See itk::MaskedFFTNormalizedCorrelationImageFilter for additional documentation.