Proposals:Explicit Instantiation: Difference between revisions

From KitwarePublic
Jump to navigationJump to search
(Initial Creation)
 
(Wrote majority of text)
Line 3: Line 3:
This document derives the motivation and implementation of explicit
This document derives the motivation and implementation of explicit
template instantiation support in ITK.
template instantiation support in ITK.
==Overview==
Most of ITK is implemented using class class templates in order to
provide users with tremendous flexibility on the types of data that
can be processed.  Here we investigate the mechanism used by the build
system to transform these templates into executable code.
Consider an example ITK class template defined in itkFoo.h:
<pre>
// itkFoo.h
namespace itk
{
  template <class T>
  class Foo
  {
  public:
    void MethodA();
  };
}
</pre>
The template member MethodA is defined in itkFoo.txx:
<pre>
// itkFoo.txx
namespace itk
{
  template <class T>
  void Foo<T>::MethodA()
  {
  }
}
</pre>
When the user writes code such as
<pre>
#include "itkFoo.h"
int main()
{
  itk::Foo<int> foo;
  foo.MethodA();
  return 0;
}
</pre>
a reference to the symbol "void itk::Foo<int>::MethodA()" is created.
The basic problem we face is how to create the template instantiation
providing this symbol.
==Implicit Template Instantiation==
One way to provide the symbol is called "implicit template
instantiation".  If the user were to use
<pre>
#include "itkFoo.txx"
</pre>
then the compiler will have a definition of the template member
MethodA and create a copy of the symbol in the same object file that
contains main.  In this case the method implementation is implicitly
instantiated by the compiler because the definition is available and
the method is called.
Implicit instantiation has the advantage that user code may create any
instantiation of the template Foo that it needs without worrying about
where the symbol definitions will reside.  The drawback is that every
source file that references the method will create its own copy of the
symbol.  Code for MethodA will be compiled and stored in every object
file that needs it, but the linker will choose only one copy and throw
out the rest.  This means that the work the compiler did in most of
the object files was wasted.
==Explicit Template Instantiation==
An alternative way to provide the symbol "void itk::Foo<int>::MethodA()"
is through "explicit template instantiation".  Assume the user did not
include itkFoo.txx.  When the compiler sees a reference to MethodA
it does not know how to create the symbol so it instead places in the object
file an unresolved symbol reference.  In order to provide the symbol to
the linker we need to explicitly create a copy of it somewhere.  We can
do this by creating an explicit instantiation in a separate source file.
Consier a source file called "itkFoo+int-.cxx" that contains the
following code:
<pre>
#include "itkFoo.h"
#include "itkFoo.txx"
namespace itk
{
  template class Foo<int>; // explicit template instantiation
}
</pre>
When the compiler builds this source file it will have the definitions
of all the members of Foo<int> because itkFoo.txx was included.  The
explicit template instantiation syntax tells the compiler to
instantiate a copy of every symbol in the template.  Now when the user
builds a program referencing "void itk::Foo<int>::MethodA()" the
linker will find it in the object file providing the explicit
instantiation.
==Extern Template Instantiation==
Now that we have provided the instantiation itk::Foo<int> explicitly
users may use this instantiation by including only itkFoo.h.  However,
say the user now wants to use itk::Foo<float> as well:
<pre>
#include "itkFoo.h"
int main()
{
  itk::Foo<int> fooI;
  fooI.MethodA();
  itk::Foo<float> fooF;
  fooF.MethodA();
  return 0;
}
</pre>
The compiler will not see the definition of MethodA and will produce
an unresolved symbol reference.  The linker will now fail to resolve
the symbol because it is not provided explicitly.  A user might fix
this by includeing itkFoo.txx to get the definition.  The problem is
now the compiler will see the definition for the template MethodA and
instantiate both itk::Foo<int>::MethodA and itk::Foo<float>::MethodA!
The linker will now resolve itk::Foo<float>::MethodA but there will be
two copies of itk::Foo<int>::MethodA and one will be thrown out.
Again, the compiler has done work that is wasted.
How can this problem be avoided?  One solution is to not allow the
user to include itkFoo.txx directly but instead require that his or
her project manually create an itkFoo+float-.cxx file containing the
instantiation for itk::Foo<float>.  While this may work it will be
confusing for many users and require alot of extra work.  Another
solution is to take advantage of a compiler-specific extension known
as "extern template instantiation".  This extension is provided on at
least the MSVC, GCC, and Intel compilers.  Consider the following code
added to the bottom of itkFoo.h:
<pre>
namespace itk
{
  extern template class Foo<int>; // extern template instantiation
}
</pre>
This instructs the compiler to NOT instantiate any members of
itk::Foo<int> even if the definition is available.  Now, if the user
includes itkFoo.txx and uses both itk::Foo<int> and itk::Foo<float>
the compiler will create symbols for itk::Foo<float> but leave those
for itk::Foo<int> undefined.  The linker will then come along and find
one copy of each, and no work will have been wasted.
A similar extension is provided by the SGI MIPSpro compiler.  The above
extern instantiation may be written on this compiler as
<pre>
namespace itk
{
#pragma do_not_instantiate class Foo<int>
}
</pre>
which tells it not to instantiate this template.
==DLL Symbol Resolution on Windows==
When an explicit template instantiation is provided by a shared
library we must ensure that the symbols are available for use outside
the library.  On UNIX systems this is automatic.  On Windows systems
we need to explicitly tell the compiler that the symbols provided by
an explicit template instantiation are to be exported from the DLL.
This can be achieved by adding a dllexport decoration to the explicit
template instantiation line:
<pre>
#include "itkFoo.h"
#include "itkFoo.txx"
namespace itk
{
  template class __declspec(dllexport) Foo<int>;
}
</pre>
Similarly, when using an explicit instantiation from another source
file we must tell the compiler that we wish to import the symbols from
a DLL.  This can be achieved by adding a dllimport decoration to the
extern template instantiation line:
<pre>
namespace itk
{
  extern template class __declspec(dllimport) Foo<int>;
}
</pre>
==Scalability of Instantiation Syntax==
There are three ways that a class template instantiation may appear in
order to export and import it to/from a library.  We must make sure
these each appear a the proper time, and that each uses the
appropriate DLL import/export macro for the library providing the
instantiation.  For example, if itk::Foo<int> were provided by
ITKCommon, we would need the following layout.
<pre>
// Bottom of itkFoo+int-.cxx
namespace itk
{
template class ITKCommon_EXPORT Foo<int>;
}
// Bottom of itkFoo.h
namespace itk
{
#if (...compiler supports extern instantiation...)
  extern template class ITKCommon_EXPORT Foo<int>;
#elif (...compiler supports do_not_instantiate...)
# pragma do_not_instantiate class ITKCommon_EXPORT Foo<int>
#endif
}
</pre>
To make matters worse, some class templates have function templates
that also need to be instantiated.  For example, instantiating
itk::Vector<double, 3> might look like this:
<pre>
namespace itk
{
  template class Vector<double, 3>;
  template std::ostream& operator<<(std::ostream&, const Vector<double, 3>&);
  template std::istream& operator>>(std::ostream&, Vector<double, 3>&);
}
</pre>
Using the above layout, these three lines would have to be duplicated
three times with slight variations.  The resulting nine lines would
have to be duplicated again for each instantiation provided.  This
duplication is tedious and error-prone, but can be avoided by using a
macro to specify the declarations to instantiate for a given template.
==Developing Instantiation Macros==
We now incrementally develop the macro-based instantiation design to
be used in ITK.  In order to specify declarations to instantiate only
once we must be able to both export and import using the same
declaration.  Consider the following macro definitions:
<pre>
#define ITK_TEMPLATE_EXPORT(X) template X;
#define ITK_TEMPLATE_IMPORT(X) extern template X;
</pre>
We can now export itk::Foo<int> by writing
<pre>
namespace itk
{
  ITK_TEMPLATE_EXPORT(class ITKCommon_EXPORT Foo<int>)
}
</pre>
and import it by writing
<pre>
namespace itk
{
  ITK_TEMPLATE_IMPORT(class ITKCommon_EXPORT Foo<int>)
}
</pre>
Note that the argument to the macro is the same in both cases.  Now we
can provide a macro for the template Foo that specifies this argument:
<pre>
#define ITK_TEMPLATE_Foo(_, T) namespace itk { \
  _(class ITKCommon_EXPORT Foo< T >)
  }
</pre>
Exporting and importing itk::Foo<int> can now be done by the following
two lines respectively.
<pre>
// Bottom of itkFoo+int-.cxx
ITK_TEMPLATE_Foo(ITK_TEMPLATE_EXPORT, int)
// Bottom of itkFoo.h
ITK_TEMPLATE_Foo(ITK_TEMPLATE_IMPORT, int)
</pre>
These will be expanded to
<pre>
namespace itk
{
  ITK_TEMPLATE_EXPORT(class ITKCommon_EXPORT Foo< int >)
}
</pre>
and
<pre>
namespace itk
{
  ITK_TEMPLATE_IMPORT(class ITKCommon_EXPORT Foo< int >)
}
</pre>
and then to
<pre>
namespace itk
{
  template class ITKCommon_EXPORT Foo< int >;
}
</pre>
and
<pre>
namespace itk
{
  extern template class ITKCommon_EXPORT Foo< int >;
}
</pre>
This design avoids duplicating the declaration "class Foo<int>" and
all supporting declarations (such as in the Vector case).  However, it
allows instantiations to be provided only by ITKCommon.  Instead we
should allow the export macro to be specified as an argument:
<pre>
#define ITK_TEMPLATE_Foo(_, EXPORT, T) namespace itk { \
  _(class EXPORT Foo< T >)
  }
</pre>
This leads to
<pre>
// Bottom of itkFoo+int-.cxx
ITK_TEMPLATE_Foo(ITK_TEMPLATE_EXPORT, ITKCommon_EXPORT, int)
// Bottom of itkFoo.h
ITK_TEMPLATE_Foo(ITK_TEMPLATE_IMPORT, ITKCommon_EXPORT, int)
</pre>
which is more flexible.  We can shorten this by providing some helper
macros:
<pre>
#define ITK_EXPORT_TEMPLATE(EXPORT, c, T) \
        ITK_TEMPLATE_##c(ITK_TEMPLATE_EXPORT, EXPORT, T)
#define ITK_IMPORT_TEMPLATE(EXPORT, c, T) \
        ITK_TEMPLATE_##c(ITK_TEMPLATE_IMPORT, EXPORT, T)
</pre>
which leads to
<pre>
// Bottom of itkFoo+int-.cxx
ITK_EXPORT_TEMPLATE(ITKCommon_EXPORT, Foo, int)
// Bottom of itkFoo.h
ITK_IMPORT_TEMPLATE(ITKCommon_EXPORT, Foo, int)
</pre>
We can shorten this further by providing library-specific
import/export macros:
<pre>
#define ITK_EXPORT_ITKCommon(c, T) ITK_EXPORT_TEMPLATE(ITKCommon_EXPORT, c, T)
#define ITK_IMPORT_ITKCommon(c, T) ITK_IMPORT_TEMPLATE(ITKCommon_EXPORT, c, T)
</pre>
which leads to
<pre>
// Bottom of itkFoo+int-.cxx
ITK_EXPORT_ITKCommon(Foo, int)
// Bottom of itkFoo.h
ITK_IMPORT_ITKCommon(Foo, int)
</pre>
This is pretty short, and is relatively nice looking.  Unfortunately
it works only for templates with one argument!  The compilers we wish
to support do not support variable-length macro argument lists.
Consider what happens when we try to pass multiple template arguments
to the inner-most instantiation macro:
<pre>
namespace itk
{
  ITK_TEMPLATE_EXPORT(class ITKCommon_EXPORT Vector<double, 3>)
}
</pre>
The C preprocessor does not understand C++ template arguments, and
will treat this as a call to ITK_TEMPLATE_EXPORT with two arguments.
The first argument will contain "class ITKCommon_EXPORT Vector<double"
and the second argument will contain " 3>".  Since ITK_TEMPLATE_EXPORT
was defined with only one argument this will produce a preprocessing
error.
We need a way to pass multiple template arguments through only one
macro argument.  One way to shield a comma-separated list from being
expanded as multiple macro arguments is to put it inside a nested
level of parentheses.  For example, invoking
<pre>
ITK_TEMPLATE_EXPORT((class ITKCommon_EXPORT Vector<double, 3>))
</pre>
will pass just one argument and will not be a preprocessing error.
Unfortunately the resulting expansion will be
<pre>
template (class ITKCommon_EXPORT Vector<double, 3>);
</pre>
which is not valid.  In order to remove the extra level of parentheses
after the argument is passed, we use some helper macros:
<pre>
#define ITK_TEMPLATE_1(x1)      x1
#define ITK_TEMPLATE_2(x1,x2)    x1,x2
#define ITK_TEMPLATE_3(x1,x2,x3) x1,x2,x3
</pre>
Consider the definition
<pre>
#define ITK_TEMPLATE_EXPORT(x) template ITK_TEMPLATE_##x;
</pre>
Now we can use the same macro for both Foo and Vector instantiations:
<pre>
namespace itk
{
  ITK_TEMPLATE_EXPORT(1(class ITKCommon_EXPORT Foo<int>))
  ITK_TEMPLATE_EXPORT(2(class ITKCommon_EXPORT Vector<double, 3>))
}
</pre>
The preprocessor will expand the Foo line first to
<pre>
template ITK_TEMPLATE_1(class ITKCommon_EXPORT Foo<int>);
</pre>
which will be recursively expanded to
<pre>
template class ITKCommon_EXPORT Foo<int>;
</pre>
Then the preprocessor will expand the Vector line first to
<pre>
template ITK_TEMPLATE_2(class ITKCommon_EXPORT Vector<double, 3>);
</pre>
which will be recursively expanded to
<pre>
template class ITKCommon_EXPORT Vector<double, 3>;
</pre>
These are exactly the lines we need.  The ITK_TEMPLATE_IMPORT macro
can be modified similarly.
In order to use this new macro properly we need to change our
definition of ITK_TEMPLATE_Foo.  Consider a the new definition
<pre>
#define ITK_TEMPLATE_Foo(_, EXPORT, T) namespace itk { \
  _(1(class EXPORT Foo< T >))
  }
</pre>
This will produce a proper invocation of ITK_TEMPLATE_EXPORT and
ITK_TEMPLATE_IMPORT for any template argument of Foo.  However, this
method cannot be used for templates like Vector that take multiple
arguments.  Consider an attempt to define the instantiation macro for
Vector:
<pre>
#define ITK_TEMPLATE_Vector(_, EXPORT, T, D) namespace itk { \
  _(2(class EXPORT Vector< T,D >)) \
  _(1(EXPORT std::ostream& operator<<(std::ostream&, \
                                      const Vector< T,D >&))) \
  _(1(EXPORT std::istream& operator>>(std::istream&, \
                                      Vector< T,D >&))) \
  }
</pre>
Note that the second two lines have a comma only nested inside a
second level of parentheses, so the length of the argument list is
only one.  This is why the first line is preceded by a "2" while the
other lines have only "1".
This instantiation macro will work if invoked directly, but we will
not be able to use the short-hand interface provided by
ITK_EXPORT_ITKCommon and ITK_IMPORT_ITKCommon.  If we try to invoke
one of them for Vector we will have too many arguments:
<pre>
ITK_EXPORT_ITKCommon(Vector, double, 3) // too many arguments
</pre>
The preprocessor will again give an error.  The solution is to use the
numbered helper macros to pass the template argument list:
<pre>
#define ITK_TEMPLATE_Foo(_, EXPORT, x) namespace itk { \
  _(1(class EXPORT Foo< ITK_TEMPLATE_1 x >))
  }
#define ITK_TEMPLATE_Vector(_, EXPORT, x) namespace itk { \
  _(2(class EXPORT Vector< ITK_TEMPLATE_2 x >)) \
  _(1(EXPORT std::ostream& operator<<(std::ostream&, \
                                      const Vector< ITK_TEMPLATE_2 x >&))) \
  _(1(EXPORT std::istream& operator>>(std::istream&, \
                                      Vector< ITK_TEMPLATE_2 x >&))) \
  }
</pre>
In this case the instantiation macro knows how many template arguments
are required, so it can directly invoke the proper helper macro to
expand the arguments passed through "x".  The user need not pass the
leading argument count.
Finally, we can use ITK_EXPORT_ITKCommon or its related macros to
export templates with any number of arguments:
<pre>
// Bottom of itkFoo+int-.cxx
ITK_EXPORT_ITKCommon(Foo, (int))
// Bottom of itkVector+double.3-.cxx
ITK_EXPORT_ITKCommon(Vector, (double, 3))
</pre>
This will work well as long as the template arguments are simple
types.  In ITK however there are many class templates that are
instaniated using other template instantiations as arguments.
In our example, we might want to instantiate the type
<pre>
itk::Foo< itk::Vector<double, 3> >
</pre>
In order to do this we should first import our instantiation of
Vector
<pre>
ITK_IMPORT_ITKCommon(Vector, (double, 3))
</pre>
and then export the instantiation of Foo
<pre>
ITK_EXPORT_ITKCommon(Foo, (itk::Vector<double, 3))
</pre>
Unfortunately this passes an argument list of length two to the
instantiation macro for Foo, which will be evaluated as
<pre>
ITK_TEMPLATE_1(itk::Vector<double, 3)
</pre>
and will give a preprocessing error.  Instead we need to use a typedef
to avoid having a comma in the name of the Vector type:
<pre>
typedef itk::Vector<double, 3> VectorD3;
ITK_EXPORT_ITKCommon(Foo, (VectorD3))
</pre>
This will work, but it is tedious.  Consider what it would take to
import this instantiation.
<pre>
ITK_IMPORT_ITKCommon(Vector, (double, 3))
typedef itk::Vector<double, 3> VectorD3;
ITK_IMPORT_ITKCommon(Foo, (VectorD3))
</pre>
This requires that the author of the code importing the instantiation
knows exactly how the Vector instantiation appears inside
ITK_IMPORT_ITKCommon.  Instead we can arrange things to have the
Vector instantiation define the typedef automatically.
In order to provide this typedef automatically we need a way to
produce a name for it.  This can be done by adding another argument to
the instantiation macros that provides a single preprocessing token
corresponding to the template argument list which may be used to
construct typedef names.  For example, we might write
<pre>
ITK_IMPORT_ITKCommon(Foo, (int), I)
ITK_IMPORT_ITKCommon(Vector, (double, 3), D3)
</pre>
where the "I" corresponds to the argument list "(int)" and the "D3"
corresponds to the argument list "(double, 3)".  These tokens may then
be used by the instantiation macros to produce a unique name for each
instantiation, such as "FooI" and "VectorD3".  In order to avoid
conflict with other names in the itk namespace these names can be
placed in a sub-namespace called "itk::Templates".  Importing the
above instantiation of Foo is now more simple:
<pre>
ITK_IMPORT_ITKCommon(Vector, (double, 3), D3)
ITK_IMPORT_ITKCommon(Foo, (Templates::VectorD3), VD3)
</pre>
We are now ready to define instantiation macros to achieve this
syntax.  The final version of the macros is given in the next section.
There is one last detail required to support the do_not_instantiate
pragma on the SGI MIPSpro compiler.  We must find a way to define
ITK_TEMPLATE_IMPORT(X) to produce the code
<pre>
#pragma do_not_instantiate X
</pre>
There is no portable way to put a preprocessor directive inside a
macro, but fortunately the MIPSpro provides a compiler-specific way to
do this.  It will transform the expression
<pre>
_Pragma("some string literal")
</pre>
into
<pre>
#pragma some string literal
</pre>
We can take advantage of this to define ITK_TEMPLATE_IMPORT as
follows.
<pre>
#define ITK_TEMPLATE_IMPORT(x) ITK_TEMPLATE_IMPORT_DELAY(x)
#define ITK_TEMPLATE_IMPORT_DELAY(x) \
        ITK_TEMPLATE_IMPORT_IMPL(do_not_instantiate ITK_TEMPLATE_##x)
#define ITK_TEMPLATE_IMPORT_IMPL(x) _Pragma(#x)
</pre>
This will transform the code
<pre>
ITK_TEMPLATE_IMPORT(2(class Vector<double, 3>))
</pre>
into
<pre>
_Pragma("do_not_instantiate ITK_TEMPLATE_2(class Vector<double, 3>)")
</pre>
and will be interpreted by the compiler as
<pre>
#pragma do_not_instantiate class Vector<double, 3>
</pre>
which is exactly what we need.
==Final Instantiation Macros==
We now give the final instantiation macro design used in ITK.  These
are defined and documented in the header itkMacro.h, except for the
per-template instantiation macros.
The low-level instantiation macros are
<pre>
#define ITK_TEMPLATE_EXPORT(x) ITK_TEMPLATE_EXPORT_DELAY(x)
#define ITK_TEMPLATE_IMPORT(x) ITK_TEMPLATE_IMPORT_DELAY(x)
#define ITK_TEMPLATE_EXPORT_DELAY(x) template ITK_TEMPLATE_##x;
#define ITK_TEMPLATE_IMPORT_DELAY(x) extern template ITK_TEMPLATE_##x;
</pre>
One level of substitution delay is provided to allow the argument "x"
to contain a macro computing the number of arguments in the
paren-enclosed expression passed through the argument.  Currently this
is not needed but it is provided for flexibility and does not hurt
anything.
These macros are passed by name to per-template instantiation macros
which in turn invoke them to produce specific instantiations.  Here we
give the final example per-template instantiations macros for Foo and
Vector based on the above derivation.
<pre>
#define ITK_TEMPLATE_Foo(_, EXPORT, x, y) namespace itk { \
  _(1(class EXPORT Foo< ITK_TEMPLATE_1 x >)) \
  namespace Templates { typedef Foo< ITK_TEMPLATE_1 x > Foo##y; }\
  }
#define ITK_TEMPLATE_Vector(_, EXPORT, x, y) namespace itk { \
  _(2(class EXPORT Vector< ITK_TEMPLATE_2 x >)) \
  _(1(EXPORT std::ostream& operator<<(std::ostream&, \
                                      const Vector< ITK_TEMPLATE_2 x >&))) \
  _(1(EXPORT std::istream& operator>>(std::istream&, \
                                      Vector< ITK_TEMPLATE_2 x >&))) \
  namespace Templates { typedef Vector< ITK_TEMPLATE_2 x > Vector##y; } \
  }
</pre>
These macros are typically defined in the header file of their
corresponding class template just after the class template is defined.
They should contain at least one invocation of the "_" macro passed as
an argument plus one typedef in the Templates sub-namespace providing
a name constructed from the class template name and the preprocessor
token passed in the "y" argument.
The per-template instantiation macros are invoked by
ITK_EXPORT_TEMPLATE and ITK_IMPORT_TEMPLATE, passing
ITK_TEMPLATE_EXPORT and ITK_TEMPLATE_IMPORT respectively:
<pre>
#define ITK_EXPORT_TEMPLATE(EXPORT, c, x, y) \
        ITK_TEMPLATE_##c(ITK_TEMPLATE_EXPORT, EXPORT, x, y)
#define ITK_IMPORT_TEMPLATE(EXPORT, c, x, y) \
        ITK_TEMPLATE_##c(ITK_TEMPLATE_IMPORT, EXPORT, x, y)
</pre>
For these macros the "c" argument provies the class template name
whose per-template instantiation macro is to be invoked.  Finally, the
library-specific export/import macros are
<pre>
#define ITK_EXPORT_ITKCommon(c, x, y) \
        ITK_EXPORT_TEMPLATE(ITKCommon_EXPORT, c, x, y)
#define ITK_IMPORT_ITKCommon(c, x, y) \
        ITK_IMPORT_TEMPLATE(ITKCommon_EXPORT, c, x, y)
</pre>
and provide a short-hand way to export and import instantiations from
each library.
These macros can be used to export some example instantiations:
<pre>
// itkFoo+int-.cxx
ITK_EXPORT_ITKCommon(Foo, (int), I)
// itkVector+double.3-.cxx
ITK_EXPORT_ITKCommon(Vector, (double, 3), D3)
// itkFoo+itkVector+double.3--.cxx
ITK_IMPORT_ITKCommon(Vector, (double, 3), D3)
ITK_EXPORT_ITKCommon(Foo, (Templates::VectorD3), VD3)
</pre>
and also to import them
<pre>
ITK_IMPORT_ITKCommon(Foo, (int), I)
ITK_IMPORT_ITKCommon(Vector, (double, 3), D3)
ITK_IMPORT_ITKCommon(Foo, (Templates::VectorD3), VD3)
</pre>
==Organizing Source Files==
The instantiation macros defined above provide a concise syntax to
export and import template instantiations for any template in ITK.  We
need to carefully organize our source files to make sure the macros
are invoked at the proper time.
TODO...
==Writing Instantiation Macros==
All knowledge about how to instantiate a class template and its
supporting function templates is encoded in its instantiation macro.
TODO...

Revision as of 22:07, 25 April 2006

ITK Explicit Template Instantiation Support

This document derives the motivation and implementation of explicit template instantiation support in ITK.

Overview

Most of ITK is implemented using class class templates in order to provide users with tremendous flexibility on the types of data that can be processed. Here we investigate the mechanism used by the build system to transform these templates into executable code.

Consider an example ITK class template defined in itkFoo.h:

// itkFoo.h
namespace itk
{
  template <class T>
  class Foo
  {
  public:
    void MethodA();
  };
}

The template member MethodA is defined in itkFoo.txx:

// itkFoo.txx
namespace itk
{
  template <class T>
  void Foo<T>::MethodA()
  {
  }
}

When the user writes code such as

#include "itkFoo.h"

int main()
{
  itk::Foo<int> foo;
  foo.MethodA();
  return 0;
}

a reference to the symbol "void itk::Foo<int>::MethodA()" is created. The basic problem we face is how to create the template instantiation providing this symbol.

Implicit Template Instantiation

One way to provide the symbol is called "implicit template instantiation". If the user were to use

#include "itkFoo.txx"

then the compiler will have a definition of the template member MethodA and create a copy of the symbol in the same object file that contains main. In this case the method implementation is implicitly instantiated by the compiler because the definition is available and the method is called.

Implicit instantiation has the advantage that user code may create any instantiation of the template Foo that it needs without worrying about where the symbol definitions will reside. The drawback is that every source file that references the method will create its own copy of the symbol. Code for MethodA will be compiled and stored in every object file that needs it, but the linker will choose only one copy and throw out the rest. This means that the work the compiler did in most of the object files was wasted.

Explicit Template Instantiation

An alternative way to provide the symbol "void itk::Foo<int>::MethodA()" is through "explicit template instantiation". Assume the user did not include itkFoo.txx. When the compiler sees a reference to MethodA it does not know how to create the symbol so it instead places in the object file an unresolved symbol reference. In order to provide the symbol to the linker we need to explicitly create a copy of it somewhere. We can do this by creating an explicit instantiation in a separate source file. Consier a source file called "itkFoo+int-.cxx" that contains the following code:

#include "itkFoo.h"
#include "itkFoo.txx"
namespace itk
{
  template class Foo<int>; // explicit template instantiation
}

When the compiler builds this source file it will have the definitions of all the members of Foo<int> because itkFoo.txx was included. The explicit template instantiation syntax tells the compiler to instantiate a copy of every symbol in the template. Now when the user builds a program referencing "void itk::Foo<int>::MethodA()" the linker will find it in the object file providing the explicit instantiation.

Extern Template Instantiation

Now that we have provided the instantiation itk::Foo<int> explicitly users may use this instantiation by including only itkFoo.h. However, say the user now wants to use itk::Foo<float> as well:

#include "itkFoo.h"

int main()
{
  itk::Foo<int> fooI;
  fooI.MethodA();
  itk::Foo<float> fooF;
  fooF.MethodA();
  return 0;
}

The compiler will not see the definition of MethodA and will produce an unresolved symbol reference. The linker will now fail to resolve the symbol because it is not provided explicitly. A user might fix this by includeing itkFoo.txx to get the definition. The problem is now the compiler will see the definition for the template MethodA and instantiate both itk::Foo<int>::MethodA and itk::Foo<float>::MethodA! The linker will now resolve itk::Foo<float>::MethodA but there will be two copies of itk::Foo<int>::MethodA and one will be thrown out. Again, the compiler has done work that is wasted.

How can this problem be avoided? One solution is to not allow the user to include itkFoo.txx directly but instead require that his or her project manually create an itkFoo+float-.cxx file containing the instantiation for itk::Foo<float>. While this may work it will be confusing for many users and require alot of extra work. Another solution is to take advantage of a compiler-specific extension known as "extern template instantiation". This extension is provided on at least the MSVC, GCC, and Intel compilers. Consider the following code added to the bottom of itkFoo.h:

namespace itk
{
  extern template class Foo<int>; // extern template instantiation
}

This instructs the compiler to NOT instantiate any members of itk::Foo<int> even if the definition is available. Now, if the user includes itkFoo.txx and uses both itk::Foo<int> and itk::Foo<float> the compiler will create symbols for itk::Foo<float> but leave those for itk::Foo<int> undefined. The linker will then come along and find one copy of each, and no work will have been wasted.

A similar extension is provided by the SGI MIPSpro compiler. The above extern instantiation may be written on this compiler as

namespace itk
{
#pragma do_not_instantiate class Foo<int>
}

which tells it not to instantiate this template.

DLL Symbol Resolution on Windows

When an explicit template instantiation is provided by a shared library we must ensure that the symbols are available for use outside the library. On UNIX systems this is automatic. On Windows systems we need to explicitly tell the compiler that the symbols provided by an explicit template instantiation are to be exported from the DLL. This can be achieved by adding a dllexport decoration to the explicit template instantiation line:

#include "itkFoo.h"
#include "itkFoo.txx"
namespace itk
{
  template class __declspec(dllexport) Foo<int>;
}

Similarly, when using an explicit instantiation from another source file we must tell the compiler that we wish to import the symbols from a DLL. This can be achieved by adding a dllimport decoration to the extern template instantiation line:

namespace itk
{
  extern template class __declspec(dllimport) Foo<int>;
}

Scalability of Instantiation Syntax

There are three ways that a class template instantiation may appear in order to export and import it to/from a library. We must make sure these each appear a the proper time, and that each uses the appropriate DLL import/export macro for the library providing the instantiation. For example, if itk::Foo<int> were provided by ITKCommon, we would need the following layout.

// Bottom of itkFoo+int-.cxx
namespace itk
{
template class ITKCommon_EXPORT Foo<int>;
}

// Bottom of itkFoo.h
namespace itk
{
#if (...compiler supports extern instantiation...)
  extern template class ITKCommon_EXPORT Foo<int>;
#elif (...compiler supports do_not_instantiate...)
# pragma do_not_instantiate class ITKCommon_EXPORT Foo<int>
#endif
}

To make matters worse, some class templates have function templates that also need to be instantiated. For example, instantiating itk::Vector<double, 3> might look like this:

namespace itk
{
  template class Vector<double, 3>;
  template std::ostream& operator<<(std::ostream&, const Vector<double, 3>&);
  template std::istream& operator>>(std::ostream&, Vector<double, 3>&);
}

Using the above layout, these three lines would have to be duplicated three times with slight variations. The resulting nine lines would have to be duplicated again for each instantiation provided. This duplication is tedious and error-prone, but can be avoided by using a macro to specify the declarations to instantiate for a given template.

Developing Instantiation Macros

We now incrementally develop the macro-based instantiation design to be used in ITK. In order to specify declarations to instantiate only once we must be able to both export and import using the same declaration. Consider the following macro definitions:

#define ITK_TEMPLATE_EXPORT(X) template X;
#define ITK_TEMPLATE_IMPORT(X) extern template X;

We can now export itk::Foo<int> by writing

namespace itk
{
  ITK_TEMPLATE_EXPORT(class ITKCommon_EXPORT Foo<int>)
}

and import it by writing

namespace itk
{
  ITK_TEMPLATE_IMPORT(class ITKCommon_EXPORT Foo<int>)
}

Note that the argument to the macro is the same in both cases. Now we can provide a macro for the template Foo that specifies this argument:

#define ITK_TEMPLATE_Foo(_, T) namespace itk { \
  _(class ITKCommon_EXPORT Foo< T >)
  }

Exporting and importing itk::Foo<int> can now be done by the following two lines respectively.

// Bottom of itkFoo+int-.cxx
ITK_TEMPLATE_Foo(ITK_TEMPLATE_EXPORT, int)

// Bottom of itkFoo.h
ITK_TEMPLATE_Foo(ITK_TEMPLATE_IMPORT, int)

These will be expanded to

namespace itk
{
  ITK_TEMPLATE_EXPORT(class ITKCommon_EXPORT Foo< int >)
}

and

namespace itk
{
  ITK_TEMPLATE_IMPORT(class ITKCommon_EXPORT Foo< int >)
}

and then to

namespace itk
{
  template class ITKCommon_EXPORT Foo< int >;
}

and

namespace itk
{
  extern template class ITKCommon_EXPORT Foo< int >;
}

This design avoids duplicating the declaration "class Foo<int>" and all supporting declarations (such as in the Vector case). However, it allows instantiations to be provided only by ITKCommon. Instead we should allow the export macro to be specified as an argument:

#define ITK_TEMPLATE_Foo(_, EXPORT, T) namespace itk { \
  _(class EXPORT Foo< T >)
  }

This leads to

// Bottom of itkFoo+int-.cxx
ITK_TEMPLATE_Foo(ITK_TEMPLATE_EXPORT, ITKCommon_EXPORT, int)

// Bottom of itkFoo.h
ITK_TEMPLATE_Foo(ITK_TEMPLATE_IMPORT, ITKCommon_EXPORT, int)

which is more flexible. We can shorten this by providing some helper macros:

#define ITK_EXPORT_TEMPLATE(EXPORT, c, T) \
        ITK_TEMPLATE_##c(ITK_TEMPLATE_EXPORT, EXPORT, T)

#define ITK_IMPORT_TEMPLATE(EXPORT, c, T) \
        ITK_TEMPLATE_##c(ITK_TEMPLATE_IMPORT, EXPORT, T)

which leads to

// Bottom of itkFoo+int-.cxx
ITK_EXPORT_TEMPLATE(ITKCommon_EXPORT, Foo, int)

// Bottom of itkFoo.h
ITK_IMPORT_TEMPLATE(ITKCommon_EXPORT, Foo, int)

We can shorten this further by providing library-specific import/export macros:

#define ITK_EXPORT_ITKCommon(c, T) ITK_EXPORT_TEMPLATE(ITKCommon_EXPORT, c, T)
#define ITK_IMPORT_ITKCommon(c, T) ITK_IMPORT_TEMPLATE(ITKCommon_EXPORT, c, T)

which leads to

// Bottom of itkFoo+int-.cxx
ITK_EXPORT_ITKCommon(Foo, int)

// Bottom of itkFoo.h
ITK_IMPORT_ITKCommon(Foo, int)

This is pretty short, and is relatively nice looking. Unfortunately it works only for templates with one argument! The compilers we wish to support do not support variable-length macro argument lists. Consider what happens when we try to pass multiple template arguments to the inner-most instantiation macro:

namespace itk
{
  ITK_TEMPLATE_EXPORT(class ITKCommon_EXPORT Vector<double, 3>)
}

The C preprocessor does not understand C++ template arguments, and will treat this as a call to ITK_TEMPLATE_EXPORT with two arguments. The first argument will contain "class ITKCommon_EXPORT Vector<double" and the second argument will contain " 3>". Since ITK_TEMPLATE_EXPORT was defined with only one argument this will produce a preprocessing error.

We need a way to pass multiple template arguments through only one macro argument. One way to shield a comma-separated list from being expanded as multiple macro arguments is to put it inside a nested level of parentheses. For example, invoking

ITK_TEMPLATE_EXPORT((class ITKCommon_EXPORT Vector<double, 3>))

will pass just one argument and will not be a preprocessing error. Unfortunately the resulting expansion will be

template (class ITKCommon_EXPORT Vector<double, 3>);

which is not valid. In order to remove the extra level of parentheses after the argument is passed, we use some helper macros:

#define ITK_TEMPLATE_1(x1)       x1
#define ITK_TEMPLATE_2(x1,x2)    x1,x2
#define ITK_TEMPLATE_3(x1,x2,x3) x1,x2,x3

Consider the definition

#define ITK_TEMPLATE_EXPORT(x) template ITK_TEMPLATE_##x;

Now we can use the same macro for both Foo and Vector instantiations:

namespace itk
{
  ITK_TEMPLATE_EXPORT(1(class ITKCommon_EXPORT Foo<int>))
  ITK_TEMPLATE_EXPORT(2(class ITKCommon_EXPORT Vector<double, 3>))
}

The preprocessor will expand the Foo line first to

template ITK_TEMPLATE_1(class ITKCommon_EXPORT Foo<int>);

which will be recursively expanded to

template class ITKCommon_EXPORT Foo<int>;

Then the preprocessor will expand the Vector line first to

template ITK_TEMPLATE_2(class ITKCommon_EXPORT Vector<double, 3>);

which will be recursively expanded to

template class ITKCommon_EXPORT Vector<double, 3>;

These are exactly the lines we need. The ITK_TEMPLATE_IMPORT macro can be modified similarly.

In order to use this new macro properly we need to change our definition of ITK_TEMPLATE_Foo. Consider a the new definition

#define ITK_TEMPLATE_Foo(_, EXPORT, T) namespace itk { \
  _(1(class EXPORT Foo< T >))
  }

This will produce a proper invocation of ITK_TEMPLATE_EXPORT and ITK_TEMPLATE_IMPORT for any template argument of Foo. However, this method cannot be used for templates like Vector that take multiple arguments. Consider an attempt to define the instantiation macro for Vector:

#define ITK_TEMPLATE_Vector(_, EXPORT, T, D) namespace itk { \
  _(2(class EXPORT Vector< T,D >)) \
  _(1(EXPORT std::ostream& operator<<(std::ostream&, \
                                      const Vector< T,D >&))) \
  _(1(EXPORT std::istream& operator>>(std::istream&, \
                                      Vector< T,D >&))) \
  }

Note that the second two lines have a comma only nested inside a second level of parentheses, so the length of the argument list is only one. This is why the first line is preceded by a "2" while the other lines have only "1".

This instantiation macro will work if invoked directly, but we will not be able to use the short-hand interface provided by ITK_EXPORT_ITKCommon and ITK_IMPORT_ITKCommon. If we try to invoke one of them for Vector we will have too many arguments:

ITK_EXPORT_ITKCommon(Vector, double, 3) // too many arguments

The preprocessor will again give an error. The solution is to use the numbered helper macros to pass the template argument list:

#define ITK_TEMPLATE_Foo(_, EXPORT, x) namespace itk { \
  _(1(class EXPORT Foo< ITK_TEMPLATE_1 x >))
  }

#define ITK_TEMPLATE_Vector(_, EXPORT, x) namespace itk { \
  _(2(class EXPORT Vector< ITK_TEMPLATE_2 x >)) \
  _(1(EXPORT std::ostream& operator<<(std::ostream&, \
                                      const Vector< ITK_TEMPLATE_2 x >&))) \
  _(1(EXPORT std::istream& operator>>(std::istream&, \
                                      Vector< ITK_TEMPLATE_2 x >&))) \
  }

In this case the instantiation macro knows how many template arguments are required, so it can directly invoke the proper helper macro to expand the arguments passed through "x". The user need not pass the leading argument count.

Finally, we can use ITK_EXPORT_ITKCommon or its related macros to export templates with any number of arguments:

// Bottom of itkFoo+int-.cxx
ITK_EXPORT_ITKCommon(Foo, (int))

// Bottom of itkVector+double.3-.cxx
ITK_EXPORT_ITKCommon(Vector, (double, 3))

This will work well as long as the template arguments are simple types. In ITK however there are many class templates that are instaniated using other template instantiations as arguments. In our example, we might want to instantiate the type

itk::Foo< itk::Vector<double, 3> >

In order to do this we should first import our instantiation of Vector

ITK_IMPORT_ITKCommon(Vector, (double, 3))

and then export the instantiation of Foo

ITK_EXPORT_ITKCommon(Foo, (itk::Vector<double, 3))

Unfortunately this passes an argument list of length two to the instantiation macro for Foo, which will be evaluated as

ITK_TEMPLATE_1(itk::Vector<double, 3)

and will give a preprocessing error. Instead we need to use a typedef to avoid having a comma in the name of the Vector type:

typedef itk::Vector<double, 3> VectorD3;
ITK_EXPORT_ITKCommon(Foo, (VectorD3))

This will work, but it is tedious. Consider what it would take to import this instantiation.

ITK_IMPORT_ITKCommon(Vector, (double, 3))
typedef itk::Vector<double, 3> VectorD3;
ITK_IMPORT_ITKCommon(Foo, (VectorD3))

This requires that the author of the code importing the instantiation knows exactly how the Vector instantiation appears inside ITK_IMPORT_ITKCommon. Instead we can arrange things to have the Vector instantiation define the typedef automatically.

In order to provide this typedef automatically we need a way to produce a name for it. This can be done by adding another argument to the instantiation macros that provides a single preprocessing token corresponding to the template argument list which may be used to construct typedef names. For example, we might write

ITK_IMPORT_ITKCommon(Foo, (int), I)
ITK_IMPORT_ITKCommon(Vector, (double, 3), D3)

where the "I" corresponds to the argument list "(int)" and the "D3" corresponds to the argument list "(double, 3)". These tokens may then be used by the instantiation macros to produce a unique name for each instantiation, such as "FooI" and "VectorD3". In order to avoid conflict with other names in the itk namespace these names can be placed in a sub-namespace called "itk::Templates". Importing the above instantiation of Foo is now more simple:

ITK_IMPORT_ITKCommon(Vector, (double, 3), D3)
ITK_IMPORT_ITKCommon(Foo, (Templates::VectorD3), VD3)

We are now ready to define instantiation macros to achieve this syntax. The final version of the macros is given in the next section.

There is one last detail required to support the do_not_instantiate pragma on the SGI MIPSpro compiler. We must find a way to define ITK_TEMPLATE_IMPORT(X) to produce the code

#pragma do_not_instantiate X

There is no portable way to put a preprocessor directive inside a macro, but fortunately the MIPSpro provides a compiler-specific way to do this. It will transform the expression

_Pragma("some string literal")

into

#pragma some string literal

We can take advantage of this to define ITK_TEMPLATE_IMPORT as follows.

#define ITK_TEMPLATE_IMPORT(x) ITK_TEMPLATE_IMPORT_DELAY(x)
#define ITK_TEMPLATE_IMPORT_DELAY(x) \
        ITK_TEMPLATE_IMPORT_IMPL(do_not_instantiate ITK_TEMPLATE_##x)
#define ITK_TEMPLATE_IMPORT_IMPL(x) _Pragma(#x)

This will transform the code

ITK_TEMPLATE_IMPORT(2(class Vector<double, 3>))

into

_Pragma("do_not_instantiate ITK_TEMPLATE_2(class Vector<double, 3>)")

and will be interpreted by the compiler as

#pragma do_not_instantiate class Vector<double, 3>

which is exactly what we need.

Final Instantiation Macros

We now give the final instantiation macro design used in ITK. These are defined and documented in the header itkMacro.h, except for the per-template instantiation macros.

The low-level instantiation macros are

#define ITK_TEMPLATE_EXPORT(x) ITK_TEMPLATE_EXPORT_DELAY(x)
#define ITK_TEMPLATE_IMPORT(x) ITK_TEMPLATE_IMPORT_DELAY(x)
#define ITK_TEMPLATE_EXPORT_DELAY(x) template ITK_TEMPLATE_##x;
#define ITK_TEMPLATE_IMPORT_DELAY(x) extern template ITK_TEMPLATE_##x;

One level of substitution delay is provided to allow the argument "x" to contain a macro computing the number of arguments in the paren-enclosed expression passed through the argument. Currently this is not needed but it is provided for flexibility and does not hurt anything.

These macros are passed by name to per-template instantiation macros which in turn invoke them to produce specific instantiations. Here we give the final example per-template instantiations macros for Foo and Vector based on the above derivation.

#define ITK_TEMPLATE_Foo(_, EXPORT, x, y) namespace itk { \
  _(1(class EXPORT Foo< ITK_TEMPLATE_1 x >)) \
  namespace Templates { typedef Foo< ITK_TEMPLATE_1 x > Foo##y; }\
  }

#define ITK_TEMPLATE_Vector(_, EXPORT, x, y) namespace itk { \
  _(2(class EXPORT Vector< ITK_TEMPLATE_2 x >)) \
  _(1(EXPORT std::ostream& operator<<(std::ostream&, \
                                      const Vector< ITK_TEMPLATE_2 x >&))) \
  _(1(EXPORT std::istream& operator>>(std::istream&, \
                                      Vector< ITK_TEMPLATE_2 x >&))) \
  namespace Templates { typedef Vector< ITK_TEMPLATE_2 x > Vector##y; } \
  }

These macros are typically defined in the header file of their corresponding class template just after the class template is defined. They should contain at least one invocation of the "_" macro passed as an argument plus one typedef in the Templates sub-namespace providing a name constructed from the class template name and the preprocessor token passed in the "y" argument.

The per-template instantiation macros are invoked by ITK_EXPORT_TEMPLATE and ITK_IMPORT_TEMPLATE, passing ITK_TEMPLATE_EXPORT and ITK_TEMPLATE_IMPORT respectively:

#define ITK_EXPORT_TEMPLATE(EXPORT, c, x, y) \
        ITK_TEMPLATE_##c(ITK_TEMPLATE_EXPORT, EXPORT, x, y)
#define ITK_IMPORT_TEMPLATE(EXPORT, c, x, y) \
        ITK_TEMPLATE_##c(ITK_TEMPLATE_IMPORT, EXPORT, x, y)

For these macros the "c" argument provies the class template name whose per-template instantiation macro is to be invoked. Finally, the library-specific export/import macros are

#define ITK_EXPORT_ITKCommon(c, x, y) \
        ITK_EXPORT_TEMPLATE(ITKCommon_EXPORT, c, x, y)
#define ITK_IMPORT_ITKCommon(c, x, y) \
        ITK_IMPORT_TEMPLATE(ITKCommon_EXPORT, c, x, y)

and provide a short-hand way to export and import instantiations from each library.

These macros can be used to export some example instantiations:

// itkFoo+int-.cxx
ITK_EXPORT_ITKCommon(Foo, (int), I)

// itkVector+double.3-.cxx
ITK_EXPORT_ITKCommon(Vector, (double, 3), D3)

// itkFoo+itkVector+double.3--.cxx
ITK_IMPORT_ITKCommon(Vector, (double, 3), D3)
ITK_EXPORT_ITKCommon(Foo, (Templates::VectorD3), VD3)

and also to import them

ITK_IMPORT_ITKCommon(Foo, (int), I)
ITK_IMPORT_ITKCommon(Vector, (double, 3), D3)
ITK_IMPORT_ITKCommon(Foo, (Templates::VectorD3), VD3)

Organizing Source Files

The instantiation macros defined above provide a concise syntax to export and import template instantiations for any template in ITK. We need to carefully organize our source files to make sure the macros are invoked at the proper time.

TODO...


Writing Instantiation Macros

All knowledge about how to instantiate a class template and its supporting function templates is encoded in its instantiation macro.

TODO...