VTK/Wrapping Special Types

From KitwarePublic
Jump to navigationJump to search

This project began on April 28, 2010 and arose from a desire to improve the ability of the VTK wrappers to handle special VTK types, i.e. types that are not derived from vtkObjectBase. The primary goals are to wrap vtkVariant, and to fix the wrapping of vtkStdString since the current code will sometimes push an invalid "const char *" to the wrapper language. The wrapping of vtkVariant will be done via a general-purpose mechanism than can be trivially extended to other special classes in VTK.

Background

There are some very useful classes in VTK that are not derived from vtkObjectBase and are therefore not wrapped. Some, like vtkTimeStamp, are trivial classes. Others, like vtkVariant, are more comprehensive. The vtkStdString isn't wrapped, but is coerced to "const char *" by the wrappers in a manner that can leave dangling references to temporary objects.

State of the code

The wrapper generator code is a mess of hexidecimal literals, static variables, and badly-named functions. It should be updated to VTK code standards, in both the front-end and the back-end. This process has been started by the addition of vtkParseType.h, which defines useful macros for use by the wrapper code.

The vtkStdString issue

When vtkStdString was introduced to VTK, support for it was added to the wrappers in the following manner:

If a VTK method returned a vtkStdString or vtkStdString&, it would be stored in a "const char *" variable. Because type conversion from "string" to "char *" is automatic, this required minimal changes to the wrappers. Code that originally handled "const char *" would now handle strings, as well.

This scheme works fine for methods that return "vtkStdString&" because the string is stored somewhere and is guaranteed to be around at least until the contents of the "char *" are copied or otherwise used by the wrapper language. However, methods returning "vtkStdString" are problematic, because they create a temporary vtkStdString object which the wrappers then get a "char *" from. As soon as the "char *" is acquired, the compiler is free to throw the temporary away, even before the wrappers have a chance to utilize the "char *".

The fix should be straightforward. When a method returns a vtkStdString, the wrappers should store it in a vtkStdString variable that remains valid until the wrappers have copied the contents. The only complication is that the vtkParse front end must define a new type constant for vtkStdString.

Changes Needed

New parser types for vtkStdString and vtkUnicodeString

These two types must be handled transparently by the wrapper languages, i.e. we want vtkStdString to be automatically converted the the wrapper language's native string type, as opposed to defining a special "vtkStdString" type in the wrapper language. In other words, we need to add two slots to the list of parse types. Unfortunately, the way that the parser types are enumerated (i.e. as hexidecimal digits), there are only 15 slots available, and 14 of those slots are used.

The parser internally uses 32-bit ints for types. The first 16 bits are used for the array count, e.g. "float arg[count]". The last 16 bits are used to store four hexadecimal digits that describe the following:

  • 1st digit: const (0x1) or static (0x2) or function pointer (0x5)
  • 2nd digit: reference (0x1) or pointer (0x3) and variations thereof up to 0x9
  • 3rd digit: unsigned (0x1)
  • 4th digit: base type (0x1 through 0xE)

This is poor usage of the available bits. If we squash the bitfield to remove bits that are always zero, then a full 8 bits or two hexadecimal digits can be reserved for the base type.

I have added a vtkParseType.h header file that defines the bitfields for the types, and have modified vtkParse so that it uses this header file. So it is now possible to change the type bitfields simply by changing vtkParseType.h.

Once this has been done, vtkStdString and vtkUnicodeString will be visible to vtkWrapPython.c and the other wrapper back-ends. The back-ends will have to be modified to convert back and forth between these types and the native string types of the wrapper languages.

Question: do we also want to wrap vtkstd::string and std::string?

Marking special classes in CMake

Unlike vtkStdString, we want vtkVariant to appear in the wrapper languages as a type called "vtkVariant" that has the same methods as the C++ class. In other words, we want it to be wrapped is a similar manner (but with some distinct differences) to vtkObjectBase-derived classes.

The differences between wrapping "special classes" like vtkVariant vs. vtkObjectBase-derived classes are:

  • special classes must not be added to the vtkInstantiators
  • special classes must always be handled with copy semantics in the wrappers, since they don't have reference counts
  • the wrapper languages will probably not handle polymorphism for special classes

Because of these differences, there must be a way of marking special classes within CMake. Right now, their header files are simply marked as "WRAP_EXCLUDE". They must instead be marked as "WRAP_SPECIAL" so that they can be kept out of the instantiators, while being sent to whatever language wrappers are able to handle them. It is likely that support for special classes will only be added to the python wrappers (unless there are volunteers for the other wrapper languages).

It will probably be best if setting WRAP_SPECIAL implicitly sets WRAP_EXCLUDE, and then wrapper generators that know they are smart enough to wrap special types will ignore the WRAP_EXCLUDE if WRAP_SPECIAL is also set. By doing this, we can avoid breaking backwards compatibility with any third-party wrapper generators out there that are unable to wrap special types.

Python Specifics

PyVTKSpecialObjectType

The python wrappers already have a PyVTKSpecialObjectType that was originally developed for this purpose several years ago. However, it was unused because it was decided that it would be more expedient to only wrap vtkObjectBase-derived types.

To properly use this new "special object" type, vtkPythonUtil.cxx needs a new hash that can be used to store information about these special types. The hash will map the class name to a PyVTKSpecialTypeInfo struct, which will contain a pointer to a table of all the methods for the type, along with other important information such as the docstring.

For now, at least, there will be no polymorphism, i.e. the PyVTKSpecialTypeInfo struct will not provide info about the superclass. This is something that could be added in the future.

Each PyVTKSpecialObject will contain a pointer to its own copy of the underlying C++ object. That is, if Python ever encounters a vtkVariant, then it will make a PyVTKSpecialObject for that vtkVariant, and then will use the copy constructor to make its own copy. The use of copy semantics will eliminate the need for a garbage collection scheme.

Resolving arguments and calling the correct signature

Currently, if there are multiple signatures for a particular VTK object method, then when a method is called the wrapper will try each signature in order until one of them is able to process the arguments. So if one method signature takes "float" and another method signature takes "int", then the one that is defined first is called regardless of whether the passed argument is "float" or "int". This behaviour is completely different from C++, where the compiler works very hard to resolve ambiguities between method signatures.

This is a particularly bad situation for vtkVariant, which has a multitude of constructors that take various argument types. After the default constructor, the first constructor defined is "vtkVariant(char c)" which will accept an int or a float with silent conversion. In other words, constructing a float or an int vtkVariant is impossible. To fix this problem, the python wrappers will have to compare the passed arguments against the available method signatures in order to optimally match the latter to the former.

Proper "bool" support

Python has had a native "bool" type since python 2.3, but the wrappers do not yet distinguish between bool and int. This unfortunately makes it impossible to construct a bool-valued vtkVariant in python. Because PyArg_ParseTuple doesn't provide a format character for bool, a fair bit of work is needed to check boolean arguments when resolving overloaded methods.

Tricky issues

How will wrappers know vtkObjectBase-derived args from non-vtkObjectBase args?

In order for vtkVariant and its ilk to be wrapped, the BTX/ETX will have to be removed from methods that use vtkVariant. But if only the python wrappers properly support vtkVariant, what will happen if someone calls these methods in Tcl? Well, right now the Tcl wrappers won't even compile, because they will try to call methods like "IsA()" on the vtkVariant. In a way this is good. It's better for code not to compile, than to compile but then crash when it's run.

The reason that the Tcl wrappers don't compile, is that they see a vtkVariant and assume that it is derived from vtkObjectBase. It should be easy to tell the difference, though:

  • vtkObjectBase-derived objects are always handled via pointers, this is enforced
  • vtkVariant is rarely handled via pointers

So "vtkObj *" can be assumed to be a vtkObjectBase, while "vtkObj" or "vtkObj&" can be assumed to not be a vtkObjectBase. Sounds simple, right? But how safe are these assumptions?

Well, the assumption that a "vtkObj *" is derived from vtkObjectBase is not safe. There are a few methods that use "vtkVariant *", for example. We have to make sure that, if these methods aren't enclosed in BTX/ETX, that the wrapper code will fail to compile.

Status

30 April 2010

For python, vtkVariant and vtkTimeStamp are both automatically wrapped. There is an issue with constructors for vtkVariant that still needs to be fixed. The wrapper code tries out the various constructor signatures in order until one is able to use the constructor arguments. Because the "char" constructor is tried first, any attempt to make a "double" or "int" variant results in the creation of a "char" variant instead. Creating a variant from a string or a vtkObject works fine, though. Storing variants in a vtkVariantArray also works.

A problem was found with the parser: the leading "~" of the destructor signature is thrown away by the parser, so the destructor isn't distinguishable from a constructor.