VTK/Wrapper Update 2010

From KitwarePublic
< VTK
Revision as of 03:01, 12 July 2010 by Dgobbi (talk | contribs)
Jump to navigationJump to search

The new wrappers for VTK are not really "new", but they are a drastic clean-up of the original wrappers which were completed circa 1998. The wrapper renewal project is currently an open project, with four necessary items and a long wish list of desired new features. The four main goals of this project are: 1) Clean up the wrapper code by removing hard-code hexadecimal constants and reducing voodoo factor, 2) Properly wrap vtkStdString, because it is a crucial interface type, 3) Wrap vtkVariant in Python, especially for use in ParaView, and 4) Eliminate the need for BTX/ETX markers in the code.

Overview

The main design goals for VTK wrappers have not changed since 1998, specifically they must be:

  1. Scalable to a very large number of classes with minimal additional compile-time or run-time overhead.
  2. Able to wrap VTK classes as automatically as possible, with a minimal amount of hinting.
  3. Able to support multiple wrapper back-ends for different wrapper languages.

The core of the wrapper is a lex/yacc parser that reads C++ header files and stores information about the classes in C data structures that can be used by the wrapper-generator back-ends. This parser and its data structures are what have received the most attention during this wrapper update. Some important points about the parser (both the new and the old) are as follows:

  1. It only parses the header file in question, it does not pull in the included header files.
  2. It understands all (or nearly all) of the VTK macros defined in vtkSetGet.

These two points are important for the efficiency and simplicity of the parser. The parser does not have a C preprocessor, and it does not read more than one file at a time. Instead, it relies on its built-in knowledge of the VTK macros. The new parser front-end does have the ability to read and parse multiple header files, but this feature is not taken advantage of for VTK. It is primarily just a result of the code cleanup.

The four main items

The big cleanup

The first part of the cleanup was to remove all the hexadecimal constants like 0x303 from the files in the Wrapper directory, and replace them with named constants defined in a new header file called vtkParseType.h. This was a tedious job, but just by itself was enough to make the code much more readable. For example, VTK_PARSE_CHAR_PTR is obvious, while 0x303 is not. An interesting note: the author (David Gobbi) was also responsible for converting the 1998-era decimal constants into hexadecimal constants in

Wrapping vtkStdString

The vtkStdString type was introduced in VTK 5.0, as a VTK-standard subclass of std::string. It was wrapped via the expedient of adding a "const char *" typecast operator to it so that the wrappers could simply treat vtkStdString return values as if they were "const char *". This trick unfortunately only works for methods that return "vtkStdString&", i.e. methods that return a reference to a persistent string. In usual C++ programming practice, however, methods should always return strings by value, not by reference. As a result, VTK methods that returned vtkStdString had to be surrounded by BTX/ETX because, if they were wrapped, they would return a temporary vtkStdString object to the wrappers, which would then grab a pointer to the internal "char *", which would immediately become invalid. This issue of having to BTX/ETX methods that return vtkStdString persisted from 2005 to 2010, with only a select few methods that returned "vtkStdString&" being properly wrapped. The original addition of vtkStdString to the wrappers is logged as follows:

ENH: Wrap vtkStringArray by adding vtkStdString as a special token and mapping 
it to "const char *" in the wrappers.  vtkStringArray::GetValue() was changed to 
return a reference because otherwise c_str() is called on a temporary 
vtkStdString object.
dgobbi (author) May 21, 2005

In the new wrappers, vtkStdString (and vtkstd::string and std::string) are recognized by the parser as their own types, not as "const char *", so now all VTK methods that use vtkStdString can be properly and safely wrapped.

The parser also recognizes vtkUnicodeString, but only the Python wrappers handle this type. In the python wrappers, the vtkUnicodeString is synonymous with Python's unicode type, with automatic conversion between the two.

Wrapping vtkVariant

The vtkVariant type is a VTK type that can hold any of the types commonly used in VTK, such as the C++ numeric types, vtkObjects, vtkStdString, and vtkUnicodeString. It is, in other words, an interface to a union of these types. An increasing number of classes in VTK use it as an interface, so there was a strong interest in wrapping it, particularly for use in ParaView's python scripting engine. The new wrappers make vtkVariant available in Python, but not in Java or Tcl (and not, as of yet, in ParaView's ClientServer wrapper).

Two approaches could have been taken for wrapping vtkVariant. The first approach would have been to make vtkVariant invisible from Python, i.e. methods taking vtkVariant arguments would automatically convert the given Python type into a vtkVariant, and methods returning vtkVariant would automatically convert the vtkVariant to a native Python type (or to a vtkObject). The second approach was to explicitly wrap vtkVariant and make it possible to construct and use vtkVariant objects within python. This latter approach was taken, because it makes Python VTK code much easier to compare with and convert to C++ VTK code.

One concession was made, however. The VTK/Python wrappers were modified to support automatic argument conversion via the vtkVariant constructors. So if a VTK method accepts a vtkVariant, then you can pass a numeric value, a string, a unicode string, or a vtkObject and the vtkVariant will be constructed automatically. This kind of argument conversion is standard in C++, but not in Python, except for the VTK/Python wrappers.

The method used to wrap vtkVariant is generic, and can be applied to other special VTK types. Currently the special-wrapped types for VTK/Python are vtkVariant, vtkTimeStamp, vtkArrayCoordinates, vtkArrayExtents, and vtkArrayRange.

Also see the following project page: Wrapping special types (start Apr 28, 2010, finish Jun 18, 2010)

Eliminating BTX/ETX from VTK header files

There are two main uses BTX/ETX are used in the VTK header files. The first use is to block off code that the VTK wrapper parser cannot parse, since it does not understand all C++ syntax. The second use is to block of methods that, if they were wrapped, would cause the wrappers to either refuse to compile, or compile and then segfault if the method was called.

The new wrappers tackle both of these issues in order to make it possible to remove BTX/ETX from the code. The main feature of the new parser is that it is a full C++ parser, and is likely to only be confused by the use of unrecognized preprocessor macros (since the wrapper's parser lacks a true preprocessor).

The second issue, i.e. the problem of wrapped methods either not compiling or segfaulting when used, was due to the inability of the wrappers to properly recognize anything but basic C types and vtkIdType. When the wrappers saw a vtkSomething as an argument, they would always assume that this was a vtkObjectBase-derived type. There are only two ways for the wrappers to be able to figure out types: the first is to have them go through all included header files and look for class definitions and typedefs, and the second is for them to be given a list of types that they can consult.

This "list of types" is provided by the new vtkWrapHierarchy tool, which has its own project page here. The vtkWrapHierarchy tool reads all the VTK header files in one go, and spits out a file that lists all the classes, typedefs, and enums that are defined within the kit. This information is then pulled in by the wrappers, which use it in order to properly wrap method arguments and return types.

Wish list items