VTK/Python Wrapper Enhancement: Difference between revisions
Line 56: | Line 56: | ||
* the < <= != etc. operators for comparisons | * the < <= != etc. operators for comparisons | ||
* the "[ ]" operator for indexing | * the "[ ]" operator for indexing | ||
Operators that could be supported in the future: | |||
* the "[ ]" operator for mapping | |||
* the "( )" operator for callable objects | |||
* arithmetic operators (would require a lot of tedious work) | |||
====Templated type handling in Python==== | ====Templated type handling in Python==== |
Revision as of 13:43, 11 May 2011
This project will improve the python wrappers bit-by-bit, with the goal of making the Python interface as close as possible to the original C++ interface. When each piece is finished, it will be marked as "done".
The original author of this document is David Gobbi. He can be reached on the VTK developers mailing list.
Improvements to Python wrappers
- Python 3 support
- Allow hierarchies of special types (i.e. non-vtkObjectBase classes)
- Enums and other constants (partly done)
- Operator support (partly done)
- Wrapping of templated types - will always be limited to selected types, but can still be very useful
- Wrapping of default-value arguments (done)
- Wrapping of istream and ostream
- Wrapping of pointers args - requires a better hinting system
- Wrapping of reference args and returning values in them (done)
- Wrapping of multi-dimensional arrays (done)
- Handle SetTuple/GetTuple methods of vtkDataArray (done)
- Wrap vtkCommand and allow it to be subclassed
- Wrap the CallData for observer methods
Python 3
Python 3 introduced major API changes for strings and ints, and minor API changes elsewhere.
- It will be possible to support Python 2 and Python 3 simultaneously
- PyIntObject and PyStringObject are absent in Python 3
- Python 3 uses unicode for all strings (ramifications for VTK?)
- A new, multi-dimensional buffer interface exists (plus, memoryview)
- Small changes to PyTypeObject
- Language changes: some examples will have to be rewritten
Hierarchies of special types in Python wrappers
If each non-vtkObjectBase special type had its own PyTypeObject struct (generated by vtkWrapPython.c) then:
- These types could have a hierarchy via python's subclass system
- Type-specific protocols (number, sequence, buffer, etc) could be supported, this would require proper parsing of operators
Also: see VTK/WrapHierarchy for a tool that provides the entire hierarchy to the wrappers at compile-time. Note that the WrapHierarchy tool is of critical importance... without it, the wrappers cannot tell if a "vtkSomething *" parameter is a pointer to a vtkObjectBase object or to a special VTK type.
Enums and constants (mostly done as of July 31, 2010)
The new vtkParse will parse enums and #define constants.
- The FileInfo struct must be expanded to hold these constants
- Constant values should be stored as strings to simplify typing*
- The wrappers will have to automatically add the class scope to enum constants.
- Type checking for named enums, instead of treating them like "int" (not done)
* The strings can be written literally into the wrapper .cxx files where they will be evaluated as the correct type.
Operator support (partly done)
The new vtkParse provides information about operator methods for the VTK classes. These operator methods can be mirrored in Python by:
- defining the appropriate "protocols" for special type objects
- defining the proper methods e.g. __setitem__() for vtkObjectBase objects
Each VTK special type will have its own python "type" object, and can thus support its own set of protocols that will be inherited by "subtypes". All vtkObjectBase objects have the same python "type" object so protocols cannot be used, but underscore methods can potentially be used.
Operators supported so far:
- the << operator for printing
- the < <= != etc. operators for comparisons
- the "[ ]" operator for indexing
Operators that could be supported in the future:
- the "[ ]" operator for mapping
- the "( )" operator for callable objects
- arithmetic operators (would require a lot of tedious work)
Templated type handling in Python
Should be made to look similar to C++, but with square brackets instead of angle brackets. E.g. vtkValue['f', 3]( ) would create a vtkValue<float, 3>( ). In Python, the specialized types would be stored in a dictionary. There is already template support in vtkParse, so all the information about the templates is available to vtkWrapPython... it is just a matter of specializing and wrapping the templated classes.
Default value arguments (done as of Sept 17, 2010)
Default argument value support would require the following:
- vtkParse must store the default value in the FunctionInfo struct as a string*
- vtkWrapPython.c must use these default values to initialize parameter values
- vtkWrapPython.c must place a bar "|" before the default args in the ParseTuple format string
- some other small changes would be needed
* the default value must be stored as a string to accommodate all types and to accommodate simple mathematical expressions, e.g. the default might be SomeMethod(int param = VTK_CONST1 - VTK_CONST2). The string "VTK_CONST1 - VTK_CONST2" can be dropped directly into the wrapper CXX code where it will be evaluated.
Wrapping istream and ostream
The wrapper parser already identifies input and output streams as their own types. It would be straightforward to wrap these as Python file objects.
Pointer arg wrapping
Pointer arguments (as opposed to array arguments) are used for the following in VTK:
- passing data arrays, e.g. vtkIntArray::SetArray(int *data, vtkIdType size, int save)
- 'tuples' where the tuple size can change, e.g. vtkDataArray::SetTuple(double *tuple)
- return slots, e.g. vtkVariant::ToInt(bool *valid)
For (1) and (2), there is an analogous situations for return values:
- int *vtkIntArray::GetPointer(vtkIdType offset)
The closest thing that Python has to a "pointer" is its buffer objects, such as array, string, and numeric array. The problem is that a python buffer always requires a size argument, but C++ rarely provides any hints about the size of the data object that a pointer is pointing to. Some heuristics would have to be applied:
- the wrappers can be made to look for vtkDataArray and properly handle their pointer methods
- other methods will need some sort of hinting.
The "count" for pointer args should be hinted so that they can be properly wrapped. E.g.
- vtkVariant::ToInt(bool *vtkSingleValue(valid))
- vtkVariant::ToInt(bool *vtkOptionalSingleValue(valid)) - can be safely set to NULL
- vtkDataArray::SetTuple(double *vtkMultiValue(tuple, ->GetNumberOfComponents()))
In the latter, the name of the method to get the count is supplied in the hint. Recognizing these macros in vtkParse would be easy. Unfortunately, they make the C++ code very ugly.
Reference arg wrapping (done as of Sept 17, 2010)
This is trivial to add, only a few lines would have to be added to vtkWrapPython. For the reference arg, the user would have to pass a container object that supported both the sequence protocol and the number protocol. A new "mutable" object was created for this purpose.
Wrapping of multi-dimensional arrays (done as of Sept 17, 2010)
Python can unpack nested sequences, so reading multi-array args is easy. Writing back to them is a bit more complicated.
Sept 17: reading and writing is done. Reading requires that each element is passed as an arg to PyArg_ParseTuple, which is nice for small arrays but not so good for large arrays. Instead of having PyArg_ParseTuple unpack the elements, a subroutine could be added to vtkPythonUtil.
GetTuple/SetTuple (done as of Aug 6, 2010)
The wrappers should make a "special case" for vtkDataArray and wrap the GetTuple/SetTuple methods using the knowledge that the tuple size is equal to the number of components. The same can be done for the subclasses, with GetTupleValue/SetTupleValue. This is a change that could also be easily done for Tcl and Java. (done as of Aug 6, 2010 for Python wrappers).
Wrap vtkCommand and allow it to be subclassed
Right now, vtkCommand() cannot be used from python because it is an abstract class, and abstract VTK classes cannot be subclassed in Python. Even without vtkCommand, the VTK Command/Observer features can still be used in Python because the vtkObject's AddObserver method can take any python method as an argument, and the wrappers internally convert that python method into a vtkPythonCommand. Unfortunately, though, some flexibility is lost because these vtkCommand methods are lost:
- SetAbortFlag()/GetAbortFlag()
- SetPassiveObserver()
I have done some work to remedy this, but the work is not yet complete. So far, I have:
- changed the CMake files so that vtkCommand is wrapped (but it is abstract and cannot be instantiated)
- made it possible to subclass vtkCommand in python (but the subclasses are abstract)
To make everything work, vtkCommand subclasses in python must actually be subclassed from vtkPythonCommand (which is concrete), and vtkPythonCommand must provide virtual function hooks so that Execute can be overridden as a virtual method. This is only possible if vtkPythonCommand is provided with a "PyObject *" slot for its pythonic other half, so that when Execute is called it can search the python dict to see if the method has been overridden.
Wrap CallData for observer methods
A method is passed to vtkObject.AddObserver() in python takes two args (O, s) where "O" is the observed object, and "s" is a string that gives the event type. The corresponding C++ method takes a third object: a void pointer called "CallData" that contains extra information about the event. The CallData is usually NULL, but sometimes contains useful information such as an error message, a pointer to a vtkObject, or a pointer to a numeric value. The python observer methods should be made to take an optional third argument, which will be the CallData automatically resolved to the correct type. It will be tricky to achieve this in a backwards-compatible manner because there is a lot of existing code that will break if passed a third argument, but judicious error detection within vtkPythonCommand.cxx can work around this by attempting to call the method with three parameters first, and then retrying with two parameters if a TypeError occurred with the usual parameter-count error text and if the traceback is exactly one level deep.