VTK/Python Wrapper Enhancement: Difference between revisions

From KitwarePublic
< VTK
Jump to navigationJump to search
 
(64 intermediate revisions by 4 users not shown)
Line 1: Line 1:
This project will improve the python wrappers bit-by-bit, with the goal of making the Python interface as close as possible to the original C++ interface.
This project will improve the python wrappers bit-by-bit, with the goal of making the Python interface as close as possible to the original C++ interface.  When each piece is finished, it will be marked as "done".


===Improvements to Python wrappers===
The original author of this document is David Gobbi.  He can be reached on the VTK developers mailing list.


* Allow hierarchies of special types
==Enums and constants (mostly done as of July 31, 2010, finished on Dec 13, 2014) ==
* Enums and other constants
* Operator support
* Wrapping of templated types - will always be limited to selected types, but can still be very useful
* Wrapping of default-value arguments
* Wrapping of pointers args - requires a better hinting system
* Wrapping of reference args for returning values - would be easy
* Handle SetTuple/GetTuple methods of vtkDataArray
 
====Hierarchies of special types in Python wrappers====
 
If each special type had its own PyTypeObject struct (to be generated by vtkWrapPython.c) then:
* Types could have a hierarchy via python's subclass system
* Type-specific protocols (number, sequence, buffer, etc) could be supported, this would require proper parsing of operators
Also: see [[VTK/WrapHierarchy]] for a tool that provides the entire hierarchy to the wrappers at compile-time
 
====Enums and constants====


The new vtkParse will parse enums and #define constants.
The new vtkParse will parse enums and #define constants.
Line 25: Line 9:
* Constant values should be stored as strings to simplify typing*
* Constant values should be stored as strings to simplify typing*
* The wrappers will have to automatically add the class scope to enum constants.
* The wrappers will have to automatically add the class scope to enum constants.
<nowiki>*</nowiki> The strings can be written literally into the wrapper .cxx files where they will be evaluated.
* Type checking for named enums, instead of treating them like "int" (done on Nov 21, 2014)
 
<nowiki>*</nowiki> The strings can be written literally into the wrapper .cxx files where they will be evaluated as the correct type.
====Operator support====
 
The new vtkParse provides information about operator methods for the VTK classes.  These operator methods can be mirrored in Python by:
# defining the appropriate "protocols" for special type objects
# defining the proper methods e.g. __setitem__() for vtkObjectBase objects
Each VTK special type will have its own python "type" object, and can thus support its own set of protocols which will automatically be inherited by "subtypes".  All vtkObjectBase objects have the same python "type" object so protocols cannot be used, the less-efficient technique of using underscore methods can be applied.


====Templated type handling in Python====
==GetTuple/SetTuple (done as of Aug 6, 2010)==


Should be made to look similar to numpy, e.g. vtkValue(1, 'f') would create a vtkValue<float>To python, the templated type would look like a variadic typeIt would be necessary to change vtkParse so that it recognized templates.
The wrappers should make a "special case" for vtkDataArray and wrap the GetTuple/SetTuple methods using the knowledge that the tuple size is equal to the number of components. The same can be done for the subclasses, with GetTupleValue/SetTupleValueThis is a change that could also be easily done for Tcl and Java(done as of Aug 6, 2010 for Python wrappers).


====Default value arguments====
==Default value arguments (done as of Sept 17, 2010)==


Default argument value support would require the following:
Default argument value support would require the following:
Line 47: Line 25:
<nowiki>*</nowiki> the default value must be stored as a string to accommodate all types and to accommodate simple mathematical expressions, e.g. the default might be SomeMethod(int param = VTK_CONST1 - VTK_CONST2).  The string "VTK_CONST1 - VTK_CONST2" can be dropped directly into the wrapper CXX code where it will be evaluated.
<nowiki>*</nowiki> the default value must be stored as a string to accommodate all types and to accommodate simple mathematical expressions, e.g. the default might be SomeMethod(int param = VTK_CONST1 - VTK_CONST2).  The string "VTK_CONST1 - VTK_CONST2" can be dropped directly into the wrapper CXX code where it will be evaluated.


====Pointer arg wrapping====
==Reference arg wrapping (done as of Sept 17, 2010)==
 
This is trivial to add, only a few lines would have to be added to vtkWrapPython.  For the reference arg, the user would have to pass a container object that supported both the sequence protocol and the number protocol.  A new "mutable" object was created for this purpose.
 
==Wrapping of multi-dimensional arrays (done as of Sept 17, 2010)==
 
Python can unpack nested sequences, so reading multi-array args is easy.  Writing back to them is a bit more complicated.
 
Sept 17: reading and writing is done.  Reading requires that each element is passed as an arg to PyArg_ParseTuple, which is nice for small arrays but not so good for large arrays.  Instead of having PyArg_ParseTuple unpack the elements, a subroutine could be added to vtkPythonUtil.
 
==Operator support (partly done as of May 19, 2011)==
 
The new vtkParse provides information about operator methods for the VTK classes.  These operator methods can be mirrored in Python by:
# defining the appropriate "protocols" for special type objects
# defining the proper methods e.g. __setitem__() for vtkObjectBase objects
Each VTK special type will have its own python "type" object, and can thus support its own set of protocols that will be inherited by "subtypes".  All vtkObjectBase objects have the same python "type" object so protocols cannot be used, but underscore methods can potentially be used.
 
Operators supported so far:
* the << operator for printing
* the < <= != etc. operators for comparisons
* the "[ ]" operator for indexing
 
Operators that could be supported in the future:
* the "[ ]" operator for mapping
* the "( )" operator for callable objects
* arithmetic operators (would require a lot of tedious work)
 
==Hierarchies of special types (done as of May 20, 2011)==
 
If each non-vtkObjectBase special type had its own PyTypeObject struct (generated by vtkWrapPython.c) then:
* These types could have a hierarchy via python's subclass system
* Type-specific protocols (number, sequence, buffer, etc) could be supported, this would require proper parsing of operators
Also: see [[VTK/WrapHierarchy]] for a tool that provides the entire hierarchy to the wrappers at compile-time.  Note that the WrapHierarchy tool is of critical importance... without it, the wrappers cannot tell if a "vtkSomething *" parameter is a pointer to a vtkObjectBase object or to a special VTK type.
 
==Templated type handling (done as of May 31, 2011)==
 
Should be made to look similar to C++, but with square brackets instead of angle brackets. E.g. vtkValue['float32', 3]( ) would create a vtkValue<float, 3>( ).  In Python, the specialized types would be stored in a dictionary.  There is already template support in vtkParse, so all the information about the templates is available to vtkWrapPython... it is just a matter of instantiating and wrapping the class templates.
 
==Wrap Namespaces (partly done as of Nov 21, 2014)==
 
There are two ways that namespaces could be handled:
* as modules, with all items in the namespace placed within the dict of the module (this option was chosen)
* as class objects, with all items as attributes of the class (functions would be static methods of the class)
Only constants and enum types within the namespace are wrapped.  Functions and classes in the namespace are not yet wrapped.
 
==Eliminate WRAP_SPECIAL (partly done as of Sept 8, 2015)==
 
The WRAP_SPECIAL (and WRAP_EXCLUDE) indicators in CMakeList.txt are a pain to maintain.  It would be possible to just let the python wrappers attempt to wrap everything, and if any types turn up as "unwrappable" they could be wrapped as opaque pointers.  That way they could still be passed back and forth between fully-wrapped objects.  The only caveat is that some classes in VTK are incomplete i.e. missing method definitions in their .cxx files, but this would be a good opportunity to fix such classes, or place them in an "#ifndef __WRAP__" block.
 
As a step towards this goal, WRAP_SPECIAL has been eliminated.  The wrappers now try to wrap all classes unless they are specifically marked with the new WRAP_EXCLUDE_PYTHON flag.
 
==Wrapping istream and ostream==
 
The wrapper parser already identifies input and output streams as their own types.  It would be straightforward to wrap these as Python file objects.
 
==Pointer arg wrapping==
 
See Wrapper Hints below.
 
==Wrap vtkCommand and allow it to be subclassed==
 
Right now, vtkCommand() cannot be used from python because it is an abstract class, and abstract VTK classes cannot be subclassed in Python.  Even without vtkCommand, the VTK Command/Observer features can still be used in Python because the vtkObject's AddObserver method can take any python method as an argument, and the wrappers internally convert that python method into a vtkPythonCommand.  Unfortunately, though, some flexibility is lost because these vtkCommand methods are lost:
* SetAbortFlag()/GetAbortFlag()
* SetPassiveObserver()
I have done some work to remedy this, but the work is not yet complete.  So far, I have:
# changed the CMake files so that vtkCommand is wrapped (but it is abstract and cannot be instantiated)
# made it possible to subclass vtkCommand in python (but the subclasses are abstract)
To make everything work, vtkCommand subclasses in python must actually be subclassed from vtkPythonCommand (which is concrete), and vtkPythonCommand must provide virtual function hooks so that Execute can be overridden as a virtual method.  This is only possible if vtkPythonCommand is provided with a "PyObject *" slot for its pythonic other half, so that when Execute is called it can search the python dict to see if the method has been overridden.
 
==Wrap CallData for observer methods (done as of Feb 18, 2014)==
 
See http://vtk.org/gitweb?p=VTK.git;a=commit;h=50d601bdb8ff10e6df6f0
 
A method is passed to vtkObject.AddObserver() in python takes two args (O, s) where "O" is the observed object, and "s" is a string that gives the event type.  The corresponding C++ method takes a third object: a void pointer called "CallData" that contains extra information about the event.  The CallData is usually NULL, but sometimes contains useful information such as an error message, a pointer to a vtkObject, or a pointer to a numeric value.  The python observer methods should be made to take an optional third argument, which will be the CallData automatically resolved to the correct type.  It will be tricky to achieve this in a backwards-compatible manner because there is a lot of existing code that will break if passed a third argument, but judicious error detection within vtkPythonCommand.cxx can work around this by attempting to call the method with three parameters first, and then retrying with two parameters if a TypeError occurred with the usual parameter-count error text and if the traceback is exactly one level deep.
 
==The using directive (done as of May 15, 2015)==
 
In C++, when a class overrides a superclass method, then all superclass signatures of that method will be shadowed.  In order to avoid this, the "using" directive can be used to bring them into the subclass namespace:
 
  using Superclass::SetColor;  // bring in e.g. SetColor(r,g,b)
  void SetColor(double color[3]);  // override SetColor(color)
 
Exactly the same shadowing occurs in python, and because the wrappers ignore the "using" directive, the above code does not fix the shadowing problem for the wrappers.  In order for the wrappers to apply the "using" directive, the following must be done:
* When vtkWrapPython parses a header, it must recursively parse superclass headers
* If "using" is encountered, items should be brought in from the superclass namespace
* For recursive parsing, see vtkParse_SetRecursive() in vtkParse.y and preprocessor_directive() in vtkParse.l
 
==Python 3 (done as of Aug 31, 2015)==
 
Python 3 introduced major API changes for strings and ints, and minor API changes elsewhere.
* It will be possible to support <s>Python 2.3 though Python 3</s> '''Done, supports 2.5, 2.6, 2.7 and 3.2+'''
* Older versions of python will have to be dropped <s>(Python 2.2 is a grey area)</s>
* "Classic" classes are gone in Python 3, so PyVTKObject might not work anymore
* PyIntObject and PyStringObject are absent in Python 3
* Python 3 uses unicode for all strings (ramifications for VTK?) '''Python wrappers assume VTK uses utf-8'''
* A new, multi-dimensional buffer interface exists (plus, memoryview) '''Memoryview can be used with vtkDataArray'''
* Small changes to PyTypeObject '''Done'''
* Language changes: some examples will have to be rewritten '''Done'''
 
==Wrapper hints via C++11 attributes (done as of Aug 31, 2017)==
 
C++11 introduced ''attributes'' to provide hints for the compiler.  These hints can be used by the wrappers, see "[[VTK/Wrapping hints |Wrapping Hints]]" for more information:
* [[<nowiki />vtk::expects(condition)]] tells the wrappers to check a condition before calling a method.
* [[<nowiki />vtk::sizehint(parameter,expression)]] tells the wrappers that 'parameter' is a pointer to an array of size 'expression'.
* [[<nowiki />vtk::zerocopy]] indicates that a pointer is to be handled via the Python buffer interface.
* [[<nowiki />vtk::newinstance]] indicates that a return value is a new reference that the caller must dispose of.


The "count" for pointer args should be hinted so that they can be properly wrapped.  E.g.
==Iterator support via begin(), end()==
# vtkVariant::ToInt(bool *vtkSingleValue(valid))
# vtkVariant::ToInt(bool *vtkOptionalSingleValue(valid)) - can be safely set to NULL
# vtkDataArray::SetTuple(double *vtkMultiValue(tuple, GetNumberOfComponents))
In the latter, the name of the method to get the count is supplied in the hint.  Recognizing these macros in vtkParse would be easy.


====Reference arg wrapping: &arg====
The wrappers should recognize C++ classes as containers if they have the following:
* an ::iterator type member with a preincrement operator
* begin() and end() methods that return the iterator type
The wrappers should add a tp_iter slot for such classes.


This is trivial to add, only a few lines would have to be added to vtkWrapPython.  For the reference arg, the user would have to pass a container object that supported both the sequence protocol and the number protocol, e.g. like a numpy array.  For example, the user could make an array([0], 'f') and pass it, and after the call the result would be stored in the array.
==Improved installation==


====GetTuple/SetTuple====
There has been interest in improving the installation of the vtk python modules, including:
* Install the modules within the existing python path (can vary from system to system, sometimes impossible for user installs, must be overridden for embedding e.g. ParaView)
* Creating a wheel (whl) binary installer for use with pip


The wrappers should make a "special case" for vtkDataArray and wrap the GetTuple/SetTuple methods using the knowledge that the tuple size is equal to the number of components. The same can be done for the subclasses, with GetTupleValue/SetTupleValue.  This is a change that could also be easily done for Tcl and Java.
The sysconfig module for Python 2.7 and 3.2+ can be useful here (it contains tons of info, run "python -m sysconfig" for a dump).

Latest revision as of 16:08, 24 October 2017

This project will improve the python wrappers bit-by-bit, with the goal of making the Python interface as close as possible to the original C++ interface. When each piece is finished, it will be marked as "done".

The original author of this document is David Gobbi. He can be reached on the VTK developers mailing list.

Enums and constants (mostly done as of July 31, 2010, finished on Dec 13, 2014)

The new vtkParse will parse enums and #define constants.

  • The FileInfo struct must be expanded to hold these constants
  • Constant values should be stored as strings to simplify typing*
  • The wrappers will have to automatically add the class scope to enum constants.
  • Type checking for named enums, instead of treating them like "int" (done on Nov 21, 2014)

* The strings can be written literally into the wrapper .cxx files where they will be evaluated as the correct type.

GetTuple/SetTuple (done as of Aug 6, 2010)

The wrappers should make a "special case" for vtkDataArray and wrap the GetTuple/SetTuple methods using the knowledge that the tuple size is equal to the number of components. The same can be done for the subclasses, with GetTupleValue/SetTupleValue. This is a change that could also be easily done for Tcl and Java. (done as of Aug 6, 2010 for Python wrappers).

Default value arguments (done as of Sept 17, 2010)

Default argument value support would require the following:

  1. vtkParse must store the default value in the FunctionInfo struct as a string*
  2. vtkWrapPython.c must use these default values to initialize parameter values
  3. vtkWrapPython.c must place a bar "|" before the default args in the ParseTuple format string
  4. some other small changes would be needed

* the default value must be stored as a string to accommodate all types and to accommodate simple mathematical expressions, e.g. the default might be SomeMethod(int param = VTK_CONST1 - VTK_CONST2). The string "VTK_CONST1 - VTK_CONST2" can be dropped directly into the wrapper CXX code where it will be evaluated.

Reference arg wrapping (done as of Sept 17, 2010)

This is trivial to add, only a few lines would have to be added to vtkWrapPython. For the reference arg, the user would have to pass a container object that supported both the sequence protocol and the number protocol. A new "mutable" object was created for this purpose.

Wrapping of multi-dimensional arrays (done as of Sept 17, 2010)

Python can unpack nested sequences, so reading multi-array args is easy. Writing back to them is a bit more complicated.

Sept 17: reading and writing is done. Reading requires that each element is passed as an arg to PyArg_ParseTuple, which is nice for small arrays but not so good for large arrays. Instead of having PyArg_ParseTuple unpack the elements, a subroutine could be added to vtkPythonUtil.

Operator support (partly done as of May 19, 2011)

The new vtkParse provides information about operator methods for the VTK classes. These operator methods can be mirrored in Python by:

  1. defining the appropriate "protocols" for special type objects
  2. defining the proper methods e.g. __setitem__() for vtkObjectBase objects

Each VTK special type will have its own python "type" object, and can thus support its own set of protocols that will be inherited by "subtypes". All vtkObjectBase objects have the same python "type" object so protocols cannot be used, but underscore methods can potentially be used.

Operators supported so far:

  • the << operator for printing
  • the < <= != etc. operators for comparisons
  • the "[ ]" operator for indexing

Operators that could be supported in the future:

  • the "[ ]" operator for mapping
  • the "( )" operator for callable objects
  • arithmetic operators (would require a lot of tedious work)

Hierarchies of special types (done as of May 20, 2011)

If each non-vtkObjectBase special type had its own PyTypeObject struct (generated by vtkWrapPython.c) then:

  • These types could have a hierarchy via python's subclass system
  • Type-specific protocols (number, sequence, buffer, etc) could be supported, this would require proper parsing of operators

Also: see VTK/WrapHierarchy for a tool that provides the entire hierarchy to the wrappers at compile-time. Note that the WrapHierarchy tool is of critical importance... without it, the wrappers cannot tell if a "vtkSomething *" parameter is a pointer to a vtkObjectBase object or to a special VTK type.

Templated type handling (done as of May 31, 2011)

Should be made to look similar to C++, but with square brackets instead of angle brackets. E.g. vtkValue['float32', 3]( ) would create a vtkValue<float, 3>( ). In Python, the specialized types would be stored in a dictionary. There is already template support in vtkParse, so all the information about the templates is available to vtkWrapPython... it is just a matter of instantiating and wrapping the class templates.

Wrap Namespaces (partly done as of Nov 21, 2014)

There are two ways that namespaces could be handled:

  • as modules, with all items in the namespace placed within the dict of the module (this option was chosen)
  • as class objects, with all items as attributes of the class (functions would be static methods of the class)

Only constants and enum types within the namespace are wrapped. Functions and classes in the namespace are not yet wrapped.

Eliminate WRAP_SPECIAL (partly done as of Sept 8, 2015)

The WRAP_SPECIAL (and WRAP_EXCLUDE) indicators in CMakeList.txt are a pain to maintain. It would be possible to just let the python wrappers attempt to wrap everything, and if any types turn up as "unwrappable" they could be wrapped as opaque pointers. That way they could still be passed back and forth between fully-wrapped objects. The only caveat is that some classes in VTK are incomplete i.e. missing method definitions in their .cxx files, but this would be a good opportunity to fix such classes, or place them in an "#ifndef __WRAP__" block.

As a step towards this goal, WRAP_SPECIAL has been eliminated. The wrappers now try to wrap all classes unless they are specifically marked with the new WRAP_EXCLUDE_PYTHON flag.

Wrapping istream and ostream

The wrapper parser already identifies input and output streams as their own types. It would be straightforward to wrap these as Python file objects.

Pointer arg wrapping

See Wrapper Hints below.

Wrap vtkCommand and allow it to be subclassed

Right now, vtkCommand() cannot be used from python because it is an abstract class, and abstract VTK classes cannot be subclassed in Python. Even without vtkCommand, the VTK Command/Observer features can still be used in Python because the vtkObject's AddObserver method can take any python method as an argument, and the wrappers internally convert that python method into a vtkPythonCommand. Unfortunately, though, some flexibility is lost because these vtkCommand methods are lost:

  • SetAbortFlag()/GetAbortFlag()
  • SetPassiveObserver()

I have done some work to remedy this, but the work is not yet complete. So far, I have:

  1. changed the CMake files so that vtkCommand is wrapped (but it is abstract and cannot be instantiated)
  2. made it possible to subclass vtkCommand in python (but the subclasses are abstract)

To make everything work, vtkCommand subclasses in python must actually be subclassed from vtkPythonCommand (which is concrete), and vtkPythonCommand must provide virtual function hooks so that Execute can be overridden as a virtual method. This is only possible if vtkPythonCommand is provided with a "PyObject *" slot for its pythonic other half, so that when Execute is called it can search the python dict to see if the method has been overridden.

Wrap CallData for observer methods (done as of Feb 18, 2014)

See http://vtk.org/gitweb?p=VTK.git;a=commit;h=50d601bdb8ff10e6df6f0

A method is passed to vtkObject.AddObserver() in python takes two args (O, s) where "O" is the observed object, and "s" is a string that gives the event type. The corresponding C++ method takes a third object: a void pointer called "CallData" that contains extra information about the event. The CallData is usually NULL, but sometimes contains useful information such as an error message, a pointer to a vtkObject, or a pointer to a numeric value. The python observer methods should be made to take an optional third argument, which will be the CallData automatically resolved to the correct type. It will be tricky to achieve this in a backwards-compatible manner because there is a lot of existing code that will break if passed a third argument, but judicious error detection within vtkPythonCommand.cxx can work around this by attempting to call the method with three parameters first, and then retrying with two parameters if a TypeError occurred with the usual parameter-count error text and if the traceback is exactly one level deep.

The using directive (done as of May 15, 2015)

In C++, when a class overrides a superclass method, then all superclass signatures of that method will be shadowed. In order to avoid this, the "using" directive can be used to bring them into the subclass namespace:

 using Superclass::SetColor;  // bring in e.g. SetColor(r,g,b)
 void SetColor(double color[3]);  // override SetColor(color)

Exactly the same shadowing occurs in python, and because the wrappers ignore the "using" directive, the above code does not fix the shadowing problem for the wrappers. In order for the wrappers to apply the "using" directive, the following must be done:

  • When vtkWrapPython parses a header, it must recursively parse superclass headers
  • If "using" is encountered, items should be brought in from the superclass namespace
  • For recursive parsing, see vtkParse_SetRecursive() in vtkParse.y and preprocessor_directive() in vtkParse.l

Python 3 (done as of Aug 31, 2015)

Python 3 introduced major API changes for strings and ints, and minor API changes elsewhere.

  • It will be possible to support Python 2.3 though Python 3 Done, supports 2.5, 2.6, 2.7 and 3.2+
  • Older versions of python will have to be dropped (Python 2.2 is a grey area)
  • "Classic" classes are gone in Python 3, so PyVTKObject might not work anymore
  • PyIntObject and PyStringObject are absent in Python 3
  • Python 3 uses unicode for all strings (ramifications for VTK?) Python wrappers assume VTK uses utf-8
  • A new, multi-dimensional buffer interface exists (plus, memoryview) Memoryview can be used with vtkDataArray
  • Small changes to PyTypeObject Done
  • Language changes: some examples will have to be rewritten Done

Wrapper hints via C++11 attributes (done as of Aug 31, 2017)

C++11 introduced attributes to provide hints for the compiler. These hints can be used by the wrappers, see "Wrapping Hints" for more information:

  • [[vtk::expects(condition)]] tells the wrappers to check a condition before calling a method.
  • [[vtk::sizehint(parameter,expression)]] tells the wrappers that 'parameter' is a pointer to an array of size 'expression'.
  • [[vtk::zerocopy]] indicates that a pointer is to be handled via the Python buffer interface.
  • [[vtk::newinstance]] indicates that a return value is a new reference that the caller must dispose of.

Iterator support via begin(), end()

The wrappers should recognize C++ classes as containers if they have the following:

  • an ::iterator type member with a preincrement operator
  • begin() and end() methods that return the iterator type

The wrappers should add a tp_iter slot for such classes.

Improved installation

There has been interest in improving the installation of the vtk python modules, including:

  • Install the modules within the existing python path (can vary from system to system, sometimes impossible for user installs, must be overridden for embedding e.g. ParaView)
  • Creating a wheel (whl) binary installer for use with pip

The sysconfig module for Python 2.7 and 3.2+ can be useful here (it contains tons of info, run "python -m sysconfig" for a dump).