36 SWIG and Python

Caution: This chapter is under repair!

This chapter describes SWIG's support of Python. SWIG is compatible with most recent Python versions including Python 3.0 and Python 2.6, as well as older versions dating back to Python 2.0. For the best results, consider using Python 2.3 or newer.

This chapter covers most SWIG features, but certain low-level details are covered in less depth than in earlier chapters. At the very least, make sure you read the "SWIG Basics" chapter.

36.1 Overview

To build Python extension modules, SWIG uses a layered approach in which parts of the extension module are defined in C and other parts are defined in Python. The C layer contains low-level wrappers whereas Python code is used to define high-level features.

This layered approach recognizes the fact that certain aspects of extension building are better accomplished in each language (instead of trying to do everything in C or C++). Furthermore, by generating code in both languages, you get a lot more flexibility since you can enhance the extension module with support code in either language.

In describing the Python interface, this chapter starts by covering the basics of configuration, compiling, and installing Python modules. Next, the Python interface to common C and C++ programming features is described. Advanced customization features such as typemaps are then described followed by a discussion of low-level implementation details.

36.2 Preliminaries

36.2.1 Running SWIG

Suppose that you defined a SWIG module such as the following:

/* File: example.i */
%module example

%{
#define SWIG_FILE_WITH_INIT
#include "example.h"
%}

int fact(int n);

The #define SWIG_FILE_WITH_INIT line inserts a macro that specifies that the resulting C file should be built as a python extension, inserting the module init code. This .i file wraps the following simple C file:

/* File: example.c */

#include "example.h"

int fact(int n) {
    if (n < 0){ /* This should probably return an error, but this is simpler */
        return 0;
    }
    if (n == 0) {
        return 1;
    }
    else {
        /* testing for overflow would be a good idea here */
        return n * fact(n-1);
    }
}

With the header file:

/* File: example.h */

int fact(int n);

To build a Python module, run SWIG using the -python option:

$ swig -python example.i

If building a C++ extension, add the -c++ option:

$ swig -c++ -python example.i

This creates two different files; a C/C++ source file example_wrap.c or example_wrap.cxx and a Python source file example.py. The generated C source file contains the low-level wrappers that need to be compiled and linked with the rest of your C/C++ application to create an extension module. The Python source file contains high-level support code. This is the file that you will import to use the module.

The name of the wrapper file is derived from the name of the input file. For example, if the input file is example.i, the name of the wrapper file is example_wrap.c. To change this, you can use the -o option. The name of the Python file is derived from the module name specified with %module. If the module name is example, then a file example.py is created.

The following sections have further practical examples and details on how you might go about compiling and using the generated files.

36.2.2 Using distutils

The preferred approach to building an extension module for python is to compile it with distutils, which comes with all recent versions of python (Distutils Docs).

Distutils takes care of making sure that your extension is built with all the correct flags, headers, etc. for the version of Python it is run with. Distutils will compile your extension into a shared object file or DLL (.so on Linux, .pyd on Windows, etc). In addition, distutils can handle installing your package into site-packages, if that is desired. A configuration file (conventionally called: setup.py) describes the extension (and related python modules). The distutils will then generate all the right compiler directives to build it for you.

Here is a sample setup.py file for the above example:

#!/usr/bin/env python

"""
setup.py file for SWIG example
"""

from distutils.core import setup, Extension


example_module = Extension('_example',
                           sources=['example_wrap.c', 'example.c'],
                           )

setup (name = 'example',
       version = '0.1',
       author      = "SWIG Docs",
       description = """Simple swig example from docs""",
       ext_modules = [example_module],
       py_modules = ["example"],
       )

In this example, the line: example_module = Extension(....) creates an Extension module object, defining the name as _example, and using the source code files: example_wrap.c, generated by swig, and example.c, your original c source. The swig (and other python extension modules) tradition is for the compiled extension to have the name of the python portion, prefixed by an underscore. If the name of your python module is "example.py", then the name of the corresponding object file will be"_example.so"

The setup call then sets up distutils to build your package, defining some meta data, and passing in your extension module object. Once this is saved as setup.py, you can build your extension with these commands:

$ swig -python example.i
$ python setup.py build_ext --inplace

And a .so, or .pyd or... will be created for you. It will build a version that matches the python that you run the command with. Taking apart the command line:

The distutils have many other features, consult the python distutils docs for details.

This same approach works on all platforms if the appropriate compiler is installed. (it can even build extensions to the standard Windows Python using MingGW)

36.2.3 Hand compiling a dynamic module

While the preferred approach to building an extension module is to use the distutils, some people like to integrate building extensions with a larger build system, and thus may wish to compile their modules without the distutils. To do this, you need to compile your program using commands like this (shown for Linux):

$ swig -python example.i
$ gcc -O2 -fPIC -c example.c
$ gcc -O2 -fPIC -c example_wrap.c -I/usr/local/include/python2.5
$ gcc -shared example.o example_wrap.o -o _example.so

The exact commands for doing this vary from platform to platform. However, SWIG tries to guess the right options when it is installed. Therefore, you may want to start with one of the examples in the SWIG/Examples/python directory. If that doesn't work, you will need to read the man-pages for your compiler and linker to get the right set of options. You might also check the SWIG Wiki for additional information.

When linking the module, the name of the output file has to match the name of the module prefixed by an underscore. If the name of your module is "example", then the name of the corresponding object file should be "_example.so" or "_examplemodule.so". The name of the module is specified using the %module directive or the -module command line option.

Compatibility Note: In SWIG-1.3.13 and earlier releases, module names did not include the leading underscore. This is because modules were normally created as C-only extensions without the extra Python support file (instead, creating Python code was supported as an optional feature). This has been changed in SWIG-1.3.14 and is consistent with other Python extension modules. For example, the socket module actually consists of two files; socket.py and _socket.so. Many other built-in Python modules follow a similar convention.

36.2.4 Static linking

An alternative approach to dynamic linking is to rebuild the Python interpreter with your extension module added to it. In the past, this approach was sometimes necessary due to limitations in dynamic loading support on certain machines. However, the situation has improved greatly over the last few years and you should not consider this approach unless there is really no other option.

The usual procedure for adding a new module to Python involves finding the Python source, adding an entry to the Modules/Setup file, and rebuilding the interpreter using the Python Makefile. However, newer Python versions have changed the build process. You may need to edit the 'setup.py' file in the Python distribution instead.

In earlier versions of SWIG, the embed.i library file could be used to rebuild the interpreter. For example:

%module example

%inline %{
extern int fact(int);
extern int mod(int, int);
extern double My_variable;
%}

%include "embed.i"       // Include code for a static version of Python

The embed.i library file includes supporting code that contains everything needed to rebuild Python. To rebuild the interpreter, you simply do something like this:

$ swig -python example.i
$ gcc example.c example_wrap.c \
        -Xlinker -export-dynamic \
        -DHAVE_CONFIG_H -I/usr/local/include/python2.1 \
	-I/usr/local/lib/python2.1/config \
	-L/usr/local/lib/python2.1/config -lpython2.1 -lm -ldl \
	-o mypython

You will need to supply the same libraries that were used to build Python the first time. This may include system libraries such as -lsocket, -lnsl, and -lpthread. Assuming this actually works, the new version of Python should be identical to the default version except that your extension module will be a built-in part of the interpreter.

Comment: In practice, you should probably try to avoid static linking if possible. Some programmers may be inclined to use static linking in the interest of getting better performance. However, the performance gained by static linking tends to be rather minimal in most situations (and quite frankly not worth the extra hassle in the opinion of this author).

Compatibility note: The embed.i library file is deprecated and has not been maintained for several years. Even though it appears to "work" with Python 2.1, no future support is guaranteed. If using static linking, you might want to rely on a different approach (perhaps using distutils).

36.2.5 Using your module

To use your module, simply use the Python import statement. If all goes well, you will be able to this:

$ python
>>> import example
>>> example.fact(4)
24
>>>

A common error received by first-time users is the following:

>>> import example
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "example.py", line 2, in ?
    import _example
ImportError: No module named _example

If you get this message, it means that you either forgot to compile the wrapper code into an extension module or you didn't give the extension module the right name. Make sure that you compiled the wrappers into a module called _example.so. And don't forget the leading underscore (_).

Another possible error is the following:

>>> import example
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
ImportError: dynamic module does not define init function (init_example)
>>>                                                               

This error is almost always caused when a bad name is given to the shared object file. For example, if you created a file example.so instead of _example.so you would get this error. Alternatively, this error could arise if the name of the module is inconsistent with the module name supplied with the %module directive. Double-check the interface to make sure the module name and the shared object filename match. Another possible cause of this error is forgetting to link the SWIG-generated wrapper code with the rest of your application when creating the extension module.

Another common error is something similar to the following:

Traceback (most recent call last):
  File "example.py", line 3, in ?
    import example
ImportError: ./_example.so: undefined symbol: fact

This error usually indicates that you forgot to include some object files or libraries in the linking of the shared library file. Make sure you compile both the SWIG wrapper file and your original program into a shared library file. Make sure you pass all of the required libraries to the linker.

Sometimes unresolved symbols occur because a wrapper has been created for a function that doesn't actually exist in a library. This usually occurs when a header file includes a declaration for a function that was never actually implemented or it was removed from a library without updating the header file. To fix this, you can either edit the SWIG input file to remove the offending declaration or you can use the %ignore directive to ignore the declaration.

Finally, suppose that your extension module is linked with another library like this:

$ gcc -shared example.o example_wrap.o -L/home/beazley/projects/lib -lfoo \
      -o _example.so

If the foo library is compiled as a shared library, you might encounter the following problem when you try to use your module:

>>> import example
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
ImportError: libfoo.so: cannot open shared object file: No such file or directory
>>>                 

This error is generated because the dynamic linker can't locate the libfoo.so library. When shared libraries are loaded, the system normally only checks a few standard locations such as /usr/lib and /usr/local/lib. To fix this problem, there are several things you can do. First, you can recompile your extension module with extra path information. For example, on Linux you can do this:

$ gcc -shared example.o example_wrap.o -L/home/beazley/projects/lib -lfoo \
      -Xlinker -rpath /home/beazley/projects/lib  \
      -o _example.so

Alternatively, you can set the LD_LIBRARY_PATH environment variable to include the directory with your shared libraries. If setting LD_LIBRARY_PATH, be aware that setting this variable can introduce a noticeable performance impact on all other applications that you run. To set it only for Python, you might want to do this instead:

$ env LD_LIBRARY_PATH=/home/beazley/projects/lib python

Finally, you can use a command such as ldconfig (Linux) or crle (Solaris) to add additional search paths to the default system configuration (this requires root access and you will need to read the man pages).

36.2.6 Compilation of C++ extensions

Compilation of C++ extensions has traditionally been a tricky problem. Since the Python interpreter is written in C, you need to take steps to make sure C++ is properly initialized and that modules are compiled correctly. This should be a non-issue if you're using distutils, as it takes care of all that for you. The following is included for historical reasons, and in case you need to compile on your own.

On most machines, C++ extension modules should be linked using the C++ compiler. For example:

$ swig -c++ -python example.i
$ g++ -O2 -fPIC -c example.cxx
$ g++ -O2 -fPIC -c example_wrap.cxx -I/usr/local/include/python2.5
$ g++ -shared example.o example_wrap.o -o _example.so

The -fPIC option tells GCC to generate position-independent code (PIC) which is required for most architectures (it's not vital on x86, but still a good idea as it allows code pages from the library to be shared between processes). Other compilers may need a different option specified instead of -fPIC.

In addition to this, you may need to include additional library files to make it work. For example, if you are using the Sun C++ compiler on Solaris, you often need to add an extra library -lCrun like this:

$ swig -c++ -python example.i
$ CC -c example.cxx
$ CC -c example_wrap.cxx -I/usr/local/include/python2.5
$ CC -G example.o example_wrap.o -L/opt/SUNWspro/lib -o _example.so -lCrun

Of course, the extra libraries to use are completely non-portable---you will probably need to do some experimentation.

Sometimes people have suggested that it is necessary to relink the Python interpreter using the C++ compiler to make C++ extension modules work. In the experience of this author, this has never actually appeared to be necessary. Relinking the interpreter with C++ really only includes the special run-time libraries described above---as long as you link your extension modules with these libraries, it should not be necessary to rebuild Python.

If you aren't entirely sure about the linking of a C++ extension, you might look at an existing C++ program. On many Unix machines, the ldd command will list library dependencies. This should give you some clues about what you might have to include when you link your extension module. For example:

$ ldd swig
        libstdc++-libc6.1-1.so.2 => /usr/lib/libstdc++-libc6.1-1.so.2 (0x40019000)
        libm.so.6 => /lib/libm.so.6 (0x4005b000)
        libc.so.6 => /lib/libc.so.6 (0x40077000)
        /lib/ld-linux.so.2 => /lib/ld-linux.so.2 (0x40000000)

As a final complication, a major weakness of C++ is that it does not define any sort of standard for binary linking of libraries. This means that C++ code compiled by different compilers will not link together properly as libraries nor is the memory layout of classes and data structures implemented in any kind of portable manner. In a monolithic C++ program, this problem may be unnoticed. However, in Python, it is possible for different extension modules to be compiled with different C++ compilers. As long as these modules are self-contained, this probably won't matter. However, if these modules start sharing data, you will need to take steps to avoid segmentation faults and other erratic program behavior. If working with lots of software components, you might want to investigate using a more formal standard such as COM.

36.2.7 Compiling for 64-bit platforms

On platforms that support 64-bit applications (Solaris, Irix, etc.), special care is required when building extension modules. On these machines, 64-bit applications are compiled and linked using a different set of compiler/linker options. In addition, it is not generally possible to mix 32-bit and 64-bit code together in the same application.

To utilize 64-bits, the Python executable will need to be recompiled as a 64-bit application. In addition, all libraries, wrapper code, and every other part of your application will need to be compiled for 64-bits. If you plan to use other third-party extension modules, they will also have to be recompiled as 64-bit extensions.

If you are wrapping commercial software for which you have no source code, you will be forced to use the same linking standard as used by that software. This may prevent the use of 64-bit extensions. It may also introduce problems on platforms that support more than one linking standard (e.g., -o32 and -n32 on Irix).

On the Linux x86_64 platform (Opteron or EM64T), besides of the required compiler option -fPIC discussed above, you will need to be careful about the libraries you link with or the library path you use. In general, a Linux distribution will have two set of libraries, one for native x86_64 programs (under /usr/lib64), and another for 32 bits compatibility (under /usr/lib). Also, the compiler options -m32 and -m64 allow you to choose the desired binary format for your python extension.

36.2.8 Building Python Extensions under Windows

Building a SWIG extension to Python under Windows is roughly similar to the process used with Unix. Using the distutils, it is essentially identical. If you have the same version of the MS compiler that Python was built with (the python2.4 and python2.5 distributed by python.org are built with Visual Studio 2003), the standard python setup.py build should just work.

As of python2.5, the distutils support building extensions with MingGW out of the box. Following the instruction here: Building Python extensions for Windows with only free tools should get you started.

If you need to build it on your own, the following notes are provided:

You will need to create a DLL that can be loaded into the interpreter. This section briefly describes the use of SWIG with Microsoft Visual C++. As a starting point, many of SWIG's examples include project files (.dsp files) for Visual C++ 6. These can be opened by more recent versions of Visual Studio. You might want to take a quick look at these examples in addition to reading this section.

In Developer Studio, SWIG should be invoked as a custom build option. This is usually done as follows:

If all went well, SWIG will be automatically invoked whenever you build your project. Any changes made to the interface file will result in SWIG being automatically executed to produce a new version of the wrapper file.

To run your new Python extension, simply run Python and use the import command as normal. For example :

$ python
>>> import example
>>> print example.fact(4)
24
>>>

If you get an ImportError exception when importing the module, you may have forgotten to include additional library files when you built your module. If you get an access violation or some kind of general protection fault immediately upon import, you have a more serious problem. This is often caused by linking your extension module against the wrong set of Win32 debug or thread libraries. You will have to fiddle around with the build options of project to try and track this down.

A 'Debug' build of the wrappers requires a debug build of the Python interpreter. This normally requires building the Python interpreter from source, which is not a job for the feint-hearted. Alternatively you can use the 'Release' build of the Python interpreter with a 'Debug' build of your wrappers by defining the SWIG_PYTHON_INTERPRETER_NO_DEBUG symbol under the preprocessor options. Or you can ensure this macro is defined at the beginning of the wrapper code using the following in your interface file, where _MSC_VER ensures it is only used by the Visual Studio compiler:

%begin %{
#ifdef _MSC_VER
#define SWIG_PYTHON_INTERPRETER_NO_DEBUG
#endif
%}

Some users have reported success in building extension modules using Cygwin and other compilers. However, the problem of building usable DLLs with these compilers tends to be rather problematic. For the latest information, you may want to consult the SWIG Wiki.

36.3 A tour of basic C/C++ wrapping

By default, SWIG tries to build a very natural Python interface to your C/C++ code. Functions are wrapped as functions, classes are wrapped as classes, and so forth. This section briefly covers the essential aspects of this wrapping.

36.3.1 Modules

The SWIG %module directive specifies the name of the Python module. If you specify `%module example', then everything is wrapped into a Python 'example' module. Underneath the covers, this module consists of a Python source file example.py and a low-level extension module _example.so. When choosing a module name, make sure you don't use the same name as a built-in Python command or standard module name.

36.3.2 Functions

Global functions are wrapped as new Python built-in functions. For example,

%module example
int fact(int n);

creates a built-in function example.fact(n) that works exactly like you think it does:

>>> import example
>>> print example.fact(4)
24
>>>

36.3.3 Global variables

C/C++ global variables are fully supported by SWIG. However, the underlying mechanism is somewhat different than you might expect due to the way that Python assignment works. When you type the following in Python

a = 3.4

"a" becomes a name for an object containing the value 3.4. If you later type

b = a

then "a" and "b" are both names for the object containing the value 3.4. Thus, there is only one object containing 3.4 and "a" and "b" are both names that refer to it. This is quite different than C where a variable name refers to a memory location in which a value is stored (and assignment copies data into that location). Because of this, there is no direct way to map variable assignment in C to variable assignment in Python.

To provide access to C global variables, SWIG creates a special object called `cvar' that is added to each SWIG generated module. Global variables are then accessed as attributes of this object. For example, consider this interface

// SWIG interface file with global variables
%module example
...
%inline %{
extern int My_variable;
extern double density;
%}
...

Now look at the Python interface:

>>> import example
>>> # Print out value of a C global variable
>>> print example.cvar.My_variable
4
>>> # Set the value of a C global variable
>>> example.cvar.density = 0.8442
>>> # Use in a math operation
>>> example.cvar.density = example.cvar.density*1.10

If you make an error in variable assignment, you will receive an error message. For example:

>>> example.cvar.density = "Hello"
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
TypeError: C variable 'density (double )'
>>> 

If a variable is declared as const, it is wrapped as a read-only variable. Attempts to modify its value will result in an error.

To make ordinary variables read-only, you can use the %immutable directive. For example:

%{
extern char *path;
%}
%immutable;
extern char *path;
%mutable;

The %immutable directive stays in effect until it is explicitly disabled or cleared using %mutable. See the Creating read-only variables section for further details.

If you just want to make a specific variable immutable, supply a declaration name. For example:

%{
extern char *path;
%}
%immutable path;
...
extern char *path;      // Read-only (due to %immutable)

If you would like to access variables using a name other than "cvar", it can be changed using the -globals option :

$ swig -python -globals myvar example.i

Some care is in order when importing multiple SWIG modules. If you use the "from <file> import *" style of importing, you will get a name clash on the variable `cvar' and you will only be able to access global variables from the last module loaded. To prevent this, you might consider renaming cvar or making it private to the module by giving it a name that starts with a leading underscore. SWIG does not create cvar if there are no global variables in a module.

36.3.4 Constants and enums

C/C++ constants are installed as Python objects containing the appropriate value. To create a constant, use #define, enum, or the %constant directive. For example:

#define PI 3.14159
#define VERSION "1.0"

enum Beverage { ALE, LAGER, STOUT, PILSNER };

%constant int FOO = 42;
%constant const char *path = "/usr/local";

For enums, make sure that the definition of the enumeration actually appears in a header file or in the wrapper file somehow---if you just stick an enum in a SWIG interface without also telling the C compiler about it, the wrapper code won't compile.

Note: declarations declared as const are wrapped as read-only variables and will be accessed using the cvar object described in the previous section. They are not wrapped as constants. For further discussion about this, see the SWIG Basics chapter.

Constants are not guaranteed to remain constant in Python---the name of the constant could be accidentally reassigned to refer to some other object. Unfortunately, there is no easy way for SWIG to generate code that prevents this. You will just have to be careful.

36.3.5 Pointers

C/C++ pointers are fully supported by SWIG. Furthermore, SWIG has no problem working with incomplete type information. Here is a rather simple interface:

%module example

FILE *fopen(const char *filename, const char *mode);
int fputs(const char *, FILE *);
int fclose(FILE *);

When wrapped, you will be able to use the functions in a natural way from Python. For example:

>>> import example
>>> f = example.fopen("junk","w")
>>> example.fputs("Hello World\n", f)
>>> example.fclose(f)

If this makes you uneasy, rest assured that there is no deep magic involved. Underneath the covers, pointers to C/C++ objects are simply represented as opaque values using an especial python container object:

>>> print f
<Swig Object of type 'FILE *' at 0xb7d6f470>

This pointer value can be freely passed around to different C functions that expect to receive an object of type FILE *. The only thing you can't do is dereference the pointer from Python. Of course, that isn't much of a concern in this example.

In older versions of SWIG (1.3.22 or older), pointers were represented using a plain string object. If you have an old package that still requires that representation, or you just feel nostalgic, you can always retrieve it by casting the pointer object to a string:

>>> print str(f)
_c0671108_p_FILE

Also, if you need to pass the raw pointer value to some external python library, you can do it by casting the pointer object to an integer:

>>> print int(f)
135833352

However, the inverse operation is not possible, i.e., you can't build a SWIG pointer object from a raw integer value.

Note also that the '0' or NULL pointer is always represented by None, no matter what type swig is addressing. In the previous example, you can call:

>>> example.fclose(None)

and that will be equivalent to the following, but not really useful, C code:

FILE *f = NULL;
fclose(f);

As much as you might be inclined to modify a pointer value directly from Python, don't. The hexadecimal encoding is not necessarily the same as the logical memory address of the underlying object. Instead it is the raw byte encoding of the pointer value. The encoding will vary depending on the native byte-ordering of the platform (i.e., big-endian vs. little-endian). Similarly, don't try to manually cast a pointer to a new type by simply replacing the type-string. This may not work like you expect, it is particularly dangerous when casting C++ objects. If you need to cast a pointer or change its value, consider writing some helper functions instead. For example:

%inline %{
/* C-style cast */
Bar *FooToBar(Foo *f) {
   return (Bar *) f;
}

/* C++-style cast */
Foo *BarToFoo(Bar *b) {
   return dynamic_cast<Foo*>(b);
}

Foo *IncrFoo(Foo *f, int i) {
    return f+i;
}
%}

Also, if working with C++, you should always try to use the new C++ style casts. For example, in the above code, the C-style cast may return a bogus result whereas as the C++-style cast will return None if the conversion can't be performed.

36.3.6 Structures

If you wrap a C structure, it is wrapped by a Python class. This provides a very natural interface. For example,

struct Vector {
	double x,y,z;
};

is used as follows:

>>> v = example.Vector()
>>> v.x = 3.5
>>> v.y = 7.2
>>> print v.x, v.y, v.z
7.8 -4.5 0.0
>>> 

Similar access is provided for unions and the data members of C++ classes.

If you print out the value of v in the above example, you will see something like this:

>>> print v
<C Vector instance at _18e31408_p_Vector>

This object is actually a Python instance that has been wrapped around a pointer to the low-level C structure. This instance doesn't actually do anything--it just serves as a proxy. The pointer to the C object can be found in the .this attribute. For example:

>>> print v.this
_18e31408_p_Vector
>>>

Further details about the Python proxy class are covered a little later.

const members of a structure are read-only. Data members can also be forced to be read-only using the %immutable directive. For example:

struct Foo {
   ...
   %immutable;
   int x;        /* Read-only members */
   char *name;
   %mutable;
   ...
};

When char * members of a structure are wrapped, the contents are assumed to be dynamically allocated using malloc or new (depending on whether or not SWIG is run with the -c++ option). When the structure member is set, the old contents will be released and a new value created. If this is not the behavior you want, you will have to use a typemap (described later).

If a structure contains arrays, access to those arrays is managed through pointers. For example, consider this:

struct Bar {
    int  x[16];
};

If accessed in Python, you will see behavior like this:

>>> b = example.Bar()
>>> print b.x
_801861a4_p_int
>>> 

This pointer can be passed around to functions that expect to receive an int * (just like C). You can also set the value of an array member using another pointer. For example:

>>> c = example.Bar()
>>> c.x = b.x             # Copy contents of b.x to c.x

For array assignment, SWIG copies the entire contents of the array starting with the data pointed to by b.x. In this example, 16 integers would be copied. Like C, SWIG makes no assumptions about bounds checking---if you pass a bad pointer, you may get a segmentation fault or access violation.

When a member of a structure is itself a structure, it is handled as a pointer. For example, suppose you have two structures like this:

struct Foo {
   int a;
};

struct Bar {
   Foo f;
};

Now, suppose that you access the f attribute of Bar like this:

>>> b = Bar()
>>> x = b.f

In this case, x is a pointer that points to the Foo that is inside b. This is the same value as generated by this C code:

Bar b;
Foo *x = &b->f;       /* Points inside b */

Because the pointer points inside the structure, you can modify the contents and everything works just like you would expect. For example:

>>> b = Bar()
>>> b.f.a = 3               # Modify attribute of structure member
>>> x = b.f                   
>>> x.a = 3                 # Modifies the same structure

36.3.7 C++ classes

C++ classes are wrapped by Python classes as well. For example, if you have this class,

class List {
public:
  List();
  ~List();
  int  search(char *item);
  void insert(char *item);
  void remove(char *item);
  char *get(int n);
  int  length;
};

you can use it in Python like this:

>>> l = example.List()
>>> l.insert("Ale")
>>> l.insert("Stout")
>>> l.insert("Lager")
>>> l.get(1)
'Stout'
>>> print l.length
3
>>>

Class data members are accessed in the same manner as C structures.

Static class members present a special problem for Python. Prior to Python-2.2, Python classes had no support for static methods and no version of Python supports static member variables in a manner that SWIG can utilize. Therefore, SWIG generates wrappers that try to work around some of these issues. To illustrate, suppose you have a class like this:

class Spam {
public:
   static void foo();
   static int bar;

};

In Python, the static member can be access in three different ways:

>>> example.Spam_foo()    # Spam::foo()
>>> s = example.Spam()
>>> s.foo()               # Spam::foo() via an instance
>>> example.Spam.foo()    # Spam::foo(). Python-2.2 only

The first two methods of access are supported in all versions of Python. The last technique is only available in Python-2.2 and later versions.

Static member variables are currently accessed as global variables. This means, they are accessed through cvar like this:

>>> print example.cvar.Spam_bar
7

36.3.8 C++ inheritance

SWIG is fully aware of issues related to C++ inheritance. Therefore, if you have classes like this

class Foo {
...
};

class Bar : public Foo {
...
};

those classes are wrapped into a hierarchy of Python classes that reflect the same inheritance structure. All of the usual Python utility functions work normally:

>>> b = Bar()
>>> instance(b,Foo)
1
>>> issubclass(Bar,Foo)
1
>>> issubclass(Foo,Bar)
0

Furthermore, if you have functions like this

void spam(Foo *f);

then the function spam() accepts Foo * or a pointer to any class derived from Foo.

It is safe to use multiple inheritance with SWIG.

36.3.9 Pointers, references, values, and arrays

In C++, there are many different ways a function might receive and manipulate objects. For example:

void spam1(Foo *x);      // Pass by pointer
void spam2(Foo &x);      // Pass by reference
void spam3(const Foo &x);// Pass by const reference
void spam4(Foo x);       // Pass by value
void spam5(Foo x[]);     // Array of objects

In Python, there is no detailed distinction like this--specifically, there are only "objects". There are no pointers, references, arrays, and so forth. Because of this, SWIG unifies all of these types together in the wrapper code. For instance, if you actually had the above functions, it is perfectly legal to do this:

>>> f = Foo()           # Create a Foo
>>> spam1(f)            # Ok. Pointer
>>> spam2(f)            # Ok. Reference
>>> spam3(f)            # Ok. Const reference
>>> spam4(f)            # Ok. Value.
>>> spam5(f)            # Ok. Array (1 element)

Similar behavior occurs for return values. For example, if you had functions like this,

Foo *spam6();
Foo &spam7();
Foo  spam8();
const Foo &spam9();

then all three functions will return a pointer to some Foo object. Since the third function (spam8) returns a value, newly allocated memory is used to hold the result and a pointer is returned (Python will release this memory when the return value is garbage collected). The fourth case (spam9) which returns a const reference, in most of the cases will be treated as a returning value, and it will follow the same allocation/deallocation process.

36.3.10 C++ overloaded functions

C++ overloaded functions, methods, and constructors are mostly supported by SWIG. For example, if you have two functions like this:

void foo(int);
void foo(char *c);

You can use them in Python in a straightforward manner:

>>> foo(3)           # foo(int)
>>> foo("Hello")     # foo(char *c)

Similarly, if you have a class like this,

class Foo {
public:
    Foo();
    Foo(const Foo &);
    ...
};

you can write Python code like this:

>>> f = Foo()          # Create a Foo
>>> g = Foo(f)         # Copy f

Overloading support is not quite as flexible as in C++. Sometimes there are methods that SWIG can't disambiguate. For example:

void spam(int);
void spam(short);

or

void foo(Bar *b);
void foo(Bar &b);

If declarations such as these appear, you will get a warning message like this:

example.i:12: Warning 509: Overloaded method spam(short) effectively ignored,
example.i:11: Warning 509: as it is shadowed by spam(int).

To fix this, you either need to ignore or rename one of the methods. For example:

%rename(spam_short) spam(short);
...
void spam(int);    
void spam(short);   // Accessed as spam_short

or

%ignore spam(short);
...
void spam(int);    
void spam(short);   // Ignored

SWIG resolves overloaded functions and methods using a disambiguation scheme that ranks and sorts declarations according to a set of type-precedence rules. The order in which declarations appear in the input does not matter except in situations where ambiguity arises--in this case, the first declaration takes precedence.

Please refer to the "SWIG and C++" chapter for more information about overloading.

36.3.11 C++ operators

Certain C++ overloaded operators can be handled automatically by SWIG. For example, consider a class like this:

class Complex {
private:
  double rpart, ipart;
public:
  Complex(double r = 0, double i = 0) : rpart(r), ipart(i) { }
  Complex(const Complex &c) : rpart(c.rpart), ipart(c.ipart) { }
  Complex &operator=(const Complex &c);

  Complex operator+=(const Complex &c) const;
  Complex operator+(const Complex &c) const;
  Complex operator-(const Complex &c) const;
  Complex operator*(const Complex &c) const;
  Complex operator-() const;
  
  double re() const { return rpart; }
  double im() const { return ipart; }
};

When wrapped, it works like you expect:

>>> c = Complex(3,4)
>>> d = Complex(7,8)
>>> e = c + d
>>> e.re()
10.0
>>> e.im()
12.0
>>> c += d
>>> c.re()
10.0
>>> c.im()
12.0

One restriction with operator overloading support is that SWIG is not able to fully handle operators that aren't defined as part of the class. For example, if you had code like this

class Complex {
...
friend Complex operator+(double, const Complex &c);
...
};

then SWIG ignores it and issues a warning. You can still wrap the operator, but you may have to encapsulate it in a special function. For example:

%rename(Complex_add_dc) operator+(double, const Complex &);

There are ways to make this operator appear as part of the class using the %extend directive. Keep reading.

Also, be aware that certain operators don't map cleanly to Python. For instance, overloaded assignment operators don't map to Python semantics and will be ignored.

36.3.12 C++ namespaces

SWIG is aware of C++ namespaces, but namespace names do not appear in the module nor do namespaces result in a module that is broken up into submodules or packages. For example, if you have a file like this,

%module example

namespace foo {
   int fact(int n);
   struct Vector {
       double x,y,z;
   };
};

it works in Python as follows:

>>> import example
>>> example.fact(3)
6
>>> v = example.Vector()
>>> v.x = 3.4
>>> print v.y
0.0
>>>

If your program has more than one namespace, name conflicts (if any) can be resolved using %rename For example:

%rename(Bar_spam) Bar::spam;

namespace Foo {
    int spam();
}

namespace Bar {
    int spam();
}

If you have more than one namespace and your want to keep their symbols separate, consider wrapping them as separate SWIG modules. For example, make the module name the same as the namespace and create extension modules for each namespace separately. If your program utilizes thousands of small deeply nested namespaces each with identical symbol names, well, then you get what you deserve.

36.3.13 C++ templates

C++ templates don't present a huge problem for SWIG. However, in order to create wrappers, you have to tell SWIG to create wrappers for a particular template instantiation. To do this, you use the %template directive. For example:

%module example
%{
#include "pair.h"
%}

template<class T1, class T2>
struct pair {
   typedef T1 first_type;
   typedef T2 second_type;
   T1 first;
   T2 second;
   pair();
   pair(const T1&, const T2&);
  ~pair();
};

%template(pairii) pair<int,int>;

In Python:

>>> import example
>>> p = example.pairii(3,4)
>>> p.first
3
>>> p.second
4

Obviously, there is more to template wrapping than shown in this example. More details can be found in the SWIG and C++ chapter. Some more complicated examples will appear later.

36.3.14 C++ Smart Pointers

In certain C++ programs, it is common to use classes that have been wrapped by so-called "smart pointers." Generally, this involves the use of a template class that implements operator->() like this:

template<class T> class SmartPtr {
   ...
   T *operator->();
   ...
}

Then, if you have a class like this,

class Foo {
public:
     int x;
     int bar();
};

A smart pointer would be used in C++ as follows:

SmartPtr<Foo> p = CreateFoo();   // Created somehow (not shown)
...
p->x = 3;                        // Foo::x
int y = p->bar();                // Foo::bar

To wrap this in Python, simply tell SWIG about the SmartPtr class and the low-level Foo object. Make sure you instantiate SmartPtr using %template if necessary. For example:

%module example
...
%template(SmartPtrFoo) SmartPtr<Foo>;
...

Now, in Python, everything should just "work":

>>> p = example.CreateFoo()          # Create a smart-pointer somehow
>>> p.x = 3                          # Foo::x
>>> p.bar()                          # Foo::bar

If you ever need to access the underlying pointer returned by operator->() itself, simply use the __deref__() method. For example:

>>> f = p.__deref__()     # Returns underlying Foo *

36.3.15 C++ reference counted objects

The C++ reference counted objects section contains Python examples of memory management using referencing counting.

36.4 Further details on the Python class interface

In the previous section, a high-level view of Python wrapping was presented. A key component of this wrapping is that structures and classes are wrapped by Python proxy classes. This provides a very natural Python interface and allows SWIG to support a number of advanced features such as operator overloading. However, a number of low-level details were omitted. This section provides a brief overview of how the proxy classes work.

New in SWIG version 2.0.4: The use of Python proxy classes has performance implications that may be unacceptable for a high-performance library. The new -builtin option instructs SWIG to forego the use of proxy classes, and instead create wrapped types as new built-in Python types. When this option is used, the following section ("Proxy classes") does not apply. Details on the use of the -builtin option are in the Built-in Types section.

36.4.1 Proxy classes

In the "SWIG basics" and "SWIG and C++" chapters, details of low-level structure and class wrapping are described. To summarize those chapters, if you have a class like this

class Foo {
public:
     int x;
     int spam(int);
     ...

then SWIG transforms it into a set of low-level procedural wrappers. For example:

Foo *new_Foo() {
    return new Foo();
}
void delete_Foo(Foo *f) {
    delete f;
}
int Foo_x_get(Foo *f) {
    return f->x;
}
void Foo_x_set(Foo *f, int value) {
    f->x = value;
}
int Foo_spam(Foo *f, int arg1) {
    return f->spam(arg1);
}

These wrappers can be found in the low-level extension module (e.g., _example).

Using these wrappers, SWIG generates a high-level Python proxy class (also known as a shadow class) like this (shown for Python 2.2):

import _example

class Foo(object):
     def __init__(self):
         self.this = _example.new_Foo()
         self.thisown = 1
     def __del__(self):
         if self.thisown:
               _example.delete_Foo(self.this)
     def spam(self,arg1):
         return _example.Foo_spam(self.this,arg1)
     x = property(_example.Foo_x_get, _example.Foo_x_set)

This class merely holds a pointer to the underlying C++ object (.this) and dispatches methods and member variable access to that object using the low-level accessor functions. From a user's point of view, it makes the class work normally:

>>> f = example.Foo()
>>> f.x = 3
>>> y = f.spam(5)

The fact that the class has been wrapped by a real Python class offers certain advantages. For instance, you can attach new Python methods to the class and you can even inherit from it (something not supported by Python built-in types until Python 2.2).

36.4.2 Built-in Types

The -builtin option provides a significant performance improvement in the wrapped code. To understand the difference between proxy classes and built-in types, let's take a look at what a wrapped object looks like under both circumstances.

When proxy classes are used, each wrapped object in python is an instance of a pure python class. As a reminder, here is what the __init__ method looks like in a proxy class:

class Foo(object):
     def __init__(self):
         self.this = _example.new_Foo()
         self.thisown = 1

When a Foo instance is created, the call to _example.new_Foo() creates a new C++ Foo instance; wraps that C++ instance inside an instance of a python built-in type called SwigPyObject; and stores the SwigPyObject instance in the 'this' field of the python Foo object. Did you get all that? So, the python Foo object is composed of three parts:

When -builtin is used, the pure python layer is stripped off. Each wrapped class is turned into a new python built-in type which inherits from SwigPyObject, and SwigPyObject instances are returned directly from the wrapped methods. For more information about python built-in extensions, please refer to the python documentation:

http://docs.python.org/extending/newtypes.html

36.4.2.1 Limitations

Use of the -builtin option implies a couple of limitations:

36.4.2.2 Operator overloads -- use them!

The entire justification for the -builtin option is improved performance. To that end, the best way to squeeze maximum performance out of your wrappers is to use operator overloads. Named method dispatch is slow in python, even when compared to other scripting languages. However, python built-in types have a large number of "slots", analogous to C++ operator overloads, which allow you to short-circuit named method dispatch for certain common operations.

By default, SWIG will translate most C++ arithmetic operator overloads into python slot entries. For example, suppose you have this class:

class Twit {
public:
    Twit operator+ (const Twit& twit) const;

    // Forward to operator+
    Twit add (const Twit& twit) const
    { return *this + twit; }
};

SWIG will automatically register operator+ as a python slot operator for addition. You may write python code like this:

from MyModule import Twit

nigel = Twit()
emily = Twit()
percival = nigel + emily
percival = nigel.add(emily)

The last two lines of the python code are equivalent, but the line that uses the '+' operator is much faster.

In-place operators (e.g., operator+=) and comparison operators (operator==, operator<, etc.) are also converted to python slot operators. For a complete list of C++ operators that are automatically converted to python slot operators, refer to the file python/pyopers.swig in the SWIG library.

There are other very useful python slots that you may explicitly define using %feature directives. For example, suppose you want to use instances of a wrapped class as keys in a native python dict. That will work as long as you define a hash function for instances of your class, and use it to define the python tp_hash slot:

%feature("python:slot", "tp_hash", functype="hashfunc") Cheese::cheeseHashFunc;

class Cheese {
public:
    Cheese (const char *name);
    long cheeseHashFunc () const;
};

This will allow you to write python code like this:

from my MyPackage import Cheese

inventory = {
    Cheese("cheddar") : 0,
    Cheese("gouda") : 0,
    Cheese("camembert") : 0
}

Because you defined the tp_hash slot, Cheese objects may be used as hash keys; and when the cheeseHashFunc method is invoked by a python dict, it will not go through named method dispatch. A more detailed discussion about %feature("python:slot") can be found in the file python/pyopers.swig in the SWIG library. You can read about all of the available python slots here:

http://docs.python.org/c-api/typeobj.html

You may override (almost) all of the slots defined in the PyTypeObject, PyNumberMethods, PyMappingMethods, PySequenceMethods, and PyBufferProcs structs.

36.4.3 Memory management

NOTE: Although this section refers to proxy objects, everything here also applies when the -builtin option is used.

Associated with proxy object, is an ownership flag .thisown The value of this flag determines who is responsible for deleting the underlying C++ object. If set to 1, the Python interpreter will destroy the C++ object when the proxy class is garbage collected. If set to 0 (or if the attribute is missing), then the destruction of the proxy class has no effect on the C++ object.

When an object is created by a constructor or returned by value, Python automatically takes ownership of the result. For example:

class Foo {
public:
    Foo();
    Foo bar();
};

In Python:

>>> f = Foo()
>>> f.thisown
1
>>> g = f.bar()
>>> g.thisown
1

On the other hand, when pointers are returned to Python, there is often no way to know where they came from. Therefore, the ownership is set to zero. For example:

class Foo {
public:
    ...
    Foo *spam();
    ...
};

>>> f = Foo()
>>> s = f.spam()
>>> print s.thisown
0
>>>

This behavior is especially important for classes that act as containers. For example, if a method returns a pointer to an object that is contained inside another object, you definitely don't want Python to assume ownership and destroy it!

A good way to indicate that ownership should be set for a returned pointer is to use the %newobject directive.

Related to containers, ownership issues can arise whenever an object is assigned to a member or global variable. For example, consider this interface:

%module example

struct Foo {
    int  value;
    Foo  *next;
};

Foo *head = 0;

When wrapped in Python, careful observation will reveal that ownership changes whenever an object is assigned to a global variable. For example:

>>> f = example.Foo()
>>> f.thisown
1
>>> example.cvar.head = f           
>>> f.thisown
0
>>>

In this case, C is now holding a reference to the object---you probably don't want Python to destroy it. Similarly, this occurs for members. For example:

>>> f = example.Foo()
>>> g = example.Foo()
>>> f.thisown
1
>>> g.thisown
1
>>> f.next = g
>>> g.thisown
0
>>>

For the most part, memory management issues remain hidden. However, there are occasionally situations where you might have to manually change the ownership of an object. For instance, consider code like this:

class Node {
   Object *value;
public:
   void set_value(Object *v) { value = v; }
   ...
};

Now, consider the following Python code:

>>> v = Object()           # Create an object
>>> n = Node()             # Create a node
>>> n.set_value(v)         # Set value
>>> v.thisown
1
>>> del v

In this case, the object n is holding a reference to v internally. However, SWIG has no way to know that this has occurred. Therefore, Python still thinks that it has ownership of the object. Should the proxy object be destroyed, then the C++ destructor will be invoked and n will be holding a stale-pointer. If you're lucky, you will only get a segmentation fault.

To work around this, it is always possible to flip the ownership flag. For example,

>>> v.thisown = 0

It is also possible to deal with situations like this using typemaps--an advanced topic discussed later.

36.4.4 Python 2.2 and classic classes

SWIG makes every attempt to preserve backwards compatibility with older versions of Python to the extent that it is possible. However, in Python-2.2, an entirely new type of class system was introduced. This new-style class system offers many enhancements including static member functions, properties (managed attributes), and class methods. Details about all of these changes can be found on www.python.org and is not repeated here.

To address differences between Python versions, SWIG currently emits dual-mode proxy class wrappers. In Python-2.2 and newer releases, these wrappers encapsulate C++ objects in new-style classes that take advantage of new features (static methods and properties). However, if these very same wrappers are imported into an older version of Python, old-style classes are used instead.

This dual-nature of the wrapper code means that you can create extension modules with SWIG and those modules will work with all versions of Python ranging from Python-1.4 to the very latest release. Moreover, the wrappers take advantage of Python-2.2 features when available.

For the most part, the interface presented to users is the same regardless of what version of Python is used. The only incompatibility lies in the handling of static member functions. In Python-2.2, they can be accessed via the class itself. In Python-2.1 and earlier, they have to be accessed as a global function or through an instance (see the earlier section).

36.5 Cross language polymorphism

Proxy classes provide a more natural, object-oriented way to access extension classes. As described above, each proxy instance has an associated C++ instance, and method calls to the proxy are passed to the C++ instance transparently via C wrapper functions.

This arrangement is asymmetric in the sense that no corresponding mechanism exists to pass method calls down the inheritance chain from C++ to Python. In particular, if a C++ class has been extended in Python (by extending the proxy class), these extensions will not be visible from C++ code. Virtual method calls from C++ are thus not able access the lowest implementation in the inheritance chain.

Changes have been made to SWIG 1.3.18 to address this problem and make the relationship between C++ classes and proxy classes more symmetric. To achieve this goal, new classes called directors are introduced at the bottom of the C++ inheritance chain. The job of the directors is to route method calls correctly, either to C++ implementations higher in the inheritance chain or to Python implementations lower in the inheritance chain. The upshot is that C++ classes can be extended in Python and from C++ these extensions look exactly like native C++ classes. Neither C++ code nor Python code needs to know where a particular method is implemented: the combination of proxy classes, director classes, and C wrapper functions takes care of all the cross-language method routing transparently.

36.5.1 Enabling directors

The director feature is disabled by default. To use directors you must make two changes to the interface file. First, add the "directors" option to the %module directive, like this:

%module(directors="1") modulename

Without this option no director code will be generated. Second, you must use the %feature("director") directive to tell SWIG which classes and methods should get directors. The %feature directive can be applied globally, to specific classes, and to specific methods, like this:

// generate directors for all classes that have virtual methods
%feature("director");         

// generate directors for all virtual methods in class Foo
%feature("director") Foo;      

You can use the %feature("nodirector") directive to turn off directors for specific classes or methods. So for example,

%feature("director") Foo;
%feature("nodirector") Foo::bar;

will generate directors for all virtual methods of class Foo except bar().

Directors can also be generated implicitly through inheritance. In the following, class Bar will get a director class that handles the methods one() and two() (but not three()):

%feature("director") Foo;
class Foo {
public:
    Foo(int foo);
    virtual void one();
    virtual void two();
};

class Bar: public Foo {
public:
    virtual void three();
};

then at the python side you can define

import mymodule

class MyFoo(mymodule.Foo):
  def __init__(self, foo):
     mymodule.Foo(self, foo)  

  def one(self):
     print "one from python"

36.5.2 Director classes

For each class that has directors enabled, SWIG generates a new class that derives from both the class in question and a special Swig::Director class. These new classes, referred to as director classes, can be loosely thought of as the C++ equivalent of the Python proxy classes. The director classes store a pointer to their underlying Python object and handle various issues related to object ownership. Indeed, this is quite similar to the "this" and "thisown" members of the Python proxy classes.

For simplicity let's ignore the Swig::Director class and refer to the original C++ class as the director's base class. By default, a director class extends all virtual methods in the inheritance chain of its base class (see the preceding section for how to modify this behavior). Thus all virtual method calls, whether they originate in C++ or in Python via proxy classes, eventually end up in at the implementation in the director class. The job of the director methods is to route these method calls to the appropriate place in the inheritance chain. By "appropriate place" we mean the method that would have been called if the C++ base class and its extensions in Python were seamlessly integrated. That seamless integration is exactly what the director classes provide, transparently skipping over all the messy extension API glue that binds the two languages together.

In reality, the "appropriate place" is one of only two possibilities: C++ or Python. Once this decision is made, the rest is fairly easy. If the correct implementation is in C++, then the lowest implementation of the method in the C++ inheritance chain is called explicitly. If the correct implementation is in Python, the Python API is used to call the method of the underlying Python object (after which the usual virtual method resolution in Python automatically finds the right implementation).

Now how does the director decide which language should handle the method call? The basic rule is to handle the method in Python, unless there's a good reason not to. The reason for this is simple: Python has the most "extended" implementation of the method. This assertion is guaranteed, since at a minimum the Python proxy class implements the method. If the method in question has been extended by a class derived from the proxy class, that extended implementation will execute exactly as it should. If not, the proxy class will route the method call into a C wrapper function, expecting that the method will be resolved in C++. The wrapper will call the virtual method of the C++ instance, and since the director extends this the call will end up right back in the director method. Now comes the "good reason not to" part. If the director method were to blindly call the Python method again, it would get stuck in an infinite loop. We avoid this situation by adding special code to the C wrapper function that tells the director method to not do this. The C wrapper function compares the pointer to the Python object that called the wrapper function to the pointer stored by the director. If these are the same, then the C wrapper function tells the director to resolve the method by calling up the C++ inheritance chain, preventing an infinite loop.

One more point needs to be made about the relationship between director classes and proxy classes. When a proxy class instance is created in Python, SWIG creates an instance of the original C++ class and assigns it to .this. This is exactly what happens without directors and is true even if directors are enabled for the particular class in question. When a class derived from a proxy class is created, however, SWIG then creates an instance of the corresponding C++ director class. The reason for this difference is that user-defined subclasses may override or extend methods of the original class, so the director class is needed to route calls to these methods correctly. For unmodified proxy classes, all methods are ultimately implemented in C++ so there is no need for the extra overhead involved with routing the calls through Python.

36.5.3 Ownership and object destruction

Memory management issues are slightly more complicated with directors than for proxy classes alone. Python instances hold a pointer to the associated C++ director object, and the director in turn holds a pointer back to the Python object. By default, proxy classes own their C++ director object and take care of deleting it when they are garbage collected.

This relationship can be reversed by calling the special __disown__() method of the proxy class. After calling this method, the .thisown flag is set to zero, and the director class increments the reference count of the Python object. When the director class is deleted it decrements the reference count. Assuming no outstanding references to the Python object remain, the Python object will be destroyed at the same time. This is a good thing, since directors and proxies refer to each other and so must be created and destroyed together. Destroying one without destroying the other will likely cause your program to segfault.

To help ensure that no references to the Python object remain after calling __disown__(), this method returns a weak reference to the Python object. Weak references are only available in Python versions 2.1 and higher, so for older versions you must explicitly delete all references. Here is an example:

class Foo {
public:
    ...
};
class FooContainer {
public:
    void addFoo(Foo *);
    ...
};

>>> c = FooContainer()
>>> a = Foo().__disown__()
>>> c.addFoo(a)
>>> b = Foo()
>>> b = b.__disown__()
>>> c.addFoo(b)
>>> c.addFoo(Foo().__disown__())

In this example, we are assuming that FooContainer will take care of deleting all the Foo pointers it contains at some point. Note that no hard references to the Foo objects remain in Python.

36.5.4 Exception unrolling

With directors routing method calls to Python, and proxies routing them to C++, the handling of exceptions is an important concern. By default, the directors ignore exceptions that occur during method calls that are resolved in Python. To handle such exceptions correctly, it is necessary to temporarily translate them into C++ exceptions. This can be done with the %feature("director:except") directive. The following code should suffice in most cases:

%feature("director:except") {
    if ($error != NULL) {
        throw Swig::DirectorMethodException();
    }
}

This code will check the Python error state after each method call from a director into Python, and throw a C++ exception if an error occurred. This exception can be caught in C++ to implement an error handler. Currently no information about the Python error is stored in the Swig::DirectorMethodException object, but this will likely change in the future.

It may be the case that a method call originates in Python, travels up to C++ through a proxy class, and then back into Python via a director method. If an exception occurs in Python at this point, it would be nice for that exception to find its way back to the original caller. This can be done by combining a normal %exception directive with the director:except handler shown above. Here is an example of a suitable exception handler:

%exception {
    try { $action }
    catch (Swig::DirectorException &e) { SWIG_fail; }
}

The class Swig::DirectorException used in this example is actually a base class of Swig::DirectorMethodException, so it will trap this exception. Because the Python error state is still set when Swig::DirectorMethodException is thrown, Python will register the exception as soon as the C wrapper function returns.

36.5.5 Overhead and code bloat

Enabling directors for a class will generate a new director method for every virtual method in the class' inheritance chain. This alone can generate a lot of code bloat for large hierarchies. Method arguments that require complex conversions to and from target language types can result in large director methods. For this reason it is recommended that you selectively enable directors only for specific classes that are likely to be extended in Python and used in C++.

Compared to classes that do not use directors, the call routing in the director methods does add some overhead. In particular, at least one dynamic cast and one extra function call occurs per method call from Python. Relative to the speed of Python execution this is probably completely negligible. For worst case routing, a method call that ultimately resolves in C++ may take one extra detour through Python in order to ensure that the method does not have an extended Python implementation. This could result in a noticeable overhead in some cases.

Although directors make it natural to mix native C++ objects with Python objects (as director objects) via a common base class pointer, one should be aware of the obvious fact that method calls to Python objects will be much slower than calls to C++ objects. This situation can be optimized by selectively enabling director methods (using the %feature directive) for only those methods that are likely to be extended in Python.

36.5.6 Typemaps

Typemaps for input and output of most of the basic types from director classes have been written. These are roughly the reverse of the usual input and output typemaps used by the wrapper code. The typemap operation names are 'directorin', 'directorout', and 'directorargout'. The director code does not currently use any of the other kinds of typemaps. It is not clear at this point which kinds are appropriate and need to be supported.

36.5.7 Miscellaneous

Director typemaps for STL classes are in place, and hence you should be able to use std::vector, std::string, etc., as you would any other type.

Note: The director typemaps for return types based in const references, such as

class Foo {
…
    virtual const int& bar();
…
};

will work only for simple call scenarios. Usually the resulting code is neither thread or reentrant safe. Hence, the user is advised to avoid returning const references in director methods. For example, the user could modify the method interface to use lvalue return types, wherever possible, for example

class Foo {
…
    virtual int bar();
…
};

If that is not possible, the user should avoid enabling the director feature for reentrant, recursive or threaded member methods that return const references.

36.6 Common customization features

The last section presented the absolute basics of C/C++ wrapping. If you do nothing but feed SWIG a header file, you will get an interface that mimics the behavior described. However, sometimes this isn't enough to produce a nice module. Certain types of functionality might be missing or the interface to certain functions might be awkward. This section describes some common SWIG features that are used to improve your the interface to an extension module.

36.6.1 C/C++ helper functions

Sometimes when you create a module, it is missing certain bits of functionality. For example, if you had a function like this

void set_transform(Image *im, double m[4][4]);

it would be accessible from Python, but there may be no easy way to call it. For example, you might get errors like this:

>>> a = [
...   [1,0,0,0],
...   [0,1,0,0],
...   [0,0,1,0],
...   [0,0,0,1]]
>>> set_transform(im,a)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
TypeError: Type error. Expected _p_a_4__double

The problem here is that there is no easy way to construct and manipulate a suitable double [4][4] value to use. To fix this, you can write some extra C helper functions. Just use the %inline directive. For example:

%inline %{
/* Note: double[4][4] is equivalent to a pointer to an array double (*)[4] */
double (*new_mat44())[4] {
   return (double (*)[4]) malloc(16*sizeof(double));
}
void free_mat44(double (*x)[4]) {
   free(x);
}
void mat44_set(double x[4][4], int i, int j, double v) {
   x[i][j] = v;
}
double mat44_get(double x[4][4], int i, int j) {
   return x[i][j];
}
%}

From Python, you could then write code like this:

>>> a = new_mat44()
>>> mat44_set(a,0,0,1.0)
>>> mat44_set(a,1,1,1.0)
>>> mat44_set(a,2,2,1.0)
...
>>> set_transform(im,a)
>>>

Admittedly, this is not the most elegant looking approach. However, it works and it wasn't too hard to implement. It is possible to clean this up using Python code, typemaps, and other customization features as covered in later sections.

36.6.2 Adding additional Python code

If writing support code in C isn't enough, it is also possible to write code in Python. This code gets inserted in to the .py file created by SWIG. One use of Python code might be to supply a high-level interface to certain functions. For example:

void set_transform(Image *im, double x[4][4]);

...
/* Rewrite the high level interface to set_transform */
%pythoncode %{
def set_transform(im,x):
   a = new_mat44()
   for i in range(4):
       for j in range(4):
           mat44_set(a,i,j,x[i][j])
   _example.set_transform(im,a)
   free_mat44(a)
%}

In this example, set_transform() provides a high-level Python interface built on top of low-level helper functions. For example, this code now seems to work:

>>> a = [
...   [1,0,0,0],
...   [0,1,0,0],
...   [0,0,1,0],
...   [0,0,0,1]]
>>> set_transform(im,a)
>>>

Admittedly, this whole scheme for wrapping the two-dimension array argument is rather ad-hoc. Besides, shouldn't a Python list or a Numeric Python array just work normally? We'll get to those examples soon enough. For now, think of this example as an illustration of what can be done without having to rely on any of the more advanced customization features.

There is also %pythonbegin which is another directive very similar to %pythoncode, but generates the given Python code at the beginning of the .py file. This directive works in the same way as %pythoncode, except the code is copied just after the SWIG banner (comment) at the top of the file, before any real code. This provides an opportunity to add your own description in a comment near the top of the file as well as Python imports that have to appear at the top of the file, such as "from __future__ import" statements.

The following shows example usage for Python 2.6 to use print as it can in Python 3, that is, as a function instead of a statement:

%pythonbegin %{
# This module provides wrappers to the Whizz Bang library
%}

%pythonbegin %{
from __future__ import print_function
print("Loading", "Whizz", "Bang", sep=' ... ')
%}

which can be seen when viewing the first few lines of the generated .py file:

# This file was automatically generated by SWIG (http://www.swig.org).
# Version 2.0.11
#
# Do not make changes to this file unless you know what you are doing--modify
# the SWIG interface file instead.

# This module provides wrappers to the Whizz Bang library

from __future__ import print_function
print("Loading", "Whizz", "Bang", sep=' ... ')

When using %pythoncode and %pythonbegin you generally want to make sure that the block is delimited by %{ and %}. If you delimit it with { and } then any lines with a leading # will be handled by SWIG as preprocessor directives, when you probably meant them as Python comments. Prior to SWIG 3.0.3, invalid preprocessor directives were silently ignored, so generally using the wrong delimiters resulted in such comments not appearing in the generated output (though a comment starting with a valid preprocessor directive could cause problems, for example: # error handling). SWIG 3.0.3 and later report an error for invalid preprocessor directives, so you may have to update existing interface files to delimit blocks of Python code correctly.

Sometimes you may want to replace or modify the wrapper function that SWIG creates in the proxy .py file. The Python module in SWIG provides some features that enable you to do this. First, to entirely replace a proxy function you can use %feature("shadow"). For example:

%module example

// Rewrite bar() python code

%feature("shadow") Foo::bar(int) %{
def bar(*args):
    #do something before
    $action
    #do something after
%}
    
class Foo {
public:
    int bar(int x);
}

where $action will be replaced by the call to the C/C++ proper method.

Often the proxy function created by SWIG is fine, but you simply want to add code to it without touching the rest of the generated function body. For these cases SWIG provides the pythonprepend and pythonappend features which do exactly as their names suggest. The pythonprepend feature will insert its value at the beginning of the proxy function, and pythonappend will insert code at the end of the proxy, just before the return statement.

%module example

// Add python code to bar() 

%feature("pythonprepend") Foo::bar(int) %{
   #do something before C++ call
%}

%feature("pythonappend") Foo::bar(int) %{
   #do something after C++ call
%}

    
class Foo {
public:
    int bar(int x);
}

Notes: Usually the pythonappend and pythonprepend features are safer to use than the shadow feature. Also, from SWIG version 1.3.28 you can use the directive forms %pythonappend and %pythonprepend as follows:

%module example

// Add python code to bar() 

%pythonprepend Foo::bar(int) %{
   #do something before C++ call
%}

%pythonappend Foo::bar(int) %{
   #do something after C++ call
%}

    
class Foo {
public:
    int bar(int x);
}

Note that when the underlying C++ method is overloaded, there is only one proxy Python method for multiple C++ methods. In this case, only one of parsed methods is examined for the feature. You are better off specifying the feature without the argument list to ensure it will get used, as it will then get attached to all the overloaded C++ methods. For example:

%module example

// Add python code to bar()

%pythonprepend Foo::bar %{
   #do something before C++ call
%}

%pythonappend Foo::bar %{
   #do something after C++ call
%}


class Foo {
public:
    int bar(int x);
    int bar();
}

The same applies for overloaded constructors.

36.6.3 Class extension with %extend

One of the more interesting features of SWIG is that it can extend structures and classes with new methods--at least in the Python interface. Here is a simple example:

%module example
%{
#include "someheader.h"
%}

struct Vector {
   double x,y,z;
};

%extend Vector {
   char *__str__() {
       static char tmp[1024];
       sprintf(tmp,"Vector(%g,%g,%g)", $self->x,$self->y,$self->z);
       return tmp;
   }
   Vector(double x, double y, double z) {
       Vector *v = (Vector *) malloc(sizeof(Vector));
       v->x = x;
       v->y = y;
       v->z = z;
       return v;
   }
};

Now, in Python

>>> v = example.Vector(2,3,4)
>>> print v
Vector(2,3,4)
>>>

%extend can be used for many more tasks than this. For example, if you wanted to overload a Python operator, you might do this:

%extend Vector {
    Vector __add__(Vector *other) {
         Vector v;
         v.x = $self->x + other->x;
         v.y = $self->y + other->y;
         v.z = $self->z + other->z;
         return v;
    }
};

Use it like this:

>>> import example
>>> v = example.Vector(2,3,4)
>>> w = example.Vector(10,11,12)
>>> print v+w
Vector(12,14,16)
>>> 

%extend works with both C and C++ code. It does not modify the underlying object in any way---the extensions only show up in the Python interface.

36.6.4 Exception handling with %exception

If a C or C++ function throws an error, you may want to convert that error into a Python exception. To do this, you can use the %exception directive. %exception simply lets you rewrite part of the generated wrapper code to include an error check.

In C, a function often indicates an error by returning a status code (a negative number or a NULL pointer perhaps). Here is a simple example of how you might handle that:

%exception malloc {
  $action
  if (!result) {
     PyErr_SetString(PyExc_MemoryError,"Not enough memory");
     return NULL;
  }
}
void *malloc(size_t nbytes);

In Python,

>>> a = example.malloc(2000000000)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
MemoryError: Not enough memory
>>>

If a library provides some kind of general error handling framework, you can also use that. For example:

%exception {
   $action
   if (err_occurred()) {
      PyErr_SetString(PyExc_RuntimeError, err_message());
      return NULL;
   }
}

No declaration name is given to %exception, it is applied to all wrapper functions.

C++ exceptions are also easy to handle. For example, you can write code like this:

%exception getitem {
   try {
      $action
   } catch (std::out_of_range &e) {
      PyErr_SetString(PyExc_IndexError, const_cast<char*>(e.what()));
      return NULL;
   }
}

class Base {
public:
     Foo *getitem(int index);      // Exception handled added
     ...
};

When raising a Python exception from C, use the PyErr_SetString() function as shown above. The following exception types can be used as the first argument.

PyExc_ArithmeticError
PyExc_AssertionError
PyExc_AttributeError
PyExc_EnvironmentError
PyExc_EOFError
PyExc_Exception
PyExc_FloatingPointError
PyExc_ImportError
PyExc_IndexError
PyExc_IOError
PyExc_KeyError
PyExc_KeyboardInterrupt
PyExc_LookupError
PyExc_MemoryError
PyExc_NameError
PyExc_NotImplementedError
PyExc_OSError
PyExc_OverflowError
PyExc_RuntimeError
PyExc_StandardError
PyExc_SyntaxError
PyExc_SystemError
PyExc_TypeError
PyExc_UnicodeError
PyExc_ValueError
PyExc_ZeroDivisionError

The language-independent exception.i library file can also be used to raise exceptions. See the SWIG Library chapter.

36.7 Tips and techniques

Although SWIG is largely automatic, there are certain types of wrapping problems that require additional user input. Examples include dealing with output parameters, strings, binary data, and arrays. This chapter discusses the common techniques for solving these problems.

36.7.1 Input and output parameters

A common problem in some C programs is handling parameters passed as simple pointers. For example:

void add(int x, int y, int *result) {
   *result = x + y;
}

or perhaps

int sub(int *x, int *y) {
   return *x-*y;
}

The easiest way to handle these situations is to use the typemaps.i file. For example:

%module example
%include "typemaps.i"

void add(int, int, int *OUTPUT);
int  sub(int *INPUT, int *INPUT);

In Python, this allows you to pass simple values. For example:

>>> a = add(3,4)
>>> print a
7
>>> b = sub(7,4)
>>> print b
3
>>>

Notice how the INPUT parameters allow integer values to be passed instead of pointers and how the OUTPUT parameter creates a return result.

If you don't want to use the names INPUT or OUTPUT, use the %apply directive. For example:

%module example
%include "typemaps.i"

%apply int *OUTPUT { int *result };
%apply int *INPUT  { int *x, int *y};

void add(int x, int y, int *result);
int  sub(int *x, int *y);

If a function mutates one of its parameters like this,

void negate(int *x) {
   *x = -(*x);
}

you can use INOUT like this:

%include "typemaps.i"
...
void negate(int *INOUT);

In Python, a mutated parameter shows up as a return value. For example:

>>> a = negate(3)
>>> print a
-3
>>>

Note: Since most primitive Python objects are immutable, it is not possible to perform in-place modification of a Python object passed as a parameter.

The most common use of these special typemap rules is to handle functions that return more than one value. For example, sometimes a function returns a result as well as a special error code:

/* send message, return number of bytes sent, along with success code */
int send_message(char *text, int len, int *success);

To wrap such a function, simply use the OUTPUT rule above. For example:

%module example
%include "typemaps.i"
%apply int *OUTPUT { int *success };
...
int send_message(char *text, int *success);

When used in Python, the function will return multiple values.

bytes, success = send_message("Hello World")
if not success:
    print "Whoa!"
else:
    print "Sent", bytes

Another common use of multiple return values are in query functions. For example:

void get_dimensions(Matrix *m, int *rows, int *columns);

To wrap this, you might use the following:

%module example
%include "typemaps.i"
%apply int *OUTPUT { int *rows, int *columns };
...
void get_dimensions(Matrix *m, int *rows, *columns);

Now, in Python:

>>> r,c = get_dimensions(m)

Be aware that the primary purpose of the typemaps.i file is to support primitive datatypes. Writing a function like this

void foo(Bar *OUTPUT);

may not have the intended effect since typemaps.i does not define an OUTPUT rule for Bar.

36.7.2 Simple pointers

If you must work with simple pointers such as int * or double * and you don't want to use typemaps.i, consider using the cpointer.i library file. For example:

%module example
%include "cpointer.i"

%inline %{
extern void add(int x, int y, int *result);
%}

%pointer_functions(int, intp);

The %pointer_functions(type,name) macro generates five helper functions that can be used to create, destroy, copy, assign, and dereference a pointer. In this case, the functions are as follows:

int  *new_intp();
int  *copy_intp(int *x);
void  delete_intp(int *x);
void  intp_assign(int *x, int value);
int   intp_value(int *x);

In Python, you would use the functions like this:

>>> result = new_intp()
>>> print result
_108fea8_p_int
>>> add(3,4,result)
>>> print intp_value(result)
7
>>>

If you replace %pointer_functions() by %pointer_class(type,name), the interface is more class-like.

>>> result = intp()
>>> add(3,4,result)
>>> print result.value()
7

See the SWIG Library chapter for further details.

36.7.3 Unbounded C Arrays

Sometimes a C function expects an array to be passed as a pointer. For example,

int sumitems(int *first, int nitems) {
    int i, sum = 0;
    for (i = 0; i < nitems; i++) {
        sum += first[i];
    }
    return sum;
}

To wrap this into Python, you need to pass an array pointer as the first argument. A simple way to do this is to use the carrays.i library file. For example:

%include "carrays.i"
%array_class(int, intArray);

The %array_class(type, name) macro creates wrappers for an unbounded array object that can be passed around as a simple pointer like int * or double *. For instance, you will be able to do this in Python:

>>> a = intArray(10000000)         # Array of 10-million integers
>>> for i in xrange(10000):        # Set some values
...     a[i] = i
>>> sumitems(a,10000)
49995000
>>>

The array "object" created by %array_class() does not encapsulate pointers inside a special array object. In fact, there is no bounds checking or safety of any kind (just like in C). Because of this, the arrays created by this library are extremely low-level indeed. You can't iterate over them nor can you even query their length. In fact, any valid memory address can be accessed if you want (negative indices, indices beyond the end of the array, etc.). Needless to say, this approach is not going to suit all applications. On the other hand, this low-level approach is extremely efficient and well suited for applications in which you need to create buffers, package binary data, etc.

36.7.4 String handling

If a C function has an argument of char *, then a Python string can be passed as input. For example:

// C
void foo(char *s);
# Python
>>> foo("Hello")

When a Python string is passed as a parameter, the C function receives a pointer to the raw data contained in the string. Since Python strings are immutable, it is illegal for your program to change the value. In fact, doing so will probably crash the Python interpreter.

If your program modifies the input parameter or uses it to return data, consider using the cstring.i library file described in the SWIG Library chapter.

When functions return a char *, it is assumed to be a NULL-terminated string. Data is copied into a new Python string and returned.

If your program needs to work with binary data, you can use a typemap to expand a Python string into a pointer/length argument pair. As luck would have it, just such a typemap is already defined. Just do this:

%apply (char *STRING, int LENGTH) { (char *data, int size) };
...
int parity(char *data, int size, int initial);

Now in Python:

>>> parity("e\x09ffss\x00\x00\x01\nx", 0)

If you need to return binary data, you might use the cstring.i library file. The cdata.i library can also be used to extra binary data from arbitrary pointers.

36.8 Typemaps

This section describes how you can modify SWIG's default wrapping behavior for various C/C++ datatypes using the %typemap directive. This is an advanced topic that assumes familiarity with the Python C API as well as the material in the "Typemaps" chapter.

Before proceeding, it should be stressed that typemaps are not a required part of using SWIG---the default wrapping behavior is enough in most cases. Typemaps are only used if you want to change some aspect of the primitive C-Python interface or if you want to elevate your guru status.

36.8.1 What is a typemap?

A typemap is nothing more than a code generation rule that is attached to a specific C datatype. For example, to convert integers from Python to C, you might define a typemap like this:

%module example

%typemap(in) int {
	$1 = (int) PyLong_AsLong($input);
	printf("Received an integer : %d\n",$1);
}
%inline %{
extern int fact(int n);
%}

Typemaps are always associated with some specific aspect of code generation. In this case, the "in" method refers to the conversion of input arguments to C/C++. The datatype int is the datatype to which the typemap will be applied. The supplied C code is used to convert values. In this code a number of special variable prefaced by a $ are used. The $1 variable is placeholder for a local variable of type int. The $input variable is the input object of type PyObject *.

When this example is compiled into a Python module, it operates as follows:

>>> from example import *
>>> fact(6)
Received an integer : 6
720

In this example, the typemap is applied to all occurrences of the int datatype. You can refine this by supplying an optional parameter name. For example:

%module example

%typemap(in) int nonnegative {
	$1 = (int) PyLong_AsLong($input);
        if ($1 < 0) {
           PyErr_SetString(PyExc_ValueError,"Expected a nonnegative value.");
           return NULL;
        }
}
%inline %{
extern int fact(int nonnegative);
%}

In this case, the typemap code is only attached to arguments that exactly match int nonnegative.

The application of a typemap to specific datatypes and argument names involves more than simple text-matching--typemaps are fully integrated into the SWIG C++ type-system. When you define a typemap for int, that typemap applies to int and qualified variations such as const int. In addition, the typemap system follows typedef declarations. For example:

%typemap(in) int n {
	$1 = (int) PyLong_AsLong($input);
	printf("n = %d\n",$1);
}
%inline %{
typedef int Integer;
extern int fact(Integer n);    // Above typemap is applied
%}

Typemaps can also be defined for groups of consecutive arguments. For example:

%typemap(in) (char *str, int len) {
    $1 = PyString_AsString($input);
    $2 = PyString_Size($input);
};

int count(char c, char *str, int len);

When a multi-argument typemap is defined, the arguments are always handled as a single Python object. This allows the function to be used like this (notice how the length parameter is omitted):

>>> example.count('e','Hello World')
1
>>>

36.8.2 Python typemaps

The previous section illustrated an "in" typemap for converting Python objects to C. A variety of different typemap methods are defined by the Python module. For example, to convert a C integer back into a Python object, you might define an "out" typemap like this:

%typemap(out) int {
    $result = PyInt_FromLong((long) $1);
}

A detailed list of available methods can be found in the "Typemaps" chapter.

However, the best source of typemap information (and examples) is probably the Python module itself. In fact, all of SWIG's default type handling is defined by typemaps. You can view these typemaps by looking at the files in the SWIG library. Just take into account that in the latest versions of swig (1.3.22+), the library files are not very pristine clear for the casual reader, as they used to be. The extensive use of macros and other ugly techniques in the latest version produce a very powerful and consistent python typemap library, but at the cost of simplicity and pedagogic value.

To learn how to write a simple or your first typemap, you better take a look at the SWIG library version 1.3.20 or so.

36.8.3 Typemap variables

Within typemap code, a number of special variables prefaced with a $ may appear. A full list of variables can be found in the "Typemaps" chapter. This is a list of the most common variables:

$1

A C local variable corresponding to the actual type specified in the %typemap directive. For input values, this is a C local variable that's supposed to hold an argument value. For output values, this is the raw result that's supposed to be returned to Python.

$input

A PyObject * holding a raw Python object with an argument or variable value.

$result

A PyObject * that holds the result to be returned to Python.

$1_name

The parameter name that was matched.

$1_type

The actual C datatype matched by the typemap.

$1_ltype

An assignable version of the datatype matched by the typemap (a type that can appear on the left-hand-side of a C assignment operation). This type is stripped of qualifiers and may be an altered version of $1_type. All arguments and local variables in wrapper functions are declared using this type so that their values can be properly assigned.

$symname

The Python name of the wrapper function being created.

36.8.4 Useful Python Functions

When you write a typemap, you usually have to work directly with Python objects. The following functions may prove to be useful.

Python Integer Functions

PyObject *PyInt_FromLong(long l);
long      PyInt_AsLong(PyObject *);
int       PyInt_Check(PyObject *);

Python Floating Point Functions

PyObject *PyFloat_FromDouble(double);
double    PyFloat_AsDouble(PyObject *);
int       PyFloat_Check(PyObject *);

Python String Functions

PyObject *PyString_FromString(char *);
PyObject *PyString_FromStringAndSize(char *, lint len);
int       PyString_Size(PyObject *);
char     *PyString_AsString(PyObject *);
int       PyString_Check(PyObject *);

Python List Functions

PyObject *PyList_New(int size);
int       PyList_Size(PyObject *list);
PyObject *PyList_GetItem(PyObject *list, int i);
int       PyList_SetItem(PyObject *list, int i, PyObject *item);
int       PyList_Insert(PyObject *list, int i, PyObject *item);
int       PyList_Append(PyObject *list, PyObject *item);
PyObject *PyList_GetSlice(PyObject *list, int i, int j);
int       PyList_SetSlice(PyObject *list, int i, int , PyObject *list2);
int       PyList_Sort(PyObject *list);
int       PyList_Reverse(PyObject *list);
PyObject *PyList_AsTuple(PyObject *list);
int       PyList_Check(PyObject *);

Python Tuple Functions

PyObject *PyTuple_New(int size);
int       PyTuple_Size(PyObject *);
PyObject *PyTuple_GetItem(PyObject *, int i);
int       PyTuple_SetItem(PyObject *, int i, PyObject *item);
PyObject *PyTuple_GetSlice(PyObject *t, int i, int j);
int       PyTuple_Check(PyObject *);

Python Dictionary Functions

PyObject *PyDict_New();
int       PyDict_Check(PyObject *);
int       PyDict_SetItem(PyObject *p, PyObject *key, PyObject *val);
int       PyDict_SetItemString(PyObject *p, const char *key, PyObject *val);
int       PyDict_DelItem(PyObject *p, PyObject *key);
int       PyDict_DelItemString(PyObject *p, char *key);
PyObject* PyDict_Keys(PyObject *p);
PyObject* PyDict_Values(PyObject *p);
PyObject* PyDict_GetItem(PyObject *p, PyObject *key);
PyObject* PyDict_GetItemString(PyObject *p, const char *key);
int       PyDict_Next(PyObject *p, Py_ssize_t *ppos, PyObject **pkey, PyObject **pvalue);
Py_ssize_t PyDict_Size(PyObject *p);
int       PyDict_Update(PyObject *a, PyObject *b);
int       PyDict_Merge(PyObject *a, PyObject *b, int override);
PyObject* PyDict_Items(PyObject *p);

Python File Conversion Functions

PyObject *PyFile_FromFile(FILE *f);
FILE     *PyFile_AsFile(PyObject *);
int       PyFile_Check(PyObject *);

Abstract Object Interface

write me

36.9 Typemap Examples

This section includes a few examples of typemaps. For more examples, you might look at the files "python.swg" and "typemaps.i" in the SWIG library.

36.9.1 Converting Python list to a char **

A common problem in many C programs is the processing of command line arguments, which are usually passed in an array of NULL terminated strings. The following SWIG interface file allows a Python list object to be used as a char ** object.

%module argv

// This tells SWIG to treat char ** as a special case
%typemap(in) char ** {
  /* Check if is a list */
  if (PyList_Check($input)) {
    int size = PyList_Size($input);
    int i = 0;
    $1 = (char **) malloc((size+1)*sizeof(char *));
    for (i = 0; i < size; i++) {
      PyObject *o = PyList_GetItem($input,i);
      if (PyString_Check(o))
	$1[i] = PyString_AsString(PyList_GetItem($input,i));
      else {
	PyErr_SetString(PyExc_TypeError,"list must contain strings");
	free($1);
	return NULL;
      }
    }
    $1[i] = 0;
  } else {
    PyErr_SetString(PyExc_TypeError,"not a list");
    return NULL;
  }
}

// This cleans up the char ** array we malloc'd before the function call
%typemap(freearg) char ** {
  free((char *) $1);
}

// Now a test function
%inline %{
int print_args(char **argv) {
    int i = 0;
    while (argv[i]) {
         printf("argv[%d] = %s\n", i,argv[i]);
         i++;
    }
    return i;
}
%}

When this module is compiled, the wrapped C function now operates as follows :

>>> from argv import *
>>> print_args(["Dave","Mike","Mary","Jane","John"])
argv[0] = Dave
argv[1] = Mike
argv[2] = Mary
argv[3] = Jane
argv[4] = John
5

In the example, two different typemaps are used. The "in" typemap is used to receive an input argument and convert it to a C array. Since dynamic memory allocation is used to allocate memory for the array, the "freearg" typemap is used to later release this memory after the execution of the C function.

36.9.2 Expanding a Python object into multiple arguments

Suppose that you had a collection of C functions with arguments such as the following:

int foo(int argc, char **argv);

In the previous example, a typemap was written to pass a Python list as the char **argv. This allows the function to be used from Python as follows:

>>> foo(4, ["foo","bar","spam","1"])

Although this works, it's a little awkward to specify the argument count. To fix this, a multi-argument typemap can be defined. This is not very difficult--you only have to make slight modifications to the previous example:

%typemap(in) (int argc, char **argv) {
  /* Check if is a list */
  if (PyList_Check($input)) {
    int i;
    $1 = PyList_Size($input);
    $2 = (char **) malloc(($1+1)*sizeof(char *));
    for (i = 0; i < $1; i++) {
      PyObject *o = PyList_GetItem($input,i);
      if (PyString_Check(o))
	$2[i] = PyString_AsString(PyList_GetItem($input,i));
      else {
	PyErr_SetString(PyExc_TypeError,"list must contain strings");
	free($2);
	return NULL;
      }
    }
    $2[i] = 0;
  } else {
    PyErr_SetString(PyExc_TypeError,"not a list");
    return NULL;
  }
}

%typemap(freearg) (int argc, char **argv) {
  free((char *) $2);
}

When writing a multiple-argument typemap, each of the types is referenced by a variable such as $1 or $2. The typemap code simply fills in the appropriate values from the supplied Python object.

With the above typemap in place, you will find it no longer necessary to supply the argument count. This is automatically set by the typemap code. For example:

>>> foo(["foo","bar","spam","1"])

36.9.3 Using typemaps to return arguments

A common problem in some C programs is that values may be returned in arguments rather than in the return value of a function. For example:

/* Returns a status value and two values in out1 and out2 */
int spam(double a, double b, double *out1, double *out2) {
	... Do a bunch of stuff ...
	*out1 = result1;
	*out2 = result2;
	return status;
}

A typemap can be used to handle this case as follows :

%module outarg

// This tells SWIG to treat an double * argument with name 'OutValue' as
// an output value.  We'll append the value to the current result which 
// is guaranteed to be a List object by SWIG.

%typemap(argout) double *OutValue {
    PyObject *o, *o2, *o3;
    o = PyFloat_FromDouble(*$1);
    if ((!$result) || ($result == Py_None)) {
        $result = o;
    } else {
        if (!PyTuple_Check($result)) {
            PyObject *o2 = $result;
            $result = PyTuple_New(1);
            PyTuple_SetItem(target,0,o2);
        }
        o3 = PyTuple_New(1);
        PyTuple_SetItem(o3,0,o);
        o2 = $result;
        $result = PySequence_Concat(o2,o3);
        Py_DECREF(o2);
        Py_DECREF(o3);
    }
}

int spam(double a, double b, double *OutValue, double *OutValue);

The typemap works as follows. First, a check is made to see if any previous result exists. If so, it is turned into a tuple and the new output value is concatenated to it. Otherwise, the result is returned normally. For the sample function spam(), there are three output values--meaning that the function will return a 3-tuple of the results.

As written, the function must accept 4 arguments as input values, last two being pointers to doubles. If these arguments are only used to hold output values (and have no meaningful input value), an additional typemap can be written. For example:

%typemap(in,numinputs=0) double *OutValue(double temp) {
    $1 = &temp;
}

By specifying numinputs=0, the input value is ignored. However, since the argument still has to be set to some meaningful value before calling C, it is set to point to a local variable temp. When the function stores its output value, it will simply be placed in this local variable. As a result, the function can now be used as follows:

>>> a = spam(4,5)
>>> print a
(0, 2.45, 5.0)
>>> x,y,z = spam(4,5)
>>>

36.9.4 Mapping Python tuples into small arrays

In some applications, it is sometimes desirable to pass small arrays of numbers as arguments. For example :

extern void set_direction(double a[4]);       // Set direction vector

This too, can be handled used typemaps as follows :

// Grab a 4 element array as a Python 4-tuple
%typemap(in) double[4](double temp[4]) {   // temp[4] becomes a local variable
  int i;
  if (PyTuple_Check($input)) {
    if (!PyArg_ParseTuple($input,"dddd",temp,temp+1,temp+2,temp+3)) {
      PyErr_SetString(PyExc_TypeError,"tuple must have 4 elements");
      return NULL;
    }
    $1 = &temp[0];
  } else {
    PyErr_SetString(PyExc_TypeError,"expected a tuple.");
    return NULL;
  }
}

This allows our set_direction function to be called from Python as follows :

>>> set_direction((0.5,0.0,1.0,-0.25))

Since our mapping copies the contents of a Python tuple into a C array, such an approach would not be recommended for huge arrays, but for small structures, this approach works fine.

36.9.5 Mapping sequences to C arrays

Suppose that you wanted to generalize the previous example to handle C arrays of different sizes. To do this, you might write a typemap as follows:

// Map a Python sequence into any sized C double array
%typemap(in) double[ANY](double temp[$1_dim0]) {
  int i;
  if (!PySequence_Check($input)) {
      PyErr_SetString(PyExc_TypeError,"Expecting a sequence");
      return NULL;
  }
  if (PyObject_Length($input) != $1_dim0) {
      PyErr_SetString(PyExc_ValueError,"Expecting a sequence with $1_dim0 elements");
      return NULL;
  }
  for (i =0; i < $1_dim0; i++) {
      PyObject *o = PySequence_GetItem($input,i);
      if (!PyFloat_Check(o)) {
         Py_XDECREF(o);
         PyErr_SetString(PyExc_ValueError,"Expecting a sequence of floats");
         return NULL;
      }
      temp[i] = PyFloat_AsDouble(o);
      Py_DECREF(o);
  }
  $1 = &temp[0];
}

In this case, the variable $1_dim0 is expanded to match the array dimensions actually used in the C code. This allows the typemap to be applied to types such as:

void foo(double x[10]);
void bar(double a[4], double b[8]);

Since the above typemap code gets inserted into every wrapper function where used, it might make sense to use a helper function instead. This will greatly reduce the amount of wrapper code. For example:

%{
static int convert_darray(PyObject *input, double *ptr, int size) {
  int i;
  if (!PySequence_Check(input)) {
      PyErr_SetString(PyExc_TypeError,"Expecting a sequence");
      return 0;
  }
  if (PyObject_Length(input) != size) {
      PyErr_SetString(PyExc_ValueError,"Sequence size mismatch");
      return 0;
  }
  for (i =0; i < size; i++) {
      PyObject *o = PySequence_GetItem(input,i);
      if (!PyFloat_Check(o)) {
         Py_XDECREF(o);
         PyErr_SetString(PyExc_ValueError,"Expecting a sequence of floats");
         return 0;
      }
      ptr[i] = PyFloat_AsDouble(o);
      Py_DECREF(o);
  }
  return 1;
}
%}

%typemap(in) double [ANY](double temp[$1_dim0]) {
   if (!convert_darray($input,temp,$1_dim0)) {
      return NULL;
   }
   $1 = &temp[0];
}

36.9.6 Pointer handling

Occasionally, it might be necessary to convert pointer values that have been stored using the SWIG typed-pointer representation. Since there are several ways in which pointers can be represented, the following two functions are used to safely perform this conversion:

int SWIG_ConvertPtr(PyObject *obj, void **ptr, swig_type_info *ty, int flags)

Converts a Python object obj to a C pointer. The result of the conversion is placed into the pointer located at ptr. ty is a SWIG type descriptor structure. flags is used to handle error checking and other aspects of conversion. It is the bitwise-or of several flag values including SWIG_POINTER_EXCEPTION and SWIG_POINTER_DISOWN. The first flag makes the function raise an exception on type error. The second flag additionally steals ownership of an object. Returns 0 on success and -1 on error.

PyObject *SWIG_NewPointerObj(void *ptr, swig_type_info *ty, int own)

Creates a new Python pointer object. ptr is the pointer to convert, ty is the SWIG type descriptor structure that describes the type, and own is a flag that indicates whether or not Python should take ownership of the pointer.

Both of these functions require the use of a special SWIG type-descriptor structure. This structure contains information about the mangled name of the datatype, type-equivalence information, as well as information about converting pointer values under C++ inheritance. For a type of Foo *, the type descriptor structure is usually accessed as follows:

Foo *f;
if (SWIG_ConvertPtr($input, (void **) &f, SWIGTYPE_p_Foo, SWIG_POINTER_EXCEPTION) == -1)
  return NULL;

PyObject *obj;
obj = SWIG_NewPointerObj(f, SWIGTYPE_p_Foo, 0);

In a typemap, the type descriptor should always be accessed using the special typemap variable $1_descriptor. For example:

%typemap(in) Foo * {
if ((SWIG_ConvertPtr($input,(void **) &$1, $1_descriptor,SWIG_POINTER_EXCEPTION)) == -1)
  return NULL;
}

If necessary, the descriptor for any type can be obtained using the $descriptor() macro in a typemap. For example:

%typemap(in) Foo * {
if ((SWIG_ConvertPtr($input,(void **) &$1, $descriptor(Foo *), 
                                               SWIG_POINTER_EXCEPTION)) == -1)
  return NULL;
}

Although the pointer handling functions are primarily intended for manipulating low-level pointers, both functions are fully aware of Python proxy classes. Specifically, SWIG_ConvertPtr() will retrieve a pointer from any object that has a this attribute. In addition, SWIG_NewPointerObj() can automatically generate a proxy class object (if applicable).

36.10 Docstring Features

Using docstrings in Python code is becoming more and more important and more tools are coming on the scene that take advantage of them, everything from full-blown documentation generators to class browsers and popup call-tips in Python-aware IDEs. Given the way that SWIG generates the proxy code by default, your users will normally get something like "function_name(*args)" in the popup calltip of their IDE which is next to useless when the real function prototype might be something like this:

bool function_name(int x, int y, Foo* foo=NULL, Bar* bar=NULL);

The features described in this section make it easy for you to add docstrings to your modules, functions and methods that can then be used by the various tools out there to make the programming experience of your users much simpler.

36.10.1 Module docstring

Python allows a docstring at the beginning of the .py file before any other statements, and it is typically used to give a general description of the entire module. SWIG supports this by setting an option of the %module directive. For example:

%module(docstring="This is the example module's docstring") example

When you have more than just a line or so then you can retain the easy readability of the %module directive by using a macro. For example:

%define DOCSTRING
"The `XmlResource` class allows program resources defining menus, 
layout of controls on a panel, etc. to be loaded from an XML file."
%enddef

%module(docstring=DOCSTRING) xrc

36.10.2 %feature("autodoc")

As alluded to above SWIG will generate all the function and method proxy wrappers with just "*args" (or "*args, **kwargs" if the -keyword option is used) for a parameter list and will then sort out the individual parameters in the C wrapper code. This is nice and simple for the wrapper code, but makes it difficult to be programmer and tool friendly as anyone looking at the .py file will not be able to find out anything about the parameters that the functions accept.

But since SWIG does know everything about the function it is possible to generate a docstring containing the parameter types, names and default values. Since many of the docstring tools are adopting a standard of recognizing if the first thing in the docstring is a function prototype then using that instead of what they found from introspection, then life is good once more.

SWIG's Python module provides support for the "autodoc" feature, which when attached to a node in the parse tree will cause a docstring to be generated that includes the name of the function, parameter names, default values if any, and return type if any. There are also four levels for autodoc controlled by the value given to the feature, %feature("autodoc", "level"). The four values for level are covered in the following sub-sections.

36.10.2.1 %feature("autodoc", "0")

When level "0" is used then the types of the parameters will not be included in the autodoc string. For example, given this function prototype:

%feature("autodoc", "0");
bool function_name(int x, int y, Foo* foo=NULL, Bar* bar=NULL);

Then Python code like this will be generated:

def function_name(*args, **kwargs):
    """function_name(x, y, foo=None, bar=None) -> bool"""
    ...

36.10.2.2 %feature("autodoc", "1")

When level "1" is used then the parameter types will be used in the autodoc string. In addition, an attempt is made to simplify the type name such that it makes more sense to the Python user. Pointer, reference and const info is removed if the associated type is has an associated Python type (%rename's are thus shown correctly). This works most of the time, otherwise a C/C++ type will be used. See the next section for the "docstring" feature for tweaking the docstrings to your liking. Given the example above, then turning on the parameter types with level "1" will result in Python code like this:

def function_name(*args, **kwargs):
    """function_name(int x, int y, Foo foo=None, Bar bar=None) -> bool"""
    ...

36.10.2.3 %feature("autodoc", "2")

Level "2" results in the function prototype as per level "0". In addition, a line of documentation is generated for each parameter. Using the previous example, the generated code will be:

def function_name(*args, **kwargs):
    """
    function_name(x, y, foo=None, bar=None) -> bool

    Parameters:
        x: int
        y: int
        foo: Foo *
        bar: Bar *

    """
    ...

Note that the documentation for each parameter is sourced from the "doc" typemap which by default shows the C/C++ type rather than the simplified Python type name described earlier for level "1". Typemaps can of course change the output for any particular type, for example the int x parameter:

%feature("autodoc", "2");
%typemap("doc") int x "$1_name (C++ type: $1_type) -- Input $1_name dimension"
bool function_name(int x, int y, Foo* foo=NULL, Bar* bar=NULL);

resulting in

def function_name(*args, **kwargs):
  """
    function_name(x, y, foo=None, bar=None) -> bool

    Parameters:
        x (C++ type: int) -- Input x dimension
        y: int
        foo: Foo *
        bar: Bar *

    """

36.10.2.4 %feature("autodoc", "3")

Level "3" results in the function prototype as per level "1" but also contains the same additional line of documentation for each parameter as per level "2". Using our earlier example again, the generated code will be:

def function_name(*args, **kwargs):
    """
    function_name(int x, int y, Foo foo=None, Bar bar=None) -> bool

    Parameters:
        x: int
        y: int
        foo: Foo *
        bar: Bar *

    """
    ...

36.10.2.5 %feature("autodoc", "docstring")

Finally, there are times when the automatically generated autodoc string will make no sense for a Python programmer, particularly when a typemap is involved. So if you give an explicit value for the autodoc feature then that string will be used in place of the automatically generated string. For example:

%feature("autodoc", "GetPosition() -> (x, y)") GetPosition;
void GetPosition(int* OUTPUT, int* OUTPUT);

36.10.3 %feature("docstring")

In addition to the autodoc strings described above, you can also attach any arbitrary descriptive text to a node in the parse tree with the "docstring" feature. When the proxy module is generated then any docstring associated with classes, function or methods are output. If an item already has an autodoc string then it is combined with the docstring and they are output together. If the docstring is all on a single line then it is output like this::

"""This is the docstring"""

Otherwise, to aid readability it is output like this:

"""
This is a multi-line docstring
with more than one line.
"""

36.11 Python Packages

Python has concepts of modules and packages. Modules are separate units of code and may be grouped together to form a package. Packages may be nested, that is they may contain subpackages. This leads to tree-like hierarchy, with packages as intermediate nodes and modules as leaf nodes.

The hierarchy of Python packages/modules follows the hierarchy of *.py files found in a source tree (or, more generally, in the Python path). Normally, the developer creates new module by placing a *.py file somewhere under Python path; the module is then named after that *.py file. A package is created by placing an __init__.py file within a directory; the package is then named after that directory. For example, the following source tree:

mod1.py
pkg1/__init__.py
pkg1/mod2.py
pkg1/pkg2/__init__.py
pkg1/pkg2/mod3.py

defines the following Python packages and modules:

pkg1            # package
pkg1.pkg2       # package
mod1            # module
pkg1.mod2       # module
pkg1.pkg2.mod3  # module

The purpose of an __init__.py file is two-fold. First, the existence of __init__.py in a directory informs the Python interpreter that this directory contains a Python package. Second, the code in __init__.py is loaded/executed automatically when the package is initialized (when it or its submodule/subpackage gets import'ed). By default, SWIG generates proxy Python code – one *.py file for each *.i interface. The __init__.py files, however, are not generated by SWIG. They should be created by other means. Both files (module *.py and __init__.py) should be installed in appropriate destination directories in order to obtain a desirable package/module hierarchy.

The way Python defines its modules and packages impacts SWIG users. Some users may need to use special features such as the package option in the %module directive or import related command line options. These are explained in the following sections.

36.11.1 Setting the Python package

Using the package option in the %module directive allows you to specify a Python package that the module will be in when installed.

%module(package="wx") xrc

This is useful when the .i file is %imported by another .i file. By default SWIG will assume that the importer is able to find the importee with just the module name, but if they live in separate Python packages then this won't work. However if the importee specifies what its package is with the %module option then the Python code generated for the importer will use that package name when importing the other module and in base class declarations, etc..

SWIG assumes that the package option provided to %module together with the module name (that is, wx.xrc in the above example) forms a fully qualified (absolute) name of a module (in Python terms). This is important especially for Python 3, where absolute imports are used by default. It's up to you to place the generated module files (.py, .so) in appropriate subdirectories. For example, if you have an interface file foo.i with:

%module(package="pkg1.pkg2") foo

then the resulting directory layout should be

pkg1/
pkg1/__init__.py
pkg1/pkg2/__init__.py
pkg1/pkg2/foo.py        # (generated by SWIG)
pkg1/pkg2/_foo.so       # (shared library built from C/C++ code generated by SWIG)

36.11.2 Absolute and relative imports

Suppose, we have the following hierarchy of files:

pkg1/
pkg1/__init__.py
pkg1/mod2.py
pkg1/pkg2/__init__.py
pkg1/pkg2/mod3.py

Let the contents of pkg1/pkg2/mod3.py be

class M3: pass

We edit pkg1/mod2.py and want to import module pkg1/pkg2/mod3.py in order to derive from class M3. We can write appropriate Python code in several ways, for example:

  1. Using "import <>" syntax with absolute package name:

    # pkg1/mod2.py
    import pkg1.pkg2.mod3
    class M2(pkg1.pkg2.mod3.M3): pass
    
  2. Using "import <>" syntax with package name relative to pkg1 (only in Python 2.7 and earlier):

    # pkg1/mod2.py
    import pkg2.mod3
    class M2(pkg2.mod3.M3): pass
    
  3. Using "from <> import <>" syntax (relative import syntax, only in Python 2.5 and later):

    # pkg1/mod2.py
    from .pkg2 import mod3
    class M2(mod3.M3): pass
    
  4. Other variants, for example the following construction in order to have the pkg2.mod3.M3 symbol available in mod2 as in point 2 above (but now under Python 3):

    # pkg1/mod2.py
    from . import pkg2
    from .pkg2 import mod3
    class M2(pkg2.mod3.M3): pass
    

Now suppose we have mod2.i with

// mod2.i
%module (package="pkg1") mod2
%import "mod3.i"
// ...

and mod3.i with

// mod3.i
%module (package="pkg1.pkg2") mod3
// ...

By default, SWIG would generate mod2.py proxy file with import directive as in point 1. This can be changed with the -relativeimport command line option. The -relativeimport instructs SWIG to organize imports as in point 2 (for Python 2.x) or as in point 4 (for Python 3, that is when the -py3 command line option is enabled). In short, if you have mod2.i and mod3.i as above, then without -relativeimport SWIG will write

import pkg1.pkg2.mod3

to mod2.py proxy file, and with -relativeimport it will write

import pkg2.mod3

if -py3 is not used, or

from . import pkg2
import pkg1.pkg2.mod3

when -py3 is used.

You should avoid using relative imports and use absolute ones whenever possible. There are some cases, however, when relative imports may be necessary. The first example is, when some (legacy) Python code refers entities imported by proxy files generated by SWIG, and it assumes that the proxy file uses relative imports. Second case is, when one puts import directives in __init__.py to import symbols from submodules or subpackages and the submodule depends on other submodules (discussed later).

36.11.3 Enforcing absolute import semantics

As you may know, there is an incompatibility in import semantics (for the import <> syntax) between Python 2 and 3. In Python 2.4 and earlier it is not clear whether

import foo

refers to a top-level module or to another module inside the current package. In Python 3 it always refers to a top-level module (see PEP 328). To instruct Python 2.5 through 2.7 to use new semantics (that is import foo is interpreted as absolute import), one has to put the following line

from __future__ import absolute_import

at the very beginning of his proxy *.py file. In SWIG, it may be accomplished with %pythonbegin directive as follows:

%pythonbegin %{
from __future__ import absolute_import
%}

36.11.4 Importing from __init__.py

Imports in __init__.py are handy when you want to populate a package's namespace with names imported from other modules. In SWIG based projects this approach may also be used to split large pieces of code into smaller modules, compile them in parallel and then re-assemble everything at runtime by importing submodules' contents in __init__.py, for example.

Unfortunately import directives in __init__.py may cause problems, especially if they refer to a package's submodules. This is caused by the way Python initializes packages. If you spot problems with imports from __init__.py try using -relativeimport option. Below we explain in detail one issue, for which the -relativeimport workaround may be helpful.

Consider the following example (Python 3):

pkg1/__init__.py        # (empty)
pkg1/pkg2/__init__.py   # (imports something from bar.py)
pkg1/pkg2/foo.py
pkg1/pkg2/bar.py        # (imports foo.py)

If the file contents are:

Now if one simply used import pkg1.pkg2, it will usually fail:

>>> import pkg1.pkg2
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "./pkg1/pkg2/__init__.py", line 2, in <module>
    from .bar import Bar
  File "./pkg1/pkg2/bar.py", line 3, in <module>
    class Bar(pkg1.pkg2.foo.Foo): pass
AttributeError: 'module' object has no attribute 'pkg2'

Surprisingly, if we execute the import pkg1.pkg2 directive for the second time, it succeeds. The reason seems to be following: when Python spots the from .bar import Bar directive in pkg1/pkg2/__init__.py it starts loading pkg1/pkg2/bar.py. This module imports pkg1.pkg2.foo in turn and tries to use pkg1.pkg2.foo.Foo, but the package pkg1 is not fully initialized yet (the initialization procedure is actually in progress) and it seems like the effect of the already seen import pkg1.pkg2.pkg3.foo is "delayed" or ignored. Exactly the same may happen to a proxy module generated by SWIG.

One workaround for this case is to use a relative import in pkg1/pkg2/bar.py. If we change bar.py to be:

from .pkg3 import foo
class Bar(foo.Foo): pass

or

from . import pkg3
from .pkg3 import foo
class Bar(pkg3.foo.Foo): pass

then the example works again. With SWIG, you need to enable the -relativeimport option in order to have the above workaround in effect (note, that the Python 2 case also needs the -relativeimport workaround).

36.12 Python 3 Support

SWIG is able to support Python 3.0. The wrapper code generated by SWIG can be compiled with both Python 2.x or 3.0. Further more, by passing the -py3 command line option to SWIG, wrapper code with some Python 3 specific features can be generated (see below subsections for details of these features). The -py3 option also disables some incompatible features for Python 3, such as -classic.

There is a list of known-to-be-broken features in Python 3:

The following are Python 3.0 new features that are currently supported by SWIG.

36.12.1 Function annotation

The -py3 option will enable function annotation support. When used SWIG is able to generate proxy method definitions like this:

  def foo(self, bar : "int"=0) -> "void" : ...

Also, even if without passing SWIG the -py3 option, the parameter list still could be generated:

  def foo(self, bar=0): ...

But for overloaded function or method, the parameter list would fallback to *args or self, *args, and **kwargs may be append depend on whether you enabled the keyword argument. This fallback is due to all overloaded functions share the same function in SWIG generated proxy class.

For detailed usage of function annotation, see PEP 3107.

36.12.2 Buffer interface

Buffer protocols were revised in Python 3. SWIG also gains a series of new typemaps to support buffer interfaces. These typemap macros are defined in pybuffer.i, which must be included in order to use them. By using these typemaps, your wrapped function will be able to accept any Python object that exposes a suitable buffer interface.

For example, the get_path() function puts the path string into the memory pointed to by its argument:

void get_path(char *s);

Then you can write a typemap like this: (the following example is applied to both Python 3.0 and 2.6, since the bytearray type is backported to 2.6.

%include <pybuffer.i>
%pybuffer_mutable_string(char *str);
void get_path(char *s);

And then on the Python side the wrapped get_path could be used in this way:

>>> p = bytearray(10)
>>> get_path(p)
>>> print(p)
bytearray(b'/Foo/Bar/\x00')

The macros defined in pybuffer.i are similar to those in cstring.i:

%pybuffer_mutable_binary(parm, size_parm)

The macro can be used to generate a typemap which maps a buffer of an object to a pointer provided by parm and a size argument provided by size_parm. For example:

%pybuffer_mutable_binary(char *str, size_t size);
...
int snprintf(char *str, size_t size, const char *format, ...);

In Python:

>>> buf = bytearray(6)
>>> snprintf(buf, "Hello world!")
>>> print(buf)
bytearray(b'Hello\x00')
>>> 

%pybuffer_mutable_string(parm)

This typemap macro requires the buffer to be a zero terminated string, and maps the pointer of the buffer to parm. For example:

%pybuffer_mutable_string(char *str);
...
size_t make_upper(char *str);

In Python:

>>> buf = bytearray(b'foo\x00')
>>> make_upper(buf)
>>> print(buf)
bytearray(b'FOO\x00')
>>>

Both %pybuffer_mutable_binary and %pybuffer_mutable_string require the provided buffer to be mutable, eg. they can accept a bytearray type but can't accept an immutable byte type.

%pybuffer_binary(parm, size_parm)

This macro maps an object's buffer to a pointer parm and a size size_parm. It is similar to %pybuffer_mutable_binary, except the %pybuffer_binary an accept both mutable and immutable buffers. As a result, the wrapped function should not modify the buffer.

%pybuffer_string(parm)

This macro maps an object's buffer as a string pointer parm. It is similar to %pybuffer_mutable_string but the buffer could be both mutable and immutable. And your function should not modify the buffer.

36.12.3 Abstract base classes

By including pyabc.i and using the -py3 command line option when calling SWIG, the proxy classes of the STL containers will automatically gain an appropriate abstract base class. For example, the following SWIG interface:

%include <pyabc.i>
%include <std_map.i>
%include <std_list.i>

namespace std {
  %template(Mapii) map<int, int>;
  %template(IntList) list<int>;
}

will generate a Python proxy class Mapii inheriting from collections.MutableMap and a proxy class IntList inheriting from collections.MutableSequence.

pyabc.i also provides a macro %pythonabc that could be used to define an abstract base class for your own C++ class:

%pythonabc(MySet, collections.MutableSet);

For details of abstract base class, please see PEP 3119.

36.12.4 Byte string output conversion

By default, any byte string (char* or std::string) returned from C or C++ code is decoded to text as UTF-8. This decoding uses the surrogateescape error handler under Python 3.1 or higher -- this error handler decodes invalid byte sequences to high surrogate characters in the range U+DC80 to U+DCFF. As an example, consider the following SWIG interface, which exposes a byte string that cannot be completely decoded as UTF-8:

%module example

%include <std_string.i>

%inline %{

const char* non_utf8_c_str(void) {
        return "h\xe9llo w\xc3\xb6rld";
}

%}

When this method is called from Python 3, the return value is the following text string:

>>> s = example.non_utf8_c_str()
>>> s
'h\udce9llo wörld'

Since the C string contains bytes that cannot be decoded as UTF-8, those raw bytes are represented as high surrogate characters that can be used to obtain the original byte sequence:

>>> b = s.encode('utf-8', errors='surrogateescape')
>>> b
b'h\xe9llo w\xc3\xb6rld'

One can then attempt a different encoding, if desired (or simply leave the byte string as a raw sequence of bytes for use in binary protocols):

>>> b.decode('latin-1')
'héllo wörld'

Note, however, that text strings containing surrogate characters are rejected with the default strict codec error handler. For example:

>>> with open('test', 'w') as f:
...     print(s, file=f)
...
Traceback (most recent call last):
  File "<stdin>", line 2, in <module>
UnicodeEncodeError: 'utf-8' codec can't encode character '\udce9' in position 1: surrogates not allowed

This requires the user to check most strings returned by SWIG bindings, but the alternative is for a non-UTF8 byte string to be completely inaccessible in Python 3 code.

For more details about the surrogateescape error handler, please see PEP 383.