3 SWIG Basics

This chapter describes the basic operation of SWIG, the structure of its input files, and how it handles various sorts of C/C++ declarations. Specific details about each target language are described in later chapters. However, much of the underlying design philosophy and aspects of SWIG's implementation are contained here.

Running SWIG

To run SWIG, use the swig command with one or more of the following options and a filename like this:

swig [ options ] filename

-tcl                  Generate Tcl wrappers
-perl                 Generate Perl5 wrappers
-python               Generate Python wrappers
-guile                Generate Guile wrappers
-ruby                 Generate Ruby wrappers
-java                 Generate Java wrappers
-mzscheme             Generate mzscheme wrappers
-c++                  Enable C++ parsing
-Idir                 Add a directory to the file include path
-lfile                Include a SWIG library file.
-c                    Generate raw wrapper code (omit supporting code)
-o outfile            Name of output file
-module name          Set the name of the SWIG module
-Dsymbol              Define a preprocessor symbol
-version              Show SWIG version number
-help                 Display all options

Additional options are often defined for each target language. A full list can be obtained by typing swig -help or swig -lang -help.

Input format

As input, SWIG expects a file containing ANSI C/C++ declarations and special SWIG directives. More often than not, this is a special SWIG interface file which is usually denoted with a special .i or .swg suffix. In certain cases, SWIG can be used directly on raw header files or source files. However, this is not the most typical case and there are several reasons why you might not want to do this (described later).

The most common format of a SWIG interface is as follows:

%module mymodule 
%{
#include "myheader.h"
%}
// Now list ANSI C/C++ declarations
int foo;
int bar(int x);
...
The name of the module is supplied using the special %module directive (or the -module command line option). This directive must appear at the beginning of the file and is used to name the resulting extension module (in addition, this name often defines a namespace in the target language). If the module name is supplied on the command line, it overrides the name specified with the %module directive.

Everything in the %{ ... %} block is simply copied to the resulting output file. The enclosed text is not parsed or interpreted by SWIG. Although the use of a %{,%} block is optional, most interface files have one to include header files and other supporting C declarations. The %{...%} syntax and semantics in SWIG is analogous to that of the declarations section used in input files to parser generation tools such as yacc or bison.

SWIG Output

The output of SWIG is a C/C++ file that contains all of the wrapper code needed to build an extension module. By default, an input file with the name file.i is transformed into a file file_wrap.c or file_wrap.cxx (depending on whether or not the -c++ option has been used). The name of the output file can be changed using the -o option. In certain cases, file suffixes are used by the compiler to determine the source language (C, C++, etc.). Therefore, you have to use the -o option to change the suffix of the SWIG-generated wrapper file (if needed). For example:
$ swig -c++ -python -o example_wrap.cpp example.i

It is important to note that the output file created by SWIG normally contains everything that is needed to construct a extension module for the target scripting language. SWIG is not a stub compiler nor is usually necessary to edit the output file (and if you look at the output, you probably won't want to). To build the final extension module, the SWIG output file is compiled and linked with the rest of your C/C++ program to create a shared library.

Comments

C and C++ style comments may appear anywhere in interface files. In previous versions of SWIG, comments were used to generate documentation files. However, this feature is currently under repair and will reappear in a later SWIG release.

C Preprocessor

Like C, SWIG preprocesses all input files through an enhanced version of the C preprocessor. All standard preprocessor features are supported including file inclusion, conditional compilation and macros. However, #include statements are ignored unless the -includeall command line option has been supplied. The reason for disabling includes is that SWIG is sometimes used to process raw C header files. In this case, you usually only want the extension module to include functions in the supplied header file rather than everything that might be included by that header file (i.e., system headers, C library functions, etc.).

It should also be noted that the SWIG preprocessor skips all text enclosed inside a %{...%} block. In addition, the preprocessor includes a number of macro handling enhancements that make it more powerful than the normal C preprocessor. These extensions are described in the "Preprocessor" section near the end of this chapter.

SWIG Directives

Most of SWIG's operation is controlled by special directives that are always preceded by a "%" to distinguish them from normal C declarations. These directives are used to give SWIG hints or to alter SWIG's parsing behavior in some manner.

Since SWIG directives are not legal C syntax, it is generally not possible to include them in header files. However, SWIG directives can be included in C header files using conditional compilation like this:

/* header.h  --- Some header file */

/* SWIG directives -- only seen if SWIG is running */ 
#ifdef SWIG
%module foo
#endif
SWIG is a special preprocessing symbol defined by SWIG when it is parsing an input file.

Parser Limitations

Although SWIG can parse most common C/C++ declarations, it does not provide a complete C/C++ parser implementation. Most of these limitations pertain to very complicated type declarations and certain advanced C++ features. Specifically, the following features are not currently supported:

In the event of a parsing error, conditional compilation can be used to skip offending code. For example:

#ifndef SWIG
... some bad declarations ...
#endif
Alternatively, you can just delete the offending code from the interface file.

One of the reasons why SWIG does not provide a full C++ parser implementation is that it has been designed to work with incomplete specifications and to be very permissive in its handling of C/C++ datatypes (e.g., SWIG can generate interfaces even when there are missing class declarations or opaque datatypes). Unfortunately, this approach makes it extremely difficult to implement certain parts of a C/C++ parser as most compilers use type information to assist in the parsing of more complex declarations (for the truly curious, the primary complication in the implementation is that the SWIG parser does not utilize a separate typedef-name terminal symbol as described on p. 234 of K&R).

It should also be noted that the SWIG parser was never really developed with the intent that it would be blindly used on raw C/C++ source code. Although parsing has become a lot more powerful in recent versions, the underlying assumption was that one would usually start with a header and enhance it by adding additional support code, cutting certain features out, supplying special SWIG directives, and so forth.

Wrapping Simple C Declarations

SWIG wraps simple C declarations by creating an interface that closely matches the way in which the declarations would be used in a C program. For example, consider the following interface file:

%module example

extern double sin(double x);
extern int strcmp(const char *, const char *);
extern int Foo;
#define STATUS 50
#define VERSION "1.1"
In this file, there are two functions sin() and strcmp(), a global variable Foo, and two constants STATUS and VERSION. When SWIG creates an extension module, these declarations are accessible as scripting language functions, variables, and constants respectively. For example, in Tcl:

% sin 3
5.2335956
% strcmp Dave Mike
-1
% puts $Foo
42
% puts $STATUS
50
% puts $VERSION
1.1
Or in Python:

>>> example.sin(3)
5.2335956
>>> example.strcmp('Dave','Mike')
-1
>>> print example.cvar.Foo
42
>>> print example.STATUS
50
>>> print example.VERSION
1.1
Whenever possible, SWIG creates an interface that closely matches the underlying C/C++ code. However, due to subtle differences between languages, run-time environments, and semantics, it is not always possible to do so. The next few sections describes various aspects of this mapping.

Basic Type Handling

In order to build an interface, SWIG has to convert C/C++ datatypes to equivalent types in the target language. Generally, scripting languages provide a more limited set of primitive types than C. Therefore, this conversion process involves a certain amount of type coercion.

Most scripting languages provide a single integer type that is implemented using the int or long datatype in C. The following list shows all of the C datatypes that SWIG will convert to and from integers in the target language:

int
short
long
unsigned
signed
unsigned short
unsigned long
unsigned char
signed char
bool

When an integral value is converted from C, a cast is used to convert it to the representation in the target language. Thus, a 16 bit short in C may be promoted to a 32 bit integer. When integers are converted in the other direction, the value is cast back into the original C type. If the value is too large to fit, it is silently truncated.

unsigned char and signed char are special cases that are handled as small 8-bit integers. Normally, the char datatype is mapped as a one-character ASCII string.

The bool datatype is cast to and from an integer value of 0 and 1.

Some care is required when working with large integer values. Most scripting languages use 32-bit integers so mapping a 64-bit long integer may lead to truncation errors. Similar problems may arise with 32 bit unsigned integers (which may appear as large negative numbers). As a rule of thumb, the int datatype and all variations of char and short datatypes are safe to use. For unsigned int and long datatypes, you will need to carefully check the correct operation of your program after it has been wrapped with SWIG.

Although the SWIG parser supports the long long datatype, very few language modules currently support it. This is because long long usually exceeds the precision available in the target language. This limitation may be eliminated in future SWIG releases.

SWIG recognizes the following floating point types :

float
double

Floating point numbers are mapped to and from the natural representation of floats in the target language. This is almost always a C double. The rarely used datatype of long double is not supported by SWIG.

The char datatype is mapped into a NULL terminated ASCII string with a single character. When used in a scripting language it shows up as a tiny string containing the character value. When converting the value back into C, SWIG takes a character string from the scripting language and strips off the first character as the char value. Thus if the value "foo" is assigned to a char datatype, it gets the value `f'.

The char * datatype is handled as a NULL-terminated ASCII string. SWIG maps this into a 8-bit character string in the target scripting language. SWIG converts character strings in the target language to NULL terminated strings before passing them into C/C++. It is illegal for these strings to have embedded NULL bytes. Therefore, the char * datatype is not generally suitable for passing binary data (although SWIG's behavior can be modified to handle this).

At this time, SWIG does not provide any special support for Unicode or wide-character strings (the C wchar_t type). This is a delicate topic that is quite complex and poorly understood by many programmers. For those scripting languages that provide Unicode support, Unicode strings are often implicitly converted to an 8-bit representation such as UTF-8 whenever they are mapped to the char * type (in which case the SWIG interface will probably work).

Global Variables

Whenever possible, SWIG maps C/C++ global variables into scripting language variables. For example,

%module example
double foo;

results in a scripting language variable like this:

# Tcl
set foo [3.5]                   ;# Set foo to 3.5
puts $foo                       ;# Print the value of foo

# Python
cvar.foo = 3.5                  # Set foo to 3.5
print cvar.foo                  # Print value of foo

# Perl
$foo = 3.5;                     # Set foo to 3.5
print $foo,"\n";                # Print value of foo

# Ruby
Module.foo = 3.5               # Set foo to 3.5
print Module.foo, "\n"         # Print value of foo
Whenever the scripting language variable is used, the underlying C global variable is accessed. Although SWIG makes every attempt to make global variables work like scripting language variables, it is not always possible to do so. For instance, in Python, all global variables must be accessed through a special variable object known as cvar (shown above). In Ruby, variables are accessed as attributes of the module. Other languages may convert variables to a pair of accessor functions. For example, the Java module generates a pair of functions double get_foo() and set_foo(double val) that are used to manipulate the value.

Finally, if a global variable has been declared as const, it only supports read-only access. Note: this behavior is new to SWIG-1.3. Earlier versions of SWIG incorrectly handled const and created constants instead.

Constants

Constants can be created using #define, enumerations, or a special %constant directive. The following interface file shows a few valid constant declarations :

#define I_CONST       5               // An integer constant
#define PI            3.14159         // A Floating point constant
#define S_CONST       "hello world"   // A string constant
#define NEWLINE       '\n'            // Character constant

enum boolean {NO=0, YES=1};
enum months {JAN, FEB, MAR, APR, MAY, JUN, JUL, AUG,
             SEP, OCT, NOV, DEC};
%constant double BLAH = 42.37;
#define F_CONST (double) 5            // A floating pointer constant with cast
#define PI_4 PI/4
#define FLAGS 0x04 | 0x08 | 0x40

In #define declarations, the type of a constant is inferred by syntax. For example, a number with a decimal point is assumed to be floating point. In addition, SWIG must be able to fully resolve all of the symbols used in a #define in order for a constant to actually be created. This restriction is necessary because #define is also used to define preprocessor macros that are definitely not meant to be part of the scripting language interface. For example:
#define EXTERN extern

EXTERN void foo();
In this case, you probably don't want to create a constant called EXTERN (what would the value be?). In general, SWIG will not create constants for macros unless the value can be completely determined. For instance, in the above example, the declaration
#define PI_4  PI/4
defines a constant because PI was already defined as a constant and the value is known.

The use of constant expressions is allowed, but SWIG does not evaluate them. Rather, it passes them through to the output file and lets the C compiler perform the final evaluation (SWIG does perform a limited form of type-checking however).

For enumerations, it is critical that the original enum definition be included somewhere in the interface file (either in a header file or in the %{,%} block). SWIG only translates the enumeration into code needed to add the constants to a scripting language. It needs the original enumeration declaration in order to get the correct enum values as assigned by the C compiler.

The %constant directive is used to more precisely create constants corresponding to different C datatypes. Although it is not usually not needed for simple values, it is more useful when working with pointers and other more complex datatypes. Typically, %constant is only used when you want to add constants to the scripting language interface that are not defined in the original header file.

A brief word about const

A common confusion with C programming is the semantic meaning of the const qualifier in declarations--especially when it is mixed with pointers and other type modifiers. In fact, previous versions of SWIG handled const incorrectly--a situation that SWIG-1.3.7 and newer releases have fixed.

Starting with SWIG-1.3, all variable declarations, regardless of any use of const, are wrapped as global variables. If a declaration happens to be declared as const, it is wrapped as a read-only variable. To tell if a variable is const or not, you need to look at the right-most occurrence of the const qualifier (that appears before the variable name). If the right-most const occurs after all other type modifiers (such as pointers), then the variable is const. Otherwise, it is not.

Here are some examples of const declarations.

const char a;           // A constant character
char const b;           // A constant character (the same)
char *const c;          // A constant pointer to a character
const char *const d;    // A constant pointer to a constant character
Here is an example of a declaration that is not const:
const char *e;          // A pointer to a constant character.  The pointer
                        // may be modified.
In this case, the pointer e can change---it's only the value being pointed to that is read-only.

Compatibility Note: One reason for changing SWIG to handle const declarations as read-only variables is that there are many situations where the value of a const variable might change. For example, a library might export a symbol as const in its public API to discourage modification, but still allow the value to change through some other kind of internal mechanism. In an embedded system, a const declaration might refer to a read-only memory address such as the location of a memory-mapped I/O device port (where the value changes, but writing to the port is not supported by the hardware). Rather than trying to build a bunch of special cases into the const qualifier, the new interpretation of const as "read-only" is simple and exactly matches the actual semantics of const in C/C++. If you really want to create a constant as in older versions of SWIG, use the %constant directive instead. For example:

%constant double PI = 3.14159;
or
#ifdef SWIG
#define const %constant
#endif
const double foo = 3.4;
const double bar = 23.4;
const int    spam = 42;
#ifdef SWIG
#undef const
#endif
...

Pointers and complex objects

Most C programs have many more types than integers, floats, and character strings. Usually, there are pointers, arrays, structures, and other types of objects. This section discusses the handling of these datatypes.

Simple pointers

Pointers to fundamental C datatypes such as

int *
double ***
char **

are fully supported by SWIG. SWIG encodes pointers into a representation that contains the actual value of the pointer and a string representing the datatype. Thus, the SWIG representation of the above pointers (in Tcl), might look like this:

_10081012_p_int
_1008e124_ppp_double
_f8ac_pp_char

A NULL pointer is represented by the string "NULL" or the value 0 encoded with type information.

All pointers are treated as opaque objects by SWIG. Thus, a pointer may be returned by a function and passed around to other C functions as needed. For all practical purposes, the scripting language interface works in exactly the same way as you would manipulate the pointer in a C program with the exception that the pointer can't be dereferenced (at least not without additional help).

The scripting language representation of a pointer value should never be manipulated directly. Even though the values shown above look like hexadecimal addresses, the numbers used may differ from the actual machine address (e.g., on little-endian machines, the digits may appear in reverse order). Furthermore, SWIG does not normally map pointers into high-level objects such as associative arrays or lists (for example, converting an int * into an list of integers). There are several reasons why SWIG does not do this:

Run time pointer type checking

By allowing pointers to be manipulated from a scripting language, extension modules effectively bypass compile-time type checking in the C/C++ compiler. To prevent errors, a type signature is encoded into all pointer values and is used to perform run-time type checking. This type-checking process is an integral part of SWIG and can not be disabled or modified without using typemaps (described in later chapters).

Like C, void * matches any kind of pointer. Furthermore, NULL pointers can be passed to any function that expects to receive a pointer. Although this has the potential to cause a crash, NULL pointers are also sometimes used as sentinel values or to denote a missing/empty value. Therefore, SWIG leaves NULL pointer checking up to the application.

Derived types, structs, and classes

For everything else (structs, classes, arrays, etc...) SWIG applies a very simple rule :

Everything else is a pointer

In other words, SWIG manipulates everything else by reference. This model makes sense because most C/C++ programs make heavy use of pointers and SWIG can use the type-checked pointer mechanism already present for handling pointers to basic datatypes.

Although this probably sounds complicated, it's really quite simple. Suppose you have an interface file like this :

%module fileio
FILE *fopen(char *, char *);
int fclose(FILE *);
unsigned fread(void *ptr, unsigned size, unsigned nobj, FILE *);
unsigned fwrite(void *ptr, unsigned size, unsigned nobj, FILE *);
void *malloc(int nbytes);
void free(void *);

In this file, SWIG doesn't know what a FILE is, but it's used as a pointer, so it doesn't really matter what it is. If you wrapped this module into Python, you can use the functions just like you expect :

# Copy a file 
def filecopy(source,target):
	f1 = fopen(source,"r")
	f2 = fopen(target,"w")
	buffer = malloc(8192)
	nbytes = fread(buffer,8192,1,f1)
	while (nbytes > 0):
		fwrite(buffer,8192,1,f2)
		nbytes = fread(buffer,8192,1,f1)
	free(buffer)

In this case f1, f2, and buffer are all opaque objects containing C pointers. It doesn't matter what value they contain--our program works just fine without this knowledge.

Undefined datatypes

When SWIG encounters an undeclared datatype, it automatically assumes that it is a structure or class. For example, suppose the following function appeared in a SWIG input file:

void matrix_multiply(Matrix *a, Matrix *b, Matrix *c);

SWIG has no idea what a "Matrix" is. However, it is obviously a pointer to something so SWIG generates a wrapper using its generic pointer handling code.

Unlike C or C++, SWIG does not actually care whether Matrix has been previously defined in the interface file or not. This allows SWIG to generate interfaces from only partial or limited information. In some cases, you may not care what a Matrix really is as long as you can pass an opaque reference to one around in the scripting language interface.

An important detail to mention is that SWIG will gladly generate wrappers for an interface when there are unspecified type names. However, all unspecified types are internally handled as pointers to structures or classes! For example, consider the following declaration:

void foo(size_t num);
If size_t is undeclared, SWIG generates wrappers that expect to receive a type of size_t * (this mapping is described shortly). As a result, the scripting interface might behave strangely. For example:
foo(40);
TypeError: expected a _p_size_t.
The only way to fix this problem is to make sure you properly declare type names using typedef.

Typedef

Like C, typedef can be used to define new type names in SWIG. For example:

typedef unsigned int size_t;

typedef definitions appearing in a SWIG interface are not propagated to the generated wrapper code. Therefore, if you write a new typedef declaration, you may have to also include a small piece of code in a %{ ... %} block like this:
%{
/* Include in the generated wrapper file */
typedef unsigned int size_t;
%}
/* Tell SWIG about it */
typedef unsigned int size_t;
or
%inline %{
typedef unsigned int size_t;
%}
In certain cases, you might be able to include other header files to collect type information. For example:
%module example
%import "sys/types.h"
In this case, you might run SWIG as follows:
$ swig -I/usr/include -includeall example.i
However, it should be noted that your mileage will vary greatly here. System headers are notoriously complicated and may rely upon a variety of non-standard C coding extensions (e.g., such as special directives to GCC). Unless you exactly specify the right include directories and preprocessor symbols, this may not work correctly (you will have to experiment).

SWIG tracks typedef declarations and uses this information for run-time type checking. For instance, if you use the above typedef and had the following function declaration:

void foo(unsigned int *ptr);
The corresponding wrapper function accepts arguments of type unsigned int * or size_t *.

Other Practicalities

So far, this chapter has presented almost everything you need to know to use SWIG for simple interfaces. However, some C programs use idioms that are somewhat more difficult to map to a scripting language interface. This section describes some of these issues.

Passing complex datatypes by value

Sometimes a C function takes structure parameters that are passed by value. For example, consider the following function:

double dot_product(Vector a, Vector b);

To deal with this, SWIG transforms the function to use pointers by creating a wrapper equivalent to the following:

double wrap_dot_product(Vector *a, Vector *b) {
    return dot_product(*a,*b);
}

In the target language, the dot_product() function now accepts pointers to Vectors instead of Vectors. For the most part, this transformation is transparent so you might not notice.

Return by value

C functions that return structures or classes datatypes by value are more difficult to handle. Consider the following function:

Vector cross_product(Vector v1, Vector v2);

This function wants to return Vector, but SWIG only really supports pointers. As a result, SWIG creates a wrapper like this:

Vector *wrap_cross_product(Vector *v1, Vector *v2) {
        Vector *result;
        result = (Vector *) malloc(sizeof(Vector));
        *(result) = cross(*v1,*v2);
        return result;
}

or if SWIG was run with the -c++ option:

Vector *wrap_cross(Vector *v1, Vector *v2) {
        Vector *result = new Vector(cross(*v1,*v2)); // Uses default copy constructor
        return result;
}

In both cases, SWIG allocates a new object and returns a reference to it. It is up to the user to delete the returned object when it is no longer in use. Clearly, this will leak memory if you are unaware of the implicit memory allocation and don't take steps to free the result.

Linking to complex variables

When global variables or class members involving complex datatypes are encountered, SWIG handles them as pointers. For example, a global variable like this

Vector unit_i;

gets mapped to an underlying pair of set/get functions like this :

Vector *unit_i_get() {
	return &unit_i;
}
void unit_i_set(Vector *value) {
	unit_i = *value;
}

Again some caution is in order. A global variable created in this manner will show up as a pointer in the target scripting language. It would be an extremely bad idea to free or destroy such a pointer. Also, C++ classes must supply a properly defined copy constructor in order for assignment to work correctly.

Linking to char *

When a global variable of type char * appears, SWIG uses malloc() or new to allocate memory for the new value. Specifically, if you have a variable like this
char *foo;
SWIG generates the following code:
/* C mode */
void foo_set(char *value) {
   if (foo) free(foo);
   foo = (char *) malloc(strlen(value)+1);
   strcpy(foo,value);
}

/* C++ mode.  When -c++ option is used */
void foo_set(char *value) {
   if (foo) delete [] foo;
   foo = new char[strlen(value)+1];
   strcpy(foo,value);
}
If this is not the behavior that you want, consider making the variable read-only using the %readonly directive. Alternatively, you might write a short assist-function to set the value exactly like you want. For example:
%inline %{
  void set_foo(char *value) {
       strncpy(foo,value, 50);
   }
%}
Note: If you write an assist function like this, you will have to call it as a function from the target scripting language (it does not work like a variable). For example, in Python you will have to write:
>>> set_foo("Hello World")

Arrays

Arrays are fully supported by SWIG, but they are always handled as pointers instead of mapping them to a special array object or list in the target language. Thus, the following declarations :

int foobar(int a[40]);
void grok(char *argv[]);
void transpose(double a[20][20]);

are processed as if they were really declared like this:

int foobar(int *a);
void grok(char **argv);
void transpose(double (*a)[20]);
Like C, SWIG does not perform array bounds checking. It is up to the user to make sure the pointer points a suitably allocated region of memory.

Multi-dimensional arrays are transformed into a pointer to an array of one less dimension. For example:

int [10];         // Maps to int *
int [10][20];     // Maps to int (*)[20]
int [10][20][30]; // Maps to int (*)[20][30]
It is important to note that in the C type system, a multidimensional array a[][] is NOT equivalent to a single pointer *a or a double pointer such as **a. Instead, a pointer to an array is used (as shown above) where the actual value of the pointer is the starting memory location of the array. The reader is strongly advised to dust off their C book and re-read the section on arrays before using them with SWIG.

Array variables are supported, but are read-only by default. For example:

int   a[100][200];
In this case, reading the variable 'a' returns a pointer of type int (*)[200] that points to the first element of the array &a[0][0]. Trying to modify 'a' results in an error. This is because SWIG does not know how to copy data from the target language into the array. To work around this limitation, you may want to write a few simple assist functions like this:
%inline %{
void a_set(int i, int j, int val) {
   a[i][j] = val;
}
int a_get(int i, int j) {
   return a[i][j];
}
%}
To dynamically create arrays of various sizes and shapes, it may be useful to write some helper functions in your interface. For example:
// Some array helpers
%inline %{
  /* Create any sort of [size] array */
  int *int_array(int size) {
     return (int *) malloc(size*sizeof(int));
  }
  /* Create a two-dimension array [size][10] */
  int (*int_array_10(int size))[10] {
     return (int (*)[10]) malloc(size*10*sizeof(int));
  }
%}
Arrays of char are handled as a special case by SWIG. In this case, strings in the target language can be stored in the array. For example, if you have a declaration like this,
char pathname[256];
SWIG generates functions for both getting and setting the value that are equivalent to the following code:
char *pathname_get() {
   return pathname;
}
void pathname_set(char *value) {
   strncpy(pathname,value,256);
}

Creating read-only variables

A read-only variable can be created by using the %readonly directive as shown :

// File : interface.i

int 	a; 			// Can read/write
%readonly
int	b,c,d			// Read only variables
%readwrite
double	x,y			// read/write

The %readonly directive enables read-only mode until it is explicitly disabled using the %readwrite directive.

Read-only variables are also created when declarations are declared as const. For example:

const int foo;               /* Read only variable */
char * const version="1.0";  /* Read only variable */

Renaming and ignoring declarations

Normally, the name of a C function is used as the name of the command added to the target scripting language. However, this name may conflict with a keyword or already existing function in the scripting language. To resolve a name conflict, use the %name directive as shown :

// interface.i

%name(my_print) extern void print(char *);
%name(foo) extern int a_really_long_and_annoying_name;

SWIG still calls the correct C function, but in this case the function print() will really be called "my_print()" in the target language.

A more powerful renaming operation can be performed with the %rename directive:

%rename(newname) oldname;

%rename applies a renaming operation to all future occurrences of a name. The renaming applies to functions, variables, class and structure names, member functions, and member data. For example, if you had two-dozen C++ classes, all with a member function named `print' (which is a keyword in Python), you could rename them all to `output' by specifying :

%rename(output) print; // Rename all `print' functions to `output'

SWIG does not currently perform any checks to see if the functions it wraps are already defined in the target scripting language. However, if you are careful about namespaces and your use of modules, you can usually avoid these problems.

Closely related to %rename is the %ignore directive. %ignore instructs SWIG to ignore declarations that match a given identifier. For example:

%ignore print;         // Ignore all declarations named print
%ignore _HAVE_FOO_H;   // Ignore an include guard constant
...
%include "foo.h"       // Grab a header file
...
One use of %ignore is to selectively remove certain declarations from a header file without having to add conditional compilation to the header. However, it should be stressed that this only works for simple declarations. If you need to remove a whole section of problematic code, the SWIG preprocessor should be used instead.

More powerful variants of %rename and %ignore directives can be used to help wrap C++ overloaded functions and methods. This is described a little later in the C++ section.

Default/optional arguments

SWIG supports default arguments in both C and C++ code. For example:

int plot(double x, double y, int color=WHITE);

In this case, SWIG generates wrapper code where the default arguments are optional. For example, this function could be used in Tcl as follows :

% plot -3.4 7.5 				# Use default value
% plot -3.4 7.5 10				# set color to 10 instead

Although the ANSI C standard does not allow default arguments, default arguments specified in a SWIG interface work with both C and C++.

Pointers to functions and callbacks

Occasionally, a C library may include functions that expect to receive pointers to functions--possibly to serve as callbacks. SWIG provides full support for function pointers provided that the callback functions are defined in C and not in the target language. For example, consider a function like this:
int binary_op(int a, int b, int (*op)(int,int));

When you first wrap something like this into an extension module, you may find the function to be impossible to use. For instance, in Python:

>>> def add(x,y):
...     return x+y
...
>>> binary_op(3,4,add)
Traceback (most recent call last):
  File "", line 1, in ?
TypeError: Type error. Expected _p_f_int_int__int
>>>
The reason for this error is that SWIG doesn't know how to map a scripting language function into a C callback. However, existing C functions can be used as arguments provided you install them as constants. One way to do this is to use the %constant directive like this:
/* Function with a callback */
int binary_op(int a, int b, int (*op)(int,int));

/* Some callback functions */
%constant int add(int,int);
%constant int sub(int,int);
%constant int mul(int,int);
In this case, add, sub, and mul become function pointer constants in the target scripting language. This allows you to use them as follows:
>>> binary_op(3,4,add)
7
>>> binary_op(3,4,mul)
12
>>>
Unfortunately, by declaring the callback functions as constants, they are no longer accesible as functions. For example:
>>> add(3,4)
Traceback (most recent call last):
  File "", line 1, in ?
TypeError: object is not callable: '_ff020efc_p_f_int_int__int'
>>>
If you want to make a function available as both a callback function and a function, you can use the %callback and %nocallback directives like this:
/* Function with a callback */
int binary_op(int a, int b, int (*op)(int,int));

/* Some callback functions */
%callback("%s_cb")
int add(int,int);
int sub(int,int);
int mul(int,int);
%nocallback
The argument to %callback is a printf-style format string that specifies the naming convention for the callback constants (%s gets replaced by the function name). The callback mode remains in effect until it is explicitly disabled using %nocallback. When you do this, the interface now works as follows:
>>> binary_op(3,4,add_cb)
7
>>> binary_op(3,4,mul_cb)
12
>>> add(3,4)
7
>>> mul(3,4)
12
Notice that when the function is used as a callback, special names such as add_cb is used instead. To call the function normally, just use the original function name such as add().

SWIG provides a number of extensions to standard C printf formatting that may be useful in this context. For instance, the following variation installs the callbacks as all upper-case constants such as ADD, SUB, and MUL:

/* Some callback functions */
%callback("%(upper)s")
int add(int,int);
int sub(int,int);
int mul(int,int);
%nocallback
A format string of "%(lower)s" converts all characters to lower-case. A string of "%(title)s" capitalizes the first character and converts the rest to lower case.

And now, a final note about function pointer support. Although SWIG does not normally allow callback functions to be written in the target language, this can be accomplished with the use of typemaps and other advanced SWIG features. This is described in a later chapter.

Structures, unions, and object oriented C programming

This section describes the behavior of SWIG when processing ANSI C. Extensions to handle C++ are described in the next section.

If SWIG encounters the definition of a structure or union, it will create a set of accessor functions for you. Although SWIG does not need structure definitions to build an interface, providing definitions make it possible to access structure members. The accessor functions generated by SWIG simply take a pointer to an object and allow access to an individual member. For example, the declaration :

struct Vector {
	double x,y,z;
}

gets transformed into the following set of accessor functions :

double Vector_x_get(struct Vector *obj) {
	return obj->x;
}
double Vector_y_get(struct Vector *obj) { 
	return obj->y;
}
double Vector_z_get(struct Vector *obj) { 
	return obj->z;
}
void Vector_x_set(struct Vector *obj, double value) {
	obj->x = value;
}
void Vector_y_set(struct Vector *obj, double value) {
	obj->y = value;
}
void Vector_z_set(struct Vector *obj, double value) {
	obj->z = value;
}
In addition, SWIG creates a pair of default constructor and destructor functions if none are defined in the interface. For example:
struct Vector *new_Vector() {
    return (Vector *) calloc(1,sizeof(struct Vector));
}
void delete_Vector(struct Vector *obj) {
    free(obj);
}

Typedef and structures

SWIG supports the following construct which is quite common in C programs :

typedef struct {
	double x,y,z;
} Vector;

When encountered, SWIG assumes that the name of the object is `Vector' and creates accessor functions like before. The only difference is that the use of typedef allows SWIG to drop the struct keyword on its generated code. For example:
double Vector_x_get(Vector *obj) {
	return obj->x;
}
If two different names are used like this :

typedef struct vector_struct {
	double x,y,z;
} Vector;

the name Vector is used instead of vector_struct. If declarations defined later in the interface use the type struct vector_struct, SWIG knows that this is the same as Vector and it generates code appropriately.

Character strings and structures

Structures involving character strings require some care. SWIG assumes that all members of type char * have been dynamically allocated using malloc() and that they are NULL-terminated ASCII strings. When such a member is modified, the previously contents will be released, and the new contents allocated. For example :

%module mymodule
...
struct Foo {
	char *name;
	...
}

This results in the following accessor functions :

char *Foo_name_get(Foo *obj) {
	return Foo->name;
}

char *Foo_name_set(Foo *obj, char *c) {
	if (obj->name) free(obj->name);
	obj->name = (char *) malloc(strlen(c)+1);
	strcpy(obj->name,c);
	return obj->name;
}

If this behavior differs from what you need in your applications, the SWIG "memberin" typemap can be used to change it. See the typemaps chapter for details.

Note: If the -c++ option is used, new and delete are used to perform memory allocation.

Array members

Arrays may appear as the members of structures, but they will be read-only. SWIG will write an accessor function that returns the pointer to the first element of the array, but will not write a function to change the contents of the array itself. When this situation is detected, SWIG generates a warning message such as the following :

interface.i:116. Warning. Array member will be read-only

To eliminate the warning message, typemaps can be used, but this is discussed in a later chapter. In many cases, the warning message is harmless.

C constructors and destructors

When wrapping structures, it is generally useful to have a mechanism for creating and destroying objects. If you don't do anything, SWIG will automatically generate functions for creating and destroying objects using malloc() and free(). Note: the use of malloc() only applies when SWIG is used on C code (i.e., when the -c++ is not supplied on the command line). C++ is handled differently.

If you don't want SWIG to generate constructors and destructors, you can use the %nodefault pragma or the -no_default command line option. For example:

swig -no_default example.i 

or

%module foo
...
%pragma nodefault        // Don't create default constructors/destructors
... declarations ...
%pragma makedefault      // Reenable default constructors/destructors

Compatibility note: Prior to SWIG-1.3.7, SWIG did not generate default constructors or destructors unless you explicitly turned them on using -make_default. However, it appears that most users want to have constructor and destructor functions so it has now been enabled as the default behavior.

Adding member functions to C structures

Many scripting languages provide a mechanism for creating classes and supporting object oriented programming. From a C standpoint, object oriented programming really just boils down to the process of attaching functions to structures. These functions normally operate on an instance of the structure (or object) in some way or another. Although there is a natural mapping of C++ to such a scheme, there is no direct mechanism for utilizing it with C code. However, SWIG provides a special %addmethods directive that makes it possible to attach methods to C structures for purposes of building an object oriented scripting language interface. Suppose you have a C header file with the following declaration :

/* file : vector.h */
...
typedef struct {
	double x,y,z;
} Vector;

You can make a Vector look alot like a class by writing a SWIG interface like this:

// file : vector.i
%module mymodule
%{
#include "vector.h"
%}

%include vector.h            // Just grab original C header file
%addmethods Vector {         // Attach these functions to struct Vector
	Vector(double x, double y, double z) {
		Vector *v;
		v = (Vector *v) malloc(sizeof(Vector));
		v->x = x;
		v->y = y;
		v->z = z;
		return v;
	}
	~Vector() {
		free(self);
	}
	double magnitude() {
		return sqrt(self->x*self->x+self->y*self->y+self->z*self->z);
	}
	void print() {
		printf("Vector [%g, %g, %g]\n", self->x,self->y,self->z);
	}
};

Now, when used with shadow classes in Python, you can do things like this :

>>> v = Vector(3,4,0)                 # Create a new vector
>>> print v.magnitude()                # Print magnitude
5.0
>>> v.print()                  # Print it out
[ 3, 4, 0 ]
>>> del v                      # Destroy it

The %addmethods directive can also be used inside the definition of the Vector structure. For example:

// file : vector.i
%module mymodule
%{
#include "vector.h"
%}

typedef struct {
	double x,y,z;
	%addmethods {
		Vector(double x, double y, double z) { ... }
		~Vector() { ... }
		...
	}
} Vector;

Finally, %addmethods can be used to access externally written functions provided they follow the naming convention used in this example :

/* File : vector.c */
/* Vector methods */
#include "vector.h"
Vector *new_Vector(double x, double y, double z) {
	Vector *v;
	v = (Vector *) malloc(sizeof(Vector));
	v->x = x;
	v->y = y;
	v->z = z;
	return v;
}
void delete_Vector(Vector *v) {
	free(v);
}

double Vector_magnitude(Vector *v) {
	return sqrt(v->x*v->x+v->y*v->y+v->z*v->z);
}

// File : vector.i
// Interface file
%module mymodule
%{
#include "vector.h"
%}

typedef struct {
	double x,y,z;
	%addmethods {
                Vector(int,int,int); // This calls new_Vector()
               ~Vector();            // This calls delete_Vector()
		double magnitude();  // This will call Vector_magnitude()
		...
	}
} Vector;
A little known feature of the %addmethods directive is that it can also be used to add synthesized attributes or to modify the behavior of existing data attributes. For example, suppose you wanted to make magnitude a read-only attribute of Vector instead of a method. To do this, you might write some code like this:
// Add a new attribute to Vector
%addmethods Vector {
    const double magnitude;
}
// Now supply the implementation of the Vector_magnitude_get function
%{
const double Vector_magnitude_get(Vector *v) {
  return (const double) return sqrt(v->x*v->x+v->y*v->y+v->z*v->z);
}
%}

Now, for all practial purposes, magnitude will appear like an attribute of the object.

A similar technique can also be used to work with problematic data members. For example, consider this interface:

struct Person {
   char name[50];
   ...
}
By default, the name attribute is read-only because SWIG does not normally know how to modify arrays. However, you can rewrite the interface as follows to change this:
struct Person {
    %addmethods {
       char *name;
    }
...
}

// Specific implementation of set/get functions
%{
char *Person_name_get(Person *p) {
   return p->name;
}
void Person_name_set(Person *p, char *val) {
   strncpy(p->name,val,50);
}
%}
Finally, it should be stressed that even though %addmethods can be used to add new data members, these new members can not require the allocation of additional storage in the object (e.g., their values must be entirely synthesized from existing attributes of the structure).

Nested structures

Occasionally, a C program will involve structures like this :

typedef struct Object {
	int objtype;
	union {
		int 	ivalue;
		double	dvalue;
		char	*strvalue;
		void	*ptrvalue;
	} intRep;
} Object;

When SWIG encounters this, it performs a structure splitting operation that transforms the declaration into the equivalent of the following:

typedef union {
	int 		ivalue;
	double		dvalue;
	char		*strvalue;
	void		*ptrvalue;
} Object_intRep;

typedef struct Object {
	int objType;
	Object_intRep intRep;
} Object;

SWIG will then create an Object_intRep structure for use inside the interface file. Accessor functions will be created for both structures. In this case, functions like this would be created :

Object_intRep *Object_intRep_get(Object *o) {
	return (Object_intRep *) &o->intRep;
}
int Object_intRep_ivalue_get(Object_intRep *o) {
	return o->ivalue;
}
int Object_intRep_ivalue_set(Object_intRep *o, int value) {
	return (o->ivalue = value);
}
double Object_intRep_dvalue_get(Object_intRep *o) {
	return o->dvalue;
}
... etc ...

Although this process is a little hairy, it works like you would expect in the target scripting language--especially when shadow classes are used. For instance, in Perl:

# Perl5 script for accessing nested member
$o = CreateObject();                    # Create an object somehow
$o->{intRep}->{ivalue} = 7                # Change value of o.intRep.ivalue

If you have a lot nested structure declarations, it is advisable to double-check them after running SWIG. Although, there is a good chance that they will work, you may have to modify the interface file in certain cases.

Other things to note about structure wrapping

SWIG doesn't care if the declaration of a structure in a .i file exactly matches that used in the underlying C code (except in the case of nested structures). For this reason, there are no problems omitting problematic members or simply omitting the structure definition altogether. If you are happy passing pointers around, this can be done without ever giving SWIG a structure definition.

Starting with SWIG1.3, a number of improvements have been made to SWIG's code generator. Specifically, even though structure access has been described in terms of high-level accessor functions such as this,

double Vector_x_get(Vector *v) {
   return v->x;
}
most of the generated code is actually inlined directly into wrapper functions. Therefore, no function Vector_x_get() actually exists in the generated wrapper file. For example, when creating a Tcl module, the following function is generated instead:
static int
_wrap_Vector_x_get(ClientData clientData, Tcl_Interp *interp, int objc, Tcl_Obj *CONST objv[]) {
    struct Vector *arg0 ;
    double result ;
    
    if (SWIG_GetArgs(interp, objc, objv,"p:Vector_x_get self ",&arg0, SWIGTYPE_p_Vector) == TCL_E
RROR) return TCL_ERROR;
    result = (double ) (arg0->x);
    Tcl_SetObjResult(interp,Tcl_NewDoubleObj((double) result));
    return TCL_OK;
}
The only exception to this rule are methods defined with %addmethods. In this case, the added code is contained in a separate function.

Finally, it is important to note that most language modules may choose to build a more advanced interface. Although you may never use the low-level interface described here, most of SWIG's language modules use it in some way or another.

C++ support

SWIG's support for C++ is an extension of the support for C functions, variables, and structures. However, because of its sheer complexity and the fact that C++ is nearly impossible to integrate with itself let alone any other language, SWIG only provides support for a subset of C++ features.

This section describes SWIG's low-level access to C++ declarations. In many instances, this low-level interface may be hidden by shadow classes or an alternative calling mechanism (this is usually language dependent and is described in detail in later chapters).

Supported C++ features

SWIG currently supports the following C++ features :

The following C++ features are not currently supported :

SWIG's C++ support has gradually been improved over the years so some of these limitations may be lifted in a future release. However, we make no promises.

C++ example

The following code shows a SWIG interface file for a simple C++ class.

%module list
%{
#include "list.h"
%}

// Very simple C++ example for linked list

class List {
public:
  List();
  ~List();
  int  search(char *value);
  void insert(char *);
  void remove(char *);
  char *get(int n);
  int  length;
static void print(List *l);
};

When compiling C++ code, it is critical that SWIG be called with the `-c++' option. This changes the way a number of critical features such as memory management are handled. It also enables the recognition of C++ keywords. Without the -c++ flag, SWIG will either issue a warning or a large number of syntax errors if it encounters C++ code in an interface file.

Constructors and destructors

C++ constructors and destructors are translated into accessor functions such as the following :

List * new_List(void) {
	return new List;
}
void delete_List(List *l) {
	delete l;
}

If a C++ class does not define any public constructors or destructors, SWIG will automatically create a default constructor or destructor. However, there are a few rules that define this behavior: SWIG should never generate a constructor or destructor for a class in which it is illegal to do so. However, if it is necessary to disable the default constructor/destructor creation, the %nodefault directive can be used:
%nodefault;   // Disable creation of constructor/destructor
class Foo {
...
};
%makedefault;
Compatibility Note: The generation of default constructors/destructors was made the default behavior in SWIG 1.3.7. This may break certain older modules, but the old behavior can be easily restored using %nodefault or the -nodefault command line option. Furthermore, in order for SWIG to properly generate (or not generate) default constructors, it must be able to gather information from both the private and protected sections (specifically, it needs to know if a private or protected constructor/destructor is defined). In older versions of SWIG, it was fairly common to simply remove or comment out the private and protected sections of a class (due to parsing problems). However, this removal may now cause SWIG to erroneously generate constructors for classes that define a constructor in those sections. Consider restoring those sections in the interface or using %nodefault to fix the problem.

Member functions

All member functions are roughly translated into accessor functions like this :

int List_search(List *obj, char *value) {
	return obj->search(value);
}

This translation is the same even if the member function has been declared as virtual.

It should be noted that SWIG does not actually create a C accessor function in the code it generates. Instead, member access such as obj->search(value) is directly inlined into the generated wrapper functions. However, the name and calling convention of the wrappers match the accessor function prototype described above.

Static members

Static member functions are called directly without making any special transformations. For example, the static member function print(List *l) directly invokes List::print(List *l) in the generated wrapper code.

Member data

Member data is handled in exactly the same manner as for C structures. A pair of accessor functions will be created. For example :

int List_length_get(List *obj) {
	return obj->length;
}
int List_length_set(List *obj, int value) {
	obj->length = value;
	return value;
}

A read-only member can be created using the %readonly and %readwrite directives. For example, we probably wouldn't want the user to change the length of a list so we could do the following to make the value available, but read-only.

class List {
public:
...
%readonly
	int length;
%readwrite
...
};
Similarly, all data attributes declared as const are wrapped as read-only members.

Protection

SWIG can only wrap class members that are declared public. Anything specified in a private or protected section will simply be ignored (although the internal code generator sometimes looks at the contents of the private and protected sections so that it can properly generate code for default constructors and destructors).

By default, members of a class definition are assumed to be private until you explicitly give a `public:' declaration (This is the same convention used by C++).

Enums and constants

Enumerations and constants placed in a class definition are mapped into constants with the classname as a prefix. For example :

class Swig {
public:
	enum {ALE, LAGER, PORTER, STOUT};
};

Generates the following set of constants in the target scripting language :

Swig_ALE = Swig::ALE
Swig_LAGER = Swig::LAGER
Swig_PORTER = Swig::PORTER
Swig_STOUT = Swig::STOUT

Members declared as const are wrapped as read-only members and do not create constants.

References

C++ references are supported, but SWIG handles them internally as pointers. For example, a declaration like this :

class Foo {
public:
	double bar(double &a);
}

is accessed using a function similar to this:

double Foo_bar(Foo *obj, double *a) {
	obj->bar(*a);
}
Functions that return a reference are remapped to return a pointer instead. For example:
class Bar {
public:
     double &spam();
};
Generates code like this:
double *Bar_spam(Bar *obj) {
   double &result = obj->spam();
   return &result;
}
Don't return references to objects allocated as local variables on the stack. SWIG doesn't make a copy of the objects so this will probably cause your program to crash.

Inheritance

SWIG supports C++ public inheritance of classes and allows both single and multiple inheritance. The SWIG type-checker knows about the relationship between base and derived classes and allows pointers to any object of a derived class to be used in functions of a base class. The type-checker properly casts pointer values and is safe to use with multiple inheritance.

SWIG does not support private or protected inheritance (it is parsed, but it has no effect on the generated code). Note: private and protected inheritance do not define an "isa" relationship between classes so it has no effect on type-checking either.

The following example shows how SWIG handles inheritance. For clarity, the full C++ code has been omitted.

// shapes.i
%module shapes
%{
#include "shapes.h"
%}

class Shape {
public:
        double x,y;
	virtual double area() = 0;
	virtual double perimeter() = 0;
	void    set_location(double x, double y);
};
class Circle : public Shape {
public:
	Circle(double radius);
	~Circle();
	double area();
	double perimeter();
};
class Square : public Shape {
public:
	Square(double size);
	~Square();
	double area();
	double perimeter();
}

When wrapped into Python, we can now perform the following operations :

beazley@slack% python
>>> import shapes
>>> circle = shapes.new_Circle(7)
>>> square = shapes.new_Square(10)
>>> print shapes.Circle_area(circle)
153.93804004599999757
>>> print shapes.Shape_area(circle)
153.93804004599999757
>>> print shapes.Shape_area(square)
100.00000000000000000
>>> shapes.Shape_set_location(square,2,-3)
>>> print shapes.Shape_perimeter(square)
40.00000000000000000
>>>
In this example, Circle and Square objects have been created. Member functions can be invoked on each object by making calls to Circle_area, Square_area, and so on. However, the same results can be accomplished by simply using the Shape_area function on either object.

One important point concerning inheritance is that the low-level accessor functions are only generated for classes in which they are actually declared. For instance, in the above example, the method set_location() is only accessible as Shape_set_location() and not as Circle_set_location() or Square_set_location(). Of course, the Shape_set_location() function will accept any kind of object derived from Shape. Similarly, accessor functions for the attributes x and y are generated as Shape_x_get(), Shape_x_set(), Shape_y_get(), and Shape_y_set(). Functions such as Circle_x_get() are not available--instead you should use Shape_x_get().

Although the low-level C-like interface is functional, most language modules also produce a higher level OO interface using a technique known as shadow classing. This approach is described shortly and can be used to provide a more natural C++ interface.

Compatibility Note: Starting in version 1.3.7, SWIG only generates low-level accessor wrappers for the declarations that are actually defined in each class. This differs from SWIG1.1 which used to inherit all of the declarations defined in base classes and regenerate specialized accessor functions such as Circle_x_get(), Square_x_get(), Circle_set_location(), and Square_set_location(). This old behavior results in huge amounts of replicated code for large class hierarchies and makes it awkward to build applications spread across multiple modules (since accessor functions are duplicated in every single module). It is also unnecessary to have such wrappers when advanced features like shadow-classing are used. Future versions of SWIG may apply further optimizations such as not regenerating wrapper functions for virtual members that are already defined in a base class.

Renaming

C++ member functions and data can be renamed with the %name directive. The %name directive only replaces the member function name. For example :

class List {
public:
  List();
%name(ListSize) List(int maxsize);
  ~List();
  int  search(char *value); 
%name(find)    void insert(char *); 
%name(delete)  void remove(char *); 
  char *get(int n);
  int  length;
static void print(List *l);
};

This will create the functions List_find, List_delete, and a function named new_ListSize for the overloaded constructor.

The %name directive can be applied to all members including constructors, destructors, static functions, data members, and enumeration values.

The class name prefix can also be changed by specifying

%name(newname) class List {
...
}
Although the %name() directive can be used to help deal with overloaded methods, it really doesn't work very well. Keep reading for a better solution.

Wrapping Overloaded Functions and Methods

This section describes the problem of wrapping overloaded C++ functions and methods. This has long been a limitation of SWIG that has only recently been addressed (primarily because we couldn't quite figure out how to do it without causing a head-explosion or serious reliability problems). However, in order to understand the reasoning behind the current solution, it is important to better understand the problem.

In C++, functions and methods can be overloaded by declaring them with different type signatures. For example:

void foo(int);
void foo(double);
void foo(Bar *b, Spam *s, int );
Later, when a call to function foo() is made, the determination of which function to invoke is made by looking at the types of the arguments. For example:
int x;
double y;
Bar *b;
Spam *s;
int z;
...
foo(x);        // Calls foo(int)
foo(y);        // Calls foo(double)
foo(b,s,z);    // Calls foo(Bar *, Spam *, int)
It is important to note that the selection of the overloaded method or function is made by the C++ compiler and occurs at compile time. It does not occur as your program runs.

Internally to the C++ compiler, overloaded functions are mapped to unique identifiers using a name-mangling technique where the arguments are used to create a unique type signature that is appended to the name. This produces three unique function names that might look like this:

void foo__Fi(int);
void foo__Fd(double);
void foo__FP3BarP4Spami(Bar *, Spam *, int);
The implementation of overloaded methods in C++ is difficult to translate directly to a scripting language environment because it relies on static type-checking and compile-time binding of methods--neither of which map to the dynamically typed environment of a scripting language interpreter. For example, in Python, it is simply impossible to define three entirely different versions of a function with exactly the same name in the same scope. The repeated definitions simply overwrite previous definitions.

Therefore, to solve the overloading problem, let's first look at several approaches that have been proposed as solutions, but which are NOT used to solve the overloading problem in SWIG.

Alas, what to do about overloading?

Starting with SWIG-1.3.7, a very simple enhancement has been added to the %rename directive to help disambiguate overloaded functions and methods. Normally, the %rename directive is used to rename a declaration everywhere in an interface file. For example, if you write this,

%rename(foo) bar;
all occurences of "bar" will be renamed to "foo" (this feature was described a little earlier in this chapter in the section "Renaming Declarations"). By itself, this doesn't do anything to help fix overloaded methods. However, the %rename directive can now be parameterized as shown in this example:
/* Forward renaming declarations */
%rename(foo_i) foo(int); 
%rename(foo_d) foo(double);
...
void foo(int);           // Becomes 'foo_i'
void foo(char *c);       // Stays 'foo' (not renamed)

class Spam {
public:
   void foo(int);      // Becomes 'foo_i'
   void foo(double);   // Becomes 'foo_d'
   ...
};
Since, the %rename declaration is used to declare a renaming in advance, it can be placed at the start of an interface file. This makes it possible to apply a consistent name resolution without having to modify header files. For example:
%module foo

/* Rename these overloaded functions */
%rename(foo_i) foo(int); 
%rename(foo_d) foo(double);

%include "header.h"
When used in this simple form, the renaming is applied to all global functions and member functions that match the prototype. If you only want the renaming to apply to a certain scope, the C++ scope resolution operator (::) can be used. For example:
%rename(foo_i) ::foo(int);      // Only rename foo(int) in the global scope.
                                // (will not rename class members)

%rename(foo_i) Spam::foo(int);  // Only rename foo(int) in class Spam
When a renaming operator is applied to a class as in Spam::foo(int), it is applied to that class and all derived classes. This can be used to apply a consistent renaming across an entire class hierarchy with only a few declarations. For example:
%rename(foo_i) Spam::foo(int);
%rename(foo_d) Spam::foo(double);

class Spam {
public:
   virtual void foo(int);      // Renamed to foo_i
   virtual void foo(double);   // Renamed to foo_d
   ...
};

class Bar : public Spam {
public:
   virtual void foo(int);      // Renamed to foo_i
   virtual void foo(double);   // Renamed to foo_d
...
};

class Grok : public Bar {
public:
   virtual void foo(int);      // Renamed to foo_i
   virtual void foo(double);   // Renamed to foo_d
...
};
A final special form of %rename can be used to apply a renaming just to class members:
%rename(foo_i) *::foo(int);   // Only rename foo(int) if it appears in a class.
Note: the *:: syntax is non-standard C++, but the '*' is meant to be a wildcard that matches any class name (we couldn't think of a better alternative so if you have a better idea, send email to swig-dev@cs.uchicago.edu).

Although the %rename approach does not automatically solve the overloading problem for you (you have to supply a name), SWIG's error messages have been improved to help. For example, consider this interface file:

%module foo

class Spam {
public:
   void foo(int);
   void foo(double);
   void foo(Bar *, Spam *, int);
};
If you run SWIG on this file, you will get the following error messages:
foo.i:6. Overloaded declaration ignored.  Spam::foo(double )
foo.i:5. Previous declaration is Spam::foo(int )
foo.i:7. Overloaded declaration ignored.  Spam::foo(Bar *,Spam *,int )
foo.i:5. Previous declaration is Spam::foo(int )
The error messages indicate the problematic functions along with their type signature. In addition, the previous definition is supplied. Therefore, you can just look at these errors and decide how you want to handle the overloaded functions. For example:
%module foo
%rename(foo_d)         Spam::foo(double);               // name foo_d
%rename(foo_barspam)   Spam::foo(Bar *, Spam *, int);   // name foo_barspam
...
class Spam { 
...
};
And again, for a class hierarchy, you may be able to solve all of the problems by just renaming members in the base class--those renamings automatically propagate to all derived classes.

Another way to resolve overloaded methods is to simply eliminate conflicting definitions. An easy way to do this is to use the %ignore directive. %ignore works exactly like %rename except that it forces a declaration to disappear. For example:

%ignore foo(double);          // Ignore all foo(double)
%ignore Spam::foo;            // Ignore foo in class Spam
%ignore Spam::foo(double);    // Ignore foo(double) in class Spam
%ignore *::foo(double);       // Ignore foo(double) in all classes
When applied to a base class, %ignore forces all definitions in derived clases to disappear. For example, %ignore Spam::foo(double) will eliminate foo(double) in Spam and all classes derived from Spam.

A few final notes about the enhanced %rename directive and %ignore:

Adding new methods

New methods can be added to a class using the %addmethods directive. This directive is primarily used in conjunction with shadow classes to add additional functionality to an existing class. For example :

%module vector
%{
#include "vector.h"
%}

class Vector {
public:
	double x,y,z;
	Vector();
	~Vector();
	... bunch of C++ methods ...
	%addmethods {
		char *__str__() {
			static char temp[256];
			sprintf(temp,"[ %g, %g, %g ]", v->x,v->y,v->z);
			return &temp[0];
		}
	}
};

This code adds a __str__ method to our class for producing a string representation of the object. In Python, such a method would allow us to print the value of an object using the print command.

>>>
>>> v = Vector();
>>> v.x = 3
>>> v.y = 4
>>> v.z = 0
>>> print(v)
[ 3.0, 4.0, 0.0 ]
>>>

The %addmethods directive follows all of the same conventions as its use with C structures.

Templates

In all versions of SWIG, template type names may appear anywhere a type is expected in an interface file. For example:
void foo(vector<int> *a, int n);
Starting with SWIG-1.3.7, simple C++ template declarations can also be easily wrapped. For example, consider the following template class declaration:

// File : list.h
template<class T> class List {
private:
    T *data;
    int nitems;
    int maxitems;
public:
    List(int max) {
      data = new T [max];
      nitems = 0;
      maxitems = max;
    }
    ~List() {
      delete [] data;
    };
    void append(T obj) {
      if (nitems < maxitems) {
        data[nitems++] = obj;
      }
    }
    int length() {
      return nitems;
    }
    T get(int n) {
      return data[n];
    }
};
By itself, this template declaration is useless--SWIG simply ignores it because it doesn't know how to generate any code until unless a definition of T is provided.

To create wrappers for a specific template instantiation, use the %template directive like this:

/* Instantiate a few different versions of the template */
%template(intList) List<int>;
%template(doubleList) List<double>;
The argument to %template() is the name of the instantiation in the target language. Most target languages do not recognize identifiers such as List<int>. Therefore, each instantiation of a template has to be associated with a nicely formatted identifier such as intList or doubleList. Furthermore, due to the details of the underlying implementation, the name you select has to be unused in both C++ and the target scripting language (e.g., the name must not match any existing C++ typename, class name, or declaration name).

Since most C++ compilers are nothing more than glorified preprocessors and C++ purists really hate macros, SWIG internally handles templates by converting them into macros and performing expansions using the preprocessor. Specifically, the %template(intList) List<int> declaration results in a macro expansion that generates the following code (which is then parsed to create the interface):

// Example of how templates are internally expanded by SWIG
%{
// Define a nice name for the instantiation
typedef List<int> intList;
%}
// Provide a simple class definition with types filled in
class intList {
private:
    int *data;
    int nitems;
    int maxitems;
public:
    intList(int max) {
      data = new int [max];
      nitems = 0;
      maxitems = max;
    }
    ~intList() {
      delete [] data;
    };
    void append(int obj) {
      if (nitems < maxitems) {
        data[nitems++] = obj;
      }
    }
    int length() {
      return nitems;
    }
    int get(int n) {
      return data[n];
    }
};
SWIG can also generate wrappers for function templates using a similar technique. For example:
// Function template
template T max(T a, T b) { return a > b ? a : b; }

// Make some different versions of this function
%template(maxint) max<int>;
%template(maxdouble) max<double>;
In this case, maxint and maxdouble become unique names for specific instantiations of the function.

If your goal is to make someone's head explode more than usual, SWIG directives such as %name and %addmethods can be included directly in template definitions. Not only that, since SWIG has the advantage of using the preprocessor for template expansion, standard C preprocessor operators such as # and ## can be applied to template parameters (an obvious oversight of the C++ standard that SWIG now corrects). For example:

// File : list.h
template<class T> class List {
   ...
public:
    List(int max);
    ~List();
    ...
    %name(getitem) T get(int index);
    %addmethods {
        char *__str__() {
            /* Make a string representation */
            ...
        }
        /* Return actual type of template instantiation as a string */
        char *ttype() {
            return #T;
        }
    }
};
In this example, the extra SWIG directives are propagated to every template instantiation.

In addition, the %addmethods directive can be used to add additional methods to a specific instantiation. For example:

%template(intList) List<int>;

%addmethods intList {
    void blah() {
          printf("Hey, I'm an intList!\n");
    }
};
Needless to say, SWIG's template support provides plenty of opportunities to break the universe. That said, an important final point to note is that SWIG performs no extensive error checking of templates! Specifically, SWIG does not perform type checking nor does it check to see if the actual contents of the template declaration make any sense. Since the C++ compiler (or is it a preprocessor?) will definitely check this when it compiles the resulting wrapper file, there is no practical reason for SWIG to duplicate this functionality (besides, none of the SWIG developers are masochistic enough to want to implement this).

Pointers to Members

Starting with SWIG1.3.7, there is limited parsing support for pointers to C++ class members. For example:
double do_op(Object *o, double (Object::*callback)(double,double));
extern double (Object::*fooptr)(double,double);
%constant double (Object::*FOO)(double,double) = &Object::foo;
Although these kinds of pointers can be parsed and represented by the SWIG type system, few language modules know how to handle them due to implementation differences from standard C pointers. Readers are strongly advised to consult an advanced text such as the "The Annotated C++ Manual" for specific details.

When pointers to members are supported, the pointer value might appear as a special string like this:

>>> print example.FOO
_ff0d54a800000000_m_Object__f_double_double__double
>>>
In this case, the hexadecimal digits represent the entire value of the pointer which is usually the contents of a small C++ structure on most machines.

SWIG's type-checking mechanism is also more limited when working with member pointers. Normally SWIG tries to keep track of inheritance when checking types. However, no such support is currently provided for member pointers.

Partial class definitions

Since SWIG is still limited in its support of C++, it may be necessary to use partial class information in an interface file. However, since SWIG does not need the entire class specification to work, conditional compilation can be used to comment out problematic parts. For example, if you had a nested class definition, you might do this:
class Foo {
public:
#ifndef SWIG
   class Bar {
   public:
     ...
   };
#endif
   Foo();
  ~Foo();
   ...
};

Also, as a rule of thumb, SWIG should not be used on raw C++ source files.

Code Insertion

Sometimes it is necessary to insert special code into the resulting wrapper file generated by SWIG. For example, you may want to include additional C code to perform initialization or other operations. There are four common ways to insert code, but it's useful to know how the output of SWIG is structured first.

The output of SWIG

When SWIG creates its output file, it is broken up into four sections corresponding to runtime libraries, headers, wrapper functions, and module initialization code (in that order).

Code insertion blocks

Code is inserted into the appropriate code section by using one of the following code insertion directives:
%runtime %{
   ... code in runtime section ...
%}

%header %{
   ... code in header section ...
%}

%wrapper %{
   ... code in wrapper section ...
%}

%init %{
   ... code in init section ...
%}
The bare %{ ... %} directive is a shortcut that is the same as %header %{ ... %}.

Everything in a code insertion block is copied verbatim into the output file and is not parsed by SWIG. Most SWIG input files have at least one such block to include header files and support C code. Additional code blocks may be placed anywhere in a SWIG file as needed.

%module mymodule
%{
#include "my_header.h"
%}
... Declare functions here
%{

void some_extra_function() {
  ...
}
%}

A common use for code blocks is to write "helper" functions. These are functions that are used specifically for the purpose of building an interface, but which are generally not visible to the normal C program. For example :

%{
/* Create a new vector */
static Vector *new_Vector() {
	return (Vector *) malloc(sizeof(Vector));
}

%}
// Now wrap it 
Vector *new_Vector();

Inlined code blocks

Since the process of writing helper functions is fairly common, there is a special inlined form of code block that is used as follows :

%inline %{
/* Create a new vector */
Vector *new_Vector() {
	return (Vector *) malloc(sizeof(Vector));
}
%}

The %inline directive inserts all of the code that follows verbatim into the header portion of an interface file. The code is then parsed by both the SWIG preprocessor and parser. Thus, the above example creates a new command new_Vector using only one declaration. Since the code inside an %inline %{ ... %} block is given to both the C compiler and SWIG, it is illegal to include any SWIG directives inside the %{ ... %} block.

Initialization blocks

When code is included in the %init section, it is copied directly into the module initialization function. For example, if you needed to perform some extra initialization, you could write this:

%init %{
	init_variables();
%}

SWIG Preprocessor

SWIG includes its own enhanced version of the C preprocessor. The preprocessor supports the standard preprocessor directives and macro expansion rules. However, a number of modifications and enhancements have been made. This section describes some of these modifications.

File inclusion

To include another file into a SWIG interface, use the %include directive like this:
%include "pointer.i"
Unlike, #include, %include includes each file once (and will not reload the file on subsequent %include declarations). Therefore, it is not necessary to use include-guards in SWIG interfaces.

By default, the #include is ignored unless you run SWIG with the -includeall option. The reason for ignoring traditional includes is that you often don't want SWIG to try and wrap everything included in standard header system headers and auxilliary files.

File imports

SWIG provides another file inclusion directive with the %import directive. For example:
%import "foo.i"
The purpose of %import is to collect certain information from another SWIG interface file or a header file without actually generating any wrapper code. Such information generally includes type declarations (e.g., typedef) as well as C++ classes that might be used as base-classes for class declarations in the interface. The use of %import is also important when SWIG is used to generate extensions as a collection of related modules. This is advanced topic and is described in a later chapter.

The -importall directive tells SWIG to follow all #include statements as imports. This might be useful if you want to extract type definitions from system header files without generating any wrappers.

Conditional Compilation

SWIG fully supports the use of #if, #ifdef, #ifndef, #else, #endif to conditionally include parts of an interface. The following symbols are predefined by SWIG when it is parsing the interface:

SWIG                            Always defined when SWIG is processing a file
SWIGTCL                         Defined when using Tcl
SWIGTCL8                        Defined when using Tcl8.0
SWIGPERL                        Defined when using Perl
SWIGPERL5                       Defined when using Perl5
SWIGPYTHON                      Defined when using Python
SWIGGUILE                       Defined when using Guile
SWIGRUBY                        Defined when using Ruby
SWIGJAVA                        Defined when using Java
SWIGMZSCHEME                    Defined when using Mzscheme        
SWIGWIN                         Defined when running SWIG under Windows
SWIGMAC                         Defined when running SWIG on the Macintosh
In addition, SWIG defines the following set of standard C/C++ macros:
__LINE__                        Current line number
__FILE__                        Current file name
__STDC__                        Defined to indicate ANSI C
__cplusplus                     Defined when -c++ option used
Interface files can look at these symbols as necessary to change the way in which an interface is generated or to mix SWIG directives with C code. These symbols are also defined within the C code generated by SWIG (except for the symbol `SWIG' which is only defined within the SWIG compiler).

Macro Expansion

Traditional preprocessor macros can be used in SWIG interfaces. Be aware that the #define statement is also used to try and detect constants. Therefore, if you have something like this in your file,
#ifndef _FOO_H 1
#define _FOO_H 1
...
#endif
you may get some extra constants such as _FOO_H showing up in the scripting interface.

More complex macros can be defined in the standard way. For example:

#define EXTERN extern
#ifdef __STDC__
#define _ANSI(args)   (args)
#else
#define _ANSI(args) ()
#endif
The following operators can appear in macro definitions:

SWIG Macros

SWIG provides an enhanced macro capability with the %define and %enddef directives. For example:
%define ARRAYHELPER(type,name)
%inline %{
type *new_ ## name (int nitems) {
   return (type *) malloc(sizeof(type)*nitems);
}
void delete_ ## name(type *t) {
   free(t);
}
type name ## _get(type *t, int index) {
   return t[index];
}
void name ## _set(type *t, int index, type val) {
   t[index] = val;
}
%}
%enddef

ARRAYHELPER(int, IntArray)
ARRAYHELPER(double, DoubleArray)
The primary purpose of %define is to define large macros of code. Unlike normal C preprocessor macros, it is not necessary to terminate each line with a continuation character (\)--the macro definition extends to the first occurrence of %enddef. Furthermore, when such macros are expanded, they are reparsed through the C preprocessor. Thus, SWIG macros can contain all other preprocessor directives except for nested %define statements.

The SWIG macro capability is a very quick and easy way to generate large amounts of code. In fact, many of SWIG's advanced features and libraries are built using this mechanism (such as C++ template support).

Preprocessing and %{ ... %} blocks

The SWIG preprocessor does not process any text enclosed in a code block %{ ... %}. Therefore, if you write code like this,
%{
#ifdef NEED_BLAH
int blah() {
   ...
}
#endif
%}
the contents of the %{ ... %} block are copied without modification to the output (including all preprocessor directives).

Preprocessing and { ... }

SWIG always runs the preprocessor on text appearing inside { ... }. However, sometimes it is desirable to make a preprocessor directive pass through to the output file. For example:
%addmethods Foo {
   void bar() {
      #ifdef DEBUG
       printf("I'm in bar\n");
      #endif
   }
}
By default, SWIG will interpret the #ifdef DEBUG statement. However, if you really wanted that code to actually go into the wrapper file, prefix the preprocessor directives with % like this:
%addmethods Foo {
   void bar() {
      %#ifdef DEBUG
       printf("I'm in bar\n");
      %#endif
   }
}
SWIG will strip the extra % and leave the preprocessor directive in the code.

An Interface Building Strategy

This section describes the general approach for building interface with SWIG. The specifics related to a particular scripting language are found in later chapters.

Preparing a C program for SWIG

SWIG doesn't require modifications to your C code, but if you feed it a collection of raw C header files or source code, the results might not be what you expect---in fact, they might be awful. Here's a series of steps you can follow to make an interface for a C program :

Although this may sound complicated, the process turns out to be fairly easy once you get the hang of it.

In the process of building an interface, SWIG may encounter syntax errors or other problems. The best way to deal with this is to simply copy the offending code into a separate interface file and edit it. However, the SWIG developers have worked very hard to improve the SWIG parser--you should report parsing errors to swig-dev@cs.uchicago.edu or to the SWIG bug tracker on www.swig.org.

The SWIG interface file

The preferred method of using SWIG is to generate separate interface file. Suppose you have the following C header file :

/* File : header.h */

#include <stdio.h>
#include <math.h>

extern int foo(double);
extern double bar(int, int);
extern void dump(FILE *f);

A typical SWIG interface file for this header file would look like the following :

/* File : interface.i */
%module mymodule
%{
#include "header.h"
%}
extern int foo(double);
extern double bar(int, int);
extern void dump(FILE *f);

Of course, in this case, our header file is pretty simple so we could have made an interface file like this as well:

/* File : interface.i */
%module mymodule
%include header.h

Naturally, your mileage may vary.

Why use separate interface files?

Although SWIG can parse many header files, it is more common to write a special .i file defining the interface to a package. There are several reasons why you might want to do this:

Getting the right header files

Sometimes, it is necessary to use certain header files in order for the code generated by SWIG to compile properly. Make sure you include certain header files by using a %{,%} block like this:
%module graphics
%{
#include <GL/gl.h>
#include <GL/glu.h>
%}

// Put rest of declarations here
...

What to do with main()

If your program defines a main() function, you may need to get rid of it or rename it in order to use a scripting language. Most scripting languages define their own main() procedure that is called instead. main() also makes no sense when working with dynamic loading. There are a few approaches to solving the main() conflict :

Getting rid of main() may cause potential initialization problems of a program. To handle this problem, you may consider writing a special function called program_init() that initializes your program upon startup. This function could then be called either from the scripting language as the first operation, or when the SWIG generated module is loaded.

As a general note, many C programs only use the main() function to parse command line options and to set parameters. However, by using a scripting language, you are probably trying to create a program that is more interactive. In many cases, the old main() program can be completely replaced by a Perl, Python, or Tcl script.

How to avoid creating the interface from hell

SWIG makes it fairly easy to build a big interface really fast. In fact, if you apply it to a large enough package, you'll find yourself with a rather large amount of code being produced in the resulting wrapper file. For instance, wrapping a 1000 line C header file with a large number of structure declarations may result in a wrapper file containing 20,000-30,000 lines of code. Here are a few things you can do to make smaller interface:


SWIG 1.3 - Last Modified : September 20, 2001