1.7 Extracting Parameters in Extension Functions
The PyArg_ParseTuple() function is declared as follows:
int PyArg_ParseTuple(PyObject *arg, char *format, ...);
The arg argument must be a tuple object containing an argument
list passed from Python to a C function. The format argument
must be a format string, whose syntax is explained below. The
remaining arguments must be addresses of variables whose type is
determined by the format string. For the conversion to succeed, the
arg object must match the format and the format must be
exhausted. On success, PyArg_ParseTuple() returns true,
otherwise it returns false and raises an appropriate exception.
Note that while PyArg_ParseTuple() checks that the Python
arguments have the required types, it cannot check the validity of the
addresses of C variables passed to the call: if you make mistakes
there, your code will probably crash or at least overwrite random bits
in memory. So be careful!
A format string consists of zero or more ``format units''. A format
unit describes one Python object; it is usually a single character or
a parenthesized sequence of format units. With a few exceptions, a
format unit that is not a parenthesized sequence normally corresponds
to a single address argument to PyArg_ParseTuple(). In the
following description, the quoted form is the format unit; the entry
in (round) parentheses is the Python object type that matches the
format unit; and the entry in [square] brackets is the type of the C
variable(s) whose address should be passed. (Use the "&"operator to pass a variable's address.)
Note that any Python object references which are provided to the
caller are borrowed references; do not decrement their
reference count!
- "s" (string or Unicode object) [char *]
- Convert a Python string or Unicode object to a C pointer to a
character string. You must not provide storage for the string
itself; a pointer to an existing string is stored into the character
pointer variable whose address you pass. The C string is
null-terminated. The Python string must not contain embedded null
bytes; if it does, a TypeError exception is raised.
Unicode objects are converted to C strings using the default
encoding. If this conversion fails, an UnicodeError is
raised.
- "s#" (string, Unicode or any read buffer compatible object)
[char *, int]
- This variant on "s" stores into two C variables, the first one a
pointer to a character string, the second one its length. In this
case the Python string may contain embedded null bytes. Unicode
objects pass back a pointer to the default encoded string version of the
object if such a conversion is possible. All other read buffer
compatible objects pass back a reference to the raw internal data
representation.
- "z" (string or
None
) [char *]
- Like "s", but the Python object may also be
None
, in which
case the C pointer is set to NULL.
- "z#" (string or
None
or any read buffer compatible object)
[char *, int]
- This is to "s#" as "z" is to "s".
- "u" (Unicode object) [Py_UNICODE *]
- Convert a Python Unicode object to a C pointer to a null-terminated
buffer of 16-bit Unicode (UTF-16) data. As with "s", there is no need
to provide storage for the Unicode data buffer; a pointer to the
existing Unicode data is stored into the Py_UNICODE pointer variable whose
address you pass.
- "u#" (Unicode object) [Py_UNICODE *, int]
- This variant on "u" stores into two C variables, the first one
a pointer to a Unicode data buffer, the second one its length.
- "es" (string, Unicode object or character buffer compatible
object) [const char *encoding, char **buffer]
- This variant on "s" is used for encoding Unicode and objects
convertible to Unicode into a character buffer. It only works for
encoded data without embedded NULL bytes.
The variant reads one C variable and stores into two C variables, the
first one a pointer to an encoding name string (encoding), and the
second a pointer to a pointer to a character buffer (**buffer,
the buffer used for storing the encoded data).
The encoding name must map to a registered codec. If set to NULL,
the default encoding is used.
PyArg_ParseTuple() will allocate a buffer of the needed
size using PyMem_NEW(), copy the encoded data into this
buffer and adjust *buffer to reference the newly allocated
storage. The caller is responsible for calling
PyMem_Free() to free the allocated buffer after usage.
- "et" (string, Unicode object or character buffer compatible
object) [const char *encoding, char **buffer]
- Same as "es" except that string objects are passed through without
recoding them. Instead, the implementation assumes that the string
object uses the encoding passed in as parameter.
- "es#" (string, Unicode object or character buffer compatible
object) [const char *encoding, char **buffer, int *buffer_length]
- This variant on "s#" is used for encoding Unicode and objects
convertible to Unicode into a character buffer. It reads one C
variable and stores into three C variables, the first one a pointer to
an encoding name string (encoding), the second a pointer to a
pointer to a character buffer (**buffer, the buffer used for
storing the encoded data) and the third one a pointer to an integer
(*buffer_length, the buffer length).
The encoding name must map to a registered codec. If set to NULL,
the default encoding is used.
There are two modes of operation:
If *buffer points a NULL pointer,
PyArg_ParseTuple() will allocate a buffer of the needed
size using PyMem_NEW(), copy the encoded data into this
buffer and adjust *buffer to reference the newly allocated
storage. The caller is responsible for calling
PyMem_Free() to free the allocated buffer after usage.
If *buffer points to a non-NULL pointer (an already allocated
buffer), PyArg_ParseTuple() will use this location as
buffer and interpret *buffer_length as buffer size. It will then
copy the encoded data into the buffer and 0-terminate it. Buffer
overflow is signalled with an exception.
In both cases, *buffer_length is set to the length of the
encoded data without the trailing 0-byte.
- "et#" (string, Unicode object or character buffer compatible
object) [const char *encoding, char **buffer]
- Same as "es#" except that string objects are passed through without
recoding them. Instead, the implementation assumes that the string
object uses the encoding passed in as parameter.
- "b" (integer) [char]
- Convert a Python integer to a tiny int, stored in a C char.
- "h" (integer) [short int]
- Convert a Python integer to a C short int.
- "i" (integer) [int]
- Convert a Python integer to a plain C int.
- "l" (integer) [long int]
- Convert a Python integer to a C long int.
- "L" (integer) [LONG_LONG]
- Convert a Python integer to a C long long. This format is only
available on platforms that support long long (or _int64
on Windows).
- "c" (string of length 1) [char]
- Convert a Python character, represented as a string of length 1, to a
C char.
- "f" (float) [float]
- Convert a Python floating point number to a C float.
- "d" (float) [double]
- Convert a Python floating point number to a C double.
- "D" (complex) [Py_complex]
- Convert a Python complex number to a C Py_complex structure.
- "O" (object) [PyObject *]
- Store a Python object (without any conversion) in a C object pointer.
The C program thus receives the actual object that was passed. The
object's reference count is not increased. The pointer stored is not
NULL.
- "O!" (object) [typeobject, PyObject *]
- Store a Python object in a C object pointer. This is similar to
"O", but takes two C arguments: the first is the address of a
Python type object, the second is the address of the C variable (of
type PyObject *) into which the object pointer is stored.
If the Python object does not have the required type,
TypeError is raised.
- "O&" (object) [converter, anything]
- Convert a Python object to a C variable through a converter
function. This takes two arguments: the first is a function, the
second is the address of a C variable (of arbitrary type), converted
to void *. The converter function in turn is called as
follows:
status =
converter(
object, address);
where object is the Python object to be converted and
address is the void * argument that was passed to
PyArg_ParseTuple(). The returned status should be
1
for a successful conversion and 0
if the conversion
has failed. When the conversion fails, the converter function
should raise an exception.
- "S" (string) [PyStringObject *]
- Like "O" but requires that the Python object is a string object.
Raises TypeError if the object is not a string object.
The C variable may also be declared as PyObject *.
- "U" (Unicode string) [PyUnicodeObject *]
- Like "O" but requires that the Python object is a Unicode object.
Raises TypeError if the object is not a Unicode object.
The C variable may also be declared as PyObject *.
- "t#" (read-only character buffer) [char *, int]
- Like "s#", but accepts any object which implements the read-only
buffer interface. The char * variable is set to point to the
first byte of the buffer, and the int is set to the length of
the buffer. Only single-segment buffer objects are accepted;
TypeError is raised for all others.
- "w" (read-write character buffer) [char *]
- Similar to "s", but accepts any object which implements the
read-write buffer interface. The caller must determine the length of
the buffer by other means, or use "w#" instead. Only
single-segment buffer objects are accepted; TypeError is
raised for all others.
- "w#" (read-write character buffer) [char *, int]
- Like "s#", but accepts any object which implements the
read-write buffer interface. The char * variable is set to
point to the first byte of the buffer, and the int is set to
the length of the buffer. Only single-segment buffer objects are
accepted; TypeError is raised for all others.
- "(items)" (tuple) [matching-items]
- The object must be a Python sequence whose length is the number of
format units in items. The C arguments must correspond to the
individual format units in items. Format units for sequences
may be nested.
Note:
Prior to Python version 1.5.2, this format specifier
only accepted a tuple containing the individual parameters, not an
arbitrary sequence. Code which previously caused
TypeError to be raised here may now proceed without an
exception. This is not expected to be a problem for existing code.
It is possible to pass Python long integers where integers are
requested; however no proper range checking is done -- the most
significant bits are silently truncated when the receiving field is
too small to receive the value (actually, the semantics are inherited
from downcasts in C -- your mileage may vary).
A few other characters have a meaning in a format string. These may
not occur inside nested parentheses. They are:
- "|"
- Indicates that the remaining arguments in the Python argument list are
optional. The C variables corresponding to optional arguments should
be initialized to their default value -- when an optional argument is
not specified, PyArg_ParseTuple() does not touch the contents
of the corresponding C variable(s).
- ":"
- The list of format units ends here; the string after the colon is used
as the function name in error messages (the ``associated value'' of
the exception that PyArg_ParseTuple() raises).
- ";"
- The list of format units ends here; the string after the semicolon is
used as the error message instead of the default error message.
Clearly, ":" and ";" mutually exclude each other.
Some example calls:
int ok;
int i, j;
long k, l;
char *s;
int size;
ok = PyArg_ParseTuple(args, ""); /* No arguments */
/* Python call: f() */
ok = PyArg_ParseTuple(args, "s", &s); /* A string */
/* Possible Python call: f('whoops!') */
ok = PyArg_ParseTuple(args, "lls", &k, &l, &s); /* Two longs and a string */
/* Possible Python call: f(1, 2, 'three') */
ok = PyArg_ParseTuple(args, "(ii)s#", &i, &j, &s, &size);
/* A pair of ints and a string, whose size is also returned */
/* Possible Python call: f((1, 2), 'three') */
{
char *file;
char *mode = "r";
int bufsize = 0;
ok = PyArg_ParseTuple(args, "s|si", &file, &mode, &bufsize);
/* A string, and optionally another string and an integer */
/* Possible Python calls:
f('spam')
f('spam', 'w')
f('spam', 'wb', 100000) */
}
{
int left, top, right, bottom, h, v;
ok = PyArg_ParseTuple(args, "((ii)(ii))(ii)",
&left, &top, &right, &bottom, &h, &v);
/* A rectangle and a point */
/* Possible Python call:
f(((0, 0), (400, 300)), (10, 10)) */
}
{
Py_complex c;
ok = PyArg_ParseTuple(args, "D:myfunction", &c);
/* a complex, also providing a function name for errors */
/* Possible Python call: myfunction(1+2j) */
}
See About this document... for information on suggesting changes.