Credit: Luther Blissett
You have an existing C function that takes as an argument a C array of C-level values (e.g., doubles), and want to wrap it into a Python callable C extension that takes as an argument a Python sequence (or iterator).
The easiest way to accept an arbitrary Python sequence in the Python
C API is with the PySequence_Fast
function, which
builds and returns a tuple when needed, but returns only its argument
(with the reference count incremented) if the argument is already a
list:
#include <Python.h>
/* a preexisting C-level function you want to expose -- e.g: */
static double total(double* data, int len)
{
double total = 0.0;
int i;
for(i=0; i<len; ++i)
total += data[i];
return total;
}
/* here is how you expose it to Python code: */
static PyObject *totalDoubles(PyObject *self, PyObject *args)
{
PyObject* seq;
double *dbar;
double result;
int seqlen;
int i;
/* get one argument as a sequence */
if(!PyArg_ParseTuple(args, "O", &seq))
return 0;
seq = PySequence_Fast(seq, "argument must be iterable");
if(!seq)
return 0;
/* prepare data as an array of doubles */
seqlen = PySequence_Fast_GET_SIZE(seq);
dbar = malloc(seqlen*sizeof(double));
if(!dbar) {
Py_DECREF(seq);
return PyErr_NoMemory( );
}
for(i=0; i < seqlen; i++) {
PyObject *fitem;
PyObject *item = PySequence_Fast_GET_ITEM(seq, i);
if(!item) {
Py_DECREF(seq);
free(dbar);
return 0;
}
fitem = PyNumber_Float(item);
if(!fitem) {
Py_DECREF(seq);
free(dbar);
PyErr_SetString(PyExc_TypeError, "all items must be numbers");
return 0;
}
dbar[i] = PyFloat_AS_DOUBLE(fitem);
Py_DECREF(fitem);
}
/* clean up, compute, and return result */
Py_DECREF(seq);
result = total(dbar, seqlen);
free(dbar);
return Py_BuildValue("d", result);
}
static PyMethodDef totalMethods[] = {
{"total", totalDoubles, METH_VARARGS, "Sum a sequence of numbers."},
{0} /* sentinel */
};
void
inittotal(void)
{
(void) Py_InitModule("total", totalMethods);
}
The two best ways for your C-coded, Python-callable extension
functions to accept generic Python sequences as arguments are
PySequence_Fast
and
PyObject_GetIter
(in Python 2.2 only). The latter can often save memory, but it is
appropriate only when it’s okay for the rest of your
C code to get the items one at a time without knowing beforehand how
many items there will be in total. Often, you have preexisting C
functions from an existing library that you want to expose to Python
code, and those most often require that their input sequences are C
arrays. Thus, this recipe shows how to build a C array (in this case,
an array of double
) from a generic Python sequence
argument, so you can pass the array (and the integer that gives the
array’s length) to your existing C function
(represented here, as an example, by the total
function at the start of the recipe).
PySequence_Fast
takes two arguments: a Python object to be presented as a sequence
and a string to use as the error message in case the Python object
cannot be presented as a sequence, in which case it returns
0
(the null pointer, an error indicator). If the
Python object is already a list or tuple,
PySequence_Fast
returns the same object with the
reference count increased by one. If the Python object is any other
kind of sequence (or, in Python 2.2, any iterator or iterable),
PySequence_Fast
builds and returns a new tuple
with all items already in place. In any case,
PySequence_fast
returns an object on which you can
call PySequence_Fast_GET_SIZE
to learn the
sequence length (as we do in the recipe to malloc
the appropriate amount of storage for the C array) and
PySequence_Fast_GET_ITEM
to get an item given a
valid index (between 0, included, and the sequence length, excluded).
The recipe requires quite a bit of care, which is typical of all
C-coded Python extensions (and, more generally, any C code), to deal
with memory and error conditions properly. For
C-coded Python extensions,
it’s imperative that you know which functions return
new references (which you must
Py_DECREF
when you are done with them) and which return borrowed references
(which you must not Py_DECREF
, but on the
contrary,
Py_INCREF
if you want to keep a copy for a longer time). In this specific case,
you have to know the following (by reading the Python documentation):
There is method to this madness: even though as you start your career as a coder of C API Python extensions, you’ll no doubt have to double-check each case. Python’s C API strives to return borrowed references for performance when it knows it can always do so safely (i.e., it knows that the reference it is returning necessarily refers to an already existing object). It has to return a new reference when it’s possible (or certain) that a new object may have to be created.
For example, in the above list, PyNumber_Float
and
PySequence_Fast
may be able to return the same
object they were given as an argument, but it’s also
quite possible that they may have to create a new object for this
purpose to ensure that the returned object has the correct type.
Therefore, these two functions are specified as always returning new
references. PyArg_ParseTuple
and
PySequence_Fast_GET_ITEM
, on the other hand, will
always return references to objects that already exist elsewhere (as
items in the arguments’ tuple or items in the
fast-sequence container, respectively), and therefore, these two
functions can afford to return borrowed references and are thus
specified as doing so.
One last note: when we have an item from the fast-sequence container,
we immediately try to transform it into a Python
float
object and deal with the possibility that
the transformation will fail (e.g., if we’re passed
a sequence containing a string, a complex number, etc.). It is often
quite futile to first attempt a check (with
PyNumber_Check
), because the check might succeed,
and the later transformation attempt might fail anyway (e.g., with a
complex-number item).
As usual, the best way to build this extension (assuming
you’ve saved it to a total.py
file) is with the distutils
package. Place a file
named setup.py
such as:
from distutils.core import setup, Extension setup(name = "total", maintainer = "Luther Blissett", maintainer_email = "[email protected]", ext_modules = [Extension('total',sources=['total.c'])] )
in the same directory as the C source, then build and install by running:
$ python setup.py install
The nice thing about this is that it works on any platform (assuming you have Python 2.0 or later and have access to the same C compiler used to build your version of Python).
The Extending and Embedding manual is available
as part of the standard Python documentation set at http://www.python.org/doc/current/ext/ext.html;
documentation on Python C API at http://www.python.org/doc/current/api/api.html;
the Distributing Python Modules section of the
standard Python documentation set is still incomplete, but it is the
best source of information on the distutils
package.