Credit: Luther Blissett
You want to write a Python callable C extension that takes as an argument a Python sequence (or iterator) and accesses it sequentially, requiring no extra storage.
If you can afford to access the sequence item by item without knowing
in advance the number of items it has, and you are running Python 2.2
or better, you can sometimes save memory by using
PyObject_GetIter
instead of
PySequence_Fast
:
#include <Python.h>
static PyObject *totalIter(PyObject *self, PyObject *args)
{
PyObject* seq;
PyObject* item;
double result;
/* get one argument as an iterator */
if(!PyArg_ParseTuple(args, "O", &seq))
return 0;
seq = PyObject_GetIter(seq);
if(!seq)
return 0;
/* process data sequentially */
result = 0.0;
while((item=PyIter_Next(seq))) {
PyObject *fitem;
fitem = PyNumber_Float(item);
if(!fitem) {
Py_DECREF(seq);
Py_DECREF(item);
PyErr_SetString(PyExc_TypeError, "all items must be numbers");
return 0;
}
result += PyFloat_AS_DOUBLE(fitem);
Py_DECREF(fitem);
Py_DECREF(item);
}
/* clean up and return result */
Py_DECREF(seq);
return Py_BuildValue("d", result);
}
static PyMethodDef totitMethods[] = {
{"totit", totalIter, METH_VARARGS, "Sum a sequence of numbers."},
{0} /* sentinel */
};
void
inittotit(void)
{
(void) Py_InitModule("totit", totitMethods);
}
PyObject_GetIter
is available only in Python 2.2, and it is
appropriate only when it’s okay for the rest of your
C code to get the items one at a time without knowing beforehand how
many items there will be in total. When these conditions are met,
PyObject_GetIter
gives you roughly the same
performance as PySequence_Fast
if the input
argument is a list or tuple, but it can save memory allocation, and
therefore running time, if the input argument is an iterator or
another kind of sequence or iterable.
PyObject_GetIter
takes one argument: a Python
object from which an iterator is desired (much like
Python’s iter
built-in function).
It either returns 0
, indicating an error, or an
iterator object, on which you can call PyIter_Next
to get the next item (or 0
, which is not an error
at the end of the iteration). Both
PyObject_GetIter
and
PyIter_Next
return new references, so we must
Py_DECREF
when we’re done with
the respective objects.
As usual, the best way to build this extension (assuming
you’ve saved it in a totit.c
file) is with the distutils
package. Place a file
named setup.py
such as:
from distutils.core import setup, Extension setup(name = "totit", maintainer = "Luther Blissett", maintainer_email = "[email protected]", ext_modules = [Extension('total',sources=['totit.c'])] )
in the same directory as the C source, then build and install by running:
$ python setup.py install
The nice thing about this is that it works on any platform (assuming, of course, you have Python 2.0 or later and have access to the same C compiler used to build your version of Python).
Since Python extensions are often coded in C to maximize performance, it’s interesting to measure performance compared to that of pure Python code dealing with the same task. A typical measurement setup might be a script such as:
import time, operator, total, totit def timo(f, xs): start = time.clock( ) for x in xs: res = f(x) stend = time.clock( ) print f._ _name_ _, stend-start def totpy(x): result = 0.0 for item in x: result += item return result def totre(x): return reduce(operator.add, x) seq = range(200000) print 'on lists:' timo(totre, 10*[seq]) timo(totpy, 10*[seq]) timo(total.total, 10*[seq]) timo(totit.totit, 10*[seq]) print 'on iters:' timo(totre, [iter(seq) for i in range(10)]) timo(totpy, [iter(seq) for i in range(10)]) timo(total.total, [iter(seq) for i in range(10)]) timo(totit.totit, [iter(seq) for i in range(10)])
On my machine, running with the command-line switch
-O
so that Python can optimize operations, the
timing results are:
on lists: totre 2.88 totpy 0.91 total 0.31 totit 0.32 on iters: totre 3.02 totpy 0.91 total 0.64 totit 0.32
As you can see, the most important optimization is to avoid the
attractive nuisance of the reduce
built-in
function. We can be about three times as fast with a simple
Python-coded function! But the C-coded extension
total
, which we saw in Recipe 16.3, is three times faster yet when run on lists,
as is the totit
extension in this recipe. The
advantage of totit
over total
is seen when they are run on iterators rather than lists. In that
case, totit
can be roughly twice as fast as
total
, because it saves the overhead of memory
allocation in PySequence_Fast
.
The Extending and Embedding manual is available
as part of the standard Python documentation set at http://www.python.org/doc/current/ext/ext.html;
documentation on the Python C API at http://www.python.org/doc/current/api/api.html;
the Distributing Python Modules section of the
standard Python documentation set is still incomplete, but it is the
best source of information on the distutils
package.