This chapter introduces a collection of more advanced function-related topics: the lambda expression, functional programming tools such as map and list comprehensions, generators, and more. Part of the art of using functions lies in the interfaces between them, so we will also explore some general function design principles here. Because this is the last chapter in Part IV, we’ll close with the usual sets of gotchas and exercises to help you start coding the ideas you’ve read about.
So far, we’ve seen what it takes to write our own functions in Python. The next sections turn to a few more advanced function-related ideas. Most of these are optional features, but can simplify your coding tasks when used well.
Besides the def
statement, Python also provides an
expression form that generates function objects.
Because of its similarity to a tool in the LISP language,
it’s called
lambda
.
[1] Like
def
, this expression creates a function to be
called later, but returns it instead of assigning it to a name. This
is why lambdas are sometimes known as anonymous (i.e., unnamed)
functions. In practice, they are often used as a way to inline a
function definition, or defer execution of a piece of code.
The lambda
’s general form is
the
keyword lambda
, followed by one or more arguments
(exactly like the arguments list you enclose in parenthesis in a
def
header), followed by an expression after a
colon:
lambdaargument1
,argument2
,...argumentN
:expression using arguments
Function objects returned by running lambda
expressions work exactly the same as those created and assigned by
def
. But the lambda
has a few
differences that make it useful in specialized roles:
lambda
is an expression, not a statement. Because of this, a lambda
can appear in places a
def
is not allowed
by
Python’s syntax—inside a list literal or
function call, for example. As an expression, the
lambda
returns a value (a new function), which can
be assigned a name optionally; the def
statement
always assigns the new function to the name in the header, instead of
returning it as a result.
lambda
bodies are a single expression, not a
block of statements. The lambda
’s body is similar to
what you’d put in a def
body’s return
statement; simply
type the result as a naked expression, instead of explicitly
returning it. Because it is limited to an expression,
lambda
is less general than a
def
; you can only squeeze so much logic into a
lambda
body without using statements such as
if
(read on for more on this). This is by design,
to limit program nesting: lambda
is designed for
coding simple functions, and def
handles larger
tasks.
Apart from those distinctions, the def
and
lambda
do the same sort of work. For instance,
we’ve seen how to make functions with
def
statements:
>>>def func(x, y, z): return x + y + z
... >>>func(2, 3, 4)
9
But you can achieve the same effect with a lambda
expression, by explicitly assigning its result to a name through
which you can later call:
>>>f = lambda x, y, z: x + y + z
>>>f(2, 3, 4)
9
Here, f
is assigned the function object the
lambda
expression creates; this is how
def
works too, but its assignment is automatic.
Defaults work on lambda
arguments, just like the
def
:
>>>x = (lambda a="fee", b="fie", c="foe": a + b + c)
>>>x("wee")
'weefiefoe'
The code in a lambda
body also follows the same
scope lookup rules as code inside a
def
—lambda
expressions
introduce a local scope much like a nested def
,
which automatically sees names in enclosing functions, the module,
and the built-in scope (via the LEGB rule):
>>>def knights( ):
...title = 'Sir'
...action = (lambda x: title + ' ' + x) # Title in enclosing def
...return action # Return a function.
... >>>act = knights( )
>>>act('robin')
'Sir robin'
Prior to Release 2.2, the value for name title
would typically be passed in as a default argument value instead;
flip back to the scopes coverage of Chapter 13 if
you’ve forgotten why.
Generally speaking, lambda
s come in handy as a
sort of function shorthand that allows you to embed a
function’s definition within the code that uses it.
They are entirely optional (you can always use def
instead), but tend to be a simpler coding construct in scenarios when
you just need to embed small bits of executable code.
For instance, we’ll see later that callback handlers
are frequently coded as in-line lambda
expressions
embedded directly in a registration call’s arguments
list, instead of being defined with a def
elsewhere in a file and referenced by name (see the callbacks sidebar
for an example).
lambda
s are also commonly used to code
jump tables—lists or dictionaries of
actions to be performed on demand. For example:
L = [(lambda x: x**2), (lambda x: x**3), (lambda x: x**4)] for f in L: print f(2) # Prints 4, 8, 16 print L[0](3) # Prints 9
The lambda
expression is most useful as a
shorthand for def
, when you need to stuff small
pieces of executable code in places where statements are illegal
syntactically. This code snippet, for example, builds up a list of
three functions by embedding lambda
expressions
inside a list literal; def
won’t
work inside a list literal like this, because it is a statement, not
an expression.
You can do the same sort of thing with dictionaries and other data structures in Python, to build up action tables:
>>>key = 'got'
>>>{'already': (lambda: 2 + 2),
...'got': (lambda: 2 * 4),
...'one': (lambda: 2 ** 6)
...}[key]( )
8
Here, when Python makes the dictionary, each of the nested
lambda
s generates and leaves behind a function to
be called later; indexing by key fetches one of those functions, and
parenthesis force the fetched function to be called. When coded this
way, a dictionary becomes a more general multiway
branching
tool than what we could show you in
Chapter 9s coverage of
if
statements.
To make this work without lambda
,
you’d need to instead code three
def
statements somewhere else in your file, and
outside the dictionary in which the functions are to be used:
def f1( ): ... def f2( ): ... def f3( ): ... ... key = ... {'already': f1, 'got': f2, 'one': f3}[key]( )
This works too, and avoids lambda
s; but your
def
s may be arbitrarily far away in your file,
even if they are just little bits of code. The code
proximity that lambda
s provide is
especially useful for functions that will only be used in a single
context—if the three functions here are not useful anywhere
else, it makes sense to embed their definition within the dictionary
as lambda
s.
lambda
s also come in handy in function argument
lists, as a way to inline temporary function definitions not used
anywhere else in your program; we’ll meet examples
of such other uses later in this chapter when we study
map
.
The fact that the body of a lambda
has to be a
single expression (not statements) would seem to place severe limits
on how much logic you can pack into a lambda
. If
you know what you’re doing, though, you can code
almost every statement in Python as an expression-based equivalent.
For example, if you want to print from the body of a
lambda
function, simply say
sys.stdout.write(str(x)+'
')
, instead of
print x
. (See Chapter 8 if
you’ve forgotten why.) Similarly,
it’s possible to emulate an if
statement by combining Boolean
operators in expressions. The expression:
((a and b) or c)
is roughly equivalent to:
if a: b else: c
and is almost Python’s equivalent to the C
language’s a?b:c
ternary
operator. (To understand why, you need to have read the discussion of
Boolean operators in Chapter 9.) In short,
Python’s and
and
or
short-circuit (they don’t
evaluate the right side, if the left determines the result), and
always return either the value on the left, or the value on the
right. In code:
>>>t, f = 1, 0
>>>x, y = 88, 99
>>>a = (t and x) or y # If true, x
>>> a
88 >>>a = (f and x) or y # If false, y
>>>a
99
This works, but only as long as you can be sure that
x
will not be false too (otherwise, you will
always get y
). To truly emulate an
if
statement in an expression, you must wrap the
two possible results so as to make them non-false, and then index to
pull out the result at the end:[2]
>>>((t and [x]) or [y])[0] # If true, x
88 >>>((f and [x]) or [y])[0] # If false, y
99 >>>(t and f) or y # Fails: f is false, skipped
99 >>>((t and [f]) or [y])[0] # Works: f returned anyhow
0
Once you’ve muddled through typing this a few times, you’ll probably want to wrap it for reuse:
>>>def ifelse(a, b, c): return ((a and [b]) or [c])[0]
... >>>ifelse(1, 'spam', 'ni')
'spam' >>>ifelse(0, 'spam', 'ni')
'ni'
Of course, you can get the same results by using an
if
statement here instead:
def ifelse(a, b, c): if a: return b else: return c
But expressions like these can be placed inside a
lambda
, to implement selection logic:
>>>lower = (lambda x, y: (((x < y) and [x]) or [y])[0])
>>>lower('bb', 'aa')
'aa' >>>lower('aa', 'bb')
'aa'
Finally, if you need to perform loops within a
lambda
, you can also embed things like
map
calls and list
comprehension expressions—tools
we’ll meet later in this section:
>>>import sys
>>>showall = (lambda x: map(sys.stdout.write, x))
>>>t = showall(['spam ', 'toast ', 'eggs '])
spam toast eggs
But now that we’ve shown you these tricks, we need ask you to please only use them as a last resort. Without due care, they can lead to unreadable (a.k.a. obfuscated) Python code. In general, simple is better than complex, explicit is better than implicit, and full statements are better than arcane expressions. On the other hand, you may find these useful, when taken in moderation.
lambda
s are the main beneficiaries of
nested
function scope lookup (the E in the LEGB rule). In the following, for
example, the lambda
appears inside a
def
—the typical case—and so can access
the value that name x
had in the enclosing
function’s scope, at the time that the enclosing
function was called:
>>>def action(x):
...return (lambda y: x + y) # Make, return function.
>>>act = action(99)
>>>act
<function <lambda> at 0x00A16A88> >>>act(2)
101
What we didn’t illustrate in the prior discussion is
that a lambda
also has access to the names in any
enclosing lambda
. This case is somewhat obscure,
but imagine if we recoded the prior def
with a
lambda
:
>>>action = (lambda x: (lambda y: x + y))
>>>act = action(99)
>>>act(3)
102 >>>((lambda x: (lambda y: x + y))(99))(4)
103
Here, the nested lambda
structure makes a function
that makes a function when called. In both cases, the nested
lambda
’s code has access to
variable x
in the enclosing
lambda
. This works, but it’s
fairly convoluted
code; in the interest of
readability, nested lambda
s are generally best
avoided.
Some programs need to call arbitrary functions in a generic fashion,
without knowing their names or arguments ahead of time.
We’ll see examples of where this can be useful
later, but by way of introduction, both the
apply
built-in
function, and the special call syntax, do the job.
You can call generated functions by passing them as
arguments to apply
, along with a tuple of
arguments:
>>>def func(x, y, z): return x + y + z
... >>>apply(func, (2, 3, 4))
9 >>>f = lambda x, y, z: x + y + z
>>>apply(f, (2, 3, 4))
9
The function apply
simply calls the passed-in
function in the first argument, matching the passed-in arguments
tuple to the function’s expected arguments. Since
the arguments list is passed in as a tuple (i.e., a data structure),
it can be built at runtime by a program.[3]
The real power of apply
is that it
doesn’t need to know how many arguments a function
is being called with; for example, you can use if logic to select
from a set of functions and argument lists, and use
apply
to call any:
if <test>: action, args = func1, (1,) else: action, args = func2, (1, 2, 3) . . . apply(action, args)
More generally, apply
is useful any time you
cannot predict the arguments list ahead of time. If your user selects
an arbitrary function via a user interface, for instance, you may be
unable to hardcode a function call when writing your script. Simply
build up the arguments list with tuple operations and call indirectly
through apply
:
>>>args = (2,3) + (4,)
>>>args
(2, 3, 4) >>>apply(func, args)
9
The apply
call also
supports
an optional third argument, where you can pass in a dictionary that
represents keyword arguments to be passed to the function:
>>>def echo(*args, **kwargs): print args, kwargs
>>>echo(1, 2, a=3, b=4)
(1, 2) {'a': 3, 'b': 4}
This allows us to construct both positional and keyword arguments, at runtime:
>>>pargs = (1, 2)
>>>kargs = {'a':3, 'b':4}
>>>apply(echo, pargs, kargs)
(1, 2) {'a': 3, 'b': 4}
Python also allows you to
accomplish the same effect as an
apply
call with special syntax at the call, which
mirrors the arbitrary arguments syntax in def
headers that we met in Chapter 13. For example,
assuming the names of this example are still as assigned earlier:
>>>apply(func, args) # Traditional: tuple
9 >>>func(*args) # New apply-like syntax
9 >>>echo(*pargs, **kargs) # Keyword dictionaries too
(1, 2) {'a': 3, 'b': 4} >>>echo(0, *pargs, **kargs) # Normal, *tuple, **dictionary
(0, 1, 2) {'a': 3, 'b': 4}
This special call syntax is newer than the apply
function. There is no obvious advantage of the syntax over an
explicit apply
call, apart from its symmetry with
def
headers, and a few less keystrokes.
One of the more common
things
programs do with lists and other sequences is to apply an operation
to each item, and collect the results. For instance, updating all the
counters in a list can be done easily with a for
loop:
>>>counters = [1, 2, 3, 4]
>>> >>>updated = [ ]
>>>for x in counters:
...updated.append(x + 10) # Add 10 to each item.
... >>>updated
[11, 12, 13, 14]
Because this is such a common operation, Python provides a built-in
that does most of the work for you. The
map
function applies a passed-in function to each item in a sequence
object, and returns a list containing all the function call results.
For example:
>>>def inc(x): return x + 10 # function to be run
... >>>map(inc, counters) # Collect results.
[11, 12, 13, 14]
We introduced map
as a parallel loop traversal
tool in Chapter 10, where we passed in
None
for the function argument to pair items up.
Here, we make better use of it by passing in a real function to be
applied to each item in the list—map
calls
inc
on each list item, and collects all the return
values into a list.
Since map
expects a function to be passed in, it
also happens to be one of the places where lambda
s
commonly appear:
>>> map((lambda x: x + 3), counters) # Function expression
[4, 5, 6, 7]
Here, the function adds 3 to each item in the
counters
list; since this function
isn’t needed elsewhere, it was written inline as a
lambda
. Because such uses of
map
are equivalent to for
loops, with a little extra code, you could always code a general
mapping utility yourself:
>>>def mymap(func, seq):
...res = [ ]
...for x in seq: res.append(func(x))
...return res
... >>>map(inc, [1, 2, 3])
[11, 12, 13] >>>mymap(inc, [1, 2, 3])
[11, 12, 13]
However, since map
is a built-in,
it’s always available, always works the same way,
and has some performance benefits (in short, it’s
faster than a for
). Moreover,
map
can be used in more advanced ways than shown;
for instance, given multiple sequence arguments, it sends items taken
from sequences in parallel as distinct arguments to the function:
>>>pow(3, 4)
81 >>>map(pow, [1, 2, 3], [2, 3, 4]) # 1**2, 2**3, 3**4
[1, 8, 81]
Here, the pow
function takes two arguments on each
call—one from each sequence passed to map
.
Although we could simulate this generality too, there is no obvious
point in doing so, when map
is built-in and quick.
The map
function is the
simplest
representative of a class of Python built-ins used for
functional programming—which mostly just
means tools that apply functions to sequences. Its relatives filter
out items based on a test function (filter
), and
apply functions to pairs of items and running results
(reduce
). For example, the following
filter
call picks out items in a sequence greater
than zero:
>>>range(-5, 5)
[-5, -4, -3, -2, -1, 0, 1, 2, 3, 4] >>>filter((lambda x: x > 0), range(-5, 5))
[1, 2, 3, 4]
Items in the sequence for which the function returns true are added
to the result list. Like map
,
it’s roughly equivalent to a for
loop, but is built-in and fast:
>>>res = [ ]
>>>for x in range(-5, 5):
...if x > 0:
...res.append(x)
... >>>res
[1, 2, 3, 4]
Here are two reduce
calls computing the sum and
product of items in a list:
>>>reduce((lambda x, y: x + y), [1, 2, 3, 4])
10 >>>reduce((lambda x, y: x * y), [1, 2, 3, 4])
24
At each step, reduce passes the current sum or product, along with
the next item from the list, to the passsed in
lambda
function. By default, the first item in the
sequence initializes the starting value. Here’s the
for
loop equivalent to the first of these, with
the addition hardcoded inside the loop:
>>>L = [1,2,3,4]
>>>res = L[0]
>>>for x in L[1:]:
...res = res + x
... >>> res 10
If this has sparked your interest, also see the built-in
operator
module, which provides functions that
correspond to built-in expressions, and so comes in handy for some
uses of functional tools:
>>>import operator
>>>reduce(operator.add, [2, 4, 6]) # function-based +
12 >>>reduce((lambda x, y: x + y), [2, 4, 6])
12
Some observers might also extend the functional programming toolset
in Python to include lambda
and
apply
, and list comprehensions (discussed in the
next section).
Because mapping operations over sequences and collecting results is
such a common task in Python coding, Python 2.0 sprouted a new
feature—the list comprehension
expression—that can make
this even simpler than using map
and
filter
. Technically, this feature is not tied to
functions, but we’ve saved it for this point in the
book, because it is usually best understood by analogy to
function-based alternatives.
Let’s work through an example that demonstrates the
basics. Python’s built-in ord
function returns the integer ASCII code of a single character:
>>> ord('s')
115
The chr
built-in is the converse—it returns
the character for an ASCII code integer. Now, suppose we wish to
collect the ASCII codes of all characters in an
entire string. Perhaps the most straightforward approach is to use a
simple for
loop, and append results to a list:
>>>res = [ ]
>>>for x in 'spam'
: ...res.append(ord(x))
... >>>res
[115, 112, 97, 109]
Now that we know about map
, we can achieve similar
results with a single function call without having to manage list
construction in the code:
>>>res = map(ord, 'spam') # Apply func to seq.
>>>res
[115, 112, 97, 109]
But as of Python 2.0, we get the same results from a list comprehension expression:
>>>res = [ord(x) for x in 'spam'] # Apply expr to seq.
>>>res
[115, 112, 97, 109]
List comprehensions collect the results of applying an arbitrary
expression to a sequence of values, and return them in a new list.
Syntactically, list comprehensions are enclosed in square brackets
(to remind you that they construct a list). In their simple form,
within the brackets, you code an expression that names a variable,
followed by what looks like a for
loop header that
names the same variable. Python collects the
expression’s results, for each iteration of the
implied loop.
The effect of the example so far is similar to both the manual
for
loop, and the map
call.
List comprehensions become more handy, though, when we wish to apply
an arbitrary expression to a sequence:
>>> [x ** 2 for x in range(10)]
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
Here, we’ve collected the squares of the numbers 0
to 9. To do similar work with a map
call, we would
probably invent a little function to implement the square operation.
Because we won’t need this function elsewhere, it
would typically be coded inline, with a lambda
:
>>> map((lambda x: x**2), range(10))
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
This does the same job, and is only a few keystrokes longer than the equivalent list comprehension. For more advanced kinds of expressions, though, list comprehensions will often be less for you to type. The next section shows why.
List comprehensions are more general
than shown so far. For instance, you can code an
if
clause after the for
, to add
selection logic. List comprehensions with if
clauses can be thought of as analogous to the
filter
built-in of the prior section—they
skip sequence items for which the if
clause is not
true. Here are both schemes picking up even numbers from 0 to 4; like
map
, filter
invents a little
lambda
function for the test expression. For
comparison, the equivalent for
loop is shown here
as well:
>>>[x for x in range(5) if x % 2 == 0]
[0, 2, 4] >>>filter((lambda x: x % 2 == 0), range(5))
[0, 2, 4] >>>res = [ ]
>>>for x in range(5):
...if x % 2 == 0: res.append(x)
... >>>res
[0, 2, 4]
All of these are using modulus (remainder of division) to detect
evens: if there is no remainder after dividing a number by two, it
must be even. The filter
call is not much longer
than the list comprehension here either. However, the
combination of an if
clause
and an arbitrary expression gives list comprehensions the effect of a
filter
and a map
, in a single
expression:
>>> [x**2 for x in range(10) if x % 2 == 0]
[0, 4, 16, 36, 64]
This time, we collect the squares of the even numbers from 0 to
9—the for
loop skips numbers for which the
attached if
clause on the right is false, and the
expression on the left computes squares. The equivalent
map
call would be more work on our part: we would
have to combine filter
selections with
map
iteration, making for a noticeably more
complex expression:
>>> map((lambda x: x**2), filter((lambda x: x % 2 == 0), range(10)))
[0, 4, 16, 36, 64]
In fact, list comprehensions are even more general still. You may
code nested for
loops, and each may have an
associated if
test. The general structure of list
comprehensions looks like this:
[ expression for target1 in sequence1 [if condition] fortarget2
insequence2
[ifcondition
] ... fortargetN
insequenceN
[ifcondition
] ]
When for
clauses are nested within a list
comprehension, they work like equivalent nested
for
loop statements. For example, the following:
>>>res = [x+y for x in [0,1,2] for y in [100,200,300]]
>>>res
[100, 200, 300, 101, 201, 301, 102, 202, 302]
has the same effect as the substantially more verbose equivalent statements:
>>>res = [ ]
>>>for x in [0,1,2]:
...for y in [100,200,300]:
...res.append(x+y)
... >>>res
[100, 200, 300, 101, 201, 301, 102, 202, 302]
Although list comprehensions construct a list, remember that they can iterate over any sequence type. Here’s a similar bit of code that traverses strings instead of lists of numbers, and so collects concatenation results:
>>> [x+y for x in 'spam' for y in 'SPAM']
['sS', 'sP', 'sA', 'sM', 'pS', 'pP', 'pA', 'pM',
'aS', 'aP', 'aA', 'aM', 'mS', 'mP', 'mA', 'mM']
Finally, here is a much more complex list comprehension. It
illustrates the effect of attached if
selections
on nested for
clauses:
>>> [(x,y) for x in range(5) if x%2 == 0 for y in range(5) if y%2 == 1]
[(0, 1), (0, 3), (2, 1), (2, 3), (4, 1), (4, 3)]
This expression permutes even numbers from 0 to 4, with odd numbers
from 0 to 4. The if
clauses filter out items in
each sequence iteration. Here’s the equivalent
statement-based code—nest the list
comprehension’s for
and
if
clauses inside each other to derive the
equivalent statements. The result is longer, but perhaps clearer:
>>>res = [ ]
>>>for x in range(5):
...if x % 2 == 0:
...for y in range(5):
...if y % 2 == 1:
...res.append((x, y))
... >>>res
[(0, 1), (0, 3), (2, 1), (2, 3), (4, 1), (4, 3)]
The map
and filter
equivalent
would be wildly complex and nested, so we won’t even
try showing it here. We’ll leave its coding as an
exercise for Zen masters, ex-LISP programmers, and the criminally
insane.
With such generality, list comprehensions can quickly become, well,
incomprehensible, especially when nested. Because of that, our advice
would normally be to use simple for
loops when
getting started with Python, and map
calls in most
other cases (unless they get too complex). The “Keep
It Simple” rule applies here, as always; code
conciseness is much less important a goal than code readability.
However, there is currently a substantial performance advantage to
the extra complexity in this case: based on tests run under Python
2.2, map
calls are roughly twice as fast as
equivalent for
loops, and list comprehensions are
usually very slightly faster than map
. This speed
difference owes to the fact that map
and list
comprehensions run at C language speed inside the interpreter, rather
than stepping through Python for
loop code within
the PVM.
Because for
loops make logic more explicit, we
recommend them in general on grounds of simplicity.
map
, and especially list comprehensions, are worth
knowing if your application’s speed is an important
consideration. In addition, because map
and list
comprehensions are both expressions, they can show up syntactically
in places that for
loop statements cannot, such as
in the bodies of lambda
functions, within list and
dictionary literals, and more. Still, you should try to keep your
map
calls and list comprehensions simple; for more
complex tasks, use full statements instead.
It is possible to write functions that may be resumed after they send a value back. Such functions are known as generators because they generate a sequence of values over time. Unlike normal functions that return a value and exit, generator functions automatically suspend and resume their execution and state around the point of value generation. Because of that, they are often a useful alternative to both computing an entire series of values up front, and manually saving and restoring state in classes.
The chief code difference between generator and normal functions is
that generators yield a value, rather than returning one—the
yield
statement suspends the function and sends a
value back to the caller, but retains enough state to allow the
function to resume from where it left off. This allows functions to
produce a series of values over time, rather than computing them all
at once, and sending them back in something like a list.
Generator functions are bound up with the notion of iterator
protocols in Python. In short, functions containing a
yield
statement are compiled specially as
generators; when called, they return a generator object that supports
the iterator object interface.
Iterator objects, in turn, define a next
method,
which returns the next item in the iteration, or raises a special
exception (StopIteration
) to end the iteration.
Iterators are fetched with the iter
built-in
function. Python for
loops use this iteration
interface protocol to step through a sequence (or sequence
generator), if the protocol is supported; if not,
for
falls back on repeatedly indexing sequences
instead.
Generators and iterators are an advanced language feature, so please see the Python library manuals for the full story on generators.
To illustrate the basics, though, the following code defines a generator function that can be used to generate the squares of a series of numbers over time:[4]
>>>def gensquares(N):
...for i in range(N):
...yield i ** 2 # Resume here later.
This function yields a value, and so returns to its caller, each time
through the loop; when it is resumed, its prior state is restored,
and control picks up again immediately after the
yield
statement. For example, when used as the
sequence in a for
loop, control will resume the
function after its yield
statement, each time
through the loop:
>>>for i in gensquares(5): # Resume the function.
...print i, ':', # Print last yielded value.
... 0 : 1 : 4 : 9 : 16 : >>>
To end the generation of values, functions use either a
return
statement with no value, or simply fall off
the end of the function body. If you want to see what is going on
inside the for
, call the generator function
directly:
>>>x = gensquares(10)
>>>x
<generator object at 0x0086C378>
You get back a generator object that supports the iterator
protocol—it has a next
method,
which starts the function, or resumes it from where it last yielded a
value:
>>>x.next( )
0 >>>x.next( )
1 >>>x.next( )
4
for
loops work with generators in the same
way—by calling the next
method repeatedly,
until an exception is caught. If the object to be iterated over does
not support this protocol, for
loops instead use
the indexing protocol to iterate.
Note that in this example, we could also simply build the list of yielded values all at once:
>>>def buildsquares(n):
...res = [ ]
...for i in range(n): res.append(i**2)
...return res
... >>>for x in buildsquares(5): print x, ':',
... 0 : 1 : 4 : 9 : 16 :
For that matter, we could simply use any of the
for
loop, map
, or list
comprehension techniques:
>>>for x in [n**2 for n in range(5)]:
...print x, ':'
, ... 0 : 1 : 4 : 9 : 16 : >>>for x in map((lambda x:x**2), range(5)):
...print x, ':',
... 0 : 1 : 4 : 9 : 16 :
However, especially when result lists are large, or when it takes much computation to produce each value, generators allow functions to avoid doing all the work up front. They distribute the time required to produce the series of values among loop iterations. Moreover, for more advanced generator uses, they provide a simpler alternative to manually saving the state between iterations in class objects (more on classes later in Part VI); with generators, function variables are saved and restored automatically.
Built-in datatypes are designed to
produce
iterator objects in response to the iter
built-in
function. Dictionary iterators, for instance, produce key list items
on each iteration:
>>>D = {'a':1, 'b':2, 'c':3}
>>>x = iter(D)
>>>x.next( )
'a' >>>x.next( )
'c'
In addition, all iteration contexts, including for
loops, map
calls, and list comprehensions, are in
turn designed to automatically call the iter
function to see if the protocol is supported. That’s
why you can loop through a dictionary’s keys without
calling its keys
method, step through lines in a
file without calling readlines
or
xreadlines
, and so on:
>>>for key in D:
...print key, D[key]
... a 1 c 3 b 2
For file iterators, Python 2.2 simply uses the result of the file
xreadlines
method; this method returns an object
that loads lines from the file on demand, and reads by chunks of
lines instead of loading the entire file all at once:
>>>for line in open('temp.txt'):
...print line,
... Tis but a flesh wound.
It is also possible to implement arbitrary objects with classes,
which conform to the iterator protocol, and so may be used in
for
loops and other iteration contexts. Such
classes define a special __iter__
method that
return an iterator object (preferred over the __getitem__
indexing method). However, this is well beyond the scope
of this chapter; see Part VI for more on
classes in
general, and Chapter 21 for an example of a class that implements the
iterator protocol.
When you start using functions, you’re faced with choices about how to glue components together—for instance, how to decompose a task into functions (cohesion ), how functions should communicate (coupling ), and so on. Some of this falls into the category of structured analysis and design. Here are a few general hints for Python beginners:
Coupling: use arguments for inputs and return for outputs. Generally, you should strive to make a function independent of things outside of it. Arguments and return statements are often the best ways to isolate external dependencies.
Coupling: use global variables only when truly necessary. Global variables (i.e., names in the enclosing module) are usually a poor way for functions to communicate. They can create dependencies and timing issues that make programs difficult to debug and change.
Coupling: don’t change mutable arguments unless the caller expects it. Functions can also change parts of mutable objects passed in. But as with global variables, this implies lots of coupling between the caller and callee, which can make a function too specific and brittle.
Cohesion: each function should have a single, unified purpose. When designed well, each of your functions should do one thing—something you can summarize in a simple declarative sentence. If that sentence is very broad (e.g., “this function implements my whole program”), or contains lots of conjunctions (e.g., “this function gives employee raises and submits a pizza order”), you might want to think about splitting it into separate and simpler functions. Otherwise, there is no way to reuse the code behind the steps mixed together in such a function.
Size: each function should be relatively small. This naturally follows from the cohesion goal, but if your functions start spanning multiple pages on your display, it’s probably time to split. Especially given that Python code is so concise to begin with, a function that grows long or deeply nested is often a symptom of design problems. Keep it simple, and keep it short.
Figure 14-1 summarizes the ways functions can talk to the outside world; inputs may come from items on the left side, and results may be sent out in any of the forms on the right. Some function designers usually only use arguments for inputs and return statements for outputs.
There are plenty of exceptions, including Python’s
OOP support—as you’ll see in Part VI, Python classes depend
on changing a passed-in mutable object. Class functions set
attributes of an automatically passed-in argument called
self
, to change per-object state information
(e.g., self.name='bob
'). Moreover, if classes are
not used, global variables are often the best way for functions in
modules to retain state between calls. Such side effects
aren’t dangerous if they’re
expected.
Because Python functions are objects at runtime, you can write programs that process them generically. Function objects can be assigned, passed to other functions, stored in data structures, and so on, as if they were simple numbers or strings. We’ve seen some of these uses in earlier examples. Function objects happen to export a special operation—they can be called by listing arguments in parentheses after a function expression. But functions belong to the same general category as other objects.
For instance, there’s really nothing special about
the name used in a def
statement:
it’s just a variable assigned in the current scope,
as if it had appeared on the left of an =
sign.
After a def
runs, the function name is a reference
to an object; you can reassign that object to other names and call it
through any reference—not just the original name:
>>>def echo(message): # Echo assigned to a function object.
...print message
... >>>x = echo # Now x references it too.
>>>x('Hello world!') # Call the object by adding ( ).
Hello world!
Since arguments are passed by assigning objects, it’s just as easy to pass functions to other functions, as arguments; the callee may then call the passed-in function just by adding arguments in parentheses:
>>>def indirect(func, arg):
...func(arg) # Call the object by adding ( ).
... >>>indirect(echo, 'Hello jello!') # Pass function to a function.
Hello jello!
You can even stuff function objects into data structures, as though they were integers or strings. Since Python compound types can contain any sort of object, there’s no special case here either:
>>>schedule = [ (echo, 'Spam!'), (echo, 'Ham!') ]
>>>for (func, arg) in schedule:
...func(arg)
... Spam! Ham!
This code simply steps through the schedule list, calling the
echo
function with one argument each time through.
Python’s lack of type declarations makes for an
incredibly flexible programming language. Notice the tuple unpacking
assignment in the for
loop header, introduced
in
Chapter 8.
Here are some of the more jagged edges of functions you might not expect. They’re all obscure, and a few have started to fall away from the language completely in recent releases, but most have been known to trip up a new user.
Python classifies names assigned in a function as locals by default;
they live in the function’s scope and exist only
while the function is running. What we didn’t tell
you is that Python detects locals statically, when it compiles the
def
’s code, rather than by
noticing assignments as they happen at runtime. This leads to one of
the most common oddities posted on the Python newsgroup by beginners.
Normally, a name that isn’t assigned in a function is looked up in the enclosing module:
>>>X = 99
>>>def selector( ): # X used but not assigned
...print X # X found in global scope
... >>>selector( )
99
Here, the X
in the function resolves to the
X
in the module outside. But watch what happens if
you add an assignment to X
after the reference:
>>>def selector( ):
...print X # Does not yet exist!
...X = 88 # X classified as a local name (everywhere)
...# Can also happen if "import X", "def X",...
>>>selector( )
Traceback (most recent call last): File "<stdin>", line 1, in ? File "<stdin>", line 2, in selector UnboundLocalError: local variable 'X' referenced before assignment
You get an undefined name error, but the reason is subtle. Python
reads and compiles this code when it’s typed
interactively or imported from a module. While compiling, Python sees
the assignment to X
and decides that
X
will be a local name everywhere in the function.
But later, when the function is actually run, the assignment
hasn’t yet happened when the
print
executes, so Python says
you’re using an undefined name. According to its
name rules, it should; local X
is used before
being assigned. In fact, any assignment in a function body makes a
name local. Imports, =, nested defs, nested classes, and so on, are
all susceptible to this behavior.
The problem occurs because assigned names are treated as locals
everywhere in a function, not just after statements where they are
assigned. Really, the previous example is ambiguous at best: did you
mean to print the global X
and then create a local
X
, or is this a genuine programming error? Since
Python treats X
as a local everywhere, it is an
error; but if you really mean to print global X
,
you need to declare it in a global statement:
>>>def selector( ):
...global X # Force X to be global (everywhere).
...print X
...X = 88
... >>>selector( )
99
Remember, though, that this means the assignment also changes the
global X
, not a local X
. Within
a function, you can’t use both local and global
versions of the same simple name. If you really meant to print the
global and then set a local of the same name, import the enclosing
module and qualify to get to the global version:
>>>X = 99
>>>def selector( ):
...import __main__ # Import enclosing module.
...print __main__.X # Qualify to get to global version of name.
...X = 88 # Unqualified X classified as local.
...print X # Prints local version of name.
... >>>selector( )
99 88
Qualification (the .X
part) fetches a value from a
namespace object. The interactive namespace is a module called
__main__
, so __main__.X
reaches the global version of X
. If that
isn’t clear, check out Part V.[5]
Default argument values
are
evaluated and saved when the def
statement is run,
not when the resulting function is called. Internally, Python saves
one object per default argument, attached to the function itself.
That’s usually what you want; because defaults are
evaluated at def
time, it lets you save values
from the enclosing scope if needed. But since defaults retain an
object between calls, you have to be careful about changing mutable
defaults. For instance, the following function uses an empty list as
a default value and then changes it in place each time the function
is called:
>>>def saver(x=[ ]): # Saves away a list object
...x.append(1) # Changes same object each time!
...print x
... >>>saver([2]) # Default not used
[2, 1] >>>saver( ) # Default used
[1] >>>saver( ) # Grows on each call!
[1, 1] >>>saver( )
[1, 1, 1]
Some see this behavior as a feature—because mutable default arguments retain their state between function calls, they can serve some of the same roles as static local function variables in the C language. In a sense, they work something like global variables, but their names are local to the function, and so will not clash with names elsewhere in a program.
To most observers, though, this seems like a gotcha, especially the first time they run into this. There are better ways to retain state between calls in Python (e.g., using classes, which will be discussed in Part VI).
Moreover, mutable defaults are tricky to remember (and understand at
all). They depend upon the timing of default object construction. In
the example, there is just one list object for the default
value—the one created when the def
was
executed. You don’t get a new list every time the
function is called, so the list grows with each new append; it is not
reset to empty on each call.
If that’s not the behavior you wish, simply make copies of the default at the start of the function body, or move the default value expression into the function body; as long as the value resides in code that’s actually executed each time the function runs, you’ll get a new object each time through:
>>>def saver(x=None):
...if x is None: # No argument passed?
...x = [ ] # Run code to make a new list.
...x.append(1) # Changes new list object
...print x
... >>>saver([2])
[2, 1] >>>saver( ) # Doesn't grow here
[1] >>>saver( )
[1]
By the way, the if
statement in the example could
almost be replaced by the assignment x
= x
or [ ]
, which takes advantage of the
fact that Python’s or
returns one
of its operand objects: if no argument was passed,
x
defaults to None
, so the
or
returns the new empty list on the right.
However, this isn’t exactly the same. When an empty
list is passed in, the or
expression would cause
the function to extend and return a newly created list, rather than
extending and returning the passed-in list like the previous version.
(The expression becomes [ ] or [ ]
, which
evaluates to the new empty list on the right; see Section 9.3 in
Chapter 9 if you don’t recall
why). Real program requirements may call for either behavior.
In Python functions, return
(and
yield
) statements are optional. When a function
doesn’t return a value explicitly, the function
exits when control falls off the end of the function body.
Technically, all functions return a value; if you
don’t provide a return, your function returns the
None
object automatically:
>>>def proc(x):
...print x # No return is a None return.
... >>>x = proc('testing 123...')
testing 123... >>>print x
None
Functions such as this without a return
are
Python’s equivalent of what are called
“procedures” in some languages.
They’re usually invoked as a statement, and the
None
result is ignored, since they do their
business without computing a useful result.
This is worth knowing, because Python won’t tell you
if you try to use the result of a function that
doesn’t return one. For instance, assigning the
result of a list append
method
won’t raise an error, but you’ll
really get back None
, not the modified list:
>>>list = [1, 2, 3]
>>>list = list.append(4) # append is a "procedure."
>>>print list # append changes list in-place.
None
As mentioned in Section 11.2 in Chapter 11, such functions do their business as a side effect, and are usually designed to be run as a statement, not an expression.
We’re going to start coding more sophisticated programs in these exercises. Be sure to check solutions in Section B.4, and be sure to start writing your code in module files. You won’t want to retype these exercises from scratch if you make a mistake.
The basics. At the Python interactive prompt, write a function that prints its single argument to the screen and call it interactively, passing a variety of object types: string, integer, list, dictionary. Then try calling it without passing any argument. What happens? What happens when you pass two arguments?
Arguments. Write a function called
adder
in a Python module file. The function
adder
should accept two arguments and return the
sum (or concatenation) of its two arguments. Then add code at the
bottom of the file to call the function with a variety of object
types (two strings, two lists, two floating points), and run this
file as a script from the system command line. Do you have to print
the call statement results to see results on your screen?
varargs. Generalize the adder
function you wrote in the last exercise to compute the sum of an
arbitrary number of arguments, and change the calls to pass more or
less than two. What type is the return value sum? (Hints: a slice
such as S[:0]
returns an empty sequence of the
same type as S, and the type
built-in function can
test types; but see the min
examples in Chapter 13 for a simpler approach.) What happens if you
pass in arguments of different types? What about passing in
dictionaries?
Keywords. Change the adder
function from Exercise 2 to accept and add three arguments:
def adder(good, bad, ugly)
. Now, provide default
values for each argument and experiment with calling the function
interactively. Try passing one, two, three, and four arguments. Then,
try passing keyword arguments. Does the call adder(ugly=1,
good=2)
work? Why? Finally, generalize the new adder to
accept and add an arbitrary number of keyword arguments, much like
Exercise 3, but you’ll need to iterate over a
dictionary, not a tuple. (Hint: the dict.keys( )
method returns a list you can step through with a
for
or while
.)
Write a function called copyDict(dict)
that copies
its dictionary argument. It should return a new dictionary with all
the items in its argument. Use the dictionary keys
method to iterate (or, in Python 2.2, step over a
dictionary’s keys without calling
keys
). Copying sequences is easy
(X[:]
makes a top-level copy); does this work for
dictionaries too?
Write a function called addDict(dict1, dict2)
that
computes the union of two dictionaries. It should return a new
dictionary, with all the items in both its arguments (assumed to be
dictionaries). If the same key appears in both arguments, feel free
to pick a value from either. Test your function by writing it in a
file and running the file as a script. What happens if you pass lists
instead of dictionaries? How could you generalize your function to
handle this case too? (Hint: see the type
built-in
function used earlier.) Does the order of arguments passed matter?
More argument matching examples. First, define the following six functions (either interactively, or in a module file that can be imported):
def f1(a, b): print a, b # Normal args def f2(a, *b): print a, b # Positional varargs def f3(a, **b): print a, b # Keyword varargs def f4(a, *b, **c): print a, b, c # Mixed modes def f5(a, b=2, c=3): print a, b, c # Defaults def f6(a, b=2, *c): print a, b, c # Defaults and positional varargs
Now, test the following calls interactively and try to explain each result; in some cases, you’ll probably need to fall back on the matching algorithm shown in Chapter 13. Do you think mixing matching modes is a good idea in general? Can you think of cases where it would be useful?
>>>f1(1, 2)
>>>f1(b=2, a=1)
>>>f2(1, 2, 3)
>>>f3(1, x=2, y=3)
>>>f4(1, 2, 3, x=2, y=3
) >>>f5(1)
>>>f5(1, 4)
>>>f6(1)
>>>f6(1, 3, 4)
Primes revisited. Recall the code snippet we saw in Chapter 10, which simplistically determines if a positive integer is prime:
x = y / 2 # For some y > 1 while x > 1: if y % x == 0: # Remainder print y, 'has factor', x break # Skip else x = x-1 else: # Normal exit print y, 'is prime'
Package this code as a reusable function in a module file, and add
some calls to your function at the bottom of your file. While
you’re at it, replace the first
line’s /
operator with
//
, to make it handle floating point numbers too,
and be immune to the “true”
division change planned for the /
operator in
Python 3.0 as described in Chapter 4. What can you
do about negatives and 0 and 1? How about speeding this up? Your
outputs should look something like this:
13 is prime 13.0 is prime 15 has factor 5 15.0 has factor 5.0
List comprehensions. Write code to build a new
list containing the square roots of all the numbers in this list:
[2, 4, 9, 16, 25]
. Code this as a
for
loop first, then as a map
call, and finally as a list comprehension. Use the
sqrt
function in the built-in
math
module to do the calculation (i.e., import
math
, and say math.sqrt(x)
). Of
the three, which approach do you like best?
[1] The name “lambda” seems to scare people more than it should. It comes from Lisp, which got it from the lambda calculus, which is a form of symbolic logic. In Python, though, it’s really just a keyword that introduces the expression syntactically.
[2] As we write this, a
debate rages on comp.lang.python about adding a more direct ternary
conditional expression to Python; see future release notes for new
developments on this front. Note that you can almost achieve the same
effect as the and/or
with an expression
((falsevalue,truevalue)[condition]
), except that
this does not short-circuit (both possible results are evaluated
every time), and the condition must be 0 or 1.
[3] Be careful
not to confuse apply
with map
,
the topic of the next section. apply
runs a single
function call, passing arguments to the function object just once.
map
calls a function many times instead, for each
item in a sequence.
[4] Generators are available
in Python releases after version 2.2; in 2.2, they must be enabled
with a special import statement of the form: from __future__ import generators
. (See Chapter 18 for
more on this statement form.) Iterators were already available in
2.2, largely because the underlying protocol did not require the new,
non-backward-compatible keyword, yield
.
[5] Python has improved on this story somewhat, by issuing the more specific “unbound local” error message for this case shown in the example listing (it used to simply raise a generic name error); this gotcha is still present in general, though.