Chapter 14. Advanced Function Topics

This chapter introduces a collection of more advanced function-related topics: the lambda expression, functional programming tools such as map and list comprehensions, generators, and more. Part of the art of using functions lies in the interfaces between them, so we will also explore some general function design principles here. Because this is the last chapter in Part IV, we’ll close with the usual sets of gotchas and exercises to help you start coding the ideas you’ve read about.

Anonymous Functions: lambda

So far, we’ve seen what it takes to write our own functions in Python. The next sections turn to a few more advanced function-related ideas. Most of these are optional features, but can simplify your coding tasks when used well.

Besides the def statement, Python also provides an expression form that generates function objects. Because of its similarity to a tool in the LISP language, it’s called lambda. [1] Like def, this expression creates a function to be called later, but returns it instead of assigning it to a name. This is why lambdas are sometimes known as anonymous (i.e., unnamed) functions. In practice, they are often used as a way to inline a function definition, or defer execution of a piece of code.

lambda Expressions

The lambda’s general form is the keyword lambda, followed by one or more arguments (exactly like the arguments list you enclose in parenthesis in a def header), followed by an expression after a colon:

lambda argument1, argument2,... argumentN : expression using arguments

Function objects returned by running lambda expressions work exactly the same as those created and assigned by def. But the lambda has a few differences that make it useful in specialized roles:

  • lambda is an expression, not a statement. Because of this, a lambda can appear in places a def is not allowed by Python’s syntax—inside a list literal or function call, for example. As an expression, the lambda returns a value (a new function), which can be assigned a name optionally; the def statement always assigns the new function to the name in the header, instead of returning it as a result.

  • lambda bodies are a single expression, not a block of statements. The lambda’s body is similar to what you’d put in a def body’s return statement; simply type the result as a naked expression, instead of explicitly returning it. Because it is limited to an expression, lambda is less general than a def; you can only squeeze so much logic into a lambda body without using statements such as if (read on for more on this). This is by design, to limit program nesting: lambda is designed for coding simple functions, and def handles larger tasks.

Apart from those distinctions, the def and lambda do the same sort of work. For instance, we’ve seen how to make functions with def statements:

>>> def func(x, y, z): return x + y + z
...
>>> func(2, 3, 4)
9

But you can achieve the same effect with a lambda expression, by explicitly assigning its result to a name through which you can later call:

>>> f = lambda x, y, z: x + y + z
>>> f(2, 3, 4)
9

Here, f is assigned the function object the lambda expression creates; this is how def works too, but its assignment is automatic. Defaults work on lambda arguments, just like the def:

>>> x = (lambda a="fee", b="fie", c="foe": a + b + c)
>>> x("wee")
'weefiefoe'

The code in a lambda body also follows the same scope lookup rules as code inside a deflambda expressions introduce a local scope much like a nested def, which automatically sees names in enclosing functions, the module, and the built-in scope (via the LEGB rule):

>>> def knights(  ):
...     title = 'Sir'
...     action = (lambda x: title + ' ' + x)   # Title in enclosing def
...     return action                          # Return a function.
...
>>> act = knights(  )
>>> act('robin')
'Sir robin'

Prior to Release 2.2, the value for name title would typically be passed in as a default argument value instead; flip back to the scopes coverage of Chapter 13 if you’ve forgotten why.

Why lambda?

Generally speaking, lambdas come in handy as a sort of function shorthand that allows you to embed a function’s definition within the code that uses it. They are entirely optional (you can always use def instead), but tend to be a simpler coding construct in scenarios when you just need to embed small bits of executable code.

For instance, we’ll see later that callback handlers are frequently coded as in-line lambda expressions embedded directly in a registration call’s arguments list, instead of being defined with a def elsewhere in a file and referenced by name (see the callbacks sidebar for an example).

lambdas are also commonly used to code jump tables—lists or dictionaries of actions to be performed on demand. For example:

L = [(lambda x: x**2), (lambda x: x**3), (lambda x: x**4)]

for f in L:
    print f(2)     # Prints 4, 8, 16

print L[0](3)      # Prints 9

The lambda expression is most useful as a shorthand for def, when you need to stuff small pieces of executable code in places where statements are illegal syntactically. This code snippet, for example, builds up a list of three functions by embedding lambda expressions inside a list literal; def won’t work inside a list literal like this, because it is a statement, not an expression.

You can do the same sort of thing with dictionaries and other data structures in Python, to build up action tables:

>>> key = 'got'
>>> {'already': (lambda: 2 + 2),
...  'got':     (lambda: 2 * 4),
...  'one':     (lambda: 2 ** 6)
... }[key](  )
8

Here, when Python makes the dictionary, each of the nested lambdas generates and leaves behind a function to be called later; indexing by key fetches one of those functions, and parenthesis force the fetched function to be called. When coded this way, a dictionary becomes a more general multiway branching tool than what we could show you in Chapter 9s coverage of if statements.

To make this work without lambda, you’d need to instead code three def statements somewhere else in your file, and outside the dictionary in which the functions are to be used:

def f1(  ): ...
def f2(  ): ...
def f3(  ): ...
...
key = ...
{'already': f1, 'got': f2, 'one': f3}[key](  )

This works too, and avoids lambdas; but your defs may be arbitrarily far away in your file, even if they are just little bits of code. The code proximity that lambdas provide is especially useful for functions that will only be used in a single context—if the three functions here are not useful anywhere else, it makes sense to embed their definition within the dictionary as lambdas.

lambdas also come in handy in function argument lists, as a way to inline temporary function definitions not used anywhere else in your program; we’ll meet examples of such other uses later in this chapter when we study map.

How (Not) to Obfuscate Your Python Code

The fact that the body of a lambda has to be a single expression (not statements) would seem to place severe limits on how much logic you can pack into a lambda. If you know what you’re doing, though, you can code almost every statement in Python as an expression-based equivalent.

For example, if you want to print from the body of a lambda function, simply say sys.stdout.write(str(x)+' '), instead of print x. (See Chapter 8 if you’ve forgotten why.) Similarly, it’s possible to emulate an if statement by combining Boolean operators in expressions. The expression:

((a and b) or c)

is roughly equivalent to:

if a:
    b
else:
    c

and is almost Python’s equivalent to the C language’s a?b:c ternary operator. (To understand why, you need to have read the discussion of Boolean operators in Chapter 9.) In short, Python’s and and or short-circuit (they don’t evaluate the right side, if the left determines the result), and always return either the value on the left, or the value on the right. In code:

>>> t, f = 1, 0
>>> x, y = 88, 99

>>> a = (t and x) or y           # If true, x
                  >>> a
88
>>> a = (f and x) or y           # If false, y
>>> a
99

This works, but only as long as you can be sure that x will not be false too (otherwise, you will always get y). To truly emulate an if statement in an expression, you must wrap the two possible results so as to make them non-false, and then index to pull out the result at the end:[2]

>>> ((t and [x]) or [y])[0]     # If true, x
88
>>> ((f and [x]) or [y])[0]     # If false, y
99
>>> (t and f) or y              # Fails: f is false, skipped
99
>>> ((t and [f]) or [y])[0]     # Works: f returned anyhow
0

Once you’ve muddled through typing this a few times, you’ll probably want to wrap it for reuse:

>>> def ifelse(a, b, c): return ((a and [b]) or [c])[0]
...
>>> ifelse(1, 'spam', 'ni')
'spam'
>>> ifelse(0, 'spam', 'ni')
'ni'

Of course, you can get the same results by using an if statement here instead:

def ifelse(a, b, c):
   if a: return b
   else: return c

But expressions like these can be placed inside a lambda, to implement selection logic:

>>> lower = (lambda x, y: (((x < y) and [x]) or [y])[0])
>>> lower('bb', 'aa')
'aa'
>>> lower('aa', 'bb')
'aa'

Finally, if you need to perform loops within a lambda, you can also embed things like map calls and list comprehension expressions—tools we’ll meet later in this section:

>>> import sys
>>> showall = (lambda x: map(sys.stdout.write, x))
>>> t = showall(['spam
', 'toast
', 'eggs
'])
spam
toast
eggs

But now that we’ve shown you these tricks, we need ask you to please only use them as a last resort. Without due care, they can lead to unreadable (a.k.a. obfuscated) Python code. In general, simple is better than complex, explicit is better than implicit, and full statements are better than arcane expressions. On the other hand, you may find these useful, when taken in moderation.

Nested lambdas and Scopes

lambdas are the main beneficiaries of nested function scope lookup (the E in the LEGB rule). In the following, for example, the lambda appears inside a def—the typical case—and so can access the value that name x had in the enclosing function’s scope, at the time that the enclosing function was called:

>>> def action(x):
...     return (lambda y: x + y)        # Make, return function.

>>> act = action(99)
>>> act
<function <lambda> at 0x00A16A88>
>>> act(2)
101

What we didn’t illustrate in the prior discussion is that a lambda also has access to the names in any enclosing lambda. This case is somewhat obscure, but imagine if we recoded the prior def with a lambda:

>>> action = (lambda x: (lambda y: x + y))
>>> act = action(99)
>>> act(3)
102
>>> ((lambda x: (lambda y: x + y))(99))(4)
103

Here, the nested lambda structure makes a function that makes a function when called. In both cases, the nested lambda’s code has access to variable x in the enclosing lambda. This works, but it’s fairly convoluted code; in the interest of readability, nested lambdas are generally best avoided.

Applying Functions to Arguments

Some programs need to call arbitrary functions in a generic fashion, without knowing their names or arguments ahead of time. We’ll see examples of where this can be useful later, but by way of introduction, both the apply built-in function, and the special call syntax, do the job.

The apply Built-in

You can call generated functions by passing them as arguments to apply, along with a tuple of arguments:

>>> def func(x, y, z): return x + y + z
...
>>> apply(func, (2, 3, 4))
9
>>> f = lambda x, y, z: x + y + z
>>> apply(f, (2, 3, 4))
9

The function apply simply calls the passed-in function in the first argument, matching the passed-in arguments tuple to the function’s expected arguments. Since the arguments list is passed in as a tuple (i.e., a data structure), it can be built at runtime by a program.[3]

The real power of apply is that it doesn’t need to know how many arguments a function is being called with; for example, you can use if logic to select from a set of functions and argument lists, and use apply to call any:

if <test>:
    action, args = func1, (1,)
else:
    action, args = func2, (1, 2, 3)
. . . 
apply(action, args)

More generally, apply is useful any time you cannot predict the arguments list ahead of time. If your user selects an arbitrary function via a user interface, for instance, you may be unable to hardcode a function call when writing your script. Simply build up the arguments list with tuple operations and call indirectly through apply:

>>> args = (2,3) + (4,)
>>> args
(2, 3, 4)
>>> apply(func, args)
9

Passing keyword arguments

The apply call also supports an optional third argument, where you can pass in a dictionary that represents keyword arguments to be passed to the function:

>>> def echo(*args, **kwargs): print args, kwargs

>>> echo(1, 2, a=3, b=4)
(1, 2) {'a': 3, 'b': 4}

This allows us to construct both positional and keyword arguments, at runtime:

>>> pargs = (1, 2)
>>> kargs = {'a':3, 'b':4}
>>> apply(echo, pargs, kargs)
(1, 2) {'a': 3, 'b': 4}

Apply-Like Call Syntax

Python also allows you to accomplish the same effect as an apply call with special syntax at the call, which mirrors the arbitrary arguments syntax in def headers that we met in Chapter 13. For example, assuming the names of this example are still as assigned earlier:

>>> apply(func, args)              # Traditional: tuple
9
>>> func(*args)                    # New apply-like syntax
9
>>> echo(*pargs, **kargs)          # Keyword dictionaries too
(1, 2) {'a': 3, 'b': 4}

>>> echo(0, *pargs, **kargs)       # Normal, *tuple, **dictionary
(0, 1, 2) {'a': 3, 'b': 4}

This special call syntax is newer than the apply function. There is no obvious advantage of the syntax over an explicit apply call, apart from its symmetry with def headers, and a few less keystrokes.

Mapping Functions Over Sequences

One of the more common things programs do with lists and other sequences is to apply an operation to each item, and collect the results. For instance, updating all the counters in a list can be done easily with a for loop:

>>> counters = [1, 2, 3, 4]
>>>
>>> updated = [  ]
>>> for x in counters:
...     updated.append(x + 10)              # Add 10 to each item.
...
>>> updated
[11, 12, 13, 14]

Because this is such a common operation, Python provides a built-in that does most of the work for you. The map function applies a passed-in function to each item in a sequence object, and returns a list containing all the function call results. For example:

>>> def inc(x): return x + 10               # function to be run
...
>>> map(inc, counters)                      # Collect results.
[11, 12, 13, 14]

We introduced map as a parallel loop traversal tool in Chapter 10, where we passed in None for the function argument to pair items up. Here, we make better use of it by passing in a real function to be applied to each item in the list—map calls inc on each list item, and collects all the return values into a list.

Since map expects a function to be passed in, it also happens to be one of the places where lambdas commonly appear:

>>> map((lambda x: x + 3), counters)        # Function expression
[4, 5, 6, 7]

Here, the function adds 3 to each item in the counters list; since this function isn’t needed elsewhere, it was written inline as a lambda. Because such uses of map are equivalent to for loops, with a little extra code, you could always code a general mapping utility yourself:

>>> def mymap(func, seq):
...     res = [  ]
...     for x in seq: res.append(func(x))
...     return res
...
>>> map(inc, [1, 2, 3])
[11, 12, 13]
>>> mymap(inc, [1, 2, 3])
[11, 12, 13]

However, since map is a built-in, it’s always available, always works the same way, and has some performance benefits (in short, it’s faster than a for). Moreover, map can be used in more advanced ways than shown; for instance, given multiple sequence arguments, it sends items taken from sequences in parallel as distinct arguments to the function:

>>> pow(3, 4)
81
>>> map(pow, [1, 2, 3], [2, 3, 4])       # 1**2, 2**3, 3**4
[1, 8, 81]

Here, the pow function takes two arguments on each call—one from each sequence passed to map. Although we could simulate this generality too, there is no obvious point in doing so, when map is built-in and quick.

Functional Programming Tools

The map function is the simplest representative of a class of Python built-ins used for functional programming—which mostly just means tools that apply functions to sequences. Its relatives filter out items based on a test function (filter), and apply functions to pairs of items and running results (reduce). For example, the following filter call picks out items in a sequence greater than zero:

>>> range(-5, 5)
[-5, -4, -3, -2, -1, 0, 1, 2, 3, 4]

>>> filter((lambda x: x > 0), range(-5, 5))
[1, 2, 3, 4]

Items in the sequence for which the function returns true are added to the result list. Like map, it’s roughly equivalent to a for loop, but is built-in and fast:

>>> res = [  ]
>>> for x in range(-5, 5):
...     if x > 0:
...         res.append(x)
...
>>> res
[1, 2, 3, 4]

Here are two reduce calls computing the sum and product of items in a list:

>>> reduce((lambda x, y: x + y), [1, 2, 3, 4])
10
>>> reduce((lambda x, y: x * y), [1, 2, 3, 4])
24

At each step, reduce passes the current sum or product, along with the next item from the list, to the passsed in lambda function. By default, the first item in the sequence initializes the starting value. Here’s the for loop equivalent to the first of these, with the addition hardcoded inside the loop:

>>> L = [1,2,3,4]
>>> res = L[0]
>>> for x in L[1:]:
...     res = res + x
...
>>> res
10

If this has sparked your interest, also see the built-in operator module, which provides functions that correspond to built-in expressions, and so comes in handy for some uses of functional tools:

>>> import operator
>>> reduce(operator.add, [2, 4, 6])      # function-based +
12
>>> reduce((lambda x, y: x + y), [2, 4, 6])
12

Some observers might also extend the functional programming toolset in Python to include lambda and apply, and list comprehensions (discussed in the next section).

List Comprehensions

Because mapping operations over sequences and collecting results is such a common task in Python coding, Python 2.0 sprouted a new feature—the list comprehension expression—that can make this even simpler than using map and filter. Technically, this feature is not tied to functions, but we’ve saved it for this point in the book, because it is usually best understood by analogy to function-based alternatives.

List Comprehension Basics

Let’s work through an example that demonstrates the basics. Python’s built-in ord function returns the integer ASCII code of a single character:

>>> ord('s')
115

The chr built-in is the converse—it returns the character for an ASCII code integer. Now, suppose we wish to collect the ASCII codes of all characters in an entire string. Perhaps the most straightforward approach is to use a simple for loop, and append results to a list:

>>> res = [  ]
>>> for x in 'spam': 
...     res.append(ord(x))
...
>>> res
[115, 112, 97, 109]

Now that we know about map, we can achieve similar results with a single function call without having to manage list construction in the code:

>>> res = map(ord, 'spam')            # Apply func to seq.
>>> res
[115, 112, 97, 109]

But as of Python 2.0, we get the same results from a list comprehension expression:

>>> res = [ord(x) for x in 'spam']    # Apply expr to seq.
>>> res
[115, 112, 97, 109]

List comprehensions collect the results of applying an arbitrary expression to a sequence of values, and return them in a new list. Syntactically, list comprehensions are enclosed in square brackets (to remind you that they construct a list). In their simple form, within the brackets, you code an expression that names a variable, followed by what looks like a for loop header that names the same variable. Python collects the expression’s results, for each iteration of the implied loop.

The effect of the example so far is similar to both the manual for loop, and the map call. List comprehensions become more handy, though, when we wish to apply an arbitrary expression to a sequence:

>>> [x ** 2 for x in range(10)]
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

Here, we’ve collected the squares of the numbers 0 to 9. To do similar work with a map call, we would probably invent a little function to implement the square operation. Because we won’t need this function elsewhere, it would typically be coded inline, with a lambda:

>>> map((lambda x: x**2), range(10))
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

This does the same job, and is only a few keystrokes longer than the equivalent list comprehension. For more advanced kinds of expressions, though, list comprehensions will often be less for you to type. The next section shows why.

Adding Tests and Nested Loops

List comprehensions are more general than shown so far. For instance, you can code an if clause after the for, to add selection logic. List comprehensions with if clauses can be thought of as analogous to the filter built-in of the prior section—they skip sequence items for which the if clause is not true. Here are both schemes picking up even numbers from 0 to 4; like map, filter invents a little lambda function for the test expression. For comparison, the equivalent for loop is shown here as well:

>>> [x for x in range(5) if x % 2 == 0]
[0, 2, 4]

>>> filter((lambda x: x % 2 == 0), range(5))
[0, 2, 4]

>>> res = [  ]
>>> for x in range(5):
...     if x % 2 == 0: res.append(x)
...        
>>> res
[0, 2, 4]

All of these are using modulus (remainder of division) to detect evens: if there is no remainder after dividing a number by two, it must be even. The filter call is not much longer than the list comprehension here either. However, the combination of an if clause and an arbitrary expression gives list comprehensions the effect of a filter and a map, in a single expression:

>>> [x**2 for x in range(10) if x % 2 == 0]
[0, 4, 16, 36, 64]

This time, we collect the squares of the even numbers from 0 to 9—the for loop skips numbers for which the attached if clause on the right is false, and the expression on the left computes squares. The equivalent map call would be more work on our part: we would have to combine filter selections with map iteration, making for a noticeably more complex expression:

>>> map((lambda x: x**2), filter((lambda x: x % 2 == 0), range(10)))
[0, 4, 16, 36, 64]

In fact, list comprehensions are even more general still. You may code nested for loops, and each may have an associated if test. The general structure of list comprehensions looks like this:

[ expression for target1 in sequence1 [if condition]
             for target2 in sequence2 [if condition] ...
             for targetN in sequenceN [if condition] ]

When for clauses are nested within a list comprehension, they work like equivalent nested for loop statements. For example, the following:

>>> res = [x+y for x in [0,1,2] for y in [100,200,300]]
>>> res
[100, 200, 300, 101, 201, 301, 102, 202, 302]

has the same effect as the substantially more verbose equivalent statements:

>>> res = [  ]
>>> for x in [0,1,2]:
...     for y in [100,200,300]:
...         res.append(x+y)
...
>>> res
[100, 200, 300, 101, 201, 301, 102, 202, 302]

Although list comprehensions construct a list, remember that they can iterate over any sequence type. Here’s a similar bit of code that traverses strings instead of lists of numbers, and so collects concatenation results:

>>> [x+y for x in 'spam' for y in 'SPAM']
['sS', 'sP', 'sA', 'sM', 'pS', 'pP', 'pA', 'pM', 
'aS', 'aP', 'aA', 'aM', 'mS', 'mP', 'mA', 'mM']

Finally, here is a much more complex list comprehension. It illustrates the effect of attached if selections on nested for clauses:

>>> [(x,y) for x in range(5) if x%2 == 0 for y in range(5) if y%2 == 1]
[(0, 1), (0, 3), (2, 1), (2, 3), (4, 1), (4, 3)]

This expression permutes even numbers from 0 to 4, with odd numbers from 0 to 4. The if clauses filter out items in each sequence iteration. Here’s the equivalent statement-based code—nest the list comprehension’s for and if clauses inside each other to derive the equivalent statements. The result is longer, but perhaps clearer:

>>> res = [  ]
>>> for x in range(5):
...     if x % 2 == 0:
...         for y in range(5):
...             if y % 2 == 1:
...                 res.append((x, y))
...
>>> res
[(0, 1), (0, 3), (2, 1), (2, 3), (4, 1), (4, 3)]

The map and filter equivalent would be wildly complex and nested, so we won’t even try showing it here. We’ll leave its coding as an exercise for Zen masters, ex-LISP programmers, and the criminally insane.

Comprehending List Comprehensions

With such generality, list comprehensions can quickly become, well, incomprehensible, especially when nested. Because of that, our advice would normally be to use simple for loops when getting started with Python, and map calls in most other cases (unless they get too complex). The “Keep It Simple” rule applies here, as always; code conciseness is much less important a goal than code readability.

However, there is currently a substantial performance advantage to the extra complexity in this case: based on tests run under Python 2.2, map calls are roughly twice as fast as equivalent for loops, and list comprehensions are usually very slightly faster than map. This speed difference owes to the fact that map and list comprehensions run at C language speed inside the interpreter, rather than stepping through Python for loop code within the PVM.

Because for loops make logic more explicit, we recommend them in general on grounds of simplicity. map, and especially list comprehensions, are worth knowing if your application’s speed is an important consideration. In addition, because map and list comprehensions are both expressions, they can show up syntactically in places that for loop statements cannot, such as in the bodies of lambda functions, within list and dictionary literals, and more. Still, you should try to keep your map calls and list comprehensions simple; for more complex tasks, use full statements instead.

Generators and Iterators

It is possible to write functions that may be resumed after they send a value back. Such functions are known as generators because they generate a sequence of values over time. Unlike normal functions that return a value and exit, generator functions automatically suspend and resume their execution and state around the point of value generation. Because of that, they are often a useful alternative to both computing an entire series of values up front, and manually saving and restoring state in classes.

The chief code difference between generator and normal functions is that generators yield a value, rather than returning one—the yield statement suspends the function and sends a value back to the caller, but retains enough state to allow the function to resume from where it left off. This allows functions to produce a series of values over time, rather than computing them all at once, and sending them back in something like a list.

Generator functions are bound up with the notion of iterator protocols in Python. In short, functions containing a yield statement are compiled specially as generators; when called, they return a generator object that supports the iterator object interface.

Iterator objects, in turn, define a next method, which returns the next item in the iteration, or raises a special exception (StopIteration) to end the iteration. Iterators are fetched with the iter built-in function. Python for loops use this iteration interface protocol to step through a sequence (or sequence generator), if the protocol is supported; if not, for falls back on repeatedly indexing sequences instead.

Generator Example

Generators and iterators are an advanced language feature, so please see the Python library manuals for the full story on generators.

To illustrate the basics, though, the following code defines a generator function that can be used to generate the squares of a series of numbers over time:[4]

>>> def gensquares(N):
...     for i in range(N):
...         yield i ** 2               # Resume here later.

This function yields a value, and so returns to its caller, each time through the loop; when it is resumed, its prior state is restored, and control picks up again immediately after the yield statement. For example, when used as the sequence in a for loop, control will resume the function after its yield statement, each time through the loop:

>>> for i in gensquares(5):        # Resume the function. 
...     print i, ':',              # Print last yielded value.
...
0 : 1 : 4 : 9 : 16 :
>>>

To end the generation of values, functions use either a return statement with no value, or simply fall off the end of the function body. If you want to see what is going on inside the for, call the generator function directly:

>>> x = gensquares(10)
>>> x
<generator object at 0x0086C378>

You get back a generator object that supports the iterator protocol—it has a next method, which starts the function, or resumes it from where it last yielded a value:

>>> x.next(  )
0
>>> x.next(  )
1
>>> x.next(  )
4

for loops work with generators in the same way—by calling the next method repeatedly, until an exception is caught. If the object to be iterated over does not support this protocol, for loops instead use the indexing protocol to iterate.

Note that in this example, we could also simply build the list of yielded values all at once:

>>> def buildsquares(n):
...     res = [  ]
...     for i in range(n): res.append(i**2)
...     return res
...
>>> for x in buildsquares(5): print x, ':',
...
0 : 1 : 4 : 9 : 16 :

For that matter, we could simply use any of the for loop, map, or list comprehension techniques:

>>> for x in [n**2 for n in range(5)]:
...     print x, ':',
...
0 : 1 : 4 : 9 : 16 :

>>> for x in map((lambda x:x**2), range(5)):
...     print x, ':',
...
0 : 1 : 4 : 9 : 16 :

However, especially when result lists are large, or when it takes much computation to produce each value, generators allow functions to avoid doing all the work up front. They distribute the time required to produce the series of values among loop iterations. Moreover, for more advanced generator uses, they provide a simpler alternative to manually saving the state between iterations in class objects (more on classes later in Part VI); with generators, function variables are saved and restored automatically.

Iterators and Built-in Types

Built-in datatypes are designed to produce iterator objects in response to the iter built-in function. Dictionary iterators, for instance, produce key list items on each iteration:

>>> D = {'a':1, 'b':2, 'c':3}
>>> x = iter(D)
>>> x.next(  )
'a'
>>> x.next(  )
'c'

In addition, all iteration contexts, including for loops, map calls, and list comprehensions, are in turn designed to automatically call the iter function to see if the protocol is supported. That’s why you can loop through a dictionary’s keys without calling its keys method, step through lines in a file without calling readlines or xreadlines, and so on:

>>> for key in D: 
...     print key, D[key]
...
a 1
c 3
b 2

For file iterators, Python 2.2 simply uses the result of the file xreadlines method; this method returns an object that loads lines from the file on demand, and reads by chunks of lines instead of loading the entire file all at once:

>>> for line in open('temp.txt'):
...     print line,
...
Tis but
a flesh wound.

It is also possible to implement arbitrary objects with classes, which conform to the iterator protocol, and so may be used in for loops and other iteration contexts. Such classes define a special __iter__ method that return an iterator object (preferred over the __getitem__ indexing method). However, this is well beyond the scope of this chapter; see Part VI for more on classes in general, and Chapter 21 for an example of a class that implements the iterator protocol.

Function Design Concepts

When you start using functions, you’re faced with choices about how to glue components together—for instance, how to decompose a task into functions (cohesion ), how functions should communicate (coupling ), and so on. Some of this falls into the category of structured analysis and design. Here are a few general hints for Python beginners:

  • Coupling: use arguments for inputs and return for outputs. Generally, you should strive to make a function independent of things outside of it. Arguments and return statements are often the best ways to isolate external dependencies.

  • Coupling: use global variables only when truly necessary. Global variables (i.e., names in the enclosing module) are usually a poor way for functions to communicate. They can create dependencies and timing issues that make programs difficult to debug and change.

  • Coupling: don’t change mutable arguments unless the caller expects it. Functions can also change parts of mutable objects passed in. But as with global variables, this implies lots of coupling between the caller and callee, which can make a function too specific and brittle.

  • Cohesion: each function should have a single, unified purpose. When designed well, each of your functions should do one thing—something you can summarize in a simple declarative sentence. If that sentence is very broad (e.g., “this function implements my whole program”), or contains lots of conjunctions (e.g., “this function gives employee raises and submits a pizza order”), you might want to think about splitting it into separate and simpler functions. Otherwise, there is no way to reuse the code behind the steps mixed together in such a function.

  • Size: each function should be relatively small. This naturally follows from the cohesion goal, but if your functions start spanning multiple pages on your display, it’s probably time to split. Especially given that Python code is so concise to begin with, a function that grows long or deeply nested is often a symptom of design problems. Keep it simple, and keep it short.

Figure 14-1 summarizes the ways functions can talk to the outside world; inputs may come from items on the left side, and results may be sent out in any of the forms on the right. Some function designers usually only use arguments for inputs and return statements for outputs.

Function execution environment
Figure 14-1. Function execution environment

There are plenty of exceptions, including Python’s OOP support—as you’ll see in Part VI, Python classes depend on changing a passed-in mutable object. Class functions set attributes of an automatically passed-in argument called self, to change per-object state information (e.g., self.name='bob'). Moreover, if classes are not used, global variables are often the best way for functions in modules to retain state between calls. Such side effects aren’t dangerous if they’re expected.

Functions Are Objects: Indirect Calls

Because Python functions are objects at runtime, you can write programs that process them generically. Function objects can be assigned, passed to other functions, stored in data structures, and so on, as if they were simple numbers or strings. We’ve seen some of these uses in earlier examples. Function objects happen to export a special operation—they can be called by listing arguments in parentheses after a function expression. But functions belong to the same general category as other objects.

For instance, there’s really nothing special about the name used in a def statement: it’s just a variable assigned in the current scope, as if it had appeared on the left of an = sign. After a def runs, the function name is a reference to an object; you can reassign that object to other names and call it through any reference—not just the original name:

>>> def echo(message):           # Echo assigned to a function object.
...     print message
...
>>> x = echo                     # Now x references it too.
>>> x('Hello world!')            # Call the object by adding (  ).
Hello world!

Since arguments are passed by assigning objects, it’s just as easy to pass functions to other functions, as arguments; the callee may then call the passed-in function just by adding arguments in parentheses:

>>> def indirect(func, arg):
...     func(arg)                         # Call the object by adding (  ).
...
>>> indirect(echo, 'Hello jello!')        # Pass function to a function.
Hello jello!

You can even stuff function objects into data structures, as though they were integers or strings. Since Python compound types can contain any sort of object, there’s no special case here either:

>>> schedule = [ (echo, 'Spam!'), (echo, 'Ham!') ]
>>> for (func, arg) in schedule:
...     func(arg)
...
Spam!
Ham!

This code simply steps through the schedule list, calling the echo function with one argument each time through. Python’s lack of type declarations makes for an incredibly flexible programming language. Notice the tuple unpacking assignment in the for loop header, introduced in Chapter 8.

Function Gotchas

Here are some of the more jagged edges of functions you might not expect. They’re all obscure, and a few have started to fall away from the language completely in recent releases, but most have been known to trip up a new user.

Local Names Are Detected Statically

Python classifies names assigned in a function as locals by default; they live in the function’s scope and exist only while the function is running. What we didn’t tell you is that Python detects locals statically, when it compiles the def’s code, rather than by noticing assignments as they happen at runtime. This leads to one of the most common oddities posted on the Python newsgroup by beginners.

Normally, a name that isn’t assigned in a function is looked up in the enclosing module:

>>> X = 99
>>> def selector(  ):        # X used but not assigned
...     print X              # X found in global scope
...
>>> selector(  )
99

Here, the X in the function resolves to the X in the module outside. But watch what happens if you add an assignment to X after the reference:

>>> def selector(  ):
...     print X              # Does not yet exist!
...     X = 88               # X classified as a local name (everywhere)
...                          # Can also happen if "import X", "def X",...
>>> selector(  )
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "<stdin>", line 2, in selector
UnboundLocalError: local variable 'X' referenced before assignment

You get an undefined name error, but the reason is subtle. Python reads and compiles this code when it’s typed interactively or imported from a module. While compiling, Python sees the assignment to X and decides that X will be a local name everywhere in the function. But later, when the function is actually run, the assignment hasn’t yet happened when the print executes, so Python says you’re using an undefined name. According to its name rules, it should; local X is used before being assigned. In fact, any assignment in a function body makes a name local. Imports, =, nested defs, nested classes, and so on, are all susceptible to this behavior.

The problem occurs because assigned names are treated as locals everywhere in a function, not just after statements where they are assigned. Really, the previous example is ambiguous at best: did you mean to print the global X and then create a local X, or is this a genuine programming error? Since Python treats X as a local everywhere, it is an error; but if you really mean to print global X, you need to declare it in a global statement:

>>> def selector(  ):
...     global X           # Force X to be global (everywhere).
...     print X
...     X = 88
...
>>> selector(  )
99

Remember, though, that this means the assignment also changes the global X, not a local X. Within a function, you can’t use both local and global versions of the same simple name. If you really meant to print the global and then set a local of the same name, import the enclosing module and qualify to get to the global version:

>>> X = 99
>>> def selector(  ):
...     import __main__     # Import enclosing module.
...     print __main__.X    # Qualify to get to global version of name.
...     X = 88                  # Unqualified X classified as local.
...     print X                 # Prints local version of name.
...
>>> selector(  )
99
88

Qualification (the .X part) fetches a value from a namespace object. The interactive namespace is a module called __main__, so __main__.X reaches the global version of X. If that isn’t clear, check out Part V.[5]

Defaults and Mutable Objects

Default argument values are evaluated and saved when the def statement is run, not when the resulting function is called. Internally, Python saves one object per default argument, attached to the function itself.

That’s usually what you want; because defaults are evaluated at def time, it lets you save values from the enclosing scope if needed. But since defaults retain an object between calls, you have to be careful about changing mutable defaults. For instance, the following function uses an empty list as a default value and then changes it in place each time the function is called:

>>> def saver(x=[  ]):            # Saves away a list object
...     x.append(1)             # Changes same object each time!
...     print x
...
>>> saver([2])                  # Default not used
[2, 1]
>>> saver(  )                   # Default used
[1]
>>> saver(  )                   # Grows on each call!
[1, 1]
>>> saver(  )
[1, 1, 1]

Some see this behavior as a feature—because mutable default arguments retain their state between function calls, they can serve some of the same roles as static local function variables in the C language. In a sense, they work something like global variables, but their names are local to the function, and so will not clash with names elsewhere in a program.

To most observers, though, this seems like a gotcha, especially the first time they run into this. There are better ways to retain state between calls in Python (e.g., using classes, which will be discussed in Part VI).

Moreover, mutable defaults are tricky to remember (and understand at all). They depend upon the timing of default object construction. In the example, there is just one list object for the default value—the one created when the def was executed. You don’t get a new list every time the function is called, so the list grows with each new append; it is not reset to empty on each call.

If that’s not the behavior you wish, simply make copies of the default at the start of the function body, or move the default value expression into the function body; as long as the value resides in code that’s actually executed each time the function runs, you’ll get a new object each time through:

>>> def saver(x=None):
...     if x is None:         # No argument passed?
...         x = [  ]            # Run code to make a new list.
...     x.append(1)           # Changes new list object
...     print x
...
>>> saver([2])
[2, 1]
>>> saver(  )                   # Doesn't grow here
[1]
>>> saver(  )
[1]

By the way, the if statement in the example could almost be replaced by the assignment x = x or [ ], which takes advantage of the fact that Python’s or returns one of its operand objects: if no argument was passed, x defaults to None, so the or returns the new empty list on the right.

However, this isn’t exactly the same. When an empty list is passed in, the or expression would cause the function to extend and return a newly created list, rather than extending and returning the passed-in list like the previous version. (The expression becomes [ ] or [ ], which evaluates to the new empty list on the right; see Section 9.3 in Chapter 9 if you don’t recall why). Real program requirements may call for either behavior.

Functions Without Returns

In Python functions, return (and yield) statements are optional. When a function doesn’t return a value explicitly, the function exits when control falls off the end of the function body. Technically, all functions return a value; if you don’t provide a return, your function returns the None object automatically:

>>> def proc(x):
...     print x        # No return is a None return.
...
>>> x = proc('testing 123...')
testing 123...
>>> print x
None

Functions such as this without a return are Python’s equivalent of what are called “procedures” in some languages. They’re usually invoked as a statement, and the None result is ignored, since they do their business without computing a useful result.

This is worth knowing, because Python won’t tell you if you try to use the result of a function that doesn’t return one. For instance, assigning the result of a list append method won’t raise an error, but you’ll really get back None, not the modified list:

>>> list = [1, 2, 3]
>>> list = list.append(4)      # append is a "procedure."
>>> print list                 # append changes list in-place. 
None

As mentioned in Section 11.2 in Chapter 11, such functions do their business as a side effect, and are usually designed to be run as a statement, not an expression.

Part IV Exercises

We’re going to start coding more sophisticated programs in these exercises. Be sure to check solutions in Section B.4, and be sure to start writing your code in module files. You won’t want to retype these exercises from scratch if you make a mistake.

  1. The basics. At the Python interactive prompt, write a function that prints its single argument to the screen and call it interactively, passing a variety of object types: string, integer, list, dictionary. Then try calling it without passing any argument. What happens? What happens when you pass two arguments?

  2. Arguments. Write a function called adder in a Python module file. The function adder should accept two arguments and return the sum (or concatenation) of its two arguments. Then add code at the bottom of the file to call the function with a variety of object types (two strings, two lists, two floating points), and run this file as a script from the system command line. Do you have to print the call statement results to see results on your screen?

  3. varargs. Generalize the adder function you wrote in the last exercise to compute the sum of an arbitrary number of arguments, and change the calls to pass more or less than two. What type is the return value sum? (Hints: a slice such as S[:0] returns an empty sequence of the same type as S, and the type built-in function can test types; but see the min examples in Chapter 13 for a simpler approach.) What happens if you pass in arguments of different types? What about passing in dictionaries?

  4. Keywords. Change the adder function from Exercise 2 to accept and add three arguments: def adder(good, bad, ugly). Now, provide default values for each argument and experiment with calling the function interactively. Try passing one, two, three, and four arguments. Then, try passing keyword arguments. Does the call adder(ugly=1, good=2) work? Why? Finally, generalize the new adder to accept and add an arbitrary number of keyword arguments, much like Exercise 3, but you’ll need to iterate over a dictionary, not a tuple. (Hint: the dict.keys( ) method returns a list you can step through with a for or while.)

  5. Write a function called copyDict(dict) that copies its dictionary argument. It should return a new dictionary with all the items in its argument. Use the dictionary keys method to iterate (or, in Python 2.2, step over a dictionary’s keys without calling keys). Copying sequences is easy (X[:] makes a top-level copy); does this work for dictionaries too?

  6. Write a function called addDict(dict1, dict2) that computes the union of two dictionaries. It should return a new dictionary, with all the items in both its arguments (assumed to be dictionaries). If the same key appears in both arguments, feel free to pick a value from either. Test your function by writing it in a file and running the file as a script. What happens if you pass lists instead of dictionaries? How could you generalize your function to handle this case too? (Hint: see the type built-in function used earlier.) Does the order of arguments passed matter?

  7. More argument matching examples. First, define the following six functions (either interactively, or in a module file that can be imported):

    def f1(a, b): print a, b             # Normal args
    
    def f2(a, *b): print a, b            # Positional varargs
    
    def f3(a, **b): print a, b           # Keyword varargs
    
    def f4(a, *b, **c): print a, b, c    # Mixed modes
    
    def f5(a, b=2, c=3): print a, b, c   # Defaults
    
    def f6(a, b=2, *c): print a, b, c    # Defaults and positional varargs

    Now, test the following calls interactively and try to explain each result; in some cases, you’ll probably need to fall back on the matching algorithm shown in Chapter 13. Do you think mixing matching modes is a good idea in general? Can you think of cases where it would be useful?

    >>> f1(1, 2)                  
    >>> f1(b=2, a=1)              
    
    >>> f2(1, 2, 3)               
    >>> f3(1, x=2, y=3)           
    >>> f4(1, 2, 3, x=2, y=3)     
    
    >>> f5(1)                    
    >>> f5(1, 4)                 
    
    >>> f6(1)                    
    >>> f6(1, 3, 4)
  8. Primes revisited. Recall the code snippet we saw in Chapter 10, which simplistically determines if a positive integer is prime:

    x = y / 2                          # For some y > 1
    while x > 1:
        if y % x == 0:                 # Remainder
            print y, 'has factor', x
            break                      # Skip else
        x = x-1
    else:                              # Normal exit
        print y, 'is prime'

    Package this code as a reusable function in a module file, and add some calls to your function at the bottom of your file. While you’re at it, replace the first line’s / operator with //, to make it handle floating point numbers too, and be immune to the “true” division change planned for the / operator in Python 3.0 as described in Chapter 4. What can you do about negatives and 0 and 1? How about speeding this up? Your outputs should look something like this:

    13 is prime
    13.0 is prime
    15 has factor 5
    15.0 has factor 5.0
  9. List comprehensions. Write code to build a new list containing the square roots of all the numbers in this list: [2, 4, 9, 16, 25]. Code this as a for loop first, then as a map call, and finally as a list comprehension. Use the sqrt function in the built-in math module to do the calculation (i.e., import math, and say math.sqrt(x)). Of the three, which approach do you like best?



[1] The name “lambda” seems to scare people more than it should. It comes from Lisp, which got it from the lambda calculus, which is a form of symbolic logic. In Python, though, it’s really just a keyword that introduces the expression syntactically.

[2] As we write this, a debate rages on comp.lang.python about adding a more direct ternary conditional expression to Python; see future release notes for new developments on this front. Note that you can almost achieve the same effect as the and/or with an expression ((falsevalue,truevalue)[condition]), except that this does not short-circuit (both possible results are evaluated every time), and the condition must be 0 or 1.

[3] Be careful not to confuse apply with map, the topic of the next section. apply runs a single function call, passing arguments to the function object just once. map calls a function many times instead, for each item in a sequence.

[4] Generators are available in Python releases after version 2.2; in 2.2, they must be enabled with a special import statement of the form: from __future__ import generators. (See Chapter 18 for more on this statement form.) Iterators were already available in 2.2, largely because the underlying protocol did not require the new, non-backward-compatible keyword, yield.

[5] Python has improved on this story somewhat, by issuing the more specific “unbound local” error message for this case shown in the example listing (it used to simply raise a generic name error); this gotcha is still present in general, though.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset