In Part III, we looked at basic procedural statements in Python. Here, we’ll move on to explore a set of additional statements that create functions of our own. In simple terms, a function is a device that groups a set of statements, so they can be run more than once in a program. Functions also let us specify parameters that serve as function inputs, and may differ each time a function’s code is run. Table 12-1 summarizes the primary function-related tools we’ll study in this part of the book.
Before going into the details, let’s get a clear picture of what functions are about. Functions are a nearly universal program-structuring device. Most of you have probably come across them before in other languages, where they may have been called subroutines or procedures. But as a brief introduction, functions serve two primary development roles:
As in most programming languages, Python functions are the simplest way to package logic you may wish to use in more than one place and more than one time. Up until now, all the code we’ve been writing runs immediately; functions allow us to group and generalize code to be used arbitrarily many times later.
Functions also provide a tool for splitting systems into pieces that have a well- defined role. For instance, to make a pizza from scratch, you would start by mixing the dough, rolling it out, adding toppings, baking, and so on. If you were programming a pizza-making robot, functions would help you divide the overall “make pizza” task into chunks—one function for each subtask in the process. It’s easier to implement the smaller tasks in isolation than it is to implement the entire process at once. In general, functions are about procedure—how to do something, rather than what you’re doing it to. We’ll see why this distinction matters in Part VI.
In this part of the book, we explore the tools used to code functions in Python: function basics, scope rules, and argument passing, along with a few related concepts. As we’ll see, functions don’t imply much new syntax, but they do lead us to some bigger programming ideas.
Although it wasn’t
made
very formal, we’ve already been using functions in
earlier chapters. For instance, to make a file object, we call the
built-in open
function. Similarly, we use the
len
built-in function to ask for the number of
items in a collection object.
In this chapter, we will learn how to write new functions in Python. Functions we write behave the same way as the built-ins already seen: they are called in expressions, are passed values, and return results. But writing new functions requires a few additional ideas that haven’t yet been applied. Moreover, functions behave very differently in Python than they do in compiled languages like C. Here is a brief introduction to the main concepts behind Python functions, which we will study in this chapter:
def
is executable code. Python functions are written with a new statement, the
def
. Unlike functions in compiled languages such
as C, def
is an executable statement—your
function does not exist until Python reaches and runs the
def
. In fact, it’s legal (and
even occasionally useful) to nest def
statements
inside if
, loops, and even other
def
s. In typical operation, def
statements are coded in module files, and are naturally run to
generate functions when the module file is first imported.
def
creates an object and assigns it to a name. When Python reaches and runs a def
statement, it
generates a new function object and assigns it to the
function’s name. As with all assignments, the
function name becomes a reference to the function object.
There’s nothing magic about the name of a
function—as we’ll see, the function object can
be assigned to other names, stored in a list, and so on. Functions
may also be created with the lambda
expression—a more advanced concept deferred until later in this
chapter.
return
sends a result object back to the caller. When a function is called, the caller stops until the function
finishes its work and returns control to the caller. Functions that
compute a value send it back to the caller with
a
return
statement;
the returned value becomes the result of the function call. Functions
known as generators may also use the yield
statement to send a value back and suspend their state, such that
they may be resumed later; this is also an advanced topic covered
later in this chapter.
Arguments are passed by assignment (object reference). In Python, arguments are passed to functions by assignment (which, as we’ve learned, means object reference). As we’ll see, this isn’t quite like C’s passing rules or C++’s reference parameters—the caller and function share objects by references, but there is no name aliasing. Changing an argument name does not also change a name in the caller, but changing passed-in mutable objects can change objects shared by the caller.
global
declares module-level variables that
are to be assigned. By default, all names assigned in a function are local to that
function and exist only while the function runs. To assign a name in
the enclosing module, functions need to list it in a
global
statement.
More generally, names are always looked up in
scopes—places where variables are
stored—and assignments bind names to scopes.
Arguments, return values, and variables are not declared. As with everything in Python, there are no type constraints on functions. In fact, nothing about a function needs to be declared ahead of time: we can pass in arguments of any type, return any kind of object, and so on. As one consequence, a single function can often be applied to a variety of object types.
If some of the preceding words didn’t sink in, don’t worry—we’ll explore all these concepts with real code in this chapter. Let’s get started by expanding on these ideas, and looking at a few first examples along the way.
The
def
statement
creates a
function
object and assigns it to a name. Its general format is as follows:
def <name>(arg1, arg2,... argN): <statements>
As with all compound Python statements, def
consists of a header line, followed by a block of statements, usually
indented (or a simple statement after the colon). The statement block
becomes the function’s
body—the code Python executes each time
the function is called. The header line specifies a function
name that is assigned the function object, along
with a list of zero or more
arguments
(sometimes called parameters) in parenthesis. The
argument names in the header will be assigned to the objects passed
in parentheses at the point of call.
Function bodies often contain a return
statement:
def <name>(arg1, arg2,... argN): ... return <value>
The Python return
statement can show up anywhere
in a function body; it ends the function call and sends a result back
to the caller. It consists of an object expression that gives the
function’s result. The return
is
optional; if it’s not present, a function exits when
control flow falls off the end of the function body. Technically, a
function without a return
returns the
None
object automatically, but it is usually
ignored.
The Python def
is a true executable statement:
when it runs, it creates and assigns a new function object to a
name.
Because
it’s a statement, it can appear anywhere a statement
can—even nested in other statements. For instance,
it’s completely legal to nest a function
def
inside an if
statement, to
select between alternative definitions:
if test: def func( ): # Define func this way. ... else: def func( ): # Or else this way instead. ... ... func( ) # Call the version selected and built.
One way to understand this code is to realize that the
def
is much like an =
statement: it simply assigns a name at runtime. Unlike compiled
languages like C, Python functions do not need to be fully defined
before the program runs. More generally, def
s are
not evaluated until reached and run, and code
inside
def
s is not evaluated
until the function is later called.
Because function definition happens at runtime, there’s nothing special about the function name, only the object it refers to:
othername = func # Assign function object. othername( ) # Call func again.
Here, the function was assigned to a different name, and called through the new name. Like everything else in Python, functions are just objects; they are recorded explicitly in memory at program execution time.
Apart from such
runtime
concepts (which tend to seem most unique to programmers with
backgrounds in traditional compiled languages), Python functions are
straightforward to use. Let’s code a first real
example to demonstrate the basics. Really, there are two sides to the
function picture: a definition—the
def
that creates a function, and a
call—an expression that tells Python to
run the function’s body.
Here’s a definition typed interactively that defines
a function called times
, which returns the product
of its two arguments:
>>>def times(x, y): # Create and assign function.
...return x * y # Body executed when called.
...
When Python reaches and runs this def
, it creates
a new function object that packages the function’s
code, and assign the object to the name times
.
Typically, this statement is coded in a module file, and it would run
when the enclosing file is imported; for something this small,
though, the interactive prompt suffices.
After the def
has run, the program can call (run)
the function by adding parentheses after the
function’s name; the parentheses may optionally
contain one or more object arguments, to be passed (assigned) to the
names in the function’s header:
>>> times(2, 4) # Arguments in parentheses
8
This expression passes two arguments to times
: the
name x
in the function header is assigned the
value 2
, y
is assigned
4
, and the function’s body is
run. In this case, the body is just a return
statement, which sends back the result as the value of the call
expression. The returned object is printed here interactively (as in
most languages, 2*4
is 8
in
Python); it could also be assigned to a variable if we need to use it
later:
>>>x = times(3.14, 4) # Save the result object.
>>>x
12.56
Now, watch what happens when the function is called a third time, with very different kinds of objects passed in:
>>> times('Ni', 4) # Functions are "typeless."
'NiNiNiNi'
In this third call, a string and an integer are passed to
x
and y
, instead of two
numbers. Recall that *
works on both numbers and
sequences; because you never declare the types of variables,
arguments, or return values, you can use times
to
multiply numbers or repeat
sequences.
In fact, the very meaning of the expression x * y
in the simple times
function depends completely
upon the kinds of objects that x
and
y
are—it means multiplication first and
repetition second. Python leaves it up to the
objects to do something reasonable for this
syntax.
This sort of type-dependent behavior is known as polymorphism— which means that the meaning of operations depends on the objects being operated upon. Because Python is a dynamically typed language, polymorphism runs rampant: every operation is a polymorphic operation in Python.
This is a deliberate thing, and accounts for much of the language’s flexibility. A single function, for instance, can generally be applied to a whole categoy of object types. As long as those objects support the expected interface (a.k.a. protocol), they can be processed by the function. That is, if the objects passed in to a function have the expected methods and operators, they are plug-and-play compatible with the function’s logic.
Even in our simple times
function, this means that
any two objects that support a
*
will work, no matter what they may be, and no
matter when they may be coded. Moreover, if the objects passed in do
not support this expected interface, Python will
detect the error when the *
expression is run, and
raise an exception automatically. It’s pointless to
code error checking ourselves here.
This turns out to be a crucial philosophical difference between
Python and statically typed languages like C++ and Java: in Python,
your code is not supposed to care about specific
data types. If it does, it will be limited to work on just the types
you anticipated when you wrote your code. It will not support other
compatible object types coded in the future. Although it is possible
to test for types with tools like the type
built-in function, doing so breaks your code’s
flexibility. By and large, we code to object
interfaces in Python, not data types.
Let’s look at a second function example that does something a bit more useful than multiplying arguments, and further illustrates function basics.
In Chapter 10, we saw a
for
loop that collected items in common in two
strings. We noted there that the code wasn’t as
useful as it could be because it was set up to work only on specific
variables and could not be rerun later. Of course, you could cut and
paste the code to each place it needs to be run, but this solution is
neither good nor general—you’d still have to
edit each copy to support different sequence names, and changing the
algorithm then requires changing multiple copies.
By now, you can probably guess that the solution to this dilemma is
to package the for
loop inside a function.
Functions offer a number of advantages over simple top-level code:
By putting the code in a function, it becomes a tool that can be run as many times as you like.
By allowing callers to pass in arbitrary arguments, you make it general enough to work on any two sequences you wish to intersect.
By packaging the logic in a function, you only have to change code in one place if you ever need to change the way intersection works.
By coding the function in a module file, it can be imported and reused by any program run on your machine.
In effect, wrapping the code in a function makes it a general intersection utility:
def intersect(seq1, seq2): res = [ ] # Start empty. for x in seq1: # Scan seq1. if x in seq2: # Common item? res.append(x) # Add to end. return res
The transformation from the simple code of Chapter 10 to this function is
straightforward; we’ve just nested the original
logic under a def
header and made the objects on
which it operates passed-in parameter names. Since this function
computes a result, we’ve also added a
return
statement to send a result object back to
the caller.
Before you can call the function, you have to make the function. Run
its def
statement by typing it interactively, or
by coding it in a module file and importing the file. Once
you’ve run the def
one way or
another, you call the function by passing any two sequence objects in
parenthesis:
>>>s1 = "SPAM"
>>>s2 = "SCAM"
>>>intersect(s1, s2) # Strings
['S', 'A', 'M']
Here, the code passes in two strings, and gets back a list containing the characters in common. The algorithm the function uses is simple: “for every item in the first argument, if that item is also in the second argument, append the item to the result.” It’s a little shorter to say that in Python than in English, but it works out the same.
Like all functions in Python, intersect
is
polymorphic—it
works on arbitrary types, as long as they support the expected object
interface:
>>>x = intersect([1, 2, 3], (1, 4)) # Mixed types
>>>x # Saved result object
[1]
This time, we pass in different types of objects to our
function—a list and a tuple (mixed types)—and it still
picks out the common items. Since you don’t have to
specify the types of arguments ahead of time, the
intersect
function happily iterates through any
kind of sequence objects you send it, as long as they support the
expected interfaces.
For intersect
, this means that the first argument
has to support the for
loop, and the second has to
support the in
membership test—any two such
objects will work. If you pass in objects that do not support these
interfaces (e.g., passing in numbers), Python will automatically
detect the mismatch and raise an exception for you—exactly what
you want, and the best you could do on your own if you coded explicit
type tests. The intersect
function will even work
on class-based objects you code, which you will learn how to build in
Part VI.[1]
The variable res
inside
intersect
is what in Python is called a
local variable—
a
name that is only visible to code inside the function
def
, and only exists while the function runs. In
fact, because all names assigned in any way
inside a function are classified as local variables by default,
nearly all the names in intersect
are local
variables:
Because res
is obviously assigned, it is a local
variable.
Because arguments are passed by assignment, so too are
seq1
and seq2
.
Because the for
loop assigns items to a variable,
so is name x
.
All these local variables appear when the function is called, and
disappear when the function exits—the return
statement at the end of intersect
sends back the
result object, but name
res
goes away. To fully
understand the
notion of locals, though, we need to move on to Chapter 13.