Now that we’ve looked at the larger ideas behind modules, let’s turn to a simple example of modules in action. Python modules are easy to create; they’re just files of Python program code, created with your text editor. You don’t need to write special syntax to tell Python you’re making a module; almost any text file will do. Because Python handles all the details of finding and loading modules, modules are also easy to use; clients simply import a module, or specific names a module defines, and use the objects they reference.
To define a module, use
your text editor to type Python code into a text file. Names assigned
at the top level of the module become its attributes (names
associated with the module object), and are exported for clients to
use. For instance, if we type the def
below into a
file called module1.py and import it, we create
a module object with one attribute—the name
printer
, which happens to be a reference to a
function object:
def printer(x): # Module attribute print x
A word on module filenames: you can call modules just about anything you like, but module filenames should end in a .py suffix if you plan to import them. The .py is technically optional for top-level files that will be run, but not imported; but adding it in all cases makes the file’s type more obvious.
Since module names become variables inside a Python program without
the .py, they should also follow the normal
variable name rules we learned in Chapter 8. For
instance, you can create a module file named
if.py, but cannot import it, because
if
is a reserved word—when you try to run
import if
, you’ll get a syntax
error. In fact, both the names of module files and directories used
in package imports must conform to the rules for variable names
presented in Chapter 8. This becomes a larger
concern for package directories; their names cannot contain
platform-specific syntax such as spaces.
When modules are imported, Python maps the internal module name to an
external filename, by adding directory paths in the module search
path to the front, and a .py or other extension
at the end. For instance, a module name M
ultimately maps to some external file
<directory>M.<extension>
that
contains our module’s code.
Clients can use the module file we just wrote by running
import
or
from
statements.
Both find, compile, and run a module file’s code if
it hasn’t yet been loaded. The chief difference is
that import
fetches the module as a whole, so you
must qualify to fetch its names; instead, from
fetches (or copies) specific names out of the module.
Let’s see what this means in terms of code. All of
the following examples wind up calling the printer
function defined in the external module file
module1.py, but in different ways.
In the first example, the name module1
serves two
different purposes. It identifies an external file to be loaded and
becomes a variable in the script, which references the module object
after the file is loaded:
>>>import module1 # Get module as a whole.
>>>module1.printer('Hello world!') # Qualify to get names.
Hello world!
Because import
gives a name that refers to the
whole module object, we must go through the module name to fetch its
attributes (e.g., module1.printer
).
By contrast, because from
also copies names from
one file over to another scope, we instead use the copied names
directly without going through the module (e.g.,
printer
):
>>>from module1 import printer # Copy out one variable.
>>>printer('Hello world!') # No need to qualify name.
Hello world!
Finally, the next example uses a special form of
from
: when we use a *
, we get
copies of all the names assigned at the top
level of the referenced module. Here again, we use the copied name,
and don’t go through the module name:
>>>from module1 import * # Copy out all variables.
>>>printer('Hello world!')
Hello world!
Technically, both import
and
from
statements invoke the same import operation;
from
simply adds an extra copy-out step.
And that’s it; modules really are simple to use. But to give you a better understanding of what really happens when you define and use modules, let’s move on to look at some of their properties in more detail.
One of the most common questions beginners seem to ask when using modules is: why won’t my imports keep working? The first import works fine, but later imports during an interactive session (or program run) seem to have no effect. They’re not supposed to, and here’s why.
Modules are loaded and run on the first import
or
from
. However, the import operation only happens
on the first import, on purpose—since this is an expensive
operation, Python does it just once per process by default. Later
import operations simply fetch an already loaded module object.
As one consequence, because top-level code in a module file is usually executed only once, you can use it to initialize variables. Consider the file simple.py, for example:
print 'hello' spam = 1 # Initialize variable.
In this example, the print
and
=
statements run the first time the module is
imported, and variable spam
is initialized at
import time:
%python
>>>import simple # First import: loads and runs file's code
hello >>>simple.spam # Assignment makes an attribute.
1
However, second and later imports don’t rerun the
module’s code, but just fetch the already created
module object in Python’s internal modules
table—variable spam
is not reinitialized:
>>>simple.spam = 2 # Change attribute in module.
>>>import simple # Just fetches already-loaded module.
>>>simple.spam # Code wasn't rerun: attribute unchanged.
2
Of course, sometimes you really want a module’s code
to be rerun; we’ll see how to do it with the
reload
built-in function later in this chapter.
Just like def
, import
and
from
are
executable
statements, not compile-time declarations. They may be nested in
if
tests, appear in function
def
s, and so on, and are not resolved or run until
Python reaches them while your program executes. Imported modules and
names are never available until import statements run. Also like
def
, import
and
from
are implicit assignments:
import
assigns an entire module object to a single
name.
from
assigns one or more names to objects of the
same name in another module.
All the things we’ve already said about assignment
apply to module access, too. For instance, names copied with a
from
become references to shared objects; like
function arguments, reassigning a fetched name has no effect on the
module it was copied from, but changing a fetched mutable
object can change it in the module it was imported from.
File small.py illustrates:
x = 1 y = [1, 2] %python
>>>from small import x, y # Copy two names out.
>>>x = 42 # Changes local x
only
>>>y[0] = 42 # Changes shared mutable in-place
Here, we change a shared mutable object we got with the
from
assignment: name y
in the
importer and importee reference the same list object, so changing it
from one place changes it in the other:
>>>import small # Get module name (from doesn't).
>>>small.x # Small's x is not my x.
1 >>>small.y # But we share a changed mutable.
[42, 2]
In fact, for a graphical picture of what from
assignments do, flip back to Figure 13-2 (function
argument passing). Mentally replace
“caller” and
“function” with
“imported” and
“importer” to see what from
assignments do with references. It’s the exact same
effect, except that here we’re dealing with names in
modules, not functions. Assignment works the same everywhere in
Python.
Note in the prior example how
the
assignment to x
in
the interactive session changes name x
in that
scope only, not the x
in the file—there is
no link from a name copied with from
back to the
file it came from. To really change a global name in another file,
you must use import
:
%python
>>>from small import x, y # Copy two names out.
>>>x = 42 # Changes my x only
>>>import small # Get module name.
>>>small.x = 42 # Changes x in other module
Because changing variables in other modules like this is commonly
confused (and often a bad design choice), we’ll
revisit this technique again later in this chapter. The change to
y[0]
in the prior session changes an object, not a
name.
Incidentally, notice that we
also
have to execute an import
statement in the prior
example after the from
, in order to gain access to
the small
module name at all;
from
only copies names from one module to another,
and does not assign the module name itself. At least conceptually, a
from
statement like this one:
from module import name1, name2 # Copy these two names out (only).
is equivalent to this sequence:
import module # Fetch the module object. name1 = module.name1 # Copy names out by assignment. name2 = module.name2 del module # Get rid of the module name.
Like all assignments, the from
statement creates
new variables in the importer, which initially refer to objects of
the same name in the imported file. We only get the names copied out,
though, not the
module itself.
Modules are probably best understood as simply packages of names—places to define names you want to make visible to the rest of a system. In Python, modules are a namespace—a place where names are created. Names that live in a module are called its attributes. Technically, modules usually correspond to files, and Python creates a module object to contain all the names assigned in the file; but in simple terms, modules are just namespaces.
So how do files morph into namespaces? The short story is that every name that is assigned a value at the top level of a module file (i.e., not nested in a function or class body) becomes an attribute of that module.
For instance, given an assignment statement such as
X=1
at the top level of a module file
M.py, the name X
becomes an
attribute of M
, which we can refer to from outside
the module as M.X
. The name X
also becomes a global variable to other code inside
M.py, but we need to explain the notion of
module loading and scopes a bit more formally to understand why:
Module statements run on the first import. The first time a module is imported anywhere in a system, Python creates an empty module object and executes the statements in the module file one after another, from the top of the file to the bottom.
Top-level assignments create module attributes. During an import, statements at the top-level of the file that assign
names (e.g., =
, def
) create
attributes of the module object; assigned names are stored in the
module’s namespace.
Module namespace: attribute
__dict__
, or
dir(M)
. Module namespaces created by imports are dictionaries; they may be
accessed through the built-in __dict__
attribute
associated with module objects and may be inspected with the
dir
function. The dir
function
is roughly equivalent to the sorted keys list of an
object’s __dict__
attribute,
but includes inherited names for classes, may not be complete, and is
prone to change from release to release.
Modules are a single scope (local is global). As we saw in Chapter 13, names at the top level of a module follow the same reference/assignment rules as names in a function, but the local and global scopes are the same (or, more accurately, it’s the LEGB rule, but without the L and E lookup layers). But in modules, the module scope becomes an attribute dictionary of a module object, after the module has been loaded. Unlike functions (where the local namespace exists only while the function runs), a module file’s scope becomes a module object’s attribute namespace and lives on after the import.
Here’s a demonstration of these ideas. Suppose we create the following module file with a text editor and call it module2.py:
print 'starting to load...' import sys name = 42 def func( ): pass class klass: pass print 'done loading.'
The first time this module is imported (or run as a program), Python
executes its statements from top to bottom. Some statements create
names in the module’s namespace as a side effect,
but others may do actual work while the import is going on. For
instance, the two print
statements in this file
execute at import time:
>>>import module2
starting to load...
done loading.
But once the module is loaded, its scope becomes an attribute
namespace in the module object we get back from
import
—we access attributes in this
namespace by qualifying them with the name of the enclosing module:
>>>module2.sys
<module 'sys'> >>>module2.name
42 >>>module2.func, module2.klass
(<function func at 765f20>, <class klass at 76df60>)
Here, sys
, name
,
func
, and klass
were all
assigned while the module’s statements were being
run, so they’re attributes after the import.
We’ll talk about classes in Part VI, but notice the
sys
attribute; import
statements really assign module objects to names
and any type of assignment to a name at the top level of a file
generates a module attribute. Internally, module namespaces are
stored as dictionary objects. In fact, we can access the namespace
dictionary through the module’s __dict__
attribute; it’s just a normal
dictionary object, with the usual methods:
>>> module2.__dict__.keys( )
['__file__', 'name', '__name__', 'sys', '__doc__', '__builtins__', 'klass',
'func']
The names we assigned in the module file become dictionary keys
internally. Some of the names in the module’s
namespace are things Python adds for us; for instance, __file__
gives the name of the file the module was loaded
from, and __name__
gives its name as known to
importers (without the .py extension and
directory path).
Now that you’re becoming more familiar with modules,
we should clarify the notion of
name
qualification. In Python, you can access
attributes in any object that has attributes, using the qualification
syntax object.attribute
.
Qualification is really an expression that returns the value assigned
to an attribute name associated with an object. For example, the
expression module2.sys
in the previous example
fetches the value assigned to sys
in
module2
. Similarly, if we have a built-in list
object L
, L.append
returns the
method associated with the list.
So what does attribute qualification do to the scope rules we studied in Chapter 13? Nothing, really: it’s an independent concept. When you use qualification to access names, you give Python an explicit object to fetch from. The LEGB rule applies only to bare, unqualified names. Here are the rules:
X
means search
for name X
in
the current scopes (LEGB rule).
X.Y
means find X
in the current
scopes, then search for attribute Y
in the object
X
(not in scopes).
X.Y.Z
means look up name Y
in
object X
, then look up Z
in
object X.Y
.
Qualification works on all objects with attributes: modules, classes, C types, etc.
In Part VI, we’ll see that qualification means a bit more for classes (it’s also the place where something called inheritance happens), but in general, the rules here apply to all names in Python.
It is never possible to access names defined in another module file without first importing that file. That is, you never automatically get to see names in another file, regardless of the structure of imports or function calls in your program.
For example, consider the following two simple modules. The first,
moda.py, defines a variable X
global to code in its file only, along with a function that changes
the global X
in this file:
X = 88 # My X: global to this file only def f( ): global X # Change my X. X = 99 # Cannot see names in other modules
The second module, modb.py, defines its own
global variable X
, and imports and calls the
function in the first module:
X = 11 # My X: global to this file only import moda # Gain access to names in moda. moda.f( ) # Sets moda.X, not my X print X, moda.X
When run, moda.f
changes the X
in moda
, not the X
in
modb
. The global scope for
moda.f
is always the file enclosing it, regardless
of which module it is ultimately called from:
% python modb.py
11 99
In other words, import operations never give upward visibility to code in imported files—it cannot see names in the importing file. More formally:
Functions can never see names in other functions, unless they are physically enclosing.
Module code can never see names in other modules unless they are explicitly imported.
Such behavior is part of the lexical scoping notion—in Python, the scopes surrounding a piece of code are completely determined from the code’s physical position in your file. Scopes are never influenced by function calls, or module imports.[1]
In some sense, although imports do not nest namespaces upward, they do nest downward. Using attribute qualification paths, it’s possible to descend into arbitrarily nested modules, and access their attributes. For example, consider the next three files. mod3.py defines a single global name and attribute by assignment:
X = 3
mod2.py imports the first and uses qualification to access the imported module’s attribute:
X = 2 import mod3 print X, # My global X print mod3.X # mod3's X
And mod1.py imports the second, and fetches attributes in both the first and second files:
X = 1 import mod2 print X, # My global X print mod2.X, # mod2's X print mod2.mod3.X # Nested mod3's X
Really, when mod1
imports mod2
here, it sets up a two-level namespace nesting. By using a path of
names mod2.mod3.X
, it descends into
mod3
, which is nested in the imported
mod2
. The net effect is that
mod1
can see the X
s in all
three files, and hence has access to all three global scopes:
% python mod1.py
2 3
1 2 3
Conversely, mod3
cannot see names in
mod2
, and mod2
cannot see names
in mod1
. This example may be easier to grasp if
you don’t think in terms of namespaces and scopes;
instead, focus on the objects involved. Within
mod1
, mod2
is just a name that
refers to an object with attributes, some of which may refer to other
objects with attributes (import
is an assignment).
For paths like mod2.mod3.X
, Python simply
evaluates left to right, fetching attributes from objects along the
way.
Note that mod1
can say import
mod2
and then mod2.mod3.X
, but cannot
say import mod2.mod3
—this syntax invokes
something called package (directory) imports, described in the next
chapter. Package imports also create module namespace nesting, but
their import statements are
taken to reflect directory trees,
not simple import chains.
A module’s code is run
only once per process by default. To force a
module’s code to be reloaded and rerun, you need to ask
Python explicitly to do so, by calling the reload
built-in function. In this section, we’ll explore
how to use reloads to make your systems more dynamic. In a nutshell:
Imports (both import
and from
statements) load and run a module’s code only the
first time the module is imported in a process.
Later imports use the already loaded module object without reloading or rerunning the file’s code.
The reload
function forces an already loaded
module’s code to be reloaded and rerun. Assignments
in the file’s new code change the existing module
object in-place.
Why all the fuss about reloading modules? The
reload
function allows parts of programs to be
changed without stopping the whole program. With
reload
, the effects of changes in components can
be observed immediately. Reloading doesn’t help in
every situation, but where it does, it makes for a much shorter
development cycle. For instance, imagine a database program that must
connect to a server on startup; since program changes can be tested
immediately after reloads, you need to connect only once while
debugging.
Because Python is interpreted (more or less), it already gets rid of
the compile/link steps you need to go through to get a C program to
run: modules are loaded dynamically, when imported by a running
program. Reloading adds to this, by allowing you to also change parts
of running programs without stopping. We should note that
reload
currently only works on modules written in
Python; C extension modules can be dynamically loaded at runtime too,
but they can’t be reloaded.
reload
is a built-in function in Python, not a
statement.
reload
is passed an existing module object, not a
name.
Because reload
expects an object, a module must
have been previously imported successfully before you can reload it.
In fact, if the import was unsuccessful due to a syntax or other
error, you may need to repeat an import before you can reload.
Furthermore, the syntax of import statements and
reload
calls differs: reloads require parenthesis,
but imports do not. Reloading looks like this:
import module # Initial import ...use module.attributes... ... # Now, go change the module file. ... reload(module) # Get updated exports. ...use module.attributes...
You typically import a module, then change its source code in a text
editor and reload. When you call reload
, Python
rereads the module file’s source code and reruns its
top-level statements. But perhaps the most important thing to know
about reload
is that it changes a module object
in-place; it does not delete and recreate the module object. Because
of that, every reference to a module object anywhere in your program
is automatically affected by a reload. The details:
reload
runs a module file’s
new code in the module’s current namespace. Rerunning a module file’s code overwrites its
existing namespace, rather than deleting and recreating it.
Top-level assignments in the file replace names with new values. For instance, rerunning a def
statement replaces
the prior version of the function in the module’s
namespace, by reassigning the function name.
Reloads impact all clients that use import
to
fetch modules. Because clients that use import
qualify to fetch
attributes, they’ll find new values in the module
object after a reload
.
Reloads impact future from
clients only. Clients that used from
to fetch attributes in the
past won’t be effected by a
reload
; they’ll still have
references to the old objects fetched before the reload.
Here’s a more concrete example of
reload
in action. In the following example, we
change and reload a module file without stopping the interactive
Python session. Reloads are used in many other scenarios, too (see
the sidebar Why You Will Care: Module Reloads), but we’ll
keep things simple for illustration here. First,
let’s write a module file named
changer.py with the text editor of our choice:
message = "First version" def printer( ): print message
This module creates and exports two names—one bound to a string, and another to a function. Now, start the Python interpreter, import the module, and call the function it exports; the function prints the value of the global variable message:
%python
>>>import changer
>>>changer.printer( )
First version >>>
Next, keep the interpreter active and edit the module file in another window:
...modify changer.py without stopping Python...
% vi changer.py
Here, change the global message
variable, as well
as the printer
function body:
message = "After editing" def printer( ): print 'reloaded:', message
Finally, come back to the Python window and reload the module to
fetch the new code we just changed. Notice that importing the module
again has no effect; we get the original message even though the
file’s been changed. We have to call
reload
in order to get the new version:
...back to the Python interpreter/program... >>>import changer
>>>changer.printer( ) # No effect: uses loaded module
First version >>>reload(changer) # Forces new code to load/run
<module 'changer'> >>>changer.printer( ) # Runs the new version now
reloaded: After editing
Notice that reload
actually returns the module
object for us; its result is usually ignored, but since expression
results are printed at the interactive prompt, Python shows a default
<module name>
representation.
[1] Some languages act differently and provide for dynamic scoping, where scopes really may depend on runtime calls. This tends to make code trickier, though, because the meaning of a variable can differ over time.