Finally, here is the usual collection of boundary cases, which make life interesting for beginners. Some are so obscure it was hard to come up with examples, but most illustrate something important about Python.
As we’ve seen, the module name in an
import
or from
statement is a hardcoded variable name; you can’t use these
statements directly to load a module given its name as a Python
string. For instance:
>>> import "string"
File "<stdin>", line 1
import "string"
^
SyntaxError: invalid syntax
You need to use special tools to load modules dynamically, from a
string that exists at runtime. The most general approach is to
construct an import
statement as a string of
Python code and pass it to the exec
statement to
run:
>>>modname = "string"
>>>exec "import " + modname
# run a string of code >>>string
# imported in this namespace <module 'string'>
The exec
statement (and its cousin, the
eval
function) compiles a string of code, and
passes it to the Python interpreter to be executed. In Python, the
bytecode compiler is available at runtime, so you can write programs
that construct and run other programs like this. By default,
exec
runs the code in the current scope, but you
can get more specific by passing in optional namespace dictionaries.
We’ll say more about these tools later in this book.
The only real drawback to exec
is that it must
compile the import
statement each time it runs; if
it runs many times, you might be better off using the built-in __
import
__ function to load from a name string
instead. The effect is similar, but __ import
__
returns the module object, so we assign it to a name here:
>>>modname = "string"
>>>string =
__import
__(modname)
>>>string
<module 'string'>
Earlier, we mentioned that the
from
statement is really an assignment to names in
the importer’s scope—a name-copy operation, not a name
aliasing. The implications of this are the same as for all
assignments in Python, but subtle, especially given that the code
that shares objects lives in different files. For instance, suppose
we define a module nested1
as follows:
X = 99 def printer(): print X
Now, if we import its two names using from
in
another module, we get copies of those names, not links to them.
Changing a name in the importer resets only the binding of the local
version of that name, not the name in nested1
:
from nested1 import X, printer # copy names out
X = 88 # changes my "X" only!
printer() # nested1's X is still 99
% python nested2.py
99
On the other hand, if you use import
to get the
whole module and assign to a qualified name, you
change the name in nested1
. Qualification directs
Python to a name in the module object, rather than a name in the
importer:
import nested1 # get module as a whole
nested1.X = 88 # okay: change nested1's X
nested1.printer()
% python nested3.py
88
As we also saw earlier, when a module is first imported (or reloaded), Python executes its statements one by one, from the top of file to the bottom. This has a few subtle implications regarding forward references that are worth underscoring here:
Code at the top level of a module file (not nested in a function) runs as soon as Python reaches it during an import; because of that, it can’t reference names assigned lower in the file.
Code inside a function body doesn’t run until the function is called; because names in a function aren’t resolved until the function actually runs, they can usually reference names anywhere in the file.
In general, forward references are only a concern in top-level module code that executes immediately; functions can reference names arbitrarily. Here’s an example that illustrates forward reference dos and don’ts:
func1() # error: "func1" not yet assigned def func1(): print func2() # okay: "func2" looked up later func1() # error: "func2" not yet assigned def func2(): return "Hello" func1() # okay: "func1" and "func2" assigned
When this file is imported (or run as a standalone program), Python
executes its statements from top to bottom. The first call to
func1
fails because the func1
def
hasn’t run yet. The call to
func2
inside func1
works as
long as func2
’s def
has
been reached by the time func1
is called (it
hasn’t when the second top-level func1
call
is run). The last call to func1
at the bottom of
the file works, because func1
and
func2
have both been assigned.
Don’t do that. Mixing defs
with top-level
code is not only hard to read, it’s dependent on statement
ordering. As a rule of thumb, if you need to mix immediate code with
defs
, put your defs
at the top
of the file and top-level code at the bottom. That way, your
functions are defined and assigned by the time code that uses them
runs.
Because imports execute a file’s statements from top to bottom,
we sometimes need to be careful when using modules that import each
other (something called recursive
imports
). Since
the statements in a module have not all been run when it imports
another module, some of its names may not yet exist. If you use
import
to fetch a module as a whole, this may or
may not matter; the module’s names won’t be accessed
until you later use qualification to fetch their values. But if you
use from
to fetch specific names, you only have
access to names already assigned.
For instance, take the following modules recur1
and recur2
. recur1
assigns a
name X
, and then imports
recur2
, before assigning name
Y
. At this point, recur2
can
fetch recur1
as a whole with an
import
(it already exists in Python’s
internal modules table), but it can see only name
X
if it uses from
; the name
Y
below the import
in
recur1
doesn’t yet exist, so you get an
error:
module recur1.py
X = 1 import recur2 # run recur2 now if doesn't exist Y = 2
module recur2.py
from recur1 import X # okay: "X" already assigned
from recur1 import Y # error: "Y" not yet assigned
>>> import recur1
Traceback (innermost last):
File "<stdin>", line 1, in ?
File "recur1.py", line 2, in ?
import recur2
File "recur2.py", line 2, in ?
from recur1 import Y # error: "Y" not yet assigned
ImportError: cannot import name Y
Python is smart enough to avoid rerunning
recur1
’s statements when they are imported
recursively from recur2
(or else the imports would
send the script into an infinite loop), but
recur1
’s namespace is incomplete when
imported by recur2
.
Don’t do that....really! Python won’t get stuck in a cycle, but your programs will once again be dependent on the order of statements in modules. There are two ways out of this gotcha:
You can usually eliminate import cycles like this by careful design; maximizing cohesion and minimizing coupling are good first steps.
If you can’t break the cycles completely, postpone module name
access by using import
and qualification (instead
of from)
, or running your from
s inside functions (instead of at the top level of the
module).
The from
statement is the source of all sorts of
gotchas in Python. Here’s another: because
from
copies (assigns) names when run,
there’s no link back to the module where the names came from.
Names imported with from
simply become references
to objects, which happen to have been referenced by the same names in
the importee when the from
ran. Because of this
behavior, reloading the importee has no effect on clients that use
from
; the client’s names still reference the
objects fetched with from
, even though names in
the original module have been reset:
from module import X # X may not reflect any module reloads! . . . reload(module) # changes module, not my names X # still references old object
Don’t do it that way. To make reloads more effective, use
import
and name qualification, instead of
from
. Because qualifications always go back to the
module, they will find the new bindings of module names after calling
reload
:
import module # get module, not names . . . reload(module) # changes module in-place module.X # get current X: reflects module reloads
When you reload a module, Python only reloads that particular module’s file; it doesn’t automatically reload modules that the file being reloaded happens to import. For example, if we reload some module A, and A imports modules B and C, the reload only applies to A, not B and C. The statements inside A that import B and C are rerun during the reload, but they’ll just fetch the already loaded B and C module objects (assuming they’ve been imported before):
%cat A.py
import B # not reloaded when A is import C # just an import of an already loaded module %python
>>> . . . >>>reload(A)
Don’t depend on that. Use multiple reload
calls to update subcomponents independently. If desired, you can
design your systems to reload their subcomponents automatically by
adding reload
calls in parent modules like
A
.[42]
[42] You could also write a
general tool to do transitive reloads automatically, by scanning
module __ dict
__s (see the section Section 5.6.7), and checking each item’s
type()
to find nested modules to reload
recursively. This is an advanced exercise for the ambitious.