Did all of Chapter 20 make sense? If not, don’t worry; now that we’ve had a quick tour, we’re going to dig a bit deeper and study the concepts we’ve introduced in further detail. This chapter takes a second pass, to formalize and expand on some of the class coding ideas introduced in Chapter 20.
Although the Python class
statement seems similar
to other OOP languages on the surface, on closer inspection it is
quite different than what some programmers are used to. For example,
as in C++, the class
statement is
Python’s main OOP tool. Unlike C++,
Python’s class
is not a
declaration. Like def
, class
is
an object builder, and an implicit assignment—when run, it
generates a class object, and stores a reference to it in the name
used in the header. Also like def
, class is true
executable code—your class doesn’t exist until
Python reaches and runs the class
statement
(typically, while importing the module it is coded in, but not
until).
class
is a compound statement with a body of
indented statements usually under it. In the header, superclasses are
listed in parentheses after the class name, separated by commas.
Listing more than one superclass leads to multiple inheritance (which
we’ll say more about in the next chapter). Here is
the statement’s general form:
class <name>(superclass,...): # Assign to name. data = value # Shared class data def method(self,...): # Methods self.member = value # Per-instance data
Within the class
statement, any assignment
generates a class attribute, and specially-named methods overload
operators; for instance, a function called __init__
is called at instance object construction time, if
defined.
Classes are mostly just namespaces—a tool for defining names (i.e., attributes) that export data and logic to clients. So how do you get from the class statement to a namespace?
Here’s how. Just as with modules files, the
statements nested in a class
statement body create
its attributes. When Python executes a class
statement (not a call to a class), it runs all the statements in its
body, from top to bottom. Assignments that happen during this process
create names in the class’s local scope, which
become attributes in the associated class object. Because of this,
classes resemble both modules and functions:
Like functions, class
statements are a local scope
where names created by nested assignments live.
Like modules, names assigned in a class
statement
become attributes in a class object.
The main distinction for classes is that their namespaces are also the basis of inheritance in Python; attributes are fetched from other classes if not found in a class or instance object.
Because class
is a compound statement, any sort of
statement can be nested inside its
body—print
, =
,
if
, def
, and so on. All the
statements inside the class
statement run when the
class
statement itself runs (not when the class is
later called to make an instance). Any name assigned inside the
class
statement makes a class attribute. Nested
def
s make class methods, but other assignments
make attributes, too. For example:
>>>class SharedData:
...spam = 42 # Generates a class attribute.
... >>>x = SharedData( ) # Make two instances.
>>>y = SharedData( )
>>>x.spam, y.spam # They inherit and share spam.
(42, 42)
Here, because the name spam
is assigned at the
top-level of a class
statement, it is attached to
the class, and so will be shared by all instances. Change it by going
through the class name; refer to it through either instances or the
class.[1]
>>>SharedData.spam = 99
>>>x.spam, y.spam, SharedData.spam
(99, 99, 99)
Such class attributes can be used to manage information that spans
all the instances—a counter of the number of instances
generated, for example (we’ll expand on this idea in
Chapter 23). Now, watch what happens if we assign
name spam
through an instance instead of the
class:
>>>x.spam = 88
>>>x.spam, y.spam, SharedData.spam
(88, 99, 99)
Assignments to instance attributes create or change that name in the
instance, rather than the shared class. More generally, inheritance
search occurs only on attribute reference, not
assignment: assigning to an object’s attribute
always changes that object, and no other.[2] For example,
y.spam
is looked up in the class by inheritance,
but the assignment to x.spam
attaches a name to
x
itself.
Here’s a more comprehensive example of this behavior, that stores the same name in two places. Suppose we run the following class:
class MixedNames: # Define class. data = 'spam' # Assign class attr. def __init__(self, value): # Assign method name. self.data = value # Assign instance attr. def display(self): print self.data, MixedNames.data # Instance attr, class attr
This class contains two def
s, which bind class
attributes to method functions. It also contains an
=
assignment statement; since this class-level
assignment assigns the name data
inside the
class
, it lives in the class’s
local scope and becomes an attribute of the class object. Like all
class attributes, this data
is inherited and
shared by all instances of the class that don’t have
a data
of their own.
When we make instances of this class, the name
data
is also attached to instances, by the
assignment to self.data
in the constructor method:
>>>x = MixedNames(1) # Make two instance objects.
>>>y = MixedNames(2) # Each has its own data.
>>>x.display( ); y.display( ) # self.data differs, Subclass.data same.
1 spam 2 spam
The net result is that data
lives in two places:
in instance objects (created by the self.data
assignment in __init__
),
and in the class they inherit
names from (created by the data
assignment in the
class
). The class’s
display
method prints both versions, by first
qualifying the self
instance, and then the class.
By using these techniques to store attributes on different objects, you determine their scope of visibility. When attached to classes, names are shared; in instances, names record per-instance data, not shared behavior or data. Although inheritance looks up names for us, we can always get to an attribute anywhere in a tree, by accessing the desired object directly.
In the example, x.data
and
self.data
will choose an instance name, which
normally hides the same name in the class. But
MixedNames.data
grabs the class name explicitly.
We’ll see various roles for such coding patterns
later; the next section describes one of the most common.
Since you already know about functions, you also know about
methods
in classes. Methods are just function objects created by
def
statements nested in a
class
statement’s body. From an
abstract perspective, methods provide behavior for instance objects
to inherit. From a programming perspective, methods work in exactly
the same way as simple functions, with one crucial exception: their
first argument always receives the instance object that is the
implied subject of a method call.
In other words, Python automatically maps instance method calls to class method functions as follows. Method calls made through an instance:
instance.method(args...
)
are automatically translated to class method function calls of this form:
class.method(instance, args...
)
where the class is determined by locating the method name using Python’s inheritance search procedure. In fact, both call forms are valid in Python.
Beside the normal inheritance of method attribute names, the special
first argument is the only real magic behind method calls. In a class
method, the first argument is usually called self
by convention (technically, only its position is significant, not its
name). This argument provides methods with a hook back to the
instance—because classes generate many instance objects, they
need to use this argument to manage data that varies per instance.
C++ programmers may recognize Python’s
self
argument as similar to C++’s
“this” pointer. In Python, though,
self
is always explicit in your code. Methods must
always go through self
to fetch or change
attributes of the instance being processed by the current method
call. This explicit nature of self
is by
design—the presence of this name makes it obvious that you are
using attribute names in your script, not a name in the local or
global scope.
Let’s turn to an example; suppose we define the following class:
class NextClass: # Define class. def printer(self, text): # Define method. self.message = text # Change instance. print self.message # Access instance.
The name printer
references a function object;
because it’s assigned in the
class
statement’s scope, it
becomes a class object attribute and is inherited by every instance
made from the class. Normally, because methods like
printer
are designed to process instances, we call
them through instances:
>>>x = NextClass( ) # Make instance
>>>x.printer('instance call') # Call its method
instance call >>>x.message # Instance changed
'instance call'
When called by qualifying an instance like this,
printer
is first located by inheritance, and then
its self
argument is automatically assigned the
instance object (x
); the text
argument gets the string passed at the call ('instance
call
'). When called this way, we pass one fewer argument
than it seems we need—Python automatically passes the first
argument to self
for us. Inside
printer
, the name self
is used
to access or set per-instance data, because it refers back to the
instance currently being processed.
Methods may be called in one of two ways—through an instance,
or through the class itself. For example, we can
also call printer
by going through the class name,
provided we pass an instance to the self
argument
explicitly:
>>>NextClass.printer(x, 'class call') # Direct class call
class call >>>x.message # Instance changed again
'class call'
Calls routed through the instance and class have the exact same effect, as long as we pass the same instance object ourselves in the class form. By default, in fact, you get an error message if you try to call a method without any instance:
>>>NextClass.printer('bad call')
TypeError: unbound method printer( ) must be called with NextClass instance...
Methods are normally
called
through instances. Calls to methods through the class, though, show
up in a variety of special roles. One common scenario involves the
constructor method. The __init__
method, like
all attributes, is looked up by inheritance. This means that at
construction time, Python locates and calls just one __init__
; if subclass constructors need to guarantee that
superclass construction-time logic runs too, they generally must call
it explicitly through the class:
class Super:
def __init__(self, x):
...default code...
class Sub(Super):
def __init__(self, x, y):
Super.__init__(self, x) # Run superclass init.
...custom code
... # Do my init actions.
I = Sub(1, 2)
This is one of the few contexts in which your code calls an overload method directly. Naturally, you should only call the superclass constructor this way if you really want it to run—without the call, the subclass replaces it completely.[3]
This pattern of calling through a class is the general basis of extending (instead of completely replacing) inherited method behavior. In Chapter 23, we’ll also meet a new option in Python 2.2, static and class methods, which allows you to code methods that do not expect an instance object in their first argument. Such methods can act like simple instance-less functions, with names that are local to the class they are coded in. This is an advanced and optional extension, though; normally, you must always pass an instance to a method, whether it is called through the instance or the class.
The whole point of a namespace tool like the class
statement is to support name
inheritance. This section expands
on some of the mechanisms and roles of attribute inheritance.
In Python, inheritance happens when an object is qualified, and
involves searching an attribute definition tree (one or more
namespaces). Every time you use an expression of the form
object.attr
where object
is an
instance or class object, Python searches the namespace tree at and
above object
, for the first
attr
it can find. This includes references to
self
attributes in your methods. Because lower
definitions in the tree override higher ones, inheritance forms the
basis of specialization.
Figure 21-1 summarizes the way namespace trees are constructed and populated with names. Generally:
Instance attributes are generated by assignments to
self
attributes in methods.
Class attributes are created by statements (assignments) in
class
statements.
Superclass links are made by listing classes in parentheses in a
class
statement header.
The net result is a tree of attribute namespaces, which grows from an instance, to the class it was generated from, to all the superclasses listed in the class headers. Python searches upward in this tree from instances to superclasses, each time you use qualification to fetch an attribute name from an instance object.[4]
The tree-searching model of inheritance just described turns out to be a great way to specialize systems. Because inheritance finds names in subclasses before it checks superclasses, subclasses can replace default behavior by redefining the superclass’s attributes. In fact, you can build entire systems as hierarchies of classes, which are extended by adding new external subclasses rather than changing existing logic in place.
The idea of redefining inherited names leads to a variety of specialization techniques. For instance, subclasses may replace inherited attributes completely, provide attributes that a superclass expects to find, and extend superclass methods by calling back to the superclass from an overridden method. We’ve already seen replacement in action; here’s an example that shows how extension works:
>>>class Super:
...def method(self):
...print 'in Super.method'
... >>>class Sub(Super):
...def method(self): # Override method.
...print 'starting Sub.method' # Add actions here.
...Super.method(self) # Run default action.
...print 'ending Sub.method'
...
Direct superclass method calls are the crux of the matter here. The
Sub
class replaces
Super
’s method
function with its own specialized version. But within the
replacement, Sub
calls back to the version
exported by Super
to carry out the default
behavior. In other words, Sub.method
just extends
Super.method
behavior, rather than replacing it
completely:
>>>x = Super( ) # Make a Super instance.
>>>x.method( ) # Runs Super.method
in Super.method >>>x = Sub( ) # Make a Sub instance.
>>>x.method( ) # Runs Sub.method, which calls Super.method
starting Sub.method in Super.method ending Sub.method
This extension coding pattern is also commonly used with constructors; see Section 21.2 for an example.
Extension is only one way to interface with a superclass; the following file, specialize.py, defines multiple classes that illustrate a variety of common techniques:
Super
Defines a method
function and a
delegate
that expects an action
in a subclass
Inheritor
Doesn’t provide any new names, so it gets everything
defined in Super
Replacer
Overrides Super
’s
method
with a version of its own
Extender
Customizes Super
’s
method
by overriding and calling back to run the
default
Provider
Implements the action
method expected by
Super
’s
delegate
method
Study each of these subclasses to get a feel for the various ways they customize their common superclass:
class Super: def method(self): print 'in Super.method' # Default behavior def delegate(self): self.action( ) # Expected to be defined class Inheritor(Super): # Inherit method verbatim. pass class Replacer(Super): # Replace method completely. def method(self): print 'in Replacer.method' class Extender(Super): # Extend method behavior. def method(self): print 'starting Extender.method' Super.method(self) print 'ending Extender.method' class Provider(Super): # Fill in a required method. def action(self): print 'in Provider.action' if __name__ == '__main__': for klass in (Inheritor, Replacer, Extender): print ' ' + klass.__name__ + '...' klass( ).method( ) print ' Provider...' x = Provider( ) x.delegate( )
A few things are worth pointing out here. The self-test code at the
end of this example creates instances of three different classes in a
for
loop. Because classes are objects, you can put
them in a tuple and create instances generically (more on this idea
later). Classes also have the special __name__
attribute like modules; it’s just preset to a string
containing the name in the class header.
% python specialize.py
Inheritor...
in Super.method
Replacer...
in Replacer.method
Extender...
starting Extender.method
in Super.method
ending Extender.method
Provider...
in Provider.action
Notice how the Provider
class
in the prior example works. When we call the
delegate
method through a
Provider
instance, two
independent inheritance searches occur:
On the initial x.delegate
call, Python finds the
delegate
method in Super
, by
searching at the Provider
instance and above. The
instance x
is passed into the
method’s self
argument as usual.
Inside the Super.delegate
method,
self.action
invokes a new, independent inheritance
search at self
and above. Because
self
references a Provider
instance, the action
method is located in the
Provider
subclass.
This “filling in the blanks” sort
of coding structure is typical of OOP frameworks. At least in terms
of the delegate
method, the superclass in this
example is what is sometimes called an abstract
superclass—a class that expects parts of its
behavior to be provided by subclasses. If an expected method is not
defined in a subclass, Python raises an undefined name exception
after inheritance search fails. Class coders sometimes make such
subclass requirements more obvious with assert
statements, or raising the built-in
NotImplementedError
exception:
class Super: def method(self): print 'in Super.method' def delegate(self): self.action( ) def action(self): assert 0, 'action must be defined!'
We’ll meet assert
in Chapter 24; in short, if its expression evaluates to
false, it raises an exception with an error message. Here, the
expression is always false (0), so as to trigger an error message if
a method is not redefined and inheritance locates the version here.
Alternatively, some classes simply raise a
NotImplemented
exception directly in such method
stubs; we’ll study the raise
statement in Chapter 24.
For a somewhat more realistic example of this section’s concepts in action, see the “Zoo Animal Hierarchy” exercise in Part VI Exercises and its solution in Section B.6. Such taxonomies are a traditional way to introduce OOP, but are a bit removed from most developers’ job descriptions.
We introduced operator overloading in the prior chapter; let’s fill in more details here and look at a few commonly used overloading methods. Here’s a review of the key ideas behind overloading:
Operator overloading lets classes intercept normal Python operations.
Classes can overload all Python expression operators.
Classes can also overload operations: printing, calls, qualification, etc.
Overloading makes class instances act more like built-in types.
Overloading is implemented by providing specially named class methods.
Here’s a simple example of overloading at work. When
we provide specially named methods in a class, Python automatically
calls them when instances of the class appear in the associated
operation. For instance, the Number
class in file
number.py below provides a method to intercept
instance construction (__init__
), as well as one
for catching subtraction expressions (__sub__
).
Special methods are the hook that lets you tie into built-in
operations:
class Number: def __init__(self, start): # On Number(start) self.data = start def __sub__(self, other): # On instance - other return Number(self.data - other) # result is a new instance >>>from number import Number # Fetch class from module.
>>>X = Number(5) # Number.__init__(X, 5)
>>>Y = X - 2 # Number.__sub__(X, 2)
>>>Y.data # Y is new Number instance.
3
Just about everything you can do to built-in objects such as integers
and lists has a corresponding specially named method for
overloading in classes. Table 21-1 lists a few of the most common; there are many
more. In fact, many overload methods come in multiple versions (e.g.,
__add__
, __radd__
, and
__iadd__
for addition). See other Python books
or the Python Language Reference Manual for an exhaustive list of
special method names available.
Method |
Overloads |
Called for |
|
Constructor |
Object creation: |
|
Destructor |
Object reclamation |
Operator ' |
| |
|
Operator ' |
|
|
Printing, conversions |
|
|
Function calls |
|
|
Qualification |
|
|
Attribute assignment |
|
|
Indexing |
|
|
Index assignment |
|
|
Length |
|
|
Comparison |
|
|
Specific comparison |
|
|
Specific comparison |
|
|
Right-side operator ' |
|
|
In-place (augmented) addition |
|
|
Iteration contexts |
for |
All overload methods have names that start and end with two
underscores, to keep them distinct from other names you define in
your classes. The mapping from special method name to expression or
operations is simply predefined by the Python language (and
documented in the standard language manual). For example, name
__add__
always maps to +
expressions by Python language definition, regardless of what an
__add__
method’s code actually
does.
All operator overloading methods are optional—if you
don’t code one, that operation is simply unsupported
by your class (and may raise an exception if attempted). Most
overloading methods are only used in advanced programs that require
objects to behave like built-ins; the __init__ constructor tends to
appear in most classes, however. We’ve already met
the __init__
initialization-time constructor
method, and a few others in Table 21-1.
Let’s explore some of the additional methods in the
table by example.
The
__getitem__
method
intercepts
instance indexing operations. When an instance X
appears in an indexing expression like X[i]
,
Python calls a __getitem__
method inherited by
the instance (if any), passing X
to the first
argument and the index in brackets to the second argument. For
instance, the following class returns the square of an index value:
>>>class indexer:
...def __getitem__(self, index):
...return index ** 2
... >>>X = indexer( )
>>>X[2] # X[i] calls __getitem__(X, i).
4
>>>for i in range(5):
...print X[i],
... 0 1 4 9 16
Here’s a trick that isn’t always
obvious
to beginners, but turns out to be incredibly useful: the
for
statement works by repeatedly indexing a
sequence from zero to higher indexes, until an out-of-bounds
exception is detected. Because of that, __getitem__
also turns out to be one way to overload iteration in
Python—if defined, for
loops call the
class’s __getitem__
each time
through, with successively higher offsets. It’s a
case of “buy one, get one free”:
any built-in or user-defined object that responds to indexing also
responds to iteration:
>>>class stepper:
...def __getitem__(self, i):
...return self.data[i]
... >>>X = stepper( ) # X is a stepper object.
>>>X.data = "Spam"
>>> >>>X[1] # Indexing calls __getitem__.
'p' >>>for item in X: # for loops call __getitem__.
...print item, #
for indexes items 0..N.
... S p a m
In fact, it’s really a case of “buy
one, get a bunch for free”: any class that supports
for
loops automatically supports all iteration
contexts in Python, many of which we’ve seen in
earlier chapters. For example, the in
membership
test, list comprehensions, the map
built-in, list
and tuple assignments, and type constructors, will also call
__getitem__
automatically if defined:
>>>'p' in X # All call __getitem__ too.
1 >>>[c for c in X] # List comprehension
['S', 'p', 'a', 'm'] >>>map(None, X) # map calls
['S', 'p', 'a', 'm'] >>>(a,b,c,d) = X # Sequence assignments
>>>a, c, d
('S', 'a', 'm') >>>list(X), tuple(X), ''.join(X)
(['S', 'p', 'a', 'm'], ('S', 'p', 'a', 'm'), 'Spam') >>>X
<__main__.stepper instance at 0x00A8D5D0>
In practice, this technique can be used to create objects that provide a sequence interface, and add logic to built-in sequence type operations; we’ll revisit this idea when extending built-in types in Chapter 23.
Today, all iteration contexts in
Python
will first try to find a __iter__
method, which
is expected to return an object that supports the new iteration
protocol. If provided, Python repeatedly calls this
object’s next
method to produce
items, until the StopIteration
exception is
raised. If no such method is found, Python falls back on the
__getitem__
scheme and repeatedly indexes by
offsets as before, until an IndexError
exception.
In the new scheme, classes implement user-defined iterators by simply implementing the iterator protocol introduced in Chapter 14 for functions. For example, the following file, iters.py, defines a user-defined iterator class that generates squares:
class Squares: def __init__(self, start, stop): self.value = start - 1 self.stop = stop def __iter__(self): # Get iterator object return self def next(self): # on each for iteration. if self.value == self.stop: raise StopIteration self.value += 1 return self.value ** 2 %python
>>>from iters import Squares
>>>for i in Squares(1,5):
...print i,
... 1 4 9 16 25
Here, the iterator object is simply the instance,
self
, because the next
method
is part of this class. The end of the iteration is signaled with a
Python raise
statement (more on raising exceptions
in the next part of this book).
An equivalent coding with __getitem__
might be
less natural, because the for
would then iterate
through offsets zero and higher; offsets passed in would be only
indirectly related to the range of values produced
(0
..N
would need to map to
start
..stop
). Because
__iter__
objects retain explicitly-managed state
between next
calls, they can be more general than
__getitem__
.
On the other hand, __iter__
-based iterators can
sometimes be more complex and less convenient than __getitem__
. They are really designed for iteration, not
random indexing. In fact, they don’t overload the
indexing expression at all:
>>>X = Squares(1,5)
>>>X[1]
AttributeError: Squares instance has no attribute '__getitem__'
The __iter__
scheme implements the other
iteration contexts we saw in action for __getitem__
(membership tests, type constructors, sequence
assignment, and so on). However, unlike __getitem__
, __iter__
is designed for a single
traversal, not many. For example, the Squares
class is a one-shot iteration; once iterated, it’s
empty. You need to make a new iterator object for each new iteration:
>>>X = Squares(1,5)
>>>[n for n in X] # Exhausts items
[1, 4, 9, 16, 25] >>>[n for n in X] # Now it's empty.
[ ] >>>[n for n in Squares(1,5)]
[1, 4, 9, 16, 25] >>>list(Squares(1,3))
[1, 4, 9]
For more details on iterators, see Chapter 14. Notice that this example would probably be simpler if coded with generator functions—a topic introduced in Chapter 14 and related to iterators:
>>>from __future__ import generators # Need in 2.2
>>> >>>def gsquares(start, stop):
...for i in range(start, stop+1):
...yield i ** 2
... >>>for i in gsquares(1, 5):
...print i,
... 1 4 9 16 25
Unlike the class, the function automatically saves its state between
iterations. Classes may be better at modeling more complex
iterations, though, especially when they can benefit from inheritance
hierarchies. Of course, for this artificial example, you might as
well skip both techniques, and simply use a for
loop, map
, or list comprehension, to build the
list all at once; the best and fastest way to accomplish a task in
Python is often also the simplest:
>>> [x ** 2 for x in range(1, 6)]
[1, 4, 9, 16, 25]
The
__getattr__
method
intercepts attribute qualifications. More specifically,
it’s called with the attribute name as a string,
whenever you try to qualify an instance on an
undefined (nonexistent) attribute name. It is
not called if Python can find the attribute using its inheritance
tree-search procedure. Because of its behavior, __getattr__
is useful as a hook for responding to attribute requests
in a generic fashion. For example:
>>>class empty:
...def __getattr__(self, attrname):
...if attrname == "age":
...return 40
...else:
...raise AttributeError, attrname
... >>>X = empty( )
>>>X.age
40 >>>X.name
...error text omitted... AttributeError: name
Here, the empty
class and its instance
X
have no real attributes of their own, so the
access to X.age
gets routed to the __getattr__
method; self
is assigned the
instance (X
), and attrname
is
assigned the undefined attribute name string
(”age
“). The class makes age
look like a real attribute by returning a real value as the result of
the X.age
qualification expression
(40
). In effect, age becomes a
dynamically computed attribute.
For other attributes the class doesn’t know how to
handle, it raises the built-in AttributeError
exception, to tell Python that this is a bona fide undefined name;
asking for X.name
triggers the error.
You’ll see __getattr__
again
when we show delegation and properties at work in the next two
chapters, and we will say more about exceptions in Part VII.
A related overloading method, __setattr__
,
intercepts all attribute assignments. If this
method is defined, self.attr=value
becomes
self.__setattr__('attr',value)
. This is a bit
more tricky to use, because assigning to any self
attributes within __setattr__
calls __setattr__
again, causing an infinite recursion loop (and
eventually, a stack overflow exception!). If you want to use this
method, be sure that it assigns any instance attributes by indexing
the attribute dictionary, discussed in the next section. Use
self.__dict__['name']=x
, not
self.name=x
:
>>>class accesscontrol:
...def __setattr__(self, attr, value):
...if attr == 'age'
: ...self.__dict__[attr] = value
...else:
...raise AttributeError, attr + ' not allowed'
... >>>X = accesscontrol( )
>>>X.age = 40 # Calls __setattr__
>>>X.age
40 >>>X.name = 'mel'
...text omitted... AttributeError: name not allowed
These two attribute access overloading methods tend to play highly specialized roles, some of which we’ll meet later in this book; in general, they allow you to control or specialize access to attributes in your objects.
The next example exercises the __init__
constructor and the __add__
overload methods
we’ve already seen, but also defines a __repr__
that returns a
string representation for instances.
String formatting is used to convert the managed
self.data
object to a string. If defined,
__repr__
, or its sibling __str__
, is called automatically when class instances are printed
or converted to strings; they allow you to define a better print
string for your objects than the default instance display.
>>>class adder:
...def __init__(self, value=0):
...self.data = value # Initialize data.
...def __add__(self, other):
...self.data += other # Add other in-place.
>>>class addrepr(adder): # Inherit __init__, __add__.
...def __repr__(self): # Add string representation.
...return 'addrepr(%s)' % self.data # Convert to string as code.
>>>x = addrepr(2) # Runs __init__
>>>x + 1 # Runs __add__
>>>x # Runs __repr__
addrepr(3) >>>print x # Runs __repr__
addrepr(3) >>>str(x), repr(x) # Runs __repr__
('addrepr(3)', 'addrepr(3)')
So why two display methods? Roughly, __str__
is
tried first for user-friendly displays, such as the
print
statement and the str
built-in function. The __repr__
method should in
principle return a string that could be used as executable code to
recreate the object, and is used for interactive prompt echoes and
the repr
function. Python falls back on __repr__
if no __str__
is present, but
not vice-versa:
>>>class addstr(adder):
...def __str__(self): # __str__ but no __repr__
...return '[Value: %s]' % self.data # Convert to nice string.
>>>x = addstr(3)
>>>x + 1
>>>x # Default repr
<__main__.addstr instance at 0x00B35EF0> >>>print x # Runs __str__
[Value: 4] >>>str(x), repr(x)
('[Value: 4]', '<__main__.addstr instance at 0x00B35EF0>')
Because of this, __repr__
may be best if you
want a single display for all contexts. By defining both methods,
though, you can support different displays in different contexts:
>>>class addboth(adder):
...def __str__(self):
...return '[Value: %s]' % self.data # User-friendly string
...def __repr__(self):
...return 'addboth(%s)' % self.data # As-code string
>>>x = addboth(4)
>>>x + 1
>>>x # Runs __repr__
addboth(5) >>>print x # Runs __str__
[Value: 5] >>> str(x), repr(x) ('[Value: 5]', 'addboth(5)')
Technically, the __add__
method in the prior
example does not support the use of instance objects on the right
side of the +
operator. To implement such
expressions, and hence support
commutative
style operators, code the __radd__
method as well. Python calls __radd__
only when the object on the right of the
+
is your class instance, but the object on the
left is not an instance of your class. The __add__
method for the object on the left is called instead in
all other cases:
>>>class Commuter:
...def __init__(self, val):
...self.val = val
...def __add__(self, other):
...print 'add', self.val, other
...def __radd__(self, other):
...print 'radd', self.val, other
... >>>x = Commuter(88)
>>>y = Commuter(99)
>>>x + 1 # __add__: instance + noninstance
add 88 1 >>>1 + y # __radd__: noninstance + instance
radd 99 1 >>>x + y # __add__: instance + instance
add 88 <__main__.Commuter instance at 0x0086C3D8>
Notice how the order is reversed in __radd__
:
self
is really on the right of the
+
, and other
is on the left.
Every binary operator has a similar right-side overloading method
(e.g., __mul__
and __rmul__
). Typically, a right-side method like __radd__
usually just converts if needed and reruns a
+
to trigger __add__
, where
the main logic is coded. Also note that x
and
y
are instances of the same class here; when
instances of different classes appear mixed in an expression, Python
prefers the class of the one on the left.
Right-side methods are an advanced topic, and tend to be fairly
rarely used; you only code them when you need operators to be
commutative, and then only if you need to support operators at all.
For instance, a Vector
class may use these tools,
but an Employee
or Button
class
probably would not.
The __call__
method is called when your
instance
is called. No, this isn’t a circular
definition—if defined, Python runs a __call__
method for function call expressions applied to your
instances. This allows class instances to emulate the look and feel
of things like functions:
>>>class Prod:
...def __init__(self, value):
...self.value = value
...def __call__(self, other):
...return self.value * other
... >>>x = Prod(2)
>>>x(3)
6 >>>x(4)
8
In this example, the __call__
may seem a bit
gratuitous—a simple method provides similar utility:
>>>class Prod:
...def __init__(self, value):
...self.value = value
...def comp(self, other):
...return self.value * other
... >>>x = Prod(3)
>>>x.comp(3)
9 >>>x.comp(4)
12
However, __call__
can become more useful when
interfacing with APIs that expect functions. For example, the Tkinter
GUI toolkit we’ll meet later in this book allows you
to register functions as event handlers (a.k.a., callbacks); when
events occur, Tkinter calls the registered object. If you want an
event handler to retain state between events, you can either register
a class’s bound method, or an instance that conforms
to the expected interface with __call__
. In our
code, both x.comp
from the second example and
x
from the first can pass as function-like objects
this way. More on bound methods in the next chapter.
The __init__
constructor is called whenever an
instance is
generated. Its counterpart, destructor method __del__
, is run automatically when an instance’s
space is being reclaimed (i.e., at “garbage
collection” time):
>>>class Life:
...def __init__(self, name='unknown'):
...print 'Hello', name
...self.name = name
...def __del__(self):
...print 'Goodbye', self.name
... >>>brian = Life('Brian')
Hello Brian >>>brian = 'loretta'
Goodbye Brian
Here, when brian
is assigned a string, we lose the
last reference to the Life
instance, and so,
trigger its destructor method. This works, and may be useful to
implement some cleanup activities such as terminating server
connections. However, destructors are not as commonly used in Python
as in some OOP languages, for a number of reasons.
For one thing, because Python automatically reclaims all space held
by an instance when the instance is reclaimed, destructors are not
necessary for space management.[5] For another, because you
cannot always easily predict when an instance will be reclaimed,
it’s often better to code termination activities in
an explicitly-called method (or
try
/finally
statement,
described in the next part of the book); in some cases, there may be
lingering references to your objects in system tables, which prevent
destructors from running.
That’s as many overloading examples as we have space for here. Most work similarly to ones we’ve already seen, and all are just hooks for intercepting built-in type operations; some overload methods have unique argument lists or return values. You’ll see a few others in action later in the book, but for a complete coverage, we’ll defer to other documentation sources.
Now that we’ve seen class and instance objects, the Python namespace story is complete; for reference, let’s quickly summarize all the rules used to resolve names. The first things you need to remember are that qualified and unqualified names are treated differently, and that some scopes serve to initialize object namespaces:
Unqualified names (e.g., X
) deal with scopes.
Qualified attribute names (e.g., object.X
) use
object namespaces.
Some scopes initialize object namespaces (modules and classes).
Unqualified names follow the LEGB lexical scoping rules outlined for functions in Chapter 13:
X = value
Makes names local: creates or changes name X
in
the current local scope, unless declared global
X
Looks for name X
in the current local scope, then
any and all enclosing functions, then the current global scope, then
the built-in scope
Q ualified names refer to attributes of specific objects and obey the rules for modules and classes. For class and class instance objects, the reference rules are augmented to include the inheritance search procedure:
object.X = value
Creates or alters the attribute name X
in the
namespace of the object
being qualified, and no
other. Inheritance tree climbing only happens on attribute reference,
not on attribute assignment.
object.X
For class-based objects, searches for the attribute name
X
in the object
, then in all
accessible classes above it, using the inheritance search procedure.
For non-class objects such as modules, fetches X
from object
directly.
With distinct search procedures for qualified and unqualified names, and multiple lookup layers for both, it can sometimes be confusing to know where a name will wind up going. In Python, the place where you assign a name is crucial—it fully determines which scope or which object a name will reside in. File manynames.py illustrates and summarizes how this translates to code:
X = 11 # Module (global) name/attribute class c: X = 22 # Class attribute def m(self): X = 33 # Local variable in method self.X = 44 # Instance attribute def f( ): X = 55 # Local variable in function def g( ): print X # Access module X (11)
Because this file assigns the same name, X
, in
five different locations, there are actually five completely
different X
s in this program. From top to bottom,
the assignments to X
names generate a module
attribute, a class attribute, a local variable in a method, an
instance attribute, and a local in a function.
You should take enough time to study this example carefully, because
it collects ideas we’ve been exploring throughout
the last few parts of this book. When it makes sense to you, you will
have achieved Python namespace nirvana. Of course, an alternative
route to nirvana is to simply run this and see what happens.
Here’s the remainder of this file, which makes an
instance, and prints all the X
s that it can fetch:
obj = c( ) obj.m( ) print obj.X # 44: instance print c.X # 22: class (a.k.a. obj.X if no X in instance) print X # 11: module (a.k.a. manynames.X outside file) #print c.m.X # FAILS: only visible in method #print f.X # FAILS: only visible in function
Notice that we can go through the class to fetch its attribute
(c.X
), but can never fetch local variables in
functions or methods from outside their def
statements. Locals are only visible to other code within the
def
, and in fact only live in memory while a call
to the function or method is executing.
In Chapter 16, we learned
that
module namespaces are actually implemented as dictionaries and
exposed with the built-in __dict__
attribute.
The same holds for class and instance objects: attribute
qualification is really a dictionary indexing operation internally,
and attribute inheritance is just a matter of searching linked
dictionaries. In fact, instance and class objects are mostly just
dictionaries with links inside Python. Python exposes these
dictionaries, as well as the links between them, for use in advanced
roles (e.g., for coding tools).
To help you understand how attributes work internally, let’s work through an interactive session that traces the way namespace dictionaries grow when classes are involved. First, let’s define a superclass and a subclass with methods that will store data in their instances:
>>>class super:
...def hello(self):
...self.data1 = 'spam'
... >>>class sub(super):
...def hola(self):
...self.data2 = 'eggs'
When we make an instance of the subclass, the instance starts out
with an empty namespace dictionary, but has links back to the class
for the inheritance search to follow. In fact, the inheritance tree
is explicitly available in special attributes, which you can inspect:
instances have a __class__
attribute that links
to their class, and classes have a __bases__
attribute that is a tuple containing links to higher superclasses:
>>>X = sub( )
>>>X.__dict__
{ } >>>X.__class__
<class __main__.sub at 0x00A48448> >>>sub.__bases__
(<class __main__.super at 0x00A3E1C8>,) >>>super.__bases__
( )
As classes assign to self
attributes, they
populate the instance object—that is, attributes wind up in the
instance’s attribute namespace dictionary, not in
the classes. Instance object namespaces record data that can vary
from instance to instance, and self
is a hook into
that namespace:
>>>Y = sub( )
>>>X.hello( )
>>>X.__dict__
{'data1': 'spam'} >>>X.hola( )
>>>X.__dict__
{'data1': 'spam', 'data2': 'eggs'} >>>sub.__dict__
{'__module__': '__main__', '__doc__': None, 'hola': <function hola at 0x00A47048>} >>>super.__dict__
{'__module__': '__main__', 'hello': <function hello at 0x00A3C5A8>, '__doc__': None} >>>sub.__dict__.keys( ), super.__dict__.keys( )
(['__module__', '__doc__', 'hola'], ['__module__', 'hello', '__doc__']) >>>Y.__dict__
{ }
Notice the extra underscore names in the class dictionaries; these
are set by Python automatically. Most are not used in typical
programs, but some are utilized by tools (e.g., __doc__
holds the docstrings discussed in Chapter 11).
Also observe that Y
, a second instance made at the
start of this series, still has an empty namespace dictionary at the
end, even though X
’s has been
populated by assignments in methods. Each instance is an independent
namespace dictionary, which starts out empty, and can record
completely different attributes than other instances of the same
class.
Now, because attributes are actually dictionary keys inside Python, there are really two ways to fetch and assign their values—by qualification, or key indexing:
>>>X.data1, X.__dict__['data1']
('spam', 'spam') >>>X.data3 = 'toast'
>>>X.__dict__
{'data1': 'spam', 'data3': 'toast', 'data2': 'eggs'} >>>X.__dict__['data3'] = 'ham'
>>>X.data3
'ham'
This equivalence only applies to attributes attached to the instance,
though. Because attribute qualification also performs
inheritance, it can access attributes that
namespace dictionary indexing cannot. The inherited attribute
X.hello
, for instance, cannot be had by
X.__dict__['hello']
.
And finally, here is the built-in dir
function we
met in Chapter 3 and Chapter 11 at work on class and instance objects. This
function works on anything with attributes:
dir(object)
is similar to an object.__dict__.keys( )
call. Notice though, that
dir
sorts its list, and includes some system
attributes;
as
of Python 2.2, dir
also collects
inherited attributes automatically.[6]
>>>X.__dict__
{'data1': 'spam', 'data3': 'ham', 'data2': 'eggs'} >>>X.__dict__.keys( )
['data1', 'data3', 'data2'] >>>>dir(X)
['__doc__', '__module__', 'data1', 'data2', 'data3', 'hello', 'hola'] >>>dir(sub)
['__doc__', '__module__', 'hello', 'hola'] >>>dir(super)
['__doc__', '__module__', 'hello']
Experiment with these special attributes on your own to get a better feel for how namespaces actually do their attribute business. Even if you will never use these in the kinds of programs you write, it helps demystify the notion of namespaces in general when you see that they are just normal dictionaries.
The prior section introduced
the special __class__
and __bases__
instance and class attributes without really telling why
you might care about them. In short, they allow you to inspect
inheritance hierarchies within your own code. For example, they can
be used to display a class tree, as in the following example, file
classtree.py:
def classtree(cls, indent): print '.'*indent, cls.__name__ # Print class name here. for supercls in cls.__bases__: # Recur to all superclasses classtree(supercls, indent+3) # May visit super > once def instancetree(inst): print 'Tree of', inst # Show instance. classtree(inst.__class__, 3) # Climb to its class. def selftest( ): class A: pass class B(A): pass class C(A): pass class D(B,C): pass class E: pass class F(D,E): pass instancetree(B( )) instancetree(F( )) if __name__ == '__main__': selftest( )
The classtree
function in this script is
recursive—it prints a
class’s name using __name__
,
and then climbs up to superclasses by calling itself. This allows the
function to traverse arbitrarily shaped class trees; the recursion
climbs to the top, and stops at root superclasses that have empty
__bases__
. Most of this file is self-test code;
when run standalone, it builds an empty class tree, makes two
instance from it, and prints their class tree structures:
% python classtree.py
Tree of <__main__.B instance at 0x00ACB438>
... B
...... A
Tree of <__main__.F instance at 0x00AC4DA8>
... F
...... D
......... B
............ A
......... C
............ A
...... E
Here, indentation marked by periods is used to denote class tree height. We can import these functions anywhere we want a quick class tree display:
>>>class Emp: pass
>>>class Person(Emp): pass
>>>bob = Person( )
>>>import classtree
>>>classtree.instancetree(bob)
Tree of <__main__.Person instance at 0x00AD34E8> ... Person ...... Emp
Of course, we could improve on this output format, and perhaps even sketch it in a GUI display. Whether or not you will ever code or use such tools, this example demonstrates one of the many ways that we can make use of special attributes that expose interpreter internals. We’ll meet another when we code a general purpose attribute listing class, in Seciton 22.6 of Chapter 22 .
[1] If you’ve used C++, you may recognize this as similar to the notion of C++’s “static” class data—members that are stored in the class, independent of instances. In Python, it’s nothing special: all class attributes are just names assigned in the class statement, whether they happen to reference functions (C++’s “methods”) or something else (C++’s “members”).
[2] Unless the
attribute assignment operation has been redefined by a class with the
__setattr__
operator overloading method to do
something unique.
[3] On a somewhat related note, you can also code multiple __init__ methods within the same single class, but only the last definition will be used; see Chapter 22 for more details.
[4] This description isn’t 100% complete, because instance and class attributes can also be created by assigning to objects outside class statements. But that’s much less common and sometimes more error prone (changes aren’t isolated to class statements). In Python all attributes are always accessible by default; we talk more about name privacy in Chapter 23.
[5] In the current C implementation of Python, you also don’t need to close files objects held by the instance in destructors, because they are automatically closed when reclaimed. However, as mentioned in Chapter 7, it’s better to explicitly call file close methods, because auto-close-on-reclaim is a feature of the implementation, not the language itself (and can vary under Jython).
[6] The content of attribute dictionaries and
dir
call results is prone to change over time. For
example, because Python now allows built-in types to be subclassed
like classes, the contents of dir
results for
built-in types expanded to include operator overloading methods. In
general, attribute names with leading and trailing double underscores
are interpreter-specific. More on type subclasses in Chapter 23.