Part VI concludes our look at OOP in Python by presenting a few more advanced class-related topics, along with the gotchas and exercises for this part of the book. We encourage you to do the exercises, to help cement the ideas we’ve studied. We also suggest working on or studying larger OOP Python projects as a supplement to this book. Like much in computing, the benefits of OOP tend to become more apparent with practice.
Besides implementing new kinds of objects, classes are sometimes used to extend the functionality of Python’s built-in types in order to support more exotic data structures. For instance, to add queue insert and delete methods to lists, you can code classes that wrap (embed) a list object, and export insert and delete methods that process the list specially, like the delegation technique studied in Chapter 22. As of Python 2.2, you can also use inheritance to specialize built-in types. The next two sections show both techniques in action.
Remember those set functions we wrote in Part IV? Here’s what they look like brought back to life as a Python class. The following example, file setwrapper.py, implements a new set object type, by moving some of the set functions to methods, and adding some basic operator overloading. For the most part, this class just wraps a Python list with extra set operations. Because it’s a class, it also supports multiple instances and customization by inheritance in subclasses.
class Set: def __init__(self, value = [ ]): # Constructor self.data = [ ] # Manages a list self.concat(value) def intersect(self, other): # other is any sequence. res = [ ] # self is the subject. for x in self.data: if x in other: # Pick common items. res.append(x) return Set(res) # Return a new Set. def union(self, other): # other is any sequence. res = self.data[:] # Copy of my list for x in other: # Add items in other. if not x in res: res.append(x) return Set(res) def concat(self, value): # value: list, Set... for x in value: # Removes duplicates if not x in self.data: self.data.append(x) def __len__(self): return len(self.data) # len(self) def __getitem__(self, key): return self.data[key] # self[i] def __and__(self, other): return self.intersect(other) # self & other def __or__(self, other): return self.union(other) # self | other def __repr__(self): return 'Set:' + `self.data` # Print
By overloading indexing, the set class can often masquerade as a real list. Since you will interact with and extend this class in an exercise at the end of this chapter, we won’t say much more about this code until Appendix B.
Beginning with Python 2.2, all the built-in types can now be
subclassed directly. Type conversion functions such as
list
,
str
,
dict
, and tuple
, have become
built-in type names—although transparent to your script, a type
conversion call (e.g., list('spam')
) is now really
an invocation of a type’s object constructor.
This change allows us to customize or extend the behavior of built-in
types with user-defined class
statements: simply
subclass the new type names to customize them. Instances of your type
subclasses can be used anywhere that the original built-in type can
appear. For example, suppose you have trouble getting used to the
fact that Python list offsets begin at 0 instead of 1. Not to
worry—you can always code your own subclass that customizes
this core behavior of lists. File
typesubclass.py shows how:
# Subclass built-in list type/class. # Map 1..N to 0..N-1; call back to built-in version. class MyList(list): def __getitem__(self, offset): print '(indexing %s at %s)' % (self, offset) return list.__getitem__(self, offset - 1) if __name__ == '__main__': print list('abc') x = MyList('abc') # __init__ inherited from list print x # __repr__ inherited from list print x[1] # MyList.__getitem__ print x[3] # Customizes list superclass method x.append('spam'), print x # Attributes from list superclass x.reverse( ); print x
In this file, the MyList
subclass extends the
built-in list’s __getitem__
indexing method only, in order to map indexes 1 to N back to the
required 0 to N-1. All it really does is decrement the index
submitted, and call back to the superclass’s version
of indexing, but it’s enough to do the trick:
% python typesubclass.py
['a', 'b', 'c']
['a', 'b', 'c']
(indexing ['a', 'b', 'c'] at 1)
a
(indexing ['a', 'b', 'c'] at 3)
c
['a', 'b', 'c', 'spam']
['spam', 'c', 'b', 'a']
This output also includes tracing text the class prints on indexing.
Whether or not changing indexing this way is a good idea in general
is another issue—users of your MyList
class
may very well be confused by such a core departure from Python
sequence behavior. The fact that you can customize built-in types
this way can be a powerful tool in general, though.
For instance, this coding pattern gives rise to an alternative way to
code sets—as a subclass of the built-in list type, rather than
a standalone class that manages an embedded list object. The
following class, coded in file setsubclass.py,
customizes lists, to add just methods and operators related to set
processing; because all other behavior is inherited from the built-in
list
superclass, this makes for a shorter and
simpler alternative:
class Set(list): def __init__(self, value = [ ]): # Constructor list.__init__([ ]) # Customizes list self.concat(value) # Copies mutable defaults def intersect(self, other): # other is any sequence. res = [ ] # self is the subject. for x in self: if x in other: # Pick common items. res.append(x) return Set(res) # Return a new Set. def union(self, other): # other is any sequence. res = Set(self) # Copy me and my list. res.concat(other) return res def concat(self, value): # value: list, Set... for x in value: # Removes duplicates if not x in self: self.append(x) def __and__(self, other): return self.intersect(other) def __or__(self, other): return self.union(other) def __repr__(self): return 'Set:' + list.__repr__(self) if __name__ == '__main__': x = Set([1,3,5,7]) y = Set([2,1,4,5,6]) print x, y, len(x) print x.intersect(y), y.union(x) print x & y, x | y x.reverse( ); print x
Here is the output of this script’s self-test code at the end of the file. Because subclassing core types is an advanced feature, we will omit further details here, but invite you to trace through these results in the code, to study its behavior:
% python setsubclass.py
Set:[1, 3, 5, 7] Set:[2, 1, 4, 5, 6] 4
Set:[1, 5] Set:[2, 1, 4, 5, 6, 3, 7]
Set:[1, 5] Set:[1, 3, 5, 7, 2, 4, 6]
Set:[7, 5, 3, 1]
There are more efficient ways to implement sets with dictionaries in
Python, that replace the linear scans in the set implementations
we’ve shown, with dictionary index operations
(hashing), and so run much quicker. (For more details, see
Programming Python, Second Edition
[O’Reilly]). If you’re interested
in sets, also see the new set
module that was
added in Python 2.3 release; this module provides a
set
object and set
operations
as built-in tools. Sets are fun to experiment with, but are no longer
strictly required as of Python 2.3.
For another type subclass example, see the implementation of the new
bool
type in Python 2.3: as mentioned earlier,
bool
is a subclass of int
, with
two instances True
and False
that behave like integers 1 and 0, but inherit custom string
reprresentation methods that display their
names.
In Part IV, we learned that every name assigned at the top level of a file is exported by a module. By default, the same holds for classes—data hiding is a convention, and clients may fetch or change any class or instance attribute they like. In fact, attributes are all “public” and “virtual” in C++ terms; they’re all accessible everywhere and all looked up dynamically at runtime.[1]
That’s still true today. However, Python also includes the notion of name “mangling” (i.e., expansion), to localize some names in classes. This is sometimes misleadingly called private attributes—really, it’s just a way to localize a name to the class that created it, and does not prevent access by code outside the class. That is, this feature is mostly intended to avoid namespace collisions in instances, not to restrict access to names in general.
Pseudo-private names are an advanced feature, entirely optional, and probably won’t be very useful until you start writing large class hierarchies in multi-programmer projects. But because you may see this feature in other people’s code, you need to be somewhat aware of it even if you don’t use it in your own.
Here’s how name mangling works. Names inside a class
statement that start with two underscores (and don’t
end with two underscores) are automatically expanded to include the
name of the enclosing class. For instance, a name like __X
within a class named Spam
is changed
to _Spam__X
automatically: a single underscore,
the enclosing class’s name, and the rest of the
original name. Because the modified name is prefixed with the name of
the enclosing class, it’s somewhat unique; it
won’t clash with similar names created by other
classes in a hierarchy.
Name mangling happens only in class
statements and
only for names you write with two leading underscores. Within a
class, though, it happens to every name preceded with double
underscores wherever they appear. This includes both method names and
instance attributes. For example, an instance attribute reference
self.__X
is transformed to self._Spam__X
. Since more than one class may add attributes to an
instance, this mangling helps avoid clashes; but we need to move on
to an example to see how.
The problem that the pseudo-private attribute feature is meant to alleviate has to do with the way instance attributes are stored. In Python, all instance attributes wind up in the single instance object at the bottom of the class tree. This is very different from the C++ model, where each class gets its own space for data members it defines.
Within a class method in Python, whenever a method assigns to a
self
attribute (e.g.,
self.attr=value
), it changes or creates an
attribute in the instance (inheritance search only happens on
reference, not assignment). Because this is true even if multiple
classes in a hierarchy assign to the same attribute, collisions are
possible.
For example, suppose that when a programmer codes a class, she
assumes that she owns the attribute name X
in the
instance. In this class’s methods, the name is set
and later fetched:
class C1: def meth1(self): self.X = 88 # Assume X is mine. def meth2(self): print self.X
Suppose further that another programmer, working in isolation, makes the same assumption in a class that he codes:
class C2: def metha(self): self.X = 99 # Me too def methb(self): print self.X
Both of these classes work by themselves. The problem arises if these two classes are ever mixed together in the same class tree:
class C3(C1, C2): ... I = C3( ) # Only 1 X in I!
Now, the value that each class will get back when it says
self.X
depends on which class assigned it last.
Because all assignments to self.X
refer to the
same single instance, there is only one X
attribute—I.X
, no matter how many classes
use that attribute name. To guarantee that an attribute belongs to
the class that uses it, prefix the name with double underscores
everywhere it is used in the class, as in this file,
private.py:
class C1: def meth1(self): self.__X = 88 # Now X is mine. def meth2(self): print self.__X # Becomes _C1__X in I class C2: def metha(self): self.__X = 99 # Me too def methb(self): print self.__X # Becomes _C2__X in I class C3(C1, C2): pass I = C3( ) # Two X names in I I.meth1( ); I.metha( ) print I.__dict__ I.meth2( ); I.methb( )
When thus prefixed, the X
attributes are expanded
to include the name of the class, before being added to the instance.
If you run a dir
call on I
or
inspect its namespace dictionary after the attributes have been
assigned, you see the expanded names: _C1__X
and
_C2__X
, but not X
. Because the
expansion makes the names unique within the instance, the class
coders can assume they truly own any names that they prefix with two
underscores:
% python private.py
{'_C2__X': 99, '_C1__X': 88}
88
99
This trick can avoid potential name collisions in the instance, but
note that it is not true privacy at all. If you know the name of the
enclosing class, you can still access these attributes anywhere you
have a reference to the instance, by using the fully expended name
(e.g., I._C1__X=77
). On the other hand, this
feature makes it less likely that you will
accidentally step on a class’s
names.
We should note that this feature tends to become more useful for larger, multi-programmer projects, and then only for selected names. That is, don’t clutter your code unnecessarily; only use this feature for names that truly need to be controlled by a single class. For simpler programs, it’s probably overkill.
In Release 2.2, Python introduced a new flavor of classes, known as “new style” classes; the classes covered so far in this part of the book are known as “classic classes” when comparing them to the new kind.
New style classes are only slightly different than classic classes, and the ways in which they differ are completely irrelevent to the vast majority of Python users. Moreover, the classic class model, which has been with Python for over a decade, still works exactly as we have described previously.
New style classes are almost completely backward-compatible with classic classes, in both syntax and behavior; they mostly just add a few advanced new features. However, because they modify one special case of inheritance, they had to be introduced as a distinct tool, so as to avoid impacting any existing code that depends on the prior behavior.
New style classes are coded with all the normal class syntax we have
studied. The chief coding difference is that you subclass from a
built-in type (e.g., list
) to produce a new style
class. A new built-in name, object
, is provided to
serve as a superclass for new style classes if no other built-in type
is appropriate to use:
class newstyle(object): ...normal code...
More generally, any object derived from
object
or other built-in type is automatically
treated as a new style class. By derived, we mean that this includes
subclasses of object
, subclasses of subclasses of
object
, and so on—as long as a built-in is
somewhere in the superclass tree. Classes not derived from built-ins
are considered classic.
Perhaps the most visible change in new style classes is their slightly different treatment of inheritance for the so-called diamond pattern of multiple inheritance trees—where more than one superclass leads to the same higher superclass further above. The diamond pattern is an advanced design concept, which we have not even discussed for normal classes.
In short, with classic classes inheritance search is strictly depth first, and then left to right—Python climbs all the way to the top before it begins to back up and look to the right in the tree. In new style classes, the search is more breadth-first in such cases—Python chooses a closer superclass to the right before ascending all the way to the common superclass at the top. Because of this change, lower superclasses can overload attributes of higher superclasses, regardless of the sort of multiple inheritance trees they are mixed into.
To illustrate, consider this simplistic incarnation of the diamond inheritance pattern for classic classes:
>>>class A: attr = 1 # Classic
>>>class B(A): pass
>>>class C(A): attr = 2
>>>class D(B,C): pass # Tries A before C
>>>x = D( )
>>>x.attr
1
The attribute here was found in superclass A
,
because inheritance climbs as high as it can before backing up and
moving right—it searches D
,
B
, A
, then C
(and stops when attr
is found in
A
above B
). With the new style
classes derived from a built-in like object
,
though, inheritance looks in C
first (to the
right) before A
(above
B
)—it searches D
,
B
, C
, then A
(and in this case stops in C
):
>>>class A(object): attr = 1 # New style
>>>class B(A): pass
>>>class C(A): attr = 2
>>>class D(B,C): pass # Tries C before A
>>>x = D( )
>>>x.attr
2
This change in inheritance is based upon the assumption that if you
mix in C
lower in the tree, you probably intend to
grab its attributes in preference to
A
’s. It also assumes that
C
probably meant to overide
A
’s attribute
always—it does when used standalone, but
not when mixed into a diamond with classic classes. You might not
know that C
may be mixed-in like this at the time
you code it.
Of course, the problem with assumptions is that they assume things. If this search order deviation seems too subtle to remember, or if you want more control over the search process, you can always force the selection of an attribute from anywhere in the tree by assigning or otherwise naming the one you want at the place where classes are mixed together:
>>>class A: attr = 1 # Classic
>>>class B(A): pass
>>>class C(A): attr = 2
>>>class D(B,C): attr = C.attr # Choose C, to the right.
>>>x = D( )
>>>x.attr # Works like new style
2
Here, a tree of classic classes is emulating the search order of new
style classes; the assignment to the attribute in
D
picks the version in C
,
thereby subverting the normal inheritance search path
(D.attr
will be lowest in the tree). New style
classes can similarly emulate classic classes, by choosing the
attribute above at the place where the classes are mixed together:
>>>class A(object): attr = 1 # New style
>>>class B(A): pass
>>>class C(A): attr = 2
>>>class D(B,C): attr = B.attr # Choose A.attr, above.
>>>x = D( )
>>>x.attr # Works like classic
1
If you are willing to always resolve conflicts like this, you can largely ignore the search order difference, and not rely on assumptions about what you meant when you coded your classes. Naturally, the attributes we pick this way can also be method functions—methods are normal, assignable objects:
>>>class A:
...def meth(s): print 'A.meth'
>>>class C(A):
...def meth(s): print 'C.meth'
>>>class B(A):
...pass
>>>class D(B,C): pass # Use default search order.
>>>x = D( ) # Will vary per class type
>>>x.meth( ) # Defaults to classic order
A.meth >>>class D(B,C): meth = C.meth # Pick C's method: new style.
>>>x = D( )
>>>x.meth( )
C.meth >>>class D(B,C): meth = B.meth # Pick B's method: classic.
>>>x = D( )
>>>x.meth( )
A.meth
Here, we select methods by assignments to same names lower in the tree. We might also simply call the desired class explicity; in practice, this pattern might be more common, especially for things like constructors:
class D(B,C):
def meth(self): # Redefine lower.
...
C.meth(self) # Pick C's method by calling.
Such selections by assignment or call at mix-in points can effectively insulate your code from this difference in class flavors. By explicitly resolving the conflict this way, your code won’t vary per Python version in the future (apart from perhaps needing to derive classes from a built-in for the new style).[2]
By default, the diamond pattern is searched differently in classic and new style classes, and this is a non-backward compatible change. However, keep in mind that this change only effects diamond pattern cases; new style class inheritance works unchanged for all other inheritance tree structures. Further, it’s not impossible that this entire issue may be of more theoretical than practical importance—since it wasn’t significant enough to change until 2.2, it seems unlikely to impact much Python code.
Beyond this change in the diamond inheritance pattern (which is itself too obscure to matter to most readers of this book), new style classes open up a handful of even more advanced possibilities. Here’s a brief look at each.
It is possible to define
methods
within a class that can be called without an instance:
static methods work roughly like simple
instance-less functions inside a class, and
class methods are passed a class instead of an
instance. Special built-in functions must be called within the class
to enable these method modes: staticmethod
and
classmethod
. Because this is also a solution to a
longstanding gotcha in Python, we’ll present these
calls later in this chapter in Section 23.4 . Note that the new
static and class methods also work for classic classes in Python
release 2.2.
By assigning a list of string attribute names to a special __slots__
class attribute, it is possible for new style
classes to limit the set of legal attributes that instances of the
class will have. This special attribute is typically set by assigning
to variable __slots__
at the top level of a
class statement. Only those names in the __slots__
list can be assigned as instance attributes. However,
like all names in Python, instance attribute names must still be
assigned before they can be referenced, even if listed in __slots__
. Here’s an example to
illustrate:
>>>class limiter(object):
...__slots__ = ['age', 'name', 'job']
>>>x = limiter( )
>>>x.age # Must assign before use
AttributeError: age >>>x.age = 40
>>>x.age
40 >>>x.ape = 1000 # Illegal: not in slots
AttributeError: 'limiter' object has no attribute 'ape'
This feature is envisioned as a way to catch
“typo” errors (assignment to
illegal attribute names not in __slots__
is
detected) and as a possible optimization mechanism in the future.
Slots are something of a break with Python’s dynamic
nature, which dictates that any name may be created by assignment.
They also have additional constraints and implications that are far
too complex for us to discuss here (e.g., some instances with slots
may not have an attribute dictionary __dict__
);
see Python 2.2 release documents for details.
A mechanism known as
properties
provides another way for new style classes to define automatically
called methods for access or assignment to instance attributes. This
feature is an alternative for many current uses of the __getattr__
and __setattr__
overloading methods studied in Chapter 21.
Properties have a similar effect to these two methods, but incur an
extra method call only for access to names that require dynamic
computation. Properties (and slots) are based on a new notion of
attribute descriptors, which is too advanced for us to cover here.
In short, properties are a type of object assigned to class attribute
names. They are generated by calling a property
built-in with three methods (handlers for get, set, and delete
operations), as well as a docstring; if any argument is passed as
None
or omitted, it is not supported. Properties
are typically assigned at the top level of a class statement (e.g.,
name=property(...)
). When thus assigned, accesses
to the class attribute itself (e.g., obj.name
) are
automatically routed to one of the accessor methods passed into the
property. For example, the __getattr__
method
allows classes to intercept undefined attribute references:
>>>class classic:
...def __getattr__(self, name):
...if name == 'age':
...return 40
...else:
...raise AttributeError
... >>>x = classic( )
>>>x.age # Runs __getattr__
40 >>>x.name # Runs __getattr__
AttributeError
Here is the same example, coded with properties instead:
>>>class newprops(object):
...def getage(self):
...return 40
...age = property(getage, None, None, None) # get,set,del,docs
... >>>x = newprops( )
>>>x.age # Runs getage
40 >>>x.name # Normal fetch
AttributeError: newprops instance has no attribute 'name'
For some coding tasks, properties can be both less complex and quicker to run than the traditional techniques. For example, when we add attribute assignment support, properties become more attractive—there’s less code to type, and you might not incur an extra method call for assignments to attributes you don’t wish to compute dynamically:
>>>class newprops(object):
...def getage(self):
...return 40
...def setage(self, value):
...print 'set age:', value
...self._age = value
...age = property(getage, setage, None, None)
... >>>x = newprops( )
>>>x.age # Runs getage
40 >>>x.age = 42 # Runs setage
set age: 42 >>>x._age # Normal fetch; no getage call
42 >>>x.job = 'trainer' # Normal assign; no setage call
>>>x.job # Normal fetch; no getage call
'trainer'
The equivalent classic class might trigger extra method calls, and may need to route attribute assignments through the attribute dictionary to avoid loops:
>>>class classic:
...def __getattr__(self, name): # On undefined reference
...if name == 'age':
...return 40
...else:
...raise AttributeError
...def __setattr__(self, name, value): # On all assignments
...print 'set:', name, value
...if name == 'age':
...self.__dict__['_age'] = value
...else:
...self.__dict__[name] = value
... >>>x = classic( )
>>>x.age # Runs __getattr__
40 >>>x.age = 41 # Runs __setattr__
set: age 41 >>>x._age # Defined: no __getattr__ call
41 >>>x.job = 'trainer' # Runs __setattr__ again
>>>x.job # Defined: no __getattr__ call
Properties seem like a win for this simple example. However, some
applications of __getattr__
and __setattr__
may still require more dynamic or generic
interfaces than properties directly provide. For example, in many
cases, the set of attributes to be supported cannot be determined
when the class is coded, and may not even exist in any tangible form
at all (e.g., when delegating arbitrary method references to a
wrapped/embedded object generically). In such cases, a generic
__getattr__
or __setattr__
attribute handler with a passed-in attribute name may be an
advantage. Because such generic handlers can also handle simpler
cases, properties are largely an optional extension.
The __getattribute__
method, available for new
style classes only, allows a class to intercept all attribute
references, not just undefined references like __getattr__
. It is also substantially trickier to use than both
__getattr__
or __setattr__
(it is prone to loops). We’ll defer to
Python’s standard documentation for more details.
Besides all these feature additions, new style classes integrate with the notion of subclassable types that we met earlier in this chapter; subclassable types and new style classes were both introduced in conjunction with a merging of the type/class dichotomy in 2.2 and beyond.
Because new style class features are all advanced topics, we are going to skip further details in this introductory text. Please see Python 2.2 release documentation and the language reference for more information.
It is not impossible that new style classes might be adopted as the
single class model in future Python releases. If they are, you might
simply need to make sure your top-level superclasses are derived from
object
or other built-in type name (if even that
will be required at all); everything else we’ve
studied in this part of the book should continue to work as
described.
Most class issues can usually be boiled down to namespace issues (which makes sense, given that classes are just namespaces with a few extra tricks). Some of the topics in this section are more like case studies of advanced class usage than problems, and one or two of these have been eased by recent Python releases.
Theoretically speaking, classes (and class instances) are all mutable objects. Just as with built-in lists and dictionaries, they can be changed in place, by assigning to their attributes. And like lists and dictionaries, this also means that changing a class or instance object may impact multiple references to it.
That’s usually what we want (and is how objects change their state in general), but this becomes especially critical to know when changing class attributes. Because all instances generated from a class share the class’s namespace, any changes at the class level are reflected in all instances, unless they have their own versions of changed class attributes.
Since classes, modules, and instances are all just objects with
attribute namespaces, you can normally change their attributes at
runtime by assignments. Consider the following class; inside the
class body, the assignment to name a
generates an
attribute X.a
, which lives in the class object at
runtime and will be inherited by all of
X
’s instances:
>>>class X:
...a = 1 # Class attribute
... >>>I = X( )
>>>I.a # Inherited by instance
1 >>>X.a
1
So far so good—this is the normal case. But notice what happens
when we change the class attribute dynamically outside the
class
statement: it also changes the attribute in
every object that inherits from the class. Moreover, new instances
created from the class during this session or program get the
dynamically set value, regardless of what the
class’s source code says:
>>>X.a = 2 # May change more than X
>>>I.a # I changes too.
2 >>>J = X( ) # J inherits from X's runtime values
>>>J.a # (but assigning to J.a changes a in J, not X or I).
2
Is this a useful feature or a dangerous trap? You be the judge, but you can actually get work done by changing class attributes, without ever making a single instance. This technique can simulate “records” or “structs” in other languages. As a refresher on this technique, consider the following unusual but legal Python program:
class X: pass # Make a few attribute namespaces. class Y: pass X.a = 1 # Use class attributes as variables. X.b = 2 # No instances anywhere to be found X.c = 3 Y.a = X.a + X.b + X.c for X.i in range(Y.a): print X.i # Prints 0..5
Here, classes X
and Y
work like
“file-less”
modules—namespaces for storing variables we
don’t want to clash. This is a perfectly legal
Python programming trick, but is less appropriate when applied to
classes written by others; you can’t always be sure
that class attributes you change aren’t critical to
the class’s internal behavior. If
you’re out to simulate a C
“struct,” you may be better off
changing instances than classes, since only one object is affected:
class Record: pass X = Record( ) X.name = 'bob' X.job = 'Pizza maker'
This may be obvious, but is worth underscoring: if you use multiple inheritance, the order in which superclasses are listed in a class statement header can be critical. Python always searches your superclasses left to right, according to the order in the class header line.
For instance, in the multiple inheritance example we saw in Chapter 22, suppose that the Super
implemented a __repr__
method too; would we then
want to inherit Lister
’s or
Super
’s? We would get it from
whichever class is listed first in
Sub
’s class header, since
inheritance searches left to right. Presumably, we would list
Lister
first, since its whole purpose is its
custom __repr__
:
class Lister: def __repr__(self): ... class Super: def __repr__(self): ... class Sub(Lister, Super): # Get Lister's __repr__ by listing it first.
But now suppose Super
and
Lister
have their own versions of other same-named
attributes, too. If we want one name from Super
and another from Lister
, no order in the class
header will help—we will have to override inheritance by
manually assigning to the attribute name in the
Sub
class:
class Lister: def __repr__(self): ... def other(self): ... class Super: def __repr__(self): ... def other(self): ... class Sub(Lister, Super): # Get Lister's __repr__ by listing it first other = Super.other # but explicitly pick Super's version of other. def __init__(self): ... x = Sub( ) # Inheritance searches Sub before Super/Lister.
Here, the assignment to other
within the
Sub
class creates
Sub.other
—a reference back to the
Super.other
object. Because it is lower in the
tree, Sub.other
effectively hides
Lister.other
, the attribute that inheritance would
normally find. Similarly, if we listed Super
first
in the class header to pick up its other
, we would
then need to select Lister
’s
method:
class Sub(Super, Lister): # Get Super's other by order. __repr__ = Lister.__repr__ # Explicitly pick Lister.__repr__.
Multiple inheritance is an advanced tool. Even if you understood the last paragraph, it’s still a good idea to use it sparingly and carefully. Otherwise, the meaning of a name may depend on the order in which classes are mixed in an arbitrarily far removed subclass. For another example of the technique shown here in action, see the discussion of explicit conflict resolution in Section 23.3 earlier in this chapter.
As a rule of thumb, multiple inheritance works best when your mix-in
classes are as self-contained as possible—since they may be
used in a variey of contexts, they should not make assumptions about
the names related to other classes in a tree. Moreover, the
pseudo-private attributes feature we studied earlier can help by
localizing names that the a class relies on owning, and limiting the
names that your mix-in classes add to the mix. In the example, if
Lister
only means to export its custom __repr__
, it could name its other method __other
to avoid clashing with other classes.
This gotcha has been fixed by a new optional feature in Python 2.2, static and class methods, but we retain it here for readers with older Python releases, and because it gives us as good a reason as any for presenting the new static and class methods advanced feature.
In Python releases prior to 2.2, class method functions can never be
called without an instance. (In Python 2.2 and later, this is also
the default behavior, but it can be modified if necessary.) In the
prior chapter, we talked about unbound methods:
when we fetch a method function by qualifying a class (instead of an
instance), we get an unbound method object. Even though they are
defined with a def
statement, unbound method
objects are not simple functions; they cannot be called without an
instance.
For example, suppose we want to use class attributes to count how many instances are generated from a class (file spam.py, shown below). Remember, class attributes are shared by all instances, so we can store the counter in the class object itself:
class Spam: numInstances = 0 def __init__(self): Spam.numInstances = Spam.numInstances + 1 def printNumInstances( ): print "Number of instances created: ", Spam.numInstances
But this won’t work: the
printNumInstances
method still expects an instance
to be passed in when called, because the function is associated with
a class (even though there are no arguments in the
def
header):
>>>from spam import *
>>>a = Spam( )
>>>b = Spam( )
>>>c = Spam( )
>>>Spam.printNumInstances( )
Traceback (innermost last): File "<stdin>", line 1, in ? TypeError: unbound method must be called with class instance 1st argument
Don’t expect this: unbound instance methods aren’t exactly the same as simple functions. This is mostly a knowledge issue, but if you want to call functions that access class members without an instance, probably the best advice is to just make them simple functions, not class methods. This way, an instance isn’t expected in the call:
def printNumInstances( ): print "Number of instances created: ", Spam.numInstances class Spam: numInstances = 0 def __init__(self): Spam.numInstances = Spam.numInstances + 1 >>>import spam
>>>a = spam.Spam( )
>>>b = spam.Spam( )
>>>c = spam.Spam( )
>>>spam.printNumInstances( )
Number of instances created: 3 >>>spam.Spam.numInstances
3
We can also make this work by calling through an instance, as usual, although this can be inconvenient if making an instance changes the class data:
class Spam: numInstances = 0 def __init__(self): Spam.numInstances = Spam.numInstances + 1 def printNumInstances(self): print "Number of instances created: ", Spam.numInstances >>>from spam import Spam
>>>a, b, c = Spam( ), Spam( ), Spam( )
>>>a.printNumInstances( )
Number of instances created: 3 >>>b.printNumInstances( )
Number of instances created: 3 >>>Spam( ).printNumInstances( )
Number of instances created: 4
Some language theorists claim that this means Python doesn’t have class methods, only instance methods. We suspect they really mean Python classes don’t work the same as in some other language. Python really has bound and unbound method objects, with well-defined semantics; qualifying a class gets you an unbound method, which is a special kind of function. Python does have class attributes, but functions in classes expect an instance argument.
Moreover, since Python already provides modules
as a namespace partitioning tool, there’s usually no
need to package functions in classes unless they implement object
behavior. Simple functions within modules usually do most of what
instance-less class methods could. For example, in the first code
sample in this section, printNumInstances
is
already associated with the class, because it lives in the same
module. The only lost functionality is that the function name has a
broader scope—the entire module, rather than the class.
As of Python 2.2, you can code classes with both static and class
methods, neither of which require an instance to be present when they
are invoked. To designate such methods, classes call the built-in
functions staticmethod
and
classmethod
, as hinted in the earlier discussion
of new style classes. For example:
class Multi: def imeth(self, x): # Normal instance method print self, x def smeth(x): # Static: no instance passed print x def cmeth(cls, x): # Class: gets class, not instance print cls, x smeth = staticmethod(smeth) # Make smeth a static method. cmeth = classmethod(cmeth) # Make cmeth a class method.
Notice how the last two assignments in this code simply
reassign the method names
smeth
and cmeth
. Attributes are
created and changed by any assignment in a class
statement, so these final assignments overwrite the assignments made
earlier by the defs
.
Technically, Python 2.2 supports three kinds of class-related methods: instance, static, and class. Instance methods are the normal (and default) case that we’ve seen in this book. With instance methods, you always must call the method with an instance object. When you call through an instance, Python passes the instance to the first (leftmost) argument automatically; when called through the class, you pass along the instance manually:
>>>obj = Multi( ) # Make an instance
>>>obj.imeth(1) # Normal call, through instance
<__main__.Multi instance...> 1 >>>Multi.imeth(obj, 2) # Normal call, through class
<__main__.Multi instance...> 2
By contrast, static methods are called without an instance argument; their names are local to the scope of the class they are defined in, and may be looked up by inheritance; mostly, they work like simple functions that happen to be coded inside a class:
>>>Multi.smeth(3) # Static call, through class
3 >>>obj.smeth(4) # Static call, through instance
4
Class methods are similar, but Python automatically passes the class (not an instance) in to the method’s first (leftmost) argument:
>>>Multi.cmeth(5) # Class call, through class
__main__.Multi 5 >>>obj.cmeth(6) # Class call, through instance
__main__.Multi 6
Static and class methods are new and advanced features of the language. They have highly specialized roles that we don’t have space to document here. Static methods are commonly used in conjunction with class attributes to manage information that spans all instances generated from the class.
For example, to keep track of the number of instances generated from a class (as in the earlier example), you may use static methods to manage a counter attached as a class attribute. Since such a count has nothing to do with any particular instance, it is inconvenient to have to access methods that process it through an instance (especially since making an instance to access the counter may change the counter). Moreover, static methods’ proximity to the class provides a more natural solution than coding class-oriented functions outside the class. Here is the static method equivalent of this section’s original example:
class Spam: numInstances = 0 def __init__(self): Spam.numInstances += 1 def printNumInstances( ): print "Number of instances:", Spam.numInstances printNumInstances = staticmethod(printNumInstances) >>>a = Spam( )
>>>b = Spam( )
>>>c = Spam( )
>>>Spam.printNumInstances( )
Number of instances: 3 >>>a.printNumInstances( )
Number of instances: 3
Compared to simply moving printNumInstances
outside the class as prescribed earlier, this version requires an
extra staticmethod
call, but localizes the
function name in the class scope, and moves the function code closer
to where it is used (inside the class
statement).
You should judge for yourself whether this is a net improvement or
not.
This gotcha went away in Python 2.2,
with the introduction of nested function
scopes, but we retain it here for historical perspective, for readers
working with older Python releases, and because it demonstrates what
happens to the new nested function scope rules when a
class
is a layer of the nesting.
Classes introduce a local scope just as functions do, so the same
sorts of scope behavior can happen in a class
statement body. Moreover, methods are further nested functions, so
the same issues apply. Confusion seems to be especially common when
classes are nested.
In the following example, file nester.py, the
generate
function returns an instance of the
nested Spam
class. Within its code, the class name
Spam
is assigned in the
generate
function’s local scope.
But within the class’s method
function, the class name Spam
is not visible in
Python prior to 2.2 where method
has access only
to its own local scope, the module surrounding
generate
, and built-in names:
def generate( ):
class Spam:
count = 1
def method(self): # Name Spam not visible:
print Spam.count # not local(def),global(module), built-in
return Spam( )
generate( ).method( )
C:pythonexamples> python nester.py
Traceback (innermost last):
File "nester.py", line 8, in ?
generate( ).method( )
File "nester.py", line 5, in method
print Spam.count # Not local(def),global(module), built-in
NameError: Spam
As a solution, either upgrade to Python 2.2, or
don’t nest code this way. This example works in
Python 2.2 and later, because the local scopes of all enclosing
function defs
are automatically visible to nested
defs
, including nested method
defs
, as in this example.
Note that even in 2.2, method defs
cannot see the
local scope of the enclosing class, only the
local scope of enclosing defs
.
That’s why methods must go through the
self
instance or the class name, to reference
methods and other attributes defined in the enclosing
class
statement. For example, code in the method
must use self.count
or
Spam.count
, not just count
.
Prior to release 2.2, there are a variety of ways to get the example
above to work. One of the simplest is to move the name
Spam
out to the enclosing
module’s scope with global declarations; since
method
sees global names in the enclosing module,
references work:
def generate( ): global Spam # Force Spam to module scope. class Spam: count = 1 def method(self): print Spam.count # Works: in global (enclosing module) return Spam( ) generate( ).method( ) # Prints 1
Perhaps better, we can also restructure the code such that class
Spam
is defined at the top level of the module by
virtue of its nesting level, rather than global declarations. Both
the nested method
function and the top level
generate
find Spam
in their
global scopes:
def generate( ): return Spam( ) class Spam: # Define at module top-level. count = 1 def method(self): print Spam.count # Works: in global (enclosing module) generate( ).method( )
In fact, this is what we prescribe for all Python releases—your code tends to be simpler in general if you avoid nesting of classes and functions.
If you want to get complicated and tricky, you can also get rid of
the Spam
reference in method
altogether, by using the special __class__
attribute, which returns an instance’s class object:
def generate( ): class Spam: count = 1 def method(self): print self.__class__.count # Works: qualify to get class return Spam( )
generate( ).method( )
Sometimes, the abstraction potential of OOP can be abused to the point of making code difficult to understand. If your classes are layered too deeply, it can make code obscure; you may have to search through many classes to discover what an operation does. For example, one of your authors once worked in a C++ shop with thousands of classes (some generated by machine), and up to 15 levels of inheritance; deciphering a method call in such a complex system was often a monumental task. Multiple classes had to be consulted for even the most basic of operations.
The most general rule of thumb applies here too: don’t make things complicated unless they truly must be. Wrapping your code in multiple layers of classes to the point of incomprehensibility is always a bad idea. Abstraction is the basis of polymorphism and encapsulation, and can be a very effective tool when used well. But you’ll simplify debugging and aid maintainability, if you make your class interfaces intuitive, avoid making code overly abstract, and keep your class hierarchies short and flat unless there is a good reason to do otherwise.
These exercises ask you to write a few classes and experiment with some existing code. Of course, the problem with existing code is that it must be existing. To work with the set class in exercise 5, either pull down the class source code off the Internet (see Preface) or type it up by hand (it’s fairly small). These programs are starting to get more sophisticated, so be sure to check the solutions at the end of the book for pointers.
See Section B.6 for the solutions.
Inheritance. Write a class called
Adder
that exports a method add(self, x,
y)
that prints a “Not
Implemented” message. Then define two subclasses of
Adder
that implement the add
method:
ListAdder
With an add
method that returns the concatenation
of its two list arguments
DictAdder
With an add
method that returns a new dictionary
with the items in both its two dictionary arguments (any definition
of addition will do)
Experiment by making instances of all three of your classes
interactively and calling their add
methods.
Now, extend your Adder
superclass to save an
object in the instance with a constructor (e.g., assign
self.data
a list or a dictionary) and overload the
+
operator with an __add__
to
automatically dispatch to your add
methods (e.g.,
X+Y
triggers X.add(X.data,Y)
).
Where is the best place to put the constructors and operator overload
methods (i.e., in which classes)? What sorts of objects can you add
to your class instances?
In practice, you might find it easier to code your
add
methods to accept just one real argument
(e.g., add(self,y)
), and add that one argument to
the instance’s current data (e.g.,
self.data+y
). Does this make more sense than
passing two arguments to add
? Would you say this
makes your classes more “object-oriented”?
Operator overloading. Write a class called
Mylist
that shadows
(“wraps”) a Python list: it should
overload most list operators and operations including
+
, indexing, iteration, slicing, and list methods
such as append
and sort
. See
the Python reference manual for a list of all possible methods to
support. Also, provide a constructor for your class that takes an
existing list (or a Mylist
instance) and copies
its components into an instance member. Experiment with your class
interactively. Things to explore:
Why is copying the initial value important here?
Can you use an empty slice (e.g., start[:]
) to
copy the initial value if it’s a
Mylist
instance?
Is there a general way to route list method calls to the wrapped list?
Can you add a Mylist
and a regular list? How about
a list and a Mylist
instance?
What type of object should operations like +
and
slicing return; how about indexing?
If you are working with a more recent Python release (Version 2.2 or later), you may implement this sort of wrapper class either by embedding a real list in a standalone class, or by extending the built-in list type with a subclass. Which is easier and why?
Subclassing. Make a subclass of
Mylist
from Exercise 2 called
MylistSub
, which extends Mylist
to print a message to stdout
before each
overloaded operation is called and counts the number of calls.
MylistSub
should inherit basic method behavior
from Mylist
. Adding a sequence to a
MylistSub
should print a message, increment the
counter for +
calls, and perform the
superclass’s method. Also, introduce a new method
that displays the operation counters to stdout
and
experiment with your class interactively. Do your counters count
calls per instance, or per class (for all instances of the class)?
How would you program both of these? (Hint: it depends on which
object the count members are assigned to: class members are shared by
instances, self
members are per-instance data.)
Metaclass methods. Write a class called
Meta
with methods that intercept every attribute
qualification (both fetches and assignments) and prints a message
with their arguments to stdout
. Create a
Meta
instance and experiment with qualifying it
interactively. What happens when you try to use the instance in
expressions? Try adding, indexing, and slicing the instance of your
class.
Set objects. Experiment with the set class described in Section 23.1.1. Run commands to do the following sorts of operations:
Create two sets of integers, and compute their intersection and union
by using &
and |
operator
expressions.
Create a set from a string, and experiment with indexing your set; which methods in the class are called?
Try iterating through the items in your string set using a
for
loop; which methods run this time?
Try computing the intersection and union of your string set and a simple Python string; does it work?
Now, extend your set by subclassing to handle arbitrarily many
operands using a *args
argument form (Hint: see
the function versions of these algorithms in Chapter 13). Compute intersections and unions of multiple
operands with your set subclass. How can you intersect three or more
sets, given that &
has only two sides?
How would you go about emulating other list operations in the set
class? (Hints: __add__
can catch concatenation,
and __getattr__
can pass most list method calls
off to the wrapped list.)
Class tree links. In Section 21.5 in
Chapter 21, and in Section 22.6 in
Chapter 22, we mentioned that classes have a
__bases__
attribute that returns a tuple of the
class’s superclass objects (the ones in parentheses
in the class header). Use __bases__
to extend
the Lister
mixin class (see Chapter 22), so that it prints the names of the immediate
superclasses of the instance’s class. When
you’re done, the first line of the string
representation should look like this (your address may vary):
<Instance of Sub(Super, Lister), address 7841200:
How would you go about listing inherited class attributes too? (Hint:
classes have a __dict__
.) Try extending your
Lister
class to display all accessible
superclasses and their attributes as well; see Chapter 21s
classtree.py example for hints on climbing class
trees, and the Lister
footnote about using
dir
and getattr
in Python 2.2
for hints on climbing trees.
Composition. Simulate a fast-food ordering scenario by defining four classes:
Lunch
A container and controller class
Customer
The actor that buys food
Employee
The actor that a customer orders from
Food
What the customer buys
To get you started, here are the classes and methods you’ll be defining:
class Lunch: def __init__(self) # Make/embed Customer and Employee. def order(self, foodName) # Start a Customer order simulation. def result(self) # Ask the Customer what kind of Food it has. class Customer: def __init__(self) # Initialize my food to None. def placeOrder(self, foodName, employee) # Place order with an Employee. def printFood(self) # Print the name of my food. class Employee: def takeOrder(self, foodName) # Return a Food, with requested name. class Food: def __init__(self, name) # Store food name.
The order simulation works as follows:
The Lunch
class’s constructor
should make and embed an instance of Customer
and
Employee
, and export a method called
order
. When called, this order
method should ask the Customer
to place an order,
by calling its placeOrder
method. The
Customer
’s
placeOrder
method should in turn ask the
Employee
object for a new Food
object, by calling the Employee
’s
takeOrder
method.
Food objects should store a food name string (e.g.,
“burritos”), passed down from
Lunch.order
to
Customer.placeOrder
, to
Employee.takeOrder
, and finally to
Food
’s constructor. The top-level
Lunch
class should also export a method called
result
, which asks the customer to print the name
of the food it received from the Employee
via the
order (this can be used to test your simulation).
Note that Lunch
needs to either pass the
Employee
to the Customer
, or
pass itself to the Customer
, in order to allow the
Customer
to call Employee
methods.
Experiment with your classes interactively by importing the
Lunch
class, calling its order method to run an
interaction, and then calling its result method to verify that the
Customer
got what he or she ordered. If you
prefer, you can also simply code test cases as self-test code in the
file where your classes are defined, using the module __name__
trick in Chapter 18. In this simulation, the
Customer
is the active agent; how would your
classes change if Employee
were the object that
initiated customer/ employee interaction instead?
Zoo Animal Hierarchy: Consider the class tree
shown in Figure 23-1. Code a set of six
class
statements to model this taxonomy with
Python inheritance. Then, add a speak
method to
each of your classes that prints a unique message, and a
reply
method in your top-level
Animal
superclass that simply calls
self.speak
to invoke the category-specific message
printer in a subclass below (this will kick off an independent
inheritance search from self
). Finally, remove the
speak
method from your Hacker
class, so that it picks up the default above it. When
you’re finished, your classes should work this way:
%python
>>>from zoo import Cat, Hacker
>>>spot = Cat( )
>>>spot.reply( ) # Animal.reply; calls Cat.speak
meow >>>data = Hacker( ) # Animal.reply; calls Primate.speak
>>>data.reply( )
Hello world!
The Dead Parrot Sketch: Consider the object
embedding structure captured in Figure 23-2. Code a
set of Python classes to implement this structure with composition.
Code your Scene
object to define an
action
method, and embed instances of
Customer
, Clerk
, and
Parrot
classes—all three of which should
define a line
method that prints a unique message.
The embedded objects may either inherit from a common superclass that
defines line and simply provide message text, or define
line
themselves. In the end, your classes should
operate like this:
%python
>>>import parrot
>>>parrot.Scene( ).action( ) # Activate nested objects.
customer: "that's one ex-bird!" clerk: "no it isn't..." parrot: None
[1] This tends to scare C++ people unnecessarily. In Python, it’s even possible to change or completely delete a class method at runtime. On the other hand, nobody ever does, in practical programs. As a scripting language, Python is more about enabling, than restricting.
[2] Even without the classic/new divergence, this
technique may sometimes come in handy in multiple
inheritance scenarios in general. If you want part of a
superclass on the left, and part of a superclass on the right, you
might need to tell Python which same-named attributes to choose by
such explicit assignments in subclasses. We’ll
revisit this notion in a gotcha at the end of this chapter. Also note
that diamond inheritance can be more problematic in some cases than
we’ve implied (e.g., what if B
and C
both have required constructors that call to
A
’s?), but this is beyond this
book’s scope.