Chapter 23. Advanced Class Topics

Part VI concludes our look at OOP in Python by presenting a few more advanced class-related topics, along with the gotchas and exercises for this part of the book. We encourage you to do the exercises, to help cement the ideas we’ve studied. We also suggest working on or studying larger OOP Python projects as a supplement to this book. Like much in computing, the benefits of OOP tend to become more apparent with practice.

Extending Built-in Types

Besides implementing new kinds of objects, classes are sometimes used to extend the functionality of Python’s built-in types in order to support more exotic data structures. For instance, to add queue insert and delete methods to lists, you can code classes that wrap (embed) a list object, and export insert and delete methods that process the list specially, like the delegation technique studied in Chapter 22. As of Python 2.2, you can also use inheritance to specialize built-in types. The next two sections show both techniques in action.

Extending Types by Embedding

Remember those set functions we wrote in Part IV? Here’s what they look like brought back to life as a Python class. The following example, file setwrapper.py, implements a new set object type, by moving some of the set functions to methods, and adding some basic operator overloading. For the most part, this class just wraps a Python list with extra set operations. Because it’s a class, it also supports multiple instances and customization by inheritance in subclasses.

class Set:
   def __init__(self, value = [  ]):     # Constructor
       self.data = [  ]                 # Manages a list
       self.concat(value)

   def intersect(self, other):        # other is any sequence.
       res = [  ]                       # self is the subject.
       for x in self.data:
           if x in other:             # Pick common items.
               res.append(x)
       return Set(res)                # Return a new Set.

   def union(self, other):            # other is any sequence.
       res = self.data[:]             # Copy of my list
       for x in other:                # Add items in other.
           if not x in res:
               res.append(x)
       return Set(res)

   def concat(self, value):           # value: list, Set...
       for x in value:                # Removes duplicates
          if not x in self.data:
               self.data.append(x)

   def __len__(self):          return len(self.data)        # len(self)
   def __getitem__(self, key): return self.data[key]        # self[i]
   def __and__(self, other):   return self.intersect(other) # self & other
   def __or__(self, other):    return self.union(other)     # self | other
   def __repr__(self):         return 'Set:' + `self.data`  # Print

By overloading indexing, the set class can often masquerade as a real list. Since you will interact with and extend this class in an exercise at the end of this chapter, we won’t say much more about this code until Appendix B.

Extending Types by Subclassing

Beginning with Python 2.2, all the built-in types can now be subclassed directly. Type conversion functions such as list, str, dict, and tuple, have become built-in type names—although transparent to your script, a type conversion call (e.g., list('spam')) is now really an invocation of a type’s object constructor.

This change allows us to customize or extend the behavior of built-in types with user-defined class statements: simply subclass the new type names to customize them. Instances of your type subclasses can be used anywhere that the original built-in type can appear. For example, suppose you have trouble getting used to the fact that Python list offsets begin at 0 instead of 1. Not to worry—you can always code your own subclass that customizes this core behavior of lists. File typesubclass.py shows how:

# Subclass built-in list type/class.
# Map 1..N to 0..N-1; call back to built-in version.

class MyList(list):
    def __getitem__(self, offset):
        print '(indexing %s at %s)' % (self, offset)  
        return list.__getitem__(self, offset - 1) 

if __name__ == '__main__':
    print list('abc')
    x = MyList('abc')               # __init__ inherited from list
    print x                         # __repr__ inherited from list

    print x[1]                      # MyList.__getitem__
    print x[3]                      # Customizes list superclass method

    x.append('spam'), print x       # Attributes from list superclass
    x.reverse(  );      print x

In this file, the MyList subclass extends the built-in list’s __getitem__ indexing method only, in order to map indexes 1 to N back to the required 0 to N-1. All it really does is decrement the index submitted, and call back to the superclass’s version of indexing, but it’s enough to do the trick:

% python typesubclass.py
['a', 'b', 'c']
['a', 'b', 'c']
(indexing ['a', 'b', 'c'] at 1)
a
(indexing ['a', 'b', 'c'] at 3)
c
['a', 'b', 'c', 'spam']
['spam', 'c', 'b', 'a']

This output also includes tracing text the class prints on indexing. Whether or not changing indexing this way is a good idea in general is another issue—users of your MyList class may very well be confused by such a core departure from Python sequence behavior. The fact that you can customize built-in types this way can be a powerful tool in general, though.

For instance, this coding pattern gives rise to an alternative way to code sets—as a subclass of the built-in list type, rather than a standalone class that manages an embedded list object. The following class, coded in file setsubclass.py, customizes lists, to add just methods and operators related to set processing; because all other behavior is inherited from the built-in list superclass, this makes for a shorter and simpler alternative:

class Set(list):
    def __init__(self, value = [  ]):     # Constructor
        list.__init__([  ])               # Customizes list
        self.concat(value)               # Copies mutable defaults

    def intersect(self, other):          # other is any sequence.
        res = [  ]                       # self is the subject.
        for x in self:
            if x in other:               # Pick common items.
                res.append(x)
        return Set(res)                  # Return a new Set.

    def union(self, other):            # other is any sequence.
        res = Set(self)                # Copy me and my list.
        res.concat(other)
        return res

    def concat(self, value):           # value: list, Set...
        for x in value:                # Removes duplicates
            if not x in self:
                self.append(x)

    def __and__(self, other): return self.intersect(other)
    def __or__(self, other):  return self.union(other)
    def __repr__(self):       return 'Set:' + list.__repr__(self)

if __name__ == '__main__':
    x = Set([1,3,5,7])
    y = Set([2,1,4,5,6])
    print x, y, len(x)
    print x.intersect(y), y.union(x)
    print x & y, x | y
    x.reverse(  ); print x

Here is the output of this script’s self-test code at the end of the file. Because subclassing core types is an advanced feature, we will omit further details here, but invite you to trace through these results in the code, to study its behavior:

% python setsubclass.py
Set:[1, 3, 5, 7] Set:[2, 1, 4, 5, 6] 4
Set:[1, 5] Set:[2, 1, 4, 5, 6, 3, 7]
Set:[1, 5] Set:[1, 3, 5, 7, 2, 4, 6]
Set:[7, 5, 3, 1]

There are more efficient ways to implement sets with dictionaries in Python, that replace the linear scans in the set implementations we’ve shown, with dictionary index operations (hashing), and so run much quicker. (For more details, see Programming Python, Second Edition [O’Reilly]). If you’re interested in sets, also see the new set module that was added in Python 2.3 release; this module provides a set object and set operations as built-in tools. Sets are fun to experiment with, but are no longer strictly required as of Python 2.3.

For another type subclass example, see the implementation of the new bool type in Python 2.3: as mentioned earlier, bool is a subclass of int, with two instances True and False that behave like integers 1 and 0, but inherit custom string reprresentation methods that display their names.

Pseudo-Private Class Attributes

In Part IV, we learned that every name assigned at the top level of a file is exported by a module. By default, the same holds for classes—data hiding is a convention, and clients may fetch or change any class or instance attribute they like. In fact, attributes are all “public” and “virtual” in C++ terms; they’re all accessible everywhere and all looked up dynamically at runtime.[1]

That’s still true today. However, Python also includes the notion of name “mangling” (i.e., expansion), to localize some names in classes. This is sometimes misleadingly called private attributes—really, it’s just a way to localize a name to the class that created it, and does not prevent access by code outside the class. That is, this feature is mostly intended to avoid namespace collisions in instances, not to restrict access to names in general.

Pseudo-private names are an advanced feature, entirely optional, and probably won’t be very useful until you start writing large class hierarchies in multi-programmer projects. But because you may see this feature in other people’s code, you need to be somewhat aware of it even if you don’t use it in your own.

Name Mangling Overview

Here’s how name mangling works. Names inside a class statement that start with two underscores (and don’t end with two underscores) are automatically expanded to include the name of the enclosing class. For instance, a name like __X within a class named Spam is changed to _Spam__X automatically: a single underscore, the enclosing class’s name, and the rest of the original name. Because the modified name is prefixed with the name of the enclosing class, it’s somewhat unique; it won’t clash with similar names created by other classes in a hierarchy.

Name mangling happens only in class statements and only for names you write with two leading underscores. Within a class, though, it happens to every name preceded with double underscores wherever they appear. This includes both method names and instance attributes. For example, an instance attribute reference self.__X is transformed to self._Spam__X. Since more than one class may add attributes to an instance, this mangling helps avoid clashes; but we need to move on to an example to see how.

Why Use Pseudo-Private Attributes?

The problem that the pseudo-private attribute feature is meant to alleviate has to do with the way instance attributes are stored. In Python, all instance attributes wind up in the single instance object at the bottom of the class tree. This is very different from the C++ model, where each class gets its own space for data members it defines.

Within a class method in Python, whenever a method assigns to a self attribute (e.g., self.attr=value), it changes or creates an attribute in the instance (inheritance search only happens on reference, not assignment). Because this is true even if multiple classes in a hierarchy assign to the same attribute, collisions are possible.

For example, suppose that when a programmer codes a class, she assumes that she owns the attribute name X in the instance. In this class’s methods, the name is set and later fetched:

class C1:
    def meth1(self): self.X = 88         # Assume X is mine.
    def meth2(self): print self.X

Suppose further that another programmer, working in isolation, makes the same assumption in a class that he codes:

class C2:
    def metha(self): self.X = 99         # Me too
    def methb(self): print self.X

Both of these classes work by themselves. The problem arises if these two classes are ever mixed together in the same class tree:

class C3(C1, C2): ...
I = C3(  )                                 # Only 1 X in I!

Now, the value that each class will get back when it says self.X depends on which class assigned it last. Because all assignments to self.X refer to the same single instance, there is only one X attribute—I.X, no matter how many classes use that attribute name. To guarantee that an attribute belongs to the class that uses it, prefix the name with double underscores everywhere it is used in the class, as in this file, private.py:

class C1:
    def meth1(self): self.__X = 88       # Now X is mine.
    def meth2(self): print self.__X      # Becomes _C1__X in I

class C2:
    def metha(self): self.__X = 99       # Me too
    def methb(self): print self.__X      # Becomes _C2__X in I

class C3(C1, C2): pass
I = C3(  )                                 # Two X names in I

I.meth1(  ); I.metha(  )
print I.__dict__
I.meth2(  ); I.methb(  )

When thus prefixed, the X attributes are expanded to include the name of the class, before being added to the instance. If you run a dir call on I or inspect its namespace dictionary after the attributes have been assigned, you see the expanded names: _C1__X and _C2__X, but not X. Because the expansion makes the names unique within the instance, the class coders can assume they truly own any names that they prefix with two underscores:

% python private.py
{'_C2__X': 99, '_C1__X': 88}
88
99

This trick can avoid potential name collisions in the instance, but note that it is not true privacy at all. If you know the name of the enclosing class, you can still access these attributes anywhere you have a reference to the instance, by using the fully expended name (e.g., I._C1__X=77). On the other hand, this feature makes it less likely that you will accidentally step on a class’s names.

We should note that this feature tends to become more useful for larger, multi-programmer projects, and then only for selected names. That is, don’t clutter your code unnecessarily; only use this feature for names that truly need to be controlled by a single class. For simpler programs, it’s probably overkill.

“New Style” Classes in Python 2.2

In Release 2.2, Python introduced a new flavor of classes, known as “new style” classes; the classes covered so far in this part of the book are known as “classic classes” when comparing them to the new kind.

New style classes are only slightly different than classic classes, and the ways in which they differ are completely irrelevent to the vast majority of Python users. Moreover, the classic class model, which has been with Python for over a decade, still works exactly as we have described previously.

New style classes are almost completely backward-compatible with classic classes, in both syntax and behavior; they mostly just add a few advanced new features. However, because they modify one special case of inheritance, they had to be introduced as a distinct tool, so as to avoid impacting any existing code that depends on the prior behavior.

New style classes are coded with all the normal class syntax we have studied. The chief coding difference is that you subclass from a built-in type (e.g., list) to produce a new style class. A new built-in name, object, is provided to serve as a superclass for new style classes if no other built-in type is appropriate to use:

class newstyle(object):
    ...normal code...

More generally, any object derived from object or other built-in type is automatically treated as a new style class. By derived, we mean that this includes subclasses of object, subclasses of subclasses of object, and so on—as long as a built-in is somewhere in the superclass tree. Classes not derived from built-ins are considered classic.

Diamond Inheritance Change

Perhaps the most visible change in new style classes is their slightly different treatment of inheritance for the so-called diamond pattern of multiple inheritance trees—where more than one superclass leads to the same higher superclass further above. The diamond pattern is an advanced design concept, which we have not even discussed for normal classes.

In short, with classic classes inheritance search is strictly depth first, and then left to right—Python climbs all the way to the top before it begins to back up and look to the right in the tree. In new style classes, the search is more breadth-first in such cases—Python chooses a closer superclass to the right before ascending all the way to the common superclass at the top. Because of this change, lower superclasses can overload attributes of higher superclasses, regardless of the sort of multiple inheritance trees they are mixed into.

Diamond inheritance example

To illustrate, consider this simplistic incarnation of the diamond inheritance pattern for classic classes:

>>> class A:      attr = 1             # Classic
>>> class B(A):   pass
>>> class C(A):   attr = 2
>>> class D(B,C): pass                 # Tries A before C
>>> x = D(  )
>>> x.attr
1

The attribute here was found in superclass A, because inheritance climbs as high as it can before backing up and moving right—it searches D, B, A, then C (and stops when attr is found in A above B). With the new style classes derived from a built-in like object, though, inheritance looks in C first (to the right) before A (above B)—it searches D, B, C, then A (and in this case stops in C):

>>> class A(object): attr = 1          # New style
>>> class B(A):      pass
>>> class C(A):      attr = 2
>>> class D(B,C):    pass              # Tries C before A
>>> x = D(  )
>>> x.attr
2

This change in inheritance is based upon the assumption that if you mix in C lower in the tree, you probably intend to grab its attributes in preference to A’s. It also assumes that C probably meant to overide A’s attribute always—it does when used standalone, but not when mixed into a diamond with classic classes. You might not know that C may be mixed-in like this at the time you code it.

Explicit conflict resolution

Of course, the problem with assumptions is that they assume things. If this search order deviation seems too subtle to remember, or if you want more control over the search process, you can always force the selection of an attribute from anywhere in the tree by assigning or otherwise naming the one you want at the place where classes are mixed together:

>>> class A:      attr = 1            # Classic
>>> class B(A):   pass
>>> class C(A):   attr = 2
>>> class D(B,C): attr = C.attr       # Choose C, to the right.
>>> x = D(  )
>>> x.attr                            # Works like new style
2

Here, a tree of classic classes is emulating the search order of new style classes; the assignment to the attribute in D picks the version in C, thereby subverting the normal inheritance search path (D.attr will be lowest in the tree). New style classes can similarly emulate classic classes, by choosing the attribute above at the place where the classes are mixed together:

>>> class A(object): attr = 1         # New style
>>> class B(A):      pass
>>> class C(A):      attr = 2
>>> class D(B,C):    attr = B.attr    # Choose A.attr, above.
>>> x = D(  )
>>> x.attr                            # Works like classic
1

If you are willing to always resolve conflicts like this, you can largely ignore the search order difference, and not rely on assumptions about what you meant when you coded your classes. Naturally, the attributes we pick this way can also be method functions—methods are normal, assignable objects:

>>> class A:
...    def meth(s): print 'A.meth'
>>> class C(A):
...     def meth(s): print 'C.meth'
>>> class B(A): 
...     pass

>>> class D(B,C): pass                 # Use default search order.
>>> x = D(  )                          # Will vary per class type
>>> x.meth(  )                         # Defaults to classic order
A.meth

>>> class D(B,C): meth = C.meth      # Pick C's method: new style.
>>> x = D(  )
>>> x.meth(  )
C.meth

>>> class D(B,C): meth = B.meth      # Pick B's method: classic.
>>> x = D(  )
>>> x.meth(  )
A.meth

Here, we select methods by assignments to same names lower in the tree. We might also simply call the desired class explicity; in practice, this pattern might be more common, especially for things like constructors:

                     class D(B,C):
                     def meth(self):                  # Redefine lower.
                             ...
                     C.meth(self)                 # Pick C's method by calling.

Such selections by assignment or call at mix-in points can effectively insulate your code from this difference in class flavors. By explicitly resolving the conflict this way, your code won’t vary per Python version in the future (apart from perhaps needing to derive classes from a built-in for the new style).[2]

By default, the diamond pattern is searched differently in classic and new style classes, and this is a non-backward compatible change. However, keep in mind that this change only effects diamond pattern cases; new style class inheritance works unchanged for all other inheritance tree structures. Further, it’s not impossible that this entire issue may be of more theoretical than practical importance—since it wasn’t significant enough to change until 2.2, it seems unlikely to impact much Python code.

Other New Style Class Extensions

Beyond this change in the diamond inheritance pattern (which is itself too obscure to matter to most readers of this book), new style classes open up a handful of even more advanced possibilities. Here’s a brief look at each.

Static and class methods

It is possible to define methods within a class that can be called without an instance: static methods work roughly like simple instance-less functions inside a class, and class methods are passed a class instead of an instance. Special built-in functions must be called within the class to enable these method modes: staticmethod and classmethod. Because this is also a solution to a longstanding gotcha in Python, we’ll present these calls later in this chapter in Section 23.4 . Note that the new static and class methods also work for classic classes in Python release 2.2.

Instance slots

By assigning a list of string attribute names to a special __slots__ class attribute, it is possible for new style classes to limit the set of legal attributes that instances of the class will have. This special attribute is typically set by assigning to variable __slots__ at the top level of a class statement. Only those names in the __slots__ list can be assigned as instance attributes. However, like all names in Python, instance attribute names must still be assigned before they can be referenced, even if listed in __slots__. Here’s an example to illustrate:

>>> class limiter(object):
...     __slots__ = ['age', 'name', 'job']
        
>>> x = limiter(  )
>>> x.age                     # Must assign before use
AttributeError: age

>>> x.age = 40
>>> x.age
40
>>> x.ape = 1000              # Illegal: not in slots
AttributeError: 'limiter' object has no attribute 'ape'

This feature is envisioned as a way to catch “typo” errors (assignment to illegal attribute names not in __slots__ is detected) and as a possible optimization mechanism in the future. Slots are something of a break with Python’s dynamic nature, which dictates that any name may be created by assignment. They also have additional constraints and implications that are far too complex for us to discuss here (e.g., some instances with slots may not have an attribute dictionary __dict__); see Python 2.2 release documents for details.

Class properties

A mechanism known as properties provides another way for new style classes to define automatically called methods for access or assignment to instance attributes. This feature is an alternative for many current uses of the __getattr__ and __setattr__ overloading methods studied in Chapter 21. Properties have a similar effect to these two methods, but incur an extra method call only for access to names that require dynamic computation. Properties (and slots) are based on a new notion of attribute descriptors, which is too advanced for us to cover here.

In short, properties are a type of object assigned to class attribute names. They are generated by calling a property built-in with three methods (handlers for get, set, and delete operations), as well as a docstring; if any argument is passed as None or omitted, it is not supported. Properties are typically assigned at the top level of a class statement (e.g., name=property(...)). When thus assigned, accesses to the class attribute itself (e.g., obj.name) are automatically routed to one of the accessor methods passed into the property. For example, the __getattr__ method allows classes to intercept undefined attribute references:

>>> class classic:
...     def __getattr__(self, name):
...         if name == 'age':
...             return 40
...         else:
...             raise AttributeError
...        
>>> x = classic(  )
>>> x.age                                    # Runs __getattr__
40
>>> x.name                                   # Runs __getattr__
AttributeError

Here is the same example, coded with properties instead:

>>> class newprops(object):
...     def getage(self):
...         return 40
...     age = property(getage, None, None, None)      # get,set,del,docs
... 
>>> x = newprops(  )
>>> x.age                                    # Runs getage
40
>>> x.name                                   # Normal fetch
AttributeError: newprops instance has no attribute 'name'

For some coding tasks, properties can be both less complex and quicker to run than the traditional techniques. For example, when we add attribute assignment support, properties become more attractive—there’s less code to type, and you might not incur an extra method call for assignments to attributes you don’t wish to compute dynamically:

>>> class newprops(object):
...     def getage(self):
...         return 40
...     def setage(self, value):
...         print 'set age:', value
...         self._age = value
...     age = property(getage, setage, None, None)
...
>>> x = newprops(  )
>>> x.age                     # Runs getage
40
>>> x.age = 42                # Runs setage 
set age: 42
>>> x._age                    # Normal fetch; no getage call
42
>>> x.job = 'trainer'         # Normal assign; no setage call
>>> x.job                     # Normal fetch; no getage call
'trainer'

The equivalent classic class might trigger extra method calls, and may need to route attribute assignments through the attribute dictionary to avoid loops:

>>> class classic:
...     def __getattr__(self, name):            # On undefined reference
...         if name == 'age':
...             return 40
...         else:
...             raise AttributeError
...     def __setattr__(self, name, value):     # On all assignments
...         print 'set:', name, value
...         if name == 'age':
...             self.__dict__['_age'] = value
...         else:
...             self.__dict__[name] = value
...
>>> x = classic(  )
>>> x.age                     # Runs __getattr__
40
>>> x.age = 41                # Runs __setattr__
set: age 41
>>> x._age                    # Defined: no __getattr__ call
41
>>> x.job = 'trainer'         # Runs __setattr__ again
>>> x.job                     # Defined: no __getattr__ call

Properties seem like a win for this simple example. However, some applications of __getattr__ and __setattr__ may still require more dynamic or generic interfaces than properties directly provide. For example, in many cases, the set of attributes to be supported cannot be determined when the class is coded, and may not even exist in any tangible form at all (e.g., when delegating arbitrary method references to a wrapped/embedded object generically). In such cases, a generic __getattr__ or __setattr__ attribute handler with a passed-in attribute name may be an advantage. Because such generic handlers can also handle simpler cases, properties are largely an optional extension.

New __getattribute__ overload method

The __getattribute__ method, available for new style classes only, allows a class to intercept all attribute references, not just undefined references like __getattr__. It is also substantially trickier to use than both __getattr__ or __setattr__ (it is prone to loops). We’ll defer to Python’s standard documentation for more details.

Besides all these feature additions, new style classes integrate with the notion of subclassable types that we met earlier in this chapter; subclassable types and new style classes were both introduced in conjunction with a merging of the type/class dichotomy in 2.2 and beyond.

Because new style class features are all advanced topics, we are going to skip further details in this introductory text. Please see Python 2.2 release documentation and the language reference for more information.

It is not impossible that new style classes might be adopted as the single class model in future Python releases. If they are, you might simply need to make sure your top-level superclasses are derived from object or other built-in type name (if even that will be required at all); everything else we’ve studied in this part of the book should continue to work as described.

Class Gotchas

Most class issues can usually be boiled down to namespace issues (which makes sense, given that classes are just namespaces with a few extra tricks). Some of the topics in this section are more like case studies of advanced class usage than problems, and one or two of these have been eased by recent Python releases.

Changing Class Attributes Can Have Side Effects

Theoretically speaking, classes (and class instances) are all mutable objects. Just as with built-in lists and dictionaries, they can be changed in place, by assigning to their attributes. And like lists and dictionaries, this also means that changing a class or instance object may impact multiple references to it.

That’s usually what we want (and is how objects change their state in general), but this becomes especially critical to know when changing class attributes. Because all instances generated from a class share the class’s namespace, any changes at the class level are reflected in all instances, unless they have their own versions of changed class attributes.

Since classes, modules, and instances are all just objects with attribute namespaces, you can normally change their attributes at runtime by assignments. Consider the following class; inside the class body, the assignment to name a generates an attribute X.a, which lives in the class object at runtime and will be inherited by all of X’s instances:

>>> class X:
...     a = 1        # Class attribute
...
>>> I = X(  )
>>> I.a              # Inherited by instance
1
>>> X.a
1

So far so good—this is the normal case. But notice what happens when we change the class attribute dynamically outside the class statement: it also changes the attribute in every object that inherits from the class. Moreover, new instances created from the class during this session or program get the dynamically set value, regardless of what the class’s source code says:

>>> X.a = 2          # May change more than X
>>> I.a              # I changes too.
2
>>> J = X(  )             # J inherits from X's runtime values
>>> J.a              # (but assigning to J.a changes a in J, not X or I).
2

Is this a useful feature or a dangerous trap? You be the judge, but you can actually get work done by changing class attributes, without ever making a single instance. This technique can simulate “records” or “structs” in other languages. As a refresher on this technique, consider the following unusual but legal Python program:

class X: pass                          # Make a few attribute namespaces.
class Y: pass

X.a = 1                                # Use class attributes as variables.
X.b = 2                                # No instances anywhere to be found
X.c = 3
Y.a = X.a + X.b + X.c

for X.i in range(Y.a): print X.i       # Prints 0..5

Here, classes X and Y work like “file-less” modules—namespaces for storing variables we don’t want to clash. This is a perfectly legal Python programming trick, but is less appropriate when applied to classes written by others; you can’t always be sure that class attributes you change aren’t critical to the class’s internal behavior. If you’re out to simulate a C “struct,” you may be better off changing instances than classes, since only one object is affected:

class Record: pass
X = Record(  )
X.name = 'bob'
X.job  = 'Pizza maker'

Multiple Inheritance: Order Matters

This may be obvious, but is worth underscoring: if you use multiple inheritance, the order in which superclasses are listed in a class statement header can be critical. Python always searches your superclasses left to right, according to the order in the class header line.

For instance, in the multiple inheritance example we saw in Chapter 22, suppose that the Super implemented a __repr__ method too; would we then want to inherit Lister’s or Super’s? We would get it from whichever class is listed first in Sub’s class header, since inheritance searches left to right. Presumably, we would list Lister first, since its whole purpose is its custom __repr__:

class Lister:
    def __repr__(self): ...

class Super:
    def __repr__(self): ...

class Sub(Lister, Super):  # Get Lister's __repr__ by listing it first.

But now suppose Super and Lister have their own versions of other same-named attributes, too. If we want one name from Super and another from Lister, no order in the class header will help—we will have to override inheritance by manually assigning to the attribute name in the Sub class:

class Lister:
    def __repr__(self): ...
    def other(self): ...

class Super:
    def __repr__(self): ...
    def other(self): ...

class Sub(Lister, Super):  # Get Lister's __repr__ by listing it first
    other = Super.other    # but explicitly pick Super's version of other.
    def __init__(self): 
        ... 

x = Sub(  )                  # Inheritance searches Sub before Super/Lister.

Here, the assignment to other within the Sub class creates Sub.other—a reference back to the Super.other object. Because it is lower in the tree, Sub.other effectively hides Lister.other, the attribute that inheritance would normally find. Similarly, if we listed Super first in the class header to pick up its other, we would then need to select Lister’s method:

class Sub(Super, Lister):          # Get Super's other by order.
    __repr__ = Lister.__repr__  # Explicitly pick Lister.__repr__.

Multiple inheritance is an advanced tool. Even if you understood the last paragraph, it’s still a good idea to use it sparingly and carefully. Otherwise, the meaning of a name may depend on the order in which classes are mixed in an arbitrarily far removed subclass. For another example of the technique shown here in action, see the discussion of explicit conflict resolution in Section 23.3 earlier in this chapter.

As a rule of thumb, multiple inheritance works best when your mix-in classes are as self-contained as possible—since they may be used in a variey of contexts, they should not make assumptions about the names related to other classes in a tree. Moreover, the pseudo-private attributes feature we studied earlier can help by localizing names that the a class relies on owning, and limiting the names that your mix-in classes add to the mix. In the example, if Lister only means to export its custom __repr__, it could name its other method __other to avoid clashing with other classes.

Class Function Attributes Are Special: Static Methods

This gotcha has been fixed by a new optional feature in Python 2.2, static and class methods, but we retain it here for readers with older Python releases, and because it gives us as good a reason as any for presenting the new static and class methods advanced feature.

In Python releases prior to 2.2, class method functions can never be called without an instance. (In Python 2.2 and later, this is also the default behavior, but it can be modified if necessary.) In the prior chapter, we talked about unbound methods: when we fetch a method function by qualifying a class (instead of an instance), we get an unbound method object. Even though they are defined with a def statement, unbound method objects are not simple functions; they cannot be called without an instance.

For example, suppose we want to use class attributes to count how many instances are generated from a class (file spam.py, shown below). Remember, class attributes are shared by all instances, so we can store the counter in the class object itself:

class Spam:
    numInstances = 0
    def __init__(self):
        Spam.numInstances = Spam.numInstances + 1
    def printNumInstances(  ):
        print "Number of instances created: ", Spam.numInstances

But this won’t work: the printNumInstances method still expects an instance to be passed in when called, because the function is associated with a class (even though there are no arguments in the def header):

>>> from spam import *
>>> a = Spam(  )
>>> b = Spam(  )
>>> c = Spam(  )
>>> Spam.printNumInstances(  )
Traceback (innermost last):
  File "<stdin>", line 1, in ?
TypeError: unbound method must be called with class instance 1st argument

Solution (prior to 2.2, and in 2.2 normally)

Don’t expect this: unbound instance methods aren’t exactly the same as simple functions. This is mostly a knowledge issue, but if you want to call functions that access class members without an instance, probably the best advice is to just make them simple functions, not class methods. This way, an instance isn’t expected in the call:

def printNumInstances(  ):
    print "Number of instances created: ", Spam.numInstances

class Spam:
    numInstances = 0
    def __init__(self):
        Spam.numInstances = Spam.numInstances + 1

>>> import spam
>>> a = spam.Spam(  )
>>> b = spam.Spam(  )
>>> c = spam.Spam(  )
>>> spam.printNumInstances(  )
Number of instances created:  3
>>> spam.Spam.numInstances
3

We can also make this work by calling through an instance, as usual, although this can be inconvenient if making an instance changes the class data:

class Spam:
    numInstances = 0
    def __init__(self):
        Spam.numInstances = Spam.numInstances + 1
    def printNumInstances(self):
        print "Number of instances created: ", Spam.numInstances

>>> from spam import Spam
>>> a, b, c = Spam(  ), Spam(  ), Spam(  )
>>> a.printNumInstances(  )
Number of instances created:  3
>>> b.printNumInstances(  )
Number of instances created:  3
>>> Spam(  ).printNumInstances(  )
Number of instances created:  4

Some language theorists claim that this means Python doesn’t have class methods, only instance methods. We suspect they really mean Python classes don’t work the same as in some other language. Python really has bound and unbound method objects, with well-defined semantics; qualifying a class gets you an unbound method, which is a special kind of function. Python does have class attributes, but functions in classes expect an instance argument.

Moreover, since Python already provides modules as a namespace partitioning tool, there’s usually no need to package functions in classes unless they implement object behavior. Simple functions within modules usually do most of what instance-less class methods could. For example, in the first code sample in this section, printNumInstances is already associated with the class, because it lives in the same module. The only lost functionality is that the function name has a broader scope—the entire module, rather than the class.

Static and class methods in Python 2.2

As of Python 2.2, you can code classes with both static and class methods, neither of which require an instance to be present when they are invoked. To designate such methods, classes call the built-in functions staticmethod and classmethod, as hinted in the earlier discussion of new style classes. For example:

class Multi:
    def imeth(self, x):          # Normal instance method
        print self, x
    def smeth(x):                # Static: no instance passed
        print x
    def cmeth(cls, x):           # Class: gets class, not instance
        print cls, x
    smeth = staticmethod(smeth)  # Make smeth a static method.
    cmeth = classmethod(cmeth)   # Make cmeth a class method.

Notice how the last two assignments in this code simply reassign the method names smeth and cmeth. Attributes are created and changed by any assignment in a class statement, so these final assignments overwrite the assignments made earlier by the defs.

Technically, Python 2.2 supports three kinds of class-related methods: instance, static, and class. Instance methods are the normal (and default) case that we’ve seen in this book. With instance methods, you always must call the method with an instance object. When you call through an instance, Python passes the instance to the first (leftmost) argument automatically; when called through the class, you pass along the instance manually:

>>> obj = Multi(  )              # Make an instance
>>> obj.imeth(1)                 # Normal call, through instance
<__main__.Multi instance...> 1
>>> Multi.imeth(obj, 2)          # Normal call, through class
<__main__.Multi instance...> 2

By contrast, static methods are called without an instance argument; their names are local to the scope of the class they are defined in, and may be looked up by inheritance; mostly, they work like simple functions that happen to be coded inside a class:

>>> Multi.smeth(3)             # Static call, through class
3
>>> obj.smeth(4)               # Static call, through instance
4

Class methods are similar, but Python automatically passes the class (not an instance) in to the method’s first (leftmost) argument:

>>> Multi.cmeth(5)             # Class call, through class
__main__.Multi 5
>>> obj.cmeth(6)               # Class call, through instance
__main__.Multi 6

Static and class methods are new and advanced features of the language. They have highly specialized roles that we don’t have space to document here. Static methods are commonly used in conjunction with class attributes to manage information that spans all instances generated from the class.

For example, to keep track of the number of instances generated from a class (as in the earlier example), you may use static methods to manage a counter attached as a class attribute. Since such a count has nothing to do with any particular instance, it is inconvenient to have to access methods that process it through an instance (especially since making an instance to access the counter may change the counter). Moreover, static methods’ proximity to the class provides a more natural solution than coding class-oriented functions outside the class. Here is the static method equivalent of this section’s original example:

class Spam:
    numInstances = 0
    def __init__(self):
        Spam.numInstances += 1
    def printNumInstances(  ):
        print "Number of instances:", Spam.numInstances
    printNumInstances = staticmethod(printNumInstances)

>>> a = Spam(  )
>>> b = Spam(  )
>>> c = Spam(  )
>>> Spam.printNumInstances(  )
Number of instances: 3
>>> a.printNumInstances(  )
Number of instances: 3

Compared to simply moving printNumInstances outside the class as prescribed earlier, this version requires an extra staticmethod call, but localizes the function name in the class scope, and moves the function code closer to where it is used (inside the class statement). You should judge for yourself whether this is a net improvement or not.

Methods, Classes, and Nested Scopes

This gotcha went away in Python 2.2, with the introduction of nested function scopes, but we retain it here for historical perspective, for readers working with older Python releases, and because it demonstrates what happens to the new nested function scope rules when a class is a layer of the nesting.

Classes introduce a local scope just as functions do, so the same sorts of scope behavior can happen in a class statement body. Moreover, methods are further nested functions, so the same issues apply. Confusion seems to be especially common when classes are nested.

In the following example, file nester.py, the generate function returns an instance of the nested Spam class. Within its code, the class name Spam is assigned in the generate function’s local scope. But within the class’s method function, the class name Spam is not visible in Python prior to 2.2 where method has access only to its own local scope, the module surrounding generate, and built-in names:

def generate(  ):
    class Spam:
        count = 1
        def method(self):        # Name Spam not visible:
            print Spam.count     # not local(def),global(module), built-in
    return Spam(  )

generate(  ).method(  )

C:pythonexamples> python nester.py
Traceback (innermost last):
  File "nester.py", line 8, in ?
    generate(  ).method(  )
  File "nester.py", line 5, in method
    print Spam.count             # Not local(def),global(module), built-in
NameError: Spam

As a solution, either upgrade to Python 2.2, or don’t nest code this way. This example works in Python 2.2 and later, because the local scopes of all enclosing function defs are automatically visible to nested defs, including nested method defs, as in this example.

Note that even in 2.2, method defs cannot see the local scope of the enclosing class, only the local scope of enclosing defs. That’s why methods must go through the self instance or the class name, to reference methods and other attributes defined in the enclosing class statement. For example, code in the method must use self.count or Spam.count, not just count.

Prior to release 2.2, there are a variety of ways to get the example above to work. One of the simplest is to move the name Spam out to the enclosing module’s scope with global declarations; since method sees global names in the enclosing module, references work:

def generate(  ):
    global Spam                 # Force Spam to module scope.
    class Spam:
        count = 1
        def method(self):
            print Spam.count        # Works: in global (enclosing module)
    return Spam(  )

generate(  ).method(  )             # Prints 1

Perhaps better, we can also restructure the code such that class Spam is defined at the top level of the module by virtue of its nesting level, rather than global declarations. Both the nested method function and the top level generate find Spam in their global scopes:

def generate(  ):
    return Spam(  )

class Spam:                    # Define at module top-level.
    count = 1
    def method(self):
        print Spam.count       # Works: in global (enclosing module)

generate(  ).method(  )

In fact, this is what we prescribe for all Python releases—your code tends to be simpler in general if you avoid nesting of classes and functions.

If you want to get complicated and tricky, you can also get rid of the Spam reference in method altogether, by using the special __class__ attribute, which returns an instance’s class object:

def generate(  ):
    class Spam:
        count = 1
        def method(self):
            print self.__class__.count       # Works: qualify to get class
    return Spam(  )

generate( ).method( )

Overwrapping-itis

Sometimes, the abstraction potential of OOP can be abused to the point of making code difficult to understand. If your classes are layered too deeply, it can make code obscure; you may have to search through many classes to discover what an operation does. For example, one of your authors once worked in a C++ shop with thousands of classes (some generated by machine), and up to 15 levels of inheritance; deciphering a method call in such a complex system was often a monumental task. Multiple classes had to be consulted for even the most basic of operations.

The most general rule of thumb applies here too: don’t make things complicated unless they truly must be. Wrapping your code in multiple layers of classes to the point of incomprehensibility is always a bad idea. Abstraction is the basis of polymorphism and encapsulation, and can be a very effective tool when used well. But you’ll simplify debugging and aid maintainability, if you make your class interfaces intuitive, avoid making code overly abstract, and keep your class hierarchies short and flat unless there is a good reason to do otherwise.

Part VI Exercises

These exercises ask you to write a few classes and experiment with some existing code. Of course, the problem with existing code is that it must be existing. To work with the set class in exercise 5, either pull down the class source code off the Internet (see Preface) or type it up by hand (it’s fairly small). These programs are starting to get more sophisticated, so be sure to check the solutions at the end of the book for pointers.

See Section B.6 for the solutions.

  1. Inheritance. Write a class called Adder that exports a method add(self, x, y) that prints a “Not Implemented” message. Then define two subclasses of Adder that implement the add method:

    ListAdder

    With an add method that returns the concatenation of its two list arguments

    DictAdder

    With an add method that returns a new dictionary with the items in both its two dictionary arguments (any definition of addition will do)

    Experiment by making instances of all three of your classes interactively and calling their add methods.

    Now, extend your Adder superclass to save an object in the instance with a constructor (e.g., assign self.data a list or a dictionary) and overload the + operator with an __add__ to automatically dispatch to your add methods (e.g., X+Y triggers X.add(X.data,Y)). Where is the best place to put the constructors and operator overload methods (i.e., in which classes)? What sorts of objects can you add to your class instances?

    In practice, you might find it easier to code your add methods to accept just one real argument (e.g., add(self,y)), and add that one argument to the instance’s current data (e.g., self.data+y). Does this make more sense than passing two arguments to add? Would you say this makes your classes more “object-oriented”?

  2. Operator overloading. Write a class called Mylist that shadows (“wraps”) a Python list: it should overload most list operators and operations including +, indexing, iteration, slicing, and list methods such as append and sort. See the Python reference manual for a list of all possible methods to support. Also, provide a constructor for your class that takes an existing list (or a Mylist instance) and copies its components into an instance member. Experiment with your class interactively. Things to explore:

    1. Why is copying the initial value important here?

    2. Can you use an empty slice (e.g., start[:]) to copy the initial value if it’s a Mylist instance?

    3. Is there a general way to route list method calls to the wrapped list?

    4. Can you add a Mylist and a regular list? How about a list and a Mylist instance?

    5. What type of object should operations like + and slicing return; how about indexing?

    6. If you are working with a more recent Python release (Version 2.2 or later), you may implement this sort of wrapper class either by embedding a real list in a standalone class, or by extending the built-in list type with a subclass. Which is easier and why?

  3. Subclassing. Make a subclass of Mylist from Exercise 2 called MylistSub, which extends Mylist to print a message to stdout before each overloaded operation is called and counts the number of calls. MylistSub should inherit basic method behavior from Mylist. Adding a sequence to a MylistSub should print a message, increment the counter for + calls, and perform the superclass’s method. Also, introduce a new method that displays the operation counters to stdout and experiment with your class interactively. Do your counters count calls per instance, or per class (for all instances of the class)? How would you program both of these? (Hint: it depends on which object the count members are assigned to: class members are shared by instances, self members are per-instance data.)

  4. Metaclass methods. Write a class called Meta with methods that intercept every attribute qualification (both fetches and assignments) and prints a message with their arguments to stdout. Create a Meta instance and experiment with qualifying it interactively. What happens when you try to use the instance in expressions? Try adding, indexing, and slicing the instance of your class.

  5. Set objects. Experiment with the set class described in Section 23.1.1. Run commands to do the following sorts of operations:

    1. Create two sets of integers, and compute their intersection and union by using & and | operator expressions.

    2. Create a set from a string, and experiment with indexing your set; which methods in the class are called?

    3. Try iterating through the items in your string set using a for loop; which methods run this time?

    4. Try computing the intersection and union of your string set and a simple Python string; does it work?

    5. Now, extend your set by subclassing to handle arbitrarily many operands using a *args argument form (Hint: see the function versions of these algorithms in Chapter 13). Compute intersections and unions of multiple operands with your set subclass. How can you intersect three or more sets, given that & has only two sides?

    6. How would you go about emulating other list operations in the set class? (Hints: __add__ can catch concatenation, and __getattr__ can pass most list method calls off to the wrapped list.)

  6. Class tree links. In Section 21.5 in Chapter 21, and in Section 22.6 in Chapter 22, we mentioned that classes have a __bases__ attribute that returns a tuple of the class’s superclass objects (the ones in parentheses in the class header). Use __bases__ to extend the Lister mixin class (see Chapter 22), so that it prints the names of the immediate superclasses of the instance’s class. When you’re done, the first line of the string representation should look like this (your address may vary):

    <Instance of Sub(Super, Lister), address 7841200:

    How would you go about listing inherited class attributes too? (Hint: classes have a __dict__.) Try extending your Lister class to display all accessible superclasses and their attributes as well; see Chapter 21s classtree.py example for hints on climbing class trees, and the Lister footnote about using dir and getattr in Python 2.2 for hints on climbing trees.

  7. Composition. Simulate a fast-food ordering scenario by defining four classes:

    Lunch

    A container and controller class

    Customer

    The actor that buys food

    Employee

    The actor that a customer orders from

    Food

    What the customer buys

    To get you started, here are the classes and methods you’ll be defining:

    class Lunch:
        def __init__(self)          # Make/embed Customer and Employee.
        def order(self, foodName)  # Start a Customer order simulation.
        def result(self)           # Ask the Customer what kind of Food it has.
    
    class Customer:
        def __init__(self)                         # Initialize my food to None.
        def placeOrder(self, foodName, employee)  # Place order with an Employee.
        def printFood(self)                       # Print the name of my food.
    
    class Employee:
        def takeOrder(self, foodName)       # Return a Food, with requested name.
    
    class Food:
        def __init__(self, name)         # Store food name.

    The order simulation works as follows:

    1. The Lunch class’s constructor should make and embed an instance of Customer and Employee, and export a method called order. When called, this order method should ask the Customer to place an order, by calling its placeOrder method. The Customer’s placeOrder method should in turn ask the Employee object for a new Food object, by calling the Employee’s takeOrder method.

    2. Food objects should store a food name string (e.g., “burritos”), passed down from Lunch.order to Customer.placeOrder, to Employee.takeOrder, and finally to Food’s constructor. The top-level Lunch class should also export a method called result, which asks the customer to print the name of the food it received from the Employee via the order (this can be used to test your simulation).

    Note that Lunch needs to either pass the Employee to the Customer, or pass itself to the Customer, in order to allow the Customer to call Employee methods.

    Experiment with your classes interactively by importing the Lunch class, calling its order method to run an interaction, and then calling its result method to verify that the Customer got what he or she ordered. If you prefer, you can also simply code test cases as self-test code in the file where your classes are defined, using the module __name__ trick in Chapter 18. In this simulation, the Customer is the active agent; how would your classes change if Employee were the object that initiated customer/ employee interaction instead?

  8. Zoo Animal Hierarchy: Consider the class tree shown in Figure 23-1. Code a set of six class statements to model this taxonomy with Python inheritance. Then, add a speak method to each of your classes that prints a unique message, and a reply method in your top-level Animal superclass that simply calls self.speak to invoke the category-specific message printer in a subclass below (this will kick off an independent inheritance search from self). Finally, remove the speak method from your Hacker class, so that it picks up the default above it. When you’re finished, your classes should work this way:

    % python
    >>> from zoo import Cat, Hacker
    >>> spot = Cat(  )
    >>> spot.reply(  )              # Animal.reply; calls Cat.speak
    meow
    >>> data = Hacker(  )           # Animal.reply; calls Primate.speak
    >>> data.reply(  )
    Hello world!
A zoo hierarchy
Figure 23-1. A zoo hierarchy
  1. The Dead Parrot Sketch: Consider the object embedding structure captured in Figure 23-2. Code a set of Python classes to implement this structure with composition. Code your Scene object to define an action method, and embed instances of Customer, Clerk, and Parrot classes—all three of which should define a line method that prints a unique message. The embedded objects may either inherit from a common superclass that defines line and simply provide message text, or define line themselves. In the end, your classes should operate like this:

    % python
    >>> import parrot
    >>> parrot.Scene(  ).action(  )       # Activate nested objects.
    customer: "that's one ex-bird!"
    clerk: "no it isn't..."
    parrot: None
A scene composite
Figure 23-2. A scene composite


[1] This tends to scare C++ people unnecessarily. In Python, it’s even possible to change or completely delete a class method at runtime. On the other hand, nobody ever does, in practical programs. As a scripting language, Python is more about enabling, than restricting.

[2] Even without the classic/new divergence, this technique may sometimes come in handy in multiple inheritance scenarios in general. If you want part of a superclass on the left, and part of a superclass on the right, you might need to tell Python which same-named attributes to choose by such explicit assignments in subclasses. We’ll revisit this notion in a gotcha at the end of this chapter. Also note that diamond inheritance can be more problematic in some cases than we’ve implied (e.g., what if B and C both have required constructors that call to A’s?), but this is beyond this book’s scope.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset