Chapter 21. Class Coding Details

Did all of Chapter 20 make sense? If not, don’t worry; now that we’ve had a quick tour, we’re going to dig a bit deeper and study the concepts we’ve introduced in further detail. This chapter takes a second pass, to formalize and expand on some of the class coding ideas introduced in Chapter 20.

The Class Statement

Although the Python class statement seems similar to other OOP languages on the surface, on closer inspection it is quite different than what some programmers are used to. For example, as in C++, the class statement is Python’s main OOP tool. Unlike C++, Python’s class is not a declaration. Like def, class is an object builder, and an implicit assignment—when run, it generates a class object, and stores a reference to it in the name used in the header. Also like def, class is true executable code—your class doesn’t exist until Python reaches and runs the class statement (typically, while importing the module it is coded in, but not until).

General Form

class is a compound statement with a body of indented statements usually under it. In the header, superclasses are listed in parentheses after the class name, separated by commas. Listing more than one superclass leads to multiple inheritance (which we’ll say more about in the next chapter). Here is the statement’s general form:

class <name>(superclass,...):       # Assign to name.
    data = value                    # Shared class data
    def method(self,...):           # Methods
        self.member = value         # Per-instance data

Within the class statement, any assignment generates a class attribute, and specially-named methods overload operators; for instance, a function called __init__ is called at instance object construction time, if defined.

Example

Classes are mostly just namespaces—a tool for defining names (i.e., attributes) that export data and logic to clients. So how do you get from the class statement to a namespace?

Here’s how. Just as with modules files, the statements nested in a class statement body create its attributes. When Python executes a class statement (not a call to a class), it runs all the statements in its body, from top to bottom. Assignments that happen during this process create names in the class’s local scope, which become attributes in the associated class object. Because of this, classes resemble both modules and functions:

  • Like functions, class statements are a local scope where names created by nested assignments live.

  • Like modules, names assigned in a class statement become attributes in a class object.

The main distinction for classes is that their namespaces are also the basis of inheritance in Python; attributes are fetched from other classes if not found in a class or instance object.

Because class is a compound statement, any sort of statement can be nested inside its body—print, =, if, def, and so on. All the statements inside the class statement run when the class statement itself runs (not when the class is later called to make an instance). Any name assigned inside the class statement makes a class attribute. Nested defs make class methods, but other assignments make attributes, too. For example:

>>> class SharedData:
...     spam = 42               # Generates a class attribute.
...
>>> x = SharedData(  )            # Make two instances.
>>> y = SharedData(  )
>>> x.spam, y.spam              # They inherit and share spam.
(42, 42)

Here, because the name spam is assigned at the top-level of a class statement, it is attached to the class, and so will be shared by all instances. Change it by going through the class name; refer to it through either instances or the class.[1]

>>> SharedData.spam = 99
>>> x.spam, y.spam, SharedData.spam
(99, 99, 99)

Such class attributes can be used to manage information that spans all the instances—a counter of the number of instances generated, for example (we’ll expand on this idea in Chapter 23). Now, watch what happens if we assign name spam through an instance instead of the class:

>>> x.spam = 88
>>> x.spam, y.spam, SharedData.spam
(88, 99, 99)

Assignments to instance attributes create or change that name in the instance, rather than the shared class. More generally, inheritance search occurs only on attribute reference, not assignment: assigning to an object’s attribute always changes that object, and no other.[2] For example, y.spam is looked up in the class by inheritance, but the assignment to x.spam attaches a name to x itself.

Here’s a more comprehensive example of this behavior, that stores the same name in two places. Suppose we run the following class:

class MixedNames:                          # Define class.
    data = 'spam'                          # Assign class attr.
    def __init__(self, value):              # Assign method name.
        self.data = value                  # Assign instance attr.
    def display(self):
        print self.data, MixedNames.data    # Instance attr, class attr

This class contains two defs, which bind class attributes to method functions. It also contains an = assignment statement; since this class-level assignment assigns the name data inside the class, it lives in the class’s local scope and becomes an attribute of the class object. Like all class attributes, this data is inherited and shared by all instances of the class that don’t have a data of their own.

When we make instances of this class, the name data is also attached to instances, by the assignment to self.data in the constructor method:

>>> x = MixedNames(1)           # Make two instance objects.
>>> y = MixedNames(2)           # Each has its own data.
>>> x.display(  ); y.display(  )     # self.data differs, Subclass.data same.
1 spam
2 spam

The net result is that data lives in two places: in instance objects (created by the self.data assignment in __init__), and in the class they inherit names from (created by the data assignment in the class). The class’s display method prints both versions, by first qualifying the self instance, and then the class.

By using these techniques to store attributes on different objects, you determine their scope of visibility. When attached to classes, names are shared; in instances, names record per-instance data, not shared behavior or data. Although inheritance looks up names for us, we can always get to an attribute anywhere in a tree, by accessing the desired object directly.

In the example, x.data and self.data will choose an instance name, which normally hides the same name in the class. But MixedNames.data grabs the class name explicitly. We’ll see various roles for such coding patterns later; the next section describes one of the most common.

Methods

Since you already know about functions, you also know about methods in classes. Methods are just function objects created by def statements nested in a class statement’s body. From an abstract perspective, methods provide behavior for instance objects to inherit. From a programming perspective, methods work in exactly the same way as simple functions, with one crucial exception: their first argument always receives the instance object that is the implied subject of a method call.

In other words, Python automatically maps instance method calls to class method functions as follows. Method calls made through an instance:

instance.method(args...)

are automatically translated to class method function calls of this form:

class.method(instance, args...)

where the class is determined by locating the method name using Python’s inheritance search procedure. In fact, both call forms are valid in Python.

Beside the normal inheritance of method attribute names, the special first argument is the only real magic behind method calls. In a class method, the first argument is usually called self by convention (technically, only its position is significant, not its name). This argument provides methods with a hook back to the instance—because classes generate many instance objects, they need to use this argument to manage data that varies per instance.

C++ programmers may recognize Python’s self argument as similar to C++’s “this” pointer. In Python, though, self is always explicit in your code. Methods must always go through self to fetch or change attributes of the instance being processed by the current method call. This explicit nature of self is by design—the presence of this name makes it obvious that you are using attribute names in your script, not a name in the local or global scope.

Example

Let’s turn to an example; suppose we define the following class:

class NextClass:                            # Define class.
    def printer(self, text):                # Define method.
        self.message = text                 # Change instance.
        print self.message                  # Access instance.

The name printer references a function object; because it’s assigned in the class statement’s scope, it becomes a class object attribute and is inherited by every instance made from the class. Normally, because methods like printer are designed to process instances, we call them through instances:

>>> x = NextClass(  )                          # Make instance

>>> x.printer('instance call')              # Call its method
instance call

>>> x.message                               # Instance changed
'instance call'

When called by qualifying an instance like this, printer is first located by inheritance, and then its self argument is automatically assigned the instance object (x); the text argument gets the string passed at the call ('instance call'). When called this way, we pass one fewer argument than it seems we need—Python automatically passes the first argument to self for us. Inside printer, the name self is used to access or set per-instance data, because it refers back to the instance currently being processed.

Methods may be called in one of two ways—through an instance, or through the class itself. For example, we can also call printer by going through the class name, provided we pass an instance to the self argument explicitly:

>>> NextClass.printer(x, 'class call')      # Direct class call
class call

>>> x.message                               # Instance changed again
'class call'

Calls routed through the instance and class have the exact same effect, as long as we pass the same instance object ourselves in the class form. By default, in fact, you get an error message if you try to call a method without any instance:

>>> NextClass.printer('bad call')
TypeError: unbound method printer(  ) must be called with NextClass instance...

Calling Superclass Constructors

Methods are normally called through instances. Calls to methods through the class, though, show up in a variety of special roles. One common scenario involves the constructor method. The __init__ method, like all attributes, is looked up by inheritance. This means that at construction time, Python locates and calls just one __init__; if subclass constructors need to guarantee that superclass construction-time logic runs too, they generally must call it explicitly through the class:

class Super:
    def __init__(self, x): 
        ...default code...

class Sub(Super):
    def __init__(self, x, y):
        Super.__init__(self, x)          # Run superclass init.
        ...custom code...               # Do my init actions.

I = Sub(1, 2)

This is one of the few contexts in which your code calls an overload method directly. Naturally, you should only call the superclass constructor this way if you really want it to run—without the call, the subclass replaces it completely.[3]

Other Method Call Possibilities

This pattern of calling through a class is the general basis of extending (instead of completely replacing) inherited method behavior. In Chapter 23, we’ll also meet a new option in Python 2.2, static and class methods, which allows you to code methods that do not expect an instance object in their first argument. Such methods can act like simple instance-less functions, with names that are local to the class they are coded in. This is an advanced and optional extension, though; normally, you must always pass an instance to a method, whether it is called through the instance or the class.

Inheritance

The whole point of a namespace tool like the class statement is to support name inheritance. This section expands on some of the mechanisms and roles of attribute inheritance.

In Python, inheritance happens when an object is qualified, and involves searching an attribute definition tree (one or more namespaces). Every time you use an expression of the form object.attr where object is an instance or class object, Python searches the namespace tree at and above object, for the first attr it can find. This includes references to self attributes in your methods. Because lower definitions in the tree override higher ones, inheritance forms the basis of specialization.

Attribute Tree Construction

Figure 21-1 summarizes the way namespace trees are constructed and populated with names. Generally:

  • Instance attributes are generated by assignments to self attributes in methods.

  • Class attributes are created by statements (assignments) in class statements.

  • Superclass links are made by listing classes in parentheses in a class statement header.

Namespaces tree construction and inheritance
Figure 21-1. Namespaces tree construction and inheritance

The net result is a tree of attribute namespaces, which grows from an instance, to the class it was generated from, to all the superclasses listed in the class headers. Python searches upward in this tree from instances to superclasses, each time you use qualification to fetch an attribute name from an instance object.[4]

Specializing Inherited Methods

The tree-searching model of inheritance just described turns out to be a great way to specialize systems. Because inheritance finds names in subclasses before it checks superclasses, subclasses can replace default behavior by redefining the superclass’s attributes. In fact, you can build entire systems as hierarchies of classes, which are extended by adding new external subclasses rather than changing existing logic in place.

The idea of redefining inherited names leads to a variety of specialization techniques. For instance, subclasses may replace inherited attributes completely, provide attributes that a superclass expects to find, and extend superclass methods by calling back to the superclass from an overridden method. We’ve already seen replacement in action; here’s an example that shows how extension works:

>>> class Super:
...     def method(self):
...         print 'in Super.method'
... 
>>> class Sub(Super):
...     def method(self):                       # Override method.
...         print 'starting Sub.method'         # Add actions here.
...         Super.method(self)                  # Run default action.
...         print 'ending Sub.method'
...

Direct superclass method calls are the crux of the matter here. The Sub class replaces Super’s method function with its own specialized version. But within the replacement, Sub calls back to the version exported by Super to carry out the default behavior. In other words, Sub.method just extends Super.method behavior, rather than replacing it completely:

>>> x = Super(  )            # Make a Super instance.
>>> x.method(  )             # Runs Super.method
in Super.method

>>> x = Sub(  )              # Make a Sub instance.
>>> x.method(  )             # Runs Sub.method, which calls Super.method
starting Sub.method
in Super.method
ending Sub.method

This extension coding pattern is also commonly used with constructors; see Section 21.2 for an example.

Class Interface Techniques

Extension is only one way to interface with a superclass; the following file, specialize.py, defines multiple classes that illustrate a variety of common techniques:

Super

Defines a method function and a delegate that expects an action in a subclass

Inheritor

Doesn’t provide any new names, so it gets everything defined in Super

Replacer

Overrides Super’s method with a version of its own

Extender

Customizes Super’s method by overriding and calling back to run the default

Provider

Implements the action method expected by Super’s delegate method

Study each of these subclasses to get a feel for the various ways they customize their common superclass:

class Super:
    def method(self):
        print 'in Super.method'       # Default behavior
    def delegate(self):
        self.action(  )                    # Expected to be defined

class Inheritor(Super):               # Inherit method verbatim.
    pass

class Replacer(Super):                # Replace method completely.
    def method(self):
        print 'in Replacer.method'

class Extender(Super):                # Extend method behavior.
    def method(self):
        print 'starting Extender.method'
        Super.method(self)
        print 'ending Extender.method'

class Provider(Super):                # Fill in a required method.
    def action(self):
        print 'in Provider.action'

if __name__ == '__main__':
    for klass in (Inheritor, Replacer, Extender):
        print '
' + klass.__name__ + '...'
        klass(  ).method(  )

    print '
Provider...'
    x = Provider(  )
    x.delegate(  )

A few things are worth pointing out here. The self-test code at the end of this example creates instances of three different classes in a for loop. Because classes are objects, you can put them in a tuple and create instances generically (more on this idea later). Classes also have the special __name__ attribute like modules; it’s just preset to a string containing the name in the class header.

% python specialize.py

Inheritor...
in Super.method

Replacer...
in Replacer.method

Extender...
starting Extender.method
in Super.method
ending Extender.method

Provider...
in Provider.action

Abstract Superclasses

Notice how the Provider class in the prior example works. When we call the delegate method through a Provider instance, two independent inheritance searches occur:

  1. On the initial x.delegate call, Python finds the delegate method in Super, by searching at the Provider instance and above. The instance x is passed into the method’s self argument as usual.

  2. Inside the Super.delegate method, self.action invokes a new, independent inheritance search at self and above. Because self references a Provider instance, the action method is located in the Provider subclass.

This “filling in the blanks” sort of coding structure is typical of OOP frameworks. At least in terms of the delegate method, the superclass in this example is what is sometimes called an abstract superclass—a class that expects parts of its behavior to be provided by subclasses. If an expected method is not defined in a subclass, Python raises an undefined name exception after inheritance search fails. Class coders sometimes make such subclass requirements more obvious with assert statements, or raising the built-in NotImplementedError exception:

class Super:
    def method(self):
        print 'in Super.method'
    def delegate(self):
        self.action(  )
    def action(self):
        assert 0, 'action must be defined!'

We’ll meet assert in Chapter 24; in short, if its expression evaluates to false, it raises an exception with an error message. Here, the expression is always false (0), so as to trigger an error message if a method is not redefined and inheritance locates the version here. Alternatively, some classes simply raise a NotImplemented exception directly in such method stubs; we’ll study the raise statement in Chapter 24.

For a somewhat more realistic example of this section’s concepts in action, see the “Zoo Animal Hierarchy” exercise in Part VI Exercises and its solution in Section B.6. Such taxonomies are a traditional way to introduce OOP, but are a bit removed from most developers’ job descriptions.

Operator Overloading

We introduced operator overloading in the prior chapter; let’s fill in more details here and look at a few commonly used overloading methods. Here’s a review of the key ideas behind overloading:

  • Operator overloading lets classes intercept normal Python operations.

  • Classes can overload all Python expression operators.

  • Classes can also overload operations: printing, calls, qualification, etc.

  • Overloading makes class instances act more like built-in types.

  • Overloading is implemented by providing specially named class methods.

Here’s a simple example of overloading at work. When we provide specially named methods in a class, Python automatically calls them when instances of the class appear in the associated operation. For instance, the Number class in file number.py below provides a method to intercept instance construction (__init__), as well as one for catching subtraction expressions (__sub__). Special methods are the hook that lets you tie into built-in operations:

class Number:
    def __init__(self, start):               # On Number(start)
        self.data = start
    def __sub__(self, other):                # On instance - other
        return Number(self.data - other)    # result is a new instance

>>> from number import Number               # Fetch class from module.
>>> X = Number(5)                           # Number.__init__(X, 5)
>>> Y = X - 2                               # Number.__sub__(X, 2)
>>> Y.data                                  # Y is new Number instance.
3

Common Operator Overloading Methods

Just about everything you can do to built-in objects such as integers and lists has a corresponding specially named method for overloading in classes. Table 21-1 lists a few of the most common; there are many more. In fact, many overload methods come in multiple versions (e.g., __add__, __radd__, and __iadd__ for addition). See other Python books or the Python Language Reference Manual for an exhaustive list of special method names available.

Table 21-1. Common operator overloading methods

Method

Overloads

Called for

__init__

Constructor

Object creation: Class( )

__del__

Destructor

Object reclamation

__add__

Operator '+'

X + Y, X += Y

__or__

Operator '|' (bitwise or)

X | Y, X |= Y

__repr__,__str__

Printing, conversions

print X, `X`, str(X)

__call__

Function calls

X( )

__getattr__

Qualification

X.undefined

__setattr__

Attribute assignment

X.any = value

__getitem__

Indexing

X[key], for loops, in tests

__setitem__

Index assignment

X[key] = value

__len__

Length

len(X), truth tests

__cmp__

Comparison

X == Y, X < Y

__lt__

Specific comparison

X < Y (or else __cmp__)

__eq__

Specific comparison

X == Y (or else __cmp__)

__radd__

Right-side operator '+'

Noninstance + X

__iadd__

In-place (augmented) addition

X += Y (or else __add__)

__iter__

Iteration contexts

for loops,in tests, others

All overload methods have names that start and end with two underscores, to keep them distinct from other names you define in your classes. The mapping from special method name to expression or operations is simply predefined by the Python language (and documented in the standard language manual). For example, name __add__ always maps to + expressions by Python language definition, regardless of what an __add__ method’s code actually does.

All operator overloading methods are optional—if you don’t code one, that operation is simply unsupported by your class (and may raise an exception if attempted). Most overloading methods are only used in advanced programs that require objects to behave like built-ins; the __init__ constructor tends to appear in most classes, however. We’ve already met the __init__ initialization-time constructor method, and a few others in Table 21-1. Let’s explore some of the additional methods in the table by example.

__getitem__ Intercepts Index References

The __getitem__ method intercepts instance indexing operations. When an instance X appears in an indexing expression like X[i], Python calls a __getitem__ method inherited by the instance (if any), passing X to the first argument and the index in brackets to the second argument. For instance, the following class returns the square of an index value:

>>> class indexer:
...     def __getitem__(self, index):
...         return index ** 2
...
>>> X = indexer(  )
>>> X[2]                        # X[i] calls __getitem__(X, i).
                  4
>>> for i in range(5): 
...     print X[i],             
...
0 1 4 9 16

__getitem__ and __iter__ Implement Iteration

Here’s a trick that isn’t always obvious to beginners, but turns out to be incredibly useful: the for statement works by repeatedly indexing a sequence from zero to higher indexes, until an out-of-bounds exception is detected. Because of that, __getitem__ also turns out to be one way to overload iteration in Python—if defined, for loops call the class’s __getitem__ each time through, with successively higher offsets. It’s a case of “buy one, get one free”: any built-in or user-defined object that responds to indexing also responds to iteration:

>>> class stepper:
...     def __getitem__(self, i):
...         return self.data[i]
...
>>> X = stepper(  )              # X is a stepper object.
>>> X.data = "Spam"
>>>
>>> X[1]                       # Indexing calls __getitem__.
'p'
>>> for item in X:             # for loops call __getitem__.
...     print item,            # 
                  for indexes items 0..N.
...
S p a m

In fact, it’s really a case of “buy one, get a bunch for free”: any class that supports for loops automatically supports all iteration contexts in Python, many of which we’ve seen in earlier chapters. For example, the in membership test, list comprehensions, the map built-in, list and tuple assignments, and type constructors, will also call __getitem__ automatically if defined:

>>> 'p' in X                   # All call __getitem__ too.
1

>>> [c for c in X]             # List comprehension
['S', 'p', 'a', 'm']

>>> map(None, X)               # map calls
['S', 'p', 'a', 'm']

>>> (a,b,c,d) = X              # Sequence assignments
>>> a, c, d
('S', 'a', 'm')

>>> list(X), tuple(X), ''.join(X)
(['S', 'p', 'a', 'm'], ('S', 'p', 'a', 'm'), 'Spam')

>>> X
<__main__.stepper instance at 0x00A8D5D0>

In practice, this technique can be used to create objects that provide a sequence interface, and add logic to built-in sequence type operations; we’ll revisit this idea when extending built-in types in Chapter 23.

User-defined iterators

Today, all iteration contexts in Python will first try to find a __iter__ method, which is expected to return an object that supports the new iteration protocol. If provided, Python repeatedly calls this object’s next method to produce items, until the StopIteration exception is raised. If no such method is found, Python falls back on the __getitem__ scheme and repeatedly indexes by offsets as before, until an IndexError exception.

In the new scheme, classes implement user-defined iterators by simply implementing the iterator protocol introduced in Chapter 14 for functions. For example, the following file, iters.py, defines a user-defined iterator class that generates squares:

class Squares:
    def __init__(self, start, stop):
        self.value = start - 1
        self.stop  = stop
    def __iter__(self):                    # Get iterator object
        return self
    def next(self):                       # on each for iteration.
        if self.value == self.stop:
            raise StopIteration
        self.value += 1
        return self.value ** 2

% python
>>> from iters import Squares
>>> for i in Squares(1,5):
...     print i,
...
1 4 9 16 25

Here, the iterator object is simply the instance, self, because the next method is part of this class. The end of the iteration is signaled with a Python raise statement (more on raising exceptions in the next part of this book).

An equivalent coding with __getitem__ might be less natural, because the for would then iterate through offsets zero and higher; offsets passed in would be only indirectly related to the range of values produced (0..N would need to map to start..stop). Because __iter__ objects retain explicitly-managed state between next calls, they can be more general than __getitem__.

On the other hand, __iter__-based iterators can sometimes be more complex and less convenient than __getitem__. They are really designed for iteration, not random indexing. In fact, they don’t overload the indexing expression at all:

>>> X = Squares(1,5)
>>> X[1]
AttributeError: Squares instance has no attribute '__getitem__'

The __iter__ scheme implements the other iteration contexts we saw in action for __getitem__ (membership tests, type constructors, sequence assignment, and so on). However, unlike __getitem__, __iter__ is designed for a single traversal, not many. For example, the Squares class is a one-shot iteration; once iterated, it’s empty. You need to make a new iterator object for each new iteration:

>>> X = Squares(1,5)
>>> [n for n in X]                     # Exhausts items
[1, 4, 9, 16, 25]
>>> [n for n in X]                     # Now it's empty.
[  ]
>>> [n for n in Squares(1,5)]
[1, 4, 9, 16, 25]
>>> list(Squares(1,3))
[1, 4, 9]

For more details on iterators, see Chapter 14. Notice that this example would probably be simpler if coded with generator functions—a topic introduced in Chapter 14 and related to iterators:

>>> from __future__ import generators    # Need in 2.2
>>>
>>> def gsquares(start, stop):
...     for i in range(start, stop+1):
...         yield i ** 2
...
>>> for i in gsquares(1, 5):
...     print i,
...
1 4 9 16 25

Unlike the class, the function automatically saves its state between iterations. Classes may be better at modeling more complex iterations, though, especially when they can benefit from inheritance hierarchies. Of course, for this artificial example, you might as well skip both techniques, and simply use a for loop, map, or list comprehension, to build the list all at once; the best and fastest way to accomplish a task in Python is often also the simplest:

>>> [x ** 2 for x in range(1, 6)]
[1, 4, 9, 16, 25]

__getattr__ and __setattr__ Catch Attribute References

The __getattr__ method intercepts attribute qualifications. More specifically, it’s called with the attribute name as a string, whenever you try to qualify an instance on an undefined (nonexistent) attribute name. It is not called if Python can find the attribute using its inheritance tree-search procedure. Because of its behavior, __getattr__ is useful as a hook for responding to attribute requests in a generic fashion. For example:

>>> class empty:
...     def __getattr__(self, attrname):
...         if attrname == "age":
...             return 40
...         else:
...             raise AttributeError, attrname
...
>>> X = empty(  )
>>> X.age
40
>>> X.name
...error text omitted...
AttributeError: name

Here, the empty class and its instance X have no real attributes of their own, so the access to X.age gets routed to the __getattr__ method; self is assigned the instance (X), and attrname is assigned the undefined attribute name string (”age“). The class makes age look like a real attribute by returning a real value as the result of the X.age qualification expression (40). In effect, age becomes a dynamically computed attribute.

For other attributes the class doesn’t know how to handle, it raises the built-in AttributeError exception, to tell Python that this is a bona fide undefined name; asking for X.name triggers the error. You’ll see __getattr__ again when we show delegation and properties at work in the next two chapters, and we will say more about exceptions in Part VII.

A related overloading method, __setattr__, intercepts all attribute assignments. If this method is defined, self.attr=value becomes self.__setattr__('attr',value). This is a bit more tricky to use, because assigning to any self attributes within __setattr__ calls __setattr__ again, causing an infinite recursion loop (and eventually, a stack overflow exception!). If you want to use this method, be sure that it assigns any instance attributes by indexing the attribute dictionary, discussed in the next section. Use self.__dict__['name']=x, not self.name=x:

>>> class accesscontrol:
...     def __setattr__(self, attr, value):
...         if attr == 'age':
...             self.__dict__[attr] = value
...         else:
...             raise AttributeError, attr + ' not allowed'
...
>>> X = accesscontrol(  )
>>> X.age = 40                     # Calls __setattr__
>>> X.age
40
>>> X.name = 'mel'
...text omitted...
AttributeError: name not allowed

These two attribute access overloading methods tend to play highly specialized roles, some of which we’ll meet later in this book; in general, they allow you to control or specialize access to attributes in your objects.

__repr__ and __str__Return String Representations

The next example exercises the __init__ constructor and the __add__ overload methods we’ve already seen, but also defines a __repr__ that returns a string representation for instances. String formatting is used to convert the managed self.data object to a string. If defined, __repr__, or its sibling __str__, is called automatically when class instances are printed or converted to strings; they allow you to define a better print string for your objects than the default instance display.

>>> class adder:
...     def __init__(self, value=0):
...         self.data = value                  # Initialize data.
...     def __add__(self, other):
...         self.data += other                 # Add other in-place.

>>> class addrepr(adder):                      # Inherit __init__, __add__.
...     def __repr__(self):                     # Add string representation.
...         return 'addrepr(%s)' % self.data   # Convert to string as code.

>>> x = addrepr(2)              # Runs __init__
>>> x + 1                       # Runs __add__
>>> x                           # Runs __repr__
addrepr(3)
>>> print x                     # Runs __repr__
addrepr(3)
>>> str(x), repr(x)             # Runs   __repr__
('addrepr(3)', 'addrepr(3)')

So why two display methods? Roughly, __str__ is tried first for user-friendly displays, such as the print statement and the str built-in function. The __repr__ method should in principle return a string that could be used as executable code to recreate the object, and is used for interactive prompt echoes and the repr function. Python falls back on __repr__ if no __str__ is present, but not vice-versa:

>>> class addstr(adder):            
...     def __str__(self):                      # __str__ but no __repr__
...         return '[Value: %s]' % self.data   # Convert to nice string.

>>> x = addstr(3)
>>> x + 1
>>> x                                          # Default repr
<__main__.addstr instance at 0x00B35EF0>
>>> print x                                    # Runs __str__
[Value: 4]
>>> str(x), repr(x)
('[Value: 4]', '<__main__.addstr instance at 0x00B35EF0>')

Because of this, __repr__ may be best if you want a single display for all contexts. By defining both methods, though, you can support different displays in different contexts:

>>> class addboth(adder):
...     def __str__(self):
...         return '[Value: %s]' % self.data   # User-friendly string
...     def __repr__(self):
...         return 'addboth(%s)' % self.data   # As-code string

>>> x = addboth(4)
>>> x + 1
>>> x                                  # Runs __repr__
addboth(5)
>>> print x                            # Runs __str__
[Value: 5]
>>> str(x), repr(x)
('[Value: 5]', 'addboth(5)')

__radd__ Handles Right-Side Addition

Technically, the __add__ method in the prior example does not support the use of instance objects on the right side of the + operator. To implement such expressions, and hence support commutative style operators, code the __radd__ method as well. Python calls __radd__ only when the object on the right of the + is your class instance, but the object on the left is not an instance of your class. The __add__ method for the object on the left is called instead in all other cases:

>>> class Commuter:
...     def __init__(self, val):
...         self.val = val
...     def __add__(self, other):
...         print 'add', self.val, other
...     def __radd__(self, other):
...         print 'radd', self.val, other
...
>>> x = Commuter(88)
>>> y = Commuter(99)
>>> x + 1                      # __add__:  instance + noninstance
add 88 1
>>> 1 + y                      # __radd__: noninstance + instance
radd 99 1
>>> x + y                      # __add__:  instance + instance
add 88 <__main__.Commuter instance at 0x0086C3D8>

Notice how the order is reversed in __radd__: self is really on the right of the +, and other is on the left. Every binary operator has a similar right-side overloading method (e.g., __mul__ and __rmul__). Typically, a right-side method like __radd__ usually just converts if needed and reruns a + to trigger __add__, where the main logic is coded. Also note that x and y are instances of the same class here; when instances of different classes appear mixed in an expression, Python prefers the class of the one on the left.

Right-side methods are an advanced topic, and tend to be fairly rarely used; you only code them when you need operators to be commutative, and then only if you need to support operators at all. For instance, a Vector class may use these tools, but an Employee or Button class probably would not.

__call__ Intercepts Calls

The __call__ method is called when your instance is called. No, this isn’t a circular definition—if defined, Python runs a __call__ method for function call expressions applied to your instances. This allows class instances to emulate the look and feel of things like functions:

>>> class Prod:
...     def __init__(self, value):
...         self.value = value
...     def __call__(self, other):
...         return self.value * other
...
>>> x = Prod(2)
>>> x(3)
6
>>> x(4)
8

In this example, the __call__ may seem a bit gratuitous—a simple method provides similar utility:

>>> class Prod:
...     def __init__(self, value):
...         self.value = value
...     def comp(self, other):
...         return self.value * other
...
>>> x = Prod(3)
>>> x.comp(3)
9
>>> x.comp(4)
12

However, __call__ can become more useful when interfacing with APIs that expect functions. For example, the Tkinter GUI toolkit we’ll meet later in this book allows you to register functions as event handlers (a.k.a., callbacks); when events occur, Tkinter calls the registered object. If you want an event handler to retain state between events, you can either register a class’s bound method, or an instance that conforms to the expected interface with __call__. In our code, both x.comp from the second example and x from the first can pass as function-like objects this way. More on bound methods in the next chapter.

__del__ Is a Destructor

The __init__ constructor is called whenever an instance is generated. Its counterpart, destructor method __del__, is run automatically when an instance’s space is being reclaimed (i.e., at “garbage collection” time):

>>> class Life:
...     def __init__(self, name='unknown'):
...         print 'Hello', name
...         self.name = name
...     def __del__(self):
...         print 'Goodbye', self.name
...
>>> brian = Life('Brian')
Hello Brian
>>> brian = 'loretta'
Goodbye Brian

Here, when brian is assigned a string, we lose the last reference to the Life instance, and so, trigger its destructor method. This works, and may be useful to implement some cleanup activities such as terminating server connections. However, destructors are not as commonly used in Python as in some OOP languages, for a number of reasons.

For one thing, because Python automatically reclaims all space held by an instance when the instance is reclaimed, destructors are not necessary for space management.[5] For another, because you cannot always easily predict when an instance will be reclaimed, it’s often better to code termination activities in an explicitly-called method (or try/finally statement, described in the next part of the book); in some cases, there may be lingering references to your objects in system tables, which prevent destructors from running.

That’s as many overloading examples as we have space for here. Most work similarly to ones we’ve already seen, and all are just hooks for intercepting built-in type operations; some overload methods have unique argument lists or return values. You’ll see a few others in action later in the book, but for a complete coverage, we’ll defer to other documentation sources.

Namespaces: The Whole Story

Now that we’ve seen class and instance objects, the Python namespace story is complete; for reference, let’s quickly summarize all the rules used to resolve names. The first things you need to remember are that qualified and unqualified names are treated differently, and that some scopes serve to initialize object namespaces:

  • Unqualified names (e.g., X) deal with scopes.

  • Qualified attribute names (e.g., object.X) use object namespaces.

  • Some scopes initialize object namespaces (modules and classes).

Unqualified Names: Global Unless Assigned

Unqualified names follow the LEGB lexical scoping rules outlined for functions in Chapter 13:

Assignment: X = value

Makes names local: creates or changes name X in the current local scope, unless declared global

Reference: X

Looks for name X in the current local scope, then any and all enclosing functions, then the current global scope, then the built-in scope

Qualified Names: Object Namespaces

Q ualified names refer to attributes of specific objects and obey the rules for modules and classes. For class and class instance objects, the reference rules are augmented to include the inheritance search procedure:

Assignment: object.X = value

Creates or alters the attribute name X in the namespace of the object being qualified, and no other. Inheritance tree climbing only happens on attribute reference, not on attribute assignment.

Reference: object.X

For class-based objects, searches for the attribute name X in the object, then in all accessible classes above it, using the inheritance search procedure. For non-class objects such as modules, fetches X from object directly.

Assignments Classify Names

With distinct search procedures for qualified and unqualified names, and multiple lookup layers for both, it can sometimes be confusing to know where a name will wind up going. In Python, the place where you assign a name is crucial—it fully determines which scope or which object a name will reside in. File manynames.py illustrates and summarizes how this translates to code:

X = 11                  # Module (global) name/attribute

class c:
    X = 22              # Class attribute
    def m(self):
        X = 33          # Local variable in method
        self.X = 44     # Instance attribute

def f(  ):
    X = 55              # Local variable in function

def g(  ):
    print X             # Access module X (11)

Because this file assigns the same name, X, in five different locations, there are actually five completely different Xs in this program. From top to bottom, the assignments to X names generate a module attribute, a class attribute, a local variable in a method, an instance attribute, and a local in a function.

You should take enough time to study this example carefully, because it collects ideas we’ve been exploring throughout the last few parts of this book. When it makes sense to you, you will have achieved Python namespace nirvana. Of course, an alternative route to nirvana is to simply run this and see what happens. Here’s the remainder of this file, which makes an instance, and prints all the Xs that it can fetch:

obj = c(  )
obj.m(  )

print obj.X             # 44: instance
print c.X               # 22: class     (a.k.a. obj.X if no X in instance)
print X                 # 11: module    (a.k.a. manynames.X outside file)

#print c.m.X            # FAILS: only visible in method
#print f.X              # FAILS: only visible in function

Notice that we can go through the class to fetch its attribute (c.X), but can never fetch local variables in functions or methods from outside their def statements. Locals are only visible to other code within the def, and in fact only live in memory while a call to the function or method is executing.

Namespace Dictionaries

In Chapter 16, we learned that module namespaces are actually implemented as dictionaries and exposed with the built-in __dict__ attribute. The same holds for class and instance objects: attribute qualification is really a dictionary indexing operation internally, and attribute inheritance is just a matter of searching linked dictionaries. In fact, instance and class objects are mostly just dictionaries with links inside Python. Python exposes these dictionaries, as well as the links between them, for use in advanced roles (e.g., for coding tools).

To help you understand how attributes work internally, let’s work through an interactive session that traces the way namespace dictionaries grow when classes are involved. First, let’s define a superclass and a subclass with methods that will store data in their instances:

>>> class super:
...     def hello(self):
...         self.data1 = 'spam'
...
>>> class sub(super):
...     def hola(self):
...         self.data2 = 'eggs'

When we make an instance of the subclass, the instance starts out with an empty namespace dictionary, but has links back to the class for the inheritance search to follow. In fact, the inheritance tree is explicitly available in special attributes, which you can inspect: instances have a __class__ attribute that links to their class, and classes have a __bases__ attribute that is a tuple containing links to higher superclasses:

>>> X = sub(  )
>>> X.__dict__
{  }

>>> X.__class__
<class __main__.sub at 0x00A48448>

>>> sub.__bases__
(<class __main__.super at 0x00A3E1C8>,)

>>> super.__bases__
(  )

As classes assign to self attributes, they populate the instance object—that is, attributes wind up in the instance’s attribute namespace dictionary, not in the classes. Instance object namespaces record data that can vary from instance to instance, and self is a hook into that namespace:

>>> Y = sub(  )

>>> X.hello(  )
>>> X.__dict__
{'data1': 'spam'}

>>> X.hola(  )
>>> X.__dict__
{'data1': 'spam', 'data2': 'eggs'}
 
>>> sub.__dict__
{'__module__': '__main__', '__doc__': None, 'hola': <function hola at
 0x00A47048>}

>>> super.__dict__
{'__module__': '__main__', 'hello': <function hello at 0x00A3C5A8>,
 '__doc__': None}

>>> sub.__dict__.keys(  ), super.__dict__.keys(  )
(['__module__', '__doc__', 'hola'], ['__module__', 'hello', '__doc__'])

>>> Y.__dict__
{  }

Notice the extra underscore names in the class dictionaries; these are set by Python automatically. Most are not used in typical programs, but some are utilized by tools (e.g., __doc__ holds the docstrings discussed in Chapter 11).

Also observe that Y, a second instance made at the start of this series, still has an empty namespace dictionary at the end, even though X’s has been populated by assignments in methods. Each instance is an independent namespace dictionary, which starts out empty, and can record completely different attributes than other instances of the same class.

Now, because attributes are actually dictionary keys inside Python, there are really two ways to fetch and assign their values—by qualification, or key indexing:

>>> X.data1, X.__dict__['data1']
('spam', 'spam')

>>> X.data3 = 'toast'
>>> X.__dict__
{'data1': 'spam', 'data3': 'toast', 'data2': 'eggs'}

>>> X.__dict__['data3'] = 'ham'
>>> X.data3
'ham'

This equivalence only applies to attributes attached to the instance, though. Because attribute qualification also performs inheritance, it can access attributes that namespace dictionary indexing cannot. The inherited attribute X.hello, for instance, cannot be had by X.__dict__['hello'].

And finally, here is the built-in dir function we met in Chapter 3 and Chapter 11 at work on class and instance objects. This function works on anything with attributes: dir(object) is similar to an object.__dict__.keys( ) call. Notice though, that dir sorts its list, and includes some system attributes; as of Python 2.2, dir also collects inherited attributes automatically.[6]

>>> X.__dict__
{'data1': 'spam', 'data3': 'ham', 'data2': 'eggs'}
>>> X.__dict__.keys(  )
['data1', 'data3', 'data2']

>>>> dir(X)
['__doc__', '__module__', 'data1', 'data2', 'data3', 'hello', 'hola']
>>> dir(sub)
['__doc__', '__module__', 'hello', 'hola']
>>> dir(super)
['__doc__', '__module__', 'hello']

Experiment with these special attributes on your own to get a better feel for how namespaces actually do their attribute business. Even if you will never use these in the kinds of programs you write, it helps demystify the notion of namespaces in general when you see that they are just normal dictionaries.

Namespace Links

The prior section introduced the special __class__ and __bases__ instance and class attributes without really telling why you might care about them. In short, they allow you to inspect inheritance hierarchies within your own code. For example, they can be used to display a class tree, as in the following example, file classtree.py:

def classtree(cls, indent):
    print '.'*indent, cls.__name__        # Print class name here.
    for supercls in cls.__bases__:        # Recur to all superclasses
        classtree(supercls, indent+3)         # May visit super > once

def instancetree(inst):
    print 'Tree of', inst                     # Show instance.
    classtree(inst.__class__, 3)          # Climb to its class.

def selftest(  ):
    class A: pass
    class B(A): pass
    class C(A): pass
    class D(B,C): pass
    class E: pass
    class F(D,E): pass
    instancetree(B(  ))
    instancetree(F(  ))
    
if __name__ == '__main__': selftest(  )

The classtree function in this script is recursive—it prints a class’s name using __name__, and then climbs up to superclasses by calling itself. This allows the function to traverse arbitrarily shaped class trees; the recursion climbs to the top, and stops at root superclasses that have empty __bases__. Most of this file is self-test code; when run standalone, it builds an empty class tree, makes two instance from it, and prints their class tree structures:

% python classtree.py
Tree of <__main__.B instance at 0x00ACB438>
... B
...... A
Tree of <__main__.F instance at 0x00AC4DA8>
... F
...... D
......... B
............ A
......... C
............ A
...... E

Here, indentation marked by periods is used to denote class tree height. We can import these functions anywhere we want a quick class tree display:

>>> class Emp: pass
>>> class Person(Emp): pass
>>> bob = Person(  )
>>> import classtree
>>> classtree.instancetree(bob)
Tree of <__main__.Person instance at 0x00AD34E8>
... Person
...... Emp

Of course, we could improve on this output format, and perhaps even sketch it in a GUI display. Whether or not you will ever code or use such tools, this example demonstrates one of the many ways that we can make use of special attributes that expose interpreter internals. We’ll meet another when we code a general purpose attribute listing class, in Seciton 22.6 of Chapter 22 .



[1] If you’ve used C++, you may recognize this as similar to the notion of C++’s “static” class data—members that are stored in the class, independent of instances. In Python, it’s nothing special: all class attributes are just names assigned in the class statement, whether they happen to reference functions (C++’s “methods”) or something else (C++’s “members”).

[2] Unless the attribute assignment operation has been redefined by a class with the __setattr__ operator overloading method to do something unique.

[3] On a somewhat related note, you can also code multiple __init__ methods within the same single class, but only the last definition will be used; see Chapter 22 for more details.

[4] This description isn’t 100% complete, because instance and class attributes can also be created by assigning to objects outside class statements. But that’s much less common and sometimes more error prone (changes aren’t isolated to class statements). In Python all attributes are always accessible by default; we talk more about name privacy in Chapter 23.

[5] In the current C implementation of Python, you also don’t need to close files objects held by the instance in destructors, because they are automatically closed when reclaimed. However, as mentioned in Chapter 7, it’s better to explicitly call file close methods, because auto-close-on-reclaim is a feature of the implementation, not the language itself (and can vary under Jython).

[6] The content of attribute dictionaries and dir call results is prone to change over time. For example, because Python now allows built-in types to be subclassed like classes, the contents of dir results for built-in types expanded to include operator overloading methods. In general, attribute names with leading and trailing double underscores are interpreter-specific. More on type subclasses in Chapter 23.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset