Now that we’ve talked about OOP in the abstract, let’s move on to the details of how this translates to actual code. In this chapter and in Chapter 21, we fill in the syntax details behind the class model in Python.
If you’ve never been exposed to OOP in the past, classes can be somewhat complicated if taken in a single dose. To make class coding easier to absorb, we’ll begin our detailed look at OOP by taking a first look at classes in action in this chapter. We’ll expand on the details introduced here in later chapters of this part of the book; but in their basic form, Python classes are easy to understand.
Classes have three primary distinctions. At a base level, they are mostly just namespaces, much like the modules studied in Part V. But unlike modules, classes also have support for generating multiple objects, namespace inheritance, and operator overloading. Let’s begin our class statement tour by exploring each of these three distinctions in turn.
To understand how the multiple objects idea works, you have to first understand that there are two kinds of objects in Python’s OOP model—class objects and instance objects. Class objects provide default behavior and serve as factories for instance objects. Instance objects are the real objects your programs process; each is a namespace in its own right, but inherits (i.e., has automatic access to) names in the class it was created from. Class objects come from statements, and instances from calls; each time you call a class, you get a new instance of that class.
This object generation concept is very different from any of the
other program constructs we’ve seen so far in this
book. In effect, classes are factories for making many instances. By
contrast, there is only one copy of each module imported (in fact,
this is one reason that we have to call reload
, to
update the single module object).
Next, we’ll summarize the bare essentials of Python
OOP. Classes are in some ways similar to both def
and modules, but they may be quite different than what
you’re used to in other languages.
The class
statement creates a class object and
assigns it a name. Just like the function def
statement, the Python
class
statement is an
executable
statement. When
reached and run, it generates a new class object and assigns it to
the name in the class header. Also like def
, class
statements typically run when the file they are coded in is first
imported.
Assignments inside class
statements make class
attributes. Just like module files, assignments within a
class statement generate attributes in a class
object. After running a class statement, class attributes are
accessed by name qualification: object.name
.
Class attributes provide object state and behavior. Attributes of a class object record state information and behavior,
to be shared by all instances created from the class; function
def
statements nested inside a
class
generate methods, which
process instances.
Calling a class object like a function makes a new instance object. Each time a class is called, it creates and returns a new instance object. Instances represent concrete items in your program’s domain.
Each instance object inherits class attributes and gets its own namespace. Instance objects created from classes are new namespaces; they start out empty, but inherit attributes that live in the class object they were generated from.
Assignments to attributes of self
in methods
make per-instance attributes. Inside class method functions, the first argument (called
self
by convention) references the instance object
being processed; assignments to attributes of self
create or change data in the instance, not the class.
Let’s turn to a real example to show how these ideas
work in practice. To begin, let’s define a class
named FirstClass
, by running a Python
class
statement interactively:
>>>class FirstClass: # Define a class object.
...def setdata(self, value): # Define class methods.
...self.data = value # self is the instance.
...def display(self):
...print self.data # self.data: per instance
We’re working interactively here, but typically, such a statement would be run when the module file it is coded in is imported. Like functions, your class won’t even exist until Python reaches and runs this statement.
Like all compound statements, class
starts with a
header line that lists the class name, followed by a body of one or
more nested and (usually) indented statements. Here, the nested
statements are def
s; they define functions that
implement the behavior the class means to export. As
we’ve learned, def
is really an
assignment; here, it assigns to the names setdata
and display
in the class
statement’s scope, and so generates attributes
attached to the class: FirstClass.setdata
, and
FirstClass.display
.
Functions inside a class are usually called
methods; they’re normal
def
s, but the first argument automatically
receives an implied instance object when called—the subject of
a call. We need a couple of instances to see how:
>>>x = FirstClass( ) # Make two instances.
>>>y = FirstClass( ) # Each is a new namespace.
By calling
the
class this way (notice the parenthesis), it generates instance
objects, which are just namespaces that have access to their
class’s attributes. Properly speaking, at this point
we have three objects—two instances and a class. Really, we
have three linked namespaces, as sketched in Figure 20-1. In OOP terms, we say that
x
“is a”
FirstClass
, as is y
.
The two instances start empty, but have links back to the class they were generated from. If we qualify an instance with the name of an attribute that lives in the class object, Python fetches the name from the class by inheritance search (unless it also lives in the instance):
>>>x.setdata("King Arthur") # Call methods: self is x
>>>y.setdata(3.14159)
# Runs: FirstClass.setdata(y, 3.14159)
Neither x
nor y
has a
setdata
of its own; instead, Python follows the
link from instance to class if an attribute doesn’t
exist in an instance. And that’s about all there is
to inheritance in Python: it happens at
attribute qualification time, and just involves looking up names in
linked objects (e.g., by following the is-a links in Figure 20-1).
In the setdata
function inside
FirstClass
, the value passed in is assigned to
self.data
. Within a method,
self
—the name given to the leftmost argument
by convention—automatically refers to the instance being
processed (x
or y
), so the
assignments store values in the instances’
namespaces, not the class (that’s how the
data
names in Figure 20-1 are
created).
Since classes generate multiple instances, methods must go through
the self
argument to get to the instance to be
processed. When we call the class’s
display
method to print
self.data
, we see that it’s
different in each instance; on the other hand,
display
is the same in x
and
y
, since it comes (is inherited) from the class:
>>>x.display( ) # self.data differs in each.
King Arthur >>>y.display( )
3.14159
Notice that we stored different object types in the
data
member (a string and a floating-point). Like
everything else in Python, there are no declarations for instance
attributes (sometimes called members); they spring into existence the
first time they are assigned a value, just like simple variables. In
fact, we can change instance attributes either in the
class
itself by assigning to
self
in methods, or outside the class by assigning
to an explicit instance object:
>>>x.data = "New value" # Can get/set attributes
>>>x.display( ) # outside the class too.
New value
Although less common, we could even generate a brand new atribute on the instance, by assigning to its name outside the class’s method functions:
>>> x.anothername = "spam" # Can get/set attributes
This would attach a new attribute called
anothername
to the instance object
x
, which may or may not be used by any of the
class’s methods. Classes usually create all the
instance’s attributes by assignment to the
self
argument, but they don’t
have to; programs can fetch, change, or create
attributes on any
object that they have a reference to.
Besides serving as object generators, classes also allow us to make changes by introducing new components (called subclasses), instead of changing existing components in place. Instance objects generated from a class inherit the class’s attributes. Python also allows classes to inherit from other classes, and this opens the door to coding hierarchies of classes, that specialize behavior by overriding attributes lower in the hierarchy. Here, too, there is no parallel in modules: their attributes live in a single, flat namespace.
In Python, instances inherit from classes, and classes inherit from superclasses. Here are the key ideas behind the machinery of attribute inheritance:
S
uperclasses
are listed in parentheses in a class
header. To inherit attributes from another class, just list the class in
parentheses in a class
statement’s header. The class that inherits is
called a subclass, and the class that is
inherited from is its superclass.
Classes inherit attributes from their superclasses. Just like instances, a class gets all the attribute names defined in its superclasses; they’re found by Python automatically when accessed, if they don’t exist in the subclass.
Instances inherit attributes from all accessible classes. Instances get names from the class they are generated from, as well as all of that class’s superclasses. When looking for a name, Python checks the instance, then its class, then all superclasses above.
Each object.attribute
reference invokes a new,
independent search. Python performs an independent search of the class tree, for each
attribute fetch expression. This includes both references to
instances and classes made outside class statements (e.g.,
X.attr
), as well as references to attributes of
the self
instance argument in class method
functions. Each self.attr
in a method invokes a
new search for attr
in self
and
above.
Logic changes are made by subclassing, not by changing superclasses. By redefining superclass names in subclasses lower in a hierarchy (tree), subclasses replace, and thus, customize inherited behavior.
The
next
example builds on the one before. Let’s define a new
class, SecondClass
, which inherits all of
FirstClass
’s names and provides
one of its own:
>>>class SecondClass(FirstClass): # Inherits setdata
...def display(self): # Changes display
...print 'Current value = "%s"' % self.data
SecondClass
defines the display
method to print with a different format. But because
SecondClass
defines an attribute of the same name,
it effectively overrides and replaces the
display
attribute in
FirstClass
.
Recall that inheritance works by searching up
from instances, to subclasses, to superclasses, and stops at the
first appearance of an attribute name it finds. Since it finds the
display
name in SecondClass
before the one in FirstClass
, we say that
SecondClass
overrides
FirstClass
’s
display
. Sometimes we call this act of replacing
attributes by redefining them lower in the tree
overloading.
The net effect here is that SecondClass
specializes FirstClass
, by changing the behavior
of the display
method. On the other hand,
SecondClass
(and instances created from it) still
inherits the setdata
method in
FirstClass
verbatim. Figure 20-2
sketches the namespaces involved; let’s make an
instance to demonstrate:
>>>z = SecondClass( )
>>>z.setdata(42) # setdata found in FirstClass
>>>z.display( ) # finds overridden method in SecondClass.
Current value = "42"
As before, we make a SecondClass
instance object
by calling it. The setdata
call still runs the
version in FirstClass
, but this time the
display
attribute comes from
SecondClass
and prints a custom message.
Here’s a very important thing to notice about OOP:
the specialization introduced in SecondClass
is
completely external to
FirstClass
; it doesn’t effect
existing or future FirstClass
objects—like
x
from the prior example:
>>> x.display( ) # x is still a FirstClass instance (old message).
New value
Rather than changing FirstClass
, we customized it.
Naturally, this is an artificial example; but as a rule, because
inheritance allows us to make changes like this in external
components (i.e., in subclasses), classes often support extension and
reuse better than functions or modules can.
Before we move on,
remember
that there’s nothing magic about a class name.
It’s just a variable assigned to an object when the
class
statement runs, and the object can be
referenced with any normal expression. For instance, if our
FirstClass
was coded in a module file instead of
being typed interactively, we could import and use its name normally
in a class header line:
from modulename import FirstClass # Copy name into my scope. class SecondClass(FirstClass): # Use class name directly. def display(self): ...
Or, equivalently:
import modulename # Access the whole module. class SecondClass(modulename.FirstClass): # Qualify to reference def display(self): ...
Like everything else, class names always live within a module, and so
follow all the rules we studied in Part V. For
example, more than one class
can be coded in a
single module file—like other names in a module, they are run
and defined during imports, and become distinct module attributes.
More generally, each module may arbitrarily mix any number of
variables, functions, and classes, and all names in a module behave
the same way. File food.py demonstrates:
var = 1 # food.var def func( ): # food.func ... class spam: # food.spam ... class ham: # food.ham ... class eggs: # food.eggs ...
This holds true even if the module and class happen to have the same name. For example, given the following file, person.py:
class person: ...
We need to go through the module to fetch the class as usual:
import person # Import module x = person.person( ) # class within module.
Although this path may look redundant, it’s
required: person.person
refers to the
person
class inside the person
module. Saying just person
gets the module, not
the class, unless the from
statement is used:
from person import person # Get class from module. x = person( ) # Use class name.
Like other variables, we can never see a class in a file without first importing and somehow fetching from its enclosing file. If this seems confusing, don’t use the same name for a module and a class within it.
Also keep in mind that although classes and modules are both namespaces for attaching attributes, they correspond to very different source code structures: a module reflects an entire file, but a class is a statement within a file. We’ll say more about such distinctions later in this part of the book.
Let’s take a look at the third major distinction of classes: operator overloading. In simple terms, operator overloading lets objects coded with classes intercept and respond to operations that work on built-in types: addition, slicing, printing, qualification, and so on. It’s mostly just an automatic dispatch mechanism: expressions route control to implementations in classes. Here, too, there is nothing similar in modules: modules can implement function calls, but not the behavior of expressions.
Although we could implement all class behavior as method functions, operator overloading lets objects be more tightly integrated with Python’s object model. Moreover, because operator overloading makes our own objects act like built-ins, it tends to foster object interfaces that are more consistent and easier to learn. Here are the main ideas behind overloading operators:
Methods with names such as __X__
are special
hooks. Python operator overloading is implemented by providing specially
named methods to intercept operations.
Such methods are called automatically when Python evaluates operators. For instance, if an object inherits an __add__
method, it is called when the object appears in a
+
expression.
Classes may override most built-in type operations. There are dozens of special operator method names, for intercepting and implementing nearly every operation available for built-in types.
Operators allow classes to integrate with Python’s object model. By overloading type operations, user-defined objects implemented with classes act just like built-ins, and so provide consistency.
On to another example. This time, we define a subclass of
SecondClass
, which implements three
specially-named attributes that Python will call automatically:
__init__
is called when a new instance object is
being constructed (self
is the new
ThirdClass
object), and __add__
and __mul__
are called when a
ThirdClass
instance appears in
+
and *
expressions, respectively:
>>>class ThirdClass(SecondClass): # is-a SecondClass
...def __init__(self, value): # On "ThirdClass(value)"
...self.data = value
...def __add__(self, other): # On "self + other"
...return ThirdClass(self.data + other)
...def __mul__(self, other):
...self.data = self.data * other # On "self * other"
>>>a = ThirdClass("abc") # New __init__ called
>>>a.display( ) # Inherited method
Current value = "abc" >>>b = a + 'xyz' # New __add__: makes a new instance
>>>b.display( )
Current value = "abcxyz" >>>a * 3 # New __mul__: changes instance in-place
>>>a.display( )
Current value = "abcabcabc"
ThirdClass
is a SecondClass
, so
its instances inherit display
from
SecondClass
. But ThirdClass
generation calls pass an argument now (e.g.,
“abc”); it’s
passed to the value
argument in the __init__
constructor and assigned to
self.data
there. Further,
ThirdClass
objects can show up in
+
and *
expressions; Python
passes the instance object on the left to the self
argument and the value on the right to other
, as
illustrated in Figure 20-3.
Specially-named methods such as __init__
and
__add__
are inherited by subclasses and
instances, just like any other name assigned in a
class
. If the methods are not coded in a class,
Python looks for such names in all superclasses as usual. Operator
overloading method names are also not built-in or reserved words:
they are just attributes that Python looks for when objects appear in
various contexts. They are usually called by Python automatically,
but may occasionally be called by your code as well.
Notice that the __add__
method makes and returns
a new instance object of its class (by calling
ThirdClass
with the result value), but __mul__
changes the current instance
object in place (by reassigning a self
attribute).
This is different than the behavior of built-in types such as numbers
and strings, which always make a new object for the
*
operator. Because operator overloading is really
just an expression-to-method dispatch mechanism, you can interpret
operators any way you like in your own class objects.[1]
As a class designer, you can choose to use operator overloading or not. Your choice simply depends on how much you want your object to look and feel like a built-in type. If you omit an overloading operator method and do not inherit it from a superclass, the corresponding operation is not supported for your instances, and will simply throw an exception (or use a standard default) if attempted.
Frankly, many operator overloading methods tend to be used only when implementing objects that are mathematical in nature; a vector or matrix class may overload addition, for example, but an employee class likely would not. For simpler classes, you might not use overloading at all, and rely instead on explicit method calls to implement your object’s behavior.
On the other hand, you might also use operator overloading if you need to pass a user-defined object to a function that was coded to expect the operators available on a built-in type like a list or a dictionary. By implementing the same operator set in your class, your objects will support the same expected object interface, and so will be compatible with the function.
Typically, one overloading method seems to show up in almost every
realistic class: the __init__
constructor
method. Because it allows classes to fill out the attributes in their
newly-created instances immediately, the constructor is useful for
almost every kind of class you might code. In fact, even though
instance attributes are not declared in Python, you can usually find
out which attributes an instance will have by inspecting its
class’s __init__
method code.
We’ll see additional inheritance and operator
overloading techniques in action in Chapter 21
.
[1] But you probably shouldn’t. Common practice
dictates that overloaded operators should work the same way built-in
operator implementations do. In this case, that means our __mul__
method should return a new
object as its result, rather than changing the instance
(self
) in place; for in-place changes, a
mul
method call may be better style than a
*
overload here (e.g., a.mul(3)
instead of a * 3
).