Credit: Alex Martelli, author of Python in a Nutshell (O’Reilly)
Object-oriented programming (OOP) is among Python’s greatest strengths. Python’s OOP features continue to improve steadily and gradually, just like Python in general. You could already write better object-oriented programs in Python 1.5.2 (the ancient, long-stable version that was new when I first began to work with Python) than in any other popular language (excluding, of course, Lisp and its variants: I doubt there’s anything you can’t do well in Lisp-like languages, as long as you can stomach parentheses-heavy concrete syntax). For a few years now, since the release of Python 2.2, Python OOP has become substantially better than it was with 1.5.2. I am constantly amazed at the systematic progress Python achieves without sacrificing solidity, stability, and backwards-compatibility.
To get the most out of Python’s great OOP features, you should use them the Python way, rather than trying to mimic C++, Java, Smalltalk, or other languages you may be familiar with. You can do a lot of mimicry, thanks to Python’s power. However, you’ll get better mileage if you invest time and energy in understanding the Python way. Most of the investment is in increasing your understanding of OOP itself: what is OOP, what does it buy you, and which underlying mechanisms can your object-oriented programs use? The rest of the investment is in understanding the specific mechanisms that Python itself offers.
One caveat is in order. For such a high-level language, Python is quite explicit about the OOP mechanisms it uses behind the curtains: they’re exposed and available for your exploration and tinkering. Exploration and understanding are good, but beware the temptation to tinker. In other words, don’t use unnecessary black magic just because you can. Specifically, don’t use black magic in production code. If you can meet your goals with simplicity (and most often, in Python, you can), then keep your code simple. Simplicity pays off in readability, maintainability, and, more often than not, performance, too. To describe something as clever is not considered a compliment in the Python culture.
So what is OOP all about? First of all, it’s about keeping some state (data) and some behavior (code) together in handy packets. “Handy packets” is the key here. Every program has state and behavior—programming paradigms differ only in how you view, organize, and package them. If the packaging is in terms of objects that typically comprise state and behavior, you’re using OOP. Some object-oriented languages force you to use OOP for everything, so you end up with many objects that lack either state or behavior. Python, however, supports multiple paradigms. While everything in Python is an object, you package things as OOP objects only when you want to. Other languages try to force your programming style into a predefined mold for your own good, while Python empowers you to make and express your own design choices.
With OOP, once you have specified how an object is composed, you can instantiate as many objects of that kind as you need. When you don’t want to create multiple objects, consider using other Python constructs, such as modules. In this chapter, you’ll find recipes for Singleton, an object-oriented design pattern that eliminates the multiplicity of instantiation, and Borg, an idiom that makes multiple instances share state. But if you want only one instance, in Python it’s often best to use a module, not an OOP object.
To describe how an object is made, use the class
statement:
class SomeName(object): """ You usually define data and code here (in the class body). """
SomeName
is a class object. It’s a
first-class object, like every Python object,
meaning that you can put it in lists and dictionaries, pass it as an
argument to functions, and so on. You don’t have to
include the (object)
part in the
class
header clause—class SomeName
: by itself is also valid Python
syntax—but normally you should include that part,
as we’ll see later.
When you want a new instance of a class, call the class object as if it were a function. Each call returns a new instance object:
anInstance = SomeName( ) another = SomeName( )
anInstance
and another
are two
distinct instance objects, instances of the
SomeName
class. (See Recipe 4.18 for a class that
does little more than this and yet is already quite useful.) You can
freely bind (i.e., assign or set) and
access (i.e., get)
attributes (i.e., state) of an instance
object:
anInstance.someNumber = 23 * 45
print anInstance.someNumber # emits:1035
Instances of an “empty” class like SomeName
have
no behavior, but they may have state. Most often, however, you want
instances to have behavior. Specify the behavior you want by defining
methods (with def
statements, just like you define
functions) inside the class body:
class Behave(object): def _ _init_ _(self, name): self.name = name def once(self): print "Hello,", self.name def rename(self, newName) self.name = newName def repeat(self, N): for i in range(N): self.once( )
You define methods with the same def
statement Python uses to define functions,
exactly because methods are essentially functions.
However, a method is an attribute of a class object, and its first
formal argument is (by universal convention) named self
. self
always refers to the instance on which you call the method.
The method with the special name _ _init_ _
is also known as the
constructor (or more properly the
initializer) for instances of the class. Python
calls this special method to initialize each newly created instance with
the arguments that you passed when calling the class (except for
self
, which you do not pass
explicitly since Python supplies it automatically). The body of _ _init_ _
typically binds attributes on the
newly created self
instance to
appropriately initialize the instance’s state.
Other methods implement the behavior of instances of the class.
Typically, they do so by accessing instance attributes. Also, methods
often rebind instance attributes, and they may call other methods.
Within a class definition, these actions are always done with the
self.something
syntax. Once you instantiate the class,
however, you call methods on the instance, access the instance’s
attributes, and even rebind them, using the
theobject.something
syntax:
beehive = Behave("Queen Bee") beehive.repeat(3) beehive.rename("Stinger") beehive.once( ) print beehive.name beehive.name = 'See, you can rebind it "from the outside" too, if you want' beehive.repeat(2)
If you’re new to OOP in Python, you should try, in an interactive Python environment, the example snippets I have shown so far and those I’m going to show in the rest of this Introduction. One of the best interactive Python environments for such exploration is the GUI shell supplied as part of the free IDLE development environment that comes with Python.
In addition to the constructor (_
_init_ _
), your class may have other special methods, meaning
methods with names that start and end with two underscores. Python calls
the special methods of a class when instances of the class are used in
various operations and built-in functions. For example, len(x)
returns x._
_len_ _( )
; a+b
normally
returns a._ _add_ _(b)
; a[b]
returns a._
_getitem_ _(b)
. Therefore, by defining special methods in a
class, you can make instances of that class interchangeable with objects
of built-in types, such as numbers, lists, and dictionaries.
Each operation and built-in function can try several special
methods in some specific order. For example, a+b
first tries
a._ _add_ _(b)
, but, if that
doesn’t pan out, the operation also gives object
b
a say in the matter, by next trying
b._ _radd_ _(a)
. This kind of
intrinsic structuring among special methods, that operations and
built-in functions can provide, is an important added value of such
functions and operations with respect to pure OO notation such as
someobject.somemethod(arguments)
.
The ability to handle different objects in similar ways,
known as polymorphism, is a major advantage of
OOP. Thanks to polymorphism, you can call the same method on various
objects, and each object can implement the method appropriately. For
example, in addition to the Behave
class, you might
have another class that implements a repeat
method with
rather different behavior:
class Repeater(object): def repeat(self, N): print N*"*-*"
You can mix instances of Behave
and
Repeater
at will, as long as the only method you call
on each such instance is repeat
:
aMix = beehive, Behave('John'), Repeater( ), Behave('world') for whatever in aMix: whatever.repeat(3)
Other languages require inheritance, or the formal
definition and implementation of interfaces, in order to enable such
polymorphism. In Python, all you need is to have methods with the same
signature (i.e., methods of the same name,
callable with the same arguments). This signature-based
polymorphism allows a style of programming that’s quite
similar to generic programming (e.g., as
supported by C++’s template
classes
and functions), without syntax cruft and without conceptual
complications.
Python also uses inheritance, which is mostly a handy, elegant, structured way to reuse code. You can define a class by inheriting from another (i.e., subclassing the other class) and then adding or redefining (known as overriding) some methods:
class Subclass(Behave): def once(self): print '(%s)' % self.name subInstance = Subclass("Queen Bee") subInstance.repeat(3)
The Subclass
class overrides only the
once
method, but you can also call the
repeat
method on
subInstance
, since Subclass
inherits that method from the Behave
superclass. The
body of the repeat
method calls once
n times on the specific instance, using whatever
version of the once
method the instance has. In this
case, each call uses the method from the Subclass
class, which prints the name in parentheses, not the original version
from the Behave
class, which prints the name after a
greeting. The idea of a method calling other methods on the same
instance and getting the appropriately overridden version of each is
important in every object-oriented language, including Python. It is
also known as the Template Method Design Pattern.
The method of a subclass often overrides a method from the superclass, but also needs to call the method of the superclass as part of its own operation. You can do this in Python by explicitly getting the method as a class attribute and passing the instance as the first argument:
class OneMore(Behave): def repeat(self, N): Behave.repeat(self, N+1) zealant = OneMore("Worker Bee") zealant.repeat(3)
The OneMore
class implements its own
repeat
method in terms of the method with the same name
in its superclass, Behave
, with a slight change. This
approach, known as delegation, is pervasive in
all programming. Delegation involves implementing some functionality by
letting another existing piece of code do most of the work, often with
some slight variation. An overriding method often is best implemented by
delegating some of the work to the same method in the superclass. In
Python, the syntax Classname.method(self
, . . .)
delegates to Classname
’s
version of the method. A vastly preferable way to
perform superclass delegation, however, is to use Python’s built-in
super
:
class OneMore(Behave): def repeat(self, N): super(OneMore, self).repeat(N+1)
This super
construct is
equivalent to the explicit use of Behave.repeat
in this simple case, but it also
allows class OneMore
to be used smoothly with
multiple inheritance. Even if you’re not
interested in multiple inheritance at first, you should still get into
the habit of using super
instead of
explicit delegation to your base class by name—super
costs nothing and it may prove very
useful to you in the future.
Python does fully support multiple inheritance: one class can inherit from several other classes. In terms of coding, this feature is sometimes just a minor one that lets you use the mix-in class idiom, a convenient way to supply functionality across a broad range of classes. (See Recipe 6.20 and Recipe 6.12, for unusual but powerful examples of using the mix-in idiom.) However, multiple inheritance is particularly important because of its implications for object-oriented analysis—the way you conceptualize your problem and your solution in the first place. Single inheritance pushes you to frame your problem space via taxonomy (i.e., mutually exclusive classification). The real world doesn’t work like that. Rather, it resembles Jorge Luis Borges’ explanation in The Analytical Language of John Wilkins, from a purported Chinese encyclopedia, The Celestial Emporium of Benevolent Knowledge. Borges explains that all animals are divided into:
Those that belong to the Emperor
Embalmed ones
Those that are trained
Suckling pigs
Mermaids
Fabulous ones
Stray dogs
Those included in the present classification
Those that tremble as if they were mad
Innumerable ones
Those drawn with a very fine camelhair brush
Others
Those that have just broken a flower vase
Those that from a long way off look like flies
You get the point: taxonomy forces you to pigeonhole, fitting everything into categories that aren’t truly mutually exclusive. Modeling aspects of the real world in your programs is hard enough without buying into artificial constraints such as taxonomy. Multiple inheritance frees you from these constraints.
Ah, yes, that (object)
thing—I had promised to come back to it later. Now
that you’ve seen Python’s notation for inheritance, you realize that
writing class
X(object)
means that class X
inherits from class object
. If you
just write class Y
:, you’re saying
that Y
doesn’t inherit from
anything—Y
, so to speak, “stands on its own”. For
backwards compatibility, Python allows you to request such a
rootless class, and, if you do, then Python makes
class Y
an “old-style” class, also known as a
classic class, meaning a class that works just
like all classes used to work in the Python versions of old. Python is
very keen on backwards-compatibility.
For many elementary uses, you won’t notice the difference
between classic classes and the new-style classes that are recommended
for all new Python code you write. However, it’s important to underscore
that classic classes are a legacy feature,
not recommended for new code. Even within the
limited compass of elementary OOP features that I cover in this
Introduction, you will already feel some of the limitations of classic
classes: for example, you cannot use super
within classic classes, and in practice,
you should not do any serious use of multiple inheritance with them.
Many important features of today’s Python OOP, such as the property
built-in, can’t work completely, if
they even work at all, with old-style classes.
In practice, even if you’re maintaining a large body of
legacy Python code, the next time you need to do any substantial
maintenance on that code, you should take the little effort required to
ensure all classes are new style: it’s a small job, and it will ease
your future maintenance burden quite a bit. Instead of explicitly having
all your classes inherit from object
,
an equivalent alternative is to add the following assignment statement
close to the start of every module that defines any classes:
_ _metaclass_ _ = type
The built-in type
is
the metaclass of object
and of every
other new-style class and built-in type
. That’s why inheriting from object
or any built-in type
makes a class new style: the class you’re
coding gets the same metaclass as its base. A class without bases can
get its metaclass from the module-global _
_metaclass_ _
variable, which is why the “state"ment I suggest
suffices to ensure that any classes without explicit bases are made
new-style. Even if you never make any other use of explicit metaclasses
(a rather advanced subject that is, nevertheless, mentioned in several
of this chapter’s recipes), this one simple use of them will stand you
in good stead.
Credit: Artur de Sousa Rocha, Adde Nilsson
You want to convert easily among Kelvin, Celsius, Fahrenheit, and Rankine scales of temperature.
Rather than having a dozen functions to do all possible conversions, we can more elegantly package this functionality into a class:
class Temperature(object): coefficients = {'c': (1.0, 0.0, -273.15), 'f': (1.8, -273.15, 32.0), 'r': (1.8, 0.0, 0.0)} def _ _init_ _(self, **kwargs): # default to absolute (Kelvin) 0, but allow one named argument, # with name being k, c, f or r, to use any of the scales try: name, value = kwargs.popitem( ) except KeyError: # no arguments, so default to k=0 name, value = 'k', 0 # error if there are more arguments, or the arg's name is unknown if kwargs or name not in 'kcfr': kwargs[name] = value # put it back for diagnosis raise TypeError, 'invalid arguments %r' % kwargs setattr(self, name, float(value)) def _ _getattr_ _(self, name): # maps getting of c, f, r, to computation from k try: eq = self.coefficients[name] except KeyError: # unknown name, give error message raise AttributeError, name return (self.k + eq[1]) * eq[0] + eq[2] def _ _setattr_ _(self, name, value): # maps settings of k, c, f, r, to setting of k; forbids others if name in self.coefficients: # name is c, f or r -- compute and set k eq = self.coefficients[name] self.k = (value - eq[2]) / eq[0] - eq[1] elif name == 'k': # name is k, just set it object._ _setattr_ _(self, name, value) else: # unknown name, give error message raise AttributeError, name def _ _str_ _(self): # readable, concise representation as string return "%s K" % self.k def _ _repr_ _(self): # detailed, precise representation as string return "Temperature(k=%r)" % self.k
Converting between several different scales or units of measure
is a task that’s subject to a “combinatorial explosion”: if we tackle
it in the apparently obvious way, by providing a function for each
conversion, then, to deal with n
different
units, we will have to write n * (n-1)
functions.
A Python class can intercept attribute setting and getting, and perform computation on the fly in response. This power enables a much handier and more elegant architecture, as shown in this recipe for the specific case of temperatures.
Inside the class, we always hold the measurement in one
reference unit or scale, Kelvin (absolute) degrees in the case of this
recipe. We allow the setting of the value to happen through any of
four attribute names ('k', 'r', 'c',
'f
', abbreviations of the scales’ names), and compute and
set the Kelvin-scale value appropriately. Vice versa, we also allow
the “getting” of the value in any scale, through the same attribute
names, computing the result on the fly. (Assuming you have saved the
code in this recipe as te.py
somewhere on your Python sys.path
,
you can import it as a module.) For example:
>>> from te import Temperature >>> t = Temperature(f=70) # 70 F is... >>> print t.c # ...a bit over 21 C21.1111111111
>>> t.c = 23 # 23 C is... >>> print t.f # ...a bit over 73 F73.4
_ _getattr_ _
and _ _setattr_ _
work better than named properties would in this case, since the form
of the computation is the same for every attribute (except the
reference 'k
' one), and we only
need to use different coefficients that we can most handily keep in a
per-class dictionary, the one we name self.coefficients
. It’s important to
remember that _ _setattr_ _
is
called on every setting of any attribute, so it
must delegate to object
the setting
of attributes, which need to be recorded in the instance (the _ _setattr_ _
implementation in this recipe
does just such a delegation for attribute k
) and must
raise an AttributeError
exception
for attributes that can’t be set. _ _getattr_
_
, on the other hand, is called only upon the “getting” of
an attribute that can’t be found by other, “normal” means (e.g., in
the case of this recipe’s class, _ _getattr_
_
is not called for accesses to
attribute k
, which is recorded in the instance and
thus gets found by normal means). _ _getattr_
_
must also raise an AttributeError
exception for attributes that
can’t be accessed.
Library Reference and Python
in a Nutshell documentation on attributes and on special
methods _ _getattr_ _
and _ _setattr_ _
.
Credit: Alex Martelli
You need to define module-level variables (i.e., named constants) that client code cannot accidentally rebind.
You can install any object as if it were a module. Save the
following code as module const.py
on some directory on your Python sys.path
:
class _const(object):
class ConstError(TypeError): pass
def _ _setattr_ _(self, name, value):
if name in self._ _dict_ _:
raise self.ConstError, "Can't rebind const(%s)" % name
self._ _dict_ _[name] = value
def _ _delattr_ _(self, name):
if name in self._ _dict_ _:
raise self.ConstError, "Can't unbind const(%s)" % name
raise NameError, name
import syssys.modules[_ _name_ _] = _const( )
Now, any client code can import const
, then
bind an attribute on the const
module just once, as
follows:
const.magic = 23
Once the attribute is bound, the program cannot accidentally rebind or unbind it:
const.magic = 88 # raises const.ConstError del const.magic # raises const.ConstError
In Python, variables can be rebound at will, and modules,
differently from classes, don’t let you define special methods such as
_ _setattr_ _
to stop rebinding. An
easy solution is to install an instance as if it were a module.
Python performs no type-checks to force entries in sys.modules
to actually be module objects.
Therefore, you can install any object there and take advantage of
attribute-access special methods (e.g., to prevent rebinding, to
synthesize attributes on the fly in _
_getattr_ _
, etc.), while still allowing client code to
access the object with import
somename
. You may even see it as a more
Pythonic Singleton-style idiom (but see Recipe 6.16).
This recipe ensures that a module-level name remains
constantly bound to the same object once it has first been bound to
it. This recipe does not deal with a certain object’s immutability,
which is quite a different issue. Altering an object and rebinding a
name are different concepts, as explained in Recipe 4.1. Numbers,
strings, and tuples are immutable: if you bind a name in
const
to such an object, not only will the name
always be bound to that object, but the object’s contents also will
always be the same since the object is immutable. However, other
objects, such as lists and dictionaries, are mutable: if you bind a
name in const
to, say, a list object, the name will
always remain bound to that list object, but the contents of the list
may change (e.g., items in it may be rebound or unbound, more items
can be added with the object’s append
method, etc.).
To make “read-only” wrappers around mutable objects, see Recipe 6.5. You might choose
to have class _const
’s _
_setattr_ _
method perform such wrapping implicitly. Say you
have saved the code from Recipe 6.5 as module
ro.py somewhere along your Python
sys.path
. Then, you need to add, at
the start of module const.py:
import ro
and change the assignment self._ _dict_
_[name] = value
, used in class _const
’s
_ _setattr_ _
method to:
self._ _dict_ _[name] = ro.Readonly(value)
Now, when you set an attribute in const
to some
value, what gets bound there is a read-only wrapper to that value. The
underlying value might still get changed by calling mutators on some
other reference to that same value (object), but it cannot be
accidentally changed through the attribute of “pseudo-module”
const
. If you want to avoid such “accidental changes
through other references”, you need to take a copy, as explained in
Recipe 4.1, so that
there exist no other references to the value held by the read-only
wrapper. Ensure that at the start of module const.py you have:
import ro, copy
and change the assignment in class _const
’s
_ _setattr_ _
method to:
self._ _dict_ _[name] = ro.Readonly(copy.copy(value))
If you’re sufficiently paranoid, you might even use copy.deepcopy
rather than plain copy.copy
in this latest snippet. However,
you may end up paying substantial amounts of memory, as well as losing
some performance, by these kinds of excessive precautions. You should
evaluate carefully whether so much prudence is really necessary for
your specific application. Whatever you end up deciding about this
issue, Python offers all the tools you need to implement exactly the
amount of constantness you require.
The _const
class presented in this
recipe can be seen, in a sense, as the “complement” of the
NoNewAttrs
class, which is presented next in Recipe 6.3. This one ensures
that already bound attributes can never be rebound but lets you freely
bind new attributes; the other one, conversely, lets you freely rebind
attributes that are already bound but blocks the binding of any new
attribute.
Recipe 6.5;
Recipe 6.13; Recipe 4.1;
Library Reference and Python in a
Nutshell docs on module objects, the import
statement, and the modules
attribute of the sys
built-in module.
Credit: Michele Simionato
Python normally lets you freely add attributes to classes and their instances. However, you want to restrict that freedom for some class.
Special method _ _setattr_
_
intercepts every setting of an attribute, so it lets you
inhibit the addition of new attributes that were not already present.
One elegant way to implement this idea is to code a class, a simple
custom metaclass, and a wrapper function, all cooperating for the
purpose, as follows:
def no_new_attributes(wrapped_setattr): """ raise an error on attempts to add a new attribute, while allowing existing attributes to be set to new values. """ def _ _setattr_ _(self, name, value): if hasattr(self, name): # not a new attribute, allow setting wrapped_setattr(self, name, value) else: # a new attribute, forbid adding it raise AttributeError("can't add attribute %r to %s" % (name, self)) return _ _setattr_ _ class NoNewAttrs(object): """ subclasses of NoNewAttrs inhibit addition of new attributes, while allowing existing attributed to be set to new values. """ # block the addition new attributes to instances of this class _ _setattr_ _ = no_new_attributes(object._ _setattr_ _) class _ _metaclass_ _(type): " simple custom metaclass to block adding new attributes to this class " _ _setattr_ _ = no_new_attributes(type._ _setattr_ _)
For various reasons, you sometimes want to restrict
Python’s dynamism. In particular, you may want to get an exception
when a new attribute is accidentally set on a certain class or one of
its instances. This recipe shows how to go about implementing such a
restriction. The key point of the recipe is,
don’t use _ _slots_
_
for this purpose: _ _slots_
_
is intended for a completely different task (i.e., saving
memory by avoiding each instance having a dictionary, as it normally
would, when you need to have vast numbers of instances of a class with
just a few fixed attributes). _ _slots_
_
performs its intended task well but has various
limitations when you try to stretch it to perform, instead, the task
this recipe covers. (See Recipe 6.18 for an example
of the appropriate use of _ _slots_
_
to save memory.)
Notice that this recipe inhibits the addition of runtime attributes, not only to class instances, but also to the class itself, thanks to the simple custom metaclass it defines. When you want to inhibit accidental addition of attributes, you usually want to inhibit it on the class as well as on each individual instance. On the other hand, existing attributes on both the class and its instances may be freely set to new values.
Here is an example of how you could use this recipe:
class Person(NoNewAttrs): firstname = '' lastname = '' def _ _init_ _(self, firstname, lastname): self.firstname = firstname self.lastname = lastname def _ _repr_ _(self): return 'Person(%r, %r)' % (self.firstname, self.lastname) me = Person("Michere", "Simionato") print me # emits:Person('Michere', 'Simionato')
# oops, wrong value for firstname, can we fix it? Sure, no problem! me.firstname = "Michele" print me # emits:Person('Michele', 'Simionato')
The point of inheriting from NoNewAttrs
is
forcing yourself to “declare” all allowed attributes by setting them
at class level in the body of the class itself. Any further attempt to
set a new, “undeclared” attribute raises an AttributeError
:
try: Person.address = '' except AttributeError, err: print 'raised %r as expected' % err try: me.address = '' except AttributeError, err: print 'raised %r as expected' % err
In some ways, therefore, subclasses of
NoNewAttr
and their instances behave more like Java
or C++ classes and instances, rather than normal Python ones. Thus,
one use case for this recipe is when you’re coding in Python a
prototype that you already know will eventually have to be recoded in
a less dynamic language.
Library Reference and Python
in a Nutshell documentation on the special method _ _setattr_ _
and on custom metaclasses;
Recipe 6.18 for an
example of an appropriate use of _ _slots_
_
to save memory; Recipe 6.2 for a class that
is the complement of this one.
Credit: Raymond Hettinger
You have several mappings (usually dict
s) and want to look things up in them in
a chained way (try the first one; if the key is not there, then try
the second one; and so on). Specifically, you want to make a single
mapping object that “virtually merges” several others, by looking
things up in them in a specified priority order, so that you can
conveniently pass that one object around.
A mapping is a generalized, abstract version of a dictionary: a mapping provides an interface that’s similar to a dictionary’s, but it may use very different implementations. All dictionaries are mappings, but not vice versa. Here, you need to implement a mapping which sequentially tries delegating lookups to other mappings. A class is the right way to encapsulate this functionality:
class Chainmap(object): def _ _init_ _(self, *mappings): # record the sequence of mappings into which we must look self._mappings = mappings def _ _getitem_ _(self, key): # try looking up into each mapping in sequence for mapping in self._mappings: try: return mapping[key] except KeyError: pass # `key' not found in any mapping, so raise KeyError exception raise KeyError, key def get(self, key, default=None): # return self[key] if present, otherwise `default' try: return self[key] except KeyError: return default def _ _contains_ _(self, key): # return True if `key' is present in self, otherwise False try: self[key] return True except KeyError: return False
For example, you can now implement the same sequence of lookups that Python normally uses for any name: look among locals, then (if not found there) among globals, lastly (if not found yet) among built-ins:
import _ _builtin_ _ pylookup = Chainmap(locals( ), globals( ), vars(_ _builtin_ _))
Chainmap
relies on minimal functionality from the mappings it
wraps: each of those underlying mappings must allow indexing (i.e.,
supply a special method _ _getitem_
_
), and it must raise the standard exception KeyError
when indexed with a key that the
mapping does not know about. A Chainmap
instance
provides the same behavior, plus the handy get
method covered in Recipe 4.9 and special
method _ _contains_ _
(which
conveniently lets you check whether some key
k
is present in a Chainmap
instance c
by just coding if k in c
).
Besides the obvious and sensible limitation of being
“read-only”, this Chainmap
class has
others—essentially, it is not a “full mapping” even within the
read-only design choice. You can make any partial mapping into a “full
mapping” by inheriting from class DictMixin
(in standard library module
UserDict
) and supplying a few key
methods (DictMixin
implements the
others). Here is how you could make a full (read-only) mapping from
ChainMap
and UserDict.DictMixin
:
import UserDict from sets import Set class FullChainmap(Chainmap, UserDict.DictMixin): def copy(self): return self._ _class_ _(self._mappings) def _ _iter_ _(self): seen = Set( ) for mapping in self._mappings: for key in mapping: if key not in seen: yield key seen.add(key) iterkeys = _ _iter_ _ def keys(self): return list(self)
This class FullChainmap
adds one
requirement to the mappings it holds, besides the requirements posed
by Chainmap
: the mappings must be iterable. Also note
that the implementation in Chainmap
of methods
get
and _
_contains_ _
is redundant (although innocuous) once we
subclass DictMixin
, since DictMixin
also implements those two methods
(as well as many others) in terms of lower-level methods, just like
Chainmap
does. See Recipe 5.14 for more
details about DictMixin
.
Recipe 4.9; Recipe 5.14; the Library Reference and Python in a Nutshell sections on mapping types.
Credit: Alex Martelli, Raymond Hettinger
You’d like to inherit from a class or type, but you need some tweak that inheritance does not provide. For example, you want to selectively hide some of the base class’ methods, which inheritance doesn’t allow.
Inheritance is quite handy, but it’s not all-powerful. For example, it doesn’t let you hide methods or other attributes supplied by a base class. Containment with automatic delegation is often a good alternative. Say, for example, you need to wrap some objects to make them read-only; thus preventing accidental alterations. Therefore, besides stopping attribute-setting, you also need to hide mutating methods. Here’s a way:
# support 2.3 as well as 2.4 try: set except NameError: from sets import Set as set class ROError(AttributeError): pass class Readonly: # there IS a reason to NOT subclass object, see Discussion mutators = { list: set('''_ _delitem_ _ _ _delslice_ _ _ _iadd_ _ _ _imul_ _ _ _setitem_ _ _ _setslice_ _ append extend insert pop remove sort'''.split( )), dict: set('''_ _delitem_ _ _ _setitem_ _ clear pop popitem setdefault update'''.split( )), } def _ _init_ _(self, o): object._ _setattr_ _(self, '_o', o) object._ _setattr_ _(self, '_no', self.mutators.get(type(o), ( ))) def _ _setattr_ _(self, n, v): raise ROError, "Can't set attr %r on RO object" % n def _ _delattr_ _(self, n): raise ROError, "Can't del attr %r from RO object" % n def _ _getattr_ _(self, n): if n in self._no: raise ROError, "Can't get attr %r from RO object" % n return getattr(self._o, n)
Code using this class Readonly
can easily add
other wrappable types with Readonly.mutators[sometype] =
the_mutators
.
Automatic delegation, which the special methods _ _getattr_ _
, _
_setattr_ _
, and _ _delattr_
_
enable us to perform so smoothly, is a powerful, general
technique. In this recipe, we show how to use it to get an effect that
is almost indistinguishable from subclassing while hiding some names.
In particular, we apply this quasi-subclassing to the task of wrapping
objects to make them read-only. Performance isn’t quite as good as it
might be with real inheritance, but we get better flexibility and
finer-grained control as compensation.
The fundamental idea is that each instance of our class holds an
instance of the type we are wrapping (i.e., extending and/or
tweaking). Whenever client code tries to get an attribute from an
instance of our class, unless the attribute is specifically defined
there (e.g., the mutators
dictionary in class
Readonly
), _ _getattr_
_
transparently shunts the request to the wrapped instance
after appropriate checks. In Python, methods are also attributes,
accessed in just the same way, so we don’t need to do anything
different to access methods. The _ _getattr_
_
approach used to access data attributes works for methods
just as well.
This is where the comment in the recipe about there being a
specific reason to avoid subclassing object comes in. Our _ _getattr_ _
based approach does work on
special methods too, but only for instances of
old-style classes. In today’s object model, Python operations access
special methods on the class, not on the instance. Solutions to this
issue are presented next in Recipe 6.6 and in Recipe 20.8. The approach
adopted in this recipe—making class Readonly
old
style, so that the issue can be locally avoided and delegated to other
recipes—is definitely not recommended for
production code. I use it here only to keep this recipe shorter and to
avoid duplicating coverage that is already amply given elsewhere in
this cookbook.
_ _setattr_ _
plays a role
similar to _ _getattr_ _
, but it
gets called when client code sets an instance attribute; in this case,
since we want to make a read-only wrapper, we simply forbid the
operation. Remember, to avoid triggering _
_setattr_ _
from inside the methods you code, you must never
code normal self.n = v
statements
within the methods of classes that have _
_setattr_ _
. The simplest workaround is to delegate the
setting to class object
, just like
our class Readonly
does twice in its _ _init_ _
method. Method _ _delattr_ _
completes the picture, dealing
with any attempts to delete attributes from an instance.
Wrapping by automatic delegation does not work well with client or framework code that, one way or another, does type-testing. In such cases, the client or framework code is breaking polymorphism and should be rewritten. Remember not to use type-tests in your own client code, as you probably do not need them anyway. See Recipe 6.13 for better alternatives.
In old versions of Python, automatic delegation was even more prevalent, since you could not subclass built-in types. In modern Python, you can inherit from built-in types, so you’ll use automatic delegation less often. However, delegation still has its place—it is just a bit farther from the spotlight. Delegation is more flexible than inheritance, and sometimes such flexibility is invaluable. In addition to the ability to delegate selectively (thus effectively “hiding” some of the attributes), an object can delegate to different subobjects over time, or to multiple subobjects at one time, and inheritance doesn’t offer anything comparable.
Here is an example of delegating to multiple specific subobjects. Say that you have classes that are chock full of “forwarding methods”, such as:
class Pricing(object):
def _ _init_ _(self, location, event):
self.location = location
self.event = event
def setlocation(self, location):
self.location = location
def getprice(self):
return self.location.getprice( )
def getquantity(self):
return self.location.getquantity( )
def getdiscount(self):
return self.event.getdiscount( )and many more such methods
Inheritance is clearly not applicable because an instance of
Pricing
must delegate to
specific location
and
event
instances, which get passed at initialization
time and may even be changed. Automatic delegation to the
rescue:
class AutoDelegator(object): delegates = ( ) do_not_delegate = ( ) def _ _getattr_ _(self, key): if key not in do_not_delegate: for d in self.delegates: try: return getattr(d, key) except AttributeError: pass raise AttributeError, key class Pricing(AutoDelegator): def _ _init_ _(self, location, event): self.delegates = [location, event] def setlocation(self, location): self.delegates[0] = location
In this case, we do not delegate the setting and deletion of
attributes, only the getting of attributes (and nonspecial methods).
Of course, this approach is fully applicable only when the methods
(and other attributes) of the various objects to which we want to
delegate do not interfere with each other; for example,
location
must not have a getdiscount
method; otherwise, it would preempt the delegation of that method,
which is intended to go to event
.
If a class that does lots of delegation has a few such
issues to solve, it can do so by explicitly defining the few
corresponding methods, since _ _getattr_
_
enters the picture only for attributes and methods that
cannot be found otherwise. The ability to hide
some attributes and methods that are supplied by a delegate, but the
delegator does not want to expose, is supported through attribute
do_not_delegate
, which any subclass may override. For
example, if class Pricing
wanted to hide a method
setdiscount
that is supplied by, say,
event
, only a tiny change would be required:
class Pricing(AutoDelegator): do_not_delegate = ('set_discount',)
while all the rest remains as in the previous snippet.
Recipe 6.13; Recipe 6.6; Recipe 20.8; Python in a Nutshell chapter on OOP; PEP 253 (http://www.python.org/peps/pep-0253.html) for more details about Python’s current (new-style) object model.
Credit: Gonçalo Rodrigues
In the new-style object model, Python operations perform implicit lookups for special methods on the class (rather than on the instance, as they do in the classic object model). Nevertheless, you need to wrap new-style instances in proxies that can also delegate a selected set of special methods to the object they’re wrapping.
You need to generate each proxy’s class on the fly. For example:
class Proxy(object): """ base class for all proxies """ def _ _init_ _(self, obj): super(Proxy, self)._ _init_ _(obj) self._obj = obj def _ _getattr_ _(self, attrib): return getattr(self._obj, attrib) def make_binder(unbound_method): def f(self, *a, **k): return unbound_method(self._obj, *a, **k) # in 2.4, only: f._ _name_ _ = unbound_method._ _name_ _ return f known_proxy_classes = { } def proxy(obj, *specials): ''' factory-function for a proxy able to delegate special methods ''' # do we already have a suitable customized class around? obj_cls = obj._ _class_ _ key = obj_cls, specials cls = known_proxy_classes.get(key) if cls is None: # we don't have a suitable class around, so let's make it cls = type("%sProxy" % obj_cls._ _name_ _, (Proxy,), { }) for name in specials: name = '_ _%s_ _' % name unbound_method = getattr(obj_cls, name) setattr(cls, name, make_binder(unbound_method)) # also cache it for the future known_proxy_classes[key] = cls # instantiate and return the needed proxy return cls(obj)
Proxying and automatic delegation are a joy in Python,
thanks to the _ _getattr_ _
hook.
Python calls it automatically when a lookup for any attribute
(including a method—Python draws no distinction there) has not
otherwise succeeded.
In the old-style (classic) object model, _ _getattr_ _
also applied to special
methods that were looked up as part of a Python operation. This
required some care to avoid mistakenly supplying a special method one
didn’t really want to supply but was otherwise handy. Nowadays, the
new-style object model is recommended for all new code: it is faster,
more regular, and richer in features. You get new-style classes when
you subclass object
or any other
built-in type. One day, some years from now, Python 3.0 will eliminate
the classic object model, as well as other features that are still
around only for backwards-compatibility. (See http://www.python.org/peps/pep-3000.html for
details about plans for Python 3.0—almost all changes will be language
simplifications, rather than new features.)
In the new-style object model, Python operations don’t look up special methods at runtime: they rely on “slots” held in class objects. Such slots are updated when a class object is built or modified. Therefore, a proxy object that wants to delegate some special methods to an object it’s wrapping needs to belong to a specially made and tailored class. Fortunately, as this recipe shows, making and instantiating classes on the fly is quite an easy job in Python.
In this recipe, we don’t use any advanced Python
concepts such as custom metaclasses and custom descriptors. Rather,
each proxy is built by a factory function proxy
, which takes as arguments the object
to wrap and the names of special methods to delegate (shorn of leading
and trailing double underscores). If you’ve saved the “Solution"’s
code in a file named proxy.py
somewhere along your Python sys.path
, here is how you could use it from
an interactive Python interpreter session:
>>> import proxy >>> a = proxy.proxy([ ], 'len', 'iter') # only delegate _ _len_ _ & _ _iter_ _ >>> a # _ _repr_ _ is not delegated<proxy.listProxy object at 0x0113C370>
>>> a._ _class_ _<class 'proxy.listProxy'>
>>> a._obj[ ]
>>> a.append # all non-specials are delegated<built-in method append of list object at 0x010F1A10>
Since _ _len_ _
is delegated,
len(a
) works as expected:
>>> len(a)0
>>> a.append(23) >>> len(a)1
Since _ _iter_ _
is
delegated, for
loops work as
expected, as does intrinsic looping performed by built-ins such as
list
, sum
, max
,
. . . :
>>> for x in a: print x ...23
>>> list(a)[23]
>>> sum(a)23
>>> max(a)23
However, since _ _getitem_ _
is not delegated, a
cannot be indexed nor sliced:
>>> a._ _getitem_ _<method-wrapper object at 0x010F1AF0>
>>> a[1]
Traceback (most recent call last):
File "<interactive input>", line 1, in ?
TypeError: unindexable object
Function proxy
uses a “cache” of classes it has
previously generated, the global dictionary
known_proxy_classes
, keyed by the class of the object
being wrapped and the tuple of special methods’ names being delegated.
To make a new class, proxy
calls
the built-in type
, passing as
arguments the name of the new class (made by appending 'Proxy
' to the name of the class being
wrapped), class Proxy
as the only base, and an
“empty” class dictionary (since it’s adding no class attributes yet).
Base class Proxy
deals with initialization and
delegation of ordinary attribute lookups. Then, factory function
proxy
loops over the names of specials to be
delegated: for each of them, it gets the unbound method from the class
of the object being wrapped, and sets it as an attribute of the new
class within a make_binder
closure.
make_binder
deals with calling the unbound method
with the appropriate first argument (i.e., the object being wrapped,
self._obj
).
Once it’s done preparing a new class, proxy
saves it in known_proxy_classes
under the appropriate
key. Finally, whether the class was just built or recovered from
known_proxy_classes
, proxy
instantiates it, with the object being wrapped as the only argument,
and returns the resulting proxy instance.
Recipe 6.5 for
more information about automatic delegation; Recipe 6.9 for another
example of generating classes on the fly (using a class
statement rather than a call to
type
).
Credit: Gonçalo Rodrigues, Raymond Hettinger
Python tuples are handy ways to group pieces of information, but having to access each item by numeric index is a bother. You’d like to build tuples whose items are also accessible as named attributes.
A factory function is the simplest way to generate the required
subclass of tuple
:
# use operator.itemgetter if we're in 2.4, roll our own if we're in 2.3 try: from operator import itemgetter except ImportError: def itemgetter(i): def getter(self): return self[i] return getter def superTuple(typename, *attribute_names): " create and return a subclass of `tuple', with named attributes " # make the subclass with appropriate _ _new_ _ and _ _repr_ _ specials nargs = len(attribute_names) class supertup(tuple): _ _slots_ _ = ( ) # save memory, we don't need per-instance dict def _ _new_ _(cls, *args): if len(args) != nargs: raise TypeError, '%s takes exactly %d arguments (%d given)' % ( typename, nargs, len(args)) return tuple._ _new_ _(cls, args) def _ _repr_ _(self): return '%s(%s)' % (typename, ', '.join(map(repr, self))) # add a few key touches to our new subclass of `tuple' for index, attr_name in enumerate(attribute_names): setattr(supertup, attr_name, property(itemgetter(index))) supertup._ _name_ _ = typename return supertup
You often want to pass data around by means of tuples, which
play the role of C’s struct
s, or
that of simple records in other languages. Having to remember which
numeric index corresponds to which field, and accessing the fields by
indexing, is often bothersome. Some Python Standard Library modules,
such as time
and os
, which in old Python versions used to
return tuples, have fixed the problem by returning, instead, instances
of tuple-like types that let you access the fields by name, as
attributes, as well as by index, as items. This recipe shows you how
to get the same effect for your code, essentially by automatically
building a custom subclass of tuple
.
Orchestrating the building of a new, customized type can be
achieved in several ways; custom metaclasses are often the best
approach for such tasks. In this case, however, a simple factory
function is quite sufficient, and you should never use more power than
you need. Here is how you can use this recipe’s
superTuple
factory function in your code, assuming
you have saved this recipe’s Solution as a module named supertuple.py somewhere along your Python
sys.path
:
>>> import supertuple >>> Point = supertuple.superTuple('Point', 'x', 'y') >>> Point<class 'supertuple.Point'>
>>> p = Point(1, 2, 3) # wrong number of fieldsTraceback (most recent call last):
File "", line 1, in ?
File "C:Python24Libsite-packagessuperTuple.py", line 16, in _ _new_ _
raise TypeError, '%s takes exactly %d arguments (%d given)' % (
TypeError: Point takes exactly 2 arguments (3 given)
>>> p = Point(1, 2) # let's do it right this time >>> pPoint(1, 2)
>>> print p.x, p.y1 2
Function superTuple
’s implementation is
quite straightforward. To build the new subclass,
superTuple
uses a class
statement, and in that statement’s
body, it defines three specials: an “empty” _
_slots_ _
(just to save memory, since our supertuple
instances don’t need any per-instance dictionary anyway); a _ _new_ _
method that checks the number of
arguments before delegating to tuple._ _new_
_
; and an appropriate _ _repr_
_
method. After the new class object is built, we set into
it a property
for each named
attribute we want. Each such property
has only a “getter”, since our
supertuples, just like tuples themselves, are immutable—no setting of
fields. Finally, we set the new class’ name and return the class
object.
Each of the getters is easily built by a simple call to
the built-in itemgetter
from the
standard library module operator
.
Since operator.itemgetter
was
introduced in Python 2.4, at the very start of our module we ensure we
have a suitable itemgetter
at hand anyway, even in
Python 2.3, by rolling our own if necessary.
Library Reference and Python
in a Nutshell docs for property
, _ _slots_
_
, tuple
, and special
methods _ _new_ _
and _ _repr_ _
; (Python 2.4 only) module
operator
’s function itemgetter
.
Credit: Yakov Markovitch
Your classes use some property
instances where either the getter
or the setter is just boilerplate code to fetch or set an instance
attribute. You would prefer to just specify the attribute name,
instead of writing boilerplate code.
You need a factory function that catches the cases in which
either the getter or the setter argument is a string, and wraps the
appropriate argument into a function, then delegates the rest of the
work to Python’s built-in property
:
def xproperty(fget, fset, fdel=None, doc=None): if isinstance(fget, str): attr_name = fget def fget(obj): return getattr(obj, attr_name) elif isinstance(fset, str): attr_name = fset def fset(obj, val): setattr(obj, attr_name, val) else: raise TypeError, 'either fget or fset must be a str' return property(fget, fset, fdel, doc)
Python’s built-in property
is
very useful, but it presents one minor annoyance (it may be easier to
see as an annoyance for programmers with experience in Delphi). It
often happens that you want to have both a setter and a “getter”, but
only one of them actually needs to execute any significant code; the
other one simply needs to read or write an instance attribute. In that
case, property
still requires two
functions as its arguments. One of the functions will then be just
“boilerplate code” (i.e., repetitious plumbing code that is boring,
and often voluminous, and thus a likely home for bugs).
For example, consider:
class Lower(object): def _ _init_ _(self, s=''): self.s = s def _getS(self): return self._s def _setS(self, s): self._s = s.lower( ) s = property(_getS, _setS)
Method _getS
is just boilerplate, yet
you have to code it because you need to pass it to property
. Using this recipe, you can make
your code a little bit simpler, without changing the code’s
meaning:
class Lower(object): def _ _init_ _(self, s=''): self.s = s def _setS(self, s): self._s = s.lower( ) s = xproperty('_s', _setS)
The simplification doesn’t look like much in one small example, but, applied widely all over your code, it can in fact help quite a bit.
The implementation of factory function
xproperty
in this recipe’s Solution is rather rigidly
coded: it requires you to pass both fget
and fset
, and exactly one of them must be a
string. No use case requires that both be strings; when neither is a
string, or when you want to have just one of the two accessors, you
can (and should) use the built-in property
directly. It is better, therefore,
to have xproperty
check that it is being used
accurately, considering that such checks remove no useful
functionality and impose no substantial performance penalty
either.
Library Reference and Python
in a Nutshell documentation on the built-in property
.
Credit: Alex Martelli
You need to implement the special method _ _copy_ _
so that your class can cooperate
with the copy.copy
function.
Because the _ _init_ _
method of
your specific class happens to be slow, you need to bypass it and get
an “empty”, uninitialized instance of the class.
Here’s a solution that works for both new-style and classic classes:
def empty_copy(obj): class Empty(obj._ _class_ _): def _ _init_ _(self): pass newcopy = Empty( ) newcopy._ _class_ _ = obj._ _class_ _ return newcopy
Your classes can use this function to implement _ _copy_ _
as follows:
class YourClass(object): def _ _init_ _(self):assume there's a lot of work here def _ _copy_ _(self): newcopy = empty_copy(self) copy some relevant subset of self's attributes to newcopy return newcopy
Here’s a usage example:
if _ _name_ _ == '_ _main_ _': import copy y = YourClass( ) # This, of course, does run _ _init_ _ print y z = copy.copy(y) # ...but this doesn't print z
As covered in Recipe 4.1, Python doesn’t
implicitly copy your objects when you assign them, which is a great
thing because it gives fast, flexible, and uniform semantics. When you
need a copy, you explicitly ask for it, often with the copy.copy
function, which knows how to copy
built-in types, has reasonable defaults for your own objects, and lets
you customize the copying process by defining a special method
_ _copy_ _
in your own classes. If
you want instances of a class to be noncopyable, you can define
_ _copy_ _
and raise a TypeError
there. In most cases, you can just
let copy.copy
’s default mechanisms
work, and you get free clonability for most of your classes. This is
quite a bit nicer than languages that force you to implement a
specific clone
method for every
class whose instances you want to be clonable.
A _ _copy_ _
method
often needs to start with an “empty” instance of the class in question
(e.g., self
), bypassing _ _init_ _
when that is a costly operation.
The simplest general way to do this is to use the ability that Python
gives you to change an instance’s class on the fly: create a new
object in a local empty class, then set the new object’s _ _class_ _
attribute, as the recipe’s code
shows. Inheriting class Empty
from
obj._ _class_ _
is redundant (but quite innocuous)
for old-style (classic) classes, but that inheritance makes the recipe
compatible with all kinds of objects of classic or new-style classes
(including built-in and extension types). Once you choose to inherit
from obj
’s class, you must override
_ _init_ _
in class Empty
, or else the whole purpose of
the recipe is defeated. The override means that the _ _init_ _
method of obj
’s class won’t execute, since Python,
fortunately, does not automatically execute
ancestor classes’ initializers.
Once you have an “empty” object of the required class, you
typically need to copy a subset of self
’s attributes.
When you need all of the attributes, you’re better off not defining
_ _copy_ _
explicitly, since
copying all instance attributes is exactly copy.copy
’s default behavior. Unless, of
course, you need to do a little bit more than just copying instance
attributes; in this case, these two alternative techniques to copy all
attributes are both quite acceptable:
newcopy._ _dict_ _.update(self._ _dict_ _) newcopy._ _dict_ _ = dict(self._ _dict_ _)
An instance of a new-style class doesn’t necessarily keep all of
its state in _ _dict_ _
, so you may
need to do some class-specific state copying in such cases.
Alternatives based on the new
standard module can’t be made transparent across classic and new-style
classes, and neither can the _ _new_
_
static method that generates an empty instance—the latter
is only defined in new-style classes, not classic ones. Fortunately,
this recipe obviates any such issues.
A good alternative to implementing _ _copy_ _
is often to implement the methods
_ _getstate_ _
and _ _setstate_ _
instead: these special
methods define your object’s state very
explicitly and intrinsically bypass _ _init_
_
. Moreover, they also support serialization (i.e.,
pickling) of your class instances: see Recipe 7.4 for more
information about these methods.
So far we have been discussing shallow copies, which is
what you want most of the time. With a shallow copy, your object is
copied, but objects it refers to (attributes or items) are not, so the
newly copied object and the original object refer to the same items or
attributes objects—a fast and lightweight operation. A deep copy is a
heavyweight operation, potentially duplicating a large graph of
objects that refer to each other. You get a deep copy by calling
copy.deepcopy
on an object. If you
need to customize the way in which instances of your class are
deep-copied, you can define the special method _ _deepcopy_ _
:
class YourClass(object):...
def _ _deepcopy_ _(self, memo):
newcopy = empty_copy(self)
# use copy.deepcopy(self.x, memo) to get deep copies of elements
# in the relevant subset of self's attributes, to set in newcopy
return newcopy
If you choose to implement _
_deepcopy_ _
, remember to respect the memoization protocol
that is specified in the Python documentation for standard module
copy
—get deep copies of all the
attributes or items that are needed by calling copy.deepcopy
with a second argument, the
same memo
dictionary that is passed to the _ _deepcopy_ _
method. Again, implementing
_ _getstate_ _
and _ _setstate_ _
is often a good alternative,
since these methods can also support deep copying: Python takes care
of deeply copying the “state” object that _
_getstate_ _
returns, before passing it to the _ _setstate_ _
method of a new, empty
instance. See Recipe
7.4 for more information about these special methods.
Recipe 4.1 about
shallow and deep copies; Recipe 7.4 about _ _getstate_ _
and _ _setstate_ _
; the Library
Reference and Python in a Nutshell
sections on the copy
module.
Credit: Joseph A. Knapka, Frédéric Jolliton, Nicodemus
You want to hold references to bound methods, while still allowing the associated object to be garbage-collected.
Weak references (i.e., references that indicate an
object as long as that object is alive but don’t
keep that object alive if there are no other,
normal references to it) are an important tool in
some advanced programming situations. The weakref
module in the Python Standard
Library lets you use weak references.
However, weakref
’s
functionality cannot directly be used for bound methods unless you
take some precautions. To allow an object to be garbage-collected
despite outstanding references to its bound methods, you need some
wrappers. Put the following code in a file named weakmethod.py in some directory on your
Python sys.path
:
import weakref, new class ref(object): """ Wraps any callable, most importantly a bound method, in a way that allows a bound method's object to be GC'ed, while providing the same interface as a normal weak reference. """ def _ _init_ _(self, fn): try: # try getting object, function, and class o, f, c = fn.im_self, fn.im_func, fn.im_class except AttributeError: # It's not a bound method self._obj = None self._func = fn self._clas = None else: # It is a bound method if o is None: self._obj = None # ...actually UN-bound else: self._obj = weakref.ref(o) # ...really bound self._func = f self._clas = c def _ _call_ _(self): if self.obj is None: return self._func elif self._obj( ) is None: return None return new.instancemethod(self._func, self.obj( ), self._clas)
A normal bound method holds a strong reference to the bound method’s object. That means that the object can’t be garbage-collected until the bound method is disposed of:
>>> class C(object):
... def f(self):
... print "Hello"
... def _ _del_ _(self):
... print "C dying"
...
>>> c = C( )
>>> cf = c.f
>>> del c # c continues to wander about with glazed eyes...
>>> del cf # ...until we stake its bound method, only then it goes away:C dying
This behavior is most often handy, but sometimes it’s not what
you want. For example, if you’re implementing an event-dispatch
system, it might not be desirable for the mere presence of an event
handler (i.e., a bound method) to prevent the associated object from
being reclaimed. The instinctive idea should then be to use weak
references. However, a normal weakref.ref
to a bound method doesn’t quite
work the way one might expect, because bound methods are first-class
objects. Weak references to bound methods are dead-on-arrival—that is,
they always return None
when
dereferenced, unless another strong reference to the same bound-method
object exists.
For example, the following code, based on the weakref
module from the Python Standard
Library, doesn’t print “Hello” but raises an exception instead:
>>> import weakref >>> c = C( ) >>> cf = weakref.ref(c.f) >>> cf # Oops, better try the lightning again, Igor...<weakref at 80ce394; dead>
>>> cf( )( )Traceback (most recent call last):
File "", line 1, in ?
TypeError: object of type 'None' is not callable
On the other hand, the class ref
in the
weakmethod
module shown in this recipe allows you to
have weak references to bound methods in a useful way:
>>> import weakmethod >>> cf = weakmethod.ref(c.f) >>> cf( )( ) # It LIVES! Bwahahahaha!Hello
>>> del c # ...and it diesC dying
>>> print cf( )None
Calling the weakmethod.ref
instance, which
refers to a bound method, has the same semantics as calling a weakref.ref
instance that refers to, say, a
function object: if the referent has died, it returns None
; otherwise, it returns the referent.
Actually, in this case, it returns a freshly minted new.instancemethod (
holding a strong
reference to the object—so, be sure not to hold on to that, unless you
do want to keep the object alive for a
while!).
Note that the recipe is carefully coded so you can wrap into a
ref
instance any callable you want, be it a method
(bound or unbound), a function, whatever; the weak references
semantics, however, are provided only when you’re wrapping a bound
method; otherwise, ref
acts as a normal (strong)
reference, holding the callable alive. This basically lets you use
ref
for wrapping arbitrary callables without needing
to check for special cases.
If you want semantics closer to that of a weakref.proxy
, they’re easy to implement,
for example by subclassing the ref
class given in
this recipe. When you call a proxy, the proxy calls the referent with
the same arguments. If the referent’s object no longer lives, then
weakref.ReferenceError
gets raised
instead. Here’s an implementation of such a proxy
class:
class proxy(ref): def _ _call_ _(self, *args, **kwargs): func = ref._ _call_ _(self) if func is None: raise weakref.ReferenceError('referent object is dead') else: return func(*args, **kwargs) def _ _eq_ _(self, other): if type(other) != type(self): return False return ref._ _call_ _(self) == ref._ _call_ _(other)
The Library Reference and
Python in a Nutshell sections on the weakref
and new
modules and on bound-method
objects.
Credit: Sébastien Keim, Paul Moore, Steve Alexander, Raymond Hettinger
You want to define a buffer with a fixed size, so that, when it fills up, adding another element overwrites the first (oldest) one. This kind of data structure is particularly useful for storing log and history information.
This recipe changes the buffer object’s class on the fly, from a nonfull buffer class to a full buffer class, when the buffer fills up:
class RingBuffer(object):
""" class that implements a not-yet-full buffer """
def _ _init_ _(self, size_max):
self.max = size_max
self.data = [ ]
class _ _Full(object):
""" class that implements a full buffer """
def append(self, x):
""" Append an element overwriting the oldest one. """
self.data[self.cur] = x
self.cur = (self.cur+1) % self.max
def tolist(self):
""" return list of elements in correct order. """
return self.data[self.cur:] + self.data[:self.cur]
def append(self, x):
""" append an element at the end of the buffer. """
self.data.append(x)
if len(self.data) == self.max:
self.cur = 0
# Permanently change self's class from non-full to fullself._ _class_ _ = _ _Full
def tolist(self):
""" Return a list of elements from the oldest to the newest. """
return self.data
# sample usage
if _ _name_ _ == '_ _main_ _':
x = RingBuffer(5)
x.append(1); x.append(2); x.append(3); x.append(4)
print x._ _class_ _, x.tolist( )
x.append(5)
print x._ _class_ _, x.tolist( )
x.append(6)
print x.data, x.tolist( )
x.append(7); x.append(8); x.append(9); x.append(10)
print x.data, x.tolist( )
A ring buffer is a buffer with a fixed size. When it fills up, adding another element overwrites the oldest one that was still being kept. It’s particularly useful for the storage of log and history information. Python has no direct support for this kind of structure, but it’s easy to construct one. The implementation in this recipe is optimized for element insertion.
The notable design choice in the implementation is that, since
these objects undergo a nonreversible state transition at some point
in their lifetimes—from nonfull buffer to full buffer (and behavior
changes at that point)—I modeled that by changing self._ _class_ _
. This works just as well
for classic classes as for new-style ones, as long as the old and new
classes of the object have the same slots (e.g., it works fine for two
new-style classes that have no slots at all, such as
RingBuffer
and _ _Full
in this
recipe). Note that, differently from other languages, the fact that
class _ _Full
is implemented inside class
RingBuffer
does not imply any special relationship
between these classes; that’s a good thing, too, because no such
relationship is necessary.
Changing the class of an instance may be strange in many languages, but it is an excellent Pythonic alternative to other ways of representing occasional, massive, irreversible, and discrete changes of state that vastly affect behavior, as in this recipe. Fortunately, Python supports it for all kinds of classes.
Ring buffers (i.e., bounded queues, and other names) are
quite a useful idea, but the inefficiency of testing whether the ring
is full, and if so, doing something different, is a nuisance. The
nuisance is particularly undesirable in a language like Python, where
there’s no difficulty—other than the massive memory cost involved—in
allowing the list to grow without bounds. So, ring buffers end up
being underused in spite of their potential. The idea of assigning to
_ _class_ _
to switch behaviors
when the ring gets full is the key to this recipe’s efficiency: such
class switching is a one-off operation, so it doesn’t make the
steady-state cases any less efficient.
Alternatively, we might switch just two methods, rather than the whole class, of a ring buffer instance that becomes full:
class RingBuffer(object): def _ _init_ _(self,size_max): self.max = size_max self.data = [ ] def _full_append(self, x): self.data[self.cur] = x self.cur = (self.cur+1) % self.max def _full_get(self): return self.data[self.cur:]+self.data[:self.cur] def append(self, x): self.data.append(x) if len(self.data) == self.max: self.cur = 0 # Permanently change self's methods from non-full to fullself.append = self._full_append self.tolist = self._full_get def tolist(self): return self.data
This method-switching approach is essentially equivalent to the class-switching one in the recipe’s solution, albeit through rather different mechanisms. The best approach is probably to use class switching when all methods must be switched in bulk and method switching only when you need finer granularity of behavior change. Class switching is the only approach that works if you need to switch any special methods in a new-style class, since intrinsic lookup of special methods during various operations happens on the class, not on the instance (classic classes differ from new-style ones in this aspect).
You can use many other ways to implement a ring buffer. In
Python 2.4, in particular, you should consider subclassing the new
type collections.deque
, which
supplies a “double-ended queue”, allowing equally effective additions
and deletions from either end:
from collections import deque class RingBuffer(deque): def _ _init_ _(self, size_max): deque._ _init_ _(self) self.size_max = size_max def append(self, datum): deque.append(self, datum) if len(self) > self.size_max: self.popleft( ) def tolist(self): return list(self)
or, to avoid the if
statement
when at steady state, you can mix this idea with the idea of switching
a method:
from collections import deque class RingBuffer(deque): def _ _init_ _(self, size_max): deque._ _init_ _(self) self.size_max = size_max def _full_append(self, datum): deque.append(self, datum) self.popleft( ) def append(self, datum): deque.append(self, datum) if len(self) == self.size_max: self.append = self._full_append def tolist(self): return list(self)
With this latest implementation, we need to switch only the
append
method (the
tolist
method remains the same), so method switching
appears to be more appropriate than class switching.
The Reference Manual and
Python in a Nutshell sections on the standard
type hierarchy and classic and new-style object models; Python 2.4
Library Reference on module collections
.
You need to check whether any changes to an instance’s state have occurred to selectively save instances that have been modified since the last “save” operation.
An effective solution is a mixin class—a class you can multiply inherit from and that is able to take snapshots of an instance’s state and compare the instance’s current state with the last snapshot to determine whether or not the instance has been modified:
import copy class ChangeCheckerMixin(object): containerItems = {dict: dict.iteritems, list: enumerate} immutable = False def snapshot(self): ''' create a "snapshot" of self's state -- like a shallow copy, but recursing over container types (not over general instances: instances must keep track of their own changes if needed). ''' if self.immutable: return self._snapshot = self._copy_container(self._ _dict_ _) def makeImmutable(self): ''' the instance state can't change any more, set .immutable ''' self.immutable = True try: del self._snapshot except AttributeError: pass def _copy_container(self, container): ''' semi-shallow copy, recursing on container types only ''' new_container = copy.copy(container) for k, v in self.containerItems[type(new_container)](new_container): if type(v) in self.containerItems: new_container[k] = self._copy_container(v) elif hasattr(v, 'snapshot'): v.snapshot( ) return new_container def isChanged(self): ''' True if self's state is changed since the last snapshot ''' if self.immutable: return False # remove snapshot from self._ _dict_ _, put it back at the end snap = self._ _dict_ _.pop('_snapshot', None) if snap is None: return True try: return self._checkContainer(self._ _dict_ _, snap) finally: self._snapshot = snap def _checkContainer(self, container, snapshot): ''' return True if the container and its snapshot differ ''' if len(container) != len(snapshot): return True for k, v in self.containerItems[type(container)](container): try: ov = snapshot[k] except LookupError: return True if self._checkItem(v, ov): return True return False def _checkItem(self, newitem, olditem): ''' compare newitem and olditem. If they are containers, call self._checkContainer recursively. If they're an instance with an 'isChanged' method, delegate to that method. Otherwise, return True if the items differ. ''' if type(newitem) != type(olditem): return True if type(newitem) in self.containerItems: return self._checkContainer(newitem, olditem) if newitem is olditem: method_isChanged = getattr(newitem, 'isChanged', None) if method_isChanged is None: return False return method_isChanged( ) return newitem != olditem
I often need change-checking functionality in my applications. For example, when a user closes the last GUI window over a certain document, I need to check whether the document was changed since the last “save” operation; if it was, then I need to pop up a small window to give the user a choice between saving the document, losing the latest changes, or canceling the window-closing operation.
The class ChangeCheckerMixin
, which
this recipe describes, satisfies this need. The idea is to multiply
derive all of your data classes, meaning all classes that hold data
the user views and may change, from
ChangeCheckerMixin
(as well as from any other bases
they need). When the data has just been loaded from or saved to
persistent storage, call method snapshot
on the
top-level, document data class instance. This call takes a “snapshot”
of the current state, basically a shallow copy of the object but with
recursion over containers, and calls the snapshot
methods on any contained instance that has such a method. Any time
afterward, you can call method isChanged
on any data
class instance to check whether the instance state was changed since
the time of its last snapshot.
As container types, ChangeCheckerMixin
, as
presented, considers only list
and
dict
. If you also use other types
as containers, you just need to add them appropriately to the
containerItems
dictionary. That dictionary must map
each container type to a function callable on an instance of that type
to get an iterator on indices and values (with indices usable to index
the container). Container type instances must also support being
shallowly copied with standard library Python function copy.copy
. For example, to add Python 2.4’s
collections.deque
as a container to
a subclass of ChangeCheckerMixin
, you can
code:
import collections class CCM_with_deque(ChangeCheckerMixin): containerItems = dict(ChangeCheckerMixin.containerItems) containerItems[collections.deque] = enumerate
since collections.deque
can
be “walked over” with enumerate
,
just like list
can.
Here is a toy example of use for
ChangeChecherMixin
:
if _ _name_ _ == '_ _main_ _': class eg(ChangeCheckerMixin): def _ _init_ _(self, *a, **k): self.L = list(*a, **k) def _ _str_ _(self): return 'eg(%s)' % str(self.L) def _ _getattr_ _(self, a): return getattr(self.L, a) x = eg('ciao') print 'x =', x, 'is changed =', x.isChanged( ) # emits:x = eg(['c', 'i', 'a', 'o']) is changed = True
# now, assume x gets saved, then...: x.snapshot( ) print 'x =', x, 'is changed =', x.isChanged( ) # emits:x = eg(['c', 'i', 'a', 'o']) is changed = False
# now we change x...: x.append('x') print 'x =', x, 'is changed =', x.isChanged( ) # emits:x = eg(['c', 'i', 'a', 'o', 'x']) is changed = True
In class eg
we only subclass
ChanceCheckerMixin
because we need no other bases. In
particular, we cannot usefully subclass list
because the change-checking
functionality works only on state that is kept in an instance’s
dictionary; so, we must hold a list object in our instance’s
dictionary, and delegate to it as needed (in this toy example, we
delegate all nonspecial methods, automatically, via _ _getattr_ _
). With this precaution, we see
that the isChanged
method correctly reflects the
crucial tidbit—whether the instance’s state has been changed since the
last call to snapshot
on the instance.
An implicit assumption of this recipe is that your application’s
data class instances are organized in a hierarchical fashion. The
tired old (but still valid) example is an invoice containing header
data and detail lines. Each instance of the details data class could
contain other instances, such as product details, which may not be
modifiable in the current activity but are probably modifiable
elsewhere. This is the reason for the immutable
attribute and the makeImmutable
method: when the
attribute is set by calling the method, any outstanding snapshot for
the instance is dropped to save memory, and further calls to either
snapshot
or isChanged
can return
very rapidly.
If your data does not lend itself to such hierarchical
structuring, you may have to take full deep copies, or even “snapshot”
a document instance by taking a full pickle of it, and check for
changes by comparing the new pickle with the last one previously
taken. That may be all right on very fast machines, or when the amount
of data you’re handling is rather modest. In my tests, however, it
shows up as being unacceptably slow for substantial amounts of data on
more ordinary machines. This recipe, when your data organization is
suitable for its application, can offer better performance. If some of
your data classes also contain data that is automatically computed or,
for other reasons, does not need to be saved, store such data in
instances of subordinate classes (which do not
inherit from ChangeCheckerMixin
), rather than either
holding the data as attributes or storing it in ordinary containers
such as lists and dictionaries.
Library Reference and Python
in a Nutshell documentation on multiple inheritance, the
iteritems
method of dictionaries,
and built-in functions enumerate
,
isinstance
, and hasattr
.
Credit: Alex Martelli
You need to check whether an object has certain necessary attributes before performing state-altering operations. However, you want to avoid type-testing because you know it interferes with polymorphism.
In Python, you normally just try performing whatever operations you need to perform. For example, here’s the simplest, no-checks code for doing a certain sequence of manipulations on a list argument:
def munge1(alist): alist.append(23) alist.extend(range(5)) alist.append(42) alist[4] = alist[3] alist.extend(range(2))
If alist
is missing any of
the methods you’re calling (explicitly, such as append
and extend
; or implicitly, such as the calls to
_ _getitem_ _
and _ _setitem_ _
implied by the assignment
statement alist[4] = alist[3]
), the
attempt to access and call a missing method raises an exception.
Function munge1
makes no attempt to catch the
exception, so the execution of munge1
terminates, and
the exception propagates to the caller of munge1
. The
caller may choose to catch the exception and deal with it, or
terminate execution and let the exception propagate further back along
the chain of calls, as appropriate.
This approach is usually just fine, but problems may
occasionally occur. Suppose, for example, that the alist
object has an append
method but not an extend
method. In this peculiar case, the
munge1
function partially alters alist
before an exception is raised. Such
partial alterations are generally not cleanly undoable; depending on
your application, they can sometimes be a bother.
To forestall the “partial alterations” problem, the first
approach that comes to mind is to check the type of alist
. Such a naive “Look Before You Leap”
(LBYL) approach may look safer than doing no checks at all, but LBYL
has a serious defect: it loses polymorphism! The worst approach of all
is checking for equality of types:
def munge2(alist):
if type(alist) is list: # avery bad idea
munge1(alist)
else: raise TypeError, "expected list, got %s" % type(alist)
This even fails, without any good reason, when alist
is an instance of a
subclass of list
. You can at least remove that huge
defect by using isinstance
instead:
def munge3(alist): if isinstance(alist, list): munge1(alist) else: raise TypeError, "expected list, got %s" % type(alist)
However, munge3
still fails, needlessly, when
alist
is an instance of a type or
class that mimics list
but doesn’t
inherit from it. In other words, such type-checking sacrifices one of
Python’s great strengths: signature-based polymorphism. For example,
you cannot pass to munge3
an instance of Python 2.4’s
collections.deque
, which is a real
pity because such a deque
does
supply all needed functionality and indeed can be passed to the
original munge1
and work just fine. Probably a
zillion sequence types are out there that, like deque
, are quite acceptable to
munge1
but not to munge3
.
Type-checking, even with isinstance
, exacts an enormous price.
A far better solution is accurate LBYL, which is both safe and fully polymorphic:
def munge4(alist): # Extract all bound methods you need (get immediate exception, # without partial alteration, if any needed method is missing): append = alist.append extend = alist.extend # Check operations, such as indexing, to get an exception ASAP # if signature compatibility is missing: try: alist[0] = alist[0] except IndexError: pass # An empty alist is okay # Operate: no exceptions are expected from this point onwards append(23) extend(range(5)) append(42) alist[4] = alist[3] extend(range(2))
Python functions are naturally polymorphic on their
arguments because they essentially depend on the methods and behaviors
of the arguments, not
on the
arguments’ types. If you check the types of
arguments, you sacrifice this precious polymorphism, so,
don’t! However, you may perform a few early
checks to obtain some extra safety (particularly against partial
alterations) without substantial costs.
The normal Pythonic way of life can be described as the Easier to Ask Forgiveness than Permission (EAFP) approach: just try to perform whatever operations you need, and either handle or propagate any exceptions that may result. It usually works great. The only real problem that occasionally arises is “partial alteration”: when you need to perform several operations on an object, just trying to do them all in natural order could result in some of them succeeding, and partially altering the object, before an exception is raised.
For example, suppose that munge1
, as shown at
the start of this recipe’s Solution, is called with an actual argument
value for alist
that has an
append
method but lacks extend
. In this case, alist
is altered by the first call to
append
; but then, the attempt to
obtain and call extend
raises an
exception, leaving alist
’s state
partially altered, a situation that may be hard to recover from.
Sometimes, a sequence of operations should ideally be
atomic: either all of the alterations happen,
and everything is fine, or none of them do, and an exception gets
raised.
You can get closer to ideal atomicity by switching to the LBYL approach, but in an accurate, careful way. Extract all bound methods you’ll need, then noninvasively test the necessary operations (such as indexing on both sides of the assignment operator). Move on to actually changing the object state only if all of this succeeds. From that point onward, it’s far less likely (although not impossible) that exceptions will occur in midstream, leaving state partially altered. You could not reach 100% safety even with the strictest type-checking, after all: for example, you might run out of memory just smack in the middle of your operations. So, with or without type-checking, you don’t really ever guarantee atomicity—you just approach asymptotically to that desirable property.
Accurate LBYL generally offers a good trade-off in comparison to
EAFP, assuming we need safeguards against partial alterations. The
extra complication is modest, and the slowdown due to the checks is
typically compensated by the extra speed gained by using bound methods
through local names rather than explicit attribute access (at least if
the operations include loops, which is often the case). It’s important
to avoid overdoing the checks, and the assert
statement can help with that. For
example, you can add such checks as assert
callable(append)
to munge4
. In this case,
the compiler removes the assert
entirely when you run the program with optimization (i.e., with flags
-O
or -OO
passed to the python command), while performing the checks
when the program is run for testing and debugging (i.e., without the
optimization flags).
Language Reference and Python
in a Nutshell about assert
and the meaning of the -O
and -OO
command-line arguments;
Library Reference and Python in a
Nutshell about sequence types, and lists in
particular.
Credit: Elmar Bschorer
An object in your program can switch among several “states”, and the object’s behavior must change along with the object’s state.
The key idea of the State Design Pattern is to objectify the “state” (with its several behaviors) into a class instance (with its several methods). In Python, you don’t have to build an abstract class to represent the interface that is common to the various states: just write the classes for the “state"s themselves. For example:
class TraceNormal(object): ' state for normal level of verbosity ' def startMessage(self): self.nstr = self.characters = 0 def emitString(self, s): self.nstr += 1 self.characters += len(s) def endMessage(self): print '%d characters in %d strings' % (self.characters, self.nstr) class TraceChatty(object): ' state for high level of verbosity ' def startMessage(self): self.msg = [ ] def emitString(self, s): self.msg.append(repr(s)) def endMessage(self): print 'Message: ', ', '.join(self.msg) class TraceQuiet(object): ' state for zero level of verbosity ' def startMessage(self): pass def emitString(self, s): pass def endMessage(self): pass class Tracer(object): def _ _init_ _(self, state): self.state = state def setState(self, state): self.state = state def emitStrings(self, strings): self.state.startMessage( ) for s in strings: self.state.emitString(s) self.state.endMessage( ) if _ _name_ _ == '_ _main_ _': t = Tracer(TraceNormal( )) t.emitStrings('some example strings here'.split( )) # emits:21 characters in 4 strings
t.setState(TraceQuiet( )) t.emitStrings('some example strings here'.split( )) # emits nothing t.setState(TraceChatty( )) t.emitStrings('some example strings here'.split( )) # emits:Message: 'some', 'example', 'strings', 'here'
With the State Design Pattern, you can “factor out” a number of
related behaviors of an object (and possibly some data connected with
these behaviors) into an auxiliary state object, to which the main
object delegates these behaviors as needed, through calls to methods
of the “state” object. In Python terms, this design pattern is related
to the idioms of rebinding an object’s whole _ _class_ _
, as shown in Recipe 6.11, and rebinding
just certain methods (shown in Recipe 2.14). This design
pattern, in a sense, lies in between those Python idioms: you group a
set of related behaviors, rather than switching either all behavior,
by changing the object’s whole _ _class_
_
, or each method on its own, without grouping. With
relation to the classic design pattern terminology, this recipe
presents a pattern that falls somewhere between the classic State
Design Pattern and the classic Strategy Design Pattern.
This State Design Pattern has some extra oomph, compared to the
related Pythonic idioms, because an appropriate amount of data can
live together with the behaviors you’re delegating—exactly as much, or
as little, as needed to support each specific behavior. In the
examples given in this recipe’s Solution, for example, the different
state objects differ greatly in the kind and amount of data they need:
none at all for class TraceQuiet
, just a couple of
numbers for TraceNormal
, a whole list of strings for
TraceChatty
. These responsibilities are usefully
delegated from the main object to each specific “state object”.
In some cases, although not in the specific examples shown in
this recipe, state objects may need to cooperate more closely with the
main object, by calling main object methods or accessing main object
attributes in certain circumstances. To allow this, the main object
can pass as an argument either self
or some bound method of self
to
methods of the “state” objects. For example, suppose that the
functionality in this recipe’s Solution needs to be extended, in that
the main object must keep track of how many lines have been emitted by
messages it has sent. Tracer._ _init_ _
will have to
add one per-instance initialization self.lines = 0
, and the signature of the
“state” object’s endMessage
methods will have to be
extended to def
endMessage(self, tracer)
:. The
implementation of endMessage
in class
TraceQuiet
will just ignore the
tracer
argument, since it doesn’t actually emit any
lines; the implementations in the other two classes will each add a
statement tracer.lines +=
1
, since each of them emits one line per
message.
As you see, the kind of closer coupling implied by this kind of extra functionality need not be particularly problematic. In particular, the key feature of the classic State Design Pattern, that state objects are the ones that handle state switching (while, in the Strategy Design Pattern, the switching comes from the outside), is just not enough of a big deal in Python to warrant considering the two design patterns as separate.
See http://exciton.cs.rice.edu/JavaResources/DesignPatterns/ for good coverage of the classic design patterns, albeit in a Java context.
Credit: Jürgen Hermann
The _ _new_ _
staticmethod
makes the task very
simple:
class Singleton(object): """ A Pythonic Singleton """ def _ _new_ _(cls, *args, **kwargs): if '_inst' not in vars(cls): cls._inst = type._ _new_ _(cls, *args, **kwargs) return cls._inst
Just have your class inherit from Singleton
,
and don’t override _ _new_ _
. Then,
all calls to that class (normally creations of new instances) return
the same instance. (The instance is created once, on the first such
call to each given subclass of Singleton
during each
run of your program.)
This recipe shows the one obvious way to implement the “Singleton” Design Pattern in Python (see E. Gamma, et al., Design Patterns: Elements of Reusable Object-Oriented Software, Addison-Wesley). A Singleton is a class that makes sure only one instance of it is ever created. Typically, such a class is used to manage resources that by their nature can exist only once. See Recipe 6.16 for other considerations about, and alternatives to, the “Singleton” design pattern in Python.
We can complete the module with the usual self-test idiom and show this behavior:
if _ _name_ _ == '_ _main_ _': class SingleSpam(Singleton): def _ _init_ _(self, s): self.s = s def _ _str_ _(self): return self.s s1 = SingleSpam('spam') print id(s1), s1.spam( ) s2 = SingleSpam('eggs') print id(s2), s2.spam( )
When we run this module as a script, we get something like the
following output (the exact value of id
does vary, of course):
8172684 spam 8172684 spam
The 'eggs
' parameter passed
when trying to instantiate s2
has been
ignored, of course—that’s part of the price you pay for having a
Singleton!
One issue with Singleton in general is
subclassability. The way class
Singleton
is coded in this recipe, each descendant
subclass, direct or indirect, will get a separate instance. Literally
speaking, this violates the constraint of only one instance
per class, depending on what one exactly means by
it:
class Foo(Singleton): pass
class Bar(Foo): pass
f = Foo( ); b = Bar( )
print f is b, isinstance(f, Foo), isinstance(b, Foo)
# emitsFalse True True
f
and b
are separate instances, yet, according to
the built-in function isinstance
,
they are both instances of Foo
because isinstance
applies the IS-A rule of OOP: an
instance of a subclass IS-An instance of the base class too. On the
other hand, if we took pains to return f
again when b
is being instantiated by calling Bar
, we’d be violating the normal assumption
that calling class Bar
gives us an
instance of class Bar
, not an
instance of a random superclass of Bar
that just happens to have been
instantiated earlier in the course of a run of the program.
In practice, subclassability of “Singleton"s is rather a headache, without any obvious solution. If this issue is important to you, the alternative Borg idiom, explained next in Recipe 6.16 may provide a better approach.
Recipe 6.16; E. Gamma, R. Helm, R. Johnson, J. Vlissides, Design Patterns: Elements of Reusable Object-Oriented Software (Addison-Wesley).
Credit: Alex Martelli, Alex A. Naanou
You want to make sure that only one instance of a class
is ever created: you don’t care about the id
of the resulting instances, just about
their state and behavior, and you need to ensure
subclassability.
Application needs (forces) related to the
“Singleton” Design Pattern can be met by allowing multiple instances
to be created while ensuring that all instances share state and
behavior. This is more flexible than fiddling with instance creation.
Have your class inherit from the following Borg
class:
class Borg(object): _shared_state = { } def _ _new_ _(cls, *a, **k): obj = object._ _new_ _(cls, *a, **k) obj._ _dict_ _ = cls._shared_state return obj
If you override _ _new_ _
in
your class (very few classes need to do that), just remember to use
Borg._ _new_ _
, rather than object._ _new_ _
, within your override. If
you want instances of your class to share state among themselves, but
not with instances of other subclasses of Borg
, make
sure that your class has, at class scope, the “state"ment:
_shared_state = { }
With this “data override”, your class doesn’t inherit the
_shared_state
attribute from Borg
but rather gets its own. It is to enable this “data override” that
Borg
’s _ _new_ _
uses cls._shared_state
instead of
Borg._shared_state
.
Here’s a typical example of Borg
use:
if _ _name_ _ == '_ _main_ _': class Example(Borg): name = None def _ _init_ _(self, name=None): if name is not None: self.name = name def _ _str_ _(self): return 'name->%s' % self.name a = Example('Lara') b = Example( ) # instantiating b shares self.name with a print a, b c = Example('John Malkovich') # making c changes self.name of a & b too print a, b, c b.name = 'Seven' # setting b.name changes name of a & c too print a, b, c
When running this module as a main script, the output is:
name->Lara name->Lara name->John Malkovich name->John Malkovich name->John Malkovich name->Seven name->Seven name->Seven
All instances of Example
share state, so any setting of the name
attribute
of any instance, either in _ _init_
_
or directly, affects all instances equally. However,
note that the instance’s id
s
differ; therefore, since we have not defined special methods
_ _eq_ _
and _ _hash_ _
, each instance can work as a
distinct key in a dictionary. Thus, if we continue our sample code
as follows:
adict = { } j = 0 for i in a, b, c: adict[i] = j j = j + 1 for i in a, b, c: print i, adict[i]
the output is:
name->Seven 0 name->Seven 1 name->Seven 2
If this behavior is not what you want, add _ _eq_ _
and _
_hash_ _
methods to the Example
class or the Borg
superclass. Having these methods might better simulate the existence
of a single instance, depending on your exact needs. For example,
here’s a version of Borg
with these special methods
added:
class Borg(object): _shared_state = { } def _ _new_ _(cls, *a, **k): obj = object._ _new_ _(cls, *a, **k) obj._ _dict_ _ = cls._shared_state return obj def _ _hash_ _(self): return 9 # any arbitrary constant integer def _ _eq_ _(self, other): try: return self._ _dict_ _ is other._ _dict_ _ except AttributeError: return False
With this enriched version of Borg
, the
example’s output changes to:
name->Seven 2 name->Seven 2 name->Seven 2
The Singleton Design Pattern has a catchy name, but unfortunately it also has the wrong focus for most purposes: it focuses on object identity, rather than on object state and behavior. The Borg design nonpattern makes all instances share state instead, and Python makes implementing this idea a snap.
In most cases in which you might think of using Singleton or Borg, you don’t really need either of them. Just write a Python module, with functions and module-global variables, instead of defining a class, with methods and per-instance attributes. You need to use a class only if you must be able to inherit from it, or if you need to take advantage of the class’ ability to define special methods. (See Recipe 6.2 for a way to combine some of the advantages of classes and modules.) Even when you do need a class, it’s usually unnecessary to include in the class itself any code to enforce the idea that one can’t make multiple instances of it; other, simpler idioms are generally preferable. For example:
class froober(object):
def _ _init_ _(self):etc, etc
froober = froober( )
Now froober
is by nature the only instance of
its own class, since name 'froober
' has been rebound to mean the
instance, not the class. Of course, one might call froober._ _class_ _( )
, but it’s not
sensible to spend much energy taking precautions against deliberate
abuse of your design intentions. Any obstacles you put in the way of
such abuse, somebody else can bypass. Taking precautions against
accidental misuse is way plenty. If the very
simple idiom shown in this latest snippet is sufficient for your
needs, use it, and forget about Singleton and Borg. Remember:
do the simplest thing that could possibly work.
On rare occasions, though, an idiom as simple as this one cannot
work, and then you do need more.
The Singleton Design Pattern (described previously in Recipe 6.15) is all about ensuring that just one instance of a certain class is ever created. In my experience, Singleton is generally not the best solution to the problems it tries to solve, producing different kinds of issues in various object models. We typically want to let as many instances be created as necessary, but all with shared state. Who cares about identity? It’s state (and behavior) we care about. The alternate pattern based on sharing state, in order to solve roughly the same problems as Singleton does, has also been called Monostate. Incidentally, I like to call Singleton “Highlander” because there can be only one.
In Python, you can implement the Monostate Design
Pattern in many ways, but the Borg design nonpattern is often best.
Simplicity is Borg’s greatest strength. Since the _ _dict_ _
of any instance can be rebound,
Borg
in its _ _new_
_
rebinds the _ _dict_
_
of each of its instances to a class-attribute
dictionary. Now, any reference or binding of an instance attribute
will affect all instances equally. I thank David Ascher for
suggesting the appropriate name Borg
for this
nonpattern. Borg is a nonpattern because it had no known uses at the
time of its first publication (although several uses
are now known): two or more known uses are part
of the prerequisites for being a design pattern. See the detailed
discussion at http://www.aleax.it/5ep.html.
An excellent article by Robert Martin about Singleton and Monostate can be found at http://www.objectmentor.com/resources/articles/SingletonAndMonostate.pdf. Note that most of the disadvantages that Martin attributes to Monostate are really due to the limitations of the languages that Martin is considering, such as C++ and Java, and just disappear when using Borg in Python. For example, Martin indicates, as Monostate’s first and main disadvantage, that “A non-Monostate class cannot be converted into a Monostate class through derivation”—but that is obviously not the case for Borg, which, through multiple inheritance, makes such conversions trivial.
The _ _getattr_ _
and _ _setattr_ _
special methods
are not involved in Borg
’s operations. Therefore,
you can define them independently in your subclass, for whatever
other purposes you may require, or you may leave these special
methods undefined. Either way is not a problem because Python does
not call _ _setattr_ _
in the
specific case of the rebinding of the instance’s _ _dict_ _
attribute.
Borg
does not work well for classes that
choose to keep some or all of their per-instance state somewhere
other than in the instance’s _ _dict_
_
. So, in subclasses of Borg
, avoid
defining _ _slots_ _
—that’s a
memory-footprint optimization that would make no sense, anyway,
since it’s meant for classes that have a large number of instances,
and Borg
subclasses will effectively have just one
instance! Moreover, instead of inheriting from built-in types such
as list
or dict
, your Borg subclasses should use
wrapping and automatic delegation, as shown previously Recipe 6.5. (I named this
latter twist “DeleBorg,” in my paper available at http://www.aleax.it/5ep.html.)
Saying that Borg “is a Singleton” would be as silly as saying that a portico is an umbrella. Both serve similar purposes (letting you walk in the rain without getting wet)—solve similar forces, in design pattern parlance—but since they do so in utterly different ways, they’re not instances of the same pattern. If anything, as already mentioned, Borg has similarities to the Monostate alternative design pattern to Singleton. However, Monostate is a design pattern, while Borg is not; also, a Python Monostate could perfectly well exist without being a Borg. We can say that Borg is an idiom that makes it easy and effective to implement Monostate in Python.
For reasons mysterious to me, people often conflate issues germane to Borg and Highlander with other, independent issues, such as access control and, particularly, access from multiple threads. If you need to control access to an object, that need is exactly the same whether there is one instance of that object’s class or twenty of them, and whether or not those instances share state. A fruitful approach to problem-solving is known as divide and conquer—making problems easier to solve by splitting apart their different aspects. Making problems more difficult to solve by joining together several aspects must be an example of an approach known as unite and suffer!
Recipe 6.5; Recipe 6.15; Alex Martelli, “Five Easy Pieces: Simple Python Non-Patterns” (http://www.aleax.it/5ep.html).
Credit: Dinu C. Gherman, Holger Krekel
You want to reduce the need for conditional statements in your code, particularly the need to keep checking for special cases.
The usual placeholder object for “there’s nothing here” is
None
, but we may be able to do
better than that by defining a class meant exactly to act as such a
placeholder:
class Null(object): """ Null objects always and reliably "do nothing." """ # optional optimization: ensure only one instance per subclass # (essentially just to save memory, no functional difference) def _ _new_ _(cls, *args, **kwargs): if '_inst' not in vars(cls): cls._inst = type._ _new_ _(cls, *args, **kwargs) return cls._inst def _ _init_ _(self, *args, **kwargs): pass def _ _call_ _(self, *args, **kwargs): return self def _ _repr_ _(self): return "Null( )" def _ _nonzero_ _(self): return False def _ _getattr_ _(self, name): return self def _ _setattr_ _(self, name, value): return self def _ _delattr_ _(self, name): return self
You can use an instance of the Null
class
instead of the primitive value None
. By using such an instance as a
placeholder, instead of None
, you
can avoid many conditional statements in your code and can often
express algorithms with little or no checking for special values. This
recipe is a sample implementation of the Null Object Design Pattern.
(See B. Woolf, “The Null Object Pattern” in Pattern
Languages of Programming [PLoP 96, September
1996].)
This recipe’s Null
class
ignores all parameters passed when constructing or calling instances,
as well as any attempt to set or delete attributes. Any call or
attempt to access an attribute (or a method, since Python does not
distinguish between the two, calling _
_getattr_ _
either way) returns the same Null
instance (i.e., self
—no reason to create a new instance).
For example, if you have a computation such as:
def compute(x, y):
try:lots of computation here to return some appropriate object
except SomeError:
return None
and you use it like this:
for x in xs: for y in ys: obj = compute(x, y) if obj is not None: obj.somemethod(y, x)
you can usefully change the computation to:
def compute(x, y):
try:lots of computation here to return some appropriate object
except SomeError:
return Null( )
and thus simplify its use down to:
for x in xs: for y in ys: compute(x, y).somemethod(y, x)
The point is that you don’t need to check whether compute
has returned a real result or an
instance of Null
: even in the
latter case, you can safely and innocuously call on it whatever method
you want. Here is another, more specific use case:
log = err = Null( ) if verbose: log = open('/tmp/log', 'w') err = open('/tmp/err', 'w') log.write('blabla') err.write('blabla error')
This obviously avoids the usual kind of “pollution” of your code
from guards such as if verbose
:
strewn all over the place. You can now call log.write('bla')
, instead of having to
express each such call as if log is not None:
log.write('bla')
.
In the new object model, Python does not call _ _getattr_ _
on an instance for any special
methods needed to perform an operation on the instance (rather, it
looks up such methods in the instance class’ slots). You may have to
take care and customize Null
to your application’s
needs regarding operations on null objects, and therefore special
methods of the null objects’ class, either directly in the class’
sources or by subclassing it appropriately. For example, with this
recipe’s Null
, you cannot index Null
instances, nor take their length, nor iterate on them. If this is a
problem for your purposes, you can add all the special methods you
need (in Null
itself or in an appropriate subclass)
and implement them appropriately—for example:
class SeqNull(Null): def _ _len_ _(self): return 0 def _ _iter_ _(self): return iter(( )) def _ _getitem_ _(self, i): return self def _ _delitem_ _(self, i): return self def _ _setitem_ _(self, i, v): return self
Similar considerations apply to several other operations.
The key goal of Null
objects
is to provide an intelligent replacement for the often-used primitive
value None
in Python. (Other
languages represent the lack of a value using either null or a null
pointer.) These nobody-lives-here markers/placeholders are used for
many purposes, including the important case in which one member of a
group of otherwise similar elements is special. This usage usually
results in conditional statements all over the place to distinguish
between ordinary elements and the primitive null (e.g., None
) value, but Null
objects help you avoid that.
Among the advantages of using Null
objects are
the following:
Superfluous conditional statements can be avoided by
providing a first-class object alternative for the primitive value
None
, thereby improving code
readability.
Null
objects can act as
placeholders for objects whose behavior is not yet
implemented.
Null
objects can be used
polymorphically with instances of just about any other class
(perhaps needing suitable subclassing for special methods, as
previously mentioned).
Null
objects are very
predictable.
The one serious disadvantage of Null
is that it
can hide bugs. If a function returns None
, and the caller did not expect that
return value, the caller most likely will soon thereafter try to call
a method or perform an operation that None
doesn’t support, leading to a
reasonably prompt exception and traceback. If the return value that
the caller didn’t expect is a Null
, the problem might
stay hidden for a longer time, and the exception and traceback, when
they eventually happen, may therefore be harder to reconnect to the
location of the defect in the code. Is this problem serious enough to
make using Null
inadvisable? The answer is a matter
of opinion. If your code has halfway decent unit tests, this problem
will not arise; while, if your code lacks decent
unit tests, then using Null
is the
least of your problems. But, as I said, it boils
down to a matter of opinions. I use Null
very widely,
and I’m extremely happy with the effect it has had on my
productivity.
The Null
class as presented in this recipe uses
a simple variant of the “Singleton” pattern (shown earlier in Recipe 6.15), strictly for
optimization purposes—namely, to avoid the creation of numerous
passive objects that do nothing but take up memory. Given all the
previous remarks about customization by subclassing, it is, of course,
crucial that the specific implementation of “Singleton” ensures a
separate instance exists for each subclass of
Null
that gets instantiated. The number of subclasses
will no doubt never be so high as to eat up substantial amounts of
memory, and anyway this per-subclass distinction can be semantically
crucial.
B. Woolf, “The Null Object Pattern” in Pattern Languages of Programming (PLoP 96, September 1996), http://www.cs.wustl.edu/~schmidt/PLoP-96/woolf1.ps.gz; Recipe 6.15.
Credit: Peter Otten, Gary Robinson, Henry Crutcher, Paul Moore, Peter Schwalm, Holger Krekel
You want to avoid writing and maintaining _ _init_ _
methods that consist of almost
nothing but a series of self.something =
something
assignments.
You can “factor out” the attribute-assignment task to an auxiliary function:
def attributesFromDict(d): self = d.pop('self') for n, v in d.iteritems( ): setattr(self, n, v)
Now, the typical boilerplate code for an _ _init_ _
method such as:
def _ _init_ _(self, foo, bar, baz, boom=1, bang=2): self.foo = foo self.bar = bar self.baz = baz self.boom = boom self.bang = bang
can become a short, crystal-clear one-liner:
def _ _init_ _(self, foo, bar, baz, boom=1, bang=2): attributesFromDict(locals( ))
As long as no additional logic is in the body of _ _init_ _
, the dict
returned by calling the built-in
function locals
contains only the
arguments that were passed to _ _init_
_
(plus those arguments that were not passed but have
default values). Function attributesFromDict
extracts the object,
relying on the convention that the object is always an argument named
'self
', and then interprets all
other items in the dictionary as names and values of attributes to
set. A similar but simpler technique, not requiring an auxiliary
function, is:
def _ _init_ _(self, foo, bar, baz, boom=1, bang=2): self._ _dict_ _.update(locals( )) del self.self
However, this latter technique has a serious defect when
compared to the one presented in this recipe’s Solution: by setting
attributes directly into self._ _dict_
_
(through the latter’s update
method), it does not play well with
properties and other advanced descriptors, while the approach in this
recipe’s Solution, using built-in setattr
, is impeccable in this
respect.
attributesFromDict
is not meant for use in an
_ _init_ _
method that contains
more code, and specifically one that uses some local variables,
because attributesFromDict
cannot easily distinguish,
in the dictionary that is passed as its only argument
d
, between arguments of _
_init_ _
and other local variables of _ _init_ _
. If you’re willing to insert a
little introspection in the auxiliary function, this limitation may be
overcome:
def attributesFromArguments(d): self = d.pop('self') codeObject = self._ _init_ _.im_func.func_code argumentNames = codeObject.co_varnames[1:codeObject.co_argcount] for n in argumentNames: setattr(self, n, d[n])
By extracting the code object of the _
_init_ _
method, function
attributesFromArguments
is able
to limit itself to the names of _ _init_
_
’s arguments. Your _ _init_
_
method can then call attributesFromArguments(locals( ))
, instead
of attributesFromDict(locals( ))
,
if and when it needs to continue, after the call, with more code that
may define other local variables.
The key limitation of attributesFromArguments
is that it does not support _ _init_ _
having a last special argument of
the **kw
kind. Such support can be
added, with yet more introspection, but it would require more black
magic and complication than the functionality is probably worth. If
you nevertheless want to explore this possibility, you can use the
inspect
module of the standard
library, rather than the roll-your-own approach used in function
attributeFromArguments
, for introspection purposes.
inspect.getargspec(self._ _init_ _)
gives you both the argument names and the indication of whether
self._ _init_ _
accepts a **kw
form. See Recipe 6.19 for more
information about function inspect.getargspec
. Remember the golden rule
of Python programming: “Let the standard library do it!”
Library Reference and Python
in a Nutshell docs for the built-in function locals
, methods of type dict
, special method _ _init_ _
, and introspection techniques
(including module inspect
).
You want to ensure that _ _init_
_
is called for all superclasses that define it, and Python
does not do this automatically.
As long as your class is new-style, the built-in super
makes this task easy (if all
superclasses’ _ _init_ _
methods
also use super
similarly):
class NewStyleOnly(A, B, C): def _ _init_ _(self):super(NewStyleOnly, self)._ _init_ _( ) initialization specific to subclass NewStyleOnly
Classic classes are not recommended
for new code development: they exist only to guarantee backwards
compatibility with old versions of Python. Use new-style classes
(deriving directly or indirectly from object
) for all new code. The only thing you
cannot do with a new-style class is to raise
its instances as exception objects;
exception classes must therefore be old style, but then, you do not
need the functionality of this recipe for such classes. Since the rest
of this recipe’s Discussion is therefore both advanced and of limited
applicability, you may want to skip it.
Still, it may happen that you need to retrofit this
functionality into a classic class, or, more likely, into a new-style
class with some superclasses that do not follow
the proper style of cooperative superclass method-calling with the
built-in super
. In such cases, you
should first try to fix the problematic premises—make all classes new
style and make them use super
properly. If you absolutely cannot fix things, the best you can do is
to have your class loop over its base classes—for each base, check
whether it has an _ _init_ _
, and
if so, then call it:
class LookBeforeYouLeap(X, Y, Z):
def _ _init_ _(self):
for base in self_ _class_ _._ _bases_ _:
if hasattr(base, '_ _init_ _'):
base._ _init_ _(self)initialization specific to subclass LookBeforeYouLeap
More generally, and not just for method _ _init_ _
, we often want to call a method
on an instance, or class, if and only if that method exists; if the
method does not exist on that class or instance, we do nothing, or we
default to another action. The technique shown in the “Solution”,
based on built-in super
, is not
applicable in general: it only works on superclasses of the current
object, only if those superclasses also use super
appropriately, and only if the method
in question does exist in some superclass. Note that all new-style
classes do have an _ _init_ _
method: they all subclass object
,
and object
defines _ _init_ _
(as a do-nothing function that
accepts and ignores any arguments). Therefore, all new-style classes
have an _ _init_ _
method, either
by inheritance or by override.
The LBYL technique shown in class
LookBeforeYouLeap
may be of help in more general
cases, including ones that involve methods other than _ _init_ _
. Indeed, LBYL may even be used
together with super
, for example, as in the following toy
example:
class Base1(object): def met(self): print 'met in Base1' class Der1(Base1): def met(self): s = super(Der1, self) if hasattr(s, 'met'): s.met( ) print 'met in Der1' class Base2(object): pass class Der2(Base2): def met(self): s = super(Der2, self) if hasattr(s, 'met'): s.met( ) print 'met in Der2' Der1( ).met( ) Der2( ).met( )
This snippet emits:
met in Base1 met in Der1 met in Der2
The implementation of met
has the same
structure in both derived classes, Der1
(whose
superclass Base1
does have a
method named met
) and Der2
(whose
superclass Base1
doesn’t have
such a method). By binding a local name s
to the result of super
, and
checking with hasattr
that the
superclass does have such a method before calling it, this LBYL
structure lets you code in the same way in both cases. Of course, when
coding a subclass, you do normally know which
methods the superclasses have, and whether and how you need to call
them. Still, this technique can provide a little extra flexibility for
those occasions in which you need to slightly decouple the subclass
from the superclass.
The LBYL technique is far from perfect, though: a superclass
might define an attribute named met
, which is not
callable or needs a different number of arguments. If your need for
flexibility is so extreme that you must ward against such occurrences,
you can extract the superclass’ method object (if any) and check it
with the getargspec
function of
standard library module inspect
.
While pushing this idea towards full generality can lead into rather deep complications, here is one example of how you might code a class with a method that calls the superclass’ version of the same method only if the latter is callable without arguments:
import inspect class Der(A, B, C, D): def met(self): s = super(Der, self) # get the superclass's bound-method object, or else None m = getattr(s, 'met', None) try: args, varargs, varkw, defaults = inspect.getargspec(m) except TypeError: # m is not a method, just ignore it pass else: # m is a method, do all its arguments have default values? if len(defaults) == len(args): # yes! so, call it: m( ) print 'met in Der'
inspect.getargspec
raises a TypeError
if its argument is not a method or function, so we catch that case
with a try/except
statement, and if
the exception occurs, we just ignore it with a do-nothing pass
statement in the except
clause. To simplify our code a bit,
we do not first check separately with hasattr
. Rather, we get the 'met
' attribute of the superclass by calling
getattr
with a third argument of
None
. Thus, if the superclass does
not have any attribute named 'met
',
m
is set to None
,
later causing exactly the same TypeError
that we have to catch (and ignore)
anyway—two birds with one stone. If the call to inspect.getargspec
in the try
clause does not raise a TypeError
, execution continues with the
else
clause.
If inspect.getargspec
doesn’t
raise a TypeError
, it returns a
tuple of four items, and we bind each item to a local name. In this
case, the ones we care about are args
, a list of
m
’s argument names, and
defaults
, a tuple of default values that m
provides for its arguments. Clearly, we
can call m
without arguments if and
only if m
provides default values
for all of its arguments. So, we check that there are just as many
default values as arguments, by comparing the len
gths of list args
and
tuple defaults
, and call m
only if the lengths are equal.
No doubt you don’t need such advanced introspection and such careful checking in most of the code you write, but, just in case you do, Python does supply all the tools you need to achieve it.
Docs for built-in functions super
, getattr
, and hasattr
, and module inspect
, in the Library
Reference and Python in a
Nutshell.
Credit: Paul McNett, Alex Martelli
You appreciate the cooperative style of
multiple-inheritance coding supported by the super
built-in, but you wish you could use
that style in a more terse and concise way.
A good solution is a mixin
class—a class you can multiply inherit
from, that uses introspection to allow more terse coding:
import inspect class SuperMixin(object): def super(cls, *args, **kwargs): frame = inspect.currentframe(1) self = frame.f_locals['self'] methodName = frame.f_code.co_name method = getattr(super(cls, self), methodName, None) if inspect.ismethod(method): return method(*args, **kwargs) super = classmethod(super)
Any class cls
that inherits
from class SuperMixin
acquires a magic method named
super
: calling cls.super(args)
from within a method named
somename
of class cls
is a concise way to call super(cls, self).somename(args)
. Moreover,
the call is safe even if no class that follows cls
in Method Resolution Order (MRO) defines
any method named somename
.
Here is a usage example:
if _ _name_ _ == '_ _main_ _': class TestBase(list, SuperMixin): # note: no myMethod defined here pass class MyTest1(TestBase): def myMethod(self): print "in MyTest1" MyTest1.super( ) class MyTest2(TestBase): def myMethod(self): print "in MyTest2" MyTest2.super( ) class MyTest(MyTest1, MyTest2): def myMethod(self): print "in MyTest" MyTest.super( ) MyTest( ).myMethod( ) # emits: #in MyTest
#in MyTest
1 #in MyTest2
Python has been offering “new-style” classes for years, as a
preferable alternative to the classic classes that you get by default.
Classic classes exist only for backwards-compatibility with old
versions of Python and are not recommended for new code. Among the
advantages of new-style classes is the ease of calling superclass
implementations of a method in a “cooperative” way that fully supports
multiple inheritance, thanks to the super
built-in.
Suppose you have a method in a new-style class cls
, which needs to perform a task and then
delegate the rest of the work to the superclass implementation of the
same method. The code idiom is:
def somename(self, *args):...some preliminary task...
return super(cls, self).somename(*args)
This idiom suffers from two minor issues: it’s slightly verbose,
and it also depends on a superclass offering a method somename
. If you want to make cls
less coupled to other classes, and
therefore more robust, by removing the dependency, the code gets even
more verbose:
def somename(self, *args):...some preliminary task...
try:
super_method = super(cls, self).somename
except AttributeError:
return None
else:
return super_method(*args)
The mixin
class
SuperMixin
shown in this recipe removes both issues.
Just ensure cls
inherits, directly
or indirectly, from SuperMixin
(alongside any other
base classes you desire), and then you can code, concisely
and robustly:
def somename(self, *args):...some preliminary task...
return cls.super(*args)
The classmethod
SuperMixin.super
relies on simple introspection to
get the self
object and the name of
the method, then internally uses built-ins super
and getattr
to get the superclass method, and
safely call it only if it exists. The introspection is performed
through the handy inspect
module of
the standard Python library, making the whole task even
simpler.
Library Reference and Python
in a Nutshell docs on super
, the new object model and MRO, the
built-in getattr
, and standard
library module inspect
; Recipe 20.12 for another
recipe taking a very different approach to simplify the use of
built-in super
.