Credit: Raymond Hettinger
I had my power drill slung low on my toolbelt and I said, “Go ahead, honey.Break something.”
Tim Allen
on the challenges of figuring out whatto do with a new set of general-purpose tools
This chapter is last because it deals with issues that look or sound difficult, although they really aren’t. It is about Python’s power tools.
Though easy to use, the power tools can be considered advanced for several reasons. First, the need for them rarely arises in simple programs. Second, most involve introspection, wrapping, and forwarding techniques available only in a dynamic language like Python. Third, the tools seem advanced because when you learn them, you also develop a deep understanding of how Python works internally.
Last, as with the power tools in your garage, it is easy to get carried away and create a gory mess. Accordingly, to ward off small children, the tools were given scary names such as descriptors, decorators, and metaclasses (such names as pangalaticgarglebaster were considered a bit too long).
Because these tools are so general purpose, it can be a challenge to figure out what to do with them. Rather that resorting to Tim Allen’s tactics, study the recipes in this chapter: they will give you all the practice you need. And, as Tim Peters once pointed out, it can be difficult to devise new uses from scratch, but when a real problem demands a power tool, you’ll know it when you need it.
The concept of descriptors is easy enough. Whenever an attribute is looked up, an action takes place. By default, the action is a get, set, or delete. However, someday you’ll be working on an application with some subtle need and wish that more complex actions could be programmed. Perhaps you would like to create a log entry every time a certain attribute is accessed. Perhaps you would like to redirect a method lookup to another method. The solution is to write a function with the needed action and then specify that it be run whenever the attribute is accessed. An object with such functions is called a descriptor (just to make it sound harder than it really is).
While the concept of a descriptor is straightforward, there
seems to be no limit to what can be done with them. Descriptors
underlie Python’s implementation of methods, bound methods, super
, property
, classmethod
, and staticmethod
. Learning about the various
applications of descriptors is key to mastering the language.
The recipes in this chapter show how to put descriptors straight
to work. However, if you want the full details behind the descriptor
protocol or want to know exactly how descriptors are used to implement
super
, property
, and the like, see my paper on the
subject at http://users.rcn.com/python/download/Descriptor.htm.
Decorators are even simpler than
descriptors. Writing myfunc=wrapper(myfunc)
was the common way to
modify or log something about another function, which took place
somewhere after myfunc
was defined.
Starting with Python 2.4, we now write @wrapper
just before the def
statement that performs the definition
of myfunc
. Common examples include
@staticmethod
and @classmethod
. Unlike Java declarations,
these wrappers are higher-order functions that can modify the original
function or take some other action. Their uses are limitless. Some
ideas that have been advanced include @make_constants
for bytecode optimization,
@atexit
to register a function to
be run before Python exits, @synchronized
to automatically add mutual
exclusion locking to a function or method, and @log
to create a log entry every time a
function is called. Such wrapper functions are called
decorators (not an especially intimidating name
but cryptic enough to ward off evil spirits).
The concept of a metaclass sounds strange
only because it is so familiar. Whenever you write a class definition,
a mechanism uses the name, bases, and class dictionary to create a
class object. For old-style classes that mechanism is types.ClassType
. For new-style classes, the
mechanism is just type
. The former
implements the familiar actions of a classic class, including
attribute lookup and showing the name of the class when repr
is called. The latter adds a few bells
and whistles including support for _ _slots_
_
and _ _getattribute_ _
.
If only that mechanism were programmable, what you could do in Python
would be limitless. Well, the mechanism is
programmable, and, of course, it has an intimidating name,
metaclasses.
The recipes in this chapter show that writing metaclasses can be
straightforward. Most metaclasses subclass type
and simply extend or override the
desired behavior. Some are as simple as altering the class dictionary
and then forwarding the arguments to type
to finish the job.
For instance, say that you would like to automatically generate
getter methods for all the private variables listed in slots. Just
define a metaclass M
that looks up _ _slots_ _
in the mapping, scans for
variable names starting with an underscore, creates an accessor method
for each, and adds the new methods to the class dictionary:
class M(type): def _ _new_ _(cls, name, bases, classdict): for attr in classdict.get('_ _slots_ _', ( )): if attr.startswith('_'): def getter(self, attr=attr): return getattr(self, attr) # 2.4 only: getter._ _name_ _ = 'get' + attr[1:] classdict['get' + attr[1:]] = getter return type._ _new_ _(cls, name, bases, classdict)
Apply the new metaclass to every class where you want automatically created accessor functions:
class Point(object): _ _metaclass_ _ = M _ _slots_ _ = ['_x', '_y']
If you now print dir(Point)
,
you will see the two accessor methods as if you had written them out
the long way:
class Point(object): _ _slots_ _ = ['_x', '_y'] def getx(self): return self._x def gety(self): return self._y
In both cases, among the output of the print
statement, you will see the names
'getx
' and 'gety
‘.
Credit: Sean Ross
Python computes the default values for a function’s optional
arguments just once, when the function’s def
statement executes. However, for some of
your functions, you’d like to ensure that the default values are
fresh ones (i.e., new and independent copies)
each time a function gets called.
A Python 2.4 decorator offers an elegant solution, and, with a slightly less terse syntax, it’s a solution you can apply in version 2.3 too:
import copy def freshdefaults(f): "a decorator to wrap f and keep its default values fresh between calls" fdefaults = f.func_defaults def refresher(*args, **kwds): f.func_defaults = deepcopy(fdefaults) return f(*args, **kwds) # in 2.4, only: refresher._ _name_ _ = f._ _name_ _ return refresher # usage as a decorator, in python 2.4: @freshdefaults def packitem(item, pkg=[ ]): pkg.append(item) return pkg # usage in python 2.3: after the function definition, explicitly assign: # f = freshdefaults(f)
A function’s default values are evaluated once, and only once,
at the time the function is defined (i.e., when the def
statement executes). Beginning Python
programmers are sometimes surprised by this fact; they try to use
mutable default values and yet expect that the values will somehow be
regenerated afresh each time they’re needed.
Recommended Python practice is to not use mutable default values. Instead, you should use idioms such as:
def packitem(item, pkg=None): if pkg is None: pkg = [ ] pkg.append(item) return pkg
The freshdefaults
decorator presented in this
recipe provides another way to accomplish the same task. It eliminates
the need to set as your default value anything but the value you
intend that optional argument to have by default. In particular, you
don’t have to use None
as the
default value, rather than (say) square brackets [ ]
, as you do in the recommended
idiom.
freshdefaults
also removes the need to test
each argument against the stand-in value (e.g., None
) before assigning the intended value:
this could be an important simplification in your code, where your
functions need to have several optional arguments with mutable default
values, as long as all of those default values can be
deep-copied.
On the other hand, the implementation of
freshdefaults
needs several reasonably advanced
concepts: decorators, closures, function attributes, and deep copying.
All in all, this implementation is no doubt more difficult to explain
to beginning Python programmers than the recommended idiom. Therefore,
this recipe cannot really be recommended to beginners. However,
advanced Pythonistas may find it useful.
Python Language Reference documentation
about decorators; Python Language
Reference and Python in a Nutshell
documentation about closures and function attributes;
Python Library Reference
and Python in a Nutshell documentation about
standard library module copy
,
specifically function deepcopy
.
Credit: Sean Ross, David Niergarth, Holger Krekel
You want to code properties without cluttering up your class namespace with accessor methods that are not called directly.
Functions nested within another function are quite handy for this task:
import math class Rectangle(object): def _ _init_ _(self, x, y): self.y = x self.y = y def area( ): doc = "Area of the rectangle" def fget(self): return self.x * self.y def fset(self, value): ratio = math.sqrt((1.0*value)/self.area) self.x *= ratio self.y *= ratio return locals( ) area = property(**area( ))
The standard idiom used to create a property starts with
defining in the class body several accessor methods (e.g., getter,
setter, deleter), often with boilerplate-like method names such as
setThis
, getThat
, or
delTheother
. More often than not, such accessors are
not required except inside the property itself; sometimes (rarely)
programmers even remember to del
them to clean up the class namespace after building the property
instance.
The idiom suggested in this recipe avoids cluttering up the
class namespace at all. Just write in the class body a function with
the same name you intend to give to the property. Inside that
function, define appropriate nested functions, which
must be named exactly fget
, fset
, fdel
, and assign an appropriate docstring
named doc
. Have the outer function
return a dictionary whose entries have exactly those names, and no
others: returning the locals( )
dictionary will work, as long as your outer function has no other
local variables at that point. If you do have other names in addition
to the fixed ones, you might want to code your return
statement, for example, as:
return sub_dict(locals( ), 'doc fget fset fdel'.split( ))
using the sub_dict
function shown in Recipe 4.13. Any other way
to subset a dictionary will work just as well.
Finally, the call to property
uses the **
notation to expand a
mapping into named arguments, and the assignment rebinds the name to
the resulting property instance, so that the class namespace is left
pristine.
As you can see from the example in this recipe’s Solution, you
don’t have to define all of the four key names:
you may, and should, omit some of them if a particular property
forbids the corresponding operation. In particular, the
area
function in the solution does not define
fdel
because the resulting
area
attribute must be not deletable.
In Python 2.4, you can define a simple custom decorator to make this recipe’s suggested idiom even spiffier:
def nested_property(c): return property(**c( ))
With this little helper at hand, you can replace the explicit assignment of the property to the attribute name with the decorator syntax:
@nested_property
def area( ):
doc = "Area of the rectangle"
def fget(self):the area function remains the same
In Python 2.4, having a decorator line @
deco
right
before a def name
statement is
equivalent to having, right after the def
statement’s body, an assignment name =
deco
(name). A mere difference of syntax
sugar, but it’s useful: anybody reading the source code of the class
knows up front that the function or method you’re def
‘ing is meant to get decorated in a
certain way, not to get used exactly as coded. With the Python 2.3
syntax, somebody reading in haste might possibly miss the assignment
statement that comes after the def
.
Returning locals
works only
if your outer function has no other local variables besides fget
, fset
, fdel
, and doc
. An alternative idiom to avoid this
restriction is to move the call to property
inside the
outer function:
def area( ): what_is_area = "Area of the rectangle" def compute_area(self): return self.x * self.y def scale_both_sides(self, value): ratio = math.sqrt((1.0*value)/self.area) self.x *= ratio self.y *= ratio return property(compute_area, scale_both_sides, None, what_is_area) area = area( )
As you see, this alternative idiom enables us to give different
names to the getter and setter accessors, which is not a big deal
because, as mentioned previously, accessors are often named in
uninformative ways such as getThis
and
setThat
anyway. But, if your opinion differs, you may
prefer this idiom, or its slight variant based on having the outer
function return a tuple
of values
for property
’s argument rather than
a dict
. In other words, the variant
obtained by changing the last two statements of this latest snippet
to:
return compute_area, scale_both_sides, None, what_is_area area = property(*area( ))
Library Reference and Python
in a Nutshell docs on built-in functions property
and locals
.
Credit: Denis S. Otkidach
You want to use an attribute name as an alias for another one, either just as a default value (when the attribute was not explicitly set), or with full setting and deleting abilities too.
Custom descriptors are the right tools for this task:
class DefaultAlias(object): ''' unless explicitly assigned, this attribute aliases to another. ''' def _ _init_ _(self, name): self.name = name def _ _get_ _(self, inst, cls): if inst is None: # attribute accessed on class, return `self' descriptor return self return getattr(inst, self.name) class Alias(DefaultAlias): ''' this attribute unconditionally aliases to another. ''' def _ _set_ _(self, inst, value): setattr(inst, self.name, value) def _ _delete_ _(self, inst): delattr(inst, self.name)
Your class instances sometimes have attributes whose default
value must be the same as the current value of other attributes but
may be set and deleted independently. For such requirements, custom
descriptor DefaultAlias
, as presented in this
recipe’s Solution, is just the ticket. Here is a toy example:
class Book(object): def _ _init_ _(self, title, shortTitle=None): self.title = title if shortTitle is not None: self.shortTitle = shortTitle shortTitle = DefaultAlias('title') b = Book('The Life and Opinions of Tristram Shandy, Gent.') print b.shortTitle # emits:The Life and Opinions of Tristram Shandy, Gent.
b.shortTitle = "Tristram Shandy" print b.shortTitle # emits:Tristram Shandy
del b.shortTitle print b.shortTitle # emits:The Life and Opinions of Tristram Shandy, Gent.
DefaultAlias
is not what
is technically known as a data descriptor class
because it has no _ _set_ _
method.
In practice, this means that, when we assign a value to an instance
attribute whose name is defined in the class as a
DefaultAlias
, the instance records the attribute
normally, and the instance attribute shadows the
class attribute. This is exactly what’s happening in this snippet
after we explicitly assign to b.shortTitle
—when we
del b.shortTitle
, we remove the
per-instance attribute, uncovering the per-class
one again.
Custom descriptor class Alias
is a simple
variant of class DefaultAlias
, easily obtained by
inheritance. Alias
aliases one attribute to another,
not just upon accesses to the attribute’s value (as
DefaultAlias
would do), but also upon all operations
of value setting and deletion. It easily achieves this by being a
“data descriptor” class, which means that it does have a _ _set_ _
method. Therefore, any assignment
to an instance attribute whose name is defined in the class as an
Alias
gets intercepted by Alias'
_ _set_ _
method.
(Alias
also defines a _
_delete_ _
method, to obtain exactly the same effect upon
attribute deletion.)
Alias
can be quite useful when you want to
evolve a class, which you made publicly available in a previous
version, to use more appropriate names for methods and other
attributes, while still keeping the old names available for backwards
compatibility. For this specific use, you may even want a version that
emits a warning when the old name is used:
import warnings class OldAlias(Alias): def _warn(self): warnings.warn('use %r, not %r' % (self.name, self.oldname), DeprecationWarning, stacklevel=3) def _ _init_ _(self, name, oldname): super(OldAlias, self)._ _init_ _(name) self.oldname = oldname def _ _get_ _(self, inst, cls): self._warn( ) return super(OldAlias, self)._ _get_ _(inst, cls) def _ _set_ _(self, inst, value): self._warn( ) return super(OldAlias, self)._ _set_ _(inst, value) def _ _delete_ _(self, inst): self._warn( ) return super(OldAlias, self)._ _delete_ _(inst)
Here is a toy example of using OldAlias
:
class NiceClass(object): def _ _init_ _(self, name): self.nice_new_name = name bad_old_name = OldAlias('nice_new_name', 'bad_old_name')
Old code using this class may still refer to the instance
attribute as bad_old_name
, preserving backwards
compatibility; when that happens, though, a warning message is
presented about the deprecation, encouraging the old code’s author to
upgrade the code to use nice_new_name
instead. The
normal mechanisms of the warnings
module of the Python Standard Library ensure that, by default, such
warnings are output only once per occurrence and per run of a program,
not repeatedly. For example, the snippet:
x = NiceClass(23) for y in range(4): print x.bad_old_name x.bad_old_name += 100
emits:
xxx.py:64: DeprecationWarning: use 'nice_new_name', not 'bad_old_name' print x.bad_old_name 23 xxx.py:65: DeprecationWarning: use 'nice_new_name', not 'bad_old_name' x.bad_old_name += 100 123 223 323
The warning is printed once per line using the bad old name, not
repeated again and again as the for
loop iterates.
Custom descriptors are best documented on Raymond Hettinger’s
web page: http://users.rcn.com/python/download/Descriptor.htm;
Library Reference and Python in a
Nutshell docs about the warnings
module.
Credit: Denis S. Otkidach
You want to be able to compute attribute values, either per instance or per class, on demand, with automatic caching.
Custom descriptors are the right tools for this task:
class CachedAttribute(object): ''' Computes attribute value and caches it in the instance. ''' def _ _init_ _(self, method, name=None): # record the unbound-method and the name self.method = method self.name = name or method._ _name_ _ def _ _get_ _(self, inst, cls): if inst is None: # instance attribute accessed on class, return self return self # compute, cache and return the instance's attribute value result = self.method(inst) setattr(inst, self.name, result) return result class CachedClassAttribute(CachedAttribute): ''' Computes attribute value and caches it in the class. ''' def _ _get_ _(self, inst, cls): # just delegate to CachedAttribute, with 'cls' as ``instance'' return super(CachedClassAttribute, self)._ _get_ _(cls, cls)
If your class instances have attributes that must be computed on
demand but don’t generally change after they’re first computed, custom
descriptor CachedAttribute
as presented in this
recipe is just the ticket. Here is a toy example of use (with Python
2.4 syntax):
class MyObject(object): def _ _init_ _(self, n): self.n = n @CachedAttribute def square(self): return self.n * self.n m = MyObject(23) print vars(m) # 'square' not there yet # emits:{'n': 23}
print m.square # ...so it gets computed # emits:529
print vars(m) # 'square' IS there now # emits:{'square': 529, 'n': 23}
del m.square # flushing the cache print vars(m) # 'square' removed # emits:{'n': 23}
m.n = 42 print vars(m) # emits:{'n': 42}
# still no 'square' print m.square # ...so gets recomputed # emits:1764
print vars(m) # 'square' IS there again # emits:{'square': 1764, 'n': 23}
As you see, after the first access to m.square
,
the square
attribute is cached in instance
m
, so it will not get recomputed for that
instance. If you need to flush the cache, for example, to change
m.n
, so that m.square
will get
recomputed if it is ever accessed again, just del m.square
. Remember, attributes can be
removed in Python! To use this code in Python 2.3, remove the
decorator syntax @CachedAttribute
and insert instead an assignment square =
CachedAttribute(square)
after the end
of the def
statement for method
square
.
Custom descriptor CachedClassAttribute
is just
a simple variant of CachedAttribute
, easily obtained
by inheritance: it computes the value by calling a method on the class
rather than the instance, and it caches the result on the class, too.
This may help when all instances of the class need to see the same
cached value. CachedClassAttribute
is mostly meant
for cases in which you do not need to flush the cache because its
_ _get_ _
method usually wipes away
the instance descriptor itself:
class MyClass(object): class_attr = 23 @CachedClassAttribute def square(cls): return cls.class_attr * cls.class_attr x = MyClass( ) y = MyClass( ) print x.square # emits:529
print y.square # emits:529
del MyClass.square print x.square # raises an AttributeError exception
However, when you do need a cached class attribute with the ability to occasionally flush it, you can still get it with a little trick. To implement this snippet so it works as intended, just add the statement:
class MyClass(MyClass): pass
right after the end of the class
MyClass
statement and before generating any instance of
MyClass
. Now, two class objects are named
MyClass
, a hidden “base” one that always holds the
custom descriptor instance, and an outer “subclass” one that is used
for everything else, including making instances and holding the cached
value if any. Whether this trick is a reasonable one or whether it’s
too cute and clever for its own good, is a judgment call you can make
for yourself! Perhaps it would be clearer to name the base class
MyClassBase
and use class
MyClass(MyClassBase)
, rather than use the same name for both
classes; the mechanism would work in exactly the same fashion, since
it is not dependent on the names of classes.
Custom descriptors are best documented at Raymond Hettinger’s web page: http://users.rcn.com/python/download/Descriptor.htm.
Credit: Raymond Hettinger
Python’s built-in property
descriptor is quite handy but only as long as you want to use a
separate method as the accessor of each attribute you make into a
property. In certain cases, you prefer to use the same method to
access several different attributes, and property
does not support that mode of
operation.
We need to code our own custom descriptor, which gets the
attribute name in _ _init_ _
, saves
it, and passes it on to the accessors. For convenience, we also
provide useful defaults for the various accessors. You can still pass
in None
explicitly if you want to
forbid certain kinds of access but the default is to allow it
freely.
class CommonProperty(object): def _ _init_ _(self, realname, fget=getattr, fset=setattr, fdel=delattr, doc=None): self.realname = realname self.fget = fget self.fset = fset self.fdel = fdel self._ _doc_ _ = doc or "" def _ _get_ _(self, obj, objtype=None): if obj is None: return self if self.fget is None: raise AttributeError, "can't get attribute" return self.fget(obj, self.realname) def _ _set_ _(self, obj, value): if self.fset is None: raise AttributeError, "can't set attribute" self.fset(obj, self.realname, value) def _ _delete_ _(self, obj): if self.fdel is None: raise AttributeError, "can't delete attribute" self.fdel(obj, self.realname, value)
Here is a simple example of using this
CommonProperty
custom descriptor:
class Rectangle(object): def _ _init_ _(self, x, y): self._x = x # don't trigger _setSide prematurely self.y = y # now trigger it, so area gets computed def _setSide(self, attrname, value): setattr(self, attrname, value) self.area = self._x * self._y x = CommonProperty('_x', fset=_setSide, fdel=None) y = CommonProperty('_y', fset=_setSide, fdel=None)
The idea of this Rectangle
class is that
attributes x
and y
may be freely
accessed but never deleted; when either of these attributes is set,
the area
attribute must be recomputed at once. You
could alternatively recompute the area on the fly each time it’s
accessed, using a simple property
for the purpose; however, if area
is accessed often and sides are changed rarely, the architecture of
this simple example obviously can be preferable.
In this simple example of CommonProperty
use,
we just need to be careful on the very first attribute setting in
_ _init_ _
: if we carelessly used self.x = x
, that would trigger the call to
_setSide
, which, in turn, would try to use
self._y
before the _y
attribute is
set.
Another issue worthy of mention is that if any one or more of
the fget
, fset
, or fdel
arguments to
CommonProperty
is defaulted, the
realname
argument must be
different from the attribute name to which the
CommonProperty
instance is assigned; otherwise,
unbounded recursion would occur on trying the corresponding operation
(in practice, you’d get a RecursionLimitExceeded
exception).
The Library Reference and
Python in a Nutshell documentation for
built-ins getattr
, setattr
, delattr
, and property
.
Credit: Ken Seehof, Holger Krekel
You need to add functionality to an existing class, without changing the source code for that class, and inheritance is not applicable (since it would make a new class, rather than changing the existing one). Specifically, you need to enrich a method of the class, adding some extra functionality “around” that of the existing method.
Adding completely new methods (and other attributes) to an
existing class
object is quite
simple, since the built-in function setattr
does essentially all the work. We
need to “decorate” an existing method to add to its functionality. To
achieve this, we can build the new replacement method as a closure.
The best architecture is to define general-purpose wrapper and
unwrapper functions, such as:
import inspect def wrapfunc(obj, name, processor, avoid_doublewrap=True): """ patch obj.<name> so that calling it actually calls, instead, processor(original_callable, *args, **kwargs) """ # get the callable at obj.<name> call = getattr(obj, name) # optionally avoid multiple identical wrappings if avoid_doublewrap and getattr(call, 'processor', None) is processor: return # get underlying function (if any), and anyway def the wrapper closure original_callable = getattr(call, 'im_func', call) def wrappedfunc(*args, **kwargs): return processor(original_callable, *args, **kwargs) # set attributes, for future unwrapping and to avoid double-wrapping wrappedfunc.original = call wrappedfunc.processor = processor # 2.4 only: wrappedfunc._ _name_ _ = getattr(call, '_ _name_ _', name) # rewrap staticmethod and classmethod specifically (iff obj is a class) if inspect.isclass(obj): if hasattr(call, 'im_self'): if call.im_self: wrappedfunc = classmethod(wrappedfunc) else: wrappedfunc = staticmethod(wrappedfunc) # finally, install the wrapper closure as requested setattr(obj, name, wrappedfunc) def unwrapfunc(obj, name): ''' undo the effects of wrapfunc(obj, name, processor) ''' setattr(obj, name, getattr(obj, name).original)
This approach to wrapping is carefully coded to work just as
well on ordinary functions (when obj
is a module) as
on methods of all kinds (e.g., bound methods, when
obj
is an instance; unbound, class, and static
methods, when obj
is a class). This method doesn’t
work when obj
is a built-in type, though, because
built-ins are immutable.
For example, suppose we want to have “tracing” prints of all
that happens whenever a particular method is called. Using the
general-purpose wrapfunc
function just shown, we
could code:
def tracing_processor(original_callable, *args, **kwargs): r_name = getattr(original_callable, '_ _name_ _', '<unknown>') r_args = map(repr, args) r_args.extend(['%s=%r' % x for x in kwargs.iteritems( )]) print "begin call to %s(%s)" % (r_name, ", ".join(r_args)) try: result = call(*args, **kwargs) except: print "EXCEPTION in call to %s" %(r_name,) raise else: print "call to %s result: %r" %(r_name, result) return result def add_tracing_prints_to_method(class_object, method_name): wrapfunc(class_object, method_name, tracing_processor)
This recipe’s task occurs fairly often when you’re trying to
modify the behavior of a standard or third-party Python module, since
editing the source of the module itself is undesirable. In particular,
this recipe can be handy for debugging, since the example function
add_tracing_prints_to_method
presented in the
“Solution” lets you see on standard output all details of calls to a
method you want to watch, without modifying the library module, and
without requiring interactive access to the Python session in which
the calls occur.
You can also use this recipe’s approach on a larger scale. For
example, say that a library that you imported has a long series of
methods that return numeric error codes. You could wrap each of them
inside an enhanced wrapper method, which raises an exception when the
error code from the original method indicates an error condition.
Again, a key issue is not having to modify the library’s own code.
However, methodical application of wrappers when building a subclass
is also a way to avoid repetitious code (i.e., boilerplate). For
example, Recipe 5.12
and Recipe 1.24 might
be recoded to take advantage of the general wrapfunc
presented in this recipe.
Particularly when “wrapping on a large scale”, it is important
to be able to “unwrap” methods back to their normal state, which is
why this recipe’s Solution also includes an
unwrapfunc
function. It may also be handy to avoid
accidentally wrapping the same method in the same way twice, which is
why wrapfunc
supports the optional parameter
avoid_doublewrap
, defaulting to True
, to avoid such double wrapping.
(Unfortunately, classmethod
and
staticmethod
do not support
per-instance attributes, so the avoidance of double wrapping, as well
as the ability to “unwrap”, cannot be guaranteed in all cases.)
You can wrap the same method multiple times with different processors. However, unwrapping must proceed last-in, first-out; as coded, this recipe does not support the ability to remove a wrapper from “somewhere in the middle” of a chain of several wrappers. A related limitation of this recipe as coded is that double wrapping is not detected when another unrelated wrapping occurred in the meantime. (We don’t even try to detect what we might call “deep double wrapping.”)
If you need “generalized unwrapping”, you can extend
unwrap_func
to return the processor it has removed;
then you can obtain generalized unwrapping by unwrapping all the way,
recording a list of the processors that you removed, and then pruning
that list of processors and rewrapping. Similarly, generalized
detection of “deep” double wrapping could be implemented based on this
same idea.
Another generalization, to fully support staticmethod
and classmethod
, is to use a global dict
, rather than per-instance attributes,
for the original
and processor
values; functions, bound and unbound methods, as well as class methods
and static methods, can all be used as keys into such a dictionary.
Doing so obviates the issue with the inability to set per-instance
attributes on class methods and static methods. However, each of these
generalizations can be somewhat complicated, so we are not pursuing
them further here.
Once you have coded some processors with the signature and
semantics required by this recipe’s wrapfunc
, you can
also use such processors more directly (in cases where modifying the
source is OK) with a Python 2.4 decorator, as follows:
def processedby(processor): """ decorator to wrap the processor around a function. """ def processedfunc(func): def wrappedfunc(*args, **kwargs): return processor(func, *args, **kwargs) return wrappedfunc return processedfunc
For example, to wrap this recipe’s
tracing_processor
around a certain method at the time
the class
statement executes, in
Python 2.4, you can code:
class SomeClass(object): @processedby(tracing_processor) def amethod(self, s): return 'Hello, ' + s
Recipe 5.12 and
Recipe 1.24 provide
examples of the methodical application of wrappers to build a subclass
to avoid boilerplate; Library Reference and
Python in a Nutshell docs on built-in functions
getattr
and setattr
and module inspect
.
Credit: Stephan Diehl, Robert E. Brewer
You need to add functionality to an existing class without changing the source code for that class. Specifically, you need to enrich all methods of the class, adding some extra functionality “around” that of the existing methods.
Recipe 20.6
previously showed a way to solve this task for one method by writing a
closure that builds and applies a wrapper, exemplified by function
add_tracing_prints_to_method
in that recipe’s
Solution. This recipe generalizes that one, wrapping methods
throughout a class or hierarchy, directly or via a custom
metaclass.
Module inspect
lets
you easily find all methods of an existing class, so you can
systematically wrap them all:
import inspect def add_tracing_prints_to_all_methods(class_object): for method_name, v in inspect.getmembers(class_object, inspect.ismethod): add_tracing_prints_to_method(class_object, method_name)
If you need to ensure that such wrapping applies to all methods of all classes in a whole hierarchy, the simplest way may be to insert a custom metaclass at the root of the hierarchy, so that all classes in the hierarchy will get that same metaclass. This insertion does normally need a minimum of “invasiveness”—placing a single statement
_ _metaclass_ _ = MetaTracer
in the body of that root class. Custom metaclass
MetaTracer
is, however, quite easy to write:
class MetaTracer(type): def _ _init_ _(cls, n, b, d): super(MetaTracer, cls)._ _init_ _(n, b, d) add_tracing_prints_to_all_methods(cls)
Even such minimal invasiveness sometimes is unacceptable, or you
need a more dynamic way to wrap all methods in a hierarchy. Then, as
long as the root class of the hierarchy is new-style, you can arrange
to get function add_tracing_prints_to_all_methods
dynamically called on all classes in the hierarchy:
def add_tracing_prints_to_all_descendants(class_object): add_tracing_prints_to_all_methods(class_object) for s in class_object._ _subclasses_ _( ): add_tracing_prints_to_all_descendants(s)
The inverse function unwrapfunc
, in Recipe 20.6, may also be
similarly applied to all methods of a class and all classes of a
hierarchy.
We could code just about all functionality of such a powerful
function as add_tracing_prints_to_all_descendants
in
the function’s own body. However, it would not be a great idea to
bunch up such diverse functionality inside a single function. Instead,
we carefully split the functionality among the various separate
functions presented in this recipe and previously in Recipe 20.6. By this
careful factorization, we obtain maximum reusability without code
duplication: we have separate functions to dynamically add and remove
wrapping from a single method, an entire class, and a whole hierarchy
of classes; each of these functions appropriately uses the simpler
ones. And for cases in which we can afford a tiny amount of
“invasiveness” and want the convenience of automatically applying the
wrapping to all methods of classes descended from a certain root, we
can use a tiny custom metaclass.
add_tracing_prints_to_all_descendants
cannot
apply to old-style classes. This limitation is inherent in the
old-style object model and is one of the several reasons you should
always use new-style classes in new code you write: classic classes
exist only to ensure compatibility in legacy programs. Besides the
problem with classic classes, however, there’s another issue with the
structure of add_tracing_prints_to_all_descendants
:
in cases of multiple inheritance, the function will repeatedly visit
some classes.
Since the method-wrapping function is carefully designed to avoid double wrapping, such multiple visits are not a serious problem, costing just a little avoidable overhead, which is why the function was acceptable for inclusion in the “Solution”. In other cases in which we want to operate on all descendants of a certain root class, however, multiple visits might be unacceptable. Moreover, it is clearly not optimal to entwine the functionality of getting all descendants with that of applying one particular operation to each of them. The best idea is clearly to factor out the recursive structure into a generator, which can avoid duplicating visits with the memo idiom:
def all_descendants(class_object, _memo=None): if _memo is None: _memo = { } elif class_object in _memo: return yield class_object for subclass in class_object._ _subclasses_ _( ): for descendant in all_descendants(subclass, _memo): yield descendant
Adding tracing prints to all descendants now simplifies to:
def add_tracing_prints_to_all_descendants(class_object): for c in all_descendants(class_object): add_tracing_prints_to_all_methods(c)
In Python, whenever you find yourself with an iteration
structure of any complexity, or recursion, it’s always worthwhile to
check whether it’s feasible to factor out the iterative or recursive
control structure into a separate, reusable generator, so that all
iterations of that form can become simple for
statements. Such separation of concerns
can offer important simplifications and make code more
maintainable.
Recipe 20.6 for
details on how each method gets wrapped; Library
Reference and Python in a Nutshell
docs on module inspect
and the
_ _subclasses_ _
special method of
new-style classes.
Credit: Moshe Zadka
During debugging, you want to identify certain specific instance
objects so that print
statements
display more information when applied to those specific
objects.
The print
statement
implicitly calls the special method _ _str_
_
of the class of each object you’re printing. Therefore, to
ensure that print
ing certain
objects displays more information, we need to give those objects new
classes whose _ _str_ _
special
methods are suitably modified. For example:
def add_method_to_objects_class(object, method, name=None): if name is None: name = method.func_name class newclass(object._ _class_ _): pass setattr(newclass, name, method) object._ _class_ _ = newclass import inspect def _rich_str(self): pieces = [ ] for name, value in inspect.getmembers(self): # don't display specials if name.startswith('_ _') and name.endswith('_ _'): continue # don't display the object's own methods if inspect.ismethod(value) and value.im_self is self: continue pieces.extend((name.ljust(15), ' ', str(value), ' ')) return ''.join(pieces) def set_rich_str(obj, on=True): def isrich( ): return getattr(obj._ _class_ _._ _str_ _, 'im_func', None) is _rich_str if on: if not isrich( ): add_method_to_objects_class(obj, _rich_str, '_ _str_ _') assert isrich( ) else: if not isrich( ): return bases = obj._ _class_ _._ _bases_ _ assert len(bases) == 1 obj._ _class_ _ = bases[0] assert not isrich( )
Here is a sample use of this recipe’s
set_rich_str
function, guarded in the usual
way:
if _ _name_ _ == '_ _main_ _': # usual guard for example usage class Foo(object): def _ _init_ _(self, x=23, y=42): self.x, self.y = x, y f = Foo( ) print f # emits:<_ _main_ _.Foo object at 0x38f770>
set_rich_str(f) print f # emits: #x 23
#y 42
set_rich_str(f, on=False) print f # emits:<_ _main_ _.Foo object at 0x38f770>
In old versions of Python (and in Python 2.3 and 2.4, for
backwards compatibility on instances of classic classes), intrinsic
lookup of special methods (such as the intrinsic lookup for _ _str_ _
in a print
statement) started on the instance. In
today’s Python, in the new object model that is recommended for all
new code, the intrinsic lookup starts on the instance’s class,
bypassing names set in the instance’s own _
_dict_ _
. This innovation has many advantages, but, at a
first superficial look, it may also seem to have one substantial
disadvantage: namely, to make it impossible to solve this recipe’s
Problem in the general case (i.e., for instances that might belong to
either classic or new-style classes).
Fortunately, that superficial impression is not correct, thanks
to Python’s power of introspection and dynamism. This recipe’s
function add_method_to_objects_class
shows how to
change special methods on a given object
obj
’s class, without affecting other
“sibling” objects (i.e., other instances of the same class as
obj
’s): very simply, start by changing the
obj
’s class—that is, by setting obj._ _class_ _
to a newly made class object
(which inherits from the original class of
obj
, so that anything we don’t explicitly
modify remains unchanged). Once you’ve done that, you can then alter
the newly made class object to your heart’s contents.
Function _rich_str
shows how you can use
introspection to display a lot of information about a specific
instance. Specifically, we display every attribute of the instance
that doesn’t have a special name (starting and ending with two
underscores), except the instances’ own bound methods. Function
set_rich_str
shows how to set the _ _str_ _
special method of an instance’s
class to either “rich” (the _rich_str
function we
just mentioned) or “normal” (the _ _str_
_
method the object’s original class is coded to supply). To
make the object’s _ _str_ _
rich,
set_rich_str
uses
add_method_to_objects_class
to set _ _str_ _
to _rich_str
.
When the object goes back to “normal”, set_rich_str
sets the object’s _ _class_ _
back
to its original value (which is preserved as the only base class when
the object is set to use _rich_str
).
Recipe 20.6 and
Recipe 20.7 for other
cases in which a class’ methods are modified; documentation on the
inspect
standard library module in
the Library Reference.
Credit: Raymond Hettinger
You want to ensure that the classes you define implement the interfaces that they claim to implement.
Python does not have a formal concept of “interface”, but we can easily represent interfaces by means of “skeleton” classes such as:
class IMinimalMapping(object): def _ _getitem_ _(self, key): pass def _ _setitem_ _(self, key, value): pass def _ _delitem_ _(self, key): pass def _ _contains_ _(self, key): pass import UserDict class IFullMapping(IMinimalMapping, UserDict.DictMixin): def keys(self): pass class IMinimalSequence(object): def _ _len_ _(self): pass def _ _getitem_ _(self, index): pass class ICallable(object): def _ _call_ _(self, *args): pass
We follow the natural convention that any class can
represent an interface: the interface is the
set of methods and other attributes of the class. We can say that a
class C
implements
an interface i
if
C
has all the methods and other attributes
of i
(and, possibly, additional
ones).
We can now define a simple custom metaclass that checks whether classes implement all the interfaces they claim to implement:
# ensure we use the best available 'set' type with name 'set' try: set except NameError: from sets import Set as set # a custom exception class that we raise to signal violations class InterfaceOmission(TypeError): pass class MetaInterfaceChecker(type): ''' the interface-checking custom metaclass ''' def _ _init_ _(cls, classname, bases, classdict): super(MetaInterfaceChecker, cls)._ _init_ _(classname, bases, classdict) cls_defines = set(dir(cls)) for interface in cls._ _implements_ _: itf_requires = set(dir(interface)) if not itf_requires.issubset(cls_defines): raise InterfaceOmission, list(itf_requires - cls_defines)
Any class that uses MetaInterfaceChecker
as its
metaclass must expose a class attribute _
_implements_ _
, an iterable whose items are the interfaces
the class claims to implement. The metaclass checks the claim, raising
an InterfaceOmission
exception if the claim is
false.
Here’s an example class using the
MetaInterfaceChecker
custom metaclass:
class Skidoo(object): ''' a mapping which claims to contain all keys, each with a value of 23; item setting and deletion are no-ops; you can also call an instance with arbitrary positional args, result is 23. ''' _ _metaclass_ _ = MetaInterfaceChecker _ _implements_ _ = IMinimalMapping, ICallable def _ _getitem_ _(self, key): return 23 def _ _setitem_ _(self, key, value): pass def _ _delitem_ _(self, key): pass def _ _contains_ _(self, key): return True def _ _call_ _(self, *args): return 23 sk = Skidoo( )
Any code dealing with an instance of such a class can choose to check whether it can rely on certain interfaces:
def use(sk):
if IMinimalMapping in sk._ _implements_ _:...code using 'sk[...]' and/or 'x in sk'...
You can, if you want, provide much fancier and more thorough
checks, for example by using functions from standard library module
inspect
to check that the
attributes being exposed and required are methods with compatible
signatures. However, this simple recipe does show how to automate the
simplest kind of checks for interface compliance.
Library Reference and Python
in a Nutshell docs about module sets
, (in Python 2.4 only) the set
built-in, custom metaclasses, the
inspect
module.
Credit: Michele Simionato, Stephan Diehl, Alex Martelli
You are writing a custom metaclass, and you are not sure which
tasks your metaclass should perform in its _
_new_ _
method, and which ones it should perform in its
_ _init_ _
method instead.
Any preliminary processing that your custom metaclass performs
on the name, bases, or dict of the class being built, can affect the
way in which the class object gets built only if it occurs in the
metaclass’ _ _new_ _
method,
before your code calls the metaclass’ superclass’
_ _new_ _
. For example, that’s the
only time when you can usefully affect the new class’ _ _slots_ _
, if any:
class MetaEnsure_foo(type): def _ _new_ _(mcl, cname, cbases, cdict): # ensure instances of the new class can have a '_foo' attribute if '_ _slots_ _' in cdict and '_foo' not in cdict['_ _slots_ _']: cdict['_ _slots_ _'] = tuple(cdict['_ _slots_ _']) + ('_foo',) return super(MetaEnsure_foo, mcl)._ _new_ _(mcl, cname, cbases, cdict)
Metaclass method _ _init_ _
is generally the most appropriate one for any changes that your custom
metaclass makes to the class object after the
class object is built—for example, continuing the example code for
metaclass MetaEnsure_foo
:
def _ _init_ _(cls, cname, cbases, cdict): super(MetaEnsure_foo, cls)._ _init_ _(cls, cname, cbases, cdict) cls._foo = 23
The custom metaclass MetaEnsure_foo
performs a
definitely “toy” task presented strictly as an example: if the class
object being built defines a _ _slots_
_
attribute (to save memory),
MetaEnsure_foo
ensures that the class object includes
a slot _foo
, so that instances of
that class can have an attribute thus named. Further, the custom
metaclass sets an attribute with name _foo
and value 23 on each new class object.
The point of the recipe isn’t really this toy task, but rather, a
clarification on how _ _new_ _
and
_ _init_ _
methods of a custom
metaclass are best coded, and which tasks are most appropriate for
each.
Whenever you instantiate any class x
(whether x
is a custom metaclass or an
ordinary class) with or without arguments (we can employ the usual
Python notation *a
, **k
to mean arbitrary positional and named
arguments), Python internally performs the equivalent of the following
snippet of code:
new_thing = X._ _new_ _(X, *a, **k) if isinstance(new_thing, X): X._ _init_ _(new_thing, *a, **k)
The new_thing
thus built and
initialized is the result of instantiating
x
. If x
is a
custom metaclass, in particular, this snippet occurs at the end of the
execution of a class
statement, and
the arguments (all positional) are the name, bases, and dictionary of
the new class that is being built.
So, your custom metaclass’ _ _new_
_
method is the code that has dibs—it executes first. That’s
the moment in which you can adjust the name, bases, and dictionary
that you receive as arguments, to affect the way the new class object
is built. Most characteristics of the class object, but not all, can
also be changed later. An example of an attribute that you have to set
before building the class object is _ _slots_ _
. Once the class object is built,
the slots, if any, are defined, and any further change to _ _slots_ _
has no effect.
The custom metaclass in this recipe carefully uses super
to delegate work to its superclass,
rather than carelessly calling type._ _new_
_
or type._ _init_ _
directly: the latter usage would be a subtle mistake, impeding the
proper working of multiple inheritance among metaclasses. Further,
this recipe is careful in naming the first parameters to both methods:
cls
to mean an ordinary class (the object that is the
first argument to a custom metaclass’ _
_init_ _
), mcl
to mean a metaclass (the
object that is the first argument to a custom metaclass’ _ _new_ _
). The common usage of
self
should be reserved to mean normal instances, not
classes nor metaclasses, and therefore it doesn’t normally occur in
the body of a custom metaclass. All of these names are a matter of
mere convention, but using appropriate conventions promotes clarity,
and this use of cls
and mcl
was
blessed by Guido van Rossum himself, albeit only verbally.
The usage distinction between _ _new_
_
and _ _init_ _
that
this recipe advocates for custom metaclasses is basically the same
criterion that any class should always follow:
use _ _new_ _
when you must, only
for jobs that cannot be done later; use _
_init_ _
for all jobs that can be left until _ _init_ _
time. Following these conventions
makes life easiest for anybody who must tweak your custom metaclass or
make it work well in a multiple inheritance situation, and thus
enhances the reusability of your code. _
_new_ _
should contain only the essence
of your metaclass: stuff that anybody using your metaclass in any way
at all must surely want (or else he wouldn’t be
using your metaclass!) because it’s stuff that’s not easy to tweak,
modify, or override. _ _init_ _
is
“softer”, so most of what your metaclass is doing to the class objects
you generate, should be there, exactly because it will be easier for
reusers to tweak or avoid.
Library Reference and Python
in a Nutshell docs on built-ins super
and _ _slots_
_
, and special methods _ _init_
_
and _ _new_ _
.
Credit: Stephan Diehl, Alex Martelli
The methods of the list
type that mutate a list object in
place—methods such as append
and
sort
—return None
. To call a series of such methods, you
therefore need to use a series of statements. You would like those
methods to return self
to enable
you to chain a series of calls within a single
expression.
A custom metaclass can offer an elegant approach to this task:
def makeChainable(func):
''' wrapp a method returning None into one returning self '''
def chainableWrapper(self, *args, **kwds):
func(self, *args, **kwds)
return self
# 2.4 only: chainableWrapper._ _name_ _ = func._ _name_ _
return chainableWrapper
class MetaChainable(type):
def _ _new_ _(mcl, cName, cBases, cDict):
# get the "real" base class, then wrap its mutators into the cDict
for base in cBases:
if not isinstance(base, MetaChainable):
for mutator in cDict['_ _mutators_ _']:
if mutator not in cDict:
cDict[mutator] = makeChainable(getattr(base, mutator))
break
# delegate the rest to built-in 'type'
return super(MetaChainable, mcl)._ _new_ _(mcl, cName, cBases, cDict)
class Chainable: _ _metaclass_ _ = MetaChainable
if _ _name_ _ == '_ _main_ _':
# example usage
class chainablelist(Chainable, list):
_ _mutators_ _ = 'sort reverse append extend insert'.split( )
print ''.join(chainablelist('hello').extend('ciao').sort( ).reverse( ))
# emits:oolliheca
Mutator methods of mutable objects such as lists and
dictionaries work in place, mutating the object they’re called on, and
return None
. One reason for this
behavior is to avoid confusing programmers who might otherwise think
such methods build and return new objects. Returning None
also prevents you from
chaining a sequence of mutator calls, which some
Python gurus consider bad style because it can lead to very dense code
that may be hard to read.
Some programmers, however, occasionally prefer the chained-calls, dense-code style. This style is particularly useful in such contexts as lambda forms and list comprehensions. In these contexts, the ability to perform actions within an expression, rather than in statements, can be crucial. This recipe shows one way you can tweak mutators’ return values to allow chaining. Using a custom metaclass means the runtime overhead of introspection is paid only rarely, at class-creation time, rather than repeatedly. If runtime overhead is not a problem for your application, it may be simpler for you to use a delegating wrapper idiom that was posted to comp.lang.python by Jacek Generowicz:
class chainable(object): def _ _init_ _(self, obj): self.obj = obj def _ _iter_ _(self): return iter(self.obj) def _ _getattr_ _(self, name): def proxy(*args, **kwds): result = getattr(self.obj, name)(*args, **kwds) if result is None: return self else: return result # 2.4 only: proxy._ _name_ _ = name return proxy
The use of this wrapper is quite similar to that of classes obtained by the custom metaclass presented in this recipe’s Solution—for example:
print ''.join(chainable(list('hello')).extend('ciao').sort( ).reverse( ))
# emits:oolliheca
Library Reference and Python
in a Nutshell docs on built-in type list
and special methods _ _new_ _
and _
_getattr_ _
.
Credit: Michele Simionato, Gonçalo Rodrigues
You like the cooperative style of multiple-inheritance coding
supported by the super
built-in,
but you wish you could use that style in a more terse and direct
way.
A custom metaclass lets us selectively wrap the methods exposed
by a class. Specifically, if the second argument of a method is named
super
, then that argument gets bound to the
appropriate instance of the built-in super
:
import inspect def second_arg(func): args = inspect.getargspec(func)[0] try: return args[1] except IndexError: return None def super_wrapper(cls, func): def wrapper(self, *args, **kw): return func(self, super(cls, self), *args, **kw) # 2.4 only: wrapper._ _name_ _ = func._ _name_ _ return wrapper class MetaCooperative(type): def _ _init_ _(cls, name, bases, dic): super(MetaCooperative, cls)._ _init_ _(cls, name, bases, dic) for attr_name, func in dic.iteritems( ): if inspect.isfunction(func) and second_arg(func) == "super": setattr(cls, attr_name, super_wrapper(cls, func)) class Cooperative: _ _metaclass_ _ = MetaCooperative
Here is a usage example of the custom metaclass presented in this recipe’s Solution, in a typical toy case of “diamond-shaped” inheritance:
if _ _name_ _ == "_ _main_ _":
class B(Cooperative):
def say(self):
print "B",
class C(B):
def say(self, super):
super.say( )
print "C",
class D(B):
def say(self, super):
super.say( )
print "D",
class CD(C, D):
def say(self, super):
super.say( )
print '!'
CD( ).say( )
# emits: B D C !
Methods that want to access the super-instance just need to use
super
as the name of their second argument; the
metaclass then arranges to wrap those methods so that the
super-instance gets synthesized and passed in as the second argument,
as needed.
In other words, when a class cls
,
whose metaclass is MetaCooperative
, has methods whose
second argument is named super
, then, in those
methods, any call of the form super.something(*args, **kw)
is a shortcut
for super(cls,
self).something(*args
, **kw)
. This approach avoids the need to pass
the class object as an argument to the built-in super
.
Class cls
may also perfectly well
have other methods that do not follow this convention, and in those
methods, it may use the built-in super
in the usual way: all it takes for any
method to be “normal” is to not use
super
as the name of its second argument, surely not
a major restriction. This recipe offers nicer syntax sugar for the
common case of cooperative supercalls, where the first argument to
super
is the current class—nothing
more.
Library Reference and Python
in a Nutshell docs on module inspect
and the super
built-in.
Credit: Dan Perl, Shalabh Chaturvedi
Your classes need to initialize some instance attributes when
they generate new instances. If you do the initialization, as normal,
in the _ _init_ _
method of your
classes, then, when anybody subclasses your classes, they must
remember to invoke your classes’ _ _init_
_
methods. Your classes often get subclassed by beginners
who forget this elementary requirement, and you’re getting tired of
the resulting support requests. You’d like an approach that beginners
subclassing your classes are less likely to mess up.
Beginners are unlikely to have heard of the _ _new_ _
method, so you can place your
initialization there, instead of in _ _init_
_
:
# a couple of classes that you write:
class super1(object):
def _ _new_ _(cls, *args, **kwargs):
obj = super(super1, cls)._ _new_ _(cls, *args, **kwargs)
obj.attr1 = [ ]
return obj
def _ _str_ _(self):
show_attr = [ ]
for attr, value in sorted(self._ _dict_ _.iteritems( )):
show_attr.append('%s:%r' % (attr, value))
return '%s with %s' % (self._ _class_ _._ _name_ _,
', '.join(show_attr))
class super2(object):
def _ _new_ _(cls, *args, **kwargs):
obj = super(super2, cls)._ _new_ _(cls, *args, **kwargs)
obj.attr2 = { }
return obj
# typical beginners' code, inheriting your classes but forgetting to
# call its superclasses' _ _init_ _ methods
class derived(super1, super2):
def _ _init_ _(self):
self.attr1.append(111)
self.attr3 = ( )
# despite the typical beginner's error, you won't get support calls:
d = derived( )
print d
# emits:derived with attr1:[111], attr2:{ }, attr3:( )
One of Python’s strengths is that it does very little
magic behind the curtains—close to nothing,
actually. If you know Python in sufficient depth, you know that
essentially all internal mechanisms are clearly documented and
exposed. This strength, however, means that you yourself must do some
things that other languages do magically, such as
prefixing self
. to methods and
attributes of the current object and explicitly calling the _ _init_ _
methods of your superclasses in
the _ _init_ _
method of your own
class.
Unfortunately, Python beginners, particularly if they first
learned from other languages where they’re used to such implicit and
magical behavior, can take some time adapting to this brave new world
where, if you want something done, you do it. Eventually, they learn.
Until they have learned, at times it seems that their favorite pastime
is filling my mailbox with help requests, in tones ranging from the
humble to the arrogant and angry, complaining that “my classes don’t
work.” Almost invariably, this complaint means they’re inheriting from
my classes, which are meant to ease such tasks as displaying GUIs and
communicating on the Internet, and they have forgotten to call my
classes’ _ _init_ _
methods from
the _ _init_ _
methods of
subclasses they have coded.
To deal with this annoyance, I devised the simple solution shown
in this recipe. Beginners generally don’t know about the _ _new_ _
method, and what they don’t know,
they cannot mess up. If they do know enough to
override _ _new_ _
, you can hope
they also know enough to do a properly cooperative supercall using the
super
built-in, rather than crudely
bypassing your code by directly calling object._ _new_ _
. Well, hope springs
eternal, or so they say. Truth be told, my hopes lie in beginners’
total, blissful ignorance about _ _new_
_
—and this theory seems to work because I don’t get those
kind of help requests any more. The help requests I now receive seem
concerned more with how to actually use my classes, rather than
displaying fundamental ignorance of Python.
If you work with more advanced but equally perverse beginners,
ones quite able to mess up _ _new_
_
, you should consider giving your classes a custom
metaclass that, in its _ _call_ _
(which executes at class instantiation time), calls a special hidden
method on your classes to enable you to do your initializations
anyway. That approach should hold you in good stead—at least until the
beginners start learning about metaclasses. Of course, “it is
impossible to make anything foolproof, because fools are so ingenious”
(Roger Berg). Nevertheless, see Recipe 20.14 for other
approaches that avoid _ _init_ _
for attribute initialization needs.
Library Reference and Python
in a Nutshell documentation on special methods _ _init_ _
and _
_new_ _
, and built-in super
; Recipe 20.14.
Credit: Sébastien Keim, Troy Melhase, Peter Cogolo
You want to set some attributes to constant values,
during object initialization, without forcing your subclasses to call
your _ _init_ _
method.
For constant values of immutable types, you can just set them in the class. For example, instead of the natural looking:
class counter(object): def _ _init_ _(self): self.count = 0 def increase(self, addend=1): self.count += addend
you can code:
class counter(object): count = 0 def increase(self, addend=1): self.count += addend
This style works because self.count +=
addend
, when self.count
belongs to an
immutable type, is exactly equivalent to self.count = self.count + addend
. The first
time this code executes for a particular instance
self
, self.count
is not yet
initialized as a per-instance attribute, so the per-class attribute is
used, on the right of the equal
sign (=); but the per-instance
attribute is nevertheless the one assigned to (on the
left of the sign). Any further use, once the
per-instance attribute has been initialized in this way, gets or sets
the per-instance attribute.
This style does not work for values of mutable types, such as lists or dictionaries. Coding this way would then result in all instances of the class sharing the same mutable-type object as their attribute. However, a custom descriptor works fine:
class auto_attr(object): def _ _init_ _(self, name, factory, *a, **k): self.data = name, factory, a, k def _ _get_ _(self, obj, clas=None): name, factory, a, k = self.data setattr(obj, name, factory(*a, **k)) return getattr(obj, name)
With class auto_attr
at hand, you can now code,
for example:
class recorder(object): count = 0 events = auto_attr('events', list) def record(self, event): self.count += 1 self.events.append((self.count, event))
The simple and standard approach of defining constant initial
values of attributes by setting them as class attributes is just fine,
as long as we’re talking about constants of immutable types, such as
numbers or strings. In such cases, it does no harm for all instances
of the class to share the same initial-value object for such
attributes, and, when you do such operations as self.count += 1
, you intrinsically rebind
the specific, per-instance value of the attribute, without affecting
the attributes of other instances.
However, when you want an attribute to have an initial value of
a mutable type, such as a list or a dictionary,
you need a little bit more—such as the auto_attr
custom descriptor type in this recipe. Each instance of
auto_attr
needs to know to what attribute name it’s
being bound, so we pass that name
as the first
argument when we instantiate auto_attr
. Then, we have
the factory
, a callable that will produce the desired
initial value when called (often factory
will be a
type object, such as list
or
dict
); and finally optional
positional and keyword arguments to be passed when
factory
gets called.
The first time you access an attribute named
name
on a given instance
obj
, Python finds in
obj
’s class the descriptor (an instance of
auto_attr
) and calls the descriptor’s method _ _get_ _
, with
obj
as an argument.
auto_attr
’s _ _get_
_
calls the factory and sets the result under the right name
as an instance attribute, so that any further access to the attribute
of that name in the instance gets the actual value.
In other words, the descriptor is designed to hide
itself when it’s first accessed on each instance, to get
out of the way from further accesses to the attribute of the same name
on that same instance. For this purpose, it’s absolutely crucial that
auto_attr
is technically a
nondata descriptor class, meaning it doesn’t
define a _ _set_ _
method. As a
consequence, an attribute of the same name may be set in the instance:
the per-instance attribute overrides (i.e., takes precedence over) the
per-class attribute (i.e., the instance of a nondata descriptor
class).
You can regard this recipe’s approach as “just-in-time
generation” of instance attributes, the first time a certain attribute
gets accessed on a certain instance. Beyond allowing attribute
initialization to occur without an _ _init_
_
method, this approach may therefore be useful as an
optimization: consider it when each instance has a potentially large
set of attributes, maybe costly to initialize, and most of the
attributes may end up never being accessed on each given
instance.
It is somewhat unfortunate that this recipe requires you to pass
to auto_attr
the name of the attribute it’s getting
bound to; unfortunately, auto_attr
has no way to find
out for itself. However, if you’re willing to add a custom metaclass
to the mix, you can fix this little inconvenience, too, as
follows:
class smart_attr(object): name = None def _ _init_ _(self, factory, *a, **k): self.creation_data = factory, a, k def _ _get_ _(self, obj, clas=None): if self.name is None: raise RuntimeError, ("class %r uses a smart_attr, so its " "metaclass should be MetaSmart, but is %r instead" % (clas, type(clas))) factory, a, k = self.creation_data setattr(obj, name, factory(*a, **k)) return getattr(obj, name) class MetaSmart(type): def _ _new_ _(mcl, clasname, bases, clasdict): # set all names for smart_attr attributes for k, v in clasdict.iteritems( ): if isinstance(v, smart_attr): v.name = k # delegate the rest to the supermetaclass return super(MetaSmart, mcl)._ _new_ _(mcl, clasname, bases, clasdict) # let's let any class use our custom metaclass by inheriting from smart_object class smart_object: _ _metaclass_ _ = MetaSmart
Using this variant, you could code:
class recorder(smart_object): count = 0 events = smart_attr(list) def record(self, event): self.count += 1 self.events.append((self.count, event))
Once you start considering custom metaclasses, you have more
options for this recipe’s task, automatic initialization of instance
attributes. While a custom descriptor remains the best approach when
you do want “just-in-time” generation of initial
values, if you prefer to generate all the initial values at the time
the instance is being initialized, then you can use a simple
placeholder instead of smart_attr
, and do more work
in the metaclass:
class attr(object): def _ _init_ _(self, factory, *a, **k): self.creation_data = factory, a, k import inspect def is_attr(member): return isinstance(member, attr) class MetaAuto(type): def _ _call_ _(cls, *a, **k): obj = super(MetaAuto, cls)._ _call_ _(cls, *a, **k) # set all values for 'attr' attributes for n, v in inspect.getmembers(cls, is_attr): factory, a, k = v.creation_data setattr(obj, n, factory(*a, **k)) return obj # lets' let any class use our custom metaclass by inheriting from auto_object class auto_object: _ _metaclass_ _ = MetaAuto
Code using this more concise variant looks just about the same as with the previous one:
class recorder(auto_object): count = 0 events = attr(list) def record(self, event): self.count += 1 self.events.append((self.count, event))
Recipe 20.13
for another approach that avoids _ _init_
_
for attribute initialization needs; Library
Reference and Python in a Nutshell
docs on special method _ _init_ _
,
and built-ins super
and setattr
.
Credit: Michael Hudson, Peter Cogolo
You are developing a Python module that defines a class,
and you’re trying things out in the interactive interpreter. Each time
you reload
the module, you have to
ensure that existing instances are updated to instances of the new,
rather than the old class.
First, we define a custom metaclass, which ensures its classes keep track of all their existing instances:
import weakref class MetaInstanceTracker(type): ''' a metaclass which ensures its classes keep track of their instances ''' def _ _init_ _(cls, name, bases, ns): super(MetaInstanceTracker, cls)._ _init_ _(name, bases, ns) # new class cls starts with no instances cls._ _instance_refs_ _ = [ ] def _ _instances_ _(cls): ''' return all instances of cls which are still alive ''' # get ref and obj for refs that are still alive instances = [(r, r( )) for r in cls._ _instance_refs_ _ if r( ) is not None] # record the still-alive references back into the class cls._ _instance_refs_ _ = [r for (r, o) in instances] # return the instances which are still alive return [o for (r, o) in instances] def _ _call_ _(cls, *args, **kw): ''' generate an instance, and record it (with a weak reference) ''' instance = super(MetaInstanceTracker, cls)._ _call_ _(*args, **kw) # record a ref to the instance before returning the instance cls._ _instance_refs_ _.append(weakref.ref(instance)) return instance class InstanceTracker: ''' any class may subclass this one, to keep track of its instances ''' _ _metaclass_ _ = MetaInstanceTracker
Now, we can subclass MetaInstanceTracker
to
obtain another custom metaclass, which, on top of the
instance-tracking functionality, implements the auto-upgrading
functionality required by this recipe’s Problem:
import inspect class MetaAutoReloader(MetaInstanceTracker): ''' a metaclass which, when one of its classes is re-built, updates all instances and subclasses of the previous version to the new one ''' def _ _init_ _(cls, name, bases, ns): # the new class may optionally define an _ _update_ _ method updater = ns.pop('_ _update_ _', None) super(MetaInstanceTracker, cls)._ _init_ _(name, bases, ns) # inspect locals & globals in the stackframe of our caller f = inspect.currentframe( ).f_back for d in (f.f_locals, f.f_globals): if name in d: # found the name as a variable is it the old class old_class = d[name] if not isinstance(old_class, mcl): # no, keep trying continue # found the old class: update its existing instances for instance in old_class._ _instances_ _( ): instance._ _class_ _ = cls if updater: updater(instance) cls._ _instance_refs_ _.append(weakref.ref(instance)) # also update the old class's subclasses for subclass in old_class._ _subclasses_ _( ): bases = list(subclass._ _bases_ _) bases[bases.index(old_class)] = cls subclass._ _bases_ _ = tuple(bases) break return cls class AutoReloader: ''' any class may subclass this one, to get automatic updates ''' _ _metaclass_ _ = MetaAutoReloader
Here is a usage example:
# an 'old class' class Bar(AutoReloader): def _ _init_ _(self, what=23): self.old_attribute = what # a subclass of the old class class Baz(Bar): pass # instances of the old class & of its subclass b = Bar( ) b2 = Baz( ) # we rebuild the class (normally via 'reload', but, here, in-line!): class Bar(AutoReloader): def _ _init_ _(self, what=42): self.new_attribute = what+100 def _ _update_ _(self): # compute new attribute from old ones, then delete old ones self.new_attribute = self.old_attribute+100 del self.old_attribute def meth(self, arg): # add a new method which wasn't in the old class print arg, self.new_attribute if _ _name_ _ == '_ _main_ _': # now b is "upgraded" to the new Bar class, so we can call 'meth': b.meth(1) # emits:1 123
# subclass Baz is also upgraded, both for existing instances...: b2.meth(2) # emits:2 123
# ...and for new ones: Baz( ).meth(3) # emits:3 142
You’re probably familiar with the problem this recipe is meant to address. The scenario is that you’re editing a Python module with your favorite text editor. Let’s say at some point, your module mod.py looks like this:
class Foo(object): def meth1(self, arg): print arg
In another window, you have an interactive interpreter running to test your code:
>>> import mod
>>> f = mod.Foo( )
>>> f.meth1(1)1
and it seems to be working. Now you edit mod.py to add another method:
class Foo(object): def meth1(self, arg): print arg def meth2(self, arg): print -arg
Head back to the test session:
>>> reload(mod)module 'mod' from 'mod.pyc'
>>> f.meth2(2)Traceback (most recent call last):
File "<stdin>", line 1, in ?
AttributeError: 'Foo' object has no attribute 'meth2'
Argh! You forgot that f
was
an instance of the old mod.Foo
!
You can do two things about this situation. After reloading, either regenerate the instance:
>>> f = mod.Foo( )
>>> f.meth2(2)-2
or manually assign to f._ _class_
_
:
>>> f._ _class_ _ = mod.Foo
>>> f.meth2(2)-2
Regenerating works well in simple situations but can become very tedious. Assigning to the class can be automated, which is what this recipe is all about.
Class MetaInstanceTracker
is a metaclass that
tracks instances of its instances. As metaclasses go, it isn’t too
complicated. New classes of this metatype get an extra _
_instance_refs_ _
class variable (which is used to store weak
references to instances) and an _ _instances_ _
class
method (which strips out dead references from the _
_instance_refs_ _
list and returns real references to the
still live instances). Each time a class whose metatype is
MetaInstanceTracker
gets instantiated, a weak
reference to the instance is appended to the class’ _
_instance_refs_ _
list.
When the definition of a class of metatype
MetaAutoReloader
executes, the namespace of the
definition is examined to determine whether a class of the same name
already exists. If it does, then it is assumed that this is a class
redefinition, instead of a class definition, and
all instances of the old class are updated to the
new class. (MetaAutoReloader
inherits from MetaInstanceTracker
, so such instances
can easily be found). All direct subclasses, found through the old
class’ intrinsic _ _subclasses_ _
class method, then get their _ _bases_
_
tuples rebuilt with the same change.
The new class definition can optionally include a method
_ _update_ _
, whose job is to
update the state (meaning the set of attributes) of each instance, as
the instance’s class transitions from the old version of the class to
the new one. The usage example in this recipe’s Solution presents a
case in which one attribute has changed name and is computed by
different rules, as you can tell by observing the way the _ _init_ _
methods of the old and new
versions are coded; in this case, the job of _ _update_ _
is to compute the new attribute
based on the value of the old one, then del
the old attribute for tidiness.
This recipe’s code should probably do more thorough error checking; Net of error-checking issues, this recipe can also supply some fundamental tools to start solving a problem that is substantially harder than the one explained in this recipe’s Problem statement: automatically upgrade classes in a long-running application, without needing to stop and restart that application.
Doing automatic upgrading in production code is more difficult than doing it during development because many more issues must be monitored. For example, you may need a form of locking to ensure the application is in a quiescent state while a number of classes get upgraded, since you probably don’t want to have the application answering requests in the middle of the upgrading procedure, with some classes or instances already upgraded and others still in their old versions. You also often encounter issues of persistent storage because the application probably needs to update whatever persistent storage it keeps from old to new versions when it upgrades classes. And those are just two examples. Nevertheless, the key component of such on-the-fly upgrading, which has to do with updating instances and subclasses of old classes to new ones, can be tackled with the tools shown in this recipe.
Docs for the built-in function reload
in the Library
Reference and Python in a
Nutshell.
Credit: Raymond Hettinger, Skip Montanaro
Runtime lookup of global and built-in names is slower than lookup of local names. So, you would like to bind constant global and built-in names into local constant names at compile time.
To perform this task, we must examine and rewrite
bytecodes in the function’s code object. First, we get three names
from the standard library module opcode
, so we can operate symbolically on
bytecodes, and define two auxiliary functions for bytecode
operations:
from opcode import opmap, HAVE_ARGUMENT, EXTENDED_ARG globals( ).update(opmap) def _insert_constant(value, i, code, constants): ''' insert LOAD_CONST for value at code[i:i+3]. Reuse an existing constant if values coincide, otherwise append new value to the list of constants; return index of the value in constants. ''' for pos, v in enumerate(constants): if v is value: break else: pos = len(constants) constants.append(value) code[i] = LOAD_CONST code[i+1] = pos & 0xFF code[i+2] = pos >> 8 return pos def _arg_at(i, code): ''' return argument number of the opcode at code[i] ''' return code[i+1] | (code[i+2] << 8)
Next comes the workhorse, the internal function that does all the binding and folding work:
def _make_constants(f, builtin_only=False, stoplist=( ), verbose=False): # bail out at once, innocuously, if we're in Jython, IronPython, etc try: co = f.func_code except AttributeError: return f # we'll modify the bytecodes and consts, so make lists of them newcode = map(ord, co.co_code) codelen = len(newcode) newconsts = list(co.co_consts) names = co.co_names # Depending on whether we're binding only builtins, or ordinary globals # too, we build dictionary 'env' to look up name->value mappings, and we # build set 'stoplist' to selectively override and cancel such lookups import _ _builtin_ _ env = vars(_ _builtin_ _).copy( ) if builtin_only: stoplist = set(stoplist) stoplist.update(f.func_globals) else: env.update(f.func_globals) # First pass converts global lookups into lookups of constants i = 0 while i < codelen: opcode = newcode[i] # bail out in difficult cases: optimize common cases only if opcode in (EXTENDED_ARG, STORE_GLOBAL): return f if opcode == LOAD_GLOBAL: oparg = _arg_at(i, newcode) name = names[oparg] if name in env and name not in stoplist: # get the constant index to use instead pos = _insert_constant(env[name], i, newcode, newconsts) if verbose: print '%r -> %r[%d]' % (name, newconsts[pos], pos) # move accurately to the next bytecode, skipping arg if any i += 1 if opcode >= HAVE_ARGUMENT: i += 2 # Second pass folds tuples of constants and constant attribute lookups i = 0 while i < codelen: newtuple = [ ] while newcode[i] == LOAD_CONST: oparg = _arg_at(i, newcode) newtuple.append(newconsts[oparg]) i += 3 opcode = newcode[i] if not newtuple: i += 1 if opcode >= HAVE_ARGUMENT: i += 2 continue if opcode == LOAD_ATTR: obj = newtuple[-1] oparg = _arg_at(i, newcode) name = names[oparg] try: value = getattr(obj, name) except AttributeError: continue deletions = 1 elif opcode == BUILD_TUPLE: oparg = _arg_at(i, newcode) if oparg != len(newtuple): continue deletions = len(newtuple) value = tuple(newtuple) else: continue reljump = deletions * 3 newcode[i-reljump] = JUMP_FORWARD newcode[i-reljump+1] = (reljump-3) & 0xFF newcode[i-reljump+2] = (reljump-3) >> 8 pos = _insert_constant(value, i, newcode, newconsts) if verbose: print "new folded constant: %r[%d]" % (value, pos) i += 3 codestr = ''.join(map(chr, newcode)) codeobj = type(co)(co.co_argcount, co.co_nlocals, co.co_stacksize, co.co_flags, codestr, tuple(newconsts), co.co_names, co.co_varnames, co.co_filename, co.co_name, co.co_firstlineno, co.co_lnotab, co.co_freevars, co.co_cellvars) return type(f)(codeobj, f.func_globals, f.func_name, f.func_defaults, f.func_closure)
Finally, we use _make_constants
to optimize
itself and its auxiliary function, and define the functions that are
meant to be called from outside this module to perform the
optimizations that this module supplies:
# optimize thyself! _insert_constant = _make_constants(_insert_constant) _make_constants = _make_constants(_make_constants) import types @_make_constants def bind_all(mc, builtin_only=False, stoplist=( ), verbose=False): """ Recursively apply constant binding to functions in a module or class. """ try: d = vars(mc) except TypeError: return for k, v in d.items( ): if type(v) is types.FunctionType: newv = _make_constants(v, builtin_only, stoplist, verbose) setattr(mc, k, newv) elif type(v) in (type, types.ClassType): bind_all(v, builtin_only, stoplist, verbose) @_make_constants def make_constants(builtin_only=False, stoplist=[ ], verbose=False): """ Call this metadecorator to obtain a decorator which optimizes global references by constant binding on a specific function. """ if type(builtin_only) == type(types.FunctionType): raise ValueError, 'must CALL, not just MENTION, make_constants' return lambda f: _make_constants(f, builtin_only, stoplist, verbose)
Assuming you have saved the code in this recipe’s Solution as
module optimize.py somewhere on
your Python sys.path
, the following
example demonstrates how to use the make_constants
decorator with arguments (i.e., metadecorator) to
optimize a function—in this case, a reimplementation of random.sample
:
import random import optimize @optimize.make_constants(verbose=True) def sample(population, k): " Choose `k' unique random elements from a `population' sequence. " if not isinstance(population, (list, tuple, str)): raise TypeError('Cannot handle type', type(population)) n = len(population) if not 0 <= k <= n: raise ValueError, "sample larger than population" result = [None] * k pool = list(population) for i in xrange(k): # invariant: non-selected at [0,n-i) j = int(random.random( ) * (n-i)) result[i] = pool[j] pool[j] = pool[n-i-1] # move non-selected item into vacancy return result
Importing this module emits the following output. (Some details, such as the addresses and paths, will, of course, vary.)
'isinstance' -> <built-in function isinstance>[6] 'list' -> <type 'list'>[7] 'tuple' -> <type 'tuple'>[8] 'str' -> <type 'str'>[9] 'TypeError' -> <class exceptions.TypeError at 0x402952cc>[10] 'type' -> <type 'type'>[11] 'len' -> <built-in function len>[12] 'ValueError' -> <class exceptions.ValueError at 0x40295adc>[13] 'list' -> <type 'list'>[7] 'xrange' -> <type 'xrange'>[14] 'int' -> <type 'int'>[15] 'random' -> <module 'random' from '/usr/local/lib/python2.4/random.pyc'>[16] new folded constant: (<type 'list'>, <type 'tuple'>, <type 'str'>)[17] new folded constant: <built-in method random of Random object at 0x819853c>[18]
On my machine, with the decorator optimize.make_constants
as shown in this
snippet, sample(range(1000), 100)
takes 287 microseconds; without the decorator
(and thus with the usual bytecode that the Python 2.4 compiler
produces), the same operation takes 333 microseconds. Thus, using the
decorator improves performance by approximately 14% in this
example—and it does so while allowing your own functions’ source code
to remain pristine, without any optimization-induced obfuscation. On
functions making use of more constant names within loops, the
performance benefit of using this recipe’s decorator can be
correspondingly greater.
A common and important technique for manual optimization of a
Python function, once that function is shown by profiling to be a
bottleneck of program performance, is to ensure that all global and
built-in name lookups are turned into lookups of local names. In the
source of functions that have been thus optimized, you see strange
arguments with default values, such as _len=len
, and the body of the function uses
this local name _len
to refer to the built-in
function len
. This kind of
optimization is worthwhile because lookup of local names is much
faster than lookup of global and built-in names. However, functions
thus optimized can become cluttered and less readable. Moreover,
optimizing by hand can be tedious and error prone.
This recipe automates this important optimization technique: by
just mentioning a decorator before the def
statement, you get all the constant
bindings and foldings, while leaving the function source uncluttered,
readable, and maintainable. After binding globals to constants, the
decorator makes a second pass and folds constant attribute lookups and
tuples of constants. Constant attribute lookups often occur when you
use a function or other attribute from an imported module, such as the
use of random.random
in the
sample
function in the example snippet.
Tuples of constants commonly occur in for
loops and conditionals using the
in
operator, such as for x in ('a', 'b', 'c')
. The best way to
appreciate the bytecode transformations performed by the decorator in
this recipe is to run "dis.dis(sample)
" and view the disassembly
into bytecodes, both with and without the decorator.
If you want to optimize every function and method in a module,
you can call optimize.bind_all(sys.modules[_
_name_ _])
as the last instruction in the module’s body,
before the tests. To optimize every method in a class, you can call
optimize.bind_all(theclass)
just
after the end of the body of the class
theclass
statement. Such wholesale optimization is handy (it
does not require you to deal with any details) but generally not the
best approach. It’s best to bind, selectively, only functions whose
speed is important. Functions that particularly benefit from
constant-binding optimizations are those that refer to many global and
built-in names, particularly with references in loops.
To ensure that the constant-binding optimizations do not alter
the behavior of your code, apply them only where dynamic updates of
globals are not desired (i.e., the globals do not change). In more
dynamic environments, a more conservative approach is to pass argument
builtin_only
as True
, so that only the built-ins get
optimized (built-ins include functions such as len
, exceptions such as IndexError
, and such constants as True
or False
). Alternatively, you can pass a
sequence of names as the stoplist
argument, to tell
the binding optimization functions to leave unchanged any reference to
those names.
While this recipe is meant for use with Python 2.4, you can also
use this approach in Python 2.3, with a few obvious caveats. In
particular, in version 2.3, you cannot use the new 2.4 @decorator
syntax. Therefore, to use in
Python 2.3, you’ll have to tweak the recipe’s code a little, to expose
_make_constants
directly, without a leading
underscore, and use f=make_constants(f)
in your code, right
after the end of the body of the def
f
statement. However, if you are interested in optimization,
you should consider moving to Python 2.4 anyway: Python 2.4 is very
compatible with Python 2.3, with just a few useful additions, and
version 2.4 is generally measurably faster than Python 2.3.
Library Reference and Python
in a Nutshell docs on the opcode
module.
Credit: Michele Simionato, David Mertz, Phillip J. Eby, Alex Martelli, Anna Martelli Ravenscroft
You need to multiply inherit from several classes that may come from several metaclasses, so you need to generate automatically a custom metaclass to solve any possible metaclass conflicts.
First of all, given a sequence of metaclasses, we want to filter
out “redundant” ones—those that are already implied by others, being
duplicates or superclasses. This job nicely factors into a
general-purpose generator yielding the unique, nonredundant items of
an iterable, and a function using inspect.getmro
to make the set of all
superclasses of the given classes (since superclasses are
redundant):
# support 2.3, too try: set except NameError: from sets import Set as set # support classic classes, to some extent import types def uniques(sequence, skipset): for item in sequence: if item not in skipset: yield item skipset.add(item) import inspect def remove_redundant(classes): redundant = set([types.ClassType]) # turn old-style classes to new for c in classes: redundant.update(inspect.getmro(c)[1:]) return tuple(uniques(classes, redundant))
Using the remove_redundant
function, we can generate a metaclass that can resolve metatype
conflicts (given a sequence of base classes, and other metaclasses to
inject both before and after those implied by the base classes). It’s
important to avoid generating more than one metaclass to solve the
same potential conflicts, so we also keep a “memoization”
mapping:
memoized_metaclasses_map = { } def _get_noconflict_metaclass(bases, left_metas, right_metas): # make tuple of needed metaclasses in specified order metas = left_metas + tuple(map(type, bases)) + right_metas needed_metas = remove_redundant(metas) # return existing confict-solving meta, if any try: return memoized_metaclasses_map[needed_metas] except KeyError: pass # compute, memoize and return needed conflict-solving meta if not needed_metas: # whee, a trivial case, happy us meta = type elif len(needed_metas) == 1: # another trivial case meta = needed_metas[0] else: # nontrivial, darn, gotta work... # ward against non-type root metatypes for m in needed_metas: if not issubclass(m, type): raise TypeError( 'Non-type root metatype %r' % m) metaname = '_' + ''.join([m._ _name_ _ for m in needed_metas]) meta = classmaker( )(metaname, needed_metas, { }) memoized_metaclasses_map[needed_metas] = meta return meta def classmaker(left_metas=( ), right_metas=( )): def make_class(name, bases, adict): metaclass = _get_noconflict_metaclass(bases, left_metas, right_metas) return metaclass(name, bases, adict) return make_class
The internal _get_noconflict_metaclass
function, which returns (and, if needed, builds) the
conflict-resolution metaclass, and the public
classmaker
closure must be mutually recursive for a
rather subtle reason. If _get_noconflict_metaclass
just built the metaclass with the reasonably common idiom:
meta = type(metaname, needed_metas, { })
it would work in all ordinary cases, but it might get into trouble when the metaclasses involved have custom metaclasses themselves! Just like “little fleas have lesser fleas,” so, potentially, metaclasses can have meta-metaclasses, and so on—fortunately not “ad infinitum,” pace Augustus De Morgan, so the mutual recursion does eventually terminate.
The recipe offers minimal support for old-style (i.e., classic)
classes, with the simple expedient of initializing the set
redundant
to contain the metaclass of old-style
classes, types.ClassType
. In
practice, this recipe imposes automatic conversion to new-style
classes. Trying to offer more support than this for classic classes,
which are after all a mere legacy feature, would be overkill, given
the confused and confusing situation of metaclass use for old-style
classes.
In all of our code outside of this noconflict.py module, we will only use
noconflict.classmaker
, optionally passing it
metaclasses we want to inject, left and right, to obtain a callable
that we can then use just like a metaclass to build new class objects
given names, bases, and dictionary, but with the assurance that
metatype conflicts cannot occur. Phew. Now that
was worth it, wasn’t it?!
Here is the simplest case in which a metatype conflict can occur: multiply inheriting from two classes with independent metaclasses. In a pedagogically simplified toy-level example, that could be, say:
>>> class Meta_A(type): pass ... >>> class Meta_B(type): pass ... >>> class A: _ _metaclass_ _ = Meta_A ... >>> class B: _ _metaclass_ _ = Meta_B ... >>> class C(A, B): passTraceback (most recent call last):
File "<stdin>", line 1, in ?
TypeError: Error when calling the metaclass bases
metaclass conflict: the metaclass of a derived class must be a
(non-strict) subclass of the metaclasses of all its bases
>>>
A class normally inherits its metaclass from its bases, but when
the bases have distinct metaclasses, the metatype constraint that
Python expresses so tersely in this error message applies. So, we need
to build a new metaclass, say Meta_C
, which
inherits from both Meta_A
and
Meta_B
. For a demonstration of this need,
see the book that’s so aptly considered the bible of metaclasses: Ira
R. Forman and Scott H. Danforth, Putting Metaclasses to
Work: A New Dimension in Object-Oriented Programming
(Addison-Wesley).
Python does not do magic: it does not automatically create the
required Meta_C
. Rather, Python raises a
TypeError
to ensure that the
programmer is aware of the problem. In simple cases, the programmer
can solve the metatype conflict by hand, as follows:
>>> class Meta_C(Meta_A, Meta_B): pass >>> class C(A, B): _ _metaclass_ _ = Meta_C
In this case, everything works smoothly.
The key point of this recipe is to show an automatic way to
resolve metatype conflicts, rather than having to do it by hand every
time. Having saved all the code from this recipe’s Solution into
noconflict.py somewhere along
your Python sys.path
, you can make
class C
with automatic conflict resolution, as
follows:
>>> import noconflict >>> class C(A, B): _ _metaclass_ _ = noconflict.classmaker( )
The call to the noconflict.classmaker
closure
returns a function that, when Python calls it, obtains the proper
metaclass and uses it to build the class object. It cannot yet return
the metaclass itself, but that’s OK—you can
assign anything you want to the _ _metaclass_
_
attribute of your class, as long as it’s callable with the
(name, bases, dict) arguments and nicely builds the class object. Once
again, Python’s signature-based polymorphism serves us well and
unobtrusively.
Automating the resolution of the metatype conflict has many pluses, even in simple cases. Thanks to the “memoizing” technique used in noconflict.py, the same conflict-resolving metaclass is used for any occurrence of a given sequence of conflicting metaclasses. Moreover, with this approach you may also explicitly inject other metaclasses, beyond those you get from your base classes, and again you can avoid conflicts. Consider:
>>> class D(A): _ _metaclass_ _ = Meta_BTraceback (most recent call last):
File "<stdin>", line 1, in ?
TypeError: Error when calling the metaclass bases
metaclass conflict: the metaclass of a derived class must be a
(non-strict) subclass of the metaclasses of all its bases
This metatype conflict is resolved just as easily as the former one:
>>> class D(A): _ _metaclass_ _ = noconflict.classmaker((Meta_B,))
The code presented in this recipe’s Solution takes pains to avoid any subclassing that is not strictly necessary, and it also uses mutual recursion to avoid any meta-level of meta-meta-type conflicts. You might never meet higher-order-meta conflicts anyway, but if you adopt the code presented in this recipe, you need not even worry about them.
Thanks to David Mertz for help in polishing the original version
of the code. This version has benefited immensely from discussions
with Phillip J. Eby. Alex Martelli and Anna Martelli Ravenscroft did
their best to make the recipe’s code and discussion as explicit and
understandable as they could. The functionality in this recipe is not
absolutely complete: for example, it supports old-style classes only
in a rather backhanded way, and it does not really cover such exotica
as nontype metatype roots (such as Zope 2’s old ExtensionClass
). These limitations are there
primarily to keep this recipe as understandable as possible. You may
find a more complete implementation of metatype conflict resolution at
Phillip J. Eby’s PEAK site, http://peak.telecommunity.com/, in the
peak.util.Meta
module of the PEAK
framework.
Ira R. Forman and Scott H. Danforth, Putting Metaclasses to Work: A New Dimension in Object-Oriented Programming (Addison-Wesley); Michele Simionato’s essay, “Method Resolution Order,” http://www.python.org/2.3/mro.html.