Credit: David Ascher, Mark Hammond
You’re trying to track down memory usage of specific classes in a large system, and Recipe 14.10 either gives too much data to be useful or fails to recognize cycles.
You can design the constructors of suspect classes to keep a list of weak references to the instances in a global cache:
tracked_classes = {}
import weakref
def logInstanceCreation(instance):
name = instance._ _class_ _._ _name_ _
if not tracked_classes.has_key(name):
tracked_classes[name] = []
tracked_classes[name].append(weakref.ref(instance))
def reportLoggedInstances(classes): # "*" means all known instances
if classes == '*':
classes = tracked_classes.keys( )
else:
classes = classes.split( )
classes.sort( )
for classname in classes:
for ref in tracked_classes[classname]:
ob = ref( )
if ob is not None:
print ref( )
To use this code, add a call to
logInstanceCreation(self)
to the _ _init_ _
calls of the classes whose
instances you want to track. When you want to find out which
instances are currently alive, call reportLoggedInstances( )
with the name of the classes in question (e.g., MyClass._ _name_ _
).
Tracking memory problems is a key skill for developers of large systems. The above code was dreamed up to deal with memory allocations in a system that involved three different garbage collectors; Python was only one of them. Due to the references between Python objects and non-Python objects, none of the individual garbage collectors could be expected to detect cycles between objects managed in different memory-management systems. Furthermore, being able to ask a class which of its instances are alive can be useful even in the absence of cycles (e.g., when making sure that the right numbers of instances are created following a particular user action in a GUI program).
The recipe hinges on a global dictionary called
tracking_classes
, which uses class names as keys, and
a list of weak references to instances of that class in
correspondence with each key. The
logInstanceCreation
function updates the
dictionary (adding a new empty list if the name of specific class
whose instance is being tracked is not a key in the dictionary, then
appending the new weak reference in any case). The
reportLoggedInstances
function accepts a string
argument that is either '*'
, meaning all classes,
or all the names of the pertinent classes separated by whitespace.
The function checks the dictionary entry for each of these class
names, examining the list and printing out those instances of the
class that still exist. It checks whether an instance still exists by
calling the weak reference that was put in the list to it. When
called, a weak reference returns None
if the
object it referred to does not exist; otherwise, it returns a normal
(strong) reference to the object in question.
Something you may want to do when using this kind of code is make
sure that the possibly expensive debugging calls are wrapped in a
if _ _debug_ _:
test, as in:
class TrackedClass: def _ _init_ _(self): if _ _debug_ _: logInstanceCreation(self) ...
The pattern if _ _debug_ _:
is detected by the
Python parser in Python 2.0 and later. The body of any such marked
block is ignored in the byte code-generation phase if the
-O
command-line switch is specified. Consequently,
you may write inefficient debug-time code, while not impacting the
production code. In this case, this even avoids some unimportant
byte-code generation. These byte-code savings can’t
amount to much, but the feature is worth noting.
Also note that the ignominiously named
setdefault
dictionary method can be used to compact the
logInstanceCreation
function into a logical
one-liner:
def logInstanceCreation(instance): tracked_classes.setdefault(instance._ _class_ _._ _name_ _, [] ).append(weakref.ref(instance))
But such space savings are hardly worth the obfuscation cost, at least in the eyes of these authors.