Special attributes, those identifiers with two leading and trailing underscores, are abundant in Jython classes. They are the primary means of creating highly customized objects with complex behaviors. This chapter considers advanced Jython classes as those that leverage these special attributes.
Describing these objects as advanced might be misconstrued to mean difficult or reserved for those more studied in Python, but that is not the case. Adding special attributes to classes is part of the ongoing battle against complexity. The ability to tune an object’s behavior to act like a list or the ability to intercept attribute access only costs a few special methods while potential gains in design, reusability and flexibility are great.
Classes and instances implicitly have certain special attributes—those that automatically appear when a class definition executes or an instance is created. Jython wraps Java classes and instances so that they also have special attributes. Jython classes have five special attributes, whereas Java classes have three of the same five. Note that while all of these attributes are readable, only some allow assignment to them.
To further examine these special attributes, let’s first define a minimal Jython class suitable for exploring. Listing 7.1 is a Jython module that contains an import
statement and a single class definition: LineDatum
. The LineDatum
class merely defines a line expression (datum) based on the slope and intercept supplied to the constructor. The instance then may add points that lie on the line with the addPoint
method.
Example 7.1. A Jython Class Tracking Points on a Line
# file: datum.py import java class LineDatum(java.util.Hashtable): """Collects points that lie on a line. Instantiate with line slope and intercept: e.g. LineDatum(.5, 3)""" def __init__(self, slope, incpt): self.slope = slope self.incpt = incpt def addPoint(self, x, y): """addPoint(x, y) – > 1 or 0 Accepts coordinates for a cartesian point (x,y). If point is on the line, it adds the point to the instance.""" if y == self.slope * x + self.incpt: self.put((self.slope, self.incpt), (x, y)) return 1 return 0
Using the LineDatum
class from within the interactive interpreter looks like this:
>>> import datum >>> ld = datum.LineDatum(.5, 3) >>> ld.addPoint(0, 3) 1 >>> ld.addPoint(2, 3) 0 >>> ld.addPoint(2, 4) 1
The special class variables are described in the next few sections.
This read-only attribute contains the name of the class. The name of the LineDatum
class in Listing 7.1 is LineDatum
. Even if the import as
syntax is used, the __name__
is LineDatum
. An example of this follows:
>>> from datum import LineDatum >>> LineDatum.__name__ 'LineDatum' >>> from datum import LineDatum as ld >>> ld.__name__ 'LineDatum'
Java classes used within Jython also have the __name__
attribute as demonstrated here:
>>> import java >>> java.util.Hashtable.__name__ 'java.util.Hashtable'
This attribute contains the class documentation string, or None if it is not provided. You can assign to __doc__
.
>>> from datum import LineDatum >>> print LineDatum.__doc__ Collects points that lie on a line. Instantiate with line slope and intercept: LineDatum(.5, 3)
Java classes used within Jython do not have a __doc__
attribute.
This attribute contains the name of the module in which the class is defined. You can assign to __module__
. For the LineDatum
class in Listing 7.1, it is defined in the module datum
and is confirmed with this example:
>>> from datum import LineDatum >>> LineDatum.__module__ 'datum'
Java classes used within Jython do not have a __module__
attribute. Java doesn’t have modules, so this makes sense.
This attribute is a PyStringMap
object containing all the class attributes. We can guess from looking at the LineDatum
class definition in Listing 7.1 what keys should be found in LineDatum.__dict__
: There must be an __init__
and an addPoint
because those are the two methods defined. There should also be __doc__
and __module__
keys as described previously. The __name__
key is a good guess, but it actually doesn’t appear in the class.__dict__
in Jython. The LineDatum
class in Listing 7.1 is more complex than a normal class because it actually subclasses a Java class. The implementation of this requires additional attributes as seen in this example:
>>> from datum import LineDatum >>> LineDatum.__dict__ {'__doc__': 'Collects points that lie on a line. Instantiate with line slope and intercept: LineDatum(.5, 3) ', 'rehash': <java function rehash at 1298249>, '__init__': <function __init__ at 913493>, 'addPoint': <function addPoint at 1939121>, 'finalize': <java function finalize at 1068455>, '__module__': 'datum'}
Attributes defined in a class appear in the class’s __dict__
. The conventional notation for accessing attribute b
in class A
is A.b
; however, the class __dict__
allows an alternate notation of A.__dict__['b']
. Here’s an example of the differing syntaxes for attribute access:
>>> from datum import LineDatum >>> LineDatum.__module__ 'datum' >>> LineDatum.__dict__['__module__'] # same as 'LineDatum.__module__' 'datum'
You can even call methods with this alternate naming. A bit of a trick is required for the addPoint
method of Listing 7.1 because it is an instance method. An instance must be the first parameter. Fortunately, Jython isn’t fussy about which instance, so you can just create an instance before testing the syntax and pass it as the first argument:
>>> from datum import LineDatum >>> inst = LineDatum(.5, 3) # get a surrogate instance >>> LineDatum.addPoint(inst, 5, 4) 1 >>> LineDatum.__dict__['addPoint'](inst, 5, 4) # does same as above 1
This indirect way of accessing class and instance attributes is part of some popular Python patterns. Directly using a class __dict__
is something many flexible Jython object designs employ, and is sometime required when certain special methods are defined such as the __setattr__
method described later.
Even more interesting is that Java classes used within Jython also have a __dict__
, which contains their members. Looking at the java.io.File
class’s __dict__
in Jython requires the following:
>>> import java >>> java.io.File.__dict__ {'createNewFile': <java function createNewFile at 4294600>, 'lastModified': <java function lastModified at 3759986>, ... }
The full results of looking at the java.io.File.__dict__
is left to the reader to discover due to its length. If you just want to look at the member names use the following:
>>> java.io.File.__dict__.keys() ['mkdirs', 'exists', ...]
Additionally, you can call a method such as listRoots
by using its key in java.io.File.__dict__
. What would traditionally be called with java.io.File.listRoots()
works with java.io.File.__dict__['listRoots']()
as demonstrated in this example:
>>> java.io.File.__dict__['listRoots']() array([A:, C:, D:, G:], java.io.File)
Currently, you can also assign to a Java class’s __dict__
. While this is likely bad practice for beginners, it clarifies the nature of __dict__
. If you wish to alter the lookup of a Java class member, you can change the value of that key in its __dict__
. Suppose you wanted a different method called for java.io.File.listRoots()
; you can alter it this way:
>>> import java >>> def newListRoots(): ... return (['c:']) ... >>> java.io.File.__dict__['listRoots'] = newListRoots >>> java.io.File.listRoots() # try the circumvented method ['c:']
This is a tuple
of bases, or super classes. In Jython version 2.0 and the first alpha versions of Jython 2.1 designate this variable as read-only; however, versions after 2.1a1 allow assignments to this variable. The implementation at the time of this writing differs slightly from CPython because you can alter CPython’s __bases__
. In Listing 7.1, the superclass of LineDatum
is java.util.Hashtable
. The special variable __bases__
confirms this:
>>> from datum import LineDatum >>> LineDatum.__bases__ (<jclass java.util.Hashtable at 5012120>,)
Java classes used within Jython also have the special __bases__
variable, which includes base classes and interfaces implemented:
>>> import java >>> java.io.File.__bases__ (<jclass java.lang.Object at 7290061>, <jclass java.io.Serializable at 62789>, <jclass java.lang.Comparable at 6728374>)
Jython instances have two special variables that are implicitly defined, while Java instances used within Jython have one. These attributes are readable and assignable.
The __class__
variable denotes the class that the current object is an instance of. If we continue abusing Listing 7.1, we can demonstrate how an instance of LineDatum
knows its class.
>>> from datum import LineDatum >>> ld = LineDatum(.66, -2) >>> ld.__class__ <class datum.LineDatum at 867682>
You can see that the __class__
variable is not just a string representing the class, but an actual reference to the class. If you know the required parameters, you can create an instance of an instance’s class. Continuing the previous example to do so looks like this:
>>> ld2 = ld.__class__(1.2, 6) >>> ld2.__class__ <class datum.LineDatum at 867682>
You can examine all class properties of instance.__class__
just as you can with the actual class. This is especially advantageous when examining Java instances. The dir()
of a Java instances isn’t very informative about its instance members because of the nature of the proxy used to access them. That means being able to examine a Java instance’s __class__
dictionary aids in exploring a Java instance in the interactive interpreter. If you forget methods available in a Java.io.File
instance, you can examine the instance’s class for attributes.
>>> import java >>> f = java.io.File("c:\jython") >>> dir(f) [] >>> dir(f.__class__) ['__init__', 'absolute', 'absoluteFile', 'absolutePath', 'canRead', 'canWrite','canonicalFile', 'canonicalPath', 'compareTo', 'createNewFile', 'createTempFile', 'delete', 'deleteOnExit', 'directory', 'exists', 'file', 'getAbsoluteFile', 'getAbsolutePath', 'getCanonicalFile', 'getCanonicalPath', 'getName', 'getParent', 'getParentFile', 'getPath', 'hidden', 'isAbsolute', 'isDirectory', 'isFile', 'isHidden', 'lastModified', 'length', 'list', 'listFiles', 'listRoots', 'mkdir', 'mkdirs', 'name', 'parent', 'parentFile', 'path', 'pathSeparator', 'pathSeparatorChar', 'renameTo', 'separator', 'separatorChar', 'setLastModified', 'setReadOnly', 'toURL']
Although three of the four general customization methods for objects were introduced in Chapter 6, “Classes, Instances, and Inheritance,” they are reiterated here to make this chapter a more complete reference of special attributes.
The __init__
method is a Jython object’s constructor; it gets called for instance creation. Jython superclasses requiring explicit initialization should be initialized in a constructor with baseclass.__init__(self, [args...])
. The same syntax works for explicitly initializing a Java superclasses. If a Java superclass is not explicitly initialized, its empty constructor is called at the completion of a Jython subclass’s __init__
method. See Chapter 6 for more on constructors.
The __del__
method is a Jython object’s destructor or finalizer. It accepts no arguments so it’s parameter list should only contain self
. There is no guarantee as to when garbage collection will collect an object and thus call the __del__
method. Java does not even guarantee it will be called at all. Because of this, it is best to plan objects so that contents of the __del__
method are minimal or so that a finalizer is unnecessary. Also note that Jython classes that do define __del__
incur a performance penalty. Finalizing methods of Java superclasses are automatically called along with the __del__
method of a Jython instance, but Jython superclass destructors must be explicitly called when their execution is required. Syntax for calling a parent class’s destructor is demonstrated here:
>>> class superclass: ... def __del__(self): ... print "superclass destroyed" ... >>> class subclass(superclass): ... def __del__(self): ... superclass.__del__(self) ... print "subclass destroyed" ... >>> s = subclass() >>> del s >>> >>> # wait for a while and hit enter a few time until GC comes around superclass destroyed subclass destroyed
Exceptions that either Java or Jython finalizing methods raise are all ignored. The only effect a raised exception has is that the finalizing method returns at the point of the exception rather than running to normal completion.
The __repr__
method provides an object with string conversion behavior. The use of reverse-quotes or the repr() built-in method calls an object’s __repr__
method. Also, if no __str__
attribute exists in the object, the object’s __repr__
method is called when it is printed. The __repr__
method should return a valid Python expression as a string that represents the formal data structure of the object. If an appropriate expression is not possible, convention suggests a technical description within angle brackets (<>
). Assume that you have an object that is supposed to act like a list. The __repr__
method of this object should return a string that looks like a list (for example, '[1, 2, 3, 4]'
).
The __str__
method provides an informal representation of an object called when the object is printed, or when the built-in str()
method is used on the object. This differs from __repr__
. The __repr__
method returns an expression or data-full representation of an object while the __str__
method usually returns a brief description or characterization of the object.
Listing 7.2 demonstrates the implementation of both the special methods __str__
and __repr__
. The class in Listing 7.2 implements both these methods so there may be a canonical data object representation (the __repr__
results) and an HTML characterization (the __str__
results).
Example 7.2. Implementing __str__ and __repr__
# file: html.py class HtmlMetaTag: """Constructor requires "name" field of metatag. Use the intance's "append" method to add to the list""" def __init__(self, name): self.name = name self.list = [] def append(self, item): self.list.append(item) def __repr__(self): return `{'name':self.name, 'list':self.list}` def __str__(self): S = '<meta name="%s" content="%s">' return S % (self.name, ", ".join(self.list)) if __name__=='__main__': mt = HtmlMetaTag("keywords") map(mt.append, ['Jython', 'Python', 'programming']) print "The __str__ results are: ", mt print print "The __repr__ results are: ", repr(mt)
The results from running jython
html.py
:
The __str__ results are: <meta name="keywords" content="Jython, Python, programming"> The __repr__ results are: {'name': 'keywords', 'list': ['Jython', 'Python', 'programming']}
Jython allows programmers to customize the access, setting and deletion of instance attributes with the corresponding special methods __getattr__
, __setattr__
, and __delattr__
. There is an implied interrelationship between these methods, but there is no requirement to define certain ones. If you want dynamic access, define __getattr__
. If you want dynamic attribute assignments, define __setattr__
, and if you want dynamic attribute deletion, define __delattr__
. What makes these different than using normal attribute access is that they are dynamic: They evaluate at the time the attribute is requested during runtime.
In a class that lacks dynamic attribute lookups, accessing a non-existing attribute is an AttributeError
:
>>> class test: ... pass ... >>> t = test() >>> t.a Traceback (innermost last): File "<console>", line 1, in ? AttributeError: instance of 'test' has no attribute 'a'
A minimal example of dynamic attribute access is the ability to avoid such AttributeErrors
as is done in Listing 7.3. Adding dynamic attribute access to an instance requires defining the __getattr__
method. This method must have two parameter slots, the first for self
and second for the attribute name. Once __getattr__
is defined, instance attribute lookups that fail in traditional means continue to call the __getattr__
method to fulfill the request. Listing 7.3 is a module containing a class that merely avoids the AttributeError
by supplying a default __getattr__
value of None
.
Example 7.3. Adding Dynamic Attribute Access to a Class
# file: getattr.py class test: a = 10 def __getattr__(self, name): return None if __name__ == "__main__": t = test() print "The value of t.a is:", t.a print "The value of t.b is:", t.b
Results from running jython
getattr.py
:
The value of t.a is: 10 The value of t.b is: None
The __getattr__
in Listing 7.3 provides a default value of None
for missing attributes. It is a succinct example of using __getattr__
, but I should mention that it is somewhat suspect in design. An implementation of __getattr__
normally returns a useful value or raises an AttributeError
if it’s unable to compute a useful value. Default values are certainly appropriate at times, but they can also hide design flaws. Listing 7.4 more closely resembles common implementations of __getattr__
. The valuable principle exploited in Listing 7.4 is the use of an object separate from the class and instance for locating object attributes. Why this is valuable is related to an asymmetry between __getattr__
and __setattr__
, which is explained later. In Listing 7.4 the data object used to hold instance attributes is a module-level dictionary, but it could just as easily be a list, an instance of another class, or a network resource. It all depends on what you do in the __getattr__
method (and __setattr__
).
Example 7.4. Attributes Supplied from a Separate Object
# file extern.py data = {"a":1, "b":2} class test: def __getattr__(self, attr): if data.has_key(attr): # lookup attribute in module-global "data" return data[attr] else: raise AttributeError if __name__=="__main__": t = test() print "attribute a =", t.a print "attribute b =", t.b print "attribute c =", t.c # doesn't exist in "data"- is error
Results from running jython
extern.py
attribute a = 1 attribute b = 2 attribute c =Traceback (innermost last): File "extern.py", line 15, in ? AttributeError: instance of "test" has no attribute "c"
As noted earlier, calling the __getattr__
method occurs only after traditional attribute lookup fails. What is the traditional lookup? Listing 7.3 proves that it must obviously include class variables; otherwise, the output would not include 10
. The typical scenario is that attribute lookup begins with the instance dictionary, then the instance dictionary of initialized base classes, then the class dictionary, and finally base class dictionaries. Only after those fail does Jython call the __getattr__
method. In Listing 7.3, looking up the attribute a
does not go through __getattr__
because a
is found in the class __dict__
.
One catch in instance initialization is that if you define an __init__
method in a subclass, you need to explicitly call the __init__
in Jython superclasses to ensure proper instance lookup. If you do not define an __init__
method, the superclass constructor is automatically called, as demonstrated here:
>>> class A: ... def __init__(self): ... self.val = "'val' found in instance of superclass" ... >>> class B(A): ... pass ... >>> c = B() >>> c.val "'val' found in instance of superclass"
If you do define an __init__
in a subclass, but fail to call the constructor of the Jython superclass, attributes cannot be resolved in the instance of the superclass:
>>> class B(A): # assume class A is the same as the previous example ... def __init__(self): ... pass ... >>> c = B() >>> c.val Traceback (innermost last): File "<console>", line 1, in ? AttributeError: instance of 'B' has no attribute 'val'
This initialization catch does not apply to Java superclasses. If a Java superclass is not explicitly initialized, its empty constructor is called upon completion of the subclasses __init__
method, and instance attribute lookup proceeds as normal.
Adding dynamic attribute assignment to an instance requires a __setattr__
method. This method must have three parameter slots, the first for self
, the second for the attribute name, and the third is the value assigned to the attribute. Once defined, the __setattr__
method intercepts all assignments to instance attributes except the implicitly defined __class__
and __dict__
. Because of this, you cannot directly set an instance attribute in the __setattr__
method without creating a circular lookup and overflow exception. You can however rebind the instance __dict__
without such errors because of its exemption from the __setattr__
hook, and you can access __dict__
directly because of its exemption from __getattr__
.
Listing 7.5 demonstrates using the __setattr__
method to restrict field types to integers. Additionally, the use of both __getattr__
and __setattr__
methods allows storage of instance attributes in a data object other than the instances __dict__
. Listing 7.5 instead uses a Hashtable called _data
to store any instance fields assigned.
The assignment of _data
in the class constructor does not use self._data=...,
but instead uses the instance __dict__
directly. Why? Instance assignments, including those in the constructor, now all go through __setattr__
; however, __setattr__
is expecting a _data
key in the instance __dict__
. This paradox is avoided by adding _data
directly to the instance dictionary with the self.__dict__[key]=value
syntax, and thus avoids the __setattr__
hook.
Another valuable quality of Listing 7.5 is that the __setattr__
method ensures instance variables are not stored in the instance’s __dict__
. Why is this good? To understand the value, we must first look at the asymmetry between __setattr__
and __getattr__
. The __setattr__
always intercepts instance attribute assignments, but __getattr__
is called only when normal attribute lookup fails. This is bad if you want each get
and set
to perform some symmetrical action such as always accessing and storing values from an external database. Keeping instance values outside the instance __dict__
ensures they are not found and __getattr__
is called. Instance variables in superclasses, and class variables short-circuit this control, so careful planning in subclasses is due. Also note that there is a bit of a performance hit for each __getattr__
call considering the normal lookup must complete unsuccessfully before calling the __getattr__
method.
Example 7.5. Using __setattr__ and __getattr__
# file: setter.py from types import IntType, LongType import java class IntsOnly: def __init__(self): self.__dict__['_data'] = java.util.Hashtable() def __getattr__(self, name): if self._data.containsKey(name): return self._data.get(name) else: raise AttributeError, name def __setattr__(self, name, value): test = lambda x: type(x)==IntType or type(x)==LongType assert test(value), "All fields in this class must be integers" self._data.put(name, value) if __name__ == '__main__': c = IntsOnly() c.a = 1 print "c.a=", c.a c.a = 200L print "c.a=", c.a c.a = "string" print "c.a=", c.a # Shouldn't get here
Results from running jython
setter.py
:
c.a= 1 c.a= 200 Traceback (innermost last): File "setter.py", line 25, in ? File "setter.py", line 17, in __setattr__ AssertionError: All fields in this class must be integers
Adding dynamic attribute deletion requires defining the __delattr__
method. This method must have two parameter slots, the first for self
, and the second for the attribute name. The __delattr__
method is called when using del
object.attribute
. The attribute could be a resource requiring flushing/closing before deletion. The attribute could also be part of a persistent resource requiring deletion from a database or file system, or could be an attribute you don’t want users of your class to delete. It could also be that attributes are stored in a data object other than the instance __dict__
and require special handling for deletion, as would be required for Listing 7.5. The __delattr__
hook allows the programmer to properly handle these types of situations. Listing 7.6 examples using the __delattr__
hook to prevent deletion of an attribute:
Example 7.6. Using __delattr__ to Protect an Attribute from Deletion
# file: immortal.py class A: def __init__(self, var): self.immortalVar = var def __delattr__(self, name): assert name!="immortalVar", "Cannot delete- it's immortal" del self.__dict__[name] c = A("some value")print "The immortalVar=", c.immortalVar del c.immortalVar
Results from running jython
immortal.py
The immortalVar= some value Traceback (innermost last): File "immortal.py", line 12, in ? File "immortal.py", line 7, in __delattr__ AssertionError: Cannot delete- it's immortal
The special method __call__
makes an instance callable. The number of parameters of the call method is not restricted in any way. On a basic level, this makes an instance act like a function. Creating a function-like instance that prints a simple message looks like this:
>>> class hello: ... def __call__(self): ... print "Hello" ... >>> h = hello() >>> h() # call the instance as if it were a function Hello
The hello
example is a bit misleading in that it disguises the real potential of the __call__
method. Listing 7.7 is a slightly more interesting example that fakes static methods with an inner class that implements the __call__
method. The inner class in Listing 7.7 gets a java.lang.Runtime
instance and defines a __call__
method for running a system command and returning its output. The inner class in Listing 7.7 is named _static_runcommand
. What a user would call is the instance of the inner class. Remember, the instance is what is callable because of __call__
, and is what becomes the static method. A user of the class would instantiate the outer class commands
. Let’s call this instance A
, and then would call the runcommand
instance with A.runcommand(command)
. The runcommand
instance looks and acts like a method. Despite numerous instances of the outer commands
class, only a single instance of _static_runcommand
, and thus a single instance of java.lang.Runtime
, is required (this assumes no synchronization requirements).
Example 7.7. Faking Static Methods with __call__ and Inner Classes
# file: staticmeth.py from java import lang, io class commands: class _static_runcommand: "inner class whose instance is used to fake a static method" rt = lang.Runtime.getRuntime() def __call__(self, cmd): stream = self.rt.exec(cmd).getInputStream() isr = io.InputStreamReader(stream) results = [] ch = isr.read() while (ch > -1): results.append(chr(ch)) ch = isr.read() return "".join(results) runcommand = _static_runcommand() # create instance in class scope if __name__ == '__main__': inst1 = commands() inst2 = commands() # now make sure runcommand is static (is shared by both instances) assert inst1.runcommand is inst2.runcommand, "Not class static" # now call the "faked" static method from either instance print inst1.runcommand("mem") # for windows users #print inst1.runcommand("cat /proc/meminfo") # for linux users
The results from running jython
staticmeth.py
on Windows 2000 is:
655360 bytes total conventional memory 655360 bytes available to MS-DOS 633024 largest executable program size 1048576 bytes total contiguous extended memory 0 bytes available contiguous extended memory 941056 bytes available XMS memory MS-DOS resident in High Memory Area
Comparison is the use of operators ==
, !=
, <
, <=
, >
, and >=
. This is simple for many objects like integers (5 > 4
), but what about user-defined classes. Jython allows the definition of methods that implement such comparisons, but Jython version 2.1 (and Python 2.1) introduced some changes in implementing class comparisons methods. This new feature is rich comparisons. For the sake of contrast, the old comparison method is dubbed poor comparisons.
Three-way, or poor comparisons entail a single special comparison method called __cmp__
, which returns −1
, 0
, or 1
depending on whether self
evaluates as less, equal, or more than another object. The __cmp__
method must have two parameter slots, the first of which for self
and the second for the other object. The verbose version of what compare should do is this:
def __cmp__(self, other): if (self < other): return −1 if (self == other): return 0 if (self > other): return 1
Listing 7.8 uses the _
_cmp_
_
method and a class attribute, role
, to determine sort order for a list of members of a family circle. The built-in cmp()
function is used on the appropriate class value as determined by the role
the class is set to. If the other
class is not an instance of the family class, it always compares as less (note that ‘eldest’ inverts things so less is really more). Note that setting a class instance flag
, role
in this case, is not the preferred way to control sort behavior in more complex classes.
Example 7.8. Comparison by Role
#file: poorcompare.py class family: role = "familyMember" # default value def __init__(self, name, age, relation, communicationSkills): self.name = name self.age = age self.relation = relation self.communicationSkills = communicationSkills self._roles = {1:"familyMember", 2:"communicator", 3:"eldest"} def __cmp__(self, other): if other.__class__ != self.__class__: return -1 # non-family classes are always less if self.role=="familyMember": relations = {"mother":1, "father":1, "aunt":2, "uncle":2, "cousin":3, "unrelated":4} return cmp(relations[self.relation], relations[other.relation]) elif self.role=="communicator": return cmp(self.communicationSkills, other.communicationSkills) elif self.role=="eldest": return cmp(other.age, self.age) #, other.age) def __repr__(self): # This is an abuse of __repr__- "canonical" data not returned.. # Included only for sake of example. return self.name if __name__ == '__main__': L = [] # add ppl to list L.append(family("Fester", 80, "uncle", 2)) L.append(family("Gomez", 50, "father", 1)) L.append(family("Lurch", 75, "unrelated", 3)) L.append(family("Cousin It", 113, "cousin", 4)) L.append("other data-type") # print list sorted by default role L.sort() print "by relation:", L # print list sorted by communication skills: family.role = "communicator" L.sort() print "by communication skills:", L # print list eldest to youngest family.role = "eldest" L.sort() print "eldest to youngest:", L
Output from running jython
poorcompare.py
is:
by relation: [Gomez, Fester, Cousin It, Lurch, 'other data-type'] by communication skills: [Gomez, Fester, Lurch, Cousin It, 'other data-type'] eldest to youngest: [Cousin It, Fester, Lurch, Gomez, 'other data-type']
Rich comparisons appear in Jython’s 2.1 versions and are not restricted to the −1
, 0
, 1
return values that __cmp__
is. If comparing two lists or two matrices, you can return a list or matrix containing the element-wise comparisons, another object, None
, NotImplemented
, a Boolean or raise an exception. The special rich comparison methods are a set of six special methods representing the six comparison operators. Each method requires two parameters: the first for self
and the second for other
. Table 7.1 lists the operator and associated special, rich-comparison method.
Table 7.1. Rich Comparison Methods
Operator |
Method |
---|---|
|
|
|
|
|
|
|
|
|
|
|
|
For objects A
and B
, the rich comparison of A < B
becomes A.__lt__(B)
. If the comparison is B
>
A
, the method B.__gt__(A)
is evaluated. Each operator has a natural compliment, but there is no enforcement of an invariant such as A.__lt__(B) == B.__gt__(A)
. The left-hand object is first searched for an appropriate rich comparison method. If one is not defined, the right-hand object is searched for the compliment method. If both halves define an appropriate method, the left hand object’s method is used.
Listing 7.9 defines two classes: A
and B
. Class A
defines all six rich comparison methods such that they compare self ’s class name with the name of the other class. Class B
defines only one comparison method: greater-than (__gt__)
. Note that the means of comparison in class B
’s __gt__
method is by an instance creation timestamp: something incongruous with class A
’s definition of comparability. Such divergent definitions are troublesome and are cautioned against without strong cause. The remainder of Listing 7.9 goes through testing the comparison combinations using one instance of each class. You can see from the output how appropriate comparison methods are resolved.
Example 7.9. Rich Comparison Methods
# file: rich.py import time class A: def __init__(self): self.timestamp = time.time() def __lt__(self, other): print "...Using A's __lt__ method...", return self.__class__.__name__ < other.__class__.__name__ def __le__(self, other): print "...Using A's __le__ method...", return self.__class__.__name__ <= other.__class__.__name__ def __ne__(self, other): print "...using A's __ne__ method...", return self.__class__.__name__ != other.__class__.__name__ def __gt__(self, other): print "...Using A's __gt__ method...", return self.__class__.__name__ > other.__class__.__name__ def __ge__(self, other): print "...Using A's __ge__ method...", return self.__class__.__name__ >= other.__class__.__name__ def __eq__(self, other): print "...Using A's __eq__ method...", return self.__class__.__name__ == other.__class__.__name__ class B: def __init__(self): self.timestamp = time.time() def __gt__(self, other): print "...Using B's __gt__ method...", return self.timestamp > other.timestamp if __name__ == '__main__': inst_b = B() inst_a = A() print "Is a < b?", inst_a < inst_b print "Is b < a?", inst_b < inst_a print "Is a <= b?", inst_a <= inst_b print "Is b <= a?", inst_b <= inst_a print "Is a == b?", inst_a == inst_b print "Is b == a?", inst_b == inst_a print "Is a != b?", inst_a != inst_b print "Is b != a?", inst_b != inst_a print "Is a > b?", inst_a > inst_b print "Is b > a?", inst_b > inst_a print "Is a >= b?", inst_a >= inst_b print "Is b >= a?", inst_b >= inst_a
The output from running jython
rich.py
is:
Is a < b? ...Using A's __lt__ method... 1 Is b < a? ...Using A's __gt__ method... 0 Is a <= b? ...Using A's __le__ method... 1 Is b <= a? ...Using A's __ge__ method... 0 Is a == b? ...Using A's __eq__ method... 0 Is b == a? ...Using A's __eq__ method... 0 Is a != b? ...using A's __ne__ method... 1 Is b != a? ...using A's __ne__ method... 1 Is a > b? ...Using A's __gt__ method... 0 Is b > a? ...Using B's __gt__ method... 0 Is a >= b? ...Using A's __ge__ method... 0 Is b >= a? ...Using A's __le__ method... 1
Listing 7.9 clarifies the rich comparison methods but is pointless with regard to practical applications. A more plausible usage is the element-wise comparison of list objects. Listing 7.10 defines a class called listemulator
, which includes an __lt__
method definition. The listemulator
class behaves like a list thanks to the help of the UserList
class imported at the beginning of the listing. The details of emulating other types occurs later in this chapter, but for now, let’s assume an instance of the listemulator
class acts exactly like a normal Jython list except for the __lt__
comparison. The __lt__
method in Listing 7.10 does two things. First, it compares the length of itself and other
to ensure the element-wise comparison is legitimate (they are the same length). Then, the __lt__
method compares each element of self
and other
and returns the list of comparison results.
Example 7.10. Element-Wise Rich Comparison
# file: richlist.py from UserList import UserList class listemulator(UserList): def __init__(self, list): self.data = list UserList.__init__(self, self.data) def __lt__(self, other): if len(self) != len(other): raise ValueError, ("Instance of %s differs in size from %s" % (self.__class__.__name__, other.__class__.__name__)) return map(lambda x, y: x < y, self, other) L = [2,3,4,5] LC = listemulator([2,3,3,4]) print LC < L
The results from running jython richlist.py
are:
[0, 0, 1, 1]
Dictionary operations rely on the hash value of those objects used as keys. Jython objects can determine their own hash value for dictionary key operations and the built-in hash
function by defining the special __hash__
instance method. The __hash__
method has only one parameter and it is self, and the return value should be an integer. A restriction in implementing __hash__
is that objects of the same value should return the same hash
value. Because dictionary keys must be immutable, objects that define a comparison method but no __hash__
method cannot be used as dictionary keys.
The search for the truth in Jython objects follows these rules:
If a class defines the special __nonzero__
method, its return value (1
or 0
) determines the “truth” of the object.
If an object does not define the __nonzero__
method, the interpreter calls the special method __len__
. The __len__
method appears later in this chapter, but its meaning is sufficiently intuitive: It returns an integer representing the objects length. If this return value is non-zero, the object is true.
If a class defines neither of the above special methods, instances of that class are always true.
Implementing trueness with the __nonzero__
method looks like this:
import random class gamble: def __nonzero__(self): return random.choice([0,1])
The __nonzero__
parameter list includes only self
, and the return value is 1
for true, and 0
for false.
Numerous occasions call for creating classes that emulate built-in data objects. Maybe a project requires a Jython dictionary, but needs that dictionary to maintain order, or maybe you need a list with a special lookup. Emulating built-in objects allows you to extend their behavior, add constraints and instrument object operations with minimal work, yet end up with a familiar interface. Implementing extended behavior with the familiarity of a built-in interface adds near zero complexity, which is the primary goal of objects. Jython’s special methods allow user-defined classes to emulate Jython’s built-in numeric, sequence and mapping objects. The emulation of objects that have associated methods requires implementations of those non-special methods as well to truly emulate that object.
The examples in this section often use a Jython class that internally uses a java object to illustrate this functionality. This is something that is not always required. Jython is very good about converting types to meet the situation. The java.util.Vector
object already supports the PyList
index syntax (v[inex]
) and java.util.Hashtable
and java.util.HashMap
already support key assignment (h[key]
), and numeric objects automatically convert to the appropriate types where needed. With this substantial intuitive support for Java objects there is often no need to wrap them in special methods; however, there may be little things like a java.util.Vector
not supporting slice syntax in Jython:
>>> import java >>> v = java.util.Vector() >>> v.addElement(10) >>> v.addElement(20) >>> v[0] # this works 10 >>> v[0:2] Traceback (innermost last): File "<console>", line 1, in ? TypeError: only integer keys accepted
Emulating built-in types allows you to specify every behavior to ensure an object acts indistinguishably similar to a built-in type. Because Jython and Java are so very integrated, passing objects between these languages is pervasive. It is often convenient to allow Java objects to better emulate Jython built-ins so users of your code need not care which is Java and which isn’t. Java objects often already contain methods that do the same thing as a comparable method in a Jython built-in, but are named differently.
Listing 7.11 shows a convenient way to map such Java methods to Jython methods to further ease emulating built-in objects. The HashWrap
class in Listing 7.11 is a subclass of java.util.Hashtable
that assigns class identifiers to Hashtable
methods that already perform the expected behavior. Notice that the HashWrap
class doesn’t define values()
, clear()
, or get()
. These names already exist in the superclass, java.util.Hashtable
, and perform close enough to what is expected. The only catch in Listing 7.11 is that some of the Hashtable
functions return unexpected types, such as the Enumeration
returned from keys()
and items()
. These methods are wrapped in a simple lambda expression to convert them into a list. Some of Jython’s dictionary methods don’t have direct parallels in Java Hashtable
’s, so setdefault()
, popitem()
, and copy()
are defined in the HashWrap
class.
Listing 7.11 also contains special methods—those that begin and end with two underscores. The meaning of these special methods might be discernable from the Java methods they are mapped to, but the idea at this point is only to show how to map identifiers to Java methods.
Example 7.11. Assigning Java Methods to Jython Class Identifiers
# file: hashwrap.py import java import copy class HashWrap(java.util.Hashtable): #map jython names to Hashtable names has_key = java.util.Hashtable.containsKey update = java.util.Hashtable.putAll # Hashtable returns an Enumeration for keys = lambda self: map(None, java.util.Hashtable.keys(self)) items = lambda self: map(None, java.util.Hashtable.elements(self)) # these don't have direct parallels in Hashtable, so define here def setdefault(self, key, value): if self.containsKey(key): return self.get(key) else: self.put(key, value) return value def popitem(self): return self.remove(self.keys()[0]) def copy(self): return copy.copy(self) # These are the special methods introduced in this section. # Read on to find out more. __getitem__ = java.util.Hashtable.get __setitem__ = java.util.Hashtable.put __delitem__ = java.util.Hashtable.remove __repr__ = java.util.Hashtable.toString __len__ = java.util.Hashtable.size if __name__ == '__main__': hw = HashWrap() hw["A"] = "Alpha" hw["B"] = "Beta" print hw print hw.setdefault("G", "Gamma") print hw.setdefault("D", "Delta") print hw["A"] print "keys=", hw.keys() print "values=", hw.values() print "items=", hw.items()
Output from running jython
hashwrap.
py
is:
{A=Alpha, B=Beta} Gamma Delta Alpha keys= ['A', 'G', 'D', 'B'] values= [Alpha, Gamma, Delta, Beta] items= ['Alpha', 'Gamma', 'Delta', 'Beta']
Built-in sequences come in two flavors: mutable and immutable. Immutable sequences (PyTuples
) have no associated methods, while mutable sequences (PyLists
) do; both flavors have similar sequence behaviors such as indexes and slices. Truly emulating a PyList
would involve defining its associated methods (append
, count
, extend
, index
, insert
, pop
, remove
, reverse
, and sort
) as well as the special methods associated with sequence length, indexes and slices. Emulating immutable sequences (PyTuples
) requires only a subset of the special methods as some of those methods implement operations unique to mutable objects (they change object contents). A user-defined object need not implement all sequence behaviors, so you are free to define only those methods that suite your design. However, defining all sequence methods allows users to be blissfully unaware of inconsequential differences between your class and a built-in data object, which is really better for abstraction, reusability and the holy grail of objects: reduced complexity.
The following subsections delineate sequence behaviors, and their associated special methods. The descriptions used in subsections assume a sequence S
,a PyList L
and a PyTuple T
. The use of sequence
and S
indicates special functions applicable to sequences in general. The use of PyTuple
and T
indicates implementation of an immutable sequence and the use of PyList
and L
indicates comments specific to mutable sequences.
A sequence should have a length equal to len(S)
.
The special method that returns an object’s length is __len__(self)
. Calling len(S)
is equivalent to S.__len__()
. The __len__(self)
method must return an integer >= 0
. Note that there are no means of enforcing the accuracy of __len__
. A list-like object with ten elements can define a __len__
method that returns 1
.
Listing 7.12 keeps sequence elements in a java.util.Vector
instance. The length method returns the results from the vector’s size()
method as the object’s length:
Example 7.12. Implementing Sequence Length
# file: seqlen.py import java class usrList: def __init__(self): # use name-mangled, private identifier for vector self.__data = java.util.Vector() def append(self, o): self.__data.add(o) def __len__(self): return self.__data.size() if __name__ == '__main__': L = usrList() L.append("A") L.append("B") L.append("C") print "The length of object L is:", len(L)
The output of jython seqlen.py
is:
The length of object L is: 3
S[i]
gets the value at sequence index i
. The index i
can be a positive integer (counted from the left side of a sequence), a negative integer (counted from the right side), or a slice object.
The special method used to return an object designated by a specific index or slice is __getitem__(self, index)
. Calling S[i]
is equivalent to S.__getitem__(i)
. The __getitem__
method should raise an IndexError
exception when the specified index is out of range, and a ValueError
for non-supported index types. To truly emulate built-in sequences, you must allow for an index
value of a positive integer, negative integer, or slice object.
The __getitem__
method is sometimes confused with the special class attribute method __getattr__
so it’s worth noting their differences. The __getitem__
method retrieves what the object defines as list or mapping entries (mapping implementations appear later) instead of attributes of the object itself. Additionally, __getitem__
is called for each item retrieval, unlike __getattr__
, which is called only after a normal object attribute lookup fails. Because __getitem__
is always called, it’s a good candidate for implementing persistence and other special behavior that require symmetry between setting and getting objects.
Listing 7.13 is similar to Listing 7.12 in that it is a list-like class wrapped around a java.util.Vector
. Now we use this concept to illustrate the __getitem__
method. The __getitem__
implementation in Listing 7.13 allows for positive, negative, and slice indexes by converting the specified index value into a list of positive integers.
The implementation should allow for positive, negative, and slice indexes and should raise the IndexError
and ValueError
exceptions where appropriate. Remember that negative indexes mean they are counted from the right, -1
being the last sequence item, -2
the penultimate, and so on. Jython slicing conventions assume default values for missing slice elements, so a user-defined, list-like object should also allow for this. For the slice [::3]
, Jython assumes start of sequence and end of sequence for the first two missing values, then uses the 3
as the step value. Implementing all this in a user-defined object may sound daunting, but it need not be difficult or convoluted. An important tool in the battle against complexity is leveraging functionality already found in familiar objects. The premise of all this is that __getitem__
should emulate the behavior of a built-in PyList
object—there couldn’t be a bigger hint. Use a list object in the __getitem__
implementation. If the internal data is in a vector, make a list the size of the vector and apply the index or slice to the list— you’ve now handled positive, negative and slice indexes, as well as default index values and the IndexError
and ValueError
exceptions. That’s the trick used in Listing 7.13. Here is an example of just the list trick to make it more clear:
>>> import java >>> v = java.util.Vector() >>> map(v.addElement, range(10)) [None, None, None, None, None, None, None, None, None, None] >>> v # take a peek at the vector [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] >>> # create a slice object >>> i = slice(2, -3, 4) >>> >>> ## Next line handles all slice logic, and appropriate exceptions >>> indexes = range(v.size())[i] >>> indexes [2, 6] # remember- these are indexes of values, not values
Whereas the java.util.Vector
object doesn’t support slices, a PyList
does. Make a PyList
containing the vector’s range of index numbers and apply the index or slice to that list. The result is a positive list of vector indexes or a single vector index number. Using the PyList
also ensures that any ValueErrors
or IndexErrors
are raised as needed. Searching for ways to reuse existing functionality, especially that of built-ins, is vital in battling complexity.
Listing 7.13 acquires either a single, positive index or a list of positive indexes from using the list trick. Once the appropriate positive index values are determined, the __getitem__
method returns the appropriate value or values.
Example 7.13. Sequence Item Retrieval with __getitem__
# file: seqget.py import java import types class usrList: def __init__(self, initial_values): data = java.util.Vector() map(data.add, initial_values) self.__data = data def __getitem__(self, index): indexes = range(self.__data.size())[index] try: if not isinstance(indexes, types.ListType): return self.__data.elementAt(indexes) else: return map(self.__data.elementAt, indexes) except java.lang.ArrayIndexOutOfBoundsException: raise IndexError, "index out of range: %s" %% index if __name__ == '__main__': S = usrList(range(1,10)) print "S=", S[:] print "S[3]=", S[3] print "S[-2]=", S[-2] print "S[1:7:2]=", S[1:7:2] print "S[-5:8]=", S[-5:8] print "S[-8:]=", S[-8:]
The output from running jython
seqget.
py
is:
S= [1, 2, 3, 4, 5, 6, 7, 8, 9] S[3]= 4 S[-2]= 8 S[1:7:2]= [2, 4, 6] S[-5:8]= [5, 6, 7, 8] S[-8:]= [2, 3, 4, 5, 6, 7, 8, 9]
Note that in Listing 7.13, using the vector’s elementAt()
method is inside a try/except
. If an index is out of range, the more verbose Java exception and traceback is caught and replaced with a Jython IndexError
exception. This is only an implementation choice—there is no obligation to wrap Java exceptions, but the IndexError
helps the userList
class act more like a built-in.
Although Listing 7.13 uses a PyList
to do the dirty work, there may be instances requiring explicit handling of types. This introduces type testing. Normally in Jython a variable’s type can be tested against a reference type such as one of these three following examples:
>>> import types >>> a = 1024L >>> if type(a) == types.LongType: "a is a LongType" ... 'a is a LongType' >>> if type(a) == type(1L): "a is a LongType" ... 'a is a LongType' >>> if type(a) in [types.IntType, types.LongType]: "a is an ok type" ... 'a is an ok type'
The test for appropriate types is interesting when you introduce Java types. Allowing for sufficient discrimination of Java types is a bit odd. For example, consider the following:
>>> import java >>> v = java.util.Vector() >>> i = java.lang.Integer(3) >>> type(v) == type(i) 1
If type()
can’t tell the difference between an integer and vector, how can you allow for limited Java types? One way to do so is to test an object’s class. All Jython objects have a _
_class_
_
attribute, including the Java ones, so you could do the following:
>>> import java >>> i = java.lang.Integer(5) >>> v = java.util.Vector() >>> ok_types = [(1).__class__, (1L).__class__, java.lang.Integer, java.lang.Long] >>> i.__class__ in ok_types 1 >>> v.__class__ in ok_types 0
Another means of confirming the appropriateness of object types is the use of the built-in isinstance
function. The isinstance
function accepts an object and a class as arguments and returns 1
if the object is an instance of the specified class, 0
otherwise. Using isinstance
to check types is preferred because of its appropriateness when working with inheritance hierarchies. Using isinstance
to check types would look like this:
>>> import types >>> a = 1024L >>> if isinstance(a, types.LongType): "a is a LongType" ... 'a is a LongType'
The expression L[i] = object
should bind object
to index i
in list-like object L
. This is specific to classes designed to emulate lists, not immutable objects (PyTuple
-like). Only mutable objects should implement this behavior.
The special method used to bind an object to a specific index is __setitem__(self, index, value)
. This method should raise an IndexError
exception when the specified index is out of range. The index value could be negative, positive or a slice object. Raise a ValueError
exception for those index values the __setitem__
implementation does not allow for.
Assigning to a slice has some special constraints, at least for the built-in PyList
. You don’t have to respect this behavior in user-defined objects, but it is recommended. The restrictions are that the step
value must be 1, and the value need not be the same length as the slice. First, look at an assignment to a single index:
>>> S = ["a", "b", "c"] >>> S[1] = [1, 2, 3] >>> S ['a', [1, 2, 3], 'c']
The value assigned to index 1 is a list, which shows up as a single object in index 1. Assigning to a slice differs:
>>> S = ["a", "b", "c"] >>> S[3:4] = [1, 2, 3] >>> S ['a', 'b', 'c', 1, 2, 3]
There is still only 1 index involved which is index 3, but assigning to a slice means something different in that the right side must be a sequence, and the resulting list is the concatenation:
S[0:slice.start] + values + S[slice.stop:len(S)]
Listing 7.14 defines a class that implements the __setitem__
method, but chooses to implement two constraints: All values must be strings and the list is a static size designated in a constructor parameter. Each list index internally represents a line in the file usrList.dat
. Setting L[2]=Some string
changes the second line of the file to Some string
. This makes the internal list similar to class static variables considering all instances would be reading from a single file (this can be changed with another constructor argument however). The opening and closing of the file within each method is expensive, so this would only really occur if this file were a shared resource. The persistence could otherwise be implemented in the __init__
method and possible a close
method (note that __del__
would work, but is often avoided because exceptions in __del_
_
are ignored and there’s no guarantee of when that method gets called).
Listing 7.14 does support assignment to list indexes and slices, but because the lines of the file are stored in a real PyList
as an intermediary, this functionality is automatic. Listing 7.14 raises ValueError
and IndexError
exceptions appropriately, but note that the IndexErrors
would propagate from normal list operations rather than catching and re-raising the exception.
Example 7.14. Adding Persistence with __setitem__
# file: seqset.py import types import os class usrList: def __init__(self, size): self.__size = size self.__file = "usrList.dat" if not os.path.isfile(self.__file): f = open(self.__file, "w") print >> f, " " * size f.close() def __repr__(self): f = open(self.__file) L = f.readlines()[:self.__size] f.close() return str(map(lambda x: x[:-1], L)) def __setitem__(self, index, value): f = open(self.__file, "r+") L = f.readlines()[:self.__size] if isinstance(index, types.SliceType): if len(L[index]) != len(value): raise ValueError, "Bad value: %s" % value for x in value: if not isinstance(x, types.StringType): raise ValueError, "Only String values supported" L[index] == map(lambda x: x + " ", value) if (isinstance(index, types.IntType) or isinstance(index, types.LongType)): if type(value) != types.StringType: raise ValueError, "Only String values supported" L[index] = value + " " f.seek(0) f.writelines(L) f.close() if __name__ == '__main__': S = usrList(10) for x in range(10): S[x] == str(x) print "First List=", S S[4:-4] = "four", "five" print "Second List =", S for x in range(10, 20): S[x-10] = str(x) print "Last list = ", S
Output from running jython seqset.py
First List= ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9'] Second List = ['0', '1', '2', '3', 'four', 'five', '6', '7', '8', '9'] Last list = ['10', '11', '12', '13', '14', '15', '16', '17', '18', '19']
The values in the usrList
instance are stored in the usrList.dat
file, so future instantiations of the usrList
class will start with those values. You can confirm this in the interactive interpreter—just make sure to start the interpreter from within the same directory as the usrList.dat
file (the usrList
instance only looks in the current directory):
>>> import seqset >>> L = seqset.usrList(10) >>> print L # <- print persistent values ['10', '11', '12', '13', '14', '15', '16', '17', '18', '19']
The convenience of an automatically persistent data type is great, but the performance of Listing 7.14 isn’t. Using this same technique with a speedy database helps greatly.
Listing 7.14 is instructive, but there’s little about handling the unique constraints of assignments to a slice within it because the internal data is a PyList
. Listing 7.15 uses a java.util.Vector
for the internal data so supporting slices in __setitem__
is clarified.
Example 7.15. Wrapping a Java Vector in a List Class
#file: seqset1.py import java import types class usrList: def __init__(self): self.__data = java.util.Vector() map(self.__data.addElement, range(10)) def __getitem__(self, index): indexes = range(self.__data.size())[index] if isinstance(index, types.SliceType): return map(self.__data.elementAt, indexes) else: return self.__data.elementAt(indexes) def __setitem__(self, index, value): if isinstance(index, types.SliceType): size = self.__data.size() if index.step != 1: raise ValueError, "Step size must be 1 for setting list slice" newdata = java.util.Vector() map(newdata.addElement, range(0, index.start)) map(newdata.addElement, value) map(newdata.addElement, range(index.stop, size)) self.__data = newdata else: self.__data.setElementAt(value, index) def __delitem__(self, index): indexes = range(self.__data.size())[index] indexes.reverse() # so we can delete High to Low for i in indexes: self.__data.removeElementAt(i) def __repr__(self): return str(map(None, self.__data)) if __name__ == "__main__": L = usrList() print "L=", L print "L[1:]=", L[1:7] print "L[-4:9]=", L[-4:9] L[3:6:1] = range(100, 110) print L
Output from running jython
seqset1.
py
:
L= [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] L[1:]= [1, 2, 3, 4, 5, 6] L[-4:9]= [6, 7, 8] [0, 1, 2, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 6, 7, 8, 9]
The expression del L[i]
should delete an object from list-like object L
. This is obviously specific to mutable objects (PyList
-like object).
The special method used to delete a specific index is __delitem__(self,
index
, value
). This method should raise an IndexError
exception when the specified index is out of range and a ValueError
exception when the index type is unsupported.
Listing 7.16 defines a class that implements the __delitem__
method. The class holds its list contents in an internal java.
util.
Vector
object, so the __delitem__
method must use the vector’s removeElementAt()
(or remove()
) method for each index deleted. Adding support for slices is familiar from previous examples, but one additional trick is required to delete items from the vector. Deleting a slice becomes consecutive removeElementAt()
operations on the vector. If indexes were deleted from low to high, the index of the next higher item to be removed is reduced by one every time an item is deleted. The list of indexes requiring deletion is reversed in Listing 7.16 to ensure that the indexes are in highest to lowest order. This allows deletion without decrementing indexes of future deletions.
Example 7.16. Implementing __delitem__
# file: seqdel.py import java import types class userList: def __init__(self): self.__data = java.util.Vector() def append(self, object): self.__data.addElement(object) def __repr__(self): return str(list(self.__data)) def __delitem__(self, index): if isinstance(index, types.SliceType): if index.step != 1: raise ValueError, "Step size must be 1 for setting list slice" delList = range(self.__data.size())[index] else: delList = [range(self.__data.size())[index]] delList.reverse() map(self.__data.removeElementAt, delList) if __name__ == '__main__': S = userList() map(S.append, range(5, 20, 2)) print "Before deletes:", S del S[3] del S[4:6] print "After deletes: ", S
Output from running jython seqdel.py
:
Before deletes: [5, 7, 9, 11, 13, 15, 17, 19] After deletes: [5, 7, 9, 13, 19]
Table 7.2 contains the math operations a sequence-like object should support as well as the special methods associated with that operation.
Table 7.2. Sequence Math Operations
Operation |
Description |
Special Method |
---|---|---|
|
The concatenation of two sequences. |
|
|
Changing |
|
|
Repeating a sequence for integer |
|
|
Changing list |
|
Implementing the concatenation and repetition operations requires implementing the special operator methods for addition and multiplication. Operator methods occur in threes: addition has __add__
, __radd__
, and __iadd__
, whereas multiplication has __mul__
, __rmul__
, and __imul__
. The first of these (__add__
, or __mul__
) is called when the object defining it is on the left side of an operation. For the sequence S
it would mean S
+
X
or S
*
X
are really implemented as S.__add__(X)
and S.__mul__(X)
. The second of these methods (__radd__
, and __rmul__
) are reflected versions of the first—for when the object is on the right side of the expression. The reflected methods are called only if objects on the left do not define __add__
or __mul__
. If S defines __radd__
and __rmul__
, but X
does not, then X + S
and X * S
become S.__radd__(X)
and S.__rmul__(X)
. The __iadd__
and __imul__
methods implement augmented assignment, meaning S
+=
X
and S
*=
X
are implemented as S.__iadd__(X)
and S.__imul__(X)
.
Listing 7.17 implements all six methods listed in Table 7.2. The obligations of these methods are that they raise exceptions for unsupported types, and that only the augmented assignment operations modify self.
Example 7.17. Sequence Concatenation and Repetition
# file: seqmath.py import types import java class usrList: def __init__(self): self.__data = java.util.Vector() map(self.__data.addElement, range(5,8)) # some default values def __add__(self, other): if isinstance(other, types.ListType): return map(None, self.__data) + other else: raise TypeError, "__add__ only defined for ListType" def __radd__(self, other): if isinstance(other, types.ListType): return other + map(None, self.__data) else: raise TypeError, "__radd__ only defined for ListType" def __iadd__(self, other): # Augmented assignments methods usually modify self, then return self if isinstance(other, types.ListType): map(self.__data.addElement, other) return self #map(None, self.__data) # act like a list type else: raise TypeError, "__iadd__ only defined for ListType" def __mul__(self, other): if (isinstance(other, types.IntType) or isinstance(other, types.LongType)): return map(None, self.__data) * other else: raise TypeError, "Only integers allowed for multiplier" def __rmul__(self, other): if (isinstance(other, types.IntType) or isinstance(other, types.LongType)): return map(None, self.__data) * other else: raise TypeError, "Only integers allowed for multiplier" def __imul__(self, other): if (isinstance(other, types.IntType) or isinstance(other, types.LongType)): map(self.__data.addElement, [x for x in self.__data] * (other -1)) return self else: raise TypeError, "Only integers allowed for multiplier" def __repr__(self): return str(map(None, self.__data)) if __name__ == '__main__': L = usrList() print "start :", L print "__add__ :", L + [1,0] print "__radd__ :", ["a", "b"] + L L += ["-", "-"] print "__iadd__ :", L print "__mull__ :", L * 2 try: print "__rmull__ :", 2 * L except TypeError: print "__rmull__ raised a TypeError" L *= 2 print "__imull__ :", L
Output from running jython seqmath.py
is:
start : [5, 6, 7] __add__ : [5, 6, 7, 1, 0] __radd__ : ['a', 'b', 5, 6, 7] __iadd__ : [5, 6, 7, '-', '-'] __mull__ : [5, 6, 7, '-', '-', 5, 6, 7, '-', '-'] __rmull__ : [5, 6, 7, '-', '-', 5, 6, 7, '-', '-'] __imull__ : [5, 6, 7, '-', '-', 5, 6, 7, '-', '-']
Python deprecated the following methods in version 2.0. Support for these methods still exists in the Jython code base, so they are included here for com-pleteness. Their inclusion is not meant to encourage their use, but instead is supplied just in case a reader encounters this in legacy code. For new code, use __setitem__
, __getitem__
, and __delitem__
.
__getslice__(self, start, stop, step)
. Returns a sequence containing those elements of self designated by the slice parameters (start
, stop
and step
). If L
contains [1,2,3,4,5]
, then L[1:4:1]
actually calls (if defined) L.__getslice__(1, 4, 1)
, and __getslice__(1, 4,1)
should return [2,3,4]
.
__setslice__(self, start, stop, step, value)
. Assigns indexes of self to value or values specified. If L
contains [1,2,3,4,5]
, then L[2:4] = ["a",
"b", "c"]
actually calls (if defined) L.__setslice__(2, 4, 1, ["a", "b",
"c"]). L.__setslice__
should set its internal list to L[0:start] + value +
L[stop:]
.
__delslice__(self, start, stop, step)
. Removes the internal sequence values designated by the slice. If L
contains [1,2,3,4,5]
, then del L[1:4]
actually calls (if defined) L.__delslice__(1, 4, 1)
, which deletes internal values designated as elements 1
, 2
, and 3
.
Testing if an object o
is a member of sequence S
usually loops through S
looking for o
. If your design requires numerous membership tests like this, you are facing a harsh, quadratic performance penalty. You do, however, have the options to optimize this membership test with the special method __contains__
. If the __contains__
method is defined, list membership tests instead call S.__contains__(o)
rather than looping through S
. The __contains__
method has the self
parameter and a parameter slot for the item whose membership is in question. The __contains__
function should return 0
(false) if it does not contain the object and 1
, or non-zero (true) if it does.
Listing 7.18 is a class that emulates a list, but also sets members as keys in an internal dictionary. The dictionary value is the number of times the object appears in the list. This allows speedy membership tests by checking if the dictionary has the key rather than looping through the sequence. The tradeoff is increased memory usage and a slower setting and deleting of items. Listing 7.18 adds the __setitem__
and_ _delitem__
methods as these operations must be intercepted to keep the internal dictionary and list in sync. Two helper methods,_ _incrementMember
and __decrementMember
, are defined to help in handling the syncing process by determining each key’s count (value) and deleting or creating the key when necessary.
Example 7.18. Accelerating Membership Tests with __contains__
# file: seqin.py import types class usrList: def __init__(self, initialValues): self.__data = initialValues self.__membership = {} map(self.__membership.update, [{key:1} for key in self.__data]) def __contains__(self, item): return self.__membership.has_key(item) # for __contains__ to work, assignment and deletion must # change self.__data and self.__membership def __setitem__(self, index, value): if isinstance(index, types.SliceType): if index.step != 1: raise ValueError, "Assignment to slice requires step=1" indexes = self.__data[index] else: indexes = [self.__data[index]] # updated self.__data _and_ self.__membership self.__data[index] = value map(self.__decrementMember, indexes) map(self.__incrementMember, values) def __delitem__(self, index): indexes = self.__data[index] del self.__data[index] if isinstance(indexes, types.ListType): map(self.__decrementMember, indexes) else: self.__decrementMember(indexes) # it's really only one index def __incrementMember(self, member): if self.__membership.has_key(member): self.__membership[member] += 1 else: self.__membership[member] = 1 def __decrementMember(self, member): if self.__membership.has_key(member): if self.__membership[member] == 1: del self.__membership[member] else: self.__membership[member] -= 1 def __repr__(self): return str(self.__data) if __name__ == '__main__': from time import time t1 = time() pyList = range(0, 12000, 3) print "The PyList took %f seconds to fill." %% (time()-t1,) t1 = time() newList = usrList(range(0, 12000, 3)) print "The usrList took %f seconds to fill." %% (time()-t1,) t1 = time() count = 0 for x in range(10, 12000, 7): if x in pyList: count += 1 print "Found %i items in pyList in %f seconds" %% (count, time()-t1) t1 = time() count = 0 for x in range(10, 12000, 7): if x in newList: count += 1 print "Found %i items in newList in %f seconds" %% (count, time()-t1)
Output from running jython seqin.py
is:
The PyList took 0.000000 seconds to fill. The usrList took 0.110000 seconds to fill. Found 571 items in pyList in 6.430000 seconds Found 571 items in newList in 0.220000 seconds
Listing 7.18 has a verbose testing section to illustrate the item setting penalty and the membership test benefit inherent in this approach.
The Jython (and Python) library contains a module aimed at easing the creating of list-like, user-defined objects. The previous examples used a class called usrList
that was intended to foreshadow the introduction of this without creating naming confusion (note the spelling difference). The UserList
module defines one class: UserList
. This class optionally uses a PyList
object internally to represent the list data, and supplies default methods for working with this data. If you choose not to use a PyList
object for the internal data, you need to override all required methods.
Listing 7.19 uses the UserList
class to keep statistics about the frequency items are requested from the list. The important points of Listing 7.19 are the internal data and the methods defined. The ListStats
class in Listing 7.19 chooses to use the PyList
object as internal data and passes that object to the UserList
constructor. Not all methods require implementing to fully act like a built-in list because UserList
handles everything not explicitly defined in the ListStats
class of Listing 7.19. If it did not pass a PyList
to the UserList
superclass, much more work would need to be done to fully act like a list.
Example 7.19. UserList and List-Like Objects
# file: liststats.py import UserList class ListStats(UserList.UserList): def __init__(self, data=[]): self.data = data assert type(data)==type([]), "Constructor arg must be a list" UserList.UserList(data) self.stats = {} self.requestCount = 0 def __getitem__(self, index): items = self.data[index] if type(items) != type([]): # make plain integers into a list for convenience items = [items] for x in items: self.requestCount += 1 self.stats[x] = self.stats.setdefault(x, 0) + 1 return items def printStats(self): for x in self.data: use = self.stats.setdefault(x, 0) if not use: continue print ("%00.i, %1.3f%% " % (self.data[x], float(use)/float(self.requestCount)*100)), print # to put prompt on a new line if __name__ == '__main__': import random L = ListStats(range(10)) for x in range(2000): L[random.randint(0, 9)] # access a random index L.printStats()
The output from running jython liststats.py
is:
0, 9.100% 1, 9.600% 2, 11.000% 3, 10.900% 4, 11.350% 5, 8.750% 6, 10.850% 7, 9.500% 8, 9.550% 9, 9.400%
Emulating a mapping type is extremely similar to emulating a list type, except you work with keys instead of indexes. A built-in mapping object implements the methods clear
, copy
, get
, has_key
, items
, keys
, setdefault
, update
, and values
, so truly emulating a mapping type required implementing these methods. The special methods that a mapping object should implement are __len__
, __getitem__
, __setitem__
, and __delitem__
. These should look familiar from the section on lists.
__len__(self)
. Returns the number of key:value
pairs within the mapping object.
__getitem__(self, key)
. Returns the values associated with the specified key.
__setitem__(self, key, value)
. Sets the specified key to the specified value in the internal mapping representation.
__delitem__(self, key)
. Deletes the specified key from the internal mapping.
Because the implementation of these special methods is so familiar from emulating lists, the following code listing should be sufficient to demonstrate these special mapping methods. Listing 7.20 borrows ideas from Listing 7.14 in that it also implements data elements as files, but adds a bit of a twist. The dictionary represents a directory, the keys represent files, and the values represent file contents.
Listing 7.20 also allows for numbers and data objects to be stored by using Jython’s pickle
module. Pickling is one of Jython’s serialization mechanisms. The two pickle
methods employed are dumps()
, which converts from object to string, and loads()
, which converts from string to object. To serialize the list in Listing 7.20, we use the following:
string = pickle.dumps(list)
To restore the list, we use this:
pickle.loads(string)
There is another trick used for safety sake in listing 7.20. A JythonIDfile
is added to each directory created for this mapping. There are no checks to guarantee this class created a certain directory, or any file within it, so it must identify directories as special somehow (lest someone try this example with /etc
or C:windows
system
).
Some of the library methods used have been introduced before, but for clarity, here is a list of methods and what they do:
os.mkdir(directory)
. Creates a directory. An OS that enforces file and directory permissions may raise an OSError
if it unable to create the directory.
os.path.join(path, filename)
. Joins a path and filename, adding the platform-specific path separator where appropriate.
os.path.isdir(directory)
. Returns 1
if the specified directory exists, 0
otherwise.
os.path.isfile(filename)
. Returns 1
if the specified file exists, and is a file.
os.listdir(directory)
. Returns a list of all the names defined in the specified directory.
os.stat(file)
. Returns a nine-element tuple
representing file statistics. Listing 7.20 uses only the file size, designated with ST_SIZE
from the stat
module.
pickle.dumps(object)
. Returns a string, which is the serialized object.
pickle.loads(string)
. Returns an object for which the string was the serialized data for.
Example 7.20. A Persistent Dictionary
# file: specialmap.py import types import os import pickle from stat import ST_SIZE class mappingDirectory: def __init__(self, directory): self.__ID = None self.__dir = directory if not os.path.exists(directory): os.mkdir(directory) idfile = os.path.join(directory, "JythonIDfile") f = open(idfile, "wb") print >> f, str(id(self)) f.close() elif not os.path.isdir(directory): raise ValueError, "File %s already exists." % directory elif not os.path.isfile(os.path.join(directory, "JythonIDfile")): msg = "Directory exists, but it isn't a mapping directory." raise ValueError, msg def __repr__(self): listing = os.listdir(self.__dir) results = {} for x in listing: if x == "JythonIDfile": continue size = os.stat(os.path.join(self.__dir, x))[ST_SIZE] results[x] == "<datafile: size=%i>" %% size return str(results) def __setitem__(self, key, value): self.__testKey(key) pathandname = os.path.join(self.__dir, key) f = open(pathandname, "w+b") print >> f, pickle.dumps(value) f.close() def __getitem__(self, key): self.__testKey(key) pathandname = os.path.join(self.__dir, key) try: f = open(pathandname, "rb") except IOError: raise KeyError, key value = f.read() f.close() return pickle.loads(value) def __delitem__(self, key): self.__testKey(key) pathandname = os.path.join(self.__dir, key) if not os.path.isfile(pathandname): raise KeyError, key os.remove(pathandname) def __testKey(self, key): if not isinstance(key, types.StringType): raise KeyError, "This mapping restricts keys to strings" if key == "JythonIDfile": raise KeyError, "The name JythonIDfile is reserved." if __name__ == '__main__': md = mappingDirectory("c:\windows\desktop\jythontestdir") md["odd"] = filter(lambda x: x%2, range(10000)) md["even"] = filter(lambda x: not x%2, range(10000)) md["prime"] = [2, 3, 5, 7, 11, 13, 17] print "Mapping =", md print "primes =", md["prime"] del md["prime"] print "primes deleted" print "Mapping =", md
Output from running jythonspecialmap.py
is:
Mapping = {'prime': '<datafile: size=38>', 'odd': '<datafile: size=34452>', 'even': '<datafile: size=34452>'} primes = [2, 3, 5, 7, 11, 13, 17] primes deleted Mapping = {'odd': '<datafile: size=34452>', 'even': '<datafile: size=34452>'}
Emulating a numeric type requires defining the special methods for each numeric operation the object should support. The special methods associated with numeric operations are those that implement unary and binary operators, conversion to other types, and coercion. The majority of the special methods are for the binary operators, and these methods appear in triples. For example, implementing addition involves defining the __add__
method for when the object is on the left side of the addition operator, the __radd__
method for when the object is on the right side of the operator, or __iadd__
for when using augmented assignment (+=
). These methods are sometimes called respectively normal, reflected, and augmented methods.
The augmented assignment methods (__i*__
) are unique in that their implementation doesn’t return a value, but instead modifies self
. However, if a numeric object does not define the augmented method, it can still be used in an augmented assignment. If the object N
defines __add__
, but not __iadd__
, the expression N += N
executes the following:
N = N.__add__(N)
Table 7.3 lists numeric operations and their associated method. On the left side of Table 7.3 is the operation with N
representing the user-defined, numeric object. The right side of Table 7.3 is the method signature of the associated special method. If a method should returns a specific type of object, the return type is noted by --> type
. For example, the operation N + 2
translates into N.__add__(2)
, and the method signature is __add__(self, other)
.
Something that plays an important role in the numeric operations is __coerce__
. The __coerce__
method is called whenever the two operands are of differing types. The operation N1 + N2
, where N1
and N2
are different types, actually calls N1.__coerce__(N2)
. The __coerce__
method returns a tuple
of N1
and N2
converted to a common type—let’s call them T1
and T2
. Then T1.__add__(T2)
is called. If the left operand does not have the __coerce__
method, the right operand’s_ _coerce__
method is called.
Table 7.3. Numeric Binary Operators and Their Special Methods
Operators |
Methods |
---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|