Credit: Donn Cave, University of Washington
In this chapter, we consider a class of programmer—the humble system administrator—in contrast to other chapters’ focus on functional domains. As a programmer, the system administrator faces most of the same problems that other programmers face and should find the rest of this book of at least equal interest.
Python’s advantages in the system administration domain are also quite familiar to other Python programmers, but Python’s competition is different. On Unix platforms, at any rate, the landscape is dominated by a handful of lightweight languages such as the Bourne shell and awk that aren’t exactly made obsolete by Python. These little languages can often support a simpler, clearer, and more concise solution than Python, particularly for commands that you’re typing interactively at the shell command prompt. But Python can do things these languages can’t, and it’s often more robust when dealing with issues such as unusually large data inputs. Another notable competitor, especially on Unix systems, is Perl (which isn’t really a little language at all), with just about the same overall power as Python, and usable for typing a few commands interactively at the shell’s command prompt. Python’s strength here is readability and maintainability: when you dust off a script you wrote in a hurry eight months ago, because you need to make some changes to it, you don’t spend an hour to figure out whatever exactly you had in mind when you wrote this or that subtle trick. You just don’t use any tricks at all, subtle or gross, so that your Python scrips work just fine and you don’t burn your time, months later, striving to reverse-engineer them for understanding.
One item that stands out in this chapter’s solutions is
the wrapper: the alternative, programmed
interface to a software system. On Unix (including, these days, Mac OS
X), this is usually a fairly prosaic matter of diversion and analysis of
text I/O. Life is easy when the programs you’re dealing with are able to
just give clean textual output, without requiring complex interaction
(see Eric Raymond, The Art of Unix Programming,
http://www.faqs.org/docs/artu/, for an
informative overview of how programs should be
architected to make your life easy). However, even when you have to wrap
a program that’s necessarily interactive, all is far from lost. Python
has very good support in this area, thanks, first of all, to the fact
that it places C-level pseudo-TTY functions at your disposal (see the
pty
module of the Python Standard
Library). The pseudo-TTY device is like a bidirectional pipe with TTY
driver support, so it’s essential for things such as password prompts
that insist on a TTY. Because it appears to be a TTY, applications
writing to a pseudo-TTY normally use line buffering, instead of the
block buffering that gives problems with pipes. Pipes are more portable
and less trouble to work with, but they don’t work for interfacing to
every application. Excellent third-party extensions exist that wrap
pty
into higher-level layers for ease
of use, most notably Pexpect, http://pexpect.sourceforge.net/.
On Windows, the situation often is not as prosaic as on
Unix-like platforms, since the information you need to do your system
administration job may be somewhere in the registry, may be available
via some Windows APIs, and/or may be available via COM. The standard
Python library _winreg
module, Mark
Hammond’s PyWin32
package, and Thomas
Heller’s ctypes
, taken together, give
the Windows administrator reasonably easy access to all of these
sources, and you’ll see more Windows administration recipes here than
you will ones for Unix. The competition for Python as a system
administration language on Windows is feeble compared to that on Unix,
which is yet another reason for the platform’s prominence here. The
PyWin32
extensions are available for
download at http://sourceforge.net/projects/pywin32/.
PyWin32
also comes with ActiveState’s
ActivePython distribution of Python (http://www.activestate.com/ActivePython/). To
use this rich and extremely useful package most effectively, you also
need Mark Hammond and Andy Robinson, Python Programming on
Win32 (O’Reilly). ctypes
is available for download at http://sourceforge.net/projects/ctypes.
While it may sometimes be difficult to see what brought all the recipes together in this chapter, it isn’t difficult to see why system administrators deserve their own chapter: Python would be nowhere without them! Who else, back when Python was still an obscure, fledgling language, could bring it into an organization and almost covertly infiltrate it into the working environment? If it weren’t for the offices of these benevolent and pragmatic anarchists, Python might well have languished in obscurity despite its merits.
Credit: Devin Leung
You need to create new passwords randomly—for example, to assign them automatically to new user accounts.
One of the chores of system administration is installing new user accounts. Assigning a different, totally random password to each new user is a good idea. Save the following code as makepass.py:
from random import choice import string def GenPasswd(length=8, chars=string.letters+string.digits): return ''.join([ choice(chars) for i in range(length) ])
This recipe is useful when you are creating new user accounts and assigning each of them a different, totally random password. For example, you can print six passwords of length 12:
>>> import makepass >>> for i in range(6): ... print makepass.GenPasswd(12) ...uiZWGSJLWjOI
FVrychdGsAaT
CGCXZAFGjsYI
TPpQwpWjQEIi
HMBwIvRMoIvh
Of course, such totally random passwords, while providing an excellent theoretical basis for security, are impossibly hard to remember for most users. If you require users to stick with their assigned passwords, many users will probably write them down. The best you can hope for is that new users will set their own passwords at their first login, assuming, of course, that the system you’re administering lets each user change his own password. (Most operating systems do, but you might be assigning passwords for other kinds of services that unfortunately often lack such facilities.)
A password that is written down anywhere is a serious security risk: pieces of paper get lost, misplaced, and peeked at. From a pragmatic point of view, you might be better off assigning passwords that are not totally random; users are more likely to remember them and less likely to write them down (see Recipe 10.2). This practice may violate the theory of password security, but, as all practicing system administrators know, pragmatism trumps theory.
Recipe 10.2;
documentation of the standard library module random
in the Library
Reference and Python in a
Nutshell.
Credit: Luther Blissett
You need to create new passwords randomly—for example, to assign them automatically to new user accounts. You want the passwords to be somewhat feasible to remember for typical users, so they won’t be written down.
We can use a pastiche approach for this, mimicking letter n-grams in actual English words. A grander way to look at the same approach is to call it a Markov Chain Simulation of English:
import random, string
class password(object):
# Any substantial file of English words will do just as well: we
# just need self.data to be a big string, the text we'll pastiche
data = open("/usr/share/dict/words").read( ).lower( )
def renew(self, n, maxmem=3):
''' accumulate into self.chars `n' random characters, with a
maximum-memory "history" of `maxmem` characters back. '''
self.chars = [ ]
for i in range(n):
# Randomly "rotate" self.data
randspot = random.randrange(len(self.data))
self.data = self.data[randspot:] + self.data[:randspot]
# Get the n-gram
where = -1
# start by trying to locate the last maxmem characters in
# self.chars. If i<maxmem, we actually only get the last
# i, i.e., all of self.chars -- but that's OK: slicing
# is quite tolerant in this way, and it fits the algorithm
locate = ''.join(self.chars[-maxmem:])
while where<0 and locate:
# Locate the n-gram in the data
where = self.data.find(locate)
# Back off to a shorter n-gram if necessary
locate = locate[1:]
# if where==-1 and locate='', we just pick self.data[0] --
# it's a random item within self.data, tx to the rotationc = self.data[where+len(locate)+1]
# we only want lowercase letters, so, if we picked another
# kind of character, we just choose a random letter instead
if not c.islower( ): c = random.choice(string.lowercase)
# and finally we record the character into self.chars
self.chars.append(c)
def _ _str_ _(self):
return ''.join(self.chars)
if _ _name_ _ == '_ _main_ _':
"Usage: pastiche [passwords [length [memory]]]"
import sys
if len(sys.argv)>1: dopass = int(sys.argv[1])
else: dopass = 8
if len(sys.argv)>2: length = int(sys.argv[2])
else: length = 10
if len(sys.argv)>3: memory = int(sys.argv[3])
else: memory = 3
onepass = password( )
for i in range(dopass):
onepass.renew(length, memory)
print onepass
This recipe is useful when creating new user accounts and assigning each user a different, random password: it uses passwords that a typical user will find it feasible to remember, hopefully so they won’t get written down. See Recipe 10.1 if you prefer totally random passwords.
The recipe’s idea is based on the good old pastiche concept. Each letter (always lowercase) in the password is chosen pseudo-randomly from data that is a collection of words in a natural language familiar to the users. This recipe uses the file that is /usr/share/dict/words supplied with Linux systems (on my machine, a file of over 45,000 words), but any large document in plain text will do just as well. The trick that makes the passwords sort of memorable, and not fully random, is that each letter is chosen based on the last few letters already picked for the password as it stands so far. Thus, letter transitions will tend to be “repetitive” according to patterns that are familiar to the user.
The code in the recipe takes some care to locate each time a
random occurrence, in the text being pastiched, of the last
maxmem
characters picked so far. Since it’s easy to
find the first occurrence of a substring, the
code “rotates” the text string randomly, to ensure that the first
occurrence is a random one from the point of view of the original
text. If the substring made up with the last maxmem
characters picked is not found in the text, the code “backs down” to
search for just the last maxmem-1
,
and so on, backing down until, worst case, it just picks the first
character in the rotated text (which is a random character from the
point of view of the original text).
A break in this Markov Chain process occurs when this picking procedure chooses a character that is not a lowercase letter, in which case, a random lowercase letter is chosen instead (any lowercase letter is picked with equal probability).
Here are a couple of typical sample runs of this pastiche.py password-generation script:
[situ@tioni cooker]$ python pastiche.pyyjackjaceh
ackjavagef
aldsstordb
dingtonous
stictlyoke
cvaiwandga
lidmanneck
olexnarinl
[situ@tioni cooker]$ python pastiche.pyptiontingt
punchankin
cypresneyf
sennemedwa
iningrated
fancejacev
sroofcased
nryjackman
As you can see, some of these are definitely word-like, others less so, but for a typical human being, none are more problematic to remember than a sequence of even fewer totally random, uncorrelated letters. No doubt some theoretician will complain (justifiably, in a way) that they aren’t as random as all that. Well, tough. My point is that they had better not be, if some poor fellow is going to have to remember them! You can compensate for this limitation by making them a bit longer. If said theoretician demonstrates how to compute the entropy per character of this method of password generation (versus the obvious 4.7 bits/character, the base-2 logarithm of 26, for passwords made up of totally random lowercase letters), now that would be a useful contribution indeed. Meanwhile, I’ll keep generating passwords this way, rather than in a totally random way. If nothing else, it’s the closest thing I’ve found to a useful application for the lovely pastiche concept.
The concept of passwords that are not totally random, but rather a bit more memorable, goes back a long way—at least to the 1960s and to works by Morrie Gasser and Daniel Edwards. A Federal Information Processing Standard (FIPS), FIPS 181, specifies in detail how “pronounceable” passwords are to be generated; see http://www.itl.nist.gov/fipspubs/fip181.htm.
Recipe 10.1;
documentation of the standard library module random
in the Library
Reference and Python in a
Nutshell.
Credit: Magnus Lyckå
You are writing a Python application that must authenticate users. All of the users have accounts on some POP servers, and you’d like to reuse, for your own authentication, the user IDs and passwords that your users have on those servers.
To log into the application, a user must provide the server, user ID and password for his mail account. We try logging into that POP server with these credentials—if that attempt succeeds, then the user has authenticated successfully. (Of course, we don’t peek into the user’s mailbox!)
def popauth(popHost, user, passwd): """ Log in and log out, only to verify user identity. Raise exception in case of failure. """ import poplib try: pop = poplib.POP3(popHost) except: raise RuntimeError("Could not establish connection " "to %r for password check" % popHost) try: # Log in and perform a small sanity check pop.user(user) pop.pass_(passwd) length, size = pop.stat( ) assert type(length) == type(size) == int pop.quit( ) except: raise RuntimeError("Could not verify identity. " "User name %r or password incorrect." % user) pop.quit( )
To use this recipe, the application must store somewhere the list of known users and either the single POP server they all share, or the specific POP server on which each user authenticates—it need not be the same POP server for all users. Either a text file, or a simple table in any kind of database, will do just fine for this purpose.
This solution is neat, but it does have some weaknesses:
Users must trust that any application implementing this authentication system won’t abuse their email accounts.
POP passwords are, alas!, sent in plain text over the Internet.
We have to trust that the POP server security isn’t compromised.
Logging in might take a few seconds if the POP server is slow.
Logging in won’t work if the POP server is down.
However, to offset all of these potential drawbacks is the convenience of applications not having to store any passwords, nor forcing a poor overworked system administrator to administer password changes. It’s also quite simple! In short, I wouldn’t use this approach for a bank system, but I would have no qualms using it, for example, to give users rights to edit web pages at a somewhat restricted WikiWiki, or similarly low-risk applications.
Documentation of the standard library module poplib
in the Library
Reference and Python in a
Nutshell.
Credit: Mark Nenadov, Ivo Woltring
You need to examine a log file from Apache to count the number of hits recorded from each individual IP address that accessed it.
Many of the chores of administering a web server have to do with analyzing Apache logs, which Python makes easy:
def calculateApacheIpHits(logfile_pathname):
''' return a dict mapping IP addresses to hit counts '''
ipHitListing = { }
contents = open(logfile_pathname, "r")
# go through each line of the logfile
for line in contents:
# split the string to isolate the IP addressip = line.split(" ", 1)[0]
# Ensure length of the IP address is proper (see discussion)
if 6 < len(ip) <= 15:
# Increase by 1 if IP exists; else set hit count = 1
ipHitListing[ip] = ipHitListing.get(ip, 0) + 1
return ipHitListing
This recipe supplies a function that returns a dictionary containing the hit counts for each individual IP address that has accessed your Apache web server, as recorded in an Apache log file. For example, a typical use would be:
HitsDictionary = calculateApacheIpHits( "/usr/local/nusphere/apache/logs/access_log")
This function has many quite useful applications. For example, I often use it in my code to determine the number of hits that are actually originating from locations other than my local host. This function is also used to chart which IP addresses are most actively viewing the pages that are served by a particular installation of Apache.
This function performs a modest validation of each IP address, which is really just a length check: an IP address cannot be longer than 15 characters (4 sets of triplets and 3 periods) nor shorter than 7 (4 sets of single digits and 3 periods). This validation is not stringent, but it does reduce, at tiny runtime cost, the probability of placing into the dictionary some data that is obviously garbage. As a general technique, low-cost, highly approximate sanity checks for data that is expected to be OK (but one never knows for sure) are worth considering. However, if you want to be stricter, regular expressions can help. Change the loop in this recipe’s function’s body to:
import re # an IP is: 4 strings, each of 1-3 digits, joined by periods ip_specs = r'.'.join([r'd{1,3}']*4) re_ip = re.compile(ip_specs) for line in contents: match = re_ip.match(line) if match: # Increase by 1 if IP exists; else set hit count = 1 ip = match.group( ) ipHitListing[ip] = ipHitListing.get(ip, 0) + 1
In this variant, we use a regular expression to extract and validate the IP at the same time. This approach enables us to avoid the split operation as well as the length check, and thus amortizes most of the runtime cost of matching the regular expression. This variant is only a few percentage points slower than the recipe’s solution.
Of course, the pattern given here as ip_specs
is not entirely precise either, since it accepts, as components of an
IP quad, arbitrary strings of one to three digits, while the
components should be more constrained. But to ward off garbage lines,
this level of sanity check is sufficient.
Another alternative is to convert and check the address: extract
string ip
just as we do in this recipe’s Solution,
then:
# Ensure the IP address is proper try: quad = map(int, ip.split('.')) except ValueError: pass else: if len(quad)==4 and min(quad)>=0 and max(quad)<=255: # Increase by 1 if IP exists; else set hit count = 1 ipHitListing[ip] = ipHitListing.get(ip, 0) + 1
This approach is more work, but it does guarantee that only IP addresses that are formally valid get counted at all.
The Apache web server is available and documented at http://httpd.apache.org;
regular expressions are covered in the docs of the re
module in the Library
Reference and Python in a
Nutshell.
Credit: Mark Nenadov
You need to monitor how often client requests are refused by your Apache web server because the client’s cache of the page is already up to date.
When a browser queries a server for a page that the browser has in its cache, the browser lets the server know about the cached data, and the server returns a special error code (rather than serving the page again) if the client’s cache is up to date. Here’s how to find the statistics for such occurrences in your server’s logs:
def clientCachePercentage(logfile_pathname):
contents = open(logfile_pathname, "r")
totalRequests = 0
cachedRequests = 0
for line in contents:
totalRequests += 1if line.split(" ")[8] == "304":
# if server returned "not modified"
cachedRequests += 1
return int(0.5+float(100*cachedRequests)/totalRequests)
The percentage of requests to your Apache server that are met by the client’s own cache is an important factor in the perceived performance of your server. The code in this recipe helps you get this information from the server’s log. Typical use would be:
log_path = "/usr/local/nusphere/apache/logs/access_log" print "Percentage of requests that were client-cached: " + str( clientCachePercentage(log_path)) + '%'
The recipe reads the log file one line at a time by looping over
the file—the normal way to read a file nowadays. Trying to read the
whole log file in memory, by calling the readlines
method on the file object, would
be an unsuitable approach for very large files, which server log files
can certainly be. That approach might not work at all, or might work
but damage performance considerably by swamping your machine’s virtual
memory. Even when it works, readlines
offers no advantage over the
approach used in this recipe.
The body of the for
loop
calls the split
method on each line string, with a
string of a single space as the argument, to split the line into a
tuple of its space-separated fields. Then it uses indexing ([8]
) to get the ninth such field. Apache
puts the error code into the ninth field of each line in the log. Code
"304
" means “not modified” (i.e.,
the client’s cache was already correctly updated). We count those
cases in the cachedRequests
variable and all lines in
the log in the totalRequests
variable, so that, in
the end, we can return the percentage of cache hits. The expression we
use in the return
statement
computes the percentage as a float
number, then rounds it correctly to the closest int
, because an integer result is most
useful in practice.
The Apache web server is available and documented at http://httpd.apache.org.
Credit: Larry Price, Peter Cogolo
You want users to work with their favorite text-editing programs to edit text files, to provide input to your script.
Module tempfile
lets
you create temporary files, and module os
has many tools to check the environment
and to work with files and external programs, such as text editors. A
couple of functions can wrap this functionality into an easy-to-use
form:
import sys, os, tempfile def what_editor( ): editor = os.getenv('VISUAL') or os.getenv('EDITOR') if not editor: if sys.platform == 'windows': editor = 'Notepad.Exe' else: editor = 'vi' return editor def edited_text(starting_text=''): temp_fd, temp_filename = tempfile.mkstemp(text=True) os.write(temp_fd, starting_text) os.close(temp_fd) editor = what_editor( ) x = os.spawnlp(os.P_WAIT, editor, editor, temp_filename) if x: raise RuntimeError, "Can't run %s %s (%s)" % (editor, temp_filename, x) result = open(temp_filename).read( ) os.unlink(temp_filename) return result if _ _name_ _=='_ _main_ _': text = edited_text('''Edit this text a little, go ahead, it's just a demonstration, after all...! ''') print 'Edited text is:', text
Your scripts may often need a substantial amount of textual input from the user. Letting users edit the text with their favorite text editor is an excellent feature for your script to have, and this recipe shows how you can obtain it. I have used variants of this approach for such purposes as adjusting configuration files, writing blog posts, and sending emails.
If your scripts do not need to run on Windows, a more secure and
slightly simpler way to code function edited_text
is
available:
def edited_text(starting_text=''): temp_file = tempfile.NamedTemporaryFile( ) temp_file.write(starting_text) temp_file.seek(0) editor = what_editor( ) x = os.spawnlp(os.P_WAIT, editor, editor, temp_file.name) if x: raise RuntimeError, "Can't run %s %s (%s)" % (editor, temp_file.name, x) return temp_file.read( )
Unfortunately, this alternative relies on the editor we’re
spawning being able to open and modify the temporary file while we are
holding that file open, and this capability is not supported on most
versions of Windows. The version of edited_text
given
in the recipe is more portable.
When you’re using this recipe to edit text files that must
respect some kind of syntax or other constraints, such as a
configuration file, you can make your script simpler and more
effective by using a cycle of “input/parse/re-edit in case of errors,”
providing immediate feedback to users when you can diagnose they’ve
made a mistake in editing the file. Ideally, in such cases, you should
reopen the editor already pointing at the line in error, which is
possible with most Unix editors by passing them a first argument such
as '+23
', specifying that they
start editing at line 23, before the filename argument. Unfortunately,
such an argument would confuse many Windows editors, so you have to
make some hard decisions here (if you do need to support
Windows).
Documentation for modules tempfile
and os
in the Library
Reference and Python in a
Nutshell.
Credit: Anand Pillai, Tiago Henriques, Mario Ruggier
You want to make frequent backup copies of all files you have modified within a directory tree, so that further changes won’t accidentally obliterate some of your editing.
Version-control systems, such as RCS, CVS, and SVN, are very powerful and useful, but sometimes a simple script that you can easily edit and customize can be even handier. The following script checks for new files to back up in a tree that you specify. Run the script periodically to keep your backup copies up to date.
import sys, os, shutil, filecmp MAXVERSIONS=100 def backup(tree_top, bakdir_name='bakdir'): for dir, subdirs, files in os.walk(tree_top): # ensure each directory has a subdir called bakdir backup_dir = os.path.join(dir, bakdir_name) if not os.path.exists(backup_dir): os.makedirs(backup_dir) # stop any recursing into the backup directories subdirs[:] = [d for d in subdirs if d != bakdir_name] for file in files: filepath = os.path.join(dir, file) destpath = os.path.join(backup_dir, file) # check existence of previous versions for index in xrange(MAXVERSIONS): backup = '%s.%2.2d' % (destpath, index) if not os.path.exists(backup): break if index > 0: # no need to backup if file and last version are identical old_backup = '%s.%2.2d' % (destpath, index-1) try: if os.path.isfile(old_backup ) and filecmp.cmp(abspath, old_backup, shallow=False): continue except OSError: pass try: shutil.copy(filepath, backup) except OSError: pass if _ _name_ _ == '_ _main_ _': # run backup on the specified directory (default: the current directory) try: tree_top = sys.argv[1] except IndexError: tree_top = '.' backup(tree_top)
Although version-control systems are more powerful, this script
can be useful in development work. I often customize it, for example,
to keep backups only of files with certain extensions (or, when that’s
handier, of all files except those with certain
extensions); it suffices to add an appropriate test at the very start
of the for file in files
loop, such
as:
name, ext = os.path.splitext(file) if ext not in ('.py', '.txt', '.doc'): continue
This snippet first uses function splitext
from the standard library module
os.path
to extract the file
extension (starting with a period) into local variable
ext
, then conditionally executes statement
continue
, which passes to the next
leg of the loop, unless the extension is one of a few that happen to
be the ones of interest in the current subtree.
Other potentially useful variants include backing files
up to some other subtree (potentially on a
removable drive, which has some clear advantages for backup purposes)
rather than the current one, compressing the files that are being
backed up (look at standard library module gzip
for this purpose), and more refined
ones yet. However, rather than complicating function
backup
by offering all of these variants as options,
I prefer to copy the entire script to the root of each of the various
subtrees of interest, and customize it with a little simple editing.
While this strategy would be a very bad one for any kind of
complicated, highly reusable production-level code, it is reasonable
for a simple, straightforward system administration utility such as
the one in this recipe.
Worthy of note in this recipe’s implementation is the use of
function os.walk
, a generator from
the standard Python library’s module os
, which makes it very simple to iterate
over all or most of a filesystem subtree, with no need for such
subtleties as recursion or callbacks, just a straightforward for
statement. To avoid backing up the
backups, this recipe uses one advanced feature of os.walk
: the second one of the three values
that os.walk
yields at each step
through the loop is a list of subdirectories of the current directory.
We can modify this list in place, removing some
of the subdirectory names it contains. When we perform such an
in-place modification, os.walk
does
not recurse through the subdirectories whose names we removed. The
following steps deal only with the subdirectories whose names are left
in. This subtle but useful feature of os.walk
is one good example of how a
generator can receive information from the code that’s iterating on
it, to affect details of the iteration being performed.
Documentation of standard library modules os
, shutils
, and gzip
in the Library
Reference and Python in a
Nutshell.
Credit: Noah Spurrier, Dave Benjamin
You need to selectively copy a large mailbox file (in
mbox
style), passing each message
through a filtering function that may alter or skip the
message.
The Python Standard Library package email
is the modern Python approach for this
kind of task. However, standard library modules mailbox
and rfc822
can also supply the base
functionality to implement this task:
def process_mailbox(mailboxname_in, mailboxname_out, filter_function): mbin = mailbox.PortableUnixMailbox(file(mailboxname_in,'r')) fout = file(mailboxname_out, 'w') for msg in mbin: if msg is None: break document = filter_function(msg, msg.fp.read( )) if document: assert document.endswith(' ') fout.write(msg.unixfrom) fout.writelines(msg.headers) fout.write(' ') fout.write(document) fout.close( )
I often write lots of little scripts to filter my mailbox, so I
wrote this recipe’s small module. I can import the module from each
script and call the module’s function process_mailbox
as needed. Python’s future direction is to perform email processing
with the standard library package email
, but lower-level modules, such as
mailbox
and rfc822
, are still available in the Python
Standard Library. They are sometimes easier to use than the rich,
powerful, and very general functionality offered by package email
.
The function you pass to process_mailbox
as the
third argument, filter_function
, must take two
arguments—msg
, an rfc822
message object, and
document
, a string that is the message’s entire body,
ending with two line-end characters (
). filter_function
can
return False
, meaning that this
message must be skipped (i.e., not copied at all to the output), or
else it must return a string terminated with
that is written to the output as the
message body. Normally, filter_function
returns
either False
or the same
document
argument it was called with, but in some
cases you may find it useful to write to the output file an altered
version of the message’s body rather than the original message
body.
Here is an example of a filter function that removes duplicate messages:
import sets found_ids = sets.Set( ) def no_duplicates(msg, document): msg_id = msg.getheader('Message-ID') if msg_id in found_ids: return False found_ids.add(msg_id) return document
In Python 2.4, you could use the built-in set
rather than sets.Set
, but for a case as simple as this,
it makes no real difference in performance (and the usage is exactly
the same, anyway).
Documentation about modules mailbox
and rfc822
, and package email
, in the Library
Reference and Python in a
Nutshell.
Credit: Noah Spurrier
To help you configure an antispam system, you want a list of email addresses, commonly known as a whitelist, that you can trust won’t send you spam. The addresses to which you send email are undoubtedly good candidates for this whitelist.
Here is a script to output “To” addresses given a mailbox path:
#!/usr/bin/env python """ Extract and print all 'To:' addresses from a mailbox """ import mailbox def main(mailbox_path): addresses = { } mb = mailbox.PortableUnixMailbox(file(mailbox_path)) for msg in mb: toaddr = msg.getaddr('To')[1] addresses[toaddr] = 1 addresses = addresses.keys( ) addresses.sort( ) for address in addresses: print address if _ _name_ _ == '_ _main_ _': import sys main(sys.argv[1])
In addition to bypassing spam filters, identifying addresses of people you’ve sent mail to may also help in other ways, such as flagging emails from them as higher priority, depending on your mail-reading habits and your mail reader’s capabilities. As long as your mail reader keeps mail you have sent in some kind of “Sent Items” mailbox in standard mailbox format, you can call this script with the path to the mailbox as its only argument, and the addresses to which you’ve sent mail will be emitted to standard output.
The script is simple because the Python Standard Library module
mailbox
does all the hard work. All
the script needs to do is collect the set of email addresses as it
loops through all messages, then emit them. While collecting, we keep
addresses
as a dictionary, since that’s much faster
than keeping a list and checking each toaddr
in order
to append it only if it wasn’t already in the list. When we’re done
collecting, we just extract the addresses from the dictionary as a
list because we want to emit its items in sorted order. In Python 2.4,
function main
can be made even slightly more elegant,
thanks to the new built-ins set
and
sorted
:
def main(mailbox_path): addresses = set( ) mb = mailbox.PortableUnixMailbox(file(mailbox_path)) for msg in mb: toaddr = msg.getaddr('To')[1] addresses.add(toaddr) for address in sorted(addresses): print address
If your mailbox is not in the Unix mailbox style supported by
mailbox.PortableUnixMailbox
, you
may want to use other classes supplied by the Python Standard Library
module mailbox
. For example, if
your mailbox is in Qmail maildir
format, you can use the mailbox.Maildir
class to read it.
Documentation of the standard library module mailbox
in the Library
Reference and Python in a
Nutshell.
Credit: Marina Pianu, Peter Cogolo
Many of the mails you receive are duplicates. You need to block the duplicates with a fast, simple filter before they reach a more time-consuming step, such as an anti-spam filter, in your email pipeline.
Many mail systems, such as the popular procmail, and KDE’s KMail, enable you to control your mail-reception pipeline. Specifically, you can insert in the pipeline your filter programs, which get messages on standard input, may modify them, and emit them again on standard output. Here is one such filter, with the specific purpose of performing the task described in the Problem—blocking messages that are duplicates of other messages that you have received recently:
#!/usr/bin/python import time, sys, os, email now = time.time( ) # get archive of previously-seen message-ids and times kde_dir = os.expanduser('~/.kde') if not os.path.isdir(kde_dir): os.mkdir(kde_dir) arfile = os.path.join(kde_dir, 'duplicate_mails') duplicates = { } try: archive = open(arfile) except IOError: pass else: for line in archive: when, msgid = line[:-1].split(' ', 1) duplicates[msgid] = float(when) archive.close( ) redo_archive = False # suck message in from stdin and study it msg = email.message_from_file(sys.stdin) msgid = msg['Message-ID'] if msgid: if msgid in duplicates: # duplicate message: alter its subject subject = msg['Subject'] if subject is None: msg['Subject'] = '**** DUP **** ' + msgid else: del msg['Subject'] msg['Subject'] = '**** DUP **** ' + subject else: # non-duplicate message: redo the archive file redo_archive = True duplicates[msgid] = now else: # invalid (missing message-id) message: alter its subject subject = msg['Subject'] if subject is None: msg['Subject'] = '**** NID **** ' else: del msg['Subject'] msg['Subject'] = '**** NID **** ' + subject # emit message back to stdout print msg if redo_archive: # redo archive file, keep only msgs from the last two hours keep_last = now - 2*60*60.0 archive = file(arfile, 'w') for msgid, when in duplicates.iteritems( ): if when > keep_last: archive.write('%9.2f %s ' % (when, what)) archive.close( )
Whether it is because of spammers’ malice or incompetence, or because of hiccups at my Internet ISP (Internet service provider), at times I get huge amounts of duplicate messages that can overload my mail-reception pipeline, particularly antispam filters. Fortunately, like many other mail systems, KDE’s KMail, the one I use, lets me insert my own filters in the mail reception pipeline. In particular, I can diagnose duplicate messages, alter their headers (I use “Subject” for clarity), and tell later stages in the filters’ pipeline to throw away messages with such subjects or to shunt them aside into a dedicated mailbox for later perusal, without passing them on to the antispam and other filters.
The email
module from
the Python Standard Library performs all the required parsing of the
message and lets me access headers with dictionary-like indexing
syntax. I need some “memory” of recently seen messages. Fortunately, I
have noticed all duplicates happen within a few minutes of each other,
so I don’t have to keep that memory for long—two hours are plenty.
Therefore, I keep that memory in a simple text file, which records the
time when a message was received and the message ID. I thought I might
have to find a more advanced way to keep this kind of FIFO (first-in,
first-out) archive, but I tried a simple approach first—a simple text
file that is entirely rewritten whenever a new nonduplicate message
arrives. This approach appears to perform quite adequately for my
needs (at most a couple hundred messages an hour), even on my somewhat
dated PC. “Do the simplest thing that could possibly work” strikes
again!
Documentation about package email
and modules time
, sys
and os
in the Library
Reference and Python in a
Nutshell.
Credit: Anand Pillai
The winsound
module
of the Python Standard Library makes this check really simple:
import winsound try: winsound.PlaySound("*", winsound.SND_ALIAS) except RuntimeError, e: print 'Sound system has problems,', e else: print 'Sound system is OK'
The sound system might pass this test and still be unable to produce sound correctly, due to a variety of possible problems—starting from simple ones such as powered loudspeakers being turned off (there’s no sensible way you can check for that in your program!), all the way to extremely subtle and complicated ones. When sound is a problem in your applications, using this recipe at least you know whether you should be digging into a subtle issue of device driver configuration or start by checking whether the loudspeakers are on!
Documentation on the Python Standard Library winsound
module.
Credit: Bill Bell
You want to register or unregister a DLL in Windows, just as it is normally done by regsrv32.exe, but you want to do it from Python, without requiring that executable to be present or bothering to find it.
All that Microsoft’s regsrv32.exe does is load a DLL and call
its entries named DllRegisterServer
or DllUnregisterServer
. This
behavior is very easy to replicate via Thomas Heller’s ctypes
extension:
from ctypes import windll dll = windll[r'C:PathToSome.DLL'] result = dll.DllRegisterServer( ) result = dll.DllUnregisterServer( )
The result
is of
Windows type HRESULT
, so, if you
wish, ctypes
can also implicitly
check it for you, raising a ctypes.WindowsError
exception when an error
occurs; you just need to use ctypes.oledll
instead of ctypes.windll
. In other words, to have the
result automatically checked and an exception raised in case of
errors, instead of the previous script, use this one:
from ctypes import oledll dll = oledll[r'C:PathToSome.DLL'] dll.DllRegisterServer( ) dll.DllUnregisterServer( )
Thomas Heller’s ctypes
enables your Python code to load DLLs on Windows (and similar
dynamic/shared libraries on other platforms) and call functions from
such libraries, and it manages to perform these tasks with a high
degree of both power and elegance. On Windows, in particular, it
offers even further “added value” through such mechanisms as the
oledll
object, which, besides
loading DLLs and calling functions from them, also checks the returned
HRESULT
instances and raises
appropriate exceptions when the HRESULT
values indicate errors.
In this recipe, we’re using ctypes
(either the windll
or oledll
objects from that module)
specifically to avoid the need to use Microsoft’s regsrv32.exe to register or unregister DLLs
that implement in-process COM servers for some CLSIDs. (A CLSID is a
globally unique identifier that identifies a COM class object, and the
abbreviation presumably stands for class identifier.) The cases in
which you’ll use this specific recipe are only those in which you need
to register or unregister such COM DLLs (whether they’re implemented
in Python or otherwise makes no difference). Be aware, however, that
the applicability of ctypes
is far
wider, as it extends to any case in which you wish your Python code to
load and interact with a DLL (or, on platforms other than Windows,
equivalent dynamically loaded libraries, such as .so files on Linux and .dynlib files on Mac OS X).
The protocol that regsrv32.exe implements is well documented and very simple, so our own code can reimplement it in a jiffy. That’s much more practical than requiring regsrv32.exe to be installed on the machine on which we want to register or unregister the DLLs, not to mention finding where the EXE file might be to run it directly (via os.spawn or whatever) and also finding an effective way to detect errors and show them to the user.
ctypes
is at http://sourceforge.net/projects/ctypes.
Credit: Daniel Kinnaer
You need to check which tasks Windows is set to automatically run at login and possibly change this set of tasks.
When administering Windows machines, it’s crucial to keep track
of the tasks each machine runs at login. Like so many Windows tasks,
this requires working with the registry, and standard Python module
_winreg
enables this:
import _winreg as wr aReg = wr.ConnectRegistry(None, wr.HKEY_LOCAL_MACHINE) try: targ = r'SOFTWAREMicrosoftWindowsCurrentVersionRun' print "*** Reading from", targ, "***" aKey = wr.OpenKey(aReg, targ) try: for i in xrange(1024): try: n, v, t = wr.EnumValue(aKey, i) print i, n, v, t except EnvironmentError: print "You have", i, "tasks starting at logon" break finally: wr.CloseKey(aKey) print "*** Writing to", targ, "***" aKey = wr.OpenKey(aReg, targ, 0, wr.KEY_WRITE) try: try: wr.SetValueEx(aKey, "MyNewKey", 0, REG_SZ, r"c:winntexplorer.exe") except EnvironmentError: print "Encountered problems writing into the Registry..." raise finally: CloseKey(aKey) finally: CloseKey(aReg)
The Windows registry holds a wealth of crucial system
administration data, and the Python standard module _winreg
makes it feasible to read and alter
data held in the registry. One of the items held in the Windows
registry is a list of tasks to be run at login (in addition to other
lists held elsewhere, such as the user-specific Startup folder that this recipe does not
deal with).
This recipe shows how to examine the registry list of login tasks, and how to add a task to the list so it is run at login. (This recipe assumes you have Explorer installed at the specific location c:winnt. If you have it installed elsewhere, edit the recipe accordingly.)
If you want to remove the specific key added by this recipe, you can use the following simple script:
import _winreg as wr aReg = wr.ConnectRegistry(None, wr.HKEY_LOCAL_MACHINE) targ = r'SOFTWAREMicrosoftWindowsCurrentVersionRun' aKey = wr.OpenKey(aReg, targ, 0, wr.KEY_WRITE) wr.DeleteValue(aKey, "MyNewKey") wr.CloseKey(aKey) wr.CloseKey(aReg)
The try
/finally
constructs used in the recipe are
far more robust than the simple sequence of function calls used in
this latest snippet, since they ensure that everything is closed
correctly regardless of whether the intervening calls succeed or fail.
This care and prudence are strongly advisable for scripts that are
meant be run in production, particularly for system-administration
scripts that must generally run with administrator privileges. Such
scripts therefore might harm a system’s setup if they don’t clean up
after themselves properly. However, you can omit the try
/finally
when you know the calls will succeed
or don’t care what happens if they fail. In this case, if you have
successfully added a task with the recipe’s script, the calls in this
simple cleanup script should work just fine.
Documentation for the standard module _winreg
in the Library
Reference; Windows API documentation available from
Microsoft (http://msdn.microsoft.com); information on
what is where in the registry tends to be spread information among
many sources, but for some useful collections of such information, see
http://www.winguides.com/registry and
http://www.activewin.com/tips/reg/index.shtml.
Credit: John Nielsen
PyWin32’s win32net
module
makes this task very easy:
import win32net import win32netcon shinfo={ } shinfo['netname'] = 'python test' shinfo['type'] = win32netcon.STYPE_DISKTREE shinfo['remark'] = 'data files' shinfo['permissions'] = 0 shinfo['max_uses'] = -1 shinfo['current_uses'] = 0 shinfo['path'] = 'c:\my_data' shinfo['passwd'] = '' server = 'servername' win32net.NetShareAdd(server, 2, shinfo)
While the task of sharing a folder is indeed fairly easy to
accomplish, finding the information on how you do so isn’t. All I
could find in the win32net
documentation was that you needed to pass a dictionary holding the
share’s data “in the format of SHARE_INFO_*.” I finally managed to
integrate this tidbit with the details from the Windows SDK
(http://msdn.microsoft.com) and produce the
information in this recipe. One detail that took me some effort to
discover is that the constants you need to use as the value for the
'type
' entry are “hidden away” in
the win32netcon
module.
PyWin32 docs at http://sourceforge.net/projects/pywin32/; Microsoft’s MSDN site, http://msdn.microsoft.com.
Credit: Bill Bell, Graham Fawcett
Instantiating Internet Explorer to access its interfaces via COM is easy, but you want to connect to an already running instance.
The simplest approach is to rely on Internet Explorer’s CLSID:
from win32com.client import Dispatch ShellWindowsCLSID = '{9BA05972-F6A8-11CF-A442-00A0C90A8F39}' ShellWindows = Dispatch(ShellWindowsCLSID) print '%d instances of IE' % len(shellwindows) print for shellwindow in ShellWindows : print shellwindow print shellwindos.LocationName print shellwindos.LocationURL print
Dispatching on the CLSID provides a sequence of all the running
instances of the application with that class. Of course, there could
be none, one, or more. If you’re interested in a specific instance,
you may be able to identify it by checking, for example, for its
properties LocationName
and
LocationURL
.
You’ll see that Windows Explorer and Internet Explorer have the same CLSID—they’re basically the same application. If you need to distinguish between them, you can try adding at the start of your script the statement:
from win32gui import GetClassName
and then checking each shellwindow
in the loop
with:
if GetClassName(shellwindow.HWND) == 'IEFrame':...
'IEFrame
' is
supposed to result from this call (according to
the docs) for all Internet Explorer instances and those only. However,
I have not found this check to be wholly reliable across all versions
and patch levels of Windows and Internet Explorer, so, take this
approach as just one possibility (which is why I haven’t added this
check to the recipe’s official “Solution”).
This recipe does not let you receive IE events. The most
important event is probably DocumentComplete
. You can roughly substitute
checks on the Busy
property for the
inability to wait for that event, but remember not to poll too
frequently (for that or any other property) or you may slow down your
PC excessively. Something like:
while shellwindow.Busy: time.sleep(0.2)
Sleeping 0.2 seconds between checks may be a reasonable compromise between responding promptly and not loading your PC too heavily with a busy-waiting-loop.
PyWin32 docs at http://sourceforge.net/projects/pywin32/; Microsoft’s MSDN site, http://msdn.microsoft.com.
Credit: Kevin Altis
Your Microsoft Outlook Contacts house a wealth of useful information, and you need to extract some of it in text form.
Like many other problems of system administration on Windows,
this one is best approached by using COM. The most popular way to
interface Python to COM is to use the win32com
package, which is part of Mark
Hammond’s pywin32
extension
package:
from win32com.client import gencache, constants DEBUG = False class MSOutlook(object): def _ _init_ _(self): try: self.oOutlookApp = gencache.EnsureDispatch("Outlook.Application") self.outlookFound = True except: print "MSOutlook: unable to load Outlook" self.outlookFound = False self.records = [ ] def loadContacts(self, keys=None): if not self.outlookFound: return onMAPI = self.oOutlookApp.GetNamespace("MAPI") ofContacts = onMAPI.GetDefaultFolder(constants.olFolderContacts) if DEBUG: print "number of contacts:", len(ofContacts.Items) for oc in range(len(ofContacts.Items)): contact = ofContacts.Items.Item(oc + 1) if contact.Class == constants.olContact: if keys is None: # no keys were specified, so build up a list of all keys # that belong to some types we know we can deal with good_types = int, str, unicode keys = [key for key in contact._prop_map_get_ if isinstance(getattr(contact, key), good_types) ] if DEBUG: print "Fields == == == == == == == == == == == ==" keys.sort( ) for key in keys: print key record = { } for key in keys: record[key] = getattr(contact, key) self.records.append(record) if DEBUG: print oc, contact.FullName if _ _name_ _ == '_ _main_ _': if '-d' in sys.argv: DEBUG = True if DEBUG: print "attempting to load Outlook" oOutlook = MSOutlook( ) if not oOutlook.outlookFound: print "Outlook not found" sys.exit(1) fields = ['FullName', 'CompanyName', 'MailingAddressStreet', 'MailingAddressCity', 'MailingAddressState', 'MailingAddressPostalCode', 'HomeTelephoneNumber', 'BusinessTelephoneNumber', 'MobileTelephoneNumber', 'Email1Address', 'Body', ] if DEBUG: import time print "loading records..." startTime = time.time( ) # to get all fields just call oOutlook.loadContacts( ) # but getting a specific set of fields is much faster oOutlook.loadContacts(fields) if DEBUG: print "loading took %f seconds" % (time.time( ) - startTime) print "Number of contacts: %d" % len(oOutlook.records) print "Contact: %s" % oOutlook.records[0]['FullName'] print "Body: %s" % oOutlook.records[0]['Body']
This recipe’s code could use more error-checking, and you could
get it by using nested try
/except
blocks, but I didn’t want to obscure
the code’s fundamental simplicity in this recipe. This recipe should
work with different versions of Outlook, but I’ve tested it only with
Outlook 2000. If you have applied the Outlook security patches then
you will be prompted with a dialog requesting access to Outlook for
1-10 minutes from an external program, which in this case is
Python.
The code has already been optimized in two important ways.
First, by ensuring that the Python COM wrappers for Outlook have been
generated, which is guaranteed by calling gencache.EnsureDispatch
. Second, in the loop
that reads the contacts, the Contact
reference is obtained only once and
then kept in a local variable contact
to avoid repeated references. This
simple but crucial optimization is the role of the statement:
contact = ofContacts.Items.Item(oc + 1)
Both of these optimizations have a dramatic impact on total
import time, and both are important enough to keep in mind.
Specifically, the EnsureDispatch
idea is important for most uses of COM in Python; the concept of
getting an object reference, once, into a local variable (rather than
repeating indexing, calls, and attribute accesses) is even more
important and applies to every use of
Python.
Simple variations of this script can be applied to other elements of the Outlook object model such as the Calendar and Tasks. You’ll want to look at the Python wrappers generated for Outlook in the C:Python23Libsite-packageswin32comgen_py directory. I also suggest that you look at the Outlook object model documentation on MSDN and/or pick up a book on the subject.
PyWin32 docs at http://sourceforge.net/projects/pywin32/; Microsoft’s MSDN site, http://msdn.microsoft.com.
Credit: Brian Quinlan
You want to retrieve detailed information about a Mac OS X system. You want either complete information about the system or information about particular keys in the system-information database.
Mac OS X’s system_profiler command can provide system information as an XML stream that we can parse and examine:
#!/usr/bin/env python from xml import dom from xml.dom.xmlbuilder import DOMInputSource, DOMBuilder import datetime, time, os def group(seq, n): """group([0, 3, 4, 10, 2, 3, 1], 3) => [(0, 3, 4), (10, 2, 3)] Group a sequence into n-subseqs, discarding incomplete subseqs. """ return [ seq[i:i+n] for i in xrange(0, len(seq)-n+1, n) ] def remove_whitespace_nodes(node): """Removes all of the whitespace-only text descendants of a DOM node.""" remove_list = [ ] for child in node.childNodes: if child.nodeType == dom.Node.TEXT_NODE and not child.data.strip( ): remove_list.append(child) elif child.hasChildNodes( ): remove_whitespace_nodes(child) for child in remove_list: node.removeChild(child) child.unlink( ) class POpenInputSource(DOMInputSource): "Use stdout from an external program as a DOMInputSource" def _ _init_ _(self, command): super(DOMInputSource, self)._ _init_ _( ) self.byteStream = os.popen(command) class OSXSystemProfiler(object): "Provide information from the Mac OS X System Profiler" def _ _init_ _(self, detail=-1): """detail can range from -2 to +1. Larger numbers return more info. Beware of +1, can take many minutes to get all info!""" b = DOMBuilder( ) self.document = b.parse( POpenInputSource('system_profiler -xml -detailLevel %d' % detail)) remove_whitespace_nodes(self.document) def _content(self, node): "Get the text node content of an element, or an empty string" if node.firstChild: return node.firstChild.nodeValue else: return '' def _convert_value_node(self, node): """Convert a 'value' node (i.e. anything but 'key') into a Python data structure""" if node.tagName == 'string': return self._content(node) elif node.tagName == 'integer': return int(self._content(node)) elif node.tagName == 'real': return float(self._content(node)) elif node.tagName == 'date': # <date>2004-07-05T13:29:29Z</date> return datetime.datetime( *time.strptime(self._content(node), '%Y-%m-%dT%H:%M:%SZ')[:5]) elif node.tagName == 'array': return [self._convert_value_node(n) for n in node.childNodes] elif node.tagName == 'dict': return dict([(self._content(n), self._convert_value_node(m)) for n, m in group(node.childNodes, 2)]) else: raise ValueError, 'Unknown tag %r' % node.tagName def _ _getitem_ _(self, key): from xml import xpath # pyxml's xpath does not support /element1[...]/element2... nodes = xpath.Evaluate('//dict[key=%r]' % key, self.document) results = [ ] for node in nodes: v = self._convert_value_node(node)[key] if isinstance(v, dict) and '_order' in v: # this is just information for display pass else: results.append(v) return results def all(self): """Return the complete information from the system profiler as a Python data structure""" return self._convert_value_node( self.document.documentElement.firstChild) def main( ): from optparse import OptionParser from pprint import pprint info = OSXSystemProfiler( ) parser = OptionParser( ) parser.add_option("-f", "--field", action="store", dest="field", help="display the value of the specified field") options, args = parser.parse_args( ) if args: parser.error("no arguments are allowed") if options.field is not None: pprint(info[options.field]) else: # print some keys known to exist in only one important dict for k in ['cpu_type', 'current_processor_speed', 'l2_cache_size', 'physical_memory', 'user_name', 'os_version', 'ip_address']: print '%s: %s' % (k, info[k][0]) if _ _name_ _ == '_ _main_ _': main( )
Mac OS X puts at your disposal a wealth of information about
your system through the system_profiler application. This recipe
shows how to access that information from your Python code. First, you
have to instantiate class OSXSystemProfiler
, for
example, via a statement such as info =
OSXSystemProfiler( )
; once you have done that, you can
obtain all available information by calling info.all( )
, or information for one specific
key by indexing info[thekey]
. The
main
function in the recipe, which executes when you
run this module as a main script, emits information to standard
output—either a specific key, requested by using switch -f when invoking the script, or, by default,
a small set of keys known to be generally useful.
For example, when run on the old Apple iBook belonging to one of this book’s editors (no prize for guessing which one), the script in this recipe emits the following output:
cpu_type: PowerPC G4 (3.3) current_processor_speed: 800 MHz l2_cache_size: 256 KB physical_memory: 640 MB user_name: Alex (alex) os_version: Mac OS X 10.3.6 (7R28) ip_address: [u'192.168.0.190']
system_profiler returns XML
data in pinfo
format, so this
recipe implements a partial pinfo
parser, using Python’s standard library XML-parsing facilities, and
the xpath
implementation from the PyXML
extensions. More information about
Python’s facilities that help you deal with XML can be found in Chapter 12.
Documentation of the standard Python library support for XML in the Library Reference and Python in a Nutshell; PyXML docs at http://pyxml.sourceforge.net/; Mac OS X system_profiler docs at http://developer.apple.com/documentation/Darwin/Reference/ManPages/man8/system_profiler.8.html; Chapter 12.