Describing a directory. There are several solutions to this exercise, naturally. One simple solution is:
import os, sys, stat def describedir(start): def describedir_helper(arg, dirname, files): """ Helper function for describing directories """ print "Directory %s has files:" % dirname for file in files: # find the full path to the file (directory + filename) fullname = os.path.join(dirname, file) if os.path.isdir(fullname): # if it's a directory, say so; no need to find the size print ' '+ file + ' (subdir)' else: # find out the size, and print the info. size = os.stat(fullname)[stat.ST_SIZE] print ' '+file+' size=' + `size` # Start the 'walk'. os.path.walk(start, describedir_helper, None)
which uses the walk
function in the
os.path
module, and works just fine:
>>>import describedir
>>>describedir.describedir2('testdir')
Directory testdir has files: describedir.py size=939 subdir1 (subdir) subdir2 (subdir) Directory testdirsubdir1 has files: makezeros.py size=125 subdir3 (subdir) Directory testdirsubdir1subdir3 has files: Directory testdirsubdir2 has files:
Note that you could have found the size of the files by doing
len(open(fullname, 'rb').read())
, but this works
only when you have read access to all the files and is quite
inefficient. The stat
call in the
os
module gives out all kinds of useful
information in a tuple, and the stat
module
defines some names that make it unnecessary to remember the order of
the elements in that tuple. See the Library
Reference for details.
Modifying the prompt.
The key to this exercise is to remember that the
ps1
and ps2
attributes of the
sys
module can be anything, including a class
instance with a __repr
__ or _
_str
__ method. For example:
import sys, os class MyPrompt: def __init__(self, subprompt='>>> '): self.lineno = 0 self.subprompt = subprompt def __repr__(self): self.lineno = self.lineno + 1 return os.getcwd()+'|%d'%(self.lineno)+self.subprompt sys.ps1 = MyPrompt() sys.ps2 = MyPrompt('... ')
This code works as shown (use the -i
option of the
Python interpreter to make sure your program starts right away):
h:Davidook>python -i modifyprompt.py
h:Davidook|1>>>x = 3
h:Davidook|2>>>y = 3
h:Davidook|3>>>def foo():
h:Davidook|3...x = 3
# the secondary prompt is supported h:Davidook|3... h:Davidook|4>>>import os
h:Davidook|5>>>os.chdir('..')
h:David|6>>> # note the prompt changed!
Avoiding regular
expressions. This program is long and tedious, but not
especially complicated. See if you can understand how it works.
Whether this is easier for you than regular expressions depends on
many factors, such as your familiarity with regular expressions and
your comfort with the functions in the string
module. Use whichever type of programming works for you.
import string file = open('pepper.txt') text = file.read() paragraphs = string.split(text, ' ') def find_indices_for(big, small): indices = [] cum = 0 while 1: index = string.find(big, small) if index == -1: return indices indices.append(index+cum) big = big[index+len(small):] cum = cum + index + len(small) def fix_paragraphs_with_word(paragraphs, word): lenword = len(word) for par_no in range(len(paragraphs)): p = paragraphs[par_no] wordpositions = find_indices_for(p, word) if wordpositions == []: return for start in wordpositions: # look for 'pepper' ahead indexpepper = string.find(p, 'pepper') if indexpepper == -1: return -1 if string.strip(p[start:indexpepper]) != '': # something other than whitespace in between! continue where = indexpepper+len('pepper') if p[where:where+len('corn')] == 'corn': # it's immediately followed by 'corn'! continue if string.find(p, 'salad') < where: # it's not followed by 'salad' continue # Finally! we get to do a change! p = p[:start] + 'bell' + p[start+lenword:] paragraphs[par_no] = p # change mutable argument! fix_paragraphs_with_word(paragraphs, 'red') fix_paragraphs_with_word(paragraphs, 'green') for paragraph in paragraphs: print paragraph+' '
We won’t repeat the output here; it’s the same as that of the regular expression solution.
Wrapping a text file with a
class. This one is surprisingly easy, if you understand
classes and the split
function in the
string
module. The following is a version that has
one little twist over and beyond what we asked for:
import string class FileStrings: def __init__(self, filename=None, data=None): if data == None: self.data = open(filename).read() else: self.data = data self.paragraphs = string.split(self.data, ' ') self.lines = string.split(self.data, ' ') self.words = string.split(self.data) def __repr__(self): return self.data def paragraph(self, index): return FileStrings(data=self.paragraphs[index]) def line(self, index): return FileStrings(data=self.lines[index]) def word(self, index): return self.words[index]
This solution, when applied to the file pepper.txt, gives:
>>>from FileStrings import FileStrings
>>>bigtext = FileStrings('pepper.txt')
>>>print bigtext.paragraph(0)
This is a paragraph that mentions bell peppers multiple times. For one, here is a red Pepper and dried tomato salad recipe. I don't like to use green peppers in my salads as much because they have a harsher flavor. >>>print bigtext.line(0)
This is a paragraph that mentions bell peppers multiple times. For >>>print bigtext.line(-4)
aren't peppers, they're chilies, but would you rather have a good cook >>>print bigtext.word(-4)
botanist
How does it work? The constructor simply reads all the file into a
big string (the instance attribute data
) and then
splits it according to the various criteria, keeping the results of
the splits in instance attributes that are lists of strings. When
returning from one of the accessor methods, the data itself is
wrapped in a FileStrings
object. This isn’t
required by the assignment, but it’s nice because it means you
can chain the operations, so that to find out what the last word of
the third line of the third paragraph is, you can just write:
>>> print bigtext.paragraph(2).line(2).word(-1)
'cook'