11. Documenting Python Code

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 11. Documenting Python Code

This chapter concludes Part III with a look at techniques and tools used for documenting Python code. Although Python code is designed to be readable in general, a few well-placed human-readable comments can do much to help others understand the workings of your programs. To support comments, Python includes both syntax and tools to make documentation easier. Although this is something of a tools-related concept, the topic is presented here, partly because it involves Python’s syntax model, and partly as a resource for readers struggling to understand Python’s toolset. As usual, this chapter ends with pitfalls and exercises.

The Python Documentation Interlude

By this point in the book you’re probably starting to realize that Python comes with an awful lot of prebuilt functionality—built-in functions, exceptions, predefined object attributes, standard library modules, and more. Moreover we’ve really only scratched the surface of each of these categories.

One of the first questions that bewildered beginners often ask is: how do I find information on all the built-in tools? This section provides hints on the various documentation sources available in Python. It also presents documentation strings and the PyDoc system that makes use of them. These topics are somewhat peripheral to the core language itself, but become essential knowledge as soon as your code reaches the level of the examples and exercises in this chapter.

Documentation Sources

As summarized in Table 11-1, there are a variety of places to look for information in Python, with generally increasing verbosity. Since documentation is such a crucial tool in practical programming, let’s look at each of these categories.

Table 11-1. Python documentation sources

Form	Role
`#` comments	In-file documentation
The dir function	Lists of attributes available on objects
Docstrings`: __doc__`	In-file documentation attached to objects
PyDoc: The help function	Interactive help for objects
PyDoc: HTML reports	Module documentation in a browser
Standard manual set	Official language and library descriptions
Web resources	Online tutorial, examples, and so on
Published books	Commercially-available reference texts

# Comments

Hash-mark comments are the most basic way to document your code. All the text following a # (that is not inside a string literal) is simply ignored by Python. Because of that, this provides a place for you to write and read words meaningful to programmers. Such comments are only accessible in your source files; to code comments that are more widely available, use docstrings.

The dir Function

The built-in dir function is an easy way to grab a list that shows all the attributes available inside an object (i.e., its methods, and simple data items). It can be called with any object that has atributes. For example, to find out what’s available in the standard library’s sys module, import it and pass to dir:

>>> import sys
>>> dir(sys)
['__displayhook__', '__doc__', '__excepthook__', '__name__', 
'__stderr__', '__stdin__', '__stdout__', '_getframe', 'argv',
'builtin_module_names', 'byteorder', 'copyright', 'displayhook', 'dllhandle', 
'exc_info', 'exc_type', 'excepthook', 
...more names ommitted...]

Only some of the many names are displayed; run these statements on your machine to see the full list. To find out what attributes are provided in built-in object types, run dir on a literal of that type. For example, to see list and string attributes, you can pass empty objects:

>>> dir([  ])
['__add__', '__class__', ...more...
'append', 'count', 'extend', 'index', 'insert', 'pop', 'remove',
'reverse', 'sort']

>>> dir('')
['__add__', '__class__', ...more...
'capitalize', 'center', 'count', 'decode', 'encode', 'endswith', 'expandtabs', 
'find', 'index', 'isalnum', 'isalpha', 'isdigit', 
'islower', 'isspace', 'istitle', 'isupper', 'join', 'ljust', 
...more names ommitted...]

dir results for built-in types include a set of attributes that are related to the implementation of the type (technically, operator overloading methods); they all begin and end with double underscores to make them distinct, and can be safely ignored at this point in the book, so they are not shown here.

Incidentally, you can achieve the same effect by passing a type name to dir instead of a literal:

>>> dir(str) == dir('')     # Same result as prior example
1
>>> dir(list) == dir([  ])
1

This works, because functions like str and list that were once type converters are actually names of types; calling them invokes their constructor to generate an instance of that type. More on constructors and operator overloading methods when we meet classes in Part VI.

The dir function serves as a sort of memory-jogger—it provides a list of attribute names, but does not tell you anything about what those names mean. For such extra information, we need to move on to the next topic.

Docstrings: doc

Besides # comments, Python supports documentation that is retained at runtime for inspection, and automatically attached to objects. Syntactically, such comments are coded as strings at the top of module files, and the top of both function and class statements, before any other executable code. Python automatically stuffs the string, known as a docsting, into the __doc__ attribute of the corresponding object.

User-defined docstrings

For example, consider the following file, docstrings.py. Its docstrings appear at the beginning of the file, and at the start of a function and class within it. Here, we use triple-quoted block strings for multiline comments in the file and function, but any sort of string will work. We haven’t studied the def or class statements yet, so ignore everything about them, except the strings at their tops:

"""
Module documentation
Words Go Here
"""

spam = 40

def square(x):
    """
    function documentation
    can we have your liver then?
    """
    return x **2

class employee:
    "class documentation"
    pass

print square(4)
print square.__doc__

The whole point of this documentation protocol is that your comments are retained for inspection in __doc__ attributes, after the file is imported:

>>> import docstrings
16

    function documentation
    can we have your liver then?

>>> print docstrings.__doc__

Module documentation
Words Go Here

>>> print docstrings.square.__doc__

    function documentation
    can we have your liver then?
    
>>> print docstrings.employee.__doc__
class documentation

Here, after importing, we display the docstrings associated with the module and its objects, by printing their __doc__ attributes, where Python has saved the text. Note that you will generally want to explicitly say print to docstrings; otherwise, you’ll get a single string with embedded newline characters.

You can also attach docstrings to methods of classes (covered later), but because these are just def statements nested in a class, they’re not a special case. To fetch the docstring of a method function inside a class within a module, follow the path and go through the class: module.class.method.__doc__ (see the example of method docstrings in Chapter 22).

Docstring standards

There is no broad standard a bout what should go into the text of a docstring (although some companies have internal standards). There have been various mark-up language and template proposals (e.g., HTML), but they seem to have not caught on in the Python world.

This is probably related to the priority of documentation among programmers in general. Usually, if you get any comments in a file at all, you count yourself lucky; asking programmers to hand-code HTML or other formats in their comments seems unlikely to fly. Of course, we encourage you to document your code liberally.

Built-in docstrings

It turns out that built modules and objects in Python use similar techniques to attach documentation above and beyond the attribute lists returned by dir. For example, to see actual words that give a human readable description of a built-in module, import and print its __doc__ string:

>>> import sys
>>> print sys.__doc__
This module provides access to some objects 
used or maintained by the interpreter and to 
...more text ommitted... 

Dynamic objects:

argv -- command line arguments; argv[0] is the script pathname if known
path -- module search path; path[0] is the script directory, else ''
modules -- dictionary of loaded modules
...more text ommitted...

Similarly, functions, classes, and methods within built-in modules have attached words in their __doc__ attributes as well:

>>> print sys.getrefcount.__doc__
getrefcount(object) -> integer

Return the current reference count for the object.  
...more text ommitted...

In addition, you can read about built-in functions via their docstrings:

>>> print int.__doc__
int(x[, base]) -> integer

Convert a string or number to an integer, if possible.  
...more text ommitted...

>>> print open.__doc__
file(name[, mode[, buffering]]) -> file object

Open a file.  The mode can be 'r', 'w' or 'a' for reading 
...more text ommitted...

PyDoc: The help Function

The docstring technique proved to be so useful that Python ships with a tool that makes them even easier to display. The standard PyDoc tool is Python code that knows how to extract and format your docstrings, together with automatically extracted structural information, into nicely arranged reports of various types.

There are a variety of ways to launch PyDoc, including command-line script options. Perhaps the two most prominent Pydoc interfaces are the built-in help function, and the PyDoc GUI/HTML interface. The newly introduced help function invokes PyDoc to generate a simple textual report (which looks much like a manpage on Unix-like systems):

>>> import sys
>>> help(sys.getrefcount)
Help on built-in function getrefcount:

getrefcount(...)
    getrefcount(object) -> integer
    
    Return the current reference count for the object.  
    ...more ommitted...

Note that you do not have to import sys in order to call help, but you do have to import sys to get help on sys. For larger objects such as modules and classes, the help display is broken down into multiple sections, a few of which are shown here. Run this interactively to see the full report.

>>> help(sys)
Help on built-in module sys:

NAME
    sys

FILE
    (built-in)

DESCRIPTION
    This module provides access to some objects used 
    or maintained by the interpreter and to functions
    ...more ommitted...

FUNCTIONS
    __displayhook__ = displayhook(...)
        displayhook(object) -> None
        
        Print an object to sys.stdout and also save it
    ...more ommitted...
DATA
    __name__ = 'sys'
    __stderr__ = <open file '<stderr>', mode 'w' at 0x0082BEC0>
    ...more ommitted...

Some of the information in this report is docstrings, and some of it (e.g., function call patterns) is structural information that Pydoc gleans automatically by inspecting objects’ internals. You can also use help on built-in functions, methods, and types. To get help for a built-in type, use the type name (e.g., dict for dictionary, str for string, list for list); you’ll get a large display that describes all the methods available for that type:

>>> help(dict)
Help on class dict in module __builtin__:

class dict(object)
 |  dict(  ) -> new empty dictionary.
 ...more ommitted...

>>> help(str.replace)
Help on method_descriptor:

replace(...)
    S.replace (old, new[, maxsplit]) -> string
    
    Return a copy of string S with all occurrences
    ...more ommitted...

>>> help(ord)
Help on built-in function ord:

ord(...)
    ord(c) -> integer
    
    Return the integer ordinal of a one-character string.

Finally, the help function works just as well on your modules as built-ins. Here it is reporting on the docstrings.py file coded in the prior section; again, some of this is docstrings, and some is automatic by structure:

>>> help(docstrings.square)
Help on function square in module docstrings:

square(x)
    function documentation
    can we have your liver then?

>>> help(docstrings.employee)
...more ommitted...

>>> help(docstrings)
Help on module docstrings:

NAME
    docstrings

FILE
    c:python22docstrings.py

DESCRIPTION
    Module documentation
    Words Go Here

CLASSES
    employee
    ...more ommitted...

FUNCTIONS
    square(x)
        function documentation
        can we have your liver then?

DATA
    __file__ = 'C:\PYTHON22\docstrings.pyc'
    __name__ = 'docstrings'
    spam = 40

PyDoc: HTML Reports

The help function is nice for grabbing documentation when working interactively. For a more grandiose display, PyDoc also provides a GUI interface (a simple, but portable Python/Tkinter script), and can render its report in HTML page format, viewable in any web browser. In this mode, PyDoc can run locally or as a remote server, and reports contain automatically-created hyperlinks that allow you to click your way through the documentation of related components in your application.

To start PyDoc in this mode, you generally first launch the search engine GUI captured in Figure 11-1. You can start this by either selecting the Module Docs item in Python’s Start button menu on Windows, or launching the pydocgui script in Python’s tools directory. Enter the name of a module you’re interested in knowing about, and press the Enter key; PyDoc will march down your module import search path looking for references to the module.

Figure 11-1. Pydoc GUI top-level search interface

Once you’ve found a promising entry, select and click “go to selected”; PyDoc spawns a web browser on your machine to display the report rendered in HTML format. Figure 11-2 shows information PyDoc displays for the built-in glob module.

Figure 11-2. PyDoc HTML report, built-in module

Notice the hyperlinks in the Modules section of this page—click these to jump to the PyDoc pages for related (imported) modules. For larger pages, PyDoc also generates hyperlinks to sections within the page. As for the help function interface, the GUI interface works on user-defined modules as well; Figure 11-3 shows the page generated for the docstrings.py module file.

Figure 11-3. PyDoc HTML report, user-defined module

PyDoc can be customized and launched in various ways. The main thing to take away from this section is that PyDoc essentially gives you implementation reports “for free”—if you are good about using docstrings in your files, PyDoc does all the work of collecting and formatting them for display. PyDoc also provides an easy way to access a middle level of documentation for built-in tools—its reports are more useful than raw attribute lists, and less exhaustive than the standard manuals.

Standard Manual Set

For the complete and most up-to-date description of the Python language and its tool set, Python’s standard manuals stand ready to serve. Python’s manuals ship in HTML format and are installed with the Python system on Windows—they are available in your Start button’s menu for Python, and can also be opened from the Help menu within IDLE. You can also fetch the manual set separately at http://www.python.org in a variety of formats, or read them online at that site (follow the Documentation link).

When opened, the HTML format of the manuals displays a root page like that in Figure 11-4. The two most important entries here are most likely the Library Reference (which documents built-in types, functions, exceptions, and standard library modules) and the Language Reference (which provides a formal description of language-level details). The tutorial listed on this page also provides a brief introduction for newcomers.

Figure 11-4. Python’s standard manual set

Web Resources

At http://www.python.org you’ll find links to various tutorials, some of which cover special topics or domains. Look for the Documentation and Newbies (i.e., newcomers) links. This site also lists non-English Python resources.

Published Books

Finally, you can today choose from a collection of reference books for Python. In general, books tend to lag behind the cutting edge of Python changes, partly because of the work involved in writing, and partly because of the natural delays built in to the publishing cycle. Usually, by the time a book comes out, it’s six or more months behind the current Python state. Unlike standard manuals, books are also generally not free.

For many, the convenience and quality of a professionally published text is worth the cost. Moreover, Python changes so slowly that books are usually still relevent years after they are published, especially if their authors post updates on the Web. See the Preface for more pointers on Python books.

Common Coding Gotchas

Before the programming exercises for this part of the book, here are some of the most common mistakes beginners make when coding Python statements and programs. You’ll learn to avoid these once you’ve gained a bit of Python coding experience; but a few words might help you avoid falling into some of these traps initially.

Don’t forget the colons. Don’t forget to type a : at the end of compound statement headers (the first line of an if, while, for, etc.). You probably will at first anyhow (we did too), but you can take some comfort in the fact that it will soon become an unconscious habit.
Start in column 1. Be sure to start top-level (unnested) code in column 1. That includes unnested code typed into module files, as well as unnested code typed at the interactive prompt.
Blank lines matter at the interactive prompt. Blank lines in compound statements are always ignored in module files, but, when typing code, end the statement at the interactive prompt. In other words, blank lines tell the interactive command line that you’ve finished a compound statement; if you want to continue, don’t hit the Enter key at the ... prompt until you’re really done.
Indent consistently. Avoid mixing tabs and spaces in the indentation of a block, unless you know what your text editor system does with tabs. Otherwise, what you see in your editor may not be what Python sees when it counts tabs as a number of spaces. It’s safer to use all tabs or all spaces for each block.
Don’t code C in Python. A note to C/C++ programmers: you don’t need to type parentheses around tests in if and while headers (e.g., if (X==1):); you can if you like (any expression can be enclosed in parentheses), but they are fully superfluous in this context. Also, do not terminate all your statements with a semicolon; it’s technically legal to do this in Python as well, but is totally useless, unless you’re placing more than one statement on a single line (the end of a line normally terminates a statement). And remember, don’t embed assignment statements in while loop tests, and don’t use { } around blocks (indent your nested code blocks consistently instead).
Use simple for loops instead of while or range. A simple for loop (e.g., for x in seq:) is almost always simpler to code, and quicker to run than a while or range-based counter loop. Because Python handles indexing internally for a simple for, it can sometimes be twice as fast as the equivalent while.
Don’t expect results from functions that change objects in-place . In-place change operations like the list.append( ) and list.sort( ) methods of Chapter 6 do not return a value (other than None); call them without assigning the result. It’s not uncommon for beginners to say something like mylist=mylist.append(X) to try to get the result of an append; instead, this assigns mylist to None, rather than the modified list (in fact, you’ll lose a reference to the list altogether).
A more devious example of this pops up when trying to step through dictionary items in sorted fashion. It’s fairly common to see this sort of code: for k in D.keys( ).sort( ):. This almost works: the keys method builds a keys list, and the sort method orders it—but since the sort method returns None, the loop fails because it is ultimately a loop over None (a nonsequence). To code this correctly, split the method calls out to statements: Ks = D.keys( ), then Ks.sort( ), and finally for k in Ks:. This, by the way is one case where you’ll still want to call the keys method explicitly for looping, instead of relying on the dictionary iterators.
Always use parenthesis to call a function. You must add parentheses after a function name to call it, whether it takes arguments or not (e.g., use function( ), not function). In Part IV, we’ll see that functions are simply objects that have a special operation—a call, that you trigger with the parentheses.
In classes, this seems to occur most often with files. It’s common to see beginners type file.close to close a file, rather than file.close( ); because it’s legal to reference a function without calling it, the first version with no parenthesis succeeds silently, but does not close the file!
Don’t use extensions or paths in imports and reloads. Omit directory paths and file suffixes in import statements (e.g., say import mod, not import mod.py). (We met module basics in Chapter 6, and will continue studying them in Part V.) Because modules may have other suffixes besides .py (a .pyc, for instance), hardcoding a particular suffix is not only illegal syntax, it doesn’t make sense. And platform-specific directory path syntax comes from module search path settings, not the import statement.

Part III Exercises

Now that you know how to code basic program logic, the exercises ask you to implement some simple tasks with statements. Most of the work is in exercise 4, which lets you explore coding alternatives. There are always many ways to arrange statements, and part of learning Python is learning which arrangements work better tHan others.

See Section B.3 for the solutions.

Coding basic loops.
1. Write a for loop that prints the ASCII code of each character in a string named S. Use the built-in function ord(character) to convert each character to an ASCII integer. (Test it interactively to see how it works.)
2. Next, change your loop to compute the sum of the ASCII codes of all characters in a string.
3. Finally, modify your code again to return a new list that contains the ASCII codes of each character in the string. Does this expression have a similar effect—map(ord,S)? (Hint: see Part IV.)
Backslash characters. What happens on your machine when you type the following code interactively?
```
for i in range(50):
    print 'hello %d
a' % i
```
Beware that if run outside of the IDLE interface, this example may beep at you, so you may not want to run it in a crowded lab. IDLE prints odd characters instead (see the backslash escape characters in Table 5-2).
Sorting dictionaries. In Chapter 6, we saw that dictionaries are unordered collections. Write a for loop that prints a dictionary’s items in sorted (ascending) order. Hint: use the dictionary keys and list sort methods.
Program logic alternatives. Consider the following code, which uses a while loop and found flag to search a list of powers of 2, for the value of 2 raised to the 5th power (32). It’s stored in a module file called power.py.
```
L = [1, 2, 4, 8, 16, 32, 64]
X = 5

found = i = 0
while not found and i < len(L):
    if 2 ** X == L[i]:
        found = 1
    else:
        i = i+1

if found:
    print 'at index', i
else:
    print X, 'not found'

C:ook	ests> python power.py
at index 5
```
As is, the example doesn’t follow normal Python coding techniques. Follow the steps below to improve it. For all the transformations, you may type your code interactively or store it in a script file run from the system command line (using a file makes this exercise much easier).
1. First, rewrite this code with a while loop else, to eliminate the found flag and final if statement.
2. Next, rewrite the example to use a for loop with an else, to eliminate the explicit list indexing logic. Hint: to get the index of an item, use the list index method (L.index(X) returns the offset of the first X in list L).
3. Next, remove the loop completely by rewriting the examples with a simple in operator membership expression. (See Chapter 6 for more details, or type this to test: 2 in [1,2,3].)
4. Finally, use a for loop and the list append method to generate the powers-of-2 list (L), instead of hard-coding a list literal.
5. Deeper thoughts: (1) Do you think it would improve performance to move the 2**X expression outside the loops? How would you code that? (2) As we saw in Exercise 1, Python also includes a map(function, list) tool that can generate the powers-of-2 list too: map(lambda x: 2**x, range(7)). Try typing this code interactively; we’ll meet lambda more formally in Chapter 14.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 11. Documenting Python Code

Create new playlist

Sign In

Sign Up