7. Tuples, Files, and Everything Else

Tuples

The last collection type in our survey is the Python tuple. Tuples construct simple groups of objects. They work exactly like lists, except that tuples can’t be changed in-place (they’re immutable) and are usually written as a series of items in parentheses, not square brackets. Although they don’t support any method calls, tuples share most of their properties with lists. Tuples are:

Ordered collections of arbitrary objects: Like strings and lists, tuples are a positionally-ordered collection of objects; like lists, they can embed any kind of object.
Accessed by offset: Like strings and lists, items in a tuple are accessed by offset (not key); they support all the offset-based access operations, such as indexing and slicing.
Of the category immutable sequence: Like strings, tuples are immutable; they don’t support any of the in-place change operations applied to lists. Like strings and lists, tuples are sequences; they support many of the same operations.
Fixed length, heterogeneous, arbitrarily nestable: Because tuples are immutable, they cannot grow or shrink without making a new tuple; on the other hand, tuples can hold other compound objects (e.g., lists, dictionaries, other tuples) and so support arbitrary nesting.
Arrays of object references: Like lists, tuples are best thought of as object reference arrays; tuples store access points to other objects (references), and indexing a tuple is relatively quick.

Table 7-1 highlights common tuple operations. Tuples are written as a series of objects (really, expressions that generate objects), separated by commas, and enclosed in parentheses. An empty tuple is just a parentheses pair with nothing inside.

Table 7-1. Common tuple literals and operations

Operation	Interpretation
`( )`	An empty tuple
`t1 = (0,)`	A one-item tuple (not an expression)
`t2 = (0, 'Ni', 1.2, 3)`	A four-item tuple
`t2 = 0, 'Ni', 1.2, 3`	Another four-item tuple (same as prior line)
`t3 = ('abc', ('def', 'ghi'))`	Nested tuples
`t1[i]t3[i][j]t1[i:j]len(t1)`	Index, slice, length
`t1 + t2t2 * 3`	Concatenate, repeat
`for x in t23 in t2`	Iteration, membership

Notice that tuples have no methods (e.g., an append call won’t work here), but do support the usual sequence operations that we saw for strings and lists:

>>> (1, 2) + (3, 4)             # Concatenation
(1, 2, 3, 4)

>>> (1, 2) * 4                  # Repitition
(1, 2, 1, 2, 1, 2, 1, 2)

>>> T = (1, 2, 3, 4)            # Indexing, slicing
>>> T[0], T[1:3]
(1, (2, 3))

The second and fourth entries in Table 7-1 merit a bit more explanation. Because parentheses can also enclose expressions (see Section 4.3), you need to do something special to tell Python when a single object in parentheses is a tuple object and not a simple expression. If you really want a single-item tuple, simply add a trailing comma after the single item and before the closing parenthesis:

>>> x = (40)         # An integer
>>> x
40
>>> y = (40,)        # A tuple containing an integer
>>> y
(40,)

As a special case, Python also allows you to omit the opening and closing parentheses for a tuple in contexts where it isn’t syntactically ambiguous to do so. For instance, the fourth line of the table simply listed four items, separated by commas. In the context of an assignment statement, Python recognizes this as a tuple, even though it didn’t have parentheses. For beginners, the best advice is that it’s probably easier to use parentheses than it is to figure out when they’re optional. Many programmers also find that parenthesis tend to aid script readability.

Apart from literal syntax differences, tuple operations (the last three rows in Table 7-1) are identical to strings and lists. The only differences worth noting are that the +, *, and slicing operations return new tuples when applied to tuples, and tuples don’t provide the methods you saw for strings, lists, and dictionaries. If you want to sort a tuple, for example, you’ll usually have to first convert it to a list to gain access to a sorting method call, and make it a mutable object:

>>> T = ('cc', 'aa', 'dd', 'bb')
>>> tmp = list(T)
>>> tmp.sort(  )
>>> tmp
['aa', 'bb', 'cc', 'dd']
>>> T = tuple(tmp)
>>> T
('aa', 'bb', 'cc', 'dd')

Here, the list and tuple built-in functions were used to convert to a list, and then back to a tuple; really, both calls make new objects, but the net effect is like a conversion. Also note that the rule about tuple immutability only applies to the top-level of the tuple itself, not to its contents; a list inside a tuple, for instance, can be changed as usual:

>>> T = (1, [2, 3], 4)
>>> T[1][0] = 'spam'                  # Works
>>> T
(1, ['spam', 3], 4)
>>> T[1] = 'spam'                     # Fails
TypeError: object doesn't support item assignment

Why Lists and Tuples?

This seems to be the first question that always comes up when teaching beginners about tuples: why do we need tuples if we have lists? Some of it may be historic. But the best answer seems to be that the immutability of tuples provides some integrity—you can be sure a tuple won’t be changed through another reference elsewhere in a program. There’s no such guarantee for lists.

Tuples can also be used in places that lists cannot—for example, as dictionary keys (see the sparse matrix example in Chapter 6). Some built-in operations may also require or imply tuples, not lists. As a rule of thumb, lists are the tool of choice for ordered collections that might need to change; tuples handle the other cases.

Files

Most readers are probably familiar with the notion of files—named storage compartments on your computer that are managed by your operating system. This last built-in object type provides a way to access those files inside Python programs. The built-in open function creates a Python file object, which serves as a link to a file residing on your machine. After calling open, you can read and write the associated external file, by calling file object methods. The built-in name file is a synonym for open, and files may be opened by calling either name.

Compared to the types you’ve seen so far, file objects are somewhat unusual. They’re not numbers, sequences, or mappings; instead, they export methods only for common file processing tasks.

Table 7-2 summarizes common file operations. To open a file, a program calls the open function, with the external name first, followed by a processing mode ('r' to open for input—the default; 'w' to create and open for output; 'a' to open for appending to the end; and others we’ll omit here). Both arguments must be Python strings. The external file name argument may include a platform-specific and absolute or relative directory path prefix; without a path, the file is assumed to exist in the current working directory (i.e., where the script runs).

Table 7-2. Common file operations

Operation	Interpretation
`output = open('/tmp/spam', 'w')`	Create output file ('`w`' means write).
`input = open('data', 'r')`	Create input file ('`r`' means read).
`S = input.read( )`	Read entire file into a single string.
`S = input.read(N)`	Read N bytes (1 or more).
`S = input.readline( )`	Read next line (through end-line marker).
`L = input.readlines( )`	Read entire file into list of line strings.
`output.write(S)`	Write string `S` into file.
`output.writelines(L)`	Write all line strings in list `L` into file.
`output.close( )`	Manual close (done for you when file collected).

Once you have a file object, call its methods to read from or write to the external file. In all cases, file text takes the form of strings in Python programs; reading a file returns its text in strings, and text is passed to the write methods as strings. Reading and writing both come in multiple flavors; Table 7-2 gives the most common.

Calling the file close method terminates your connection to the external file. In Python, an object’s memory space is automatically reclaimed as soon as the object is no longer referenced anywhere in the program. When file objects are reclaimed, Python also automatically closes the file if needed. Because of that, you don’t need to always manually close your files, especially in simple scripts that don’t run long. On the other hand, manual close calls can’t hurt and are usually a good idea in larger systems. Strictly speaking, this auto-close-on-collection feature of files is not part of the language definition, and may change over time. Because of that, manual file close method calls are a good habit to form.

Files in Action

Here is a simple example that demonstrates file-processing basics. It first opens a new file for output, writes a string (terminated with an newline marker, ''), and closes the file. Later, the example opens the same file again in input mode, and reads the line back. Notice that the second readline call returns an empty string; this is how Python file methods tell you that you’ve reached the end of the file (empty lines in the file come back as strings with just a newline character, not empty strings).

>>> myfile = open('myfile', 'w')             # Open for output (creates).
>>> myfile.write('hello text file
')        # Write a line of text.
>>> myfile.close(  )

>>> myfile = open('myfile', 'r')             # Open for input.
>>> myfile.readline(  )                           # Read the line back.
'hello text file
'
>>> myfile.readline(  )                        # Empty string: end of file
''

There are additional, more advanced file methods not shown in Table 7-2; for instance, seek resets your current position in a file (the next read or write happens at the position), flush forces buffered output to be written out to disk (by default, files are always buffered), and so on.

The sidebar Why You Will Care: File Scanners in Chapter 10 sketches common file-scanning loop code patterns, and the examples in later parts of this book discuss larger file-based code. In addition, the Python standard library manual and the reference books described in the Preface provide a complete list of file methods.

Type Categories Revisited

Now that we’ve seen all of Python’s core built-in types, let’s take a look at some of the properties they share.

Table 7-3 classifies all the types we’ve seen, according to the type categories we introduced earlier. Objects share operations according to their category—for instance, strings, lists, and tuples all share sequence operations. Only mutable objects may be changed in-place. You can change lists and dictionaries in-place, but not numbers, strings, or tuples. Files only export methods, so mutability doesn’t really apply (they may be changed when written, but this isn’t the same as Python type constraints).

Table 7-3. Object classifications

Object type	Category	Mutable?
Numbers	Numeric	No
Strings	Sequence	No
Lists	Sequence	Yes
Dictionaries	Mapping	Yes
Tuples	Sequence	No
Files	Extension	n/a

Later, we’ll see that objects we implement with classes can pick and choose from these categories arbitrarily. For instance, if you want to provide a new kind of specialized sequence object that is consistent with built-in sequences, code a class that overloads things like indexing and concatenation:

class MySequence:
    def __getitem__(self, index):
        # Called on self[index], others
    def __add__(self, other):
        # Called on self + other

and so on. You can also make the new object mutable or not, by selectively implementing methods called for in-place change operations (e.g., __setitem__ is called on self[index]=value assignments). It’s also possible to implement new objects in C, as C extension types. For these, you fill in C function pointer slots to choose between number, sequence, and mapping operation sets.

Object Generality

We’ve seen a number of compound object types (collections with components). In general:

Lists, dictionaries, and tuples can hold any kind of object.
Lists, dictionaries, and tuples can be arbitrarily nested.
Lists and dictionaries can dynamically grow and shrink.

Because they support arbitrary structures, Python’s compound object types are good at representing complex information in a program. For example, values in dictionaries may be lists, which may contain tuples, which may contain dictionaries, and so on—as deeply nested as needed to model the data to be processed.

Here’s an example of nesting. The following interaction defines a tree of nested compound sequence objects, shown in Figure 7-1. To access its components, you may include as many index operations as required. Python evaluates the indexes from left to right, and fetches a reference to a more deeply nested object at each step. Figure 7-1 may be a pathologically complicated data structure, but it illustrates the syntax used to access nested objects in general:

>>> L = ['abc', [(1, 2), ([3], 4)], 5]
>>> L[1]
[(1, 2), ([3], 4)]
>>> L[1][1]
([3], 4)
>>> L[1][1][0]
[3]
>>> L[1][1][0][0]
3

Figure 7-1. A nested object tree

References Versus Copies

Section 4.6 in Chapter 4 mentioned that assignments always store references to objects, not copies. In practice, this is usually what you want. But because assignments can generate multiple references to the same object, you sometimes need to be aware that changing a mutable object in-place may affect other references to the same object elsewhere in your program. If you don’t want such behavior, you’ll need to tell Python to copy the object explicitly.

For instance, the following example creates a list assigned to X, and another assigned to L that embeds a reference back to list X. It also creates a dictionary D that contains another reference back to list X:

>>> X = [1, 2, 3]
>>> L = ['a', X, 'b']           # Embed references to X's object.
>>> D = {'x':X, 'y':2}

At this point, there are three references to the first list created: from name X, from inside the list assigned to L, and from inside the dictionary assigned to D. The situation is illustrated in Figure 7-2.

Figure 7-2. Shared object references

Since lists are mutable, changing the shared list object from any of the three references changes what the other two reference:

>>> X[1] = 'surprise'         # Changes all three references!
>>> L
['a', [1, 'surprise', 3], 'b']
>>> D
{'x': [1, 'surprise', 3], 'y': 2}

References are a higher-level analog of pointers in other languages. Although you can’t grab hold of the reference itself, it’s possible to store the same reference in more than one place: variables, lists, and so on. This is a feature—you can pass a large object around a program without generating copies of it along the way. If you really do want copies, you can request them:

Slice expressions with empty limits copy sequences.
The dictionary copy method copies a dictionary.
Some built-in functions such as list also make copies.
The copy standard library module makes full copies.

For example, if you have a list and a dictionary, and don’t want their values to be changed through other variables:

>>> L = [1,2,3]
>>> D = {'a':1, 'b':2}

simply assign copies to the other variables, not references to the same objects:

>>> A = L[:]              # Instead of: A = L (or list(L))
>>> B = D.copy(  )            # Instead of: B = D

This way, changes made from other variables change the copies, not the originals:

>>> A[1] = 'Ni'
>>> B['c'] = 'spam'
>>>
>>> L, D
([1, 2, 3], {'a': 1, 'b': 2})
>>> A, B
([1, 'Ni', 3], {'a': 1, 'c': 'spam', 'b': 2})

In terms of the original example, you can avoid the reference side effects by slicing the original list, instead of simply naming it:

>>> X = [1, 2, 3]
>>> L = ['a', X[:], 'b']          # Embed copies of X's object.
>>> D = {'x':X[:], 'y':2}

This changes the picture in Figure 7-2—L and D will point to different lists than X. The net effect is that changes made through X will impact only X, not L and D; similarly, changes to L or D will not impact X.

One note on copies: empty-limit slices and the copy method of dictionaries still only make a top-level copy—they do not copy nested data structures, if any are present. If you need a complete, fully independent copy of a deeply nested data structure, use the standard copy module: import copy, and say X=copy.deepcopy(Y) to fully copy an arbitrarily nested object Y. This call recursively traverses objects to copy all their parts. This is the much more rare case, though (which is why you have to say more to make it go). References are usually the behaviour you will want; when they are not, slices and copy methods are usually as much copying as you’ll need to do.

Comparisons, Equality, and Truth

All Python objects also respond to the comparisons: test for equality, relative magnitude, and so on. Python comparisons always inspect all parts of compound objects, until a result can be determined. In fact, when nested objects are present, Python automatically traverses data structures to apply comparisons recursively—left to right, and as deep as needed.

For instance, a comparison of list objects compares all their components automatically:

>>> L1 = [1, ('a', 3)]         # Same value, unique objects
>>> L2 = [1, ('a', 3)]
>>> L1 == L2, L1 is L2         # Equivalent? Same object?
(1, 0)

Here, L1 and L2 are assigned lists that are equivalent, but distinct objects. Because of the nature of Python references (studied in Chapter 4), there are two ways to test for equality:

The == operator tests value equivalence. Python performs an equivalence test, comparing all nested objects recursively
The is operator tests object identity. Python tests whether the two are really the same object (i.e., live at the same address in memory).

In the example, L1 and L2 pass the == test (they have equivalent values because all their components are equivalent), but fail the is check (they are two different objects, and hence two different pieces of memory). Notice what happens for short strings:

>>> S1 = 'spam'
>>> S2 = 'spam'
>>> S1 == S2, S1 is S2
(1, 1)

Here, we should have two distinct objects that happen to have the same value: == should be true, and is should be false. Because Python internally caches and reuses short strings as an optimization, there really is just a single string, 'spam', in memory, shared by S1 and S2; hence, the is identity test reports a true result. To trigger the normal behavior, we need to use longer strings that fall outside the cache mechanism:

>>> S1 = 'a longer string'
>>> S2 = 'a longer string'
>>> S1 == S2, S1 is S2
(1, 0)

Because strings are immutable, the object caching mechanism is irrelevent to your code—string can’t be changed in-place, regardless of how many variables refer to them. If identity tests seem confusing, see Section 4.6 in Chapter 4 for a refresher on object reference concepts.

As a rule of thumb, the == operator is what you will want to use for almost all equality checks; is is reserved for highly specialized roles. We’ll see cases of both operators put to use later in the book.

Notice that relative magnitude comparisons are also applied recursively to nested data structures:

>>> L1 = [1, ('a', 3)]
>>> L2 = [1, ('a', 2)]
>>> L1 < L2, L1 == L2, L1 > L2     # less,equal,greater: tuple of results
(0, 0, 1)

Here, L1 is greater than L2 because the nested 3 is greater than 2. The result of the last line above is really a tuple of three objects—the results of the three expressions typed (an example of a tuple without its enclosing parentheses).

The three values in this tuple represent true and false values; in Python, an integer 0 represents false and an integer 1 represents true. Python also recognizes any empty data structure as false, and any nonempty data structure as true. More generally, the notions of true and false are intrinsic properties of every object in Python—each object is true or false, as follows:

Numbers are true if nonzero.
Other objects are true if nonempty.

Table 7-4 gives examples of true and false objects in Python.

Table 7-4. Example object truth values

Object	Value
"`spam`"	True
“”	False
`[ ]`	False
`{ }`	False
`1`	True
`0.0`	False
`None`	False

Python also provides a special object called None (the last item in Table 7-4), which is always considered to be false. None is the only value of a special data type in Python; it typically serves as an empty placeholder, much like a NULL pointer in C.

For example, recall that for lists, you cannot assign to an offset unless that offset already exists (the list does not magically grow if you assign out of bounds). To preallocate a 100-item list such that you can add to any of the 100 offsets, you can fill one with None objects:

>>> L = [None] * 100
>>>
>>> L
[None, None, None, None, None, None, None, . . . ]

Python 2.3 introduces a new explicit Boolean data type called bool, with values True and False available as new preassigned built-in names. Because of the way this new type is implemented, this is really just a minor extension to the notions of true and false outlined in this chapter, designed to make truth values more explicit. Most programmers were preassigning True and False to 1 and 0 anyway, so the new type makes this a standard. For instance, an infinite loop can now be coded as while True: instead of the less intuitive while 1:. Similarly, flags can be initialized with flag = False.

Internally, the new names True and False are instances of bool, which is in turn just a subclass of the built-in integer data type int. True and False behave exactly like integers 1 and 0, except that they have customized printing logic—they print themselves as the words True and False, instead of the digits 1 and 0 (technically, bool redefines its str and repr string formats.) Because of this customization, the output of Boolean expressions typed at the interactive prompt print as the words True and False as of Python 2.3, instead of the 1 and 0 you see in this book.

For all other practical purposes, you can treat True and False as though they are predefined variables set to integer 1 and 0 (e.g., True + 3 yields 4). In truth tests, True and False evaluate to true and false, because they truly are just specialized versions of integers 1 and 0. Moreover, you are not required to use only Boolean types in if statements; all objects are still inherently true or false, and all the Boolean concepts mentioned in this chapter still work as before. More on Booleans in Chapter 9.

In general, Python compares types as follows:

Numbers are compared by relative magnitude.
Strings are compared lexicographically, character-by-character (“abc” < “ac”).
Lists and tuples are compared by comparing each component, from left to right.
Dictionaries are compared as though comparing sorted (key, value) lists.

In later chapters, we’ll see other object types that can change the way they get compared.

Python’s Type Hierarchies

Figure 7-3 summarizes all the built-in object types available in Python and their relationships. We’ve looked at the most prominent of these; most other kinds of objects in Figure 7-3 either correspond to program units (e.g., functions and modules), or exposed interpreter internals (e.g., stack frames and compiled code).

Figure 7-3. Built-in type hierarchies

The main point to notice here is that everything is an object type in a Python system and may be processed by your Python programs. For instance, you can pass a class to a function, assign it to a variable, stuff it in a list or dictionary, and so on.

Even types are an object type in Python: a call to the built-in function type(X) returns the type object of object X. Type objects can be used for manual type comparisons in Python if statements. However, for reasons to be explained in Part IV, manual type testing is usually not the right thing to do in Python.

A note on type names: as of Python 2.2, each core type has a new built-in name added to support type subclassing: dict, list, str, tuple, int, long, float, complex, unicode, type, and file (file is a synonym for open). Calls to these names are really object constructor calls, not simply conversion functions.

The types module provides additional type names (now largely synonyms for the built-in type names), and it is possible to do type tests with the isinstance function. For example, in Version 2.2, all of the following type tests are true:

isinstance([1],list)
type([1])==list
type([1])==type([  ])
type([1])==types.ListType

Because types can be subclassed in 2.2, the isinstance technique is generally recommended. See Chapter 23 for more on subclassing built-in types in 2.2 and later.

Other Types in Python

Besides the core objects studied in this chapter, a typical Python installation has dozens of other object types available as linked-in C extensions or Python classes. You’ll see examples of a few later in the book—regular expression objects, DBM files, GUI widgets, and so on. The main difference between these extra tools and the built-in types just seen is that the built-ins provide special language creation syntax for their objects (e.g., 4 for an integer, [1,2] for a list, the open function for files). Other tools are generally exported in a built-in module that you must first import to use. See Python’s library reference for a comprehensive guide to all the tools available to Python programs.

Built-in Type Gotchas

Part II concludes with a discussion of common problems that seem to bite new users (and the occasional expert), along with their solutions.

Assignment Creates References, Not Copies

Because this is such a central concept, it is mentioned again: you need to understand what’s going on with shared references in your program. For instance, in the following exmaple, the list object assigned to name L is referenced both from L and from inside the list assigned to name M. Changing L in-place changes what M references too:

>>> L = [1, 2, 3]
>>> M = ['X', L, 'Y']       # Embed a reference to L.
>>> M
['X', [1, 2, 3], 'Y']

>>> L[1] = 0                # Changes M too
>>> M
['X', [1, 0, 3], 'Y']

This effect usually becomes important only in larger programs, and shared references are often exactly what you want. If they’re not, you can avoid sharing objects by copying them explicitly; for lists, you can always make a top-level copy by using an empty-limits slice:

>>> L = [1, 2, 3]
>>> M = ['X', L[:], 'Y']       # Embed a copy of L.
>>> L[1] = 0                   # Changes only L, not M 
>>> L
[1, 0, 3]
>>> M
['X', [1, 2, 3], 'Y']

Remember, slice limits default to 0 and the length of the sequence being sliced; if both are omitted, the slice extracts every item in the sequence, and so makes a top-level copy (a new, unshared object).

Repetition Adds One-Level Deep

Sequence repetition is like adding a sequence to itself a number of times. That’s true, but when mutable sequences are nested, the effect might not always be what you expect. For instance, in the following, X is assigned to L repeated four times, whereas Y is assigned to a list containing L repeated four times:

>>> L = [4, 5, 6]
>>> X = L * 4           # Like [4, 5, 6] + [4, 5, 6] + ...
>>> Y = [L] * 4         # [L] + [L] + ... = [L, L,...]

>>> X
[4, 5, 6, 4, 5, 6, 4, 5, 6, 4, 5, 6]
>>> Y
[[4, 5, 6], [4, 5, 6], [4, 5, 6], [4, 5, 6]]

Because L was nested in the second repetition, Y winds up embedding references back to the original list assigned to L, and is open to the same sorts of side effects noted in the last section:

>>> L[1] = 0            # Impacts Y but not X
>>> X
[4, 5, 6, 4, 5, 6, 4, 5, 6, 4, 5, 6]
>>> Y
[[4, 0, 6], [4, 0, 6], [4, 0, 6], [4, 0, 6]]

Solutions

This is really another way to create the shared mutable object reference case, so the same solutions as above apply here. And if you remember that repetition, concatenation, and slicing copy only the top level of their operand objects, these sorts of cases make much more sense.

Cyclic Data Structures

We actually encountered this gotcha in a prior exercise: if a collection object contains a reference to itself, it’s called a cyclic object. Python prints a “[...]” whenever it detects a cycle in the object, rather than getting stuck in an infinite loop:

>>> L = ['grail']              # Append reference to same object.
>>> L.append(L)                # Generates cycle in object: [...]
>>> L
['grail', [...]]

Besides understanding that the three dots represent a cycle in the object, this case is worth knowing about in general because it can lead to gotchas—cyclic structures may cause code of your own to fall into unexpected loops if you don’t anticipate them. For instance, some programs keep a list or dictionary of items already visited, and check it to know if they have reached a cycle. See the solutions to Part I Exercises in Appendix B for more on the problem, and the reloadall.py program at the end of Chapter 18 for a solution.

Don’t use a cyclic reference, unless you need to. There are good reasons to create cycles, but unless you have code that knows how to handle them, you probably won’t want to make your objects reference themselves very often in practice.

Immutable Types Can’t Be Changed in-Place

Finally, you can’t change an immutable object in-place:

                  T = (1, 2, 3)
                  T[2] = 4             # Error!
                  T = T[:2] + (4,)     # Okay: (1, 2, 4)

Construct a new object with slicing, concatenation, and so on, and assign it back to the original reference if needed. That might seem like extra coding work, but the upside is that the previous gotchas can’t happen when using immutable objects such as tuples and strings; because they can’t be changed in-place, they are not open to the sorts of side effects that lists are.

Part II Exercises

This session asks you to get your feet wet with built-in object fundamentals. As before, a few new ideas may pop up along the way, so be sure to flip to Section B.2 when you’re done (and even when you’re not). If you have limited time, we suggest starting with exercise 11 (the most practical of the bunch), and then working from first to last as time allows. This is all fundamental material, though, so try to do as many of these as you can.

The basics. Experiment interactively with the common type operations found in the tables in Part II. To get started, bring up the Python interactive interpreter, type each of the expressions below, and try to explain what’s happening in each case:

2 ** 16
2 / 5, 2 / 5.0

"spam" + "eggs"
S = "ham"
"eggs " + S
S * 5
S[:0]
"green %s and %s" % ("eggs", S)

('x',)[0]
('x', 'y')[1]

L = [1,2,3] + [4,5,6]
L, L[:], L[:0], L[-2], L[-2:]
([1,2,3] + [4,5,6])[2:4]
[L[2], L[3]]
L.reverse(  ); L
L.sort(  ); L
L.index(4)

{'a':1, 'b':2}['b']
D = {'x':1, 'y':2, 'z':3}
D['w'] = 0
D['x'] + D['w']
D[(1,2,3)] = 4
D.keys(  ), D.values(  ), D.has_key((1,2,3))

[[  ]], ["",[  ],(  ),{  },None]

Indexing and slicing. At the interactive prompt, define a list named L that contains four strings or numbers (e.g., L=[0,1,2,3]). Then, experiment with some boundary cases.
1. What happens when you try to index out of bounds (e.g., L[4])?
2. What about slicing out of bounds (e.g., L[-1000:100])?
3. Finally, how does Python handle it if you try to extract a sequence in reverse—with the lower bound greater than the higher bound (e.g., L[3:1])? Hint: try assigning to this slice (L[3:1]=['?']) and see where the value is put. Do you think this may be the same phenomenon you saw when slicing out of bounds?
Indexing, slicing, and del. Define another list L with four items again, and assign an empty list to one of its offsets (e.g., L[2]=[ ]). What happens? Then assign an empty list to a slice (L[2:3]=[ ]). What happens now? Recall that slice assignment deletes the slice and inserts the new value where it used to be. The del statement deletes offsets, keys, attributes, and names. Use it on your list to delete an item (e.g., del L[0]). What happens if you del an entire slice (del L[1:])? What happens when you assign a nonsequence to a slice (L[1:2]=1)?
Tuple assignment. Type this sequence:
```
>>> X = 'spam'
>>> Y = 'eggs'
>>> X, Y = Y, X
```
What do you think is happening to X and Y when you type this sequence?
Dictionary keys. Consider the following code fragments:
```
>>> D = {  }
>>> D[1] = 'a'
>>> D[2] = 'b'
```
We learned that dictionaries aren’t accessed by offsets, so what’s going on here? Does the following shed any light on the subject? (Hint: strings, integers, and tuples share which type category?)
```
>>> D[(1, 2, 3)] = 'c'
>>> D
{1: 'a', 2: 'b', (1, 2, 3): 'c'}
```
Dictionary indexing. Create a dictionary named D with three entries, for keys 'a', 'b', and 'c‘. What happens if you try to index a nonexistent key (D['d'])? What does Python do if you try to assign to a nonexistent key d (e.g., D['d']='spam')? How does this compare to out-of-bounds assignments and references for lists? Does this sound like the rule for variable names?
Generic operations. Run interactive tests to answer the following questions:
1. What happens when you try to use the + operator on different/mixed types (e.g., string + list, list + tuple)?
2. Does + work when one of the operands is a dictionary?
3. Does the append method work for both lists and strings? How about using the keys method on lists? (Hint: What does append assume about its subject object?)
4. Finally, what type of object do you get back when you slice or concatenate two lists or two strings?
String indexing. Define a string S of four characters: S = "spam“. Then type the following expression: S[0][0][0][0][0]. Any clues as to what’s happening this time? (Hint: recall that a string is a collection of characters, but Python characters are one-character strings.) Does this indexing expression still work if you apply it to a list such as: ['s', 'p', 'a', 'm']? Why?
Immutable types. Define a string S of 4 characters again: S = "spam“. Write an assignment that changes the string to "slam“, using only slicing and concatenation. Could you perform the same operation using just indexing and concatenation? How about index assignment?
Nesting. Write a data-structure that represents your personal information: name (first, middle, last), age, job, address, email address, and phone number. You may build the data structure with any combination of built-in object types you like: lists, tuples, dictionaries, strings, numbers. Then access the individual components of your data structures by indexing. Do some structures make more sense than others for this object?
Files. Write a script that creates a new output file called myfile.txt and writes the string "Hello file world!" into it. Then write another script that opens myfile.txt, and reads and prints its contents. Run your two scripts from the system command line. Does the new file show up in the directory where you ran your scripts? What if you add a different directory path to the filename passed to open? Note: file write methods do not add newline characters to your strings; add an explicit '' at the end of the string if you want to fully terminate the line in the file.
The dir function revisited. Try typing the following expressions at the interactive prompt. Starting with Version 1.5, the dir function has been generalized to list all attributes of any Python object you’re likely to be interested in. If you’re using an earlier version than 1.5, the __methods__ scheme has the same effect. If you’re using Python 2.2, dir is probably the only of these that will work.
```
[  ].__methods__      # 1.4 or 1.5
dir([  ])                 # 1.5 and later

{  }.__methods__      # Dictionary
dir({  })
```

Table of Contents for
7. Tuples, Files, and Everything Else

Chapter 7. Tuples, Files, and Everything Else

Tuples

Why Lists and Tuples?

Files

Files in Action

Type Categories Revisited

Object Generality

References Versus Copies

Comparisons, Equality, and Truth

Python’s Type Hierarchies

Other Types in Python

Built-in Type Gotchas

Assignment Creates References, Not Copies

Repetition Adds One-Level Deep

Solutions

Cyclic Data Structures

Immutable Types Can’t Be Changed in-Place

Part II Exercises

Table of Contents for 7. Tuples, Files, and Everything Else

Create new playlist

Sign In

Sign Up

Chapter 7. Tuples, Files, and Everything Else

Tuples

Why Lists and Tuples?

Files

Files in Action

Type Categories Revisited

Object Generality

References Versus Copies

Comparisons, Equality, and Truth

Python’s Type Hierarchies

Other Types in Python

Built-in Type Gotchas

Assignment Creates References, Not Copies

Repetition Adds One-Level Deep

Solutions

Cyclic Data Structures

Immutable Types Can’t Be Changed in-Place

Part II Exercises

Table of Contents for
7. Tuples, Files, and Everything Else