Hopefully, most readers are familiar with the notion of
files—named
storage compartments on your computer that are managed by your
operating system. Our last built-in object type provides a way to
access those files inside Python programs. The built-in
open
function creates a Python file
object, which serves as a link to a file residing on your machine.
After calling open
, you can read and write the
associated external file, by calling file object methods.
Compared to types we’ve seen so far, file objects are somewhat
unusual. They’re not numbers, sequences, or mappings; instead,
they export methods only for common file processing tasks.
Technically, files are a prebuilt C extension type that provides a
thin wrapper over the underlying C stdio
filesystem; in fact, file object methods have an almost 1-to-1
correspondence to file functions in the standard C library.
Table 2.10 summarizes common file operations. To
open a file, a program calls the open
function,
with the external name first, followed by a processing mode
('r'
means open for input, 'w'
means create and open for output, 'a'
means open
for appending to the end, and others we’ll ignore here). Both
arguments must be Python strings.
Table 2-10. Common File Operations
Once you have a file object, call its methods to read from or write
to the external file. In all cases, file text takes the form of
strings in Python programs; reading a file
returns its text in strings, and text is passed to the
write
methods as strings. Reading and writing both
come in multiple flavors; Table 2.10 gives the most
common.
Calling the file close
method terminates your connection to
the external file. We talked about garbage
collection
in a footnote earlier; in Python, an
object’s memory space is automatically reclaimed as soon as the
object is no longer referenced anywhere in the program. When file
objects are reclaimed, Python automatically closes the file if
needed. Because of that, you don’t need to always manually
close your files, especially in simple scripts that don’t run
long. On the other hand, manual close
calls
can’t hurt and are usually a good idea in larger systems.
Here is a simple example that demonstrates file-processing
basics. We first open a new file for
output, write a string (terminated with an end-of-line marker,
'
'
), and close the file. Later, we open the same
file again in input mode, and read the line back. Notice that the
second readline
call returns an empty string; this
is how Python file methods tell us we’ve reached the
end of the file (empty lines are strings with
just an end-of-line character, not empty strings).
>>>myfile = open('myfile', 'w')
# open for output (creates) >>>myfile.write('hello text file ')
# write a line of text >>>myfile.close()
>>>myfile = open('myfile', 'r')
# open for input >>>myfile.readline()
# read the line back 'hello text file 12' >>>myfile.readline()
# empty string: end of file ''
There are additional, more advanced file methods not shown in
Table 2.10; for instance, seek
resets your current position in a file, flush
forces buffered output to be written, and so on. See the Python
library manual or other Python books for a complete list of file
methods. Since we’re going to see file examples in Chapter 9, we won’t present more examples
here.
File objects returned by the open
function handle basic file-interface chores. In Chapter 8, you’ll see a handful of related but
more advanced Python tools. Here’s a quick preview of all the
file-like tools available:
The os
module provides interfaces for using
low-level descriptor-based files.
The anydbm
module provides an interface to
access-by-key files.
The shelve
and pickle
modules
support saving entire objects (beyond simple strings).
The os
module also provides POSIX interfaces for
processing pipes.
There are also optional interfaces to database systems, B-Tree based files, and more.