Text editors are among the most important applications in the Unix world. They are used so often that many people spend more time within an editor than anywhere else on their Unix system. The same holds true for Linux.
The choice of an editor can be a religious one. Many editors exist, but the Unix community has arranged itself into two major groups: the Emacs camp and the vi camp. Because of vi’s somewhat nonintuitive user interface, many people (newcomers and seasoned users alike) prefer Emacs over vi. However, long-time users of vi (and single-finger typists) use it more efficiently than a more complex editor such as Emacs.
If vi is one end of the text-editor spectrum, Emacs is the other; they are widely different in their design and philosophy. Emacs is partly the brainchild of Richard Stallman, founder of the Free Software Foundation and author of much of the GNU software.
Emacs is a very large system with more features than any single Unix application to date (some people would even go so far as not to call it an editor but an “integrated environment”). It contains its own LISP language engine that you can use to write extensions for the editor. (Many of the functions within Emacs are written in Emacs LISP.) Emacs includes extensions for everything from compiling and debugging programs to reading and sending electronic mail to X Window System support and more. Emacs also includes its own online tutorial and documentation. The book Learning GNU Emacs by Debra Cameron, Bill Rosenblatt, and Eric Raymond (O’Reilly) is a popular guide to the editor.
Most Linux distributions include two variants of Emacs. GNU Emacs is the original version, which is still being developed, but development seems to have slowed down. XEmacs is larger, but much more user-friendly and better integrated with the X Window System (even though you can also use it from the command line, despite its name). If you are not tight on memory and have a reasonably fast computer, we suggest using XEmacs. Another advantage of XEmacs is that many useful packages that you would need to download and install separately with GNU Emacs are already shipped with XEmacs. We will not cover the differences here, though; the discussion in this section applies to both. Whenever we talk about Emacs in this section, we mean either version.
GNU Emacs is simply invoked as:
$emacs
options
Likewise, XEmacs is invoked as:
$xemacs
options
Most of the time, you don’t need options. You can specify filenames on the command line, but it’s more straightforward to read them in after starting the program.
In Emacs lingo, C-x
means Ctrl-X, and
M-p
is equivalent to Alt-P. As you might guess,
C-M-p
means Ctrl-Alt-P.
Using these conventions, press C-x
followed by
C-f
to read in a file or create a new one. The
keystrokes display a prompt at the bottom of your screen showing your
current working directory. You can create a buffer now to hold what
will end up being the content of a new file; let’s
call the file wibble.txt
. We now see the
following:
The
mode line at the bottom indicates the name of the file as well as the
type of buffer you’re in (which here is
Fundamental
). Emacs supports many kinds of editing
modes; Fundamental
is the default for plain-text
files, but other modes exist for editing C and
TEX source, modifying directories, and so on.
Each mode has certain key bindings and commands associated with it,
as we’ll see soon. Emacs typically determines the
mode of the buffer based on the filename extension.
To the right of the buffer type is the word All
,
which means that you are currently looking at the entire file (which
is empty). Typically, you will see a percentage, which represents how
far into the file you are.
If you’re running Emacs under the X Window System, a new window will be created for the editor with a menu bar at the top, scrollbars, and other goodies. In Section 11.6.2 in Chapter 11, we discuss Emacs’s special features when used within X.
Emacs is more straightforward than vi when it
comes to basic text editing. The arrow keys should move the cursor
around the buffer; if they don’t (in case Emacs is
not configured for your terminal), use the keys
C-p
(previous line), C-n
(next
line), C-f
(forward character), and
C-b
(backward character).
If you find using the Alt key uncomfortable, press Escape and then
p
. Pressing and releasing Escape is equivalent to
holding down Alt.
Already
we must take the first aside on our tour of Emacs. Literally every
command and key within Emacs is
customizable. That is, with a
“default” Emacs configuration,
C-p
maps to the internal function
previous-line, which moves the cursor (also
called “point”) to the previous
line. However, you can easily rebind different keys to these
functions, or write new functions and bind keys to them, and so
forth. Unless otherwise stated, the keys we introduce here work for
the default Emacs configuration. Later we’ll show
you how to customize the keys for your own use.
Back to editing: using the arrow keys or one of the equivalents moves the cursor around the current buffer. Just start typing text, and it is inserted at the current cursor location. Pressing the Backspace or Delete key should delete text at the cursor. If it doesn’t, we’ll show how to fix it in Section 9.2.8 later in this chapter. Now begin to type:
The keys C-a
and C-e
move the
cursor to the beginning and end of the current line, respectively.
C-v
moves forward a page; M-v
moves back a page. There are many more basic editing commands, but
we’ll allow the Emacs online documentation
(discussed shortly) to fill those in.
In order to get out of Emacs, use the command C-x
C-c
. This is the first of the extended commands
we’ve seen; many Emacs commands require several
keys. C-x
alone is a
“prefix” to other keys. In this
case, pressing C-x
followed by
C-c
quits Emacs, first asking for confirmation if
you want to quit without saving changes to the buffer.
You can use C-x
C-s
to save
the
current file, and C-x
C-f
to
“find” another file to edit. For
example, typing C-x
C-f
presents you with a prompt, such as:
Find file: /home/loomer/mdw/
where the current directory is displayed. After this, type the name of the file to find. Pressing the Tab key will do filename completion similar to that used in bash and tcsh. For example, entering:
Find file: /home/loomer/mdw/.bash
and pressing Tab opens another buffer, showing all possible completions, as so:
After you complete the filename, the *Completions*
buffer goes away and the new file is displayed for editing. This is
one example of how Emacs uses temporary buffers to present
information.
Emacs allows
you to use multiple buffers when editing text; each buffer may
contain a different file you’re editing. When you
load a file with C-x
C-f
, a new
buffer is created to edit the file, but the original buffer
isn’t deleted.
You can switch to another buffer using the C-x
b
command, which asks you for the name of the
buffer (usually the name of the file within the buffer). For example,
pressing C-x
b
presents the
prompt:
Switch to buffer: (default wibble.txt)
The default buffer is the previous one visited. Press Enter to switch
to the default buffer, or type another buffer name. Using
C-x
C-b
will present a buffer
list (in a buffer of its own), as so:
Popping up the buffer menu splits the Emacs screen into two
“windows,” which you can switch
between using C-x
o
. More than
two concurrent windows are possible as well. In order to view just
one window at a time, switch to the appropriate one and press
C-x
1
. This hides all the other
windows, but you can switch to them later using the
C-x
b
command just described.
Using C-x
k
actually deletes a
buffer from Emacs’s memory.
Already Emacs looks a bit complex; that is simply because it’s such a flexible system. Before we go any further, it is instructive to introduce Emacs’s built-in online help and tutorial. This documentation has also been published in book form as the GNU Emacs Manual, by Richard M. Stallman (GNU Press).
Using the C-h
command gives you a list of help
options on the last line of the display. Pressing
C-h
again describes what they are. In particular,
C-h
followed by t
drops you
into the Emacs tutorial. It should be self-explanatory, and an
interactive tutorial about Emacs tells you more about the system than
we can hope to cover here.
After
going through the Emacs tutorial you should get accustomed to the
Info system, where the rest of the Emacs documentation resides.
C-h
followed by i
enters the
Info reader. A mythical Info page might look like this:
File: intercal.info, Node: Top, Next: Instructions, Up: (dir) This file documents the Intercal interpreter for Linux. * Menu: * Instructions:: How to read this manual. * Overview:: Preliminary information. * Examples:: Example Intercal programs and bugs. * Concept Index:: Index of concepts.
As you see, text is presented along with a menu to other
“nodes.” Pressing
m
and then entering a node name from the menu will
allow you to read that node. You can read nodes sequentially by
pressing the spacebar, which jumps to the next node in the document
(indicated by the information line at the top of the buffer). Here,
the next node is Instructions
, which is the first
node in the menu.
Each node also has a link to the parent node (Up
),
which here is (dir)
, meaning the Info page
directory. Pressing u
takes you to the parent
node. In addition, each node has a link to the previous node, if it
exists (in this case, it does not). The p
command
moves to the previous node. The l
command returns
you to the node most recently visited.
Within the Info reader, pressing ?
gives you a
list of commands and pressing h
presents you with
a short tutorial on using the system. Since you’re
running Info within Emacs, you can use Emacs commands as well (such
as C-x b
to switch to another buffer).
If you think that the Info system is arcane and obsolete, please keep in my mind that it was designed to work on all kinds of systems, including those lacking graphics or powerful processing capabilities.
Other online help is available within Emacs. Pressing
C-h
C-h
gives you a list of
help options. One of these is C-h
k
, after which you press a key, and documentation
about the function that is bound to that key appears.
There are various ways to move and duplicate blocks of text within Emacs. These methods involve use of the mark, which is simply a “remembered” cursor location you can set using various commands. The block of text between the current cursor location (point) and the mark is called the region.
You can set the mark using the key C-@
(or
C-Space
on most systems). Moving the cursor to a
location and pressing C-@
sets the mark at that
position. You can now move the cursor to another location within the
document, and the region is defined as the text between mark and
point.
Many
Emacs commands operate on the region. The most important of these
commands deal with deleting and yanking text. The command
C-w
deletes the current region and saves it in the
kill ring. The kill ring is a list of text
blocks that have been deleted. You can then paste
(yank) the
text at another location, using the
C-y
command. (Note that the semantics of the term
yank differ between vi and
Emacs. In vi,
“yanking” text is equivalent to
adding it to the undo register without deleting it, while in Emacs,
“yank” means to paste text.) Using
the kill ring, you can paste not only the most recently deleted block
of text, but also blocks of text that were deleted previously.
For example, type the following text into an Emacs buffer:
Now, move the cursor to the beginning of the second line
(“Here is a line...”), and set the
mark with C-@
. Move to the end of the line (with
C-e
), and delete the region using
C-w
. The buffer should now look like the
following:
In order to yank the text just deleted, move the cursor to the end of
the buffer and press C-y
. The line should be
pasted at the new location:
Pressing C-y
repeatedly will insert the text
multiple times.
You can copy text in a similar fashion.
Using M-w
instead of C-w
will
copy the region into the kill ring without deleting it. (Remember
that M-
means holding down the Alt key or pressing
Escape before the w
.)
Text that is deleted using other kill commands, such as
C-k
, is also added to the kill ring. This means
that you don’t need to set the mark and use
C-w
to move a block of text; any command that
deletes more than one character will do.
In order to recover previously deleted blocks of text (which are
saved on the kill ring), use the command M-y
after
yanking with C-y
. M-y
replaces
the yanked text with the previous block from the kill ring. Pressing
M-y
repeatedly cycles through the contents of the
kill ring. This feature is useful if you wish to move or copy
multiple blocks of text.
Emacs also provides a more general register
mechanism, similar to that found in vi. Among
other things, you can use this feature to save text you want to paste
in later. A register has a one-character name; let’s
use a
for this example:
At the beginning of the text you want to save, set the mark by
pressing the Control key and spacebar together (or if that
doesn’t work, press C-@
).
Move point (the cursor) to the end of the region you want to save.
Press C-x x
followed by the name of the register
(a
in this case).
When you want to paste the text somewhere, press C-x g
followed by the name of the register,
a
.
The most common way to
search for a string within Emacs is to press C-s
.
This starts what is called an incremental search. You then start entering the characters you are
looking for. Each time you press a character, Emacs searches forward
for a string matching everything you’ve typed so
far. If you make a mistake, just press the Delete key and continue
typing the right characters. If the string cannot be found, Emacs
beeps. If you find an occurrence but you want to keep searching for
another one, press C-s
again.
You can
also search backward this way using the C-r
key.
Several other types of searches exist, including a regular expression
search that you can invoke by pressing M-C-s
. This
lets you search for something like jo.*n
, which
matches names like John, Joan, and Johann. (By default, searches are
not case-sensitive.)
To replace a string, enter M-%
. You are prompted
for the string that is currently in the buffer, and then the one with
which you want to replace it. Emacs displays each place in the buffer
where the string is and asks you if you want to replace this
occurrence. Press the spacebar to replace the string, the Delete key
to skip this string, or a period to stop the search.
If you know you want to replace all occurrences of a string that
follow your current place in the buffer, without being queried for
each one, enter M-x replace-string
. (The
M-x
key allows you to enter the name of an Emacs
function and execute it, without use of a key binding. Many Emacs
functions are available only via M-x
, unless you
bind them to keys yourself.) A regular expression can be replaced by
entering M-x
replace-regexp
.
The
name
Emacs comes partly from the word
“macros.” A macro is a simple but
powerful feature that makes Emacs a pleasure to use. If you plan on
doing anything frequently and repetitively, just press C-x (
, perform the operation once, and then press C-x )
. The two C-x
commands with the opening
and closing parentheses remember all the keys you pressed. Then you
can execute the commands over and over again by pressing C-x e
.
Here’s a example you can try on any text file; it capitalizes the first word of each line.
Press C-x (
to begin the macro.
Press C-a
to put point at the beginning of the
current line. It’s important to know where you are
each time a macro executes. By pressing C-a
you
are making sure the macro will always go to the beginning of the
line, which is where you want to be.
Press M-c
to make the first letter of the first
word a capital letter.
Press C-a
again to return to the beginning of the
line and C-n
or the down arrow to go to the
beginning of the following line. This ensures that the macro will
start execution at the right place next time.
Press C-x )
to end the macro.
Press C-x e
repeatedly to capitalize the following
lines. Or press C-u
several times, followed by
C-x e
. The repeated uses of C-u
are prefix keys, causing the following command to execute many times.
If you get to the end of the document while the macro is still
executing, no harm is done; Emacs just beeps and stops executing the
macro.
Emacs provides interfaces for many programs, which you can run within an Emacs buffer. For example, Emacs modes exist for reading and sending electronic mail, reading Usenet news, compiling programs, and interacting with the shell. In this section, we’ll introduce some of these features.
To send electronic mail from within
Emacs, press C-x m
. This opens up a buffer that
allows you to compose and send an email message:
Simply enter your message within this buffer and use C-c C-s
to send it. You can also insert text from other
buffers, extend the interface with your own Emacs
LISP functions, and so on. Furthermore, an Emacs
mode called RMAIL lets you read your electronic mail right within
Emacs, but we won’t discuss it here because most
people prefer standalone mailers. (Usually, these mailers let you
choose Emacs as your editor for email messages.)
Similar
to the RMAIL mail interface is
GNUS, the Emacs-based newsreader, which you can
start with the M-x gnus
command. After startup
(and a bit of chewing on your .newsrc
file), a
list of newsgroups will be presented, along with a count of unread
articles for each:
GNUS is an example of the power of using Emacs interfaces to other tools. You get all the convenience of Emacs’s navigation, search, and macro capabilities, along with specific key sequences appropriate for the tool you’re using.
Using the arrow keys, you can select a newsgroup to read. Press the spacebar to begin reading articles from that group. Two buffers will be displayed, one containing a list of articles and the other displaying the current article.
Using n
and p
move to the next
and previous articles, respectively. Then use f
and F
to post a follow-up to the current article
(either including or excluding the current article), and
r
and R
to reply to the article
via electronic mail. There are many other GNUS
commands; use C-h m
to get a list of them. If
you’re used to a newsreader, such as
rn, GNUS will be somewhat
familiar.
Emacs provides a number of modes for editing various types of files. For example, there is C mode for editing C source code, and TEX mode for editing (surprise) TEX source. Each mode boasts features that make editing the appropriate type of file easier.
For
example, within C mode, you can use the command M-x compile
, which, by default, runs make -k
in the current directory and redirects errors to another buffer. For
example, the compilation buffer may contain:
cd /home/loomer/mdw/pgmseq/ make -k gcc -O -O2 -I. -I../include -c stream_load.c -o stream_load.o stream_load.c:217: syntax error before `struct' stream_load.c:217: parse error before `struct'
You can move the cursor to a line containing an error message and
press C-c
C-c
to make the
cursor jump to that line in the corresponding source buffer. Emacs
opens a buffer for the appropriate source file if one does not
already exist. Now you can edit and compile programs entirely within
Emacs.
Emacs also provides a complete interface to the gdb debugger, which is described in Section 14.1.6.3 in Chapter 14.
Usually, Emacs selects the appropriate mode for the buffer based on
the filename extension. For example, editing a file with the
extension .c
in the filename automatically
selects C mode for that buffer.
Shell
mode is one of the most popular Emacs extensions. Shell mode allows
you to interact with the shell in an Emacs buffer, using the command
M-x shell
. You can edit, cut, and paste command
lines with standard Emacs commands. You can also run single shell
commands from Emacs using M-!
. If you use
M-|
instead, the contents of the current region
are piped to the given shell command as standard input. This is a
general interface for running subprograms from within Emacs.
The Emacs online documentation should be sufficient to get you on track to learning more about the system and growing accustomed to it. However, sometimes it is hard to locate some of the most helpful hints for getting started. Here we’ll present a rundown on certain customization options many Emacs users choose to employ to make life easier.
The Emacs
personal customization file is .emacs
, which
should reside in your home directory. This file should contain code,
written in Emacs LISP, which runs or defines
functions to customize your Emacs environment. (If
you’ve never written LISP before,
don’t worry; most customizations using it are quite
simple.)
One of the most common
things users customize are key bindings. For instance, if you use
Emacs to edit SGML documents, you can bind the key C-c s
to switch to SGML mode. Put this in your
.emacs
file:
; C-c followed by s will put buffer into SGML mode." (global-set-key "C-cs" 'sgml-mode)
Comments in Emacs
LISP start with a semicolon. The command that
follows runs the command global-set-key. Now you
don’t have to type in the long sequence
M-x
sgml-mode
to start editing
in SGML. Just press the two characters C-c s
. This
works anywhere in Emacs — no matter what mode your buffer is
in — because it is global. (Of course, Emacs may also recognize
an SGML or XML file by its suffix and put it in SGML mode for you
automatically.)
A customization that you might want to use is making the text mode the default mode and turning on the “auto-fill” minor mode (which makes text automatically wrap if it is too long for one line) like this:
; Make text mode the default, with auto-fill (setq default-major-mode 'text-mode) (add-hook 'text-mode-hook 'turn-on-auto-fill)
You don’t always want your key mappings to be global. As you use TEX mode, C mode, and other modes defined by Emacs, you’ll find useful things you’d like to do only in a single mode. Here, we define a simple LISP function to insert some characters into C code, and then bind the function to a key for our convenience:
(defun start-if-block( ) (interactive) (insert "if ( ) { } ") (backward-char 6) )
We start the function by declaring it “interactive” so that we can invoke it (otherwise, it would be used only internally by other functions). Then we use the insert function to put the following characters into our C buffer:
if ( ) { }
Strings in Emacs can contain standard C escape characters. Here,
we’ve used
for a newline.
Now we have a template for an if
block. To put on
the ribbon and the bow, our function also moves backward six
characters so that point is within the parentheses, and we can
immediately start typing an expression.
Our whole goal was to make it easy to insert these characters, so now let’s bind our function to a key:
(define-key c-mode-map "C-ci" 'start-if-block)
The
define-key function binds a key to a function.
By specifying c-mode-map
, we indicate that the key
works only in C mode. There is also a tex-mode-map
for mode, a lisp-mode-map
that you will want to
know about if you play with your .emacs
file a
lot.
If you’d like to write your own Emacs LISP functions, you should read the Info pages for elisp, which should be available on your system. Two good books on writing Emacs LISP functions are An Introduction to Programming in Emacs Lisp, by Robert J. Chassell (GNU Press) and Writing GNU Emacs Extensions, by Bob Glickstein (O’Reilly).
Now here’s an important
customization you may need. On many terminals the Backspace key sends
the character C-h
, which is the Emacs help key. To
fix this, you should change the internal table Emacs uses to
interpret keys, as follows:
(keyboard-translate ?C-h ?C-?)
Pretty cryptic code. C-h
is recognizable as the
Control key pressed with h
, which happens to
produce the same ASCII code (8) as the Backspace
key. C-?
represents the Delete key
(ASCII code 127). Don’t confuse
this question mark with the question marks that precede each
backslash. ?C-h
means “the
ASCII code corresponding to
C-h
.” You could just as well
specify 8 directly.
So now, both Backspace and C-h
will delete.
You’ve lost your help key. Therefore, another good
customization would be to bind another key to C-h
.
Let’s use C-
, which
isn’t used often for anything else. You have to
double the backslash when you specify it as a key:
(keyboard-translate ?C-\ ?C-h)
On the X Window System, there is a way to change the code sent by your Backspace key using the xmodmap command, but we’ll have to leave it up to you to do your own research. It is not a completely portable solution (so we can’t show you an example guaranteed to work), and it may be too sweeping for your taste (it also changes the meaning of the Backspace key in your xterm shell and everywhere else).
There are other key bindings you may want to use. For example, you
may prefer to use the keys C-f
and
C-b
to scroll forward (or backward) one page at a
time, as in vi. In your
.emacs
file you might include the following
lines:
(global-set-key "C-f" 'scroll-up) (global-set-key "C-b" 'scroll-down)
Again, we have to issue a caveat: be careful not to redefine keys
that have other important uses. (One way to find out is to use
C-h k
to tell you what a key does in the current
mode. You should also consider that the key may have definitions in
other modes.) In particular, you’ll lose access to a
lot of functions if you rebind the prefix keys
that start commands, such as C-x
and
C-c
.
You can create your own prefix keys, if you really want to extend your current mode with lots of new commands. Use something like:
(global-unset-key "C-d") (global-set-key "C-dC-f" 'my-function)
First, we must unbind the C-d
key (which simply
deletes the character under the cursor) in order to use it as a
prefix for other keys. Now, pressing C-d
C-f
will execute my-function.
You may also prefer to use another mode
besides Fundamental
or Text
for
editing “vanilla” files.
Indented Text
mode, for example, automatically
indents lines of text relative to the previous line so that it starts
in the same column (as with the :set ai
function
in vi). To turn on this mode by default, use:
; Default mode for editing text (setq default-major-mode 'indented-text-mode)
You should also rebind the Enter key to indent the next line of text:
(define-key indented-text-mode-map "C-m" 'newline-and-indent)
Emacs
also provides “minor” modes, which
are modes you use along with major modes. For example,
Overwrite
mode is a minor mode that causes newly
typed characters to overwrite the text in the buffer, instead of
inserting it. To bind the key C-r
to toggle
overwrite
mode, use the command:
; Toggle overwrite mode. (global-set-key "C-r" 'overwrite-mode)
Another minor mode is
Autofill
, which automatically wraps lines as you
type them. That is, instead of pressing the Enter key at the end of
each line of text, you may continue typing and Emacs automatically
breaks the line for you. To enable Autofill
mode,
use the commands:
(setq text-mode-hook 'turn-on-auto-fill) (setq fill-column 72)
This turns on Autofill
mode whenever you enter
Text
mode (through the
text-mode-hook function). It also sets the point
at which to break lines at 72 characters.
Even a few regular expression tricks can vastly increase your power to search for text and alter it in bulk. Regular expressions were associated only with Unix tools and languages for a long time; now they are popping up in other environments, such as Microsoft’s .NET, but only Unix offers them in a wide variety of places, such as text editors and the grep command where ordinary users can exploit them.
Let’s suppose you’re looking
through a file that contains mail messages. You’re
on a bunch of mailing lists with names, such as
gyro-news
and gyro-talk
, so
you’re looking for Subject lines with
gyro-
in them. You can use your text editor or the
grep command to search for:
^Subject:.*gyro-
This means “look for lines beginning with
Subject
:, followed by any number of any kind of
character, followed by gyro-
.”
The regular expression is made up of a number of parts, some
reproducing the plain text you’re looking for and
others expressing general concepts like “beginning
of line.” Figure 9-1 shows what
the parts mean and how they fit together.
Just to give a hint of how powerful and sophisticated regular
expressions can be, let’s refine the one in Figure 9-1 for a narrower search. This time, we know that
mailing lists on gyros send out mail with Subject lines that begin
with the name of the list in brackets, such as Subject: [gyro-news]
or Subject: [gyro-talk]
. We
can search for precisely such lines, as follows:
^Subject: *[gyro-[a-z]*]
Figure 9-2 shows what the parts of this expression mean. We’ll just mention a couple of interesting points here.
Brackets, like carets and asterisks, are special characters in
regular expressions. Brackets are used to mark whole classes of
characters you want to search for, such as [a-z]
to represent “any lowercase
character.” We don’t want the
bracket before gyro
to have this special meaning,
so we put a backslash in front of it; this is called
escaping the bracket. (In other words, we let
the bracket escape being considered a metacharacter in the regular
expression.)
The first asterisk in our expression follows a space, so it means
“match any number of spaces in
succession.” The second asterisk follows the
[a-z]
character class, so it applies to that
entire construct. By itself, [a-z]
matches one and
only one lowercase letter. Together, [a-z]*
means
“match any number of lowercase letters in
succession.”
A sophisticated use of regular expressions can take weeks to learn, and readers who want to base applications on regular expressions would do well to read Mastering Regular Expressions, by Jeffrey Friedl (O’Reilly).