While sed is line-oriented and lets you fiddle and diddle to your heart’s content, awk is field-oriented and is ideal for manipulating database or comma-delimited files. For example, if you have an address book file, you can use awk to find and change information in fields you specify, as in Code Listing 6.11. In the following steps, we’ll show you a sampling of the things you can do using awk to modify, in this example, an address book file.
[ejr@hobbes manipulate]$ awk '{ print $1 }' → address.book Schmidt, Feldman, Brown, Smith, Jones, [ejr@hobbes manipulate]$ |
De-what?A delimited file uses a specific character to show where one bit of information ends and another begins. Each piece of information is a separate field. For example, a file that contains “John, Doe, Thornton, Colorado” is comma-delimited, sporting a comma between fields. Other files, such as the /etc/passwd file, use a colon (:) to separate the fields. Just about any symbol that’s not used in the content could be used as a delimiter. |
1. | awk ‘{ print $1 }’ address.book At the shell prompt, use awk ‘{ print $1 }’ address.book to look at the address.book file and select (and send to standard output) the first field in each record (line). More specifically, starting from the inside out
|
2. | awk -F, ‘{ print $1 }’ address.book The -F flag tells awk to use the character following it—in this case, a comma (,)— as the field separator. This change makes the output of the command a little cleaner and more accurate. If you were working with /etc/passwd, you’duse -F: to specify that the : is the field separator. |
3. | awk -F, '{ print $2 " " $1 " " $7 }'
→ address.book > phone.list With this code, you can pull specific fields, in an arbitrary order, from your database. Although it looks complex, it’s just one additional step from the previous example. Rather than printing a single field from the address book, we’re printing field 2, then a space, then field 1, then a space, then field 7. The final bit just redirects the output into a new file. This example would produce a list of names and phone numbers, as shown in Code Listing 6.12. |
4. | awk -F, '/CA/{ print $2 $1 $7 }'
→ address.book > phone.list You can also specify a matching pattern. Here, we added /CA/ to search and act on only the lines that contain CA, so only those lines will be in the phone.list file. |
[ejr@hobbes manipulate]$ awk -F, '{print → $2 "" $1 " " $7 }' address book → > phone.list [ejr@hobbes manipulate]$ more phone.list Sven Schmidt 555-555-8382 Fester Feldman John Brown 918-555-1234 Sally Smith 801-555-8982 Kelly Jones 408-555-7253 [ejr@hobbes manipulate]$ |
✓ Tips
You can load awk scripts from a file with awk -f script.awk filename. Just as with sed, this keeps there typing to a minimum, which is helpful with these long and convoluted commands. Refer to Chapter 10 for more details about scripting.
Take a glance at Sorting Files with sort later in this chapter and consider piping your awk output to sort. Let Unix do the tedious work for you!