grep

grep (global regular expression processor) is a utility program that searches a file, or files, for lines that contain a particular string pattern. Unlike the find command, which searches directory structures for files that meet certain criteria, grep searches files for lines that contain a particular pattern.

We can use grep to search for a simple fixed character string, or it can be used to search for more complex patterns called “regular expressions.” First let’s take a look at the command syntax, and then I’ll describe how to use grep to search for simple strings.

The command syntax for the grep command is as follows:

grep <-options> search-pattern files-to-search
					

Some of the more common options to the grep command are described in Table 4.1.

Table 4.1. grep Options
Options Description
-c Prints only a count of the lines that contain the pattern. The actual contents of the line will not be displayed.
-h Prevents the name of the file containing the matching line from being appended to that line. This is used when searching multiple files.
-i Ignores upper- or lowercase when searching for the string.
-l Prints only the names of files with matching lines, separated by newline characters. Does not repeat the names of files when the pattern is found more than once.
-n Precedes each line by its line number in the file. (The first line is 1.)
-s Suppresses error messages about nonexistent or unreadable files. The default is to display errors.
-v Prints all lines except those that contain the pattern.
-w Searches for the expression as a word as if surrounded by < and >.

Searching for Fixed Character Strings

The simplest pattern to search for using grep is the fixed character string. In this example, I’ll use grep to search a file named file1 for any lines that contain the string bcalkins:

grep bcalkins file1 <cr> 

The system responds by listing all of the lines in the file that contain the string bcalkins:

bcalkins        Bill Calkins    ext. 123        Engineering 

To search for lines in every file in the current directory that have the string bcalkins, issue the grep command as follows:

grep bcalkins * <cr> 

The system will list each file that contains the text string, followed by the line in that file that contains the string, as follows:

file1:bcalkins          Bill Calkins    ext. 123        Engineering 
userlist:bcalkins       Bill Calkins    ext. 123        Engineering 

When searching multiple files, use the –h option if you do not want the filename displayed with each line. Alternatively, use the –l option to only list the filenames that contain the search pattern, as follows:

grep –l bcalkins * <cr> 

The system responds with the filenames but not the lines that contain the search string:

file1 
userlist 

So far, we’ve used grep to print all lines that contain a particular pattern. Use the –v option to list all lines that do not match a particular pattern, as follows:

grep –v bcalkins file1 <cr> 

The following lines are listed:

sburge          Steve Burge     ext. 234        IT 
dschurman       Dan Shurman     ext. 345        Purchasing 

The grep command is case sensitive, meaning the string you specified will be matched exactly. To ignore case sensitivity with grep, use the –i option as follows:

grep –i bcalkins * <cr> 

All files will be searched and the string bcalkins can be uppercase, lowercase, or any combination.

To get a line count of files that contain the string pattern, use the –n option as follows:

grep –n bcalkins *  <cr> 

grep will search all files in the current directory for lines that contain the string bcalkins. The system displays the filename that contains the string, followed by the line number in the file where the string appears, and then the line that contains the string, as follows:

file1:1:bcalkins        Bill Calkins    ext. 123        Engineering 
userlist:1:bcalkins     Bill Calkins    ext. 123        Engineering 

To get only a count of the number of lines in a file that contain the text string bcalkins, use the following command:

grep –c bcalkins file1 <cr> 

The system responds with the number of lines:

1 

Using Regular Expressions in the Text Pattern

grep uses a series of metacharacters known as regular expressions metacharacters for forming a search pattern. The list of regular expression metacharacters that can be used in a search pattern is as follows:

. Any single character

.* Any number of characters

x>* Any number of the x character

[ ] Any character in the set

[ - ] Any character in the range

^ Pattern at the beginning of the line

$ Pattern at the end of the line

These regular expression metacharacters have a special meaning. They are called “metacharacters” because they represent something other than themselves. Because metacharacters have a special meaning, it is best to enclose them in quotes, single quotes (‘ ’) or double quotes (“ ”).

Note

Don’t confuse these regular expression metacharacters with shell metacharacters , which are used to expand filenames. Shell metacharacters for filenames are described in Chapter 1.


There might be times when you need to search for a pattern at the beginning or end of each line in a file. Here’s how to use the regular expression metacharacter to only search for a pattern at the beginning of the line:

grep '^bcalkins' file1 <cr> 

In the example, grep will only locate lines in file1 that begin with the string bcalkins.

To only search for a pattern at the end of the line, use this command:

grep 'bcalkins$' file1 <cr> 

If I want to extract lines that only contain the phrase The network is the computer, type the following:

grep "^The network is the computer$" file1  <cr> 

Note

Don’t forget to use quotes around the entire expression.


Use the . (dot) to replace a single character in a search string. For example, to search for all words that have the letter b followed by three more characters, use the . (dot) to specify the missing characters, as follows:

grep "b..."  file1  <cr> 

The system lists all lines that contain five-letter words that begin with b:

file1:bcalkins      Bill Calkins    ext. 123        Engineering 
file1:sburge        Steve Burge     ext. 234        IT 
userlist:bcalkins   Bill Calkins    ext. 123        Engineering 
userlist:sburge     Steve Burge     ext. 234        IT 

Use the * (asterisk) to replace any number of characters in a search string. For example, if I want to search for a string that begins with the letter w followed by any number of characters, I could issue the following grep command:

grep 'w.*' file1  <cr> 

Note

A dot followed by an asterisk (.*) represents zero or more characters in a regular expression.


All words that begin with w followed by any number of characters would be listed as follows:

wcalkins 
www 
sburge  www admin 

When I enclose characters in brackets, [ ], I can specify a set of characters that are to be searched for. The system will match any one of the characters inside the brackets. For example:

grep '[Bb]calkins' *  <cr> 

grep will search for the string Bcalkins or bcalkins. The following results are displayed:

file1:bcalkins     Bill Calkins    ext. 123        Engineering 
file1:Bcalkins     Bruce Calkins   ext. 123        Engineering 
userlist:bcalkins  Bill Calkins    ext. 123        Engineering 

I also can search for a range of characters using the following expression:

grep '[a-ew-z]calkins` *  <cr> 

The system will search for variations of the word calkins that begin with a, b, c, d, e, w, x, y, or z. The results are as follows:

file1:bcalkins       Bill Calkins    ext. 123        Engineering 
file1:wcalkins       William Calkins ext. 123        Engineering 
userlist:bcalkins    Bill Calkins    ext. 123        Engineering 

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset