grep (global regular expression processor) is a utility program that searches a file, or files, for lines that contain a particular string pattern. Unlike the find command, which searches directory structures for files that meet certain criteria, grep searches files for lines that contain a particular pattern.
We can use grep to search for a simple fixed character string, or it can be used to search for more complex patterns called “regular expressions.” First let’s take a look at the command syntax, and then I’ll describe how to use grep to search for simple strings.
The command syntax for the grep command is as follows:
grep <-options> search-pattern files-to-search
Some of the more common options to the grep command are described in Table 4.1.
The simplest pattern to search for using grep is the fixed character string. In this example, I’ll use grep to search a file named file1 for any lines that contain the string bcalkins:
grep bcalkins file1 <cr>
The system responds by listing all of the lines in the file that contain the string bcalkins:
bcalkins Bill Calkins ext. 123 Engineering
To search for lines in every file in the current directory that have the string bcalkins, issue the grep command as follows:
grep bcalkins * <cr>
The system will list each file that contains the text string, followed by the line in that file that contains the string, as follows:
file1:bcalkins Bill Calkins ext. 123 Engineering userlist:bcalkins Bill Calkins ext. 123 Engineering
When searching multiple files, use the –h option if you do not want the filename displayed with each line. Alternatively, use the –l option to only list the filenames that contain the search pattern, as follows:
grep –l bcalkins * <cr>
The system responds with the filenames but not the lines that contain the search string:
file1 userlist
So far, we’ve used grep to print all lines that contain a particular pattern. Use the –v option to list all lines that do not match a particular pattern, as follows:
grep –v bcalkins file1 <cr>
The following lines are listed:
sburge Steve Burge ext. 234 IT dschurman Dan Shurman ext. 345 Purchasing
The grep command is case sensitive, meaning the string you specified will be matched exactly. To ignore case sensitivity with grep, use the –i option as follows:
grep –i bcalkins * <cr>
All files will be searched and the string bcalkins can be uppercase, lowercase, or any combination.
To get a line count of files that contain the string pattern, use the –n option as follows:
grep –n bcalkins * <cr>
grep will search all files in the current directory for lines that contain the string bcalkins. The system displays the filename that contains the string, followed by the line number in the file where the string appears, and then the line that contains the string, as follows:
file1:1:bcalkins Bill Calkins ext. 123 Engineering userlist:1:bcalkins Bill Calkins ext. 123 Engineering
To get only a count of the number of lines in a file that contain the text string bcalkins, use the following command:
grep –c bcalkins file1 <cr>
The system responds with the number of lines:
1
grep uses a series of metacharacters known as regular expressions metacharacters for forming a search pattern. The list of regular expression metacharacters that can be used in a search pattern is as follows:
x>* Any number of the x character
[ - ] Any character in the range
These regular expression metacharacters have a special meaning. They are called “metacharacters” because they represent something other than themselves. Because metacharacters have a special meaning, it is best to enclose them in quotes, single quotes (‘ ’) or double quotes (“ ”).
Note
Don’t confuse these regular expression metacharacters with shell metacharacters , which are used to expand filenames. Shell metacharacters for filenames are described in Chapter 1.
There might be times when you need to search for a pattern at the beginning or end of each line in a file. Here’s how to use the regular expression metacharacter to only search for a pattern at the beginning of the line:
grep '^bcalkins' file1 <cr>
In the example, grep will only locate lines in file1 that begin with the string bcalkins.
To only search for a pattern at the end of the line, use this command:
grep 'bcalkins$' file1 <cr>
If I want to extract lines that only contain the phrase The network is the computer, type the following:
grep "^The network is the computer$" file1 <cr>
Note
Don’t forget to use quotes around the entire expression.
Use the . (dot) to replace a single character in a search string. For example, to search for all words that have the letter b followed by three more characters, use the . (dot) to specify the missing characters, as follows:
grep "b..." file1 <cr>
The system lists all lines that contain five-letter words that begin with b:
file1:bcalkins Bill Calkins ext. 123 Engineering file1:sburge Steve Burge ext. 234 IT userlist:bcalkins Bill Calkins ext. 123 Engineering userlist:sburge Steve Burge ext. 234 IT
Use the * (asterisk) to replace any number of characters in a search string. For example, if I want to search for a string that begins with the letter w followed by any number of characters, I could issue the following grep command:
grep 'w.*' file1 <cr>
Note
A dot followed by an asterisk (.*) represents zero or more characters in a regular expression.
All words that begin with w followed by any number of characters would be listed as follows:
wcalkins www sburge www admin
When I enclose characters in brackets, [ ], I can specify a set of characters that are to be searched for. The system will match any one of the characters inside the brackets. For example:
grep '[Bb]calkins' * <cr>
grep will search for the string Bcalkins or bcalkins. The following results are displayed:
file1:bcalkins Bill Calkins ext. 123 Engineering file1:Bcalkins Bruce Calkins ext. 123 Engineering userlist:bcalkins Bill Calkins ext. 123 Engineering
I also can search for a range of characters using the following expression:
grep '[a-ew-z]calkins` * <cr>
The system will search for variations of the word calkins that begin with a, b, c, d, e, w, x, y, or z. The results are as follows:
file1:bcalkins Bill Calkins ext. 123 Engineering file1:wcalkins William Calkins ext. 123 Engineering userlist:bcalkins Bill Calkins ext. 123 Engineering