.
Manucomp Systems
Hours of Operation

Monday to Friday:
9am - 6pm EST

Saturday & Sunday:
Closed

If you would like additional information please contact us toll-free at :

1-866-440-1115
info@manucomp.com

Can't find the product you are looking for?
Request a quote.
Grep
Back to Sun Tips


Grep this

Using grep, fgrep, and egrep to search for strings of words

The grep utility, which allows files to be searched for strings of words, uses a syntax similar to the regular expression syntax of the vi, ex, ed, and sed editors. grep comes in three flavors, grep, fgrep, and egrep, all of which I'll cover in this article.

The name grep is derived from the editor command g/re/p, which literally translates to "globally search for a regular wxpression and print what you find." Regular expressions are at the core of grep, and I'll cover them after a brief description of some of the utility's command options.

The simplest grep command is grep (search pattern) (files list), as in:

grep hello *

The output of this command might be something like this:

$ grep hello *
story.txt: so I said hello and she smiled back
intro.txt: use the hello.c program as an example of C programming
$

grep is case sensitive, so in order to change the search to include "hello," "Hello," or "HELLO," use the -y or -i option. Earlier versions of grep used -y, and later versions use -i. -y is now considered obsolete, although some versions of grep do support both. In the following example, more hellos show up because the search is case independent.

$ grep -i hello *
story.txt: so I said hello and she smiled back
story.txt: I could hear my echo, "HELLO."
intro.txt: use the
hello.c program as an example of C programming hello.c: printf("Hello, world. \n");
$

This command searches all files in the current directory and prints the file name and the line containing the string "hello" for any files that contain that string.

The output of grep varies depending on whether you're searching one or several files. If only one file is named on the command line, the output doesn't include the file name, as in the following example:

$ grep -i hello hello.c
printf("Hello, world. \n");
$

The one-file rule applies whether you use a wild card in your file list or not. If hello.c were the only file in the current directory, using a wild card to locate the file would still produce an unnamed file output. In the following example, the user is searching for any C files containing "hello." There is only one C file in the directory, so the output is identical to the previous example.

$ grep -i hello *.c
printf("Hello, world. \n");
$

I don't know of a grep that has a work-around for this behavior, but you could use the -l option instead, which prints the file name only and not the line containing the string. At least you would know the name of the file that contained the string.

$ grep -il hello *.c
hello.c:
$

The -l option can be used to extract a list of files containing the string. The file name is printed only once, even though the string may appear in multiple lines within that file. In the following example, story.txt appears only once, even though it contains more than one "hello."

$ grep -il
hello * hello.c:
intro.txt:
story.txt:
$

The -l option suppresses most of the other output options from grep. On the other hand, the -n option will print a line number as well as the text, as in the following example:

$ grep -in hello *
hello.c:7: printf("Hello, world. \n");
intro.txt:44: use the hello.c program as an example of C programming story.txt:110: so I said hello and she smiled back
story.txt:187: I could hear my echo, "HELLO."
$

The -v option outputs the complement of the search, i.e., all lines not containing the requested search pattern.

$ grep -iv hello intro.txt
You will be able to get more practice if you
at its simplest
$

The -c option prints only a count of lines matched. It also has the interesting and useful side effect of listing all the files it searches, not just the successful hits.

 $ grep -ic hello *   
data.txt:0
hello.c:1:
intro.txt:1
intro2.txt:0
story.txt:2
$

Some versions of grep come with -r as an option, which prompts grep to search recursively through subdirectories. The default behavior is to search only one directory, so the -r option, as provided in GNU and other implementations of grep, is the exception rather than the rule.

Going wild with grep

So far I've covered some of the input and output options, but the real power of grep is in its search pattern, which uses regular expressions. grep can match simple strings, as we saw in the "hello" example we played with above; but it can also use a variety of wild cards and special symbols to create a regular expression to search for more complex strings.

I will begin with some of the simpler characters in a regular expression. A ^ (caret) character means the start of a line and a $ (dollar) character means the end of one.

The wild cards used by grep frequently clash with the special symbols that the shell uses, so the usual practice is to enclose complex search strings within single quotes. The two following examples would match any case version of "hello" at the start and end of a line, respectively.

$ grep '^hello' *

$ grep 'hello$' *

The dot or period character (.) will match any single character. For example, the following would match any character followed by "ello," as in "aello," "bello," "cello," and so on all the way through "zello." Odd combinations, like "1ello" and "?ello," would also be included; any combination of one initial character followed by "ello" is valid. The dot does not match the beginning or end of a line; therefore, "ello" at the start of a line would not be matched.

$ grep '.ello' *

Optional characters can be enclosed in square brackets ([ ]) causing any of the enclosed characters to be matched. The following search string would match "hello," "cello," or "jello."

$ grep '[hcj]ello' *

Optional characters can also be specified by using a range consisting of two characters separated by a hyphen. The following example would match "bay," "cay," or "day."

$ grep '[b-d]ay' *

An optional character or range of characters can be preceded by a caret (^) to invert the sense of the match. The following would match any character proceeded by "ay" except the combinations "bay," "cay," and "day."

$ grep '[^b-d]ay' *

Note that options and ranges represent a match of a single character.

Any single character match (including a single character matched by a option/range specification) can be repeated by using the asterisk character (*). An asterisk following a single character means "zero or more occurrences" of the preceding match. The following search requests any line containing "hello" followed by "dolly" where the words are separated by zero or more spaces. Note that the asterisk follows the space after "hello" and therefore applies to the space character.

$ grep 'hello *dolly' *

This search would match any of the following, without regard to the number of spaces between the words.

hellodolly    
hello dolly  
hello   dolly

The asterisk can be applied to an option or range. Following search matches "c" and "t" with any number of vowels (or no vowels) in between.

 $ grep 'c[aeiou]*t' somewords.txt   
cat
coat
coot
cot
cout
cut
ct
$