1.13 Analyze text using basic regular expressions

The grep utility filters input line by line and looks for Matches. In its simplest use, grep prints those lines that contain text that Matches a pattern. It can find fixed sequences of characters and can even ignore case with the -i option.

Now we create a file, superheroes.txt, with these lines:

Ant-Man
Batman
Batgirl
Captain America
Catwoman
Daredevil
Deadpool
Iron Man
Magneto
Spider-Man
Wonder Woman

, grep scans each line in the heroes.txt file and looks for an m, followed by an a, and then followed by an n. Except for being contiguous, those letters can appear anywhere on the line, even embedded in a larger word. Each listed world contain the string man, ignoring case (the -i option).

# grep -i man superheroes.txt
Ant-Man
Batman
Catwoman
Iron Man
Spider-Man
Wonder Woman

The tool also has a nice feature to exclude rather than include all Matches found. Use the -v option to omit lines that Match.

# grep -i -v man superheroes.txt
Batgirl
Captain America
Daredevil
Deadpool
Magneto

A regular expression can filter for a specific location, such as the start or end of a line and the beginning and end of a word. A regular expression, commonly abbreviated regex, can also describe alternates (which you might describe as "this" or "that"); fixed-, variable-, or indefinite-length repetition; ranges (for example, "any of the letters from a-m"); and classes, or kinds of characters ("printable characters" or "punctuation"), and other techniques.

Table below shows some common regular expression operators. You can string together the primitives in Table (and other operators) and use them in combination to build complex regular expressions.

OperatorPurpose
. (period)Match any single character.
^ (caret)Match the empty string that occurs at the beginning of a line or string.
$ (dollar sign)Match the empty string that occurs at the end of a line.
AMatch an uppercase letter A.
aMatch a lowercase a.
\dMatch any single digit.
\DMatch any single non-digit character.
\wMatch any single alphanumeric character; a synonym is [:alnum:].
[A-E]Match any of uppercase A, B, C, D, or E.
[^A-E]Match any character except uppercase A, B, C, D, or E.
X?Match no or one occurrence of the capital letter X.
X*Match zero or more capital Xs.
X+Match one or more capital Xs.
X{n}Match exactly n capital Xs.
X{n,m}Match at least n and no more than m capital Xs. If you omit m, the expression tries to Match at least n Xs.
(abc|def)+Match a sequence of at least one abc and def; abc and def would Match.

Here are a few examples of regular expressions using grep as the search tool. Many other UNIX tools, including interactive editors vi and Emacs, stream editors sed and awk, and all modern programming languages, also support regular expressions. Once you learn the (admittedly cryptic) syntax of regular expressions, you can transfer your expertise among tools, programming languages, and operating systems.

Find strings that bigin with "Bat":

# grep -E '^Bat' superheroes.txt
Batman
Batgirl

Find strings that end with "man":

grep -E 'man$' superheroes.txt
Batman
Catwoman
Wonder Woman

Find strings that end with "man" add "Man"

# grep -E '[Mm]an$' superheroes.txt
Ant-Man
Batman
Catwoman
Iron Man
Spider-Man
Wonder Woman

Find strings that bigin with "cat" or "bat":

# grep -i -E '^(bat|cat)' superheroes.txt
or
# grep -i -E '[bc]at' superheroes.txt
or
# grep -i -E '^.at' superheroes.txt
Batman
Batgirl
Catwoman

Regular expressions are extremely powerful; the number and kind of operators and techniques you can command are enormous. There's so much information and practical knowledge that it's impossible to present but a fraction here.

At the command line, you'll find many ways to use regular expressions. Virtually every command that processes text supports regular expressions of one form or another. Most shell command syntax also expands regular expressions to match file names, although the operators might function differently, greatly, or slightly.

CentOS 7

No features.

openSUSE Leap 42.3

No features.

Ubuntu 17.04

No features.

Publication/Release Date: Jul 06, 2017

Advertisement