Use Extended Regular Expressions with grep command
The grep command accepts a text and a location as arguments. It finds the specified text string in the provided location. The specified text string is called a pattern or expression. It assigns special meanings to many symbols to allow users to customize the expression. This tutorial explains these special-meaning symbols and how they work.
Regular and Extended regular expressions
Grep assigns special meanings to a few characters. These characters are known as metacharacters. Initially, grep assigned the following characters as metacharacters. These characters are also known as default metacharacters.
^ $ . [ ] *
Later, grep added the following characters to this list.
( ) { } ? + |
Use the -E switch with the grep command to use the special meanings of lately added characters. Without the -E option, grep treats them as the literal. A character is a literal when the command uses its original meaning. Instead of the original meaning, if the command uses the special meaning of a character, it is a metacharacter.
For example, if you use the + sign without the -E option, grep searches for the + (plus) sign. Here, the plus is a literal. However, if you use it with the -E option, the grep uses its special meaning. In special meaning, it appends the previous search. Here, the plus is a metacharacter.
Let us take another example.
The original implementation uses the pipe sign (|) as a regular character, while the new implementation defines it as a metacharacter. As a metacharacter, it allows you to search for multiple words.
Let us take an example. Suppose you want to search for two user accounts: sanjay and rick in the local database file. The /etc/passwd file saves local user accounts. If you use the following command for this, it will not work.
#grep "sanjay|rick" /etc/passwd
Without the -E option, grep will treat it as a single word. It will search for the string sanjay|rick instead of searching for two separate words: sanjay and rick. Use the -E option to use the pipe sign for the special meaning. The following command searches for two words: sanjay and rick separately.
#grep -E "sanjay|rick" /etc/passwd

The -E option instructs grep to use the special meaning of the pipe. In special meaning, the pipe works as a text separator. It instructs grep to search both words separately. You can use a pipe sign multiple times in a pattern to search multiple words simultaneously. For example, the following command searches for words abc, fgh, xyz, mno and jkl.
#grep –E "abc|fgh|xyz|mno|jkl" [source file]
The grep command regex example
This section presents a small project-based example of regexes. This project extracts all links from an HTML source file. You can use the source code of any webpage to create an HTML source file for practice.
Start a web browser and open the webpage from which you want to extract links. Press CTRL + U to display the source code of the web page.
Press CTRL + C to copy all codes.

Open a terminal and create a new text file. Right-click inside the edit mode and select the paste option.

Save the file.

The following command extracts all links from the file.
#grep –Eoi '<a[^>]+>.*</a>' html_file

The following outline explains the above command.
Options
-E This option instructs the grep command that the search pattern contains lately added meta characters.
o By default, grep prints the entire line that contains the search pattern. This option forces it to print only the matching words.
i This option instructs it to ignore the case while matching the pattern.
Special meaning metacharacters
- <a Starting point of the anchor tag.
- [^ >] Match everything except the > symbol.
- + Match preceding one or more times.
- > Ending point of the anchor tag.
The above pattern searches for the text that starts with <a and picks everything that comes after it until it finds a > sign. The > sign ends the anchor tag. A + sign instructs it to repeat the previous search in the entire file. The previous search finds everything that starts with the <a and ends with the a>.
- A dot (.) sign represents a single character. A star (*) represents all characters. This pattern searches for all characters between the starting and closing anchor tags.
- </a>:- This is the closing point of the anchor tag.
Collectively, the above pattern searches a text string that starts with the <a and has some texts and ends with the > and again has some texts and ends with the </a>.
Displaying only anchor tags
If you need only anchor tags, you can use the following command to exclude the expression that includes the linked text.
#grep –Eoi '<a[^>]+>' html_file

Extracting all links and saving them in a text file
Combine the following three commands to extract all links or URLs from an HTML file.
#grep –Eoi '<a[^>]+>' html_file #grep -Eo 'href= "[^"]+"' #grep –Eo 'https://[^"]+' > link-only
The following syntax combines the above commands.
#grep –Eoi '<a[^>]+>' html_file | grep -Eo 'href="[^"]+"' | grep –Eo 'https://[^"]+'

To save all links in a text file, redirect the final output to the file.
#grep –Eoi '<a[^>]+>' html_file | grep -Eo 'href="[^"]+"' | grep –Eo 'https://[^"]+' > link-only

- The first command receives its input from the file named html_file. The second command receives its input from the first command. The third command receives its input from the second command.
- The first command extracts all anchor attributes from the file and sends output to the second command.
- The second command extracts all href tags from the output of the first command and sends the result to the third command.
- The third command extracts all links from the second command's output and saves it to the text file.
This tutorial is part of the tutorial "The grep command in Linux: - usage, options, and syntax explained through examples.". Other parts of this tutorial are as follows:
Chapter 1 grep options, regex, parameters and regular expressions
Chapter 2 Grep Command in Linux Explained with Practical Examples
Chapter 3 Use Extended Regular Expressions with the grep command
Chapter 4 grep regex Practical Examples of Regular Expressions
Conclusion
The grep is a powerful tool for extracting specific data from a text source. This tutorial explained this process through an example that extracts all links from an HTML source file.
Author Laxmi Goswami Updated on 2025-11-28