RHCSA Series: Editing Text Files with Nano and Vim / Analyzing text with grep and regexps – Part 4

Analyzing Text with Grep and Regular Expressions

By now you have learned how to create and edit files using nano or vim. Say you become a text editor ninja, so to speak – now what? Among other things, you will also need how to search for regular expressions inside text.

A regular expression (also known as “regex” or “regexp“) is a way of identifying a text string or pattern so that a program can compare the pattern against arbitrary text strings. Although the use of regular expressions along with grep would deserve an entire article on its own, let us review the basics here:

1. The simplest regular expression is an alphanumeric string (i.e., the word “svm”) or two (when two are present, you can use the | (OR) operator):

# grep -Ei 'svm|vmx' /proc/cpuinfo

The presence of either of those two strings indicate that your processor supports virtualization:

Regular Expression Example
Regular Expression Example

2. A second kind of a regular expression is a range list, enclosed between square brackets.

For example, c[aeiou]t matches the strings cat, cet, cit, cot, and cut, whereas [a-z] and [0-9] match any lowercase letter or decimal digit, respectively. If you want to repeat the regular expression X certain number of times, type {X} immediately following the regexp.

For example, let’s extract the UUIDs of storage devices from /etc/fstab:

# grep -Ei '[0-9a-f]{8}-([0-9a-f]{4}-){3}[0-9a-f]{12}' -o /etc/fstab
Extract String from a File in Linux
Extract String from a File

The first expression in brackets [0-9a-f] is used to denote lowercase hexadecimal characters, and {8} is a quantifier that indicates the number of times that the preceding match should be repeated (the first sequence of characters in an UUID is a 8-character long hexadecimal string).

The parentheses, the {4} quantifier, and the hyphen indicate that the next sequence is a 4-character long hexadecimal string, and the quantifier that follows ({3}) denote that the expression should be repeated 3 times.

Finally, the last sequence of 12-character long hexadecimal string in the UUID is retrieved with [0-9a-f]{12}, and the -o option prints only the matched (non-empty) parts of the matching line in /etc/fstab.

3. POSIX character classes.

Character Class Matches…
 [[:alnum:]]  Any alphanumeric [a-zA-Z0-9] character
 [[:alpha:]]  Any alphabetic [a-zA-Z] character
 [[:blank:]]  Spaces or tabs
 [[:cntrl:]]  Any control characters (ASCII 0 to 32)
 [[:digit:]]  Any numeric digits [0-9]
 [[:graph:]]  Any visible characters
 [[:lower:]]  Any lowercase [a-z] character
 [[:print:]]  Any non-control characters
 [[:space:]]  Any whitespace
 [[:punct:]]  Any punctuation marks
 [[:upper:]]  Any uppercase [A-Z] character
 [[:xdigit:]]  Any hex digits [0-9a-fA-F]
 [:word:]  Any letters, numbers, and underscores [a-zA-Z0-9_]

For example, we may be interested in finding out what the used UIDs and GIDs (refer to Part 2 of this series to refresh your memory) are for real users that have been added to our system. Thus, we will search for sequences of 4 digits in /etc/passwd:

# grep -Ei [[:digit:]]{4} /etc/passwd
Search For a String in File
Search For a String in File

The above example may not be the best case of use of regular expressions in the real world, but it clearly illustrates how to use POSIX character classes to analyze text along with grep.

Conclusion

In this article we have provided some tips to make the most of nano and vim, two text editors for the command-line users. Both tools are supported by extensive documentation, which you can consult in their respective official web sites (links given below) and using the suggestions given in Part 1 of this series.

Reference Links

http://www.nano-editor.org/
http://www.vim.org/

Gabriel Cánepa
Gabriel Cánepa is a GNU/Linux sysadmin and web developer from Villa Mercedes, San Luis, Argentina. He works for a worldwide leading consumer product company and takes great pleasure in using FOSS tools to increase productivity in all areas of his daily work.

Each tutorial at TecMint is created by a team of experienced Linux system administrators so that it meets our high-quality standards.

Join the TecMint Weekly Newsletter (More Than 156,129 Linux Enthusiasts Have Subscribed)
Was this article helpful? Please add a comment or buy me a coffee to show your appreciation.

2 thoughts on “RHCSA Series: Editing Text Files with Nano and Vim / Analyzing text with grep and regexps – Part 4”

  1. Thanks for contacting your team, I am the root user I set the file password via vim eg: 123, i forget the password i try to open the file it’s coming to encrypted method, at the time i just try to add the content it was attend, then save my file.

    Again i try to open the file via the correct password it was open but it’s was coming to encrypted. My data was lose, How to recover my data via vim…!

    Reply

Leave a Reply to Ragunath Cancel reply

Thank you for taking the time to share your thoughts with us. We appreciate your decision to leave a comment and value your contribution to the discussion. It's important to note that we moderate all comments in accordance with our comment policy to ensure a respectful and constructive conversation.

Rest assured that your email address will remain private and will not be published or shared with anyone. We prioritize the privacy and security of our users.