RHCSA Series: Editing Text Files with Nano and Vim / Analyzing text with grep and regexps – Part 4

Analyzing Text with Grep and Regular Expressions

By now you have learned how to create and edit files using nano or vim. Say you become a text editor ninja, so to speak – now what? Among other things, you will also need how to search for regular expressions inside text.

A regular expression (also known as “regex” or “regexp“) is a way of identifying a text string or pattern so that a program can compare the pattern against arbitrary text strings. Although the use of regular expressions along with grep would deserve an entire article on its own, let us review the basics here:

1. The simplest regular expression is an alphanumeric string (i.e., the word “svm”) or two (when two are present, you can use the | (OR) operator):

# grep -Ei 'svm|vmx' /proc/cpuinfo

The presence of either of those two strings indicate that your processor supports virtualization:

Regular Expression Example

Regular Expression Example

2. A second kind of a regular expression is a range list, enclosed between square brackets.

For example, c[aeiou]t matches the strings cat, cet, cit, cot, and cut, whereas [a-z] and [0-9] match any lowercase letter or decimal digit, respectively. If you want to repeat the regular expression X certain number of times, type {X} immediately following the regexp.

For example, let’s extract the UUIDs of storage devices from /etc/fstab:

# grep -Ei '[0-9a-f]{8}-([0-9a-f]{4}-){3}[0-9a-f]{12}' -o /etc/fstab
Extract String from a File in Linux

Extract String from a File

The first expression in brackets [0-9a-f] is used to denote lowercase hexadecimal characters, and {8} is a quantifier that indicates the number of times that the preceding match should be repeated (the first sequence of characters in an UUID is a 8-character long hexadecimal string).

The parentheses, the {4} quantifier, and the hyphen indicate that the next sequence is a 4-character long hexadecimal string, and the quantifier that follows ({3}) denote that the expression should be repeated 3 times.

Finally, the last sequence of 12-character long hexadecimal string in the UUID is retrieved with [0-9a-f]{12}, and the -o option prints only the matched (non-empty) parts of the matching line in /etc/fstab.

3. POSIX character classes.

Character Class Matches…
 [[:alnum:]]  Any alphanumeric [a-zA-Z0-9] character
 [[:alpha:]]  Any alphabetic [a-zA-Z] character
 [[:blank:]]  Spaces or tabs
 [[:cntrl:]]  Any control characters (ASCII 0 to 32)
 [[:digit:]]  Any numeric digits [0-9]
 [[:graph:]]  Any visible characters
 [[:lower:]]  Any lowercase [a-z] character
 [[:print:]]  Any non-control characters
 [[:space:]]  Any whitespace
 [[:punct:]]  Any punctuation marks
 [[:upper:]]  Any uppercase [A-Z] character
 [[:xdigit:]]  Any hex digits [0-9a-fA-F]
 [:word:]  Any letters, numbers, and underscores [a-zA-Z0-9_]

For example, we may be interested in finding out what the used UIDs and GIDs (refer to Part 2 of this series to refresh your memory) are for real users that have been added to our system. Thus, we will search for sequences of 4 digits in /etc/passwd:

# grep -Ei [[:digit:]]{4} /etc/passwd
Search For a String in File

Search For a String in File

The above example may not be the best case of use of regular expressions in the real world, but it clearly illustrates how to use POSIX character classes to analyze text along with grep.

Conclusion

In this article we have provided some tips to make the most of nano and vim, two text editors for the command-line users. Both tools are supported by extensive documentation, which you can consult in their respective official web sites (links given below) and using the suggestions given in Part 1 of this series.

Reference Links

http://www.nano-editor.org/
http://www.vim.org/

If You Appreciate What We Do Here On TecMint, You Should Consider:

TecMint is the fastest growing and most trusted community site for any kind of Linux Articles, Guides and Books on the web. Millions of people visit TecMint! to search or browse the thousands of published articles available FREELY to all.

If you like what you are reading, please consider buying us a coffee ( or 2 ) as a token of appreciation.

Support Us

We are thankful for your never ending support.

Gabriel Cánepa

Gabriel Cánepa is a GNU/Linux sysadmin and web developer from Villa Mercedes, San Luis, Argentina. He works for a worldwide leading consumer product company and takes great pleasure in using FOSS tools to increase productivity in all areas of his daily work.

Your name can also be listed here. Got a tip? Submit it here to become an TecMint author.

RedHat RHCE and RHCSA Certification Book
Linux Foundation LFCS and LFCE Certification Preparation Guide

You may also like...

2 Responses

  1. Ragunath says:

    Thanks for contacting your team, I am the root user I set the file password via vim eg: 123, i forget the password i try to open the file it’s coming to encrypted method, at the time i just try to add the content it was attend, then save my file.

    Again i try to open the file via the correct password it was open but it’s was coming to encrypted. My data was lose, How to recover my data via vim…!

  2. Rizwan says:

    Thanks for the Nice explanations.
    For vim editor, here is the nice interactive tutorial http://www.openvim.com/tutorial.html

Got something to say? Join the discussion.

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.