How to Count Word Occurrences in a Text File

Graphical User Interface word processors and note-taking applications have information or detail indicators for document details such as the count of pages, words, and characters, a headings list in word processors, a table of content in some markdown editors, etc. and finding the occurrence of words or phrases are as easy as hitting Ctrl + F and typing in the characters you want to search for.

A GUI does make everything easy but what happens when you can only work from the command line and you want to check the number of times a word, phrase, or character occurs in a text file? It’s almost as easy as it is when using a GUI as long as you’ve got the right command and I’m about to narrate to you how it is done.

Suppose you have an example.txt file containing the sentences:

Praesent in mauris eu tortor porttitor accumsan. Mauris suscipit, ligula sit amet pharetra semper, 
nibh ante cursus purus, vel sagittis velit mauris vel metus enean fermentum risus.

You can use grep command to count the number of times "mauris" appears in the file as shown.

$ grep -o -i mauris example.txt | wc -l
Count Word Occurrence in Linux File
Count Word Occurrence in Linux File

Using grep -c alone will count the number of lines that contain the matching word instead of the number of total matches. The -o option is what tells grep to output each match in a unique line and then wc -l tells wc to count the number of lines. This is how the total number of matching words is deduced.

A different approach is to transform the content of the input file with tr command so that all words are in a single line and then use grep -c to count that match count.

$ tr '[:space:]' '[\n*]' < example.txt | grep -i -c mauris
Count Occurrences of Word in File
Count Occurrences of Word in File

Is this how you would check word occurrence from your terminal? Share your experience with us and let us know if you’ve got another way of accomplishing the task.

If this article helped, with someone on your team.

TecMint Weekly Newsletter
Get the Learn Linux 7 Days Crash Course free when you join 34,000+ Linux professionals reading every Thursday.
Check your email for a magic link to get started.
Something went wrong. Please try again.
TecMint has been free for 14 years. Help keep it that way.
Google AI Overviews and tools like ChatGPT have cut into search traffic for independent tech sites like TecMint. Running this site costs over $2,000 every month for hosting, infrastructure, and paying authors to keep the content accurate and tested.

If this article helped you solve a problem, consider buying a coffee. It helps keep TecMint free, supports the authors, and keeps the project going.
☕ Buy Me a Coffee
Martins D. Okoi
Martins Divine Okoi is a graduate of Computer Science with a passion for Linux and the Open Source community. He works as a Graphic Designer, Web Developer, and programmer.

Each tutorial at TecMint is created by a team of experienced Linux system administrators so that it meets our high-quality standards.

5 Comments

Leave a Reply
  1. It’s not perfect but I adapted the ‘tr‘ approach to print a count of each word in some standard input:

    tr -c "'[:alnum:]" "\n" | grep "[[:alnum:]]" | sort | uniq -c | sort -n
    

    While grep -c works on a line, this puts every word or number on its own line and sorts them. Then uniq -c deduplicates them as well as printing the number of occurrences. grep is used to remove blank lines only because if you don’t, uniq prints out the number of blank lines as well, and I didn’t yet come up with a better way to do that.

    The final sort is optional, used to list the words by frequency of appearance instead of alphanumerically. Note the apostrophe in the first set given to tr such that possessives and contractions remain whole words but parenthesis, quotation marks, and other punctuation are stripped off. Also note that this breaks on longer, comma-separated numbers, turning each group into a (probably meaningless) lone 1-, 2-, or 3-digit number. So just don’t try to handle those as though they are words, and there is no problem.

    Reply

Got Something to Say? Join the Discussion...

Thank you for taking the time to share your thoughts with us. We appreciate your decision to leave a comment and value your contribution to the discussion. It's important to note that we moderate all comments in accordance with our comment policy to ensure a respectful and constructive conversation.

Rest assured that your email address will remain private and will not be published or shared with anyone. We prioritize the privacy and security of our users.

Free Course
Get a free Linux course before you go.
Subscribe to TecMint Weekly and get the Learn Linux 7 Days Crash Course free. Read by 34,000+ Linux professionals every Thursday.
Something went wrong. Please try again.
Check your email for a magic link to get started.