How to Use Comparison Operators & Data Filtering with Awk – Part 4

When dealing with numerical or string values in a line of text, filtering text or strings using comparison operators comes in handy for awk command users.

In this part of the Awk series, we shall take a look at how you can filter text or strings using comparison operators.

If you are a programmer then you must already be familiar with comparison operators but for those who are not, let me explain in the section below.

What are Comparison operators in Awk?

Before diving into how to use comparison operators with Awk, let’s first understand what comparison operators are.

Comparison operators consist of symbols or keywords utilized to compare values in programming languages.

In Awk, comparison operators are often used to compare the value of numbers or strings and they include the following:

  • > – greater than
  • < – less than
  • >= – greater than or equal to
  • <= – less than or equal to
  • == – equal to
  • != – not equal to
  • some_value ~ / pattern/ – true if some_value matches the pattern
  • some_value !~ / pattern/ – true if some_value does not match the pattern

Now that we have looked at the various comparison operators in Awk, let us understand them better using an example.

Filtering Data with Awk

In this example, we have a file named food_list.txt which is a shopping list for different food items and I would like to flag food items whose quantity is less than or equal 20 by adding (**) at the end of each line.

File – food_list.txt
No      Item_Name               Quantity        Price
1       Mangoes                    45           $3.45
2       Apples                     25           $2.45
3       Pineapples                 5            $4.45
4       Tomatoes                   25           $3.45
5       Onions                     15           $1.45
6       Bananas                    30           $3.45

The general syntax for using comparison operators in Awk is:

expression { actions; }

To achieve the above goal, I will have to run the command below:

awk '$3 <= 20 {print $0 " (**)" } $3 > 20 {print $0}' food_list.txt
Flagging Food Items with Awk
Flagging Food Items with Awk

Here is the explanation of the command:

  • awk – This command invokes the Awk text processing utility.
  • ‘$3 <= 20 {print $0 ” (**)” } – This part of the command is a condition followed by an action. It checks if the value in the third column (Quantity) of each line is less than or equal to 20. If the condition is true, it prints the entire line ($0) with “(**)” appended to it.
  • $3 > 20 {print $0} – This part of the command is another condition followed by an action. It checks if the value in the third column (Quantity) of each line is greater than 20. If the condition is true, it prints the entire line ($0) without any modifications.
  • food_list.txt – This is the input file that the Awk command will process. It contains the data on which the conditions and actions specified in the command will be applied.

Another example is to mark lines where the quantity is less than or equal to 20 with the word “(TRUE)” at the end.

awk '$3 <= 20 { printf "%s\t%s\n", $0,"TRUE" ; } $3 > 20  { print $0 ;} ' food_list.txt
Print Lines with True
Print Lines with True

Combining Operators in Awk

We can also combine multiple comparison operators to create more complex conditions. For example, if we want to filter out food items whose quantity is between 20 and 50, we can use the logical AND operator (&&) as shown.

awk '$3 >= 20 && $3 <= 50' food_list.txt
Filtering Food Items by Quantity Range
Filtering Food Items by Quantity Range

The above command will print lines where the quantity (third column) falls between 20 and 50.

Summary

This is an introductory tutorial to comparison operators in Awk, therefore you need to try out many other options and discover more.

In case of any problems you face or any additions that you have in mind, then drop a comment in the comment section below. Remember to read the next part of the Awk series where I will take you through compound expressions.

Aaron Kili
Aaron Kili is a Linux and F.O.S.S enthusiast, an upcoming Linux SysAdmin, web developer, and currently a content creator for TecMint who loves working with computers and strongly believes in sharing knowledge.

Each tutorial at TecMint is created by a team of experienced Linux system administrators so that it meets our high-quality standards.

Join the TecMint Weekly Newsletter (More Than 156,129 Linux Enthusiasts Have Subscribed)
Was this article helpful? Please add a comment or buy me a coffee to show your appreciation.

11 Comments

Leave a Reply
  1. Forward read.

    Filename SRR11910146_1.fastq
    Total Sequences 705425
    %GC 46
    PASS Adapter Content SRR11910146_1.fastq
    reverse read
    Filename SRR11910146_2.fastq
    Total Sequences 705425
    %GC 46
    PASS Adapter Content SRR11910146_2.fastq

    I have a file that contains this and if I want to compare each row separately how can I do this . for example I have to check %GC > 50 or not how can I achieve this?

    Reply
  2. Apparently the script is not smart enough to validate if $3 is a number of character.

    $ awk '$3 <= 30 { print $0," 30 { print $0, "<-- quantity greater than 30" ;}' food_list.txt
    
    No      Item_Name               Quantity        Price <-- quantity greater than 30
    1       Mangoes                    45           $3.45 <-- quantity greater than 30
    2       Apples                     25           $2.45 <-- quantity is less than or equal to 30
    3       Pineapples                 5            $4.45 <-- quantity is less than or equal to 30
    4       Tomatoes                   25           $3.45 <-- quantity is less than or equal to 30
    5       Onions                     15           $1.45 <-- quantity is less than or equal to 30
    6       Bananas                    30           $3.45 <-- quantity is less than or equal to 30
    
    Reply
      • My try script, while the output is good however I do feel it’s a bit redundant particularly checking ($3 ~ /^[0-9]/) twice. Please simply :)

        $ awk ‘($3 ~ “^[a-zA-Z]”) { print $0} (($3 ~ /^[0-9]/) && ($3 <= 30)) { print $0," 30)) { print $0, “<– quantity is greater than 30" ;}' food_list.txt

        No      Item_Name               Quantity        Price
        1       Mangoes                    45           $3.45 <-- quantity is greater than 30
        2       Apples                     25           $2.45 <-- quantity is less than or equal to 30
        3       Pineapples                 5            $4.45 <-- quantity is less than or equal to 30
        4       Tomatoes                   25           $3.45 <-- quantity is less than or equal to 30
        5       Onions                     15           $1.45 <-- quantity is less than or equal to 30
        6       Bananas                    30           $3.45 <-- quantity is less than or equal to 30
        
        Reply
  3. What is the difference between using these expression directly and using this expressions with if statement. How it makes difference.

    Reply
  4. awk ‘NR==1 {print }; NR>1 && $3 > 30 { printf “%s\t%s\n”, $0,”**” ; };NR>1 && $3 <= 30 { print $0 ;}' food_list.txt
    will be better

    Reply
  5. if I want filter the quantity greater than 30; there will be something wrong
    awk ‘$3 > 30 { printf “%s\t%s\n”, $0,”**” ; } $3 1 && $3 > 30 { printf “%s\t%s\n”, $0,”**” ; };NR>1 && $3 <= 30 { print $0 ;}' food_list.txt

    Reply

Leave a Reply to jumen Cancel reply

Thank you for taking the time to share your thoughts with us. We appreciate your decision to leave a comment and value your contribution to the discussion. It's important to note that we moderate all comments in accordance with our comment policy to ensure a respectful and constructive conversation.

Rest assured that your email address will remain private and will not be published or shared with anyone. We prioritize the privacy and security of our users.