Friday, November 27, 2015

Unix - AWK Command by examples. Part 2

In this part, we will learn on AWK Built in Variables:

FS - Input field separator variable:

By default, AWK assumes that fields in a file are separated by space.
If the fields in the file are separated by any other column separator, we can use the FS variable for that.

Example 5. awk 'BEGIN {FS=":"} {print $2}' test_file
OR
awk -F: '{print $2} test_file

output will be as below.

39 AAAA
15 BBBB
39 EEEE
39 GGGG

We can use -F option for to change column separator in command to display output.

awk -F: '{if ($2 == "") print $1 ": no password!"}'

We can easily move from one column separator to another column separator using awk.

#!/bin/awk -f
{
if ($1 == "#START") {
FS=":";
} else if ($1 == "#STOP") {
FS=" ";
} else {
#print the Roman number in column 3
print $3
}

}


OFS - Output field separator variable: 

Example 6. awk '{print $4,$5}' test_file

When we use print statement in awk, it uses default column seperator while displaying result.

The output of this command will be
unixrun 0
unixrun 175
unixrun 430
unixrun 428

We can change this default column seperator by using the OFS variable as

awk 'BEGIN {OFS=":"} {print $4,$5}' test_file

output will be as below.

unixrun:0
unixrun:175
unixrun:430
unixrun:428

If I run same command in this way, it will not work same as earlier, Here no column separator showing in output.

awk 'BEGIN {OFS=":"} {print $4$5}' test_file

unixrun0
unixrun175
unixrun430
unixrun428

NF - Number of fields variable:

The NF used to get the number of fields in line of a file.

Example 7. awk '{print NF}' test_file

This will display the number of columns in each row, output as below.

9
9
9
9

NR - number of records variable: 

The NR used to get the line number or no of lines in the file.

Example 8. awk '{print NR}' test_file

This will display the line numbers from 1.

1

Example 9. awk 'END {print NR}' test_file

This will display the total number of lines in the file, here output will be.

4

FILENAME - The Current Filename Variable

The last variable known to regular AWK is "FILENAME", which tells you the name of the file.

#!/bin/awk -f
# reports which file is being read
BEGIN {
f="";
}
{ if (f != FILENAME) {
print "reading", FILENAME;
f=FILENAME;
}
print;
}


Unix - AWK Command by examples. Part 1

AWK command is one of the most powerful tools in Unix used for processing the rows and columns in a file. AWK has built in string functions and associative arrays and supports all functionalities of C language. We can also convert AWK scripts into Perl scripts using a2p utility, an added advantage of AWK.

awk 'BEGIN {start_action} {action} END {stop_action}' file

The actions in the begin block are performed before processing the file.
The actions in the end block are performed after processing the file.
The rest of the actions are performed while processing the file.

Create a file test_file with the following data. This file can be easily created using the output of ls -l.
-rw-rw-rw- 1 unixrun unixrun   0 Nov  26 21:39 AAAA
-rw-rw-rw- 1 unixrun unixrun 175 Nov  26 21:15 BBBB
-rw-rw-rw- 1 unixrun unixrun 430 Nov  26 21:39 EEEE
-rw-rw-rw- 1 unixrun unixrun 428 Nov  26 21:39 GGGG

Example 1. awk '{print $1}' test_file

$1, $2, $3... represents the first, second, third column,.. in a row respectively.
Here, awk command will print the first column in each row as shown below.

The variable "$0" refers to the entire line that AWK reads in. That is, if you had nine fields in a line,
print $0
is similar to
print $1, $2, $3, $4, $5, $6, $7, $8, $9

-rw-rw-rw-
-rw-rw-rw-
-rw-rw-rw-
-rw-rw-rw-

To print the 2th and 5th columns in a file use awk '{print $2,$5}' test_file

Example 2. awk 'BEGIN {cnt=0} {cnt=cnt+5+$3} END {print cnt}' test_file

This will prints the sum of the value as  (value of 3th column + 5 ).
In the Begin block the cnt variable is assigned with value 0.
In the next block the v (value of 3rd column + 5) is added to cnt variable.
In addition of this, same scenario is repeated for no of rows in a file.
When all the rows are processed the cnt variable will hold the sum of the values as per our formula (value of 3rd column + 5).
Finally, the output printed in the End block.

We can perform the above steps in another way also - by creating command file.

vi summation.awk
#!/usr/bin/awk -f
BEGIN {cnt=0}
{cnt=cnt+$3+5}
END {print cnt}
:wq

This will create a file named as summation.awk, it will be executed by using awk command as below:

awk -f summation.awk test_file.

This will run the script in summation.awk file and displays the sum as per our formula (sum of the 3rd column + 5 ) in the test_file.

Example 3. awk '{ if($5 == "time") print $0;}' test_file

Above command is used for to find a particular string in a file in a specific column.
Here, it searches the string "time" in the 5th column  of the file and prints entire line in output.

-rw-r--r-- 1 unixrun unixrun 43 Nov  26 21:39 time

Example 4. awk 'BEGIN { for(i=1;i<=4;i++) print "Cube of", i, "is",i*i*i; }'

This will print the Cube of numbers from 1 to 4, The output of the command is

Cube of 1 is 1
Cube of 2 is 8
Cube of 3 is 27
Cube of 4 is 64