Linux awk command is easy to understand and minute minute to learn

Introduction

Awk is a powerful text analysis tool. Compared to grep search, sed editing, awk is particularly powerful when it analyzes data and generates reports. Simply put, awk reads the file line by line, and uses space as the default separator to slice each line. The cut part is then analyzed.

There are 3 different versions of awk: awk, nawk and gawk, not specifically stated, generally referred to as gawk, gawk is the GNU version of AWK.

The name awk comes from the first letters of its founders Alfred Aho, Peter Weinberger and Brian Kernighan. In fact, AWK does have its own language: AWK programming language, which has been officially defined by the three creators as "style scanning and processing languages." It allows you to create short programs that read input files, sort data, process data, perform calculations on inputs, and generate reports, as well as countless other features.

Instructions

Awk'{pattern + action}'{filenames}

Although the operation can be complicated, the syntax is always the same, where pattern represents what AWK looks for in the data, and action is a series of commands that are executed when a match is found. Braces ({}) do not need to appear all the time in the program, but they are used to group a series of instructions according to a particular pattern. Pattern is the regular expression to be represented, enclosed in slashes.

The most basic function of the awk language is to browse and extract information based on specified rules in files or strings. After awk extracts information, other text operations can be performed. A complete awk script is usually used to format information in a text file.

Usually, awk is a unit of processing of a file. Each time awk receives a line of the file, it then executes the corresponding command to process the text.

Call awk

There are three ways to call awk

Command line mode

Awk[-F field-separator] 'commands' input-file(s)

Among them, commands are real awk commands, and [-F field separators] are optional. Input-file(s) is the file to be processed.

In awk, in each line of a file, each item separated by a domain separator is called a field. Typically, in the case of the unnamed -F domain separator, the default domain separator is a space.

2. Shell script mode

Insert all awk commands into a file and make the awk program executable, then the awk command interpreter is used as the first line of the script, and is called again by typing the script name.

Equivalent to the first line of the shell script: #!/bin/sh

Can be replaced with: #!/bin/awk

3. Insert all awk commands into a separate file and then call:

Awk -fawk-script-fileinput-file(s)

Among them, the -f option loads the awk script in awk-script-file, and input-file(s) is the same as above.

This chapter focuses on the command line approach.

Entry example

Suppose the output of last -n 5 is as follows

# last -n 5 Take only the first five lines

Root pts/1 192.168.1.100 Tue Feb1011:21 still logged in

Root pts/1 192.168.1.100 Tue Feb1000:46 - 02:28 (01:41)

Root pts/1 192.168.1.100 Mon Feb 911:41 - 18:30 (06:48)

Dmtsai pts/1 192.168.1.100 Mon Feb 911:41 - 11:41 (00:00)

Root tty1 Fri Sep 514:09 - 14:10 (00:01)

If only the 5 accounts recently logged in are displayed

#last -n 5 | awk '{print $1}'

Root

Root

Root

Dmtsai

Root

The awk workflow is like this: read a record with 'n' newline split, then divide the record by the specified domain separator, fill the field, $0 for all domains, $1 for the first field, $n Represents the nth domain. The default domain separator is "blank key" or "key", so $1 means login user, $3 means login user ip, and so on.

If only the account showing /etc/passwd is displayed

#cat /etc/passwd |awk -F ':' '{print $1}'

Root

Daemon

Bin

Sys

This is an example of awk+action, which executes action{print $1} on each line.

-F specifies the domain separator as ':'.

If only the account corresponding to /etc/passwd and the account are displayed, and the account and the shell are separated by tab

#cat /etc/passwd |awk -F ':' '{print $1"t"$7}'

Root /bin/bash

Daemon /bin/sh

Bin /bin/sh

Sys /bin/sh

If you just display the /etc/passwd account and the shell corresponding to the account, and the account and the shell are separated by commas, and add the column name name, shell in all rows, add "blue, /bin/nosh" in the last line.

Cat /etc/passwd |awk -F':' 'BEGIN {print "name,shell"} {print $1","$7} END {print "blue,/bin/nosh"}'

Name, shell

Root, /bin/bash

Daemon, /bin/sh

Bin, /bin/sh

Sys, /bin/sh

....

Blue, /bin/nosh

The awk workflow is like this: first execute BEGING, then read the file, read a record with /n newline split, then divide the record by the specified domain separator, fill the field, $0 means all domains, $1 Represents the first field, $n represents the nth field, and then begins the action action corresponding to the execution mode. Then start reading the second record... until all the records have been read, and finally the END operation.

Search all lines of the root keyword in /etc/passwd

#awk -F: '/root/' /etc/passwd

Root:x:0:0:root:/root:/bin/bash

This is an example of the use of pattern. A line that matches pattern (here is root) will execute the action (no action is specified, the default output is the content of each line).

Search for support regulars, such as looking for root: awk -F: '/^root/' /etc/passwd

Search /etc/passwd has all the lines of the root keyword and display the corresponding shell

# awk -F: '/root/{print $7}' /etc/passwd

/bin/bash

This specifies action{print $7}

Awk built-in variable

Awk has a number of built-in variables for setting environment information. These variables can be changed. Some of the most commonly used variables are given below.

Number of ARGC command line parameters

ARGV command line parameter arrangement

ENVIRON supports the use of system environment variables in the queue

FILENAME awk browse file name

Number of records in FNR browsing files

FS sets the input field separator, equivalent to the command line -F option

The number of fields in the NF browsing record

NR Number of records read

OFS output field separator

ORS output record separator

RS control record separator

In addition, the $0 variable refers to the entire record. $1 represents the first field of the current line, $2 represents the second field of the current line, and so on

Statistics /etc/passwd: file name, line number of each line, number of columns per line, corresponding full line content:

#awk -F ':' '{print "filename:" FILENAME ",linenumber:" NR ",columns:" NF ",linecontent:"$0}' /etc/passwd

Filename: /etc/passwd, linenumber:1,columns:7,linecontent:root:x:0:0:root:/root:/bin/bash

Filename: /etc/passwd, linenumber: 2, columns: 7, linecontent: daemon: x:1:1:daemon:/usr/sbin:/bin/sh

Filename: /etc/passwd, linenumber:3, columns:7,linecontent:bin:x:2:2:bin:/bin:/bin/sh

Filename:/etc/passwd, linenumber:4,columns:7,linecontent:sys:x:3:3:sys:/dev:/bin/sh

Use printf instead of print to make the code more concise and easy to read.

Awk -F':' '{printf("filename:%10s,linenumber:%s,columns:%s,linecontent:%sn",FILENAME,NR,NF,$0)}' /etc/passwd

Print and printf

Awk also provides a print and printf print output function.

The parameters of the print function can be variables, values ​​or strings. Strings must be quoted in double quotes, separated by commas. Without a comma, the parameters are concatenated and cannot be distinguished. Here, the role of the comma is the same as the separator of the output file, except that the latter is a space.

Printf function, its usage is basically similar to printf in c language. It can format strings. When output is complicated, printf is easier to use and the code is easier to understand.

Awk programming

Variables and assignments

In addition to awk's built-in variables, awk can also customize variables.

The number of accounts in /etc/passwd is counted below.

Awk'{count++;print $0;} END{print "user count is ", count}' /etc/passwd

Root:x:0:0:root:/root:/bin/bash

......

User count is40

Count is a custom variable. The previous action{} has only one print. In fact, print is just a statement, and action{} can have multiple statements separated by ;

There is no initialized count here, although the default is 0, but the proper practice is to initialize to 0:

awk'BEGIN {count=0;print "[start]user count is ", count} {count=count+1;print $0;} END{print "[end]user count is ", count}' /etc/passwd

[start]user count is 0

Root:x:0:0:root:/root:/bin/bash

...

[end]user count is 40

Count the number of bytes occupied by files in a folder

Ls -l |awk'BEGIN {size=0;} {size=size+$5;} END{print "[end]size is ", size}'

[end]size is8657198

If displayed in M:

Ls -l |awk'BEGIN {size=0;} {size=size+$5;} END{print "[end]size is ", size/1024/1024,"M"}'

[end]size is8.25889M

Note that the statistics do not include subdirectories of the folder.

Conditional statements

The conditional statements in awk are borrowed from the C language, as explained below:

If(expression){

Statement

Statement

......

}

If(expression){

Statement

}else{

Statement2;

}

If(expression){

Statement1;

}elseif(expression1){

Statement2;

}else{

Statement3;

}

Count the number of bytes occupied by files in a folder, and filter files of 4096 size (usually folders):

Ls -l |awk'BEGIN {size=0;print "[start]size is ", size} {if($5!=4096){size=size+$5;}} END{print "[end]size is ", Size/1024/1024,"M"}'

[end]size is8.22339M

loop statement

The loop statements in awk are also borrowed from the C language. They support while, do/while, for, break, and continue. The semantics of these keywords are exactly the same as those in C.

Array

Because the subscripts of arrays in awk can be numbers and letters, the subscripts of an array are often referred to as keys. Both the value and the keyword are stored in a separate table for the key/value application hash. Since hashes are not stored sequentially, you will find that when displaying array contents, they are not displayed in the order you expect. Arrays, like variables, are created automatically when they are used. Awk also automatically determines whether it stores numbers or strings. In general, arrays in awk are used to collect information from records, which can be used to calculate sums, count words, and how many times the tracking template is matched.

Show account for /etc/passwd

Awk -F':''BEGIN {count=0;} {name[count] = $1;count++;}; END{for (i = 0; i ' /etc/passwd

0root

1daemon

2bin

3sys

4sync

5games

......

Here use a for loop to traverse the array

Outdoor Fixed LED Display

Outdoor Fixed LED Display is a popular product for its high quality, every year sold to at least 80,000 pieces around the world, including Europe, North America, southeast Asia.Compared to other indoor LED display in the market, its biggest advantage is that it can display high-definition images while maintaining low power consumption.Besides, it adopts Die casting aluminum cabinet which is ultra-thin and ultra-light and owns good heat dissipation.Easy to install and maintain and suitable for multiple indoor scenes.


Application:
* Business Organizations:
Supermarket, large-scale shopping malls, star-rated hotels, travel agencies
* Financial Organizations:
Banks, insurance companies, post offices, hospital, schools
* Public Places:
Subway, airports, stations, parks, exhibition halls, stadiums, museums, commercial buildings, meeting rooms
* Entertainments:

Movie theaters, clubs, stages.

Outdoor Fixed LED Display,Led Wall Display Screen,Curved Led Display Screen,Led Display Board

Guangzhou Chengwen Photoelectric Technology co.,ltd , https://www.cwledpanel.com