十五、gawk命令使用

gawk命令

gawk程序是Unix中原始awk程序的GNU版本。

它可以用来写脚本的方式处理文本数据。

他可以

定义变量保存数据
使用算数运算符字符串操作符处理数据
使用结构化逻辑语句处理数据
提取数据重新定义格式（如过滤日志文件找出错误行便于集中处理阅读）

语法格式

gawk options program file

gawk选项

选项	描述
-F fs	指定分隔符
-f file	指定含有命令的文件
-v var=value	定义变量
-mf N	指定处理文件中最大字段数
-mr N	指定数据文件中最大数据行数
-W keyword	指定gawk的兼容模式或警告等级

从命令行读取程序脚本

gawk命令的脚本使用花括号定义。

gawk命令认为脚本是一串字符串所以还需要用单引号括起来。

gawk程序会针对数据流中的每行文本执行程序脚本。

注意：如下命令，因为没有在命令行上指定文件名，gawk默认会从STDIN也就是键盘接收数据。

即从键盘中中随便输入文本，接着都会执行输出Hello World!然后再等待输入。

[root@tzPC 19Unit]# gawk '{print "Hello World!"}'
This is a test
Hello World!

Ctrl+D可以终止gawk程序

使用数据字段变量

gawk会自动给一行中每个数据字段分配一个变量

$0代表整个文本行

$1代表文本行中的第1个数据字段

$2代表文本行中第2个数据字段

...

在文本行中每个数据字段都是通过字段分隔符划分的。默认是空格或制表符

例：使用gawk读取文本文件，显示第一个数据字段的值

[root@tzPC 19Unit]# cat data2.txt 
One line of test text.
Two lines of test text.
Three     lines of test text.
[root@tzPC 19Unit]# gawk '{print $1}' data2.txt 
One
Two
Three

-F参数更改字段分隔符

[root@tzPC 19Unit]# gawk -F: '{print $1}' /etc/passwd
root
bin
daemon
adm
lp
sync
shutdown
...

在程序脚本中使用多个命令

使用分号分隔命令即可。

[root@tzPC 19Unit]# echo "My name is tz" | gawk '{$4="root";print $0}'
My name is root

也可以使用如下写法

因为没有输入数据流，所以gawk默认会从STDIN中获取数据也就是从键盘中获取，所以要手动输入数据，然后才会执行gawk命令输出数据，使用Ctrl+D退出程序

[root@tzPC ~]# gawk '{
> $4="root"
> print $0}'
My name is tz
My name is root

从文件中读取程序命令

使用-f选项指定命令文件

[root@tzPC 19Unit]# cat script2.gawk 
{print $1 "'s home directory is " $6}
[root@tzPC 19Unit]# gawk -F: -f script2.gawk /etc/passwd
root's home directory is /root
bin's home directory is /bin
daemon's home directory is /sbin
adm's home directory is /var/adm
...

使用含有多行命令的命令文件

注意：这里定义了变量text，再次调用它的时候不需要加$

[root@tzPC 19Unit]# cat script3.gawk 
{
text="'s home directory is "
print $1 text $6
}
[root@tzPC 19Unit]# gawk -F: -f script3.gawk /etc/passwd
root's home directory is /root
bin's home directory is /bin
daemon's home directory is /sbin
adm's home directory is /var/adm
lp's home directory is /var/spool/lpd
sync's home directory is /sbin

在处理数据前运行脚本

使用关键字BEGIN

[root@tzPC 19Unit]# cat data3.txt 
Line 1
Line 2
Line 3
[root@tzPC 19Unit]# gawk 'BEGIN {print "The data3 File Contents:"}
> {print $0}' data3.txt
The data3 File Contents:
Line 1
Line 2
Line 3

再处理数据后运行脚本

使用END关键字

[root@tzPC 19Unit]# gawk 'BEGIN {Print "The data3 File Contents:"}
> {print $0}
> END {print "End of File"}' data3.txt
Line 1
Line 2
Line 3
End of File

高大上的格式化输出报告用法来了，圈重点！

[root@tzPC 19Unit]# cat script4.gawk 
BEGIN{
print "The latest list of users and shells"
print " UserID 	 Shell"
print "------- 	 -------"
FS=":"
}

{
print $1 "    	 "    $7
}

END{
print "This concludes the listing"
}

[root@tzPC 19Unit]# gawk -f script4.gawk /etc/passwd
The latest list of users and shells
 UserID      Shell
-------      -------
root         /bin/bash
bin         /sbin/nologin
postfix         /sbin/nologin
sshd         /sbin/nologin
tz         /bin/bash
This concludes the listing

学习来自：《Linux命令行与Shell脚本大全第3版》第19章

今天的学习是为了以后的工作更加的轻松！