awk命令

简介

awk是一个强大的文本分析工具,是一种编程语言,相对于grep,它在数据分析方面表现更为强大,用于linux下对数据和文本进行处理。

awk命令形式:awk '{pattern+action}' {filenames}

awk是行处理器,它每接收文件的一行,就执行相应的处理命令。

应用示例

1.取出前五行,打印第一个数据项

[root@www ~]#last -n 5 
root     pts/1   192.168.1.100  Tue Feb 10 11:21   still logged in
root     pts/1   192.168.1.100  Tue Feb 10 00:46 - 02:28  (01:41)
root     pts/1   192.168.1.100  Mon Feb  9 11:41 - 18:30  (06:48)
dmtsai   pts/1   192.168.1.100  Mon Feb  9 11:41 - 11:41  (00:00)
root     tty1                   Fri Sep  5 14:09 - 14:10  (00:01)
[root@www ~]#last -n 5 | awk '{print $1}'
root
root
root
dmtsai
root

awk工作流程是这样的:读入有' '换行符分割的一条记录,然后将记录按指定的域分隔符划分域,填充域,$0则表示所有域,$1表示第一个域,$n表示第n个域。默认域分隔符是"空白键" 或 "[tab]键",所以$1表示登录用户,$3表示登录用户ip,以此类推。

2.统计PV、UV、IP等

从nginx的access.log中分析统计出来:

192.168.40.2 - [02/Nov/2016:15:44:35 +0800]  "GET /wcm/app/main/refresh.jsp?r=1478072325778 HTTP/1.1"  - 200 "User_Cookie:7F00000122A5597C46607B1C0A7EC016"
192.168.40.2 - [02/Nov/2016:15:44:35 +0800]  "GET /webpic/W0201611/W020161102/W020161102566715167404.jpg HTTP/1.1"  - 200 "User_Cookie:7F00000122A5597C46607B1C0A7EC016"
119.255.31.109 - [02/Nov/2016:15:44:36 +0800]  "GET /wcm/app/main/refresh.jsp?r=1478072510132 HTTP/1.1"  - 200 "User_Cookie:7F000001237921BE9237838AEC65704D"
119.255.31.109 - [02/Nov/2016:15:44:36 +0800]  "GET /wcm/app/message/message_query_service.jsp?READFLAG=0&MSGTYPES=1%2C2%2C3 HTTP/1.1"  - 200 "User_Cookie:7F000001237921BE9237838AEC65704D"
192.168.40.2 - [02/Nov/2016:15:44:37 +0800]  "GET /wcm/app/message/message_query_service.jsp?READFLAG=0&MSGTYPES=1%2C2%2C3 HTTP/1.1"  - 200 "User_Cookie:7F00000123D3BF2345115EAAC21F71E0"
192.168.40.2 - [02/Nov/2016:15:44:37 +0800]  "GET /wcm/app/message/message_query_service.jsp?READFLAG=0&MSGTYPES=1%2C2%2C3 HTTP/1.1"  - 200 "User_Cookie:7F00000123EF73896DF98EDA9950944E"
192.168.40.2 - [02/Nov/2016:15:44:37 +0800]  "GET /wcm/app/message/message_query_service.jsp?READFLAG=0&MSGTYPES=1%2C2%2C3 HTTP/1.1"  - 200 "User_Cookie:7F00000123FE0F9C397E1A8F0C4F044B"
192.168.40.2 - [02/Nov/2016:15:44:37 +0800]  "GET /wcm/app/main/refresh.jsp?r=1478072511427 HTTP/1.1"  - 200 "User_Cookie:7F00000123A465B7EA1DE0AF0AE671B7"
119.255.31.109 - [02/Nov/2016:15:44:38 +0800]  "GET /wcm/app/message/message_query_service.jsp?READFLAG=0&MSGTYPES=1%2C2%2C3 HTTP/1.1"  - 200 "User_Cookie:7F00000123D89B11302DF80AE773C900"

 PV统计

--总PV量
awk '{print $6}' access.log | wc -l
--独立IP
awk '{print $1}' access.log | sort -r |uniq -c | wc -l
--UV统计
awk '{print $10}' access.log | sort -r |uniq -c |wc -l
原文地址:https://www.cnblogs.com/liluredhat/p/6702755.html