AWK简单案例

1.处理一下文件内容,将域名提取并根据域名进行数据排序处理

[root@salt-master test]# cat oldboy.txt
http://www.etiantian.org/index.html
http://www.etiantian.org/1.html
http://post.etiantian.org/index.html
http://www.etiantian.org/3.html
http://mp3.etiantian.org/index.html
http://post.etiantian.org/2.html

解答:

[root@salt-master test]# awk -F / '{print $3}' oldboy.txt|sort -r|uniq -c
3 www.etiantian.org
2 post.etiantian.org
1 mp3.etiantian.org

[root@salt-master test]# cut -d / -f3 oldboy.txt|sort -r|uniq -c
3 www.etiantian.org
2 post.etiantian.org
1 mp3.etiantian.org

2.统计高并发web服务器不同的网络连接状态对应的数量

Proto Recv-Q Send-Q Local Address           Foreign Address         State      
tcp        0      0 0.0.0.0:42081           0.0.0.0:*               LISTEN     
tcp        0      0 0.0.0.0:111             0.0.0.0:*               LISTEN     
tcp        0      0 127.0.0.1:631           0.0.0.0:*               LISTEN     
tcp        0      1 172.30.0.54:47570       74.125.204.102:443      SYN_SENT   
tcp        0      1 172.30.0.54:47564       74.125.204.102:443      SYN_SENT   
tcp        0      1 172.30.0.54:47566       74.125.204.102:443      SYN_SENT   
tcp        0      0 172.30.0.54:32840       165.254.134.121:80      ESTABLISHED
tcp        0      0 192.168.1.125:49202     192.168.1.142:139       ESTABLISHED
tcp        0      1 172.30.0.54:47560       74.125.204.102:443      SYN_SENT   
tcp        0      1 172.30.0.54:47562       74.125.204.102:443      SYN_SENT   
tcp        0      1 172.30.0.54:47568       74.125.204.102:443      SYN_SENT   
tcp6       0      0 :::56937                :::*                    LISTEN     
tcp6       0      0 :::3306                 :::*                    LISTEN     
tcp6       0      0 :::111                  :::*                    LISTEN     
tcp6       0      0 ::1:631                 :::*                    LISTEN 
解答:

[root@salt-master test]# awk '/^tcp/ {print $NF}' netstat.log|sort|uniq -c|sort -rn|head
7 LISTEN
6 SYN_SENT
2 ESTABLISHED

[root@salt-master test]# awk '/^tcp/ {S[$NF]++}END{for(k in S) print S[k],k}' netstat.log|sort -rn|head
7 LISTEN
6 SYN_SENT
2 ESTABLISHED

3.分析图片服务日志,把日志(每个图片访问次数*图片大小的总和)排行,取top10,也就是计算每个url的总访问大小

说明:本题生产环境应用:这个功能可以用于IDC网站流量带宽很高,然后通过分析服务器日志哪些元素占用流量过大,进而进行优化或裁剪该图片,压缩js等措施。

输出格式:【访问次数*单个文件次数】 【访问次数】 文件名

[root@salt-master test]# cat access_log
59.33.26.105 - - [08/Dec/2010:15:43:56 +0800] "GET /static/images/photos/2.jpg HTTP/1.1" 200 11299 "http://oldboy.blog.51cto.com/static/web/column/17/index.shtml?courseId=43" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729)"

59.33.26.105 - - [08/Dec/2010:15:43:56 +0800] "GET /static/images/photos/2.jpg HTTP/1.1" 200 11299 "http://oldboy.blog.51cto.com/static/web/column/17/index.shtml?courseId=43" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729)"

59.33.26.105 - - [08/Dec/2010:15:44:02 +0800] "GET /static/flex/vedioLoading.swf HTTP/1.1" 200 3583 "http://oldboy.blog.51cto.com/static/flex/AdobeVideoPlayer.swf?width=590&height=328&url=/`DYNAMIC`/2" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729)"

124.115.4.18 - - [08/Dec/2010:15:44:15 +0800] "GET /?= HTTP/1.1" 200 46232 "-" "-"

124.115.4.18 - - [08/Dec/2010:15:44:25 +0800] "GET /static/js/web_js.js HTTP/1.1" 200 4460 "-" "-"

124.115.4.18 - - [08/Dec/2010:15:44:25 +0800] "GET /static/js/jquery.lazyload.js HTTP/1.1" 200 1627 "-" "-"

解决:

[root@salt-master test]# awk '{print $7" " $10}' access_log|sort|uniq -c|awk '{print $1*$3,$1,$2}'|sort -rn|head
46232 1 /?=
22598 2 /static/images/photos/2.jpg
4460 1 /static/js/web_js.js
3583 1 /static/flex/vedioLoading.swf
1627 1 /static/js/jquery.lazyload.js
0 5

[root@salt-master test]# awk '{array_num[$7]++;array_size[$7]+=$10}END{for(x in array_num){print array_size[x],array_num[x],x}}' access_log |sort -rn -k1|head -10
46232 1 /?=
22598 2 /static/images/photos/2.jpg
4460 1 /static/js/web_js.js
3583 1 /static/flex/vedioLoading.swf
1627 1 /static/js/jquery.lazyload.js
0 5

命令解析:

读access.log日志第一行为例

$7===/static/images/photos/2.jpg

$10===11299

第一列:

array_num["/static/images/photos/2.jpg"]=1

第二列:

array_size[$7]=array_size[$7]+$10

array_size["/static/images/photos/2.jpg"]=11299

原文地址:https://www.cnblogs.com/liuhui-xzz/p/9781197.html