awk

使用ab工具将自己的测试环境的网页进行压测:

  ab -n 10000 -c 1000 http://192.168.223.136/index.html

然后使用awk进行统计连接数:

  [root@node1 html]# netstat -tna|awk '/^tcp/ {state[$NF]++}END{for(k in state) print k,state[k]}'

  TIME_WAIT 10091

  ESTABLISHED 2

  LISTEN 6

分析:
state是数组,$NF是连接状态,state[$NF]就是连接数组连接状态的值,首先该值是个空值
执行第一行,k=$NF,为ESTABLISHED时,那么state[“ESTABLISHED”]从空值加1
当k=LISTEN时,那么那么state[“LISTEN”]从空值加1
依此类推,逐行分析完所有的连接状态个数,这对于处理客户端的连接请求有很大帮助
 
利用awk处理nginx的access的访问日志:
  该日志格式如下:

    log_format main '$remote_addr - $remote_user [$time_local] "$request" '
    '$status $body_bytes_sent "$http_referer" '
    '"$http_user_agent" "$http_x_forwarded_for"';

截取nginx某时段的日志文件信息:
[root@node1 logs]# cat access_20170725.log |awk '$4 >= "[24/Jul/2017:09:56:22" && $4 <= "[24/Jul/2017:09:57:00"'
192.168.223.1 - - [24/Jul/2017:09:56:22 +0800] "GET / HTTP/1.1" 192.168.223.136 200 41 "-" "Mozilla/5.0 (Windows NT 10.0; WOW64; Trident/7.0; rv:11.0) like Gecko" "-"
192.168.223.1 - - [24/Jul/2017:09:56:22 +0800] "GET /favicon.ico HTTP/1.1" 192.168.223.136 404 142 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64; Trident/7.0; rv:11.0) like Gecko" "-"
192.168.223.1 - - [24/Jul/2017:09:56:23 +0800] "GET / HTTP/1.1" 192.168.223.136 200 41 "-" "Mozilla/5.0 (Windows NT 10.0; WOW64; Trident/7.0; rv:11.0) like Gecko" "-"
192.168.223.1 - - [24/Jul/2017:09:56:23 +0800] "GET /favicon.ico HTTP/1.1" 192.168.223.136 404 142 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64; Trident/7.0; rv:11.0) like Gecko" "-"
192.168.223.1 - - [24/Jul/2017:09:56:25 +0800] "GET / HTTP/1.1" 192.168.223.136 200 41 "-" "Mozilla/5.0 (Windows NT 10.0; WOW64; Trident/7.0; rv:11.0) like Gecko" "-"
192.168.223.1 - - [24/Jul/2017:09:56:25 +0800] "GET /favicon.ico HTTP/1.1" 192.168.223.136 404 142 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64; Trident/7.0; rv:11.0) like Gecko" "-"
访问IP统计:
统计access访问日志的ip以及ip个数:
[root@node1 logs]# cat access_20170725.log |awk '{num[$1]++}END{for(key in num) print key,num[key]}'|sort -nr|sort -u
192.168.223.1 17
192.168.223.136 20000
 
统计访问的ip有几个:
[root@node1 logs]# awk '{ip[$1]++}END{print asort(ip)}' access_20170725.log
2
[root@node1 logs]# cat access_20170725.log |awk '{print $1}'|sort -u|wc -l
2
 
访问日志的总宽带:
[root@node1 logs]# awk -v total=0 '{total+=$10}END{print total/1024/1024}' access_20170725.log
3.8195
-v var_name:指定初始变量,并赋值
-v total=0:初始变量赋值
将$10的字节数全部相加,除以kb/M(1024*1024)
 
访问请求的url:
[root@node1 logs]# awk '{url[$7]++}END{for (k in url){print url[k],k}}' access_20170725.log | sort -rn | head -20
20006 /
8 /favicon.ico
2 /image.jpeg
1 /proxy_path/image.jpeg
 
计算每分钟的访问量:
[root@node1 logs]# awk -F: '{count[$2":"$3]++} END {for (minute in count) print minute, count[minute]}' access_20170725.log | sort|head
09:55 10002
09:56 10006
13:59 3
14:01 2
14:39 3
14:40 1
 
计算时间段的访问的url次数:

[root@node1 logs]# awk '{count[$4" "$7]++} END {for (minute in count) print minute, count[minute]}' access.log | sort -rnk3
[02/Aug/2017:11:38:58 /index.html 6276
[02/Aug/2017:11:38:59 /index.html 3826
[02/Aug/2017:11:41:21 / 1
[02/Aug/2017:11:36:44 /favicon.ico 1
[02/Aug/2017:11:36:44 / 1
[02/Aug/2017:11:36:38 /favicon.ico 1
[02/Aug/2017:11:36:38 / 1
[02/Aug/2017:11:35:10 / 1
[02/Aug/2017:11:34:45 /favicon.ico 1
[02/Aug/2017:11:34:45 / 1
[02/Aug/2017:11:34:16 /favicon.ico 1
[02/Aug/2017:11:34:16 / 1
[02/Aug/2017:11:34:10 /favicon.ico 1
[02/Aug/2017:11:34:10 / 1
[02/Aug/2017:11:31:22 /favicon.ico 1
[02/Aug/2017:11:31:22 / 1

计算每分钟访问的url次数:

[root@node1 logs]# cat access.log |awk -F"[: ]+" '{count[$5":"$6" "$10]++}END{for (minute in count) print minute, count[minute]}'|sort
11:31 / 1
11:31 /favicon.ico 1
11:34 / 3
11:34 /favicon.ico 3
11:35 / 1
11:36 / 2
11:36 /favicon.ico 2
11:38 /index.html 10102
11:41 / 1

原文地址:https://www.cnblogs.com/jsonhc/p/7273546.html