linux shell学习--awk练习

例子文件素材样式:

120.197.87.216 - - [04/Jan/2012:00:00:02 +0800] "GET /home.php?mod=space&uid=563413&mobile=yes HTTP/1.1" 200 3388 "-" "-"
123.126.50.73 - - [04/Jan/2012:00:00:02 +0800] "GET /thread-679411-1-1.html HTTP/1.1" 200 5251 "-" "Sogou web spider/4.0(+http://www.sogou.com/docs/help/webmasters.htm#07)"
203.208.60.187 - - [04/Jan/2012:00:00:02 +0800] "GET /archiver/tid-3003.html HTTP/1.1" 200 2056 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
114.112.141.6 - - [04/Jan/2012:00:00:02 +0800] "GET /ctp080113.php?action=getgold HTTP/1.1" 200 13886 "-" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; Trident/4.0; InfoPath.3; .NET4.0C; .NET4.0E; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729)"

 awk '{print $1}' access.20120104.log

awk '{print substr($4,2)}' access.20120104.log

实例:

1.模拟windows dir的输出风格

ls -l | awk '{printf $6" "$7" "$8" ";if (substr($1,1,1)=="d") {printf "<dir>  "} else {printf $5"  "};print $9}'

4月 27 2014 <dir>  a
3月 11 2014 688211055  access.20120104.log
10月 9 2013 3025757  access.log.10
5月 31 2014 <dir>  Algorithms

加强版:a. 年/月/日 时:分  b. 字段对齐  (使用printf格式化字符串)

ls -al --time-style=long-iso | awk '{printf $6" "$7" ";if(substr($1,1,1)=="d") {printf "%-15s","<dir>"} else {printf "%15s",$5};print "    "$8}'

2014-03-11 14:11       688211055 access.20120104.log
2013-10-09 10:35         3025757   access.log.10
2014-05-31 00:56 <dir>              Algorithms
2015-05-02 18:47           18748    .bash_history
2014-03-13 14:06             220    .bash_logout
2014-03-13 14:06            3486    .bashrc
2014-05-09 15:45 <dir>              .cache

注:

2. 计算网站的ip数和pv数

awk 'BEGIN{print "ip","num"} {ip[$1]++} END{ for (i in ip) {print i,ip[i]}}'< access.20120104.log| wc -l

上面也可以结合sort与uniq的使用

3. 计算以a开头的普通文件的平均长度

find .  -name "a*" -type f -exec ls -l {} ; | awk 'BEGIN{count=0;sum=0};{count+=1;sum+=$5};END{print "avg="sum/count}' 

当然,上面可以不用count,之一直接使用NR代替进行计算  

4.根据下表计算每一个人的总额和平均值

vi pay.txt

Name    1st     2st     3st

grid    23000   24000   25000

lily    21000   23000   20000

david   25000   19000   24000

awk '{if(NR==1) {printf "%10s %10s %10s %10s %10s %10s ",$1,$2,$3,$4,"total","avg"};if(NR>=2){total=$1+$2+$3+$4;avg=total/(NF-1);printf "%10s %10d %10d %10d %10.2f %10.2f ",$1,$2,$3,$4,total,avg}}' pay.txt

Name        1st        2st        3st      total        avg
  grid      23000      24000      25000   72000.00   24000.00
   lily      21000      23000      20000   64000.00   21333.33
 david      25000      19000      24000   68000.00   22666.67

注:1.printf格式化语句的熟练。2.有判断语句,花括号不要弄错,记住if-else,for都是一个语句,之间只有分号

原文地址:https://www.cnblogs.com/bowenlearning/p/4485372.html