文本处理三剑客之grep

grep

  • grep Global search Regular Expression(RE) and Print out the line

按照正则表达式来处理并打印出相应的行。

  • grep是基于行来处理的
    grep的工作原理很简单,每次从文本中拿出一行来,放到内存里,按照grep参数规定的规则,在内存里进行比对,比对成功的就打印出来
[21:19:21 root@C8-3-55 ~]#grep --help
用法: grep [选项]... PATTERN [FILE]...
Search for PATTERN in each FILE.
Example: grep -i 'hello world' menu.h main.c

Pattern selection and interpretation:
  -E, --extended-regexp     PATTERN is an extended regular expression
  -F, --fixed-strings       PATTERN is a set of newline-separated strings
  -G, --basic-regexp        PATTERN is a basic regular expression (default)
  -P, --perl-regexp         PATTERN is a Perl regular expression
  -e, --regexp=PATTERN      用 PATTERN 来进行匹配操作
  -f, --file=FILE           从 FILE 中取得 PATTERN
  -i, --ignore-case         忽略大小写
  -w, --word-regexp         强制 PATTERN 仅完全匹配字词
  -x, --line-regexp         强制 PATTERN 仅完全匹配一行
  -z, --null-data           一个 0 字节的数据行,但不是空行

杂项:
  -s, --no-messages         不显示错误信息
  -v, --invert-match        选中不匹配的行
  -V, --version             显示版本信息并退出
      --help                显示此帮助并退出

Output control:
  -m, --max-count=NUM       stop after NUM selected lines
  -b, --byte-offset         print the byte offset with output lines
  -n, --line-number         print line number with output lines
      --line-buffered       flush output on every line
  -H, --with-filename       print file name with output lines
  -h, --no-filename         suppress the file name prefix on output
      --label=LABEL         use LABEL as the standard input file name prefix
  -o, --only-matching       只显示匹配PATTERN 部分的行
  -q, --quiet, --silent     不显示所有常规输出
      --binary-files=TYPE   设定二进制文件的TYPE 类型;
                            TYPE 可以是`binary', `text', 或`without-match'
  -a, --text                等同于 --binary-files=text
  -I                        equivalent to --binary-files=without-match
  -d, --directories=ACTION  how to handle directories;
                            ACTION is 'read', 'recurse', or 'skip'
  -D, --devices=ACTION      how to handle devices, FIFOs and sockets;
                            ACTION is 'read' or 'skip'
  -r, --recursive           like --directories=recurse
  -R, --dereference-recursive
                            likewise, but follow all symlinks
      --include=FILE_PATTERN
                            search only files that match FILE_PATTERN
      --exclude=FILE_PATTERN
                            skip files and directories matching FILE_PATTERN
      --exclude-from=FILE   skip files matching any file pattern from FILE
      --exclude-dir=PATTERN directories that match PATTERN will be skipped.
  -L, --files-without-match print only names of FILEs with no selected lines
  -l, --files-with-matches  print only names of FILEs with selected lines
  -c, --count               print only a count of selected lines per FILE
  -T, --initial-tab         make tabs line up (if needed)
  -Z, --null                print 0 byte after FILE name

文件控制:
  -B, --before-context=NUM  打印文本及其前面NUM 行
  -A, --after-context=NUM   打印文本及其后面NUM 行
  -C, --context=NUM         打印NUM 行输出文本
  -NUM                      same as --context=NUM
      --group-separator=SEP use SEP as a group separator
      --no-group-separator  use empty string as a group separator
      --color[=WHEN],
      --colour[=WHEN]       use markers to highlight the matching strings;
                            WHEN is 'always', 'never', or 'auto'
  -U, --binary              do not strip CR characters at EOL (MSDOS/Windows)

When FILE is '-', read standard input.  With no FILE, read '.' if
recursive, '-' otherwise.  With fewer than two FILEs, assume -h.
Exit status is 0 if any line is selected, 1 otherwise;
if any error occurs and -q is not given, the exit status is 2.

常用命令

grep 至简无敌普通用法

  • grep <关键字> <文件名>

grep不加任何参数,只加关键字和文件名,表示从文件中过滤出带有关键字的行

[22:27:31 root@C8-3-55 ~]#grep root /etc/passwd
root:x:0:0:root:/root:/bin/bash
operator:x:11:0:operator:/root:/sbin/nologin

配合和管道对特定字符的行进行过滤

[22:36:23 root@C8-3-55 ~]#df -h
文件系统             容量  已用  可用 已用% 挂载点
devtmpfs             886M     0  886M    0% /dev
tmpfs                904M     0  904M    0% /dev/shm
tmpfs                904M  8.8M  895M    1% /run
tmpfs                904M     0  904M    0% /sys/fs/cgroup
/dev/mapper/cl-root   17G  3.4G   14G   20% /
/dev/sda1            976M  139M  771M   16% /boot
tmpfs                181M     0  181M    0% /run/user/0
[22:36:31 root@C8-3-55 ~]#df -h | grep /dev
devtmpfs             886M     0  886M    0% /dev
tmpfs                904M     0  904M    0% /dev/shm
/dev/mapper/cl-root   17G  3.4G   14G   20% /
/dev/sda1            976M  139M  771M   16% /boot

grep -v

  • -v参数表示取反
    查看配置文件时,可以通过grep -v 取反,去掉以‘#’开头的注释行
[22:44:17 root@C8-3-55 ~]#grep -v '#' /web-back/html/etc/yum.conf
[main]
cachedir=/var/cache/yum/$basearch/$releasever
keepcache=0
debuglevel=2
logfile=/var/log/yum.log
exactarch=1
obsoletes=1
gpgcheck=1
plugins=1
installonly_limit=5
bugtracker_url=http://bugs.centos.org/set_project.php?project_id=23&ref=http://bugs.centos.org/bug_report_page.php?category=yum
distroverpkg=centos-release

grep -f

  • -f 选项可以比较两个文件的不同
[22:21:47 root@C8-3-55 ~]#echo 123 > 123.txt ;echo 123456 > 123456.txt ;grep -f 123.txt 123456.txt
123456

grep -r -R

  • -r 递归 可以搜索整个文件夹中的所有文件
  • -R 如果有软链接,继续沿着软链接找

想查找一个参数,又不确定在哪个文件里,可以用-r递归

[22:26:49 root@C8-3-55 ~]#grep -r grep /
/boot/System.map-4.18.0-147.el8.x86_64:ffffffff81183af0 t kdb_grep_help
/boot/System.map-4.18.0-147.el8.x86_64:ffffffff8215c990 r __ksymtab_kdb_grepping_flag
/boot/System.map-4.18.0-147.el8.x86_64:ffffffff82187894 r __kstrtab_kdb_grepping_flag
/boot/System.map-4.18.0-147.el8.x86_64:ffffffff82b9f160 b suspend_grep
/boot/System.map-4.18.0-147.el8.x86_64:ffffffff82b9f2a0 B kdb_grep_trailing
/boot/System.map-4.18.0-147.el8.x86_64:ffffffff82b9f2a4 B kdb_grep_leading
/boot/System.map-4.18.0-147.el8.x86_64:ffffffff82b9f2a8 B kdb_grepping_flag
/boot/System.map-4.18.0-147.el8.x86_64:ffffffff82b9f2c0 B kdb_grep_string
/boot/grub2/i386-pc/modinfo.sh:grub_package_bugreport="bug-grub@gnu.org"

grep -m

  • -m 选项表示匹配多少次
    grep可能匹配很多个,如果只想要特定的个数,可以通过-m来限制匹配到的个数

只招出前5个shell为/bin/bash的用户

[22:41:10 root@C8-3-55 ~]#grep -m 5 /bin/bash /etc/passwd
root:x:0:0:root:/root:/bin/bash
python:x:1000:1000::/home/python:/bin/bash
sun3:x:1002:1002::/home/sun3:/bin/bash
sun4:x:1003:1003::/home/sun4:/bin/bash
sun2:x:8889:8889::/home/sun2:/bin/bash

grep -c

  • -c选项统计匹配到的次数,不关心内容

统计/etc/passwd中shell为/bin/bash和/sbin/bash的用户数量

[22:44:56 root@C8-3-55 ~]#grep -c '/bin/bash' /etc/passwd
107
[22:48:07 root@C8-3-55 ~]#grep -c '/sbin/nologin' /etc/passwd
29

grep -q

  • -q选项表示静默显示,效果相当于 将输出结果重定向到 &>/dev/null

只关心找到没找到的时候配合$?返回值进行结果判断,0为找到了,1为没找到

[22:49:09 root@C8-3-55 ~]#grep -q root /etc/passwd
[22:51:43 root@C8-3-55 ~]#echo $?
0
[22:51:51 root@C8-3-55 ~]#grep -q roooot /etc/passwd
[22:52:00 root@C8-3-55 ~]#echo $?
1

grep -e

  • -e选项可以进行多个关键字的匹配
    匹配包含 bash 或者包含 nologin的行
[23:08:45 root@C8-3-55 ~]#grep -e bash -e nologin /etc/passwd | wc -l
136
  • 如果需要包含bash并且包含root,则多grep一次就行了
[23:09:15 root@C8-3-55 ~]#grep bash /etc/passwd | grep root
root:x:0:0:root:/root:/bin/bash

grep -w 匹配整个单词

单独的数字、字母、下划线等一类的在一起算“单词”,不同类型的连在一起了,就不算“单词”

grep -A -B -C 查到到的同时显示前、中、后相邻的几行

grep -i 忽略大小写

grep -n 显示行号

* * * 胖并快乐着的死肥宅 * * *
原文地址:https://www.cnblogs.com/bpzblog/p/14499377.html