Perl 学习手札之九:regular expressions

Regular expressions is a very powerful method of matching patterns in text

  there are often called "regex" or the plural "regexes".

they are used within other languages, like Perl

commonly used for search-and-replace operations

regexes can be very simple or very complex

the regex expression language is compact, so it can look more intimidating

($hour,$min,$sec) = $time = ~/(\d\d):(\d\d):(\d\d)/;

  while($n=~s/^(-?\d+)(\d{3})/\1,\2/){}

search.pl

 1 #!/usr/bin/perl
2 #
3 use strict;
4 use warnings;
5
6 main(@ARGV);
7
8 sub main
9 {
10 open(FH, 'short.txt');
11 while($_=<FH>) {
12 print if $_=~/regular/;
13 print if /\//;
14 print if m/regular/;
15 print if m|//|;
16 }
17 close FH;
18 }
19
20 sub message
21 {
22 my $m = shift or return;
23 print("$m\n");
24 }
25
26 sub error
27 {
28 my $e = shift || 'unkown error';
29 print(STDERR "$0: $e\n");
30 exit 0;
31 }

行号12==14; 13==15;

search1.pl

 1 #!/usr/bin/perl
2
3
4 #
5 use strict;
6 use warnings;
7
8 main(@ARGV);
9
10 sub main
11 {
12 open(FH, 'short.txt');
13 while(my $line=<FH>) {
14 $line =~s/regular/BOB/;
15 $line =~s/a/BOB/gi;
16 print $line ;
17 }
18 close FH;
19 }
20
21 sub message
22 {
23 my $m = shift or return;
24 print("$m\n");
25 }
26
27 sub error
28 {
29 my $e = shift || 'unkown error';
30 print(STDERR "$0: $e\n");
31 exit 0;
32 }

s: 替换substitude;

g: global;

i:忽略大小写;

extract.pl

 1 #!/usr/bin/perl
2 #
3 use strict;
4 use warnings;
5
6 main(@ARGV);
7
8 sub main
9 {
10 my $time = "05:24:37";
11 $time=~/(..):(..):(..)/;
12 my $hour=$1;
13 my $min = $2;
14 my $sec = $3;
15 #my ($hour,$min,$sec) = $time =~/(..):(..):(..)/;
16 message("hour:$hour,min:$min,sec:$sec");
17 }
18
19 sub message
20 {
21 my $m = shift or return;
22 print("$m\n");
23 }
24
25 sub error
26 {
27 my $e = shift || 'unkown error';
28 print(STDERR "$0: $e\n");
29 exit 0;
30 }

注释的15行可以等价与11至14行;

wild.pl

 1 #!/usr/bin/perl
2 #
3 use strict;
4 use warnings;
5
6 main(@ARGV);
7
8 sub main
9 {
10 open(FH, 'short.txt');
11 while(<FH>) {
12 /(^....)/;#以任意字母四个组合开头
13 /(....)$/;#以任意字母四个组合结尾
14 /(a...)/;#以字母a开头的四个字母组合
15 message($1) if $1;
16 }
17 close FH;
18 }
19
20 sub message
21 {
22 my $m = shift or return;
23 print("$m\n");
24 }
25
26 sub error
27 {
28 my $e = shift || 'unkown error';
29 print(STDERR "$0: $e\n");
30 exit 0;
31 }

wild1.pl

 1 #!/usr/bin/perl
2 use strict;
3 use warnings;
4
5 main(@ARGV);
6
7 sub main
8 {
9 open(FH, 'short.txt');
10 while(<FH>) {
11 my @list=/(a...)/g;#匹配没行左右的a开头的四个字符。
12 #my @list=/(a.{9})/g;#大括号表示重复的次数
13 #my @list=/(a.*s)/g;#匹配任意多个字符,知道s结尾,贪婪匹配greed
14 #my @list=/(a.*?s)/g;#匹配到第一个s结尾;和贪婪匹配对应
15
16 message(join(':',@list)) if @list;
17 }
18 close FH;
19 }
20
21 sub message
22 {
23 my $m = shift or return;
24 print("$m\n");
25 }
26
27 sub error
28 {
29 my $e = shift || 'unkown error';
30 print(STDERR "$0: $e\n");
31 exit 0;
32 }

class.pl

 1 #!/usr/bin/perl
2 #
3 #
4 #
5 use strict;
6 use warnings;
7
8 main(@ARGV);
9
10 sub main
11 {
12 open(FH, 'perlre.txt');
13 while(<FH>) {
14 #my @list = /[0-9]/g;
15 #my @list = /\d+/g;
16 #my @list = /[a-zA-Z]+/g;
17 #my @list = /([173]+)/g;
18 #my @list = /([[:digit:]]+)/g;
19 #my @list = /([[:punct:]]+)/g;
20 #my @list = /(\w+)/g;
21 my @list = /(\W+)/g;
22 message(join(':',@list)) if @list;
23 }
24 close FH;
25 }
26
27 sub message
28 {
29 my $m = shift or return;
30 print("$m\n");
31 }
32
33 sub error
34 {
35 my $e = shift || 'unkown error';
36 print(STDERR "$0: $e\n");
37 exit 0;
38 }

例子里面是各种匹配方式得到的不同结果。

split.pl

 1 #!/usr/bin/perl
2 #
3 use strict;
4 use warnings;
5
6 main(@ARGV);
7
8 sub main
9 {
10 my $time = "05:27:32";
11 my ($hour,$min,$sec)=split(/:/,$time);
12 message("hour: $hour, minute: $min,second: $sec");
13 error("this is an error message");
14 }
15
16 sub message
17 {
18 my $m = shift or return;
19 print("$m\n");
20 }
21
22 sub error
23 {
24 my $e = shift || 'unkown error';
25 my @me = split(m|[\\/]|, $0);#这里匹配需要输出的流。
26 print(STDERR "$me[-1]: $e\n");
27 exit 0;
28 }


这里我们的正则表达式就暂时告一段落,个人感觉写的非常不好,很多的匹配方式都没有看到。

更多可以参考:http://perltraining.com.au/上的文章。






原文地址:https://www.cnblogs.com/hanleilei/p/2379838.html