正则表达式的捕获组(Java)

 捕获组分类

  1. 普通捕获组(Expression)
  2. 命名捕获组(?<name>Expression)

普通捕获组

从正则表达式左侧开始,每出现一个左括号“(”记做一个分组,分组编号从1开始。0代表整个表达式。

对于时间字符串:2017-04-25,表达式如下

(\d{4})-((\d{2})-(\d{2}))

有4个左括号,所以有4个分组

public static final String DATE_STRING = "2017-04-25";
public static final String P_COMM = "(\d{4})-((\d{2})-(\d{2}))";

Pattern pattern = Pattern.compile(P_COMM);
Matcher matcher = pattern.matcher(DATE_STRING);
matcher.find();//必须要有这句
System.out.printf("
matcher.group(0) value:%s", matcher.group(0));
System.out.printf("
matcher.group(1) value:%s", matcher.group(1));
System.out.printf("
matcher.group(2) value:%s", matcher.group(2));
System.out.printf("
matcher.group(3) value:%s", matcher.group(3));
System.out.printf("
matcher.group(4) value:%s", matcher.group(4));

命名捕获组

每个以左括号开始的捕获组,都紧跟着“?”,而后才是正则表达式。

对于时间字符串:2017-04-25,表达式如下

(?<year>\d{4})-(?<md>(?<month>\d{2})-(?<date>\d{2}))

有4个命名的捕获组,分别是

命名的捕获组同样也可以使用编号获取相应值

public static final String P_NAMED = "(?<year>\d{4})-(?<md>(?<month>\d{2})-(?<date>\d{2}))";
public static final String DATE_STRING = "2017-04-25";

Pattern pattern = Pattern.compile(P_NAMED);
Matcher matcher = pattern.matcher(DATE_STRING);
matcher.find();
System.out.printf("
===========使用名称获取=============");
System.out.printf("
matcher.group(0) value:%s", matcher.group(0));
System.out.printf("
 matcher.group('year') value:%s", matcher.group("year"));
System.out.printf("
matcher.group('md') value:%s", matcher.group("md"));
System.out.printf("
matcher.group('month') value:%s", matcher.group("month"));
System.out.printf("
matcher.group('date') value:%s", matcher.group("date"));
matcher.reset();
System.out.printf("
===========使用编号获取=============");
matcher.find();
System.out.printf("
matcher.group(0) value:%s", matcher.group(0));
System.out.printf("
matcher.group(1) value:%s", matcher.group(1));
System.out.printf("
matcher.group(2) value:%s", matcher.group(2));
System.out.printf("
matcher.group(3) value:%s", matcher.group(3));
System.out.printf("
matcher.group(4) value:%s", matcher.group(4));

PS:非捕获组
在左括号后紧跟“?:”,而后再加上正则表达式,构成非捕获组(?:Expression)。

对于时间字符串:2017-04-25,表达式如下

(?:\d{4})-((\d{2})-(\d{2}))
1
这个正则表达式虽然有四个左括号,理论上有4个捕获组。但是第一组(?:d{4}),其实是被忽略的。当使用matcher.group(4)时,系统会报错。

public static final String P_UNCAP = "(?:\d{4})-((\d{2})-(\d{2}))";
public static final String DATE_STRING = "2017-04-25";

Pattern pattern = Pattern.compile(P_UNCAP);
Matcher matcher = pattern.matcher(DATE_STRING);
matcher.find();
System.out.printf("
matcher.group(0) value:%s", matcher.group(0));
System.out.printf("
matcher.group(1) value:%s", matcher.group(1));
System.out.printf("
matcher.group(2) value:%s", matcher.group(2));
System.out.printf("
matcher.group(3) value:%s", matcher.group(3));

// Exception in thread "main" java.lang.IndexOutOfBoundsException: No group 4
System.out.printf("
matcher.group(4) value:%s", matcher.group(4));
//如果同时匹配到了不同位置的字符串,要取捕获组哪个位置的字符串
//这个位置在循环中索引值不能变,如m.group(2)中的2就不能变
String s=" from aaa from bbb"
Pattern p = Pattern.compile("\s+(from|join)\s+(\w+)");
Matcher m = p.matcher(hql);
while (m.find()) {
   System.out.println(m.group(2));
}
//输出结果:
aaa
bbb

原文地址:https://www.cnblogs.com/lyy-blog/p/9817361.html