05. 重复匹配(学习笔记)

5. 正则表达式必知必会-重复匹配

5.1 有多少个匹配

匹配电子邮箱。

w@w.w 只能匹配 a@b.c 这样的邮箱,不能匹配 abcd@dbcd.com 这样的邮箱地址。

5.1.1 匹配一个或多个字符

匹配同一个字符或字符集的多次重复,可以用字符 +,匹配一个或多个字符。

const email1 = 'luwl@qq.com';
const email2 = 'luwl2@163.com';
const reg = /w+@w+.w+/; // 此处正则不能使用全局 g,否则,下面第二个 console 打印出来是 false,参考下面
console.log(reg.test(email1)); // true
console.log(reg.test(email2)); // true

正则 global flag

匹配一个或多个字符集

const str =
  'Send personal email to luwl@qq.com or luwl.p@qq.com. For questions use support@qq.com or support@vip.qq.com.';
const reg = /w+@w+.w+/g;
let match;
while ((match = reg.exec(str))) {
  console.log(`匹配项: ${match[0]}, index: ${match.index}`);
}
// 匹配项: luwl@qq.com, index: 23
// 匹配项: p@qq.com, index: 43 <-- 不对
// 匹配项: support@qq.com, index: 71
// 匹配项: support@vip.qq, index: 89 <-- 不对
const str =
  'Send personal email to luwl@qq.com or luwl.p@qq.com. For questions use support@qq.com. If your message is urgent try support@vip.qq.com.';
const reg = /[w.]+@[w.]+.w+/g; // <-- 修改正则,匹配字符集 [w.]
let match;
while ((match = reg.exec(str))) {
  console.log(`匹配项: ${match[0]}, index: ${match.index}`);
}
// 匹配项: luwl@qq.com, index: 23
// 匹配项: luwl.p@qq.com, index: 38
// 匹配项: support@qq.com, index: 71
// 匹配项: support@vip.qq.com, index: 89

5.1.2 匹配零个或多个字符

元字符 + 匹配一个或多个字符,元字符 * 匹配零个或多个字符。

const str = '.luwl@qq.com';
const reg1 = /[w.]+@[w.]+.w+/;
console.log(reg1.exec(str)); // [".luwl@qq.com", index: 0, input: ".luwl@qq.com", groups: undefined]
// email 的开头是点,不符合 email 的格式
const reg2 = /w+[w.]+@[w.]+.w+/;
console.log(reg2.exec(str)); // ["luwl@qq.com", index: 1, input: ".luwl@qq.com", groups: undefined]
const reg3 = /w+[w.]*@[w.]+.w+/; // <-- * 前面的是可选的
console.log(reg3.exec(str)); // ["luwl@qq.com", index: 1, input: ".luwl@qq.com", groups: undefined]

5.1.3 匹配零个或一个字符

元字符 ?,匹配一个字符或字符集的 0 次或 1次。

const str =
  'The url is http://wendys.com/, to connect securely use https://wendys.com/ . This is a wrong url: httpssssss://wendys.com/ .';
const reg1 = /http://[w./]+/g;
let match1;
while ((match1 = reg1.exec(str))) {
  console.log(match1[0]);
}
// http://wendys.com/
// 匹配不到 https://wendys.com
const reg2 = /https*://[w./]+/g; // <-- 匹配 0 次或多次 s
let match2;
while ((match2 = reg2.exec(str))) {
  console.log(match2[0]);
}
// http://wendys.com/
// https://wendys.com/
// httpssssss://wendys.com/ <-- 匹配到错误的 url
const reg3 = /https?://[w./]+/g; // <-- 匹配 0 次或 1 次 s
let match3;
while ((match3 = reg3.exec(str))) {
  console.log(match3[0]);
}
// http://wendys.com/
// https://wendys.com/

5.2 匹配的重复次数

重复次数可以用 {数值} 来表示。

5.2.1 为重复匹配的次数设定一个精确的值

匹配十六进制的颜色。

const color1 = '#d8d8d8';
const color2 = '#666666';
const color3 = '#ddd';
const reg = /#[a-zA-Z0-9]{6}/; // <-- 重复6遍
console.log(reg.exec(color1)); // ["#d8d8d8", index: 0, input: "#d8d8d8", groups: undefined]
console.log(reg.exec(color2)); // ["#666666", index: 0, input: "#666666", groups: undefined]
console.log(reg.exec(color3)); // null

5.2.2 为重复匹配次数设定一个区间

验证日期格式。

const date1 = '07-19-2019';
const date2 = '7-19-2019';
const date3 = '7/19/2019';
const date4 = '7/9/2019';
const date5 = '7/9/19';
const date6 = '7/9/1';
const reg = /d{1,2}[/-]d{1,2}[/-]d{2,4}/; // <-- 不能检查日期值是否有效,只能检查格式是否正确
console.log(reg.exec(date1)); // ["07-19-2019", index: 0, input: "07-19-2019", groups: undefined]
console.log(reg.exec(date2)); // ["7-19-2019", index: 0, input: "7-19-2019", groups: undefined]
console.log(reg.exec(date3)); // ["7/19/2019", index: 0, input: "7/19/2019", groups: undefined]
console.log(reg.exec(date4)); // ["7/9/2019", index: 0, input: "7/9/2019", groups: undefined]
console.log(reg.exec(date5)); // ["7/9/19", index: 0, input: "7/9/19", groups: undefined]
console.log(reg.exec(date6)); // null

5.2.3 匹配 “至少重复多少次“

const str = '1001: $496.80; 1002: $1290.69; 1003: $26.43; 1004: $613.42; 1005: $7.61; 1006: $414.90; 1007: $25.00;';
const reg = /d{4}: $d{3,}.d{2}/g;
let match;
while ((match = reg.exec(str))) {
  console.log('匹配项: ' + match[0] + ' index: ' + match.index); // <-- 找出金额大于等于100的
}
// 匹配项: 1001: $496.80 index: 0
// 匹配项: 1002: $1290.69 index: 15
// 匹配项: 1004: $613.42 index: 45
// 匹配项: 1006: $414.90 index: 73

5.3 防止过度匹配

const str = '<p>this is a test, and this is <b>important</b>, this is also <b>important</b>.</p>';
const reg = /<b>.*</b>/g; // <-- 匹配 <b> 中的内容
console.log(reg.exec(str));
// ["<b>important</b>, this is also <b>important</b>", index: 31, input: "<p>this is a test, and this is <b>important</b>, this is also <b>important</b>.</p>", groups: undefined]
// 匹配出来的和预期不一样,预期的应该是两个
  • 和 + 是“贪婪型“元字符。

使用这些字符的“懒惰型”版本,就可以了。

贪婪型元字符 懒惰型元字符
* *?
+ +?
{n, } {n, }?
const str = '<p>this is a test, and this is <b>important</b>, this is also <b>important</b>.</p>';
const reg = /<b>.*?</b>/g; // <-- 匹配 <b> 中的内容
let match;
while ((match = reg.exec(str))) {
  console.log(`匹配项: ${match[0]}, index: ${match.index}`);
}
// 匹配项: <b>important</b>, index: 31
// 匹配项: <b>important</b>, index: 62
原文地址:https://www.cnblogs.com/lwl0812/p/11217136.html