PHP正则表达式概念

【定界符】

通常使用"/"做为定界符开始和结束,也可以使用"#"。一般在字符串中有很多"/"字符的时候，使用"#"做定界符，因为正则的时候这种字符需要转义，比如uri。

<?php
  /**
   * 定界符
   * **/
  $regex = '/^http://([w.]+)/([w]+)/([w]+)/([w]+)/([w]+)/([w]+).html$/i';
  $str = 'http://php1234.cn/a/functions/2016/0905/50.html';
  $matches = array();
  if(preg_match($regex, $str, $matches)){
       var_dump($matches);
  }
  $regex = '#^http://([w.]+)/([w]+)/([w]+)/([w]+)/([w]+)/([w]+).html$#i';
  if(preg_match($regex, $str, $matches)){
       var_dump($matches);
  }
?>

输出结果：
array(7) {
  [0]=>
  string(47) "http://php1234.cn/a/functions/2016/0905/50.html"
  [1]=>
  string(10) "php1234.cn"
  [2]=>
  string(1) "a"
  [3]=>
  string(9) "functions"
  [4]=>
  string(4) "2016"
  [5]=>
  string(4) "0905"
  [6]=>
  string(2) "50"
}
array(7) {
  [0]=>
  string(47) "http://php1234.cn/a/functions/2016/0905/50.html"
  [1]=>
  string(10) "php1234.cn"
  [2]=>
  string(1) "a"
  [3]=>
  string(9) "functions"
  [4]=>
  string(4) "2016"
  [5]=>
  string(4) "0905"
  [6]=>
  string(2) "50"
}

【修饰符】

用于改变正则表达式行为的符号,上例中的表达式最后面的/i就是一个修饰符，用来忽略大小写，还有一个较常用的是"x",用来表示忽略空格的。

【字符域】

用方括号扩起来的部分就是字符域，如上例中的:[w]。

【限定符】

如[w]{3,5}或者[w]*或者[w]+这些[w]后面的符号都表示限定符。
{3,5}表示3到5个字符。
{3,}超过3个字符，{,5}最多5个。
{3}三个字符。
* 表示0到多个。
+ 表示1到多个。

【脱字符】

^:放在字符域(如:[^w])中表示否定(不包括的意思)即“反向选择”。
  放在表达式之前，表示以当前这个字符开始。(/^n/i,表示以n开头)。
注意：我们经常管""叫"跳脱字符"。用于转义一些特殊符号，如".","/"

【通配符】

判断字符串中某些字符的存在与否！
格式：
正向预查:(?=) 相对应的 (?!)表示否定意思
反向预查:(?<=) 相对应的 (?<!)表示否定意思

<?php
  /**
   * 通配符
   * **/
  $regex = '/(?<=c)d(?=e)/';  /* d 前面紧跟c, d 后面紧跟e*/
  $str = 'abcdefgk';
  $matches = array();
  if(preg_match($regex, $str, $matches)){
      var_dump($matches);
  }
  $regex = '/(?<!c)d(?!e)/';//否定意义
  $str = 'abcdefgdk';
  if(preg_match($regex, $str, $matches)){
      var_dump($matches);
  }
?>

输出结果：
array(1) {
  [0]=>
  string(1) "d"
}
array(1) {
  [0]=>
  string(1) "d"
}

【惰性匹配】

格式:限定符?
原理:"?"：如果前面有限定符，会使用最小的数据。如“*”会取0个，而“+”会取1个，如过是{3,5}会取3个。

<?php
  /**
   * 惰性匹配
   * **/
  $regex = '/heL*/i';
  $str = 'heLLLLLLLLLLLLLLLL';
  if(preg_match($regex, $str, $matches)){
       var_dump($matches);
  }
  $regex = '/heL*?/i';
  $str = 'heLLLLLLLLLLLLLLLL';
  if(preg_match($regex, $str, $matches)){
       var_dump($matches);
  }
  $regex = '/heL+?/i';
  $str = 'heLLLLLLLLLLLLLLLL';
  if(preg_match($regex, $str, $matches)){
       var_dump($matches);
  }
 $regex = '/heL{5,8}?/i';
 $str = 'heLLLLLLLLLLLLLLLL';
 if(preg_match($regex, $str, $matches)){
       var_dump($matches);
 }
?>

输出结果：
array(1) {
  [0]=>
  string(18) "heLLLLLLLLLLLLLLLL"
}
array(1) {
  [0]=>
  string(2) "he"
}
array(1) {
  [0]=>
  string(3) "heL"
}
array(1) {
  [0]=>
  string(7) "heLLLLL"
}

* ：0到多次 
+：1到多次还可以写成{1,} 
? ：0或1次 
.  ：匹配除换行符外的所有单个的字符 
w： [a-zA-Z0-9_] 
s：空白字符(空格，换行符，回车符）[	

] 
d：[0-9]