re模块

#经过前面的知识储备，我们可以在Python中使用正则表达式了。Python通过re模块提供对正则表达式的支持。

1、re.match函数

#一般使用re的步骤是先将正则表达式的字符串形式编译为Pattem实例，然后使用Pattem实例处理文本并获得匹配结果（一个match函数），最后使用match函数获得信息，进行其他操作。

#re.match函数尝试从字符串的起始位置匹配一个模式，该函数语法如下：

re.match(pattern,string,flags=0)

#函数参数说明：pattern指匹配的正则表达式；string指要匹配的字符串；flags为标志位，用于控制正则表达式的匹配方式，如是否区分大小写、多行匹配等。

#如果匹配成功，re.match方法就返回一个匹配对象，否则就返回None。

#例如：

#!/usr/bin/python3
#-*-coding=UTF-8-*-
#re.macth

import re

print(re.match('hello','hello world').span())  #在起始位置匹配
print(re.match('world','hello world'))         #不在起始位置匹配
#执行结果如下：

D:Pythonworkspacedatatime20180110>python re.match.py
(0, 5)
None
2、re.search方法
#在re模块中，除了match函数外，search方法也经常使用。
#re.search方法用于扫描整个字符串并返回第一个成功匹配的字符，语法如下：
re.search(pattern,string,flags=0)
#函数参数说明：pattern指匹配的正则表达式；string指要匹配的字符串；flags为标志位，用于控制正则表达式的匹配方式，如是否区分大小写、多行匹配等。
#如果匹配成功，re.search方法就返回一个匹配的对象，否则返回None。
#例如：

#!/usr/bin/python3
#-*-conding:UTF-8-*-
#re.search

import re
print(re.search('hello','hello world').span())   #在起始位置匹配
print(re.search('world','hello world').span())   #不在起始位置匹配
#执行结果如下：

D:Pythonworkspacedatatime20180110>python re.search.py
(0, 5)
(6, 11)
3、re.match与re.search的区别
#re.match函数只匹配字符串开始的字符，如果开始的字符不符合正则表达式，匹配就会失败，函数返回None。
#re.search方法匹配整个字符串，直到找到一个匹配的对象，匹配结束没找到匹配值才返回None。
#例如：

#!/usr/bin/python3
#-*-coding:UTF-8-*-
#re.match_re.search

import re 

line='Cats are smarter than dogs'

matchObj=re.match(r'dogs',line,re.M|re.I)
if matchObj:
   print('use match,the match string is:',matchObj.group())
else:
   print('No match string!!')

matchObj=re.search(r'dogs',line,re.M|re.I)
if matchObj:
   print('use search,the match string is:',matchObj.group())
else:
   print('No match string!!')
#执行结果如下：

D:Pythonworkspacedatatime20180110>python re.match_re.search.py
No match string!!
use search,the match string is: dogs
#该示例使用了match类中的分组方法--group方法。该方法定义如下：

def group(self,*args):
   """Return one or more subgroups of the match.
   :rtype:T|tuple
   """
   pass
#group([group1,...]):获得一个或多个分组截获的字符串，指定多个参数时以元组的形式返回。group1可以使用编号，也可以使用别名。编号0代表整个匹配的字符串。不填写参数时，返回group(0)；没有截获字符串的组时，返回None；截获多次字符串的组时，返回最后一次截获的子串。
#还有一个常用的分组方法groups。
#groups([default]):以元组形式返回全部分组截获的字符串，相当于调用group(1,2,...last)。default表示没有截获字符串的组以这个值代替，默认为None。