Python妙用re.sub分析正则表达式匹配过程

声明：本文所使用方法为老猿自行研究并编码，相关代码版权为老猿所有，禁止转载文章，代码禁止用于商业用途！

在《第11.23节 Python 中re模块的搜索替换功能：sub及subn函数》介绍了re.sub函数，其中的替换内容可以是一个函数，利用该功能我们可以展示正则表达式匹配过程中匹配到的目标子串的匹配顺序、匹配文本的内容和匹配文本在搜索文本中的位置。具体实现如下：

import re
matchcount = 0

def parsematch(patstr,text):
    global matchcount
    matchcount = 0
    re.sub(patstr,matchrsult,text)
    
def matchrsult(m):
    global matchcount
    matchcount += 1   
    print(f"第{matchcount}次匹配，匹配情况:")
    if(m.lastindex):
        for i in range(0,m.lastindex+1):print(f"    匹配子串group({i}): {m.group(i)},位置为：{m.span(i)}") #正则表达式为{m.re},搜索文本为{m.string},
    else:print(f"    匹配子串group(0): {m.group(0)},位置为：{m.span(0)}")
    return m.group(0)

调用举例：

>>> parsematch(r'(?i)(?P<lab>pyw*)','Python?PYTHON!Learning python with LaoYuan! ')
第1次匹配，匹配情况:
    匹配子串group(0): Python,位置为：(0, 6)
    匹配子串group(1): Python,位置为：(0, 6)
第2次匹配，匹配情况:
    匹配子串group(0): PYTHON,位置为：(7, 13)
    匹配子串group(1): PYTHON,位置为：(7, 13)
第3次匹配，匹配情况:
    匹配子串group(0): python,位置为：(23, 29)
    匹配子串group(1): python,位置为：(23, 29)
>>>
>>> parsematch('(.?)*',"abc")
第1次匹配，匹配情况:
    匹配子串group(0): abc,位置为：(0, 3)
    匹配子串group(1): ,位置为：(3, 3)
第2次匹配，匹配情况:
    匹配子串group(0): ,位置为：(3, 3)
    匹配子串group(1): ,位置为：(3, 3)
>>> 
>>> parsematch('(?P<l1>Lao)(?P<l2>w+)(Python)','LaoYuanPython')
第1次匹配，匹配情况:
    匹配子串group(0): LaoYuanPython,位置为：(0, 13)
    匹配子串group(1): Lao,位置为：(0, 3)
    匹配子串group(2): Yuan,位置为：(3, 7)
    匹配子串group(3): Python,位置为：(7, 13)
>>>

不过上述分析过程仅用于多次搜索到目标串的时候才有作用，如果只是一次匹配到一个目标串，则无需使用该方法，因为使用匹配对象就很方便的查看匹配信息。

老猿Python，跟老猿学Python!
博客地址：https://blog.csdn.net/LaoYuanPython
请大家多多支持，点赞、评论和加关注！谢谢！