python re(正则模块)

参考文档：http://blog.csdn.net/wusuopubupt/article/details/29379367

ipython环境中，输入"?re"，官方解释如下：

This module exports the following functions:
    match    Match a regular expression pattern to the beginning of a string.
    search   Search a string for the presence of a pattern.
    sub      Substitute occurrences of a pattern found in a string.
    subn     Same as sub, but also return the number of substitutions made.
    split    Split a string by the occurrences of a pattern.
    findall  Find all occurrences of a pattern in a string.
    finditer Return an iterator yielding a match object for each match.
    compile  Compile a pattern into a RegexObject.
    purge    Clear the regular expression cache.
    escape   Backslash all non-alphanumerics in a string.

Some of the functions in this module takes flags as optional parameters:
    I  IGNORECASE  Perform case-insensitive matching.
    L  LOCALE      Make w, W, , B, dependent on the current locale.
    M  MULTILINE   "^" matches the beginning of lines (after a newline)
                   as well as the string.
                   "$" matches the end of lines (before a newline) as well
                   as the end of the string.
    S  DOTALL      "." matches any character at all, including the newline.
    X  VERBOSE     Ignore whitespace and comments for nicer looking RE's.
    U  UNICODE     Make w, W, , B, dependent on the Unicode locale.

1. re.compile

compile函数会将一个字符串对象转换为RegexObject，官方示例如下：

import re


# Precompile the patterns
regexes = [re.compile(p) 
           for p in ['this', 'that']
           ]
text = "Does this text match the pattern?"
print 'Text: %r
' % repr(text)


for regex in regexes:
    print 'Seeking "%s"->' % regex.pattern, 
    
    if regex.search(text):
        print 'match!'
    else:
        print 'no match'

模块级别函数会维护一个已编译表达式缓存，直接使用已经编译表达式可以减少缓存查找开销，同时在开始时预编译表达式，可以避免运行时在进行编译。节省世间