LeetCode 10: Regular Expression Matching

Description:

Given an input string (s) and a pattern (p), implement regular expression matching with support for '.' and '*'.

'.' Matches any single character.
'*' Matches zero or more of the preceding element.

The matching should cover the entire input string (not partial).

Note:

s could be empty and contains only lowercase letters a-z.
p could be empty and contains only lowercase letters a-z, and characters like . or *.

Example 1:

Input:
s = "aa"
p = "a"
Output: false
Explanation: "a" does not match the entire string "aa".

Example 2:

Input:
s = "aa"
p = "a*"
Output: true
Explanation: '*' means zero or more of the precedeng element, 'a'. Therefore, by repeating 'a' once, it becomes "aa".

Example 3:

Input:
s = "ab"
p = ".*"
Output: true
Explanation: ".*" means "zero or more (*) of any character (.)".

Example 4:

Input:
s = "aab"
p = "c*a*b"
Output: true
Explanation: c can be repeated 0 times, a can be repeated 1 time. Therefore it matches "aab".

Example 5:

Input:
s = "mississippi"
p = "mis*is*p*."
Output: false

描述：

给定字符串(s)和模式(p)，实现支持'*’和‘.’的正则表达式匹配。

‘.’:匹配任意字符
‘*’:匹配零个或多个前缀字符

要求：模式匹配覆盖整个输入字符，而不是部分字符。

输入字符串s为空，或者只包含小写字母a-z；

模式p可以为空，或者只包含小写字母a-z和特殊字符'*'、‘.’。

例子1：

输入:
s = "aa"
p = "a"
输出: false
说明: "a"不能匹配整个输入字符串"aa".

例子2：

输入:
s = "aa"
p = "a*"
输出: true
说明: '*'代表0个或多个前缀字符'a'. 因此, 重复一次'a' , 变为"aa".

例子3：

输入:
s = "ab"
p = ".*"
输出: true
Explanation: ".*" 表示0个或多个 (*) 任意字符 (.)。

例子4：

输入:
s = "aab"
p = "c*a*b"
输出: true
说明: c可以出现0次, a可以出现两次. 因此匹配输入字符"aab"。

例子5：

输入:
s = "mississippi"
p = "mis*is*p*."
输出: false

方法一：迭代法

首先，我们考虑不包含字符‘*’的情形，‘.’可以和任意字符匹配。我们首先判断第一个字符是否相等，如果相等则递归判断剩下的字符是否相等。

代码如下：

class Solution {
public:
    bool isMatch(string s, string p) {
        if(p.length() == 0) return s.length() == 0;
        bool first_match = (s.length() != 0 && (s[0] == p[0] || p[0] == '.'));
        return first_match && isMatch(s.substr(1), p.substr(1));
    }
};

接着，我们考虑存在‘*’的情形。这里存在两种情况：

1. 输入字符的首字符和模式的首字符不匹配，如s=“abc”，p=“c*abc”，此时跳过模式p的前两个字符，进行后续比较，即输入s=“abc”，p="abc"。

2. 输入字符的首字符和模式的首字符匹配，如s=“abc”，p=“a*bc”，由于‘*’可以表示多个前缀字符，此时跳过输入字符的首字符，进行后续比较，即输入s=“bc”，p="a*bc"。

class Solution {
public:
    bool isMatch(string s, string p) {
        if(p.length() == 0) return s.length() == 0;
        bool first_match = (s.length() != 0 && (s[0] == p[0] || p[0] == '.'));
        if(p.length() >= 2 && p[1] == '*') {
            return isMatch(s, p.substr(2)) || (first_match && isMatch(s.substr(1), p));
        } else {
            return first_match && isMatch(s.substr(1), p.substr(1));
        }
    }
};

复杂度分析：

时间复杂度: 以 $\binom{i+j}{i}(ii+j) times, and strings of the order O(T - i)O(T−i) and O(P - 2*j)O(P−2∗j) will be made. Thus, the complexity has the order \sum_{i = 0}^T \sum_{j = 0}^{P/2} \binom{i+j}{i} O(T+P-i-2j)∑i=0T∑j=0P/2(ii+j)O(T+P−i−2j). With some effort outside the scope of this article, we can show this is bounded by O\big((T+P)2^{T + \frac{P}{2}}\big)O((T+P)2T+2P).$
Space Complexity: For every call to match, we will create those strings as described above, possibly creating duplicates. If memory is not freed, this will also take a total of $O\big((T+P)2^{T + \frac{P}{2}}\big)O((T+P)2T+2P) space, even though there are only order O(T^2 + P^2)O(T2+P2) unique suffixes of PP and TT that are actually required.$