Leetcode: 30. Substring with Concatenation of All Words

Description

You are given a string, s, and a list of words, words, that are all of the same length. Find all starting indices of substring(s) in s that is a concatenation of each word in words exactly once and without any intervening characters.

Example

Input:
  s = "barfoothefoobarman",
  words = ["foo","bar"]
Output: [0,9]
Explanation: Substrings starting at index 0 and 9 are "barfoo" and "foobar" respectively.
The output order does not matter, returning [9,0] is fine too.

分析

采用搜索的方法， 具体做法如下
   0 （预处理）， 将 words 中中的所有单词用 26 进制转换为 数字
   1. 先统计 words 中所有单词中的 字母 个数， 然后 iter 输入字符串 ss。 找到一个窗口，其中所有的字符个数和  words 中一致。
   2. 窗口中的字符以 26 进制转换为数字， 和 words 中的数字表示进行比较。

code

class Solution(object):
    def findSubstring(self, ss, words):
        """
        :type s: str
        :type words: List[str]
        :rtype: List[int]
        """
        N, M, A = len(ss), len(words), ord('a')
        if N*M == 0:
            return []
        
        L = M * len(words[0])
        def build(word):
            s = 0
            for i in word:
                s = s * 26 + ord(i) - A
            return s

        def build2(str):
            i = 0
            lWordN = []
            step = len(words[0])
            for i in range(0, len(str), step):
                lWordN.append(build(str[i:i+step]))
            return sorted(lWordN)
        wordsNum = sorted([build(v) for v in words])

        if M*len(words[0]) > N:
            return []
        dp = [0 for i in range(26)]
        for v in words:
            for k in v:
                dp[ord(k) - A] += 1
        for i, v in enumerate(dp):
            if v == 0:
                dp[i] = -1
        s, e, ret, flag = 0, 0, [], False
        while N > e:
            if dp[ord(ss[e])-A]> - 1:
                dp[ord(ss[e])-A] -= 1
                s, e = e, e+1
                break
            e += 1
        if e == N:
            if build2(ss[s:e+1]) == wordsNum:
                ret.append(s)
            return ret
        
        while N > e:
            chn = ord(ss[e]) - A
            if dp[chn] == -1:
                e += 1
                flag = True
                continue
            if flag:
                while e > s:
                    s_key = ord(ss[s]) - A
                    s += 1
                    if dp[s_key] > -1:
                        dp[s_key] += 1
                flag = False
                # can be more fast if we can just reset array dp

            if dp[chn] == 0:
                while e > s:
                    s_key = ord(ss[s]) - A
                    s += 1
                    if s_key == chn:
                        break
                    if dp[s_key] > -1:
                        dp[s_key] += 1
                dp[chn] += 1
            dp[chn] -= 1
            if e - s +1 == L:
                if build2(ss[s:e+1]) == wordsNum:
                    ret.append(s)
            e += 1
        return ret

Runtime: 936 ms, faster than 36.72% of Python online submissions for Substring with Concatenation of All Words.
速度上并不是很快，需要思考一波