算法：一道题学会位操作和子集运算（LeetCode1178）

算法：一道题学会位操作和子集运算

算法：一道题学会位操作和子集运算
- 题目
  - 朴素解法
  - 哈希+子集解法
- 参考

题目

外国友人仿照中国字谜设计了一个英文版猜字谜小游戏，请你来猜猜看吧。

字谜的迷面 puzzle 按字符串形式给出，如果一个单词 word 符合下面两个条件，那么它就可以算作谜底：

单词 word 中包含谜面 puzzle 的第一个字母。
单词 word 中的每一个字母都可以在谜面 puzzle 中找到。
例如，如果字谜的谜面是 "abcdefg"，那么可以作为谜底的单词有 "faced", "cabbage", 和 "baggage"；而 "beefed"（不含字母 "a"）以及 "based"（其中的 "s" 没有出现在谜面中）都不能作为谜底。
返回一个答案数组 answer，数组中的每个元素 answer[i] 是在给出的单词列表 words 中可以作为字谜迷面 puzzles[i] 所对应的谜底的单词数目。

示例：

输入：
words = ["aaaa","asas","able","ability","actt","actor","access"], 
puzzles = ["aboveyz","abrodyz","abslute","absoryz","actresz","gaswxyz"]
输出：[1,1,3,2,4,0]
解释：
1 个单词可以作为 "aboveyz" 的谜底 : "aaaa" 
1 个单词可以作为 "abrodyz" 的谜底 : "aaaa"
3 个单词可以作为 "abslute" 的谜底 : "aaaa", "asas", "able"
2 个单词可以作为 "absoryz" 的谜底 : "aaaa", "asas"
4 个单词可以作为 "actresz" 的谜底 : "aaaa", "asas", "actt", "access"
没有单词可以作为 "gaswxyz" 的谜底，因为列表中的单词都不含字母 'g'。

提示：

1 <= words.length <= 10^5
4 <= words[i].length <= 50
1 <= puzzles.length <= 10^4
puzzles[i].length == 7
words[i][j], puzzles[i][j] 都是小写英文字母。
每个 puzzles[i] 所包含的字符都不重复。

朴素解法

观察到words[i][j], puzzles[i][j] 都是小写英文字母，很自然想到用位运算来表示

举个例子

aaccd可以表示为00001101

思路：

为每个单词建立对应的位表示
为每个谜底，建立位表示，其中包括
- 第一位的表示
- 所有的表示
对于每一个谜底，遍历每个单词，判断是否合法（通过位运算((wordMap[i] & puzzleMap) == wordMap[i]) && ((wordMap[i] & first) == first)）

class Solution {
public:
    vector<int> findNumOfValidWords(vector<string> &words, vector<string> &puzzles) {
        vector<int> res;
        vector<int> wordMap(words.size());
        
        for (int i = 0; i < words.size(); ++i) {
            int tmp = 0;
            for (char c:words[i]) {
                tmp |= 1 << (c - 'a');
            }
            wordMap[i] = tmp;
        }
        for (int i = 0; i < puzzles.size(); ++i) {
            res.push_back(helper(wordMap, puzzles[i]));
        }
        return res;
    }

    int helper(vector<int> &wordMap, string& puzzle) {
        int res = 0;
        int first = 1<<(puzzle[0]-'a');
        int puzzleMap = 0;
        for (char c:puzzle) {
            puzzleMap |= 1 << (c - 'a');
        }
        for (int i = 0; i < wordMap.size(); ++i) {
            if (((wordMap[i] & puzzleMap) == wordMap[i]) && ((wordMap[i] & first) == first))
                res++;
        }
        return res;
    }
};

时间复杂度度O(m*n)，超时

哈希+子集解法

观察到题设一个有用信息

puzzles[i].length == 7

我们可以使用哈希+子集，来优化时间复杂度和空间复杂度

思路：

为每个单词建立对应的位表示tmp，建立哈希表，记录每种位表示的单词个数
为每个谜底，建立位表示，其中包括
遍历谜底位表示的子集，哈希查表
```
        while (subSet!=0){
            int h = first|subSet;
            res+=wordMap[h];
            subSet = (subSet-1)&puzzleMap;
        }
```
这段代码是精髓，直接算出了位示数的子集，比如1101，它可以算出子集为1100，1001，1000，0101，0100，0001，时间复杂度为O(1的个数)

这里借用leetcode上一张图来解释

class Solution {
public:
    vector<int> findNumOfValidWords(vector<string> &words, vector<string> &puzzles) {
        vector<int> res;
        unordered_map<int,int> wordMap;
        for (int i = 0; i < words.size(); ++i) {
            int tmp = 0;
            for (char c:words[i]) {
                tmp |= 1 << (c - 'a');
            }
            wordMap[tmp]+=1;
        }
        for (int i = 0; i < puzzles.size(); ++i) {
            res.push_back(helper(wordMap, puzzles[i]));
        }
        return res;
    }

    int helper(unordered_map<int,int> &wordMap, string& puzzle) {
        int res = 0;
        int first = 1<<(puzzle[0]-'a');
        int puzzleMap = 0;
        for (int i = 1; i < puzzle.size(); ++i) {
            int c= puzzle[i];
            puzzleMap |= 1 << (c - 'a');
        }
        int subSet = puzzleMap;
        res+=wordMap[first];
        while (subSet!=0){
            int h = first|subSet;
            res+=wordMap[h];
            subSet = (subSet-1)&puzzleMap;
        }
        return res;
    }
};

顺利AC

参考

https://leetcode-cn.com/problems/number-of-valid-words-for-each-puzzle/solution/shou-hua-tu-jie-si-lu-jie-xi-leetcode-11-12dy/