leetcode187- Repeated DNA Sequences- medium

All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACGAATTCCG". When studying DNA, it is sometimes useful to identify repeated sequences within the DNA.

Write a function to find all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule.

For example,

Given s = "AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT",

Return:
["AAAAACCCCC", "CCCCCAAAAA"].

位操作+hashmap。1.把ACGT转化成00,01,10,11然后每次<<2来实现字符到数字的转换,从而节省时间复杂度。2.hashmap要记录出现次数,避免同一串出现两次最后加两次到结果里。

细节:for循环开头的数字遍历是for(int i = 0; i < s.length() - 9; i++)  最后是-9,不是-10,因为你想本来是遍历到最后长度为1的字符串,你要遍历到最后长度为10的字符串,那差了9个位置啊。

(普通一点的也可以直接用substring,或者stringBuilder开始暖场。)

class Solution {
    public List<String> findRepeatedDnaSequences(String s) {
        
        List<String> result = new ArrayList<>();
        if (s == null || s.length() <= 10) {
            return result;
        }
        int[] map = new int[26];
        map['A' - 'A'] = 0;
        map['C' - 'A'] = 1;
        map['G' - 'A'] = 2;
        map['T' - 'A'] = 3;
        
        Map<Integer, Integer> Cntmap = new HashMap<Integer, Integer>();
        
        for (int i = 0; i < s.length() - 9; i++) {
            int trans = 0;
            for (int j = i; j < i + 10; j++) {
                trans |= (map[s.charAt(j) - 'A']) << ((j - i) * 2);
            }
            if (Cntmap.containsKey(trans)) {
                if (Cntmap.get(trans) == 1) {
                    result.add(s.substring(i, i + 10));
                } 
                Cntmap.put(trans, Cntmap.get(trans) + 1);
            } else {
                Cntmap.put(trans, 1);
            }
        }
        return result;
    }
}
原文地址:https://www.cnblogs.com/jasminemzy/p/7838490.html