Repeated DNA Sequences

All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACGAATTCCG". When studying DNA, it is sometimes useful to identify repeated sequences within the DNA.

Write a function to find all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule.

For example,

Given s = "AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT",

Return:
["AAAAACCCCC", "CCCCCAAAAA"].
参考:http://blog.csdn.net/coderhuhy/article/details/43647731

 1 import java.util.ArrayList;
 2 import java.util.HashMap;
 3 import java.util.List;
 4 import java.util.Map;
 5 
 6 
 7 public class Solution {
 8     public List<String> findRepeatedDnaSequences(String s) {
 9         if(s.length() < 10)
10             return new ArrayList<String>();
11         List<String> result = new ArrayList<String>();                                    //结果集
12         Map<Character, Integer> dict = new HashMap<Character, Integer>();                //ACGT对应的整数编码
13         Map<Integer, Integer> check = new HashMap<Integer, Integer>();                    //存放已经放到结果集中字符串,用于去重
14         Map<Integer, Integer> subValue = new HashMap<Integer, Integer>();                //遍历过的子串,用于检查重复子串 
15         int erase = 0x0003ffff;
16         
17         dict.put('A', 0);
18         dict.put('C', 1);
19         dict.put('G', 2);
20         dict.put('T', 3);
21         
22         int hint = 0;
23         
24         for(int i = 0; i < 10; i++){
25             hint <<= 2;
26             hint += dict.get(s.charAt(i));
27         }
28         
29         subValue.put(hint, 1);
30         //
31         for(int i = 10; i < s.length(); i++){
32             hint = ((hint & erase) << 2) + dict.get(s.charAt(i));
33             if(subValue.get(hint) != null){                            //遇到重复的子串
34                 if(check.get(hint) == null){                        //结果集中没有,放到结果集和check中
35                     result.add(s.substring(i - 9, i + 1));
36                     check.put(hint, 1);
37                 }//if
38             }//if
39             subValue.put(hint, 1);
40         }
41         
42         return result;
43     }
44 }
原文地址:https://www.cnblogs.com/luckygxf/p/4323898.html