LeetCode

题目

URL：https://leetcode.com/problems/longest-substring-without-repeating-characters

解法

一、Hash by HashMap

核心思想是，判断字符是否在 HashMap 中，如果是，那么计算当前不重复字串的长度，和之前不重复子串的长度比较并决定是否更新，之后更新在 HashMap 该字符的下标，如果否，将字符加入 HashMap 中。遍历到最后再次计算所有不重复最长子串，因为最后子串也是最长的。

    public int lengthOfLongestSubstring(String s) {
        char[] c = s.toCharArray();
        HashMap<Character, Integer> map = new HashMap<>();
        int max = 0, noRepeatingStart = 0;
        for (int i = 0; i < c.length; i++) {
            if (map.containsKey(c[i])) {
                max = i - noRepeatingStart > max ? i - noRepeatingStart : max;
                noRepeatingStart = map.get(c[i]) + 1 > noRepeatingStart ? map.get(c[i]) + 1 : noRepeatingStart;
                map.replace(c[i], i);
            } else {
                map.put(c[i], i);
            }
        }
        return max > c.length - noRepeatingStart ? max : c.length - noRepeatingStart;
    }

Hash by HashMap，时间复杂度O(n)，运行时间约为 54 ms。

二、Hash by Array（Best Solution）

参考：https://discuss.leetcode.com/topic/33361/java-runtime-4ms/11

其实没有必要用 HashMap 映射，比较耗时。这里因为只有 256 个 ASCII 字符，所以可以直接 Hash 到数组上。

start 记录不重复子串初始位置， end 记录不重复子串终止位置，max 记录最长长度。chars 映射数组表示字符上一次出现的下标，值为 0 表示未出现。遍历字符时，当发现字符出现在不重复子串之后，计算当前不重复子串长度，比较之前不重复子串长度决定是否更新不重复子串最长长度，之后在 chars 映射数组上更新字符的下标（这里为从 1 作为起始下标）。遍历到最后再次计算所有不重复最长子串，因为最后子串也是最长的。

    public int lengthOfLongestSubstring(String s) {
        if (s.length() == 0) return 0;
        int start = 1, end = 1, max = 0;
        int[] chars = new int[256];
        for (int i = 0; i < s.length(); i++) {
            char c = s.charAt(i);
            if (chars[c - ' '] >= start) {
                max = max > end - start ? max : end - start + 1;
                start = chars[c - ' '] + 1;
            }
            end = chars[c - ' '] = i + 1;
        }
        max = max > end - start ? max : end - start + 1;
        return max;
    }

Hash by Array，时间复杂度O(n)，运行时间约为 54 ms。

总结

同样利用 Hash 技术，手工的 Hash by Array 比系统自带的 Hash by HashMap 快了 25% 左右。能简单 Hash 的还是手工 Hash 比较好。

其实我也不清楚到底哪种算法比较好，我有的时候会觉得 Hash by HashMap 比较科学。因为它有利于程序设计，编写代码者只需要考虑如何组织 Hash，并不需要设计 Hash，此外，手工 Hash by Array 很容易因为下标的问题产生错误。

不管怎么样，HashMap 的原理还是有必要了解的，不然很容易产生性能问题。