Longest Consecutive Sequence

Problem Statement

  • Given an unsorted array of integers, find the length of the longest consecutive elements sequence.
  • For example,  Given [100, 4, 200, 1, 3, 2],  The longest consecutive elements sequence is [1, 2, 3, 4]. Return its length: 4.
  • Your algorithm should run in $O(n)$ complexity.

There's a easy $O(nlog{n})$ solution.

We will define a hash table mp, and we initialize the hash table with 1:

unordered_map<int, int> mp;
for(int i = 0; i < n; ++i)
    mp[A[i]] = 1;

The value of mp, mp[A[i]] just means the maximum length of consecutive sequence in which the maximum element is A[i].

So at beginning, every element itself is a consecutive element, and the maximum length of consecutive sequence ended by A[i] is 1.

Then, for each element A[i] in A, we search its one less element A[i]-1. If it exists, we add the mp[A[i]], i.e.

++mp[A[i]]

Beacause the map is ordered by A[i], these operations can gurantee the increment of mp[A[i]] is beginning at the minimum element. Thus the algorithm is correct.


The $O(n)$ solving method:

Our algorithm is to abstract every connected component(consecutive sequence) out of array A.

For every element $A[i] in C_i = [C_{min}, C_{max}] subseteq A$, we search it from $A[i]$ to its upper bound $C_{max}$, and search it to its lower bound $C_{min}$.

We first use a hash table to store every element in A as key. And we use true as this key's entry.

unordered_map<int, bool> mp;
for(int i = 0; i < length; ++i)
    mp[num[i]] = true;

Then, every time we access one element A[i], we erase it from mp, because it's one element of connected component $C_i$. Then we find if A[i+1] is exsiting in mp. If it is, we erase it from mp, and add the length of $C_i$, ++$lenC_i$. And keep going until we can't find the next bigger element in mp.

int consecutiveLength = 1;  //A[i] itself is a consecutive sequence
int findNum = A[i];
mp.erase(findNum);

while(mp.find(mp[findNum+1]) != mp.end()){
    ++consecutiveLength;
    ++findNum;  //make findNum pointer to next bigger element
    mp.erase(findNum);
}

Then, we search to the lower bound of $C_i$ from $A[i]$ with the same method.

findNum = A[i];
while(mp.find(findNum-1) != mp.end()){
    ++consecutiveLength;
    --findNum;  //make findNum pointer to next smaller element
    mp.erase(findNum);
}

If the current consecutive length is bigger than sum before, we update the sum.

So, based on the above statement, we can iterate through the array to abstract every connected component out of array A. With the constraint of running time $O(n)$, each time we access the element A[i], we only do the search we A[i] is still in mp.


 The complete code is below:

int longestConsecutive(vector<int> &num) {
    int length = num.size();
    if(0 == length) return 0;
    
    unordered_map<int, bool> mp;
    for(int i = 0; i < length; ++i){
        mp[num[i]] = true;
    }
    
    int res = 0;
    for(int i = 0; i < length; ++i){
        if(mp.find(num[i]) == mp.end()) continue;
        
        int consecutiveLength = 1;
        int findNum = num[i];
        
        mp.erase(findNum);
        
        while(mp.find(findNum+1) != mp.end()){
            ++consecutiveLength;
            ++findNum;
            mp.erase(findNum);
        }
        
        findNum = num[i];
        while(mp.find(findNum-1) != mp.end()){
            ++consecutiveLength;
            --findNum;
            mp.erase(findNum);
        }
        
        res = consecutiveLength > res ? consecutiveLength : res ;
    }
    
    return res;
    
}
原文地址:https://www.cnblogs.com/kid551/p/4114564.html