LeetCode——Find Median from Data Stream

Median is the middle value in an ordered integer list. If the size of the list is even, there is no middle value. So the median is the mean of the two middle value.

Examples: 

[2,3,4] , the median is 3

[2,3], the median is (2 + 3) / 2 = 2.5

Design a data structure that supports the following two operations:

  • void addNum(int num) - Add a integer number from the data stream to the data structure.
  • double findMedian() - Return the median of all elements so far.

For example:

add(1)
add(2)
findMedian() -> 1.5
add(3) 
findMedian() -> 2

设计数据结构存储数据流并能找出数据流的中位数。
思路:首先找中位数是需要对数据流进行排序的。但是这里不是一次给出所有数据,而是逐渐累加。因此要维护一个有序数列。
首先想到的是Java中的SortedSet可以维护有序集,但是在获取中位数的时候必须要转换成数组,才能直接获取到中位数,时间复杂度较大。超时
class MedianFinder {
    
    private int count;
    private int sum;
    private java.util.SortedSet<Integer> set;
    
    public MedianFinder() {
        set = new TreeSet();
    }

    // Adds a number into the data structure.
    public void addNum(int num) {
        set.add(num);
    }

    // Returns the median of current data stream
    public double findMedian() {
        Integer[] list = set.toArray(new Integer[0]);
        int size = set.size();
        double res = 0.0;
        if(size % 2 == 0) {
            res = (double)(list[size/2] + list[size/2 - 1]) / 2.0;
        }
        else {
            res = (double)list[size/2];
        }
        return res;
    }
};

// Your MedianFinder object will be instantiated and called as such:
// MedianFinder mf = new MedianFinder();
// mf.addNum(1);
// mf.findMedian();

然后看到网上有用优先队列实现的,思路是这样的,维护两个优先队列,其实内部是用堆实现的。维护一个大顶堆,一个小顶堆。分别存储数据流中较大的一般和较小的一半。如果数据流的总数是奇数那么大顶堆中的个数要多一个,这样一来在获取中位数的时候,对于数据流总数是奇数的情况直接返回大顶堆堆顶,对于数据流总数是偶数的情况返回两个堆顶的平均数。最重要的是要维护两个堆的大小。在优先队列中offer,poll时间复杂度为O(logn) peek时间复杂度为O(1),所以addNum的时间复杂度为O(logn),findMedian时间复杂度为O(1)。

public class MedianFinder {
    
    private Queue<Integer> maxHeap; //大顶堆
    private Queue<Integer> minHeap; //小顶堆
    
    
    public MedianFinder() {
        maxHeap = new PriorityQueue<Integer>(11,Collections.reverseOrder());
        minHeap = new PriorityQueue<Integer>();
    }
    
    
    public void addNum(int num) {
        //插入大顶堆
        if(maxHeap.size()==0 || maxHeap.peek()>=num) {
            maxHeap.offer(num);
            if(maxHeap.size()-1 > minHeap.size()) {
                minHeap.offer(maxHeap.poll());
            }
        }
        //插入小顶堆
        else if(minHeap.size()==0 || minHeap.peek() < num) {
            minHeap.offer(num);
            if(minHeap.size() > maxHeap.size()) {
                maxHeap.offer(minHeap.poll());
            }
        }
        //两者之间,先考虑大顶堆。
        else {
            if(maxHeap.size() <= minHeap.size()) {
                maxHeap.offer(num);
            }
            else {
                minHeap.offer(num);
            }
        }
        
        
    }
    
    
    
    public double findMedian() {
        if(maxHeap.size() == minHeap.size()) {
            return (double)(maxHeap.peek() + minHeap.peek()) / 2.0;
        }
        else {
            return (double)maxHeap.peek();
        }
    }
    
    
}






原文地址:https://www.cnblogs.com/wxisme/p/4916800.html