[LeetCode] 692. Top K Frequent Words

Given a non-empty list of words, return the k most frequent elements.

Your answer should be sorted by frequency from highest to lowest. If two words have the same frequency, then the word with the lower alphabetical order comes first.

Example 1:

Input: ["i", "love", "leetcode", "i", "love", "coding"], k = 2
Output: ["i", "love"]
Explanation: "i" and "love" are the two most frequent words.
    Note that "i" comes before "love" due to a lower alphabetical order.

Example 2:

Input: ["the", "day", "is", "sunny", "the", "the", "the", "sunny", "is", "is"],
k = 4
Output: ["the", "is", "sunny", "day"]
Explanation: "the", "is", "sunny" and "day" are the four most frequent words,
    with the number of occurrence being 4, 3, 2 and 1 respectively.

Note:

You may assume k is always valid, 1 ≤ k ≤ number of unique elements.
Input words contain only lowercase letters.

Follow up:

Try to solve it in O(n log k) time and O(n) extra space.

前K个高频单词。

题意是给一个非空的单词列表，返回前K个出现次数最多的单词。返回的答案应该按单词出现频率由高到低排序。如果不同的单词有相同出现频率，按字母顺序排序。

因为是前K个XX的题型所以思路不是pq就是bucket sort。本题是用到heap/priority queue。先用hashmap存每个单词和他们出现的频率，然后用pq创建一个最小堆，并且做一个comparator比较map的entrySet，如果map中两个key的value相同，则比较这两个key的字母排序，字典序小的在前，否则就按照这两个key对应的value大小排序。最后用一个linkedlist输出结果，注意将单词加入res的时候要加在list的头部，因为从pq弹出的元素是最小的，所以最后一个弹出的元素应该排在list的最前面。

二刷的时候想不起来为什么一定要用最小堆。其实用最大堆也行，但是最小堆可以在创建的时候直接定义size K，当堆的size到达K的时候，之后再放进来的元素如果更大，会把堆顶元素挤掉；如果再放进来的元素更小，则无法放进堆。这个操作会节省一点时间。

时间O(nlogk)

空间O(n)

Java实现最小堆

 1 class Solution {
 2     public List<String> topKFrequent(String[] words, int k) {
 3         List<String> res = new ArrayList<>();
 4         HashMap<String, Integer> map = new HashMap<>();
 5         for (String word : words) {
 6             map.put(word, map.getOrDefault(word, 0) + 1);
 7         }
 8         
 9         // 如果两个key的value相同，则字典序小的在堆顶
10         // 否则把value大的key放在堆顶
11         PriorityQueue<Map.Entry<String, Integer>> queue = new PriorityQueue<>((a, b) -> a.getValue() == b.getValue() ? a.getKey().compareTo(b.getKey()) : b.getValue() - a.getValue());
12         for (Map.Entry<String, Integer> entry : map.entrySet()) {
13             queue.offer(entry);
14         }
15         
16         for (int i = 0; i < k; i++) {
17             res.add(queue.poll().getKey());
18         }
19         return res;
20     }
21 }

时间O(nlogk)

空间O(n)

Java实现最大堆

 1 class Solution {
 2     public List<String> topKFrequent(String[] words, int k) {
 3         HashMap<String, Integer> map = new HashMap<>();
 4         for (String s : words) {
 5             map.put(s, map.getOrDefault(s, 0) + 1);
 6         }
 7         PriorityQueue<Map.Entry<String, Integer>> maxHeap = new PriorityQueue<>(k, (a,
 8                 b) -> a.getValue() == b.getValue() ? a.getKey().compareTo(b.getKey()) : b.getValue() - a.getValue());
 9         for (Map.Entry<String, Integer> entry : map.entrySet()) {
10             maxHeap.offer(entry);
11         }
12         List<String> res = new ArrayList<>();
13         while (res.size() < k) {
14             res.add(maxHeap.poll().getKey());
15         }
16         return res;
17     }
18 }

LeetCode 题目总结