LeetCode 274. H-Index

274. H-Index

Total Accepted: 65025
Total Submissions: 200494
Difficulty: Medium
Contributors: Admin

Given an array of citations (each citation is a non-negative integer) of a researcher, write a function to compute the researcher's h-index.

According to the definition of h-index on Wikipedia: "A scientist has index h if h of his/her N papers have at least h citations each, and the other N − h papers have no more than h citations each."

For example, given citations = [3, 0, 6, 1, 5], which means the researcher has 5 papers in total and each of them had received 3, 0, 6, 1, 5 citations respectively. Since the researcher has 3 papers with at least 3 citations each and the remaining two with no more than 3 citations each, his h-index is 3.

Note: If there are several possible values for h, the maximum one is taken as the h-index.

Hint:

An easy approach is to sort the array first.
What are the possible values of h-index?
A faster approach is to use extra space.

Credits:
Special thanks to @jianchao.li.fighter for adding this problem and creating all test cases.

Subscribe to see which companies asked this question.

【题目分析】

给定一个研究者的论文引用量数组，返回一个数h，h满足：在数组中至少有h个数大于等于h，剩下的数都小于等于h。如果存在多个h，则返回使得条件成立的最大的那个。

【思路】

一开始看到这个题目，感觉被整晕了。下面我们来捋一下思路。

1. h的取值范围为0～L，L为数组的长度。例如：h=L，那么数组中所有的元素都大于等于数组的长度。h=0，表明数组中所有元素都小于1。

2. 我们可以对数组进行升序排序，然后遍历所有h的可能。选定一个h，那么排序后的数组后面的h个数要大于等于h，而前L-h个数要小于等于h。由于使用了排序，这个算法的时间复杂度为O(nlogn).

3. 如果不使用排序的话，一个巧妙的思路如下。我们使用一个count[L+1]数组来记录原数组中大于等于某个长度的元素的个数。count[i]表示数组中大于等于i的元素的个数。

We first create a new vector counts of size L+1 where L is the length of the citations vector. The counts vector stores the number of papers having a citation equal to its index for i=0 to L-1. For i=L, it stores the number of papers having a citation equal to or greater than L. A simple fact is that the h-index can be at most L, this happens when all of his papers have citations no less than L. Therefore, for the purpose of computing h-index, if a person has L papers, it would end up with the same h-index no matter one of his paper has a citation of 10L or L.

After finalizing the counts vector, we can then easily locate his h-index by scanning from right (L) to left (0). By definition, index k is his h-index if the summation of all elements from counts[k] to counts[L] is no less than k.

【java代码——排序】

 1 public class Solution {
 2     public int hIndex(int[] citations) {
 3         if(citations.length <= 0) return 0;
 4         Arrays.sort(citations);
 5         int len = citations.length;
 6         
 7         if(citations[len-1] <= 0) return 0;
 8         if(citations[0] >= len) return len;
 9         
10         for(int i = len-1; i >= 1; i--) {
11             if(citations[len-1-i] <= i && citations[len-i] >= i) return i;
12         }
13         
14         return 0;
15     }
16 }

【java代码——非排序】

 1 public class Solution {
 2     public int hIndex(int[] citations) {
 3         int L = citations.length;
 4         if(L == 0) return 0;
 5         
 6         int[] count = new int[L+1];
 7         for(int i : citations) {
 8             if(i > L) count[L]++;
 9             else count[i]++;
10         }
11         
12         int res = 0;
13         for(int k = L; k >= 0; k--) {
14             res += count[k];
15             if(res >= k) return k;
16         }
17         
18         return 0;
19     }
20 }