12.7

词频统计:

import nltk
tokens=[ 'my','dog','has','flea','problems','help','please',
'maybe','not','take','him','to','dog','park','stupid',
'my','dalmation','is','so','cute','I','love','him' ]
#统计词频
freq = nltk.FreqDist(tokens)

#输出词和相应的频率
for key,val in freq.items():
print (str(key) + ':' + str(val))

#可以把最常用的5个单词拿出来
standard_freq=freq.most_common(5)
print(standard_freq)

原文地址:https://www.cnblogs.com/1329197745a/p/15651657.html