自然语言9_NLTK计算中文高频词

自然语言9_NLTK计算中文高频词

python机器学习-乳腺癌细胞挖掘（博主亲自录制视频）https://study.163.com/course/introduction.htm?courseId=1005269003&utm_campaign=commission&utm_source=cp-400000000398149&utm_medium=share

以下代码仅限于python2

NLTK计算中文高频词

>>> sinica_fd=nltk.FreqDist(sinica_treebank.words())
>>> top100=sinica_fd.items()[0:100]
>>> for (x,y) in top100:
print x,y

的 6776
、 1482
在 1331
是 1317
了 1190
有 759
我 724
他 688
就 627
上 612
和 580
也 542
不 526
人 467
都 417
與 404
著 389
我們 384

https://study.163.com/provider/400000000398149/index.htm?share=2&shareId=400000000398149（博主视频教学主页）

【推广】免费学中医，健康全家人

原文地址：https://www.cnblogs.com/webRobot/p/6068858.html