python工具——wordcloud

生成词云

安装wordcloud模块

pip install -i https://pypi.tuna.tsinghua.edu.cn/simple/ wordcloud

用重复的单个单词组成单词云

import numpy as np
from wordcloud import WordCloud

text = "square"
x, y = np.ogrid[:300, :300]

mask = (x - 150) ** 2 + (y - 150) ** 2 > 130 ** 2
mask = 255 * mask.astype(int)

wc = WordCloud(background_color="white", repeat=True, mask=mask)
wc.generate(text)
wc.to_file('wc.png')

使用一句话生成词云

from wordcloud import WordCloud
wc = WordCloud()    # 创建词云对象
wc.generate('This is not the end. It is not even the beginning of the end. But it is, perhaps, the end of the beginning.')    # 生成词云
wc.to_file('wc.png')    # 保存词云

读取txt文件生成

import os

from os import path
from wordcloud import WordCloud
import matplotlib.pyplot as plt
d = path.dirname(__file__) if "__file__" in locals() else os.getcwd()
text = open(path.join(d, 'test.txt')).read()

wordcloud = WordCloud(max_font_size=40).generate(text)
plt.figure()
plt.imshow(wordcloud, interpolation="bilinear")
plt.axis("off")
plt.show()

生成一个词云文件需要三步:

   1、配置对象参数 

   2、加载词云文本 

   3、输出词云文件 (如果不加说明默认的图片大小为400 * 200)

wordcloud做词频统计分为以下几个步骤:

1、分隔:以空格分隔单词 

2、统计 :单词出现的次数并过滤 

3、字体:根据统计搭配相应的字号 

4、布局

常用参数

 eg:

import os

from os import path
from wordcloud import WordCloud

d = path.dirname(__file__) if "__file__" in locals() else os.getcwd()
text = open(path.join(d, 'test.txt')).read()
text=text.lower()
wordcloud = WordCloud(background_color="white",width=800,height=660).generate(text)
import matplotlib.pyplot as plt

plt.imshow(wordcloud)
plt.axis("off")
plt.show()
wc.to_file('test.png')

 

 test.txt的获取

链接:https://pan.baidu.com/s/1zfuK9-W5tyq1P8ftlQJuJQ
提取码:iet4

更多参考 http://amueller.github.io/word_cloud/

    https://github.com/amueller/word_cloud

原文地址:https://www.cnblogs.com/baby123/p/13024713.html