htm2txt

1.安装BeautifulSoup

pip install beautifulsoup4

2.读取htm文件

htmcontent = open(path,'r').read()

soup = BeautifulSoup(htmcontent)

htmcontent = soup.get_text()

原文地址:https://www.cnblogs.com/levy/p/htm2txt.html