python文章正文提取

参考资料: 
cx-extractor,地址:https://code.google.com/archive/p/cx-extractor/   https://github.com/chrislinan/cx-extractor-python

Boilerpipe,地址:http://code.google.com/p/boilerpipe/

Html2Article,地址:

http://www.cnblogs.com/jasondan/p/3497757.html

https://github.com/stanzhai/Html2Article

python:https://github.com/zhuyf8899/Html2Article

python goose,地址:https://github.com/grangier/python-goose

Readability,Python版本:https://github.com/timbertson/python-readability

newspaper,地址:https://github.com/codelucas/newspaper

arex,地址:https://github.com/ahkimkoo/arex

原文地址:https://www.cnblogs.com/microtiger/p/14882829.html