关于要python爬虫你需要的一些网址和信息[待补充]

phantomjs无头浏览器(基本不用)
http://phantomjs.org/download.html
如果报 下面这种错误

[root@hwgz01 ~]# phantomjs                                                                       
phantomjs: error while loading shared libraries: libfontconfig.so.1: cannot open shared object fi
le: No such file or directory          

需要安装包
centos系列 sudo yum install fontconfig
ubuntu系列 sudo apt-get install libfontconfig
如果还出现问题.
https://stackoverflow.com/questions/480764/linux-error-while-loading-shared-libraries-cannot-open-shared-object-file-no-s

抓取数据相关

selennium(pypi页面)
https://pypi.org/project/selenium/

requests
http://docs.python-requests.org/zh_CN/latest/user/quickstart.html

ChromeDriver - WebDriver for Chrome(下载)
http://chromedriver.chromium.org/downloads
下载对应的版本->是和chrome对应的chromeDrive.

解析html的模块

pyquery
https://pythonhosted.org/pyquery/
bs4
https://www.crummy.com/software/BeautifulSoup/bs4/doc/index.zh.html

爬虫相关文章

https://zhuanlan.zhihu.com/p/56157552

原文地址:https://www.cnblogs.com/lovesKey/p/10934339.html