python scrapy爬虫框架

http://scrapy-chs.readthedocs.io/zh_CN/0.24/intro/tutorial.html

scrapy 提取html的标签内容

from scrapy.selector import Selector

selector = Selector(response)
ul = selector.xpath('//ul[@class="movieList"]')

要获取class包含test的所有div,比如<div class="test website"></div>

把上述xpath的参数修改为 "div[contains(@class,'test')]" 即可。

参考:http://blog.csdn.net/iefreer/article/details/20745065

原文地址:https://www.cnblogs.com/cxscode/p/8184043.html