使用scrapy-redis搭建分布式爬虫环境

详细内容见原文:https://www.cnblogs.com/pythoner6833/p/9148937.html

在settings文件中需要添加5项:

1.DUPEFILTER_CLASS = "scrapy_redis.dupefilter.RFPDupeFilter"

2.SCHEDULER = "scrapy_redis.scheduler.Scheduler"

3.SCHEDULER_PERSIST = True

4.ITEM_PIPELINES = {

  'scrapy_redis.piplines.RedisPipeline':100,

  }

5.REDIS_URL = "redis://127.0.0.1:6379"

  或者写成:REDIS_HOST='127.0.0.1'

             REDIS_PORT=6379

原文地址:https://www.cnblogs.com/znh8/p/11802884.html