Open two scrapy
tasks at the same time, and then go to push
in redis
a start_url
but only one scrapy task An is running, and when An is stopped, B task will begin to crawl.
the reason seems to be that requests
is not saved in redis while scrapy-redis is running, only dupefilter
is saved. requests
or push a start_url,B task to redis before it starts crawling.
what"s going on?
version:
python 3.6
Scrapy (1.5.0)
scrapy-redis (0.6.8)
settings.py
SCHEDULER = "scrapy_redis.scheduler.Scheduler"
DUPEFILTER_CLASS = "scrapy_redis.dupefilter.RFPDupeFilter"
SCHEDULER_PERSIST = True