problem description
capture answers similar to Zhihu because there are so many answers from Zhihu, response.save is used to save the results of crawling ahead
because Zhihu site cannot be crawled too fast, the task may not be completed in time
so taskdb needs to access a lot of data. The taskdb of that project is close to 250GB
.what result do you expect? What is the error message actually seen?
whether pyspider supports mongodb cluster as taskdb
and then how to configure it.