there are more than 30 pages with 10 entries per page, and only one or two pieces of data from some pages can be obtained, adding up to only more than 20 records.
is there any problem with the following cycle?
the approximate code is as follows: (other codes for obtaining information processing information will not be put)
def getCurList(self,response):
for x in range(totalPage):
pageUrl=start_url+"&rPage="+str(x+1)
yield Request(pageUrl,headers=self.headers,callback=self.getPageList)
def getPageList(self,response):
for good in detailUrlList:
yield Request(detailUrl,headers=self.headers,callback=self.getDetail)
def getDetail(self,response):
yield item