Some questions and doubts of scrapy

my proxy middleware: settings has been set to 544 and added to None by default

class IpProxyMiddleware(object):
    def __init__(self, ip=""):
        self.ip = ip

    def process_request(self, request, spider):
        self.ip = requests.get("http://localhost:5555/random").text
        logging.info("IP:" + self.ip)
        request.meta["proxy"] = "http://" + self.ip
  • wonder: every time you use Request to specify a url and callback function, will the process_request method be executed? Then call API once to get the local proxy Ip? That"s what doc said.
this method is called when each request passes through the download middleware
  • question: how to set the parsing function of the callback, when parsing the non-20000 error code, switch the proxy IP? again Is there a problem using the following code? I used it, but it didn"t work. I don"t know where to check.
        if response.status != 200:
            logging.error("--------IP has be baned!Retry again~ --------")
            yield Request(url=response.url,meta={"change_proxy": True}, callback=self.followees_parse)
Mar.09,2021

take a look at debug, and run the breakpoint step by step

MySQL Query : SELECT * FROM `codeshelper`.`v9_news` WHERE status=99 AND catid='6' ORDER BY rand() LIMIT 5
MySQL Error : Disk full (/tmp/#sql-temptable-64f5-1b3c536-4d9a2.MAI); waiting for someone to free some space... (errno: 28 "No space left on device")
MySQL Errno : 1021
Message : Disk full (/tmp/#sql-temptable-64f5-1b3c536-4d9a2.MAI); waiting for someone to free some space... (errno: 28 "No space left on device")
Need Help?