Can we set a proxy for the spider using the scrapy_splash? - Codes Helper - Programming Question Answer

Can we set a proxy for the spider using the scrapy_splash?

When I implemented a spider using Scrapy, I wanted to change the proxy of it so that the server wouldn"t forbid my request according to the frequent requests from an ip. I also knew how to change the proxy with Scrapy, using middlewares or directly change the meta when I request.

However, I used the package scrapy_splash to execute the Javascript for my spider, then I found it difficult to change the proxy because in my opinion, the scrapy_splash use a proxy server to render the JS of the website for us.

In fact, when I only use Scrapy, the proxy goes well, but turns to be unuseful when I use scrapy_splash.

So is there any way to set a proxy for the request of the scrapy_splash?

< H2 > HELP ME,PLZ,THANK YOU < / H2 >

modified 4 hours later:

I have set the related settings in the setting.py and written this in the middlewares.py . As I mentioned before, this only works for scrapy but not scrapy_splash:

class RandomIpProxyMiddleware(object):
    def __init__(self, ip=""):
        self.ip = ip
        ip_get()
        with open("carhome\\ip.json", "r") as f:
            self.IPPool = json.loads(f.read())

    def process_request(self, request, spider):
        thisip = random.choice(self.IPPool)
        request.meta["proxy"] = "http://{}".format(thisip["ipaddr"])

And here is the code in the spider with scrapy_splash:

    yield scrapy_splash.SplashRequest(
            item, callback=self.parse, args={"wait": 0.5})

Here is the code in the spider without this pluguin:

    yield scrapy.Request(item, callback=self.parse)

Scrapy python-crawler

Mar.30,2021

If you want to build your private proxy pool, you can try this solution which can makes your android device or your home pc as an proxy server: https://github.com/xapanyun/p.

Previous: The problem of adding formal parameters to functions bound by addEventListener

Next: About using webpack to package vue projects, the local compilation speed is very slow.

Scrapy scheduled task under centos, cannot be executed
execute after entering the project, the error shows scrapy command not found , but I-sharpscrapy can be run, the scrapy crawl test crawler command can also be executed alone, only the scheduled command will appear scrapy:command not found ...

Crontab scrapy python-crawler

Mar.04,2021
Ask a python scrapy deep crawler problem.
after crawling the navigation, the URL crawl that you want to continue in-depth navigation, and then the unified return value is written to xlsx < H1 >--coding: utf-8--< H1 > from lagou.items import LagouItem; import scrapy class LaGouSpider (...

Scrapy python-crawler

Mar.04,2021
The problem of scrapy RetryMiddleware Middleware retry request carrying request header and proxy ip
goal: you want to launch the current request repeatedly when the request ip fails, or when the CAPTCHA is encountered, until the request succeeds, so as to reduce the data omission of crawling. question: I don t know if my thinking is correct. At pres...

Scrapy python-crawler

Mar.23,2021
How scrapy crawls the content under the style= "display:none" tag when the display style of web page elements is set to invisible
as shown in the title, scrapy novice asks how to crawl the content under the style= "display:none " tag where the display style of web elements is set to invisible: the source code of the web page is as follows: <dl class="xxx" style=&qu...

Selenium scrapy python-crawler

Sep.24,2021
Using Scrapy-Redis to implement distributed crawlers how to gracefully keep the scheduling pool capable of crawling multiple machines at the same time? Why is the scheduling pool easy to be empty?
question : RedisCrawlSpider s crawler template is used in the project to achieve two-way crawling, that is, a Rule handles horizontal url crawling of the next page, and a Rule handles vertical detail page url crawling. Then the effect of distributed ...

Scrapy python-crawler

May.12,2022
Please ask me the question of scrapy crawler, thank you, online, etc.
ask, scrapy crawler, why did I send it to scrapy.Request https: www.tianyancha.com reportContent 24505794 2017 then print out the url in callback to become https: www.tianyancha.com login?from=https: www.tianyancha.com reportContent 24505794 2017...

Scrapy python-crawler python

Jun.20,2022
An error is reported during the operation of scrapy, ModuleNotFoundError: No module named 'pymongo'
I run the single file directly without import errors. In addition, it is normal for me to use mongodb in the py file alone, but when I run it in the scrapy project, I will say that the import failed. Why? import json import pymongo from scrapy.utils.pr...

Mongodb python scrapy python-crawler

Jul.02,2022
Scrapy cannot extract the next page
problem description cannot get the next page related codes Please paste the code text below (do not replace the code with pictures) import scrapy from qsbk.items import QsbkItem from scrapy.http.response.html import HtmlResponse from scra...

Scrapy python-crawler

Jul.05,2022

MySQL Query : SELECT * FROM `codeshelper`.`v9_news` WHERE status=99 AND catid='6' ORDER BY rand() LIMIT 5
MySQL Error : Disk full (/tmp/#sql-temptable-64f5-42a401c-7c14.MAI); waiting for someone to free some space... (errno: 28 "No space left on device")
MySQL Errno : 1021
Message : Disk full (/tmp/#sql-temptable-64f5-42a401c-7c14.MAI); waiting for someone to free some space... (errno: 28 "No space left on device")
Need Help?