Python-crawler - CodesHelper - Programming Question Answer

Python-crawler - Related information

Python crawled FAERS data report error
problem description using luigi framework to crawl faers data reported an error, IDE is pycharm error message is No task specified Process finished with exit code 1 2. Source code import os import re import shutil import requests from io imp...

Pycharm python-crawler python

Jul.12,2022
Scrapy cannot extract the next page
problem description cannot get the next page related codes Please paste the code text below (do not replace the code with pictures) import scrapy from qsbk.items import QsbkItem from scrapy.http.response.html import HtmlResponse from scra...

Scrapy python-crawler

Jul.05,2022
An error is reported during the operation of scrapy, ModuleNotFoundError: No module named 'pymongo'
I run the single file directly without import errors. In addition, it is normal for me to use mongodb in the py file alone, but when I run it in the scrapy project, I will say that the import failed. Why? import json import pymongo from scrapy.utils.pr...

Mongodb python scrapy python-crawler

Jul.02,2022
What is the execution flow of the for loop in python?
squares = [] for x in range(1, 5): squares.append(x) print(squares) the result is [1] [1, 2] [1, 2, 3] [1, 2, 3, 4] my understanding is as follows, is this correct? Or should I force an explanation? x = 1, append (x) adds 1 to the list. A...

Python-crawler python3.x python

Jun.24,2022
Scrapy, I want to simulate the landing day to check the website, that site to slide alignment verification, what can I do to simulate the success of landing?
this is the core code of my simulated login: def __init__(self): dcap = dict(webdriver.DesiredCapabilities.PHANTOMJS) -sharp userAgent -sharp dcap[ -sharp "phantomjs.page.settings.userAgent"] = "Mozilla 5.0 (...

Python python-crawler scrapy

Jun.21,2022
Please ask me the question of scrapy crawler, thank you, online, etc.
ask, scrapy crawler, why did I send it to scrapy.Request https: www.tianyancha.com reportContent 24505794 2017 then print out the url in callback to become https: www.tianyancha.com login?from=https: www.tianyancha.com reportContent 24505794 2017...

Scrapy python-crawler python

Jun.20,2022
Appium always can't find this element. The page is displayed, but page_source does not have it. What should I do?
appium+ simulator found the id of the element with uiautomatorviewer but: find_element_by_id( com.ss.android.me:id i7 ) makes a mistake selenium.common.exceptions.NoSuchElementException: Message: An element could not be located on the page using ...

Selenium python-crawler python3.x appium

Jun.12,2022
Python crawler crawls data within td tags with only onclick attributes and gets onclick content
problem description I want to crawl the contents of all the td tags in the tr tag and get the absolute path within the onclick attribute the environmental background of the problems and what methods you have tried try to directly ignore onclick ...

Onclick python-crawler

Jun.09,2022
How do I use Beautiful Soup to get everything between two tags?
<h4>1< h4> text text text <h4>2< h4> text text text <span>asdf< span> <h4>3< h4> 4 1 1 2 2 text description 4 text 1 2 HTML code as above, how do I get the content between two ? For exam...

Python-crawler beautifulsoup

May.29,2022
How to scroll to the bottom of scrapy and then return to response after all the contents of the page have been loaded
the website I am crawling now displays only 20 pieces of data. Only when the mouse scrolls to the bottom can it display another 20 pieces, and then scroll to the bottom to continue to display all 60 pieces of data . how can I achieve this effect with s...

Selenium scrapy python python-crawler

May.16,2022
Using Scrapy-Redis to implement distributed crawlers how to gracefully keep the scheduling pool capable of crawling multiple machines at the same time? Why is the scheduling pool easy to be empty?
question : RedisCrawlSpider s crawler template is used in the project to achieve two-way crawling, that is, a Rule handles horizontal url crawling of the next page, and a Rule handles vertical detail page url crawling. Then the effect of distributed ...

Scrapy python-crawler

May.12,2022
Xpath extracts intermediate content from one node to the next
<table> <thead><tr>< tr>< thead> <tbody> <tr class="aaa">< tr> <tr>< tr> <tr class="aaa">< tr> <tr>< tr> <tr>< tr> <tr cla...

Python-crawler python3.x xpath

Apr.10,2022
Why can pyspider implement url to crawl multiple pieces of data?
it is said that on_message can, but I still can t test it. Is there any way to achieve it? def detail_page(self, response): results = json.loads(response.text) for result in results: date = result[ date ] number = response.ur...

Pyspider python python-crawler

Mar.21,2022
How do tasks that have been running be handled with a thread pool?
for example, if you use python to crawl the on-screen comment of Douyu s live room, do you need to ensure that 100 threads connect at the same time? ...

Multithreaded python-crawler

Feb.15,2022
How do tasks that have been running be handled with a thread pool?
for example, if you use python to crawl the on-screen comment of Douyu s live room, do you need to ensure that 100 threads connect at the same time? ...

Multithreaded python-crawler

Feb.15,2022
The crawler encounters an encryption method and doesn't know how to break it.
recently climbed a video app, to climb to the last step. I don t know how to break this encryption ...

Python-crawler python

Nov.30,2021
Pyspider all failed, indicating that the platform does not support timeout
it used to be fine, but now it doesn t work. I don t know what the reason is, and Baidu didn t find out why. I asked the boss for help, thank you D:python.ptc > D:python.ptc > pyspider all dazzle anacondalibsitelypackagespyspiderlibsutils.pyride1...

Pycharm pyspider python3.x python python-crawler

Nov.19,2021
How to extract an element from a repeating element with xpath?
<div class="container"> <div class="col-12 col-sm-3"> <p class="title"> 001 < div> <div class="col-12 col-sm-3"> <p class="title"> 999 < div&...

Python-crawler lxml xpath

Nov.05,2021
Python multithreaded crawler queue queue problem.
the idea is to first construct the url list all_url and then for i in range (0, len (all_url)): urlqueue.put (all_ URL [I]) then get can pull url from the list every time now the problem is that range cannot be written as 0 to list length will sh...

Python-crawler queue multithreaded

Oct.20,2021
About the anti-crawler problem of a website
the website I encounter now seems to use distil networks, an anti-crawler service. If you need to get the data, you must bring cookie, without cookie. All requests will be returned directly . <!DOCTYPE html> <html> <head> <META NAM...

Web-page-grab-package web-crawler python-crawler python

Sep.28,2021

MySQL Query : SELECT * FROM `codeshelper`.`v9_news` WHERE status=99 AND catid='6' ORDER BY rand() LIMIT 5
MySQL Error : Disk full (/tmp/#sql-temptable-64f5-385ad1b-96cb.MAI); waiting for someone to free some space... (errno: 28 "No space left on device")
MySQL Errno : 1021
Message : Disk full (/tmp/#sql-temptable-64f5-385ad1b-96cb.MAI); waiting for someone to free some space... (errno: 28 "No space left on device")
Need Help?