On the problem of grabbing web page content by PYTHON

I want to crawl the page the number of readings at the end of the article in http://blog.sina.com.cn/s/blo.

read (332) "comments (0)" favorites (0)

Web page source code:

<span id="r_6f72ff900102xqgi" class="SG_txtb"></span>

there is no value in the source code.

the pyquery library I use is as follows:

url = "http://blog.sina.com.cn/s/blog_6f72ff900102xqgi.html"

doc = pq(url=url, encoding="utf-8")

print(doc("-sharpr_6f72ff900102xqgi") )

result of code output:

<span id="r_6f72ff900102xqgi" class="SG_txtb"></span>

what do I need to do to get the number of readings on the page?

Mar.01,2021

number of readings I have seen some videos before, which may be stored in the returned json. You can open F12 to see if json data has been returned.


first confirm whether the reading amount is dynamic data (asynchronously sending requests for acquisition, such as ajax) or static data (Synchronize loading and rendering)

  • if it is dynamic data, you can try to simulate sending a request to get the back-end data. F12, take a look at all the requests sent and the data returned
  • if it is static data, html can use regular matching to obtain
  • after it is captured.

should be dynamic data obtained by js

The

request is similar to the following address:
http://comet.blog.sina.com.cn.

get the result

requestId_57944281= {"pv": 773757, "av": 362}

the number of readings is the same as the value of this av.

MySQL Query : SELECT * FROM `codeshelper`.`v9_news` WHERE status=99 AND catid='6' ORDER BY rand() LIMIT 5
MySQL Error : Disk full (/tmp/#sql-temptable-64f5-1b3ee1f-2c456.MAI); waiting for someone to free some space... (errno: 28 "No space left on device")
MySQL Errno : 1021
Message : Disk full (/tmp/#sql-temptable-64f5-1b3ee1f-2c456.MAI); waiting for someone to free some space... (errno: 28 "No space left on device")
Need Help?