The problem of crawler redirecting 302

when the crawler starts, it is redirected to an error page. What to do
http://www.gzcc.gov.cn/data/l.
crawler"s error log is

clipboard.png

Mar.11,2021

Open the web page, see what the request headers are
, and then configure the request first sample in your crawler.


this kind of website is OK to reverse crawl. If you just want to get the source code, just disguise your head

.
Accept:text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Accept-Encoding:gzip, deflate, sdch
Accept-Language:zh-CN,zh;q=0.8
Cache-Control:max-age=0
Connection:keep-alive
Cookie:ASP.NET_SessionId=bqtygl55xovvgp45ajwmuj45
DNT:1
Host:www.gzcc.gov.cn
Upgrade-Insecure-Requests:1
User-Agent:Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36 SE 2.X MetaSr 1.0

of course, in practice, we don't need to fill in so many parameters. Take the requests library as an example, (Pyhthon)

.
header={'user-agent':'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36 SE 2.X MetaSr 1.0'}
-sharp json
resp=requests.get(url=url,headers=header)

camouflage the request header

MySQL Query : SELECT * FROM `codeshelper`.`v9_news` WHERE status=99 AND catid='6' ORDER BY rand() LIMIT 5
MySQL Error : Disk full (/tmp/#sql-temptable-64f5-1b3b34b-2c282.MAI); waiting for someone to free some space... (errno: 28 "No space left on device")
MySQL Errno : 1021
Message : Disk full (/tmp/#sql-temptable-64f5-1b3b34b-2c282.MAI); waiting for someone to free some space... (errno: 28 "No space left on device")
Need Help?