There is a problem with php Curl crawling the page of search results.
ask for help. Recently do a movie search function, need to crawl btso search results. But no matter how you set it with curl, it returns 403. I have also tried to add a header message, but I can"t do it either. Please help me!
Thank you!
$url = "https://btso.pw/search/".$keyword;
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_HEADER, FALSE);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_REFERER, "https://btso.pw/search/");
dd(curl_exec($ch));
you first grab the request information under natural browsing in the browser:
curl 'https://btso.pw/search/FSET-391'
-H 'Connection: keep-alive'
-H 'Pragma: no-cache' -H 'Cache-Control: no-cache'
-H 'Upgrade-Insecure-Requests: 1' -H 'DNT: 1'
-H 'User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/69.0.3497.100 Safari/537.36'
-H 'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8'
-H 'Referer: https://btso.pw/search/'
-H 'Accept-Encoding: gzip, deflate, br'
-H 'Accept-Language: zh-CN,zh;q=0.9,en;q=0.8,zh-TW;q=0.7'
-H 'Cookie: AD_enterTime=1540175511; AD_adca_b_SM_T_728x90=0; AD_jav_b_SM_T_728x90=0; AD_javu_b_SM_T_728x90=0; AD_wav_b_SM_T_728x90=0; AD_adst_b_SM_T_728x90=1; AD_exoc_b_SM_T_728x90=1; AD_adma_b_POPUNDER=2'
--compressed ;
then convert this information to PHP:
$headers = array(
'User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/69.0.3497.100 Safari/537.36',
'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8',
'Cookie: AD_enterTime=1540175511; AD_adca_b_SM_T_728x90=0; AD_jav_b_SM_T_728x90=0; AD_javu_b_SM_T_728x90=0; AD_wav_b_SM_T_728x90=0; AD_adst_b_SM_T_728x90=1; AD_exoc_b_SM_T_728x90=1; AD_adma_b_POPUNDER=2',
'Accept-Language: zh-CN,zh;q=0.9,en;q=0.8,zh-TW;q=0.7',
'Upgrade-Insecure-Requests: 1',
'Connection: keep-alive',
'Referer: https://btso.pw/search/'
);
$url = 'https://btso.pw/search/FSET-391';
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_HTTPHEADER, $headers);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
var_dump(curl_exec($ch));
$httpCode = curl_getinfo($ch, CURLINFO_HTTP_CODE);
var_dump($httpCode);