just contacted python, according to https://blog.csdn.net/mtbaby/.
wanted to crawl piglet short rent information, but then IP was blocked.
then looks at the problem of agent ip , but still can"t get the information
import requests
from lxml import etree
import time
proxies = {
"http": "http://61.135.217.7:80",
}
user_agent = "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/65.0.3325.162 Safari/537.36"
url = "http://hz.xiaozhu.com/"
headers = {"User-Agent": user_agent}
data = requests.get(url, headers=headers, proxies=proxies).text
h = etree.HTML(data)
home = h.xpath("//*[@id="page_list"]/ul/li")
time.sleep(2)
for div in home:
title = h.xpath("./div[2]/div/a/span/text()")[0] -sharp
price = h.xpath("./div[2]/span[1]/i/text()")[0] -sharp
print("{}-->{}}".format(title, price))
the running result is as follows
hoping to help solve it. Thank you very much!