when using python to crawl a novel website, there are always a few words missing in the first few paragraphs. Deeply confused.
crawl address: https://www.biqukan.com/1_109.
the code is as follows:
from bs4 import BeautifulSoup
import requests
if __name__ == "__main__":
target = "https://www.biqukan.com/1_1094/5403177.html"
req = requests.get(url = target)
html = req.text
bf = BeautifulSoup(html)
texts = bf.find_all("div",class_ = "showtxt")
-sharpprint(texts)
print(texts[0].text.replace("\xa0"*8,"\n\n"))
here is the result of my crawl, with one word missing in the red box.
:
these words are missing respectively.
Please kindhearted people to help me as a rookie. Thank you.