I have done a Weibo climbing article before, using puppeteer.js, to completely simulate user behavior and will not be blocked from detection
you can take a look at this library
it is illegal to climb Weibo. Please read Weibo's user agreement carefully. So just do it secretly, don't do it with so much fanfare.
Java
has never done Weibo, but the idea is to first obtain authentication Cookie,Token and so on, then grab the package with Fiddler, mainly the interface for requesting data, and then capture the Weibo part for persistence with Jsoup.
about the source, there should be an App interface, or a PC page or an H5 page, to see which is easier to choose.
previously wrote a simulated login with Java and climbed my own private message
because I was lazy. Instead of using Weibo's API
, I used Fiddler to grab packets, analyze parameters, simulate browser login, send requests, and parse Json
. The disadvantage is that it is relatively passive, so others can't play with a parameter program.
if I were asked to write another one now, I would choose to write a Chrome plug-in
, after all, it is a browser. Don't worry about authentication, just climb
.
if the plug-in doesn't bother to write, you can take a look at this
from=groupmessage" rel=" nofollow noreferrer "> without writing code. Webscraper grabs Li Xiaolai in 30 seconds
Weibo has its own open platform, which you can get through Weibo's API. There is no need to use crawlers
.