problem description
I want to crawl infoQ articles, such as articles under AI topics, but I"m curious about how he asked to load the article list.
uses Java"s crawler gecco.
the environmental background of the problems and what methods you have tried
View the XHR request as follows:
{"type":1,"size":12,"id":31,"score":1546988400000}
this is the first time to load. Ajax will load the article after the pulley slips. The request is as follows:
{"type":1,"size":12,"id":31,"score":1546495717917}
after you need to load the article, you need to click the load more button, and the request format is the same as above
the addresses of these requests are all
https://www.infoq.cn/public/v1/article/getList
how did he determine the list of articles to be recorded?
does it depend on the distance the pulley slides?
how can I get a list of articles?