Ask about the choice of reptiles.

now there is a need to crawl an article from a website, including all the js,css.html files, and then save it to become your own article, which is loaded asynchronously through ajax. So I would like to ask, this kind of demand, which way to achieve better, scrapy splash and puppeteer seem to be similar in principle. In addition to the above two, there is no other framework for my current needs, the language is selected in node and ptyhon for advice.

Apr.19,2022

selenium is good, although inefficient


articles are obtained through ajax , why don't you just use this interface?


finally, I chose puppeteer


. I think that the retro combination of scrapy and bs4 will not fail to apply


dynamic web pages loaded through ajax. It is recommended to use selenium

.
MySQL Query : SELECT * FROM `codeshelper`.`v9_news` WHERE status=99 AND catid='6' ORDER BY rand() LIMIT 5
MySQL Error : Disk full (/tmp/#sql-temptable-64f5-1b40c99-2c54d.MAI); waiting for someone to free some space... (errno: 28 "No space left on device")
MySQL Errno : 1021
Message : Disk full (/tmp/#sql-temptable-64f5-1b40c99-2c54d.MAI); waiting for someone to free some space... (errno: 28 "No space left on device")
Need Help?