为什么要写爬虫?

为什么要爬数据?

To quote Wikipedia

> The key element that distinguishes data scraping from regular parsing is that the output being scraped was intended for display to and *end-user*, rather than as input to another program, and is therefore usually *neither documented nor structured* for convenient parsing.

* 爬取整站思路:使用图遍历算法
* 爬取更新思路:找列表页,不断刷新获得更新

如何获得列表页?
通过爬取整站,通过机器学习,查找列表页

About 逸飞

后端工程师

发表评论

电子邮件地址不会被公开。 必填项已用*标注