spider tricks


Author: yifei / Created: May 30, 2017, 10:33 a.m. / Modified: May 30, 2017, 10:34 a.m. / Edit

http://www.cnblogs.com/jexus/p/5471665.html

  1. Use your user's agent as a node
  2. Use free resouces online that has a web interface as a node
  3. For metadata, consult google or other competetive websites
  4. reverse engineering the site template

页面类型

  1. 静态页面
  2. 动态页面,页面在 js 数组
  3. 动态页面,页面在 js 模板中
  4. 动态页面,数据通过 jsonp 加载
  5. 动态页面,数据通过 json 加载
  6. 动态页面,数据通过 ws 加载

评论区