Skip to content

downdawn/amazon_spider

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 

Repository files navigation

amazon_spider

middlewares中间件扩展RandomUserAgentMiddleware类,使用随机UserAgent。真正的大规模爬取会遇到IP反爬,请自行扩展IP池或者使用付费代理,这里暂不详讲。

友好起见,在setting中设置DOWNLOAD_DELAY下载延时。可以使用scrapy_redis分布式来提高爬取速度。

About

基于scrapy的amazon爬虫

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages