-
Notifications
You must be signed in to change notification settings - Fork 151
DSL
Tony Shen edited this page Aug 3, 2019
·
13 revisions
在使用 Kotlin 编写爬虫时,可以借助 DSL 来编写 NetDiscovery 的组件。
定义 Request
val request = request {
url = "https://www.baidu.com/"
httpMethod = HttpMethod.GET
spiderName = "tony"
header {
"111" to "2222"
"333" to "44444"
}
extras {
"tt" to "qqq"
}
}
定义 Spider
val spider = spider {
name = "tony"
urls = listOf("http://www.163.com/","https://www.baidu.com/")
pipelines = listOf(ConsolePipeline())
}
spider.run()
它等价于下面的 Java 代码
Spider.create().name("tony1")
.url("http://www.163.com/", "https://www.baidu.com/")
.pipeline(new ConsolePipeline())
.run();
定义 SpiderEngine
val spiderEngine = spiderEngine {
port = 7070
addSpider {
name = "tony1"
}
addSpider {
name = "tony2"
urls = listOf("https://www.baidu.com")
}
}
val spider = spiderEngine.getSpider("tony1")
spider.repeatRequest(10000,"https://github.com/fengzhizi715")
.initialDelay(10000)
spiderEngine.run()
- Configuration
- Downloader
- vertx webclient
- urlconnection
- http client
- okhttp3
- file
- selenium
- Chrome
- Firefox
- IE
- Phantomjs
- htmlunit
- Queue
- ConcurrentLinkedQueue
- Disruptor
- Redis
- Kafka
- RabbitMQ
- RocketMQ
- Parser
- Pipeline
- Rate limiting
- RPC
- Jobs
- Vert.x EventBus
- User Agent Pools
- IP Pools
- Cookies Pools
- Monitor
- Spider
- SpiderEngine
- DSL
- Coroutines
- ChangeLog