-
Notifications
You must be signed in to change notification settings - Fork 268
Protocols
Sebastian Nagel edited this page Jul 16, 2025
·
7 revisions
The following network protocols are implemented in StormCrawler:
See HTTPProtocol for the effect of metadata content on protocol behaviour.
To change the implementation, add the following lines to your crawler-conf.yaml
http.protocol.implementation: "org.apache.stormcrawler.protocol.okhttp.HttpProtocol"
https.protocol.implementation: "org.apache.stormcrawler.protocol.okhttp.HttpProtocol"
| Features | HTTPClient | OKhttp | Selenium |
|---|---|---|---|
| Basic authentication | Y | Y | N |
| proxy (w. credentials?) | Y / Y | Y / Y | ? |
| interruptible / trimmable #463 | N / Y | Y / Y | Y / N |
| cookies | Y | Y | N |
| response headers | Y | Y | N |
| trust all certificates | N | Y | N |
| HEAD method | Y | Y | N |
| POST method | N | Y | N |
| verbatim response header | Y | Y | N |
| verbatim request header | N | Y | N |
| IP address capture | N | Y | N |
| navigation and javascript | N | N | Y |
| HTTP/2 | N | Y | (Y) |
| configurable connection pool | N | Y | N |
- the OKHttp protocol supports HTTP/2 if the JDK includes ALPN (Java 9 and upwards or Java 8 builds starting early/mid 2020).
- HttpClient does not yet support HTTP/2
- Selenium: whether HTTP/2 is used or not depends on the used driver
Since #829 the HTTP protocol version used is configurable via http.protocol.versions (see also comments in crawler-default.yaml. Eg., to force that only HTTP/1.1 is used:
http.protocol.versions:
- "http/1.1"
- Start
- Components
- Filters
- Bolts
- Protocol
- Metadata
- Resources