A Kafka Connector to read items from Hacker News and stream it into Kafka. Because why not ¯\_(ツ)_/¯
This source connector reads items (stories, comments, jobs, Ask HNs, polls) from Hacker News via https://github.com/HackerNews/API. Items are read serially starting from initial.start.item
(defaults to 1). Currently, only a single connector task is supported.
Run mvn clean package
from the repo's root and then copy and unzip the zip archive created in target/components/packages/
to any directory on your Connect worker's plugin path.
These are the supported configs :-
Name | Description | Type | Importance |
---|---|---|---|
kafka.topic |
Topic to write to | String | High |
poll.interval.ms |
Interval between polls (ms) | Long | High |
initial.start.item |
Hacker News item id to start reading from | Long | Medium |
max.items |
Maximum number of items to read from Hacker News or less than 1 for unlimited | Long | Medium |
An example config for this connector :-
{
"name": "HN",
"connector.class": "com.github.yashmayya.kafka.connect.hackernews.HackerNewsSourceConnector",
"value.converter": "org.apache.kafka.connect.json.JsonConverter",
"value.converter.schemas.enable": "false",
"kafka.topic": "hn-items",
"poll.interval.ms": "100",
"initial.start.item": "1"
}
-
Implement offset tracking and recovery -
Support dynamic reloading of max item id so that the connector can run forever -
Add support for schemas