[Feature Request] Support writing metrics to Kafka #343

bk-khaled · 2018-05-09T15:24:51Z

Hi,

I think it would be great if we can send metrics from snmpcollector to Kafka.
Influx is great but this way we can have more flexibility and control over the data flow ...

Best regards

toni-moreno · 2018-05-28T06:55:18Z

Hi @bk-khaled we would like to add this kind of new features but I would like to understand your use case.

why do you need kafka between snmpcollector and influxdb? how many data ingestion are you currently generating that influxdb is not supporting them?

I think you would be able to do this kind of data transformation with telegraf , did you test it before?

lerela · 2018-06-05T16:40:31Z

Hi @toni-moreno. After discussing this with @bk-khaled, goal isn't necessarily to address a data ingestion issue. For instance, a few use cases:

need for clustering & redundancy (without InfluxDB Entreprise): Kafka can be clustered & ensure reliable persistence to subscribers,
storing the data elsewhere: the overall architecture would be simpler if the flow is snmpcollector -> kafka -> logstash -> ... (for instance), instead of having snmpcollector -> influxdb -> telegraf -> kafka -> logstash -> ...,
more broadly, Kafka offers interesting features so being able to directly push measurements to topics could allow finer control of the data flow (and of course data could also be pushed to influxdb after kafka, making kafka the message broker & influx a consumer among others)

What are your thoughts?

lerela · 2018-08-22T12:49:10Z

@toni-moreno another use case is that we are collecting many other, non-snmp metrics and need a message broker to isolate the DB or to feed the stream processing engine (as in most metrics infrastructures I know of); it would make much sense to feed all our metrics into the same solution and process them with the same pipeline.

I understand that you don't have time to devote to this task right now so we might try and implement it, but could you share some insights or implementation tips so that I can assess the feasibility of adding a Kafka sink? eg, what in your opinion would need to be done so that both InfluxDB & Kafka can be supported the "clean" way?

Many thanks.

toni-moreno · 2018-08-23T13:57:18Z

Hi @lerela . You are right, I'm sorry but I have not time enough , but you are welcomed to add this new feature if you need.

This new feature will need a lot of redesign and refactor the following steps would be a way to do it.

1.- Refactor output configuration config and store.

This means the way as snmpcollector will store config data and let users config them from the web ui the influxdb/kafka or any other kind of backend.

1.1- Refactor the way to store generic other ouput configuration into the database.

right now influxdb is stored with this struct
https://github.com/toni-moreno/snmpcollector/blob/master/pkg/config/dbconfig.go#L50-L68

And has I/O API to de database here.
https://github.com/toni-moreno/snmpcollector/blob/master/pkg/config/influxcfg.go

1.2- Refactor the webui Angular&Golang

This go file connects with the config.InfluxCFG to load/save data to the database
https://github.com/toni-moreno/snmpcollector/blob/master/pkg/webui/apicfg-influxerver.go
This directory should be renamed as GenericOutput perhaps.
https://github.com/toni-moreno/snmpcollector/tree/master/src/influxserver

2.- Refactor/Design for Output Interfaces

2.1- create an output.Backend interface

 "pkg/agent/output/interface.go"
 "pkg/agent/output/Backend.go"

Be careful , this module has internal statistics that should be maintained for all kind of output backends.

2.2- Adapt the influxdb output to the new interface

 "pkg/agent/output/influxdb.go"

2.2.- Modify the agent to store an array of generic output.Backend instead of output.InfluxDB devices.

https://github.com/toni-moreno/snmpcollector/blob/master/pkg/agent/agent.go#L64

3.- Create your new Kafka Output

 "pkg/agent/output/kalka.go"

Let me know if you need more info to begin to work in this new feature.

lerela · 2018-08-23T16:39:29Z

Thanks for your detailed answer @toni-moreno. Will look into this and let you know if I have additional questions.

steffenschumacher · 2021-04-13T13:51:33Z

So I would also like to support the development of this feature, and we will be in touch on the specifics - did anything ever get done? @lerela ?

lerela · 2021-04-13T14:06:22Z

Hi @steffenschumacher, no. snmpcollector is great but we ended up writing a new tool more suited to our use case (https://github.com/kosctelecom/horus)

The SNMPCollector is currently focused and oriented to send metrics to InfluxDB, using internally specific functions of InfluxDB client to manage and send those metrics. This PR tries to break with this dependency on metric internal management adopting the current telegraf metric definitions. With this abstracion, a new engine is created in order to send those generic metrics to specific backends in a centralized way. The new metric sender engine also adopts the current telegraf buffer. The metrics are now stored on the buffer instead of being written each time that a measurement produces a small set of them. The proposed engine is changed and it is based on the current telegraf way: a ticker is defined and the metrics are flushed on several batches, reducing drastically the number of request that are being done and with the change to control the current buffer size and make operations on top the output. Since all the metrics are stored on a top-level buffer, the send process is being done by the backend implementation. This PR redefines the current influxdb to be a backend that only performs the connection, write close. The internal selfmon metrics -measurement, runtime, outdb stats- are refactored to follow the same generic metric definition, using, as always, the 'default' backend defined, now, as an Output. With this behaviour, a new Kafka Backend has been created to be able to write metrics to several brokers using JSON output format. The code is copied/modified from the current telegraf.output Kafka The SNMPCollector OutDB now is related with an Output instead of InfluxDB Server. All the management of Outputs and new Kafka Server can be done on the UI and via API. The output relation with the backend is, right now 1:1, and a Device can have only 1 output attached BREAKING CHANGE: the metrics are now stored on internal buffer and written to the final backend based on time. The write process is being done each FlushInterval and splitted into several request based on the buffer length and the MetricsBatchSize. BREAKING CHANGE: the InfluxDB UI component is being moved to a new section and the relation with the SNMPDevice is break and changed to an Output. The field BufferSize is being removed from API request and new DB initializtion, but mantained on old configurations to perform a migration process fix #343

toni-moreno added this to the 2.0 milestone Mar 7, 2021

toni-moreno added output-db Related to new output devices to send metrics enhancement labels Mar 7, 2021

sbengo linked a pull request Nov 9, 2022 that will close this issue

feat!: add new kafka backend and refactor metrics sender engine #522

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature Request] Support writing metrics to Kafka #343

[Feature Request] Support writing metrics to Kafka #343

bk-khaled commented May 9, 2018

toni-moreno commented May 28, 2018

lerela commented Jun 5, 2018 •

edited

Loading

lerela commented Aug 22, 2018

toni-moreno commented Aug 23, 2018 •

edited

Loading

lerela commented Aug 23, 2018

steffenschumacher commented Apr 13, 2021

lerela commented Apr 13, 2021

[Feature Request] Support writing metrics to Kafka #343

[Feature Request] Support writing metrics to Kafka #343

Comments

bk-khaled commented May 9, 2018

toni-moreno commented May 28, 2018

lerela commented Jun 5, 2018 • edited Loading

lerela commented Aug 22, 2018

toni-moreno commented Aug 23, 2018 • edited Loading

1.- Refactor output configuration config and store.

1.1- Refactor the way to store generic other ouput configuration into the database.

1.2- Refactor the webui Angular&Golang

2.- Refactor/Design for Output Interfaces

2.1- create an output.Backend interface

2.2- Adapt the influxdb output to the new interface

2.2.- Modify the agent to store an array of generic output.Backend instead of output.InfluxDB devices.

3.- Create your new Kafka Output

lerela commented Aug 23, 2018

steffenschumacher commented Apr 13, 2021

lerela commented Apr 13, 2021

lerela commented Jun 5, 2018 •

edited

Loading

toni-moreno commented Aug 23, 2018 •

edited

Loading