-
Notifications
You must be signed in to change notification settings - Fork 3.5k
Description
Today, most Logstash monitoring functions are accomplished by tailing logs or outputting debug messages. Users typically send specially tagged tracer events to check the health of the system. These special events are also used to measure the latency of the pipeline. This is definitely not straightforward and it becomes hard to administer a large-scale Logstash cluster.
We plan to introduce a Logstash monitoring API endpoint, which will provide visibility into the pipeline. Some important metrics are:
- health
- number of events processed
- latency metrics (average, percentile, etc)
- size of the persistent queues (Provide option to have variable size internal queues which are persisted #2606)
- number of errors/success
Medium term, we should provide plugin level granularity. For example, it would be great to know how long (on average) an event spends on grok filters, geo ip filters etc. This would help users drill in to the expensive parts of the pipeline.
Care should be taken to make sure metrics collection do not add additional stress on the pipeline and affect the latency and throughput of the events.