Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Beats central monitoring Phase 1 #3422

Closed
10 tasks done
monicasarbu opened this issue Jan 19, 2017 · 13 comments
Closed
10 tasks done

Beats central monitoring Phase 1 #3422

monicasarbu opened this issue Jan 19, 2017 · 13 comments

Comments

@monicasarbu
Copy link
Contributor

monicasarbu commented Jan 19, 2017

When you deploy a large number of Beats, it becomes challenging to monitor the Beats itself.
A solution would be for the Beats to report health status to a collection point, such as Elasticsearch, and visualize it with Kibana.

The following health metrics should be sent to Elasticsearch:

Each Beat exports more metrics via expvar, but it should send only a subset of these metrics to Elasticsearch.

By default, the health metrics are sent directly to the Elasticsearch cluster configured in the outputs.elasticsearch, but you can also configure an extra Elasticsearch cluster to send the monitoring data to.

TODO:

  • Differentiate between the metrics exported via expvar to send only a subset
  • Send the health metrics to Elasticsearch

Configuration:

monitoring:
    enabled: true
    period: 10s
    elasticsearch: ["localhost:9201"]

UPDATE: The CPU usage is exported under different fields. See #3422

cc-ed @bohyun-e @brandonmensing

@ruflin
Copy link
Member

ruflin commented Jan 20, 2017

As reference, here is the old issue where this all started: #463

@bohyun-e
Copy link

enabled: true

Would the default value be true here? or false?

period: 10s

I'm not 100% sure what happens when you have the collection period that is different from the ES monitoring collection interval. But reading from the doc, my gut feeling is that whatever ES's collection interval is - should be applied in other products, such as Kibana Monitoring collection interval. I'm guessing it would be the same for Beats, but it'd be a good idea to confirm.

image

@uboness
Copy link

uboness commented Jan 20, 2017

I'm not 100% sure what happens when you have the collection period that is different from the ES monitoring collection interval. But reading from the doc, my gut feeling is that whatever ES's collection interval is - should be applied in other products, such as Kibana Monitoring collection interval. I'm guessing it would be the same for Beats, but it'd be a good idea to confirm.

I don't think we should have this restriction (that all collection intervals are equal). I also don't know how we can even enforce it.

Different systems may need different intervals, and the monitoring UI should deal with that.

@monicasarbu monicasarbu mentioned this issue Jan 20, 2017
36 tasks
@lswith
Copy link

lswith commented Jan 24, 2017

It would be extremely nice to simply have one more commandline option to turn on only expvar variables, rather than the httpprof commandline option. This would allow other tools to scrape each beat type on their own interval.

@bohyun-e
Copy link

cc: @pickypg @tsullivan @skearns64

@valentin-fischer
Copy link

Hi,

Any progress on this ? Is there a way to export this as json and not send everything to elasticsearch ?

Thank you!

@jeremydonahue
Copy link

+1 for a progress update.

It would be extremely nice to simply have one more commandline option to turn on only expvar variables, rather than the httpprof commandline option. This would allow other tools to scrape each beat type on their own interval.

+1

It would also be very nice to support outputs other than elasticsearch. Ideally any of the already supported outputs (eg. Kafka, Redis, Logstash, etc.) would work:

monitoring:
    enabled: true
    period: 10s
    output.elasticsearch: 
        hosts: ["localhost:9201"]
        ...
    output.kafka:
        hosts: ["localhost:9092"]
        ...

Thanks!

@monicasarbu
Copy link
Contributor Author

Unfortunately, we didn't do much progress here. We are planning to store all the monitoring data to Elasticsearch only. In the first version, we are sending the monitoring data to Elasticsearch, but we are considering sending the data to other supported outputs in the future.

@superwhykz
Copy link

For kafka it would be nice to send to different topic that can be defined under monitoring struct. We 're heavily scaling filebeat in our infra(18k+) and none of them ships directly to elastic.

@trondhindenes
Copy link

I'd love it if Beats followed the same model as Logstash and simply exposed a local metrics endpoint. We're struggling with writing robust monitoring for filebeat, as it's very much a "black box" when it comes to state. I'm not sure an Elasticsearch metrics integration would help all that much. Beats logfiles are not geared towards getting to the "current state" of the beat (eg. "is the beat able to ship data to logstash right now?") Enabling a local metrics endpoint ala logstash and expose items such as "queued events" "percent number of dead/alive shipper targets" so that we could pick up that info locally using a monitoring tool/agent such as prometheus or Datadog would be of much higher value to us than getting it in Elasticsearch.

@tsg
Copy link
Contributor

tsg commented Jan 31, 2018

@trondhindenes An experimental http endpoint is added in Beats 6.2. See #3717

@trondhindenes
Copy link

Awesome!

@tsg
Copy link
Contributor

tsg commented Feb 28, 2018

We can consider phase 1 completed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

10 participants