-
Notifications
You must be signed in to change notification settings - Fork 796
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Metric "elasticsearch_cluster_health_timed_out" should be 1 and not absent when querying cluster API fails. #212
Comments
Actually absent won't work. absent() function will only work when all the nodes when querying the API fails for all members of an elasticsearch cluster . How is this metric supposed to be used? I want to use this metric to monitor that the exporter can indeed query the search engine and alert when otherwise. UPDATE:
Would raise an alert if that metric for any nodes in the cluster turned to 1. |
Hi @sarfarazahmad89 So in fact, having field from the You can use the following metrics as an indicator if the cluster responds to http requests:
These metrics are set to 1 if the endpoint was reachable, 0 else. |
The metric elasticsearch_cluster_health_timed_out was removed in prometheus-community@320d8b3 per prometheus-community#212 Signed-off-by: Frank Ritchie <[email protected]>
The metric elasticsearch_cluster_health_timed_out was removed in 320d8b3 per #212 Signed-off-by: Frank Ritchie <[email protected]>
elasticsearch_cluster_health_timed_out is a gauge metric.
I was under the impression that its value would oscillate between 0 and 1 depending on whether it can query cluster health API or not.
So in a situation where elasticsearch service has gone down, I was expecting the metric to turn to 1, instead it just goes away.
I could configure alert rules using something like absent(elasticsearch_cluster_health_timed_out) but I think that isn't the right way to do this.
Even the official prometheus' docs recommend avoiding missing metrics.
Here https://prometheus.io/docs/practices/instrumentation/#avoid-missing-metrics
The text was updated successfully, but these errors were encountered: