-
Notifications
You must be signed in to change notification settings - Fork 796
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Split the exporter into multiple collectors #65
Conversation
Wrap promtheus.Desc to have it all in one place
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Would be nice to add a proper README, example recording/alerting rules and a example Grafana dashboard, but we can merge first and add that later.
collector/cluster_health.go
Outdated
), | ||
NumberOfPendingTasks: prometheus.NewDesc( | ||
prometheus.BuildFQName(namespace, subsystem, "number_of_pending_tasks"), | ||
"XXX WHAT DOES THIS MEAN?", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Intended or left over? If we can't figure out a proper description we may want to put an undocumented
instead.
collector/cluster_health.go
Outdated
), | ||
DelayedUnassignedShards: prometheus.NewDesc( | ||
prometheus.BuildFQName(namespace, subsystem, "delayed_unassigned_shards"), | ||
"XXX WHAT DOES THIS MEAN?", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
document, please
collector/cluster_health.go
Outdated
), | ||
TimedOut: prometheus.NewDesc( | ||
prometheus.BuildFQName(namespace, subsystem, "timed_out"), | ||
"XXX WHAT DOES THIS MEAN?", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
document, please
collector/cluster_health.go
Outdated
) | ||
|
||
var statusValue float64 | ||
if clusterHealthResponse.Status == "green" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about red and yellow? Would like to be able to distinquish between those two as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I currently assume that elasticsearch_cluster_health_status
is 0
if red or yellow and only 1
when green.
I can only think about doing something like
red
=>0
yellow
=>1
green
=>2
which honestly feels weird.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, indeed. I see that issue. But on the other hand I'd like to differentiate between red and yellow ...
collector/cluster_health.go
Outdated
"The number of shards that are currently moving from one node to another node.", | ||
[]string{"cluster"}, nil, | ||
), | ||
StatusIsGreen: prometheus.NewDesc( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we drop this, please?
We already have the status metric below.
collector/cluster_health.go
Outdated
NumberOfNodes *prometheus.Desc | ||
NumberOfPendingTasks *prometheus.Desc | ||
RelocatingShards *prometheus.Desc | ||
StatusIsGreen *prometheus.Desc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we drop this in favor of Status?
collector/cluster_health.go
Outdated
RelocatingShards *prometheus.Desc | ||
StatusIsGreen *prometheus.Desc | ||
Status *prometheus.Desc | ||
StatusIsYellow *prometheus.Desc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we drop this in favor of Status?
collector/cluster_health.go
Outdated
StatusIsGreen *prometheus.Desc | ||
Status *prometheus.Desc | ||
StatusIsYellow *prometheus.Desc | ||
StatusIsRed *prometheus.Desc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we drop this in favor of Status?
Since prometheus-community/elasticsearch_exporter#65 the elasticsearch_up metric is in the elasticsearch_cluster_health namespace
We want to get rid of the big maps with counters and gauges.
This is a approach I could come up with.
Put everything a metric needs (type, desc, value, labels) into a struct to have it in one place. The improvement is, that these metrics, now have value and label
func
s to retrieve their values where they're declared. This way it's all combined at one place and not all over the place.Changes to metrics:
- elasticsearch_up
We're doing multiple requests concurrently to the endpoint now. Not really sure on how this metric could be helpful now. Would need some sort of global shared state between collectors. Dropping it for now.
These are duplicates of
elasticsearch_cluster_health_status
which has the color as a label, as it should be.Percentage has been computed in this exporter up until now. We're dropping these metrics. This should be calculated in prometheus itself. We're probably going to provide recording rules that make this optional metrics.
Because we're talking about seconds.
- indices_search_fetch_time_seconds_total
This is a duplicate of
indices_search_fetch_time_seconds
- elasticsearch_indices_search_query_time_seconds_total
This is a duplicate of
indices_search_query_time_seconds