-
Notifications
You must be signed in to change notification settings - Fork 599
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
dashboard Namespace (Workloads) is broken when you have HA prometheus #680
Comments
That sounds like an issue with evaluating rules not in prometheus but in something like thanos ruler or cortex. |
No, rules are evaluated on Prometheus side. I use remote_write on top of that and store data in VictoriaMetrics. |
I am just wondering how your setup is configured to get to this results. Usually what I find is that there are 2 prometheus replicas using identical scrape configuration and the same recording/alerting rules. The only difference between replicas is what is set up in The only way I can see this could happen is if recording rules are evaluated based on data from both replicas A and replica B. |
Thanks @Andor
error:
"fixed" by:
|
FWIW, we ran into the same issue after setting up HA Prometheus (backed by Thanos using the Thanos sidecar), and it affected a lot of dashboards. The best way to fix this for us was to turn on deduplication in Thanos query, so that Thanos would only reply with a single timeserie rather than one timeserie for each scraping Prometheus. All dashboards back to normal! |
When you have HA Prometheus, you usually will have multiple prometheus instances with different
prometheus_replica
label values.For example, this query from dashboard
Namespace (Workloads)
will return the error:error:
The text was updated successfully, but these errors were encountered: