dashboard Namespace (Workloads) is broken when you have HA prometheus #680

Andor · 2021-10-19T07:59:59Z

When you have HA Prometheus, you usually will have multiple prometheus instances with different prometheus_replica label values.

For example, this query from dashboard Namespace (Workloads) will return the error:

sum(
  node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster="$cluster", namespace="$namespace"}
* on(namespace,pod)
  group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace="$namespace", workload_type="$type"}
) by (workload, workload_type)

error:

duplicate time series on the right side of `* on (namespace, pod) group_left (workload, workload_type)`

The text was updated successfully, but these errors were encountered:

paulfantom · 2021-10-21T09:42:15Z

That sounds like an issue with evaluating rules not in prometheus but in something like thanos ruler or cortex.

Andor · 2021-10-21T10:03:55Z

No, rules are evaluated on Prometheus side. I use remote_write on top of that and store data in VictoriaMetrics.
I managed to mitigate this issue by removing label prometheus_replica on remote_write relabeling.
This is possible you can close this issue now.

paulfantom · 2021-10-21T10:15:57Z

I am just wondering how your setup is configured to get to this results. Usually what I find is that there are 2 prometheus replicas using identical scrape configuration and the same recording/alerting rules. The only difference between replicas is what is set up in external_labels section. In such a scenario replica A do not have access to data from replica B and thus there is no possibility for duplication of data which could cause an issue like the one described here.

The only way I can see this could happen is if recording rules are evaluated based on data from both replicas A and replica B.

pschulten · 2021-12-21T12:19:30Z

Thanks @Andor
I had the same behavior. query:

sum(
    node_namespace_pod_container:container_cpu_usage_seconds_total:sum_irate{cluster="tortilla", namespace="loki"}
  * on(namespace,pod)
    group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="tortilla", namespace="loki", workload="compactor", workload_type="deployment"}
) by (pod)

error:

Query error
422: error when executing query="sum(\n node_namespace_pod_container:container_cpu_usage_seconds_total:sum_irate{cluster=\"tortilla\", namespace=\"loki\"}\n * on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"tortilla\", namespace=\"loki\", workload=\"compactor\", workload_type=\"deployment\"}\n) by (pod)\n" on the time range (start=1640087850000, end=1640088750000, step=30000): cannot execute query: cannot evaluate "node_namespace_pod_container:container_cpu_usage_seconds_total:sum_irate{cluster=\"tortilla\", namespace=\"loki\"} * on (namespace, pod) group_left (workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=\"tortilla\", namespace=\"loki\", workload=\"compactor\", workload_type=\"deployment\"}": duplicate time series on the right side of `* on (namespace, pod) group_left (workload, workload_type)`: {cluster="tortilla", namespace="loki", pod="compactor-8564fbf8b4-mdr7p", prometheus="monitoring/k8s", prometheus_replica="prometheus-k8s-1", stage="prod", workload="compactor", workload_type="deployment"} and {cluster="tortilla", namespace="loki", pod="compactor-8564fbf8b4-mdr7p", prometheus="monitoring/k8s", prometheus_replica="prometheus-k8s-0", stage="prod", workload="compactor", workload_type="deployment"}

"fixed" by:

prometheus+: {
  prometheus+: {
    spec+: {
      //...
      externalLabels: {
        cluster: 'xyz',
        //...
      },
      remoteWrite: [
        // Also write metrics to victoria-metricses
        {
          writeRelabelConfigs: [
            {
              sourceLabels: ['prometheus_replica'],
              action: 'drop',
            },
          ],
          url: 'https://example.com:8427/api/v1/write',
          //...

Pluies · 2022-03-09T12:43:31Z

FWIW, we ran into the same issue after setting up HA Prometheus (backed by Thanos using the Thanos sidecar), and it affected a lot of dashboards.

The best way to fix this for us was to turn on deduplication in Thanos query, so that Thanos would only reply with a single timeserie rather than one timeserie for each scraping Prometheus. All dashboards back to normal!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

dashboard Namespace (Workloads) is broken when you have HA prometheus #680

dashboard Namespace (Workloads) is broken when you have HA prometheus #680

Andor commented Oct 19, 2021

paulfantom commented Oct 21, 2021

Andor commented Oct 21, 2021

paulfantom commented Oct 21, 2021

pschulten commented Dec 21, 2021

Pluies commented Mar 9, 2022

dashboard Namespace (Workloads) is broken when you have HA prometheus #680

dashboard Namespace (Workloads) is broken when you have HA prometheus #680

Comments

Andor commented Oct 19, 2021

paulfantom commented Oct 21, 2021

Andor commented Oct 21, 2021

paulfantom commented Oct 21, 2021

pschulten commented Dec 21, 2021

Pluies commented Mar 9, 2022