Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate #1056

Closed
xiaozhangzhang1 opened this issue Mar 24, 2021 · 26 comments
Closed

Comments

@xiaozhangzhang1
Copy link

xiaozhangzhang1 commented Mar 24, 2021

record: node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate
expr: sum by(cluster, namespace, pod, container) (rate(container_cpu_usage_seconds_total{container!="POD",image!="",job="kubelet",metrics_path="/metrics/cadvisor"}[5m])) * on(cluster, namespace, pod) group_left(node) topk by(cluster, namespace, pod) (1, max by(cluster, namespace, pod, node) (kube_pod_info{node!=""}))

this record rule is not work in promethues ,if i change on(cluster, namespace, pod) is on( namespace, pod),it works

  • Prometheus Operator version:
    release-0.6

  • Kubernetes version information:

    kubectl version
    Client Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.14", GitCommit:"89182bdd065fbcaffefec691908a739d161efc03", GitTreeState:"clean", BuildDate:"2020-12-18T12:11:25Z", GoVersion:"go1.13.15", Compiler:"gc", Platform:"linux/amd64"}
    Server Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.14", GitCommit:"89182bdd065fbcaffefec691908a739d161efc03", GitTreeState:"clean", BuildDate:"2020-12-18T12:02:35Z", GoVersion:"go1.13.15", Compiler:"gc", Platform:"linux/amd64"}

@paulfantom
Copy link
Member

It looks like you have a cluster label in one metric but not in the other. Does container_cpu_usage_seconds_total{container!="POD",image!="",job="kubelet",metrics_path="/metrics/cadvisor", cluster!=""} and kube_pod_info{node!="",cluster!=""} return any output?

@xiaozhangzhang1
Copy link
Author

It looks like you have a cluster label in one metric but not in the other. Does container_cpu_usage_seconds_total{container!="POD",image!="",job="kubelet",metrics_path="/metrics/cadvisor", cluster!=""} and kube_pod_info{node!="",cluster!=""} return any output?

i did, return no data

@xiaozhangzhang1
Copy link
Author

xiaozhangzhang1 commented Mar 24, 2021

It looks like you have a cluster label in one metric but not in the other. Does container_cpu_usage_seconds_total{container!="POD",image!="",job="kubelet",metrics_path="/metrics/cadvisor", cluster!=""} and kube_pod_info{node!="",cluster!=""} return any output?

kube_pod_info{node!="",cluster!=""} return many
kube_pod_info{cluster="",container="kube-rbac-proxy-main",created_by_kind="",created_by_name="",host_ip="",instance="",job="kube-state-metrics",namespace="default",node="master01",pod="netshoot",pod_ip="",uid="ef6d61ac-fed4-4ee3-b757-de912a6863fb"}

@xiaozhangzhang1
Copy link
Author

container_cpu_usage_seconds_total{container!="POD",image!="",job="kubelet",metrics_path="/metrics/cadvisor", cluster!=""}
return no data

@paulfantom
Copy link
Member

It seems that your cluster is not configured correctly and you have cluster label attached to metrics from kube-state-metrics, but not to metrics from kubelet. You need to have it in both places.

@xiaozhangzhang1
Copy link
Author

It seems that your cluster is not configured correctly and you have cluster label attached to metrics from kube-state-metrics, but not to metrics from kubelet. You need to have it in both places.

thanks ,i got it ,yes ,i did cluster label to kube-state-metrics, i did the same to kubelet, but it not works ,

@ArchiFleKs
Copy link

I have the same issue where CPU usage is not working anymore on grafana

@rouja
Copy link

rouja commented Oct 5, 2021

Hi,

I'm not sure but I think this issue was fixed in :

78a4677

It seems that the record node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate was replaced by node_namespace_pod_container:container_cpu_usage_seconds_total:sum_irate

@SonalJain1707
Copy link

SonalJain1707 commented Nov 16, 2022

I am also facing same issue

sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_irate{cluster="$cluster", namespace="$namespace"}) / sum(kube_pod_container_resource_requests{job="kube-state-metrics", cluster="$cluster", namespace="$namespace", resource="cpu"})

Does not return data

@rmn-lux
Copy link

rmn-lux commented Dec 12, 2022

+1 the same thing

I am also facing same issue

sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_irate{cluster="$cluster", namespace="$namespace"}) / sum(kube_pod_container_resource_requests{job="kube-state-metrics", cluster="$cluster", namespace="$namespace", resource="cpu"})

Does not return data

@jeremydescamps
Copy link

jeremydescamps commented Dec 22, 2022

When querying node_namespace_pod_container:container_cpu_usage_seconds_total:sum_irate on my prometheus instance, that does not return anything.
From what exporter this metric come from ?

@fguiet
Copy link

fguiet commented Dec 22, 2022

Hi there,

Prometheus stack chart : kube-prometheus-stack-43.1.1, App Version: 0.61.1
K8s deployed with Rancher Docker version 2.7 : 1.24.4

To me, it was related to this issue : k3s-io/k3s#5782
As mentioned in the issue, image label is now missing.

Workaround : I removed the image!="" label in all the rules from prometheus-stack-kube-prom-k8s.rules.yaml file and now my grafana dashboard work like a charm

# Extract from file : prometheus-stack-kube-prom-k8s.rules.yaml
- name: k8s.rules
      rules:
        - expr: >-
            sum by (cluster, namespace, pod, container) (
              irate(container_cpu_usage_seconds_total{job="kubelet", metrics_path="/metrics/cadvisor", image!=""}[5m])
            ) * on (cluster, namespace, pod) group_left(node) topk by (cluster,
            namespace, pod) (
              1, max by(cluster, namespace, pod, node) (kube_pod_info{node!=""})
            )
          record: >-
            node_namespace_pod_container:container_cpu_usage_seconds_total:sum_irate

image

This file has to be modified as well : prometheus-stack-kube-prom-k8s-resources-workload.yaml
Remove : container!="" and image!=""
Don't forget to kill pod : prometheus-stack-grafana so dashboards get updated !

@github-actions
Copy link

This issue has been automatically marked as stale because it has not had any activity in the last 60 days. Thank you for your contributions.

@bmgante
Copy link

bmgante commented Mar 17, 2023

Hi @fguiet
Got stuck in this problem as well, i am using latest helm chart version.
I am using minikube v1.28.0.

I've already removed the label image!="" from k8s.rules and cpu dashboards started working.

However, i still have issues for memory dashboard which basically use metric container_memory_working_set_bytes.

Example:
sum(container_memory_working_set_bytes{job="kubelet", metrics_path="/metrics/cadvisor", cluster="$cluster", namespace="$namespace", pod="$pod", container!="", image!=""}) by (container)

From prometheus, there is no cluster, container or image labels for this metric. Did you face this issue as well and if yes, how did you fix it?

image

Similar issue with dashboards using fs metrics (and probably a lot of other metrics from cadvisor):
sum by(container) (rate(container_fs_reads_total{job="kubelet", metrics_path="/metrics/cadvisor", device=~"(/dev/)?(mmcblk.p.+|nvme.+|rbd.+|sd.+|vd.+|xvd.+|dm-.+|md.+|dasd.+)", container!="", cluster="$cluster", namespace="$namespace", pod="$pod"}[$__rate_interval]))

Thanks

@github-actions github-actions bot removed the stale label Mar 18, 2023
@sirajkrm
Copy link

FWIW this has been addressed in later versions of Rancher Server 2.6.11 along upgrading k8s to 1.24.10-rancher4-1

@anthosz
Copy link

anthosz commented May 25, 2023

When querying node_namespace_pod_container:container_cpu_usage_seconds_total:sum_irate on my prometheus instance, that does not return anything. From what exporter this metric come from ?

Same for me, did you found it finally?

Using the last helm chart of prometheus stack

@zuchka
Copy link

zuchka commented Jul 12, 2023

I'm hitting this as well. any ideas?

@jpiazza35
Copy link

same issue here

@DhruvPatel2647
Copy link

DhruvPatel2647 commented Aug 1, 2023

if you have included this in the values of prometheus :

before: kubelet:
serviceMonitor:
https: false

After (This works):
kubelet:
serviceMonitor:
https: true
because Kubelet is responsible for that metrics.

for me I have disabled http in service-monitor for kubelt then I research it and foud that kublelt hhtp shoulbe enabled that is http:true

@gustavofbreunig
Copy link

I'm facing this same issue, reported here.

Removed container!="" and image!="" from prometheus-stack-kube-prom-k8s.rules.yaml worked.

@mohamadkhani
Copy link

mohamadkhani commented Sep 8, 2023

Thanks to @gustavofbreunig.

Removing all image!="" from charts/kube-prometheus-stack/templates/prometheus/rules-1.14/k8s.rules.yaml file, fixed my problem too.

This is my fork if any one wants to check it.

@gustavofbreunig
Copy link

Related issue: google/cadvisor#3336

@gustavofbreunig
Copy link

It was a rancher issue, corrected on v1.24.10-rancher4-1

rancher/rancher#38934

proceed to close the issue

@nathanmcgarvey-modopayments
Copy link

nathanmcgarvey-modopayments commented Sep 8, 2023

Related issue: google/cadvisor#3336

Also related details if you are on Docker-Desktop: docker/for-mac#6969

Edit: ....or potentially just using the docker driver for minikube or Docker Desktop or really anything that involves Docker.

Copy link

github-actions bot commented Nov 8, 2023

This issue has been automatically marked as stale because it has not had any activity in the last 60 days. Thank you for your contributions.

@github-actions github-actions bot added the stale label Nov 8, 2023
Copy link

github-actions bot commented Mar 8, 2024

This issue was closed because it has not had any activity in the last 120 days. Please reopen if you feel this is still valid.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Mar 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests