-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
No metrics with containerd #764
Comments
I was having a similar issue with cadvisor not returning full metrics when using containerd. Is this related? #724. I was expecting the latest release to include that fix but it doesn't seem to be present. The workaround in the PR fixes the issue for me. |
Yes definitely thanks a lot, I banged my head on this for hours. |
We experiencing a possibly related issue. What we noticed scraping https://localhost:10250/metrics/cadvisor is that the container label of the metrics is always empty when running containerd runtime:
Example of expected behaviour
|
PR #724 fixes the issue of missing metrics labels when containerd is used with amazon-eks-node-v20211004 AMIs. But it is still broken for GPU-enabled AMIs (amazon-eks-gpu-node) as of v20211004. GPU instances running amazon-eks-gpu-node-1.21-v20211004 failed to join the cluster. |
Hi @ArchiFleKs, |
What happened:
I have several EKS 1.21 cluster with Docker as runtime that are working well with prometheus operator (v18+). I have tested EKS AMI with Containerd and I cannot get some prometheus recording rules to work like the other cluster.
The issue seems to be with containerd but I'm not entirely sure. I have other bare metal cluster using pure containerd 1.5 install that are working and where I can get CPU usage from prometheus.
Here is a detail explanation: prometheus-operator/kube-prometheus#1389
Basically the following query:
container_cpu_usage_seconds_total{job="kubelet", metrics_path="/metrics/cadvisor", image!=""}
On EKS Containerd 1.4.6 if a remove the ``image!=""` the query works, if not if returns zero. I'm kind of lost here because this only happens with EKS. For now I guess I'll rollback to using Docker.
What you expected to happen:
I expected the out of the box prometheus metrics to work like with Docker runtime
How to reproduce it (as minimally and precisely as possible):
Launch a cluster in EKS with Containerd as runtime
Anything else we need to know?:
Environment:
aws eks describe-cluster --name <name> --query cluster.platformVersion
): eks.2aws eks describe-cluster --name <name> --query cluster.version
): v1.21uname -a
):cat /etc/eks/release
on a node):The text was updated successfully, but these errors were encountered: