No metrics with containerd #764

ArchiFleKs · 2021-09-22T08:14:49Z

What happened:

I have several EKS 1.21 cluster with Docker as runtime that are working well with prometheus operator (v18+). I have tested EKS AMI with Containerd and I cannot get some prometheus recording rules to work like the other cluster.

The issue seems to be with containerd but I'm not entirely sure. I have other bare metal cluster using pure containerd 1.5 install that are working and where I can get CPU usage from prometheus.

Here is a detail explanation: prometheus-operator/kube-prometheus#1389

Basically the following query: container_cpu_usage_seconds_total{job="kubelet", metrics_path="/metrics/cadvisor", image!=""}

OK with EKS Docker runtime
NOK with EKS Container 1.4.6
OK on bare metal cluster with Container 1.5.5

On EKS Containerd 1.4.6 if a remove the ``image!=""` the query works, if not if returns zero. I'm kind of lost here because this only happens with EKS. For now I guess I'll rollback to using Docker.

What you expected to happen:

I expected the out of the box prometheus metrics to work like with Docker runtime

How to reproduce it (as minimally and precisely as possible):

Launch a cluster in EKS with Containerd as runtime

Anything else we need to know?:

Environment:

AWS Region: eu-west-1
Instance Type(s):
EKS Platform version (use aws eks describe-cluster --name <name> --query cluster.platformVersion): eks.2
Kubernetes version (use aws eks describe-cluster --name <name> --query cluster.version): v1.21
AMI Version: 1.21.2-20210830
Kernel (e.g. uname -a):
Release information (run cat /etc/eks/release on a node):

The text was updated successfully, but these errors were encountered:

danielcoleAL · 2021-09-22T10:20:48Z

I was having a similar issue with cadvisor not returning full metrics when using containerd. Is this related? #724.

I was expecting the latest release to include that fix but it doesn't seem to be present. The workaround in the PR fixes the issue for me.

ArchiFleKs · 2021-09-24T14:23:06Z

Yes definitely thanks a lot, I banged my head on this for hours.

paologallinaharbur · 2021-10-05T15:21:17Z

We experiencing a possibly related issue.

What we noticed scraping https://localhost:10250/metrics/cadvisor is that the container label of the metrics is always empty when running containerd runtime:

container_tasks_state{container="",id="/kubepods/burstable/podb9bac7ec-7264-4c8b-a6e9-cf177ffcdd7c/21241cbb20ab776dc87b05bb27c9912ba70f5fcfdfaadbbc400c5b4da89576a5",image="",name="",namespace="",pod="",state="iowaiting"}

Example of expected behaviour

container_threads_max{container="newrelic-logging",id="/kubepods/burstable/pod88a4c57f-5619-47b4-9c9b-70f4a28b8aec/83ce33548d81da5aaa4154568cde74c9d736932a6a79abf44d4a74e9b59b04a8",image="newrelic/newrelic-fluentbit-output@sha256:e64b37e642f233f20abd1655ace5655d419d4ca34e14d6c84a59506c14d1bd74",name="k8s_newrelic-logging_newrelic-bundle-newrelic-logging-p6x5l_newrelic_88a4c57f-5619-47b4-9c9b-70f4a28b8aec_0",namespace="newrelic",pod="newrelic-bundle-newrelic-logging-p6x5l"}

chrissng · 2021-10-11T07:30:42Z

PR #724 fixes the issue of missing metrics labels when containerd is used with amazon-eks-node-v20211004 AMIs.

But it is still broken for GPU-enabled AMIs (amazon-eks-gpu-node) as of v20211004. GPU instances running amazon-eks-gpu-node-1.21-v20211004 failed to join the cluster.

ravisinha0506 · 2021-11-02T04:46:46Z

Hi @ArchiFleKs,
The latest EKS AMIs ( both GPU and Non-GPU ) should have this working. Kindly look into this and let us know if you see any issues.

paologallinaharbur mentioned this issue Oct 5, 2021

Missing containerID info EKS 1.21.2 ContainerD newrelic/nri-kubernetes#231

Closed

ArchiFleKs mentioned this issue Oct 6, 2021

node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate prometheus-operator/kube-prometheus#1056

Closed

ravisinha0506 closed this as completed Nov 4, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

No metrics with containerd #764

No metrics with containerd #764

ArchiFleKs commented Sep 22, 2021

danielcoleAL commented Sep 22, 2021

ArchiFleKs commented Sep 24, 2021

paologallinaharbur commented Oct 5, 2021

chrissng commented Oct 11, 2021 •

edited

Loading

ravisinha0506 commented Nov 2, 2021 •

edited

Loading

No metrics with containerd #764

No metrics with containerd #764

Comments

ArchiFleKs commented Sep 22, 2021

danielcoleAL commented Sep 22, 2021

ArchiFleKs commented Sep 24, 2021

paologallinaharbur commented Oct 5, 2021

chrissng commented Oct 11, 2021 • edited Loading

ravisinha0506 commented Nov 2, 2021 • edited Loading

chrissng commented Oct 11, 2021 •

edited

Loading

ravisinha0506 commented Nov 2, 2021 •

edited

Loading