-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unable to see container's metrics about external volumes attached (k8s persistent volumes) #1702
Comments
The current scope of cAdvisor is restricted to monitoring containers. It provides the kubelet with container metrics only, and has no concept of volumes, pods, or any other higher-level Kubernetes API objects. The kubelet has a built-in cadvisor, which monitors containers in kubernetes. The kubelet currently takes the container metrics provided by cAdvisor, and combines them with volume metrics, and kubernetes-specific metadata (e.g. container->pod mappings) to produce the summary API, which is exposed by the kubelet at 10255/stats/summary. The underlying issue you are raising here is that the kubelet does not expose this via prometheus. |
Hi @dashpole so you're saying all relevant information is available / provided by cAdvisor already but not mapped/interpreted by the kubelet->summary implementation yet? |
No, cAdvisor does not provide volume metrics, and has no concept of volumes or pods, which is why there are no volume metrics exposed via prometheus. |
Interesting, thanks for the insight @dashpole. Seems like we should maybe work towards exposing the stats API as Prometheus metrics by the kubelet. |
Would just adding volume metrics to the kubelet's prometheus endpoint be sufficient? |
Hi @dashpole , Yes, the information we are looking for is available in :10255/stats/summary, and as you said, that's not translated to any kind of metric (that was actually the issue to me). I don't want to ask for anything out of the scope for any component, that's the reason my question is more or less.... "who do you think should be responsible of translating that into a prometheus metric"? Your proposal "just adding volume metrics to the kubelet's prometheus endpoint" is exactly what I was looking for, but i don't know if by kubelet or whom (controller-manager?). I also fully agree that cadvisor should know nothing about k8s concepts like persistent volumes, but from container perspective, if a docker has a filesystem mounted, I expect prometheus metrics to be exported about that filesystem, which is actually very important to track. Let me know your view and thanks in advance! |
@brancz so would this be escalated with kubernetes/kubernetes then? |
@gnufied told me there is something already in the works, maybe he can comment on where we should go next. |
@brancz yes I was talking about this proposal - kubernetes/community#855 cc @jingxu97 |
@gnufied I don't want to hijack that PR so I'll ask here. Do you think it's reasonable to also add all of those metrics as metrics exposed in the |
I think it is reasonable. We should also have a discussion on the relationship between kubernetes and prometheus. Seems odd to have some metrics on the cadvisor port, and others on the kubelet port. I think ideally, the prometheus endpoint should mirror the information provided by the kubelet's http endpoints (e.g summary API). |
I guess my biggest concern/question is that I don't know how prometheus deals with metrics changing. What about a metric (name, format, labels) can change across a release without causing disruption? Prometheus doesnt look like it has versioning. |
anyone pls correct my if I'm wrong - but @dashpole prometheus merely reads what's there - and it's entirely up to the prometheus consumers do deal with changed labels/data... The major usage basically are prometheus alert rules and grafana to visualise based on the metrics. |
@dashpole the proposal you mentioned will simply expose the storage metrics - and seem to aim to map/provide |
or e.g. mirror existing metrics like |
@gnufied : I think it should be :10255/metrics, what other endpoint do you see? @dashpole : When you mention cadvisor port, what port do you mean? I thought in terms of metrics, cadvisor/kubelet where going to expose only one set of metrics. If there are 2 endpoints let me know which is the other one and I will check if the info we are looking for is already available there. And as you mention, the "summary API" has the correct and complete set of information. So, I don't know who, but someone needs to put that information into a "metrics" format. But when you talk about "prometheus" and "kubelet" relationship I don't completely get you.
Regarding metrics names and changes I agree with @hartmut-pq , there won't be any important disruption in prometheus if that happens, just that all components that read data from prometheus should be updated to start digging for the new names. The information we are looking for from containers point of view is the same information you are already providing for other filesystems of the container (size, usage bytes, free bytes, ...). (container_fs_*)... This is info from a container is taken from stats/summary endpoint:
And that volume (container_fs) is completely missing in the kubelet metrics endpoint (that I also thought it was cadvisor endpoint). |
@eedugon Thanks for your comment and feedback. I am currentlto y working on a feature to expose storage metrics to users which will address your issue. |
@jingxu97 that sounds great! Regarding:
I think this is something we can discuss further in the proposal. Feel free to tag us once you have something ready. 🙂 |
@jingxu97 : just for curiosity... any progress about this issue? Thanks in advance! |
@eedugon sorry that I missed your message. Currently you can use the following ways to get PVC metrcis
Please let me know if you have problems or questions. Thanks~! |
@eedugon Kubernetes 1.8 exposes following volume related metrics for Prometheus which can be used to monitor PVC disk usage:
|
Would love to see a dashboard and or alerting rules with these! 🙂 |
@tiloso i run k8s 1.8.3 with vsphere volumes, i do not have these metrics.... |
PVC volume related metrics have been introduced by this commit which seems to be part of Kubernetes >= 1.8.0. Here's a very simple example of how we use it to get an idea of the disk usage:
Grafana query
Grafana legend
|
@tiloso i see, but i do not have metrics with the name kubelet_volume* (not in prometheus, not when curling the kubelet) but i do have pvc's. |
@tiloso awesome! I can definitely see very nice |
I don't see these metrics too... @f0 did you found a solution? |
yes, these metrics were added in kubernetes/kubernetes#51553, which first became available in 1.8.0. |
@cyrus-mc that was fixed recently and we have backported it to 1.9 kubernetes/kubernetes#60013 If you are still on 1.9 - you should upgrade to next version with the fix. |
Apologies for joining late on this thread... but I am confused by @dashpole's comment. Regardless of how the kubelet collects and aggregates cAdvisor metrics, your explanation does not make it clear to me why the device in question ( I'm running into a similar issue on our k8s clusters, where we need to gather metrics on volumes mounted by each container. Independently of how k8s concepts, I expected cAdvisor to spit metrics on every device it finds inside the container. I'm not sure the answer provides an explanation. I'm glad to use kublet's Anyone else still having issues on this? I can't find a way to get |
@juliohm1978 You need to query the kubelet's |
I could be wrong. Isn't this the kubelet endpoint?
|
@juliohm1978 you are not seeing those metrics for nfs volumes because nfs volume plugin does not implement necessary metric interface. If you are up to it - see kubernetes/kubernetes#62644 github issue about how to fix it. The reason some of the volume types don't implement metric interface is because - we haven't had a pressing need until now. We welcome any patches for fixing it though. |
Excellent. Thank you! |
Strange, that i am seeing stale metrics for the volumes which are no longer attached to the host. I am running 1.8.11. Anyone has seen this issue ? Is upgrading to 1.9.latest the only way to solve this. |
@tiloso Actually I can fetch the kubelet_volume_* metrics, but I get multiple values for a single persistent volume on different nodes. I have a cluster on GKE and claim a PV for a pod in namespace X. when I query for kubelet_volume_stats_used_bytes, I get multiple records for same persistent volume in the same namespace with different values on different cluster nodes. Do you have any idea or have you ever faced this issue? |
Have the same situation. pv+pvc under nfs-deployment+gce disk |
host-path and nfs volume does not implement metric interface for now? @gnufied |
Is it possible to extend |
I am running a k8s cluster on digital ocean, are do-block-storage volume type supported in prometheus monitoring via publish of kubelet_volume* metrics. I can not access these metrics via grafana (Prometheus Operator setup) Running K8S 1.13+ |
I don't see these metrics too... with last k8s version..... |
I'm closing this issue, as it is out of the scope of cAdvisor. If there is a specific kubernetes volume type that doesn't have metrics or bug with the existing kubelet volume metrics, that would probably be best tracked with an issue in k/k. |
For anyone struggling to find answer like i did. If you use some cis plugin for persistent volume and this plugin doesn't expose metrics you won't find them. For example: kubernetes-sigs/aws-ebs-csi-driver#223 |
|
Hello,
We are having a discussion at "prometheus-operator" project, and based on @brancz suggestion, we are raising the topic here, as our problem seems related with cadvisor in some way.
(this is the original thread, just in case: #prometheus-operator/prometheus-operator#485)
In a kubernetes + prometheus-operator environment we are not able to get any kind of metric about persistent volumes that are attached to containers in pods, and we don't really understand why.
Keeping out of the topic that Kubernetes should probably provide information about them, because PVs are actually K8s resources, we are wondering how we can't find any information at cadvisor/kubelet level metrics.
For example, a container might show the following disks (by df -h directly in a bash of the container):
In that output, the "154G" device is the main disk of the container owner (k8s node), and we have no problem finding metrics about that disk at all levels.
But the other device (/dev/xvdbg) is actually the persistent volume that is mounted in the container, and there's no track of it at all in the metrics.
container_fs_usage_bytes information about that pod/container has only:
container_fs_usage_bytes{container_name="kafka",device="/dev/xvda1",.....}
Which represents actually the physical disk of the k8s node, owning the container.
But there's noting else about /dev/xvdbg, and that's what we are looking for...
Do you have any suggestion, idea or explanation for this behavior? Are we missing something at configuration level to make kubelet reporting this?
Note1: At node level, lsblk shows the disk of the PV:
In the node we have cadvisor/kubelet metrics and also node_exporter.... none of them are reporting what we are looking for. I suspect we are missing some kind of mapping or configuration somewhere, because the disk is there...
Thanks very much in advance, any help will be appreciated,
Eduardo
(+ @hartmut-pq)
The text was updated successfully, but these errors were encountered: