Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Doc "Important Components for Kubernetes / Kubeletstats receiver": ClusterRole missing optional permissions #6033

Open
r-asiebert opened this issue Jan 22, 2025 · 2 comments

Comments

@r-asiebert
Copy link

URL

https://opentelemetry.io/docs/kubernetes/collector/components/#kubeletstats-receiver

Recommended change

The documented ClusterRole for the kubeletstats receiver should list its optional permissions.
Maybe commented out? E.g.

...
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: otel-collector
rules:
  - apiGroups: ['']
    resources: ['nodes/stats']
    verbs: ['get', 'watch', 'list']
  # The following is needed when using extra_metadata_labels or any of the {request|limit}_utilization metrics
  #- apiGroups: ['']
  #  resources: ['nodes/proxy']
  #  verbs: ['get']
...

Context

After enabling the optional metric k8s.container.memory_limit_utilization in the kubeletstats receiver, my OTel Collectors started partially failing with these errors:

[email protected]/scraper.go:113 call to /pods endpoint failed {"kind": "receiver", "name": "kubeletstats", "data_type": "metrics", "error": "kubelet request GET https://<snipped_node_ip>:10250/pods failed - "403 Forbidden", response: "Forbidden (user=system:serviceaccount:otel:otel-node-service-account, verb=get, resource=nodes, subresource=proxy)""}

This caught me by surprise as the ClusterRole described in the linked page was assigned to these pods and had worked fine for weeks.

After granting the permission listed in the error, my otelcol daemonset was back in operation.
I've done so after looking around and finding the same optional permission tackled in open-telemetry/opentelemetry-operator#3155.
https://github.com/open-telemetry/opentelemetry-operator/blob/1980f0877e5cff8e41ff3eafafe4c57133d7c899/internal/components/receivers/kubeletstats.go#L65-L93 shows that nodes/proxy is needed "when using extra_metadata_labels or any of the {request|limit}_utilization metrics".

@tiffany76
Copy link
Contributor

Sounds like a good fix to me, @r-asiebert. Thanks for raising the issue!

@open-telemetry/operator-approvers, please take a quick look.

@r-asiebert
Copy link
Author

Complete guess: this might be for sig:collector instead, as the otel operator had the same concern addressed in the aforementioned open-telemetry/opentelemetry-operator#3155
I couldn't use that operator feature for my otelcols as I needed to implement the k8s service account manually, hence relying on the doc in question

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: No status
Development

No branches or pull requests

2 participants