-
Notifications
You must be signed in to change notification settings - Fork 98
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
agent is not starting with error /sys/fs/cgroup/freezer/kubepods/guaranteed: no such file or directory #26
Comments
Hello.
What do they say? |
here is the thing, cluster has 10 nodes, they are the same os (cos) and kernel
only one (random) perforator agent works well and sends profiles etc, all others are crashlooping with error above
i'm a bit limited with ssh to these nodes, once i can get in i will provide cgroup info. thanks |
hello @MikailBag
|
the agent that is not crashlooping runs on the node where there is also no guaranteed
|
Thank you, I think this clarifies things a lot. It seems that your nodes have
This is strange) Maybe for some reason it has other configuration (i.e. profiles whole system instead of contacting he kubelet). Anyway, for now we need to add support for cgroupsPerQOS being disabled, it should help. |
interesting i dont see it set to false
|
full config
|
I think 5fcb948 should resolve this issue once we make a new release. |
thanks @MikailBag , i look forward |
Update: release v0.0.2 is now available |
thanks @MikailBag now the agent works |
Thank you for the report, it was very helpful! |
hello, i'm trying to run perforator in k8s using helm chart but most of agent pods are crashlooping with the error
tskv ts=2025-02-04T15:14:49.326760602Z level=error logger=profiler worker=pods cgroup tracker msg=Worker failed error=open /sys/fs/cgroup/freezer/kubepods/guaranteed: no such file or directory tskv ts=2025-02-04T15:14:49.326874872Z level=error logger=profiler worker=process poller msg=Worker failed error=context canceled
control plane version is 1.30
Kernel Version: 6.6.56+ OS Image: Container-Optimized OS from Google Operating System: linux Architecture: amd64 Container Runtime Version: containerd://1.7.24 Kubelet Version: v1.31.4-gke.1256000 Kube-Proxy Version: v1.31.4-gke.1256000
what is odd i always have 1 agent running without any errors on the node with the same characteristics as above
Please advise.
Thanks
The text was updated successfully, but these errors were encountered: