Detect container ID via cgroup inode when pa…#1367
Conversation
…th is masked When the profiler runs non-privileged with hostPID: true, it stays in its own private cgroup namespace (cgroup v2 default). Its cgroup path appears as "0::/" rather than the full host path, making container ID extraction via the standard regex impossible. At startup, walk /proc/1/root/sys/fs/cgroup (the host's view) to find the directory whose inode matches the profiler's. The path of that directory contains the container ID.
florianl
left a comment
There was a problem hiding this comment.
Good complementary addition to #1172 👍
Startup time might be slightly delayed depending on the number of elements traversed in hostCgroupRoot in DetectSelfContainerIDViaInode(). However, this is a one-time cost that should be negligible.
Co-authored-by: Florian Lehner <florianl@users.noreply.github.com>
| selfCgroupIno uint64 | ||
|
|
||
| // selfContainerID is the profiler's own container ID, detected once at startup. | ||
| // Used as a fallback when /proc/<pid>/cgroup yields no container ID for processes | ||
| // that share the profiler's cgroup directory (e.g., private cgroup namespace). | ||
| selfContainerID libpf.String |
There was a problem hiding this comment.
nit (no need to be applied): These seem to be constant across the whole execution, one option would be replacing it with a closure that captures the values.
| selfCgroupIno uint64 | |
| // selfContainerID is the profiler's own container ID, detected once at startup. | |
| // Used as a fallback when /proc/<pid>/cgroup yields no container ID for processes | |
| // that share the profiler's cgroup directory (e.g., private cgroup namespace). | |
| selfContainerID libpf.String | |
| fillContainerIDFallback func(pid libpf.PID, meta *process.ProcessMeta) |
| if err != nil { | ||
| return nil // skip inaccessible directories | ||
| } |
There was a problem hiding this comment.
What happens if /proc/1/root/sys/fs/cgroup can't be walked? Seems like we'll return nil and swallow an error that we should probably be logging?
There was a problem hiding this comment.
Right, thanks for flagging this. Fixed in 4cd3c0c
| if err != nil { | ||
| return libpf.NullString, 0, fmt.Errorf("failed to walk host cgroup tree: %w", err) | ||
| } |
There was a problem hiding this comment.
Will this logic ever be triggered?
There was a problem hiding this comment.
It can now be triggered in the case added by 4cd3c0c.
| return | ||
| } | ||
| var st unix.Stat_t | ||
| if err := unix.Stat(fmt.Sprintf("/proc/%d/root/sys/fs/cgroup", pid), &st); err != nil { |
There was a problem hiding this comment.
Nit: We can turn this into an exported function in process/process.go (as that's also using the same pattern in hostCgroupRoot) that we can also document
Currently working on removing privilege needs for the profiler and instead grant specific capabilities to mitigate security risks, and when non-privileged with
hostPID: true, it stays in its own private cgroup namespace. Its cgroup path appears as0::/rather than the full host path, making container ID extraction via the standard regex impossible.This PR adds an inode-based fallback to recover the container ID.