Skip to content

Detect container ID via cgroup inode when pa…#1367

Merged
christos68k merged 4 commits into
open-telemetry:mainfrom
theomagellan:theo/self-container-id-inode
Apr 28, 2026
Merged

Detect container ID via cgroup inode when pa…#1367
christos68k merged 4 commits into
open-telemetry:mainfrom
theomagellan:theo/self-container-id-inode

Conversation

@theomagellan
Copy link
Copy Markdown
Contributor

@theomagellan theomagellan commented Apr 23, 2026

Currently working on removing privilege needs for the profiler and instead grant specific capabilities to mitigate security risks, and when non-privileged with hostPID: true, it stays in its own private cgroup namespace. Its cgroup path appears as 0::/ rather than the full host path, making container ID extraction via the standard regex impossible.

This PR adds an inode-based fallback to recover the container ID.

  1. At startup: walk /proc/1/root/sys/fs/cgroup (the host's view) looking for the directory whose inode matches the profiler's. The path of the matching directory contains the container ID.
  2. Per new PID, if the standard extraction returned empty, compare the inode of /proc//root/sys/fs/cgroup against the profiler's cached inode. Match -> same container -> same container ID.

…th is masked

When the profiler runs non-privileged with hostPID: true, it stays in
its own private cgroup namespace (cgroup v2 default). Its cgroup path
appears as "0::/" rather than the full host path, making container ID
extraction via the standard regex impossible.

At startup, walk /proc/1/root/sys/fs/cgroup (the host's view) to find
the directory whose inode matches the profiler's. The path of that
directory contains the container ID.
@theomagellan theomagellan marked this pull request as ready for review April 24, 2026 08:27
@theomagellan theomagellan requested review from a team as code owners April 24, 2026 08:27
Copy link
Copy Markdown
Member

@florianl florianl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good complementary addition to #1172 👍

Startup time might be slightly delayed depending on the number of elements traversed in hostCgroupRoot in DetectSelfContainerIDViaInode(). However, this is a one-time cost that should be negligible.

Comment thread processmanager/processinfo.go
Co-authored-by: Florian Lehner <florianl@users.noreply.github.com>
Comment thread processmanager/types.go
Comment on lines +120 to +125
selfCgroupIno uint64

// selfContainerID is the profiler's own container ID, detected once at startup.
// Used as a fallback when /proc/<pid>/cgroup yields no container ID for processes
// that share the profiler's cgroup directory (e.g., private cgroup namespace).
selfContainerID libpf.String
Copy link
Copy Markdown
Contributor

@rogercoll rogercoll Apr 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit (no need to be applied): These seem to be constant across the whole execution, one option would be replacing it with a closure that captures the values.

Suggested change
selfCgroupIno uint64
// selfContainerID is the profiler's own container ID, detected once at startup.
// Used as a fallback when /proc/<pid>/cgroup yields no container ID for processes
// that share the profiler's cgroup directory (e.g., private cgroup namespace).
selfContainerID libpf.String
fillContainerIDFallback func(pid libpf.PID, meta *process.ProcessMeta)

Comment thread process/process.go
Comment on lines +205 to +207
if err != nil {
return nil // skip inaccessible directories
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if /proc/1/root/sys/fs/cgroup can't be walked? Seems like we'll return nil and swallow an error that we should probably be logging?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, thanks for flagging this. Fixed in 4cd3c0c

Comment thread process/process.go
Comment on lines +223 to +225
if err != nil {
return libpf.NullString, 0, fmt.Errorf("failed to walk host cgroup tree: %w", err)
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this logic ever be triggered?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can now be triggered in the case added by 4cd3c0c.

Comment thread processmanager/processinfo.go Outdated
return
}
var st unix.Stat_t
if err := unix.Stat(fmt.Sprintf("/proc/%d/root/sys/fs/cgroup", pid), &st); err != nil {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: We can turn this into an exported function in process/process.go (as that's also using the same pattern in hostCgroupRoot) that we can also document

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 09ffd67

@christos68k christos68k merged commit f1388e8 into open-telemetry:main Apr 28, 2026
31 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants