-
Notifications
You must be signed in to change notification settings - Fork 2
feat(containers): comprehensive metadata extraction with kubernetes integration #232
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
implement image, resource limits, and labels extraction from container runtime metadata files, addressing multiple container discovery enhancements. all metadata extraction failures are handled gracefully to ensure container discovery succeeds even when metadata files are unavailable or permissions are restricted. this implementation extracts metadata across all supported container runtimes (docker, containerd, cri-o, podman) and both cgroup v1 and v2 systems, providing consistent metadata access regardless of runtime environment. image metadata extraction: - parse container image references from runtime configuration files - support all major runtimes: docker (config.v2.json), containerd (config.json annotations), cri-o (state.json annotations), podman (userdata/config.json) - handle various image reference formats including registries with ports, digests, and tags - extract clean image names by stripping registry paths and repository prefixes - default to "latest" tag when unspecified - handle both tagged (name:tag) and digest (name@sha256:...) references resource limits extraction: - read cpu limits from cgroup files: shares, quota, period, cpuset constraints - read memory limits with proper handling of "max" sentinel values - support both cgroup v1 (cpu.shares, memory.limit_in_bytes) and v2 (cpu.weight, memory.max) - convert cgroup v2 cpu.weight to shares-equivalent using formula: shares = (weight - 1) * 1024 / 9999 + 2 - properly handle controller-specific paths in cgroup v1 (cpu,cpuacct vs memory) container labels extraction: - extract labels/annotations from runtime-specific configuration files - support docker labels, containerd/cri-o annotations, podman labels - merge both annotations and labels for cri-o (which maintains both) - include kubernetes-specific labels (pod names, namespaces, app labels) technical implementation: - graceful degradation: all metadata extraction errors are silently logged - handle truncated container ids via glob pattern matching - proper path handling for both rootful and rootless container installations - comprehensive error handling for missing files and permission denials - thorough unit test coverage with 11+ test cases for image parsing edge cases integration: - integrate metadata extraction into internal/containers/manager.go GetContainers() - metadata automatically populated when building container graph snapshots - empty hostRoot parameter since container paths are already absolute Closes #199 Closes #200 Closes #201 Closes #202
… stripping
add container-specific human-readable identifier fields (container_name, workload_name) to metadata extraction and container graph nodes, enabling intuitive container identification across all runtimes without duplicating kubernetes pod-level fields.
container_name extraction:
- prioritize kubernetes container name from io.kubernetes.container.name label
- fall back to docker compose service name (com.docker.compose.service)
- default to image name when no explicit container name available
- provides consistent naming across kubernetes, docker, containerd, cri-o, podman runtimes
workload_name extraction with hash stripping:
- derive workload names from kubernetes pod names by stripping generated hashes
- strip both replicaset hash and pod hash (e.g., "web-server-7d4f8bd9c-abc12" -> "web-server")
- detect kubernetes hashes using alphanumeric pattern matching (5-10 chars with both letters and digits)
- preserve non-deployment pod names like statefulsets ("cassandra-0" -> "cassandra-0")
- only populate for kubernetes containers (requires io.kubernetes.pod.name label)
integration:
- add fields to internal/containers/graph/builder.go ContainerInfo struct
- populate fields in graph/nodes.go createContainerNode()
- extract names in manager.go collectRuntimeSnapshot() with sample logging
- update protobuf bindings (pkg/api/antimetal/runtime/v1/linux.pb.go) from jra3-apis PR #14
design decisions:
- container-specific fields only (no duplication of pod name, namespace, app)
- pod-level fields available via kubernetes pod resources and container->pod relationships
- graceful degradation when labels unavailable (fields remain empty)
- hash detection algorithm balances precision (avoid false positives) with recall (catch k8s hashes)
testing:
- 24 test cases across 6 test functions
- comprehensive coverage of extractHumanNames() with kubernetes/docker/fallback scenarios
- thorough stripPodHash() testing with deployments, statefulsets, edge cases
- helper function tests (isAlphanumeric, isKubernetesHash)
Note: 🤖 This commit includes significant code written with Claude Code assistance
Depends-On: jra3-apis#14
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most container runtime have a daemon process that exposes a socket. Shouldn't we be fetching metadata through there? That seems more stable
|
I considered using runtime daemon sockets (Docker API, containerd CRI, Multi-Runtime Support Without Heavy DependenciesThe biggest advantage is supporting multiple container runtimes with a single, unified implementation. Using socket APIs would require:
Each runtime has different:
The filesystem approach handles all runtimes with ~400 lines of Stability ConsiderationsThe file formats we're reading are quite stable:
If we encounter issues with specific runtime versions, we can add |
Consolidates container metadata directly into the Container struct rather than maintaining a separate Metadata type. This simplifies the API and eliminates unnecessary field copying since metadata is always extracted during container discovery. Changes: - Add all metadata fields (image, labels, limits, names) to Container - Update ExtractMetadata() to populate Container in-place - Remove intermediate Metadata struct and 14-field copying in manager - Update tests to use Container directly Addresses PR feedback about struct separation. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
Swap the order of cgroup version checks in extractResourceLimits() to check v2 before v1. While functionally equivalent (a container only exists in one cgroup hierarchy), this aligns with our v2-first philosophy used throughout the discovery code. Addresses PR feedback about preferring v2 over v1.
Summary
This PR adds comprehensive container metadata extraction capabilities to the Antimetal Agent, enabling enriched observability data for containerized workloads. The implementation extracts container-specific metadata (image info, labels, resource limits, human-readable identifiers) while avoiding duplication of Kubernetes Pod-level data that's already available through K8s resources.
Key capabilities added:
Motivation
Container metadata enrichment is critical for:
Without this feature, the agent could only report raw cgroup paths and PIDs, making it difficult for users to understand which applications are consuming resources.
Changes
New Package:
pkg/containersmetadata.go: Core metadata extraction logic (670 lines)ExtractMetadata(): Main entry point for metadata extractionmetadata_test.go: Comprehensive test suite (408 lines, 24 test cases)Integration Points
internal/containers/manager.go(+48 lines):ContainerNodewith extracted metadata fieldsinternal/containers/graph/builder.go(+30 lines):internal/containers/graph/nodes.go(+14 lines):API Changes
pkg/api/antimetal/runtime/v1/linux.pb.go(binary protocol buffer update):container_name,workload_name,image_name,image_tag,labels,cpu_shares,cpu_quota_us,cpu_period_us,memory_limit_bytes,cpuset_cpus,cpuset_memsDependencies
ContainerNodeprotobuf schema fields for metadataTesting
Unit Tests (24 comprehensive test cases):
Integration Testing:
Implementation Details
Hash Stripping Algorithm
Kubernetes appends hash suffixes to workload names (e.g.,
web-server-7d4f8b9c5d-abc123). The implementation strips these hashes to reveal the logical workload name:Supports Deployment, StatefulSet, ReplicaSet, DaemonSet, and Job patterns.
Runtime-Specific Paths
The implementation searches multiple paths for metadata files, ensuring compatibility across runtimes:
Image metadata paths:
/sys/fs/cgroup/.../io.kubernetes.cri.image-name(Kubernetes CRI)/proc/<pid>/root/.dockerenv,/proc/<pid>/root/.containerenv(runtime markers)/var/lib/docker,/var/lib/containerd, etc.Label paths:
/var/lib/docker/containers/<id>/config.v2.json(Docker)/var/run/containerd/io.containerd.runtime.v2.task/k8s.io/<id>/config.json(containerd)/var/lib/containers/storage/overlay-containers/<id>/userdata/config.json(Podman)Resource Limit Extraction
Reads cgroup files with proper v1/v2 detection:
cgroup v1:
cpu.shares,cpu.cfs_quota_us,cpu.cfs_period_usmemory.limit_in_bytescpuset.cpus,cpuset.memscgroup v2:
cpu.weight(converted to shares)cpu.max(quota/period in single file)memory.maxcpuset.cpus,cpuset.memsBreaking Changes
None. This PR is additive only:
Review Checklist
make fmt,make fmt.clang)make gen-license-headers)