Skip to content

Merge/exec arg wildcards#38

Open
entlein wants to merge 26 commits intomainfrom
merge/exec-arg-wildcards
Open

Merge/exec arg wildcards#38
entlein wants to merge 26 commits intomainfrom
merge/exec-arg-wildcards

Conversation

@entlein
Copy link
Copy Markdown

@entlein entlein commented May 4, 2026

  • f8df60f (now a83bb69) — CompareExecArgs hookup on the unified CP cache
  • eb0145d — R0040 'Unexpected process arguments' rule + Test_32 e2e
  • 7c10baa — fix(tamper_alert): accept self-signed, only flag actual tamper
  • 6f2a5b4 — fix(containerprofilecache): re-wire R1016 + expand Test_31
  • 9385562 — Test_33 AnalyzeOpensWildcardAnchoring component test
  • f828777 — AP-fixture linter (R-AP-* rules) + canonical reference profile
  • 4dd0b39, 484d11c, cb57674, 4d374d5, 6dda020 — Test_28/30/31/33 flakiness + cluster-pressure fixes
  • fbf98b0 — CI: Test_32 in component-tests matrix
  • 23d3d2f — port Test_28 to upstream's unified user-overlay label
  • fe80d73 — Test_32 profile uses full-path argv[0]
  • bcf41ea, a59e284, 0816393 — historical storage replace bumps (could be squashed if you want a tidier branch — say the word)
  • a5af261 — final storage replace bump to wildcard-tip-with-CRDs

Entlein added 18 commits May 4, 2026 11:01
Upstream PR kubescape#788 (Replace AP and NN cache with CP) collapsed the two
legacy workload-keyed caches into a single ContainerProfileCache that
reads ONE pod label `kubescape.io/user-defined-profile=<name>` and
uses <name> as the lookup key for BOTH the user ApplicationProfile and
the user NetworkNeighborhood.

The fork's earlier two-label scheme (`user-defined-profile` for AP +
separate `user-defined-network` for NN, with potentially different
names) is no longer honored — the second label is silently ignored.

Port:
  - tests/resources/nginx-user-defined-deployment.yaml: drop the
    `user-defined-network` label, point the surviving label at one
    shared name `curl-28-overlay`.
  - tests/component_test.go Test_28_UserDefinedNetworkNeighborhood:
    create both AP and NN under that single shared name. Assertions
    unchanged — the test still verifies that the user NN's egress
    restriction (only fusioncore.ai allowed on TCP/80) is enforced
    once the pod is running.

Verified locally: go vet -tags=component ./tests/... clean;
go test -tags=component -run='^$' ./tests/... compiles cleanly.
PR #35's wildcard-aware exec arg matching needs reapplication on top
of the new ContainerProfileCache (upstream kubescape#788) baseline. The original
PR sat on the legacy applicationprofilecache, which has been deleted;
the call site now reads cp.Spec.Execs from a ContainerProfile.

Same semantic change as PR #35:
  '⋯' (DynamicIdentifier)  — matches exactly one argument position.
  '*' (WildcardIdentifier) — matches zero or more consecutive args.

Wiring:
  - pkg/rulemanager/cel/libraries/applicationprofile/exec.go:
    drop slices.Compare exact-equality on the cp.Spec.Execs loop;
    route through dynamicpathdetector.CompareExecArgs.
  - go.mod: bump fork storage replace to feat/exec-arg-wildcards tip
    (3fc287210729) which carries the matcher.
  - exec_test.go: re-add TestExecWithArgsWildcardInProfile (13 subtests
    across curl --user ⋯, sh -c *, ls -l ⋯, echo hello *, plus negative
    literal-anchor / under-consumed-⋯ / mid-profile-* cases). Mirrors
    the test set that lived on PR #35 before the upstream merge.

Verified: full applicationprofile package green
(`go test ./pkg/rulemanager/cel/libraries/applicationprofile/`).
R0040 is an additive companion to R0001. It evaluates:

  ap.was_executed(...) && !ap.was_executed_with_args(..., event.args)

so it ONLY fires when the exec'd path IS in the user-defined profile
(R0001 stays silent) but the runtime arg vector does not match any
profile entry's pattern. With wildcard tokens supported by
dynamicpathdetector.CompareExecArgs:

  '⋯' (DynamicIdentifier)  — exactly one argument position.
  '*' (WildcardIdentifier) — zero or more consecutive args.

Use case: profile entry {Path: /bin/sh, Args: [sh, -c, *]} flags
'sh -x ...' as drift while permitting 'sh -c <anything>'.

Wiring:
  - tests/chart/templates/node-agent/default-rules.yaml: new R0040
    CEL rule definition immediately after R0001, same MITRE tagging
    (TA0002/T1059) and same applicationprofile-anomaly tag set.
  - tests/chart/templates/node-agent/default-rule-binding.yaml:
    R0040 added to the all-rules-all-pods binding next to R0001.
  - tests/resources/curl-exec-arg-wildcards-deployment.yaml: new
    fixture, curl pod labelled with the unified
    kubescape.io/user-defined-profile=curl-32-overlay label.
  - tests/component_test.go: Test_32_UnexpectedProcessArguments with
    4 subtests:
      32a sh_dash_c_matches_wildcard_trailing — sh -c <cmd> matches
          profile [sh, -c, *] — R0040 silent.
      32b sh_dash_x_mismatches_R0040 — sh -x <cmd> mismatches the
          literal -c anchor — R0040 fires.
      32c echo_hello_matches_wildcard_trailing — echo hello world
          matches [echo, hello, *] — R0040 silent.
      32d echo_goodbye_mismatches_R0040 — echo goodbye world
          mismatches the literal hello anchor — R0040 fires.

Verified locally: go vet -tags=component ./tests/... clean;
go test -tags=component -run='^$' ./tests/... compiles cleanly.
End-to-end alert assertions run in CI.
After rebasing storage feat/exec-arg-wildcards onto storage main, the
matcher branch now sits on top of the latest fork main commit
(352395a3 — Internal-field merge fix). Bump the node-agent storage
replace to that new pseudo-version so this branch's tests run against
storage main + matcher in one consistent baseline.

Verified locally: 47/47 non-eBPF unit packages green; vet clean; the
applicationprofile CEL package's TestExecWithArgsWildcardInProfile is
13/13 green; component-tests compile under the component tag. The two
failing packages (pkg/containerwatcher/v2/tracers and pkg/validator)
fail with the same pre-existing /sys/fs/bpf mount-permission error
they have on every recent run — env, not code.
The component-tests workflow uses a hardcoded matrix list, not a
dynamic discovery from the test source. Test_32 (added in a613cf6)
must be listed explicitly to be picked up — without this entry the
test is silently skipped.
Upstream PR kubescape#788 (Replace AP and NN cache with CP) deleted the legacy
applicationprofilecache where the fork's emitTamperAlert (commit
c2d681e 'Feat/tamperalert' #22) lived. After the merge, R1016 alerts
no longer fired for tampered user-defined profiles, breaking
Test_31_TamperDetectionAlert (passed 3/3 on main, fails on the merged
branch — confirmed regression introduced by PR #36).

This restores the contract: every cache load of a user-supplied
ApplicationProfile or NetworkNeighborhood overlay re-verifies the
signature, and emits an R1016 'Signed profile tampered' alert
through the rule-alert exporter when the signature is present but no
longer valid. Alert shape preserved from the legacy cache so dashboards
and component tests keep matching.

Implementation:
  - new file pkg/objectcache/containerprofilecache/tamper_alert.go:
    verifyUserApplicationProfile / verifyUserNetworkNeighborhood /
    emitTamperAlert / extractWlidFromContainerID. Self-contained;
    keeps containerprofilecache.go diff small.
  - containerprofilecache.go: new tamperAlertExporter field +
    SetTamperAlertExporter setter + verify hooks immediately after
    GetApplicationProfile / GetNetworkNeighborhood succeed in the
    user-overlay branch of addContainer.
  - cmd/main.go: wire the alert exporter via SetTamperAlertExporter
    after the cache constructor (kept the constructor signature
    unchanged to avoid blast radius on tests).

The setter is nil-safe: when no exporter is wired, verification still
runs and is logged but no alert is emitted — matches the legacy
behavior for unit-tests-with-no-exporter.

Test_31 expanded from one scenario to four subtests, each in its own
namespace to avoid alert cross-contamination:
  31a tampered_user_defined_AP_fires_R1016 — original regression case
  31b untampered_signed_AP_no_R1016        — negative: clean signature
  31c unsigned_AP_no_R1016                 — signing is opt-in
  31d tampered_user_defined_NN_fires_R1016 — parallel NN code path

Verified locally:
  - go build ./... clean
  - go test ./pkg/objectcache/containerprofilecache/... green
  - go test ./pkg/signature/... green
  - go vet -tags=component ./tests/... clean
  - go test -tags=component -run='^$' ./tests/... compiles
Empirical finding from CI run 25178930763 — Test_32's positive
subtests (32a sh_dash_c_matches, 32c echo_hello_matches) fired
R0040 when they should not. Cause: at runtime, the eBPF tracer
captures argv[0] as the FULL exec path (e.g. "/bin/sh") rather
than the basename ("sh"). My profile entries used basenames, so
the matcher's first-position literal compare missed and the cache
fell through to 'no exec entry matches' — R0040 fires.

Aligns Test_32's profile with the convention already used by
Test_27's wildcard_yaml_profile_allowed_opens fixture
(known-application-profile-wildcards.yaml predecessor): argv[0]
is the full path, subsequent positions are flags/values.

Subtest expectations after this fix:
  32a sh -c <cmd>      → matches [/bin/sh, -c, *]    → R0040 silent
  32b sh -x <cmd>      → -c anchor mismatch          → R0040 fires
  32c echo hello <…>   → matches [/bin/echo, hello, *]→ R0040 silent
  32d echo goodbye <…> → hello anchor mismatch       → R0040 fires
Catches the class of bug Test_32 hit on its first CI run (PR #37 run
25178930763): profile entries used basename argv[0] ("sh") while the
eBPF tracer captures the full path ("/bin/sh"), so the matcher
silently misses and the rule fires when it shouldn't. Without a linter,
this kind of fixture drift only surfaces in a 15-minute component-test
run on a kind cluster — too late, too expensive.

The linter (LintApplicationProfile / LintApplicationProfileYAML in
tests/resources/aplint_test.go) is intentionally written as a pure
function returning []Violation. Zero testing-package coupling on the
hot path so it can be lifted into a future bobctl subcommand
`bobctl lint <ap.yaml>` without rewrite — see backlog at
~/biz/sbob-business-plan/state.yaml.

Rules:
  R-AP-01 — kind must be ApplicationProfile
  R-AP-02 — at least one container
  R-AP-03 — container.name non-empty
  R-AP-10 — exec.path absolute (catches relative paths)
  R-AP-11 — exec.path no wildcards (binary identity is exact)
  R-AP-12 — exec.args[0] equals exec.path or wildcard token
            (Test_32-style argv[0] basename trap)
  R-AP-13 — exec.args wildcard tokens are whole-word (no embedding)
  R-AP-20 — open.path non-empty + absolute
  R-AP-21 — open.flags non-empty (real auto-recorded opens always have ≥1)
  R-AP-22 — open.flags from known O_* set (catches typos)

Each rule has a dedicated self-test that constructs a minimal-bad YAML
and asserts the rule fires (5 negative tests). One positive test
(TestLinter_canonical_AP_passes) parses the fork's reference
known-application-profile.yaml — extracted from a real auto-recorded
AP for curlimages/curl:8.5.0 in fea3b06 — and asserts zero violations.
The reference YAML is restored to tests/resources/ so the canonical
shape is in-tree and visible to humans + CI.

Why a Go test rather than a shell linter: keeps the rule set in the
same language as the storage matcher (`dynamicpathdetector`), so
extending CompareExecArgs and the linter together stays cheap.

Local-cluster organic learning was the original plan but k3s on
OrbStack is currently flapping (LXC-related boot loop). The fea3b06
profile was extracted from real auto-learning at an earlier moment of
stability, which is the next-best ground truth.
Switch verifyUser{ApplicationProfile,NetworkNeighborhood} from strict
VerifyObject to VerifyObjectAllowUntrusted. The strict variant requires
a Sigstore Fulcio trust chain and rejects locally-signed profiles even
when the signature against the embedded cert is valid. That made
Test_31b 'untampered_signed_AP_no_R1016' fire R1016 against an
untampered AP, and broke Test_30's 'tampered_profile_loaded_without_
enforcement' subtest the same way.

The intent is: tamper detection, not trust-chain enforcement. Matches
cmd/sign-object/main.go's default verifier.
…Eventually

The single-shot wget exec before Eventually was racy: if the eBPF event
landed before the CP cache projected the user-defined AP, the rule
manager evaluated against an empty baseline and R0001 never fired
within the 60s polling window. Same race Test_29 already documents.

Drive the wget exec inside the Eventually loop (10s tick, 120s deadline)
so cache-load latency is absorbed by retries. Filter R0001 to comm=wget
to make the assertion specific instead of catching any R0001. Drops
the blind 15s pre-sleep and the redundant settle-then-recount block.
Picks up the upstream-PR-kubescape#316 review fix: trailing WildcardIdentifier
now requires at least one regular-path segment, matching standard glob
semantics. Closes the R0002 blind spot where '/etc/*' would silently
match the bare '/etc' directory.
Pulls in the full PR-kubescape#316 review fix set that just landed on storage main:
proper splitPath-based trailing-* anchoring, DefaultCollapseConfigs()
defensive-copy accessor, FindConfigForPath value-return, splitEndpoint
defensive guard, plus the BenchmarkCompareDynamic baseline.
End-to-end pin of the storage-side CompareDynamic contract through
R0002. Each subtest deploys a fresh nginx pod with a user-defined AP
carrying ONE Opens entry, then `cat`s a target path that probes a
boundary case from the storage analyzer fixes (kubescape/storage kubescape#316
review by matthyx + entlein):

  - Anchored trailing `*` matches one OR MORE remaining segments —
    never zero. So /etc/* matches /etc/passwd but NOT bare /etc.
  - DynamicIdentifier (⋯) consumes EXACTLY ONE segment.
  - Mid-path `*` is zero-or-more, so /etc/*/* matches /etc/ssh
    (inner * consumes zero, trailing * consumes one).
  - Mixed ⋯/* combinations: ⋯ pins one, * consumes the rest.
  - splitPath normalises trailing slashes on both sides.

11 subtests covering:
  trailing_star_matches_immediate_child           — basic /etc/* match
  trailing_star_matches_deep_child                — multi-segment under prefix
  trailing_star_does_not_match_bare_parent        — the security fix
  deep_prefix_trailing_star_does_not_match_parent — same rule, deeper
  ellipsis_pin_one_segment_then_literal           — ⋯ rejects zero
  ellipsis_then_trailing_star_matches_two_*       — ⋯/* combo, 2 levels
  ellipsis_then_trailing_star_matches_three_*     — ⋯/* combo, 3 levels
  double_trailing_matches_one_child               — /*/* mid-zero
  double_trailing_matches_deep_child              — /*/* mid-one
  double_trailing_does_not_match_parent           — /*/* needs ≥1 child
  trailing_slash_in_profile_normalises_to_literal — splitPath on profile

Pinned at component level on TOP of the unit suite in
storage/pkg/registry/file/dynamicpathdetector/tests/coverage_test.go.
Both layers must agree — a drift in either lights up R0002 with a
false positive or false negative.

Matrix entry added to component-tests.yaml so the test runs in CI.
… monitored prefix

R0002's CEL ruleExpression has a strict path-prefix filter:
  event.path.startsWith('/etc/') || event.path.startsWith('/var/log/') || ...
All with trailing slash. Bare /etc and /var/log don't pass the filter,
so R0002 never evaluates on those events — the matcher's bare-parent
anchoring contract stays invisible at runtime even though the storage
unit tests pin it.

Probe one level deeper instead: /etc/ssl IS under the /etc/ monitored
prefix, so the rule CAN see whether a /etc/ssl/* profile entry matches
the bare /etc/ssl parent. Same security guarantee, observable layer.

Reworked subtests:
  - trailing_star_does_not_match_bare_parent_under_monitored_prefix:
    profile /etc/ssl/*, cat /etc/ssl → R0002 fires
  - deep_prefix_trailing_star_does_not_match_parent:
    profile /etc/ssl/certs/*, cat /etc/ssl/certs → R0002 fires
  - ellipsis_requires_one_segment_not_zero:
    profile /etc/passwd/⋯, cat /etc/passwd → ⋯ requires one more segment
  - double_trailing_does_not_match_parent_under_monitored_prefix:
    profile /etc/ssl/*/*, cat /etc/ssl → R0002 fires

The 7 positive subtests that already passed are untouched. Added a
comment block documenting why we probe at /etc/ssl rather than /etc.
Two distinct fixes for what looked like the same intermittent failures
across PR #37 runs:

Test_31 31b 'untampered_signed_AP_no_R1016' — root cause: storage's
PreSave runs DeflateSortString on Syscalls (and Capabilities,
Architectures), which sorts + dedupes. The signSignedAP helper signed
the AP BEFORE pushing, against unsorted syscalls
{socket, connect, read, write, close, openat}. After PreSave the
stored AP had sorted {close, connect, openat, read, socket, write},
so the content hash differed from the signature → server-side verify
correctly failed → R1016 fired even though the profile was untampered.
Test_29 + Test_30 30b had the same fixture but didn't observe the bug
because they only assert R0001 counts, never R1016. Pre-sort the
syscalls in all three test fixtures so storage's normalization is a
no-op on round-trip.

Test_28 28a 'allowed_fusioncore_no_alert' — root cause: 15s post-deploy
sleep wasn't always enough for the upstream ContainerProfileCache to
project the user-defined NN. Failure mode is alert payload
`profileMetadata.errorMessage:"waiting for profile update"` —
the rule manager evaluated against an unloaded NN and fired
R0005/R0011 spuriously. Bumped to 30s with a comment documenting why.
A real fix would poll a cache-loaded signal but no such signal is
exposed from outside the node-agent today.
…ift R1016 false positive

Test_31 31b 'untampered_signed_AP_no_R1016' kept flaking because the
AP's content hash drifted between client-side sign and server-side
verify across the K8s/storage roundtrip. Sources of drift include
storage's PreSave normalisation (DeflateSortString, DeflateStringer,
DeflateRulePolicies), signature/profiles GetContent's nil→empty-map
mutation on PolicyByRuleId, and any K8s server-side defaulting of
spec/metadata fields. Pre-sorting Syscalls in the previous fix only
covered one of these.

Sign-after-roundtrip closes the whole class:

  1. Push the AP UNSIGNED to storage. PreSave runs, normalises content.
  2. Read it back — this is what node-agent will see at verify time.
  3. Sign THAT normalised content.
  4. Delete the unsigned in-storage copy so deployAndWait can Create
     the signed version without an AlreadyExists conflict.
  5. Strip server-managed metadata (resourceVersion / uid / etc.) from
     the returned AP so the second Create succeeds cleanly.

Second push goes through deflate again. Idempotent on already-normalised
content → stored bytes identical to signed bytes → content hash matches
→ verify succeeds → no R1016 false positive.

Tampered subtests (31a, 31d) keep working: signSignedAP returns a
known-good signed AP, the test mutates it post-helper, deployAndWait
Creates the mutated version, storage round-trip preserves the mutation,
and verify correctly detects the divergence.
…re tolerance

Test_33 deploys 11 fresh pods sequentially, one per subtest. Later
subtests race against an increasingly loaded kind cluster — CP cache
reconciler, alertmanager, prometheus all chew CPU at boot. 80s
WaitForReady deadline timed out on the post-23ea224 run with
'workload not ready in ns ...' for early subtests once the cluster
got busy. 180s gives headroom without changing total runtime regime.
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 4, 2026

Review Change Stack
No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro Plus

Run ID: f30c8744-a7a5-4865-93d1-86006d246314

📥 Commits

Reviewing files that changed from the base of the PR and between 045940c and eb40bbf.

📒 Files selected for processing (4)
  • .github/workflows/benchmark.yaml
  • .github/workflows/go-basic-tests.yaml
  • .github/workflows/incluster-comp-pr-created.yaml
  • .github/workflows/incluster-comp-pr-merged.yaml
💤 Files with no reviewable changes (4)
  • .github/workflows/benchmark.yaml
  • .github/workflows/go-basic-tests.yaml
  • .github/workflows/incluster-comp-pr-created.yaml
  • .github/workflows/incluster-comp-pr-merged.yaml

📝 Walkthrough

Walkthrough

Adds a signature-mismatch sentinel and wraps verification errors, implements tamper detection with R1016 alerting and exporter wiring in ContainerProfileCache, enables wildcard exec-arg matching and R0040, adds an ApplicationProfile YAML linter, new/rewritten component tests and fixtures, and expands the CI component-test matrix.

Changes

Tamper Detection and Wildcard Argument Matching

Layer / File(s) Summary
Core Type & Export Registration
pkg/objectcache/containerprofilecache/containerprofilecache.go
ContainerProfileCacheImpl adds tamperAlertExporter field and tamperEmitted dedup map. Profile loads now call signature-verification hooks before merging user-defined overlays.
Signature Verification Logic
pkg/signature/annotations.go, pkg/signature/verify.go
Adds exported ErrSignatureMismatch and wraps verification failures with it so callers can classify signature-mismatch (tamper) vs operational errors.
Tamper Detection Implementation
pkg/objectcache/containerprofilecache/tamper_alert.go
New file: SetTamperAlertExporter registers an optional exporter. verifyUserApplicationProfile and verifyUserNetworkNeighborhood re-verify signatures on load; on ErrSignatureMismatch, clear overlay reference and emit R1016 once per (kind,namespace,name,resourceVersion). emitTamperAlert builds/sends the GenericRuleFailure when exporter is set.
Integration / Wiring
cmd/main.go
Runtime-detection ContainerProfileCache is wired with the tamper alert exporter via cpc.SetTamperAlertExporter(exporter).
Dynamic Argument Matching
pkg/rulemanager/cel/libraries/applicationprofile/exec.go
wasExecutedWithArgs now uses dynamicpathdetector.CompareExecArgs instead of exact slice equality, enabling wildcard (* and ) token matching in profile argument vectors.
Exec Argument Matching Tests
pkg/rulemanager/cel/libraries/applicationprofile/exec_test.go
New TestExecWithArgsWildcardInProfile validates CEL execution against profiles containing wildcard argument tokens ( and *).
Rule Definitions & Bindings
tests/chart/templates/node-agent/default-rules.yaml, tests/chart/templates/node-agent/default-rule-binding.yaml
Inserted rule R0040 ("Unexpected process arguments") into runtime rules and added the binding entry.
Test Fixtures & Linter
tests/resources/aplint_test.go, tests/resources/known-application-profile.yaml, tests/resources/curl-exec-arg-wildcards-deployment.yaml, tests/resources/nginx-user-defined-deployment.yaml
ApplicationProfile YAML linter validates profile structure, path/arg/flag rules, and wildcard token usage. Reference profile (fusioncore-profile) and updated deployment manifests with unified overlay naming for AP/NN. New curl-32 deployment for arg-wildcard testing.
Component Tests & Test Logic
tests/component_test.go
Major rewrite of tamper/signed-profile tests with signing helpers and explicit subtests, added Test_32_UnexpectedProcessArguments and Test_33_AnalyzeOpensWildcardAnchoring, adjusted waits and retry loops to accommodate cache-load latency.
Module Dependencies
go.mod
github.com/k8sstormcenter/storage replace directive pinned to a newer revision supporting the changes.
CI Workflows
.github/workflows/*
Removed top-level permissions: read-all from several workflows and added optional GITGUARDIAN_API_KEY secret to incluster-comp-pr-created workflow_call. Component-tests matrix extended to run the two new per-test targets.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

  • k8sstormcenter/node-agent#39: Modifies workflow-level permission blocks in the opposite direction (adds permissions: read-all) and thus relates to the workflow permission edits in this PR.
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 64.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title 'Merge/exec arg wildcards' clearly references the main feature — wildcard argument matching for exec events via dynamicpathdetector integration — which is extensively documented across the changeset (R0040 rule, Test_32, CEL library updates, component tests).
Description check ✅ Passed The description enumerates commits addressing exec-arg wildcards, tamper detection, test fixes, linter additions, and storage bumps, all of which are reflected in the changeset summaries and PR objectives.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch merge/exec-arg-wildcards

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/objectcache/containerprofilecache/tamper_alert.go`:
- Around line 66-96: Deduplicate repeated R1016 alerts by tracking which objects
have already emitted a tamper alert and only call emitTamperAlert on the first
transition to invalid; add a field on ContainerProfileCacheImpl (e.g.,
tamperEmitted map[string]struct{}) and construct a stable key from kind,
namespace, name, and resourceVersion, then in verifyUserApplicationProfile and
verifyUserNetworkNeighborhood check the map before emitting (if present, skip
emit), insert the key when emitting, and remove the key when verification passes
again so future transitions can re-alert.
- Around line 135-155: The current code derives a WLID from containerID using
extractWlidFromContainerID before calling GenericRuleFailure.SetWorkloadDetails,
which loses proper workload attribution; change emitTamperAlert to accept the
real WLID (pass sharedData.Wlid from the caller in containerprofilecache.go),
and inside emitTamperAlert call ruleFailure.SetWorkloadDetails(wlid) instead of
extractWlidFromContainerID(containerID); update the emitTamperAlert signature
and all its call sites accordingly (remove the fallback guess when real WLID is
available) so alerts use the authoritative sharedData.Wlid.

In `@tests/chart/templates/node-agent/default-rules.yaml`:
- Around line 50-61: The rule R0040 currently calls
ap.was_executed_with_args(...) unconditionally; update either the rule or the
args-contract so empty/nil Execs.Args means “no argv constraint.” Specifically,
modify the ruleExpression (around R0040 where parse.get_exec_path and
ap.was_executed_with_args are used) to only call ap.was_executed_with_args when
the matching profile entry actually declares an arg pattern (Args != nil and
length>0), or change ap.was_executed_with_args implementation to treat nil/empty
event.args/Execs.Args as a wildcard (always match). Ensure references to
Execs.Args, event.args, ap.was_executed_with_args and parse.get_exec_path are
updated so existing tests that use path-only Execs entries do not trigger R0040.

In `@tests/resources/aplint_test.go`:
- Around line 214-216: The loop is shadowing the loop variable unnecessarily
with `p := p`, which triggers the copyloopvar linter; remove the redundant `p :=
p` line and let the closure capture the loop variable directly in the `for _, p
:= range matches { ... t.Run(filepath.Base(p), func(t *testing.T) { ... }) }`
loop (no other changes needed to the `t.Run` closure).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro Plus

Run ID: 7dd1e0db-7754-481d-920b-7f306fc23550

📥 Commits

Reviewing files that changed from the base of the PR and between f0d0f72 and a5af261.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (14)
  • .github/workflows/component-tests.yaml
  • cmd/main.go
  • go.mod
  • pkg/objectcache/containerprofilecache/containerprofilecache.go
  • pkg/objectcache/containerprofilecache/tamper_alert.go
  • pkg/rulemanager/cel/libraries/applicationprofile/exec.go
  • pkg/rulemanager/cel/libraries/applicationprofile/exec_test.go
  • tests/chart/templates/node-agent/default-rule-binding.yaml
  • tests/chart/templates/node-agent/default-rules.yaml
  • tests/component_test.go
  • tests/resources/aplint_test.go
  • tests/resources/curl-exec-arg-wildcards-deployment.yaml
  • tests/resources/known-application-profile.yaml
  • tests/resources/nginx-user-defined-deployment.yaml

Comment thread pkg/objectcache/containerprofilecache/tamper_alert.go Outdated
Comment thread pkg/objectcache/containerprofilecache/tamper_alert.go Outdated
Comment thread tests/chart/templates/node-agent/default-rules.yaml
Comment thread tests/resources/aplint_test.go
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/chart/templates/storage/collapseconfiguration-default.yaml`:
- Around line 1-11: The top-of-file Helm block comment in
collapseconfiguration-default.yaml (the opening token sequence "{{- /*" and the
closing "*/}}") causes YAMLlint failure; replace the entire Helm block comment
wrapper with regular YAML comment lines (# ...) so the content remains
documentation but is valid YAML/Helm template syntax — i.e., remove "{{- /*" and
"*/}}" and convert each line inside to begin with "#" (preserve wording and any
leading spacing) so the template still documents the singleton
CollapseConfiguration without breaking YAML parsing.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro Plus

Run ID: b257a3c8-7233-40d0-ba2b-8f6278b7a908

📥 Commits

Reviewing files that changed from the base of the PR and between a5af261 and 09aee0c.

📒 Files selected for processing (1)
  • tests/chart/templates/storage/collapseconfiguration-default.yaml

Comment thread tests/chart/templates/storage/collapseconfiguration-default.yaml Outdated
Entlein added 3 commits May 4, 2026 11:36
…int)

CodeRabbit (PR #38). Loop variable shadowing is no longer required
since Go 1.22 made each iteration capture its own variable. The shadow
trips golangci-lint's copyloopvar check.
CodeRabbit (PR #38) flagged two issues on the refresh-loop tamper path:

1. Dedup R1016 (Major). The cache refresh re-runs verifyUser*() on every
   reconcile cycle, so a long-lived tampered profile would emit R1016 on
   every cycle and once per container reference. Track emitted alerts in
   a sync.Map keyed on (kind|ns/name@resourceVersion). LoadOrStore gives
   atomic 'first transition to invalid' semantics; a re-tamper at a new
   RV uses a fresh key and alerts again. Verification passing clears the
   key so future tampers re-alert.

2. Pass real WLID (Major). emitTamperAlert previously called
   extractWlidFromContainerID(containerID) — but containerID is a runtime
   ID like 'containerd://...' which GenericRuleFailure.SetWorkloadDetails
   parses as a WLID and silently drops kind/name/cluster attribution.
   Thread sharedData.Wlid (already in scope at the call site in
   containerprofilecache.go) through verifyUser*() into emitTamperAlert.
   Drop extractWlidFromContainerID — no longer needed.

Test_31_TamperDetectionAlert exercises this path end-to-end; expecting
it to keep passing (one alert per tampered profile, with proper
workload attribution in Alertmanager).
Picks up storage's CompareExecArgs fix for path-only Execs entries
(CodeRabbit PR #38, finding #3). With this bump, R0040 no longer
fires on every legit exec of a path that the user-defined profile
allowed but didn't constrain by argv.
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/objectcache/containerprofilecache/tamper_alert.go`:
- Around line 74-86: The code treats any error from
signature.VerifyObjectAllowUntrusted as a tamper event; change the logic in the
tamper handling blocks (both the block around
signature.VerifyObjectAllowUntrusted at the first check and the similar block at
lines ~105-114) to distinguish verification failures from other operational
errors: call signature.VerifyObjectAllowUntrusted(adapter) as before, but if the
returned error wraps or contains the sentinel "signature verification failed"
(use errors.Is/errors.As or string-match the wrapped error type/message from the
verifier), then run the existing tamper path (log the warning, dedupe via
c.tamperEmitted.LoadOrStore(key), and call c.emitTamperAlert(...)); for all
other errors (hash computation/verifier construction), log as an operational
error without emitting R1016 or touching c.tamperEmitted, and return
!c.cfg.EnableSignatureVerification. Ensure this change is applied to both
occurrences and reference the functions/variables
signature.VerifyObjectAllowUntrusted, c.emitTamperAlert, c.tamperEmitted, and
c.cfg.EnableSignatureVerification.

In `@tests/resources/aplint_test.go`:
- Around line 220-221: Replace the raw substring check that uses
strings.Contains(string(data), "kind: ApplicationProfile") with parsing the YAML
header and inspecting the actual kind field: unmarshal the YAML from the
existing data variable into a map[string]interface{} (or a small struct with
Kind string), read the "kind" value (handling strings and quoted forms), and
branch on kind == "ApplicationProfile" instead of substring matching so t.Skipf
is only invoked when the parsed kind differs.
- Around line 158-167: The R-AP-12 check currently rejects profiles when
exec.Args[0] doesn't equal exec.Path, which diverges from runtime matching
(wasExecutedWithArgs treats Path and Args separately); remove the strict
equality check and stop adding the "R-AP-12" error based on e.Args[0] != e.Path
— preserve the wildcardIdentifier allowance but do not require Args[0] to equal
Path (i.e., delete or bypass the if block that calls add("R-AP-12", ... ) that
references e.Args[0] and e.Path).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro Plus

Run ID: 7219528e-a03b-4046-bd2d-26229f82dc39

📥 Commits

Reviewing files that changed from the base of the PR and between 09aee0c and 5c09cbf.

📒 Files selected for processing (3)
  • pkg/objectcache/containerprofilecache/containerprofilecache.go
  • pkg/objectcache/containerprofilecache/tamper_alert.go
  • tests/resources/aplint_test.go

Comment thread pkg/objectcache/containerprofilecache/tamper_alert.go Outdated
Comment thread tests/resources/aplint_test.go
Comment thread tests/resources/aplint_test.go Outdated
@entlein entlein force-pushed the merge/exec-arg-wildcards branch from 5c09cbf to 98bdf97 Compare May 4, 2026 10:18
Three CR concerns addressed in one commit (all submitted 2026-05-04 10:09):

1. tamper_alert.go (Major) — only emit R1016 on actual signature mismatch.
   Previously every error from VerifyObjectAllowUntrusted (hash computation,
   verifier construction, malformed annotations, signature mismatch) was
   treated as a tamper event. With EnableSignatureVerification=true that
   meant a transient operational failure could drop a valid overlay AND
   emit a false R1016. Fix:
     - Add signature.ErrSignatureMismatch sentinel in pkg/signature/annotations.go
     - Wrap the signature-fail return in verify.go with the sentinel
       (Go 1.20+ multiple-%w form)
     - Classify in tamper_alert.go via errors.Is(err, ErrSignatureMismatch);
       operational errors log with a 'NOT tamper' tag, do NOT touch the
       tamperEmitted dedup map, and do NOT emit R1016. They still honour
       strict-mode (return !EnableSignatureVerification) so behaviour is
       conservative.
   Adds tamper_alert_test.go: pins LoadOrStore semantics, confirms each
   operational-error variant returns false on errors.Is(ErrSignatureMismatch),
   smoke-tests the sentinel value.

2. aplint_test.go:221 (Minor) — replace strings.Contains substring check
   for fixture-type detection with proper YAML-header parse. The substring
   form silently skipped fixtures with quoted/non-canonical 'kind' values
   AND would false-positive on nested OwnerReferences with the same
   substring. Now uses sigs.k8s.io/yaml (already imported) to unmarshal
   into a {Kind string} struct and branch on the parsed value.

3. aplint_test.go R-AP-12 (Major, 'Heavy lift') — REJECTED with reasoning
   to be posted as CR reply. Evidence shows R-AP-12 enforces the actual
   runtime contract:
     - pkg/rulemanager/cel/libraries/parse/parse.go's getExecPath returns
       args[0] (the full binary path)
     - ap.was_executed_with_args(containerID, parse.get_exec_path(event.args,
       event.comm), event.args) passes the FULL argv to the matcher
     - Integration test pkg/rulemanager/cel/libraries/applicationprofile/
       integration_test.go uses ["/bin/bash", "-c", ...] — full argv
     - Real fixture tests/resources/known-application-profile.yaml uses
       args: ["/bin/sleep", "infinity"] with args[0] == path
   The compare_exec_args_test.go cases that look like flags-only ([-s, ⋯])
   are matcher-isolation unit tests, not the wired contract. Removing
   R-AP-12 would let users author flags-only profiles that silently fail
   to silence R0040 in production.

Bumps storage replace to a7e6234349ab (storage main HEAD post-PR #27).
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@pkg/objectcache/containerprofilecache/tamper_alert_test.go`:
- Around line 25-47: Current test only exercises the tamperEmitted map directly;
instead add an integration-style test that invokes the real verify path (e.g.,
verifyUserApplicationProfile and/or verifyUserNetworkNeighborhood) and asserts
the dedup map is updated when signature.ErrSignatureMismatch is produced:
arrange the verifier used by ContainerProfileCacheImpl to return an error
wrapping signature.ErrSignatureMismatch (mock or inject a stubbed verifier),
call the appropriate verify* method with a crafted profile so it triggers
signature verification, then compute the same tamper key with
tamperKey("ApplicationProfile", ...) and assert
ContainerProfileCacheImpl.tamperEmitted contains that key (using
Load/LoadOrStore to check alreadyEmitted behavior); reference
ContainerProfileCacheImpl, tamperEmitted,
verifyUserApplicationProfile/verifyUserNetworkNeighborhood,
signature.ErrSignatureMismatch, and tamperKey when locating code to modify.

In `@tests/resources/aplint_test.go`:
- Around line 229-230: The test currently calls t.Skipf when YAML parsing or
missing canonical fixtures occur (e.g., the yaml.Unmarshal(data, &head) error
branch and the other t.Skipf at the later fixture check), which hides broken
fixtures; change those t.Skipf calls to t.Fatalf (or t.Fatalf with the same
message and err) so the test fails on malformed or missing canonical fixtures
instead of skipping them, updating both places where t.Skipf is used (the
yaml.Unmarshal error branch and the similar block around lines 345-347).
- Around line 169-175: The loop over e.Args only flags embedded
dynamicIdentifier (⋯) but not embedded '*' tokens; update the condition in the
e.Args loop (the block using e.Args, dynamicIdentifier and add("R-AP-13", ...))
to also detect embedded '*' by adding a second check: if strings.Contains(a,
"*") && a != "*" then call add("R-AP-13", fmt.Sprintf(...)) with the same
message format; ensure both checks remain so standalone tokens
(dynamicIdentifier or "*" alone) are allowed but any embedding (e.g., "foo*bar"
or "foo⋯bar") produces R-AP-13.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro Plus

Run ID: ff75a2fa-6b86-431a-ac61-c53d61e2ac5a

📥 Commits

Reviewing files that changed from the base of the PR and between 98bdf97 and 0cf4a50.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (6)
  • go.mod
  • pkg/objectcache/containerprofilecache/tamper_alert.go
  • pkg/objectcache/containerprofilecache/tamper_alert_test.go
  • pkg/signature/annotations.go
  • pkg/signature/verify.go
  • tests/resources/aplint_test.go

Comment thread pkg/objectcache/containerprofilecache/tamper_alert_test.go
Comment thread tests/resources/aplint_test.go
Comment thread tests/resources/aplint_test.go Outdated
Three new findings on the May 9 review batch:

1. aplint_test.go R-AP-13 (Major) — embedded * was not flagged, only embedded ⋯.
   An arg like "foo*bar" linted clean, defeating the purpose of the
   wildcard-must-be-standalone rule. Added a parallel check for embedded
   wildcardIdentifier mirroring the existing dynamicIdentifier check.

2. aplint_test.go canonical-fixture skip path (Major) — t.Skipf on missing
   reference fixture silently disabled the gold-standard test if someone
   deleted/renamed it. Switched to t.Fatalf with a message that explains
   the file's role. Also switched the YAML-parse-failure path from Skipf
   to Fatalf — un-parseable YAML in tests/resources/ is a fixture-quality
   bug, not something to skip past. Kept Skipf only for the kind!=AP path,
   which correctly handles the legit non-AP fixtures (Deployments,
   Services etc.) that share the directory.

3. tamper_alert_test.go (Trivial nitpick, 'Low value') — added an
   integration-style test that exercises the full verifyUserApplicationProfile
   path: real signature.SignObjectDisableKeyless, tamper the spec,
   call verifyUserApplicationProfile, assert tamperEmitted carries the
   key. Then re-sign at a new RV and assert the dedup is cleared. Confirms
   the wiring 'verifier returns ErrSignatureMismatch → classify as tamper
   → LoadOrStore in dedup map' actually fires, not just the LoadOrStore
   call in isolation. Uses the real cosign adapter — no mocking needed.
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 9, 2026

Performance Benchmark Results

Node-Agent Resource Usage
Metric BEFORE AFTER Delta
Avg CPU (cores) 0.000 0.000 N/A
Peak CPU (cores) 0.000 0.000 N/A
Avg Memory (MiB) 0.000 0.000 N/A
Peak Memory (MiB) 0.000 0.000 N/A
Dedup Effectiveness

No data available.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@pkg/objectcache/containerprofilecache/tamper_alert_test.go`:
- Around line 145-159: The test is checking that c.tamperEmitted clears a dedup
key for ResourceVersion "43" but never added that key, so make the test actually
populate then verify removal: before calling
signature.SignObjectDisableKeyless(adapter) or before re-signing, compute newKey
:= tamperKey("ApplicationProfile", profile.Namespace, profile.Name,
profile.ResourceVersion) and insert it into the map (e.g.,
c.tamperEmitted.Store(newKey, struct{}{})), then perform the re-sign and call
c.verifyUserApplicationProfile(...) and finally assert that
c.tamperEmitted.Load(newKey) no longer finds the key; use the existing
tamperKey, c.tamperEmitted, signature.SignObjectDisableKeyless, and
verifyUserApplicationProfile symbols to locate and implement this change.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro Plus

Run ID: 2148409e-a6cc-460e-8735-cdabf2263662

📥 Commits

Reviewing files that changed from the base of the PR and between 98bdf97 and 8d89240.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (6)
  • go.mod
  • pkg/objectcache/containerprofilecache/tamper_alert.go
  • pkg/objectcache/containerprofilecache/tamper_alert_test.go
  • pkg/signature/annotations.go
  • pkg/signature/verify.go
  • tests/resources/aplint_test.go

Comment on lines +145 to +159
// Re-sign over the mutated content — verification now succeeds, and
// the dedup entry should be cleared so a future tamper at a NEW
// resourceVersion can re-alert.
profile.ResourceVersion = "43" // simulate the cluster bumping RV on update
if err := signature.SignObjectDisableKeyless(adapter); err != nil {
t.Fatalf("re-sign profile: %v", err)
}
ok = c.verifyUserApplicationProfile(profile, "wlid://test/cluster/ns/Pod/p")
if !ok {
t.Errorf("verify after re-sign returned false; expected true")
}
newKey := tamperKey("ApplicationProfile", profile.Namespace, profile.Name, profile.ResourceVersion)
if _, found := c.tamperEmitted.Load(newKey); found {
t.Errorf("tamperEmitted has key %q after a successful re-verify; the verify-clean path must clear it", newKey)
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial | 💤 Low value

Test assertion doesn't meaningfully verify the clearing behavior.

The test comment says "the dedup entry should be cleared", but key "43" was never added to the map—only key "42" was added during the tamper phase (line 133). The assertion at lines 157-159 checks that a never-added key is absent, which is trivially true.

To actually test the clearing behavior, the test would need to:

  1. Add key "43" to the map (by tampering a profile with RV "43")
  2. Re-sign and verify successfully
  3. Assert key "43" was removed

The production code at tamper_alert.go:78 is correct—this is just a test coverage gap.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/objectcache/containerprofilecache/tamper_alert_test.go` around lines 145
- 159, The test is checking that c.tamperEmitted clears a dedup key for
ResourceVersion "43" but never added that key, so make the test actually
populate then verify removal: before calling
signature.SignObjectDisableKeyless(adapter) or before re-signing, compute newKey
:= tamperKey("ApplicationProfile", profile.Namespace, profile.Name,
profile.ResourceVersion) and insert it into the map (e.g.,
c.tamperEmitted.Store(newKey, struct{}{})), then perform the re-sign and call
c.verifyUserApplicationProfile(...) and finally assert that
c.tamperEmitted.Load(newKey) no longer finds the key; use the existing
tamperKey, c.tamperEmitted, signature.SignObjectDisableKeyless, and
verifyUserApplicationProfile symbols to locate and implement this change.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 9, 2026

Performance Benchmark Results

Node-Agent Resource Usage
Metric BEFORE AFTER Delta
Avg CPU (cores) 0.222 0.219 -1.3%
Peak CPU (cores) 0.236 0.234 -0.9%
Avg Memory (MiB) 319.062 272.515 -14.6%
Peak Memory (MiB) 321.348 274.832 -14.5%
Dedup Effectiveness (AFTER only)
Event Type Passed Deduped Ratio
capabilities 2 0 0.0%
hardlink 6000 0 0.0%
http 1761 119401 98.5%
network 901 78000 98.9%
open 31809 619943 95.1%
symlink 6000 0 0.0%
syscall 993 1895 65.6%
Event Counters
Metric BEFORE AFTER
capability_counter 10 10
dns_counter 1412 1398
exec_counter 7061 6993
network_counter 92884 91966
open_counter 767916 761127
syscall_counter 3511 3365

Entlein added 3 commits May 9, 2026 22:56
CodeRabbit nitpick on PR #38 (tamper_alert_test.go:159): the prior
version of TestVerifyAP_TamperedProfile_PopulatesDedupMap bumped the
profile's ResourceVersion to '43' before the re-sign, then asserted
that the key for RV='43' was absent from tamperEmitted. But '43' was
never added in the first place — only '42' was added during the tamper
phase. The assertion was trivially true.

Drop the RV bump: re-sign at the SAME RV='42' so that
verifyUserApplicationProfile takes the verify-clean branch and Deletes
the existing dedup entry. The assertion now actually exercises the
clearing path.

Production code unchanged.
Chore PR #39 added 'permissions: read-all' to every workflow file
including the workflow_call reusables. That broke the reusable-call
contract: a called workflow's top-level permissions are the FLOOR the
caller must grant. Callers of these reusables (pr-created.yaml's
pr-created job, pr-merged.yaml's pr-merged job, etc.) only grant a
narrow set of scopes (id-token:write, packages:write, security-events:
write, pull-requests:write, contents). They do NOT grant the full
read-all set (actions:read, artifact-metadata:read, attestations:read,
checks:read, deployments:read, discussions:read, issues:read,
models:read, vulnerability-alerts:read, packages:read, pages:read,
repository-projects:read, statuses:read, id-token:read).

Result: every PR opened on this fork since chore #39 merged (May 6)
has had pr-created.yaml startup_failure with:

  Error calling workflow ... incluster-comp-pr-created.yaml@<sha>.
  The workflow is requesting 'actions: read, ...', but is only allowed [...]

Fix: strip top-level 'permissions: read-all' from the four reusables:
- benchmark.yaml (also direct-trigger; falls back to per-job perms)
- go-basic-tests.yaml
- incluster-comp-pr-created.yaml
- incluster-comp-pr-merged.yaml

Each reusable's per-job 'permissions:' blocks remain intact and
correctly request only what the job needs. Callers' existing grants
are sufficient.

Top-level 'permissions: read-all' stays on the OUTERMOST workflows
(pr-created.yaml, pr-merged.yaml, build.yaml, bypass.yaml,
component-tests.yaml, sign-object.yaml, scorecard.yml) — they're
event-triggered, not workflow_call'd, so the read-all default still
hardens GITHUB_TOKEN there.
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 9, 2026

Performance Benchmark Results

Node-Agent Resource Usage
Metric BEFORE AFTER Delta
Avg CPU (cores) 0.000 0.000 N/A
Peak CPU (cores) 0.000 0.000 N/A
Avg Memory (MiB) 0.000 0.000 N/A
Peak Memory (MiB) 0.000 0.000 N/A
Dedup Effectiveness

No data available.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 9, 2026

Performance Benchmark Results

Node-Agent Resource Usage
Metric BEFORE AFTER Delta
Avg CPU (cores) 0.219 0.222 +1.5%
Peak CPU (cores) 0.239 0.235 -1.8%
Avg Memory (MiB) 304.403 268.783 -11.7%
Peak Memory (MiB) 306.719 277.336 -9.6%
Dedup Effectiveness (AFTER only)
Event Type Passed Deduped Ratio
capabilities 1 0 0.0%
hardlink 6000 0 0.0%
http 1705 119455 98.6%
network 903 77997 98.9%
open 31812 619937 95.1%
symlink 6000 0 0.0%
syscall 978 1894 65.9%
Event Counters
Metric BEFORE AFTER
capability_counter 10 10
dns_counter 1448 1441
exec_counter 7244 7208
network_counter 95250 94802
open_counter 788255 782938
syscall_counter 3513 3493

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant