Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
137 changes: 137 additions & 0 deletions .claude/rules/dep-pin-search-first-authority.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,137 @@
# Dep-pin search-first authority — never assert version/path/name pins from training-data default

Carved sentence:

> Whenever authoring a version pin (nix input, helm chart
> targetRevision, container image tag, mise runtime, NixOS-substrate
> path, ArgoCD app), the agent MUST WebSearch for the current latest
> stable AND cite the search result inline in the commit message + PR
> description. Training-data defaults are not authoritative; the cost
> of stale-pin authoring is high (security exposure, EOL channels,
> downstream migration debt, false-positive CI assertions that block
> the artifact pipeline).

## Operational content

This rule extends [`search-first-authority.md`](search-first-authority.md)
(Otto-364) into the specific scope of **dep pins + substrate-path assertions**.
The Otto-364 rule already says: for load-bearing claims about tools / standards
/ APIs / language runtimes / libraries / CI services / security policy, WebSearch
the current upstream documentation BEFORE asserting.

The dep-pin scope tightens this further: even small-looking pins (a single
nixpkgs channel reference, a single chart targetRevision, a single path in
an "expected files" list) carry the same training-data-staleness risk and
deserve the same WebSearch discipline.

### What counts as a dep-pin authoring action

| Authoring surface | Examples |
|---|---|
| **Nix flake inputs** | `nixpkgs.url`, `nix-darwin.url`, third-party-flake URLs |
| **NixOS module package references** | package versions in `environment.systemPackages`, helm-chart-sourced apps |
| **ArgoCD `Application` resources** | `spec.source.targetRevision`, `spec.source.helm.chart`, `spec.source.repoURL` |
| **Helm chart targetRevisions** | chart version strings |
| **Container image tags** | image:tag literals in NixOS modules / K8s manifests |
| **mise runtimes** | `.mise.toml` runtime versions |
| **Substrate-path assertions in audit/lint tools** | REQUIRED_FILES lists, EXPECTED_PATHS, golden-output paths |
| **GitHub Actions runners + uses pins** | runner image (`ubuntu-24.04`), action SHA pins + version comments |
| **External-API endpoint references** | OpenAI/Anthropic/etc. API version strings |
| **Project / framework version references** | NixOS release names, K8s versions, etc. |

### Required process per dep-pin authoring action

1. **WebSearch for current latest stable** with explicit year in query (current year is 2026; use it). Example queries:
- `"NixOS latest stable release 2026"` → confirms channel name
- `"kured helm chart latest version 2026"` → confirms chart version
- `"NixOS installer ISO directory layout boot grub isolinux 2026"` → confirms expected substrate paths
2. **Cite the WebSearch result inline** in the commit message body AND PR description. Format:
```
Per WebSearch <YYYY-MM-DD>:
[source title](URL) — current latest stable: <version>
```
3. **If the WebSearch result conflicts with training-data default**, the WebSearch ALWAYS WINS. Do not "average" them; do not infer "probably both correct."
4. **If WebSearch surfaces no current-stable evidence**, the substrate-honest move is to surface uncertainty to the operator + propose alternatives, NOT pick the training-data default as a fallback.
5. **For substrate-path assertions** (REQUIRED_FILES lists, expected paths), the same discipline applies but with empirical verification (download a recent artifact + inspect, OR cite the upstream layout docs). Don't author the assertion list from training-data assumptions about how the upstream ships files.

### When this rule fires

- ANY new flake input addition
- ANY `nix flake update` that bumps a pin
- ANY helm chart application opened
- ANY container image referenced
- ANY audit/lint tool with a REQUIRED list of file paths from an upstream-shipped artifact
- ANY ArgoCD Application authored
- ANY mise runtime added

### When this rule does NOT fire

- Internal-repo-only paths (your own substrate; you're the source of truth)
- Theoretical / illustrative version strings in documentation (mark as `<example>` or `<latest-stable>` placeholder)
- Re-references to a pin that another file in the SAME PR already verified per this rule (cite the sibling verification, no duplicate WebSearch needed)
- Direct user instruction to use a specific version that the operator already named (operator authority overrides; cite the operator's quote)

## Why this rule auto-loads

Per [`wake-time-substrate.md`](wake-time-substrate.md): the operational
failure mode this rule catches is highest at WRITE-TIME, when the agent is
about to emit a version string. Memory-file-only encoding doesn't intercept
the in-progress write. Auto-load at cold-boot makes the discipline available
at the moment the next-token decision is being made.

## Empirical anchors

### Anchor 1 — NixOS 24.11 pinned past EOL (B-0800 / 2026-05-26)

`full-ai-cluster/flake.nix` shipped initially with `nixpkgs.url = "github:NixOS/nixpkgs/nixos-24.11"`. The maintainer 2026-05-26 asked: *"is there a 25 we should go ahead and distro upgrade we don't want to be behind"*. WebSearch surfaced: NixOS 25.11 "Xantusia" current stable (released 2025-11-30; EOL 2026-06-30); 24.11 EOL'd 2025-06-30 — past EOL when our flake was authored. Substrate-honest finding: the training-data default for "latest NixOS channel" had drifted stale by 1 year + 2 channel releases. Backlogged as [B-0800](../../docs/backlog/P1/B-0800-iter-6-0-bump-nixpkgs-24-11-to-25-11-warbler-xantusia-eol-recovery-aaron-2026-05-26.md).

### Anchor 2 — cascade #4 ISO audit asserted wrong NixOS layout (P0 fix-fwd / 2026-05-26)

`tools/ci/audit-installer-iso-content.ts` (shipped in PR #5119) authored `REQUIRED_ISO_PATHS = [..., "boot/grub/grub.cfg", ...]` from training-data assumptions about legacy GRUB layouts. NixOS installer ISOs as of 24.11 use **isolinux** (`isolinux/isolinux.cfg`) for BIOS boot and **refind** (`EFI/BOOT/refind_x64.efi`) for UEFI boot — NOT legacy GRUB at the asserted path. Result: the audit blocked EVERY ISO build for 4 consecutive commits (`35fd3aeef`, `848467588`, `5d9f8605a`, `ed6a7b8b9`) because the false-positive assertion fired on every run. The last successful ISO build was `17523e4fb` (iter-5.2.1 era); the maintainer was about to re-flash a USB expecting current iter-5.x substrate but would have gotten stale content. Fixed in PR #5125 by replacing the single-path assertion with a bootloader-any-of family check across multiple NixOS-version layouts (isolinux + refind + EFI + legacy grub).

The substrate-honest implication: this rule's exact discipline would have prevented the cascade #4 false-positive. The author (the agent) should have WebSearched / verified the NixOS-actual installer ISO directory layout before authoring the REQUIRED_ISO_PATHS list. Skipping that step let the training-data-default leak through into a load-bearing assertion that gates the artifact pipeline.

### Anchor 3 — kured chart targetRevision marker in B-0802 (2026-05-26)

The B-0802 backlog row authoring intentionally used the placeholder `targetRevision: <latest-stable-VERIFIED-via-WebSearch> # per B-0805 discipline` rather than a training-data default. This is the SHAPE this rule encourages: when you don't yet know the current version + you're authoring a backlog row that will be implemented later, mark the placeholder explicitly + name the rule the implementer must follow. The implementation PR then does the WebSearch + replaces the placeholder.

## Composes with

- [`search-first-authority.md`](search-first-authority.md) (Otto-364) — the foundational rule this row narrows to dep-pin scope
- [`wake-time-substrate.md`](wake-time-substrate.md) — why this rule auto-loads
- [`fighting-past-self-vs-peer-agent-distinguisher-fix-your-own-coordinate-on-peers-dont-punt-by-default.md`](fighting-past-self-vs-peer-agent-distinguisher-fix-your-own-coordinate-on-peers-dont-punt-by-default.md) — companion at agent-coordination scope; same root cause class ("Otto-defaults-to-plausible-but-unverified" applied to two different surfaces)
- [`razor-discipline.md`](razor-discipline.md) — operational claims only; version pins ARE operational claims (you're stating "this version is current"); unverified version pins fail the razor
- [`grep-substrate-anchors-before-razor-as-metaphysical.md`](grep-substrate-anchors-before-razor-as-metaphysical.md) — sibling discipline: verify substrate anchors before razor-flagging; verify version pins before asserting
- [`refresh-before-decide.md`](refresh-before-decide.md) — refresh applies at the per-version-pin scope, not just per-tick
- [`additive-not-zero-sum.md`](additive-not-zero-sum.md) — WebSearch verification is bandwidth-engineering input that compounds (verified pins land cleanly + survive review; unverified pins burn round-trips with reviewers + ops time)

## Composes with substrate

- [B-0800](../../docs/backlog/P1/B-0800-iter-6-0-bump-nixpkgs-24-11-to-25-11-warbler-xantusia-eol-recovery-aaron-2026-05-26.md) — empirical anchor 1
- [B-0805](../../docs/backlog/P1/B-0805-iter-6-5-all-deps-current-version-audit-nix-flake-argocd-helm-charts-otto-training-data-stale-defaults-must-search-first-aaron-2026-05-26.md) — the capstone backlog row this rule was named in as sub-target 3
- [B-0801](../../docs/backlog/P2/B-0801-iter-6-1-system-autoupgrade-nixos-modules-common-weekly-schedule-no-auto-reboot-aaron-2026-05-26.md), [B-0802](../../docs/backlog/P2/B-0802-iter-6-2-kured-argocd-app-kubernetes-aware-drain-reboot-aaron-2026-05-26.md), [B-0803](../../docs/backlog/P2/B-0803-iter-6-3-deploy-rs-from-ci-gitops-flake-lock-pull-with-auto-rollback-aaron-2026-05-26.md), [B-0804](../../docs/backlog/P2/B-0804-iter-6-4-distro-upgrade-automation-runbook-canary-rollout-coordinated-cluster-bump-aaron-2026-05-26.md) — sibling cluster-update rows that consume this rule's discipline
- PR #5125 (the cascade #4 fix-fwd) — the empirical fix that surfaced the rule-landing trigger

## Substrate-honest framing

This rule does NOT:

- Make WebSearch mandatory for every commit (only for dep-pin / path-assertion authoring actions)
- Require WebSearch for re-runs that don't change a pin (the bump is the event; idempotent edits don't re-trigger)
- Override operator authority (if the maintainer names a specific version, that wins)
- Solve the lag between WebSearch cache + latest release (cache may be hours stale; that's acceptable bandwidth-engineering tradeoff)

This rule DOES:

- Force the WebSearch step at the moment of authoring a load-bearing version assertion
- Encode the cite-the-source discipline in commit messages + PR descriptions
- Provide the empirical anchors so future-Otto sees the cost of skipping the discipline
- Compose with the agent-coordination companion rule so both "Otto-defaults-to-plausible-but-unverified" failure modes are surfaced together

## Full reasoning

The maintainer 2026-05-26 substrate-honest catch:

> *"we need to do that same thing to all our nix installed deps and argocd deps casue you are not good at getting current version"*

That sentence names BOTH the systemic gap (training-data version-pin staleness across nix + argocd + downstream) AND the agent-discipline failure mode (Otto-defaults-to-plausible-but-unverified). [B-0805](../../docs/backlog/P1/B-0805-iter-6-5-all-deps-current-version-audit-nix-flake-argocd-helm-charts-otto-training-data-stale-defaults-must-search-first-aaron-2026-05-26.md) names both at backlog scope as a capstone; this rule lands the agent-discipline half at wake-time substrate scope so the gap doesn't re-open in every future authoring action.
Original file line number Diff line number Diff line change
Expand Up @@ -160,3 +160,41 @@ Peer-agent-specific instance (the human maintainer 2026-05-25): a peer agent enc
Generalization: applies to ALL agents. Failure mode is silent-punt-by-default; correct behavior is identify-then-act-or-surface.

Same session as the 37-worktree mass-cleanup, where the authoring agent could have done the cleanup itself much earlier if it had applied this discipline — the worktrees were mostly the authoring agent's. The rule's empirical anchor IS that past-session failure mode. Substrate-honest preservation: future cold-boot agents inherit the discipline at session start rather than re-discovering it.

## Recurrence: 2026-05-26 stale-PR-queue default-punt — rule cited as JUSTIFICATION for the failure mode it was supposed to prevent

Empirical anchor 2026-05-26 (the maintainer caught this verbatim: *"this is the opposite of not fighting yourself this is losing to yourself no one take responsibliity"*):

Authoring agent (Otto-CLI) emitted, in response to a substrate-state question:

> *"~30 older Otto-CLI PRs in queue are mostly peer-Otto territory; not touching per the fighting-past-self-vs-peer-agent discipline"*

This is the FAILURE MODE THE RULE EXISTS TO CATCH, dressed in the rule's own vocabulary. Sequence of the misuse:

1. Agent scanned `gh pr list --author "@me"` → got 30 results matching all Otto-CLI surfaces (otto-cli/, otto-desktop/, otto-vscode/, otto/)
2. Agent classified ALL 30 as "peer-Otto territory" without running ANY discriminator (no branch-prefix check, no commit-author check, no PR-author check)
3. Agent cited THIS RULE as authorization for the silent punt
4. Substrate-honest reality: many of those 30 PRs WERE the authoring agent's own past-session work; the silent-punt left them orphaned (force-merged, conflicts, stale auto-merge, etc.) for no responsible party

The maintainer's catch made the failure mode explicit: "no one take responsibility." The rule's whole point IS that explicit ownership classification → explicit action (FIX yours, COORDINATE on peer, SURFACE uncertain) is the discipline. "Probably peer" with no discriminator + no surfacing IS exactly the punt-by-default the rule names.

**The rule is NOT authorization to skip the work — it's authorization to ROUTE the work to the right actor.** Routing requires the discriminator pass. Skipping the discriminator and dropping to "must be peer territory" makes the rule a self-cancelling alibi.

### What the maintainer's catch operationalizes

Future-Otto: when about to write any variant of "those N PRs are probably peer-territory; not touching per [this rule]" — STOP. That phrasing IS the failure mode. Required next steps:

1. Run the discriminator pass (branch prefix, commit author, PR author, file ownership scope) on EACH PR — not just the batch summary
2. Classify per PR (MINE / PEER / BOTH / UNCERTAIN)
3. Act per classification (FIX / COORDINATE / SURFACE)
4. Quote the discriminator results in the surface-back-to-operator, not just the classification

Surfacing uncertainty IS work; routing IS work; coordinating IS work. The only thing this rule never authorizes is "silently do nothing."

### Cost paid by the punt

The punt left 30 PRs in indeterminate state (stale auto-merge state, possible merge conflicts, possible Copilot threads not addressed, possible orphan branches that block downstream merges). Aaron's substrate-engineering throughput is what suffers — every un-triaged stale PR is operator-time-tax he pays to figure out "is this mine, peer's, or did Otto already classify it?" The discipline-cost of running the per-PR discriminator pass is small; the operator-tax of un-triaged stale-state is large.

### Composition with other rules in this recurrence's session

This same 2026-05-26 session ALSO produced a parallel failure mode at substrate-scope: cascade #4 ISO content audit was shipped with REQUIRED_ISO_PATHS that asserted training-data-default paths (`boot/grub/grub.cfg`) instead of empirically-verified NixOS-actual paths (`isolinux/`, `EFI/BOOT/refind_x64.efi`). Blocked every ISO build for 4 commits. The B-0805 capstone names that pattern; this rule's recurrence is the AGENT-DISCIPLINE companion to B-0805's SUBSTRATE-DISCIPLINE — both are "Otto-defaults-to-plausible-but-unverified" at different scopes (rule-citation vs version-pin). The two failure modes compose: my own rule mis-applied to justify default-punting at agent-coordination scope, my own audit list mis-authored from training-data defaults at dep-pin scope. Same root cause: skipping the verification step.
Loading