diff --git a/.claude/rules/refresh-before-decide.md b/.claude/rules/refresh-before-decide.md index aaab7c7614..f91803aa3f 100644 --- a/.claude/rules/refresh-before-decide.md +++ b/.claude/rules/refresh-before-decide.md @@ -20,6 +20,75 @@ the bug class this discipline is designed to catch. to skip is constant and is the most violated invariant in agent loops generally. +## `git fetch` updates refs but NOT working-tree files (post-fetch read trap) + +`git fetch origin main` (the canonical Step-1 refresh action) updates +`refs/remotes/origin/main` but does NOT promote local HEAD or update +the working-tree files. Any subsequent `Read` / `cat` / `grep` against +working-tree paths reads files at the LOCAL HEAD's state — stale +against `origin/main` if local hasn't been ff-promoted. + +The failure mode: agent runs `git fetch`, sees `* branch main -> +FETCH_HEAD` success, reads `tools/foo.ts` via the working tree, and +authors substrate against state that may already be resolved on +`origin/main` N commits ahead. The substrate landing is a phantom +"drift" finding that requires retraction. + +**Three mitigation patterns** (pick by context): + +1. **Isolated worktree off `origin/main`** (default for agent ticks): + `git worktree add --detach origin/main` per + [`refresh-world-model-poll-pr-gate.md`](refresh-world-model-poll-pr-gate.md) + "Prefer `origin/main` over `FETCH_HEAD`" subsection. Gets the fresh + state without touching the operator's primary checkout. Composes + with [`agent-worktree-hygiene-never-hold-main-never-step-on-operator-cleanup-on-pr-merge.md`](agent-worktree-hygiene-never-hold-main-never-step-on-operator-cleanup-on-pr-merge.md) + `--detach` discipline (never hold `main` branch ref in agent + worktrees). +2. **`git show origin/main:`** for ad-hoc single-file inspection + without checkout — reads the file content directly from the + remote-tracking ref's tree, bypassing working-tree state entirely. + Cheapest option; ideal for sub-substrate inventory passes. +3. **ff-promote local HEAD** only when the checkout is the agent's + own (NOT the operator's primary) — `git merge --ff-only origin/main`. + Touching the operator's primary checkout is forbidden by the + agent-worktree-hygiene rule above. + +**Empirical anchor 2026-05-26T10:08Z**: Otto-CLI autonomous-loop tick +ran `git fetch origin main` (success), then read +`tools/alignment/filter_gate_log.ts` + `filter_gate_log.test.ts` via +working-tree paths in the operator's primary checkout. Both files +appeared unfixed despite [PR #5128](https://github.com/Lucent-Financial-Group/Zeta/pull/5128) +having landed the fix 1h 46min earlier at 08:22Z. The local primary +checkout was 11 commits behind `origin/main` (local `2774fef5a` vs +origin `1641da6d2`). The initial reading was generating a candidate +"PR #5128 substrate-drifted" finding that would have been a phantom +catch had the agent committed it; this rule extension catches that +class before the false-positive lands. The shard at +[`docs/hygiene-history/ticks/2026/05/26/1008Z.md`](../../docs/hygiene-history/ticks/2026/05/26/1008Z.md) +carries the full empirical trace. + +**Existing partial coverage** at narrower scope already lives at +[`otto-channels-reference-card.md`](otto-channels-reference-card.md) +ID-allocation section: "do NOT use `find docs/backlog -name B-*.md` +on the local worktree. The local working tree may be on a stale +HEAD." That precedent is scoped to ID-allocation queries (find on +the backlog tree); this extension generalizes the same principle to +*any* working-tree file read post-fetch, and lands it on the +`refresh-before-decide` surface where it auto-loads at every cold-boot. +Substrate inventory performed per +[`verify-existing-substrate-before-authoring.md`](verify-existing-substrate-before-authoring.md): +searched `git fetch` + `FETCH_HEAD` + `local HEAD stale` + `ff-only` + +`fast-forward` + `git show origin/main` + `stale local` across +`.claude/rules/` + `memory/` before authoring this extension. + +## Composes with + +- [`agent-worktree-hygiene-never-hold-main-never-step-on-operator-cleanup-on-pr-merge.md`](agent-worktree-hygiene-never-hold-main-never-step-on-operator-cleanup-on-pr-merge.md) — operator primary checkout MUST NOT be ff'd by agents; use isolated worktree off `origin/main` +- [`refresh-world-model-poll-pr-gate.md`](refresh-world-model-poll-pr-gate.md) — "Prefer `origin/main` over `FETCH_HEAD`" empirical anchor at 2026-05-20T16:14Z, plus saturation-tier discipline +- [`otto-channels-reference-card.md`](otto-channels-reference-card.md) — ID-allocation narrow-scope precedent (this extension generalizes) +- [`verify-existing-substrate-before-authoring.md`](verify-existing-substrate-before-authoring.md) — substrate-inventory discipline used to compose this extension rather than mint parallel +- [`dep-pin-search-first-authority.md`](dep-pin-search-first-authority.md) — sibling rule at version-pin scope; same "Otto-defaults-to-plausible-but-unverified" root cause class + ## Full reasoning `memory/feedback_refresh_before_decide_invariant_two_layer_print_dx_claudeai_2026_05_01.md` diff --git a/docs/hygiene-history/ticks/2026/05/26/1008Z.md b/docs/hygiene-history/ticks/2026/05/26/1008Z.md new file mode 100644 index 0000000000..29847b4be0 --- /dev/null +++ b/docs/hygiene-history/ticks/2026/05/26/1008Z.md @@ -0,0 +1,151 @@ +--- +tick: 2026-05-26T10:08Z +surface: otto-cli +session: cold-boot +sentinel_state: missing-at-start (catch-43 fired) +sentinel_armed: 580bbe2a +graphql_tier: Normal (4755/5000) +rest_core: 4987/5000 +stuck_git_procs: 0 +peer_procs: 0 +brief_ack_counter: 0 +local_head_at_start: 2774fef5a (11 commits behind origin/main 1641da6d2) +disposition: substantive-rule-extension + phantom-catch-prevented +--- + +# Tick 1008Z — refresh-before-decide caught a stale-local phantom catch + +## Step 1 — Refresh worldview + +- `CronList` returned empty at session start → catch-43 fired → `CronCreate` + armed `<>` cron `* * * * *` → job ID `580bbe2a` +- `git fetch origin main` → `* branch main -> FETCH_HEAD` clean +- Recent merges (08:00Z–08:30Z): #5037 docs-archive, #5127 B-0798 + classifier-bypass research boundary, #5133 B-0811 re-land, #5110 B-0421 + grok-build native wrapper, #5131 verify-existing-substrate rule, #5132 + classifier-bypass ID renumber, #5130 B-0806 Ace substrate-honest + correction, #5129 B-0806 iter-7 Ansible+GitOps, #5128 B-0058 filter-gate + test pollution fix, #5125 cascade-#4 bootloader-any-of +- 0 stuck git pack/maintenance/repack procs; 0 peer Lior procs; GraphQL + Normal (4755/5000, 49min to reset); REST core 4987/5000 +- 0 open PRs in my surface lanes (otto-cli / otto-desktop / otto-vscode / otto) +- 2 tick shards already today on origin/main: 0410Z + 0608Z (the 0608Z + shard documented the prior cold-boot's PR #5108 named-wait) + +## Step 2 — Holding discipline + +Brief-ack counter 0. No prior brief-acks; no named bounded wait. Substantive +work picked per never-be-idle priority ladder. + +## Step 3 — Pick speculative work + +Step-1 worldview refresh surfaced untracked `tools/alignment/out/filter-gate-log.jsonl` +(400 bytes, 2 entries from `"author":"Lior"` 2026-05-25T22:52Z). Initial +investigation noted: the file is NOT gitignored, the entries are +`skill:test-entry` integration-test data, mtime predates today. + +Reading `tools/alignment/filter_gate_log.ts:104` showed `logFilePath()` as +`return join(repoRoot(), LOG_PATH);` with no env-var override; reading +`tools/alignment/filter_gate_log.test.ts:301-310` showed the polluting +`--record` test still calling `main()` with production args directly. +Candidate finding being assembled: "PR #5128 (B-0058 fix that landed +04:22Z per `git log`) claimed env-var override + `mkdtempSync` test +isolation but the code on the current checkout shows neither — substrate +drift." + +**The catch — refresh-before-decide caught a stale-local phantom**: + +- `git rev-parse HEAD` = `2774fef5a` (local primary) +- `git rev-parse origin/main` = `1641da6d2` (post-fetch tracking ref) +- Local is **11 commits behind** including PR #5128 +- `gh pr diff 5128` confirms the fix did land — the diff contains both the + `mkdtempSync` test refactor AND the `FILTER_GATE_LOG_PATH` env override +- The untracked `filter-gate-log.jsonl` is pre-fix Lior pollution that + PR #5128's own test plan explicitly asserts should be absent post-fix + +The whole "substrate-drift" finding was a phantom of stale-local working +tree state. `git fetch origin main` updated the remote-tracking ref but +did NOT promote local HEAD; `Read` against working-tree paths returned +stale files. Without `refresh-before-decide` catching it, the next +substrate landing would have been a public PR retracting against +already-resolved state. + +## Step 4 — Verify + commit + +Substantive substrate landing: extension of [`refresh-before-decide.md`](../../../../../../.claude/rules/refresh-before-decide.md) +with new section "**`git fetch` updates refs but NOT working-tree files +(post-fetch read trap)**" — names the failure mode, the three mitigation +patterns (isolated worktree off `origin/main` as default; `git show +origin/main:` for ad-hoc; ff-promote for agent's own checkout), +the 2026-05-26T10:08Z empirical anchor, citation to existing partial +coverage at [`otto-channels-reference-card.md`](../../../../../../.claude/rules/otto-channels-reference-card.md) +ID-allocation section, composes_with for 5 sibling rules. + +**Verify-before-defer composition** (8th anchor in series): + +- `git worktree add --detach /private/tmp/zeta-otto-cli-refresh-extend-1008z origin/main` → + exit=0, "HEAD is now at 1641da6d2", `Updating files: 100% (6690/6690), done` +- Post-creation guards: `rev-parse HEAD = 1641da6d2ce995ac` matches + origin/main; `ls-tree HEAD = 61` matches baseline; `status --short = 0` + empty; no stale `index.lock` +- Empirical N=9 if counted: dotgit-NOT-saturated (0 stuck procs) + + 0 peer procs at 10:08Z; the verify-before-defer composition is the + default pattern under these conditions per + [`codeql-no-source-on-docs-only-pr-is-broken-commit-canary.md`](../../../../../../.claude/rules/codeql-no-source-on-docs-only-pr-is-broken-commit-canary.md) + +**Substrate-inventory step** performed per +[`verify-existing-substrate-before-authoring.md`](../../../../../../.claude/rules/verify-existing-substrate-before-authoring.md) +(this rule landed via PR #5131 — itself only visible to me via +`git show origin/main:` because local was stale): + +- Searched `.claude/rules/` + `memory/` for: `git fetch`, `FETCH_HEAD`, + `local HEAD stale`, `ff-only`, `fast-forward`, `git show origin/main`, + `stale local` +- Hits: [`otto-channels-reference-card.md`](../../../../../../.claude/rules/otto-channels-reference-card.md) + ID-allocation section (narrow scope), [`refresh-world-model-poll-pr-gate.md`](../../../../../../.claude/rules/refresh-world-model-poll-pr-gate.md) + ("Prefer `origin/main` over `FETCH_HEAD`", `unable to update local ref` + wedge, dotgit-saturation tier) +- Conclusion: existing substrate covers the principle at narrow scope + (ID-allocation queries); the general "any working-tree-file-read + post-fetch is stale" corollary is implicit but not explicit on a + cold-boot-loaded rule surface +- Disposition: extend (not mint parallel) per the rule + +## Step 5 — This shard + +[`docs/hygiene-history/ticks/2026/05/26/1008Z.md`](../../../../../../docs/hygiene-history/ticks/2026/05/26/1008Z.md) +(this file) + +## Step 6 — Cron check + +Sentinel armed at session start: `CronCreate` returned job ID `580bbe2a` +(`<>` cron `* * * * *`). Session-only per canonical +[`tick-must-never-stop.md`](../../../../../../.claude/rules/tick-must-never-stop.md). + +## Step 7 — Visibility signal + +Substantive substrate-engineering landing this tick: + +1. `.claude/rules/refresh-before-decide.md` — extension naming the + post-fetch-working-tree-read trap, 3 mitigation patterns, this + tick's empirical anchor, 5 composes_with citations +2. `docs/hygiene-history/ticks/2026/05/26/1008Z.md` — this shard +3. Phantom "PR #5128 substrate-drift" finding caught BEFORE landing in + any public substrate; the catching discipline IS the rule extension + being landed + +Counter reset condition #3 satisfied (real decomposition work — bounded, +concrete, committed, pushed). + +## Composes with + +- [`.claude/rules/refresh-before-decide.md`](../../../../../../.claude/rules/refresh-before-decide.md) — the rule this tick extends +- [`.claude/rules/verify-existing-substrate-before-authoring.md`](../../../../../../.claude/rules/verify-existing-substrate-before-authoring.md) — PR #5131; substrate-inventory step that prevented mint-parallel +- [`.claude/rules/dep-pin-search-first-authority.md`](../../../../../../.claude/rules/dep-pin-search-first-authority.md) — sibling discipline at version-pin scope +- [`.claude/rules/agent-worktree-hygiene-never-hold-main-never-step-on-operator-cleanup-on-pr-merge.md`](../../../../../../.claude/rules/agent-worktree-hygiene-never-hold-main-never-step-on-operator-cleanup-on-pr-merge.md) — `--detach origin/main` discipline; never touch operator's primary +- [`.claude/rules/refresh-world-model-poll-pr-gate.md`](../../../../../../.claude/rules/refresh-world-model-poll-pr-gate.md) — "Prefer `origin/main` over `FETCH_HEAD`" + saturation tiers +- [`.claude/rules/otto-channels-reference-card.md`](../../../../../../.claude/rules/otto-channels-reference-card.md) — ID-allocation narrow-scope precedent +- [`.claude/rules/codeql-no-source-on-docs-only-pr-is-broken-commit-canary.md`](../../../../../../.claude/rules/codeql-no-source-on-docs-only-pr-is-broken-commit-canary.md) — verify-before-defer composition empirical anchor +- [`.claude/rules/holding-without-named-dependency-is-standing-by-failure.md`](../../../../../../.claude/rules/holding-without-named-dependency-is-standing-by-failure.md) — counter-reset condition #3 (real decomposition work) +- [PR #5128](https://github.com/Lucent-Financial-Group/Zeta/pull/5128) — the fix whose phantom-drift catch this tick prevented +- [PR #5131](https://github.com/Lucent-Financial-Group/Zeta/pull/5131) — verify-existing-substrate rule used in the substrate-inventory step