diff --git a/docs/hygiene-history/ticks/2026/05/03/0043Z.md b/docs/hygiene-history/ticks/2026/05/03/0043Z.md new file mode 100644 index 000000000..7681c20d6 --- /dev/null +++ b/docs/hygiene-history/ticks/2026/05/03/0043Z.md @@ -0,0 +1 @@ +| 2026-05-03T00:43:00Z | opus-4-7 / autonomous-loop continuation | a2e2cc3a | **#1255 merged on main with one post-merge thread (row-7 grep -ilrE substitution was a SECOND semantic-equivalence drift on the same substitution attempt); fix opened as PR #1257; verify-then-claim drift catalogue extended 9 → 15+ instances with recurring-sub-classes taxonomy.** Cycle worked: my "fix" replacing `ls\|grep` with `find -iname` (caught as shell-glob-only) was followed by another substitution `grep -ilrE` (caught as content-search not filename-search). TWO recursive failures on the same substitution attempt — each "fix" introduced a new equivalence-class drift. Reverted to canonical `ls \| grep -iE` form pulled out of table cell into fenced code block below the table. Drift catalogue extended with 6 new instances (all post-naming): self-recursive (catalogue containing its own drift), semantic-equivalence (twice), convention (ADR supersession marker), path-form (fully-qualified vs bare). Added recurring-sub-classes section: existence-drift, count-drift, semantic-equivalence-drift, empirical-output-drift, convention-drift, path-form-drift, self-recursive-drift. Updated headline: 6 of 15 instances landed AFTER the discipline was named — strongest possible empirical urgency for `tools/substrate-claim-checker/` mechanization. Manual discipline insufficient against trained-prior pull-toward-claim-without-empirical-verification. Cron a2e2cc3a still armed. | #1255 (skill-design fixes + verify-then-claim memo) merged 67950a62; #1256 (#1254 follow-up: ADR convention + path consistency + MD038) wait-ci, auto-merge armed; #1257 (row-7 prose fix + drift catalogue extension to 15 instances) opened, auto-merge armed | This tick teaches the operational pattern of double-recursive-substitution-drift: the SAME substitution attempt failed twice in succession because each "fix" was authored without empirical-verification of the new substitution's equivalence to the original. The class is dangerous because Otto reads the new attempt as fixing the previous attempt, which it does for the surface-syntactic concern, but introduces a NEW semantic drift. Eval-set for `tools/substrate-claim-checker/`: any substitution claim should fire a verification check on input-output equivalence. The recurring-sub-classes taxonomy gives the tool 7 distinct check-types to implement. | diff --git a/memory/feedback_skills_as_carved_sentences_knowledge_in_docs_datavault_2_0_pattern_aaron_2026_05_03.md b/memory/feedback_skills_as_carved_sentences_knowledge_in_docs_datavault_2_0_pattern_aaron_2026_05_03.md index 264278f95..2f4d4fdda 100644 --- a/memory/feedback_skills_as_carved_sentences_knowledge_in_docs_datavault_2_0_pattern_aaron_2026_05_03.md +++ b/memory/feedback_skills_as_carved_sentences_knowledge_in_docs_datavault_2_0_pattern_aaron_2026_05_03.md @@ -137,9 +137,15 @@ The decision-archaeology skill body (B-0169 future SKILL.md) has 11 procedure la | 4 | `git log -S "" -- memory/ CLAUDE.md` | `bun tools/decision-archaeology/string-archaeology.ts ""` | | 5 | `git log -L :func:file` | `bun tools/decision-archaeology/function-archaeology.ts ` | | 6 | `grep -rlnE "" docs/hygiene-history/ticks/` | `bun tools/decision-archaeology/shard-search.ts ` | -| 7 | `grep -ilrE "" docs/DECISIONS/` (single-command, regex-capable equivalent of `ls .. | grep -iE`; preserves alternation semantics; avoids markdown-table pipe-escape awkwardness) | `bun tools/decision-archaeology/adr-search.ts ` | +| 7 | ADR-filename search by regex pattern (canonical command shown in code block below the table — pipe in table cells is awkward; the right shape is `ls docs/DECISIONS/` piped through `grep -iE`, which searches **filenames**; do NOT substitute `grep -ilrE PATTERN docs/DECISIONS/` since `-r` searches **file contents** instead) | `bun tools/decision-archaeology/adr-search.ts ` | | 8-11 | Various searches | TS-wrapped where they involve multi-flag patterns | +Layer-7 canonical command (filename search; intentionally pulled out of the table cell because pipe-in-cell breaks markdown rendering): + +```bash +ls docs/DECISIONS/ | grep -iE "" +``` + Each TS file is small (often <100 lines), single-purpose, type-checked, and re-runnable. Skill body becomes carved-sentence pointers ("invoke `bun tools/decision-archaeology/blame.ts`") rather than embedded bash. ### What's already correct under this rule diff --git a/memory/feedback_verify_then_claim_discipline_dominant_failure_mode_substrate_authoring_otto_2026_05_03.md b/memory/feedback_verify_then_claim_discipline_dominant_failure_mode_substrate_authoring_otto_2026_05_03.md index eaa5cddb7..609366ae2 100644 --- a/memory/feedback_verify_then_claim_discipline_dominant_failure_mode_substrate_authoring_otto_2026_05_03.md +++ b/memory/feedback_verify_then_claim_discipline_dominant_failure_mode_substrate_authoring_otto_2026_05_03.md @@ -6,7 +6,7 @@ type: feedback # Verify-then-claim discipline — dominant failure mode for substrate authoring -## Empirical evidence (this session, 7 PRs, 9 distinct drift instances) +## Empirical evidence (this session, 9+ PRs, 15+ distinct drift instances) | Drift instance | PR | Wrong claim | Actual reality | |---|---|---|---| @@ -19,8 +19,24 @@ type: feedback | 7 | #1250 (post-merge) | Layer-10 docs/research grep returns no specific double-hop artifact | adjacent-substrate artifacts ARE there (5+) | | 8 | #1252 (post-merge) | future-domain memo references `docs/courier-ferry-protocol.md` | doesn't exist | | 9 | #1253 (post-merge) | skill-design memo references `tools/backlog/expand-from-closure.ts` as the mechanizing tool | doesn't exist; only proposed | - -**9 drift instances across 7 PRs in one session.** Each one a Copilot catch; each one a real claim Otto wrote without verifying. The pattern is consistent enough that "verify-then-claim" needs to be a named discipline. +| 10 | #1255 (in-flight) | drift catalogue itself contained `\|` table-cell escapes (rows 5 and 7 of THIS table, in earlier draft) | the catalogue was itself drifting; rewrote rows in prose form | +| 11 | #1255 (in-flight) | mechanization path claimed pre-commit hook validates commit-message claims | git pre-commit fires BEFORE commit-msg exists; needs commit-msg hook for that surface | +| 12 | #1255 (in-flight, recursive #1) | replaced `ls\|grep` with `find -iname` — claimed equivalent | `find -iname` only does shell glob, not regex alternation; semantic-equivalence drift | +| 13 | #1255 (in-flight, recursive #2) | replaced earlier with `grep -ilrE PATTERN docs/DECISIONS/` — claimed equivalent | `grep -r` searches file CONTENTS, not filenames; semantic-equivalence drift, attempt #2 | +| 14 | #1254 (post-merge) | recommended `superseded:` / `current_status:` ADR frontmatter marker | canonical convention is `> **Superseded by** [link]` blockquote (per `docs/DECISIONS/2026-04-21-router-coherence-claims-vs-complexity.md`) | +| 15 | #1256 (post-merge) | path-form inconsistency in adjacent ADR citations (mixing fully-qualified with bare filename) | a recurring sub-class — pick one form and apply uniformly per document | + +**15 drift instances across 9 PRs (and counting; instances #10-#15 landed AFTER the discipline was named — strongest possible empirical urgency for mechanization, since manual discipline already provably hit its wall on the very memo defining the discipline).** Each one a Copilot catch; each one a real claim Otto wrote without verifying. Instances #12 and #13 are particularly diagnostic: same substitution attempt failed twice in succession (find→grep equivalence; grep -ilrE→ls|grep equivalence) — each "fix" introduced a new equivalence-class drift. + +Recurring sub-classes within the broader claim-vs-reality drift: + +- **Existence drift** (file/dir/tool claimed to exist; doesn't): instances #1, #6, #8, #9 +- **Count drift** (table claims N rows; actually M): instances #2, #3 +- **Semantic-equivalence drift** (substituted command claimed equivalent; actually changes semantics): instances #4, #12, #13 +- **Empirical-output drift** (command claimed to return X; actually returns Y): instances #5, #7 +- **Convention drift** (recommended pattern doesn't match canonical convention): instance #14 +- **Path-form drift** (fully-qualified vs bare paths inconsistent in adjacent citations): instance #15 +- **Self-recursive drift** (the memo about drift contains its own drift): instances #10, #11 ## The carved rule