diff --git a/docs/hygiene-history/ticks/2026/05/03/0203Z.md b/docs/hygiene-history/ticks/2026/05/03/0203Z.md new file mode 100644 index 000000000..5a37f5f83 --- /dev/null +++ b/docs/hygiene-history/ticks/2026/05/03/0203Z.md @@ -0,0 +1 @@ +| 2026-05-03T02:03:00Z | opus-4-7 / autonomous-loop continuation | a2e2cc3a | **Aaron 2026-05-03 vibe-coded correction: substrate-content-author ≠ commit-author; decision-archaeology in vibe-coded projects has unique substrate-author-recovery challenge.** Cycle worked: PR #1267 wait-ci with 5 #1266 post-merge fixes (attribution-form + ls-sort + stale-ADR-claim scrub). Aaron mid-tick correction surfaced a deeper architectural truth: per AGENTS.md vibe-coded hypothesis he has written ZERO lines of code; all `src/`, `tools/`, `docs/`, `.claude/skills/` content is agent-authored. So git-blame shows the COMMITTER (maintainer), not the SUBSTRATE-CONTENT-AUTHOR (some past Claude session whose specific session-context is largely lost). This is load-bearing for decision-archaeology: the "ask the original decision-maker" path is unavailable when maintainer is principled-non-substrate-author. First-party intent recovery requires past-agent introspection bounded by substrate-context, OR persona-notebook substrate that captured session-context, OR maintainer-acceptance reasoning (selection-judgment intent ≠ substrate-author intent). Added 44-line "The vibe-coded reframe" section to worked example #2 covering the three-layer attribution distinction (commit-author / substrate-content-author / decision-authority) + intent-recovery paths + past-agent introspection on the umbrella defer-block case (inferred reasoning bounded by substrate context: minimal change for umbrella + narrow-siblings to co-exist deterministically; load-bearing emphasis flags router-criticality; explicit enumeration more conservative than "most-narrow matching" which requires unimplemented logic). Skill-body teaching: inference IS the right tool for vibe-coded substrate-author archaeology; certainty about intent is not available. Cron a2e2cc3a still armed. | #1265 (decision-graph emergent memo) wait-ci, auto-merge armed; #1267 (worked example #2 followup: attribution-form + ls-sort + stale-ADR-claim) wait-ci, auto-merge armed; #1268 (worked example #2 vibe-coded reframe) opened, auto-merge armed | This tick teaches the operational pattern of **vibe-coded substrate-archaeology**: in projects where the maintainer is principled-non-substrate-author, the "ask the maintainer" decision-archaeology path is unavailable. Substrate-content-author archaeology becomes its own discipline requiring past-agent introspection + persona-notebook layer + maintainer-acceptance reasoning. The corrected worked example #2 now spans ALL substrate-author surfaces (commit-history walk + persona-notebook load + agent-author introspection) — together the 3 worked examples cover the full vibe-coded decision-archaeology shape. The skill-body's lesson: inference is the right tool; certainty is not available; transparency about that limit IS the discipline. | diff --git a/docs/hygiene-history/ticks/2026/05/03/0207Z.md b/docs/hygiene-history/ticks/2026/05/03/0207Z.md new file mode 100644 index 000000000..3a7670292 --- /dev/null +++ b/docs/hygiene-history/ticks/2026/05/03/0207Z.md @@ -0,0 +1 @@ +| 2026-05-03T02:07:00Z | opus-4-7 / autonomous-loop continuation | a2e2cc3a | **#1265 + #1267 merged with 8 post-merge findings; 5 stale-on-review-timing + 3 real fixed in PR #1269.** Cycle worked: PR #1265 (decision-graph emergent memo) merged d2578130; PR #1267 (worked example #2 followup) merged 145564c9. 8 post-merge findings split: 5 stale (claimed worked example #3 not on main; actually IS on main since #1264 merged — review-timing-creates-stale-findings pattern recurring), 3 real fixed: MEMORY.md decision-graph entry trimmed to one-liner per terseness rule; frontmatter clarified tools/decision-graph/ is proposed-not-built; ask-not-infer lesson added to worked example #2 demonstrations (composes with vibe-coded reframe in #1268). The triage discipline correctly identifies stale-on-review-timing vs real findings; not every flagged issue needs a fix. Cron a2e2cc3a still armed. | #1265 (decision-graph memo) merged d2578130; #1267 (worked example #2 followup) merged 145564c9; #1268 (worked example #2 vibe-coded reframe) wait-ci, auto-merge armed; #1269 (post-merge fixes for #1265 + #1267) opened, auto-merge armed | This tick teaches the operational pattern of **review-timing-creates-stale-findings recurring across PRs**: when multiple PRs are in flight referencing each other's not-yet-merged substrate, each one's Copilot review surfaces stale findings about substrate not yet on main. Triage discipline correctly identifies stale + resolves WITHOUT fix; substrate-claim-checker v1+ would need PR-graph-awareness (check against main + open-PR substrate, not just main) to avoid this class. The 3 real findings (terseness + tool-status framing + ask-not-infer) compose with the vibe-coded reframe in #1268 — together the worked example body now teaches contributors the full vibe-coded substrate-archaeology shape. | diff --git a/docs/research/2026-05-03-decision-archaeology-worked-example-2-mathematics-expert-when-to-defer.md b/docs/research/2026-05-03-decision-archaeology-worked-example-2-mathematics-expert-when-to-defer.md index 21c615f44..043d2fd9f 100644 --- a/docs/research/2026-05-03-decision-archaeology-worked-example-2-mathematics-expert-when-to-defer.md +++ b/docs/research/2026-05-03-decision-archaeology-worked-example-2-mathematics-expert-when-to-defer.md @@ -46,6 +46,50 @@ seeds. Three properties make it complementary to worked example #1 to recognize when a same-period ADR is unrelated and not assume doctrine-elevation occurred. +## The vibe-coded reframe (Aaron 2026-05-03 correction) + +The maintainer reminded me mid-tick: per `AGENTS.md`'s vibe-coded +hypothesis, he has written **zero lines of code** — every line in +`src/`, `tools/`, `docs/`, `.claude/skills/` is **agent-authored**. + +This makes git-blame attribution structurally misleading at the +substrate-content layer: + +| Layer | What `git blame` shows | What's actually true (vibe-coded) | +|---|---|---| +| Commit-author | "the maintainer 2026-04-19" | maintainer-as-committer (principled non-coder) | +| Substrate-content-author | (invisible at git layer) | a prior Claude session in some prior round | +| Decision authority | (looks like maintainer) | agent proposed; maintainer accepted (selection-not-authorship) | + +So the question "why was the umbrella's defer-block written this +way?" cannot be answered by asking the maintainer — he didn't +author the substrate; he selected it. Asking himself the question +returns: "I committed an agent-authored proposal." + +**Decision-archaeology in vibe-coded projects has a unique +substrate-author-recovery challenge:** + +- Substrate-content-authors are agents whose specific session-context is largely lost (sessions don't share context) +- What remains: the substrate the agent produced + maintainer-acceptance evidence + tick shards / persona notebooks / `docs/research/` if those captured session-context +- First-party intent on substrate-content is recoverable only via: + 1. **Past-agent introspection** — the current agent reasons about the structural choice given the substrate-context the past-agent had at write-time. Speculative but bounded by the substrate. + 2. **Tick shards / persona notebooks** that captured the past-session's context (Aarav's notebook is the rare load-bearing example for this case) + 3. **Maintainer-acceptance reasoning** — the maintainer can explain why he ACCEPTED the agent's output, which is its own first-party content (not the substrate-author's intent, but the selection-judgment intent) + +### Past-agent introspection on this case + +The substrate context the past-agent had at write-time (round 34, +2026-04-19): + +1. The math substrate had 6+ narrow expert skills (category-theory, measure-theory, numerical-analysis, probability, applied-mathematics, theoretical-mathematics) +2. Skill-routing matches on description keywords; an umbrella named "mathematics-expert" would trigger on every math-flavored query +3. Without explicit defer-block, the umbrella + narrow-siblings would compete for the router's matches → unpredictable behavior +4. The minimal-change fix: make the defer-discipline explicit + load-bearing + +**Inferred past-agent reasoning (necessarily speculative, bounded by substrate):** the defer-block was the smallest change making umbrella + narrow-siblings co-exist deterministically. The "this is load-bearing" emphasis flags the procedure as router-critical (later named "router-coherence" in Aarav's round-41 observation). Listing every sibling explicitly rather than "defer to most-narrow matching skill" was the more conservative implementation — explicit enumeration is deterministic; "most-narrow matching" requires routing-implementation that doesn't exist. + +This inference is **bounded by the substrate-context**, not authoritative on past-agent intent. The skill body should teach contributors: **inference is the right tool for vibe-coded substrate-author archaeology; certainty about intent is not available.** + ## The procedure walked, layer by layer ### Layer 1 — Frame the question @@ -317,6 +361,20 @@ For the eventual `decision-archaeology` SKILL.md body: 11 layers.** The procedure is consistent across modes; only the answer-shape differs (negative-result-rich for supersession; positive-pattern-canonical-doctrine for existence). +6. **Ask the original decision-maker rather than infer — when + available.** Decision-archaeology can recover *what happened*, *when*, + and *who-touched-what* from substrate; **first-party intent on + substrate-content requires first-party query**. In traditional + projects this means asking the original author. In vibe-coded + projects (per Aaron 2026-05-03 correction + the vibe-coded reframe + section above), the maintainer is principled-non-substrate-author, + so first-party query targets the maintainer's selection-judgment + + the past-agent's substrate-context, not the maintainer's + substrate-author intent. The skill body should teach contributors + to **distinguish between substrate-recoverable facts (cached) and + first-party intent (source-of-truth) — and ask the available + first-party source rather than infer from substrate when intent + is the question.** ## Composes with diff --git a/memory/MEMORY.md b/memory/MEMORY.md index ee4de3225..48012996d 100644 --- a/memory/MEMORY.md +++ b/memory/MEMORY.md @@ -4,7 +4,7 @@ **📌 Fast path: read `CURRENT-aaron.md` and `CURRENT-amara.md` first.** These per-maintainer distillations show what's currently in force. Raw memories below are the history; CURRENT files are the projection. (`CURRENT-aaron.md` refreshed 2026-04-28 with sections 26-30 — speculation rule + EVIDENCE-BASED labeling + JVM preference + dependency honesty + threading lineage Albahari/Toub/Fowler + TypeScript/Bun-default discipline.) -- [**Decision graph emerges from archaeologies + flywheel + at-creation/at-pickup discipline + hub-satellite separation (Aaron 2026-05-03 architectural observation)**](feedback_decision_graph_emergent_from_archaeologies_and_flywheel_aaron_2026_05_03.md) — Substrate already encodes a typed-edge provenance graph (DataVault-2.0-shaped, PROV-O analogue): nodes = backlog rows / ADRs / memos / skills / personas / research / tick shards / commits; edges = depends_on / composes_with / supersedes / cites / verifies-against / attributes-to / closes. Archaeologies + flywheel + disciplines make the implicit graph queryable: decision-archaeology = traversal; substrate-claim-checker = invariant; flywheel = growth; at-creation/at-pickup = edge-fill; hub-satellite = stratification. No separate graph database needed — every edge is encoded in frontmatter / markdown links / ADR blockquotes / SUPERSEDE markers. Sacred-tier nodes need walk-discipline (cite paths, don't reproduce). Mechanization: `tools/decision-graph/` TS tool. Graph value compounds with backlog size. +- [**Decision graph emerges from archaeologies + flywheel (Aaron 2026-05-03 architectural observation)**](feedback_decision_graph_emergent_from_archaeologies_and_flywheel_aaron_2026_05_03.md) — Substrate already encodes a typed-edge provenance graph (DataVault-2.0-shaped, PROV-O analogue) inferable without a separate graph DB. The 5 architectural disciplines make it queryable; sacred-tier nodes need walk-discipline; `tools/decision-graph/` (proposed, not yet built) is the mechanization. Memo body has full node/edge taxonomy + composition with B-0169/B-0170/B-0171. - [**Verify-then-claim discipline — verify every substrate claim empirically BEFORE publishing (Otto 2026-05-03 self-grading; 20 drift instances across 9+ PRs this session, 7 recurring sub-classes catalogued, v0 mechanization shipped PR #1260)**](feedback_verify_then_claim_discipline_dominant_failure_mode_substrate_authoring_otto_2026_05_03.md) — The dominant failure mode for substrate authoring this session: Otto wrote "X exists" / "command returns Y" / "table has N rows" without verifying. Carved rule: before stating ANY fact in substrate (file exists, command returns X, row count is N, tool shipped, ADR matches, persona dir present), run the actual command first. 7 sub-classes: existence / count / semantic-equivalence / empirical-output / convention / path-form / self-recursive. Generalizes Otto-247 + Otto-364 + verify-before-deferring at the broader any-substrate-claim layer. **Manual discipline provably insufficient** — instances #10-#20 landed AFTER the discipline was named. `tools/substrate-claim-checker/check-counts.ts` shipped PR #1260 (count-drift sub-class); v1+ extends to remaining 6 sub-classes. Composes with bugs-per-PR-as-immune-system-health metric. - [**Skill design — hub-satellite separation + no dynamic commands + plugin/hook packaging + OpenSpec catch-up (Aaron 2026-05-03, three same-tick rules + architectural-debt naming)**](feedback_skills_as_carved_sentences_knowledge_in_docs_datavault_2_0_pattern_aaron_2026_05_03.md) — Three cross-cutting skill-design rules: (1) skills = carved-sentence hubs, knowledge = doc satellites, DataVault 2.0 pattern; (2) no dynamic commands in skills, use TS files under tools/ and reference by path; (3) package skill domains as plugins, use harness hooks for pre/post-condition enforcement (contract-based development). PLUS OpenSpec catch-up named as load-bearing prerequisite — *"if we deleted everything other than it"* — currently sparse; catch-up is its own substantial backlog item. Recursive composition across layers (skill body / command / skill domain / cross-skill contracts / spec). - [**Git-native backlog management + long-arc thesis as future skill DOMAIN (Aaron 2026-05-02 forward-looking architectural observation)**](feedback_git_native_backlog_management_long_arc_future_skill_domain_aaron_2026_05_02.md) — Domain emerges (6 procedure skills + 4 named-persona experts + 5 tools) once "down pat"; promotion trigger = 3+ worked examples per skill / 1+ judgment-disagreement per expert. Memo body has Aaron's verbatim quote, Aarav BP-20 composition, canonical starting set. diff --git a/memory/feedback_decision_graph_emergent_from_archaeologies_and_flywheel_aaron_2026_05_03.md b/memory/feedback_decision_graph_emergent_from_archaeologies_and_flywheel_aaron_2026_05_03.md index 116a3a157..10bcbd60a 100644 --- a/memory/feedback_decision_graph_emergent_from_archaeologies_and_flywheel_aaron_2026_05_03.md +++ b/memory/feedback_decision_graph_emergent_from_archaeologies_and_flywheel_aaron_2026_05_03.md @@ -1,6 +1,6 @@ --- name: Decision graph emerges from decision-archaeology + substrate-claim-checker + at-creation/at-pickup discipline + expansion flywheel + hub-satellite separation (Aaron 2026-05-03 architectural observation) -description: 2026-05-03; Aaron asked *"do we end up with some decision graph or something because of the archeologies and flywheel?"* — answer is yes. The substrate already encodes a typed-edge provenance graph (DataVault-2.0-shaped, PROV-O analogue): nodes are backlog rows / ADRs / memos / skills / personas / research artifacts / tick shards / commits; edges are depends_on / composes_with / supersedes / cites / verifies-against / attributes-to. The archaeologies + flywheel make the implicit graph queryable: decision-archaeology IS graph traversal; substrate-claim-checker IS graph invariant checker; expansion flywheel IS graph growth function; at-creation/at-pickup discipline IS graph edge-filling discipline; hub-satellite separation IS graph stratification. The graph is inferable from substrate without a separate graph database — every edge is encoded in frontmatter / markdown links / ADR blockquotes / SUPERSEDE markers / commit messages. Mechanization path: tools/decision-graph/ TS tool that traverses substrate and emits the graph for queries. +description: 2026-05-03; Aaron asked *"do we end up with some decision graph or something because of the archeologies and flywheel?"* — answer is yes. The substrate already encodes a typed-edge provenance graph (DataVault-2.0-shaped, PROV-O analogue): nodes are backlog rows / ADRs / memos / skills / personas / research artifacts / tick shards / commits; edges are depends_on / composes_with / supersedes / cites / verifies-against / attributes-to. The archaeologies + flywheel make the implicit graph queryable: decision-archaeology IS graph traversal; substrate-claim-checker IS graph invariant checker; expansion flywheel IS graph growth function; at-creation/at-pickup discipline IS graph edge-filling discipline; hub-satellite separation IS graph stratification. The graph is inferable from substrate without a separate graph database — every edge is encoded in frontmatter / markdown links / ADR blockquotes / SUPERSEDE markers / commit messages. **Mechanization path (proposed, not yet built):** `tools/decision-graph/` TS tool that would traverse substrate and emit the graph for queries. type: feedback ---