From cfaa92c80071cb474555bc9f5b8318684177bc8c Mon Sep 17 00:00:00 2001 From: Aaron Stainback Date: Sat, 25 Apr 2026 03:08:15 -0400 Subject: [PATCH] =?UTF-8?q?hygiene(#268+):=20pr-preservation=20=5Fpatterns?= =?UTF-8?q?.md=20=E2=80=94=20synthesis=20index=20over=20the=20corpus?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Otto-268 follow-on: synthesis index abstracting over the 27+ per-PR drain logs in `docs/pr-preservation/`. Makes the training-signal corpus queryable rather than scattered. Sections: - **Outcome-class taxonomy** — 4 stable + 1 mixed (FIX / STALE-RESOLVED-BY-REALITY / OTTO-279 SURFACE-CLASS / DEFERRED-TO-MAINTAINER / VERBATIM-PRESERVATION DECLINED). - **Recurring findings classes** — 10 patterns observed across the corpus with fix templates and observation citations: inline-code-span line-wrap; memory-file dangling citation (forward-mirror compounding fix); subset-vs-superset framing errors; cross-reference column-name accuracy; implementation-vs- math-definition tension; phase-numbering / count-vs-surface-list cardinality; external-source verifiability gaps; internal- consistency drift on draft ADRs; discriminator falsification; forward-author-to-future-state-of-main drift; self-induced cascade. - **Cross-reviewer-convergence quality signal** — Codex + Copilot overlap as a high-precision finding region. - **Per-PR reviewer-coverage signal** — Codex strong on shell / math / security; Copilot strong on design-spec / formatting / refs. - **Post-merge reviewer-cascade** — wave-by-wave findings shift class on iterative drain. - **Surface-class taxonomy** — history-class / current-state / tension-surfaces. - **Compounding-substrate fixes** — Otto-114, Otto-279, Otto-227, Otto-224, Otto-236. - **Open patterns + future-work candidates** — pre-commit lints for inline-code-span and claim-vs-list cardinality; promote Otto-279 to BP-NN-stable-rule; tension-surface third class. Closes the Otto-268 backlog item by promoting the per-log corpus into a queryable index. New drain-logs landing add citations or new pattern sections per the "How to update this file" section. --- docs/pr-preservation/_patterns.md | 372 ++++++++++++++++++++++++++++++ 1 file changed, 372 insertions(+) create mode 100644 docs/pr-preservation/_patterns.md diff --git a/docs/pr-preservation/_patterns.md b/docs/pr-preservation/_patterns.md new file mode 100644 index 00000000..1574ff88 --- /dev/null +++ b/docs/pr-preservation/_patterns.md @@ -0,0 +1,372 @@ +# PR-preservation drain patterns — recurring findings across the corpus + +**Status:** synthesis index, growing as new drain-logs land. +**Audience:** future drain-runners, reviewers calibrating severity, +authors avoiding common findings at write-time. +**Source corpus:** all `docs/pr-preservation/*-drain-log.md` files +(currently 27+ logs across 2026-04-22 through 2026-04-25). + +This file indexes the recurring patterns across the per-PR drain +logs so the training signal Otto-250 identified is queryable rather +than scattered. Per-PR logs preserve verbatim reviewer text + outcome +class + resolution path; this file abstracts over them to surface +the patterns that compound across PRs. + +--- + +## Outcome-class taxonomy (4 stable + 1 mixed) + +The drain corpus has converged on four stable outcome classes plus +combinations: + +1. **FIX** — apply the suggested change; reviewer's finding is correct + and the fix improves the artifact. +2. **STALE-RESOLVED-BY-REALITY** — the doc claim was inaccurate when + the review was filed but is accurate now (forward-mirror landed, + adjacent PR fixed it, current main has the cited file). Reply + pattern: verify-with-evidence + resolve, not re-fix. +3. **OTTO-279 SURFACE-CLASS** — the finding applies a rule that has a + carve-out for the relevant surface (research / decisions / + round-history / aurora-archive / pr-preservation are history-class + surfaces; first-name attribution allowed there). Reply is a one- + paragraph stamp citing Otto-279 + surface class + provenance-vs- + policy distinction. +4. **DEFERRED-TO-MAINTAINER** — the finding is correct against the + stated rule but applying it would lose meaning (e.g., bootstrap/ + docs documenting the maintainer's personal framework where by-name + attribution is faithful representation, not arbitrary labeling). + Reply acknowledges + surfaces tension + escalates without blocking + the merge gate. +5. **VERBATIM-PRESERVATION DECLINED (Otto-227)** — the finding sits + inside verbatim-preserved ferry content (Amara absorb / external + import). Editing would violate Otto-227 signal-in-signal-out + discipline. Brittleness valid as future-implementation work, not + absorb-edit work. + +Mixed outcomes (FIX + STALE-RESOLVED-BY-REALITY for combined +findings, etc.) are also common. + +--- + +## Recurring findings classes + +### Inline-code-span line-wrap rendering bug + +Observed on: #191, #195, #219 (4 threads alone), #423. + +Pattern: file paths inside backtick spans split across newlines render +as two adjacent code spans rather than one path, breaking copy/paste +and pointer-integrity audits. + +Fix template: reflow to single-line code spans, or convert to a +markdown link with `[memory/path/file.md](../../memory/path/file.md)` +shape. + +**Pre-commit-lint candidate**: regex check for backtick spans crossing +newlines. + +### Memory-file dangling citation (forward-mirror compounding fix) + +Observed on: #135 (8 threads), #195 (6 threads), #219 (3 threads), +#235 (3 threads), #377 (5 threads), #206 (multiple). At least 25 +threads across the corpus. + +Pattern: research/aurora docs cite memory files that don't exist in +`memory/` at review time but landed via Otto-114 forward-mirror by +drain time. + +Fix template: STALE-RESOLVED-BY-REALITY reply with `ls memory/` +verification. The substrate fix (Otto-114) converted what would have +been 25+ per-doc fixes into a one-line verify-and-resolve. **Highest- +leverage compounding fix in the corpus.** + +### Subset-vs-superset framing errors + +Observed on: #206 (K-relations: rings ARE semirings; GKT homomorphism: +positive relational algebra IS a subset), #135 (DORA canonical +definitions: deployments not commits; ratio not raw count). + +Pattern: claim asserts a property over a broad class but the property +holds only for a narrower subset (or vice versa). + +Fix template: identify the precision-restricting condition; reword to +explicitly scope the claim. "Pure-semiring-without-additive-inverses" +distinguishes from rings cleanly; "positive relational algebra" +distinguishes from full relational algebra cleanly. + +**Codex catches this class reliably across math/research docs.** + +### Cross-reference column-name accuracy + +Observed on: #206 (TECH-RADAR "11" in Round column not Ring column), +#85 (line count drift 5,957 vs 6,094). + +Pattern: doc cross-references a table cell by column name but the +source-table evolved + the cited column name no longer matches. + +Fix template: parse referenced table headers + verify cited column. +For volatile counts, use order-of-magnitude approximations + drift- +tolerance notes. + +**Future doc-lint candidate.** + +### Implementation-vs-math-definition tension + +Observed on: #206 (Tropical math is `ℝ ∪ {+∞}`; Zeta uses +`ℤ ∪ {+∞}` via `int64`), other algebra docs. + +Pattern: textbook formalization is broader than the implementation +pin. + +Fix template: name the implementation-version explicitly, note the +math-definition-extends-to-broader-set with explicit pointer. +Captures both the math content + the engineering choice without +losing either. + +### Phase-numbering / count-vs-surface-list cardinality + +Observed on: #219 ("fifth phase" + "Phase 6" + 5 listed phases), +#191 ("18 audits" + 8 actual audit sections). + +Pattern: a count claim doesn't match the cardinality of the surface +list it summarizes. + +Fix template: sum the actual list, update the count, or rename the +heading to scope explicitly. + +**Future doc-lint candidate**: claim-vs-list cardinality check. + +### External-source verifiability gaps + +Observed on: #219 (OpenAI help-center / DBSP paper / provenance- +semiring paper without resolvable identifiers), #231 (parity-matrix +claims without release-notes citations). + +Pattern: research doc cites prior art without URLs / arXiv IDs / +DOIs / specific release-notes references. + +Fix template: add resolvable citations one-line-each. Converts +"grounded somewhere" into "reviewers can verify." + +**Future skill or pre-commit lint candidate.** + +### Internal-consistency drift on draft ADRs + +Observed on: #85 (5 of 7 fixes were self-reference inconsistencies: +tier-scheme-vs-tree, id-vs-filename, contradiction, claimed-as- +existing, PR-description-vs-ADR). + +Pattern: draft ADRs evolve their own self-references during authoring; +reorganizations don't propagate uniformly to all cross-references. + +Fix template: pick canonical wording in one place, sweep references +for consistency before merge. Reviewers reliably catch the +accumulated drift. + +### Discriminator falsification + +Observed on: #231 (AGENTS.md-read test relying on values repeated in +same doc — false-positive readiness path). + +Pattern: a test or check uses content that's available in the test's +own surface, so passing the test doesn't prove what it's intended to +prove. + +Fix template: pick unique-to-target content. Same shape as +randomized-canary in security testing. + +### Forward-author-to-future-state-of-main drift + +Observed on: #377 (38% stale-resolved density), #135 (66% stale- +resolved-by-reality on memory citations). + +Pattern: research doc forward-authors against a future state of main +that adjacent PRs land during the review window, producing high +stale-resolved density. + +Fix template: STALE-RESOLVED-BY-REALITY reply pattern + verify against +current main. The pattern is high-leverage: research docs naturally +describe the post-review-window state. + +### Self-induced cascade + +Observed on: #435 (wave-1 fix introduced wave-2 finding — "separate +BACKLOG items" plural vs actual single-row state). + +Pattern: when fixing a claim, the new claim itself is also +inaccurate against current-state. + +Fix template: don't just remove the bad claim; replace it with a +verified-from-grep claim. Verify-the-fix-itself is the discipline. + +--- + +## Cross-reviewer convergence as a quality signal + +Observed on: #432 (`warn` unbound: Codex P1 + Copilot P0), #206 +(K-relations retraction: Codex P2 + Copilot P1), #435 (FACTORY-HYGIENE +row: Codex P2 + Copilot P1). + +Pattern: when two unrelated reviewers converge on the same finding +with high confidence, the prior on "real bug" goes way up. Severity +typically differs (e.g., P0 vs P1) because reviewers weight common- +case-paths differently. + +This is the strongest single quality signal in the drain corpus and +worth preserving as a reviewer-roster property — Codex + Copilot +fire on different surfaces (Codex strong on shell, math, security; +Copilot strong on design-spec, formatting, broken-link refs); their +overlap is a high-precision finding region. + +## Per-PR reviewer-coverage signal + +Observed on: #434 (Copilot-only on design-spec text; Codex silent), +#432 (both fired heavily on shell-portability), #206 (both fired +heavily on math content). + +Pattern: Codex tends to fire on shell scripts, math docs, security +threat models, research-grade content with citations to verify. +Copilot tends to fire on design-spec text, formatting/grammar, +broken-link references, schema-vs-doc consistency. Different +surfaces produce different reviewer coverage. + +Could inform per-PR reviewer-selection or per-finding severity- +weighting. **Future tooling/triage candidate.** + +## Post-merge reviewer-cascade + +Observed on: #135 (10 + 4 = 14 across 2 waves), #231 (2 + 1 + 3 + 2 = +9 across 4 waves), #432 (7 first-wave only, no cascade), #435 (2 + +1 = 3 across 2 waves). + +Pattern: every commit on a PR triggers a fresh Codex/Copilot review +wave. Wave-by-wave the findings shift class: structural → rendering +→ internal-consistency → version-currency. The cascade is a property +of the merge-trigger surface, not per-PR oddity. + +4 of 7 cascade-PRs in the 2026-04-25 session followed wave-1 + +wave-2 cascade pattern. + +## Post-merge cascade triggering version-currency on the doc + +Observed on: #231 (Wave 4 reclassifications — TodoWrite Sept 15 2025; +Hooks `rust-v0.117.0` March 26 2026). + +Pattern: Codex enforces version-currency on the doc's own claims via +release-notes citations. This is the inverse of CLAUDE.md's author- +side version-currency rule. Bidirectional drift correction: +author-side searches at write-time + reviewer-side enforces against +new releases at review-time. + +Score-summary propagation (Parity 10→11, Gap 4→3 on #231) is +load-bearing — without it the matrix's running counts drift from +the row data. + +--- + +## Surface-class taxonomy (where rules apply / don't apply) + +**History-class surfaces (Otto-279)** — first-name attribution +permitted; preserves provenance: + +- `docs/research/**` +- `docs/DECISIONS/**` +- `docs/ROUND-HISTORY.md` +- `docs/aurora/**` +- `docs/pr-preservation/**` +- `memory/**` (out-of-scope for code-style rules anyway) + +**Current-state-class surfaces** — role-ref discipline applies; no +first-name attribution: + +- Skill bodies (`.claude/skills/**/SKILL.md`) +- Code (`src/**`, `tools/**`, `tests/**`) +- README, public-facing prose +- Behavioural docs, threat models +- Spec docs (`docs/**` not in history-class above) + +**Tension surfaces** — both apply; defer to maintainer: + +- `docs/bootstrap/**` (#195) — current-state operational substrate + AND maintainer's personal framework documentation simultaneously. +- `docs/craft/**` Attribution sections (#206) — current-state + operational substrate AND PR-level provenance simultaneously. + +The **Otto-279 history-vs-current-state carve-out is mature enough +to codify in `docs/AGENT-BEST-PRACTICES.md` as a stable BP-NN rule.** +Tension surfaces need either a third surface class or explicit +carve-out within current-state. + +--- + +## Fixes-to-substrate that compound across the corpus + +These are the structural changes that converted per-doc fix-toil +into one-line stale-resolved-by-reality replies: + +1. **Otto-114 forward-mirror** — Anthropic AutoMemory mirrored to + in-repo `memory/` tree. Converts 25+ memory-file-dangling + findings across the corpus into verify-with-`ls`-and-resolve. +2. **Otto-279 surface-class refinement** — history-vs-current-state + carve-out for first-name attribution. Converts every "Aaron + appears in research surface" finding into a one-paragraph stamp + reply citing the surface class. +3. **Auto-merge SQUASH armed by default** (Otto-224) — drain-author + doesn't need to manually merge after CI clears. +4. **Otto-227 verbatim-preservation discipline** — absorb docs + (Amara ferries) carry an explicit non-edit boundary. Converts + "brittleness in proposed validation" findings inside verbatim + ferry content into VERBATIM-PRESERVATION DECLINED replies + (#235, also second-wave on same PR). +5. **Otto-236 reply+resolve always paired** — every thread ends + resolved (branch-protection requires it), so even DEFERRED-TO- + MAINTAINER replies close the merge-gate. + +--- + +## Open patterns / future-work candidates + +- **Pre-commit lint for inline-code-span line-wrap** (most-recurring + formatting bug in the corpus). +- **Doc-lint for claim-vs-list cardinality** (catches phase-numbering + + audit-count drift). +- **Doc-lint for unresolvable external-source references** (research + docs without URLs / arXiv IDs / DOIs). +- **`_patterns.md` self-update** — when a new drain-log lands, add + any new pattern here + cite the log. +- **Promote Otto-279 to BP-NN-stable-rule** in + `docs/AGENT-BEST-PRACTICES.md` (currently memory-only). +- **Tension-surface third class or carve-out** for bootstrap/ + + craft/ Attribution sections. +- **Per-PR reviewer-coverage routing** (Codex vs Copilot strengths + on different surfaces — informs per-finding severity-weighting). + +--- + +## How to update this file + +When a new drain-log lands in `docs/pr-preservation/`: + +1. Read the new log's "Pattern observations" section. +2. For each observation: + - If it matches an existing pattern here, add a citation + (`Observed on: ... ; #NNN`). + - If it's a new pattern, add a section under "Recurring findings + classes" with a name, pattern description, fix template, and + observation citations. +3. If an outcome class needs renaming or a tension surface gets a + maintainer-decision, update the taxonomy sections. + +This file is itself current-state operational substrate: it should +reflect the corpus state, not the historical sequence in which +patterns were discovered. + +## Reference patterns + +- `docs/pr-preservation/*-drain-log.md` — the per-PR training-signal + records this index abstracts over. +- `memory/feedback_pr_reviews_are_training_signals_*.md` — Otto-250 + framing of the discipline. +- `docs/AGENT-BEST-PRACTICES.md` — where BP-NN-stable-rule promotions + land if/when the four stable outcome classes + history-class + surfaces get codified.