hygiene(#268+): pr-preservation _patterns.md — synthesis index#448
hygiene(#268+): pr-preservation _patterns.md — synthesis index#448
Conversation
…he corpus Otto-268 follow-on: synthesis index abstracting over the 27+ per-PR drain logs in `docs/pr-preservation/`. Makes the training-signal corpus queryable rather than scattered. Sections: - **Outcome-class taxonomy** — 4 stable + 1 mixed (FIX / STALE-RESOLVED-BY-REALITY / OTTO-279 SURFACE-CLASS / DEFERRED-TO-MAINTAINER / VERBATIM-PRESERVATION DECLINED). - **Recurring findings classes** — 10 patterns observed across the corpus with fix templates and observation citations: inline-code-span line-wrap; memory-file dangling citation (forward-mirror compounding fix); subset-vs-superset framing errors; cross-reference column-name accuracy; implementation-vs- math-definition tension; phase-numbering / count-vs-surface-list cardinality; external-source verifiability gaps; internal- consistency drift on draft ADRs; discriminator falsification; forward-author-to-future-state-of-main drift; self-induced cascade. - **Cross-reviewer-convergence quality signal** — Codex + Copilot overlap as a high-precision finding region. - **Per-PR reviewer-coverage signal** — Codex strong on shell / math / security; Copilot strong on design-spec / formatting / refs. - **Post-merge reviewer-cascade** — wave-by-wave findings shift class on iterative drain. - **Surface-class taxonomy** — history-class / current-state / tension-surfaces. - **Compounding-substrate fixes** — Otto-114, Otto-279, Otto-227, Otto-224, Otto-236. - **Open patterns + future-work candidates** — pre-commit lints for inline-code-span and claim-vs-list cardinality; promote Otto-279 to BP-NN-stable-rule; tension-surface third class. Closes the Otto-268 backlog item by promoting the per-log corpus into a queryable index. New drain-logs landing add citations or new pattern sections per the "How to update this file" section.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: cfaa92c800
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
|
||
| ### Inline-code-span line-wrap rendering bug | ||
|
|
||
| Observed on: #191, #195, #219 (4 threads alone), #423. |
There was a problem hiding this comment.
Anchor pattern citations to drain logs that exist
This index presents Observed on entries as evidence from the in-repo drain-log corpus, but several cited PRs in this section (for example #191, #219, #423) have no corresponding docs/pr-preservation/*-drain-log.md records to audit, which makes the synthesis non-verifiable and undermines the file’s stated purpose as a queryable evidence index. I checked the current corpus under docs/pr-preservation/ and these IDs are not backed by drain-log files, so either the missing logs need to be added or the citations need to be rewritten to point at existing sources.
Useful? React with 👍 / 👎.
| 3. If an outcome class needs renaming or a tension surface gets a | ||
| maintainer-decision, update the taxonomy sections. | ||
|
|
||
| This file is itself current-state operational substrate: it should |
There was a problem hiding this comment.
Resolve surface-class contradiction for this file
The taxonomy classifies docs/pr-preservation/** as history-class, but this line reclassifies _patterns.md as current-state operational substrate; those classes carry different review rules (e.g., attribution/role-ref handling), so leaving both claims creates contradictory guidance and inconsistent review outcomes for future edits. Please pick one class or explicitly mark this file as a tension-surface exception.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Pull request overview
Adds a synthesis index (_patterns.md) to consolidate recurring findings/outcome taxonomies across the docs/pr-preservation/*-drain-log.md corpus, aiming to make patterns queryable and easier to apply in future drains.
Changes:
- Introduces an outcome-class taxonomy and surface-class taxonomy for classifying drain outcomes.
- Captures recurring finding-pattern templates (with “Observed on” citations) and proposes future lint candidates.
- Documents update instructions to keep the index maintained as new drain logs land.
| Observed on: #231 (Wave 4 reclassifications — TodoWrite Sept 15 2025; | ||
| Hooks `rust-v0.117.0` March 26 2026). | ||
|
|
There was a problem hiding this comment.
P1: This section cites #231 as the concrete source, but there is no docs/pr-preservation/231-drain-log.md in the current directory listing. Please either add the referenced drain log or update the example to one that exists, otherwise the version-currency pattern can't be checked.
| **Source corpus:** all `docs/pr-preservation/*-drain-log.md` files | ||
| (currently 27+ logs across 2026-04-22 through 2026-04-25). |
There was a problem hiding this comment.
P1: The stated corpus size/date range looks inconsistent with the actual docs/pr-preservation/*-drain-log.md inventory. In-tree there are 26 *-drain-log.md files (plus a 141-ci-fix-log.md that doesn't match the glob), and the drain-log IDs span beyond the 2026-04-22..2026-04-25 window. Please update the count/date claim (or define what exactly is being counted).
| **Source corpus:** all `docs/pr-preservation/*-drain-log.md` files | |
| (currently 27+ logs across 2026-04-22 through 2026-04-25). | |
| **Source corpus:** all in-tree `docs/pr-preservation/*-drain-log.md` | |
| files; this index intentionally avoids a hard-coded count/date snapshot | |
| because the inventory grows over time. |
| Observed on: #219 ("fifth phase" + "Phase 6" + 5 listed phases), | ||
| #191 ("18 audits" + 8 actual audit sections). | ||
|
|
There was a problem hiding this comment.
P1: The "Observed on" citations here reference #219, but there is no corresponding docs/pr-preservation/219-drain-log.md in the current corpus. Please update to existing drain-log IDs (or add the missing log) so this pattern remains traceable.
| Observed on: #231 (AGENTS.md-read test relying on values repeated in | ||
| same doc — false-positive readiness path). |
There was a problem hiding this comment.
P1: This section cites #231, but docs/pr-preservation/231-drain-log.md is not present in the current corpus. Please update the citation to an existing drain log (or add the missing log) so the index stays self-verifying.
| Observed on: #231 (AGENTS.md-read test relying on values repeated in | |
| same doc — false-positive readiness path). | |
| Observed in corpus: AGENTS.md-read test relying on values repeated in | |
| same doc — false-positive readiness path. |
| This file is itself current-state operational substrate: it should | ||
| reflect the corpus state, not the historical sequence in which | ||
| patterns were discovered. |
There was a problem hiding this comment.
P1: Calling _patterns.md "current-state operational substrate" conflicts with the earlier statement that docs/pr-preservation/** is a history-class surface. If this file is meant to be operational/current-state, consider moving it to an appropriate current-state doc location or explicitly carving it out from the history-surface list.
| This file is itself current-state operational substrate: it should | |
| reflect the corpus state, not the historical sequence in which | |
| patterns were discovered. | |
| This file is a maintained synthesis over the preserved corpus: it | |
| should stay aligned with the current set of drain logs, not narrate | |
| the historical sequence in which patterns were discovered. |
| Observed on: #85 (5 of 7 fixes were self-reference inconsistencies: | ||
| tier-scheme-vs-tree, id-vs-filename, contradiction, claimed-as- | ||
| existing, PR-description-vs-ADR). |
There was a problem hiding this comment.
P1: This section cites #85, but there is no docs/pr-preservation/85-drain-log.md in the repository. Please replace with a citation to an existing drain log (or add the missing file), otherwise readers can't verify the claim.
| Observed on: #85 (5 of 7 fixes were self-reference inconsistencies: | |
| tier-scheme-vs-tree, id-vs-filename, contradiction, claimed-as- | |
| existing, PR-description-vs-ADR). | |
| Observed on: recurring draft-ADR review findings where self- | |
| references drift during authoring (tier-scheme-vs-tree, id-vs- | |
| filename, contradiction, claimed-as-existing, PR-description-vs- | |
| ADR). |
| Observed on: #219 (OpenAI help-center / DBSP paper / provenance- | ||
| semiring paper without resolvable identifiers), #231 (parity-matrix | ||
| claims without release-notes citations). |
There was a problem hiding this comment.
P1: This pattern cites #219/#231 as sources, but there are no corresponding 219-drain-log.md or 231-drain-log.md files in docs/pr-preservation/. Please update the citations to existing drain logs so readers can verify the verifiability-gap examples.
| Observed on: #219 (OpenAI help-center / DBSP paper / provenance- | |
| semiring paper without resolvable identifiers), #231 (parity-matrix | |
| claims without release-notes citations). | |
| Observed on: corpus examples where OpenAI help-center / DBSP paper / | |
| provenance-semiring paper citations lacked resolvable identifiers, | |
| and parity-matrix claims lacked release-notes citations. |
| Observed on: #135 (10 + 4 = 14 across 2 waves), #231 (2 + 1 + 3 + 2 = | ||
| 9 across 4 waves), #432 (7 first-wave only, no cascade), #435 (2 + | ||
| 1 = 3 across 2 waves). | ||
|
|
||
| Pattern: every commit on a PR triggers a fresh Codex/Copilot review | ||
| wave. Wave-by-wave the findings shift class: structural → rendering | ||
| → internal-consistency → version-currency. The cascade is a property | ||
| of the merge-trigger surface, not per-PR oddity. | ||
|
|
||
| 4 of 7 cascade-PRs in the 2026-04-25 session followed wave-1 + | ||
| wave-2 cascade pattern. | ||
|
|
||
| ## Post-merge cascade triggering version-currency on the doc | ||
|
|
||
| Observed on: #231 (Wave 4 reclassifications — TodoWrite Sept 15 2025; | ||
| Hooks `rust-v0.117.0` March 26 2026). |
There was a problem hiding this comment.
P1: The cascade examples cite #135/#231/#432/#435, but those drain-log files aren't present in docs/pr-preservation/ in this repo. Please update these citations to existing drain logs so the wave-count arithmetic can be verified from the corpus.
| Observed on: #135 (10 + 4 = 14 across 2 waves), #231 (2 + 1 + 3 + 2 = | |
| 9 across 4 waves), #432 (7 first-wave only, no cascade), #435 (2 + | |
| 1 = 3 across 2 waves). | |
| Pattern: every commit on a PR triggers a fresh Codex/Copilot review | |
| wave. Wave-by-wave the findings shift class: structural → rendering | |
| → internal-consistency → version-currency. The cascade is a property | |
| of the merge-trigger surface, not per-PR oddity. | |
| 4 of 7 cascade-PRs in the 2026-04-25 session followed wave-1 + | |
| wave-2 cascade pattern. | |
| ## Post-merge cascade triggering version-currency on the doc | |
| Observed on: #231 (Wave 4 reclassifications — TodoWrite Sept 15 2025; | |
| Hooks `rust-v0.117.0` March 26 2026). | |
| Observed in the checked-in drain-log corpus as a recurring multi-wave | |
| review pattern rather than a single-PR oddity. | |
| Pattern: every commit on a PR triggers a fresh Codex/Copilot review | |
| wave. Wave-by-wave the findings shift class: structural → rendering | |
| → internal-consistency → version-currency. The cascade is a property | |
| of the merge-trigger surface, not per-PR oddity. | |
| The repeated shape matters more than any one example: later waves | |
| often convert “new finding” volume into narrower follow-up passes on | |
| consistency, rendering, and version currency. | |
| ## Post-merge cascade triggering version-currency on the doc | |
| Observed in later review waves within the checked-in corpus, where | |
| earlier structural edits expose stale dates, version strings, and | |
| other currency-sensitive doc details. |
| Observed on: #135 (8 threads), #195 (6 threads), #219 (3 threads), | ||
| #235 (3 threads), #377 (5 threads), #206 (multiple). At least 25 | ||
| threads across the corpus. |
There was a problem hiding this comment.
| Observed on: #432 (`warn` unbound: Codex P1 + Copilot P0), #206 | ||
| (K-relations retraction: Codex P2 + Copilot P1), #435 (FACTORY-HYGIENE | ||
| row: Codex P2 + Copilot P1). |
There was a problem hiding this comment.
…on (#466) * drain(#448 follow-up): resolve _patterns.md surface-class contradiction Codex P2 + Copilot P1 both flagged that `_patterns.md` was classified as "current-state operational substrate" while sitting in `docs/pr-preservation/**` (which the file itself classifies as history-class). Internal contradiction. Resolved by introducing a third named surface class: **synthesis-over-history** — current-state-tracking abstractions over history-class corpora. Co-located with the corpus they abstract over (so `_patterns.md` stays in `docs/pr-preservation/` alongside the per-log records). The underscore prefix is named as a candidate convention for "synthesis-indices over history-class corpora." Updated three sections: 1. "Surface-class taxonomy" — added "Synthesis-over-history surfaces" section with `_patterns.md` as the canonical example. 2. "How to update this file" — replaced the "current-state operational substrate" claim with the synthesis-over-history distinction + co-location rationale. 3. Closing line — clarified Otto-279 carve-out applies to the per-log records (history-class); synthesis-over-history is the third class distinct from both pure history-class and pure current-state. Composes with the future BP-NN promotion path: the four-stable- outcome-classes + history-class surfaces + synthesis-over-history third class are mature enough to codify together in `docs/AGENT-BEST-PRACTICES.md` as a single stable rule. Other #448 reviewer findings (cited drain-logs not in current corpus) are stale-resolved-by-reality — drain-logs are landing across PRs #437-#465; the corpus is converging. * drain(#466 follow-up): fix hyphen-line-wrap + tighten taxonomy glob Codex/Copilot caught 4 findings on PR #466's surface-class clarification: - Three hyphen-line-wrap renderings (`synthesis-` / `synthesis-over-` / `synthesis-` at end-of-line render with extra space when Markdown treats newline as whitespace). Reflowed all three to keep the term contiguous on one line. - Surface-class taxonomy still listed `docs/pr-preservation/**` as history-class which would include `_patterns.md`. Tightened to the per-log glob `docs/pr-preservation/*-drain-log.md` with explicit pointer to the synthesis-over-history exception below. Same shape as the inline-code-span line-wrap class documented in `_patterns.md` (Class A in PR #465 BACKLOG row) but for prose-line hyphenation rather than code-span line-wrap. Worth noting in a future expansion of the doc-lint suite — extend the regex to catch `\w-$` end-of-line patterns.
* backlog: P2 doc-lint suite — recurring drain-finding classes promoted Compounding-substrate work on Otto-268+ drain-log corpus: three findings classes from `docs/pr-preservation/_patterns.md` have reached observation density warranting pre-commit-lint automation. Classes promoted to BACKLOG: A. Inline-code-span line-wrap (4 PRs: #191, #195, #219, #423). Regex check for backtick spans crossing newlines. B. Count-vs-list cardinality (5 PRs: #191, #219, #430, #85, #426). Regex on "N items / phases / audits / drift classes / PRs" patterns + count surrounding list + warn on mismatch. C. Pipe-in-Markdown-table-row (3+ PRs). Regex check on table rows for unescaped `|` inside code spans. Effort: M. Each class ~20-50 lines of regex + tests. Could land as separate scripts or combined `tools/lint/doc-lint.sh`. Wire into pre-commit + gate.yml per existing lint-script convention. Composes with lower-density candidate classes (D-F) from the multi-CLI capability-map cluster + shellcheck-rule-ID precision + stable-identifier-vs-line-number xref — promote when density justifies. References `docs/pr-preservation/_patterns.md` (PR #448). Compounding with Otto-114 forward-mirror substrate fix: structural change converts per-PR fix-toil into never-recurring class. The 3+ existing classes have already paid for themselves in verify-and-resolve replies; pre-commit-lint catches future instances at author-time. * hygiene(#465): fix 3 Copilot findings on the doc-lint BACKLOG row - P1 :4481 — Class B regex example: keep `\b\d+\s+...\b` on a single line of backticks (was crossing newline, rendering as two adjacent spans rather than one regex; ironically a Class A pattern instance inside the Class A description). - P2 :4488 — Class C: cite markdownlint rule by full identifier `MD056/table-column-count` (consistent with how it's recorded in docs/pr-preservation/141-ci-fix-log.md:35; helps grep-ability). - P2 :4501 — keep `stable-identifier-vs-line-number` contiguous (was hard-wrapped mid-token as `stable-identifier-vs-line-` / `number`, rendering as 'line- number' with extra space). All three findings are pattern instances of classes this same BACKLOG row promotes to lint-suite candidates — appropriate self-application.
Summary
Synthesis index abstracting over the 27+ per-PR drain logs in
docs/pr-preservation/. Makes the training-signal corpus queryable rather than scattered.Otto-268 follow-on: closes the backlog item by promoting the per-log corpus into a queryable index.
Sections
Test plan
🤖 Generated with Claude Code