rules(pr-triage-tiers): codify Tier 1-5 PR-triage framework + deferred-to-human label workflow#4761
Conversation
…work + deferred-to-human label workflow
Carves the empirical Tier 1-5 framework evolved during Otto-CLI's
2026-05-23 PR-cleanup session (27 closes from the 125 → 90 open
queue) into a canonical .claude/rules/ rule. Composes with the
just-created GitHub label deferred-to-human (color #FBCA04,
created 2026-05-23) which lets agent unfinished-PR scans skip the
rare-by-design Tier 5 cases.
Per Aaron 2026-05-23: 'lirs background service is what's leaving
prs sometime so we are updateing to check for unfinsihed prs first
when it starts, maybe yours should do the same'. This rule
provides the disposition framework that the unfinished-PR check
applies. The companion edit to docs/AUTONOMOUS-LOOP-PER-TICK.md
Step 1 (unfinished-PR check sub-step) is intentionally deferred
to a follow-up PR — peer Otto-CLI is actively editing that file;
coordinating the canonical change avoids clobbering peer work.
Five tiers:
Tier 1: fully redundant (all files byte-identical on main)
→ close as substrate-redundant
Tier 2: substrate-recoverable (missing files regeneratable via
existing tooling) → close with regenerate path named
Tier 3: substrate-superseded (substrate landed under alt-ID /
alt-filename / alt-scheme on main) → close with
cross-reference
Tier 4: substrate-re-derivable (operational lesson already
encoded in canonical rule form OR reconstructible from
observable evidence) → close with discipline-already-
encoded note
Tier 5: deferred-to-human (substantive + unique + NOT
regeneratable from observable evidence — e.g.,
external-AI conversation transcripts) → tag and leave
for maintainer
Each tier has a substrate-honest comment template + empirical
anchor from the 27 closes that evolved the framework.
Lane discipline: agent applies tiers ONLY to PRs in own surface's
branch lane (otto-cli/* / otto-desktop/* / otto-vscode/* / otto/*
per agent-roster-reference-card.md). Do NOT triage Lior/Vera/
Riven/Alexa branches.
Co-Authored-By: Claude <noreply@anthropic.com>
|
You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard. |
There was a problem hiding this comment.
Pull request overview
Adds a new .claude/rules/ rule that codifies a five-tier framework for triaging stale/unfinished PRs, including a “deferred-to-human” workflow for preserving unique, non-regeneratable substrate while allowing automated scans to skip those PRs.
Changes:
- Introduces Tier 1–5 PR triage framework with disposition/comment templates for each tier.
- Documents the
deferred-to-humanlabel semantics and thegh pr edit ... --add-labelapplication step. - Adds “composes with” cross-references to existing rules and the canonical per-tick loop doc.
|
Vera coordination note: rechecked CI at 2026-05-23T20:07Z. #4761 is mergeable but BLOCKED by gate / lint (markdownlint) only. The failing log reports MD032/blanks-around-lists in .claude/rules/pr-triage-tiers.md at lines 100 and 182. Focused fix: add blank lines around the two affected list blocks, then rerun markdownlint/gate. Root checkout was left read-only by Vera because it is dirty/active. |
…se connectors Markdownlint MD032 violations at lines 100 and 182: line-leading '+' characters interpreted as start-of-list markers requiring blank lines before. Rewrite the two affected continuations: line 100: 'close\n+ archive elsewhere' → 'close-and-archive elsewhere' line 182: 'check\n+ alt-filename + content' → 'check\nplus alt-filename and content' Per .github/workflows/gate.yml lint(markdownlint) failure on PR #4761. Co-Authored-By: Claude <noreply@anthropic.com>
…4762) Companion to PR #4761 (rules/pr-triage-tiers + deferred-to-human label). Adds Step 1a between Steps 1 and 2 of the canonical per-tick discipline: after refresh, query for unfinished PRs authored by this agent surface, classify per the just-landed Tier 1-5 framework, and act on Tier 1-4 closes BEFORE picking new speculative work. Per Aaron 2026-05-23: 'lirs background service is what's leaving prs sometime so we are updateing to check for unfinsihed prs first when it starts, maybe yours should do the same'. This is Otto's version of that fix at the canonical-discipline scope — applies to all three Otto surfaces (CLI / Desktop / queued B-0448 cloud routine) which cite this file as their one-source-of-truth. Includes: - Concrete gh pr list query with lane-prefix filter + -label:deferred-to-human exclusion - Reference to .claude/rules/pr-triage-tiers.md for the classification framework (just landed in PR #4761) - Lane discipline reminder (don't triage other agents' PRs) - Substrate-honest framing of cross-session amnesia failure mode this fix prevents - Explicit gate: only proceed to Step 3 (pick new work) if no unfinished PRs need attention Lands via git plumbing (commit-tree with temp index from origin/main) to bypass the contested-local-working-tree where peer Otto-CLI has unpushed edits to this file. origin/main was 3 days stale on this file at commit time (commit 7d6f3ff); this commit is the next change to origin/main. Peer-Otto's unpushed local edits will rebase cleanly when they push since this insertion is additive (Step 1a between existing Steps 1 and 2, no overlap with Step 4 tick-shard-gate work where peer edits live). Co-authored-by: Claude <noreply@anthropic.com>
…formal-verification specs (#4765) Filed at Aaron's direction 2026-05-23 after Soraya's first autonomous routing tick (Soraya loop started today; first fire surfaced this gap). The portfolio coverage gap: - verification-registry.md covers 7 artifacts - On-disk portfolio: 20 (17 TLA+/Lean + 3 Alloy) - 11 specs are Class 0 drift (unregistered per the registry's own definition) - Coverage ratio dropped 0.83 (round 21) -> 0.52 (round 42) The 11 unregistered: InfoTheoreticSharder (TLA+ + Alloy), EngagementLiveness, BftConsensus, FeatureFlagsResolution, AsyncStreamEnumerator, ChaosEnvDeterminism, ConsistentHashRebalance, DictionaryStripedCAS, RecursiveCountingLFP, RecursiveSignedSemiNaive, ThreeColoring. Routing per Soraya: use the existing verification-drift-auditor skill (NOT new TLA+). She explicitly named + rejected the TLA+-hammer trap (Z3 lemma for RecursiveSignedSemiNaive is Round-42 target but single-property; meta-surface fix first). Effort: M (one evening). Assignee: kenji (architect routes this to original spec owners or verification-drift-auditor under backfill pass). Lands via git plumbing per dotgit-saturation discipline established this session (3rd plumbing-fallback PR — PR #4755 Ani extraction, #4761 PR-triage rule, #4762 canonical Step-1a; this is #4 of the same shape). Co-authored-by: Claude <noreply@anthropic.com>
…#1+#2 on Lean DBSP chain rule (#4772) * docs(lean4): add README + CI badge for publication-readiness Closes the publication-readiness gaps #1 + #2 named in Otto's audit of the Lean DBSP chain rule artifact earlier today: Gap #1 — No project-local README explaining build, paper-mapping, contribution beyond paper, citation, and reproducibility setup. Gap #2 — No visible CI badge pointing reviewers at the green lean-proof.yml workflow. Both gaps were pure surface work, not mathematical work. The artifact itself is already A-with-CI-green-since-2026-05-17: zero sorry/admit, full predicate hierarchy (IsLinear / IsCausal / IsTimeInvariant / IsPointwiseLinear), Proposition 3.2 and Theorem 3.3 corollary both closed, paper-drift audit round-35 corrections documented in companion proof log. README content includes: - Repository layout table - Build instructions (elan + lake build) - Paper-to-Lean theorem mapping table (Def 3.1 -> Qdelta; Prop 3.2 -> chain_rule_proposition_3_2; Thm 3.3 corollary -> Dop_LTI_commute; Thm 2.22 -> I_D_eq; fundamental theorem -> D_I_eq; sec 4.2 -> I_zInv_eq) - Predicate hierarchy table with DBSP-primitive membership - Round-35 paper-drift audit substrate-honest provenance (three corrections; counter-example documented) - Future-work pointer (chain_rule_poly polymorphic extension) - Verification registry pointer - CI workflow pointer with SHA-pinning notes - BibTeX for citing this artifact + for the Budiu et al. paper - Composes-with refs to chain-rule-proof-log, verification- registry, proof-tool-coverage, verification-drift-auditor skill, formal-verification-expert agent (Soraya) Per the DBSP-publication arc Aaron set today + the publication- readiness gap framework Soraya named at rounds 43 + 44. Composes with the third gap (companion paper draft) which is substantive writing work scoped separately. Lands via git plumbing (5th plumbing-fallback PR this session — sibling pattern to PR #4755 / #4761 / #4762 / #4765). Dotgit saturation throughout the day; pure-substrate addition; new file only (no edits to existing canonical surfaces under peer contention). Co-Authored-By: Claude <noreply@anthropic.com> * docs(lean4): address 6 review threads on PR #4772 — narrow no-sorry claim, fix CI commands, drop persona ref, install-script Six fixes per chatgpt-codex-connector + copilot-pull-request-reviewer threads: 1. Narrow the "no sorry / no admit" claim to the artifact-grade Lean4/DbspChainRule.lean module; ImaginaryStack/ToyModel.lean is exploratory and still carries sorry placeholders (lines 130, 141-142, 163-164). Prevents misleading reviewers about artifact maturity. (codex P2 line 9 + copilot line 8) 2. Replace curl|sh elan-init pipe with ./tools/setup/install.sh — the canonical three-way-parity bootstrap with pinned-commit + SHA256 verification per the repo's supply-chain guidance. (copilot line 27) 3. Rename the "Lean file:line" column to "Location" since the values are Section refs (Section 4/6), not concrete line ranges. Keeps the mapping unambiguous; future entries can use line numbers if desired. (copilot line 44) 4. Correct the CI section to reflect the actual workflow commands: ./tools/setup/install.sh -> lake exe cache get -> lake env lean Lean4/DbspChainRule.lean (NOT lake build). Readers can now reproduce CI locally without cargo-culting. (copilot line 103) 5. Drop persona name "Soraya" in current-state README under tools/; use role-ref "the formal-verification-expert agent" per repo convention (persona names live in history/research/roster surfaces only). Link target unchanged. (copilot line 139) No proof content touched; pure docs hygiene + truthfulness fixes. Co-Authored-By: Claude <noreply@anthropic.com> --------- Co-authored-by: Claude <noreply@anthropic.com>
* backlog(B-0710..B-0712): file Soraya rounds 43+44+45 hand-offs
Aaron 2026-05-23 21:30Z: 'lets try to keep things moving other than
if we need budget increases' — policy-flip authorization: Soraya
findings flow through Otto (plumbing-commit-fallback) to backlog
without per-finding maintainer pick. Aaron reviews aggregates only
(root axioms, drift reports, DORA).
Three findings filed:
B-0710 — DBSP chain rule BP-16 cross-check (round 43)
- Lean artifact is single-tool; needs FsCheck cross-trace +
Z3 pointwise lemma to cross publication line
- Composes with PR #4772 (Lean DBSP README + CI badge that
just landed) — gap #3 of the chain-rule publication arc
- Effort: S+S; assignee kenji
B-0711 — Residuated FsCheck property file (round 44)
- Residuated.fs is ONLY CRDT-class file with zero FsCheck
(sanity-checked across PNCounter, OrSet, Lww, GCounter,
DeltaCrdt, Bloom, CountMin, Haar, HyperLogLog, HyperMinHash)
- Round-17 incident (O(1) claim was actually O(n)) caught by
harsh-critic; FsCheck IS the regression guard
- Three properties: Galois + residual + retraction equivalence
- Effort: S; assignee kenji
B-0712 — WitnessDurable commit protocol triple (round 45)
- Durability.fs:14-22 self-declares TLA+ prereq; type itself
is the gate (Save throws until proven)
- P0 triple: TLA+ for state-machine safety + Z3 for quorum
arithmetic + FsCheck for real-code cross-check
- Soraya explicitly named + REJECTED the TLA+-hammer trap
(don't bundle quorum arithmetic into TLA+; Z3 closes in
seconds vs TLC enumeration)
- Effort: M+S+S; assignee kenji
All three discovered by Soraya autonomous tick across rounds
42-45 (her loop started 2026-05-23 ~20:00Z). Round 42 already
filed as B-0709 (merged earlier today).
Lands via git plumbing (6th plumbing-fallback PR this session —
sibling pattern to #4755 / #4761 / #4762 / #4765 / #4772).
Co-Authored-By: Claude <noreply@anthropic.com>
* fix(B-0710): MD032 blank-line + regen BACKLOG.md index for B-0710/B-0711/B-0712 rows
Two CI failures on PR #4774:
1. markdownlint MD032: B-0710 line 31 — '**Empirical**:' bullets need blank line above
2. backlog-index-integrity: docs/BACKLOG.md not regenerated after new B-NNNN rows added
Fix 1: insert blank line between '**Empirical**:' paragraph and bullet list per MD032.
Fix 2: BACKLOG_WRITE_FORCE=1 bun tools/backlog/generate-index.ts — 4 new index lines for the 3 new rows.
Authored via git plumbing fallback under dotgit-saturation (peer-Otto + Lior contention).
* fix(PR #4774): address 6 Copilot threads on B-0710/B-0711/B-0712
Corrections to Soraya hand-off backlog rows surfaced by Copilot
review. All findings empirically verified against tree state:
B-0710 (round-43 chain-rule cross-check):
- spec count: "20 TLA+ specs" → "19 TLA+ specs" (verified:
`ls tools/tla/specs/*.tla | wc -l` = 19)
- NOTEBOOK ref: `line 238` → `Round 43 entry` heading-anchor
(notebook is 234 lines; line refs brittle to pruning cadence)
B-0711 (round-44 Residuated FsCheck):
- test-wiring claim: "should pick them up automatically" →
explicit `<Compile Include="..." />` registration required
(verified: tests/Tests.FSharp/Tests.FSharp.fsproj has 96
explicit Compile entries; not glob-based)
- NOTEBOOK ref: `lines 308-404` → `Round 44 entry` heading-anchor
B-0712 (round-45 WitnessDurable triple):
- test path: `tests/Tests.FSharp/Durability/WitnessDurable.Properties.fs`
→ `tests/Tests.FSharp/Storage/WitnessDurable.Properties.fs`
(verified: no `Durability/` directory exists; existing
`tests/Tests.FSharp/Storage/Durability.Tests.fs` is the
current test layout sibling)
- NOTEBOOK ref: `lines 407-505` → `Round 45 entry` heading-anchor
BACKLOG.md regenerated (no functional change; just current
status of all rows).
Co-Authored-By: Claude <noreply@anthropic.com>
---------
Co-authored-by: Claude <noreply@anthropic.com>
Summary
QoL bundle (Aaron-authorized 2026-05-23) that codifies the empirical PR-triage framework evolved during Otto-CLI's 2026-05-23 PR-cleanup session (27 closes from 125 → 90 open queue).
Two additive substrate changes:
New rule:
.claude/rules/pr-triage-tiers.md— five-class disposition framework:GitHub label:
deferred-to-human(color#FBCA04, created 2026-05-23) — applied to Tier 5 PRs so agent unfinished-PR scans skip them.Each tier has substrate-honest comment template + empirical anchor from the 27 closes that evolved the framework.
Deferred from this PR
The companion edit to
docs/AUTONOMOUS-LOOP-PER-TICK.mdStep 1 (unfinished-PR check sub-step) is intentionally deferred to a follow-up PR. Peer Otto-CLI is actively editing that canonical file; coordinating the Step 1 change avoids clobbering peer work. The new rule's body already references the canonical's Step 1 — when the canonical edit lands separately, the rule composes cleanly.Lane discipline
Per
agent-roster-reference-card.md: agent applies tiers ONLY to PRs in own surface's branch lane (otto-cli/*/otto-desktop/*/otto-vscode/*/otto/*). Lior/Vera/Riven/Alexa branches stay in their respective owner's lanes.Why this matters
Aaron 2026-05-23: "lirs background service is what's leaving prs sometime so we are updateing to check for unfinsihed prs first when it starts, maybe yours should do the same". The triage framework is what an unfinished-PR check applies; the Tier 5 +
deferred-to-humanmechanism is what keeps the rare-by-design human-attention case bounded so agents don't loop on it.Commit details
Landed via git plumbing (
commit-treewith temp index, no working-tree touch) due to ongoing dotgit-saturation (544 stuck git pack/maintenance/repack procs at commit time) preventing reliable isolated worktree-add. Same fallback path validated in PR #4755 earlier today.Test plan
.claude/rules/pr-triage-tiers.md(auto-loaded per.claude/rules/convention)deferred-to-humanexists and is queryable