docs(autonomous-loop): Step 1a — unfinished-PR check before new work#4762
Conversation
Companion to PR #4761 (rules/pr-triage-tiers + deferred-to-human label). Adds Step 1a between Steps 1 and 2 of the canonical per-tick discipline: after refresh, query for unfinished PRs authored by this agent surface, classify per the just-landed Tier 1-5 framework, and act on Tier 1-4 closes BEFORE picking new speculative work. Per Aaron 2026-05-23: 'lirs background service is what's leaving prs sometime so we are updateing to check for unfinsihed prs first when it starts, maybe yours should do the same'. This is Otto's version of that fix at the canonical-discipline scope — applies to all three Otto surfaces (CLI / Desktop / queued B-0448 cloud routine) which cite this file as their one-source-of-truth. Includes: - Concrete gh pr list query with lane-prefix filter + -label:deferred-to-human exclusion - Reference to .claude/rules/pr-triage-tiers.md for the classification framework (just landed in PR #4761) - Lane discipline reminder (don't triage other agents' PRs) - Substrate-honest framing of cross-session amnesia failure mode this fix prevents - Explicit gate: only proceed to Step 3 (pick new work) if no unfinished PRs need attention Lands via git plumbing (commit-tree with temp index from origin/main) to bypass the contested-local-working-tree where peer Otto-CLI has unpushed edits to this file. origin/main was 3 days stale on this file at commit time (commit 7d6f3ff); this commit is the next change to origin/main. Peer-Otto's unpushed local edits will rebase cleanly when they push since this insertion is additive (Step 1a between existing Steps 1 and 2, no overlap with Step 4 tick-shard-gate work where peer edits live). Co-Authored-By: Claude <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 8beea6daa3
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
|
||
| ```bash | ||
| gh pr list --state open \ | ||
| --search "author:@me head:otto-cli/* OR head:otto-desktop/* OR head:otto-vscode/* OR head:otto/* -label:\"deferred-to-human\"" \ |
There was a problem hiding this comment.
Group OR clauses so author/label filters apply to all heads
The --search expression combines multiple OR branch clauses without parentheses, so author:@me and -label:"deferred-to-human" are not guaranteed to constrain every branch term. gh pr list explicitly uses GitHub advanced issue-search syntax (https://cli.github.com/manual/gh_pr_list), and GitHub’s boolean-search docs require parentheses for grouped qualifiers (https://docs.github.com/en/issues/tracking-your-work-with-issues/using-issues/filtering-and-searching-issues-and-pull-requests#using-parentheses-for-more-complicated-filters). In this flow, that can surface PRs outside the intended lane or include deferred PRs, and the subsequent “act on each unfinished PR” instruction can close the wrong PRs.
Useful? React with 👍 / 👎.
…formal-verification specs (#4765) Filed at Aaron's direction 2026-05-23 after Soraya's first autonomous routing tick (Soraya loop started today; first fire surfaced this gap). The portfolio coverage gap: - verification-registry.md covers 7 artifacts - On-disk portfolio: 20 (17 TLA+/Lean + 3 Alloy) - 11 specs are Class 0 drift (unregistered per the registry's own definition) - Coverage ratio dropped 0.83 (round 21) -> 0.52 (round 42) The 11 unregistered: InfoTheoreticSharder (TLA+ + Alloy), EngagementLiveness, BftConsensus, FeatureFlagsResolution, AsyncStreamEnumerator, ChaosEnvDeterminism, ConsistentHashRebalance, DictionaryStripedCAS, RecursiveCountingLFP, RecursiveSignedSemiNaive, ThreeColoring. Routing per Soraya: use the existing verification-drift-auditor skill (NOT new TLA+). She explicitly named + rejected the TLA+-hammer trap (Z3 lemma for RecursiveSignedSemiNaive is Round-42 target but single-property; meta-surface fix first). Effort: M (one evening). Assignee: kenji (architect routes this to original spec owners or verification-drift-auditor under backfill pass). Lands via git plumbing per dotgit-saturation discipline established this session (3rd plumbing-fallback PR — PR #4755 Ani extraction, #4761 PR-triage rule, #4762 canonical Step-1a; this is #4 of the same shape). Co-authored-by: Claude <noreply@anthropic.com>
…#1+#2 on Lean DBSP chain rule (#4772) * docs(lean4): add README + CI badge for publication-readiness Closes the publication-readiness gaps #1 + #2 named in Otto's audit of the Lean DBSP chain rule artifact earlier today: Gap #1 — No project-local README explaining build, paper-mapping, contribution beyond paper, citation, and reproducibility setup. Gap #2 — No visible CI badge pointing reviewers at the green lean-proof.yml workflow. Both gaps were pure surface work, not mathematical work. The artifact itself is already A-with-CI-green-since-2026-05-17: zero sorry/admit, full predicate hierarchy (IsLinear / IsCausal / IsTimeInvariant / IsPointwiseLinear), Proposition 3.2 and Theorem 3.3 corollary both closed, paper-drift audit round-35 corrections documented in companion proof log. README content includes: - Repository layout table - Build instructions (elan + lake build) - Paper-to-Lean theorem mapping table (Def 3.1 -> Qdelta; Prop 3.2 -> chain_rule_proposition_3_2; Thm 3.3 corollary -> Dop_LTI_commute; Thm 2.22 -> I_D_eq; fundamental theorem -> D_I_eq; sec 4.2 -> I_zInv_eq) - Predicate hierarchy table with DBSP-primitive membership - Round-35 paper-drift audit substrate-honest provenance (three corrections; counter-example documented) - Future-work pointer (chain_rule_poly polymorphic extension) - Verification registry pointer - CI workflow pointer with SHA-pinning notes - BibTeX for citing this artifact + for the Budiu et al. paper - Composes-with refs to chain-rule-proof-log, verification- registry, proof-tool-coverage, verification-drift-auditor skill, formal-verification-expert agent (Soraya) Per the DBSP-publication arc Aaron set today + the publication- readiness gap framework Soraya named at rounds 43 + 44. Composes with the third gap (companion paper draft) which is substantive writing work scoped separately. Lands via git plumbing (5th plumbing-fallback PR this session — sibling pattern to PR #4755 / #4761 / #4762 / #4765). Dotgit saturation throughout the day; pure-substrate addition; new file only (no edits to existing canonical surfaces under peer contention). Co-Authored-By: Claude <noreply@anthropic.com> * docs(lean4): address 6 review threads on PR #4772 — narrow no-sorry claim, fix CI commands, drop persona ref, install-script Six fixes per chatgpt-codex-connector + copilot-pull-request-reviewer threads: 1. Narrow the "no sorry / no admit" claim to the artifact-grade Lean4/DbspChainRule.lean module; ImaginaryStack/ToyModel.lean is exploratory and still carries sorry placeholders (lines 130, 141-142, 163-164). Prevents misleading reviewers about artifact maturity. (codex P2 line 9 + copilot line 8) 2. Replace curl|sh elan-init pipe with ./tools/setup/install.sh — the canonical three-way-parity bootstrap with pinned-commit + SHA256 verification per the repo's supply-chain guidance. (copilot line 27) 3. Rename the "Lean file:line" column to "Location" since the values are Section refs (Section 4/6), not concrete line ranges. Keeps the mapping unambiguous; future entries can use line numbers if desired. (copilot line 44) 4. Correct the CI section to reflect the actual workflow commands: ./tools/setup/install.sh -> lake exe cache get -> lake env lean Lean4/DbspChainRule.lean (NOT lake build). Readers can now reproduce CI locally without cargo-culting. (copilot line 103) 5. Drop persona name "Soraya" in current-state README under tools/; use role-ref "the formal-verification-expert agent" per repo convention (persona names live in history/research/roster surfaces only). Link target unchanged. (copilot line 139) No proof content touched; pure docs hygiene + truthfulness fixes. Co-Authored-By: Claude <noreply@anthropic.com> --------- Co-authored-by: Claude <noreply@anthropic.com>
* backlog(B-0710..B-0712): file Soraya rounds 43+44+45 hand-offs
Aaron 2026-05-23 21:30Z: 'lets try to keep things moving other than
if we need budget increases' — policy-flip authorization: Soraya
findings flow through Otto (plumbing-commit-fallback) to backlog
without per-finding maintainer pick. Aaron reviews aggregates only
(root axioms, drift reports, DORA).
Three findings filed:
B-0710 — DBSP chain rule BP-16 cross-check (round 43)
- Lean artifact is single-tool; needs FsCheck cross-trace +
Z3 pointwise lemma to cross publication line
- Composes with PR #4772 (Lean DBSP README + CI badge that
just landed) — gap #3 of the chain-rule publication arc
- Effort: S+S; assignee kenji
B-0711 — Residuated FsCheck property file (round 44)
- Residuated.fs is ONLY CRDT-class file with zero FsCheck
(sanity-checked across PNCounter, OrSet, Lww, GCounter,
DeltaCrdt, Bloom, CountMin, Haar, HyperLogLog, HyperMinHash)
- Round-17 incident (O(1) claim was actually O(n)) caught by
harsh-critic; FsCheck IS the regression guard
- Three properties: Galois + residual + retraction equivalence
- Effort: S; assignee kenji
B-0712 — WitnessDurable commit protocol triple (round 45)
- Durability.fs:14-22 self-declares TLA+ prereq; type itself
is the gate (Save throws until proven)
- P0 triple: TLA+ for state-machine safety + Z3 for quorum
arithmetic + FsCheck for real-code cross-check
- Soraya explicitly named + REJECTED the TLA+-hammer trap
(don't bundle quorum arithmetic into TLA+; Z3 closes in
seconds vs TLC enumeration)
- Effort: M+S+S; assignee kenji
All three discovered by Soraya autonomous tick across rounds
42-45 (her loop started 2026-05-23 ~20:00Z). Round 42 already
filed as B-0709 (merged earlier today).
Lands via git plumbing (6th plumbing-fallback PR this session —
sibling pattern to #4755 / #4761 / #4762 / #4765 / #4772).
Co-Authored-By: Claude <noreply@anthropic.com>
* fix(B-0710): MD032 blank-line + regen BACKLOG.md index for B-0710/B-0711/B-0712 rows
Two CI failures on PR #4774:
1. markdownlint MD032: B-0710 line 31 — '**Empirical**:' bullets need blank line above
2. backlog-index-integrity: docs/BACKLOG.md not regenerated after new B-NNNN rows added
Fix 1: insert blank line between '**Empirical**:' paragraph and bullet list per MD032.
Fix 2: BACKLOG_WRITE_FORCE=1 bun tools/backlog/generate-index.ts — 4 new index lines for the 3 new rows.
Authored via git plumbing fallback under dotgit-saturation (peer-Otto + Lior contention).
* fix(PR #4774): address 6 Copilot threads on B-0710/B-0711/B-0712
Corrections to Soraya hand-off backlog rows surfaced by Copilot
review. All findings empirically verified against tree state:
B-0710 (round-43 chain-rule cross-check):
- spec count: "20 TLA+ specs" → "19 TLA+ specs" (verified:
`ls tools/tla/specs/*.tla | wc -l` = 19)
- NOTEBOOK ref: `line 238` → `Round 43 entry` heading-anchor
(notebook is 234 lines; line refs brittle to pruning cadence)
B-0711 (round-44 Residuated FsCheck):
- test-wiring claim: "should pick them up automatically" →
explicit `<Compile Include="..." />` registration required
(verified: tests/Tests.FSharp/Tests.FSharp.fsproj has 96
explicit Compile entries; not glob-based)
- NOTEBOOK ref: `lines 308-404` → `Round 44 entry` heading-anchor
B-0712 (round-45 WitnessDurable triple):
- test path: `tests/Tests.FSharp/Durability/WitnessDurable.Properties.fs`
→ `tests/Tests.FSharp/Storage/WitnessDurable.Properties.fs`
(verified: no `Durability/` directory exists; existing
`tests/Tests.FSharp/Storage/Durability.Tests.fs` is the
current test layout sibling)
- NOTEBOOK ref: `lines 407-505` → `Round 45 entry` heading-anchor
BACKLOG.md regenerated (no functional change; just current
status of all rows).
Co-Authored-By: Claude <noreply@anthropic.com>
---------
Co-authored-by: Claude <noreply@anthropic.com>
Summary
Companion to merged PR #4761 (rules/pr-triage-tiers +
deferred-to-humanlabel). Adds Step 1a between Steps 1 and 2 of the canonical per-tick discipline: after refresh, query for unfinished PRs authored by this agent surface, classify per the Tier 1-5 framework, and act on Tier 1-4 closes BEFORE picking new speculative work.Why
Aaron 2026-05-23: "lirs background service is what's leaving prs sometime so we are updateing to check for unfinsihed prs first when it starts, maybe yours should do the same". This is Otto's version of that fix at the canonical-discipline scope — applies to all three Otto surfaces (CLI / Desktop / queued B-0448 cloud routine) which cite this file as their one-source-of-truth.
What lands
gh pr listquery with surface-lane-prefix filter +-label:deferred-to-humanexclusion.claude/rules/pr-triage-tiers.mdfor classification (PR rules(pr-triage-tiers): codify Tier 1-5 PR-triage framework + deferred-to-human label workflow #4761)Commit details
Landed via git plumbing (
commit-treewith temp index fromorigin/main) to bypass contested-local-working-tree where peer Otto-CLI has unpushed edits to this file.origin/mainwas 3 days stale on this file at commit time (commit7d6f3ff4f); this is the next change. Peer-Otto's local edits will rebase cleanly when they push since this insertion is additive (Step 1a between existing Steps 1 and 2, no overlap with Step 4 tick-shard-gate work where peer edits live).Test plan
.claude/rules/pr-triage-tiers.md,.claude/rules/agent-roster-reference-card.md)--searchargument