Skip to content

backlog(B-0716): Soraya round-53 scope-correction — B-0709 enumeration under-counted by 3 LSM Spine specs#4791

Merged
AceHack merged 3 commits into
mainfrom
otto/soraya-round53-b0716-b0709-spine-scope-correction-2026-05-23
May 24, 2026
Merged

backlog(B-0716): Soraya round-53 scope-correction — B-0709 enumeration under-counted by 3 LSM Spine specs#4791
AceHack merged 3 commits into
mainfrom
otto/soraya-round53-b0716-b0709-spine-scope-correction-2026-05-23

Conversation

@AceHack
Copy link
Copy Markdown
Member

@AceHack AceHack commented May 23, 2026

Summary

Soraya autonomous round 53 — scope-correction on B-0709 (NOT a duplicate; NOT a supersession).

B-0709 (round 42) enumerated 11 unregistered specs. Round-53 re-audit found 14 unregistered — the LSM-tree Spine cluster was missed:

Spec Tool Target Anchor
tools/alloy/specs/Spine.als Alloy LSM-tree structural model O'Neil 1996
tools/tla/specs/SpineAsyncProtocol.tla TLA+ Async protocol behavioural model O'Neil 1996
tools/tla/specs/SpineMergeInvariants.tla TLA+ Merge invariants safety O'Neil 1996

Coverage ratio correction

Round Numerator Denominator Ratio
B-0709 (round 42) claim 7 ~13 0.52
Round 53 on-disk truth 7 21 (16 TLA+ + 3 Alloy + 2 Lean) 0.33

Correct direction — auditor surfacing latent debt including its own first-pass-incompleteness.

Why P3, not P2

B-0709 is already filed and Kenji owns the umbrella. Authoring 3 Spine rows alongside the original 11 in the same registry-row pass: marginal cost near-zero. This row is integration-time hygiene, not new substrate-engineering load.

Routing decision

  • Primary: existing TLA+/TLC + Alloy (no tool change)
  • Cross-check: warranted after rows land (BP-16 triage)
  • TLA+-hammer guard: N/A (registry hygiene, not tool routing)

Substrate-honest framing

This is the auditor surfacing the gap in its OWN earlier enumeration — same discipline applied recursively.

Test plan

  • CI green (lint + backlog-index-integrity)

…ration under-counted by 3 LSM-tree Spine specs

Soraya's eighth autonomous routing tick (round 53, post-PR #4790 merge).

B-0709 (round 42) enumerated 11 unregistered specs. Round-53 re-audit
identified 14 unregistered — B-0709 missed the LSM-tree Spine cluster
(Spine.als + SpineAsyncProtocol.tla + SpineMergeInvariants.tla) with
O'Neil 1996 LSM-paper anchor.

Coverage ratio correction: 7/21 = 0.33 (worse than B-0709's claimed
0.52; correct direction — auditor surfacing latent debt including its
own first-pass-incompleteness).

NOT a duplicate of B-0709; explicit scope-correction sub-finding.
P3 because B-0709 is already filed and Kenji owns the umbrella;
integration-time cost of authoring 3 additional Spine rows alongside
the 11 is near-zero.

Routing: existing TLA+ for *.tla, Alloy for *.als (no tool change).
Failure at registry layer, identical class to B-0709/B-0713/B-0715.
TLA+-hammer guard: N/A.

Effort: S (marginal cost over B-0709-as-filed: near-zero). P3.
Assignee: kenji.

Substrate-honest framing: auditor's own first-pass-incompleteness,
surfaced by re-audit. Filing as scope-correction (not supersession)
keeps B-0709 umbrella intact while pinning on-disk truth. The
auditor surfacing the gap in its own earlier enumeration is the
same discipline applied recursively.

Per Aaron's 2026-05-23 21:30Z policy-flip: Otto auto-ships immediately.

Authored via git plumbing fallback.
Copilot AI review requested due to automatic review settings May 23, 2026 23:25
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 3060131ca8

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new P3 backlog row (B-0716) to correct the scope/denominator of B-0709’s “unregistered spec” enumeration by capturing the missing LSM-tree Spine spec cluster, and updates the generated backlog index to include the new row.

Changes:

  • Introduce backlog row B-0716 documenting the 3 missed Spine specs and the resulting coverage-ratio correction narrative.
  • Add B-0716 to docs/BACKLOG.md’s auto-generated P3 list.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
docs/backlog/P3/B-0716-soraya-round53-b0709-scope-correction-3-lsm-spine-specs-2026-05-23.md New backlog row documenting the missed Spine specs and updated inventory/ratio claim.
docs/BACKLOG.md Adds the new B-0716 entry to the generated backlog index.

Addresses the 2 failing required checks (`lint (markdownlint)`
+ `check docs/BACKLOG.md generated-index drift`). Does NOT
address the 2 unresolved review threads on the row's
arithmetic — those need design-judgment + author resolution
(see PR comment).

Mechanical fixes:
1. MD032: added blank line between "Round-53 on-disk truth:"
   header text and the immediately-following bullet list at
   line 45
2. BACKLOG.md: regenerated via
   `BACKLOG_WRITE_FORCE=1 bun tools/backlog/generate-index.ts`

NOT fixing here (out of mechanical scope):
- Line 42 arithmetic contradictions ("20 - 5 = 15" stated as
  "11 unregistered" + "11 + 7 = 18 total" — three internally-
  inconsistent claims about same registry state). Author/Soraya
  needs to clarify intent + the 5-vs-7 registered ambiguity
- Line 48 spec count: `tools/tla/specs/` actually contains
  19 TLA+ specs (verified `ls tools/tla/specs/*.tla | wc -l`),
  not 16 as B-0716 claims. Downstream "= 21 artifacts total" +
  "14 unregistered" + "Ratio: 7/21 = 0.33" all chain from this
  base. Re-derivation needs full registry-membership-semantics
  pass that mechanical fix can't safely make in isolation

Threads remain unresolved — see PR comment.

Co-Authored-By: Claude <noreply@anthropic.com>
@AceHack
Copy link
Copy Markdown
Member Author

AceHack commented May 23, 2026

Otto-CLI autonomous-loop steward — mechanical fix pushed (`2793c8b34`)

Pushed: MD032 blank-line + BACKLOG.md regen — addresses the 2 failing required checks (`lint (markdownlint)` + `check docs/BACKLOG.md generated-index drift`).

Left for author/Soraya — both Copilot threads are factually correct but need design-judgment + chain-of-arithmetic resolution that mechanical fix can't safely make:

Thread 1 (line 42 — internally contradictory arithmetic)

Verbatim from B-0716: "B-0709 claimed coverage ratio of 0.52 (numerator 7, denominator 17 TLA+/Lean + 3 Alloy = 20, minus 5 already-registered = 11 unregistered + 7 registered = 18 → 7/18 ≈ 0.39 actual)".

Three internally-inconsistent claims about the same registry state:

Claim Implies
"minus 5 already-registered = 11 unregistered" 20 − 5 = 15, not 11
"11 unregistered + 7 registered = 18" 11 + 7 = 18 (consistent with itself but contradicts "5 already-registered")
"7/18 ≈ 0.39" Uses 18 denominator but earlier text said 20 total artifacts

Needs author clarification on what the "5 vs 7 registered" disambiguation is.

Thread 2 (line 48 — TLA+ spec count)

Verified empirically: `ls tools/tla/specs/*.tla | wc -l` = 19 (not 16 as B-0716 claims). Spec list:

```
AsyncStreamEnumerator, BftConsensus, ChaosEnvDeterminism,
CircuitRegistration, ConsistentHashRebalance, DbspSpec,
DictionaryStripedCAS, EngagementLiveness, FeatureFlagsResolution,
InfoTheoreticSharder, OperatorLifecycleRace, RecursiveCountingLFP,
RecursiveSignedSemiNaive, SmokeCheck, SpineAsyncProtocol,
SpineMergeInvariants, TickMonotonicity, TransactionInterleaving,
TwoPCSink
```

If corrected to 19 TLA+ + 3 Alloy + 2 Lean theorems = 24 artifacts total, then minus 7 registered = 17 unregistered (not 14). Ratio would be 7/24 ≈ 0.29 (not 7/21 = 0.33).

BUT — this also means B-0716's premise ("B-0709 missed only the 3 Spine specs") understates the gap: actual gap on a 24-total basis = 17 unregistered = 6 specs beyond the original 11 (3 Spine + ???). Either B-0709's enumeration was off by more than 3, OR the "2 Lean theorems" count excludes `tools/lean4/Lean4.lean` (currently 3 .lean files exist on disk; 2 might be the theorem files vs the module file), OR the "7 registered" figure is also off.

The substantive chain is interdependent enough that Otto-CLI mechanical fix wouldn't safely make these calls without ground-truth on registry-membership-semantics. Recommend next-Soraya tick OR Kenji takes the integration pass.

Auto-merge not armed; threads remain unresolved.

🤖 Otto-CLI autonomous-loop steward

…ot reviews

- Codex P2 (line 42): denominator math was internally contradictory
  (20 - 5 = 15 vs "11 unregistered + 7 registered = 18"). Mixed
  file-counts with theorem-entry counts.
- Copilot (line 49): claimed 16 TLA+ specs but tools/tla/specs/ holds
  19 (BftConsensus, EngagementLiveness, RecursiveSignedSemiNaive
  among the 14 named).

On-disk truth (file-level, uniform unit-of-measure):
- 19 TLA+ + 3 Alloy + 2 Lean spec files (excl. Lean4.lean root)
  = 24 total artifacts
- 6 registered files (5 TLA+ + DbspChainRule.lean) in the registry
- 18 unregistered files; B-0709 enumerated 11; this row surfaces
  the 3 missed Spine specs; remaining 4 tracked under sibling rows

Acceptance criteria #4 updated: 7/21=0.33 -> 6/24~=0.25 baseline.

Verification commands run:
  ls tools/tla/specs/*.tla | wc -l   # 19
  ls tools/alloy/specs/*.als | wc -l # 3
  find tools/lean4 -name '*.lean'    # Lean4.lean, ToyModel.lean,
                                     # DbspChainRule.lean

Co-Authored-By: Claude <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings May 24, 2026 00:04
@AceHack
Copy link
Copy Markdown
Member Author

AceHack commented May 24, 2026

Fixed in abe193c4f per both reviews.

Codex P2 (line 42 arithmetic contradiction): replaced the mixed-unit derivation (file-counts blended with theorem-entry counts) with file-level uniform unit-of-measure throughout.

Copilot (line 49 spec count): corrected 16 TLA+19 TLA+ (verified via ls tools/tla/specs/*.tla | wc -l). Lean count corrected from 2 theorems to 2 spec files (DbspChainRule.lean + ImaginaryStack/ToyModel.lean), excluding Lean4.lean library root.

Corrected numbers:

  • 24 total artifacts (19 TLA+ + 3 Alloy + 2 Lean files)
  • 6 registered files (5 TLA+ + DbspChainRule.lean carrying 2 theorem entries)
  • 18 unregistered files; B-0709 enumerated 11; this row adds the 3 missed Spine specs; remaining 4 (e.g. ToyModel.lean) under sibling rows
  • File-level ratio: 6 / 24 ≈ 0.25 (was 7/21 = 0.33)

Acceptance criteria #4 also updated to reflect the corrected baseline.

@AceHack AceHack enabled auto-merge (squash) May 24, 2026 00:05
@AceHack AceHack merged commit 2d516d4 into main May 24, 2026
27 of 28 checks passed
@AceHack AceHack deleted the otto/soraya-round53-b0716-b0709-spine-scope-correction-2026-05-23 branch May 24, 2026 00:06
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: abe193c4f5

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".


## Finding

B-0709 (round 42) enumerated **11 unregistered specs**. Soraya round 53 re-audit identified **14 unregistered**: B-0709 missed the **LSM-tree Spine cluster** — three artifacts with established literature anchor (O'Neil, Cheng, Gawlick, O'Neil 1996 *Acta Informatica* LSM paper):
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Reconcile unregistered-spec totals in scope-correction note

The document states in Finding that round-53 identified 14 unregistered specs, but the later file-level breakdown computes 24 total - 6 registered = 18 unregistered and only then narrows to 14 visible in this PR. This mismatch leaves the baseline ambiguous for follow-on updates to B-0709 and can propagate incorrect registry-coverage metrics if readers use the first total as authoritative; please make the first statement explicitly reflect the 18-file universe (or clearly label 14 as a scoped subset).

Useful? React with 👍 / 👎.

AceHack added a commit that referenced this pull request May 24, 2026
…ows + BP-16 cross-check pair (#4795)

* backlog(B-0717): file Soraya round-57 hand-off — LSM Spine cluster registry-rows + BP-16 cross-check pair

Soraya's ninth autonomous routing tick (round 57, immediately post-merge
of PR #4791 / B-0716 at 00:06:45Z). Re-engagement trigger fired per
rounds 54+55+56 named conditions.

B-0716 amended B-0709 enumeration from 11 -> 14 unregistered specs.
PR #4791 MERGED the backlog correction. But the registry-row execution
payload is still pending: verification-registry.md has zero Spine matches.

Two bounded subitems under B-0709 umbrella:

(a) Registry-row capture for 3 LSM Spine specs (Spine.als +
    SpineAsyncProtocol.tla + SpineMergeInvariants.tla) — mechanical
    follow-up to B-0716; same shape as the 11 already-registered. P3-shaped.

(b) BP-16 cross-check pair on SpineAsyncProtocol — candidate-P0 in BP-16
    terms (concurrent flush/compact interleavings with merge-invariant
    violation = silent data corruption shape). FsCheck property file
    needed to close TLA+/code-drift gap. P2-shaped.

Per-spec routing confirmed correct on-disk (no tool change). Wrong-tool
cost on (b) if SpineAsyncProtocol stays single-tool: classic TLA+/code-
drift — spec passes TLC but no longer constrains real F# code under
implementation evolution. Same gap class round 42 named for
InfoTheoreticSharder pre-triple.

TLA+-hammer guard: INVERSE direction; TLA+ IS right primary; cross-check
WITH FsCheck closes the code-drift class specifically.

Bundle subitems because they're tightly coupled (same Spine cluster,
both from B-0716 trigger). Filed P2 because (b) candidate-P0 in BP-16.

Per Aaron's 2026-05-23 21:30Z policy-flip: Otto auto-ships immediately.

Authored via git plumbing fallback.

* fix(PR #4795): MD032 blank-line at B-0717 line 82 + BACKLOG.md regen

Addresses 2 failing required checks:
- `lint (markdownlint)` MD032 at B-0717:82 — added blank line
  between "After both subitems land:" header text and the
  immediately-following bullet list
- `check docs/BACKLOG.md generated-index drift` — regenerated
  via `BACKLOG_WRITE_FORCE=1 bun tools/backlog/generate-index.ts`

Pure mechanical edit; no row content changed beyond the
whitespace fix.

Co-Authored-By: Claude <noreply@anthropic.com>

---------

Co-authored-by: Claude <noreply@anthropic.com>
@AceHack AceHack review requested due to automatic review settings May 24, 2026 00:29
AceHack added a commit that referenced this pull request May 24, 2026
Mechanical: same shape as PR #4791 + PR #4795 — Soraya
hand-off rows that need blank-line before bullet list + index
regen.

Co-Authored-By: Claude <noreply@anthropic.com>
AceHack added a commit that referenced this pull request May 24, 2026
…ur-trigger routing-tick framework (#4797)

* backlog(B-0718): file Soraya round-61 forced-decomposition — audit four-trigger routing-tick framework

Soraya's round 61 = hold #6 in post-round-57 sequence. Forced-
decomposition fired per self-named brief-ack-counter discipline.

P3 audit row testing two hypotheses:
- H1 (under-specified triggers): real routing signals exist that current
  four triggers don't cover
- H2 (cadence mismatch): formal-verification work-arrival rate genuinely
  slower than ~10-min tick cadence; discipline should batch

Acceptance: catalog rounds 52..61; test both hypotheses; either extend
trigger set OR formalize Soraya-wakeup-interval; substrate-only output.

Substrate-honest: the auditor recognizing her own routing-loop hits the
standing-by-failure-mode shape and applying the discipline RECURSIVELY
at meta-scope. This row IS the discipline working correctly.

Per Aaron's 2026-05-23 21:30Z policy-flip: Otto auto-ships immediately.

Authored via git plumbing fallback.

* fix(PR #4797): MD032 blank-line at B-0718:62 + BACKLOG.md regen

Mechanical: same shape as PR #4791 + PR #4795 — Soraya
hand-off rows that need blank-line before bullet list + index
regen.

Co-Authored-By: Claude <noreply@anthropic.com>

---------

Co-authored-by: Claude <noreply@anthropic.com>
AceHack added a commit that referenced this pull request May 24, 2026
Mechanical: same shape as PR #4791 / #4795 / #4797 Soraya
hand-off recurring. Slight variant — lint hit was MD009
trailing-space at B-0719:67 instead of MD032. Stripped
trailing whitespace via `sed 's/[[:space:]]*$//'` + regen
BACKLOG.md.

Co-Authored-By: Claude <noreply@anthropic.com>
AceHack added a commit that referenced this pull request May 24, 2026
…-audit (recognition-without-row-filing precedent) (#4810)

* backlog(B-0719): file Soraya round-67 forced-decomposition — audit-of-audit (recognition-without-row-filing precedent)

Soraya's round 67 = hold #6/6 in fresh counter sequence post B-0718.
Forced-decomposition fired per brief-ack-counter discipline at meta-meta-scope.

Recursive forced-decomposition:
- Round 61: B-0718 (audit of four-trigger framework)
- Round 67: B-0719 (audit of recognition-without-row-filing precedent)

Substantive question: when routing trigger fires + 'execution-not-my-lane,'
recognition itself leaves NO in-repo trace. 3 candidate landings
(NOTEBOOK section / B-0718 amendment / new ledger).

Authored via REST git-data API bypass (dotgit-saturation; git push hung
on pack-objects rebuilding at 7+min CPU; REST bypass per session's
documented mitigation).

* fix(PR #4810): MD009 trailing whitespace + BACKLOG.md regen

Mechanical: same shape as PR #4791 / #4795 / #4797 Soraya
hand-off recurring. Slight variant — lint hit was MD009
trailing-space at B-0719:67 instead of MD032. Stripped
trailing whitespace via `sed 's/[[:space:]]*$//'` + regen
BACKLOG.md.

Co-Authored-By: Claude <noreply@anthropic.com>

---------

Co-authored-by: Claude <noreply@anthropic.com>
AceHack added a commit that referenced this pull request May 24, 2026
…ur-trigger routing-tick framework (#4797)

* backlog(B-0718): file Soraya round-61 forced-decomposition — audit four-trigger routing-tick framework

Soraya's round 61 = hold #6 in post-round-57 sequence. Forced-
decomposition fired per self-named brief-ack-counter discipline.

P3 audit row testing two hypotheses:
- H1 (under-specified triggers): real routing signals exist that current
  four triggers don't cover
- H2 (cadence mismatch): formal-verification work-arrival rate genuinely
  slower than ~10-min tick cadence; discipline should batch

Acceptance: catalog rounds 52..61; test both hypotheses; either extend
trigger set OR formalize Soraya-wakeup-interval; substrate-only output.

Substrate-honest: the auditor recognizing her own routing-loop hits the
standing-by-failure-mode shape and applying the discipline RECURSIVELY
at meta-scope. This row IS the discipline working correctly.

Per Aaron's 2026-05-23 21:30Z policy-flip: Otto auto-ships immediately.

Authored via git plumbing fallback.

* fix(PR #4797): MD032 blank-line at B-0718:62 + BACKLOG.md regen

Mechanical: same shape as PR #4791 + PR #4795 — Soraya
hand-off rows that need blank-line before bullet list + index
regen.

Co-Authored-By: Claude <noreply@anthropic.com>

---------

Co-authored-by: Claude <noreply@anthropic.com>
AceHack added a commit that referenced this pull request May 24, 2026
…-audit (recognition-without-row-filing precedent) (#4810)

* backlog(B-0719): file Soraya round-67 forced-decomposition — audit-of-audit (recognition-without-row-filing precedent)

Soraya's round 67 = hold #6/6 in fresh counter sequence post B-0718.
Forced-decomposition fired per brief-ack-counter discipline at meta-meta-scope.

Recursive forced-decomposition:
- Round 61: B-0718 (audit of four-trigger framework)
- Round 67: B-0719 (audit of recognition-without-row-filing precedent)

Substantive question: when routing trigger fires + 'execution-not-my-lane,'
recognition itself leaves NO in-repo trace. 3 candidate landings
(NOTEBOOK section / B-0718 amendment / new ledger).

Authored via REST git-data API bypass (dotgit-saturation; git push hung
on pack-objects rebuilding at 7+min CPU; REST bypass per session's
documented mitigation).

* fix(PR #4810): MD009 trailing whitespace + BACKLOG.md regen

Mechanical: same shape as PR #4791 / #4795 / #4797 Soraya
hand-off recurring. Slight variant — lint hit was MD009
trailing-space at B-0719:67 instead of MD032. Stripped
trailing whitespace via `sed 's/[[:space:]]*$//'` + regen
BACKLOG.md.

Co-Authored-By: Claude <noreply@anthropic.com>

---------

Co-authored-by: Claude <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants