Skip to content

backlog: file B-0179 + B-0180 + B-0181 — 3 broken TLA+ specs from #1397 verify-then-claim sweep#1398

Merged
AceHack merged 1 commit intomainfrom
backlog/three-broken-tla-specs-from-1397-verify-then-claim-2026-05-03
May 3, 2026
Merged

backlog: file B-0179 + B-0180 + B-0181 — 3 broken TLA+ specs from #1397 verify-then-claim sweep#1398
AceHack merged 1 commit intomainfrom
backlog/three-broken-tla-specs-from-1397-verify-then-claim-2026-05-03

Conversation

@AceHack
Copy link
Copy Markdown
Member

@AceHack AceHack commented May 3, 2026

Summary

Per the verify-then-claim sweep in #1397, files 3 backlog rows tracking the broken TLA+ specs that need fixes before they can be registered in CI.

The 3 broken specs

Spec Failure mode New row Effort
SpineAsyncProtocol TTrace dump at depth 9 (counterexample) B-0179 M
CircuitRegistration Config bug — invariant Safety not defined in spec B-0180 S (smallest of 3)
SpineMergeInvariants TTrace dump at depth 17 (counterexample) B-0181 M (escalation possible)

Each row includes

  • Symptom (verbatim TLC output for reproducibility)
  • Investigation steps (read TTrace, classify failure mode)
  • Fix path (spec fix vs invariant weakening vs real-bug escalation)
  • After-fix steps (re-verify, add [Fact], add registry row)
  • Composes-with links

B-0181 escalation note

If TLA+ found an actual concurrency bug in BalancedSpine.fs rather than a spec-modeling error, escalate to BUGS.md P0 immediately. The depth-17 termination makes that more plausible than the depth-9 SpineAsyncProtocol case.

Test plan

🤖 Generated with Claude Code

… verify-then-claim sweep

Per the verify-then-claim sweep in #1397, the 4 deferred B1 specs
from #1383 math-proofs honest assessment broke down as:

- DbspSpec: PASSES (1M states / ~11s) → registered as [Fact] in
  #1397
- SpineAsyncProtocol: FAILS (TTrace dump at depth 9) → B-0179
- CircuitRegistration: FAILS (config bug — invariant Safety not
  defined in spec) → B-0180
- SpineMergeInvariants: FAILS (TTrace dump at depth 17) → B-0181

3 backlog rows file the broken specs as tracked work. Each row
includes:

- Symptom (verbatim TLC output for reproducibility)
- Investigation steps (read TTrace, classify failure mode)
- Fix path (spec fix vs invariant weakening vs real-bug
  escalation)
- After-fix steps (re-verify, add [Fact], add registry row)
- Composes-with links to sibling rows + #1383 + the production
  code being modeled

B-0181 (SpineMergeInvariants) calls out the real-bug-found case
— if TLA+ found an actual concurrency bug in BalancedSpine.fs
rather than a spec-modeling error, escalate to BUGS.md P0
immediately. The depth-17 termination makes that more plausible
than the depth-9 SpineAsyncProtocol case.

B-0180 (CircuitRegistration) is the smallest fix — just a
config-syntax error, not a semantic counterexample.

All 3 P2 since they're pre-existing failures, not new
regressions. Closure of B1 → A in the assessment matrix
requires all 3 to land before the [Fact] entries can be added.

Composes with #1397 (DbspSpec → CI; partial B1 closure) +
#1383 (the assessment) + proof-tool-coverage.md §2 (the
"deliberately skipped pending re-verification" framing —
verification has now happened, with results filed as these
3 rows).
Copilot AI review requested due to automatic review settings May 3, 2026 12:37
@AceHack AceHack enabled auto-merge (squash) May 3, 2026 12:37
@AceHack AceHack merged commit 9f4c307 into main May 3, 2026
23 of 24 checks passed
@AceHack AceHack deleted the backlog/three-broken-tla-specs-from-1397-verify-then-claim-2026-05-03 branch May 3, 2026 12:38
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds three new backlog rows documenting the deferred TLA+ specs that failed the verify-then-claim sweep, so the formal-verification work needed before CI registration is tracked in the backlog alongside the broader proof-readiness effort.

Changes:

  • Add a P2 backlog row for SpineAsyncProtocol capturing the TLC trace-dump failure and proposed follow-up steps.
  • Add a P2 backlog row for CircuitRegistration capturing the undefined Safety invariant config error and candidate fix directions.
  • Add a P2 backlog row for SpineMergeInvariants capturing the TLC trace-dump failure and escalation path if it maps to a production bug.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 5 comments.

File Description
docs/backlog/P2/B-0179-fix-spineasyncprotocol-tla-counterexample-blocks-b1-ci-registration-2026-05-03.md Tracks the SpineAsyncProtocol TLC failure and the intended path to re-verify and register it in CI.
docs/backlog/P2/B-0180-fix-circuitregistration-tla-config-bug-safety-invariant-undefined-2026-05-03.md Tracks the CircuitRegistration config/spec mismatch around the missing Safety invariant.
docs/backlog/P2/B-0181-fix-spinemergeinvariants-tla-counterexample-blocks-b1-ci-registration-2026-05-03.md Tracks the SpineMergeInvariants TLC failure and possible escalation depending on root cause.

Comment on lines +1 to +13
---
id: B-0179
priority: P2
status: open
title: Fix SpineAsyncProtocol.tla counterexample (blocks B1 → A CI registration; depth-9 trace dump on TLC run)
tier: formal-verification
effort: M
ask: Otto 2026-05-03 verify-then-claim sweep (#1397) — running TLC on the 4 deferred B1 specs from the math-proofs honest assessment (#1383) found SpineAsyncProtocol produces a TTrace dump at depth 9, indicating counterexample / invariant violation. The "deliberately skipped pending re-verification" framing in `docs/research/proof-tool-coverage.md` §2 was correct; this row tracks the actual fix.
created: 2026-05-03
last_updated: 2026-05-03
depends_on: []
composes_with: [docs/research/2026-05-03-math-proofs-honest-assessment.md, docs/research/proof-tool-coverage.md, B-0180, B-0181]
tags: [tla-plus, formal-verification, spine-async, counterexample, b1, math-proofs-assessment, verify-then-claim]
Comment on lines +32 to +35
1. Read the TTrace file to identify which invariant fails
2. Determine whether the invariant or the spec needs the fix:
- If the spec models the protocol incorrectly, fix the spec
- If the invariant is over-strong (asks for a property the protocol doesn't actually guarantee), weaken the invariant
Comment on lines +32 to +38
Two directions, decision needed:

1. **Define `Safety` in the spec.** Read the spec; identify which invariant the original author intended `Safety` to capture; add `Safety == ...` operator that captures it. Most likely the existing `FlagSound` post-condition or a similar property.

2. **Remove `Safety` from the cfg.** Inspect what other invariants the cfg references; if those alone are sufficient validation, drop the missing `Safety` reference. Risk: may drop an intentional invariant the author meant to add but never finished writing.

The verify-then-claim follow-up tick should pick (1) if the spec has a clear "what was Safety supposed to mean" thread, else (2) with a comment in the cfg documenting why.

Two directions, decision needed:

1. **Define `Safety` in the spec.** Read the spec; identify which invariant the original author intended `Safety` to capture; add `Safety == ...` operator that captures it. Most likely the existing `FlagSound` post-condition or a similar property.
Comment on lines +28 to +37
TLC dumps a trace file when it finds an invariant violation. The depth-17 termination suggests TLC explored 17 transitions before finding the violation — deeper than the SpineAsyncProtocol failure (depth 9), which may indicate the violation is more subtle (further from initial state, requires more interleaving to manifest).

## Investigation needed

1. Read the TTrace file to identify which invariant fails at the depth-17 state
2. Determine the failure class:
- **Spec bug**: spec models the LSM-spine merge protocol incorrectly
- **Invariant over-specification**: invariant asks for a property the merge protocol doesn't actually guarantee
- **Real bug found**: the invariant is correct but the modeled protocol violates it; this would indicate a bug in `BalancedSpine.fs` itself
3. The third class is the most interesting — TLA+ found a real concurrency bug. If that's the case, file as a P0/P1 in BUGS.md instead of treating it as a spec-fix.
AceHack added a commit that referenced this pull request May 3, 2026
…1 (post-#1398) (#1399)

* hygiene(backlog): regenerate BACKLOG.md index for B-0179/B-0180/B-0181 (post-#1398)

Auto-generated index regenerated via:
  BACKLOG_WRITE_FORCE=1 bun tools/backlog/generate-index.ts

Adds 3 new P2 row entries from the #1398 backlog landing:
- B-0179 SpineAsyncProtocol counterexample fix
- B-0180 CircuitRegistration config bug fix
- B-0181 SpineMergeInvariants counterexample fix

Closes the BACKLOG.md generated-index drift warning that fired
on #1398 (non-required check, didn't block merge but flagged
substrate hygiene).

* fix(backlog-index): regenerate via .sh (CI canonical) — closes drift on #1399

CI's  uses
ok: /Users/acehack/Documents/src/repos/Zeta/docs/BACKLOG.md matches generator output, NOT the .ts version.
The .ts and .sh generators disagree on a single blank line at the
header section boundary — the .ts emits an extra blank that .sh
doesn't.

My initial #1399 regeneration used the .ts version
(wrote /Users/acehack/Documents/src/repos/Zeta/docs/BACKLOG.md)
which produced output the CI .sh-check rejected as drift.

Fix: regenerated via the .sh canonical version.

Verified post-fix:
  - `./tools/backlog/generate-index.sh --check` → ok
  - `bun tools/backlog/generate-index.ts --check` → fails
    (the .ts/.sh parity bug; filed as follow-up below)

Files a follow-up parity-debt finding: the .ts generator has a
parity defect with the canonical .sh. Should be tracked as a
backlog row in a follow-up tick — or fixed inline if quick. The
migration substrate (docs/trajectories/typescript-bun-migration/
RESUME.md) lists generate-index.sh as ported in #885 but the
parity bug suggests the port wasn't fully equivalent.
AceHack added a commit that referenced this pull request May 3, 2026
…6-05-03 EOD progress (#1402)

Reflects substantive progress this session across the math-proofs
honest assessment matrix. Key state changes:

**P0 items — 3 of 3 closed:**
- Lean lake-build CI job ✓ (#1394)
- A4 registry rows ✓ (#1393)
- Peer-review email draft ✓ (#1387)
- Stryker B3 → partial (config-fix #1395; CI wire deferred to
  follow-up substantial-design)

**P1 items — significant progress:**
- Alloy B2 → A ✓ (#1396 — silent-no-op was the failure mode;
  spec-path fixed)
- Semgrep B4 → A ✓ (verify-then-claim correction; was already
  in CI)
- B1 4 deferred specs → 2 of 4 done:
  - DbspSpec ✓ #1397 (1M states / 11s)
  - CircuitRegistration ✓ #1401 (B-0180 closed; 3538 states / <1s)
  - SpineAsyncProtocol B-0179 still open (counterexample inv.)
  - SpineMergeInvariants B-0181 still open (counterexample inv.)

**Sibling work tracked:**
- Phase 0 substrate-discovery PoC ✓ (#1392 — 4.0 MB AOT binary
  on osx-arm64; cross-platform CI matrix)
- 3 broken-spec backlog rows filed (#1398 → B-0179 + B-0180 +
  B-0181); B-0180 closed (#1401)
- `.ts/.sh` parity bug in `tools/backlog/generate-index.ts`
  closed ✓ (#1400 — both generators byte-identical)

This update is bounded substrate work documenting the actual
state of the matrix; doesn't add new work, just captures
completion. Future matrix re-grades happen as work-items land
(per the assessment doc's audit-trail discipline).

Composes with #1383 (the original assessment) + every PR
referenced above.

§33 archive-header lint passes.
AceHack added a commit that referenced this pull request May 3, 2026
…egister in CI

Per B-0180 (filed in #1398) — `CircuitRegistration.cfg` referenced
`INVARIANT Safety` but the spec didn't define a `Safety` operator,
causing TLC to error out:

  Error: The invariant Safety specified in the configuration file
  is not defined in the specification.

The intended composite invariant is documented at the bottom of the
spec via the THEOREM `Spec => [](TypeOK /\ NoRegisterAfterBuild)`.
Defined `Safety` as exactly that conjunction so the cfg directive
resolves without changing TLC's check semantics.

**Local verification:**

  - `java -cp tools/tla/tla2tools.jar tlc2.TLC -workers 4 -config
     tools/tla/specs/CircuitRegistration.cfg
     tools/tla/specs/CircuitRegistration.tla` →
    "Model checking completed. No error has been found."
    3538 distinct states explored, depth 14, <1s wall.

  - `dotnet test --filter "FullyQualifiedName~CircuitRegistration"`
    passes in 1.7s.

**Composes with:**

- B-0180 (the backlog row this closes; one of 3 broken-spec
  follow-ups from #1397's verify-then-claim sweep)
- #1397 (DbspSpec → CI; the partial-B1 closure)
- #1383 math-proofs honest assessment (the matrix this
  incrementally closes — 2 of 4 deferred B1 specs now in CI;
  remaining: B-0179 SpineAsyncProtocol + B-0181 SpineMergeInvariants
  both need counterexample investigation, not config fixes)

**Discipline note:** the original spec author left "Safety:" as a
prose comment label on `NoRegisterAfterBuild` (line 81) and on
`ConnectAtMostOnce` (line 85), implying both were intended as
"safety properties" but never finished the operator definition.
The THEOREM at line 96 disambiguates: only `TypeOK /\
NoRegisterAfterBuild` belongs in INVARIANT (a state property);
`ConnectAtMostOnce` uses `[]_vars` and is a temporal PROPERTY
(deferred — not in this fix's scope).
AceHack added a commit that referenced this pull request May 3, 2026
…ister in CI (#1401)

* fix(B-0180): define `Safety` invariant in CircuitRegistration.tla + register in CI

Per B-0180 (filed in #1398) — `CircuitRegistration.cfg` referenced
`INVARIANT Safety` but the spec didn't define a `Safety` operator,
causing TLC to error out:

  Error: The invariant Safety specified in the configuration file
  is not defined in the specification.

The intended composite invariant is documented at the bottom of the
spec via the THEOREM `Spec => [](TypeOK /\ NoRegisterAfterBuild)`.
Defined `Safety` as exactly that conjunction so the cfg directive
resolves without changing TLC's check semantics.

**Local verification:**

  - `java -cp tools/tla/tla2tools.jar tlc2.TLC -workers 4 -config
     tools/tla/specs/CircuitRegistration.cfg
     tools/tla/specs/CircuitRegistration.tla` →
    "Model checking completed. No error has been found."
    3538 distinct states explored, depth 14, <1s wall.

  - `dotnet test --filter "FullyQualifiedName~CircuitRegistration"`
    passes in 1.7s.

**Composes with:**

- B-0180 (the backlog row this closes; one of 3 broken-spec
  follow-ups from #1397's verify-then-claim sweep)
- #1397 (DbspSpec → CI; the partial-B1 closure)
- #1383 math-proofs honest assessment (the matrix this
  incrementally closes — 2 of 4 deferred B1 specs now in CI;
  remaining: B-0179 SpineAsyncProtocol + B-0181 SpineMergeInvariants
  both need counterexample investigation, not config fixes)

**Discipline note:** the original spec author left "Safety:" as a
prose comment label on `NoRegisterAfterBuild` (line 81) and on
`ConnectAtMostOnce` (line 85), implying both were intended as
"safety properties" but never finished the operator definition.
The THEOREM at line 96 disambiguates: only `TypeOK /\
NoRegisterAfterBuild` belongs in INVARIANT (a state property);
`ConnectAtMostOnce` uses `[]_vars` and is a temporal PROPERTY
(deferred — not in this fix's scope).

* fix(tlc-runner): remove brittle line-number reference in CircuitRegistration test comment

Reviewer #1401 caught: the comment said 'per the spec's THEOREM at
line ~96' but the actual line drifts as the spec evolves (currently
line 106 after my Safety operator addition).

Reworded to reference the THEOREM by content rather than line:
'matching the spec's stated THEOREM Spec => [](TypeOK /NoRegisterAfterBuild)'. Doesn't drift, doesn't require maintenance
when the spec is edited.

* fix(tests/tlc + tla/specs): strip historical narrative per copilot-instructions §282

Reviewer #1401 caught: my CircuitRegistration test comments + the
Safety operator's spec comment carried historical narrative (B-0180
ticket ID, #1397 PR ref, 2026-05-03 sweep date, 'Closes 1 of 3
broken sibling specs', 'was missing until B-0180 fixed', etc.) on
current-state surfaces. Per .github/copilot-instructions.md §282:

  *Documentation is current state, not history. Historical
  narrative lives in docs/ROUND-HISTORY.md and ADRs under
  docs/DECISIONS/; everywhere else in docs/ edit in place.*

Same rule applies to spec files + test comments — they describe
what the artifact currently does, not how it got there.

Stripped:
- Test comments now describe ONLY the current invariants checked
  + cfg coverage scope + complementary-coverage pointers (no
  ticket IDs, no dates, no PR refs)
- Spec's Safety operator comment now says only what Safety is
  (composite safety invariant matching the THEOREM) without the
  B-0180 / 2026-05-03 / verify-then-claim history
- Compatible-coverage pointers retained (FsCheck Z-set tests +
  Lean proof) since those are current-state composes-with refs

Build still 0/0; functional behavior unchanged.
AceHack added a commit that referenced this pull request May 3, 2026
…ster + cache-clobber discipline encoded (#1408)

Substantial multi-tick session shard. 18 PRs touched (#1383 + #1387
+ #1392-#1407 inclusive); 14 merged + 4 in-flight as of shard time.

**Math-proofs assessment progress** (#1383 outstanding-work matrix):
- A1+A2 → A-with-CI ✓ (#1394 Lean lake-build workflow)
- A4 registry rows ✓ (#1393)
- B1 → 2 of 4 deferred specs in CI ✓ (#1397 DbspSpec + #1401
  CircuitRegistration B-0180 closed)
- B2 Alloy → A ✓ (#1396 silent-no-op spec-path fix)
- B4 Semgrep → A ✓ (correction)
- Peer-review email template ✓ (#1387)
- Phase 0 substrate-discovery PoC ✓ (#1392)
- Stryker config-fix ✓ (#1395; CI wire deferred)
- 3 broken-spec backlog rows filed ✓ (#1398)

**Cache-clobber silent-bug class discovered + fully encoded:**
B-0180 fix passing locally + failing CI → verify-then-claim
identified gate.yml + low-memory.yml caching whole tools/tla and
tools/alloy directories. Fix cluster: #1403 (gate.yml) + #1404
(low-memory.yml + audit-ci-cache-paths.ts) + #1406 (CI lint gate)
+ #1407 (memory file + bug-locus disambiguation per Aaron's 'real
github bug?' question — answer: usage-bug, not tool-bug).

**Other substrate work:** #1399 BACKLOG.md regen, #1400 .ts/.sh
parity bug, #1402 assessment matrix doc update, #1405 B-0182
backlog row (Linux-only formal verification — orthogonal-axes
split per Aaron 2026-05-03).

**Discipline lessons captured:** chat-is-assertion-channel,
substrate-corrections-cluster, search-first-before-architectural-
expansion, verify-then-claim CI fidelity, documentation-is-
current-state-not-history.

Carved sentence: 'When a lucky catch surfaces a class of bug,
build the structural fix that eliminates the luck — audit + lint
gate + carved-sentence rule + memory file.'
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants