Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
159 changes: 155 additions & 4 deletions docs/research/verification-registry.md
Original file line number Diff line number Diff line change
Expand Up @@ -125,14 +125,165 @@ because <one-line>.`

---

## TLA+ specs in CI (round 2026-05-03 cluster)

The 5 TLA+ specs that run in CI via `tests/Tests.FSharp/
TlcRunnerTests.fs` get registry rows so the math-proofs
Comment on lines +130 to +131
honest assessment (`docs/research/2026-05-03-math-proofs-
honest-assessment.md`) can claim A-grade for both the *spec
runs* claim AND the *spec matches source* claim.
Comment on lines +130 to +134
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Register all TLC specs that CI already model-checks

The section asserts there are only 5 TLA+ specs running in CI, but tests/Tests.FSharp/Formal/Tlc.Runner.Tests.fs in the same commit also runs TLC for InfoTheoreticSharder, RecursiveCountingLFP, and FeatureFlagsResolution (plus SmokeCheck). Because those CI-validated specs are omitted from this registry update, the stated closure of the external-fidelity gap is incomplete and drift audits will miss active CI specs.

Useful? React with 👍 / 👎.

Comment on lines +130 to +134

For specs with explicit external-paper citations (TwoPCSink
cites Skeen/Stonebraker 2PC), the registry shape mirrors the
Lean rows above. For internal-correctness specs (Tick
monotonicity, transaction interleaving, lifecycle race), the
"Paper" field becomes "Internal correctness target" naming
the source-code surface the spec models, and the
"Preconditions diff / Definition map" fields are scoped to
spec-vs-implementation alignment rather than spec-vs-paper.

## `TwoPCSink`

- **Artifact.** `tools/tla/specs/TwoPCSink.tla` (TLA+ spec
with `.cfg`; runs in CI via
`tests/Tests.FSharp/TlcRunnerTests.fs`).
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Point CI runner references to the real test file

This row links CI execution to tests/Tests.FSharp/TlcRunnerTests.fs, but that file does not exist in this commit; the runner lives at tests/Tests.FSharp/Formal/Tlc.Runner.Tests.fs. Keeping a non-existent path here breaks the audit trail this registry is supposed to provide, because reviewers cannot follow the documented evidence chain from spec rows to the actual CI gate.

Useful? React with 👍 / 👎.

Comment on lines +147 to +149
- **Paper.** Skeen, D. (1981) *Nonblocking Commit
Protocols*; Skeen, D. & Stonebraker, M. (1983) *A Formal
Model of Crash Recovery in a Distributed System* —
classical 2-phase-commit literature.
- **Paper statement (informal).** A 2PC protocol with one
coordinator and N participants, where the coordinator
collects votes, broadcasts decision, and participants
commit-or-rollback per the decision; satisfies safety
(no participant commits while another rolls back) and
liveness (every prepared transaction eventually
commits-or-aborts under fair scheduling).
- **Our statement.** Per-tick DBSP variant where
coordinator = circuit scheduler, participants = `ISink`
instances. Spec verifies four invariants:
`Idempotent` (epoch commits at most once), `AllOrNothing`
(committed → all sinks done-or-pending),
`AbortSafe` (aborted → no sinks done), `NoOrphans`
(every preCommitted tx eventually committed-or-aborted
after checkpoint).
Comment on lines +163 to +168
- **Preconditions diff.** Paper assumes coordinator-
failure recovery via decision log; our spec assumes
scheduler is in-process (no coordinator-failure
recovery scope).
Paper assumes async network with reordering; our spec
assumes per-tick synchronous step (TLA+ models the
interleaving of `Tick` calls but not network delay).
These narrowings are intentional — the DBSP variant
scopes 2PC to per-tick semantics, not full distributed-
recovery.
- **Definition map.**
- Our `txnState` ↔ paper's transaction state ∈ {open,
preparing, committed, aborted}.
- Our `sinkVote` ↔ paper's prepare-vote ∈ {none, yes,
no}.
- Our `sinkCommit` ↔ paper's commit-state ∈ {pending,
done, rolledback}.
- Our `sinkLog` ↔ paper's per-participant durable audit
log (Seq of committed txn IDs).
- Our `coord` ↔ paper's coordinator state-machine ∈
{init, prep, commit, abort, done}.
- **Last audit.** None yet — registered 2026-05-03.
Cadenced re-audit owed under
`verification-drift-auditor` (every 5-10 rounds).

## `TickMonotonicity`

- **Artifact.** `tools/tla/specs/TickMonotonicity.tla`
(TLA+ spec with `.cfg`; runs in CI).
- **Internal correctness target.** `Circuit.tick` field
in `src/Core/Circuit.fs`; specifically the
`Interlocked.Increment(ref _tick)` + `[<VolatileField>]`
combination.
- **Internal correctness claim.** `tick` is monotone non-
decreasing from any observer's perspective; under N
concurrent increments, the final value equals N (no
lost updates); on a 32-bit platform, no torn-long-read
produces a backwards tick.
- **Spec-vs-implementation alignment.** Spec models
`tick` as `Int` with `AcquireStep / AdvanceTick /
ObserveTick / ReleaseStep` actions; implementation
uses `Interlocked.Increment` (atomic) +
`[<VolatileField>]` (read fence). Spec checks
`Monotone` invariant + `MaxIncrements` count match;
implementation enforces same via .NET memory model.
Comment on lines +209 to +213
- **Last audit.** None yet — registered 2026-05-03.

## `TransactionInterleaving`

- **Artifact.** `tools/tla/specs/TransactionInterleaving.tla`
(TLA+ spec with `.cfg`; runs in CI).
- **Internal correctness target.** `TransactionZ1Op.fs`
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Cite the existing Transaction implementation file

This target names TransactionZ1Op.fs, but no such file exists in this commit; the TransactionZ1Op implementation is in src/Core/Transaction.fs. Referencing a non-existent source file makes the spec-to-implementation mapping unverifiable, which undermines the purpose of this internal-correctness row.

Useful? React with 👍 / 👎.

CAS-based semantics — concurrent `Begin` / `Commit` /
`Rollback` calls against `AfterStepAsync`-driven `Tick`.
- **Internal correctness claim.** No torn state snapshot;
`state ≤ pending` reachability holds; under
`autoCommit`, `state = pending` after every `Tick`; no
two concurrent `Commit` calls double-advance.
- **Spec-vs-implementation alignment.** Spec models
`state`, `pending`, `autoCommit`, `inputVal`,
`tickPhase` per-thread; implementation uses
`Interlocked.CompareExchange` for the CAS. Spec checks
the three claimed invariants directly.
Comment on lines +224 to +231
- **Last audit.** None yet — registered 2026-05-03.

## `OperatorLifecycleRace`

- **Artifact.** `tools/tla/specs/OperatorLifecycleRace.tla`
(TLA+ spec with `.cfg`; runs in CI).
- **Internal correctness target.** `Circuit`'s
`Register` / `Build` interleaving — specifically the
`anyAsync` flag soundness invariant.
- **Internal correctness claim.** `anyAsync` equals the
disjunction of async flags across every registered op
at every step. V2 formulation drops the broken
`ScanAsync` action (which read without committing) and
tightens `FlagSound` to its post-condition.
- **Spec-vs-implementation alignment.** Spec models
`ops` as Seq of `[id, async]` records; implementation
registers ops under a conceptual `registerLock` so
`anyAsync = OR(op.async)` holds at every step. Spec
is the V2 (post-bug-fix) formulation.
- **Last audit.** None yet — registered 2026-05-03.

## `SmokeCheck`

- **Artifact.** `tools/tla/specs/SmokeCheck.tla` (trivial
TLA+ module; runs in CI).
- **Internal correctness target.** TLC + tla2tools.jar
toolchain wiring itself, NOT a Zeta source artifact.
- **Internal correctness claim.** A trivial spec with one
variable that increments by 1 up to 3 satisfies the
invariant `x ≤ 3`. Catches "TLC can't even parse a
spec" regressions in the toolchain integration.
- **Spec-vs-implementation alignment.** Not applicable —
this spec exists to validate the harness, not a code
artifact. Treated as a meta-test of the
verification-portfolio infrastructure.
- **Last audit.** None yet — registered 2026-05-03.
Audit cadence may be looser since this is a toolchain
smoke-test, not a fidelity claim.

---

## How to add a new row

1. New verification artifact with an external citation lands.
1. New verification artifact with an external citation OR
internal-correctness spec lands.
2. Author (or the auditor, if unclaimed) drops a row here in
the same round.
3. Fill all seven fields (Artifact, Paper, Paper statement,
Our statement, Preconditions diff, Definition map, Last
audit).
3. For external-citation rows: fill all seven fields
Comment on lines +275 to +279
(Artifact, Paper, Paper statement, Our statement,
Preconditions diff, Definition map, Last audit).
For internal-correctness rows: substitute "Internal
correctness target" for "Paper", "Internal correctness
claim" for "Paper statement", and "Spec-vs-implementation
alignment" for the Preconditions diff + Definition map
pair.
Comment on lines +282 to +286
4. `verification-drift-auditor` re-audits on the next
scheduled cadence.

Expand Down
Loading