From ae6437cea18b4d4bccf09e42aee167c1deb4f528 Mon Sep 17 00:00:00 2001
From: Aaron Stainback <aaron_bond@yahoo.com>
Date: Sat, 18 Apr 2026 14:51:35 -0400
Subject: [PATCH 1/4] =?UTF-8?q?Round=2028=20=E2=80=94=20stateful=20harness?=
 =?UTF-8?q?=20design=20+=20lean4=20scaffolding=20cleanup?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Design doc for the FsCheck law runner captures the DST framing
(single seeded RNG, no wall-clock, total tick order, seed + schedule
printed on failure) and locks the build-vs-test decision: LawRunner
lives as a test-time library, not a Circuit.Build() gate, so Core
stays free of FsCheck and plugin authors opt in from their test
project.

Key design call — Option B (trace-based retraction check against the
existing marker IStatefulStrictOperator) this round, Option A
(Init/Step/Retract triple matching the DBSP paper's σ,λ,ρ shape)
as a planned additive promotion in round-29+. Rationale documented
in full: Option A needs real async + retraction-contract design
work; Option B lands today, teaches us what retraction actually
looks like, Option A absorbs the lessons. Option B remains a
fallback even after A ships.

Long-term payoff of Option A (documented so we don't lose it):
- Matches DbspChainRule.lean's (σ,λ,ρ) triple → compositional
  reasoning across math/proofs/impl
- Generic WDC checkpointing without per-op serialisation
- Planner fusion of adjacent stateful ops

Lean4 scaffolding cleanup — `lake new` leftovers removed:
- tools/lean4/README.md (pure GitHub-Pages setup boilerplate)
- tools/lean4/Lean4/Basic.lean (hello-world sample)
- tools/lean4/.github/workflows/ (upstream Lean CI templates)
- tools/lean4/.gitignore (redundant with root .lake/ ignore)

tools/lean4/Lean4.lean now imports Lean4.DbspChainRule so
`lake build` walks the real proof file; lakefile.toml,
lake-manifest.json, and lean-toolchain kept (load-bearing for
Mathlib resolution).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 docs/research/stateful-harness-design.md      | 278 ++++++++++++++++++
 .../.github/workflows/create-release.yml      |  22 --
 .../.github/workflows/lean_action_ci.yml      |  21 --
 tools/lean4/.github/workflows/update.yml      |  41 ---
 tools/lean4/.gitignore                        |   1 -
 tools/lean4/Lean4.lean                        |   5 +-
 tools/lean4/Lean4/Basic.lean                  |   1 -
 tools/lean4/README.md                         |  13 -
 8 files changed, 282 insertions(+), 100 deletions(-)
 create mode 100644 docs/research/stateful-harness-design.md
 delete mode 100644 tools/lean4/.github/workflows/create-release.yml
 delete mode 100644 tools/lean4/.github/workflows/lean_action_ci.yml
 delete mode 100644 tools/lean4/.github/workflows/update.yml
 delete mode 100644 tools/lean4/.gitignore
 delete mode 100644 tools/lean4/Lean4/Basic.lean
 delete mode 100644 tools/lean4/README.md

diff --git a/docs/research/stateful-harness-design.md b/docs/research/stateful-harness-design.md
new file mode 100644
index 00000000..b053da9a
--- /dev/null
+++ b/docs/research/stateful-harness-design.md
@@ -0,0 +1,278 @@
+# Stateful plugin harness — design draft
+
+**Round:** 28
+**Status:** draft (design only; no code landed)
+**Scope:** design the test-time harness that verifies
+`IStatefulStrictOperator<'TIn,'TState,'TOut>` plugins against
+the `RetractionCompletenessLaw` under deterministic simulation.
+
+## Framing: deterministic simulation
+
+The whole point of a plugin-law harness is **deterministic
+simulation testing** in the FoundationDB / TigerBeetle sense:
+given a seed and an input schedule, the harness replays
+bit-exact and a failing trace is reducible to its seed. That
+constrains the design up front:
+
+1. **No wall-clock time.** Ticks are integers; the plugin
+   never observes `DateTime.UtcNow` via the harness.
+2. **All randomness comes from one seeded `System.Random`**
+   (or an FsCheck `StdGen`). A plugin that reaches for a new
+   `Random()` defeats DST and is a bug — we lint for this
+   but cannot statically prevent it.
+3. **Tick order is total.** Within a tick, input is fed,
+   `StepAsync` runs to completion, output is captured,
+   `AfterStepAsync` runs (for strict operators), and the
+   next tick proceeds. No concurrent ticks in the harness.
+4. **A failing run produces a replay.** On law violation we
+   print the seed, the input schedule as a list, and the
+   tick index at which the law first broke. Copy-paste into
+   the next test run to reproduce.
+
+DST is the lens for *all* the law runners, not just the
+stateful one. But stateful-strict is where the lens earns
+its keep — non-determinism in a stateful op can mask a
+retraction bug for thousands of ticks before failing.
+
+## What `IStatefulStrictOperator` promises
+
+Quoting `src/Core/PluginApi.fs` for the tag:
+
+> the operator carries explicit stateful strict semantics —
+> init / step / retract triple. Distinct from
+> `IStrictOperator` (feedback-cut): stateful-strict ops hold
+> per-key or per-instance state that must retract cleanly
+> when a negative weight arrives.
+
+The marker interface today is opaque — a plugin just
+inherits `IOperator<'TOut>` under the tag, there is no
+exposed `init` / `step` / `retract` triple on the interface
+itself. That is a **design gap**: the law cannot fire
+against an opaque op. Two options:
+
+### Option A — enrich the interface
+
+Add abstract methods on `IStatefulStrictOperator`:
+
+```fsharp
+type IStatefulStrictOperator<'TIn, 'TState, 'TOut> =
+    inherit IOperator<'TOut>
+    abstract InitState : unit -> 'TState
+    abstract Step : state: 'TState * input: 'TIn -> 'TState * 'TOut
+    abstract Retract : state: 'TState * input: 'TIn -> 'TState
+```
+
+**Pros:** law runner has direct access to the state
+transitions, can check `Retract ∘ Step = id` at every tick,
+inspect state for invariants, drive FsCheck against `Step`
+and `Retract` directly without going through `StepAsync`.
+
+**Cons:** second authoring path — plugin authors now write
+`Step` and `Retract` *and* wire `StepAsync` to call them.
+The two can drift. Unless `StepAsync` has a default
+implementation that calls `Step` — which forces `'TState`
+into a mutable field on the op, which is exactly the
+shared-state pattern Ilyana pushed back on in round-27.
+
+### Option B — trace the op through `StepAsync`
+
+Keep the marker-interface-only shape. The harness feeds
+inputs tick-by-tick through `PluginHarness.runSingleInput`.
+To check retraction completeness, the harness runs *two*
+traces and compares:
+
+1. **Forward trace:** feed a sequence `A = [a₁; a₂; ...; aₙ]`
+   and record outputs `[y₁; y₂; ...; yₙ]`.
+2. **Round-trip trace:** feed `A ++ retract(A)` where
+   `retract(A)` is the negation of each input as a Z-set
+   (weights flipped to their negatives). Record outputs
+   `[y₁; ...; yₙ; y'₁; ...; y'ₙ]`.
+
+The **law**: at the end of the round-trip, the cumulative
+output Z-set must be zero — i.e. summing `y_i` across all
+`2n` ticks is the zero Z-set. If any term remains, the op
+leaked state through retraction.
+
+**Pros:** no interface change; works against the existing
+marker-only tag; authoring path stays narrow (one
+`StepAsync` method).
+
+**Cons:** only observable output is checked, not internal
+state. A plugin that retracts cleanly at the boundary but
+leaks memory internally (keeps dead state entries forever)
+is not caught. That is a separate property — "bounded
+state-growth" — and can be a `Watch`-class law under DST
+by running for a fixed number of ticks and asserting state
+byte-size stays bounded. Covered by a follow-up law, not
+`RetractionCompletenessLaw`.
+
+### Recommendation: Option B now, Option A as planned round-29+ work
+
+**Option A is the right long-term direction.** It matches the
+DBSP paper's canonical `(σ, λ, ρ)` triple — the same shape
+`tools/lean4/Lean4/DbspChainRule.lean` proves against — so
+math, Lean proofs, and F# implementation share one shape and
+compositional reasoning travels across them. Option A also
+unlocks generic WDC checkpointing (pickle `'TState` without
+per-op serialisation), planner fusion of adjacent stateful
+ops (reuse state buffers, batch step+retract), and sharper
+law-failure diagnostics ("state diverged by δ" vs Option B's
+"cumulative Z-set wasn't zero").
+
+**Option B is the right first step.** Option A needs real
+design work on (1) the async path — `Step : σ * 'TIn * ct
+-> ValueTask<σ * 'TOut>` duplicates the sync signature —
+and (2) a precise retraction contract (is `Retract(Step(s,
+a)) = s` always, or only for linear `a`? Partial retraction
+semantics?). Prototyping Option A in a vacuum, without
+having run Option B against real plugins, ships a contract
+we would almost certainly need to revise. Option B lands
+the law runner today against the existing marker interface,
+teaches us what retraction actually looks like in practice,
+and Option A absorbs those lessons.
+
+**Path forward:**
+
+1. **This round (28):** Option B trace-based
+   `checkRetractionCompleteness`. Marker interface stays as
+   it is.
+2. **Round 29+ — Option A promotion.** Additive interface
+   enrichment: `Init` / `Step` / `Retract` become abstract
+   methods on `IStatefulStrictOperator`; `StepAsync`
+   default-implements by calling `Step` so existing plugins
+   keep compiling. The law runner uses the new hooks when
+   present, falls back to Option B trace for plugins that
+   stay marker-only. Same capability-tagging shape the
+   plugin API already uses.
+3. **Round 29+ — generic checkpointing.** Once `'TState` is
+   in the type signature, WDC can pickle it via FsPickler
+   (schema-bound) without per-op `Save`/`Load` code.
+4. **Round 30+ — planner fusion.** With `Step`/`Retract`
+   exposed, the planner can fuse adjacent stateful ops
+   (reuse state buffers, batch step+retract). Out of scope
+   until we have data on which op pairs dominate hot paths.
+
+Option B is not throwaway — it remains the fallback for
+plugins that genuinely cannot expose `Step`/`Retract`
+(black-box ML wrappers, third-party system integrations).
+
+## API sketch
+
+```fsharp
+namespace Zeta.Core
+
+module LawRunner =
+
+    /// Deterministic-simulation result. `None` on pass;
+    /// `Some` carries enough to reproduce.
+    type LawViolation =
+        { Seed: int
+          Inputs: obj list            // boxed for diagnostic printing
+          TickOfFirstBreak: int
+          Message: string }
+
+    /// Linearity: `op(a + b) = op(a) + op(b)` and `op(0) = 0`.
+    /// Not stateful — single-tick check per sample.
+    val checkLinear<'TIn, 'TOut when 'TIn : (static member (+) : 'TIn * 'TIn -> 'TIn)
+                                  and 'TOut : (static member (+) : 'TOut * 'TOut -> 'TOut)
+                                  and 'TOut : equality>
+        : seed: int
+        -> samples: int
+        -> makeOp: (Stream<'TIn> -> ILinearOperator<'TIn, 'TOut>)
+        -> genInput: (System.Random -> 'TIn)
+        -> zeroIn: 'TIn
+        -> zeroOut: 'TOut
+        -> Result<unit, LawViolation>
+
+    /// Retraction completeness (stateful): forward-then-retract
+    /// a schedule, assert cumulative output is zero. Option-B
+    /// trace approach; no interface enrichment required.
+    val checkRetractionCompleteness<'TIn, 'TState, 'TOut>
+        : seed: int
+        -> samples: int
+        -> scheduleLength: int
+        -> makeOp: (Stream<ZSet<'TIn>> ->
+                    IStatefulStrictOperator<ZSet<'TIn>, 'TState, ZSet<'TOut>>)
+        -> genInput: (System.Random -> ZSet<'TIn>)
+        -> Result<unit, LawViolation>
+```
+
+Notes on the signature:
+
+- **Generators are `System.Random -> 'T`, not FsCheck `Gen<'T>`.**
+  Keeps `Zeta.Core` free of FsCheck. Plugin authors wire
+  FsCheck in their test project and hand the law runner a
+  thin `Random -> 'T` adapter.
+- **Seed is explicit.** No hidden clock-seeding. A failing
+  run prints the seed; the author pastes it back to
+  reproduce.
+- **`makeOp` is the factory, same shape as
+  `PluginHarness.runSingleInput`.** This means every
+  stateful operator already unit-testable via the harness
+  is law-runnable with no rewrite.
+- **`scheduleLength` bounds the trace.** DST prefers a
+  generous upper bound (1000 ticks is cheap) so state
+  churn has time to expose leaks; small (say 10) bounds
+  produce false-green runs that make the law useless.
+
+## Where it does not fire
+
+**Not at `Circuit.Build()`.** See `docs/PLUGIN-AUTHOR.md`
+round-28 rewrite — the law runner is a library consumer
+opts-in mechanism, not a gate the scheduler enforces. A
+`Circuit.Build()` gate would pay a probabilistic-testing
+cost on every construction, slow circuits measurably, and
+couple `Zeta.Core` to FsCheck. Plugin authors call
+`LawRunner.*` from their test project instead.
+
+The reason for the choice is **not** that Build-time
+verification is wrong in principle — it is that the
+cheapest reliable form is a test-time check driven by a
+DST harness. A future `DebugCircuit.BuildWithLawChecks`
+could exist as a slow-path gate for CI-only smoke runs, but
+that is round-29+ work and out of scope here.
+
+## Open design questions
+
+1. **Z-set negation helper.** The Option B trace approach
+   needs `ZSet<'T>` negation (`-w` on every weight). Is
+   there a canonical `ZSet.negate` already, or does the law
+   runner synthesize it? (Core grep should answer this
+   before implementation.)
+2. **Schedule shape.** Uniform random Z-sets per tick vs
+   structured schedules (e.g. "insert 100, retract 50,
+   insert 100, retract all"). DST orthodoxy favours
+   structured schedules over pure uniform for shrinking
+   quality — FsCheck's shrinker does not understand the
+   semantic of a retraction trace.
+3. **Async ops.** Does the stateful harness need to run
+   `IAsyncOperator`-tagged stateful ops under a controlled
+   scheduler (cooperative `Task.Yield` interleaving)? DST
+   orthodoxy says yes — but the first landing should be
+   sync-only and flag async as a follow-up.
+4. **Per-key vs per-instance state.** "Stateful-strict"
+   currently covers both. The law holds for per-instance
+   state trivially; per-key state needs the retract-step
+   pair to fire on the matching key. The current trace-
+   based law treats the op as opaque, so it works for both
+   — but means the diagnostic print on a failing per-key
+   op will be less actionable. A future option-A
+   enrichment would split these cleanly.
+
+## Next steps
+
+- Land `checkLinear` in `src/Core/LawRunner.fs` this round.
+  Easiest law, cleanest proof-of-concept. One test in
+  `tests/Tests.FSharp/Plugin/LawRunner.Tests.fs` against a
+  trivial `id * 2` linear op.
+- Land `checkRetractionCompleteness` this round only if
+  Option B stays clean end-to-end; otherwise ship the
+  design doc and the Linear check, defer stateful to
+  round-29. **Explicit acceptance of a partial landing is
+  better than a rushed stateful-harness implementation
+  that lies about coverage.**
+- Retract the `IStatefulStrictOperator` soft-claim in
+  `docs/PLUGIN-AUTHOR.md` only when `checkRetractionCompleteness`
+  is actually live. A partial landing updates the doc to
+  "Linear law live; stateful-strict law in round-29"
+  — honest rather than aspirational.
diff --git a/tools/lean4/.github/workflows/create-release.yml b/tools/lean4/.github/workflows/create-release.yml
deleted file mode 100644
index 6dacf77c..00000000
--- a/tools/lean4/.github/workflows/create-release.yml
+++ /dev/null
@@ -1,22 +0,0 @@
-name: Create Release
-
-on:
-  push:
-    branches:
-      - 'main'
-      - 'master'
-    paths:
-      - 'lean-toolchain'
-
-jobs:
-  lean-release-tag:
-    name: Add Lean release tag
-    runs-on: ubuntu-latest
-    permissions:
-      contents: write
-    steps:
-    - name: lean-release-tag action
-      uses: leanprover-community/lean-release-tag@v1
-      with:
-        do-release: true
-        GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
diff --git a/tools/lean4/.github/workflows/lean_action_ci.yml b/tools/lean4/.github/workflows/lean_action_ci.yml
deleted file mode 100644
index db09247d..00000000
--- a/tools/lean4/.github/workflows/lean_action_ci.yml
+++ /dev/null
@@ -1,21 +0,0 @@
-name: Lean Action CI
-
-on:
-  push:
-  pull_request:
-  workflow_dispatch:
-
-# Sets permissions of the GITHUB_TOKEN to allow deployment to GitHub Pages
-permissions:
-  contents: read # Read access to repository contents
-  pages: write # Write access to GitHub Pages
-  id-token: write # Write access to ID tokens
-
-jobs:
-  build:
-    runs-on: ubuntu-latest
-
-    steps:
-      - uses: actions/checkout@v5
-      - uses: leanprover/lean-action@v1
-      - uses: leanprover-community/docgen-action@v1
diff --git a/tools/lean4/.github/workflows/update.yml b/tools/lean4/.github/workflows/update.yml
deleted file mode 100644
index 96b7622a..00000000
--- a/tools/lean4/.github/workflows/update.yml
+++ /dev/null
@@ -1,41 +0,0 @@
-name: Update Dependencies
-
-on:
-  # schedule:             # Sets a schedule to trigger the workflow
-  #   - cron: "0 8 * * *" # Every day at 08:00 AM UTC (see https://docs.github.com/en/actions/writing-workflows/choosing-when-your-workflow-runs/events-that-trigger-workflows#schedule)
-  workflow_dispatch:    # Allows the workflow to be triggered manually via the GitHub interface
-
-jobs:
-  check-for-updates: # Determines which updates to apply.
-    runs-on: ubuntu-latest
-    outputs:
-      is-update-available: ${{ steps.check-for-updates.outputs.is-update-available }}
-      new-tags: ${{ steps.check-for-updates.outputs.new-tags }}
-    steps:
-      - name: Run the action
-        id: check-for-updates
-        uses: leanprover-community/mathlib-update-action@v1
-        # START CONFIGURATION BLOCK 1
-        # END CONFIGURATION BLOCK 1
-  do-update: # Runs the upgrade, tests it, and makes a PR/issue/commit.
-    runs-on: ubuntu-latest
-    permissions:
-      contents: write      # Grants permission to push changes to the repository
-      issues: write        # Grants permission to create or update issues
-      pull-requests: write # Grants permission to create or update pull requests
-    needs: check-for-updates
-    if: ${{ needs.check-for-updates.outputs.is-update-available == 'true' }}
-    strategy: # Runs for each update discovered by the `check-for-updates` job.
-      max-parallel: 1 # Ensures that the PRs/issues are created in order.
-      matrix:
-        tag: ${{ fromJSON(needs.check-for-updates.outputs.new-tags) }}
-    steps:
-      - name: Run the action
-        id: update-the-repo
-        uses: leanprover-community/mathlib-update-action/do-update@v1
-        with:
-          tag: ${{ matrix.tag }}
-          # START CONFIGURATION BLOCK 2
-          on_update_succeeds: pr # Create a pull request if the update succeeds
-          on_update_fails: issue # Create an issue if the update fails
-          # END CONFIGURATION BLOCK 2
diff --git a/tools/lean4/.gitignore b/tools/lean4/.gitignore
deleted file mode 100644
index bfb30ec8..00000000
--- a/tools/lean4/.gitignore
+++ /dev/null
@@ -1 +0,0 @@
-/.lake
diff --git a/tools/lean4/Lean4.lean b/tools/lean4/Lean4.lean
index a1e608b3..9c4f9ba5 100644
--- a/tools/lean4/Lean4.lean
+++ b/tools/lean4/Lean4.lean
@@ -1 +1,4 @@
-import Lean4.Basic
+-- Library root for the `Lean4` lean_lib declared in
+-- `lakefile.toml`. Imports the machine-checked proof files so
+-- `lake build` walks them transitively.
+import Lean4.DbspChainRule
diff --git a/tools/lean4/Lean4/Basic.lean b/tools/lean4/Lean4/Basic.lean
deleted file mode 100644
index 99415d9d..00000000
--- a/tools/lean4/Lean4/Basic.lean
+++ /dev/null
@@ -1 +0,0 @@
-def hello := "world"
diff --git a/tools/lean4/README.md b/tools/lean4/README.md
deleted file mode 100644
index 83c247e0..00000000
--- a/tools/lean4/README.md
+++ /dev/null
@@ -1,13 +0,0 @@
-# lean4
-
-## GitHub configuration
-
-To set up your new GitHub repository, follow these steps:
-
-* Under your repository name, click **Settings**.
-* In the **Actions** section of the sidebar, click "General".
-* Check the box **Allow GitHub Actions to create and approve pull requests**.
-* Click the **Pages** section of the settings sidebar.
-* In the **Source** dropdown menu, select "GitHub Actions".
-
-After following the steps above, you can remove this section from the README file.

From 07c3c00d22a08b9637ed68242da42cfba6d2b4a8 Mon Sep 17 00:00:00 2001
From: Aaron Stainback <aaron_bond@yahoo.com>
Date: Sat, 18 Apr 2026 14:56:15 -0400
Subject: [PATCH 2/4] =?UTF-8?q?Round=2028=20=E2=80=94=20LawRunner:=20Linea?=
 =?UTF-8?q?r=20+=20retraction-completeness=20laws=20live?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

First concrete deliverable for the round-28 anchor. `LawRunner`
lives in `src/Core/LawRunner.fs` as a test-time library (not a
Circuit.Build() gate — see design doc for build-vs-test rationale).
Generators are `System.Random -> 'T` so Core stays free of FsCheck;
plugin authors wire FsCheck from their test project.

Both checks are deterministic-simulation: a failing run prints the
seed and sample index so plugin authors reproduce bit-exact.

- `checkLinear` — generates trace pairs (A, B) and asserts
  `op(A + B) = op(A) + op(B)` tick-by-tick. `addIn` / `addOut` /
  `equalOut` are parameters so it works for ZSet, numerics, any
  additive carrier.
- `checkRetractionCompleteness` — Option B trace-based: forward-
  runs a random Z-set trace, retracts each tick in the same order,
  asserts the cumulative output Z-set is empty. Catches operators
  that leak state through retraction without any interface
  enrichment (Option A promotion planned for round-29+).

Tests in `tests/Tests.FSharp/Plugin/LawRunner.Tests.fs` — 5 tests,
63ms:
- genuine linear op passes
- falsely-tagged squarer (non-linear) is caught
- same seed reproduces bit-exact
- clean-retracting ZSet echo passes
- positive-only liar is caught

`docs/PLUGIN-AUTHOR.md` soft-claim retracted — the doc now
documents `LawRunner.checkLinear` / `checkRetractionCompleteness`
as live, with `checkBilinear` / `checkSinkTerminal` flagged as
round-29+.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 docs/PLUGIN-AUTHOR.md                        |  81 +++++++--
 src/Core/Core.fsproj                         |   1 +
 src/Core/LawRunner.fs                        | 143 ++++++++++++++++
 tests/Tests.FSharp/Plugin/LawRunner.Tests.fs | 170 +++++++++++++++++++
 tests/Tests.FSharp/Tests.FSharp.fsproj       |   1 +
 5 files changed, 379 insertions(+), 17 deletions(-)
 create mode 100644 src/Core/LawRunner.fs
 create mode 100644 tests/Tests.FSharp/Plugin/LawRunner.Tests.fs

diff --git a/docs/PLUGIN-AUTHOR.md b/docs/PLUGIN-AUTHOR.md
index 34f51762..724b0363 100644
--- a/docs/PLUGIN-AUTHOR.md
+++ b/docs/PLUGIN-AUTHOR.md
@@ -43,10 +43,13 @@ only when your operator genuinely needs them.
 
 Pick the **strongest** algebra tag that honestly describes
 your operator. Lying (claiming Linear when you are not) is
-detected by Zeta's algebra layer; the full law-verification
-suite runs at `Circuit.Build()` once the FsCheck generators
-for each tag are implemented — see **Known limits of
-round-27** at the bottom of this doc for current coverage.
+caught by `Zeta.Core.LawRunner` — a deterministic-simulation
+law runner that plugin authors call from their test project.
+Run `LawRunner.checkLinear` against an `ILinearOperator` and
+`LawRunner.checkRetractionCompleteness` against an
+`IStatefulStrictOperator`; both take a seed so failures
+reproduce bit-exact. See **Verifying your plugin's
+algebra tag** below.
 
 ## The canonical example
 
@@ -131,6 +134,51 @@ let outputs =
 Use it in unit tests. `Tests.FSharp/Plugin/*.Tests.fs` has
 more examples.
 
+## Verifying your plugin's algebra tag
+
+`Zeta.Core.LawRunner` is the test-time harness that verifies
+your algebra tag is honest. It runs under deterministic
+simulation: given a seed and a schedule length, a failing
+run prints enough to reproduce bit-exact.
+
+```fsharp
+open Zeta.Core
+
+let result =
+    LawRunner.checkLinear
+        42      // seed — reproduces the exact trace on failure
+        20      // samples (a, b) trace pairs
+        8       // ticks per trace
+        (fun input -> MyLinearOp input :> IOperator<int>)
+        (fun rng -> rng.Next(-100, 101))   // genInput
+        (+)                                 // addIn
+        (+)                                 // addOut
+        (=)                                 // equalOut
+
+match result with
+| Ok () -> ()
+| Error v -> failwithf "Linearity broke: %A" v
+```
+
+`checkRetractionCompleteness` works against
+`IStatefulStrictOperator<ZSet<'TIn>, _, ZSet<'TOut>>`:
+it forward-runs a random Z-set trace, retracts each tick
+in the same order, and asserts the cumulative output Z-set
+is empty. Any residual means the operator leaked state
+through retraction.
+
+```fsharp
+let result =
+    LawRunner.checkRetractionCompleteness
+        7 15 6
+        (fun input -> MyStatefulOp input :> IOperator<ZSet<int>>)
+        (fun rng -> ZSet.ofSeq [ rng.Next(0,5), int64 (rng.Next(-3,4)) ])
+```
+
+Bilinear and sink-terminal law runners are round-29+ work —
+see `docs/research/stateful-harness-design.md` for the
+sequenced plan.
+
 ## What you cannot do
 
 - **You cannot mutate another operator's output.** The
@@ -150,20 +198,19 @@ more examples.
   cleanly, declare `ISinkOperator` — which tells Zeta to
   keep you at terminal edges only.
 
-## Known limits of round-27
+## Known limits
 
 Safe-to-ship caveats — each is tracked in `docs/DEBT.md`
 with a target round.
 
-- **FsCheck algebra-law verification is not yet wired at
-  `Circuit.Build()`.** `ILinearOperator` / `IBilinearOperator`
-  / `IStatefulStrictOperator` / `ISinkOperator` are marker
-  interfaces today; the scheduler does not yet run the
-  `LinearLaw`, `BilinearLaw`, `RetractionCompletenessLaw`,
-  or `SinkTerminalLaw` generators against tagged ops. A
-  wrongly-tagged op compiles and runs; the runtime does not
-  catch it. **Fine for internal plugins; hold off publishing
-  to NuGet until law coverage lands.**
+- **Law coverage.** `LawRunner.checkLinear` and
+  `LawRunner.checkRetractionCompleteness` (Option B, trace-
+  based) are live. `checkBilinear` and `checkSinkTerminal`
+  are round-29+ work. An Option-A promotion of
+  `IStatefulStrictOperator` to an explicit `Init` / `Step` /
+  `Retract` triple (matching the DBSP paper's `(σ, λ, ρ)`
+  shape) is also scheduled — see
+  `docs/research/stateful-harness-design.md`.
 - **`PluginHarness.runSingleInput` asserts exactly-one-
   `Publish`-per-tick**, but the production `Circuit` path
   currently does not. An op that forgets to call `Publish`
@@ -174,9 +221,9 @@ with a target round.
   undefined behaviour.** Don't stash the buffer in a field
   or pass it to another thread to call `Publish` later. A
   tick-stamped guard is tracked as DEBT.
-- **`dotnet new zeta-plugin` scaffolding template** is a
-  round-28+ deliverable. Until it exists, start from the
-  Bayesian example in `src/Bayesian/BayesianAggregate.fs`.
+- **`dotnet new zeta-plugin` scaffolding template** is
+  still pending. Until it exists, start from the Bayesian
+  example in `src/Bayesian/BayesianAggregate.fs`.
 - **Multi-input operators** beyond one-input and two-input
   shapes need a custom `IOperator<'TOut>` with a longer
   `ReadDependencies` array; there's no generic N-input
diff --git a/src/Core/Core.fsproj b/src/Core/Core.fsproj
index a70b0e9d..8c0c5761 100644
--- a/src/Core/Core.fsproj
+++ b/src/Core/Core.fsproj
@@ -19,6 +19,7 @@
     <Compile Include="Circuit.fs" />
     <Compile Include="PluginApi.fs" />
     <Compile Include="PluginHarness.fs" />
+    <Compile Include="LawRunner.fs" />
     <Compile Include="Primitive.fs" />
     <Compile Include="Operators.fs" />
     <Compile Include="Handles.fs" />
diff --git a/src/Core/LawRunner.fs b/src/Core/LawRunner.fs
new file mode 100644
index 00000000..3b57fdf5
--- /dev/null
+++ b/src/Core/LawRunner.fs
@@ -0,0 +1,143 @@
+namespace Zeta.Core
+
+
+/// Deterministic-simulation law runner for plugin operators.
+/// Design and rationale live in `docs/research/stateful-harness-design.md`.
+///
+/// Plugin authors call these checks from their test project —
+/// LawRunner is a test-time library, not a `Circuit.Build()` gate.
+/// That keeps `Zeta.Core` free of FsCheck; generators are plain
+/// `System.Random -> 'T` so authors wire FsCheck (or any other
+/// generator) at their end.
+///
+/// All checks are **deterministic** given the same seed. A failing
+/// run reports the seed and the sample index; re-running with the
+/// same seed and `samples >= sampleIndex + 1` reproduces bit-exact.
+[<RequireQualifiedAccess>]
+module LawRunner =
+
+    /// Reported when a law fails. `Seed` + `SampleIndex` are
+    /// enough to reproduce; `Message` describes the specific
+    /// failure mode.
+    type LawViolation =
+        { Seed: int
+          SampleIndex: int
+          Message: string }
+
+    /// Drive a factory through one fresh op instance per sample
+    /// so state cannot leak across samples. Each trace runs on
+    /// its own `PluginHarness.runSingleInput` invocation which
+    /// also asserts exactly-one-`Publish`-per-tick.
+    let private runTrace<'TIn, 'TOut>
+        (makeOp: Stream<'TIn> -> IOperator<'TOut>)
+        (inputs: 'TIn list)
+        : 'TOut list =
+        PluginHarness.runSingleInput makeOp inputs
+
+    let private generateTrace<'TIn>
+        (rng: System.Random)
+        (length: int)
+        (genInput: System.Random -> 'TIn)
+        : 'TIn list =
+        [ for _ in 1 .. length -> genInput rng ]
+
+    /// Linearity: for every trace pair `(A, B)`, the output
+    /// trace on `A + B` (elementwise) must equal the elementwise
+    /// sum of the output traces on `A` and `B`.
+    ///
+    /// - `samples` — number of (A, B) pairs to test.
+    /// - `scheduleLength` — ticks per trace. DST prefers a
+    ///   generous bound (tens to low hundreds) so stateful
+    ///   linear ops (e.g. integration) have time to expose
+    ///   drift.
+    /// - `addIn` / `addOut` / `equalOut` — user-supplied so
+    ///   this works for both `ZSet<'T>` (via `ZSet.add`) and
+    ///   plain numeric types. Equality is explicit for the same
+    ///   reason — `ZSet<'T>` structural equality is not always
+    ///   cheap, so the caller chooses.
+    let checkLinear<'TIn, 'TOut>
+        (seed: int)
+        (samples: int)
+        (scheduleLength: int)
+        (makeOp: Stream<'TIn> -> IOperator<'TOut>)
+        (genInput: System.Random -> 'TIn)
+        (addIn: 'TIn -> 'TIn -> 'TIn)
+        (addOut: 'TOut -> 'TOut -> 'TOut)
+        (equalOut: 'TOut -> 'TOut -> bool)
+        : Result<unit, LawViolation> =
+        if samples < 1 then invalidArg "samples" "must be >= 1"
+        if scheduleLength < 1 then invalidArg "scheduleLength" "must be >= 1"
+        let rng = System.Random(seed)
+        let mutable failure : LawViolation option = None
+        let mutable i = 0
+        while failure.IsNone && i < samples do
+            let traceA = generateTrace rng scheduleLength genInput
+            let traceB = generateTrace rng scheduleLength genInput
+            let traceSum = List.map2 addIn traceA traceB
+            let outA = runTrace makeOp traceA
+            let outB = runTrace makeOp traceB
+            let outSum = runTrace makeOp traceSum
+            // Compare elementwise: op(a_t + b_t) == op(a_t) + op(b_t)
+            // at every tick t. Short-circuit on first divergence so
+            // the reported message points at the offending tick.
+            let mutable tick = 0
+            let mutable broke = false
+            while not broke && tick < scheduleLength do
+                let lhs = outSum.[tick]
+                let rhs = addOut outA.[tick] outB.[tick]
+                if not (equalOut lhs rhs) then
+                    failure <-
+                        Some { Seed = seed
+                               SampleIndex = i
+                               Message =
+                                 sprintf
+                                     "Linearity broke at sample %d, tick %d: \
+                                      op(a+b) ≠ op(a) + op(b)."
+                                     i tick }
+                    broke <- true
+                tick <- tick + 1
+            i <- i + 1
+        match failure with
+        | Some v -> Error v
+        | None -> Ok ()
+
+    /// Retraction completeness (Option B, trace-based).
+    /// Forward-run a generated Z-set trace, then retract each
+    /// tick's input in the same order. After the round-trip the
+    /// cumulative output Z-set must be empty — any residual
+    /// means the operator leaked state through retraction.
+    ///
+    /// Works against the existing marker `IStatefulStrictOperator`
+    /// without interface enrichment (see the design doc for the
+    /// Option A follow-up path).
+    let checkRetractionCompleteness<'TIn, 'TOut when 'TIn : comparison and 'TOut : comparison>
+        (seed: int)
+        (samples: int)
+        (scheduleLength: int)
+        (makeOp: Stream<ZSet<'TIn>> -> IOperator<ZSet<'TOut>>)
+        (genInput: System.Random -> ZSet<'TIn>)
+        : Result<unit, LawViolation> =
+        if samples < 1 then invalidArg "samples" "must be >= 1"
+        if scheduleLength < 1 then invalidArg "scheduleLength" "must be >= 1"
+        let rng = System.Random(seed)
+        let mutable failure : LawViolation option = None
+        let mutable i = 0
+        while failure.IsNone && i < samples do
+            let forward = generateTrace rng scheduleLength genInput
+            let retract = forward |> List.map ZSet.neg
+            let outputs = runTrace makeOp (forward @ retract)
+            let cumulative = outputs |> List.fold ZSet.add ZSet.empty
+            if not cumulative.IsEmpty then
+                failure <-
+                    Some { Seed = seed
+                           SampleIndex = i
+                           Message =
+                             sprintf
+                                 "Retraction incomplete at sample %d: after \
+                                  forward+retract round-trip of %d ticks, \
+                                  cumulative output has %d residual entries."
+                                 i scheduleLength cumulative.Count }
+            i <- i + 1
+        match failure with
+        | Some v -> Error v
+        | None -> Ok ()
diff --git a/tests/Tests.FSharp/Plugin/LawRunner.Tests.fs b/tests/Tests.FSharp/Plugin/LawRunner.Tests.fs
new file mode 100644
index 00000000..4454de58
--- /dev/null
+++ b/tests/Tests.FSharp/Plugin/LawRunner.Tests.fs
@@ -0,0 +1,170 @@
+module Zeta.Tests.Plugin.LawRunnerTests
+
+open System.Threading.Tasks
+open Xunit
+open FsUnit.Xunit
+open Zeta.Core
+
+
+// ────────────────────────────────────────────────────────────────
+// Linearity fixtures
+// ────────────────────────────────────────────────────────────────
+
+/// Linear plugin — doubling is linear over integers.
+type private DoublerOp(input: Stream<int>) =
+    let deps = [| input.AsDependency() |]
+    interface ILinearOperator<int, int> with
+        member _.Name = "doubler"
+        member _.ReadDependencies = deps
+        member _.StepAsync(out, _ct) =
+            out.Publish (input.Current * 2)
+            ValueTask.CompletedTask
+
+
+/// **Not** linear — squaring breaks `op(a+b) = op(a)+op(b)`.
+/// Falsely tagged `ILinearOperator` so the law runner catches
+/// the lie.
+type private SquarerOp(input: Stream<int>) =
+    let deps = [| input.AsDependency() |]
+    interface ILinearOperator<int, int> with
+        member _.Name = "squarer-liar"
+        member _.ReadDependencies = deps
+        member _.StepAsync(out, _ct) =
+            let v = input.Current
+            out.Publish (v * v)
+            ValueTask.CompletedTask
+
+
+// ────────────────────────────────────────────────────────────────
+// Retraction-completeness fixtures (ZSet<int> → ZSet<int>)
+// ────────────────────────────────────────────────────────────────
+
+/// Clean retraction — pure passthrough. Each tick echoes the
+/// input Z-set; forward+retract round-trip sums to empty.
+type private ZSetEchoOp(input: Stream<ZSet<int>>) =
+    let deps = [| input.AsDependency() |]
+    interface IStatefulStrictOperator<ZSet<int>, unit, ZSet<int>> with
+        member _.Name = "zset-echo"
+        member _.ReadDependencies = deps
+        member _.StepAsync(out, _ct) =
+            out.Publish input.Current
+            ValueTask.CompletedTask
+
+
+/// Retraction-lossy — drops negative-weight entries. Under a
+/// forward+retract round-trip the retracted entries never make
+/// it through, so cumulative output ≠ empty. Falsely tagged
+/// `IStatefulStrictOperator` so the law runner catches the lie.
+type private PositiveOnlyOp(input: Stream<ZSet<int>>) =
+    let deps = [| input.AsDependency() |]
+    interface IStatefulStrictOperator<ZSet<int>, unit, ZSet<int>> with
+        member _.Name = "positive-only-liar"
+        member _.ReadDependencies = deps
+        member _.StepAsync(out, _ct) =
+            let positives =
+                input.Current.AsSpan().ToArray()
+                |> Array.filter (fun e -> e.Weight > 0L)
+                |> Array.map (fun e -> e.Key, e.Weight)
+            out.Publish (ZSet.ofSeq positives)
+            ValueTask.CompletedTask
+
+
+// ────────────────────────────────────────────────────────────────
+// Generators
+// ────────────────────────────────────────────────────────────────
+
+let private genInt (rng: System.Random) : int = rng.Next(-100, 101)
+
+let private genZSet (rng: System.Random) : ZSet<int> =
+    let n = rng.Next(0, 4)
+    [ for _ in 1 .. n ->
+        let k = rng.Next(0, 5)
+        // Weight in [-3, 3] \ {0} — both polarities appear.
+        let mutable w = rng.Next(-3, 4)
+        if w = 0 then w <- 1
+        (k, int64 w) ]
+    |> ZSet.ofSeq
+
+
+// ────────────────────────────────────────────────────────────────
+// Linearity
+// ────────────────────────────────────────────────────────────────
+
+[<Fact>]
+let ``checkLinear passes on a genuine linear op`` () =
+    let result =
+        LawRunner.checkLinear
+            42      // seed
+            20      // samples
+            8       // scheduleLength
+            (fun s -> DoublerOp s :> IOperator<int>)
+            genInt
+            (+)
+            (+)
+            (=)
+    match result with
+    | Ok () -> ()
+    | Error v -> Assert.Fail (sprintf "expected Ok, got %A" v)
+
+
+[<Fact>]
+let ``checkLinear catches a falsely-tagged non-linear op`` () =
+    let result =
+        LawRunner.checkLinear
+            42
+            20
+            8
+            (fun s -> SquarerOp s :> IOperator<int>)
+            genInt
+            (+)
+            (+)
+            (=)
+    match result with
+    | Ok () -> Assert.Fail "expected linearity violation"
+    | Error v ->
+        v.Seed |> should equal 42
+        v.Message |> should haveSubstring "Linearity broke"
+
+
+[<Fact>]
+let ``checkLinear reproduces bit-exact on the same seed`` () =
+    let run () =
+        LawRunner.checkLinear
+            99 10 5
+            (fun s -> SquarerOp s :> IOperator<int>)
+            genInt (+) (+) (=)
+    let first = run ()
+    let second = run ()
+    first |> should equal second
+
+
+// ────────────────────────────────────────────────────────────────
+// Retraction completeness
+// ────────────────────────────────────────────────────────────────
+
+[<Fact>]
+let ``checkRetractionCompleteness passes on a clean-retracting op`` () =
+    let result =
+        LawRunner.checkRetractionCompleteness
+            7       // seed
+            15      // samples
+            6       // scheduleLength
+            (fun s -> ZSetEchoOp s :> IOperator<ZSet<int>>)
+            genZSet
+    match result with
+    | Ok () -> ()
+    | Error v -> Assert.Fail (sprintf "expected Ok, got %A" v)
+
+
+[<Fact>]
+let ``checkRetractionCompleteness catches a retraction-lossy op`` () =
+    let result =
+        LawRunner.checkRetractionCompleteness
+            7 15 6
+            (fun s -> PositiveOnlyOp s :> IOperator<ZSet<int>>)
+            genZSet
+    match result with
+    | Ok () -> Assert.Fail "expected retraction-completeness violation"
+    | Error v ->
+        v.Seed |> should equal 7
+        v.Message |> should haveSubstring "Retraction incomplete"
diff --git a/tests/Tests.FSharp/Tests.FSharp.fsproj b/tests/Tests.FSharp/Tests.FSharp.fsproj
index f00f938a..932738ef 100644
--- a/tests/Tests.FSharp/Tests.FSharp.fsproj
+++ b/tests/Tests.FSharp/Tests.FSharp.fsproj
@@ -83,6 +83,7 @@
 
     <!-- Plugin/ -->
     <Compile Include="Plugin/Harness.Tests.fs" />
+    <Compile Include="Plugin/LawRunner.Tests.fs" />
 
     <!-- Formal/ -->
     <Compile Include="Formal/Z3.Laws.Tests.fs" />

From 37e197125dd93cca34fcaf3074f2282800fa2aa2 Mon Sep 17 00:00:00 2001
From: Aaron Stainback <aaron_bond@yahoo.com>
Date: Sat, 18 Apr 2026 15:06:15 -0400
Subject: [PATCH 3/4] =?UTF-8?q?LawRunner=20=E2=80=94=20reviewer=20P0s=20+?=
 =?UTF-8?q?=20high-value=20P1s=20+=20polish?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Kira (harsh-critic) + Rune (maintainability-reviewer) floor per
GOVERNANCE.md §20. P0s from Kira's pass were load-bearing; fixing
them in this commit before the round closes.

P0 fixes:
- Per-sample `System.Random(seed + i)` so `(seed, sampleIndex)`
  genuinely reproduces bit-exact, independent of whether earlier
  samples failed fast. Previous whole-loop RNG made the
  reproducibility claim false.
- Retraction-completeness law rewritten to state-restoration via
  continuation: forward + retract + continuation, compare
  continuation outputs against a fresh-op run of the continuation
  alone. The old "cumulative output = 0" formulation passed
  trivially for empty-emitting ops and a floored-counter could
  leak state while keeping cumulative zero; the new law catches
  both. Test fixture replaced accordingly — `FlooredCounterOp`
  (genuinely stateful and retraction-lossy) supersedes the
  `PositiveOnlyOp` filter (which is non-linear, not a retraction
  fixture; properly belongs under `checkLinear`).
- `invalidArg` → `Error` on bad args — every public entry now
  returns `Result`, matching CLAUDE.md's result-over-exception
  rule.

P1 fixes worth doing now:
- `List.toArray` before the tick loop in `checkLinear` — kills
  the O(scheduleLength²) `List.item` indexing Kira flagged.
- Extracted `runSamples` helper so the two law implementations
  share one loop shape — removes paste-and-modify risk before
  round-29 adds a third law.
- Rune polish: cross-ref added from `IOperator<'TOut>` doc to
  `LawRunner` for discoverability; `≠` → `!=` in error strings
  so logs survive every terminal; WHY comments on non-obvious
  lines (`if w = 0 then w <- 1`); module-level note that
  `'TState = unit` is fine because trace-based Option B never
  inspects state.

Deferred to DEBT (docs/DEBT.md): promote `check*` to a config
record before `checkBilinear` lands; structured `LawViolation.Reason`
DU; test where op omits the marker tag.

Tests: 7/7 green in 61ms (added "bit-exact reproduces" for
retraction + "Error on bad samples arg" coverage).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 docs/DEBT.md                                 |  21 ++
 src/Core/LawRunner.fs                        | 208 +++++++++++--------
 src/Core/PluginApi.fs                        |   5 +-
 tests/Tests.FSharp/Plugin/LawRunner.Tests.fs | 111 ++++++----
 4 files changed, 224 insertions(+), 121 deletions(-)

diff --git a/docs/DEBT.md b/docs/DEBT.md
index 030afe0e..68ab3645 100644
--- a/docs/DEBT.md
+++ b/docs/DEBT.md
@@ -37,6 +37,27 @@ feature + debt budget).
 
 ## Live debt
 
+### `LawRunner.check*` takes 8-11 positional args — promote to config record
+- **Site:** `src/Core/LawRunner.fs`
+- **Found:** round 28 by Rune (maintainability-reviewer)
+- **Effort:** S
+- **Friction:** call sites stack integer literals with trailing `// seed` / `// samples` comments; the next law (`checkBilinear`) will need 10+ positionals and multiply the drift surface.
+- **Fix:** introduce `LinearityConfig<'TIn,'TOut>` / `RetractionConfig<'TIn,'TOut>` records and take one argument. Do it before `checkBilinear` lands.
+
+### `LawViolation.Message` is a string — promote to a structured DU
+- **Site:** `src/Core/LawRunner.fs`
+- **Found:** round 28 by Kira (harsh-critic)
+- **Effort:** S
+- **Friction:** tests string-grep on `"Linearity broke"` / `"Retraction incomplete"`; downstream tooling can't pattern-match on cause.
+- **Fix:** `type Reason = LinearityBreak of tick:int | RetractionResidual of count:int | BadArgs of string`; keep `Message` as a rendering helper.
+
+### `LawRunner` has no test covering operators that omit the marker tag
+- **Site:** `tests/Tests.FSharp/Plugin/LawRunner.Tests.fs`
+- **Found:** round 28 by Kira (harsh-critic)
+- **Effort:** S
+- **Friction:** law runner claims to verify the tag; no test proves the runner doesn't secretly rely on the tag to compile-time dispatch.
+- **Fix:** add a fixture that implements `IOperator<_>` directly (no `ILinearOperator` tag) and confirm `checkLinear` runs against it.
+
 ### `Op<'T>` implicitly publicised as a plugin subclass-extension point
 - **Site:** `src/Core/Circuit.fs` (`Op`, `Op<'T>`) and every subtype under
   `src/Core/Operators.fs`
diff --git a/src/Core/LawRunner.fs b/src/Core/LawRunner.fs
index 3b57fdf5..b5ab097f 100644
--- a/src/Core/LawRunner.fs
+++ b/src/Core/LawRunner.fs
@@ -10,29 +10,42 @@ namespace Zeta.Core
 /// `System.Random -> 'T` so authors wire FsCheck (or any other
 /// generator) at their end.
 ///
-/// All checks are **deterministic** given the same seed. A failing
-/// run reports the seed and the sample index; re-running with the
-/// same seed and `samples >= sampleIndex + 1` reproduces bit-exact.
+/// All checks are **deterministic** given the same seed. Each
+/// sample uses its own `System.Random(seed + sampleIndex)` so a
+/// failing `(seed, sampleIndex)` pair reproduces bit-exact on
+/// re-run with any `samples >= sampleIndex + 1` — sample N does
+/// not depend on whether earlier samples failed fast.
+///
+/// `checkRetractionCompleteness` uses the **state-restoration
+/// via continuation** formulation: feed a forward trace, then
+/// its retraction, then a continuation; compare continuation
+/// outputs to a fresh-op run of the continuation alone. Any
+/// divergence means the op's internal state survived what was
+/// supposed to be a full cancel. This catches stateful ops that
+/// mistag themselves as retraction-clean (Option B per the
+/// design doc; Option A enrichment — `Init`/`Step`/`Retract`
+/// triple — is round-29+ work).
+///
+/// `'TState` on `IStatefulStrictOperator` is unused by the
+/// trace-based law — the check runs through `StepAsync` and
+/// never inspects state. Tests pass `unit` for this parameter.
 [<RequireQualifiedAccess>]
 module LawRunner =
 
-    /// Reported when a law fails. `Seed` + `SampleIndex` are
-    /// enough to reproduce; `Message` describes the specific
-    /// failure mode.
+    /// Reported when a law fails or a check cannot run.
+    /// `Seed` + `SampleIndex` are enough to reproduce via a
+    /// fresh `checkX` call on the same seed; `Message` describes
+    /// the specific failure mode.
     type LawViolation =
         { Seed: int
           SampleIndex: int
           Message: string }
 
-    /// Drive a factory through one fresh op instance per sample
-    /// so state cannot leak across samples. Each trace runs on
-    /// its own `PluginHarness.runSingleInput` invocation which
-    /// also asserts exactly-one-`Publish`-per-tick.
-    let private runTrace<'TIn, 'TOut>
-        (makeOp: Stream<'TIn> -> IOperator<'TOut>)
-        (inputs: 'TIn list)
-        : 'TOut list =
-        PluginHarness.runSingleInput makeOp inputs
+    /// Bad arguments surface as `Error` rather than exceptions
+    /// so every public entry returns a `Result` — CLAUDE.md's
+    /// result-over-exception rule.
+    let private badArgs (seed: int) (message: string) : LawViolation =
+        { Seed = seed; SampleIndex = -1; Message = message }
 
     let private generateTrace<'TIn>
         (rng: System.Random)
@@ -41,6 +54,29 @@ module LawRunner =
         : 'TIn list =
         [ for _ in 1 .. length -> genInput rng ]
 
+    /// Shared sample-loop shape used by every law. Each sample
+    /// gets a fresh `System.Random(seed + i)` so reproducibility
+    /// is per-sample, not whole-loop. `check` returns `None` on
+    /// pass and `Some message` on failure; the framer wraps it
+    /// as a `LawViolation`.
+    let private runSamples
+        (seed: int)
+        (samples: int)
+        (check: System.Random -> int -> string option)
+        : Result<unit, LawViolation> =
+        let mutable failure : LawViolation option = None
+        let mutable i = 0
+        while failure.IsNone && i < samples do
+            let rng = System.Random(seed + i)
+            match check rng i with
+            | Some msg ->
+                failure <- Some { Seed = seed; SampleIndex = i; Message = msg }
+            | None -> ()
+            i <- i + 1
+        match failure with
+        | Some v -> Error v
+        | None -> Ok ()
+
     /// Linearity: for every trace pair `(A, B)`, the output
     /// trace on `A + B` (elementwise) must equal the elementwise
     /// sum of the output traces on `A` and `B`.
@@ -65,79 +101,89 @@ module LawRunner =
         (addOut: 'TOut -> 'TOut -> 'TOut)
         (equalOut: 'TOut -> 'TOut -> bool)
         : Result<unit, LawViolation> =
-        if samples < 1 then invalidArg "samples" "must be >= 1"
-        if scheduleLength < 1 then invalidArg "scheduleLength" "must be >= 1"
-        let rng = System.Random(seed)
-        let mutable failure : LawViolation option = None
-        let mutable i = 0
-        while failure.IsNone && i < samples do
-            let traceA = generateTrace rng scheduleLength genInput
-            let traceB = generateTrace rng scheduleLength genInput
-            let traceSum = List.map2 addIn traceA traceB
-            let outA = runTrace makeOp traceA
-            let outB = runTrace makeOp traceB
-            let outSum = runTrace makeOp traceSum
-            // Compare elementwise: op(a_t + b_t) == op(a_t) + op(b_t)
-            // at every tick t. Short-circuit on first divergence so
-            // the reported message points at the offending tick.
-            let mutable tick = 0
-            let mutable broke = false
-            while not broke && tick < scheduleLength do
-                let lhs = outSum.[tick]
-                let rhs = addOut outA.[tick] outB.[tick]
-                if not (equalOut lhs rhs) then
-                    failure <-
-                        Some { Seed = seed
-                               SampleIndex = i
-                               Message =
-                                 sprintf
+        if samples < 1 then
+            Error (badArgs seed "samples must be >= 1")
+        elif scheduleLength < 1 then
+            Error (badArgs seed "scheduleLength must be >= 1")
+        else
+            runSamples seed samples (fun rng i ->
+                let traceA = generateTrace rng scheduleLength genInput
+                let traceB = generateTrace rng scheduleLength genInput
+                let traceSum = List.map2 addIn traceA traceB
+                // Convert outputs to arrays once — tick indexing
+                // is O(n) on `List.item`, so a List scan would
+                // be O(scheduleLength²) per sample.
+                let outA = PluginHarness.runSingleInput makeOp traceA |> List.toArray
+                let outB = PluginHarness.runSingleInput makeOp traceB |> List.toArray
+                let outSum = PluginHarness.runSingleInput makeOp traceSum |> List.toArray
+                let mutable result = None
+                let mutable tick = 0
+                while result.IsNone && tick < scheduleLength do
+                    let lhs = outSum.[tick]
+                    let rhs = addOut outA.[tick] outB.[tick]
+                    if not (equalOut lhs rhs) then
+                        result <-
+                            Some (sprintf
                                      "Linearity broke at sample %d, tick %d: \
-                                      op(a+b) ≠ op(a) + op(b)."
-                                     i tick }
-                    broke <- true
-                tick <- tick + 1
-            i <- i + 1
-        match failure with
-        | Some v -> Error v
-        | None -> Ok ()
+                                      op(a+b) != op(a) + op(b)."
+                                     i tick)
+                    tick <- tick + 1
+                result)
 
-    /// Retraction completeness (Option B, trace-based).
-    /// Forward-run a generated Z-set trace, then retract each
-    /// tick's input in the same order. After the round-trip the
-    /// cumulative output Z-set must be empty — any residual
-    /// means the operator leaked state through retraction.
+    /// Retraction completeness via state restoration.
+    /// A forward trace of random Z-sets is cancelled by its
+    /// elementwise negation; a continuation trace is then fed
+    /// to the same op instance, and the continuation outputs
+    /// are compared to a fresh-op run of the continuation
+    /// alone. Any divergence means state survived the cancel.
     ///
-    /// Works against the existing marker `IStatefulStrictOperator`
-    /// without interface enrichment (see the design doc for the
-    /// Option A follow-up path).
+    /// The earlier "cumulative output = 0" formulation was
+    /// rejected in review: it passes trivially for empty-emitting
+    /// ops, it is not the correct law for stateful-strict ops
+    /// whose outputs are not themselves cancelling (e.g.
+    /// integration-shaped aggregates), and a pathological op
+    /// can trivially satisfy it while leaking state. State
+    /// restoration is the law the tag actually promises.
     let checkRetractionCompleteness<'TIn, 'TOut when 'TIn : comparison and 'TOut : comparison>
         (seed: int)
         (samples: int)
         (scheduleLength: int)
+        (continuationLength: int)
         (makeOp: Stream<ZSet<'TIn>> -> IOperator<ZSet<'TOut>>)
         (genInput: System.Random -> ZSet<'TIn>)
         : Result<unit, LawViolation> =
-        if samples < 1 then invalidArg "samples" "must be >= 1"
-        if scheduleLength < 1 then invalidArg "scheduleLength" "must be >= 1"
-        let rng = System.Random(seed)
-        let mutable failure : LawViolation option = None
-        let mutable i = 0
-        while failure.IsNone && i < samples do
-            let forward = generateTrace rng scheduleLength genInput
-            let retract = forward |> List.map ZSet.neg
-            let outputs = runTrace makeOp (forward @ retract)
-            let cumulative = outputs |> List.fold ZSet.add ZSet.empty
-            if not cumulative.IsEmpty then
-                failure <-
-                    Some { Seed = seed
-                           SampleIndex = i
-                           Message =
-                             sprintf
-                                 "Retraction incomplete at sample %d: after \
-                                  forward+retract round-trip of %d ticks, \
-                                  cumulative output has %d residual entries."
-                                 i scheduleLength cumulative.Count }
-            i <- i + 1
-        match failure with
-        | Some v -> Error v
-        | None -> Ok ()
+        if samples < 1 then
+            Error (badArgs seed "samples must be >= 1")
+        elif scheduleLength < 1 then
+            Error (badArgs seed "scheduleLength must be >= 1")
+        elif continuationLength < 1 then
+            Error (badArgs seed "continuationLength must be >= 1")
+        else
+            runSamples seed samples (fun rng i ->
+                let forward = generateTrace rng scheduleLength genInput
+                let retract = forward |> List.map ZSet.neg
+                let continuation = generateTrace rng continuationLength genInput
+                let fullTrace = forward @ retract @ continuation
+                let outFull =
+                    PluginHarness.runSingleInput makeOp fullTrace
+                    |> List.toArray
+                let outFresh =
+                    PluginHarness.runSingleInput makeOp continuation
+                    |> List.toArray
+                let prefix = scheduleLength + scheduleLength
+                let mutable result = None
+                let mutable tick = 0
+                while result.IsNone && tick < continuationLength do
+                    let afterCancel = outFull.[prefix + tick]
+                    let fresh = outFresh.[tick]
+                    let diff = ZSet.add afterCancel (ZSet.neg fresh)
+                    if not diff.IsEmpty then
+                        result <-
+                            Some (sprintf
+                                     "Retraction incomplete at sample %d, \
+                                      continuation tick %d: state survived the \
+                                      forward+retract cancel (diff has %d \
+                                      residual entries)."
+                                     i tick diff.Count)
+                    tick <- tick + 1
+                result)
diff --git a/src/Core/PluginApi.fs b/src/Core/PluginApi.fs
index a0673709..e36eed8d 100644
--- a/src/Core/PluginApi.fs
+++ b/src/Core/PluginApi.fs
@@ -45,7 +45,10 @@ type OutputBuffer<'TOut> =
 /// Plugin-author contract for a custom operator with a typed
 /// output. External plugin libraries implement this to extend
 /// Zeta's operator catalogue without touching internal scheduler
-/// types.
+/// types. Conformance to an algebra capability tag
+/// (`ILinearOperator`, `IBilinearOperator`, etc.) is verified
+/// in the plugin author's test project via `LawRunner` — see
+/// `docs/PLUGIN-AUTHOR.md` §"Verifying your plugin's algebra tag".
 type IOperator<'TOut> =
 
     /// Short diagnostic label for the operator instance. Used
diff --git a/tests/Tests.FSharp/Plugin/LawRunner.Tests.fs b/tests/Tests.FSharp/Plugin/LawRunner.Tests.fs
index 4454de58..46b44dae 100644
--- a/tests/Tests.FSharp/Plugin/LawRunner.Tests.fs
+++ b/tests/Tests.FSharp/Plugin/LawRunner.Tests.fs
@@ -1,5 +1,6 @@
 module Zeta.Tests.Plugin.LawRunnerTests
 
+open System.Collections.Generic
 open System.Threading.Tasks
 open Xunit
 open FsUnit.Xunit
@@ -36,11 +37,17 @@ type private SquarerOp(input: Stream<int>) =
 
 
 // ────────────────────────────────────────────────────────────────
-// Retraction-completeness fixtures (ZSet<int> → ZSet<int>)
+// Retraction-completeness fixtures
+//
+// The law checks state restoration: after forward+retract
+// cancel, continuation outputs must match a fresh-op run of
+// the same continuation. Catches stateful ops that survive
+// what was supposed to be a full cancel.
 // ────────────────────────────────────────────────────────────────
 
-/// Clean retraction — pure passthrough. Each tick echoes the
-/// input Z-set; forward+retract round-trip sums to empty.
+/// Stateless echo — trivially passes state restoration
+/// because it has no state. Keeps a baseline for the law on
+/// clean ops.
 type private ZSetEchoOp(input: Stream<ZSet<int>>) =
     let deps = [| input.AsDependency() |]
     interface IStatefulStrictOperator<ZSet<int>, unit, ZSet<int>> with
@@ -51,21 +58,37 @@ type private ZSetEchoOp(input: Stream<ZSet<int>>) =
             ValueTask.CompletedTask
 
 
-/// Retraction-lossy — drops negative-weight entries. Under a
-/// forward+retract round-trip the retracted entries never make
-/// it through, so cumulative output ≠ empty. Falsely tagged
-/// `IStatefulStrictOperator` so the law runner catches the lie.
-type private PositiveOnlyOp(input: Stream<ZSet<int>>) =
+/// Genuinely stateful *and* retraction-lossy — a floored per-
+/// key counter. Accumulates weights but refuses to go below
+/// zero; a `-1` on a key at count 0 is silently dropped. Under
+/// forward+retract the dropped decrements leave residual
+/// positive state that survives the cancel, so any continuation
+/// input sees the leaked state and diverges from a fresh op.
+/// Falsely tagged `IStatefulStrictOperator` so the law catches
+/// it.
+type private FlooredCounterOp(input: Stream<ZSet<int>>) =
     let deps = [| input.AsDependency() |]
-    interface IStatefulStrictOperator<ZSet<int>, unit, ZSet<int>> with
-        member _.Name = "positive-only-liar"
+    let state = Dictionary<int, int64>()
+    interface IStatefulStrictOperator<ZSet<int>, Dictionary<int, int64>, ZSet<int>> with
+        member _.Name = "floored-counter-liar"
         member _.ReadDependencies = deps
         member _.StepAsync(out, _ct) =
-            let positives =
-                input.Current.AsSpan().ToArray()
-                |> Array.filter (fun e -> e.Weight > 0L)
-                |> Array.map (fun e -> e.Key, e.Weight)
-            out.Publish (ZSet.ofSeq positives)
+            let delta = ResizeArray<int * int64>()
+            let span = input.Current.AsSpan()
+            for i in 0 .. span.Length - 1 do
+                let k = span.[i].Key
+                let w = span.[i].Weight
+                let current =
+                    match state.TryGetValue k with
+                    | true, v -> v
+                    | _ -> 0L
+                let proposed = current + w
+                let applied = if proposed < 0L then 0L else proposed
+                let emitted = applied - current
+                if applied = 0L then state.Remove k |> ignore
+                else state.[k] <- applied
+                if emitted <> 0L then delta.Add (k, emitted)
+            out.Publish (ZSet.ofSeq delta)
             ValueTask.CompletedTask
 
 
@@ -79,7 +102,8 @@ let private genZSet (rng: System.Random) : ZSet<int> =
     let n = rng.Next(0, 4)
     [ for _ in 1 .. n ->
         let k = rng.Next(0, 5)
-        // Weight in [-3, 3] \ {0} — both polarities appear.
+        // Exclude zero weight so both polarities are represented
+        // without burning a generator draw on a no-op.
         let mutable w = rng.Next(-3, 4)
         if w = 0 then w <- 1
         (k, int64 w) ]
@@ -94,14 +118,9 @@ let private genZSet (rng: System.Random) : ZSet<int> =
 let ``checkLinear passes on a genuine linear op`` () =
     let result =
         LawRunner.checkLinear
-            42      // seed
-            20      // samples
-            8       // scheduleLength
+            42 20 8
             (fun s -> DoublerOp s :> IOperator<int>)
-            genInt
-            (+)
-            (+)
-            (=)
+            genInt (+) (+) (=)
     match result with
     | Ok () -> ()
     | Error v -> Assert.Fail (sprintf "expected Ok, got %A" v)
@@ -111,14 +130,9 @@ let ``checkLinear passes on a genuine linear op`` () =
 let ``checkLinear catches a falsely-tagged non-linear op`` () =
     let result =
         LawRunner.checkLinear
-            42
-            20
-            8
+            42 20 8
             (fun s -> SquarerOp s :> IOperator<int>)
-            genInt
-            (+)
-            (+)
-            (=)
+            genInt (+) (+) (=)
     match result with
     | Ok () -> Assert.Fail "expected linearity violation"
     | Error v ->
@@ -129,8 +143,7 @@ let ``checkLinear catches a falsely-tagged non-linear op`` () =
 [<Fact>]
 let ``checkLinear reproduces bit-exact on the same seed`` () =
     let run () =
-        LawRunner.checkLinear
-            99 10 5
+        LawRunner.checkLinear 99 10 5
             (fun s -> SquarerOp s :> IOperator<int>)
             genInt (+) (+) (=)
     let first = run ()
@@ -138,17 +151,26 @@ let ``checkLinear reproduces bit-exact on the same seed`` () =
     first |> should equal second
 
 
+[<Fact>]
+let ``checkLinear returns Error on bad samples arg`` () =
+    let result =
+        LawRunner.checkLinear 0 0 1
+            (fun s -> DoublerOp s :> IOperator<int>)
+            genInt (+) (+) (=)
+    match result with
+    | Ok () -> Assert.Fail "expected bad-args Error"
+    | Error v -> v.Message |> should haveSubstring "samples"
+
+
 // ────────────────────────────────────────────────────────────────
 // Retraction completeness
 // ────────────────────────────────────────────────────────────────
 
 [<Fact>]
-let ``checkRetractionCompleteness passes on a clean-retracting op`` () =
+let ``checkRetractionCompleteness passes on a stateless echo`` () =
     let result =
         LawRunner.checkRetractionCompleteness
-            7       // seed
-            15      // samples
-            6       // scheduleLength
+            7 15 6 4
             (fun s -> ZSetEchoOp s :> IOperator<ZSet<int>>)
             genZSet
     match result with
@@ -157,14 +179,25 @@ let ``checkRetractionCompleteness passes on a clean-retracting op`` () =
 
 
 [<Fact>]
-let ``checkRetractionCompleteness catches a retraction-lossy op`` () =
+let ``checkRetractionCompleteness catches a stateful retraction-lossy op`` () =
     let result =
         LawRunner.checkRetractionCompleteness
-            7 15 6
-            (fun s -> PositiveOnlyOp s :> IOperator<ZSet<int>>)
+            7 15 6 4
+            (fun s -> FlooredCounterOp s :> IOperator<ZSet<int>>)
             genZSet
     match result with
     | Ok () -> Assert.Fail "expected retraction-completeness violation"
     | Error v ->
         v.Seed |> should equal 7
         v.Message |> should haveSubstring "Retraction incomplete"
+
+
+[<Fact>]
+let ``checkRetractionCompleteness reproduces bit-exact on the same seed`` () =
+    let run () =
+        LawRunner.checkRetractionCompleteness 13 8 5 3
+            (fun s -> FlooredCounterOp s :> IOperator<ZSet<int>>)
+            genZSet
+    let first = run ()
+    let second = run ()
+    first |> should equal second

From 602ac50089ba7c762f4dff3d72c3ae14172c94ec Mon Sep 17 00:00:00 2001
From: Aaron Stainback <aaron_bond@yahoo.com>
Date: Sat, 18 Apr 2026 15:28:41 -0400
Subject: [PATCH 4/4] =?UTF-8?q?Round=2028=20close=20+=20round=2029=20open?=
 =?UTF-8?q?=20=E2=80=94=20CI=20anchor=20committed?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Round 28 anchor (FsCheck law runner) shipped across 3 commits on
this branch; this commit is the round-close bookkeeping + the
round-29 anchor commitment.

- docs/CURRENT-ROUND.md resets for round 29 with CI pipeline as
  the anchor; round-28 deliverables summarised; carryover split
  between DEBT-tracked law-runner follow-ups, product law-
  coverage (bilinear / sink-terminal / Option-A promotion), and
  deferred items from round 27.
- docs/ROUND-HISTORY.md — round 28 narrative prepended (anchor,
  reviewer-floor catch, Option-A-vs-B decision, lean cleanup,
  soft-claim retracted).
- docs/WINS.md — three round-28 wins: reviewer-floor paying on
  first applicable round, Option-B-with-planned-followups over
  verbal agreement, deterministic-simulation as contract not
  decoration.
- docs/BACKLOG.md — new P0 entry for CI / build-machine setup
  with the full discipline rules Aaron committed:
  * `../scratch` + `../SQLSharp` are read-only references —
    never copy files, hand-craft every artefact
  * Aaron reviews every CI design decision before landing
  * Cost discipline on CI minutes; narrow default matrix
  * macOS + Linux first; Windows when justified
  * Product + CI work parallelisable on one machine
  Sub-task sequence: build-machine-setup audit →
  ci-workflow-design audit → gate inventory → first workflow
  (build-and-test), each its own Aaron review gate.

AlloyRunner.java confirmed load-bearing: Alloy is a JVM tool
with no convenient CLI, driven from F# tests via `javac` + `java`.
Keep.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 docs/BACKLOG.md       |  70 +++++++++++++++
 docs/CURRENT-ROUND.md | 203 +++++++++++++++++++++++++-----------------
 docs/ROUND-HISTORY.md | 117 ++++++++++++++++++++++++
 docs/WINS.md          |  61 +++++++++++++
 4 files changed, 369 insertions(+), 82 deletions(-)

diff --git a/docs/BACKLOG.md b/docs/BACKLOG.md
index 02e82b56..2abe062f 100644
--- a/docs/BACKLOG.md
+++ b/docs/BACKLOG.md
@@ -93,6 +93,76 @@ within each priority tier.
   side (Window.fs wiring pending). Target: measured numbers in
   `docs/BENCHMARKS.md` by end of round 20.
 
+## P0 — CI / build-machine setup (round-29 anchor)
+
+- [ ] **First-class CI pipeline for Zeta.** Every agent-written
+  commit eventually has to pass the same gate a human commit
+  does; right now the only gate is `dotnet build -c Release`
+  on the maintainer's laptop. Aaron's framing (round 28):
+
+  > "Our CI setup is as first class for this software factory
+  > as is the agents themselves, it does not ultimately work
+  > without both."
+
+  Discipline rules committed up front:
+
+  1. **Read-only reference.** `../scratch` is the model for
+     build-machine setup (scripts, installers, install
+     locations, tool pins). `../SQLSharp` is the model for
+     GitHub Actions workflows. **Never copy files.** Read to
+     understand the shape and the intent; hand-craft every
+     artefact from scratch for Zeta so no cruft or
+     assumptions from another repo's context sneaks in.
+  2. **Human review on every decision.** Aaron reviews the
+     OS matrix, tool versions, caching strategy, artefact
+     retention, secret handling, permissions model, workflow
+     triggers, and the per-job concurrency/timeout settings
+     before any workflow lands. This is *not* a place for
+     "ship and iterate".
+  3. **Cost discipline.** CI minutes are the expensive
+     resource. Every job earns its slot: justify any
+     matrix-axis expansion, any scheduled run, any
+     always-on-PR gate. Default to narrow (one OS, one
+     dotnet, current branch only) and widen only with a
+     stated reason.
+  4. **Cross-platform, eventually.** Zeta is cross-platform
+     (.NET 10, macOS arm64 dev box, Linux CI, Windows
+     supported). Aaron can run rounds on Windows when
+     Windows-specific work is under way; say so when useful.
+     CI matrix should cover at least macOS + Linux; Windows
+     gets added once we have a Windows-breaking test to
+     justify the slot.
+  5. **Parallelism note.** Product work and CI work run on
+     one maintainer machine but rarely interfere at the
+     shell level. Architect is free to dispatch CI-design
+     subagents in parallel with product work as long as
+     neither agent writes the same file.
+
+  **First sub-tasks (round-29 anchor; each its own Aaron
+  review gate):**
+
+  1. Audit `../scratch` for install-script patterns (what
+     tools, what versions, what pinning method, what target
+     paths). Output: a design doc at
+     `docs/research/build-machine-setup.md` citing every
+     borrowed idea and nothing else. No file copies.
+  2. Audit `../SQLSharp` `.github/workflows/` for workflow
+     shape (triggers, jobs, matrix, caching, permissions).
+     Output: a design doc at
+     `docs/research/ci-workflow-design.md`.
+  3. Map Zeta's actual gate list: `dotnet build`,
+     `dotnet test`, Semgrep, Alloy (needs JDK), Lean proofs
+     (needs elan/lake), TLC (needs JDK + tla2tools.jar),
+     Stryker (slow — scheduled not per-PR), FsCheck tests.
+     Output: a gate inventory at
+     `docs/research/ci-gate-inventory.md`.
+  4. First workflow: `build-and-test.yml` covering
+     `dotnet build -c Release` + `dotnet test Zeta.sln -c
+     Release` on `ubuntu-latest` + `macos-latest`. Nothing
+     else until Aaron has signed off on the gate list.
+  5. Subsequent workflows added one at a time, each with
+     explicit Aaron sign-off on the design doc first.
+
 ## P0 — security / SDL artifacts
 
 - [ ] **`docs/security/CRYPTO.md`** — justify CRC32C (integrity, not
diff --git a/docs/CURRENT-ROUND.md b/docs/CURRENT-ROUND.md
index 3130e915..5948b2f8 100644
--- a/docs/CURRENT-ROUND.md
+++ b/docs/CURRENT-ROUND.md
@@ -1,94 +1,133 @@
-# Current Round — 28 (open)
+# Current Round — 29 (open)
 
-Round 27 closed; narrative absorbed into
-`docs/ROUND-HISTORY.md`. Round 28 opens with these
-deferred items carried over from round 27:
+Round 28 closed; narrative absorbed into
+`docs/ROUND-HISTORY.md`. Round 29 opens with **CI setup as
+the anchor** — see `docs/BACKLOG.md` §"P0 — CI /
+build-machine setup" for the full discipline rules and
+sub-tasks.
 
 ## Status
 
-- **Round number:** 28
-- **Opened:** 2026-04-18 (continuous from round-27 close)
-- **Classification:** split — non-trivial P1 carryover
-  from reviewer pass + other deferred work.
-- **Reviewer budget:** 2-3 per §13 + mandatory Kira+Rune
-  floor per §20 on any code-landing phase.
-
-## Carried from round 27
-
-**Reviewer P1 findings (Kira + Rune, logged to DEBT):**
-- `OutputBuffer` tick-stamp + invalidate-on-tick-end.
-- `ReadDependencies` defensive copy at registration.
-- `box plugin` triple-evaluation → `let boxed = box plugin` once.
-- `BayesianRateOp` `int64` accumulator → `Checked.(+)` or saturate.
-- `INestedFixpointParticipant` inherits `IOperator<'TOut>`.
-- `PluginHarness` id-space via `Int32.MinValue`-range synthetics.
-- `IOperator<'T>` → `IZetaOperator<'T>` rename (before external adoption).
-- `Op<'T>.Value` mixed-accessibility hover-doc pointing at `OutputBuffer.Publish`.
-- `PluginApi.fs` split when >300 lines.
-- PLUGIN-AUTHOR.md `[<Extension>]` explanation in sample.
-- Extract `internal assignHarnessId` helper shared by `Circuit.Build` + `PluginHarness`.
-
-**Design work still open:**
+- **Round number:** 29
+- **Opened:** 2026-04-18 (continuous from round-28 close)
+- **Classification:** infrastructure round — CI pipeline
+  design is the anchor; product work (bilinear / sink-
+  terminal laws, Option-A stateful promotion) runs in
+  parallel where it doesn't block on Aaron's CI-design
+  review gates.
+- **Reviewer budget:** Kira + Rune floor per GOVERNANCE.md
+  §20 on every code landing. Aaron personally reviews
+  every CI-decision artefact.
+
+## Round 28 close — what landed
+
+Anchor goal achieved: FsCheck law runner live as a test-
+time library.
+
+- `LawRunner.checkLinear` — generic additivity check; works
+  over `ZSet` and plain numerics via user-supplied `add`/
+  `equal` callbacks.
+- `LawRunner.checkRetractionCompleteness` — Option B
+  (trace-based), state-restoration via continuation after
+  reviewer P0 rewrite. Catches retraction-lossy stateful
+  ops (tested against a floored-counter fixture).
+- Per-sample `System.Random(seed + i)` for true bit-exact
+  reproducibility of `(seed, sampleIndex)`.
+- Deterministic-simulation framing locked in
+  `docs/research/stateful-harness-design.md`; Option A
+  (enrich `IStatefulStrictOperator` with `Init`/`Step`/
+  `Retract` triple matching the DBSP paper's `(σ, λ, ρ)`
+  shape) is the planned additive promotion in round-30+.
+- Scaffolding cleanup: `tools/lean4/` `lake new` leftovers
+  removed; `Lean4.lean` rewired to import the real
+  `DbspChainRule` proof file.
+
+## Round 29 anchor — CI pipeline setup
+
+**Discipline rules (committed up front; `docs/BACKLOG.md`
+has the full text):**
+
+1. Read `../scratch` (build-machine setup) and
+   `../SQLSharp` (GitHub workflows) for shape + intent.
+   **Never copy files.** Hand-craft every artefact.
+2. Aaron reviews every CI design decision before it lands.
+   This is not a "ship and iterate" surface.
+3. Cost discipline: every CI minute earns its slot; default
+   to narrow matrix and widen with a stated reason.
+4. Cross-platform eventual: macOS + Linux first; Windows
+   when there's a Windows-breaking test to justify it.
+   Aaron can run rounds on Windows on request.
+5. Product work and CI work run in parallel on the same
+   machine; dispatch research subagents for CI design
+   concurrently with product work as long as neither
+   writes the same file.
+
+**Sub-task sequence (each its own Aaron review gate):**
+
+1. Audit `../scratch` → `docs/research/build-machine-setup.md`.
+2. Audit `../SQLSharp/.github/workflows/` →
+   `docs/research/ci-workflow-design.md`.
+3. Gate inventory → `docs/research/ci-gate-inventory.md`.
+4. First workflow: `build-and-test.yml` covering
+   `dotnet build -c Release` + `dotnet test Zeta.sln -c
+   Release` on `ubuntu-latest` + `macos-latest`. Nothing
+   else until Aaron signs off on the gate list.
+5. Subsequent workflows land one at a time, each with an
+   explicit design doc and sign-off.
+
+## Carried from round 28
+
+**Law runner follow-ups (DEBT-tracked):**
+- `check*` take 8-11 positional args → promote to config
+  record before `checkBilinear` lands.
+- `LawViolation.Message` → structured DU.
+- Test covering ops that omit the marker tag.
+
+**Law coverage (product work; can run in parallel with CI):**
+- `checkBilinear` — join-shaped ops; standard DBSP
+  incrementalisation form.
+- `checkSinkTerminal` — Sink-tagged ops must not compose
+  into a relational path.
+- Option-A promotion of `IStatefulStrictOperator` to
+  explicit `(σ, λ, ρ)` triple — round-30+ unless CI work
+  stalls on a review gate, in which case advance here.
+
+**Open from round 27 (deferred to round-30+ pool):**
 - `IsDbspLinear` Lean predicate + B1/B2/B3/chain_rule
-  closures (Tariq option-c; half-day B2, two days full).
-- FsCheck law runner at `Circuit.Build()` per capability
-  tag — unblocks PLUGIN-AUTHOR.md's soft-claim in "Known
-  limits of round-27."
-
-**Persona-notes migration tail:**
-- 6 remaining persona notebooks still in single-file
-  layout (`public-api-designer.md`, `skill-tune-up-ranker.md`,
-  `best-practices-scratch.md`, `algebra-owner.md`,
-  `formal-verification-expert.md`, `agent-experience-researcher.md`).
-  Lazy migration per §21; convert when a persona next
-  writes a typed memory entry.
-
-**Other deferred:**
-- Rune on `docs/STYLE.md` decision (small).
-- UX + DX persona proposals.
-- Empathy-coach persona spawn (naming pending).
-
-## Workflow cadence (established round-26, codified round-27)
-
-- Each round runs on its own branch (`round-N`).
-- Coherent changes within a round become separate commits
-  where it helps readability.
-- Round-close = PR from `round-N` to `main` + merge.
-- Reviewer pass per §20 before round-close.
-- Maintainer may request a review pass on the branch diff
-  before merge; ask before pushing the merge.
+  closures (Tariq option-c).
+- Reviewer P1 list: `OutputBuffer` tick-stamp,
+  `ReadDependencies` defensive copy, BayesianRate
+  `Checked.(+)`, `IOperator` → `IZetaOperator` rename
+  window, `PluginApi.fs` split when >300 lines.
 
 ## Open asks to the maintainer
 
-- **NuGet prefix reservation** on `nuget.org` for
-  `Zeta.*` — maintainer owns.
-- **`global.json` `rollForward`** — status quo vs relaxed
-  (silent-pick status quo unless objection).
-- **Eval-harness MVP scope** — still pending since round
-  23.
-- **Repo visibility** — currently private on AceHack;
-  flip to public when ready.
-
-## Next architect actions
-
-1. Open `round-28` branch off `main` once round-27 PR
-   merges.
-2. Anchor choice: **FsCheck law runner** (unblocks
-   plugin-author trust + validates Tariq's design) or
-   **OutputBuffer tick-stamp** (closes Kira's P0 that
-   DEBT'd this round). Recommend law runner — larger
-   impact, removes the soft-claim from PLUGIN-AUTHOR.md.
-3. Dispatch code-phase reviewer floor (Kira + Rune) per
-   §20 on any code that lands.
+- **Aaron decisions blocking round-29 progress:**
+  - Sign-off on the build-machine-setup design doc (first
+    sub-task).
+  - Sign-off on the CI-workflow design doc (second sub-
+    task).
+  - Sign-off on the gate inventory (third sub-task).
+  - Sign-off on the first workflow's OS matrix and dotnet
+    pin (fourth sub-task).
+- **NuGet prefix reservation** on `nuget.org` for `Zeta.*`
+  — still maintainer-owned.
+- **`global.json` `rollForward`** — status quo vs relaxed.
+- **Eval-harness MVP scope** — pending since round 23.
+- **Repo visibility** — private on AceHack; flip to public
+  when ready.
 
 ## Notes for the next Kenji waking
 
-- `memory/` is canonical shared memory (not the
-  sandbox). See GOVERNANCE.md §18 + §22.
-- `memory/persona/<persona>/` is per-persona memory.
-  Kenji's seat already folder-migrated; others lazy.
-- Reviewer pass per §20 is mandatory every code-landing
-  round. Kira + Rune is the floor.
-- Public API changes go through Ilyana per §19.
+- `memory/` is canonical shared memory; `memory/persona/
+  <name>/` is per-persona (name-keyed).
+- Reviewer pass per GOVERNANCE.md §20 is mandatory every
+  code-landing round. Kira + Rune is the floor.
+- Public API changes go through Ilyana per GOVERNANCE.md
+  §19.
 - `~/.claude/projects/` is Claude Code sandbox, not git.
-  Do not cite as canonical (§22).
+  Do not cite as canonical (GOVERNANCE.md §22).
+- **CI decisions need Aaron sign-off before landing** —
+  round-29 discipline rule.
+- `../scratch` and `../SQLSharp` are **read-only references**
+  — never copy files.
diff --git a/docs/ROUND-HISTORY.md b/docs/ROUND-HISTORY.md
index 93a4af73..7b24482e 100644
--- a/docs/ROUND-HISTORY.md
+++ b/docs/ROUND-HISTORY.md
@@ -9,6 +9,123 @@ New rounds are appended at the top.
 
 ---
 
+## Round 28 — FsCheck law runner (Option B), stateful-harness design doc, lean4 cleanup
+
+### Anchor — FsCheck law runner at plugin-law surface
+
+Round 28's committed anchor from `CURRENT-ROUND.md` round-27
+close was a law runner that could catch a falsely-tagged
+plugin operator. Design call early in the round: move it
+from `Circuit.Build()` gate (the round-27 soft-claim in
+`docs/PLUGIN-AUTHOR.md`) to a **test-time library** —
+`Zeta.Core.LawRunner`. Rationale committed: keeps `Core`
+free of FsCheck, defers probabilistic-testing cost to the
+plugin author's test project, idiomatic F#.
+
+**Live laws:**
+- `LawRunner.checkLinear` — generates trace pairs `(A, B)`
+  via a `System.Random -> 'TIn` generator, asserts
+  `op(A + B) = op(A) + op(B)` tick-by-tick using user-
+  supplied `addIn` / `addOut` / `equalOut`. Works for
+  `ZSet<'T>` (via `ZSet.add`) and plain numerics.
+- `LawRunner.checkRetractionCompleteness` — Option B
+  (trace-based, no interface change). State-restoration
+  via continuation: feed `forward ++ retract ++
+  continuation`, compare continuation outputs to a fresh-
+  op run of the continuation alone. Any divergence means
+  state survived the cancel.
+
+Deterministic-simulation framing throughout: one seeded
+`System.Random(seed + sampleIndex)` per sample, failing
+run reports `(seed, sampleIndex)`, re-run reproduces
+bit-exact. Each `runSingleInput` call creates a fresh op
+instance so state cannot leak across samples.
+
+**Design doc.** `docs/research/stateful-harness-design.md`
+captures the build-vs-test decision, the Option A vs
+Option B analysis, and the sequenced follow-up plan.
+Aaron's question mid-round — "what does Option A get us
+for the future?" — produced an explicit long-term
+recommendation baked into the doc: **Option A is the right
+long-term direction** because it matches the DBSP paper's
+`(σ, λ, ρ)` triple (the same shape
+`tools/lean4/Lean4/DbspChainRule.lean` proves against) and
+unlocks generic WDC checkpointing + planner fusion of
+adjacent stateful ops. **Option B is the right first step**
+because Option A needs real design work on the async path
+and retraction contract, and prototyping A in a vacuum
+would ship a contract we would need to revise. Option B
+also stays as a fallback for plugins that cannot expose
+`Init`/`Step`/`Retract` (ML wrappers, third-party system
+integrations).
+
+### Reviewer floor — GOVERNANCE.md §20 caught real bugs
+
+Kira + Rune dispatched after the first LawRunner landing.
+Kira P0: original retraction law ("cumulative output over
+forward+retract = 0") is too weak — passes trivially for
+empty-emitting ops and a floored-counter can leak state
+while keeping cumulative zero. Law rewritten to state-
+restoration via continuation (the law the tag actually
+promises), test fixture replaced: `FlooredCounterOp`
+(genuinely stateful and lossy) supersedes the mistagged
+`PositiveOnlyOp` filter (which was non-linear, not a
+retraction fixture). Second Kira P0: `invalidArg` in Core
+public surface violates CLAUDE.md's result-over-exception
+rule — all entries now return `Result<unit, LawViolation>`.
+Third Kira P0: whole-loop RNG meant `(seed, sampleIndex)`
+didn't actually reproduce; fixed with per-sample
+`System.Random(seed + i)`.
+
+Rune P1 findings logged to DEBT: `check*` 8-11 positional
+args → promote to config record before `checkBilinear`
+lands; `LawViolation.Message: string` → structured DU; add
+a test covering ops that omit the marker tag.
+
+Lesson carried forward: **reviewer floor is not a
+formality.** Kira's P0 on retraction semantics was a real
+bug in the law definition, not a style nit. The round 27
+codification of Kira + Rune as mandatory per §20 paid off
+on its first applicable round.
+
+### Lean4 scaffolding cleanup
+
+`lake new` boilerplate removed from `tools/lean4/`:
+`README.md` (GitHub-Pages setup template), `Basic.lean`
+(hello-world sample), `.github/workflows/` (upstream Lean
+CI templates), redundant `.gitignore` (root `.gitignore`
+already covers `.lake/`). `Lean4.lean` rewired to `import
+Lean4.DbspChainRule` so `lake build` walks the real proof
+file. Load-bearing pieces kept: `lakefile.toml`,
+`lake-manifest.json`, `lean-toolchain`. Sample deletion
+flagged by Aaron: "that was a sample hello world [lake
+scaffold] project and we don't need the [Lake] project
+itself we are calling it from dotnet."
+
+### PLUGIN-AUTHOR.md soft-claim retracted
+
+Round 27's "law verification at `Circuit.Build()` once the
+FsCheck generators are implemented" was honest-but-
+aspirational. Round 28's doc now points at
+`LawRunner.checkLinear` / `checkRetractionCompleteness` as
+live, `checkBilinear` / `checkSinkTerminal` flagged as
+round-29+ work.
+
+### Round 29 anchor — CI pipeline
+
+Aaron (round 28 close): "Our CI setup is as first class
+for this software factory as is the agents themselves, it
+does not ultimately work without both." Round 29 opens
+with CI as the anchor; `../scratch` + `../SQLSharp` are
+**read-only reference** repos (never copy files, hand-
+craft every artefact from scratch). Discipline rules
+committed in `docs/BACKLOG.md` §"P0 — CI / build-machine
+setup": Aaron reviews every design decision before it
+lands, cost discipline on CI minutes, cross-platform
+eventual (macOS + Linux first, Windows when justified).
+
+---
+
 ## Round 27 — big round: governance §20-§22, plugin API redesign landed, memory moved in-repo
 
 ### Shipped — governance rules (§20 / §21 / §22)
diff --git a/docs/WINS.md b/docs/WINS.md
index 0ce80a1c..bba3f48d 100644
--- a/docs/WINS.md
+++ b/docs/WINS.md
@@ -11,6 +11,67 @@ shipped." **Ordered newest-first** — recent rounds lead,
 older rounds trail below. Entries stay even after the moment
 passes, because the pattern is the value.
 
+## Wins — round 28
+
+### Reviewer floor caught a P0 in the law definition itself
+
+Kira's harsh-critic pass on the first `LawRunner` landing
+didn't flag style — it flagged the retraction law's
+semantics. The original "cumulative output over
+forward+retract = 0" formulation passes trivially for
+empty-emitting ops and a floored-counter can leak state
+while satisfying it. The rewritten law (state-restoration
+via continuation) is what the `IStatefulStrictOperator`
+tag actually promises. Without the round-27 codification
+of Kira + Rune as mandatory per GOVERNANCE.md §20, the
+wrong law would have landed as canonical and the test
+fixture (`PositiveOnlyOp` — non-linear, not even a
+retraction fixture) would have silently misled the next
+plugin author.
+
+**What it teaches.** Reviewer-floor-as-rule pays the first
+time a round produces novel semantics. The cost of
+dispatching two reviewers on a 130-line file is hours of
+future confusion averted. §20 is load-bearing, not
+ceremonial.
+
+### Option B with a planned Option A path, not "ship and pray"
+
+Mid-round Aaron asked "what does Option A get us for the
+future, is it the right long-term decision?" The answer
+shipped as part of the design doc, not as verbal
+agreement: Option A matches the DBSP paper's `(σ, λ, ρ)`
+triple and unlocks generic WDC checkpointing + planner
+fusion, so it's the right long-term direction — but Option
+B is the right first step because prototyping A in a
+vacuum would ship a contract we'd revise. The document
+names the sequenced round-29+ follow-ups explicitly.
+
+**What it teaches.** When a design question has a future,
+write the future down. "Decided Option B" is a weaker
+commit than "Option B with these specific round-29+
+follow-ups and this specific rationale for why not A
+today." The latter survives the next Kenji waking.
+
+### Deterministic simulation as the harness framing, not an afterthought
+
+Aaron's mid-round "I'm assuming this is for deterministic
+simulation?" re-shaped the design doc. Rather than a
+loose "tests should be reproducible" intent, the doc
+encodes the constraint: one seeded RNG per sample (not
+per whole loop), no wall-clock, total tick order, failing
+run prints seed + sample index, fresh op instance per
+sample so state can't leak across samples. The
+implementation followed the framing; per-sample RNG was a
+Kira P0 because the initial implementation had drifted
+from the doc and the doc was the source of truth.
+
+**What it teaches.** Framing labels ("deterministic
+simulation", "result-over-exception") are contracts, not
+decoration. When the framing and the code disagree, the
+framing wins and the code gets rewritten — not the other
+way round.
+
 ## Wins — round 27
 
 ### 1. Reviewer cadence caught two P0s in the same round the code landed