Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
70 changes: 70 additions & 0 deletions docs/BACKLOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -93,6 +93,76 @@ within each priority tier.
side (Window.fs wiring pending). Target: measured numbers in
`docs/BENCHMARKS.md` by end of round 20.

## P0 — CI / build-machine setup (round-29 anchor)

- [ ] **First-class CI pipeline for Zeta.** Every agent-written
commit eventually has to pass the same gate a human commit
does; right now the only gate is `dotnet build -c Release`
on the maintainer's laptop. Aaron's framing (round 28):

> "Our CI setup is as first class for this software factory
> as is the agents themselves, it does not ultimately work
> without both."

Discipline rules committed up front:

1. **Read-only reference.** `../scratch` is the model for
build-machine setup (scripts, installers, install
locations, tool pins). `../SQLSharp` is the model for
GitHub Actions workflows. **Never copy files.** Read to
understand the shape and the intent; hand-craft every
artefact from scratch for Zeta so no cruft or
assumptions from another repo's context sneaks in.
2. **Human review on every decision.** Aaron reviews the
OS matrix, tool versions, caching strategy, artefact
retention, secret handling, permissions model, workflow
triggers, and the per-job concurrency/timeout settings
before any workflow lands. This is *not* a place for
"ship and iterate".
3. **Cost discipline.** CI minutes are the expensive
resource. Every job earns its slot: justify any
matrix-axis expansion, any scheduled run, any
always-on-PR gate. Default to narrow (one OS, one
dotnet, current branch only) and widen only with a
stated reason.
4. **Cross-platform, eventually.** Zeta is cross-platform
(.NET 10, macOS arm64 dev box, Linux CI, Windows
supported). Aaron can run rounds on Windows when
Windows-specific work is under way; say so when useful.
CI matrix should cover at least macOS + Linux; Windows
gets added once we have a Windows-breaking test to
justify the slot.
5. **Parallelism note.** Product work and CI work run on
one maintainer machine but rarely interfere at the
shell level. Architect is free to dispatch CI-design
subagents in parallel with product work as long as
neither agent writes the same file.

**First sub-tasks (round-29 anchor; each its own Aaron
review gate):**

1. Audit `../scratch` for install-script patterns (what
tools, what versions, what pinning method, what target
paths). Output: a design doc at
`docs/research/build-machine-setup.md` citing every
borrowed idea and nothing else. No file copies.
2. Audit `../SQLSharp` `.github/workflows/` for workflow
shape (triggers, jobs, matrix, caching, permissions).
Output: a design doc at
`docs/research/ci-workflow-design.md`.
3. Map Zeta's actual gate list: `dotnet build`,
`dotnet test`, Semgrep, Alloy (needs JDK), Lean proofs
(needs elan/lake), TLC (needs JDK + tla2tools.jar),
Stryker (slow — scheduled not per-PR), FsCheck tests.
Output: a gate inventory at
`docs/research/ci-gate-inventory.md`.
4. First workflow: `build-and-test.yml` covering
`dotnet build -c Release` + `dotnet test Zeta.sln -c
Release` on `ubuntu-latest` + `macos-latest`. Nothing
else until Aaron has signed off on the gate list.
5. Subsequent workflows added one at a time, each with
explicit Aaron sign-off on the design doc first.

## P0 — security / SDL artifacts

- [ ] **`docs/security/CRYPTO.md`** — justify CRC32C (integrity, not
Expand Down
203 changes: 121 additions & 82 deletions docs/CURRENT-ROUND.md
Original file line number Diff line number Diff line change
@@ -1,94 +1,133 @@
# Current Round — 28 (open)
# Current Round — 29 (open)

Round 27 closed; narrative absorbed into
`docs/ROUND-HISTORY.md`. Round 28 opens with these
deferred items carried over from round 27:
Round 28 closed; narrative absorbed into
`docs/ROUND-HISTORY.md`. Round 29 opens with **CI setup as
the anchor** — see `docs/BACKLOG.md` §"P0 — CI /
build-machine setup" for the full discipline rules and
sub-tasks.

## Status

- **Round number:** 28
- **Opened:** 2026-04-18 (continuous from round-27 close)
- **Classification:** split — non-trivial P1 carryover
from reviewer pass + other deferred work.
- **Reviewer budget:** 2-3 per §13 + mandatory Kira+Rune
floor per §20 on any code-landing phase.

## Carried from round 27

**Reviewer P1 findings (Kira + Rune, logged to DEBT):**
- `OutputBuffer` tick-stamp + invalidate-on-tick-end.
- `ReadDependencies` defensive copy at registration.
- `box plugin` triple-evaluation → `let boxed = box plugin` once.
- `BayesianRateOp` `int64` accumulator → `Checked.(+)` or saturate.
- `INestedFixpointParticipant` inherits `IOperator<'TOut>`.
- `PluginHarness` id-space via `Int32.MinValue`-range synthetics.
- `IOperator<'T>` → `IZetaOperator<'T>` rename (before external adoption).
- `Op<'T>.Value` mixed-accessibility hover-doc pointing at `OutputBuffer.Publish`.
- `PluginApi.fs` split when >300 lines.
- PLUGIN-AUTHOR.md `[<Extension>]` explanation in sample.
- Extract `internal assignHarnessId` helper shared by `Circuit.Build` + `PluginHarness`.

**Design work still open:**
- **Round number:** 29
- **Opened:** 2026-04-18 (continuous from round-28 close)
- **Classification:** infrastructure round — CI pipeline
design is the anchor; product work (bilinear / sink-
terminal laws, Option-A stateful promotion) runs in
parallel where it doesn't block on Aaron's CI-design
review gates.
- **Reviewer budget:** Kira + Rune floor per GOVERNANCE.md
§20 on every code landing. Aaron personally reviews
every CI-decision artefact.

## Round 28 close — what landed

Anchor goal achieved: FsCheck law runner live as a test-
time library.

- `LawRunner.checkLinear` — generic additivity check; works
over `ZSet` and plain numerics via user-supplied `add`/
`equal` callbacks.
- `LawRunner.checkRetractionCompleteness` — Option B
(trace-based), state-restoration via continuation after
reviewer P0 rewrite. Catches retraction-lossy stateful
ops (tested against a floored-counter fixture).
- Per-sample `System.Random(seed + i)` for true bit-exact
reproducibility of `(seed, sampleIndex)`.
- Deterministic-simulation framing locked in
`docs/research/stateful-harness-design.md`; Option A
(enrich `IStatefulStrictOperator` with `Init`/`Step`/
`Retract` triple matching the DBSP paper's `(σ, λ, ρ)`
shape) is the planned additive promotion in round-30+.
- Scaffolding cleanup: `tools/lean4/` `lake new` leftovers
removed; `Lean4.lean` rewired to import the real
`DbspChainRule` proof file.

## Round 29 anchor — CI pipeline setup

**Discipline rules (committed up front; `docs/BACKLOG.md`
has the full text):**

1. Read `../scratch` (build-machine setup) and
`../SQLSharp` (GitHub workflows) for shape + intent.
**Never copy files.** Hand-craft every artefact.
2. Aaron reviews every CI design decision before it lands.
This is not a "ship and iterate" surface.
3. Cost discipline: every CI minute earns its slot; default
to narrow matrix and widen with a stated reason.
4. Cross-platform eventual: macOS + Linux first; Windows
when there's a Windows-breaking test to justify it.
Aaron can run rounds on Windows on request.
5. Product work and CI work run in parallel on the same
machine; dispatch research subagents for CI design
concurrently with product work as long as neither
writes the same file.

**Sub-task sequence (each its own Aaron review gate):**

1. Audit `../scratch` → `docs/research/build-machine-setup.md`.
2. Audit `../SQLSharp/.github/workflows/` →
`docs/research/ci-workflow-design.md`.
3. Gate inventory → `docs/research/ci-gate-inventory.md`.
4. First workflow: `build-and-test.yml` covering
`dotnet build -c Release` + `dotnet test Zeta.sln -c
Release` on `ubuntu-latest` + `macos-latest`. Nothing
else until Aaron signs off on the gate list.
5. Subsequent workflows land one at a time, each with an
explicit design doc and sign-off.

## Carried from round 28

**Law runner follow-ups (DEBT-tracked):**
- `check*` take 8-11 positional args → promote to config
record before `checkBilinear` lands.
- `LawViolation.Message` → structured DU.
- Test covering ops that omit the marker tag.

**Law coverage (product work; can run in parallel with CI):**
- `checkBilinear` — join-shaped ops; standard DBSP
incrementalisation form.
- `checkSinkTerminal` — Sink-tagged ops must not compose
into a relational path.
- Option-A promotion of `IStatefulStrictOperator` to
explicit `(σ, λ, ρ)` triple — round-30+ unless CI work
stalls on a review gate, in which case advance here.

**Open from round 27 (deferred to round-30+ pool):**
- `IsDbspLinear` Lean predicate + B1/B2/B3/chain_rule
closures (Tariq option-c; half-day B2, two days full).
- FsCheck law runner at `Circuit.Build()` per capability
tag — unblocks PLUGIN-AUTHOR.md's soft-claim in "Known
limits of round-27."

**Persona-notes migration tail:**
- 6 remaining persona notebooks still in single-file
layout (`public-api-designer.md`, `skill-tune-up-ranker.md`,
`best-practices-scratch.md`, `algebra-owner.md`,
`formal-verification-expert.md`, `agent-experience-researcher.md`).
Lazy migration per §21; convert when a persona next
writes a typed memory entry.

**Other deferred:**
- Rune on `docs/STYLE.md` decision (small).
- UX + DX persona proposals.
- Empathy-coach persona spawn (naming pending).

## Workflow cadence (established round-26, codified round-27)

- Each round runs on its own branch (`round-N`).
- Coherent changes within a round become separate commits
where it helps readability.
- Round-close = PR from `round-N` to `main` + merge.
- Reviewer pass per §20 before round-close.
- Maintainer may request a review pass on the branch diff
before merge; ask before pushing the merge.
closures (Tariq option-c).
- Reviewer P1 list: `OutputBuffer` tick-stamp,
`ReadDependencies` defensive copy, BayesianRate
`Checked.(+)`, `IOperator` → `IZetaOperator` rename
window, `PluginApi.fs` split when >300 lines.

## Open asks to the maintainer

- **NuGet prefix reservation** on `nuget.org` for
`Zeta.*` — maintainer owns.
- **`global.json` `rollForward`** — status quo vs relaxed
(silent-pick status quo unless objection).
- **Eval-harness MVP scope** — still pending since round
23.
- **Repo visibility** — currently private on AceHack;
flip to public when ready.

## Next architect actions

1. Open `round-28` branch off `main` once round-27 PR
merges.
2. Anchor choice: **FsCheck law runner** (unblocks
plugin-author trust + validates Tariq's design) or
**OutputBuffer tick-stamp** (closes Kira's P0 that
DEBT'd this round). Recommend law runner — larger
impact, removes the soft-claim from PLUGIN-AUTHOR.md.
3. Dispatch code-phase reviewer floor (Kira + Rune) per
§20 on any code that lands.
- **Aaron decisions blocking round-29 progress:**
- Sign-off on the build-machine-setup design doc (first
sub-task).
- Sign-off on the CI-workflow design doc (second sub-
task).
- Sign-off on the gate inventory (third sub-task).
- Sign-off on the first workflow's OS matrix and dotnet
pin (fourth sub-task).
- **NuGet prefix reservation** on `nuget.org` for `Zeta.*`
— still maintainer-owned.
- **`global.json` `rollForward`** — status quo vs relaxed.
- **Eval-harness MVP scope** — pending since round 23.
- **Repo visibility** — private on AceHack; flip to public
when ready.

## Notes for the next Kenji waking

- `memory/` is canonical shared memory (not the
sandbox). See GOVERNANCE.md §18 + §22.
- `memory/persona/<persona>/` is per-persona memory.
Kenji's seat already folder-migrated; others lazy.
- Reviewer pass per §20 is mandatory every code-landing
round. Kira + Rune is the floor.
- Public API changes go through Ilyana per §19.
- `memory/` is canonical shared memory; `memory/persona/
<name>/` is per-persona (name-keyed).
- Reviewer pass per GOVERNANCE.md §20 is mandatory every
code-landing round. Kira + Rune is the floor.
- Public API changes go through Ilyana per GOVERNANCE.md
§19.
- `~/.claude/projects/` is Claude Code sandbox, not git.
Do not cite as canonical (§22).
Do not cite as canonical (GOVERNANCE.md §22).
- **CI decisions need Aaron sign-off before landing** —
round-29 discipline rule.
- `../scratch` and `../SQLSharp` are **read-only references**
— never copy files.
21 changes: 21 additions & 0 deletions docs/DEBT.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,27 @@ feature + debt budget).

## Live debt

### `LawRunner.check*` takes 8-11 positional args — promote to config record
- **Site:** `src/Core/LawRunner.fs`
- **Found:** round 28 by Rune (maintainability-reviewer)
- **Effort:** S
- **Friction:** call sites stack integer literals with trailing `// seed` / `// samples` comments; the next law (`checkBilinear`) will need 10+ positionals and multiply the drift surface.
- **Fix:** introduce `LinearityConfig<'TIn,'TOut>` / `RetractionConfig<'TIn,'TOut>` records and take one argument. Do it before `checkBilinear` lands.

### `LawViolation.Message` is a string — promote to a structured DU
- **Site:** `src/Core/LawRunner.fs`
- **Found:** round 28 by Kira (harsh-critic)
- **Effort:** S
- **Friction:** tests string-grep on `"Linearity broke"` / `"Retraction incomplete"`; downstream tooling can't pattern-match on cause.
- **Fix:** `type Reason = LinearityBreak of tick:int | RetractionResidual of count:int | BadArgs of string`; keep `Message` as a rendering helper.

### `LawRunner` has no test covering operators that omit the marker tag
- **Site:** `tests/Tests.FSharp/Plugin/LawRunner.Tests.fs`
- **Found:** round 28 by Kira (harsh-critic)
- **Effort:** S
- **Friction:** law runner claims to verify the tag; no test proves the runner doesn't secretly rely on the tag to compile-time dispatch.
- **Fix:** add a fixture that implements `IOperator<_>` directly (no `ILinearOperator` tag) and confirm `checkLinear` runs against it.

### `Op<'T>` implicitly publicised as a plugin subclass-extension point
- **Site:** `src/Core/Circuit.fs` (`Op`, `Op<'T>`) and every subtype under
`src/Core/Operators.fs`
Expand Down
Loading