docs: calibration-harness Stage-2 design — Amara 18th-ferry §B/§F + corrections #2/#7/#9 by AceHack · Pull Request #342 · Lucent-Financial-Group/Zeta

AceHack · 2026-04-24T09:07:02Z

Summary

Research-grade design doc for Stage-2 of Amara's corrected promotion ladder. Specifies the next-rung deliverable (calibration harness) so when implementation starts, conventions are pre-committed.

Key design decisions

Placement: src/Experimental/CartelLab/ (not src/Core/ — that's Stage 4 per Amara ladder).
MetricVector: 7 fields including PLV magnitude AND offset as separate fields (correction Round 30 — threat-model elevation (nation-state + supply-chain) #6 — addresses PR core: PLV mean phase offset — 19th graduation (Amara 18th-ferry #6) #340's shipped split).
Wilson-interval reporting contract: every statistical claim carries {successes, trials, lowerBound, upperBound} — no more "~95% CI ±5%" handwave (correction Round 26 — rename tail, §18 memory clarification, three dispatches #2).
Robust z-score Hybrid mode: percentile-rank fallback when MAD < epsilon (correction Round 31 — rest round (maintainer-called after first-green-gate) #7).
Explicit artifact layout: 5 files + manifest under artifacts/coordination-risk/ (correction Round 33 Track D — static analysis: shellcheck + actionlint + markdownlint + editor config + cspell + vscode #9).
INullModelGenerator + IAttackInjector interfaces with Preserves / Avoids / ExpectedSignals machine-readable annotations.

Scope

Doc-only. No code, no tests, no workflow.
Does NOT touch BACKLOG.md tail — avoids the positional-append conflict pattern that cost backlog: Otto-139..149 multi-directive — F# DSLs + container-DSL + LINQ + signal-proc + KSK canonical #334 → backlog: Otto-139..149 resurrect — F# DSLs + container-DSL + LINQ + signal-proc + KSK (supersedes #334 DIRTY) #341 re-file earlier this session.

18th-ferry operationalization status

#	Correction	Status
1,10	Test classification policy	Shipped (#339)
2	Wilson intervals	Design specified (this doc); impl waits Stage 2.a
4	Exclusivity primitive	Shipped (#331)
5	Modularity relational	Shipped (#324)
6	PLV phase-offset	Shipped (#340)
7	MAD=0 fallback	Design specified (this doc); impl waits Stage 2.a
9	Artifact layout	Design specified (this doc)
3	CoordinationRiskScore rename	Already canonical in code
8	Stronger sources	Reporting discipline

7 of 10 18th-ferry corrections now have either shipped code or committed design.

Test plan

Markdownlint clean locally.
Single new file; no surface impact.

🤖 Generated with Claude Code

…-ferry §B + §F + corrections #2 #7 #9 Research-grade design doc for the Stage-2 rung of Amara's corrected promotion ladder. Specifies: (a) placement under src/Experimental/CartelLab/ (not src/Core/ — that's Stage 4); (b) MetricVector type with PLV magnitude AND offset split (correction #6); (c) INullModelGenerator interface + Preserves/Avoids table columns; (d) IAttackInjector forward-looking interface (Stage 3); (e) Wilson-interval reporting contract with {successes, trials, lowerBound, upperBound} schema (correction #2 — no more "~95% CI ±5%" handwave); (f) RobustZScoreMode with Hybrid fallback (correction #7 — percentile-rank when MAD < epsilon); (g) explicit artifact-output layout under artifacts/ coordination-risk/ with five files + run-manifest.json (correction #9). 6-stage promotion path (0 doc / 1 ADR / 2.a skeleton / 2.b full null-models + first attack / 3 attack suite / 4 Core/NetworkIntegrity / 5 Aurora-KSK) matches Amara's corrected ladder and Otto-105 cadence. Doc-only change; no code, no tests, no workflow, no BACKLOG tail touch (avoids positional-conflict pattern that cost #334 → #341 re-file this session). This is the 7th of 10 18th-ferry operationalizations: - #1/#10 test-classification (#339) - #2 Wilson-interval design specified (this doc) - #6 PLV phase-offset shipped (#340) - #7 MAD=0 Hybrid mode specified (this doc) - #9 artifact layout specified (this doc) - #4 exclusivity already shipped (#331) - #5 modularity relational already shipped (#324) Remaining: Wilson-interval IMPLEMENTATION (waits on #323 + Stage 2.a), MAD=0 Hybrid IMPLEMENTATION (waits on #333 + Stage 2.a), conductance-sign doc (waits on #331), Stage-2.a skeleton itself. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

chatgpt-codex-connector · 2026-04-24T09:07:07Z

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

Copilot

Pull request overview

Adds a research-grade design document specifying the planned Stage-2 “calibration harness” for coordination-risk / cartel detection work, with pre-committed conventions for metrics, confidence-interval reporting, robust z-score fallback, and artifact outputs.

Changes:

Introduces a Stage-2 harness design covering module placement (src/Experimental/CartelLab/), core types/interfaces, and invocation contract.
Specifies statistical reporting discipline (Wilson intervals) and robust z-score modes (including MAD=0 fallback via percentile rank / hybrid).
Defines a fixed artifact output schema under artifacts/coordination-risk/ for downstream calibration/ROC/PR tooling.

+- **`docs/definitions/KSK.md`** (PR #336) — KSK's Oracle
+  layer consumes the harness's per-run Wilson-bounded
+  detection rate. Oracle trust posture depends on the
+  interval width, not just the point estimate.


+- Amara 18th ferry — `docs/aurora/2026-04-24-amara-
+  calibration-ci-hardening-deep-research-plus-5-5-
+  corrections-18th-ferry.md`.


+**Status:** research-grade proposal (pre-v1). Origin: Amara
+18th courier ferry, Part 1 §B ("Statistical Calibration Plan"),
+§F PR #2 ("CoordinationRisk calibration harness"), and Part 2
+corrections #2 (Wilson intervals), #7 (MAD=0 fallback), and #9
+(explicit artifact output). This doc specifies the Stage-2
+rung of the corrected promotion ladder. Author: architect
+review. Scope: design-only; no code, no tests, no workflow


+  WilsonInterval.fs             ← Wilson score CL helper
+tests/Tests.FSharp/CartelLab/
+  CalibrationHarness.Tests.fs   ← seeded smoke tests
+artifacts/coordination-risk/    ← .gitignored; output of runs


+val run : config: HarnessConfig -> Async<unit>
+```
+
+The runner emits all five artifact files on completion.


…rections (#344) Dedicated absorb of Amara's 19th courier ferry per CC-002 close-on-existing discipline. Scheduled Otto-164 → executed Otto-165, following 7-ferry precedent (PRs #196 / #211 / #219 / #221 / #235 / #245 / #259 / #330 / #337). Two-part ferry: Part 1 deep-research DST audit (12 sections: rulebook, 12-row entropy scan, dependency audit, 7-row simulation-surface coverage, retry audit, CI determinism, seed discipline, Cartel-Lab DST readiness, KSK/Aurora DST readiness, state-of-the-art comparison, 10-row PR roadmap, what-not-to-claim caveats; Mermaid CI diagram + Gantt timeline). Part 2 Amara's own 5.5-Thinking correction pass (7 required corrections, per-area grade table with B- overall, revised 6-PR roadmap with titles locked, DST-held + FoundationDB-grade acceptance criteria, copy-paste Kenji summary). Key findings: - DST grade: B- (strong architecture, partial impl) - Blockers: DiskBackingStore bypasses simulation (D-grade filesystem simulation), no ISimulationDriver, Task.Run ambient ThreadPool risk, no seed artifacts / no swarm harness - 4 of 12 Part-1 sections already align with shipped substrate: - §6 test classification → PR #339 - §7 artifact layout → PR #342 design - §8 Cartel-Lab stage discipline → PRs #330/#337/#342 - §9 KSK advisory-only → PR #336 + Otto-140..145 memory 6-PR revised roadmap queued as graduation candidates: 1. DST scanner + accepted-boundary registry (new tool + policy docs + workflow) 2. Seed protocol + CI artifacts 3. Sharder reproduction (NOT widen) — reinforces 18th #10 4. ISimulationDriver + VTS promotion to core 5. Simulated filesystem (DiskBackingStore rewrite) 6. Cartel-Lab DST calibration (aligns with #342 design) Plus: push-with-retry.sh retry-audit finding; DST-held + FDB-grade criteria lock. GOVERNANCE §33 four-field header (Scope / Attribution / Operational status / Non-fusion disclaimer). Amara verdict preserved: "strong draft / not canonical yet." Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

…mara 19th-ferry correction #6) (#346) Research-grade criteria doc locking two acceptance bars: 1. DST-held — minimum: 6 items (seeds committed, failing tests emit seed+params, bit-for-bit local-vs-CI reproducibility, broad sweeps nightly-not-gating, zero unreviewed entropy hits in main-path, boundaries either simulated or explicitly accepted). 2. FoundationDB-grade DST candidate — aspirational: 8 surfaces (simulated FS, simulated network, deterministic task scheduler, fault injection/buggify, swarm runner, replay artifact storage, failure minimization/shrinking, end-to-end scenario from one seed). Maps 19th-ferry revised-roadmap PRs to which criteria items each addresses. Captures Amara's per-area grade table (overall B-) as "Amara's assessment, not factory- certified." Explicit promotion path: doc stays research-grade until PR 1 of the 19th-ferry revised roadmap lands an ADR promoting the DST-held bar to factory discipline; at that point criteria migrate to docs/DST-COMPLIANCE.md top-level. No graduation claims DST-held today; graduations reference this doc as target without self-certification. Composes with test-classification.md (PR #339; supports items 1+2+4), calibration-harness-stage2-design.md (PR #342; artifact schema supports item 2), Amara 19th ferry (PR #344 absorb; source of criteria). Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

Copilot AI review requested due to automatic review settings April 24, 2026 09:07

AceHack enabled auto-merge (squash) April 24, 2026 09:07

Copilot started reviewing on behalf of AceHack April 24, 2026 09:07 View session

AceHack merged commit 96f9a74 into main Apr 24, 2026
11 of 12 checks passed

AceHack deleted the docs/calibration-harness-stage2-design branch April 24, 2026 09:08

Copilot AI reviewed Apr 24, 2026

View reviewed changes

AceHack mentioned this pull request Apr 24, 2026

ferry: Amara 19th absorb — DST Audit + 5.5 Corrections (10 tracked; 4 aligned with shipped; 7 queued) #344

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs: calibration-harness Stage-2 design — Amara 18th-ferry §B/§F + corrections #2/#7/#9#342

docs: calibration-harness Stage-2 design — Amara 18th-ferry §B/§F + corrections #2/#7/#9#342
AceHack merged 1 commit intomainfrom
docs/calibration-harness-stage2-design

AceHack commented Apr 24, 2026

Uh oh!

chatgpt-codex-connector Bot commented Apr 24, 2026

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

AceHack commented Apr 24, 2026

Summary

Key design decisions

Scope

18th-ferry operationalization status

Test plan

Uh oh!

chatgpt-codex-connector Bot commented Apr 24, 2026

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants