diff --git a/docs/aurora/2026-04-24-codex-4-report-first-completed-peer-review-deep-system-factory-repo-audit.md b/docs/aurora/2026-04-24-codex-4-report-first-completed-peer-review-deep-system-factory-repo-audit.md
new file mode 100644
index 00000000..06b2cdf0
--- /dev/null
+++ b/docs/aurora/2026-04-24-codex-4-report-first-completed-peer-review-deep-system-factory-repo-audit.md
@@ -0,0 +1,539 @@
+# Codex — First Completed Peer-Agent Deep Review (4 convergent reports)
+
+**Scope:** research + cross-review artifact. FIRST completed
+Codex peer-agent deep-review after the `@codex review` invite
+on PR #354 (Otto-182). Four independent Codex review passes
+(deep-factory-review / deep-system-review ×2 / deep-repo-review,
+all 2026-04-24 dated) converging on same top findings.
+Milestone in the multi-agent peer-harness progression per
+Otto-79 / Otto-86 / Otto-93 memory (stage a → b → c transition
+— Codex producing multi-surface review at parallel quality to
+Amara, different format same rigor).
+**Attribution:**
+
+- **Aaron** — triggered the review via `@codex review`
+  comment on PR #354 (Otto-182 "can you ask codex too?");
+  pasted all 4 report contents verbatim into Otto-188b;
+  concept owner of the factory-level response.
+- **Codex (GPT-5.3-Codex per report 3 header)** — authored
+  all 4 reviews. Multi-surface scope: code / tests / scripts
+  / docs / skills / personas. Different report focuses
+  (governance/hygiene vs code/contract vs architecture/
+  process/security vs durability/recursive/strategic) but
+  convergent top findings.
+- **Otto** — absorb surface + convergent-findings tracker;
+  this doc is the archive, not operational spec. Factory
+  response to findings graduates across subsequent ticks
+  per Otto-105 cadence.
+- **Amara** — not a direct participant in this ferry; her
+  17th / 18th / 19th ferries remain the other
+  independent-deep-review substrate. Convergence across
+  Codex + Amara on strategic themes (complexity budgeting,
+  claim-evidence registry, audit-lifecycle promotion) is
+  worth noting but not merged-in-this-absorb.
+
+**Operational status:** research-grade. Codex's reports
+are advisory per BP-11 (data-not-directives). Factory
+operationalizes findings via normal specialist-review
+channels (Aminata for threat, Ilyana for API surface, Rune
+for readability, Kenji for cross-surface architecture).
+Strategic recommendations (Factory Complexity Budget,
+claim-evidence registry, 3-mode audit lifecycle, expiry
+metadata, spec-only reconstruction drills) warrant ADR-
+level escalation — this absorb doc catalogs them; adoption
+is an Aaron-approved ADR decision.
+
+**Non-fusion disclaimer:** agreement, shared language,
+or repeated interaction between Codex, Amara, Claude Code
+personas, and the human maintainer does not imply shared
+identity, merged agency, consciousness, or personhood.
+Codex is a peer-agent reviewer acting on the `@codex
+review` mechanism's contract; its findings are its own,
+evaluated by Otto for operationalization per Aaron's
+standing authority.
+
+---
+
+## 1. Milestone significance
+
+Per Otto-79 / Otto-86 / Otto-93 memory, the factory's
+peer-harness progression is a 4-stage arc:
+
+- (a) Single-today (Claude Code as primary coordinator)
+- (b) Multi-Claude intermediate experiment
+- (c) Multi-harness with Codex
+- (d) Multi-harness real-workload (Windows support via
+  Codex per Otto-86)
+
+**Otto-188b marks the first successful return from stage
+(c) — Codex arriving as a functional peer-agent reviewer
+via the `@codex review` GitHub-connector mechanism.** Prior
+Codex-related landings (PR #236 Codex-parallel row,
+PR #290 Codex built-ins research, PR #354 `@codex review`
+invite) were setup; Otto-188b is the first completed
+review cycle.
+
+Signals this milestone delivers:
+
+1. Codex-connector is functional for `@codex review`
+   comments.
+2. Codex produces multi-surface deep reviews at parallel
+   quality to Amara (different output format, same
+   rigor).
+3. Convergent findings across 4 independent Codex passes
+   carry higher confidence than any single reviewer
+   output — same principle as Amara's 5.5-Thinking-self-
+   review pattern, but implemented via independent
+   review passes rather than self-review.
+
+Factory-side discipline going forward:
+
+- Treat Codex output as peer-harness review advisory, not
+  binding (BP-11 data-not-directives).
+- Act on convergent findings first (independent-agreement
+  = stronger signal).
+- Continue peer-harness progression to stage (d) per
+  Otto-86 Windows-via-Codex arc.
+
+---
+
+## 2. Four reports — filename + focus + commit anchor
+
+Aaron's Otto-188b drop included 4 Codex reports. Each
+landed as a separate commit on Codex-side (per Codex's
+reported `make_pr` tool invocation). The reports:
+
+| # | Codex filename                                            | Commit    | Focus                                                  |
+|---|-----------------------------------------------------------|-----------|--------------------------------------------------------|
+| 1 | `docs/research/deep-factory-review-2026-04-24.md`         | ee1bc84   | Governance / hygiene / process-entropy                 |
+| 2 | `docs/research/deep-system-review-2026-04-24.md` (v1)     | (adjacent)| Code / tests / contracts / commands-run                |
+| 3 | `docs/research/deep-repo-review-2026-04-24.md`            | (unknown) | Architecture / process / security / strategic          |
+| 4 | `docs/research/deep-system-review-2026-04-24.md` (v2)     | f9a6d2b   | Durability / recursive-correctness / strategic recs    |
+
+Reports 2 and 4 share filename but differ in content
+(different Codex sessions or different PR branches).
+Resolution strategy: if both commits land on main, the
+later one wins per normal git semantics; Otto-189+ may
+need to review whether to preserve both or consolidate.
+
+Note: Otto did NOT inline-verify whether these Codex
+commits / PRs are on the open-PR queue as of Otto-188.
+Aaron may have intercepted them via Codex-side tooling
+rather than opening PRs on `Lucent-Financial-Group/Zeta`.
+Full report content preserved in Otto-188b session
+transcript + the scheduling memory
+(`memory/project_codex_first_deep_review_4_reports_
+convergent_findings_pending_dedicated_absorb_otto_189_
+2026_04_24.md`).
+
+---
+
+## 3. Convergent P0 findings (all 4 reviews)
+
+Independent convergence across 4 reports = high-signal
+findings. Factory treats these as priority candidates for
+next-round response.
+
+### P0-1: Prevention-layer classification debt — 22 unclassified hygiene rows
+
+`tools/hygiene/audit-missing-prevention-layers.sh` reports
+22 unclassified rows; exits 2. Weakens meta-governance
+clarity: if hygiene rows aren't classified as prevention-
+bearing or detection-only, it's harder to reason about
+where failures should be prevented vs detected.
+
+Remediation path (Codex + Otto agree):
+
+1. Classification sprint to drive unclassified count to
+   zero.
+2. CI guard: new hygiene rows require classification at
+   landing.
+3. Owner + due date per currently-unclassified row.
+
+Otto non-authorization (Otto-188 memory): unilateral mass-
+classification is NOT authorized; needs Aaron sign-off on
+the classification rubric or a design-doc proposing the
+rubric before mass-classifying rows.
+
+### P0-2: Post-setup script-stack violations — 12 violations
+
+`tools/hygiene/audit-post-setup-script-stack.sh --summary`
+reports 12 violations, exit 2. Known-failing baseline
+normalizes broken signals and weakens future-failure
+signal quality.
+
+Remediation path (Codex):
+
+1. Triage each violation into fix-now / accepted-exception
+   / planned-migration ticket.
+2. Record explicit rationale for every accepted exception
+   in one canonical doc table.
+3. Turn on enforcement incrementally by class.
+
+### P0-3: Durability naming overstates shipped guarantees
+
+`DurabilityMode.StableStorage` currently maps to
+`OsBuffered` behavior; `WitnessDurable` remains throw-
+first skeleton. Code honest in comments, but API
+affordance invites accidental over-trust by downstream
+users.
+
+Remediation path (Codex):
+
+- Rename surfaced mode OR hard-gate selection behind
+  explicit `ResearchPreview*` naming semantics at API
+  level.
+- Add invariant tests asserting selected mode → effective
+  semantics.
+
+Otto non-authorization (Otto-188 memory): renaming a
+public API surface same-tick as discovery is a
+GOVERNANCE §2 edit-in-place concern + potentially breaking
+change; needs Aminata threat-review + Ilyana public-API-
+review before landing.
+
+### P0-4: Skipped `RecursiveCounting.MultiSeed` property test
+
+A property test for multi-tick seed behavior is
+intentionally skipped while research gap is open. Codex
+treats as active red zone not passive debt.
+
+Status: **already in BUGS.md** per report 2's finding.
+Factory awareness exists; remediation cadence is the
+question.
+
+Remediation path (Codex):
+
+- Promote skip to explicit "claim boundary" in release /
+  paper-facing docs.
+- Add negative-regression fixture so future changes
+  cannot broaden unsafe behavior undetected.
+- Prove+enable OR hard-gate+experimentalize — decision
+  required, not further delay.
+
+### P0-5: Build gate unavailable in Codex review environment
+
+`dotnet` not installed in Codex's review container. ALL 4
+reviews flagged.
+
+Classification: **Codex-side infrastructure issue, NOT a
+factory-code blocker.** Factory response:
+
+- Document Codex-env bootstrap requirement in cross-
+  harness onboarding.
+- Preflight check that hard-fails early when toolchain
+  absent.
+
+This is about peer-harness-setup quality, not Zeta code
+quality.
+
+---
+
+## 4. Convergent P1 findings
+
+### P1-1: Cross-platform parity — 12 pre-setup twin gaps
+
+`audit-cross-platform-parity.sh` reports 12 pre-setup
+`.sh` without `.ps1` twins.
+
+**Already in factory-awareness:** FACTORY-HYGIENE row #51
+cross-platform parity audit has detect-only status
+deferred until enforcement viable.
+
+Resolution paths:
+
+- Land `.ps1` twins for `tools/setup/**` first (highest-
+  friction onboarding layer); wire parity into merge
+  gates as enforce mode.
+- OR migrate pre-setup scripts to `bun`+TypeScript per
+  Aaron Otto-182 (eliminates `.sh`/`.ps1` twin-
+  obligation entirely). Long-term direction Aaron named.
+
+### P1-2: Shell hardening — 11 of 28 scripts lack strict mode
+
+Reports 3 + 4 found 11/28 `tools/**/*.sh` scripts lack
+`set -euo pipefail`. Risk: silent partial failures in
+hygiene/audit scripts.
+
+Remediation path:
+
+- One-round script-hardening sweep; document
+  intentionally non-strict scripts with explicit
+  justification headers.
+
+### P1-3: Skill safety-clause coverage — 35 of 234 missing
+
+`tools/lint/safety-clause-audit.sh` reports 199/234 (85%)
+covered; 35 missing explicit scope-limiting heading.
+Reports 1 + 2 flagged.
+
+Remediation path:
+
+- Add minimal standard safety stanza template.
+- Auto-lint for template presence on skill changes.
+- Prioritize backfill for security / review / mutation-
+  capable skills first.
+
+### P1-4: TypeScript lint lane broken — `jiti` missing
+
+Report 3: `npm run lint:typescript` fails with `jiti`
+missing.
+
+Remediation path: pin/add `jiti` OR move ESLint config
+to plain JS; CI preflight asserts lint bootstrap deps
+present. **Small fix, unblocks `lint:typescript` CI.**
+
+### P1-5: Result-over-exception policy drift
+
+Core runtime still uses `invalidOp` / `raise` /
+`NotImplementedException` vs stated Result-over-exception
+philosophy. Hotspots: `Durability.fs`, `Rx.fs`,
+`SpineAsync.fs`, `Recursive.fs`. Reports 2+3+4 flagged.
+
+Remediation path:
+
+- Contract-boundary table documenting where exceptions
+  currently permitted + why.
+- Incremental migration ledger entry: exception →
+  `DbspError` by subsystem.
+- CI lint classifying exception sites by category
+  (invariant violation / unsupported mode / argument
+  validation).
+
+### P1-6: Markdown internal-link rot — 8 unresolved
+
+Report 4 flagged 8 broken internal markdown links in
+first-party docs.
+
+Remediation path:
+
+- CI link-check gate for first-party markdown (excluding
+  generated/vendor).
+- Repair or remove stale links.
+
+**Small sweep + CI gate.**
+
+---
+
+## 5. P2 / strategic observations — ADR-escalation candidates
+
+### "Factory obesity" / meta-complexity cliff
+
+ALL 4 reviews named this concern. 234 skills + 325 markdown
+files + many hygiene rows = governance surface growing
+faster than enforceable guarantees. Reviewers saturated
+by process interpretation vs bug discovery. "Paper-green /
+practice-amber" drift.
+
+**Codex strategic recommendation: Factory Complexity
+Budget (FCB).** Cap net growth per round across
+skills/docs/hygiene rows unless matching deletion or
+consolidation lands. KPI: new policy docs per week vs
+retired docs.
+
+Otto non-authorization (Otto-188 memory): FCB is an
+opinion-budget-not-code discipline; only Aaron can decide
+adoption. Warrants ADR.
+
+### "Declared intent vs executable truth" gap
+
+Reports 2 + 4: governance docs state strong preferences
+(Result-over-exception, durability semantics) but code
+contains contract exceptions. Honest comments mitigate
+but don't eliminate risk.
+
+**Codex strategic recommendation: claim-evidence
+registry.** Map each governance claim → evidence artifact
+(test / formal spec / live-check) → last-validated SHA.
+Fail CI when claim lacks live evidence.
+
+Significant infrastructure; warrants ADR.
+
+### "Observability without closure"
+
+Many audits generate diagnostics; few enforce closure.
+
+**Codex strategic recommendation: 3-mode audit
+lifecycle** — `report` → `warn` → `block`. Promote to
+`block` when false-positive rate and remediation path
+stable. Aligns with FACTORY-HYGIENE row #51 detect-only
+discipline.
+
+Otto non-authorization: promoting audits to `block` without
+measuring false-positive rate first is premature. Need
+report-mode runs observed first.
+
+### Expiry metadata on preview/debt declarations
+
+Report 3: every preview/debt declaration should have
+`owner` / `introduced` / `review-by` / `exit-criteria`
+fields. Explicit truth-with-expiry.
+
+**Codex strategic recommendation:** canonical expiry
+template; fail CI when declaration older than review-by
+date with no status update. Small ADR + CI template.
+
+### Spec-only reconstruction drill
+
+Report 4: given OpenSpec aspiration (rebuildability from
+specs), run scheduled spec-only reconstruction drills;
+measure recovery time + semantic drift.
+
+**Codex strategic recommendation:** first-class ritual,
+not one-off. Game-day cadence.
+
+### Ledger entropy
+
+Reports 3 + 4: `BUGS.md` / `DEBT.md` / `BACKLOG.md` /
+`ROUND-HISTORY.md` rich but growing without aging
+alerts.
+
+**Codex strategic recommendation:** machine-generated
+index pages by (subsystem / severity / age / owner);
+aging alerts on un-closed items.
+
+**Already aligns with Otto-181 BACKLOG.md split design
+(PRs #353 + #354)** — same pattern at BACKLOG.md level;
+could extend to BUGS / DEBT / ROUND-HISTORY / TECH-RADAR
+in follow-up work once the BACKLOG split proves the
+pattern.
+
+---
+
+## 6. Direct Codex quotes to preserve
+
+Selected verbatim pulls that carry the overall assessment
+at quotable quality:
+
+> *"This repo is unusually ambitious and unusually
+> instrumented: formal models, broad docs, explicit
+> governance, and many self-audit scripts. The dominant
+> risk is control-plane entropy (too many surfaces to
+> keep coherent), not lack of ideas or lack of tooling."*
+
+> *"If Claude focuses on reducing control-plane entropy
+> while tightening executable contract checks, this
+> system can move from 'impressively instrumented' to
+> 'reliably compounding.'"*
+
+> *"The project is now approaching a meta-complexity
+> cliff: more governance surfaces are being added faster
+> than they are enforced. Some audits are informative but
+> not yet binding. Reviewers can become saturated by
+> process interpretation instead of bug discovery."*
+
+> *"Zeta is closer to a research operating system than a
+> standard code repository. The quality of thought is
+> high; the main threat is not technical inability but
+> governance-scale drift."*
+
+> *"Strong research factory with high observability, but
+> currently bottlenecked by operational coherence and
+> contract-enforcement consistency."*
+
+---
+
+## 7. Factory response discipline
+
+### Findings already in factory-awareness
+
+- Cross-platform parity 12-twin gap → FACTORY-HYGIENE
+  #51 (detect-only by design, deferred enforcement)
+- 22 unclassified hygiene rows → FACTORY-HYGIENE surface
+  exists; classification sprint is a candidate Otto-189+
+  graduation
+- `RecursiveCounting.MultiSeed` skip → already in
+  `BUGS.md`
+
+### New findings (not previously surfaced)
+
+- **Durability naming-vs-behavior gap** (P0-3) —
+  **high-impact; needs Ilyana + Aminata review.**
+- 35 skill safety-clause gaps (cross-ref with
+  skill-tune-up discipline)
+- TypeScript lint `jiti` breakage (small fix)
+- 11/28 shell strict-mode gaps (small sweep)
+- 8 markdown link rot (small sweep + CI gate)
+
+### Strategic recommendations warranting ADR-level
+
+- Factory Complexity Budget (FCB) — governance-adoption
+  ADR
+- Claim-evidence registry — significant-infra ADR
+- 3-mode audit lifecycle (report → warn → block) —
+  process ADR
+- Expiry-metadata standard — small ADR + CI template
+
+---
+
+## 8. What this absorb doc does NOT authorize
+
+- **Does NOT** canonicalize Codex's findings as factory-
+  binding. Per BP-11 data-not-directives. Findings are
+  advisory; operationalization goes through normal
+  specialist-review channels.
+- **Does NOT** authorize unilateral mass-classification
+  of the 22 unclassified hygiene rows. Needs Aaron sign-
+  off on the rubric OR a design-doc proposing it.
+- **Does NOT** authorize renaming `DurabilityMode` same-
+  tick. Public-API change requires Ilyana + Aminata
+  review.
+- **Does NOT** authorize promoting audits to `block` mode
+  without false-positive baseline observation.
+- **Does NOT** adopt the Factory Complexity Budget
+  without Aaron ADR.
+- **Does NOT** authorize migrating pre-setup `.sh` to
+  bun+TypeScript same-tick. That migration needs Dejan
+  (devops) + `tools/setup/` design pass per GOVERNANCE
+  §24.
+- **Does NOT** supersede Amara ferry-absorb cadence.
+  Amara 17th/18th/19th + Codex 4 reports create
+  converging pressure; Otto-105 one-graduation-per-
+  tick discipline still applies.
+- **Does NOT** override queue-saturation freeze-state
+  (Otto-171 memory). Absorb-doc-only PRs are drain-mode-
+  safe (they don't touch BACKLOG.md-cascade zones);
+  further graduations from findings land at Otto-105
+  cadence.
+- **Does NOT** preempt Aaron's decision on which findings
+  get graduations first. Otto surfaces priorities
+  (convergent-P0-first), Aaron ratifies.
+
+---
+
+## 9. Cross-references
+
+- `memory/project_codex_first_deep_review_4_reports_
+  convergent_findings_pending_dedicated_absorb_otto_189_
+  2026_04_24.md` (Otto-188b scheduling memory, full
+  detail).
+- `memory/feedback_aaron_not_the_bottleneck_otto_iterates_
+  to_bullet_proof_aaron_final_validator_not_design_
+  review_gate_2026_04_23.md` (Otto-93 peer-harness
+  progression context).
+- `memory/feedback_peer_harness_progression_*` (Otto-86
+  4-stage arc).
+- PR #354 (`tools: backlog split Phase 1a`) — the PR
+  where `@codex review` was invited; this absorb's
+  origin.
+- `tools/hygiene/audit-missing-prevention-layers.sh` —
+  the audit returning 22 unclassified rows.
+- `tools/hygiene/audit-post-setup-script-stack.sh` —
+  the audit returning 12 violations.
+- `tools/hygiene/audit-cross-platform-parity.sh` —
+  FACTORY-HYGIENE #51 parity detect-only.
+- `tools/lint/safety-clause-audit.sh` — skill safety-
+  stanza audit.
+- `docs/BUGS.md` — `RecursiveCounting.MultiSeed` skip
+  already tracked.
+- `src/Core/Durability.fs` — DurabilityMode ambiguous-
+  naming site.
+- `docs/FACTORY-HYGIENE.md` row #51 — cross-platform
+  parity.
+- Amara 19th ferry (PR #344 merged) — independent-deep-
+  review substrate; thematic overlap with Codex strategic
+  recommendations.
+- GOVERNANCE §33 — external-conversation archive-header
+  requirement; this doc follows the four-field header.
+- CLAUDE.md BP-11 — data-not-directives discipline
+  applied to Codex output.