Lucent-Financial-Group · AceHack · Apr 25, 2026 · Apr 25, 2026 · Copilot · Apr 25, 2026
diff --git a/docs/research/codex-cli-first-class-2026-04-23.md b/docs/research/codex-cli-first-class-2026-04-23.md
@@ -88,9 +88,17 @@ CLI in the 2026 coding-agent landscape.
 ## 2 · The big, non-obvious win — `AGENTS.md` is already universal
 
 Claude Code reads `CLAUDE.md` first. Codex CLI reads `AGENTS.md`
-first. **Zeta's setup already has both, and the `CLAUDE.md`
-explicitly delegates to `AGENTS.md`** as the universal
-onboarding handbook. The relevant lines of `CLAUDE.md`:
+following a precedence chain: global `~/.codex/AGENTS.override.md`
+/ `~/.codex/AGENTS.md` first, then walks project root → CWD with
+`AGENTS.override.md` taking precedence per directory (and a byte
+cap; see [Codex AGENTS guide](https://developers.openai.com/codex/guides/agents-md)).
+**Zeta's setup already has both `CLAUDE.md` and `AGENTS.md`, and
+`CLAUDE.md` explicitly delegates to `AGENTS.md`** as the universal
+onboarding handbook. Stage-2 readiness checks must account for
+the precedence chain — environments with global overrides at
+`~/.codex/AGENTS.override.md` can pass/fail ingestion checks for
+reasons unrelated to the repo's `AGENTS.md` content. The relevant
+lines of `CLAUDE.md`:
-reasons unrelated to the repo's `AGENTS.md` content. The relevant
-lines of `CLAUDE.md`:
+reasons unrelated to the repo's `AGENTS.md` content. **Concrete
+Stage-2 precondition:** run ingestion verification in a clean
+Codex home (no `~/.codex/AGENTS.md` and no
+`~/.codex/AGENTS.override.md`), or an equivalent isolated `HOME` /
+Codex config directory, and record which AGENTS file(s) were in
+effect during the run. If the harness cannot show that the repo's
+`AGENTS.md` was the active source, the check is inconclusive rather
+than a repo failure. The relevant lines of `CLAUDE.md`:
-reasons unrelated to the repo's `AGENTS.md` content. The relevant
-lines of `CLAUDE.md`:
+reasons unrelated to the repo's `AGENTS.md` content. **Concrete
+Stage-2 precondition:** run ingestion verification in a clean
+Codex home (no `~/.codex/AGENTS.md` and no
+`~/.codex/AGENTS.override.md`), or an equivalent isolated `HOME` /
+Codex config directory, and record which AGENTS file(s) were in
+effect during the run. If the harness cannot show that the repo's
+`AGENTS.md` was the active source, the check is inconclusive rather
+than a repo failure. The relevant lines of `CLAUDE.md`:
 
 > 1. **[`AGENTS.md`](../../AGENTS.md)** — the universal
 >    onboarding handbook. Pre-v1 status, the three
@@ -190,24 +198,29 @@ and Codex-specific.
 | Plan Mode | `plan_mode_reasoning_effort` config | **Parity** | Named differently; same concept. |
 | Output styles (e.g., explanatory) | Not documented; may go via system-prompt override | **Gap (minor)** | Factory-side impact is small; output styles are Claude-Code-session features, not substrate. |
 | Hooks (`.claude/settings.json` PreToolUse, UserPromptSubmit) | `notify` hook + shell-only PreToolUse (per OpenAI release notes for `rust-v0.117.0`, March 26 2026, [openai/codex#15211](https://github.com/openai/codex/pull/15211)) | **Partial (narrowing)** | Codex now has shell-only PreToolUse alongside the existing `notify` hook for turn completion. UserPromptSubmit and other Claude-Code-specific hook types are still gaps. Zeta's ASCII-clean pre-commit + prompt-injection lints run via git-pre-commit (harness-neutral) so the gap-impact on Zeta substrate is small. SessionStart hooks (e.g., for output style) still have no Codex equivalent. |
-| Slash commands (`/loop`, `/fast`, `/help`, `/status-line-setup`) | `-m` / `--model`, profiles, plan-mode commands | **Partial** | Codex exposes fewer user-visible slash commands; model selection is via `-m` / `--model` flags + `--profile` (per `docs/research/openai-codex-cli-capability-map.md`), not via a `/model` slash command. Project-specific commands (e.g., Zeta's `/loop`) need re-authoring or re-routing through `codex exec`. |
+| Slash commands (`/loop`, `/fast`, `/help`, `/status-line-setup`) | Built-in `/model`, `/compact`, etc. (per [`developers.openai.com/codex/cli/slash-commands`](https://developers.openai.com/codex/cli/slash-commands)) + `-m`/`--model` flags + `--profile` | **Parity (different roster)** | Codex CLI ships built-in slash commands including `/model` for model + reasoning-effort selection, `/compact` for context compaction, etc. Both harnesses expose slash commands; the rosters differ (Claude Code has Zeta-defined `/loop`, `/fast`; Codex has its own built-in roster). Project-specific commands (e.g., Zeta's `/loop`) need re-authoring or re-routing through `codex exec`. The capability surface is parity; the specific commands aren't 1-to-1. |
-| Slash commands (`/loop`, `/fast`, `/help`, `/status-line-setup`) | Built-in `/model`, `/compact`, etc. (per [`developers.openai.com/codex/cli/slash-commands`](https://developers.openai.com/codex/cli/slash-commands)) + `-m`/`--model` flags + `--profile` | **Parity (different roster)** | Codex CLI ships built-in slash commands including `/model` for model + reasoning-effort selection, `/compact` for context compaction, etc. Both harnesses expose slash commands; the rosters differ (Claude Code has Zeta-defined `/loop`, `/fast`; Codex has its own built-in roster). Project-specific commands (e.g., Zeta's `/loop`) need re-authoring or re-routing through `codex exec`. The capability surface is parity; the specific commands aren't 1-to-1. |
+| Slash commands (`/loop`, `/fast`, `/help`, `/status-line-setup`) | Built-in `/model`, `/compact`, etc. (per [`developers.openai.com/codex/cli/slash-commands`](https://developers.openai.com/codex/cli/slash-commands)) + `-m`/`--model` flags + `--profile` | **Parity (different roster)** | Codex CLI ships built-in slash commands including `/model` for model selection and `/compact` for context compaction. Any reasoning-effort change should be understood through the active profile/config surface (`--profile`, `plan_mode_reasoning_effort`), not as a separate standalone effort-tier picker implied by `/model`. Both harnesses expose slash commands; the rosters differ (Claude Code has Zeta-defined `/loop`, `/fast`; Codex has its own built-in roster). Project-specific commands (e.g., Zeta's `/loop`) need re-authoring or re-routing through `codex exec`. The capability surface is parity; the specific commands aren't 1-to-1. |
-| Slash commands (`/loop`, `/fast`, `/help`, `/status-line-setup`) | Built-in `/model`, `/compact`, etc. (per [`developers.openai.com/codex/cli/slash-commands`](https://developers.openai.com/codex/cli/slash-commands)) + `-m`/`--model` flags + `--profile` | **Parity (different roster)** | Codex CLI ships built-in slash commands including `/model` for model + reasoning-effort selection, `/compact` for context compaction, etc. Both harnesses expose slash commands; the rosters differ (Claude Code has Zeta-defined `/loop`, `/fast`; Codex has its own built-in roster). Project-specific commands (e.g., Zeta's `/loop`) need re-authoring or re-routing through `codex exec`. The capability surface is parity; the specific commands aren't 1-to-1. |
+| Slash commands (`/loop`, `/fast`, `/help`, `/status-line-setup`) | Built-in `/model`, `/compact`, etc. (per [`developers.openai.com/codex/cli/slash-commands`](https://developers.openai.com/codex/cli/slash-commands)) + `-m`/`--model` flags + `--profile` | **Parity (different roster)** | Codex CLI ships built-in slash commands including `/model` for model selection and `/compact` for context compaction. Any reasoning-effort change should be understood through the active profile/config surface (`--profile`, `plan_mode_reasoning_effort`), not as a separate standalone effort-tier picker implied by `/model`. Both harnesses expose slash commands; the rosters differ (Claude Code has Zeta-defined `/loop`, `/fast`; Codex has its own built-in roster). Project-specific commands (e.g., Zeta's `/loop`) need re-authoring or re-routing through `codex exec`. The capability surface is parity; the specific commands aren't 1-to-1. |
 | `Task` with `isolation: "worktree"` | Built-in worktree support | **Parity** | Codex advertises worktrees as a first-class subagent feature. |
-| Session compaction | Not documented | **Gap (opaque)** | Codex's handling of long sessions is unclear; Stage 2 must test. |
+| Session compaction | Built-in `/compact` slash command (per [`developers.openai.com/codex/cli/slash-commands`](https://developers.openai.com/codex/cli/slash-commands)) | **Parity** | Codex CLI ships `/compact` specifically for summarizing conversation context to free tokens — same role as Claude Code's session compaction. Stage-2 should still test the trigger conditions and quality of the summary. |
 | Code-review agent | Native "separate agent before commit" feature | **Parity (different shape)** | Codex integrates review into the CLI workflow directly; Zeta's equivalent is Codex-as-PR-reviewer on GitHub + the harsh-critic persona under `.claude/skills/code-review-zero-empathy/`. (Note: `/ultrareview` is a Claude Code platform feature surfaced in the harness's session prompt, not a Zeta-defined command — repo-wide search finds no in-tree definition. Listed here for surface-mapping context only; not an in-repo entrypoint.) Composes. |
 | Image input / image generation | Native | **Parity+** | Codex exposes image generation in-CLI; Claude Code accepts image input only. |
 | Background macOS Computer Use | Native | **Codex-specific** | No Claude Code equivalent; relevant if Zeta ever wants agent-run GUI tests. Not urgent for Otto. |
 | Cloud-backed runtime | Codex Cloud | **Codex-specific** | May subsume the cron-gap by running long-lived agents in cloud; Stage 2 needs to verify. |
 
-**Running gap score after first-pass:**
-
-- Parity: 11 (TodoWrite reclassified Gap → Parity (different shape)
-  per OpenAI's Sept 15 2025 Codex CLI to-do-list announcement)
-- Partial: 5 (cron/autonomous-loop reclassified Likely-gap →
-  Partial (different surface) per
-  `developers.openai.com/codex/app/automations` thread-automation
-  primitive)
-- Gap: 2 (no longer including cron — autonomous-loop is reachable
-  via Codex Cloud thread automations)
+**Running gap score after first-pass + post-merge cascade
+reclassifications:**
+
+- Parity: 13 (TodoWrite Gap → Parity (different shape) per
+  OpenAI's Sept 15 2025 announcement; Slash commands Partial →
+  Parity (different roster) per Codex CLI built-in
+  `/model`/`/compact`/etc. roster; Session compaction Gap →
+  Parity per Codex CLI built-in `/compact` — both per
+  `developers.openai.com/codex/cli/slash-commands`)
+- Partial: 4 (cron/autonomous-loop Likely-gap → Partial (different
+  surface) per `developers.openai.com/codex/app/automations`
+  thread-automation primitive; slash-commands removed from this
+  bucket)
+- Gap: 1 (Output styles only; cron and session-compaction both
+  moved to Parity-class buckets)
 - Codex-specific: 2
 
 (Score subject to Stage 2 verification — these are first-pass
@@ -286,8 +299,10 @@ sidesteps that problem for Phase 1 Codex research**.
 **Nice-to-have (low friction, low impact):**
 
 1. Output-style / explanatory-mode parity.
-2. Session compaction behaviour parity.
-3. Slash-command name-parity (Zeta's `/loop` etc.).
+2. Slash-command roster parity (Zeta's project-specific commands
+   like `/loop` need re-authoring or routing through `codex exec`;
+   Codex CLI's built-in roster includes `/model`/`/compact` and
+   covers a different subset of session-management needs).
 
 **Codex-specific we don't need today:**