Conversation
Three Codex P2 cascade catches on the freshly-merged Codex CLI parity matrix — version-currency continues to refine the doc: 1. **AGENTS discovery order (§2):** Codex CLI follows a precedence chain — global ~/.codex/AGENTS.override.md / ~/.codex/AGENTS.md first, then walks project root → CWD with AGENTS.override.md taking precedence per directory (and a byte cap), per developers.openai.com/codex/guides/agents-md. Stage-2 readiness checks must account for global overrides that can pass/fail ingestion checks for reasons unrelated to the repo's AGENTS.md content. 2. **Slash commands (parity-matrix row + §5):** Codex CLI ships built-in /model, /compact, etc. per developers.openai.com/codex/cli/slash-commands — the row was classified Partial citing "no /model slash command" but that command exists. Reclassify Partial → Parity (different roster). The slash-command capability is parity; the rosters differ (Zeta-defined /loop /fast vs Codex CLI built-ins). 3. **Session compaction (parity-matrix row):** Codex CLI ships /compact for context summarization — the row was classified Gap (opaque) treating it as undocumented. Reclassify Gap → Parity. Stage-2 should still test trigger conditions and summary quality. **Score deltas:** Parity 11 → 13, Partial 5 → 4, Gap 2 → 1. Same shape as the prior cron-reclassification — version-currency on a 'not documented' claim loses to verifying via current docs. The cascade pattern (Codex re-reviews on every commit including the version-currency commit itself) catches misclassifications the author missed at write-time. §5 gap analysis tightened: 'session compaction' line removed from nice-to-have (it's now Parity); slash-command line reframed from 'name-parity' to 'roster parity'.
There was a problem hiding this comment.
Pull request overview
Updates the Codex CLI parity research doc to reflect post-merge “cascade” corrections from current Codex documentation, improving the accuracy of the capability matrix and its derived score summary.
Changes:
- Expanded §2 to describe Codex CLI’s
AGENTS.mddiscovery precedence chain, including global override files. - Reclassified “Slash commands” and “Session compaction” rows to Parity based on Codex CLI built-in slash commands (e.g.,
/model,/compact). - Updated the running gap-score summary and §5 nice-to-have list to match the new classifications.
| reasons unrelated to the repo's `AGENTS.md` content. The relevant | ||
| lines of `CLAUDE.md`: |
There was a problem hiding this comment.
P1: This section warns that global ~/.codex/AGENTS.override.md can change ingestion, but it doesn’t say how Stage-2 checks should control for / detect that (e.g., instruct running in a clean home/CODEX home, or explicitly verifying which AGENTS file was loaded). Adding a concrete test precondition would make the readiness guidance actionable and less error-prone.
| reasons unrelated to the repo's `AGENTS.md` content. The relevant | |
| lines of `CLAUDE.md`: | |
| reasons unrelated to the repo's `AGENTS.md` content. **Concrete | |
| Stage-2 precondition:** run ingestion verification in a clean | |
| Codex home (no `~/.codex/AGENTS.md` and no | |
| `~/.codex/AGENTS.override.md`), or an equivalent isolated `HOME` / | |
| Codex config directory, and record which AGENTS file(s) were in | |
| effect during the run. If the harness cannot show that the repo's | |
| `AGENTS.md` was the active source, the check is inconclusive rather | |
| than a repo failure. The relevant lines of `CLAUDE.md`: |
| | Output styles (e.g., explanatory) | Not documented; may go via system-prompt override | **Gap (minor)** | Factory-side impact is small; output styles are Claude-Code-session features, not substrate. | | ||
| | Hooks (`.claude/settings.json` PreToolUse, UserPromptSubmit) | `notify` hook + shell-only PreToolUse (per OpenAI release notes for `rust-v0.117.0`, March 26 2026, [openai/codex#15211](https://github.com/openai/codex/pull/15211)) | **Partial (narrowing)** | Codex now has shell-only PreToolUse alongside the existing `notify` hook for turn completion. UserPromptSubmit and other Claude-Code-specific hook types are still gaps. Zeta's ASCII-clean pre-commit + prompt-injection lints run via git-pre-commit (harness-neutral) so the gap-impact on Zeta substrate is small. SessionStart hooks (e.g., for output style) still have no Codex equivalent. | | ||
| | Slash commands (`/loop`, `/fast`, `/help`, `/status-line-setup`) | `-m` / `--model`, profiles, plan-mode commands | **Partial** | Codex exposes fewer user-visible slash commands; model selection is via `-m` / `--model` flags + `--profile` (per `docs/research/openai-codex-cli-capability-map.md`), not via a `/model` slash command. Project-specific commands (e.g., Zeta's `/loop`) need re-authoring or re-routing through `codex exec`. | | ||
| | Slash commands (`/loop`, `/fast`, `/help`, `/status-line-setup`) | Built-in `/model`, `/compact`, etc. (per [`developers.openai.com/codex/cli/slash-commands`](https://developers.openai.com/codex/cli/slash-commands)) + `-m`/`--model` flags + `--profile` | **Parity (different roster)** | Codex CLI ships built-in slash commands including `/model` for model + reasoning-effort selection, `/compact` for context compaction, etc. Both harnesses expose slash commands; the rosters differ (Claude Code has Zeta-defined `/loop`, `/fast`; Codex has its own built-in roster). Project-specific commands (e.g., Zeta's `/loop`) need re-authoring or re-routing through `codex exec`. The capability surface is parity; the specific commands aren't 1-to-1. | |
There was a problem hiding this comment.
P1: The /model note claims “model + reasoning-effort selection”, which reads inconsistently with the earlier statement that Codex uses profiles (--profile) rather than a discrete effort-tier enumeration. Consider clarifying how /model interacts with profiles and/or plan_mode_reasoning_effort so the doc has a single, unambiguous story for “effort” selection.
| | Slash commands (`/loop`, `/fast`, `/help`, `/status-line-setup`) | Built-in `/model`, `/compact`, etc. (per [`developers.openai.com/codex/cli/slash-commands`](https://developers.openai.com/codex/cli/slash-commands)) + `-m`/`--model` flags + `--profile` | **Parity (different roster)** | Codex CLI ships built-in slash commands including `/model` for model + reasoning-effort selection, `/compact` for context compaction, etc. Both harnesses expose slash commands; the rosters differ (Claude Code has Zeta-defined `/loop`, `/fast`; Codex has its own built-in roster). Project-specific commands (e.g., Zeta's `/loop`) need re-authoring or re-routing through `codex exec`. The capability surface is parity; the specific commands aren't 1-to-1. | | |
| | Slash commands (`/loop`, `/fast`, `/help`, `/status-line-setup`) | Built-in `/model`, `/compact`, etc. (per [`developers.openai.com/codex/cli/slash-commands`](https://developers.openai.com/codex/cli/slash-commands)) + `-m`/`--model` flags + `--profile` | **Parity (different roster)** | Codex CLI ships built-in slash commands including `/model` for model selection and `/compact` for context compaction. Any reasoning-effort change should be understood through the active profile/config surface (`--profile`, `plan_mode_reasoning_effort`), not as a separate standalone effort-tier picker implied by `/model`. Both harnesses expose slash commands; the rosters differ (Claude Code has Zeta-defined `/loop`, `/fast`; Codex has its own built-in roster). Project-specific commands (e.g., Zeta's `/loop`) need re-authoring or re-routing through `codex exec`. The capability surface is parity; the specific commands aren't 1-to-1. | |
Summary
Three Codex P2 cascade catches on the freshly-merged Codex CLI parity matrix. Score deltas: Parity 11 → 13, Partial 5 → 4, Gap 2 → 1.
AGENTS discovery order (§2): Codex CLI follows a precedence chain — global `
/.codex/AGENTS.override.md` / `/.codex/AGENTS.md` first, then walks project root → CWD with `AGENTS.override.md` taking precedence per directory (and a byte cap). Per Codex AGENTS guide. Stage-2 readiness checks must account for global overrides.Slash commands (parity-matrix row + §5): Codex CLI ships built-in `/model`, `/compact`, etc. per
developers.openai.com/codex/cli/slash-commands. Reclassify Partial → Parity (different roster). Capability is parity; roster differs.Session compaction (parity-matrix row): Codex CLI ships `/compact` for context summarization. Reclassify Gap → Parity. Stage-2 should still test trigger conditions and summary quality.
Why this matters
Same shape as the prior cron-reclassification (#231 cascade Wave 4). Version-currency on a "not documented" claim loses to verifying via current docs. The reviewer-cascade pattern (Codex re-reviews on every commit including the version-currency commit itself) catches misclassifications the author missed at write-time. The doc converges on accuracy through iteration.
Test plan
🤖 Generated with Claude Code