Skip to content

hygiene(#231 follow-up): 3 Codex post-merge parity reclassifications#472

Merged
AceHack merged 1 commit intomainfrom
drain/231-followup-codex-parity-corrections
Apr 25, 2026
Merged

hygiene(#231 follow-up): 3 Codex post-merge parity reclassifications#472
AceHack merged 1 commit intomainfrom
drain/231-followup-codex-parity-corrections

Conversation

@AceHack
Copy link
Copy Markdown
Member

@AceHack AceHack commented Apr 25, 2026

Summary

Three Codex P2 cascade catches on the freshly-merged Codex CLI parity matrix. Score deltas: Parity 11 → 13, Partial 5 → 4, Gap 2 → 1.

  1. AGENTS discovery order (§2): Codex CLI follows a precedence chain — global `/.codex/AGENTS.override.md` / `/.codex/AGENTS.md` first, then walks project root → CWD with `AGENTS.override.md` taking precedence per directory (and a byte cap). Per Codex AGENTS guide. Stage-2 readiness checks must account for global overrides.

  2. Slash commands (parity-matrix row + §5): Codex CLI ships built-in `/model`, `/compact`, etc. per developers.openai.com/codex/cli/slash-commands. Reclassify Partial → Parity (different roster). Capability is parity; roster differs.

  3. Session compaction (parity-matrix row): Codex CLI ships `/compact` for context summarization. Reclassify Gap → Parity. Stage-2 should still test trigger conditions and summary quality.

Why this matters

Same shape as the prior cron-reclassification (#231 cascade Wave 4). Version-currency on a "not documented" claim loses to verifying via current docs. The reviewer-cascade pattern (Codex re-reviews on every commit including the version-currency commit itself) catches misclassifications the author missed at write-time. The doc converges on accuracy through iteration.

Test plan

  • All 3 reclassifications cite live docs URLs (verified accessible)
  • Score deltas reproducible from per-row classifications (Parity 13 + Partial 4 + Gap 1 + Codex-specific 2 = 20 entries)
  • §5 gap analysis tightened (session-compaction removed; slash-command reframed roster-parity)
  • §2 AGENTS discovery prose extended to mention precedence chain + global override surface

🤖 Generated with Claude Code

Three Codex P2 cascade catches on the freshly-merged Codex CLI
parity matrix — version-currency continues to refine the doc:

1. **AGENTS discovery order (§2):** Codex CLI follows a
   precedence chain — global ~/.codex/AGENTS.override.md /
   ~/.codex/AGENTS.md first, then walks project root → CWD with
   AGENTS.override.md taking precedence per directory (and a
   byte cap), per developers.openai.com/codex/guides/agents-md.
   Stage-2 readiness checks must account for global overrides
   that can pass/fail ingestion checks for reasons unrelated to
   the repo's AGENTS.md content.

2. **Slash commands (parity-matrix row + §5):** Codex CLI ships
   built-in /model, /compact, etc. per
   developers.openai.com/codex/cli/slash-commands — the row was
   classified Partial citing "no /model slash command" but that
   command exists. Reclassify Partial → Parity (different
   roster). The slash-command capability is parity; the rosters
   differ (Zeta-defined /loop /fast vs Codex CLI built-ins).

3. **Session compaction (parity-matrix row):** Codex CLI ships
   /compact for context summarization — the row was classified
   Gap (opaque) treating it as undocumented. Reclassify Gap →
   Parity. Stage-2 should still test trigger conditions and
   summary quality.

**Score deltas:** Parity 11 → 13, Partial 5 → 4, Gap 2 → 1.
Same shape as the prior cron-reclassification — version-currency
on a 'not documented' claim loses to verifying via current docs.
The cascade pattern (Codex re-reviews on every commit including
the version-currency commit itself) catches misclassifications
the author missed at write-time.

§5 gap analysis tightened: 'session compaction' line removed
from nice-to-have (it's now Parity); slash-command line reframed
from 'name-parity' to 'roster parity'.
@AceHack AceHack enabled auto-merge (squash) April 25, 2026 09:22
Copilot AI review requested due to automatic review settings April 25, 2026 09:22
@AceHack AceHack merged commit fbbbf77 into main Apr 25, 2026
15 checks passed
@AceHack AceHack deleted the drain/231-followup-codex-parity-corrections branch April 25, 2026 09:24
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates the Codex CLI parity research doc to reflect post-merge “cascade” corrections from current Codex documentation, improving the accuracy of the capability matrix and its derived score summary.

Changes:

  • Expanded §2 to describe Codex CLI’s AGENTS.md discovery precedence chain, including global override files.
  • Reclassified “Slash commands” and “Session compaction” rows to Parity based on Codex CLI built-in slash commands (e.g., /model, /compact).
  • Updated the running gap-score summary and §5 nice-to-have list to match the new classifications.

Comment on lines +100 to +101
reasons unrelated to the repo's `AGENTS.md` content. The relevant
lines of `CLAUDE.md`:
Copy link

Copilot AI Apr 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: This section warns that global ~/.codex/AGENTS.override.md can change ingestion, but it doesn’t say how Stage-2 checks should control for / detect that (e.g., instruct running in a clean home/CODEX home, or explicitly verifying which AGENTS file was loaded). Adding a concrete test precondition would make the readiness guidance actionable and less error-prone.

Suggested change
reasons unrelated to the repo's `AGENTS.md` content. The relevant
lines of `CLAUDE.md`:
reasons unrelated to the repo's `AGENTS.md` content. **Concrete
Stage-2 precondition:** run ingestion verification in a clean
Codex home (no `~/.codex/AGENTS.md` and no
`~/.codex/AGENTS.override.md`), or an equivalent isolated `HOME` /
Codex config directory, and record which AGENTS file(s) were in
effect during the run. If the harness cannot show that the repo's
`AGENTS.md` was the active source, the check is inconclusive rather
than a repo failure. The relevant lines of `CLAUDE.md`:

Copilot uses AI. Check for mistakes.
| Output styles (e.g., explanatory) | Not documented; may go via system-prompt override | **Gap (minor)** | Factory-side impact is small; output styles are Claude-Code-session features, not substrate. |
| Hooks (`.claude/settings.json` PreToolUse, UserPromptSubmit) | `notify` hook + shell-only PreToolUse (per OpenAI release notes for `rust-v0.117.0`, March 26 2026, [openai/codex#15211](https://github.com/openai/codex/pull/15211)) | **Partial (narrowing)** | Codex now has shell-only PreToolUse alongside the existing `notify` hook for turn completion. UserPromptSubmit and other Claude-Code-specific hook types are still gaps. Zeta's ASCII-clean pre-commit + prompt-injection lints run via git-pre-commit (harness-neutral) so the gap-impact on Zeta substrate is small. SessionStart hooks (e.g., for output style) still have no Codex equivalent. |
| Slash commands (`/loop`, `/fast`, `/help`, `/status-line-setup`) | `-m` / `--model`, profiles, plan-mode commands | **Partial** | Codex exposes fewer user-visible slash commands; model selection is via `-m` / `--model` flags + `--profile` (per `docs/research/openai-codex-cli-capability-map.md`), not via a `/model` slash command. Project-specific commands (e.g., Zeta's `/loop`) need re-authoring or re-routing through `codex exec`. |
| Slash commands (`/loop`, `/fast`, `/help`, `/status-line-setup`) | Built-in `/model`, `/compact`, etc. (per [`developers.openai.com/codex/cli/slash-commands`](https://developers.openai.com/codex/cli/slash-commands)) + `-m`/`--model` flags + `--profile` | **Parity (different roster)** | Codex CLI ships built-in slash commands including `/model` for model + reasoning-effort selection, `/compact` for context compaction, etc. Both harnesses expose slash commands; the rosters differ (Claude Code has Zeta-defined `/loop`, `/fast`; Codex has its own built-in roster). Project-specific commands (e.g., Zeta's `/loop`) need re-authoring or re-routing through `codex exec`. The capability surface is parity; the specific commands aren't 1-to-1. |
Copy link

Copilot AI Apr 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: The /model note claims “model + reasoning-effort selection”, which reads inconsistently with the earlier statement that Codex uses profiles (--profile) rather than a discrete effort-tier enumeration. Consider clarifying how /model interacts with profiles and/or plan_mode_reasoning_effort so the doc has a single, unambiguous story for “effort” selection.

Suggested change
| Slash commands (`/loop`, `/fast`, `/help`, `/status-line-setup`) | Built-in `/model`, `/compact`, etc. (per [`developers.openai.com/codex/cli/slash-commands`](https://developers.openai.com/codex/cli/slash-commands)) + `-m`/`--model` flags + `--profile` | **Parity (different roster)** | Codex CLI ships built-in slash commands including `/model` for model + reasoning-effort selection, `/compact` for context compaction, etc. Both harnesses expose slash commands; the rosters differ (Claude Code has Zeta-defined `/loop`, `/fast`; Codex has its own built-in roster). Project-specific commands (e.g., Zeta's `/loop`) need re-authoring or re-routing through `codex exec`. The capability surface is parity; the specific commands aren't 1-to-1. |
| Slash commands (`/loop`, `/fast`, `/help`, `/status-line-setup`) | Built-in `/model`, `/compact`, etc. (per [`developers.openai.com/codex/cli/slash-commands`](https://developers.openai.com/codex/cli/slash-commands)) + `-m`/`--model` flags + `--profile` | **Parity (different roster)** | Codex CLI ships built-in slash commands including `/model` for model selection and `/compact` for context compaction. Any reasoning-effort change should be understood through the active profile/config surface (`--profile`, `plan_mode_reasoning_effort`), not as a separate standalone effort-tier picker implied by `/model`. Both harnesses expose slash commands; the rosters differ (Claude Code has Zeta-defined `/loop`, `/fast`; Codex has its own built-in roster). Project-specific commands (e.g., Zeta's `/loop`) need re-authoring or re-routing through `codex exec`. The capability surface is parity; the specific commands aren't 1-to-1. |

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants