Studio: add Codex SDK as a chat provider with parallel-calls fan-out#5724
Studio: add Codex SDK as a chat provider with parallel-calls fan-out#5724danielhanchen wants to merge 48 commits into
Conversation
|
Warning You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again! |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: fa809262d9
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| ) | ||
| return "".join(collected) | ||
|
|
||
| workers = [asyncio.create_task(_worker(i + 1)) for i in range(n)] |
There was a problem hiding this comment.
Cancel Codex fan-out tasks when stream is aborted
When a client disconnects or cancels a streaming request, this async generator can be closed before it reaches normal completion, but the worker tasks created for parallel fan-out are never canceled. Because _stream_codex_parallel starts up to 20 background tasks and a drain task here, an interrupted stream continues consuming local Codex capacity after the user is gone, which can starve subsequent requests and waste significant resources.
Useful? React with 👍 / 👎.
| try: | ||
| rc = await proc.wait() | ||
| except Exception: |
There was a problem hiding this comment.
Kill device-auth process on canceled login stream
If the /api/codex/login SSE connection is closed mid-flow (dialog closed, navigation, network drop), cancellation lands in this finally, but the code only awaits proc.wait() and never terminates the codex auth login --device-auth subprocess. That leaves orphaned login processes running server-side until they exit on their own, and the canceled request task can remain blocked waiting for that long-running process.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: bc8134ebf1
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| for msg in reversed(messages): | ||
| if msg.get("role") != "user": | ||
| continue | ||
| content = msg.get("content") | ||
| if isinstance(content, str): |
There was a problem hiding this comment.
Preserve prior turns when routing Codex requests
_last_user_prompt stops at the newest role=user message and returns only that one turn, but each request also creates a brand-new Codex thread (async_codex_cls() + thread_start) instead of reusing prior thread state. In follow-up questions, Codex therefore receives no assistant/user history and answers without conversation context, which breaks normal multi-turn chat behavior for this provider.
Useful? React with 👍 / 👎.
| } else if (!error) { | ||
| setError("Codex login did not complete -- see log for details."); |
There was a problem hiding this comment.
Preserve streamed error details in Codex login UI
The post-loop fallback checks !error from the callback closure, not the latest value set during the stream. When the backend sends an error event, setError(event.message) runs, but error here is still the stale pre-stream value, so the specific backend message is overwritten by the generic "did not complete" text. This hides actionable failure details from users during device-auth failures.
Useful? React with 👍 / 👎.
| yield "data: [DONE]\n\n" | ||
|
|
||
| return StreamingResponse( | ||
| _codex_stream(), |
Wires the OpenAI Codex CLI / Python SDK (codex_app_server) into Studio
as a new chat provider type. Hosts that don't have the CLI or the SDK
installed never see the entry; on logged-out hosts the provider config
dialog renders a device-auth Sign-in button that surfaces the
verification URL and streams CLI progress back over SSE.
Backend
- new core/inference/codex_availability.py probes the CLI + SDK and
reports {installed, logged_in, version, supported_models}; it never
imports codex_app_server at module top level so the rest of the
backend keeps starting cleanly on hosts that don't have the SDK.
- new core/inference/codex_provider.py wraps AsyncCodex and translates
Codex events into OpenAI chat-completion chunks. Supports the
thread.run_streaming path with a non-streaming fallback for older
SDK revs.
- parallel_calls > 1 fans the turn out across N tasks (capped at 20)
via asyncio.gather and emits codex_tab_open / codex_tab_chunk /
codex_tab_close tool-events per attempt plus a final codex_gather
synthesis event. A separate standalone Codex call produces the
unified answer.
- new routes/codex.py exposes GET /api/codex/status and POST
/api/codex/login. The login route shells out to
codex auth login --device-auth and streams events; the first event
carries the verification URL so the frontend can window.open it.
- ChatCompletionRequest gains a parallel_calls field bounded [1, 20]
by pydantic. The codex registry entry stays hidden by default; the
/api/codex/status probe is the authoritative gate.
- routes/inference.py dispatches provider_type=codex through the
local CLI/SDK pipeline instead of the standard HTTP client, with
graceful error surfacing for CodexUnavailableError.
Frontend
- new api/codex-api.ts exposes fetchCodexStatus() and an async
generator streamCodexDeviceLogin() that drives the SSE stream and
yields parsed events.
- new components/codex-parallel-tabs.tsx renders the tabbed parallel-
calls UI with a Synthesis tab highlighted once the codex_gather
event arrives. Pure reducer keeps the state transitions unit-
testable.
- new components/codex-login-button.tsx posts to /api/codex/login,
opens the verification URL in a new tab via window.open, and shows
the streamed CLI log as it lands.
- external-providers.ts exports CODEX_PROVIDER_TYPE,
CODEX_MAX_PARALLEL_CALLS, isCodexProviderType, and
clampCodexParallelCalls. Codex is marked text-only so the composer
hides image-attach affordances when selected.
Tests
- tests/test_codex_provider.py (14 cases) covers the availability
probe across the four install / login states, the streaming +
parallel-calls translation against a fake codex_app_server module
injected into sys.modules, the [1, 20] pydantic clamp, the
CodexUnavailableError surfacing path, and the parallel_calls=1
single-call shape (no tab tool-events).
for more information, see https://pre-commit.ci
test_health_response_reports_desktop_capability_fields builds a SimpleNamespace as a fake routes module so it can exercise main.health_check without standing the full app up. The stub listed every router name except codex_router, which lands in the main.py import block alongside the others as of this PR, so the import failed with 'cannot import name codex_router from <unknown module name>' on the Python 3.13 unit run. Add the codex_router slot to the stub.
bc8134e to
0a4309f
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 0a4309fefd
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| setLogs([]); | ||
| setDeviceUrl(null); | ||
| const controller = new AbortController(); | ||
| abortRef.current?.abort(); |
There was a problem hiding this comment.
AbortController is only wired to startLogin, so closing the dialog or navigating away while busy leaves the /api/codex/login stream running until the CLI exits. In practice this keeps a server-side login process alive after the UI is gone and continues dispatching async updates to a detached component tree. Add unmount cleanup (and clear abortRef in finally) so in-flight login streams are canceled when the component is removed.
Useful? React with 👍 / 👎.
Followups on the post-merge review pass for the Codex SDK chat provider. Verified against codex-cli 0.133.0 + the upstream `openai/codex` Rust + Python sources, then pinned each fix with a regression test in `test_codex_provider.py` (24/24 passing). * Probe both `openai_codex` (canonical upstream Python package at `openai/codex/sdk/python`) and the legacy `codex_app_server` alias. Without this the availability probe always reported `sdk_importable: false` even when the SDK was installed, so the provider was permanently hidden. * Switch the device-auth and login-status invocations from `codex auth login --device-auth` / `codex auth status` to the real upstream subcommands `codex login --device-auth` and `codex login status`. The former path returns `unrecognized subcommand 'auth'` on a real CLI. * Strip ANSI control sequences before extracting the device URL (upstream wraps the URL in `\x1b[34m...\x1b[0m`) and tighten the pattern to the canonical `.../codex/device` shape. Also surface the one-time code as a `device_code` SSE event so the UI can show it alongside the URL. * Fix `_detect_logged_in` substring footgun: `"logged in" in combined` matched inside `"not logged in"`, flipping logged-out users to logged-in. Anchor on word boundaries with negative prefixes winning regardless of return code. * Cancel in-flight fan-out workers on SSE disconnect. Previously every parallel Codex turn ran to completion against a disconnected client and burned quota; now `_stream_codex_parallel` cancels its worker + drain tasks in a try/finally on `CancelledError`/`GeneratorExit`. * Tear down the device-login subprocess on disconnect via `start_new_session=True` + `os.killpg(SIGTERM)` (Unix) or `CREATE_NEW_PROCESS_GROUP` + `CTRL_BREAK_EVENT` (Windows), with a bounded `proc.wait()` and `proc.kill()` fallback. Previously `finally: await proc.wait()` blocked the SSE close path because `codex login --device-auth` only exits on user action. * Render the full conversation transcript in `_last_user_prompt` instead of returning only the most recent user message. The PR opens a fresh thread per request so prior assistant turns were dropped, degrading multi-turn chats to single-shot prompts. Single-turn input is unchanged. * Make `ChatCompletionRequest.parallel_calls` default to 1 (`int` with `ge=1, le=20`) instead of `Optional[int] = None`. The runtime already coerced `None` -> 1, but the schema now matches the documented `[1, 20]` range. * Replace the registry's hardcoded `default_models` (which contained `o3`, not in the upstream catalog) with the current `gpt-5.5 / 5.4 / 5.4-mini / 5.3-codex / 5.2` set from `codex-rs/models-manager/models.json`. * Stop echoing `str(exc)` in SSE error frames in both `routes/inference.py` and `routes/codex.py`. The Codex SDK can raise with local paths, env-var content, or traceback fragments (CodeQL `py/information-exposure-through-exception`). Surface a generic message + `exception_type` discriminator; log the full reason server-side via `logger.error(..., exc_type=..., error=...)`. Doc / comment updates throughout to refer to `codex login` / `openai_codex` rather than the older incorrect strings. Tested: pytest 24 cases in `test_codex_provider.py` (the original 14 + 10 new `TestCodexHardenedRegressions`) plus the rest of the Studio-backend test suite the PR touches (209 passing). Also verified live against Studio launched from this branch on a Blackwell B200 via `UNSLOTH_STUDIO_HOME=$WORKSPACE/temp/... ./install.sh --local` then a Playwright probe.
|
Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits. |
for more information, see https://pre-commit.ci
|
Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits. |
The OpenAI Codex Python SDK ships on PyPI as `openai-codex-app-server-sdk`, not `openai-codex` (which is the GitHub repo project name in pyproject.toml). The runtime binary ships separately as `openai-codex-cli-bin`. Both packages expose the import name `openai_codex`; the older docs reference `codex_app_server` so we keep probing both. Update the `CodexUnavailableError` message and the provider registry notes so a user hitting the unavailable path gets a copy-pasteable `pip install` command. No behaviour change.
|
Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits. |
Post-review pass driven by reviewer.py. The original PR shipped the backend codex provider, the registry entry (with `hidden:true`), the status API, and the `CodexParallelTabs` component, but the chat UI never surfaced the row, required an API key for the connection, and never sent `parallel_calls` over the wire. Also fixes a CodeQL leak in the parallel fan-out error path and adds the canonical streaming hook upstream actually exposes. Frontend * chat-providers-dialog.tsx now calls `/api/codex/status` alongside `/api/providers/registry`. When the host has Codex installed the Add connection dialog gains a synthetic Codex row (curated model list comes from `supported_models`) so the picker is reachable. * The Add / Edit connection guards now skip the API-key requirement for Codex the same way they do for the custom OpenAI-compat presets; the field itself is also hidden so the user is not asked for a key Studio will not use. * chat-adapter.ts now also exempts Codex from the "Missing API key" pre-flight, and emits `parallel_calls` on the outgoing request when the selected connection is Codex (clamped to [1, 20] by the shared helper, defaults to 1). * external-providers.ts adds `codexParallelCalls` to ExternalProviderConfig so future composer UI can persist the user's pick per connection. Backend * `_stream_thread_run` now tries `thread.turn(prompt).stream()` first, mirroring the canonical openai_codex API (`openai/codex/sdk/python/src/openai_codex/api.py`). The legacy `thread.run_streaming(prompt)` path is kept as a fallback and the buffered `await thread.run(prompt)` stays as the last resort. * `_stream_codex_parallel` no longer echoes `str(exc)` in the `codex_tab_error` SSE event. Per-tab failures now surface a generic "Codex tab failed" message plus an `exception_type` discriminator; `CodexUnavailableError` is the only exception whose text is forwarded verbatim because it is a user-actionable install hint with no sensitive content (CodeQL `py/information-exposure-through-exception`). Tests * New `TestCodexHardenedRegressions::test_parallel_tab_error_sanitised` injects a fake SDK that raises with a path-like message and asserts the SSE frames do not echo it. * New `TestCodexHardenedRegressions::test_thread_turn_stream_path_taken` verifies the canonical `thread.turn(prompt).stream()` hook is preferred over the legacy helper. All 26 codex_provider tests pass. Frontend `tsc --noEmit` clean.
|
Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits. |
for more information, see https://pre-commit.ci
|
Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits. |
Second reviewer.py pass surfaced three follow-ups missed in the earlier round. All caught by 12 parallel reviewers + cross-block audit; each fix is small but user-facing. * `testProvider` no longer pushes a Codex connection back to the edit form to "add an API key". Codex has no remote endpoint to ping, so the Test button now calls `/api/codex/status` directly: toasts success with the CLI version when installed+logged in, prompts to sign in when installed+logged out, and errors when the CLI or SDK is missing. * The Sign-in to Codex affordance is now actually mounted. When the selected provider is Codex and `/api/codex/status` reports `installed:true, logged_in:false`, the dialog renders the new `CodexLoginButton` above the (hidden) API key row. The button's `onLoggedIn` callback re-probes status so the UI flips to the ready state without a page reload. * The chat adapter now handles `codex_*` `_toolEvent` types instead of silently swallowing them. Per-tab chunks render inline with a `[Codex tab N/M]` header so users see each parallel attempt; `codex_gather` adds a `--- Synthesis ---` divider before the final unified content delta the backend also emits as plain text. This unblocks the existing fan-out path while a dedicated `CodexParallelTabs` UI is wired in a future change. Verified: 26/26 codex_provider tests pass; `tsc --noEmit` on studio/frontend completes clean.
…sloth into feat/codex-provider
|
Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits. |
…t login on unmount Third reviewer.py pass found three remaining sharp edges. Each fix is small and paired with a regression test where applicable. * Codex subprocess env is now scrubbed to a safe-list before spawn. Both `_run_cli` in codex_availability and the device-auth spawn in stream_codex_device_login switch from `env=os.environ.copy()` to `env=_codex_subprocess_env()`, which forwards only PATH / HOME / USER / Windows-equivalents / CODEX_HOME / OPENAI_API_KEY / OPENAI_BASE_URL. Other-provider secrets like HF_TOKEN, GH_TOKEN, WANDB_API_KEY, ANTHROPIC_API_KEY no longer reach the local codex binary, so a shimmed `codex` earlier on PATH cannot harvest them. * `_stream_thread_run` now tracks `emitted_any` and refuses to fall through to the buffered `await thread.run(prompt)` after either streaming helper has already yielded text. Previously a network glitch mid-stream re-executed the same Codex turn, which can duplicate file writes, shell commands, and other Codex side effects. The buffered path is now reserved for the zero-output case (no streaming helper resolved, or streaming returned empty). * `CodexLoginButton` now aborts the SSE reader on unmount via a useEffect cleanup that calls `abortRef.current?.abort()`. The underlying `codex login --device-auth` subprocess no longer keeps streaming (and holding a device-auth session) after the dialog closes. Two new pytest cases pin the behaviour: `test_codex_subprocess_env_scrubbed` sets HF/GH/WANDB/ANTHROPIC keys and asserts none reach the codex env while OPENAI_API_KEY / CODEX_HOME survive; and `test_partial_stream_failure_does_not_replay_turn` injects a fake `turn().stream()` that yields "partial output " then raises, and asserts `thread.run()` is never called. 28/28 codex_provider tests pass; `tsc --noEmit` clean.
|
Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits. |
for more information, see https://pre-commit.ci
|
Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits. |
…x preselect Two P1 fixes from the round 8 reviewer pass: 1. _stream_thread_run no longer replays a Codex turn that fired non-visible events before crashing. The replay guard only tracked `emitted_any` (visible text). A Codex turn that emitted, say, a command.delta or file.delta event first -- both filtered to "" by _coerce_text -- and THEN crashed would leave emitted_any=False and fall through to the buffered `thread.run(prompt)` fallback, re-executing the same turn and duplicating its side effects (shell commands, file writes, tool calls). This is exactly the case the guard was added to prevent in earlier rounds; the missing bit was tracking "the turn ran at all", not just "the turn yielded text". Fix: add a separate turn_started flag that flips True the moment we ask the SDK for a turn handle or observe any event from a streaming helper. When the buffered fallback is gated on turn_started instead of emitted_any, a partial-turn crash correctly stops without replaying. Regression test reproduces the bug against the pre-fix code (assertion catches the extra thread.run call) and locks the fix in. 2. openAddProvider now mirrors the providerType-change effect's Codex pre-check. The first-run UX fix from `26799d9a` pre-checked every Codex default model in the providerType-change effect, but openAddProvider() calls resetForm() (which clears selectedModelIds) and then only restores availableModels, not selectedModelIds. If the user closes the Add connection form and re-opens it while Codex is still the current providerType, the effect does not re-run, so the form opens with Codex defaults available but none selected -- the "Add at least one model ID" save guard then blocks the Save click. Fix: openAddProvider now seeds selectedModelIds with the full default-models list when the provider is Codex, matching the providerType-change effect so the two entry paths produce the same first-run state.
|
Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits. |
for more information, see https://pre-commit.ci
|
Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits. |
|
Round 8 reviewer pass surfaced two more P1 issues; one more commit lands both fixes.
Test counts after
Round 8 reviewer-finding summary:
|
Resolve three conflicts touched by main since the branch forked: - studio/backend/core/inference/external_provider.py: take main's rewrite of _anthropic_citation_key (extended dedup keys covering end-char/page/block indices plus search_result_index) and the new _anthropic_supports_fast_mode helper. The branch had only the earlier shorter citation_key form so accepting main wholesale here loses nothing Codex-related. - studio/backend/models/inference.py: keep BOTH the Codex parallel_calls field + _clamp_parallel_calls validator (from the branch) AND the new Anthropic fast_mode field (from main). They occupy different provider lanes. - studio/frontend/src/features/chat/api/chat-adapter.ts: fold the Codex per-tab rendering pass (renderFullContent) into main's new orderAssistantContent positioning so tools land before text, generated images after, AND any Codex tab text accumulated in earlier _toolEvent frames is preserved through the synthesis content delta (round 8 render-order fix). Backend codex tests: 60/60 passing. Anthropic citations / fast_mode tests: 96/96 passing. Frontend builds cleanly.
…sloth into feat/codex-provider
|
Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits. |
1. Codex SSE wrapper terminates on exact `data: [DONE]` only. The old substring check `if "[DONE]" in line` would flip sent_done True when a normal model response carried the literal text "[DONE]" in delta.content (for example an explanation of the OpenAI stream sentinel). The real terminator was then suppressed, leaving OpenAI-compatible clients that finalise on the explicit sentinel hung on stream close. Now compares the stripped line to the exact `data: [DONE]` form. 2. Legacy `thread.run_streaming` path no longer returns an empty reply on completion-only streams. If the SDK exposes `thread.run_streaming` but the stream emits ONLY item.completed / agentMessage events with no message deltas, the loop previously exited with emitted_any False and never reached the agent-message fallback. The request returned 200 with an empty assistant reply even though Codex produced a final answer. Mirror the canonical-path behavior: collect `_completed_agent_message_text` strings in a sidecar list and emit the last one when no deltas arrived. Match the canonical payload-extraction (`getattr(event, "payload", event)`) so the event-vs-payload SDK shape difference is handled the same way in both branches. 3. Parallel-calls fan-out propagates CodexUnavailableError so the route layer can return 503. When the SDK is not importable or the safety enums are missing without the dev opt-in, every worker raised the same CodexUnavailableError. The previous catch-all converted the error into a per-tab codex_tab_error event, the outer stream never raised, and clients saw a 200 with only tool events and an empty synthesis -- OpenAI-compatible consumers that ignore _toolEvent saw a successful empty reply. Now CodexUnavailableError re-raises out of the worker (no spurious per-tab error event), _await_workers re-raises it when EVERY worker hit the same setup failure, and the finally-block drain await propagates the exception out of the parallel function so the route's existing CodexUnavailableError handler can emit the right 503 SSE error frame. Per-tab runtime failures (model rejected, timeout, mid- stream SDK crash) still get swallowed into codex_tab_error events so a single bad model in the fan-out does not kill the others. Test counts: 63/63 passing (60 round 6-8 plus 3 new round 9 regression tests). Each new test was first run against a `git stash`-restored pre-fix tree to confirm it catches the bug, then run against the patched tree.
|
Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits. |
|
Merged main into the branch and resolved three conflicts: Then addressed three new P2 findings the Codex bot left on
Test count after the round 9 commit: 63/63 passing in |
for more information, see https://pre-commit.ci
|
Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits. |
|
End-to-end Codex screenshots for this PR, captured against the latest 1. Add the Codex connection (final probe, head
|
# Conflicts: # studio/frontend/src/features/chat/api/chat-adapter.ts
|
Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits. |
for more information, see https://pre-commit.ci
|
Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits. |
Two paired changes that finally make the Codex parallel-calls fan-out visible as actual clickable tabs in the chat surface, plus a credit- free spoof that lets the whole pipeline run in dev / CI without ever touching the upstream API. 1. Real tab UI (frontend). The chat-adapter used to render the per-worker outputs as inline `[Codex tab 1/N] ...` text blocks in the assistant message body, which collapsed into one big run-on block once more than a handful of tokens had streamed. Now each `codex_*` SSE event is folded into `codexParallelState` and re-published as the `args.state` of a single tool-call part with `toolName === "codex_parallel"`. The assistant-ui surface dispatches that to the new `CodexParallelToolUI` wrapper, which mounts the existing `CodexParallelTabs` component -- one tab per worker, one Synthesis tab, click to switch. The stable `toolCallId` keeps assistant-ui updating the SAME card across stream yields rather than spawning new cards. `renderCodexTabsBlock` now returns the empty string so the message body no longer contains the labelled-text fallback (kept the function name so the rest of the adapter's `renderFullContent` / pin-signature paths are untouched). 2. Credit-free Codex SDK spoof (backend). New `studio/backend/core/inference/codex_spoof.py` exposes a drop-in subset of the upstream `openai_codex` surface (`AsyncCodex`, `AppServerConfig`, `ApprovalMode.deny_all`, `SandboxMode.read_only`, thread with `turn().stream()` + `run_streaming()` + `run()`) and emits deterministic per-tab streaming events tagged with the worker index, so flipping between tabs in the UI shows visibly distinct text. Activated by `UNSLOTH_CODEX_SPOOF=1`; `_import_codex` installs the spoof into `sys.modules` under both `openai_codex` and `codex_app_server` and the rest of the provider keeps running unchanged. OFF by default; production is unaffected. Six new tests cover the spoof itself (module install, env-flag gating, delta + completion event shape, per-tab tagging, provider import path, safety-kwargs resolution against the spoof). 69/69 tests pass with and without the flag; TypeScript clean.
…sloth into feat/codex-provider
for more information, see https://pre-commit.ci
|
Two paired changes pushed on Merge conflict resolution ( Real Codex parallel-call tab UI + credit-free SDK spoof (
Six new tests cover the spoof itself (module install + spec, env-flag gating, delta + completion event shape, per-tab tagging, provider import path, safety-kwargs resolution). Full count: 69/69 tests pass with and without Playwright capture of the new tab UI driven by the spoof is up next. |
|
Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits. |
# Conflicts: # studio/backend/main.py # studio/backend/routes/__init__.py
|
Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits. |
When ``UNSLOTH_CODEX_SPOOF=1`` is exported (the credit-free dev / CI path the previous commit added), the in-process spoof IS the Codex SDK and a real ``codex`` CLI is irrelevant. The status endpoint at ``/api/codex/status`` used to gate ``installed`` on the real CLI + real SDK only, which made the frontend hide the Codex provider in the connections dropdown even when the spoof was active. Now both ``_sdk_importable`` and ``probe_codex_availability`` short-circuit on ``codex_spoof.is_spoof_enabled()`` so the provider becomes visible under the spoof. ``installed=True``, ``cli_path="<spoof>"``, ``logged_in=True``, ``version="spoof"`` -- a sentinel that lets devs read off "yes I am under the spoof" at a glance. Real production code path (no spoof flag) is unchanged: still gates on bool(cli_path) AND sdk_ok the same as round 6. Tests: added an autouse fixture in ``test_codex_provider.py`` that clears ``UNSLOTH_CODEX_SPOOF`` before every test so the existing availability / import gating tests are not polluted when a dev runs the suite with the flag exported. The spoof-targeted tests still call ``monkeypatch.setenv(...)`` to flip it back on inside their own scope. 69/69 pass with and without the env flag.
for more information, see https://pre-commit.ci
|
Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits. |
1 similar comment
|
Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits. |
|
Another round on Merge -- main moved forward (Gemini provider #5720 + MCP servers #5750). Resolved conflicts in Code -- the Codex availability probe now honours Demo screenshots (under Live UI flow against the spoofed Studio: Connection form opens with Codex pre-selected, default models pre-checked, parallel-calls input ready: Parallel calls bumped to 3, 5 models still selected: The actual Tab 2 -- visibly different worker text (this is the point of the per-tab tagging in the spoof, mirrors what real Codex parallel calls produce when each worker explores a different angle): Tab 3: Synthesis tab -- highlighted in green when ready, unified answer below the tab strip: Mid-stream view (some tabs still streaming): Error path -- per-tab failure isolated, other tabs still complete and synthesis still arrives: The demo screenshots use mock |























Summary
Wires the OpenAI Codex CLI / Python SDK (
codex_app_server) into Studio as a new chat provider. The provider is hidden on hosts without the CLI + SDK and exposes a device-auth Sign-in button when logged out. Aparallel_callsknob fans the turn out across N (up to 20) Codex tasks and synthesises a unified answer.codex_availability.py+codex_provider.py+routes/codex.py;ChatCompletionRequest.parallel_calls(1-20);provider_type=codexdispatches through the SDK instead of HTTP; all SDK imports are lazy viaimportlib.util.find_spec.api/codex-api.ts,components/codex-parallel-tabs.tsx(tabbed render with Synthesis highlight),components/codex-login-button.tsx(device-auth + log streaming +window.openof the verification URL);external-providers.tsexportsCODEX_PROVIDER_TYPE,CODEX_MAX_PARALLEL_CALLS,clampCodexParallelCalls, and marks codex text-only.test_codex_provider.pycover the availability probe across the four install/login states, the streaming + parallel-calls translation against a fakecodex_app_serverinjected intosys.modules, the[1, 20]pydantic clamp, theCodexUnavailableErrorsurfacing path, and theparallel_calls=1single-call shape.Test plan
pytest studio/backend/tests/test_codex_provider.py(14 passing)pytest studio/backend/tests/test_external_provider_usage_chunk.py studio/backend/tests/test_anthropic_messages.py studio/backend/tests/test_openai_tool_passthrough.py studio/backend/tests/test_inference_model_validation.py(173 passing total)npx tsc -b --pretty falseinstudio/frontend(clean)codex_app_serverinstalled (not available on the build host -- see "Limitations" below)Limitations
codex_app_serverPython SDK was not installable on the build host (PyPI returned no matching distribution as of this PR). All SDK interactions are exercised against a fake module injected intosys.modulesin tests. Once the SDK ships to PyPI, an integration test against a realAsyncCodexinstance should be added.parallel_callsUI pill in the composer (deliverable 6) is implemented as a typed clamp + types inexternal-providers.ts; surfacing it as an actual composer pill requires a follow-up edit inshared-composer.tsxthat integrates with the existingInferenceParamsplumbing. TheCodexParallelTabscomponent and the underlying state reducer are wired and ready to consume the backend events.hidden: trueregistry flag; surfacing a synthetic codex row gated on/api/codex/statusis a small follow-up inchat-providers-dialog.tsxconsumingfetchCodexStatus()from the new API module.