chore(deps): bump claude-agent-sdk to 0.2.121, codex-sdk to 0.125.0#1460
chore(deps): bump claude-agent-sdk to 0.2.121, codex-sdk to 0.125.0#1460
Conversation
Both SDKs were ~30 patch releases behind. Validation suite passes
(type-check, lint, format, tests across all 10 packages) without code
changes. The only sustained Claude SDK behavior change in the range —
v0.2.111's options.env overlay/replace flap, since reverted to overlay —
is a no-op for Archon, which already passes { ...process.env } as the
SDK env.
|
Caution Review failedPull request was closed or merged during review No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: ⛔ Files ignored due to path filters (1)
📒 Files selected for processing (2)
📝 WalkthroughWalkthroughDependency versions are updated across two package.json files. The root package.json upgrades Changes
Estimated code review effort🎯 1 (Trivial) | ⏱️ ~2 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
* refactor(workflows): trust the SDK for model validation Drops cross-provider model inference and hard-coded model allow-lists. The string a workflow author writes in `model:` is forwarded to the SDK unchanged; the SDK and its API decide whether the model exists. Provider identity is the only thing Archon validates at load time — typos like `provider: claud` are caught early; everything else fails at runtime through the SDK's normal error path. Why this matters: a recent run on Sasha showed `provider: claude` + `model: opus[1m]` getting silently routed to Codex (because Codex's isModelCompatible was defined as the complement of Claude's, so anything not literally `sonnet|opus|haiku` matched). Codex then rejected the model as a `⚠️ ` system warning and the node "completed" in 2.1 seconds with empty output, after which the workflow opened a hallucinated PR. Three stacked bugs and two amplifiers; this commit removes all five. Changes: - Delete model-validation.ts entirely (inferProviderFromModel and isModelCompatible are gone). Drop the matching field from ProviderRegistration and from the claude/codex/pi entries. - Replace the resolver in executor.ts and dag-executor.ts (both the per-node and per-loop paths) with a flat `node.provider ?? workflow.provider ?? config.assistant`. Model never influences provider selection; load-time validation is just isRegisteredProvider on the resolved provider id. - Remove the dag-node Zod superRefine that recomputed model-compat — load-time provider validation moved to loader.ts. - Codex provider: stream loop now matches Claude's contract. error events that aren't followed by turn.completed yield `result.isError: true` (subtype `codex_stream_incomplete`) so the dag-executor's existing isError path catches them. turn.failed becomes `codex_turn_failed` with the same shape. Iterator close without a terminal event is itself a fail-stop. MCP-client errors remain filtered (Codex retries those internally). - dag-executor: AI nodes that exit the streaming loop with empty assistant text and no structured output now fail with `dag.node_empty_output` instead of completing silently — the Sasha bug's final amplifier. Bash/script/approval nodes are unaffected. Tests: model-validation.test.ts and isPiModelCompatible block deleted; codex provider tests rewritten to assert the new fail-stop contract; dag-executor empty-output test flipped to assert failure; new tests cover (a) loader rejecting unknown provider, (b) loader accepting any model string with a known provider, (c) executor passing provider+model through without re-routing, (d) executor throwing on unknown provider, (e) Codex synthesizing fail-stop on iterator close. Two cost-tracking tests adjusted to yield non-empty assistant text since their intent was cost accumulation, not empty-output handling. bun run validate: green (check:bundled, type-check, lint --max-warnings 0, format:check, all packages' test suites — 0 fail). End-to-end smoke (.archon/workflows/test-workflows/): - e2e-deterministic: PASS (engine healthy) - e2e-codex-smoke: PASS (Codex sendQuery + structured output work) - e2e-claude-smoke: FAIL with `error: unknown option '--no-env-file'` — this is a regression from the SDK 0.2.121 bump (#1460), not from this redesign. The Claude provider source is unchanged on this branch. To be fixed separately. * fix(workflows): address review on #1463 Critical: - C1: empty-output guard now skips idle-timeout completions. The on-screen message says "completed via idle timeout"; flipping that to a failure contradicted the user-facing log. Added !nodeIdleTimedOut to the guard. - C2: per-node provider identity is now validated at YAML load time. Loader iterates dagNodes after parsing and rejects any unknown provider id with "Node 'X': unknown provider 'Y'. Registered: ...". The dag-executor's runtime check stays as defense-in-depth. Important: - I1: CHANGELOG entry under [Unreleased] > Changed describing the resolver redesign + an explicit migration line for workflows that relied on cross-provider model inference. - I2: restored the dropped mockLogger.error('turn_failed') assertion in the turn.failed-without-error-message test. - I3: empty-output test now also asserts store.failWorkflowRun was called, matching the parallel error_max_budget_usd test pattern. - I4: new test that proves a node yielding zero assistant text but a valid structuredOutput is treated as a successful completion (not caught by the empty-output guard). - I5: rewrote the post-loop comment in codex/provider.ts to be precise about which dag-executor branch catches the synthesized result chunk (the throwing msg.isError branch, distinct from the empty-output guard's { state: 'failed' } return). - I6: removed PR-era "redesign" / "Sasha workflow" references from three test-file comments. - I7: docs sweep for the deleted isModelCompatible field — six files updated (CLAUDE.md, two docs guides, quick-reference, contributing guide, architecture reference). Polish: - S3: dropped the dead sawTerminal flag in streamCodexEvents — both terminal branches `return`, so reaching the post-loop block always means no terminal fired. Pure simplification. - S4: dropped parsePiModelRef and PiModelRef from community/pi/index.ts exports. The parser is consumed only by Pi's provider.ts; making it package-internal narrows the public surface. - S6: new Codex test for the bare-stream-close case (zero events, iterator just ends) — locks in the default fallback message used when no captured non-MCP error is available. - S7: new dag-executor test for per-node unknown-provider at runtime. Bypasses the loader to exercise resolveNodeProviderAndModel's throw, asserts the node_failed event carries the "unknown provider 'claud'" detail (the workflow-level fail message is a generic summary). bun run validate green across all 10 packages. * fix(workflows): address CodeRabbit review on #1463 Two real issues from CodeRabbit's automated pass on db95e8a: 1. Empty-output fail-stop now applies to loop iterations too. The single-shot AI-node guard at executeNodeInternal only covered prompt/command nodes; executeLoopNode has its own streaming path, so a provider that closed cleanly with zero content could pause an interactive loop with a blank gate or burn the full max_iterations budget. Mirrors the contract of the single-shot guard: `fullOutput.trim() === '' && !iterationIdleTimedOut` fails the iteration with a `loop_iteration_failed` event carrying a clear error. Idle-timeout exits remain exempt for the same reason as single-shot nodes — the on-screen "completed via idle timeout" message would otherwise contradict the failure. 2. Unknown loop providers now throw instead of return-failed. The early-return path bypassed the layer dispatch's outer catch at line 2870, so loop nodes with an invalid per-node `provider:` field skipped the standard `node_failed` event, the user-facing message, and the pre-execution log entry. Throwing reuses the common failure path — same shape as resolveNodeProviderAndModel uses for non-loop nodes. Both align with CLAUDE.md's "fail fast, explicit errors, never silently swallow" principle. The third CodeRabbit finding (boundary violation for `@archon/providers` import in loader.ts) is consistent with existing precedent — `dag-executor.ts`, `executor.ts`, and `validator.ts` already import from the same path; the runtime contract (every entrypoint bootstraps the registry before parseWorkflow runs) is already enforced in tests and documented at `loader.test.ts:31`. bun run validate green across all 10 packages.
Summary
@anthropic-ai/claude-agent-sdk^0.2.89vs0.2.121;@openai/codex-sdk^0.116.0vs0.125.0). Behind-ness was surfaced while diagnosing a workflow-routing bug where Codex rejected a Claude model alias with a non-fatal warning.claude-opus-4-7[1m], current GPT-5.x variants) are validated server-side by the SDKs. Staying on old pins doesn't break anything per se, but it widens the gap before the next bump and slows iteration on follow-up provider work.package.json+packages/providers/package.json, plus the regeneratedbun.lock.options.envoverlay/replace flap (introduced in 0.2.113, reverted in 0.2.111-line — confirmed against the SDK CHANGELOG) is a no-op for Archon becausebuildSubprocessEnv()already passes{ ...process.env }.UX Journey
No user-facing behavior change. Workflow execution, platform adapters, and CLI surfaces are identical before and after.
Architecture Diagram
No architectural change — pure dependency version bump. Same modules, same edges, same exports.
Connection inventory:
@archon/providers(claude)@anthropic-ai/claude-agent-sdk@archon/providers(codex)@openai/codex-sdk@anthropic-ai/claude-agent-sdkLabel Snapshot
risk: lowsize: XSdependenciesproviders:claude,providers:codexChange Metadata
choremulti(root +@archon/providers)Linked Issue
Validation Evidence (required)
bun run validate # EXIT=0All five gates pass on the bumped SDKs:
check:bundled✓type-check(10 packages) ✓lint --max-warnings 0✓format:check✓0 fail) ✓Security Impact (required)
Patch-level upstream releases only — no new transitive dependencies of note. Vendor SDKs continue to talk to the same Anthropic / OpenAI endpoints with the same auth flow.
Compatibility / Migration
Existing workflow YAMLs,
.archon/config.yamlfiles, and stored sessions continue to work unchanged.Human Verification (required)
bun installresolves cleanly to 0.2.121 / 0.125.0 in the lockfile; fullbun run validatereturnsEXIT=0; the SDK API surface used bypackages/providers/src/claude/provider.tsandpackages/providers/src/codex/provider.ts(query,Options, message events) type-checks against the new.d.ts.options.envoverlay/replace flap, which is a no-op for Archon's{ ...process.env }usage. Codex SDK release notes 0.116 → 0.125 show no documented JS API changes.Side Effects / Blast Radius (required)
dag.node_sdk_error_resultlog event catches result-level errors; provider-level Pino logs (stream_error,turn_failed) catch streaming issues.Rollback Plan (required)
git revert <merge-sha>and re-runbun install. No DB or filesystem state is touched.query()calls or as a TypeScript build break onbun run type-check. Both are caught immediately in CI.Risks and Mitigations
packages/providers/src/claude/provider.test.ts,packages/providers/src/codex/provider.test.ts). Rollback is a one-line revert if a real-world issue surfaces.Summary by CodeRabbit