Skip to content

chore(deps): bump claude-agent-sdk to 0.2.121, codex-sdk to 0.125.0#1460

Merged
Wirasm merged 1 commit intodevfrom
chore/bump-sdks
Apr 28, 2026
Merged

chore(deps): bump claude-agent-sdk to 0.2.121, codex-sdk to 0.125.0#1460
Wirasm merged 1 commit intodevfrom
chore/bump-sdks

Conversation

@Wirasm
Copy link
Copy Markdown
Collaborator

@Wirasm Wirasm commented Apr 28, 2026

Summary

  • Problem: Both vendor SDKs were ~30 patch releases behind (@anthropic-ai/claude-agent-sdk ^0.2.89 vs 0.2.121; @openai/codex-sdk ^0.116.0 vs 0.125.0). Behind-ness was surfaced while diagnosing a workflow-routing bug where Codex rejected a Claude model alias with a non-fatal warning.
  • Why it matters: Newer model aliases / IDs (e.g. claude-opus-4-7[1m], current GPT-5.x variants) are validated server-side by the SDKs. Staying on old pins doesn't break anything per se, but it widens the gap before the next bump and slows iteration on follow-up provider work.
  • What changed: Three pin bumps in package.json + packages/providers/package.json, plus the regenerated bun.lock.
  • What did NOT change: Zero source code changes. No API surface used by Archon shifted. The Claude SDK's options.env overlay/replace flap (introduced in 0.2.113, reverted in 0.2.111-line — confirmed against the SDK CHANGELOG) is a no-op for Archon because buildSubprocessEnv() already passes { ...process.env }.

UX Journey

No user-facing behavior change. Workflow execution, platform adapters, and CLI surfaces are identical before and after.

Architecture Diagram

No architectural change — pure dependency version bump. Same modules, same edges, same exports.

Connection inventory:

From To Status Notes
@archon/providers (claude) @anthropic-ai/claude-agent-sdk unchanged version pin moved
@archon/providers (codex) @openai/codex-sdk unchanged version pin moved
root workspace @anthropic-ai/claude-agent-sdk unchanged version pin moved

Label Snapshot

  • Risk: risk: low
  • Size: size: XS
  • Scope: dependencies
  • Module: providers:claude, providers:codex

Change Metadata

  • Change type: chore
  • Primary scope: multi (root + @archon/providers)

Linked Issue

  • Closes #
  • Related #
  • Depends on # (if stacked)
  • Supersedes # (if replacing older PR)

Validation Evidence (required)

bun run validate
# EXIT=0

All five gates pass on the bumped SDKs:

  • check:bundled
  • type-check (10 packages) ✓
  • lint --max-warnings 0
  • format:check
  • tests (every package, every file 0 fail) ✓

Security Impact (required)

  • New permissions/capabilities? No
  • New external network calls? No
  • Secrets/tokens handling changed? No
  • File system access scope changed? No

Patch-level upstream releases only — no new transitive dependencies of note. Vendor SDKs continue to talk to the same Anthropic / OpenAI endpoints with the same auth flow.

Compatibility / Migration

  • Backward compatible? Yes
  • Config/env changes? No
  • Database migration needed? No

Existing workflow YAMLs, .archon/config.yaml files, and stored sessions continue to work unchanged.

Human Verification (required)

  • Verified scenarios: bun install resolves cleanly to 0.2.121 / 0.125.0 in the lockfile; full bun run validate returns EXIT=0; the SDK API surface used by packages/providers/src/claude/provider.ts and packages/providers/src/codex/provider.ts (query, Options, message events) type-checks against the new .d.ts.
  • Edge cases checked: Reviewed Claude SDK CHANGELOG for breaking changes between 0.2.89 and 0.2.121 — only options.env overlay/replace flap, which is a no-op for Archon's { ...process.env } usage. Codex SDK release notes 0.116 → 0.125 show no documented JS API changes.
  • What was not verified: Did not run a live workflow against either provider end-to-end (would require real API credits and a real Claude / Codex CLI on the harness). Type-check + tests provide enough coverage that a regression here would have surfaced.

Side Effects / Blast Radius (required)

  • Affected subsystems/workflows: Anything that runs a Claude or Codex node in a workflow, plus direct chat through either provider.
  • Potential unintended effects: A latent SDK regression we couldn't reproduce in unit tests could surface only at request time. Low likelihood given the patch-level cadence and the existing test coverage of the wrapper code.
  • Guardrails/monitoring for early detection: Existing dag.node_sdk_error_result log event catches result-level errors; provider-level Pino logs (stream_error, turn_failed) catch streaming issues.

Rollback Plan (required)

  • Fast rollback command/path: git revert <merge-sha> and re-run bun install. No DB or filesystem state is touched.
  • Feature flags or config toggles: None — pure dependency pin.
  • Observable failure symptoms: A user-visible regression would show up as either workflow nodes failing on query() calls or as a TypeScript build break on bun run type-check. Both are caught immediately in CI.

Risks and Mitigations

  • Risk: Undocumented SDK regression in the 32 patch / 9 minor releases we skip past.
    • Mitigation: Full validation suite passes, including the provider-level test files (packages/providers/src/claude/provider.test.ts, packages/providers/src/codex/provider.test.ts). Rollback is a one-line revert if a real-world issue surfaces.

Summary by CodeRabbit

  • Chores
    • Updated project dependencies to the latest compatible versions.

Both SDKs were ~30 patch releases behind. Validation suite passes
(type-check, lint, format, tests across all 10 packages) without code
changes. The only sustained Claude SDK behavior change in the range —
v0.2.111's options.env overlay/replace flap, since reverted to overlay —
is a no-op for Archon, which already passes { ...process.env } as the
SDK env.
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 28, 2026

Caution

Review failed

Pull request was closed or merged during review

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 6265dd3a-69b3-490a-a2d5-667f8581727f

📥 Commits

Reviewing files that changed from the base of the PR and between 2220ffe and d141bb1.

⛔ Files ignored due to path filters (1)
  • bun.lock is excluded by !**/*.lock
📒 Files selected for processing (2)
  • package.json
  • packages/providers/package.json

📝 Walkthrough

Walkthrough

Dependency versions are updated across two package.json files. The root package.json upgrades @anthropic-ai/claude-agent-sdk to version 0.2.121, while packages/providers/package.json upgrades both @anthropic-ai/claude-agent-sdk to 0.2.121 and @openai/codex-sdk to 0.125.0.

Changes

Cohort / File(s) Summary
Dependency Updates
package.json, packages/providers/package.json
Bumped @anthropic-ai/claude-agent-sdk to ^0.2.121 in both files and upgraded @openai/codex-sdk to ^0.125.0 in providers package.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~2 minutes

Poem

🐰 Hopping through versions with glee,
Dependencies dance wild and free,
From point two to one-two-one,
The SDK upgrades are done!
Carrots and code, a perfect blend,
On which we can always depend! 🥕✨

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and specifically describes the main change: bumping two SDK dependency versions. It directly matches the changeset which updates @anthropic-ai/claude-agent-sdk and @openai/codex-sdk in package files.
Description check ✅ Passed The description comprehensively covers all major template sections including summary, architecture, validation evidence, security impact, compatibility, human verification, side effects, and rollback plan with appropriate detail for a dependency bump.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch chore/bump-sdks

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@Wirasm Wirasm merged commit 0afbeb3 into dev Apr 28, 2026
3 of 4 checks passed
@Wirasm Wirasm deleted the chore/bump-sdks branch April 28, 2026 08:25
Wirasm added a commit that referenced this pull request Apr 28, 2026
* refactor(workflows): trust the SDK for model validation

Drops cross-provider model inference and hard-coded model allow-lists.
The string a workflow author writes in `model:` is forwarded to the SDK
unchanged; the SDK and its API decide whether the model exists. Provider
identity is the only thing Archon validates at load time — typos like
`provider: claud` are caught early; everything else fails at runtime
through the SDK's normal error path.

Why this matters: a recent run on Sasha showed `provider: claude` +
`model: opus[1m]` getting silently routed to Codex (because Codex's
isModelCompatible was defined as the complement of Claude's, so anything
not literally `sonnet|opus|haiku` matched). Codex then rejected the model
as a `⚠️` system warning and the node "completed" in 2.1 seconds with
empty output, after which the workflow opened a hallucinated PR. Three
stacked bugs and two amplifiers; this commit removes all five.

Changes:

- Delete model-validation.ts entirely (inferProviderFromModel and
  isModelCompatible are gone). Drop the matching field from
  ProviderRegistration and from the claude/codex/pi entries.
- Replace the resolver in executor.ts and dag-executor.ts (both the
  per-node and per-loop paths) with a flat
  `node.provider ?? workflow.provider ?? config.assistant`. Model never
  influences provider selection; load-time validation is just
  isRegisteredProvider on the resolved provider id.
- Remove the dag-node Zod superRefine that recomputed model-compat —
  load-time provider validation moved to loader.ts.
- Codex provider: stream loop now matches Claude's contract. error
  events that aren't followed by turn.completed yield
  `result.isError: true` (subtype `codex_stream_incomplete`) so the
  dag-executor's existing isError path catches them. turn.failed
  becomes `codex_turn_failed` with the same shape. Iterator close
  without a terminal event is itself a fail-stop. MCP-client errors
  remain filtered (Codex retries those internally).
- dag-executor: AI nodes that exit the streaming loop with empty
  assistant text and no structured output now fail with
  `dag.node_empty_output` instead of completing silently — the Sasha
  bug's final amplifier. Bash/script/approval nodes are unaffected.

Tests: model-validation.test.ts and isPiModelCompatible block deleted;
codex provider tests rewritten to assert the new fail-stop contract;
dag-executor empty-output test flipped to assert failure; new tests
cover (a) loader rejecting unknown provider, (b) loader accepting any
model string with a known provider, (c) executor passing
provider+model through without re-routing, (d) executor throwing on
unknown provider, (e) Codex synthesizing fail-stop on iterator close.
Two cost-tracking tests adjusted to yield non-empty assistant text
since their intent was cost accumulation, not empty-output handling.

bun run validate: green (check:bundled, type-check, lint
--max-warnings 0, format:check, all packages' test suites — 0 fail).

End-to-end smoke (.archon/workflows/test-workflows/):
- e2e-deterministic: PASS (engine healthy)
- e2e-codex-smoke: PASS (Codex sendQuery + structured output work)
- e2e-claude-smoke: FAIL with `error: unknown option '--no-env-file'`
  — this is a regression from the SDK 0.2.121 bump (#1460), not from
  this redesign. The Claude provider source is unchanged on this
  branch. To be fixed separately.

* fix(workflows): address review on #1463

Critical:
- C1: empty-output guard now skips idle-timeout completions. The on-screen
  message says "completed via idle timeout"; flipping that to a failure
  contradicted the user-facing log. Added !nodeIdleTimedOut to the guard.
- C2: per-node provider identity is now validated at YAML load time.
  Loader iterates dagNodes after parsing and rejects any unknown
  provider id with "Node 'X': unknown provider 'Y'. Registered: ...".
  The dag-executor's runtime check stays as defense-in-depth.

Important:
- I1: CHANGELOG entry under [Unreleased] > Changed describing the
  resolver redesign + an explicit migration line for workflows that
  relied on cross-provider model inference.
- I2: restored the dropped mockLogger.error('turn_failed') assertion in
  the turn.failed-without-error-message test.
- I3: empty-output test now also asserts store.failWorkflowRun was
  called, matching the parallel error_max_budget_usd test pattern.
- I4: new test that proves a node yielding zero assistant text but a
  valid structuredOutput is treated as a successful completion (not
  caught by the empty-output guard).
- I5: rewrote the post-loop comment in codex/provider.ts to be precise
  about which dag-executor branch catches the synthesized result chunk
  (the throwing msg.isError branch, distinct from the empty-output
  guard's { state: 'failed' } return).
- I6: removed PR-era "redesign" / "Sasha workflow" references from
  three test-file comments.
- I7: docs sweep for the deleted isModelCompatible field — six files
  updated (CLAUDE.md, two docs guides, quick-reference, contributing
  guide, architecture reference).

Polish:
- S3: dropped the dead sawTerminal flag in streamCodexEvents — both
  terminal branches `return`, so reaching the post-loop block always
  means no terminal fired. Pure simplification.
- S4: dropped parsePiModelRef and PiModelRef from community/pi/index.ts
  exports. The parser is consumed only by Pi's provider.ts; making it
  package-internal narrows the public surface.
- S6: new Codex test for the bare-stream-close case (zero events,
  iterator just ends) — locks in the default fallback message used
  when no captured non-MCP error is available.
- S7: new dag-executor test for per-node unknown-provider at runtime.
  Bypasses the loader to exercise resolveNodeProviderAndModel's throw,
  asserts the node_failed event carries the "unknown provider 'claud'"
  detail (the workflow-level fail message is a generic summary).

bun run validate green across all 10 packages.

* fix(workflows): address CodeRabbit review on #1463

Two real issues from CodeRabbit's automated pass on db95e8a:

1. Empty-output fail-stop now applies to loop iterations too. The
   single-shot AI-node guard at executeNodeInternal only covered
   prompt/command nodes; executeLoopNode has its own streaming path,
   so a provider that closed cleanly with zero content could pause an
   interactive loop with a blank gate or burn the full max_iterations
   budget. Mirrors the contract of the single-shot guard:
   `fullOutput.trim() === '' && !iterationIdleTimedOut` fails the
   iteration with a `loop_iteration_failed` event carrying a clear
   error. Idle-timeout exits remain exempt for the same reason as
   single-shot nodes — the on-screen "completed via idle timeout"
   message would otherwise contradict the failure.

2. Unknown loop providers now throw instead of return-failed. The
   early-return path bypassed the layer dispatch's outer catch at
   line 2870, so loop nodes with an invalid per-node `provider:`
   field skipped the standard `node_failed` event, the user-facing
   message, and the pre-execution log entry. Throwing reuses the
   common failure path — same shape as resolveNodeProviderAndModel
   uses for non-loop nodes.

Both align with CLAUDE.md's "fail fast, explicit errors, never silently
swallow" principle. The third CodeRabbit finding (boundary violation
for `@archon/providers` import in loader.ts) is consistent with
existing precedent — `dag-executor.ts`, `executor.ts`, and
`validator.ts` already import from the same path; the runtime contract
(every entrypoint bootstraps the registry before parseWorkflow runs) is
already enforced in tests and documented at `loader.test.ts:31`.

bun run validate green across all 10 packages.
@Wirasm Wirasm mentioned this pull request Apr 29, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant