refactor: decompose provider sendQuery() into explicit helper boundaries by Wirasm · Pull Request #1162 · coleam00/Archon

Wirasm · 2026-04-13T07:01:59Z

Summary

Decomposes sendQuery() in both Claude and Codex providers from monolithic methods into focused internal helpers with clear ownership boundaries
Introduces structured ProviderWarning { code, message } type replacing raw string warnings for machine-readable provider translation warnings
Adds 43 focused unit tests covering the extracted helpers directly

Root Cause

sendQuery() was carrying 5+ mixed responsibilities (SDK option building, nodeConfig translation, warning collection, stream normalization, retry/error handling). Before Phase 2 adds provider extensibility, this boundary needs to be explicit so community providers don't need to cargo-cult a large mixed-responsibility method.

Changes

File	Change
`packages/providers/src/claude/provider.ts`	Extract `buildBaseClaudeOptions`, `buildToolCaptureHooks`, `streamClaudeMessages`, `classifyAndEnrichError`; `applyNodeConfig` now returns `ProviderWarning[]`
`packages/providers/src/codex/provider.ts`	Extract `buildTurnOptions`, `streamCodexEvents`, `classifyAndEnrichCodexError`
`packages/providers/src/claude/provider.test.ts`	Add 26 focused tests for `applyNodeConfig`, `streamClaudeMessages`, `classifyAndEnrichError`, `buildBaseClaudeOptions`
`packages/providers/src/codex/provider.test.ts`	Add 17 focused tests for `buildTurnOptions`, `streamCodexEvents`, `classifyAndEnrichCodexError`

Design Decisions

No new public API: All helpers are file-local or exported for testing only. The IAgentProvider.sendQuery() contract is unchanged.
No new abstractions: Helpers are plain functions, not a class hierarchy or framework.
Structured warnings: ProviderWarning { code, message } replaces raw strings. Providers still convert to system chunks at the boundary — callers don't need to change.
Platform policy stays outside: No env-leak/consent logic reintroduced into providers (per security: zero env leaks from managed repos into provider subprocesses #1135 boundary).

Testing

Type check passes (all 10 packages)
All 136 provider tests pass (77 Claude + 59 Codex)
43 new focused helper tests pass
Lint passes (zero warnings)
Format check passes
Full bun run validate passes

Validation

bun run validate

Issue

Fixes #1139

Summary by CodeRabbit

Refactor
- Improved streaming and event normalization across providers for consistent assistant, tool, result and system messages.
- Centralized options and structured warning handling so warnings are emitted as structured system messages.
Bug Fixes
- Improved abort/timeout handling and enriched error reporting that guides retry decisions.
- Avoids duplicate system warnings across retries; preserves timeout diagnostics.
Tests
- Added regression suites validating streaming, aborts, retry behavior, stderr inclusion, and structured-output handling.

…ies (#1139) sendQuery() in both Claude and Codex providers was a monolith mixing SDK option building, nodeConfig translation, stream normalization, and error classification. This makes it hard to safely extend for Phase 2 provider extensibility. Decompose both providers into focused internal helpers: Claude: - buildBaseClaudeOptions: SDK option construction - buildToolCaptureHooks: PostToolUse/PostToolUseFailure hook setup - applyNodeConfig: workflow nodeConfig → SDK translation + structured warnings - streamClaudeMessages: raw SDK event → MessageChunk normalization - classifyAndEnrichError: error classification with retry decisions Codex: - buildTurnOptions: per-turn option construction (output schema, abort) - streamCodexEvents: raw SDK event → MessageChunk normalization - classifyAndEnrichCodexError: error classification with retry decisions Also introduces ProviderWarning { code, message } replacing raw string warnings for machine-readable provider translation warnings. Adds 43 focused unit tests covering the extracted helpers directly. Fixes #1139

coderabbitai · 2026-04-13T07:02:15Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

Refactors Claude and Codex providers by extracting option builders, stream normalizers, and error classifiers; converts nodeConfig warnings to structured ProviderWarning objects; centralizes streaming and retry/enrichment logic, and updates tests to exercise retry/streaming behaviors.

Changes

Cohort / File(s)	Summary
Claude provider `packages/providers/src/claude/provider.ts`	Introduces `ProviderWarning`, `ToolResultEntry`, `buildBaseClaudeOptions()`, `buildToolCaptureHooks()`, `streamClaudeMessages()`, `classifyAndEnrichError()`. `applyNodeConfig()` now returns `ProviderWarning[]`. `sendQuery()` reworked to use helpers, centralize options, normalize streaming, and centralize retry classification and stderr capture.
Codex provider `packages/providers/src/codex/provider.ts`	Adds `buildTurnOptions()`, `streamCodexEvents()`, `classifyAndEnrichCodexError()`. Moves event normalization, structured-output parsing, todo deduping, tool/result normalization, and abort/error handling into `streamCodexEvents()`; `sendQuery()` simplified to orchestrate helpers.
Tests `packages/providers/src/claude/provider.test.ts`, `packages/providers/src/codex/provider.test.ts`	Adds black-box tests for sendQuery decomposition: first-event timeout preservation, warning dedup across retries, abort semantics across retries, stderr inclusion on final errors, post-tool hook error tolerance, SDK result-is-error surfacing, and structured-output/retry behavior.
Behavioral surface `packages/providers/src/...`	Streaming now emits normalized `MessageChunk` kinds (`assistant`, `tool`, `tool_result`, `system`, `rate_limit`, `result`); nodeConfig warnings are structured and emitted as system chunks before streaming; error classification/enrichment centralizes retry decisions (rate_limit/crash vs auth/abort).

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant Provider
    participant SDK
    participant ToolRunner

    Client->>Provider: sendQuery(prompt, options)
    Provider->>Provider: buildBaseOptions(...) / applyNodeConfig() -> collect ProviderWarning[]
    Provider->>SDK: invoke SDK with options (hooks, signal)
    SDK-->>Provider: stream events (assistant/tool/result/err)
    Provider->>Provider: stream* normalizer -> emit MessageChunks (system/assistant/tool/...)
    alt tool execution event
        Provider->>ToolRunner: run tool
        ToolRunner-->>Provider: tool result/events
        Provider->>SDK: (optionally) feed tool result back
    end
    Provider-->>Client: yield MessageChunk(s) and final result or enriched error

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

refactor: extract providers from @archon/core into @archon/providers #1137: Modifies the same provider implementations and earlier provider extraction/refactor work that this change builds upon.
fix: make env-integration test cross-platform (Windows CI) #1160: Adjusts Claude provider tests related to sendQuery; strongly related to test changes here.

Poem

🐰 I nibbled at streams, unhooked tangled threads,

Warnings now tidy, no more stringly-leds.
Helpers in rows, retries counted just right,
Tools speak clearly, errors wear light.
Hop, test, and merge — a carrot-coded night.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Linked Issues check	⚠️ Warning	The PR addresses all key objectives from issue `#1139`: decomposes sendQuery() into explicit helpers, introduces ProviderWarning for structured warnings, adds 43 focused unit tests, preserves public API contract, and keeps platform concerns external. Multi-agent review identified critical issues requiring fixes before merge (buildSubprocessEnv hoisting, abortSignal listener accumulation, PostToolUse error handling, enrichedError throwing).	Address critical issues flagged in multi-agent review: hoist buildSubprocessEnv before retry loop, fix abortSignal listener accumulation, add try/catch to PostToolUse hook, and throw enrichedError instead of lastError after retry exhaustion. Merge after fixes verified.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and concisely describes the main refactoring change: decomposing sendQuery() into explicit helper boundaries, which matches the primary objective of this PR.
Description check	✅ Passed	The description comprehensively covers the root cause, changes, design decisions, testing validation, and references the fixed issue. It follows most template sections (Summary, Changes, Design Decisions, Testing, Validation, Issue) though lacks explicit UX Journey, Architecture Diagram, and some metadata sections.
Out of Scope Changes check	✅ Passed	All changes are directly scoped to decomposing sendQuery() helpers and adding structured warnings; no unrelated refactorings or feature additions detected. Helper extraction, ProviderWarning introduction, and test additions align precisely with issue `#1139` objectives.
Docstring Coverage	✅ Passed	Docstring coverage is 87.50% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch refactor/issue-1139-sendquery-decomposition

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

…s API

Wirasm · 2026-04-13T07:04:19Z

Self Code Review

Summary

Clean decomposition that preserves all existing observable behavior. 136 tests pass (93 existing + 43 new). One type visibility issue found and fixed.

Findings

Fixed

ToolResultEntry was not exported but used in the public buildBaseClaudeOptions signature — fixed in follow-up commit

Non-blocking observations

parseClaudeConfig moved outside the retry loop (was inside). Safe because requestOptions is immutable within the call, but worth noting.
streamCodexEvents uses Record<string, unknown> for the event parameter type (deliberate tradeoff for testability vs the SDK's typed events). Internal casts are present but not individually commented.
ProviderWarning is exported from claude/provider.ts but not re-exported from index.ts — intentionally internal for now, can be promoted to the package API when Phase 2 callers need it.

Checklist

Fix addresses root cause from investigation
Code follows codebase patterns
Tests cover the change (43 new focused helper tests)
No obvious bugs introduced
No platform adapter dependencies in providers
No env-leak/consent logic reintroduced

coderabbitai

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@packages/providers/src/claude/provider.ts`:
- Around line 743-754: The current classifyAndEnrichError function
unconditionally converts any aborted controller into a generic 'Query aborted'
Error, which loses the original first-event timeout diagnostic; change the logic
in classifyAndEnrichError to detect whether the abort was caused by the
first-message timeout (preserve and return the original timeout Error and its
classification) versus a genuine upstream cancellation (return the 'Query
aborted' Error and 'aborted' class). Locate the abort check around
controller.signal.aborted, allow the original timeout Error to be propagated or
attached (e.g., by inspecting stderrLines or a sentinel on the Error) instead of
overwriting it, and only produce the generic abort Error when the cancellation
is upstream.
- Around line 884-887: The nodeConfigWarnings system chunks are currently
emitted inside the retry loop causing duplicate warnings on retries; move the
emission so it runs only once per request by hoisting the loop that yields {
type: 'system', content: `⚠️ ${warning.message}` } outside the retry/attempt
loop or wrap it with a guard like if (attempt === 0). Locate the retry handling
code surrounding nodeConfigWarnings in provider.ts and ensure nodeConfigWarnings
is iterated and yielded before entering retries (or conditioned on attempt ===
0) so each warning is emitted only once per request.

In `@packages/providers/src/codex/provider.ts`:
- Around line 192-195: streamCodexEvents currently breaks on abort (using
abortSignal?.aborted) which yields a truncated stream without signalling
failure; change streamCodexEvents to throw a clear AbortError (or a custom Error
with message like "Query aborted") when abortSignal.aborted is true, and update
sendQuery to catch that specific error (thrown by streamCodexEvents) and convert
it into the expected "Query aborted" handling or rethrow as appropriate; apply
the same fix to the other occurrence around the code handling abort at lines
referenced (the second abort check) so callers never receive a silently
truncated stream.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d0deb3a9-0014-4c00-ad52-b69f1d584570

📥 Commits

Reviewing files that changed from the base of the PR and between 6a6740a and 8be4664.

📒 Files selected for processing (4)

packages/providers/src/claude/provider.test.ts
packages/providers/src/claude/provider.ts
packages/providers/src/codex/provider.test.ts
packages/providers/src/codex/provider.ts

Wirasm · 2026-04-13T07:07:37Z

PR Review Summary — Multi-Agent Analysis

Five specialized review agents analyzed this PR. Findings below.

Critical Issues (4 found)

Agent	Issue	Location
code-reviewer	`buildSubprocessEnv()` called per retry attempt — logs auth-mode and rebuilds `{ ...process.env }` up to 4× per `sendQuery` call. Should be hoisted before the retry loop.	`claude/provider.ts:77-87` (via `buildBaseClaudeOptions` at L856)
code-reviewer	`abortSignal` listener leaked per retry — each iteration registers a new `abort` listener on `requestOptions.abortSignal` via `{ once: true }`, but `once` only prevents double-fire, not cleanup if abort never fires. Accumulates dangling closures holding old controllers.	`claude/provider.ts:843-853`
silent-failure-hunter	`PostToolUse` hook has no try/catch — `JSON.stringify(toolResponse)` can throw on circular refs. Unlike `PostToolUseFailure` (which has try/catch), this hook will throw an unhandled rejection, silently dropping the tool result from the queue.	`claude/provider.ts:576-590`
silent-failure-hunter	`lastError` thrown at retry exhaustion instead of `enrichedError` — after retries are exhausted, both providers throw the raw un-enriched error (`lastError`), discarding the stderr context assembled by `classifyAndEnrichError`. Affects both Claude (L931) and Codex (L609).	`claude/provider.ts:931`, `codex/provider.ts:609`

Important Issues (6 found)

Agent	Issue	Location
code-reviewer	Exported helpers not re-exported from `index.ts` — ambiguous public API boundary. `ProviderWarning` and `ToolResultEntry` are referenced in exported function signatures but absent from package exports.	`claude/provider.ts` (13 exports), `codex/provider.ts` (3 exports)
code-reviewer	`CodexStreamState` is private but `streamCodexEvents` accepts it as an optional parameter — type leaks into callers who can't reference the type name.	`codex/provider.ts:173-175, 181-187`
silent-failure-hunter	`result` event with `is_error: true` yielded without any log entry — callers that don't inspect `isError` will never know an error occurred.	`claude/provider.ts:684-720`
silent-failure-hunter	Codex `error` events containing `"MCP client"` are logged but never surfaced to the user — contradicts "Fail Fast + Explicit Errors" mandate.	`codex/provider.ts:205-212`
pr-test-analyzer	`applyNodeConfig` hooks-merging logic completely untested (9/10 criticality) — the most complex branching path in the PR, where YAML hooks merge with the pre-existing `PostToolUse` capture hook. Silent breakage would corrupt the tool result queue.	`claude/provider.ts:371-391`
pr-test-analyzer	`applyNodeConfig` skills translation untested (8/10 criticality) — three branches (tools presence, allowedTools dedup, model propagation) gate the entire skills feature with zero test coverage.	`claude/provider.ts:421-444`

Suggestions (8 found)

Agent	Suggestion	Location
type-design-analyzer	Narrow `ProviderWarning.code` to a string literal union (`'mcp_env_vars_missing' \| 'mcp_haiku_tool_search'`) for compile-time enforcement and discoverability.	`claude/provider.ts:342`
type-design-analyzer	Consider deriving `ToolResultEntry` from `MessageChunk['tool_result']` via `Omit<Extract<...>, 'type'>` to prevent structural drift.	`claude/provider.ts:495-499`
type-design-analyzer	Add JSDoc to `streamCodexEvents` documenting that `streamState` is mutated in place (caller-provided state aliasing).	`codex/provider.ts:181-187`
silent-failure-hunter	Add `default` branch to Codex `switch (itemType)` for unrecognized `item.completed` types.	`codex/provider.ts:236-379`
silent-failure-hunter	Add `else` branch to Claude stream normalizer for unrecognized event types.	`claude/provider.ts:648-720`
silent-failure-hunter	Surface malformed MCP server configs as `ProviderWarning` instead of silent `continue`.	`claude/provider.ts:227-231`
silent-failure-hunter	Surface empty hooks map as `ProviderWarning` when `hooks:` config produces no SDK hooks.	`claude/provider.ts:337-349`
docs-impact	CLAUDE.md architecture description could mention `ProviderWarning { code, message }` as the structured warning type for provider translation warnings.	`CLAUDE.md:411`

Strengths

Decomposition is well-structured — each helper has a single clear responsibility
Stream normalization thoroughly tested for all event types including edge cases (drain-after-end, late tool results)
Error classification has complete unit test coverage across all classes + aborted controller
buildBaseClaudeOptions tests cover model priority, env merging, stderr capture, settingSources
ProviderWarning is a genuine improvement over raw strings — enables programmatic handling
ToolResultEntry is well-designed (7.75/10) with correct optionality for toolCallId
Retry behavior tested end-to-end with timing verification for both providers
No new public API surface — IAgentProvider.sendQuery() contract unchanged

Documentation Issues

CLAUDE.md L411 — architecture description of @archon/providers could mention ProviderWarning type (minor; no public API change)

Test Coverage Gaps (by criticality)

9/10 — applyNodeConfig hooks-merging with existing capture hook
8/10 — applyNodeConfig skills translation branches
7/10 — MCP warning paths (mcp_env_vars_missing, mcp_haiku_tool_search)
6/10 — PostToolUseFailure hook is_interrupt prefix branch
6/10 — classifyAndEnrichCodexError model_access sub-cases (direct unit test)

Type Design Scores

Type	Encapsulation	Expression	Usefulness	Enforcement	Overall
`ProviderWarning`	6	5	8	7	6.5
`ToolResultEntry`	8	7	8	8	7.75
`CodexStreamState`	9	4	7	5	6

Verdict: NEEDS FIXES

The decomposition itself is clean and the overall test coverage is solid. The 4 critical issues are concrete bugs/regressions introduced by the refactor:

Repeated env building + auth logging per retry (behavioral regression)
Accumulating abort listeners (memory leak)
Missing try/catch in PostToolUse hook (asymmetric error handling)
Enriched error discarded at retry exhaustion (diagnostic regression)

Recommended Actions

Fix 4 critical issues
Add tests for hooks-merging and skills translation in applyNodeConfig
Address important error-handling gaps (is_error logging, MCP error surfacing)
Consider type narrowing suggestions at your discretion
Re-run review after fixes

coderabbitai

🧹 Nitpick comments (1)

packages/providers/src/claude/provider.ts (1)

630-633: Add type parameter to AsyncGenerator for stricter type safety.

The events parameter lacks a type argument, defaulting to AsyncGenerator<unknown, any, any>. Adding an explicit type parameter improves clarity and aligns with the strict TypeScript configuration.

♻️ Suggested fix

 export async function* streamClaudeMessages(
-  events: AsyncGenerator,
+  events: AsyncGenerator<unknown>,
   toolResultQueue: ToolResultEntry[]
 ): AsyncGenerator<MessageChunk> {

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@packages/providers/src/claude/provider.ts` around lines 630 - 633, The
parameter events in streamClaudeMessages is declared as AsyncGenerator without
type args; update its type to an explicit AsyncGenerator with the proper event
type and yield/return types used by this stream (e.g.,
AsyncGenerator<ClaudeEvent, void, unknown> or the specific event interface you
use elsewhere) so TypeScript enforces correct types; update the signature of
export async function* streamClaudeMessages(events: AsyncGenerator<...>,
toolResultQueue: ToolResultEntry[]): AsyncGenerator<MessageChunk> accordingly
and pick the actual event type name present in your codebase (e.g., ClaudeEvent
or similar) for the first generic and set the second/third generics to the
appropriate yield/return types.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@packages/providers/src/claude/provider.ts`:
- Around line 630-633: The parameter events in streamClaudeMessages is declared
as AsyncGenerator without type args; update its type to an explicit
AsyncGenerator with the proper event type and yield/return types used by this
stream (e.g., AsyncGenerator<ClaudeEvent, void, unknown> or the specific event
interface you use elsewhere) so TypeScript enforces correct types; update the
signature of export async function* streamClaudeMessages(events:
AsyncGenerator<...>, toolResultQueue: ToolResultEntry[]):
AsyncGenerator<MessageChunk> accordingly and pick the actual event type name
present in your codebase (e.g., ClaudeEvent or similar) for the first generic
and set the second/third generics to the appropriate yield/return types.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: ae25bbb5-d1e4-45b4-ad67-60d4428ff4b2

📥 Commits

Reviewing files that changed from the base of the PR and between 8be4664 and aaf4923.

📒 Files selected for processing (1)

packages/providers/src/claude/provider.ts

…try state bug Review findings: 1. Internal helpers were exported and reachable through package.json subpath exports (./claude/provider, ./codex/provider), widening the public API. All new helpers are now file-local — the only public exports remain ClaudeProvider, CodexProvider, loadMcpConfig, buildSDKHooksFromYAML, withFirstMessageTimeout, getProcessUid. 2. Codex streamState (lastTodoListSignature) was shared across retry attempts, causing todo-list dedup to suppress output on retry. Now creates fresh state per attempt. Removed direct helper test imports — existing sendQuery e2e tests (51 Claude + 42 Codex) cover all behavior paths.

…lowing Fixes from CodeRabbit + multi-agent review: 1. classifyAndEnrichError preserves first-event timeout diagnostic instead of collapsing it into generic "Query aborted" (the timeout aborts the controller, but the original error carries the #1067 breadcrumb) 2. nodeConfigWarnings emitted once before retry loop, not per attempt 3. buildSubprocessEnv() called once before retry loop (was re-logging auth mode and rebuilding { ...process.env } per attempt) 4. Abort signal listener registered once with forwarding to current controller (was accumulating per-retry listeners) 5. PostToolUse hook wrapped in try/catch (JSON.stringify can throw on circular refs — was asymmetric with PostToolUseFailure which had it) 6. Codex streamCodexEvents throws on abort instead of silent break (callers were getting truncated stream with no result/error) 7. Both providers store enrichedError (not raw error) for retry exhaustion — preserves stderr context in final throw 8. Log is_error result events at error level in Claude stream normalizer

coderabbitai

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

packages/providers/src/claude/provider.ts (1)
920-931: ⚠️ Potential issue | 🟠 Major

After retry exhaustion, throws raw error instead of enriched error.

Same issue as Codex provider: line 927 stores the raw err, but after the loop (line 931), lastError is thrown without enrichment. The diagnostic context from classifyAndEnrichError is lost.
🐛 Proposed fix
-        lastError = err;
+        lastError = enrichedError;
       }
     }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/providers/src/claude/provider.ts` around lines 920 - 931, The final
throw after the retry loop in Claude's provider currently throws the raw
lastError (assigned from err) losing enrichment; update the loop to ensure
lastError stores the enrichedError returned by classifyAndEnrichError (or set
lastError = enrichedError before continuing) and change the final throw to throw
lastError (which will now be enriched) or fall back to enrichedError/new Error
if undefined; adjust references around shouldRetry, attempt,
MAX_SUBPROCESS_RETRIES, classifyAndEnrichError and enrichedError to ensure the
enriched diagnostic context is preserved when throwing after retries.
packages/providers/src/codex/provider.ts (1)
595-606: ⚠️ Potential issue | 🟠 Major

After retry exhaustion, throws raw error instead of enriched error.

When retries are exhausted via the loop completing (line 606), lastError is the raw error without enrichment. The enrichedError from classifyAndEnrichCodexError is discarded after each retry iteration. This loses diagnostic context that would help users debug failures.
🐛 Proposed fix
-        lastError = err;
+        lastError = enrichedError;
       }
     }
 
-    throw lastError ?? new Error('Codex query failed after retries');
+    throw lastError ?? new Error('Codex query failed after retries');
Store enrichedError instead of the raw err so the thrown error retains classification and context.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/providers/src/codex/provider.ts` around lines 595 - 606, The loop
currently assigns the raw err to lastError and throws that after retries, losing
enrichments from classifyAndEnrichCodexError; update the retry loop so that
after calling classifyAndEnrichCodexError you store the returned enrichedError
into lastError (instead of err) and throw lastError (enriched) when retries are
exhausted (respecting shouldRetry and MAX_SUBPROCESS_RETRIES and preserving
existing delay logic that uses retryBaseDelayMs and getLog()) so the final
thrown error contains the classification/context.

🧹 Nitpick comments (3)

packages/providers/src/claude/provider.ts (2)

630-721: Consider logging unrecognized top-level event types.

The helper handles assistant, system, rate_limit_event, and result events. Other event types are silently ignored. A debug log for unrecognized types would help catch SDK changes.

💡 Suggested addition after line 719

     } else if (event.type === 'result') {
       // ... existing result handling ...
+    } else {
+      getLog().debug({ eventType: event.type }, 'claude.unhandled_event_type');
     }
   }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@packages/providers/src/claude/provider.ts` around lines 630 - 721,
streamClaudeMessages currently ignores any top-level event types other than
'assistant', 'system', 'rate_limit_event', and 'result'; add a debug log for
unexpected event types so SDK changes don't go unnoticed. After the existing
if/else chain in streamClaudeMessages, add an else branch that calls
getLog().debug({ eventType: event.type, raw: msg }, 'claude.unhandled_event')
(or similar) to surface unrecognized event types and their raw payloads;
reference the streamClaudeMessages function and the local variables event and
msg to locate where to insert this logging.

335-344: Consider narrowing code to a literal union.

Using string for code allows arbitrary values. A literal union would provide stronger typing and enable exhaustive handling.

💡 Example

 interface ProviderWarning {
-  code: string;
+  code: 'mcp_env_vars_missing' | 'mcp_haiku_tool_search';
   message: string;
 }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@packages/providers/src/claude/provider.ts` around lines 335 - 344, The
ProviderWarning interface's code property is too permissive as string; define a
specific union type (e.g., `type ProviderWarningCode = 'quota' | 'rate_limit' |
'deprecated' | 'transient'` or whatever canonical codes your providers emit),
replace the `code: string` with `code: ProviderWarningCode`, and update any
usages (parsing/creation sites) to return one of those literals so callers can
exhaustively switch on ProviderWarning.code; keep the ProviderWarning name and
shape but tighten the `code` type and export the union type for reuse.

packages/providers/src/codex/provider.ts (1)

235-378: Add a default branch for unrecognized item types.

The switch handles known item types but silently ignores unknown ones. If the Codex SDK adds new item types, they'll be dropped without notice. Consider adding a default case with a debug log.

💡 Suggested addition

         break;
       }
+
+      default:
+        getLog().debug({ itemType, itemId: item.id }, 'item_completed_unknown_type');
+        break;
     }
   }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@packages/providers/src/codex/provider.ts` around lines 235 - 378, The switch
on itemType in the generator (the block that handles 'agent_message',
'command_execution', etc.) lacks a default branch so unknown item types are
silently dropped; add a default case that logs a debug (or warn) with getLog(),
including the itemId and the unrecognized itemType (e.g., getLog().debug({
itemId: item.id, itemType }, 'unknown_item_type')) and optionally yield a
minimal system message or no-op if you want visibility to consumers; place this
default at the end of the switch to ensure any future/new item types are
surfaced in logs.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@packages/providers/src/claude/provider.ts`:
- Around line 505-518: buildSubprocessEnv() is being called inside
buildBaseClaudeOptions, which runs on every retry and causes duplicate auth-mode
logs; compute the subprocess env once before entering the retry loop and pass
that prebuilt env into buildBaseClaudeOptions instead of calling
buildSubprocessEnv() repeatedly. Change buildBaseClaudeOptions to accept a
prepared env object (e.g. add parameter subprocessEnv: NodeJS.ProcessEnv), move
the merge logic that currently does requestOptions?.env ? {
...buildSubprocessEnv(), ...requestOptions.env } : buildSubprocessEnv() to the
code that constructs subprocessEnv once before retries, and update all callers
(including the retry loop) to pass subprocessEnv through so
buildBaseClaudeOptions no longer invokes buildSubprocessEnv().
- Around line 843-853: The abort listener added to requestOptions.abortSignal
inside the retry loop (the addEventListener call that calls controller.abort())
can accumulate across retries; move the registration out of the retry loop or
ensure the listener is removed after each attempt by capturing the listener
function and calling removeEventListener in the finally block (or register once
before retries and clean up after all retries). Concretely, reference the
requestOptions.abortSignal.addEventListener usage and the controller variable:
either attach the listener a single time outside the retry loop or, if kept
inside, store the callback in a const and call
requestOptions.abortSignal.removeEventListener(sameCallback) in the finally so
the orphaned listeners do not accumulate.
- Around line 574-591: The PostToolUse hook currently calls
JSON.stringify(toolResponse) without protection which can throw (e.g., circular
refs); wrap the serialization and queue push in a try/catch inside the async
hook (the same pattern used in PostToolUseFailure) so any serialization error is
caught, push a safe fallback (e.g., toString or a placeholder like
"[unserializable tool response]") into toolResultQueue with toolName/toolCallId
preserved, and still return { continue: true }; ensure errors are logged or
handled rather than allowing an unhandled rejection from the PostToolUse hook.

---

Outside diff comments:
In `@packages/providers/src/claude/provider.ts`:
- Around line 920-931: The final throw after the retry loop in Claude's provider
currently throws the raw lastError (assigned from err) losing enrichment; update
the loop to ensure lastError stores the enrichedError returned by
classifyAndEnrichError (or set lastError = enrichedError before continuing) and
change the final throw to throw lastError (which will now be enriched) or fall
back to enrichedError/new Error if undefined; adjust references around
shouldRetry, attempt, MAX_SUBPROCESS_RETRIES, classifyAndEnrichError and
enrichedError to ensure the enriched diagnostic context is preserved when
throwing after retries.

In `@packages/providers/src/codex/provider.ts`:
- Around line 595-606: The loop currently assigns the raw err to lastError and
throws that after retries, losing enrichments from classifyAndEnrichCodexError;
update the retry loop so that after calling classifyAndEnrichCodexError you
store the returned enrichedError into lastError (instead of err) and throw
lastError (enriched) when retries are exhausted (respecting shouldRetry and
MAX_SUBPROCESS_RETRIES and preserving existing delay logic that uses
retryBaseDelayMs and getLog()) so the final thrown error contains the
classification/context.

---

Nitpick comments:
In `@packages/providers/src/claude/provider.ts`:
- Around line 630-721: streamClaudeMessages currently ignores any top-level
event types other than 'assistant', 'system', 'rate_limit_event', and 'result';
add a debug log for unexpected event types so SDK changes don't go unnoticed.
After the existing if/else chain in streamClaudeMessages, add an else branch
that calls getLog().debug({ eventType: event.type, raw: msg },
'claude.unhandled_event') (or similar) to surface unrecognized event types and
their raw payloads; reference the streamClaudeMessages function and the local
variables event and msg to locate where to insert this logging.
- Around line 335-344: The ProviderWarning interface's code property is too
permissive as string; define a specific union type (e.g., `type
ProviderWarningCode = 'quota' | 'rate_limit' | 'deprecated' | 'transient'` or
whatever canonical codes your providers emit), replace the `code: string` with
`code: ProviderWarningCode`, and update any usages (parsing/creation sites) to
return one of those literals so callers can exhaustively switch on
ProviderWarning.code; keep the ProviderWarning name and shape but tighten the
`code` type and export the union type for reuse.

In `@packages/providers/src/codex/provider.ts`:
- Around line 235-378: The switch on itemType in the generator (the block that
handles 'agent_message', 'command_execution', etc.) lacks a default branch so
unknown item types are silently dropped; add a default case that logs a debug
(or warn) with getLog(), including the itemId and the unrecognized itemType
(e.g., getLog().debug({ itemId: item.id, itemType }, 'unknown_item_type')) and
optionally yield a minimal system message or no-op if you want visibility to
consumers; place this default at the end of the switch to ensure any future/new
item types are surfaced in logs.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 1ba72908-eda5-4c7c-87fb-22cdb9866c15

📥 Commits

Reviewing files that changed from the base of the PR and between aaf4923 and 1f18357.

📒 Files selected for processing (2)

packages/providers/src/claude/provider.ts
packages/providers/src/codex/provider.ts

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (2)

packages/providers/src/codex/provider.ts (1)

235-378: Consider adding a default branch for unrecognized item types.

The switch statement handles known Codex item types but silently ignores any new types the SDK may introduce. Adding a debug log for unhandled types would aid future debugging without breaking current behavior.

💡 Suggested addition after line 377

           break;
         }
+
+        default:
+          getLog().debug({ itemType, itemId: item.id }, 'item_completed_unhandled_type');
+          break;
       }
     }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@packages/providers/src/codex/provider.ts` around lines 235 - 378, The switch
on itemType in the Codex handling block should include a default branch to
surface unrecognized item types; add a default: case at the end of the switch
that calls getLog().debug (or .warn if you prefer) with context like { itemId:
item.id, itemType } and a message key such as 'codex_unhandled_item_type' so
unknown/new SDK item types are logged rather than silently ignored (locate the
switch (itemType) block in provider.ts and add the default case after the
existing cases).

packages/providers/src/claude/provider.ts (1)

856-864: Cache the loaded MCP config to avoid repeated file I/O.

When nodeConfig.mcp is set, applyNodeConfig calls loadMcpConfig which performs file I/O (line 396). The current pattern calls it once to extract warnings (line 863), then again on each retry attempt inside the loop (line 904), resulting in N+1 disk reads for the same config file.

The straightforward fix is to load MCP once before the retry loop and reuse it:

Example approach

let cachedMcpServers: Record<string, unknown> | undefined;
let nodeConfigWarnings: ProviderWarning[] = [];
if (requestOptions?.nodeConfig?.mcp) {
  const { servers, serverNames, missingVars } = await loadMcpConfig(requestOptions.nodeConfig.mcp, cwd);
  cachedMcpServers = servers;
  // extract and yield warnings here
}

for (let attempt = 0; attempt <= MAX_SUBPROCESS_RETRIES; attempt++) {
  // ... 
  if (requestOptions?.nodeConfig) {
    await applyNodeConfig(options, requestOptions.nodeConfig, cwd, cachedMcpServers);
  }
}

Alternatively, split applyNodeConfig into separate concern functions.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@packages/providers/src/claude/provider.ts` around lines 856 - 864, The code
currently calls applyNodeConfig twice causing repeated loadMcpConfig file I/O;
load the MCP config once before the retry loop and reuse it: when
requestOptions?.nodeConfig?.mcp is present call loadMcpConfig(cwd) once to
produce servers/serverNames/missingVars, extract and set nodeConfigWarnings
there, store the servers in a cachedMcpServers variable, then inside the retry
loop call applyNodeConfig with the cached servers (or modify applyNodeConfig to
accept an optional cachedMcpServers param) so subsequent attempts reuse the
in-memory MCP data instead of re-reading the file; ensure nodeConfigWarnings are
yielded before the first attempt and MAX_SUBPROCESS_RETRIES loop uses the cached
data.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@packages/providers/src/codex/provider.ts`:
- Around line 359-374: mcpResult.content may be non-serializable so calling
JSON.stringify directly can throw and crash the stream; update the block around
mcpResult.content in provider.ts (the branch that sets toolOutput and yields {
type: 'tool_result', toolName: mcpToolName, toolOutput }) to wrap
JSON.stringify(mcpResult.content) in a try/catch, on error log the serialization
error with getLog().warn (include item.id, server, tool, mcpToolName and the
error), and set a safe fallback for toolOutput (e.g., a descriptive string or
truncated util.inspect-like output) so the yield always returns a defined
toolOutput.

---

Nitpick comments:
In `@packages/providers/src/claude/provider.ts`:
- Around line 856-864: The code currently calls applyNodeConfig twice causing
repeated loadMcpConfig file I/O; load the MCP config once before the retry loop
and reuse it: when requestOptions?.nodeConfig?.mcp is present call
loadMcpConfig(cwd) once to produce servers/serverNames/missingVars, extract and
set nodeConfigWarnings there, store the servers in a cachedMcpServers variable,
then inside the retry loop call applyNodeConfig with the cached servers (or
modify applyNodeConfig to accept an optional cachedMcpServers param) so
subsequent attempts reuse the in-memory MCP data instead of re-reading the file;
ensure nodeConfigWarnings are yielded before the first attempt and
MAX_SUBPROCESS_RETRIES loop uses the cached data.

In `@packages/providers/src/codex/provider.ts`:
- Around line 235-378: The switch on itemType in the Codex handling block should
include a default branch to surface unrecognized item types; add a default: case
at the end of the switch that calls getLog().debug (or .warn if you prefer) with
context like { itemId: item.id, itemType } and a message key such as
'codex_unhandled_item_type' so unknown/new SDK item types are logged rather than
silently ignored (locate the switch (itemType) block in provider.ts and add the
default case after the existing cases).

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: e025636f-c9e6-4558-8093-fde60d700a19

📥 Commits

Reviewing files that changed from the base of the PR and between 1f18357 and cc739ac.

📒 Files selected for processing (2)

packages/providers/src/claude/provider.ts
packages/providers/src/codex/provider.ts

coderabbitai · 2026-04-13T07:25:56Z

+            if (mcpResult?.content) {
+              if (Array.isArray(mcpResult.content)) {
+                toolOutput = JSON.stringify(mcpResult.content);
+              } else {
+                getLog().warn(
+                  {
+                    itemId: item.id,
+                    server,
+                    tool,
+                    resultType: typeof mcpResult.content,
+                  },
+                  'mcp_tool_call_unexpected_result_shape'
+                );
+              }
+            }
+            yield { type: 'tool_result', toolName: mcpToolName, toolOutput };


⚠️ Potential issue | 🟡 Minor

JSON.stringify can throw on non-serializable MCP content.

If mcpResult.content contains circular references, BigInt values, or other non-serializable data, JSON.stringify will throw and crash the stream. Consider wrapping in try/catch for robustness.

🛡️ Proposed fix

if (mcpResult?.content) { if (Array.isArray(mcpResult.content)) { - toolOutput = JSON.stringify(mcpResult.content); + try { + toolOutput = JSON.stringify(mcpResult.content); + } catch { + getLog().warn({ itemId: item.id, server, tool }, 'mcp_tool_result_not_serializable'); + toolOutput = '[unserializable MCP result]'; + } } else {

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@packages/providers/src/codex/provider.ts` around lines 359 - 374, mcpResult.content may be non-serializable so calling JSON.stringify directly can throw and crash the stream; update the block around mcpResult.content in provider.ts (the branch that sets toolOutput and yields { type: 'tool_result', toolName: mcpToolName, toolOutput }) to wrap JSON.stringify(mcpResult.content) in a try/catch, on error log the serialization error with getLog().warn (include item.id, server, tool, mcpToolName and the error), and set a safe fallback for toolOutput (e.g., a descriptive string or truncated util.inspect-like output) so the yield always returns a defined toolOutput.

Restore test coverage for the specific fixes from the decomposition review, exercised through sendQuery (black-box) since helpers are file-local: Claude (6 tests): - Timeout error preserved (not collapsed into "Query aborted") - nodeConfig warnings emitted once even when retries occur - Abort signal cancels across retries via single forwarding listener - Enriched error (with stderr) thrown at retry exhaustion - PostToolUse hook handles circular reference without crashing - is_error result events logged at error level Codex (3 tests): - Abort signal throws instead of silently truncating stream - Enriched error thrown at retry exhaustion - Todo-list dedup state resets between retry attempts

coderabbitai

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@packages/providers/src/claude/provider.test.ts`:
- Around line 1016-1046: The test currently aborts before a second retry starts
so it doesn't catch per-retry listener leaks; update the test for the
listener-leak fix by either (A) arranging mockQuery to allow at least two retry
attempts to begin before calling abort (e.g., make the first attempt delay long
enough so the retry logic attaches listeners multiple times) and then assert
callCount > 1 and that only one abort listener remains, or (B) spy on the
AbortSignal/EventTarget methods used by client.sendQuery (e.g., addEventListener
and removeEventListener on AbortController.signal) and assert they are
registered exactly once and removed exactly once across retries; reference
mockQuery, client.sendQuery, and AbortController.signal when making the changes.
- Around line 988-1014: The test "emits nodeConfig warnings only once even when
retries occur" currently uses nodeConfig: { effort: 'high' } which never
produces a ProviderWarning, so change the nodeConfig passed to client.sendQuery
inside that test to a configuration that is guaranteed to trigger a
ProviderWarning (e.g., the same dangerous/unsupported option used elsewhere in
tests that yields a ProviderWarning), then update the assertions to expect
exactly one system warning chunk across all retries (filter chunks by c.type ===
'system' and assert length === 1) while keeping the existing callCount assertion
to confirm retries; refer to the test name and the client.sendQuery(...) call
and the mockQuery generator to locate where to change the nodeConfig and
assertions.

In `@packages/providers/src/codex/provider.test.ts`:
- Around line 1194-1230: The test todo_list dedup state resets between retry
attempts currently only checks that "Task 1" appears at least once; update the
assertions in the test 'todo_list dedup state resets between retry attempts' to
explicitly assert the todo was emitted on both attempts by counting occurrences
from mockRunStreamed via the streaming output from client.sendQuery: filter
systemChunks (as currently done) and assert that systemChunks.filter(c =>
c.content.includes('Task 1')).length === 2 (or >=2 if you prefer permissive), so
the same todo appears on the initial attempt and again on the retry, ensuring
dedup state was reset; keep using mockRunStreamed and systemChunks to locate the
emitted events.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: acc34ce2-e11c-4450-a3d3-cea41f56d702

📥 Commits

Reviewing files that changed from the base of the PR and between cc739ac and 1dc7569.

📒 Files selected for processing (2)

packages/providers/src/claude/provider.test.ts
packages/providers/src/codex/provider.test.ts

coderabbitai · 2026-04-13T07:59:38Z

+  test('emits nodeConfig warnings only once even when retries occur', async () => {
+    let callCount = 0;
+    mockQuery.mockImplementation(async function* () {
+      callCount++;
+      if (callCount <= 2) {
+        throw new Error('process exited with code 1'); // crash → retried
+      }
+      yield {
+        type: 'assistant',
+        message: { content: [{ type: 'text', text: 'ok' }] },
+      };
+    });
+
+    const chunks = [];
+    for await (const chunk of client.sendQuery('test', '/workspace', undefined, {
+      nodeConfig: { effort: 'high' },
+    })) {
+      chunks.push(chunk);
+    }
+
+    // nodeConfig with effort doesn't produce warnings, but let's verify
+    // no system chunks are duplicated. Use a nodeConfig that doesn't warn.
+    // The point is: zero warning chunks means zero, not zero × 3 retries.
+    const systemChunks = chunks.filter(c => c.type === 'system');
+    expect(systemChunks).toHaveLength(0);
+    expect(callCount).toBe(3); // Confirms retries happened
+  }, 5_000);


⚠️ Potential issue | 🟠 Major

This never exercises a warning path.

nodeConfig: { effort: 'high' } does not produce a provider warning, so systemChunks.length === 0 doesn't prove warnings are emitted only once across retries. Please switch this to a config that is guaranteed to yield a ProviderWarning, then assert exactly one warning chunk over the full retry sequence.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@packages/providers/src/claude/provider.test.ts` around lines 988 - 1014, The test "emits nodeConfig warnings only once even when retries occur" currently uses nodeConfig: { effort: 'high' } which never produces a ProviderWarning, so change the nodeConfig passed to client.sendQuery inside that test to a configuration that is guaranteed to trigger a ProviderWarning (e.g., the same dangerous/unsupported option used elsewhere in tests that yields a ProviderWarning), then update the assertions to expect exactly one system warning chunk across all retries (filter chunks by c.type === 'system' and assert length === 1) while keeping the existing callCount assertion to confirm retries; refer to the test name and the client.sendQuery(...) call and the mockQuery generator to locate where to change the nodeConfig and assertions.

coderabbitai · 2026-04-13T07:59:38Z

+  test('abort signal cancels query across retries without listener leak', async () => {
+    const abortController = new AbortController();
+    let callCount = 0;
+
+    mockQuery.mockImplementation(async function* () {
+      callCount++;
+      if (callCount === 1) {
+        // First attempt crashes → triggers retry. Abort during the retry delay
+        // so the next iteration's abortSignal.aborted check catches it.
+        setTimeout(() => abortController.abort(), 0);
+        throw new Error('process exited with code 1');
+      }
+      // Should not reach here — abort fires before retry starts
+      yield {
+        type: 'assistant',
+        message: { content: [{ type: 'text', text: 'should not reach' }] },
+      };
+    });
+
+    const consumeGenerator = async (): Promise<void> => {
+      for await (const _ of client.sendQuery('test', '/workspace', undefined, {
+        abortSignal: abortController.signal,
+      })) {
+        // consume
+      }
+    };
+
+    await expect(consumeGenerator()).rejects.toThrow('Query aborted');
+    // Single abort listener registered (not per-retry)
+    expect(callCount).toBe(1);
+  }, 5_000);


⚠️ Potential issue | 🟠 Major

This doesn't actually verify the listener-leak fix.

The abort fires before a second attempt begins, so this test would still pass if a new abort listener were attached on every retry. If you want to lock in the leak fix, either let multiple retries start before aborting or spy on addEventListener/removeEventListener and assert a single registration lifecycle.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@packages/providers/src/claude/provider.test.ts` around lines 1016 - 1046, The test currently aborts before a second retry starts so it doesn't catch per-retry listener leaks; update the test for the listener-leak fix by either (A) arranging mockQuery to allow at least two retry attempts to begin before calling abort (e.g., make the first attempt delay long enough so the retry logic attaches listeners multiple times) and then assert callCount > 1 and that only one abort listener remains, or (B) spy on the AbortSignal/EventTarget methods used by client.sendQuery (e.g., addEventListener and removeEventListener on AbortController.signal) and assert they are registered exactly once and removed exactly once across retries; reference mockQuery, client.sendQuery, and AbortController.signal when making the changes.

coderabbitai · 2026-04-13T07:59:38Z

+  test('todo_list dedup state resets between retry attempts', async () => {
+    const todoItem = {
+      type: 'todo_list',
+      items: [{ text: 'Task 1', completed: false }],
+      id: 'todo-1',
+    };
+
+    let callCount = 0;
+    mockRunStreamed.mockImplementation(() => {
+      callCount++;
+      if (callCount === 1) {
+        return Promise.resolve({
+          events: (async function* () {
+            yield { type: 'item.completed', item: todoItem };
+            throw new Error('codex exec crashed');
+          })(),
+        });
+      }
+      // On retry, same todo should appear again (fresh state)
+      return Promise.resolve({
+        events: (async function* () {
+          yield { type: 'item.completed', item: todoItem };
+          yield { type: 'turn.completed', usage: defaultUsage };
+        })(),
+      });
+    });
+
+    const chunks = [];
+    for await (const chunk of client.sendQuery('test', '/workspace')) {
+      chunks.push(chunk);
+    }
+
+    // The todo should appear on the retry attempt (not suppressed by dedup from attempt 1)
+    const systemChunks = chunks.filter(c => c.type === 'system');
+    expect(systemChunks.length).toBeGreaterThanOrEqual(1);
+    expect(systemChunks.some(c => c.type === 'system' && c.content.includes('Task 1'))).toBe(true);
+  }, 5_000);


⚠️ Potential issue | 🟠 Major

Assert the retry emission explicitly.

This passes even if the todo is emitted only on attempt 1 and then suppressed on retry, because the current systemChunks checks are already satisfied by the pre-crash chunk. To lock in the dedup-reset fix, assert that the same todo appears twice across the two attempts.

Suggested assertion tightening

- const systemChunks = chunks.filter(c => c.type === 'system'); - expect(systemChunks.length).toBeGreaterThanOrEqual(1); - expect(systemChunks.some(c => c.type === 'system' && c.content.includes('Task 1'))).toBe(true); + const systemChunks = chunks.filter(c => c.type === 'system'); + expect(systemChunks).toHaveLength(2); + expect(systemChunks[0]).toMatchObject({ + type: 'system', + content: expect.stringContaining('Task 1'), + }); + expect(systemChunks[1]).toMatchObject({ + type: 'system', + content: expect.stringContaining('Task 1'), + });

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

test('todo_list dedup state resets between retry attempts', async () => {

const todoItem = {

type: 'todo_list',

items: [{ text: 'Task 1', completed: false }],

id: 'todo-1',

};

let callCount = 0;

mockRunStreamed.mockImplementation(() => {

callCount++;

if (callCount === 1) {

return Promise.resolve({

events: (async function* () {

yield { type: 'item.completed', item: todoItem };

throw new Error('codex exec crashed');

})(),

});

}

// On retry, same todo should appear again (fresh state)

return Promise.resolve({

events: (async function* () {

yield { type: 'item.completed', item: todoItem };

yield { type: 'turn.completed', usage: defaultUsage };

})(),

});

});

const chunks = [];

for await (const chunk of client.sendQuery('test', '/workspace')) {

chunks.push(chunk);

}

// The todo should appear on the retry attempt (not suppressed by dedup from attempt 1)

const systemChunks = chunks.filter(c => c.type === 'system');

expect(systemChunks.length).toBeGreaterThanOrEqual(1);

expect(systemChunks.some(c => c.type === 'system' && c.content.includes('Task 1'))).toBe(true);

}, 5_000);

test('todo_list dedup state resets between retry attempts', async () => {

const todoItem = {

type: 'todo_list',

items: [{ text: 'Task 1', completed: false }],

id: 'todo-1',

};

let callCount = 0;

mockRunStreamed.mockImplementation(() => {

callCount++;

if (callCount === 1) {

return Promise.resolve({

events: (async function* () {

yield { type: 'item.completed', item: todoItem };

throw new Error('codex exec crashed');

})(),

});

}

// On retry, same todo should appear again (fresh state)

return Promise.resolve({

events: (async function* () {

yield { type: 'item.completed', item: todoItem };

yield { type: 'turn.completed', usage: defaultUsage };

})(),

});

});

const chunks = [];

for await (const chunk of client.sendQuery('test', '/workspace')) {

chunks.push(chunk);

}

// The todo should appear on the retry attempt (not suppressed by dedup from attempt 1)

const systemChunks = chunks.filter(c => c.type === 'system');

expect(systemChunks).toHaveLength(2);

expect(systemChunks[0]).toMatchObject({

type: 'system',

content: expect.stringContaining('Task 1'),

});

expect(systemChunks[1]).toMatchObject({

type: 'system',

content: expect.stringContaining('Task 1'),

});

}, 5_000);

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@packages/providers/src/codex/provider.test.ts` around lines 1194 - 1230, The test todo_list dedup state resets between retry attempts currently only checks that "Task 1" appears at least once; update the assertions in the test 'todo_list dedup state resets between retry attempts' to explicitly assert the todo was emitted on both attempts by counting occurrences from mockRunStreamed via the streaming output from client.sendQuery: filter systemChunks (as currently done) and assert that systemChunks.filter(c => c.content.includes('Task 1')).length === 2 (or >=2 if you prefer permissive), so the same todo appears on the initial attempt and again on the retry, ensuring dedup state was reset; keep using mockRunStreamed and systemChunks to locate the emitted events.

…ies (coleam00#1162) * refactor: decompose provider sendQuery() into explicit helper boundaries (coleam00#1139) sendQuery() in both Claude and Codex providers was a monolith mixing SDK option building, nodeConfig translation, stream normalization, and error classification. This makes it hard to safely extend for Phase 2 provider extensibility. Decompose both providers into focused internal helpers: Claude: - buildBaseClaudeOptions: SDK option construction - buildToolCaptureHooks: PostToolUse/PostToolUseFailure hook setup - applyNodeConfig: workflow nodeConfig → SDK translation + structured warnings - streamClaudeMessages: raw SDK event → MessageChunk normalization - classifyAndEnrichError: error classification with retry decisions Codex: - buildTurnOptions: per-turn option construction (output schema, abort) - streamCodexEvents: raw SDK event → MessageChunk normalization - classifyAndEnrichCodexError: error classification with retry decisions Also introduces ProviderWarning { code, message } replacing raw string warnings for machine-readable provider translation warnings. Adds 43 focused unit tests covering the extracted helpers directly. Fixes coleam00#1139 * fix: export ToolResultEntry type used in public buildBaseClaudeOptions API * fix: unexport internal helpers to prevent API surface leakage, fix retry state bug Review findings: 1. Internal helpers were exported and reachable through package.json subpath exports (./claude/provider, ./codex/provider), widening the public API. All new helpers are now file-local — the only public exports remain ClaudeProvider, CodexProvider, loadMcpConfig, buildSDKHooksFromYAML, withFirstMessageTimeout, getProcessUid. 2. Codex streamState (lastTodoListSignature) was shared across retry attempts, causing todo-list dedup to suppress output on retry. Now creates fresh state per attempt. Removed direct helper test imports — existing sendQuery e2e tests (51 Claude + 42 Codex) cover all behavior paths. * fix: address review findings — abort handling, retry bugs, error swallowing Fixes from CodeRabbit + multi-agent review: 1. classifyAndEnrichError preserves first-event timeout diagnostic instead of collapsing it into generic "Query aborted" (the timeout aborts the controller, but the original error carries the coleam00#1067 breadcrumb) 2. nodeConfigWarnings emitted once before retry loop, not per attempt 3. buildSubprocessEnv() called once before retry loop (was re-logging auth mode and rebuilding { ...process.env } per attempt) 4. Abort signal listener registered once with forwarding to current controller (was accumulating per-retry listeners) 5. PostToolUse hook wrapped in try/catch (JSON.stringify can throw on circular refs — was asymmetric with PostToolUseFailure which had it) 6. Codex streamCodexEvents throws on abort instead of silent break (callers were getting truncated stream with no result/error) 7. Both providers store enrichedError (not raw error) for retry exhaustion — preserves stderr context in final throw 8. Log is_error result events at error level in Claude stream normalizer * test: add black-box behavioral tests for sendQuery decomposition fixes Restore test coverage for the specific fixes from the decomposition review, exercised through sendQuery (black-box) since helpers are file-local: Claude (6 tests): - Timeout error preserved (not collapsed into "Query aborted") - nodeConfig warnings emitted once even when retries occur - Abort signal cancels across retries via single forwarding listener - Enriched error (with stderr) thrown at retry exhaustion - PostToolUse hook handles circular reference without crashing - is_error result events logged at error level Codex (3 tests): - Abort signal throws instead of silently truncating stream - Enriched error thrown at retry exhaustion - Todo-list dedup state resets between retry attempts

fix: export ToolResultEntry type used in public buildBaseClaudeOption…

aaf4923

…s API

coderabbitai Bot reviewed Apr 13, 2026

View reviewed changes

Comment thread packages/providers/src/claude/provider.ts Outdated

Comment thread packages/providers/src/claude/provider.ts Outdated

Comment thread packages/providers/src/codex/provider.ts

coderabbitai Bot reviewed Apr 13, 2026

View reviewed changes

Wirasm added 2 commits April 13, 2026 10:16

coderabbitai Bot reviewed Apr 13, 2026

View reviewed changes

Comment thread packages/providers/src/claude/provider.ts Outdated

Comment thread packages/providers/src/claude/provider.ts

Comment thread packages/providers/src/claude/provider.ts Outdated

coderabbitai Bot reviewed Apr 13, 2026

View reviewed changes

Wirasm merged commit 37aeadb into dev Apr 13, 2026
4 checks passed

Wirasm deleted the refactor/issue-1139-sendquery-decomposition branch April 13, 2026 08:24

This was referenced Apr 13, 2026

[codex] Implement managed execution env propagation #1178

Merged

feat: add optional Langfuse observability for Claude Agent SDK calls #1194

Closed

asmrtfm mentioned this pull request Apr 14, 2026

feat(providers): adapt Ollama support to @archon/providers package (#1137 rebase) #1201

Closed

coderabbitai Bot mentioned this pull request Apr 14, 2026

fix(providers): replace Claude SDK embed with explicit binary-path resolver #1217

Merged

coderabbitai Bot mentioned this pull request Apr 15, 2026

fix(providers): init options.hooks before merging node hooks #1235

Closed

6 tasks

coderabbitai Bot mentioned this pull request Apr 18, 2026

fix(workflows): fail loudly on SDK isError results in DAG and loop nodes #1291

Merged

7 tasks

coleam00 mentioned this pull request Apr 18, 2026

feat(orchestrator): auto-reset stale Claude SDK sessions + bare 'reset' command for Slack #1121

Open

5 tasks

This was referenced Apr 19, 2026

feat(workflows): inline sub-agent definitions on DAG nodes #1276

Merged

fix(providers/claude): initialize options.hooks before per-node hook merge #1308

Closed

alexsiri7 mentioned this pull request Apr 20, 2026

chore: sync dev → main (upstream commits + PRs #2 #4) alexsiri7/Archon#5

Merged

This was referenced Apr 28, 2026

fix(claude): stop passing --no-env-file to native binary in dev mode #1461

Merged

refactor(workflows): trust the SDK for model validation #1463

Merged

Conversation

Wirasm commented Apr 13, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Root Cause

Changes

Design Decisions

Testing

Validation

Issue

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Apr 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

Wirasm commented Apr 13, 2026

Self Code Review

Summary

Findings

Fixed

Non-blocking observations

Checklist

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Wirasm commented Apr 13, 2026

PR Review Summary — Multi-Agent Analysis

Critical Issues (4 found)

Important Issues (6 found)

Suggestions (8 found)

Strengths

Documentation Issues

Test Coverage Gaps (by criticality)

Type Design Scores

Verdict: NEEDS FIXES

Recommended Actions

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Wirasm commented Apr 13, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 13, 2026 •

edited

Loading