refactor: decompose provider sendQuery() into explicit helper boundaries#1162
refactor: decompose provider sendQuery() into explicit helper boundaries#1162
Conversation
…ies (#1139) sendQuery() in both Claude and Codex providers was a monolith mixing SDK option building, nodeConfig translation, stream normalization, and error classification. This makes it hard to safely extend for Phase 2 provider extensibility. Decompose both providers into focused internal helpers: Claude: - buildBaseClaudeOptions: SDK option construction - buildToolCaptureHooks: PostToolUse/PostToolUseFailure hook setup - applyNodeConfig: workflow nodeConfig → SDK translation + structured warnings - streamClaudeMessages: raw SDK event → MessageChunk normalization - classifyAndEnrichError: error classification with retry decisions Codex: - buildTurnOptions: per-turn option construction (output schema, abort) - streamCodexEvents: raw SDK event → MessageChunk normalization - classifyAndEnrichCodexError: error classification with retry decisions Also introduces ProviderWarning { code, message } replacing raw string warnings for machine-readable provider translation warnings. Adds 43 focused unit tests covering the extracted helpers directly. Fixes #1139
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughRefactors Claude and Codex providers by extracting option builders, stream normalizers, and error classifiers; converts nodeConfig warnings to structured ProviderWarning objects; centralizes streaming and retry/enrichment logic, and updates tests to exercise retry/streaming behaviors. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant Provider
participant SDK
participant ToolRunner
Client->>Provider: sendQuery(prompt, options)
Provider->>Provider: buildBaseOptions(...) / applyNodeConfig() -> collect ProviderWarning[]
Provider->>SDK: invoke SDK with options (hooks, signal)
SDK-->>Provider: stream events (assistant/tool/result/err)
Provider->>Provider: stream* normalizer -> emit MessageChunks (system/assistant/tool/...)
alt tool execution event
Provider->>ToolRunner: run tool
ToolRunner-->>Provider: tool result/events
Provider->>SDK: (optionally) feed tool result back
end
Provider-->>Client: yield MessageChunk(s) and final result or enriched error
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Self Code ReviewSummaryClean decomposition that preserves all existing observable behavior. 136 tests pass (93 existing + 43 new). One type visibility issue found and fixed. FindingsFixed
Non-blocking observations
Checklist
|
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@packages/providers/src/claude/provider.ts`:
- Around line 743-754: The current classifyAndEnrichError function
unconditionally converts any aborted controller into a generic 'Query aborted'
Error, which loses the original first-event timeout diagnostic; change the logic
in classifyAndEnrichError to detect whether the abort was caused by the
first-message timeout (preserve and return the original timeout Error and its
classification) versus a genuine upstream cancellation (return the 'Query
aborted' Error and 'aborted' class). Locate the abort check around
controller.signal.aborted, allow the original timeout Error to be propagated or
attached (e.g., by inspecting stderrLines or a sentinel on the Error) instead of
overwriting it, and only produce the generic abort Error when the cancellation
is upstream.
- Around line 884-887: The nodeConfigWarnings system chunks are currently
emitted inside the retry loop causing duplicate warnings on retries; move the
emission so it runs only once per request by hoisting the loop that yields {
type: 'system', content: `⚠️ ${warning.message}` } outside the retry/attempt
loop or wrap it with a guard like if (attempt === 0). Locate the retry handling
code surrounding nodeConfigWarnings in provider.ts and ensure nodeConfigWarnings
is iterated and yielded before entering retries (or conditioned on attempt ===
0) so each warning is emitted only once per request.
In `@packages/providers/src/codex/provider.ts`:
- Around line 192-195: streamCodexEvents currently breaks on abort (using
abortSignal?.aborted) which yields a truncated stream without signalling
failure; change streamCodexEvents to throw a clear AbortError (or a custom Error
with message like "Query aborted") when abortSignal.aborted is true, and update
sendQuery to catch that specific error (thrown by streamCodexEvents) and convert
it into the expected "Query aborted" handling or rethrow as appropriate; apply
the same fix to the other occurrence around the code handling abort at lines
referenced (the second abort check) so callers never receive a silently
truncated stream.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: d0deb3a9-0014-4c00-ad52-b69f1d584570
📒 Files selected for processing (4)
packages/providers/src/claude/provider.test.tspackages/providers/src/claude/provider.tspackages/providers/src/codex/provider.test.tspackages/providers/src/codex/provider.ts
PR Review Summary — Multi-Agent AnalysisFive specialized review agents analyzed this PR. Findings below. Critical Issues (4 found)
Important Issues (6 found)
Suggestions (8 found)
Strengths
Documentation Issues
Test Coverage Gaps (by criticality)
Type Design Scores
Verdict: NEEDS FIXESThe decomposition itself is clean and the overall test coverage is solid. The 4 critical issues are concrete bugs/regressions introduced by the refactor:
Recommended Actions
|
There was a problem hiding this comment.
🧹 Nitpick comments (1)
packages/providers/src/claude/provider.ts (1)
630-633: Add type parameter toAsyncGeneratorfor stricter type safety.The
eventsparameter lacks a type argument, defaulting toAsyncGenerator<unknown, any, any>. Adding an explicit type parameter improves clarity and aligns with the strict TypeScript configuration.♻️ Suggested fix
export async function* streamClaudeMessages( - events: AsyncGenerator, + events: AsyncGenerator<unknown>, toolResultQueue: ToolResultEntry[] ): AsyncGenerator<MessageChunk> {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/providers/src/claude/provider.ts` around lines 630 - 633, The parameter events in streamClaudeMessages is declared as AsyncGenerator without type args; update its type to an explicit AsyncGenerator with the proper event type and yield/return types used by this stream (e.g., AsyncGenerator<ClaudeEvent, void, unknown> or the specific event interface you use elsewhere) so TypeScript enforces correct types; update the signature of export async function* streamClaudeMessages(events: AsyncGenerator<...>, toolResultQueue: ToolResultEntry[]): AsyncGenerator<MessageChunk> accordingly and pick the actual event type name present in your codebase (e.g., ClaudeEvent or similar) for the first generic and set the second/third generics to the appropriate yield/return types.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@packages/providers/src/claude/provider.ts`:
- Around line 630-633: The parameter events in streamClaudeMessages is declared
as AsyncGenerator without type args; update its type to an explicit
AsyncGenerator with the proper event type and yield/return types used by this
stream (e.g., AsyncGenerator<ClaudeEvent, void, unknown> or the specific event
interface you use elsewhere) so TypeScript enforces correct types; update the
signature of export async function* streamClaudeMessages(events:
AsyncGenerator<...>, toolResultQueue: ToolResultEntry[]):
AsyncGenerator<MessageChunk> accordingly and pick the actual event type name
present in your codebase (e.g., ClaudeEvent or similar) for the first generic
and set the second/third generics to the appropriate yield/return types.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: ae25bbb5-d1e4-45b4-ad67-60d4428ff4b2
📒 Files selected for processing (1)
packages/providers/src/claude/provider.ts
…try state bug Review findings: 1. Internal helpers were exported and reachable through package.json subpath exports (./claude/provider, ./codex/provider), widening the public API. All new helpers are now file-local — the only public exports remain ClaudeProvider, CodexProvider, loadMcpConfig, buildSDKHooksFromYAML, withFirstMessageTimeout, getProcessUid. 2. Codex streamState (lastTodoListSignature) was shared across retry attempts, causing todo-list dedup to suppress output on retry. Now creates fresh state per attempt. Removed direct helper test imports — existing sendQuery e2e tests (51 Claude + 42 Codex) cover all behavior paths.
…lowing Fixes from CodeRabbit + multi-agent review: 1. classifyAndEnrichError preserves first-event timeout diagnostic instead of collapsing it into generic "Query aborted" (the timeout aborts the controller, but the original error carries the #1067 breadcrumb) 2. nodeConfigWarnings emitted once before retry loop, not per attempt 3. buildSubprocessEnv() called once before retry loop (was re-logging auth mode and rebuilding { ...process.env } per attempt) 4. Abort signal listener registered once with forwarding to current controller (was accumulating per-retry listeners) 5. PostToolUse hook wrapped in try/catch (JSON.stringify can throw on circular refs — was asymmetric with PostToolUseFailure which had it) 6. Codex streamCodexEvents throws on abort instead of silent break (callers were getting truncated stream with no result/error) 7. Both providers store enrichedError (not raw error) for retry exhaustion — preserves stderr context in final throw 8. Log is_error result events at error level in Claude stream normalizer
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
packages/providers/src/claude/provider.ts (1)
920-931:⚠️ Potential issue | 🟠 MajorAfter retry exhaustion, throws raw error instead of enriched error.
Same issue as Codex provider: line 927 stores the raw
err, but after the loop (line 931),lastErroris thrown without enrichment. The diagnostic context fromclassifyAndEnrichErroris lost.🐛 Proposed fix
- lastError = err; + lastError = enrichedError; } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/providers/src/claude/provider.ts` around lines 920 - 931, The final throw after the retry loop in Claude's provider currently throws the raw lastError (assigned from err) losing enrichment; update the loop to ensure lastError stores the enrichedError returned by classifyAndEnrichError (or set lastError = enrichedError before continuing) and change the final throw to throw lastError (which will now be enriched) or fall back to enrichedError/new Error if undefined; adjust references around shouldRetry, attempt, MAX_SUBPROCESS_RETRIES, classifyAndEnrichError and enrichedError to ensure the enriched diagnostic context is preserved when throwing after retries.packages/providers/src/codex/provider.ts (1)
595-606:⚠️ Potential issue | 🟠 MajorAfter retry exhaustion, throws raw error instead of enriched error.
When retries are exhausted via the loop completing (line 606),
lastErroris the raw error without enrichment. TheenrichedErrorfromclassifyAndEnrichCodexErroris discarded after each retry iteration. This loses diagnostic context that would help users debug failures.🐛 Proposed fix
- lastError = err; + lastError = enrichedError; } } - throw lastError ?? new Error('Codex query failed after retries'); + throw lastError ?? new Error('Codex query failed after retries');Store
enrichedErrorinstead of the rawerrso the thrown error retains classification and context.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/providers/src/codex/provider.ts` around lines 595 - 606, The loop currently assigns the raw err to lastError and throws that after retries, losing enrichments from classifyAndEnrichCodexError; update the retry loop so that after calling classifyAndEnrichCodexError you store the returned enrichedError into lastError (instead of err) and throw lastError (enriched) when retries are exhausted (respecting shouldRetry and MAX_SUBPROCESS_RETRIES and preserving existing delay logic that uses retryBaseDelayMs and getLog()) so the final thrown error contains the classification/context.
🧹 Nitpick comments (3)
packages/providers/src/claude/provider.ts (2)
630-721: Consider logging unrecognized top-level event types.The helper handles
assistant,system,rate_limit_event, andresultevents. Other event types are silently ignored. A debug log for unrecognized types would help catch SDK changes.💡 Suggested addition after line 719
} else if (event.type === 'result') { // ... existing result handling ... + } else { + getLog().debug({ eventType: event.type }, 'claude.unhandled_event_type'); } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/providers/src/claude/provider.ts` around lines 630 - 721, streamClaudeMessages currently ignores any top-level event types other than 'assistant', 'system', 'rate_limit_event', and 'result'; add a debug log for unexpected event types so SDK changes don't go unnoticed. After the existing if/else chain in streamClaudeMessages, add an else branch that calls getLog().debug({ eventType: event.type, raw: msg }, 'claude.unhandled_event') (or similar) to surface unrecognized event types and their raw payloads; reference the streamClaudeMessages function and the local variables event and msg to locate where to insert this logging.
335-344: Consider narrowingcodeto a literal union.Using
stringforcodeallows arbitrary values. A literal union would provide stronger typing and enable exhaustive handling.💡 Example
interface ProviderWarning { - code: string; + code: 'mcp_env_vars_missing' | 'mcp_haiku_tool_search'; message: string; }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/providers/src/claude/provider.ts` around lines 335 - 344, The ProviderWarning interface's code property is too permissive as string; define a specific union type (e.g., `type ProviderWarningCode = 'quota' | 'rate_limit' | 'deprecated' | 'transient'` or whatever canonical codes your providers emit), replace the `code: string` with `code: ProviderWarningCode`, and update any usages (parsing/creation sites) to return one of those literals so callers can exhaustively switch on ProviderWarning.code; keep the ProviderWarning name and shape but tighten the `code` type and export the union type for reuse.packages/providers/src/codex/provider.ts (1)
235-378: Add a default branch for unrecognized item types.The switch handles known item types but silently ignores unknown ones. If the Codex SDK adds new item types, they'll be dropped without notice. Consider adding a default case with a debug log.
💡 Suggested addition
break; } + + default: + getLog().debug({ itemType, itemId: item.id }, 'item_completed_unknown_type'); + break; } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/providers/src/codex/provider.ts` around lines 235 - 378, The switch on itemType in the generator (the block that handles 'agent_message', 'command_execution', etc.) lacks a default branch so unknown item types are silently dropped; add a default case that logs a debug (or warn) with getLog(), including the itemId and the unrecognized itemType (e.g., getLog().debug({ itemId: item.id, itemType }, 'unknown_item_type')) and optionally yield a minimal system message or no-op if you want visibility to consumers; place this default at the end of the switch to ensure any future/new item types are surfaced in logs.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@packages/providers/src/claude/provider.ts`:
- Around line 505-518: buildSubprocessEnv() is being called inside
buildBaseClaudeOptions, which runs on every retry and causes duplicate auth-mode
logs; compute the subprocess env once before entering the retry loop and pass
that prebuilt env into buildBaseClaudeOptions instead of calling
buildSubprocessEnv() repeatedly. Change buildBaseClaudeOptions to accept a
prepared env object (e.g. add parameter subprocessEnv: NodeJS.ProcessEnv), move
the merge logic that currently does requestOptions?.env ? {
...buildSubprocessEnv(), ...requestOptions.env } : buildSubprocessEnv() to the
code that constructs subprocessEnv once before retries, and update all callers
(including the retry loop) to pass subprocessEnv through so
buildBaseClaudeOptions no longer invokes buildSubprocessEnv().
- Around line 843-853: The abort listener added to requestOptions.abortSignal
inside the retry loop (the addEventListener call that calls controller.abort())
can accumulate across retries; move the registration out of the retry loop or
ensure the listener is removed after each attempt by capturing the listener
function and calling removeEventListener in the finally block (or register once
before retries and clean up after all retries). Concretely, reference the
requestOptions.abortSignal.addEventListener usage and the controller variable:
either attach the listener a single time outside the retry loop or, if kept
inside, store the callback in a const and call
requestOptions.abortSignal.removeEventListener(sameCallback) in the finally so
the orphaned listeners do not accumulate.
- Around line 574-591: The PostToolUse hook currently calls
JSON.stringify(toolResponse) without protection which can throw (e.g., circular
refs); wrap the serialization and queue push in a try/catch inside the async
hook (the same pattern used in PostToolUseFailure) so any serialization error is
caught, push a safe fallback (e.g., toString or a placeholder like
"[unserializable tool response]") into toolResultQueue with toolName/toolCallId
preserved, and still return { continue: true }; ensure errors are logged or
handled rather than allowing an unhandled rejection from the PostToolUse hook.
---
Outside diff comments:
In `@packages/providers/src/claude/provider.ts`:
- Around line 920-931: The final throw after the retry loop in Claude's provider
currently throws the raw lastError (assigned from err) losing enrichment; update
the loop to ensure lastError stores the enrichedError returned by
classifyAndEnrichError (or set lastError = enrichedError before continuing) and
change the final throw to throw lastError (which will now be enriched) or fall
back to enrichedError/new Error if undefined; adjust references around
shouldRetry, attempt, MAX_SUBPROCESS_RETRIES, classifyAndEnrichError and
enrichedError to ensure the enriched diagnostic context is preserved when
throwing after retries.
In `@packages/providers/src/codex/provider.ts`:
- Around line 595-606: The loop currently assigns the raw err to lastError and
throws that after retries, losing enrichments from classifyAndEnrichCodexError;
update the retry loop so that after calling classifyAndEnrichCodexError you
store the returned enrichedError into lastError (instead of err) and throw
lastError (enriched) when retries are exhausted (respecting shouldRetry and
MAX_SUBPROCESS_RETRIES and preserving existing delay logic that uses
retryBaseDelayMs and getLog()) so the final thrown error contains the
classification/context.
---
Nitpick comments:
In `@packages/providers/src/claude/provider.ts`:
- Around line 630-721: streamClaudeMessages currently ignores any top-level
event types other than 'assistant', 'system', 'rate_limit_event', and 'result';
add a debug log for unexpected event types so SDK changes don't go unnoticed.
After the existing if/else chain in streamClaudeMessages, add an else branch
that calls getLog().debug({ eventType: event.type, raw: msg },
'claude.unhandled_event') (or similar) to surface unrecognized event types and
their raw payloads; reference the streamClaudeMessages function and the local
variables event and msg to locate where to insert this logging.
- Around line 335-344: The ProviderWarning interface's code property is too
permissive as string; define a specific union type (e.g., `type
ProviderWarningCode = 'quota' | 'rate_limit' | 'deprecated' | 'transient'` or
whatever canonical codes your providers emit), replace the `code: string` with
`code: ProviderWarningCode`, and update any usages (parsing/creation sites) to
return one of those literals so callers can exhaustively switch on
ProviderWarning.code; keep the ProviderWarning name and shape but tighten the
`code` type and export the union type for reuse.
In `@packages/providers/src/codex/provider.ts`:
- Around line 235-378: The switch on itemType in the generator (the block that
handles 'agent_message', 'command_execution', etc.) lacks a default branch so
unknown item types are silently dropped; add a default case that logs a debug
(or warn) with getLog(), including the itemId and the unrecognized itemType
(e.g., getLog().debug({ itemId: item.id, itemType }, 'unknown_item_type')) and
optionally yield a minimal system message or no-op if you want visibility to
consumers; place this default at the end of the switch to ensure any future/new
item types are surfaced in logs.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 1ba72908-eda5-4c7c-87fb-22cdb9866c15
📒 Files selected for processing (2)
packages/providers/src/claude/provider.tspackages/providers/src/codex/provider.ts
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
packages/providers/src/codex/provider.ts (1)
235-378: Consider adding a default branch for unrecognized item types.The switch statement handles known Codex item types but silently ignores any new types the SDK may introduce. Adding a debug log for unhandled types would aid future debugging without breaking current behavior.
💡 Suggested addition after line 377
break; } + + default: + getLog().debug({ itemType, itemId: item.id }, 'item_completed_unhandled_type'); + break; } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/providers/src/codex/provider.ts` around lines 235 - 378, The switch on itemType in the Codex handling block should include a default branch to surface unrecognized item types; add a default: case at the end of the switch that calls getLog().debug (or .warn if you prefer) with context like { itemId: item.id, itemType } and a message key such as 'codex_unhandled_item_type' so unknown/new SDK item types are logged rather than silently ignored (locate the switch (itemType) block in provider.ts and add the default case after the existing cases).packages/providers/src/claude/provider.ts (1)
856-864: Cache the loaded MCP config to avoid repeated file I/O.When
nodeConfig.mcpis set,applyNodeConfigcallsloadMcpConfigwhich performs file I/O (line 396). The current pattern calls it once to extract warnings (line 863), then again on each retry attempt inside the loop (line 904), resulting in N+1 disk reads for the same config file.The straightforward fix is to load MCP once before the retry loop and reuse it:
Example approach
let cachedMcpServers: Record<string, unknown> | undefined; let nodeConfigWarnings: ProviderWarning[] = []; if (requestOptions?.nodeConfig?.mcp) { const { servers, serverNames, missingVars } = await loadMcpConfig(requestOptions.nodeConfig.mcp, cwd); cachedMcpServers = servers; // extract and yield warnings here } for (let attempt = 0; attempt <= MAX_SUBPROCESS_RETRIES; attempt++) { // ... if (requestOptions?.nodeConfig) { await applyNodeConfig(options, requestOptions.nodeConfig, cwd, cachedMcpServers); } }Alternatively, split
applyNodeConfiginto separate concern functions.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/providers/src/claude/provider.ts` around lines 856 - 864, The code currently calls applyNodeConfig twice causing repeated loadMcpConfig file I/O; load the MCP config once before the retry loop and reuse it: when requestOptions?.nodeConfig?.mcp is present call loadMcpConfig(cwd) once to produce servers/serverNames/missingVars, extract and set nodeConfigWarnings there, store the servers in a cachedMcpServers variable, then inside the retry loop call applyNodeConfig with the cached servers (or modify applyNodeConfig to accept an optional cachedMcpServers param) so subsequent attempts reuse the in-memory MCP data instead of re-reading the file; ensure nodeConfigWarnings are yielded before the first attempt and MAX_SUBPROCESS_RETRIES loop uses the cached data.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@packages/providers/src/codex/provider.ts`:
- Around line 359-374: mcpResult.content may be non-serializable so calling
JSON.stringify directly can throw and crash the stream; update the block around
mcpResult.content in provider.ts (the branch that sets toolOutput and yields {
type: 'tool_result', toolName: mcpToolName, toolOutput }) to wrap
JSON.stringify(mcpResult.content) in a try/catch, on error log the serialization
error with getLog().warn (include item.id, server, tool, mcpToolName and the
error), and set a safe fallback for toolOutput (e.g., a descriptive string or
truncated util.inspect-like output) so the yield always returns a defined
toolOutput.
---
Nitpick comments:
In `@packages/providers/src/claude/provider.ts`:
- Around line 856-864: The code currently calls applyNodeConfig twice causing
repeated loadMcpConfig file I/O; load the MCP config once before the retry loop
and reuse it: when requestOptions?.nodeConfig?.mcp is present call
loadMcpConfig(cwd) once to produce servers/serverNames/missingVars, extract and
set nodeConfigWarnings there, store the servers in a cachedMcpServers variable,
then inside the retry loop call applyNodeConfig with the cached servers (or
modify applyNodeConfig to accept an optional cachedMcpServers param) so
subsequent attempts reuse the in-memory MCP data instead of re-reading the file;
ensure nodeConfigWarnings are yielded before the first attempt and
MAX_SUBPROCESS_RETRIES loop uses the cached data.
In `@packages/providers/src/codex/provider.ts`:
- Around line 235-378: The switch on itemType in the Codex handling block should
include a default branch to surface unrecognized item types; add a default: case
at the end of the switch that calls getLog().debug (or .warn if you prefer) with
context like { itemId: item.id, itemType } and a message key such as
'codex_unhandled_item_type' so unknown/new SDK item types are logged rather than
silently ignored (locate the switch (itemType) block in provider.ts and add the
default case after the existing cases).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: e025636f-c9e6-4558-8093-fde60d700a19
📒 Files selected for processing (2)
packages/providers/src/claude/provider.tspackages/providers/src/codex/provider.ts
| if (mcpResult?.content) { | ||
| if (Array.isArray(mcpResult.content)) { | ||
| toolOutput = JSON.stringify(mcpResult.content); | ||
| } else { | ||
| getLog().warn( | ||
| { | ||
| itemId: item.id, | ||
| server, | ||
| tool, | ||
| resultType: typeof mcpResult.content, | ||
| }, | ||
| 'mcp_tool_call_unexpected_result_shape' | ||
| ); | ||
| } | ||
| } | ||
| yield { type: 'tool_result', toolName: mcpToolName, toolOutput }; |
There was a problem hiding this comment.
JSON.stringify can throw on non-serializable MCP content.
If mcpResult.content contains circular references, BigInt values, or other non-serializable data, JSON.stringify will throw and crash the stream. Consider wrapping in try/catch for robustness.
🛡️ Proposed fix
if (mcpResult?.content) {
if (Array.isArray(mcpResult.content)) {
- toolOutput = JSON.stringify(mcpResult.content);
+ try {
+ toolOutput = JSON.stringify(mcpResult.content);
+ } catch {
+ getLog().warn({ itemId: item.id, server, tool }, 'mcp_tool_result_not_serializable');
+ toolOutput = '[unserializable MCP result]';
+ }
} else {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@packages/providers/src/codex/provider.ts` around lines 359 - 374,
mcpResult.content may be non-serializable so calling JSON.stringify directly can
throw and crash the stream; update the block around mcpResult.content in
provider.ts (the branch that sets toolOutput and yields { type: 'tool_result',
toolName: mcpToolName, toolOutput }) to wrap JSON.stringify(mcpResult.content)
in a try/catch, on error log the serialization error with getLog().warn (include
item.id, server, tool, mcpToolName and the error), and set a safe fallback for
toolOutput (e.g., a descriptive string or truncated util.inspect-like output) so
the yield always returns a defined toolOutput.
Restore test coverage for the specific fixes from the decomposition review, exercised through sendQuery (black-box) since helpers are file-local: Claude (6 tests): - Timeout error preserved (not collapsed into "Query aborted") - nodeConfig warnings emitted once even when retries occur - Abort signal cancels across retries via single forwarding listener - Enriched error (with stderr) thrown at retry exhaustion - PostToolUse hook handles circular reference without crashing - is_error result events logged at error level Codex (3 tests): - Abort signal throws instead of silently truncating stream - Enriched error thrown at retry exhaustion - Todo-list dedup state resets between retry attempts
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@packages/providers/src/claude/provider.test.ts`:
- Around line 1016-1046: The test currently aborts before a second retry starts
so it doesn't catch per-retry listener leaks; update the test for the
listener-leak fix by either (A) arranging mockQuery to allow at least two retry
attempts to begin before calling abort (e.g., make the first attempt delay long
enough so the retry logic attaches listeners multiple times) and then assert
callCount > 1 and that only one abort listener remains, or (B) spy on the
AbortSignal/EventTarget methods used by client.sendQuery (e.g., addEventListener
and removeEventListener on AbortController.signal) and assert they are
registered exactly once and removed exactly once across retries; reference
mockQuery, client.sendQuery, and AbortController.signal when making the changes.
- Around line 988-1014: The test "emits nodeConfig warnings only once even when
retries occur" currently uses nodeConfig: { effort: 'high' } which never
produces a ProviderWarning, so change the nodeConfig passed to client.sendQuery
inside that test to a configuration that is guaranteed to trigger a
ProviderWarning (e.g., the same dangerous/unsupported option used elsewhere in
tests that yields a ProviderWarning), then update the assertions to expect
exactly one system warning chunk across all retries (filter chunks by c.type ===
'system' and assert length === 1) while keeping the existing callCount assertion
to confirm retries; refer to the test name and the client.sendQuery(...) call
and the mockQuery generator to locate where to change the nodeConfig and
assertions.
In `@packages/providers/src/codex/provider.test.ts`:
- Around line 1194-1230: The test todo_list dedup state resets between retry
attempts currently only checks that "Task 1" appears at least once; update the
assertions in the test 'todo_list dedup state resets between retry attempts' to
explicitly assert the todo was emitted on both attempts by counting occurrences
from mockRunStreamed via the streaming output from client.sendQuery: filter
systemChunks (as currently done) and assert that systemChunks.filter(c =>
c.content.includes('Task 1')).length === 2 (or >=2 if you prefer permissive), so
the same todo appears on the initial attempt and again on the retry, ensuring
dedup state was reset; keep using mockRunStreamed and systemChunks to locate the
emitted events.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: acc34ce2-e11c-4450-a3d3-cea41f56d702
📒 Files selected for processing (2)
packages/providers/src/claude/provider.test.tspackages/providers/src/codex/provider.test.ts
| test('emits nodeConfig warnings only once even when retries occur', async () => { | ||
| let callCount = 0; | ||
| mockQuery.mockImplementation(async function* () { | ||
| callCount++; | ||
| if (callCount <= 2) { | ||
| throw new Error('process exited with code 1'); // crash → retried | ||
| } | ||
| yield { | ||
| type: 'assistant', | ||
| message: { content: [{ type: 'text', text: 'ok' }] }, | ||
| }; | ||
| }); | ||
|
|
||
| const chunks = []; | ||
| for await (const chunk of client.sendQuery('test', '/workspace', undefined, { | ||
| nodeConfig: { effort: 'high' }, | ||
| })) { | ||
| chunks.push(chunk); | ||
| } | ||
|
|
||
| // nodeConfig with effort doesn't produce warnings, but let's verify | ||
| // no system chunks are duplicated. Use a nodeConfig that doesn't warn. | ||
| // The point is: zero warning chunks means zero, not zero × 3 retries. | ||
| const systemChunks = chunks.filter(c => c.type === 'system'); | ||
| expect(systemChunks).toHaveLength(0); | ||
| expect(callCount).toBe(3); // Confirms retries happened | ||
| }, 5_000); |
There was a problem hiding this comment.
This never exercises a warning path.
nodeConfig: { effort: 'high' } does not produce a provider warning, so systemChunks.length === 0 doesn't prove warnings are emitted only once across retries. Please switch this to a config that is guaranteed to yield a ProviderWarning, then assert exactly one warning chunk over the full retry sequence.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@packages/providers/src/claude/provider.test.ts` around lines 988 - 1014, The
test "emits nodeConfig warnings only once even when retries occur" currently
uses nodeConfig: { effort: 'high' } which never produces a ProviderWarning, so
change the nodeConfig passed to client.sendQuery inside that test to a
configuration that is guaranteed to trigger a ProviderWarning (e.g., the same
dangerous/unsupported option used elsewhere in tests that yields a
ProviderWarning), then update the assertions to expect exactly one system
warning chunk across all retries (filter chunks by c.type === 'system' and
assert length === 1) while keeping the existing callCount assertion to confirm
retries; refer to the test name and the client.sendQuery(...) call and the
mockQuery generator to locate where to change the nodeConfig and assertions.
| test('abort signal cancels query across retries without listener leak', async () => { | ||
| const abortController = new AbortController(); | ||
| let callCount = 0; | ||
|
|
||
| mockQuery.mockImplementation(async function* () { | ||
| callCount++; | ||
| if (callCount === 1) { | ||
| // First attempt crashes → triggers retry. Abort during the retry delay | ||
| // so the next iteration's abortSignal.aborted check catches it. | ||
| setTimeout(() => abortController.abort(), 0); | ||
| throw new Error('process exited with code 1'); | ||
| } | ||
| // Should not reach here — abort fires before retry starts | ||
| yield { | ||
| type: 'assistant', | ||
| message: { content: [{ type: 'text', text: 'should not reach' }] }, | ||
| }; | ||
| }); | ||
|
|
||
| const consumeGenerator = async (): Promise<void> => { | ||
| for await (const _ of client.sendQuery('test', '/workspace', undefined, { | ||
| abortSignal: abortController.signal, | ||
| })) { | ||
| // consume | ||
| } | ||
| }; | ||
|
|
||
| await expect(consumeGenerator()).rejects.toThrow('Query aborted'); | ||
| // Single abort listener registered (not per-retry) | ||
| expect(callCount).toBe(1); | ||
| }, 5_000); |
There was a problem hiding this comment.
This doesn't actually verify the listener-leak fix.
The abort fires before a second attempt begins, so this test would still pass if a new abort listener were attached on every retry. If you want to lock in the leak fix, either let multiple retries start before aborting or spy on addEventListener/removeEventListener and assert a single registration lifecycle.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@packages/providers/src/claude/provider.test.ts` around lines 1016 - 1046, The
test currently aborts before a second retry starts so it doesn't catch per-retry
listener leaks; update the test for the listener-leak fix by either (A)
arranging mockQuery to allow at least two retry attempts to begin before calling
abort (e.g., make the first attempt delay long enough so the retry logic
attaches listeners multiple times) and then assert callCount > 1 and that only
one abort listener remains, or (B) spy on the AbortSignal/EventTarget methods
used by client.sendQuery (e.g., addEventListener and removeEventListener on
AbortController.signal) and assert they are registered exactly once and removed
exactly once across retries; reference mockQuery, client.sendQuery, and
AbortController.signal when making the changes.
| test('todo_list dedup state resets between retry attempts', async () => { | ||
| const todoItem = { | ||
| type: 'todo_list', | ||
| items: [{ text: 'Task 1', completed: false }], | ||
| id: 'todo-1', | ||
| }; | ||
|
|
||
| let callCount = 0; | ||
| mockRunStreamed.mockImplementation(() => { | ||
| callCount++; | ||
| if (callCount === 1) { | ||
| return Promise.resolve({ | ||
| events: (async function* () { | ||
| yield { type: 'item.completed', item: todoItem }; | ||
| throw new Error('codex exec crashed'); | ||
| })(), | ||
| }); | ||
| } | ||
| // On retry, same todo should appear again (fresh state) | ||
| return Promise.resolve({ | ||
| events: (async function* () { | ||
| yield { type: 'item.completed', item: todoItem }; | ||
| yield { type: 'turn.completed', usage: defaultUsage }; | ||
| })(), | ||
| }); | ||
| }); | ||
|
|
||
| const chunks = []; | ||
| for await (const chunk of client.sendQuery('test', '/workspace')) { | ||
| chunks.push(chunk); | ||
| } | ||
|
|
||
| // The todo should appear on the retry attempt (not suppressed by dedup from attempt 1) | ||
| const systemChunks = chunks.filter(c => c.type === 'system'); | ||
| expect(systemChunks.length).toBeGreaterThanOrEqual(1); | ||
| expect(systemChunks.some(c => c.type === 'system' && c.content.includes('Task 1'))).toBe(true); | ||
| }, 5_000); |
There was a problem hiding this comment.
Assert the retry emission explicitly.
This passes even if the todo is emitted only on attempt 1 and then suppressed on retry, because the current systemChunks checks are already satisfied by the pre-crash chunk. To lock in the dedup-reset fix, assert that the same todo appears twice across the two attempts.
Suggested assertion tightening
- const systemChunks = chunks.filter(c => c.type === 'system');
- expect(systemChunks.length).toBeGreaterThanOrEqual(1);
- expect(systemChunks.some(c => c.type === 'system' && c.content.includes('Task 1'))).toBe(true);
+ const systemChunks = chunks.filter(c => c.type === 'system');
+ expect(systemChunks).toHaveLength(2);
+ expect(systemChunks[0]).toMatchObject({
+ type: 'system',
+ content: expect.stringContaining('Task 1'),
+ });
+ expect(systemChunks[1]).toMatchObject({
+ type: 'system',
+ content: expect.stringContaining('Task 1'),
+ });📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| test('todo_list dedup state resets between retry attempts', async () => { | |
| const todoItem = { | |
| type: 'todo_list', | |
| items: [{ text: 'Task 1', completed: false }], | |
| id: 'todo-1', | |
| }; | |
| let callCount = 0; | |
| mockRunStreamed.mockImplementation(() => { | |
| callCount++; | |
| if (callCount === 1) { | |
| return Promise.resolve({ | |
| events: (async function* () { | |
| yield { type: 'item.completed', item: todoItem }; | |
| throw new Error('codex exec crashed'); | |
| })(), | |
| }); | |
| } | |
| // On retry, same todo should appear again (fresh state) | |
| return Promise.resolve({ | |
| events: (async function* () { | |
| yield { type: 'item.completed', item: todoItem }; | |
| yield { type: 'turn.completed', usage: defaultUsage }; | |
| })(), | |
| }); | |
| }); | |
| const chunks = []; | |
| for await (const chunk of client.sendQuery('test', '/workspace')) { | |
| chunks.push(chunk); | |
| } | |
| // The todo should appear on the retry attempt (not suppressed by dedup from attempt 1) | |
| const systemChunks = chunks.filter(c => c.type === 'system'); | |
| expect(systemChunks.length).toBeGreaterThanOrEqual(1); | |
| expect(systemChunks.some(c => c.type === 'system' && c.content.includes('Task 1'))).toBe(true); | |
| }, 5_000); | |
| test('todo_list dedup state resets between retry attempts', async () => { | |
| const todoItem = { | |
| type: 'todo_list', | |
| items: [{ text: 'Task 1', completed: false }], | |
| id: 'todo-1', | |
| }; | |
| let callCount = 0; | |
| mockRunStreamed.mockImplementation(() => { | |
| callCount++; | |
| if (callCount === 1) { | |
| return Promise.resolve({ | |
| events: (async function* () { | |
| yield { type: 'item.completed', item: todoItem }; | |
| throw new Error('codex exec crashed'); | |
| })(), | |
| }); | |
| } | |
| // On retry, same todo should appear again (fresh state) | |
| return Promise.resolve({ | |
| events: (async function* () { | |
| yield { type: 'item.completed', item: todoItem }; | |
| yield { type: 'turn.completed', usage: defaultUsage }; | |
| })(), | |
| }); | |
| }); | |
| const chunks = []; | |
| for await (const chunk of client.sendQuery('test', '/workspace')) { | |
| chunks.push(chunk); | |
| } | |
| // The todo should appear on the retry attempt (not suppressed by dedup from attempt 1) | |
| const systemChunks = chunks.filter(c => c.type === 'system'); | |
| expect(systemChunks).toHaveLength(2); | |
| expect(systemChunks[0]).toMatchObject({ | |
| type: 'system', | |
| content: expect.stringContaining('Task 1'), | |
| }); | |
| expect(systemChunks[1]).toMatchObject({ | |
| type: 'system', | |
| content: expect.stringContaining('Task 1'), | |
| }); | |
| }, 5_000); |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@packages/providers/src/codex/provider.test.ts` around lines 1194 - 1230, The
test todo_list dedup state resets between retry attempts currently only checks
that "Task 1" appears at least once; update the assertions in the test
'todo_list dedup state resets between retry attempts' to explicitly assert the
todo was emitted on both attempts by counting occurrences from mockRunStreamed
via the streaming output from client.sendQuery: filter systemChunks (as
currently done) and assert that systemChunks.filter(c =>
c.content.includes('Task 1')).length === 2 (or >=2 if you prefer permissive), so
the same todo appears on the initial attempt and again on the retry, ensuring
dedup state was reset; keep using mockRunStreamed and systemChunks to locate the
emitted events.
…ies (coleam00#1162) * refactor: decompose provider sendQuery() into explicit helper boundaries (coleam00#1139) sendQuery() in both Claude and Codex providers was a monolith mixing SDK option building, nodeConfig translation, stream normalization, and error classification. This makes it hard to safely extend for Phase 2 provider extensibility. Decompose both providers into focused internal helpers: Claude: - buildBaseClaudeOptions: SDK option construction - buildToolCaptureHooks: PostToolUse/PostToolUseFailure hook setup - applyNodeConfig: workflow nodeConfig → SDK translation + structured warnings - streamClaudeMessages: raw SDK event → MessageChunk normalization - classifyAndEnrichError: error classification with retry decisions Codex: - buildTurnOptions: per-turn option construction (output schema, abort) - streamCodexEvents: raw SDK event → MessageChunk normalization - classifyAndEnrichCodexError: error classification with retry decisions Also introduces ProviderWarning { code, message } replacing raw string warnings for machine-readable provider translation warnings. Adds 43 focused unit tests covering the extracted helpers directly. Fixes coleam00#1139 * fix: export ToolResultEntry type used in public buildBaseClaudeOptions API * fix: unexport internal helpers to prevent API surface leakage, fix retry state bug Review findings: 1. Internal helpers were exported and reachable through package.json subpath exports (./claude/provider, ./codex/provider), widening the public API. All new helpers are now file-local — the only public exports remain ClaudeProvider, CodexProvider, loadMcpConfig, buildSDKHooksFromYAML, withFirstMessageTimeout, getProcessUid. 2. Codex streamState (lastTodoListSignature) was shared across retry attempts, causing todo-list dedup to suppress output on retry. Now creates fresh state per attempt. Removed direct helper test imports — existing sendQuery e2e tests (51 Claude + 42 Codex) cover all behavior paths. * fix: address review findings — abort handling, retry bugs, error swallowing Fixes from CodeRabbit + multi-agent review: 1. classifyAndEnrichError preserves first-event timeout diagnostic instead of collapsing it into generic "Query aborted" (the timeout aborts the controller, but the original error carries the coleam00#1067 breadcrumb) 2. nodeConfigWarnings emitted once before retry loop, not per attempt 3. buildSubprocessEnv() called once before retry loop (was re-logging auth mode and rebuilding { ...process.env } per attempt) 4. Abort signal listener registered once with forwarding to current controller (was accumulating per-retry listeners) 5. PostToolUse hook wrapped in try/catch (JSON.stringify can throw on circular refs — was asymmetric with PostToolUseFailure which had it) 6. Codex streamCodexEvents throws on abort instead of silent break (callers were getting truncated stream with no result/error) 7. Both providers store enrichedError (not raw error) for retry exhaustion — preserves stderr context in final throw 8. Log is_error result events at error level in Claude stream normalizer * test: add black-box behavioral tests for sendQuery decomposition fixes Restore test coverage for the specific fixes from the decomposition review, exercised through sendQuery (black-box) since helpers are file-local: Claude (6 tests): - Timeout error preserved (not collapsed into "Query aborted") - nodeConfig warnings emitted once even when retries occur - Abort signal cancels across retries via single forwarding listener - Enriched error (with stderr) thrown at retry exhaustion - PostToolUse hook handles circular reference without crashing - is_error result events logged at error level Codex (3 tests): - Abort signal throws instead of silently truncating stream - Enriched error thrown at retry exhaustion - Todo-list dedup state resets between retry attempts
…ies (coleam00#1162) * refactor: decompose provider sendQuery() into explicit helper boundaries (coleam00#1139) sendQuery() in both Claude and Codex providers was a monolith mixing SDK option building, nodeConfig translation, stream normalization, and error classification. This makes it hard to safely extend for Phase 2 provider extensibility. Decompose both providers into focused internal helpers: Claude: - buildBaseClaudeOptions: SDK option construction - buildToolCaptureHooks: PostToolUse/PostToolUseFailure hook setup - applyNodeConfig: workflow nodeConfig → SDK translation + structured warnings - streamClaudeMessages: raw SDK event → MessageChunk normalization - classifyAndEnrichError: error classification with retry decisions Codex: - buildTurnOptions: per-turn option construction (output schema, abort) - streamCodexEvents: raw SDK event → MessageChunk normalization - classifyAndEnrichCodexError: error classification with retry decisions Also introduces ProviderWarning { code, message } replacing raw string warnings for machine-readable provider translation warnings. Adds 43 focused unit tests covering the extracted helpers directly. Fixes coleam00#1139 * fix: export ToolResultEntry type used in public buildBaseClaudeOptions API * fix: unexport internal helpers to prevent API surface leakage, fix retry state bug Review findings: 1. Internal helpers were exported and reachable through package.json subpath exports (./claude/provider, ./codex/provider), widening the public API. All new helpers are now file-local — the only public exports remain ClaudeProvider, CodexProvider, loadMcpConfig, buildSDKHooksFromYAML, withFirstMessageTimeout, getProcessUid. 2. Codex streamState (lastTodoListSignature) was shared across retry attempts, causing todo-list dedup to suppress output on retry. Now creates fresh state per attempt. Removed direct helper test imports — existing sendQuery e2e tests (51 Claude + 42 Codex) cover all behavior paths. * fix: address review findings — abort handling, retry bugs, error swallowing Fixes from CodeRabbit + multi-agent review: 1. classifyAndEnrichError preserves first-event timeout diagnostic instead of collapsing it into generic "Query aborted" (the timeout aborts the controller, but the original error carries the coleam00#1067 breadcrumb) 2. nodeConfigWarnings emitted once before retry loop, not per attempt 3. buildSubprocessEnv() called once before retry loop (was re-logging auth mode and rebuilding { ...process.env } per attempt) 4. Abort signal listener registered once with forwarding to current controller (was accumulating per-retry listeners) 5. PostToolUse hook wrapped in try/catch (JSON.stringify can throw on circular refs — was asymmetric with PostToolUseFailure which had it) 6. Codex streamCodexEvents throws on abort instead of silent break (callers were getting truncated stream with no result/error) 7. Both providers store enrichedError (not raw error) for retry exhaustion — preserves stderr context in final throw 8. Log is_error result events at error level in Claude stream normalizer * test: add black-box behavioral tests for sendQuery decomposition fixes Restore test coverage for the specific fixes from the decomposition review, exercised through sendQuery (black-box) since helpers are file-local: Claude (6 tests): - Timeout error preserved (not collapsed into "Query aborted") - nodeConfig warnings emitted once even when retries occur - Abort signal cancels across retries via single forwarding listener - Enriched error (with stderr) thrown at retry exhaustion - PostToolUse hook handles circular reference without crashing - is_error result events logged at error level Codex (3 tests): - Abort signal throws instead of silently truncating stream - Enriched error thrown at retry exhaustion - Todo-list dedup state resets between retry attempts
…ies (coleam00#1162) * refactor: decompose provider sendQuery() into explicit helper boundaries (coleam00#1139) sendQuery() in both Claude and Codex providers was a monolith mixing SDK option building, nodeConfig translation, stream normalization, and error classification. This makes it hard to safely extend for Phase 2 provider extensibility. Decompose both providers into focused internal helpers: Claude: - buildBaseClaudeOptions: SDK option construction - buildToolCaptureHooks: PostToolUse/PostToolUseFailure hook setup - applyNodeConfig: workflow nodeConfig → SDK translation + structured warnings - streamClaudeMessages: raw SDK event → MessageChunk normalization - classifyAndEnrichError: error classification with retry decisions Codex: - buildTurnOptions: per-turn option construction (output schema, abort) - streamCodexEvents: raw SDK event → MessageChunk normalization - classifyAndEnrichCodexError: error classification with retry decisions Also introduces ProviderWarning { code, message } replacing raw string warnings for machine-readable provider translation warnings. Adds 43 focused unit tests covering the extracted helpers directly. Fixes coleam00#1139 * fix: export ToolResultEntry type used in public buildBaseClaudeOptions API * fix: unexport internal helpers to prevent API surface leakage, fix retry state bug Review findings: 1. Internal helpers were exported and reachable through package.json subpath exports (./claude/provider, ./codex/provider), widening the public API. All new helpers are now file-local — the only public exports remain ClaudeProvider, CodexProvider, loadMcpConfig, buildSDKHooksFromYAML, withFirstMessageTimeout, getProcessUid. 2. Codex streamState (lastTodoListSignature) was shared across retry attempts, causing todo-list dedup to suppress output on retry. Now creates fresh state per attempt. Removed direct helper test imports — existing sendQuery e2e tests (51 Claude + 42 Codex) cover all behavior paths. * fix: address review findings — abort handling, retry bugs, error swallowing Fixes from CodeRabbit + multi-agent review: 1. classifyAndEnrichError preserves first-event timeout diagnostic instead of collapsing it into generic "Query aborted" (the timeout aborts the controller, but the original error carries the coleam00#1067 breadcrumb) 2. nodeConfigWarnings emitted once before retry loop, not per attempt 3. buildSubprocessEnv() called once before retry loop (was re-logging auth mode and rebuilding { ...process.env } per attempt) 4. Abort signal listener registered once with forwarding to current controller (was accumulating per-retry listeners) 5. PostToolUse hook wrapped in try/catch (JSON.stringify can throw on circular refs — was asymmetric with PostToolUseFailure which had it) 6. Codex streamCodexEvents throws on abort instead of silent break (callers were getting truncated stream with no result/error) 7. Both providers store enrichedError (not raw error) for retry exhaustion — preserves stderr context in final throw 8. Log is_error result events at error level in Claude stream normalizer * test: add black-box behavioral tests for sendQuery decomposition fixes Restore test coverage for the specific fixes from the decomposition review, exercised through sendQuery (black-box) since helpers are file-local: Claude (6 tests): - Timeout error preserved (not collapsed into "Query aborted") - nodeConfig warnings emitted once even when retries occur - Abort signal cancels across retries via single forwarding listener - Enriched error (with stderr) thrown at retry exhaustion - PostToolUse hook handles circular reference without crashing - is_error result events logged at error level Codex (3 tests): - Abort signal throws instead of silently truncating stream - Enriched error thrown at retry exhaustion - Todo-list dedup state resets between retry attempts
…ies (coleam00#1162) * refactor: decompose provider sendQuery() into explicit helper boundaries (coleam00#1139) sendQuery() in both Claude and Codex providers was a monolith mixing SDK option building, nodeConfig translation, stream normalization, and error classification. This makes it hard to safely extend for Phase 2 provider extensibility. Decompose both providers into focused internal helpers: Claude: - buildBaseClaudeOptions: SDK option construction - buildToolCaptureHooks: PostToolUse/PostToolUseFailure hook setup - applyNodeConfig: workflow nodeConfig → SDK translation + structured warnings - streamClaudeMessages: raw SDK event → MessageChunk normalization - classifyAndEnrichError: error classification with retry decisions Codex: - buildTurnOptions: per-turn option construction (output schema, abort) - streamCodexEvents: raw SDK event → MessageChunk normalization - classifyAndEnrichCodexError: error classification with retry decisions Also introduces ProviderWarning { code, message } replacing raw string warnings for machine-readable provider translation warnings. Adds 43 focused unit tests covering the extracted helpers directly. Fixes coleam00#1139 * fix: export ToolResultEntry type used in public buildBaseClaudeOptions API * fix: unexport internal helpers to prevent API surface leakage, fix retry state bug Review findings: 1. Internal helpers were exported and reachable through package.json subpath exports (./claude/provider, ./codex/provider), widening the public API. All new helpers are now file-local — the only public exports remain ClaudeProvider, CodexProvider, loadMcpConfig, buildSDKHooksFromYAML, withFirstMessageTimeout, getProcessUid. 2. Codex streamState (lastTodoListSignature) was shared across retry attempts, causing todo-list dedup to suppress output on retry. Now creates fresh state per attempt. Removed direct helper test imports — existing sendQuery e2e tests (51 Claude + 42 Codex) cover all behavior paths. * fix: address review findings — abort handling, retry bugs, error swallowing Fixes from CodeRabbit + multi-agent review: 1. classifyAndEnrichError preserves first-event timeout diagnostic instead of collapsing it into generic "Query aborted" (the timeout aborts the controller, but the original error carries the coleam00#1067 breadcrumb) 2. nodeConfigWarnings emitted once before retry loop, not per attempt 3. buildSubprocessEnv() called once before retry loop (was re-logging auth mode and rebuilding { ...process.env } per attempt) 4. Abort signal listener registered once with forwarding to current controller (was accumulating per-retry listeners) 5. PostToolUse hook wrapped in try/catch (JSON.stringify can throw on circular refs — was asymmetric with PostToolUseFailure which had it) 6. Codex streamCodexEvents throws on abort instead of silent break (callers were getting truncated stream with no result/error) 7. Both providers store enrichedError (not raw error) for retry exhaustion — preserves stderr context in final throw 8. Log is_error result events at error level in Claude stream normalizer * test: add black-box behavioral tests for sendQuery decomposition fixes Restore test coverage for the specific fixes from the decomposition review, exercised through sendQuery (black-box) since helpers are file-local: Claude (6 tests): - Timeout error preserved (not collapsed into "Query aborted") - nodeConfig warnings emitted once even when retries occur - Abort signal cancels across retries via single forwarding listener - Enriched error (with stderr) thrown at retry exhaustion - PostToolUse hook handles circular reference without crashing - is_error result events logged at error level Codex (3 tests): - Abort signal throws instead of silently truncating stream - Enriched error thrown at retry exhaustion - Todo-list dedup state resets between retry attempts
…ies (coleam00#1162) * refactor: decompose provider sendQuery() into explicit helper boundaries (coleam00#1139) sendQuery() in both Claude and Codex providers was a monolith mixing SDK option building, nodeConfig translation, stream normalization, and error classification. This makes it hard to safely extend for Phase 2 provider extensibility. Decompose both providers into focused internal helpers: Claude: - buildBaseClaudeOptions: SDK option construction - buildToolCaptureHooks: PostToolUse/PostToolUseFailure hook setup - applyNodeConfig: workflow nodeConfig → SDK translation + structured warnings - streamClaudeMessages: raw SDK event → MessageChunk normalization - classifyAndEnrichError: error classification with retry decisions Codex: - buildTurnOptions: per-turn option construction (output schema, abort) - streamCodexEvents: raw SDK event → MessageChunk normalization - classifyAndEnrichCodexError: error classification with retry decisions Also introduces ProviderWarning { code, message } replacing raw string warnings for machine-readable provider translation warnings. Adds 43 focused unit tests covering the extracted helpers directly. Fixes coleam00#1139 * fix: export ToolResultEntry type used in public buildBaseClaudeOptions API * fix: unexport internal helpers to prevent API surface leakage, fix retry state bug Review findings: 1. Internal helpers were exported and reachable through package.json subpath exports (./claude/provider, ./codex/provider), widening the public API. All new helpers are now file-local — the only public exports remain ClaudeProvider, CodexProvider, loadMcpConfig, buildSDKHooksFromYAML, withFirstMessageTimeout, getProcessUid. 2. Codex streamState (lastTodoListSignature) was shared across retry attempts, causing todo-list dedup to suppress output on retry. Now creates fresh state per attempt. Removed direct helper test imports — existing sendQuery e2e tests (51 Claude + 42 Codex) cover all behavior paths. * fix: address review findings — abort handling, retry bugs, error swallowing Fixes from CodeRabbit + multi-agent review: 1. classifyAndEnrichError preserves first-event timeout diagnostic instead of collapsing it into generic "Query aborted" (the timeout aborts the controller, but the original error carries the coleam00#1067 breadcrumb) 2. nodeConfigWarnings emitted once before retry loop, not per attempt 3. buildSubprocessEnv() called once before retry loop (was re-logging auth mode and rebuilding { ...process.env } per attempt) 4. Abort signal listener registered once with forwarding to current controller (was accumulating per-retry listeners) 5. PostToolUse hook wrapped in try/catch (JSON.stringify can throw on circular refs — was asymmetric with PostToolUseFailure which had it) 6. Codex streamCodexEvents throws on abort instead of silent break (callers were getting truncated stream with no result/error) 7. Both providers store enrichedError (not raw error) for retry exhaustion — preserves stderr context in final throw 8. Log is_error result events at error level in Claude stream normalizer * test: add black-box behavioral tests for sendQuery decomposition fixes Restore test coverage for the specific fixes from the decomposition review, exercised through sendQuery (black-box) since helpers are file-local: Claude (6 tests): - Timeout error preserved (not collapsed into "Query aborted") - nodeConfig warnings emitted once even when retries occur - Abort signal cancels across retries via single forwarding listener - Enriched error (with stderr) thrown at retry exhaustion - PostToolUse hook handles circular reference without crashing - is_error result events logged at error level Codex (3 tests): - Abort signal throws instead of silently truncating stream - Enriched error thrown at retry exhaustion - Todo-list dedup state resets between retry attempts
Summary
sendQuery()in both Claude and Codex providers from monolithic methods into focused internal helpers with clear ownership boundariesProviderWarning { code, message }type replacing raw string warnings for machine-readable provider translation warningsRoot Cause
sendQuery()was carrying 5+ mixed responsibilities (SDK option building, nodeConfig translation, warning collection, stream normalization, retry/error handling). Before Phase 2 adds provider extensibility, this boundary needs to be explicit so community providers don't need to cargo-cult a large mixed-responsibility method.Changes
packages/providers/src/claude/provider.tsbuildBaseClaudeOptions,buildToolCaptureHooks,streamClaudeMessages,classifyAndEnrichError;applyNodeConfignow returnsProviderWarning[]packages/providers/src/codex/provider.tsbuildTurnOptions,streamCodexEvents,classifyAndEnrichCodexErrorpackages/providers/src/claude/provider.test.tsapplyNodeConfig,streamClaudeMessages,classifyAndEnrichError,buildBaseClaudeOptionspackages/providers/src/codex/provider.test.tsbuildTurnOptions,streamCodexEvents,classifyAndEnrichCodexErrorDesign Decisions
IAgentProvider.sendQuery()contract is unchanged.ProviderWarning { code, message }replaces raw strings. Providers still convert tosystemchunks at the boundary — callers don't need to change.Testing
bun run validatepassesValidation
Issue
Fixes #1139
Summary by CodeRabbit
Refactor
Bug Fixes
Tests