fix(core): surface auth errors instead of silently dropping them#1089
fix(core): surface auth errors instead of silently dropping them#1089
Conversation
|
Caution Review failedThe pull request is closed. ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (3)
📝 WalkthroughWalkthroughThe pull request addresses a silent authentication failure bug in the orchestrator where OAuth token expiration errors were dropped before reaching users. Changes remove the session ID guard from result message handling, add explicit error detection and platform notification for authentication failures, and enhance error classification and formatting to detect OAuth refresh-token and Codex-specific authentication scenarios. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
🔍 Comprehensive PR ReviewPR: #1089 — fix(core): surface auth errors instead of silently dropping them SummaryThis PR fixes a genuine silent-failure bug: auth errors surfaced via the SDK result chunk's Verdict:
🟠 High Issue — Must Fix Before MergeMissing log in both
|
| Title | Priority |
|---|---|
Add subtype-aware routing for isError result chunks (max_turns, etc.) |
P3 |
| Add OAuth retry-classification regression tests to claude/codex test files | P3 |
Reviewed by Archon comprehensive-pr-review workflow
Artifacts: ~/.archon/workspaces/coleam00/Archon/artifacts/runs/b1c43db8281e329539efaab1e5ad5a62/review/
Review Fix ReportAll blocking findings from the consolidated review have been addressed in commit 2555448. HIGH Fixed: Missing structured logging in
|
Archon PR Validation ReportVerdict: ✅ APPROVE SummaryAll five root causes of silent auth error drops are confirmed on Bug Confirmation
IssuesNo blocking issues found. Minor: Hardcoded error message in orchestrator Fix Quality: 5/5Validated by archon-validate-pr workflow |
When Claude OAuth refresh token is expired, the SDK yields a result chunk with is_error=true and no session_id. Both handleStreamMode and handleBatchMode guarded the result branch with `&& msg.sessionId`, silently dropping the error. Users saw no response at all. Changes: - Remove sessionId guard from result branches in orchestrator-agent.ts - Add isError early-exit that sends error message to user - Add 4 OAuth patterns to AUTH_PATTERNS in claude.ts and codex.ts - Add OAuth refresh-token handler to error-formatter.ts - Add tests for new error-formatter branches Fixes #1076 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…uth pattern
- Add getLog().warn({ conversationId, errorSubtype }, 'ai_result_error') in both
handleStreamMode and handleBatchMode isError branches so auth failures are
visible server-side instead of silently swallowed
- Remove 'access token' from AUTH_PATTERNS in claude.ts and codex.ts; the real
OAuth refresh error is already covered by 'refresh token' and 'could not be
refreshed', eliminating false-positive auth classification risk
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…er-specific messages The isError path in stream/batch mode used a hardcoded generic message, bypassing the classifyAndFormatError infrastructure. Now constructs a synthetic Error from errorSubtype and routes through the formatter. Error formatter updated with provider-specific auth detection: - Claude: OAuth token refresh, sign-in expired → guidance to run /login - Codex: 401 retry exhaustion → guidance to run codex login - General: tightened patterns (removed broad 'auth error' substring match) Also persists session ID before early-returning on isError. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
99a4c92 to
9c4d0fd
Compare
…eam00#1089) * fix: surface auth errors instead of silently dropping them (coleam00#1076) When Claude OAuth refresh token is expired, the SDK yields a result chunk with is_error=true and no session_id. Both handleStreamMode and handleBatchMode guarded the result branch with `&& msg.sessionId`, silently dropping the error. Users saw no response at all. Changes: - Remove sessionId guard from result branches in orchestrator-agent.ts - Add isError early-exit that sends error message to user - Add 4 OAuth patterns to AUTH_PATTERNS in claude.ts and codex.ts - Add OAuth refresh-token handler to error-formatter.ts - Add tests for new error-formatter branches Fixes coleam00#1076 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: add structured logging to isError path and remove overly broad auth pattern - Add getLog().warn({ conversationId, errorSubtype }, 'ai_result_error') in both handleStreamMode and handleBatchMode isError branches so auth failures are visible server-side instead of silently swallowed - Remove 'access token' from AUTH_PATTERNS in claude.ts and codex.ts; the real OAuth refresh error is already covered by 'refresh token' and 'could not be refreshed', eliminating false-positive auth classification risk Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: route isError results through classifyAndFormatError with provider-specific messages The isError path in stream/batch mode used a hardcoded generic message, bypassing the classifyAndFormatError infrastructure. Now constructs a synthetic Error from errorSubtype and routes through the formatter. Error formatter updated with provider-specific auth detection: - Claude: OAuth token refresh, sign-in expired → guidance to run /login - Codex: 401 retry exhaustion → guidance to run codex login - General: tightened patterns (removed broad 'auth error' substring match) Also persists session ID before early-returning on isError. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…eam00#1089) * fix: surface auth errors instead of silently dropping them (coleam00#1076) When Claude OAuth refresh token is expired, the SDK yields a result chunk with is_error=true and no session_id. Both handleStreamMode and handleBatchMode guarded the result branch with `&& msg.sessionId`, silently dropping the error. Users saw no response at all. Changes: - Remove sessionId guard from result branches in orchestrator-agent.ts - Add isError early-exit that sends error message to user - Add 4 OAuth patterns to AUTH_PATTERNS in claude.ts and codex.ts - Add OAuth refresh-token handler to error-formatter.ts - Add tests for new error-formatter branches Fixes coleam00#1076 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: add structured logging to isError path and remove overly broad auth pattern - Add getLog().warn({ conversationId, errorSubtype }, 'ai_result_error') in both handleStreamMode and handleBatchMode isError branches so auth failures are visible server-side instead of silently swallowed - Remove 'access token' from AUTH_PATTERNS in claude.ts and codex.ts; the real OAuth refresh error is already covered by 'refresh token' and 'could not be refreshed', eliminating false-positive auth classification risk Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: route isError results through classifyAndFormatError with provider-specific messages The isError path in stream/batch mode used a hardcoded generic message, bypassing the classifyAndFormatError infrastructure. Now constructs a synthetic Error from errorSubtype and routes through the formatter. Error formatter updated with provider-specific auth detection: - Claude: OAuth token refresh, sign-in expired → guidance to run /login - Codex: 401 retry exhaustion → guidance to run codex login - General: tightened patterns (removed broad 'auth error' substring match) Also persists session ID before early-returning on isError. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Summary
resultchunk withis_error: trueand nosession_id. BothhandleStreamModeandhandleBatchModeguarded the result branch with&& msg.sessionId, silently dropping the error chunk entirely — the user sees no response.&& msg.sessionIdguard from both result branches; addedisErrorearly-exit that sends a visible error message. Added 4 missing OAuth patterns toAUTH_PATTERNSin bothclaude.tsandcodex.tsto prevent unnecessary 3× retries. Extendederror-formatter.tsto produce actionable re-login guidance for OAuth refresh-token errors.turn.failedsession-ID loss issue is out of scope.UX Journey
Before
After
Architecture Diagram
Before
After
Connection inventory:
Label Snapshot
risk: lowsize: Scorecore:orchestrator,core:clients,core:utilsChange Metadata
bugcoreLinked Issue
Validation Evidence (required)
error-formatter.test.tscovering OAuth refresh-token and auth-error-prefix branchesSecurity Impact (required)
Compatibility / Migration
Human Verification (required)
bun run validate)Side Effects / Blast Radius (required)
isErrorguard only fires when the SDK explicitly signals an error; normal success results are unaffectedRollback Plan (required)
git revert 06b37a4eondevRisks and Mitigations
isErrorresult with a validsessionId(e.g. max-turns reached mid-session) causes early return that discards accumulated assistant contentnewSessionIdis captured before theisErrorcheck, so the session is preserved. The early return is correct — partial output without a completion signal is confusing.Summary by CodeRabbit