fix(core): add AWS env vars to subprocess allowlist for Bedrock auth by liorfranko · Pull Request #1060 · coleam00/Archon

liorfranko · 2026-04-10T18:48:45Z

Summary

AWS credentials (AWS_PROFILE, AWS_REGION, AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_SESSION_TOKEN, AWS_DEFAULT_REGION) were missing from the subprocess env allowlist
This caused 401 authentication failures when using Claude Code via AWS Bedrock, because the Claude subprocess couldn't see the AWS credentials it needs
The allowlist already included CLAUDE_CODE_USE_BEDROCK but not the AWS vars it depends on

Test plan

Verified fix locally: server chat with Bedrock auth works after adding AWS vars to allowlist
Verify no regression for direct Anthropic API users (no AWS vars present = no change)
Verify Vertex users are unaffected

🤖 Generated with Claude Code

Summary by CodeRabbit

New Features
- AWS environment variables are now available during code execution.

AWS credentials (AWS_PROFILE, AWS_REGION, AWS_ACCESS_KEY_ID, etc.) were stripped from Claude Code subprocesses by the env allowlist, causing 401 auth failures when using Bedrock. The allowlist already included CLAUDE_CODE_USE_BEDROCK but not the AWS vars it depends on. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

coderabbitai · 2026-04-10T18:49:43Z

📝 Walkthrough

Walkthrough

The subprocess environment allowlist was extended to include six AWS-related environment variables (AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_SESSION_TOKEN, AWS_PROFILE, AWS_REGION, AWS_DEFAULT_REGION) to enable their propagation into Claude Code subprocesses.

Changes

Cohort / File(s)	Summary
Environment Allowlist Configuration `packages/core/src/utils/env-allowlist.ts`	Added six AWS environment variables to `SUBPROCESS_ENV_ALLOWLIST` to permit AWS credentials and configuration propagation to subprocesses.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~2 minutes

Poem

🐰 AWS keys hop into the list,
No Claude Code command shall be missed!
Credentials flow swift, secure, and bright,
AWS magic takes its flight! ✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Description check	⚠️ Warning	The description addresses the core problem and solution but lacks several required sections from the template including architecture diagrams, validation evidence, security impact analysis, compatibility assessment, human verification details, and rollback plan.	Complete the required template sections: provide validation evidence (test results), document security impact analysis for credential handling, confirm backward compatibility, detail human verification steps performed, describe rollback plan, and include architecture diagrams.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately and specifically describes the main change: adding AWS environment variables to the subprocess allowlist for Bedrock authentication.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Warning

Review ran into problems

🔥 Problems

Git: Failed to clone repository. Please run the @coderabbitai full review command to re-trigger a full review. If the issue persists, set path_filters to include or exclude specific files.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

🧹 Nitpick comments (1)

packages/core/src/utils/env-allowlist.ts (1)
37-43: Add a regression test for required Bedrock AWS keys in the allowlist.

Given this was a production auth break, a small unit test asserting these keys are preserved by buildCleanSubprocessEnv() would prevent accidental regressions.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/core/src/utils/env-allowlist.ts` around lines 37 - 43, Add a unit
test that asserts buildCleanSubprocessEnv() preserves the required Bedrock AWS
keys in the allowlist: 'AWS_ACCESS_KEY_ID', 'AWS_SECRET_ACCESS_KEY',
'AWS_SESSION_TOKEN', 'AWS_PROFILE', 'AWS_REGION', and 'AWS_DEFAULT_REGION'. Mock
or supply an env object containing these keys with sample values, call
buildCleanSubprocessEnv(env) and assert each of the listed AWS keys exists and
keeps its value in the returned object; place the test alongside existing
env-related tests and use the same test helpers/assertion style as the repo.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@packages/core/src/utils/env-allowlist.ts`:
- Around line 37-43: Add a unit test that asserts buildCleanSubprocessEnv()
preserves the required Bedrock AWS keys in the allowlist: 'AWS_ACCESS_KEY_ID',
'AWS_SECRET_ACCESS_KEY', 'AWS_SESSION_TOKEN', 'AWS_PROFILE', 'AWS_REGION', and
'AWS_DEFAULT_REGION'. Mock or supply an env object containing these keys with
sample values, call buildCleanSubprocessEnv(env) and assert each of the listed
AWS keys exists and keeps its value in the returned object; place the test
alongside existing env-related tests and use the same test helpers/assertion
style as the repo.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: b8fe2a73-d23f-40bc-95b1-4af04559d69f

📥 Commits

Reviewing files that changed from the base of the PR and between e8334b3 and 40aca38.

📒 Files selected for processing (1)

packages/core/src/utils/env-allowlist.ts

…only CWD The allowlist was wrong for a single-developer tool: - It blocked keys the user intentionally set in ~/.archon/.env (ANTHROPIC_API_KEY, AWS_*, CLAUDE_CONFIG_DIR, MiniMax vars, etc.) - It was bypassed by the SDK anyway (process.env leaks to subprocess regardless of the env option — see #1097) - It attracted a constant stream of PRs adding keys (#1060, #1093, #1099) New model: CWD .env keys are the only untrusted source. stripCwdEnv() at entry point handles that. Everything in ~/.archon/.env + shell env passes through to the subprocess. No filtering, no second-guessing. Changes: - Delete env-allowlist.ts and env-allowlist.test.ts - Simplify buildSubprocessEnv() to return { ...process.env } with auth-mode logging (no token stripping — user controls their config) - Replace 4 allowlist-based tests with 1 pass-through test - Remove env-allowlist.test.ts from core test batch - Update security.md and cli.md docs to reflect the new model The CLAUDECODE + CLAUDE_CODE_* marker strip and NODE_OPTIONS strip remain in stripCwdEnv() at entry point — those are process-level safety (not per-subprocess filtering) and are needed regardless.

Wirasm · 2026-04-12T09:01:01Z

Addressed by the allowlist removal in #1092 — SUBPROCESS_ENV_ALLOWLIST has been deleted. All user-configured env vars (including AWS Bedrock vars like AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_REGION, etc.) now pass through to the subprocess automatically via process.env. Once #1092 merges, this PR can be closed.

…t timeout (#1067, #1030, #1098, #1070) * fix: strip CWD .env leak, enable platform adapters in serve, add first-event timeout (#1067) Three bugs fixed: (1) Bun auto-loads CWD .env files before user code, leaking non-overlapping keys into the Archon process — new stripCwdEnv() boot import removes them before any module reads env. (2) archon serve hardcoded skipPlatformAdapters:true, preventing Slack/Telegram/Discord from starting. (3) Claude SDK query had no first-event timeout, causing silent 30-min hangs when the subprocess wedges — new withFirstMessageTimeout wrapper races the first event against a configurable deadline (default 60s). Changes: - Add @archon/paths/strip-cwd-env and strip-cwd-env-boot modules - Import boot module as first import in CLI entry point - Remove skipPlatformAdapters: true from serve.ts - Add withFirstMessageTimeout + diagnostics to ClaudeClient - Add CLAUDECODE=1 nested-session warning to CLI - Add 9 unit tests (6 strip-cwd-env + 3 timeout) Fixes #1067 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: address review findings for PR #1092 Fixed: - Clear setTimeout timer in withFirstMessageTimeout finally block (HIGH-1) - Add strip-cwd-env-boot to server/src/index.ts for direct dev:server path (MEDIUM-1) - Warn to stderr on non-ENOENT errors in stripCwdEnv (MEDIUM-2) - Update stale configuration.md docs for new env-loading mechanism (HIGH-2) - Add ARCHON_CLAUDE_FIRST_EVENT_TIMEOUT_MS and ARCHON_SUPPRESS_NESTED_CLAUDE_WARNING env vars to docs - Add nested Claude Code hang troubleshooting entry - Fix boot module JSDoc: "CLI and server" → "CLI" only - Fix stripCwdEnv JSDoc: remove stale "override: true" reference - Update .claude/rules/cli.md startup behavior section - Update CLAUDE.md @archon/paths description with new exports Tests added: - Assert controller.signal.aborted on timeout - Handle generator that completes immediately without yielding - Strip distinct keys from different .env files Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * simplify: replace string sentinel with typed error class in withFirstMessageTimeout Replace the '__timeout__' string sentinel used to identify timeout rejections with a dedicated FirstEventTimeoutError class. instanceof checks are more explicit and robust than string comparison on error messages. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: address review findings — dotenv version, docs, server warning, marker strip, tests 1. Align dotenv to ^17 (was ^16, rest of monorepo uses ^17.2.3) 2. Remove incorrect SUBPROCESS_ENV_ALLOWLIST claim from docs — the SDK bypasses the env option and uses process.env directly (#1097) 3. Add CLAUDECODE=1 warning to server entry point (was only in CLI) 4. Add diagnostic payload content test for withFirstMessageTimeout 5. Integrate #1097's finding: strip CLAUDECODE + CLAUDE_CODE_* session markers (except auth vars) + NODE_OPTIONS + VSCODE_INSPECTOR_OPTIONS from process.env at entry point. Pattern-matched on CLAUDE_CODE_* prefix rather than hardcoding 6 names, so future Claude Code markers are handled automatically. Auth vars (CLAUDE_CODE_OAUTH_TOKEN, CLAUDE_CODE_USE_BEDROCK, CLAUDE_CODE_USE_VERTEX) are preserved. Root cause per #1097: the Claude Agent SDK leaks process.env into the spawned child regardless of the explicit env option, so the only way to prevent the nested-session deadlock is to delete the markers from process.env at the entry point. Validation: bun run validate passes, 125 paths tests (6 new marker tests), 60 claude tests (1 new diagnostic test), DATABASE_URL leak verified stripped (target repo .env DATABASE_URL does not affect Archon DB selection). * refactor: remove SUBPROCESS_ENV_ALLOWLIST — trust user config, strip only CWD The allowlist was wrong for a single-developer tool: - It blocked keys the user intentionally set in ~/.archon/.env (ANTHROPIC_API_KEY, AWS_*, CLAUDE_CONFIG_DIR, MiniMax vars, etc.) - It was bypassed by the SDK anyway (process.env leaks to subprocess regardless of the env option — see #1097) - It attracted a constant stream of PRs adding keys (#1060, #1093, #1099) New model: CWD .env keys are the only untrusted source. stripCwdEnv() at entry point handles that. Everything in ~/.archon/.env + shell env passes through to the subprocess. No filtering, no second-guessing. Changes: - Delete env-allowlist.ts and env-allowlist.test.ts - Simplify buildSubprocessEnv() to return { ...process.env } with auth-mode logging (no token stripping — user controls their config) - Replace 4 allowlist-based tests with 1 pass-through test - Remove env-allowlist.test.ts from core test batch - Update security.md and cli.md docs to reflect the new model The CLAUDECODE + CLAUDE_CODE_* marker strip and NODE_OPTIONS strip remain in stripCwdEnv() at entry point — those are process-level safety (not per-subprocess filtering) and are needed regardless. * fix: restore override:true for archon env, add integration tests The integration tests caught a real issue: without override:true, the ~/.archon/.env load doesn't win over shell-inherited env vars. If the user's shell profile exports PORT=9999 and ~/.archon/.env has PORT=3000, the user expects Archon to use 3000. stripCwdEnv() handles CWD .env files (untrusted). override:true handles shell-inherited vars (trusted but less specific than ~/.archon/.env). Different concerns, both needed. Also adds 6 integration tests covering the full entry-point flow: 1. Global auth user with ANTHROPIC_API_KEY in CWD .env — stripped 2. OAuth token in archon env + random key in CWD — CWD stripped, archon kept 3. General leak test — nothing from CWD reaches subprocess 4. Same key in both CWD and archon — archon value wins 5. CLAUDECODE markers stripped even when not from CWD .env 6. CLAUDE_CODE_OAUTH_TOKEN survives marker strip * test: add DATABASE_URL leak scenarios to env integration tests * fix: move CLAUDECODE warning into stripCwdEnv, remove dead useGlobalAuth logic Review findings addressed: 1. CLAUDECODE warning was dead code — the boot import deleted CLAUDECODE from process.env before the warning check in cli.ts/server/index.ts could fire. Moved the warning into stripCwdEnv() itself, emitted BEFORE the deletion. Removed duplicate warning code from both entry points. 2. useGlobalAuth token stripping removed (intentional, not regression) — the old code stripped CLAUDE_CODE_OAUTH_TOKEN and CLAUDE_API_KEY when useGlobalAuth=true. Per design discussion: the user controls ~/.archon/.env and all keys they set are intentional. If they want global auth, they just don't set tokens. Simplified buildSubprocessEnv to log auth mode for diagnostics only, no filtering. 3. Docs "no override needed" corrected — cli.md and configuration.md now reflect the actual code (override: true). --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: Rasmus Widing <rasmus.widing@gmail.com>

Wirasm · 2026-04-12T09:11:44Z

Resolved by #1092 (allowlist removed — all AWS Bedrock vars pass through automatically).

…t timeout (coleam00#1067, coleam00#1030, coleam00#1098, coleam00#1070) * fix: strip CWD .env leak, enable platform adapters in serve, add first-event timeout (coleam00#1067) Three bugs fixed: (1) Bun auto-loads CWD .env files before user code, leaking non-overlapping keys into the Archon process — new stripCwdEnv() boot import removes them before any module reads env. (2) archon serve hardcoded skipPlatformAdapters:true, preventing Slack/Telegram/Discord from starting. (3) Claude SDK query had no first-event timeout, causing silent 30-min hangs when the subprocess wedges — new withFirstMessageTimeout wrapper races the first event against a configurable deadline (default 60s). Changes: - Add @archon/paths/strip-cwd-env and strip-cwd-env-boot modules - Import boot module as first import in CLI entry point - Remove skipPlatformAdapters: true from serve.ts - Add withFirstMessageTimeout + diagnostics to ClaudeClient - Add CLAUDECODE=1 nested-session warning to CLI - Add 9 unit tests (6 strip-cwd-env + 3 timeout) Fixes coleam00#1067 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: address review findings for PR coleam00#1092 Fixed: - Clear setTimeout timer in withFirstMessageTimeout finally block (HIGH-1) - Add strip-cwd-env-boot to server/src/index.ts for direct dev:server path (MEDIUM-1) - Warn to stderr on non-ENOENT errors in stripCwdEnv (MEDIUM-2) - Update stale configuration.md docs for new env-loading mechanism (HIGH-2) - Add ARCHON_CLAUDE_FIRST_EVENT_TIMEOUT_MS and ARCHON_SUPPRESS_NESTED_CLAUDE_WARNING env vars to docs - Add nested Claude Code hang troubleshooting entry - Fix boot module JSDoc: "CLI and server" → "CLI" only - Fix stripCwdEnv JSDoc: remove stale "override: true" reference - Update .claude/rules/cli.md startup behavior section - Update CLAUDE.md @archon/paths description with new exports Tests added: - Assert controller.signal.aborted on timeout - Handle generator that completes immediately without yielding - Strip distinct keys from different .env files Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * simplify: replace string sentinel with typed error class in withFirstMessageTimeout Replace the '__timeout__' string sentinel used to identify timeout rejections with a dedicated FirstEventTimeoutError class. instanceof checks are more explicit and robust than string comparison on error messages. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: address review findings — dotenv version, docs, server warning, marker strip, tests 1. Align dotenv to ^17 (was ^16, rest of monorepo uses ^17.2.3) 2. Remove incorrect SUBPROCESS_ENV_ALLOWLIST claim from docs — the SDK bypasses the env option and uses process.env directly (coleam00#1097) 3. Add CLAUDECODE=1 warning to server entry point (was only in CLI) 4. Add diagnostic payload content test for withFirstMessageTimeout 5. Integrate coleam00#1097's finding: strip CLAUDECODE + CLAUDE_CODE_* session markers (except auth vars) + NODE_OPTIONS + VSCODE_INSPECTOR_OPTIONS from process.env at entry point. Pattern-matched on CLAUDE_CODE_* prefix rather than hardcoding 6 names, so future Claude Code markers are handled automatically. Auth vars (CLAUDE_CODE_OAUTH_TOKEN, CLAUDE_CODE_USE_BEDROCK, CLAUDE_CODE_USE_VERTEX) are preserved. Root cause per coleam00#1097: the Claude Agent SDK leaks process.env into the spawned child regardless of the explicit env option, so the only way to prevent the nested-session deadlock is to delete the markers from process.env at the entry point. Validation: bun run validate passes, 125 paths tests (6 new marker tests), 60 claude tests (1 new diagnostic test), DATABASE_URL leak verified stripped (target repo .env DATABASE_URL does not affect Archon DB selection). * refactor: remove SUBPROCESS_ENV_ALLOWLIST — trust user config, strip only CWD The allowlist was wrong for a single-developer tool: - It blocked keys the user intentionally set in ~/.archon/.env (ANTHROPIC_API_KEY, AWS_*, CLAUDE_CONFIG_DIR, MiniMax vars, etc.) - It was bypassed by the SDK anyway (process.env leaks to subprocess regardless of the env option — see coleam00#1097) - It attracted a constant stream of PRs adding keys (coleam00#1060, coleam00#1093, coleam00#1099) New model: CWD .env keys are the only untrusted source. stripCwdEnv() at entry point handles that. Everything in ~/.archon/.env + shell env passes through to the subprocess. No filtering, no second-guessing. Changes: - Delete env-allowlist.ts and env-allowlist.test.ts - Simplify buildSubprocessEnv() to return { ...process.env } with auth-mode logging (no token stripping — user controls their config) - Replace 4 allowlist-based tests with 1 pass-through test - Remove env-allowlist.test.ts from core test batch - Update security.md and cli.md docs to reflect the new model The CLAUDECODE + CLAUDE_CODE_* marker strip and NODE_OPTIONS strip remain in stripCwdEnv() at entry point — those are process-level safety (not per-subprocess filtering) and are needed regardless. * fix: restore override:true for archon env, add integration tests The integration tests caught a real issue: without override:true, the ~/.archon/.env load doesn't win over shell-inherited env vars. If the user's shell profile exports PORT=9999 and ~/.archon/.env has PORT=3000, the user expects Archon to use 3000. stripCwdEnv() handles CWD .env files (untrusted). override:true handles shell-inherited vars (trusted but less specific than ~/.archon/.env). Different concerns, both needed. Also adds 6 integration tests covering the full entry-point flow: 1. Global auth user with ANTHROPIC_API_KEY in CWD .env — stripped 2. OAuth token in archon env + random key in CWD — CWD stripped, archon kept 3. General leak test — nothing from CWD reaches subprocess 4. Same key in both CWD and archon — archon value wins 5. CLAUDECODE markers stripped even when not from CWD .env 6. CLAUDE_CODE_OAUTH_TOKEN survives marker strip * test: add DATABASE_URL leak scenarios to env integration tests * fix: move CLAUDECODE warning into stripCwdEnv, remove dead useGlobalAuth logic Review findings addressed: 1. CLAUDECODE warning was dead code — the boot import deleted CLAUDECODE from process.env before the warning check in cli.ts/server/index.ts could fire. Moved the warning into stripCwdEnv() itself, emitted BEFORE the deletion. Removed duplicate warning code from both entry points. 2. useGlobalAuth token stripping removed (intentional, not regression) — the old code stripped CLAUDE_CODE_OAUTH_TOKEN and CLAUDE_API_KEY when useGlobalAuth=true. Per design discussion: the user controls ~/.archon/.env and all keys they set are intentional. If they want global auth, they just don't set tokens. Simplified buildSubprocessEnv to log auth mode for diagnostics only, no filtering. 3. Docs "no override needed" corrected — cli.md and configuration.md now reflect the actual code (override: true). --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: Rasmus Widing <rasmus.widing@gmail.com>

coderabbitai Bot reviewed Apr 10, 2026

View reviewed changes

Wirasm mentioned this pull request Apr 11, 2026

fix(claude): surface nested Claude Code hangs with timeout + diagnostics #1071

Closed

Wirasm mentioned this pull request Apr 12, 2026

fix: strip CWD .env leak, enable platform adapters in serve, add first-event timeout #1092

Merged

Wirasm closed this Apr 12, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(core): add AWS env vars to subprocess allowlist for Bedrock auth#1060

fix(core): add AWS env vars to subprocess allowlist for Bedrock auth#1060
liorfranko wants to merge 1 commit intocoleam00:devfrom
liorfranko:fix/bedrock-env-allowlist

liorfranko commented Apr 10, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Apr 10, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Review ran into problems

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Wirasm commented Apr 12, 2026

Uh oh!

Wirasm commented Apr 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

liorfranko commented Apr 10, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Review ran into problems

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Wirasm commented Apr 12, 2026

Uh oh!

Wirasm commented Apr 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

liorfranko commented Apr 10, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 10, 2026 •

edited

Loading