Skip to content

fix: remove CWD env stripping, load ~/.archon/.env with override for binary support#1045

Merged
Wirasm merged 2 commits intodevfrom
fix/binary-env-loading
Apr 10, 2026
Merged

fix: remove CWD env stripping, load ~/.archon/.env with override for binary support#1045
Wirasm merged 2 commits intodevfrom
fix/binary-env-loading

Conversation

@Wirasm
Copy link
Copy Markdown
Collaborator

@Wirasm Wirasm commented Apr 10, 2026

Summary

  • Problem: Compiled binaries crash on archon serve because CWD .env stripping nukes all env vars, and the reload path (import.meta.dir) is baked to the CI runner's filesystem
  • Why it matters: archon serve is unusable for binary users — the server exits with no_ai_credentials before it can start
  • What changed: Removed CWD stripping (redundant — SUBPROCESS_ENV_ALLOWLIST + env-leak gate already protect), server loads ~/.archon/.env with override: true for all keys, skips import.meta.dir in binary mode, adds CLAUDE_USE_GLOBAL_AUTH defaulting
  • What did not change: SUBPROCESS_ENV_ALLOWLIST, buildSubprocessEnv(), env-leak gate, per-codebase DB env vars, Web UI env management — all untouched

UX Journey

Before

  User                        Binary                   Server
  ────                        ──────                   ──────
  runs archon serve ────────▶ Bun auto-loads CWD .env
                              strips ALL CWD keys
                              tries import.meta.dir .env ──▶ /Users/runner/... (ENOENT)
                              loads ~/.archon/.env ──────────▶ only DATABASE_URL applied
                              checks credentials ───────────▶ none found
                              process.exit(1) ✗

After

  User                        Binary                   Server
  ────                        ──────                   ──────
  runs archon serve ────────▶ Bun auto-loads CWD .env
                              [skips import.meta.dir — binary mode]
                              [loads ~/.archon/.env override: true] ──▶ all keys applied
                              [defaults CLAUDE_USE_GLOBAL_AUTH]
                              checks credentials ───────────▶ found ✓
                              starts server ✓

Architecture Diagram

Before

CWD .env ──strip──▶ process.env ──import.meta.dir──▶ repo .env ──▶ ~/.archon/.env (DATABASE_URL only)
                                                      ↑ BROKEN in binary

After

CWD .env ──(kept)──▶ process.env ──[dev only]──▶ repo .env
                         ↑
                    ~/.archon/.env (override: true, ALL keys)
                         ↑
                    CLAUDE_USE_GLOBAL_AUTH default

Connection inventory:

From To Status Notes
CWD .env stripping process.env removed Redundant — allowlist + gate protect
import.meta.dir .env process.env modified Skipped in binary mode
~/.archon/.env process.env modified Now loads all keys with override
CLAUDE_USE_GLOBAL_AUTH process.env new Smart default in server (matches CLI)

Label Snapshot

  • Risk: risk: medium
  • Size: size: S
  • Scope: server, cli
  • Module: server:env-loading, cli:env-loading

Change Metadata

  • Change type: bug
  • Primary scope: multi

Linked Issue

  • Related: binary archon serve crashes (discovered during v0.3.3 release testing)

Validation Evidence (required)

bun run type-check   # ✓ all 9 packages pass
bun run lint         # ✓ 0 errors, 0 warnings
bun run test         # ✓ all tests pass
  • Dev mode server tested: starts correctly, all adapters connect, API responds
  • Evidence: server logs show server_listening on port 3000

Security Impact (required)

  • New permissions/capabilities? No
  • New external network calls? No
  • Secrets/tokens handling changed? Yes
  • File system access scope changed? No
  • Security analysis: The CWD stripping was a redundant safety layer. The actual protection is SUBPROCESS_ENV_ALLOWLIST (blocks ANTHROPIC_API_KEY, ANTHROPIC_AUTH_TOKEN, OPENAI_API_KEY from reaching subprocesses) and the env-leak gate (scans target repos for sensitive keys before spawning). Both remain unchanged.

Compatibility / Migration

  • Backward compatible? Yes
  • Config/env changes? Yes~/.archon/.env now applies all keys to the server (previously only DATABASE_URL). Users who had conflicting keys in both repo .env and ~/.archon/.env will now see ~/.archon/.env win (override: true).
  • Database migration needed? No

Human Verification (required)

  • Dev mode: server starts, loads repo .env, all platform adapters connect
  • Binary mode: needs 0.3.4 release to verify (code-level analysis confirms fix)
  • Edge cases checked: server startup from CWD with .env, from clean /tmp dir

Side Effects / Blast Radius (required)

  • Affected subsystems: server startup, CLI startup
  • Potential unintended effects: Users who relied on CWD stripping to isolate their server from the CWD's .env will now see those vars in process.env (but NOT in subprocesses — allowlist still blocks them)
  • Guardrails: SUBPROCESS_ENV_ALLOWLIST unchanged, env-leak gate unchanged

Rollback Plan (required)

  • Fast rollback: git revert <commit> — single commit, clean revert
  • Feature flags: None needed
  • Observable failure symptoms: server crashes on startup with no_ai_credentials (same as current broken state)

Risks and Mitigations

  • Risk: Target repo env vars now persist in server's process.env (previously stripped)
    • Mitigation: They cannot reach AI subprocesses — SUBPROCESS_ENV_ALLOWLIST blocks all non-whitelisted keys, and buildSubprocessEnv() strips auth tokens when using global auth

Summary by CodeRabbit

  • Refactor

    • Unified environment loading so the global Archon config now takes precedence across CLI and server startups.
  • Security

    • Strengthened subprocess environment isolation: repository env vars remain in process.env but are blocked from reaching AI subprocesses via an allowlist/env-leak gate.
  • Bug Fixes

    • Auto-enables global Claude authentication when no explicit API credentials are present.
  • Documentation

    • Updated docs to reflect the new env precedence and subprocess isolation behavior.

…binary support

The CWD .env stripping was redundant — SUBPROCESS_ENV_ALLOWLIST already
blocks target repo credentials from reaching AI subprocesses, and the
env-leak gate scans target repos before spawning.

The stripping also broke compiled binaries: it nuked all env vars, then
tried to reload from a baked CI path that doesn't exist on user machines.

Changes:
- Remove CWD .env stripping from both server and CLI
- Server loads ~/.archon/.env with override: true (all keys, not just DATABASE_URL)
- Skip import.meta.dir .env loading in binary mode (path is frozen at build time)
- Add CLAUDE_USE_GLOBAL_AUTH defaulting to server (same as CLI)
- Fix envFile reference in no_ai_credentials error to show useful path
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 10, 2026

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 1b06502a-fab3-4aec-9054-a9ec838d8054

📥 Commits

Reviewing files that changed from the base of the PR and between 4c01820 and bbdcc65.

📒 Files selected for processing (6)
  • .claude/rules/cli.md
  • packages/docs-web/src/content/docs/contributing/cli-internals.md
  • packages/docs-web/src/content/docs/reference/cli.md
  • packages/docs-web/src/content/docs/reference/configuration.md
  • packages/docs-web/src/content/docs/reference/security.md
  • packages/server/src/index.ts

📝 Walkthrough

Walkthrough

Removed logic that parsed and deleted CWD .env keys; standardized loading of ~/.archon/.env with override: true for both CLI and server, added conditional repo-root .env load in dev mode for the server, and introduced a CLAUDE_USE_GLOBAL_AUTH fallback when no AI credentials are present. Credential leakage is now handled via subprocess allowlist gates.

Changes

Cohort / File(s) Summary
CLI env handling
packages/cli/src/cli.ts
Deleted CWD .env parse-and-delete block; replaced with comment noting reliance on ~/.archon/.env (loaded with override: true) and downstream SUBPROCESS_ENV_ALLOWLIST / env-leak gating to prevent credential leaks.
Server env handling
packages/server/src/index.ts
Removed CWD .env stripping; added conditional monorepo repo-root .env load when not bundled (BUNDLED_IS_BINARY false); changed global env load to dotenv.config({ path: globalEnvPath, override: true }); set CLAUDE_USE_GLOBAL_AUTH='true' when no Claude credentials present; updated fatal logger envFile context.
Docs & rules
packages/docs-web/src/content/docs/..., packages/docs-web/src/content/docs/reference/..., packages/docs-web/src/content/docs/contributing/cli-internals.md, .claude/rules/cli.md
Documentation updated to remove claim of deleting repo DATABASE_URL, to state ~/.archon/.env is loaded with override: true, to describe subprocess env isolation via SUBPROCESS_ENV_ALLOWLIST, and to update best-practice guidance about repo .env use and symlink/copy instructions.

Sequence Diagram(s)

sequenceDiagram
    actor User
    participant CLI
    participant Server
    participant GlobalEnv as "~/.archon/.env"
    participant RepoEnv as "repo/.env (dev only)"
    participant EnvGate as "SUBPROCESS_ENV_ALLOWLIST"
    participant Subprocess as "AI subprocess (Claude/Code)"

    rect rgba(200,220,255,0.5)
    User->>CLI: run command
    CLI->>GlobalEnv: load with override:true
    CLI->>RepoEnv: (not parsed/stripped) repo env remains in process.env
    CLI->>EnvGate: enforce allowlist at subprocess spawn
    CLI->>Subprocess: spawn with filtered env
    end

    rect rgba(200,255,200,0.5)
    User->>Server: start server
    Server->>GlobalEnv: load with override:true
    Server->>RepoEnv: if BUNDLED_IS_BINARY==false -> load repo/.env
    Server->>EnvGate: ensure allowlist blocks sensitive keys for subprocesses
    Server->>Subprocess: spawn filtered AI subprocess
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Poem

🐰 I hopped through env files, nose a-sleuth,
No more yanking keys from the repo's booth.
Global dots now trump the scattered leaves,
An allowlist guards what the subprocess weaves.
Hooray — safe hops and clearer roots! 🌿

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and concisely summarizes the primary changes: removing CWD env stripping and loading ~/.archon/.env with override for binary support.
Description check ✅ Passed The description comprehensively follows the template structure with all major sections completed: Summary, UX Journey (before/after), Architecture Diagram, Change Metadata, Validation Evidence, Security Impact, Compatibility, Human Verification, Side Effects, and Rollback Plan.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/binary-env-loading

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@Wirasm
Copy link
Copy Markdown
Collaborator Author

Wirasm commented Apr 10, 2026

PR Review Summary

Critical Issues (0 found)

None.

Important Issues (3 found)

Agent Issue Location
silent-failure-hunter config({ path: globalEnvPath, override: true }) result silently discarded — parse errors invisible. CLI checks this; server doesn't. In binary mode this is the only env source. packages/server/src/index.ts:35
silent-failure-hunter let envPath: string | undefined violates "TYPE SAFETY IS A CORE RULE" — always assigned in the branch where it's consumed packages/server/src/index.ts:19
docs + code-reviewer + comments .claude/rules/cli.md step 1 still says "Deletes process.env.DATABASE_URL" — behavior removed by this PR .claude/rules/cli.md:32

Suggestions (6 found)

Agent Suggestion Location
docs-impact Security docs describe old CWD stripping behavior docs-web/.../security.md:122
docs-impact Configuration docs describe old DATABASE_URL deletion docs-web/.../configuration.md:299-313
docs-impact CLI docs startup sequence is stale docs-web/.../cli.md:368-369
docs-impact CLI internals diagram says "Suppresses cwd .env" docs-web/.../cli-internals.md:42
comment-analyzer Server comment uses ANTHROPIC_API_KEY as example but it's not in the allowlist — CLAUDE_API_KEY would be more accurate packages/server/src/index.ts:8
comment-analyzer "Same logic as the CLI" is slightly inaccurate (nested vs flat if) packages/server/src/index.ts:38-39

Strengths

  • Core approach is sound — removing redundant CWD stripping simplifies startup while dual-layer protection (allowlist + env-leak gate) remains untouched
  • Binary mode guard for import.meta.dir is well-reasoned with accurate comment
  • CLAUDE_USE_GLOBAL_AUTH smart default mirrors CLI behavior correctly
  • Net -17 lines — simplification, not complexity

Documentation Issues

  • 5 docs files reference the removed CWD stripping behavior (rules, docs-web security/config/cli/internals)
  • CLAUDE.md ~/.archon/ tree omits .env file (now primary config source for binaries)

Verdict

NEEDS FIXES — 1 blocking issue (silent dotenv parse error in server), 1 type safety fix, and stale docs.

Recommended Actions

  1. Fix: Check config() return in server's ~/.archon/.env load — match CLI's error handling pattern
  2. Fix: Tighten envPath typing (remove | undefined or assign in both branches)
  3. Fix: Update .claude/rules/cli.md startup sequence (directly contradicts the PR)
  4. Consider: Update docs-web pages that reference removed CWD stripping (can be a follow-up)

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@packages/cli/src/cli.ts`:
- Around line 15-24: The CLI env-isolation test still deletes DATABASE_URL and
asserts the old behavior; update the test to stop removing DATABASE_URL and
instead verify the new contract: that globalEnvPath is loaded with config({
path: globalEnvPath, override: true }) (so ~/.archon/.env overrides CWD), and
that subprocess environment isolation relies on the SUBPROCESS_ENV_ALLOWLIST and
the env-leak gate scan before spawning (assert the allowlist is consulted and
the env-leak scanner is invoked or its result is respected); keep references to
the symbols globalEnvPath, config(... override: true), SUBPROCESS_ENV_ALLOWLIST
and the env-leak gate when adding assertions.

In `@packages/server/src/index.ts`:
- Around line 34-36: The current branch calling existsSync(globalEnvPath) then
config({ path: globalEnvPath, override: true }) ignores any errors from
dotenv.config; update the startup to check the result of config(...) (or catch
thrown errors) when globalEnvPath exists and, on parse failure, log the specific
error and abort startup (throw or process.exit(1)) so malformed ~/.archon/.env
is surfaced immediately; locate the call to config in
packages/server/src/index.ts and add explicit error handling around the
config(...) call referencing globalEnvPath, config, and existsSync.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 2a3109b7-22a3-4015-9943-e422ee99f03a

📥 Commits

Reviewing files that changed from the base of the PR and between 0d5ec66 and 4c01820.

📒 Files selected for processing (2)
  • packages/cli/src/cli.ts
  • packages/server/src/index.ts

Comment thread packages/cli/src/cli.ts
Comment on lines +15 to 24
// Load .env from global Archon config (override: true so ~/.archon/.env
// always wins over any Bun-auto-loaded CWD vars).
//
// Credential safety: target repo .env keys that Bun auto-loads from CWD
// cannot leak into AI subprocesses — SUBPROCESS_ENV_ALLOWLIST blocks them.
// The env-leak gate provides a second layer by scanning target repos before
// spawning. No CWD stripping needed.
const globalEnvPath = resolve(process.env.HOME ?? '~', '.archon', '.env');
if (existsSync(globalEnvPath)) {
const result = config({ path: globalEnvPath, override: true });
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Update the CLI env-isolation test to match the new contract.

packages/cli/src/cli.test.ts:215-250 still describes the removed DATABASE_URL deletion behavior and manually does the delete itself, so it can keep passing without verifying anything in cli.ts. This change should be paired with test coverage for the new behavior instead: global ~/.archon/.env override semantics and reliance on the subprocess allowlist/env-leak gate.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/cli/src/cli.ts` around lines 15 - 24, The CLI env-isolation test
still deletes DATABASE_URL and asserts the old behavior; update the test to stop
removing DATABASE_URL and instead verify the new contract: that globalEnvPath is
loaded with config({ path: globalEnvPath, override: true }) (so ~/.archon/.env
overrides CWD), and that subprocess environment isolation relies on the
SUBPROCESS_ENV_ALLOWLIST and the env-leak gate scan before spawning (assert the
allowlist is consulted and the env-leak scanner is invoked or its result is
respected); keep references to the symbols globalEnvPath, config(... override:
true), SUBPROCESS_ENV_ALLOWLIST and the env-leak gate when adding assertions.

Comment on lines 34 to +36
if (existsSync(globalEnvPath)) {
const globalResult = config({ path: globalEnvPath, processEnv: {} });
if (globalResult.parsed?.DATABASE_URL) {
process.env.DATABASE_URL = globalResult.parsed.DATABASE_URL;
}
config({ path: globalEnvPath, override: true });
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Handle ~/.archon/.env parse failures explicitly.

In bundled mode this file is now the only .env source, but this branch ignores dotenv.config() errors. A malformed global env file will fall through and later fail as no_ai_credentials or other config issues instead of surfacing the real startup problem. As per coding guidelines, "Prefer throwing early with a clear error for unsupported/unsafe states - never silently swallow errors or broaden permissions (Fail Fast + Explicit Errors)".

Suggested fix
 const globalEnvPath = resolve(process.env.HOME ?? '~', '.archon', '.env');
 if (existsSync(globalEnvPath)) {
-  config({ path: globalEnvPath, override: true });
+  const globalEnvResult = config({ path: globalEnvPath, override: true });
+  if (globalEnvResult.error) {
+    console.error(
+      `Failed to load .env from ${globalEnvPath}: ${globalEnvResult.error.message}`
+    );
+    process.exit(1);
+  }
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/server/src/index.ts` around lines 34 - 36, The current branch
calling existsSync(globalEnvPath) then config({ path: globalEnvPath, override:
true }) ignores any errors from dotenv.config; update the startup to check the
result of config(...) (or catch thrown errors) when globalEnvPath exists and, on
parse failure, log the specific error and abort startup (throw or
process.exit(1)) so malformed ~/.archon/.env is surfaced immediately; locate the
call to config in packages/server/src/index.ts and add explicit error handling
around the config(...) call referencing globalEnvPath, config, and existsSync.

@Wirasm Wirasm merged commit 3519710 into dev Apr 10, 2026
3 of 4 checks passed
@Wirasm Wirasm mentioned this pull request Apr 10, 2026
@Wirasm Wirasm deleted the fix/binary-env-loading branch April 10, 2026 13:28
Tyone88 pushed a commit to Tyone88/Archon that referenced this pull request Apr 16, 2026
fix: remove CWD env stripping, load ~/.archon/.env with override for binary support
joaobmonteiro pushed a commit to joaobmonteiro/Archon that referenced this pull request Apr 26, 2026
fix: remove CWD env stripping, load ~/.archon/.env with override for binary support
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant