diff --git a/CHANGELOG.md b/CHANGELOG.md index 02776a0ca4..9239f9f1da 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -7,6 +7,39 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ## [Unreleased] +## [0.3.0] - 2026-04-08 + +Env-leak gate hardening, SSE reliability fixes, isolation cleanup smarter merge detection, build/version improvements, and deploy hardening. + +### Added + +- **Env-leak gate (target repo `.env` keys)**: scan auto-loaded `.env` filenames for 7 sensitive keys (`ANTHROPIC_API_KEY`, `OPENAI_API_KEY`, etc.) and refuse to register or spawn into a codebase whose `.env` would silently re-inject keys into Claude/Codex subprocesses. Default is fail-closed (`allow_env_keys = false`). Includes a per-codebase consent column, registration gate, pre-spawn check in both Claude and Codex clients, and a 422 API error with web UI checkbox (#1036). +- **CLI `--allow-env-keys` flag** for `archon workflow run` — grant env-leak-gate consent during auto-registration without needing the Web UI. Audit-logged as `env_leak_consent_granted` with `actor: 'user-cli'` (#973, #983). +- **Global `allow_target_repo_keys` flag** in `~/.archon/config.yaml` — bypass the env-leak gate for all codebases on this machine. Per-repo `.archon/config.yaml` `allow_target_repo_keys: false` re-enables the gate for that repo. The server emits `env_leak_gate_disabled` once per process per source the first time `loadConfig` resolves the bypass as active (#973, #983). +- **`PATCH /api/codebases/:id`** endpoint to flip `allow_env_keys` on existing codebases without delete/re-add. Audit-logged at `warn` level on every grant and revoke, including a `scanStatus` field that distinguishes "scanned" from "scan failed" so audit reviewers can tell empty key lists apart (#973, #983). +- **Settings → Projects per-row toggle** to grant or revoke env-key consent retroactively, with an "env keys allowed" badge and inline error feedback if the PATCH fails (#973, #983). +- **Startup env-leak scan**: when `allow_target_repo_keys` is not set, the server emits one `startup_env_leak_gate_will_block` warn per registered codebase whose `.env` would block the next spawn. Skipped entirely when the global bypass is active (#973, #983). +- **Squash-merge and PR-merge detection** for `isolation cleanup --merged`. Unions three signals (ancestry via `git branch --merged`, patch equivalence via `git cherry`, and PR state via `gh`) to safely clean up worktrees whose branches were squash-merged. Adds `--include-closed` flag to also remove worktrees whose PRs were closed without merging (#1027). +- **Git commit hash in `archon version`** output. Read at runtime via `git rev-parse` in dev or from a build-time constant in compiled binaries; falls back to `unknown` (#1035). + +### Changed + +- **Env-leak gate error messages** are now context-aware: separate remediation copy for Web Add-Project, CLI auto-register, and pre-spawn-of-existing-codebase paths. Previously every error pointed at the Web UI checkbox even from the CLI (#973, #983). +- **SSE event buffer TTL** raised from 3s to 60s and capacity from 50 to 500 events, fixing dropped `tool_result` events during the 5s reconnect grace window that left tool cards perpetually spinning. Cleanup timer now resets on each new event so the buffer is held for TTL past the most recent event, not the first one. Buffer overflow and TTL expiration now log at `warn` level for observability (#1037). +- **Binary build detection** moved from runtime env sniffing (`import.meta.dir` / `process.execPath`) to a build-time `BUNDLED_IS_BINARY` constant in `@archon/paths`. Logger uses `pino-pretty` as a destination stream on the main thread instead of a worker-thread transport, eliminating the `require.resolve('pino-pretty')` lookup that crashed inside Bun's `$bunfs` virtual filesystem in compiled binaries. Same code path runs in dev and binaries — no environment detection (#982). +- **Cloud-init deployment script** hardened: dedicated `archon` user (docker group, no sudo) with SSH keys copied from the default cloud user, 2GB swapfile to prevent OOM during docker build on small VPSes, `ufw allow 443/tcp` and `443/udp` for HTTP/3 QUIC, fail-fast on network errors, and clearer setup-complete messaging (#981). + +### Fixed + +- **Env-leak gate worktree path lookup**: pre-spawn consent check now falls back to `findCodebaseByPathPrefix()` when the exact path lookup misses, so workflow runs in `.../worktrees/feature-branch` correctly inherit consent from the source codebase (#1036). +- **`EnvLeakError` FATAL classification** in the workflow executor now checks `error.name === 'EnvLeakError'` directly instead of pattern-matching the message, immune to message rewording (#1036). +- **Scanner unreadable-file handling**: distinguishes `ENOENT` (skip) from `EACCES` and other errors so unreadable `.env` files surface as findings instead of silently bypassing the gate (#1036). + +### Security + +- The default `allow_env_keys` per codebase is `false` (fail-closed). Codebases with sensitive keys in their auto-loaded `.env` files (`ANTHROPIC_API_KEY`, `OPENAI_API_KEY`, etc.) are blocked at the next workflow run. **Remediation paths** (any one): (1) remove the key from `.env`, (2) rename to `.env.secrets`, (3) toggle "Allow env keys" in Settings → Projects, (4) `archon workflow run --allow-env-keys ...`, (5) set `allow_target_repo_keys: true` in `~/.archon/config.yaml`. See `docs/reference/security.md` for full details (#1036, #973, #983). + + ## [0.2.12] - 2026-03-20 Chat-first navigation redesign, DAG graph viewer, per-node MCP and skills, and extensive bug fixes across the web UI and workflow engine. diff --git a/CLAUDE.md b/CLAUDE.md index 9132a1a82a..fa00b0fb04 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -198,6 +198,10 @@ bun run cli workflow run implement --branch feature-auth "Add auth" # Opt out of isolation (run in live checkout) bun run cli workflow run quick-fix --no-worktree "Fix typo" +# Grant env-leak-gate consent during auto-registration (for repos whose .env +# contains sensitive keys). Audit-logged with actor: 'user-cli'. +bun run cli workflow run plan --cwd /path/to/leaky/repo --allow-env-keys "..." + # Show running workflows bun run cli workflow status @@ -224,6 +228,9 @@ bun run cli isolation cleanup 14 # Custom days # Clean up environments with branches merged into main (also deletes remote branches) bun run cli isolation cleanup --merged +# Also remove environments with closed (abandoned) PRs +bun run cli isolation cleanup --merged --include-closed + # Validate workflow definitions and their referenced resources bun run cli validate workflows # All workflows bun run cli validate workflows my-workflow # Single workflow @@ -740,6 +747,12 @@ Pattern: Use `classifyIsolationError()` (from `@archon/isolation`) to map git er - `POST /api/workflows/runs/{runId}/abandon` - Abandon a non-terminal run (marks as cancelled) - `DELETE /api/workflows/runs/{runId}` - Delete a terminal workflow run and its events +**Codebases:** +- `GET /api/codebases` / `GET /api/codebases/:id` - List / fetch codebases +- `POST /api/codebases` - Register a codebase (clone or local path); body accepts `allowEnvKeys` for the env-leak gate +- `PATCH /api/codebases/:id` - Flip the `allow_env_keys` consent bit; body: `{ allowEnvKeys: boolean }`. Audit-logged at `warn` level on every grant/revoke (`env_leak_consent_granted` / `env_leak_consent_revoked`) with `codebaseId`, `path`, `files`, `keys`, `scanStatus`, `actor` +- `DELETE /api/codebases/:id` - Delete a codebase and clean up resources + **Artifact Files:** - `GET /api/artifacts/:runId/*` - Serve a workflow artifact file by run ID and relative path; returns `text/markdown` for `.md` files, `text/plain` otherwise; 400 on path traversal (`..`), 404 if run or file not found diff --git a/bun.lock b/bun.lock index d477f58e72..d7cbe31a1b 100644 --- a/bun.lock +++ b/bun.lock @@ -41,7 +41,7 @@ }, "packages/cli": { "name": "@archon/cli", - "version": "0.2.0", + "version": "0.2.13", "bin": { "archon": "./src/cli.ts", }, @@ -114,8 +114,6 @@ "version": "0.2.0", "dependencies": { "pino": "^9", - }, - "optionalDependencies": { "pino-pretty": "^13", }, "peerDependencies": { diff --git a/deploy/cloud-init.yml b/deploy/cloud-init.yml index 9a689e98ad..9056665176 100644 --- a/deploy/cloud-init.yml +++ b/deploy/cloud-init.yml @@ -6,20 +6,30 @@ # # Paste this into your VPS provider's "User Data" field when creating a server. # Tested on: Ubuntu 22.04+, Debian 12+ +# Works with any cloud-init compatible provider (DigitalOcean, Hetzner, Linode, +# Vultr, AWS EC2, Hostinger, etc.) # # What this does: # 1. Installs Docker + Docker Compose plugin # 2. Opens firewall ports (SSH, HTTP, HTTPS) -# 3. Clones the repo to /opt/archon -# 4. Prepares .env and Caddyfile from examples -# 5. Builds the Docker image (~5 min) +# 3. Creates a 2GB swapfile (helps small VPS builds avoid OOM) +# 4. Clones the repo to /opt/archon +# 5. Prepares .env and Caddyfile from examples +# 6. Creates a dedicated 'archon' user (docker group only, no sudo) +# 7. Builds the Docker image (~5 min) as the archon user # -# After the server boots (~5-8 min), SSH in and: +# Note: On VPS with <2GB RAM, the docker build step can OOM without swap. +# Note: The 'archon' user has docker access but NOT sudo. For administrative +# tasks (updates, reboots), use the default cloud user or root. +# +# After the server boots (~5-8 min), SSH in as the archon user: +# ssh archon@your-server-ip # 1. Edit /opt/archon/.env — set your AI credentials, DOMAIN, DATABASE_URL # 2. cd /opt/archon && docker compose --profile with-db --profile cloud up -d # 3. Open https://your-domain.com # # IMPORTANT: Before starting, point your domain's DNS A record to this server's IP. +# SSH keys from the default cloud user are copied to 'archon'. # package_update: true @@ -30,29 +40,66 @@ packages: - git - ufw +users: + - default + - name: archon + gecos: Archon Service User + shell: /bin/bash + lock_passwd: true + runcmd: + # --- Swap (helps small VPS avoid OOM during docker build) --- + - | + if [ ! -f /swapfile ]; then + fallocate -l 2G /swapfile || dd if=/dev/zero of=/swapfile bs=1M count=2048 + chmod 600 /swapfile + mkswap /swapfile + swapon /swapfile + echo '/swapfile none swap sw 0 0' >> /etc/fstab + fi + # --- Docker --- - curl -fsSL https://get.docker.com | sh - - systemctl enable docker - - systemctl start docker + - usermod -aG docker archon - # --- Firewall --- + # --- Copy SSH keys from default user to archon (so login works immediately) --- + - | + DEFAULT_USER=$(getent passwd 1000 | cut -d: -f1) + if [ -n "$DEFAULT_USER" ] && [ -f /home/$DEFAULT_USER/.ssh/authorized_keys ]; then + mkdir -p /home/archon/.ssh + cp /home/$DEFAULT_USER/.ssh/authorized_keys /home/archon/.ssh/authorized_keys + chmod 700 /home/archon/.ssh + chmod 600 /home/archon/.ssh/authorized_keys + chown -R archon:archon /home/archon/.ssh + elif [ -f /root/.ssh/authorized_keys ]; then + mkdir -p /home/archon/.ssh + cp /root/.ssh/authorized_keys /home/archon/.ssh/authorized_keys + chmod 700 /home/archon/.ssh + chmod 600 /home/archon/.ssh/authorized_keys + chown -R archon:archon /home/archon/.ssh + fi + + # --- Firewall (443/udp needed for HTTP/3 QUIC via Caddy) --- - ufw allow 22/tcp - ufw allow 80/tcp - - ufw allow 443 + - ufw allow 443/tcp + - ufw allow 443/udp - ufw --force enable - # --- Clone and configure --- - - git clone https://github.com/coleam00/Archon.git /opt/archon - - cp /opt/archon/.env.example /opt/archon/.env - - cp /opt/archon/Caddyfile.example /opt/archon/Caddyfile + # --- Clone and configure (fail fast — single shell so set -e applies) --- + - | + set -e + git clone https://github.com/coleam00/Archon.git /opt/archon + cp /opt/archon/.env.example /opt/archon/.env + cp /opt/archon/Caddyfile.example /opt/archon/Caddyfile + chown -R archon:archon /opt/archon - # --- Pre-pull external images --- - - docker pull postgres:17-alpine - - docker pull caddy:2-alpine + # --- Pre-pull external images (as archon, via docker group) --- + - sudo -u archon docker pull postgres:17-alpine + - sudo -u archon docker pull caddy:2-alpine - # --- Build the app image --- - - cd /opt/archon && docker compose build + # --- Build the app image as archon --- + - sudo -u archon -H bash -c 'cd /opt/archon && docker compose build' # --- Signal completion --- - | @@ -61,6 +108,13 @@ runcmd: Archon server setup complete! ============================================ + Log in as the 'archon' user (not root): + ssh archon@ + + Note: the 'archon' user has docker access but no sudo. For system + maintenance (apt upgrade, reboots), log in as the default cloud user + or root. + Next steps: 1. Edit credentials and domain: @@ -85,7 +139,7 @@ runcmd: Logs: docker compose logs -f Health: curl https://your-domain.com/api/health - Docs: https://github.com/coleam00/Archon/blob/main/docs/docker.md + Docs: https://archon.diy/deployment/docker/ ============================================ DONE - echo "[archon] Setup complete. Edit /opt/archon/.env and run docker compose up." diff --git a/migrations/000_combined.sql b/migrations/000_combined.sql index 5ded648153..176963b40e 100644 --- a/migrations/000_combined.sql +++ b/migrations/000_combined.sql @@ -30,6 +30,7 @@ CREATE TABLE IF NOT EXISTS remote_agent_codebases ( repository_url VARCHAR(500), default_cwd VARCHAR(500) NOT NULL, ai_assistant_type VARCHAR(20) DEFAULT 'claude', + allow_env_keys BOOLEAN NOT NULL DEFAULT FALSE, commands JSONB DEFAULT '{}'::jsonb, created_at TIMESTAMP DEFAULT NOW(), updated_at TIMESTAMP DEFAULT NOW() @@ -307,3 +308,7 @@ ALTER TABLE remote_agent_conversations -- From migration 016: ended_reason on sessions ALTER TABLE remote_agent_sessions ADD COLUMN IF NOT EXISTS ended_reason TEXT; + +-- From migration 021: allow_env_keys on codebases +ALTER TABLE remote_agent_codebases + ADD COLUMN IF NOT EXISTS allow_env_keys BOOLEAN NOT NULL DEFAULT FALSE; diff --git a/migrations/021_add_allow_env_keys_to_codebases.sql b/migrations/021_add_allow_env_keys_to_codebases.sql new file mode 100644 index 0000000000..a1f30d2eca --- /dev/null +++ b/migrations/021_add_allow_env_keys_to_codebases.sql @@ -0,0 +1,4 @@ +-- Add per-codebase consent bit for subprocess .env key leakage +-- DEFAULT FALSE = safe by default; user must explicitly opt in +ALTER TABLE remote_agent_codebases + ADD COLUMN allow_env_keys BOOLEAN NOT NULL DEFAULT FALSE; diff --git a/package.json b/package.json index be30a22289..1e5d61ac30 100644 --- a/package.json +++ b/package.json @@ -1,6 +1,6 @@ { "name": "archon", - "version": "0.2.13", + "version": "0.3.0", "private": true, "workspaces": [ "packages/*" diff --git a/packages/cli/src/cli.ts b/packages/cli/src/cli.ts index a25ece0c0e..8852dbc657 100755 --- a/packages/cli/src/cli.ts +++ b/packages/cli/src/cli.ts @@ -127,6 +127,9 @@ Options: --json Output machine-readable JSON (for workflow list) --workflow Workflow to run for 'continue' (default: archon-assist) --no-context Skip context injection for 'continue' + --allow-env-keys Grant env-key consent during auto-registration + (bypasses the env-leak gate for this codebase; + logs an audit entry) Examples: archon chat "What does the orchestrator do?" @@ -190,6 +193,7 @@ async function main(): Promise { reason: { type: 'string' }, workflow: { type: 'string' }, 'no-context': { type: 'boolean' }, + 'allow-env-keys': { type: 'boolean' }, }, allowPositionals: true, strict: false, // Allow unknown flags to pass through @@ -211,6 +215,7 @@ async function main(): Promise { const resumeFlag = values.resume as boolean | undefined; const spawnFlag = values.spawn as boolean | undefined; const jsonFlag = values.json as boolean | undefined; + const allowEnvKeysFlag = values['allow-env-keys'] as boolean | undefined; // Handle help flag if (values.help) { @@ -323,6 +328,7 @@ async function main(): Promise { fromBranch, noWorktree, resume: resumeFlag, + allowEnvKeys: allowEnvKeysFlag, quiet: values.quiet as boolean | undefined, verbose: values.verbose as boolean | undefined, }; @@ -459,7 +465,8 @@ async function main(): Promise { // Check for --merged flag in remaining args const mergedFlag = args.includes('--merged') || positionals.includes('--merged'); if (mergedFlag) { - await isolationCleanupMergedCommand(); + const includeClosed = args.includes('--include-closed'); + await isolationCleanupMergedCommand({ includeClosed }); } else { const days = parseInt(positionals[2] ?? '7', 10); await isolationCleanupCommand(days); diff --git a/packages/cli/src/commands/bundled-version.ts b/packages/cli/src/commands/bundled-version.ts deleted file mode 100644 index ac0b0b49a0..0000000000 --- a/packages/cli/src/commands/bundled-version.ts +++ /dev/null @@ -1,10 +0,0 @@ -/** - * Bundled version for compiled binaries - * - * This file is updated by scripts/build-binaries.sh before compilation. - * The version is read from package.json at build time and embedded here. - * - * For development, the version command reads directly from package.json instead. - */ - -export const BUNDLED_VERSION = '0.2.0'; diff --git a/packages/cli/src/commands/isolation.test.ts b/packages/cli/src/commands/isolation.test.ts index cb81648dfc..81ca60651e 100644 --- a/packages/cli/src/commands/isolation.test.ts +++ b/packages/cli/src/commands/isolation.test.ts @@ -2,7 +2,7 @@ * Tests for isolation complete command */ import { describe, it, expect, beforeEach, afterEach, mock, spyOn } from 'bun:test'; -import { isolationCompleteCommand } from './isolation'; +import { isolationCompleteCommand, isolationCleanupMergedCommand } from './isolation'; const mockLogger = { fatal: mock(() => undefined), @@ -44,6 +44,27 @@ mock.module('@archon/core/services/cleanup-service', () => ({ cleanupMergedWorktrees: mockCleanupMergedWorktrees, })); +const mockListEnvironments = mock(() => + Promise.resolve({ + codebases: [ + { + codebaseId: 'cb-1', + defaultCwd: '/test/repo', + repositoryUrl: 'https://github.com/owner/repo', + environments: [], + }, + ], + totalEnvironments: 0, + ghostsReconciled: 0, + }) +); +const mockCleanupMergedEnvironments = mock(() => Promise.resolve({ removed: [], skipped: [] })); + +mock.module('@archon/core/operations/isolation-operations', () => ({ + listEnvironments: mockListEnvironments, + cleanupMergedEnvironments: mockCleanupMergedEnvironments, +})); + const mockHasUncommittedChanges = mock(() => Promise.resolve(false)); // Default: gh returns empty PR array, git log returns empty string (no commits to report) const mockExecFileAsync = mock((cmd: string) => @@ -358,3 +379,32 @@ describe('isolationCompleteCommand', () => { expect(consoleLogSpy).toHaveBeenCalledWith('\nComplete: 1 completed, 1 failed, 1 not found'); }); }); + +describe('isolationCleanupMergedCommand', () => { + let consoleLogSpy: ReturnType; + let consoleErrorSpy: ReturnType; + + beforeEach(() => { + consoleLogSpy = spyOn(console, 'log').mockImplementation(() => {}); + consoleErrorSpy = spyOn(console, 'error').mockImplementation(() => {}); + mockCleanupMergedEnvironments.mockReset(); + mockCleanupMergedEnvironments.mockResolvedValue({ removed: [], skipped: [] }); + }); + + afterEach(() => { + consoleLogSpy.mockRestore(); + consoleErrorSpy.mockRestore(); + }); + + it('passes includeClosed=true when --include-closed flag is set', async () => { + await isolationCleanupMergedCommand({ includeClosed: true }); + expect(mockCleanupMergedEnvironments).toHaveBeenCalledWith('cb-1', '/test/repo', { + includeClosed: true, + }); + }); + + it('defaults to includeClosed=false', async () => { + await isolationCleanupMergedCommand(); + expect(mockCleanupMergedEnvironments).toHaveBeenCalledWith('cb-1', '/test/repo', {}); + }); +}); diff --git a/packages/cli/src/commands/isolation.ts b/packages/cli/src/commands/isolation.ts index 69749da80b..6e44a0fb67 100644 --- a/packages/cli/src/commands/isolation.ts +++ b/packages/cli/src/commands/isolation.ts @@ -124,7 +124,9 @@ export async function isolationCleanupCommand(daysStale = 7): Promise { * Cleanup merged isolation environments (branches merged into main) * Also deletes remote branches for merged environments */ -export async function isolationCleanupMergedCommand(): Promise { +export async function isolationCleanupMergedCommand( + options: { includeClosed?: boolean } = {} +): Promise { console.log('Finding environments with branches merged into main...'); const { codebases } = await listEnvironments(); @@ -141,7 +143,11 @@ export async function isolationCleanupMergedCommand(): Promise { try { console.log(`\nChecking ${codebase.repositoryUrl ?? codebase.defaultCwd}...`); - const result = await cleanupMergedEnvironments(codebase.codebaseId, codebase.defaultCwd); + const result = await cleanupMergedEnvironments( + codebase.codebaseId, + codebase.defaultCwd, + options + ); for (const branch of result.removed) { console.log(` Cleaned: ${branch}`); diff --git a/packages/cli/src/commands/version.test.ts b/packages/cli/src/commands/version.test.ts index 19d1635f43..765b732751 100644 --- a/packages/cli/src/commands/version.test.ts +++ b/packages/cli/src/commands/version.test.ts @@ -2,24 +2,28 @@ * Tests for version command */ import { describe, it, expect, beforeEach, afterEach, spyOn } from 'bun:test'; +import * as git from '@archon/git'; import { versionCommand } from './version'; describe('versionCommand', () => { let consoleSpy: ReturnType; + let execSpy: ReturnType; beforeEach(() => { consoleSpy = spyOn(console, 'log').mockImplementation(() => {}); + execSpy = spyOn(git, 'execFileAsync').mockResolvedValue({ stdout: 'abc1234\n', stderr: '' }); }); afterEach(() => { consoleSpy.mockRestore(); + execSpy.mockRestore(); }); it('should output version and system info', async () => { await versionCommand(); - // Should have called console.log 4 times (version, platform, build, database) - expect(consoleSpy).toHaveBeenCalledTimes(4); + // Should have called console.log 5 times (version, platform, build, database, git commit) + expect(consoleSpy).toHaveBeenCalledTimes(5); // First call should contain "Archon CLI" and version const firstCall = consoleSpy.mock.calls[0][0] as string; @@ -37,6 +41,20 @@ describe('versionCommand', () => { // Fourth call should contain database type const fourthCall = consoleSpy.mock.calls[3][0] as string; expect(fourthCall).toContain('Database:'); + + // Fifth call should contain git commit with the mocked SHA + const fifthCall = consoleSpy.mock.calls[4][0] as string; + expect(fifthCall).toMatch(/Git commit: ([0-9a-f]{7,}|unknown)/); + expect(fifthCall).toBe(' Git commit: abc1234'); + }); + + it('should return unknown git commit when git is unavailable', async () => { + execSpy.mockRejectedValueOnce(new Error('not a git repository')); + + await versionCommand(); + + const fifthCall = consoleSpy.mock.calls[4][0] as string; + expect(fifthCall).toBe(' Git commit: unknown'); }); it('should output correct format for version line', async () => { diff --git a/packages/cli/src/commands/version.ts b/packages/cli/src/commands/version.ts index 25762f7161..589a4bcd93 100644 --- a/packages/cli/src/commands/version.ts +++ b/packages/cli/src/commands/version.ts @@ -1,15 +1,23 @@ /** * Version command - displays version info * - * For compiled binaries, version is embedded via bundled-version.ts - * For development (Bun), reads from package.json + * For compiled binaries, version and git commit are embedded via `@archon/paths` + * build-time constants (rewritten by `scripts/build-binaries.sh`). + * For development (Bun), reads from package.json and retrieves git commit at runtime. */ import { readFile } from 'fs/promises'; import { join, dirname } from 'path'; import { fileURLToPath } from 'url'; +import { execFileAsync } from '@archon/git'; +import { + BUNDLED_GIT_COMMIT, + BUNDLED_IS_BINARY, + BUNDLED_VERSION, + createLogger, +} from '@archon/paths'; import { getDatabaseType } from '@archon/core'; -import { isBinaryBuild } from '@archon/workflows/defaults'; -import { BUNDLED_VERSION } from './bundled-version'; + +const log = createLogger('cli:version'); const SCRIPT_DIR = dirname(fileURLToPath(import.meta.url)); @@ -47,25 +55,46 @@ async function getDevVersion(): Promise<{ name: string; version: string }> { return { name: pkg.name, version: pkg.version }; } +/** + * Get the git commit hash at runtime (dev mode). + * Returns 'unknown' if git is unavailable or the command fails. + */ +async function getDevGitCommit(): Promise { + try { + const { stdout } = await execFileAsync('git', ['rev-parse', '--short', 'HEAD'], { + timeout: 5000, + }); + return stdout.trim(); + } catch (err) { + // Non-blocking: git may not be installed or cwd may not be a git repo + log.debug({ err }, 'version.git_commit_lookup_failed'); + return 'unknown'; + } +} + export async function versionCommand(): Promise { let version: string; + let gitCommit: string; - if (isBinaryBuild()) { - // Compiled binary: use embedded version + if (BUNDLED_IS_BINARY) { + // Compiled binary: use embedded version and commit version = BUNDLED_VERSION; + gitCommit = BUNDLED_GIT_COMMIT; } else { - // Development mode: read from package.json + // Development mode: read from package.json and git const devInfo = await getDevVersion(); version = devInfo.version; + gitCommit = await getDevGitCommit(); } const platform = process.platform; const arch = process.arch; const dbType = getDatabaseType(); - const buildType = isBinaryBuild() ? 'binary' : 'source (bun)'; + const buildType = BUNDLED_IS_BINARY ? 'binary' : 'source (bun)'; console.log(`Archon CLI v${version}`); console.log(` Platform: ${platform}-${arch}`); console.log(` Build: ${buildType}`); console.log(` Database: ${dbType}`); + console.log(` Git commit: ${gitCommit}`); } diff --git a/packages/cli/src/commands/workflow.ts b/packages/cli/src/commands/workflow.ts index 61987521d9..99edcb0bfe 100644 --- a/packages/cli/src/commands/workflow.ts +++ b/packages/cli/src/commands/workflow.ts @@ -62,6 +62,8 @@ export interface WorkflowRunOptions { noWorktree?: boolean; resume?: boolean; codebaseId?: string; // Passed by resume/approve to skip path-based lookup + /** When true, skip the env-leak-gate during auto-registration. */ + allowEnvKeys?: boolean; quiet?: boolean; verbose?: boolean; } @@ -321,7 +323,7 @@ export async function workflowRunCommand( const repoRoot = await git.findRepoRoot(cwd); if (repoRoot) { try { - const result = await registerRepository(repoRoot); + const result = await registerRepository(repoRoot, options.allowEnvKeys, 'register-cli'); codebase = await codebaseDb.getCodebase(result.codebaseId); if (!result.alreadyExisted) { getLog().info({ name: result.name }, 'cli.codebase_auto_registered'); diff --git a/packages/core/package.json b/packages/core/package.json index 0c5f6b7810..44adcc4a7a 100644 --- a/packages/core/package.json +++ b/packages/core/package.json @@ -23,7 +23,7 @@ "./state/*": "./src/state/*.ts" }, "scripts": { - "test": "bun test src/clients/ && bun test src/handlers/command-handler.test.ts && bun test src/handlers/clone.test.ts && bun test src/db/adapters/postgres.test.ts && bun test src/db/adapters/sqlite.test.ts src/db/codebases.test.ts src/db/connection.test.ts src/db/conversations.test.ts src/db/env-vars.test.ts src/db/isolation-environments.test.ts src/db/messages.test.ts src/db/sessions.test.ts src/db/workflow-events.test.ts src/db/workflows.test.ts src/utils/defaults-copy.test.ts src/utils/worktree-sync.test.ts src/utils/conversation-lock.test.ts src/utils/credential-sanitizer.test.ts src/utils/port-allocation.test.ts src/utils/error.test.ts src/utils/error-formatter.test.ts src/utils/github-graphql.test.ts src/utils/env-allowlist.test.ts src/config/ src/state/ && bun test src/utils/path-validation.test.ts && bun test src/services/cleanup-service.test.ts && bun test src/services/title-generator.test.ts && bun test src/workflows/ && bun test src/operations/workflow-operations.test.ts && bun test src/operations/isolation-operations.test.ts && bun test src/orchestrator/orchestrator.test.ts && bun test src/orchestrator/orchestrator-agent.test.ts && bun test src/orchestrator/orchestrator-isolation.test.ts", + "test": "bun test src/clients/ && bun test src/handlers/command-handler.test.ts && bun test src/handlers/clone.test.ts && bun test src/db/adapters/postgres.test.ts && bun test src/db/adapters/sqlite.test.ts src/db/codebases.test.ts src/db/connection.test.ts src/db/conversations.test.ts src/db/env-vars.test.ts src/db/isolation-environments.test.ts src/db/messages.test.ts src/db/sessions.test.ts src/db/workflow-events.test.ts src/db/workflows.test.ts src/utils/defaults-copy.test.ts src/utils/worktree-sync.test.ts src/utils/conversation-lock.test.ts src/utils/credential-sanitizer.test.ts src/utils/port-allocation.test.ts src/utils/error.test.ts src/utils/error-formatter.test.ts src/utils/github-graphql.test.ts src/utils/env-allowlist.test.ts src/utils/env-leak-scanner.test.ts src/config/ src/state/ && bun test src/utils/path-validation.test.ts && bun test src/services/cleanup-service.test.ts && bun test src/services/title-generator.test.ts && bun test src/workflows/ && bun test src/operations/workflow-operations.test.ts && bun test src/operations/isolation-operations.test.ts && bun test src/orchestrator/orchestrator.test.ts && bun test src/orchestrator/orchestrator-agent.test.ts && bun test src/orchestrator/orchestrator-isolation.test.ts", "type-check": "bun x tsc --noEmit", "build": "echo 'No build needed - Bun runs TypeScript directly'" }, diff --git a/packages/core/src/clients/claude.test.ts b/packages/core/src/clients/claude.test.ts index 41a9fbbd0a..fd99746e03 100644 --- a/packages/core/src/clients/claude.test.ts +++ b/packages/core/src/clients/claude.test.ts @@ -1,4 +1,4 @@ -import { describe, test, expect, mock, beforeEach, spyOn } from 'bun:test'; +import { describe, test, expect, mock, beforeEach, afterEach, spyOn } from 'bun:test'; import { createMockLogger } from '../test/mocks/logger'; const mockLogger = createMockLogger(); @@ -18,6 +18,9 @@ mock.module('@anthropic-ai/claude-agent-sdk', () => ({ import { ClaudeClient } from './claude'; import * as claudeModule from './claude'; +import * as codebaseDb from '../db/codebases'; +import * as envLeakScanner from '../utils/env-leak-scanner'; +import * as configLoader from '../config/config-loader'; describe('ClaudeClient', () => { let client: ClaudeClient; @@ -951,4 +954,121 @@ describe('ClaudeClient', () => { expect(chunks[0]).toEqual({ type: 'assistant', content: 'Real content' }); }); }); + + describe('pre-spawn env leak gate', () => { + let spyFindByDefaultCwd: ReturnType; + let spyFindByPathPrefix: ReturnType; + let spyScan: ReturnType; + + beforeEach(() => { + spyFindByDefaultCwd = spyOn(codebaseDb, 'findCodebaseByDefaultCwd').mockResolvedValue(null); + spyFindByPathPrefix = spyOn(codebaseDb, 'findCodebaseByPathPrefix').mockResolvedValue(null); + spyScan = spyOn(envLeakScanner, 'scanPathForSensitiveKeys').mockReturnValue({ + path: '/workspace', + findings: [], + }); + mockQuery.mockImplementation(async function* () { + yield { type: 'result', session_id: 'sid-gate' }; + }); + }); + + afterEach(() => { + spyFindByDefaultCwd.mockRestore(); + spyFindByPathPrefix.mockRestore(); + spyScan.mockRestore(); + }); + + test('throws EnvLeakError when .env contains sensitive keys and codebase has no consent', async () => { + spyScan.mockReturnValueOnce({ + path: '/workspace', + findings: [{ file: '.env', keys: ['ANTHROPIC_API_KEY'] }], + }); + + await expect(async () => { + for await (const _ of client.sendQuery('test', '/workspace')) { + // consume + } + }).toThrow('Cannot run workflow'); + }); + + test('skips scan when codebase has allow_env_keys: true', async () => { + spyFindByDefaultCwd.mockResolvedValueOnce({ + id: 'codebase-1', + allow_env_keys: true, + default_cwd: '/workspace', + }); + + const chunks = []; + for await (const chunk of client.sendQuery('test', '/workspace')) { + chunks.push(chunk); + } + + expect(spyScan).not.toHaveBeenCalled(); + expect(chunks).toHaveLength(1); + }); + + test('proceeds when cwd has no registered codebase and no sensitive keys', async () => { + const chunks = []; + for await (const chunk of client.sendQuery('test', '/workspace')) { + chunks.push(chunk); + } + + expect(spyScan).toHaveBeenCalledTimes(1); + expect(chunks).toHaveLength(1); + }); + + test('skips scan when allowTargetRepoKeys is true in merged config', async () => { + const spyLoadConfig = spyOn(configLoader, 'loadConfig').mockResolvedValueOnce({ + allowTargetRepoKeys: true, + } as Awaited>); + // Even though scanner would return a finding, the config bypass must short-circuit + spyScan.mockReturnValueOnce({ + path: '/workspace', + findings: [{ file: '.env', keys: ['ANTHROPIC_API_KEY'] }], + }); + + const chunks = []; + for await (const chunk of client.sendQuery('test', '/workspace')) { + chunks.push(chunk); + } + + expect(spyScan).not.toHaveBeenCalled(); + expect(chunks).toHaveLength(1); + spyLoadConfig.mockRestore(); + }); + + test('falls back to scanner when loadConfig throws (fail-closed)', async () => { + const spyLoadConfig = spyOn(configLoader, 'loadConfig').mockRejectedValueOnce( + new Error('YAML parse error') + ); + spyScan.mockReturnValueOnce({ + path: '/workspace', + findings: [{ file: '.env', keys: ['ANTHROPIC_API_KEY'] }], + }); + + await expect(async () => { + for await (const _ of client.sendQuery('test', '/workspace')) { + // consume + } + }).toThrow('Cannot run workflow'); + expect(spyScan).toHaveBeenCalled(); + spyLoadConfig.mockRestore(); + }); + + test('uses prefix lookup for worktree paths when exact match returns null', async () => { + spyFindByPathPrefix.mockResolvedValueOnce({ + id: 'codebase-1', + allow_env_keys: true, + default_cwd: '/workspace/source', + }); + + const chunks = []; + for await (const chunk of client.sendQuery('test', '/workspace/worktrees/feature')) { + chunks.push(chunk); + } + + expect(spyFindByPathPrefix).toHaveBeenCalledWith('/workspace/worktrees/feature'); + expect(spyScan).not.toHaveBeenCalled(); + }); + }); }); diff --git a/packages/core/src/clients/claude.ts b/packages/core/src/clients/claude.ts index f70e230dcf..0a0624c69c 100644 --- a/packages/core/src/clients/claude.ts +++ b/packages/core/src/clients/claude.ts @@ -27,6 +27,9 @@ import { } from '../types'; import { createLogger } from '@archon/paths'; import { buildCleanSubprocessEnv } from '../utils/env-allowlist'; +import { scanPathForSensitiveKeys, EnvLeakError } from '../utils/env-leak-scanner'; +import * as codebaseDb from '../db/codebases'; +import { loadConfig } from '../config/config-loader'; /** Lazy-initialized logger (deferred so test mocks can intercept createLogger) */ let cachedLog: ReturnType | undefined; @@ -258,6 +261,31 @@ export class ClaudeClient implements IAssistantClient { resumeSessionId?: string, requestOptions?: AssistantRequestOptions ): AsyncGenerator { + // Pre-spawn: check for env key leak if codebase is not explicitly consented. + // Use prefix lookup so worktree paths (e.g. .../worktrees/feature-branch) still + // match the registered source cwd (e.g. .../source). + const codebase = + (await codebaseDb.findCodebaseByDefaultCwd(cwd)) ?? + (await codebaseDb.findCodebaseByPathPrefix(cwd)); + if (!codebase?.allow_env_keys) { + // Fail-closed: a config load failure (corrupt YAML, permission denied) + // must NOT silently bypass the gate. Catch, log, and treat as + // `allowTargetRepoKeys = false` so the scanner still runs. + let allowTargetRepoKeys = false; + try { + const merged = await loadConfig(cwd); + allowTargetRepoKeys = merged.allowTargetRepoKeys; + } catch (configErr) { + getLog().warn({ err: configErr, cwd }, 'env_leak_gate.config_load_failed_gate_enforced'); + } + if (!allowTargetRepoKeys) { + const report = scanPathForSensitiveKeys(cwd); + if (report.findings.length > 0) { + throw new EnvLeakError(report, 'spawn-existing'); + } + } + } + // Note: If subprocess crashes mid-stream after yielding chunks, those chunks // are already consumed by the caller. Retry starts a fresh subprocess, so the // caller may receive partial output from the failed attempt followed by full diff --git a/packages/core/src/clients/codex.test.ts b/packages/core/src/clients/codex.test.ts index f7b9f52601..e29002cd0a 100644 --- a/packages/core/src/clients/codex.test.ts +++ b/packages/core/src/clients/codex.test.ts @@ -1,4 +1,4 @@ -import { describe, test, expect, mock, beforeEach } from 'bun:test'; +import { describe, test, expect, mock, beforeEach, afterEach, spyOn } from 'bun:test'; import { createMockLogger } from '../test/mocks/logger'; const mockLogger = createMockLogger(); @@ -40,6 +40,8 @@ mock.module('@openai/codex-sdk', () => ({ })); import { CodexClient } from './codex'; +import * as codebaseDb from '../db/codebases'; +import * as envLeakScanner from '../utils/env-leak-scanner'; describe('CodexClient', () => { let client: CodexClient; @@ -1000,4 +1002,89 @@ describe('CodexClient', () => { }); }); }); + + describe('pre-spawn env leak gate', () => { + let spyFindByDefaultCwd: ReturnType; + let spyFindByPathPrefix: ReturnType; + let spyScan: ReturnType; + + beforeEach(() => { + // Restore a working runStreamed default so retry-test bleed doesn't break gate tests + mockRunStreamed.mockResolvedValue({ + events: (async function* () { + yield { type: 'turn.completed', usage: defaultUsage }; + })(), + }); + spyFindByDefaultCwd = spyOn(codebaseDb, 'findCodebaseByDefaultCwd').mockResolvedValue(null); + spyFindByPathPrefix = spyOn(codebaseDb, 'findCodebaseByPathPrefix').mockResolvedValue(null); + spyScan = spyOn(envLeakScanner, 'scanPathForSensitiveKeys').mockReturnValue({ + path: '/workspace', + findings: [], + }); + }); + + afterEach(() => { + spyFindByDefaultCwd.mockRestore(); + spyFindByPathPrefix.mockRestore(); + spyScan.mockRestore(); + }); + + test('throws EnvLeakError when .env contains sensitive keys and codebase has no consent', async () => { + spyFindByDefaultCwd.mockResolvedValueOnce(null); + spyFindByPathPrefix.mockResolvedValueOnce(null); + spyScan.mockReturnValueOnce({ + path: '/workspace', + findings: [{ file: '.env', keys: ['ANTHROPIC_API_KEY'] }], + }); + + const consumeGenerator = async (): Promise => { + for await (const _ of client.sendQuery('test', '/workspace')) { + // consume + } + }; + + await expect(consumeGenerator()).rejects.toThrow('Cannot run workflow'); + }); + + test('skips scan when codebase has allow_env_keys: true', async () => { + spyFindByDefaultCwd.mockResolvedValueOnce({ + id: 'codebase-1', + allow_env_keys: true, + default_cwd: '/workspace', + }); + + const chunks = []; + for await (const chunk of client.sendQuery('test', '/workspace')) { + chunks.push(chunk); + } + + expect(spyScan).not.toHaveBeenCalled(); + }); + + test('proceeds when cwd has no registered codebase and no sensitive keys', async () => { + const chunks = []; + for await (const chunk of client.sendQuery('test', '/workspace')) { + chunks.push(chunk); + } + + expect(spyScan).toHaveBeenCalledTimes(1); + }); + + test('uses prefix lookup for worktree paths when exact match returns null', async () => { + spyFindByDefaultCwd.mockResolvedValueOnce(null); + spyFindByPathPrefix.mockResolvedValueOnce({ + id: 'codebase-1', + allow_env_keys: true, + default_cwd: '/workspace/source', + }); + + const chunks = []; + for await (const chunk of client.sendQuery('test', '/workspace/worktrees/feature')) { + chunks.push(chunk); + } + + expect(spyFindByPathPrefix).toHaveBeenCalledWith('/workspace/worktrees/feature'); + expect(spyScan).not.toHaveBeenCalled(); + }); + }); }); diff --git a/packages/core/src/clients/codex.ts b/packages/core/src/clients/codex.ts index 121bd73faa..a7c52731e1 100644 --- a/packages/core/src/clients/codex.ts +++ b/packages/core/src/clients/codex.ts @@ -18,6 +18,9 @@ import { type TokenUsage, } from '../types'; import { createLogger } from '@archon/paths'; +import { scanPathForSensitiveKeys, EnvLeakError } from '../utils/env-leak-scanner'; +import * as codebaseDb from '../db/codebases'; +import { loadConfig } from '../config/config-loader'; /** Lazy-initialized logger (deferred so test mocks can intercept createLogger) */ let cachedLog: ReturnType | undefined; @@ -154,6 +157,29 @@ export class CodexClient implements IAssistantClient { resumeSessionId?: string, options?: AssistantRequestOptions ): AsyncGenerator { + // Pre-spawn: check for env key leak if codebase is not explicitly consented. + // Use prefix lookup so worktree paths (e.g. .../worktrees/feature-branch) still + // match the registered source cwd (e.g. .../source). + const codebase = + (await codebaseDb.findCodebaseByDefaultCwd(cwd)) ?? + (await codebaseDb.findCodebaseByPathPrefix(cwd)); + if (!codebase?.allow_env_keys) { + // Fail-closed: a config load failure must NOT silently bypass the gate. + let allowTargetRepoKeys = false; + try { + const merged = await loadConfig(cwd); + allowTargetRepoKeys = merged.allowTargetRepoKeys; + } catch (configErr) { + getLog().warn({ err: configErr, cwd }, 'env_leak_gate.config_load_failed_gate_enforced'); + } + if (!allowTargetRepoKeys) { + const report = scanPathForSensitiveKeys(cwd); + if (report.findings.length > 0) { + throw new EnvLeakError(report, 'spawn-existing'); + } + } + } + const codex = getCodex(); const threadOptions = buildThreadOptions(cwd, options); diff --git a/packages/core/src/config/config-loader.ts b/packages/core/src/config/config-loader.ts index f0f51ba0a4..8ee702c613 100644 --- a/packages/core/src/config/config-loader.ts +++ b/packages/core/src/config/config-loader.ts @@ -38,6 +38,24 @@ function getLog(): ReturnType { return cachedLog; } +/** + * Tracks which env-leak-gate-disabled sources have already warned in this + * process. `loadConfig()` is called once per pre-spawn check (per workflow + * step), so without this guard the warn would flood logs and break alert + * rate-limiting downstream. + */ +const envLeakGateDisabledWarnedSources = new Set<'global_config' | 'repo_config'>(); +function warnEnvLeakGateDisabledOnce(source: 'global_config' | 'repo_config'): void { + if (envLeakGateDisabledWarnedSources.has(source)) return; + envLeakGateDisabledWarnedSources.add(source); + getLog().warn({ source }, 'env_leak_gate_disabled'); +} + +// Test-only: reset the warn-once state so unit tests can re-trigger the log. +export function resetEnvLeakGateWarnedSourcesForTests(): void { + envLeakGateDisabledWarnedSources.clear(); +} + /** * Parse YAML using Bun's native YAML parser */ @@ -198,6 +216,7 @@ function getDefaults(): MergedConfig { loadDefaultCommands: true, loadDefaultWorkflows: true, }, + allowTargetRepoKeys: false, }; } @@ -302,6 +321,12 @@ function mergeGlobalConfig(defaults: MergedConfig, global: GlobalConfig): Merged result.concurrency.maxConversations = global.concurrency.maxConversations; } + // Env-leak gate bypass (global) + if (global.allow_target_repo_keys === true) { + result.allowTargetRepoKeys = true; + warnEnvLeakGateDisabledOnce('global_config'); + } + return result; } @@ -375,6 +400,14 @@ function mergeRepoConfig(merged: MergedConfig, repo: RepoConfig): MergedConfig { result.envVars = { ...result.envVars, ...repo.env }; } + // Repo-level env-leak gate override (wins over global) + if (repo.allow_target_repo_keys !== undefined) { + result.allowTargetRepoKeys = repo.allow_target_repo_keys; + if (repo.allow_target_repo_keys) { + warnEnvLeakGateDisabledOnce('repo_config'); + } + } + return result; } diff --git a/packages/core/src/config/config-types.ts b/packages/core/src/config/config-types.ts index 6b49bfbcff..f3bbdf41cf 100644 --- a/packages/core/src/config/config-types.ts +++ b/packages/core/src/config/config-types.ts @@ -84,6 +84,20 @@ export interface GlobalConfig { */ maxConversations?: number; }; + + /** + * Bypass the env-leak gate globally. When true, Archon will not refuse to + * register or spawn subprocesses for codebases whose auto-loaded .env files + * contain sensitive keys (ANTHROPIC_API_KEY, OPENAI_API_KEY, etc). + * + * WARNING: Weakens the env-leak gate. Keys in the target repo's .env will + * be auto-loaded by Bun subprocesses (Claude/Codex) and bypass Archon's + * env allowlist. Use only on trusted machines. + * + * YAML key: `allow_target_repo_keys` + * @default false + */ + allow_target_repo_keys?: boolean; } /** @@ -158,6 +172,12 @@ export interface RepoConfig { */ env?: Record; + /** + * Per-repo override for the env-leak gate bypass. Repo value wins over global. + * YAML key: `allow_target_repo_keys` + */ + allow_target_repo_keys?: boolean; + /** * Default commands/workflows configuration */ @@ -240,6 +260,14 @@ export interface MergedConfig { * Undefined when no env vars are configured. */ envVars?: Record; + + /** + * Effective value of the env-leak gate bypass. When true, the env scanner + * is skipped during registration and pre-spawn. Repo-level override wins + * over global (explicit `false` at repo level re-enables the gate). + * @default false + */ + allowTargetRepoKeys: boolean; } /** diff --git a/packages/core/src/db/codebases.test.ts b/packages/core/src/db/codebases.test.ts index 26c269a085..ec3c249d14 100644 --- a/packages/core/src/db/codebases.test.ts +++ b/packages/core/src/db/codebases.test.ts @@ -22,6 +22,7 @@ import { findCodebaseByDefaultCwd, findCodebaseByName, updateCodebase, + updateCodebaseAllowEnvKeys, deleteCodebase, } from './codebases'; @@ -36,6 +37,7 @@ describe('codebases', () => { repository_url: 'https://github.com/user/repo', default_cwd: '/workspace/test-project', ai_assistant_type: 'claude', + allow_env_keys: false, commands: { plan: { path: '.claude/commands/plan.md', description: 'Plan feature' } }, created_at: new Date(), updated_at: new Date(), @@ -54,8 +56,8 @@ describe('codebases', () => { expect(result).toEqual(mockCodebase); expect(mockQuery).toHaveBeenCalledWith( - 'INSERT INTO remote_agent_codebases (name, repository_url, default_cwd, ai_assistant_type) VALUES ($1, $2, $3, $4) RETURNING *', - ['test-project', 'https://github.com/user/repo', '/workspace/test-project', 'claude'] + 'INSERT INTO remote_agent_codebases (name, repository_url, default_cwd, ai_assistant_type, allow_env_keys) VALUES ($1, $2, $3, $4, $5) RETURNING *', + ['test-project', 'https://github.com/user/repo', '/workspace/test-project', 'claude', false] ); }); @@ -73,8 +75,8 @@ describe('codebases', () => { expect(result).toEqual(codebaseWithoutOptional); expect(mockQuery).toHaveBeenCalledWith( - 'INSERT INTO remote_agent_codebases (name, repository_url, default_cwd, ai_assistant_type) VALUES ($1, $2, $3, $4) RETURNING *', - ['test-project', null, '/workspace/test-project', 'claude'] + 'INSERT INTO remote_agent_codebases (name, repository_url, default_cwd, ai_assistant_type, allow_env_keys) VALUES ($1, $2, $3, $4, $5) RETURNING *', + ['test-project', null, '/workspace/test-project', 'claude', false] ); }); @@ -297,6 +299,7 @@ describe('codebases', () => { name: 'test-repo', default_cwd: '/workspace/test-repo', ai_assistant_type: 'claude', + allow_env_keys: false, repository_url: null, commands: {}, created_at: new Date(), @@ -396,6 +399,26 @@ describe('codebases', () => { }); }); + describe('updateCodebaseAllowEnvKeys', () => { + test('flips the consent bit', async () => { + mockQuery.mockResolvedValueOnce(createQueryResult([], 1)); + + await updateCodebaseAllowEnvKeys('codebase-123', true); + + expect(mockQuery).toHaveBeenCalledWith( + 'UPDATE remote_agent_codebases SET allow_env_keys = $1, updated_at = NOW() WHERE id = $2', + [true, 'codebase-123'] + ); + }); + + test('throws when codebase not found', async () => { + mockQuery.mockResolvedValueOnce(createQueryResult([], 0)); + await expect(updateCodebaseAllowEnvKeys('missing', false)).rejects.toThrow( + 'Codebase missing not found' + ); + }); + }); + describe('deleteCodebase', () => { test('should unlink sessions, conversations, and delete codebase', async () => { // First call: unlink sessions diff --git a/packages/core/src/db/codebases.ts b/packages/core/src/db/codebases.ts index f3847f301f..b9f45578b6 100644 --- a/packages/core/src/db/codebases.ts +++ b/packages/core/src/db/codebases.ts @@ -17,11 +17,13 @@ export async function createCodebase(data: { repository_url?: string; default_cwd: string; ai_assistant_type?: string; + allow_env_keys?: boolean; }): Promise { const assistantType = data.ai_assistant_type ?? 'claude'; + const allowEnvKeys = data.allow_env_keys ?? false; const result = await pool.query( - 'INSERT INTO remote_agent_codebases (name, repository_url, default_cwd, ai_assistant_type) VALUES ($1, $2, $3, $4) RETURNING *', - [data.name, data.repository_url ?? null, data.default_cwd, assistantType] + 'INSERT INTO remote_agent_codebases (name, repository_url, default_cwd, ai_assistant_type, allow_env_keys) VALUES ($1, $2, $3, $4, $5) RETURNING *', + [data.name, data.repository_url ?? null, data.default_cwd, assistantType, allowEnvKeys] ); if (!result.rows[0]) { throw new Error('Failed to create codebase: INSERT succeeded but no row returned'); @@ -96,6 +98,25 @@ export async function findCodebaseByDefaultCwd(defaultCwd: string): Promise { + const result = await pool.query( + `SELECT * FROM remote_agent_codebases + WHERE $1 LIKE default_cwd || '%' + ORDER BY length(default_cwd) DESC + LIMIT 1`, + [cwdPath] + ); + return result.rows[0] || null; +} + export async function findCodebaseByName(name: string): Promise { const result = await pool.query( 'SELECT * FROM remote_agent_codebases WHERE name = $1 ORDER BY created_at DESC LIMIT 1', @@ -137,6 +158,21 @@ export async function updateCodebase( } } +/** + * Flip the `allow_env_keys` consent bit for an existing codebase. + * Throws when the codebase does not exist. + */ +export async function updateCodebaseAllowEnvKeys(id: string, allowEnvKeys: boolean): Promise { + const dialect = getDialect(); + const result = await pool.query( + `UPDATE remote_agent_codebases SET allow_env_keys = $1, updated_at = ${dialect.now()} WHERE id = $2`, + [allowEnvKeys, id] + ); + if ((result.rowCount ?? 0) === 0) { + throw new Error(`Codebase ${id} not found`); + } +} + export async function listCodebases(): Promise { const result = await pool.query( 'SELECT * FROM remote_agent_codebases ORDER BY name ASC' diff --git a/packages/core/src/handlers/clone.test.ts b/packages/core/src/handlers/clone.test.ts index c913c1a78c..7f948cfb33 100644 --- a/packages/core/src/handlers/clone.test.ts +++ b/packages/core/src/handlers/clone.test.ts @@ -20,6 +20,7 @@ const mockCreateCodebase = mock(() => repository_url: 'https://github.com/owner/repo', default_cwd: '/home/test/.archon/workspaces/owner/repo/source', ai_assistant_type: 'claude', + allow_env_keys: false, commands: {}, created_at: new Date(), updated_at: new Date(), @@ -66,6 +67,20 @@ mock.module('../utils/commands', () => ({ findMarkdownFilesRecursive: mockFindMarkdownFilesRecursive, })); +// ── env-leak-scanner mock ─────────────────────────────────────────────────── +class MockEnvLeakError extends Error { + constructor(public report: unknown) { + super('Cannot add codebase — /test/path contains keys that will leak into AI subprocesses'); + this.name = 'EnvLeakError'; + } +} + +const mockScanPathForSensitiveKeys = mock(() => ({ path: '', findings: [] })); +mock.module('../utils/env-leak-scanner', () => ({ + scanPathForSensitiveKeys: mockScanPathForSensitiveKeys, + EnvLeakError: MockEnvLeakError, +})); + // ── Import module under test AFTER mocks are registered ──────────────────── import { cloneRepository, registerRepository } from './clone'; @@ -103,6 +118,7 @@ function clearMocks(): void { mockFindCodebaseByName.mockReset(); mockUpdateCodebase.mockReset(); mockFindMarkdownFilesRecursive.mockReset(); + mockScanPathForSensitiveKeys.mockReset(); mockLogger.info.mockClear(); mockLogger.debug.mockClear(); mockLogger.warn.mockClear(); @@ -116,6 +132,7 @@ function clearMocks(): void { mockFindCodebaseByName.mockResolvedValue(null); mockUpdateCodebase.mockResolvedValue(undefined); mockFindMarkdownFilesRecursive.mockResolvedValue([]); + mockScanPathForSensitiveKeys.mockReturnValue({ path: '', findings: [] }); } afterAll(() => { @@ -140,6 +157,7 @@ function makeCodebase( repository_url: 'https://github.com/owner/repo', default_cwd: '/home/test/.archon/workspaces/owner/repo/source', ai_assistant_type: 'claude', + allow_env_keys: false, commands: {}, created_at: new Date(), updated_at: new Date(), @@ -930,4 +948,33 @@ describe('RegisterResult shape', () => { expect(result.alreadyExisted).toBe(true); expect(result.commandCount).toBe(0); }); + + describe('env leak gate', () => { + test('throws EnvLeakError when scanner finds sensitive keys and allowEnvKeys is false', async () => { + mockScanPathForSensitiveKeys.mockReturnValueOnce({ + path: '/home/test/.archon/workspaces/owner/repo/source', + findings: [{ file: '.env', keys: ['ANTHROPIC_API_KEY'] }], + }); + + await expect(cloneRepository('https://github.com/owner/repo')).rejects.toThrow( + 'Cannot add codebase' + ); + }); + + test('does not throw when allowEnvKeys is true, even with scanner findings present', async () => { + mockCreateCodebase.mockResolvedValueOnce(makeCodebase() as ReturnType); + // Scanner is still called for the audit-log payload (files/keys), but the + // gate must NOT throw — the per-call grant is the bypass. + mockScanPathForSensitiveKeys.mockReturnValueOnce({ + path: '/home/test/.archon/workspaces/owner/repo/source', + findings: [{ file: '.env', keys: ['ANTHROPIC_API_KEY'] }], + }); + + const result = await cloneRepository('https://github.com/owner/repo', true); + + expect(result.codebaseId).toBe('codebase-uuid-1'); + // Scanner is called once — for the audit log, not as a gate + expect(mockScanPathForSensitiveKeys).toHaveBeenCalledTimes(1); + }); + }); }); diff --git a/packages/core/src/handlers/clone.ts b/packages/core/src/handlers/clone.ts index fe7e4d9570..3dc96f499c 100644 --- a/packages/core/src/handlers/clone.ts +++ b/packages/core/src/handlers/clone.ts @@ -16,6 +16,12 @@ import { parseOwnerRepo, } from '@archon/paths'; import { findMarkdownFilesRecursive } from '../utils/commands'; +import { + scanPathForSensitiveKeys, + EnvLeakError, + type LeakErrorContext, +} from '../utils/env-leak-scanner'; +import { loadConfig } from '../config/config-loader'; import { createLogger } from '@archon/paths'; /** Lazy-initialized logger (deferred so test mocks can intercept createLogger) */ @@ -40,8 +46,53 @@ export interface RegisterResult { async function registerRepoAtPath( targetPath: string, name: string, - repositoryUrl: string | null + repositoryUrl: string | null, + allowEnvKeys = false, + context: LeakErrorContext = 'register-ui' ): Promise { + // Scan for sensitive keys in auto-loaded .env files before registering. + // Two bypass paths exist (in order of precedence): + // 1. Per-call `allowEnvKeys=true` (Web UI checkbox or CLI --allow-env-keys) + // 2. Config-level `allow_target_repo_keys: true` (global YAML) + // When the per-call bypass is used we still emit an audit-log entry so the + // grant has a permanent breadcrumb (parity with the PATCH route's + // `env_leak_consent_granted` log). + if (!allowEnvKeys) { + const merged = await loadConfig(targetPath); + if (!merged.allowTargetRepoKeys) { + const report = scanPathForSensitiveKeys(targetPath); + if (report.findings.length > 0) { + throw new EnvLeakError(report, context); + } + } + } else { + // Per-call grant — emit audit log mirroring the PATCH route shape so the + // CLI/UI add-with-consent paths leave the same breadcrumbs. + let files: string[] = []; + let keys: string[] = []; + let scanStatus: 'ok' | 'skipped' = 'ok'; + try { + const report = scanPathForSensitiveKeys(targetPath); + files = report.findings.map(f => f.file); + keys = Array.from(new Set(report.findings.flatMap(f => f.keys))); + } catch (scanErr) { + scanStatus = 'skipped'; + getLog().warn({ err: scanErr, path: targetPath }, 'env_leak_consent_scan_skipped'); + } + const actor = context === 'register-cli' ? 'user-cli' : 'user-ui'; + getLog().warn( + { + name, + path: targetPath, + files, + keys, + scanStatus, + actor, + }, + 'env_leak_consent_granted' + ); + } + // Auto-detect assistant type based on folder structure let suggestedAssistant = 'claude'; const codexFolder = join(targetPath, '.codex'); @@ -122,6 +173,7 @@ async function registerRepoAtPath( repository_url: repositoryUrl ?? undefined, default_cwd: targetPath, ai_assistant_type: suggestedAssistant, + allow_env_keys: allowEnvKeys, }); // Auto-load commands if found @@ -190,11 +242,15 @@ function normalizeRepoUrl(rawUrl: string): { * Local paths (starting with /, ~, or .) are delegated to registerRepository * to avoid wrong owner/repo naming. See #383 for broader rethink. */ -export async function cloneRepository(repoUrl: string): Promise { +export async function cloneRepository( + repoUrl: string, + allowEnvKeys?: boolean, + context: LeakErrorContext = 'register-ui' +): Promise { // Local paths should be registered (symlink), not cloned (copied) if (repoUrl.startsWith('/') || repoUrl.startsWith('~') || repoUrl.startsWith('.')) { const resolvedPath = repoUrl.startsWith('~') ? expandTilde(repoUrl) : resolve(repoUrl); - return registerRepository(resolvedPath); + return registerRepository(resolvedPath, allowEnvKeys, context); } const { workingUrl, ownerName, repoName, targetPath } = normalizeRepoUrl(repoUrl); @@ -275,7 +331,13 @@ export async function cloneRepository(repoUrl: string): Promise await execFileAsync('git', ['config', '--global', '--add', 'safe.directory', targetPath]); getLog().debug({ path: targetPath }, 'safe_directory_added'); - const result = await registerRepoAtPath(targetPath, `${ownerName}/${repoName}`, workingUrl); + const result = await registerRepoAtPath( + targetPath, + `${ownerName}/${repoName}`, + workingUrl, + allowEnvKeys, + context + ); getLog().info({ url: workingUrl, targetPath }, 'clone_completed'); return result; } @@ -283,7 +345,11 @@ export async function cloneRepository(repoUrl: string): Promise /** * Register an existing local repository in the database (no git clone). */ -export async function registerRepository(localPath: string): Promise { +export async function registerRepository( + localPath: string, + allowEnvKeys?: boolean, + context: LeakErrorContext = 'register-ui' +): Promise { // Validate path exists and is a git repo try { await execFileAsync('git', ['-C', localPath, 'rev-parse', '--git-dir']); @@ -349,5 +415,5 @@ export async function registerRepository(localPath: string): Promise { repository_url: 'https://github.com/user/my-repo', default_cwd: '/workspace/my-repo', ai_assistant_type: 'claude', + allow_env_keys: false, commands: {}, created_at: new Date(), updated_at: new Date(), @@ -566,6 +567,7 @@ describe('CommandHandler', () => { repository_url: 'https://github.com/owner/repo', default_cwd: '/workspace/repo', ai_assistant_type: 'claude', + allow_env_keys: false, commands: {}, created_at: new Date(), updated_at: new Date(), @@ -604,6 +606,7 @@ describe('CommandHandler', () => { repository_url: 'https://github.com/owner/orphaned-repo', default_cwd: '/workspace/orphaned-repo', ai_assistant_type: 'claude', + allow_env_keys: false, commands: {}, created_at: new Date(), updated_at: new Date(), @@ -718,6 +721,7 @@ describe('CommandHandler', () => { repository_url: 'https://github.com/user/my-repo', default_cwd: '/workspace/my-repo', ai_assistant_type: 'claude', + allow_env_keys: false, commands: {}, created_at: new Date(), updated_at: new Date(), diff --git a/packages/core/src/index.ts b/packages/core/src/index.ts index 90b1cefd80..e212eb10c9 100644 --- a/packages/core/src/index.ts +++ b/packages/core/src/index.ts @@ -145,6 +145,15 @@ export { toError } from './utils/error'; // Credential sanitization export { sanitizeCredentials, sanitizeError } from './utils/credential-sanitizer'; +// Env leak scanner +export { + EnvLeakError, + scanPathForSensitiveKeys, + formatLeakError, + type LeakReport, + type LeakErrorContext, +} from './utils/env-leak-scanner'; + // GitHub GraphQL export { getLinkedIssueNumbers } from './utils/github-graphql'; diff --git a/packages/core/src/operations/isolation-operations.test.ts b/packages/core/src/operations/isolation-operations.test.ts index ead59a0d2c..d564b61d07 100644 --- a/packages/core/src/operations/isolation-operations.test.ts +++ b/packages/core/src/operations/isolation-operations.test.ts @@ -190,7 +190,7 @@ describe('cleanupMergedEnvironments', () => { const result = await cleanupMergedEnvironments('cb-1', '/main'); - expect(mockCleanupMerged).toHaveBeenCalledWith('cb-1', '/main'); + expect(mockCleanupMerged).toHaveBeenCalledWith('cb-1', '/main', {}); expect(result.removed).toBe(2); }); @@ -201,4 +201,12 @@ describe('cleanupMergedEnvironments', () => { expect(result.errors).toEqual(['branch-a: git error']); }); + + test('forwards includeClosed option to cleanupMergedWorktrees', async () => { + mockCleanupMerged.mockResolvedValueOnce({ removed: 1, errors: [] }); + + await cleanupMergedEnvironments('cb-1', '/main', { includeClosed: true }); + + expect(mockCleanupMerged).toHaveBeenCalledWith('cb-1', '/main', { includeClosed: true }); + }); }); diff --git a/packages/core/src/operations/isolation-operations.ts b/packages/core/src/operations/isolation-operations.ts index c61990243a..08b49f601e 100644 --- a/packages/core/src/operations/isolation-operations.ts +++ b/packages/core/src/operations/isolation-operations.ts @@ -167,7 +167,8 @@ export async function cleanupStaleEnvironments( */ export async function cleanupMergedEnvironments( codebaseId: string, - mainPath: string + mainPath: string, + options: { includeClosed?: boolean } = {} ): Promise { - return cleanupMergedWorktrees(codebaseId, mainPath); + return cleanupMergedWorktrees(codebaseId, mainPath, options); } diff --git a/packages/core/src/orchestrator/orchestrator-agent.test.ts b/packages/core/src/orchestrator/orchestrator-agent.test.ts index 0757929690..70080cc01a 100644 --- a/packages/core/src/orchestrator/orchestrator-agent.test.ts +++ b/packages/core/src/orchestrator/orchestrator-agent.test.ts @@ -181,6 +181,7 @@ function makeCodebase(name: string, id = `id-${name}`): Codebase { repository_url: null, default_cwd: `/repos/${name}`, ai_assistant_type: 'claude', + allow_env_keys: false, commands: {}, created_at: new Date(), updated_at: new Date(), @@ -804,6 +805,7 @@ function makeCodebaseForSync() { repository_url: 'https://github.com/test/repo', default_cwd: '/repos/test-repo', ai_assistant_type: 'claude', + allow_env_keys: false, commands: {}, created_at: new Date(), updated_at: new Date(), @@ -969,6 +971,7 @@ describe('workflow dispatch routing — interactive flag', () => { repository_url: null, default_cwd: '/repos/test-repo', ai_assistant_type: 'claude' as const, + allow_env_keys: false, commands: {}, created_at: new Date(), updated_at: new Date(), @@ -1069,6 +1072,7 @@ describe('natural-language approval routing', () => { repository_url: null, default_cwd: '/repos/test-repo', ai_assistant_type: 'claude' as const, + allow_env_keys: false, commands: {}, created_at: new Date(), updated_at: new Date(), diff --git a/packages/core/src/orchestrator/orchestrator-isolation.test.ts b/packages/core/src/orchestrator/orchestrator-isolation.test.ts index 1f533b8ed5..f46930f02c 100644 --- a/packages/core/src/orchestrator/orchestrator-isolation.test.ts +++ b/packages/core/src/orchestrator/orchestrator-isolation.test.ts @@ -176,6 +176,7 @@ function makeCodebase(overrides?: Partial): Codebase { id: 'cb-1', name: 'test-repo', default_cwd: '/workspace/test-repo', + allow_env_keys: false, commands: {}, created_at: new Date(), updated_at: new Date(), diff --git a/packages/core/src/orchestrator/orchestrator.test.ts b/packages/core/src/orchestrator/orchestrator.test.ts index f8f199a5de..d5e81038da 100644 --- a/packages/core/src/orchestrator/orchestrator.test.ts +++ b/packages/core/src/orchestrator/orchestrator.test.ts @@ -216,6 +216,7 @@ const mockCodebase: Codebase = { repository_url: 'https://github.com/user/repo', default_cwd: '/workspace/test-project', ai_assistant_type: 'claude', + allow_env_keys: false, commands: {}, created_at: new Date(), updated_at: new Date(), diff --git a/packages/core/src/services/cleanup-service.test.ts b/packages/core/src/services/cleanup-service.test.ts index e4b2d7ec08..3d1b204d35 100644 --- a/packages/core/src/services/cleanup-service.test.ts +++ b/packages/core/src/services/cleanup-service.test.ts @@ -12,6 +12,7 @@ const mockHasUncommittedChanges = mock(() => Promise.resolve(false)); const mockWorktreeExists = mock(() => Promise.resolve(false)); const mockGetDefaultBranch = mock(() => Promise.resolve('main')); const mockIsBranchMerged = mock(() => Promise.resolve(false)); +const mockIsPatchEquivalent = mock(() => Promise.resolve(false)); const mockGetLastCommitDate = mock(() => Promise.resolve(null as Date | null)); mock.module('@archon/git', () => ({ execFileAsync: mockExecFileAsync, @@ -19,6 +20,7 @@ mock.module('@archon/git', () => ({ worktreeExists: mockWorktreeExists, getDefaultBranch: mockGetDefaultBranch, isBranchMerged: mockIsBranchMerged, + isPatchEquivalent: mockIsPatchEquivalent, getLastCommitDate: mockGetLastCommitDate, toRepoPath: (p: string) => p, toBranchName: (b: string) => b, @@ -40,10 +42,13 @@ mock.module('../isolation', () => ({ destroy: mockDestroy, }), })); +type PrStateValue = 'MERGED' | 'CLOSED' | 'OPEN' | 'NONE'; +const mockGetPrState = mock(() => Promise.resolve('NONE' as PrStateValue)); mock.module('@archon/isolation', () => ({ getIsolationProvider: () => ({ destroy: mockDestroy, }), + getPrState: mockGetPrState, })); // Mock isolation-environments DB @@ -824,6 +829,10 @@ describe('cleanupMergedWorktrees', () => { // Reset defaults mockGetDefaultBranch.mockResolvedValue('main'); mockIsBranchMerged.mockResolvedValue(false); + mockIsPatchEquivalent.mockReset(); + mockIsPatchEquivalent.mockResolvedValue(false); + mockGetPrState.mockReset(); + mockGetPrState.mockResolvedValue('NONE'); mockHasUncommittedChanges.mockResolvedValue(false); mockWorktreeExists.mockResolvedValue(false); }); @@ -915,6 +924,164 @@ describe('cleanupMergedWorktrees', () => { reason: 'still used by 2 conversation(s)', }); }); + + test('removes branch when git-cherry detects squash-merge', async () => { + mockListByCodebase.mockResolvedValueOnce([ + { + id: 'env-squash', + branch_name: 'squash-branch', + working_path: '/workspace/repo/worktrees/squash-branch', + status: 'active', + }, + ]); + mockIsBranchMerged.mockResolvedValueOnce(false); + mockIsPatchEquivalent.mockResolvedValueOnce(true); + mockGetConversationsUsingEnv.mockResolvedValueOnce([]); + mockGetById.mockResolvedValueOnce({ + id: 'env-squash', + working_path: '/workspace/repo/worktrees/squash-branch', + status: 'active', + }); + mockWorktreeExists.mockResolvedValueOnce(true); + + const result = await cleanupMergedWorktrees('codebase-1', '/workspace/repo'); + + expect(result.removed).toContain('squash-branch'); + }); + + test('removes branch when PR is MERGED', async () => { + mockListByCodebase.mockResolvedValueOnce([ + { + id: 'env-pr-merged', + branch_name: 'pr-merged-branch', + working_path: '/workspace/repo/worktrees/pr-merged-branch', + status: 'active', + }, + ]); + mockIsBranchMerged.mockResolvedValueOnce(false); + mockIsPatchEquivalent.mockResolvedValueOnce(false); + mockGetPrState.mockResolvedValueOnce('MERGED'); + mockGetConversationsUsingEnv.mockResolvedValueOnce([]); + mockGetById.mockResolvedValueOnce({ + id: 'env-pr-merged', + working_path: '/workspace/repo/worktrees/pr-merged-branch', + status: 'active', + }); + mockWorktreeExists.mockResolvedValueOnce(true); + + const result = await cleanupMergedWorktrees('codebase-1', '/workspace/repo'); + + expect(result.removed).toContain('pr-merged-branch'); + }); + + test('skips branch when PR is OPEN with clear reason', async () => { + mockListByCodebase.mockResolvedValueOnce([ + { + id: 'env-pr-open', + branch_name: 'pr-open-branch', + working_path: '/workspace/repo/worktrees/pr-open-branch', + status: 'active', + }, + ]); + mockIsBranchMerged.mockResolvedValueOnce(false); + mockIsPatchEquivalent.mockResolvedValueOnce(false); + mockGetPrState.mockResolvedValueOnce('OPEN'); + + const result = await cleanupMergedWorktrees('codebase-1', '/workspace/repo'); + + expect(result.removed).toHaveLength(0); + expect(result.skipped).toContainEqual({ + branchName: 'pr-open-branch', + reason: 'PR is open (active review)', + }); + }); + + test('skips branch when PR is CLOSED and includeClosed=false', async () => { + mockListByCodebase.mockResolvedValueOnce([ + { + id: 'env-pr-closed', + branch_name: 'pr-closed-branch', + working_path: '/workspace/repo/worktrees/pr-closed-branch', + status: 'active', + }, + ]); + mockIsBranchMerged.mockResolvedValueOnce(false); + mockIsPatchEquivalent.mockResolvedValueOnce(false); + mockGetPrState.mockResolvedValueOnce('CLOSED'); + + const result = await cleanupMergedWorktrees('codebase-1', '/workspace/repo'); + + expect(result.removed).toHaveLength(0); + }); + + test('removes branch when PR is CLOSED and includeClosed=true', async () => { + mockListByCodebase.mockResolvedValueOnce([ + { + id: 'env-pr-closed-include', + branch_name: 'pr-closed-branch', + working_path: '/workspace/repo/worktrees/pr-closed-branch', + status: 'active', + }, + ]); + mockIsBranchMerged.mockResolvedValueOnce(false); + mockIsPatchEquivalent.mockResolvedValueOnce(false); + mockGetPrState.mockResolvedValueOnce('CLOSED'); + mockGetConversationsUsingEnv.mockResolvedValueOnce([]); + mockGetById.mockResolvedValueOnce({ + id: 'env-pr-closed-include', + working_path: '/workspace/repo/worktrees/pr-closed-branch', + status: 'active', + }); + mockWorktreeExists.mockResolvedValueOnce(true); + + const result = await cleanupMergedWorktrees('codebase-1', '/workspace/repo', { + includeClosed: true, + }); + + expect(result.removed).toContain('pr-closed-branch'); + }); + + test('skips branch when no PR and not merged', async () => { + mockListByCodebase.mockResolvedValueOnce([ + { + id: 'env-none', + branch_name: 'orphan-branch', + working_path: '/workspace/repo/worktrees/orphan-branch', + status: 'active', + }, + ]); + mockIsBranchMerged.mockResolvedValueOnce(false); + mockIsPatchEquivalent.mockResolvedValueOnce(false); + mockGetPrState.mockResolvedValueOnce('NONE'); + + const result = await cleanupMergedWorktrees('codebase-1', '/workspace/repo'); + + expect(result.removed).toHaveLength(0); + expect(result.skipped).toHaveLength(0); + }); + + test('skips branch when isPatchEquivalent throws unexpected error', async () => { + mockListByCodebase.mockResolvedValueOnce([ + { + id: 'env-error', + branch_name: 'error-branch', + working_path: '/workspace/repo/worktrees/error-branch', + status: 'active', + }, + ]); + mockIsBranchMerged.mockResolvedValueOnce(false); + mockIsPatchEquivalent.mockRejectedValueOnce(new Error('permission denied')); + + const result = await cleanupMergedWorktrees('codebase-1', '/workspace/repo'); + + expect(result.removed).toHaveLength(0); + expect(result.skipped).toContainEqual( + expect.objectContaining({ + branchName: 'error-branch', + reason: expect.stringContaining('merge check failed'), + }) + ); + }); }); describe('onConversationClosed', () => { diff --git a/packages/core/src/services/cleanup-service.ts b/packages/core/src/services/cleanup-service.ts index dc9ab69d67..50d9da0d2a 100644 --- a/packages/core/src/services/cleanup-service.ts +++ b/packages/core/src/services/cleanup-service.ts @@ -7,19 +7,20 @@ import * as conversationDb from '../db/conversations'; import * as sessionDb from '../db/sessions'; import { SessionNotFoundError } from '../db/sessions'; import * as codebaseDb from '../db/codebases'; -import { getIsolationProvider } from '@archon/isolation'; -import type { WorktreeStatusBreakdown } from '@archon/isolation'; +import { getIsolationProvider, getPrState } from '@archon/isolation'; +import type { WorktreeStatusBreakdown, PrState } from '@archon/isolation'; import { hasUncommittedChanges, worktreeExists, getDefaultBranch, isBranchMerged, + isPatchEquivalent, getLastCommitDate, toRepoPath, toWorktreePath, toBranchName, } from '@archon/git'; -import type { RepoPath } from '@archon/git'; +import type { RepoPath, BranchName } from '@archon/git'; import { createLogger } from '@archon/paths'; import type { IsolationEnvironmentRow } from '@archon/isolation'; import { ConversationNotFoundError } from '../types'; @@ -500,24 +501,69 @@ export async function cleanupStaleWorktrees( return result; } +/** + * Decide whether a branch is safe to remove using a union of signals: + * (a) git ancestry — `git branch --merged` (catches fast-forward / merge-commit) + * (b) git cherry — patch-equivalent commits (catches squash-merge) + * (c) GitHub PR state via `gh` CLI — MERGED/CLOSED/OPEN + * + * Returns `{ safe, openPr }`. `openPr=true` only when the PR state is OPEN — + * callers use this to surface a clearer skip reason. + */ +async function isSafeToRemove( + repoPath: RepoPath, + branchName: BranchName, + mainBranch: BranchName, + prStateCache: Map, + includeClosed: boolean +): Promise<{ safe: boolean; openPr: boolean }> { + // (a) Fast path — fast-forward / merge-commit ancestry + if (await isBranchMerged(repoPath, branchName, mainBranch)) { + return { safe: true, openPr: false }; + } + // (b) Squash-merge detection via patch equivalence + if (await isPatchEquivalent(repoPath, branchName, mainBranch)) { + return { safe: true, openPr: false }; + } + // (c) GitHub PR state + const prState = await getPrState(branchName, repoPath, prStateCache); + if (prState === 'MERGED') return { safe: true, openPr: false }; + if (prState === 'CLOSED') return { safe: includeClosed, openPr: false }; + if (prState === 'OPEN') return { safe: false, openPr: true }; + return { safe: false, openPr: false }; +} + /** * Clean up merged worktrees for a codebase * Respects uncommitted changes and conversation references */ export async function cleanupMergedWorktrees( codebaseId: string, - mainRepoPath: string + mainRepoPath: string, + options: { includeClosed?: boolean } = {} ): Promise { const result: CleanupOperationResult = { removed: [], skipped: [] }; const environments = await isolationEnvDb.listByCodebase(codebaseId); const repoPath = toRepoPath(mainRepoPath); const mainBranch = await getDefaultBranch(repoPath); + const includeClosed = options.includeClosed ?? false; + const prStateCache = new Map(); for (const env of environments) { - // Check if merged (skip env on unexpected errors) - let merged = false; + // Check if safe to remove via union of signals (skip env on unexpected errors) + let safe = false; + let openPr = false; try { - merged = await isBranchMerged(repoPath, toBranchName(env.branch_name), mainBranch); + const branchName = toBranchName(env.branch_name); + const decision = await isSafeToRemove( + repoPath, + branchName, + mainBranch, + prStateCache, + includeClosed + ); + safe = decision.safe; + openPr = decision.openPr; } catch (error) { const err = error as Error; result.skipped.push({ @@ -526,7 +572,15 @@ export async function cleanupMergedWorktrees( }); continue; } - if (!merged) continue; + if (!safe) { + if (openPr) { + result.skipped.push({ + branchName: env.branch_name, + reason: 'PR is open (active review)', + }); + } + continue; + } // Check for uncommitted changes or active conversation references const blocker = await getRemovalBlocker(env); diff --git a/packages/core/src/types/index.ts b/packages/core/src/types/index.ts index 3554c6211d..549891f35e 100644 --- a/packages/core/src/types/index.ts +++ b/packages/core/src/types/index.ts @@ -57,6 +57,7 @@ export interface Codebase { repository_url: string | null; default_cwd: string; ai_assistant_type: string; + allow_env_keys: boolean; commands: Record; created_at: Date; updated_at: Date; diff --git a/packages/core/src/utils/env-leak-scanner.test.ts b/packages/core/src/utils/env-leak-scanner.test.ts new file mode 100644 index 0000000000..4d436bbc24 --- /dev/null +++ b/packages/core/src/utils/env-leak-scanner.test.ts @@ -0,0 +1,133 @@ +import { describe, it, expect, beforeEach, afterEach } from 'bun:test'; +import { writeFileSync, mkdirSync, rmSync } from 'fs'; +import { join } from 'path'; +import { + scanPathForSensitiveKeys, + EnvLeakError, + formatLeakError, + SENSITIVE_KEYS, + AUTOLOADED_FILES, +} from './env-leak-scanner'; + +describe('scanPathForSensitiveKeys', () => { + const tmpDir = '/tmp/archon-test-env-scan'; + + beforeEach(() => { + mkdirSync(tmpDir, { recursive: true }); + }); + afterEach(() => { + rmSync(tmpDir, { recursive: true, force: true }); + }); + + it('returns empty findings for clean directory', () => { + const report = scanPathForSensitiveKeys(tmpDir); + expect(report.findings).toHaveLength(0); + }); + + it('returns empty findings for non-existent directory', () => { + const report = scanPathForSensitiveKeys('/tmp/archon-test-nonexistent-dir'); + expect(report.findings).toHaveLength(0); + }); + + // Each sensitive key × each auto-loaded filename + for (const key of SENSITIVE_KEYS) { + for (const filename of AUTOLOADED_FILES) { + it(`detects ${key} in ${filename}`, () => { + writeFileSync(join(tmpDir, filename), `${key}=sk-test-value\nOTHER=safe\n`); + const report = scanPathForSensitiveKeys(tmpDir); + expect(report.findings).toHaveLength(1); + expect(report.findings[0].file).toBe(filename); + expect(report.findings[0].keys).toContain(key); + // Clean up for next iteration + rmSync(join(tmpDir, filename)); + }); + } + } + + it('ignores commented-out keys', () => { + writeFileSync(join(tmpDir, '.env'), '# ANTHROPIC_API_KEY=value\n'); + const report = scanPathForSensitiveKeys(tmpDir); + expect(report.findings).toHaveLength(0); + }); + + it('ignores lines without =', () => { + writeFileSync(join(tmpDir, '.env'), 'ANTHROPIC_API_KEY\n'); + const report = scanPathForSensitiveKeys(tmpDir); + expect(report.findings).toHaveLength(0); + }); + + it('reports multiple files with findings', () => { + writeFileSync(join(tmpDir, '.env'), 'ANTHROPIC_API_KEY=sk-1\n'); + writeFileSync(join(tmpDir, '.env.local'), 'OPENAI_API_KEY=sk-2\n'); + const report = scanPathForSensitiveKeys(tmpDir); + expect(report.findings).toHaveLength(2); + }); + + it('reports multiple keys in same file', () => { + writeFileSync(join(tmpDir, '.env'), 'ANTHROPIC_API_KEY=sk-1\nOPENAI_API_KEY=sk-2\n'); + const report = scanPathForSensitiveKeys(tmpDir); + expect(report.findings).toHaveLength(1); + expect(report.findings[0].keys).toHaveLength(2); + }); + + it('ignores non-autoloaded filenames', () => { + writeFileSync(join(tmpDir, '.env.secrets'), 'ANTHROPIC_API_KEY=sk-1\n'); + const report = scanPathForSensitiveKeys(tmpDir); + expect(report.findings).toHaveLength(0); + }); + + it('ignores safe keys', () => { + writeFileSync(join(tmpDir, '.env'), 'DATABASE_URL=postgres://localhost\nNODE_ENV=dev\n'); + const report = scanPathForSensitiveKeys(tmpDir); + expect(report.findings).toHaveLength(0); + }); +}); + +describe('EnvLeakError', () => { + it('is instanceof EnvLeakError and Error', () => { + const report = { path: '/tmp', findings: [{ file: '.env', keys: ['ANTHROPIC_API_KEY'] }] }; + const err = new EnvLeakError(report); + expect(err).toBeInstanceOf(Error); + expect(err).toBeInstanceOf(EnvLeakError); + expect(err.name).toBe('EnvLeakError'); + expect(err.message).toContain('ANTHROPIC_API_KEY'); + expect(err.report).toBe(report); + }); + + it('defaults context to register-ui and stores it on the error', () => { + const report = { path: '/x', findings: [{ file: '.env', keys: ['ANTHROPIC_API_KEY'] }] }; + const err = new EnvLeakError(report); + expect(err.context).toBe('register-ui'); + expect(err.message).toContain('Add Project'); + }); + + it('produces distinct remediation bodies per context', () => { + const report = { path: '/x', findings: [{ file: '.env', keys: ['ANTHROPIC_API_KEY'] }] }; + const ui = formatLeakError(report, 'register-ui'); + const cli = formatLeakError(report, 'register-cli'); + const spawn = formatLeakError(report, 'spawn-existing'); + expect(ui).toContain('Add Project'); + expect(cli).toContain('--allow-env-keys'); + expect(cli).toContain('allow_target_repo_keys'); + expect(spawn).toContain('Settings'); + expect(spawn).toContain('already-registered'); + // headers differ between register and spawn + expect(ui).toContain('Cannot add codebase'); + expect(spawn).toContain('Cannot run workflow'); + }); + + it('formats multiple findings', () => { + const report = { + path: '/test', + findings: [ + { file: '.env', keys: ['ANTHROPIC_API_KEY'] }, + { file: '.env.local', keys: ['OPENAI_API_KEY', 'GEMINI_API_KEY'] }, + ], + }; + const err = new EnvLeakError(report); + expect(err.message).toContain('.env'); + expect(err.message).toContain('.env.local'); + expect(err.message).toContain('OPENAI_API_KEY'); + expect(err.message).toContain('GEMINI_API_KEY'); + }); +}); diff --git a/packages/core/src/utils/env-leak-scanner.ts b/packages/core/src/utils/env-leak-scanner.ts new file mode 100644 index 0000000000..48edc2c6b7 --- /dev/null +++ b/packages/core/src/utils/env-leak-scanner.ts @@ -0,0 +1,155 @@ +import { readFileSync, existsSync } from 'fs'; +import { join } from 'path'; + +export const SENSITIVE_KEYS = new Set([ + 'ANTHROPIC_API_KEY', + 'ANTHROPIC_AUTH_TOKEN', + 'CLAUDE_API_KEY', + 'CLAUDE_CODE_OAUTH_TOKEN', + 'OPENAI_API_KEY', + 'CODEX_API_KEY', + 'GEMINI_API_KEY', +]); + +export const AUTOLOADED_FILES = [ + '.env', + '.env.local', + '.env.development', + '.env.production', + '.env.development.local', + '.env.production.local', +]; + +export interface LeakFinding { + file: string; + keys: string[]; +} + +export interface LeakReport { + path: string; + findings: LeakFinding[]; +} + +/** + * Context in which the env-leak error is being surfaced. Drives the remediation + * copy so users see guidance that matches how they hit the gate. + * + * - `register-ui`: Add-Project flow in the Web UI (checkbox is visible) + * - `register-cli`: CLI auto-register path (no Web UI) + * - `spawn-existing`: Pre-spawn check for an already-registered codebase + */ +export type LeakErrorContext = 'register-ui' | 'register-cli' | 'spawn-existing'; + +export class EnvLeakError extends Error { + public readonly context: LeakErrorContext; + constructor( + public readonly report: LeakReport, + context: LeakErrorContext = 'register-ui' + ) { + super(formatLeakError(report, context)); + this.name = 'EnvLeakError'; + this.context = context; + } +} + +/** + * Scan `dirPath` for auto-loaded .env files containing sensitive keys. + * Pure function — no side effects. + */ +export function scanPathForSensitiveKeys(dirPath: string): LeakReport { + const findings: LeakFinding[] = []; + + for (const filename of AUTOLOADED_FILES) { + const fullPath = join(dirPath, filename); + if (!existsSync(fullPath)) continue; + + let contents: string; + try { + contents = readFileSync(fullPath, 'utf8'); + } catch (err) { + // File exists but is unreadable — treat as a finding to avoid silently bypassing the gate + const code = (err as NodeJS.ErrnoException).code; + findings.push({ file: filename, keys: [`[unreadable — ${code ?? 'unknown error'}]`] }); + continue; + } + + const foundKeys: string[] = []; + for (const line of contents.split('\n')) { + const trimmed = line.trim(); + if (trimmed.startsWith('#') || !trimmed.includes('=')) continue; + const key = trimmed.split('=')[0].trim(); + if (SENSITIVE_KEYS.has(key)) { + foundKeys.push(key); + } + } + + if (foundKeys.length > 0) { + findings.push({ file: filename, keys: foundKeys }); + } + } + + return { path: dirPath, findings }; +} + +/** + * Exhaustive per-context consent remediation copy. Using `switch` with a + * `never` default means adding a new `LeakErrorContext` variant without + * handling it here is a compile error — important for a security-visible path. + */ +function consentCopy(context: LeakErrorContext): string { + switch (context) { + case 'register-cli': + return ` 3. Acknowledge the risk and allow this codebase to use its .env key: + Re-run the CLI command with --allow-env-keys, or set + 'allow_target_repo_keys: true' in ~/.archon/config.yaml to bypass this + gate globally.`; + case 'spawn-existing': + return ` 3. Acknowledge the risk for this already-registered codebase: + Open the Web UI (Settings → Projects), find this project, and toggle + "Allow env keys". Or set 'allow_target_repo_keys: true' in + ~/.archon/config.yaml to bypass this gate globally.`; + case 'register-ui': + return ` 3. Acknowledge the risk and allow this codebase to use its .env key: + Open the web UI (Settings → Projects → Add Project) and tick + "Allow env keys (I understand the risk)" when adding this project.`; + default: { + const exhaustive: never = context; + return exhaustive; + } + } +} + +export function formatLeakError( + report: LeakReport, + context: LeakErrorContext = 'register-ui' +): string { + const fileList = report.findings.map(f => ` ${f.file} — ${f.keys.join(', ')}`).join('\n'); + + const header = + context === 'spawn-existing' + ? `Cannot run workflow — ${report.path} contains keys that will leak into AI subprocesses` + : `Cannot add codebase — ${report.path} contains keys that will leak into AI subprocesses`; + + const consent = consentCopy(context); + + return `${header} + + Found: +${fileList} + + Why this matters: + Bun subprocesses auto-load .env from their working directory. Archon cleans + its own environment, but Claude/Codex subprocesses running with cwd= + will re-inject these keys at their own startup, bypassing archon's allowlist. + This can bill the wrong API account silently. + + Choose one: + 1. Remove the key from this repo's .env (recommended): + grep -v '^ANTHROPIC_API_KEY=' .env > .env.tmp && mv .env.tmp .env + + 2. Rename to a non-auto-loaded file: + mv .env .env.secrets + # update your app to load it explicitly + +${consent}`; +} diff --git a/packages/docs-web/src/content/docs/book/isolation.md b/packages/docs-web/src/content/docs/book/isolation.md index f88d44ecf6..a1bef0c019 100644 --- a/packages/docs-web/src/content/docs/book/isolation.md +++ b/packages/docs-web/src/content/docs/book/isolation.md @@ -90,6 +90,13 @@ archon isolation cleanup --merged This also deletes the remote branches — a clean sweep after a round of PRs. +By default, branches with open or closed-without-merging PRs are skipped to avoid +accidental deletion. To also clean up abandoned (CLOSED) PRs: + +```bash +archon isolation cleanup --merged --include-closed +``` + ### Completing a Branch Lifecycle When a PR is merged and you want to remove everything — the worktree, the local branch, and the remote branch — use `complete`: diff --git a/packages/docs-web/src/content/docs/book/quick-reference.md b/packages/docs-web/src/content/docs/book/quick-reference.md index 107e3e55e9..ede87c0dab 100644 --- a/packages/docs-web/src/content/docs/book/quick-reference.md +++ b/packages/docs-web/src/content/docs/book/quick-reference.md @@ -37,6 +37,7 @@ This chapter collects every CLI command, variable, and YAML option in one place. | `archon isolation cleanup` | Remove stale worktrees (older than 7 days) | | `archon isolation cleanup ` | Remove stale worktrees older than N days | | `archon isolation cleanup --merged` | Remove worktrees whose branches merged into main | +| `archon isolation cleanup --merged --include-closed` | Also remove worktrees with closed (abandoned) PRs | ### `archon complete` diff --git a/packages/docs-web/src/content/docs/contributing/cli-internals.md b/packages/docs-web/src/content/docs/contributing/cli-internals.md index b2b4ad04ca..b644eb2246 100644 --- a/packages/docs-web/src/content/docs/contributing/cli-internals.md +++ b/packages/docs-web/src/content/docs/contributing/cli-internals.md @@ -269,6 +269,50 @@ packages/cli/ --- +## `isolation cleanup --merged` Flow + +``` +┌─────────────────────────────────────────────────────────────────┐ +│ archon isolation cleanup --merged [--include-closed] │ +└─────────────────────────────────┬───────────────────────────────┘ + │ + ▼ +┌─────────────────────────────────────────────────────────────────┐ +│ isolation.ts isolationCleanupMergedCommand({ includeClosed }) │ +│ For each codebase → cleanupMergedWorktrees(codebaseId, path) │ +└─────────────────────────────────┬───────────────────────────────┘ + │ + For each active environment + │ + ▼ +┌─────────────────────────────────────────────────────────────────┐ +│ isSafeToRemove() — three-signal union │ +│ (a) isBranchMerged() git ancestry (fast-forward/merge) │ +│ (b) isPatchEquivalent() git cherry (squash-merge) │ +│ (c) getPrState() gh CLI (MERGED/CLOSED/OPEN/NONE) │ +│ │ +│ OPEN → always skip │ +│ CLOSED → skip unless includeClosed=true │ +│ MERGED or any git-signal → proceed to remove │ +└─────────────────────────────────┬───────────────────────────────┘ + │ safe=true + ▼ +┌─────────────────────────────────────────────────────────────────┐ +│ Guard checks: no uncommitted changes, no active conversations │ +│ provider.destroy() → remove worktree + delete remote branch │ +└─────────────────────────────────────────────────────────────────┘ +``` + +Signals are evaluated in order — the first positive match short-circuits to avoid +unnecessary `gh` API calls. The `gh` CLI is a soft dependency: if missing or failing, +only git signals are used and the result degrades gracefully to `NONE`. + +**Code:** `packages/core/src/services/cleanup-service.ts` — `isSafeToRemove()`, `cleanupMergedWorktrees()` +**Code:** `packages/isolation/src/pr-state.ts` — `getPrState()` +**Code:** `packages/git/src/branch.ts` — `isPatchEquivalent()` + +--- + ## CLI Adapter Implements `IPlatformAdapter` for terminal output. diff --git a/packages/docs-web/src/content/docs/getting-started/overview.md b/packages/docs-web/src/content/docs/getting-started/overview.md index de38637d3b..f1d58ae402 100644 --- a/packages/docs-web/src/content/docs/getting-started/overview.md +++ b/packages/docs-web/src/content/docs/getting-started/overview.md @@ -316,6 +316,7 @@ archon workflow run --cwd /path/to/repo "" | `archon isolation list` | List active worktrees | | `archon isolation cleanup [days]` | Remove stale environments | | `archon isolation cleanup --merged` | Remove merged branches | +| `archon isolation cleanup --merged --include-closed` | Also remove closed (abandoned) PR branches | | `archon complete ` | Complete branch lifecycle | | `archon validate workflows [name]` | Validate workflow definitions | | `archon validate commands [name]` | Validate command files | @@ -327,7 +328,8 @@ archon workflow run --cwd /path/to/repo "" archon isolation list # show active worktrees archon isolation cleanup # remove stale (>7 days) archon isolation cleanup 14 # custom staleness threshold -archon isolation cleanup --merged # remove merged branches (deletes remote too) +archon isolation cleanup --merged # remove merged branches (deletes remote too) +archon isolation cleanup --merged --include-closed # also remove closed/abandoned PR branches archon complete # complete branch lifecycle (worktree + branches) archon complete --force # skip uncommitted-changes check ``` diff --git a/packages/docs-web/src/content/docs/reference/api.md b/packages/docs-web/src/content/docs/reference/api.md index cdde8df067..0df9cbbfd0 100644 --- a/packages/docs-web/src/content/docs/reference/api.md +++ b/packages/docs-web/src/content/docs/reference/api.md @@ -138,6 +138,7 @@ Performs a soft delete -- the conversation is hidden but not destroyed. | GET | `/api/codebases` | List registered codebases | | GET | `/api/codebases/{id}` | Get a single codebase | | POST | `/api/codebases` | Register a codebase (clone or local path) | +| PATCH | `/api/codebases/{id}` | Update env-key consent (`allowEnvKeys`) | | DELETE | `/api/codebases/{id}` | Delete a codebase and clean up resources | | GET | `/api/codebases/{id}/environments` | List isolation environments for a codebase | @@ -165,6 +166,16 @@ curl -X POST http://localhost:3090/api/codebases \ -d '{"path": "/home/user/projects/my-repo"}' ``` +### Update Env-Key Consent + +Flip the env-leak-gate consent bit (`allow_env_keys`) on an existing codebase. Audit-logged on every grant and revoke as `env_leak_consent_granted` / `env_leak_consent_revoked` (warn-level) including `codebaseId`, `path`, scanned `files`, matched `keys`, `scanStatus`, and `actor`. + +```bash +curl -X PATCH http://localhost:3090/api/codebases/{id} \ + -H "Content-Type: application/json" \ + -d '{"allowEnvKeys": true}' +``` + ### Delete a Codebase ```bash diff --git a/packages/docs-web/src/content/docs/reference/cli.md b/packages/docs-web/src/content/docs/reference/cli.md index 688276c28f..6148fb8bc6 100644 --- a/packages/docs-web/src/content/docs/reference/cli.md +++ b/packages/docs-web/src/content/docs/reference/cli.md @@ -122,6 +122,7 @@ Progress events (node start/complete/fail/skip, approval gates) are written to s | `--from `, `--from-branch ` | Override base branch (start-point for worktree) | | `--no-worktree` | Opt out of isolation -- run directly in live checkout | | `--resume` | Resume from last failed run at the working path (skips completed nodes) | +| `--allow-env-keys` | Grant env-leak-gate consent during auto-registration (bypasses the gate for this codebase). Audit-logged as `env_leak_consent_granted` with `actor: 'user-cli'`. See [security.md](/reference/security/#env-leak-gate-target-repo-env-keys). | | `--quiet`, `-q` | Suppress all progress output to stderr | | `--verbose`, `-v` | Also show tool-level events (tool name and duration) | @@ -246,8 +247,18 @@ archon isolation cleanup 14 # Remove environments with branches merged into main (also deletes remote branches) archon isolation cleanup --merged + +# Also remove environments whose PRs were closed without merging +archon isolation cleanup --merged --include-closed ``` +Merge detection uses three signals in order: git branch ancestry (fast-forward / merge commit), +patch equivalence (squash-merge via `git cherry`), and GitHub PR state via the `gh` CLI. +The `gh` CLI is optional — if absent, only git signals are used. + +By default, branches with a **CLOSED** PR are skipped. Pass `--include-closed` to clean +those up as well. Branches with an **OPEN** PR are always skipped. + ### `validate workflows [name]` Validate workflow YAML definitions and their referenced resources (command files, MCP configs, skill directories). diff --git a/packages/docs-web/src/content/docs/reference/configuration.md b/packages/docs-web/src/content/docs/reference/configuration.md index 78a9fbce77..e636957b23 100644 --- a/packages/docs-web/src/content/docs/reference/configuration.md +++ b/packages/docs-web/src/content/docs/reference/configuration.md @@ -82,6 +82,12 @@ paths: # Concurrency limits concurrency: maxConversations: 10 + +# Env-leak gate bypass (last resort — weakens a security control) +# allow_target_repo_keys: false # Set true to skip the env-leak-gate + # globally for all codebases on this machine. + # `env_leak_gate_disabled` is logged once per + # process per source. See security.md. ``` ## Repository Configuration @@ -128,6 +134,12 @@ defaults: # env: # MY_API_KEY: value # CUSTOM_ENDPOINT: https://... + +# Per-repo override for the env-leak-gate bypass. +# Set to `false` to re-enable the gate for THIS repo even when the global +# config has `allow_target_repo_keys: true`. Set to `true` to grant the +# bypass for THIS repo only. Wins over the global flag in either direction. +# allow_target_repo_keys: false ``` ### Claude settingSources diff --git a/packages/docs-web/src/content/docs/reference/security.md b/packages/docs-web/src/content/docs/reference/security.md index c834418e8e..14195a7374 100644 --- a/packages/docs-web/src/content/docs/reference/security.md +++ b/packages/docs-web/src/content/docs/reference/security.md @@ -122,6 +122,37 @@ The GitHub and Gitea adapters verify webhook signatures to ensure payloads origi - When running inside a target repository, Bun auto-loads that repo's `.env` before any Archon code runs. Both the CLI and server strip every key parsed from the CWD `.env` at startup, then load only `~/.archon/.env` (which always wins via `override: true`). This prevents target-repo secrets (e.g. `ANTHROPIC_API_KEY`, `DATABASE_URL`, `OPENAI_API_KEY`) from bleeding into Archon or its subprocesses. - Claude Code subprocesses receive only an explicit allowlist of env vars (system essentials, Claude auth, Archon runtime config, git identity, GitHub tokens). Per-codebase env vars configured via `codebase_env_vars` or `.archon/config.yaml` `env:` are merged on top of this filtered base. +### Env-leak gate (target repo `.env` keys) + +Archon scrubs its own environment, but **Bun auto-loads `.env` from the subprocess working directory** before any user code runs. That means a Claude or Codex subprocess started with `cwd=/path/to/target/repo` will re-inject any sensitive keys present in that repo's auto-loaded `.env` files — bypassing the allowlist above and silently billing the wrong API account. + +**What Archon scans:** auto-loaded filenames `.env`, `.env.local`, `.env.development`, `.env.production`, `.env.development.local`, `.env.production.local`. + +**Scanned keys:** `ANTHROPIC_API_KEY`, `ANTHROPIC_AUTH_TOKEN`, `CLAUDE_API_KEY`, `CLAUDE_CODE_OAUTH_TOKEN`, `OPENAI_API_KEY`, `CODEX_API_KEY`, `GEMINI_API_KEY`. + +:::caution +Renaming the file to `.env.local`, `.env.development`, etc. **does not work** — Bun auto-loads those too. Only `.env.secrets` (or any non-auto-loaded name) is safe. +::: + +**Where the gate runs:** + +| Failure point | When | What you see | +| --- | --- | --- | +| Registration (Web UI) | Adding a project via Settings → Add Project | 422 with the "Allow env keys" checkbox shown inline | +| Registration (CLI) | First `archon workflow run --cwd ` auto-registers | Error message points at `--allow-env-keys` and the global config flag | +| Pre-spawn | Existing codebase, before each Claude/Codex query | Error message points at Settings → Projects → "Allow env keys" toggle | + +**Primary remediation (recommended):** +1. Remove the key from the target repo's `.env`, or +2. Rename the file to `.env.secrets` and load it explicitly from your app code. + +**Secondary remediation (consent grants):** +- **Web UI:** Settings → Projects → click "Allow env keys" on the row. Revoke from the same place. Each grant/revoke writes a `warn`-level audit log (`env_leak_consent_granted` / `env_leak_consent_revoked`) including `codebaseId`, `path`, scanned `files`, matched `keys`, `scanStatus` (`'ok'` or `'skipped'`), and `actor`. +- **CLI:** `archon workflow run "your message" --cwd --allow-env-keys` grants consent during this run's auto-registration. The grant is persisted (the codebase row is created with `allow_env_keys = true`) and logged as `env_leak_consent_granted` with `actor: 'user-cli'`. +- **Global bypass:** set `allow_target_repo_keys: true` in `~/.archon/config.yaml` to disable the gate for all codebases on this machine. `env_leak_gate_disabled` is logged at most once per process per source (global vs. repo) the first time `loadConfig` resolves the bypass as active. A repo-level `.archon/config.yaml` with `allow_target_repo_keys: false` re-enables the gate for that repo. + +**Startup scan:** When `allow_target_repo_keys` is not set, the server scans every registered codebase with `allow_env_keys = false` and emits one `startup_env_leak_gate_will_block` warning per codebase **that has findings** (i.e. would actually be blocked). This gives you a chance to grant consent before hitting a fatal error mid-workflow. The scan is skipped entirely when the global bypass is active. + **CORS:** - API routes use `WEB_UI_ORIGIN` to restrict CORS. The default is `*` (allow all), which is appropriate for local single-developer use. Set a specific origin when exposing the server publicly. diff --git a/packages/git/src/branch.ts b/packages/git/src/branch.ts index 6e59ee8499..7307da895c 100644 --- a/packages/git/src/branch.ts +++ b/packages/git/src/branch.ts @@ -220,6 +220,56 @@ export async function isBranchMerged( } } +/** + * Check if a branch is patch-equivalent to the upstream (e.g. squash-merged). + * + * Uses `git cherry ` which lists branch commits not in upstream: + * - `- ` means the patch IS already in upstream (squash-merged / cherry-picked) + * - `+ ` means the patch is NOT in upstream (genuinely unmerged) + * + * Returns true if every reported commit is patch-equivalent (or if there are no + * commits to compare). Returns false if any commit is unmerged. + * + * Returns false for expected errors (branch/repo not found). + * Throws for unexpected errors (permission denied, corruption). + */ +export async function isPatchEquivalent( + repoPath: RepoPath, + branchName: BranchName, + baseBranch: BranchName +): Promise { + try { + const { stdout } = await execFileAsync( + 'git', + ['-C', repoPath, 'cherry', baseBranch, branchName], + { timeout: 15000 } + ); + const lines = stdout.split('\n').filter(line => line.trim()); + if (lines.length === 0) return true; + return lines.every(line => line.startsWith('-')); + } catch (error) { + const err = error as Error & { code?: string; stderr?: string }; + const errorText = `${err.message} ${err.stderr ?? ''}`.toLowerCase(); + + const isExpectedError = + errorText.includes('not a git repository') || + errorText.includes('unknown revision') || + errorText.includes('bad revision') || + errorText.includes('no such file') || + err.code === 'ENOENT'; + + if (isExpectedError) return false; + + getLog().error( + { err: error, repoPath, branchName, baseBranch }, + 'branch.patch_equivalent_check_failed' + ); + throw new Error( + `Failed to check if ${branchName} is patch-equivalent to ${baseBranch}: ${err.message}` + ); + } +} + /** * Check if a ref is an ancestor of HEAD in the given working directory. * diff --git a/packages/git/src/git.test.ts b/packages/git/src/git.test.ts index 4dfecdcdc4..9c3287b04b 100644 --- a/packages/git/src/git.test.ts +++ b/packages/git/src/git.test.ts @@ -1064,6 +1064,54 @@ branch refs/heads/feature/auth }); }); + describe('isPatchEquivalent', () => { + let execSpy: Mock; + + beforeEach(() => { + execSpy = spyOn(git, 'execFileAsync'); + }); + + afterEach(() => { + execSpy.mockRestore(); + }); + + test('returns true when all cherry lines start with -', async () => { + execSpy.mockResolvedValue({ stdout: '- abc123\n- def456\n', stderr: '' }); + const result = await git.isPatchEquivalent('/workspace/repo', 'feature', 'main'); + expect(result).toBe(true); + }); + + test('returns false when any cherry line starts with +', async () => { + execSpy.mockResolvedValue({ stdout: '- abc123\n+ def456\n', stderr: '' }); + const result = await git.isPatchEquivalent('/workspace/repo', 'feature', 'main'); + expect(result).toBe(false); + }); + + test('returns true for empty cherry output', async () => { + execSpy.mockResolvedValue({ stdout: '', stderr: '' }); + const result = await git.isPatchEquivalent('/workspace/repo', 'feature', 'main'); + expect(result).toBe(true); + }); + + test('returns false on expected errors (not a git repo)', async () => { + execSpy.mockRejectedValue(new Error('fatal: not a git repository')); + const result = await git.isPatchEquivalent('/workspace/repo', 'feature', 'main'); + expect(result).toBe(false); + }); + + test('throws on unexpected errors', async () => { + mockLogger.error.mockClear(); + execSpy.mockRejectedValue(new Error('fatal: permission denied')); + await expect(git.isPatchEquivalent('/workspace/repo', 'feature', 'main')).rejects.toThrow( + 'Failed to check if feature is patch-equivalent to main' + ); + expect(mockLogger.error).toHaveBeenCalledWith( + expect.objectContaining({ branchName: 'feature', baseBranch: 'main' }), + 'branch.patch_equivalent_check_failed' + ); + }); + }); + describe('getLastCommitDate', () => { let execSpy: Mock; diff --git a/packages/git/src/index.ts b/packages/git/src/index.ts index 6f8b0f14d3..8cfdc865f7 100644 --- a/packages/git/src/index.ts +++ b/packages/git/src/index.ts @@ -33,6 +33,7 @@ export { hasUncommittedChanges, commitAllChanges, isBranchMerged, + isPatchEquivalent, isAncestorOf, getLastCommitDate, } from './branch'; diff --git a/packages/isolation/package.json b/packages/isolation/package.json index a7b6483199..8da089dc64 100644 --- a/packages/isolation/package.json +++ b/packages/isolation/package.json @@ -8,7 +8,7 @@ ".": "./src/index.ts" }, "scripts": { - "test": "bun test src/resolver.test.ts && bun test src/providers/ && bun test src/errors.test.ts src/factory.test.ts src/worktree-copy.test.ts", + "test": "bun test src/resolver.test.ts && bun test src/providers/ && bun test src/errors.test.ts src/factory.test.ts src/worktree-copy.test.ts && bun test src/pr-state.test.ts", "type-check": "bun x tsc --noEmit" }, "dependencies": { diff --git a/packages/isolation/src/index.ts b/packages/isolation/src/index.ts index 28864bd41c..191e0af9c6 100644 --- a/packages/isolation/src/index.ts +++ b/packages/isolation/src/index.ts @@ -48,6 +48,10 @@ export type { IsolationResolverDeps } from './resolver'; // --- Provider --- export { WorktreeProvider } from './providers/worktree'; +// --- PR state lookup --- +export { getPrState } from './pr-state'; +export type { PrState } from './pr-state'; + // --- Worktree copy utility --- export { copyWorktreeFiles, diff --git a/packages/isolation/src/pr-state.test.ts b/packages/isolation/src/pr-state.test.ts new file mode 100644 index 0000000000..4dd67ab182 --- /dev/null +++ b/packages/isolation/src/pr-state.test.ts @@ -0,0 +1,112 @@ +import { describe, test, expect, beforeEach, mock } from 'bun:test'; + +const mockLogger = { + fatal: mock(() => undefined), + error: mock(() => undefined), + warn: mock(() => undefined), + info: mock(() => undefined), + debug: mock(() => undefined), + trace: mock(() => undefined), + child: mock(() => mockLogger), +}; +mock.module('@archon/paths', () => ({ + createLogger: mock(() => mockLogger), +})); + +interface ExecResult { + stdout: string; + stderr: string; +} + +const mockExecFileAsync = mock( + (_cmd: string, _args: string[]): Promise => + Promise.resolve({ stdout: '', stderr: '' }) +); +mock.module('@archon/git', () => ({ + execFileAsync: mockExecFileAsync, + toRepoPath: (p: string) => p, + toBranchName: (b: string) => b, +})); + +import { getPrState, type PrState } from './pr-state'; + +const REPO = '/workspace/repo'; +const BRANCH = 'feature-branch'; + +function setupGhResponse(remoteUrl: string, ghStdout: string | Error): void { + mockExecFileAsync.mockReset(); + mockExecFileAsync.mockImplementation((cmd: string, _args: string[]) => { + if (cmd === 'git') return Promise.resolve({ stdout: remoteUrl, stderr: '' }); + if (cmd === 'gh') { + if (ghStdout instanceof Error) return Promise.reject(ghStdout); + return Promise.resolve({ stdout: ghStdout, stderr: '' }); + } + return Promise.resolve({ stdout: '', stderr: '' }); + }); +} + +describe('getPrState', () => { + test('returns MERGED when gh reports MERGED', async () => { + setupGhResponse('https://github.com/owner/repo.git', '[{"state":"MERGED"}]'); + const result = await getPrState(BRANCH, REPO); + expect(result).toBe('MERGED'); + }); + + test('returns OPEN when gh reports OPEN', async () => { + setupGhResponse('https://github.com/owner/repo.git', '[{"state":"OPEN"}]'); + const result = await getPrState(BRANCH, REPO); + expect(result).toBe('OPEN'); + }); + + test('returns CLOSED when gh reports CLOSED', async () => { + setupGhResponse('https://github.com/owner/repo.git', '[{"state":"CLOSED"}]'); + const result = await getPrState(BRANCH, REPO); + expect(result).toBe('CLOSED'); + }); + + test('returns NONE when gh returns empty array (no PR)', async () => { + setupGhResponse('https://github.com/owner/repo.git', '[]'); + const result = await getPrState(BRANCH, REPO); + expect(result).toBe('NONE'); + }); + + test('returns NONE when gh CLI is not installed (ENOENT)', async () => { + const enoent = Object.assign(new Error('spawn gh ENOENT'), { code: 'ENOENT' }); + setupGhResponse('https://github.com/owner/repo.git', enoent); + const result = await getPrState(BRANCH, REPO); + expect(result).toBe('NONE'); + }); + + test('returns NONE for non-GitHub remote URL', async () => { + setupGhResponse('https://gitlab.com/owner/repo.git', '[{"state":"MERGED"}]'); + const result = await getPrState(BRANCH, REPO); + expect(result).toBe('NONE'); + }); + + test('uses cache on subsequent lookups for same branch', async () => { + setupGhResponse('https://github.com/owner/repo.git', '[{"state":"MERGED"}]'); + const cache = new Map(); + const first = await getPrState(BRANCH, REPO, cache); + const callsAfterFirst = mockExecFileAsync.mock.calls.length; + const second = await getPrState(BRANCH, REPO, cache); + expect(first).toBe('MERGED'); + expect(second).toBe('MERGED'); + expect(mockExecFileAsync.mock.calls.length).toBe(callsAfterFirst); + }); + + test('returns NONE and warns on non-ENOENT gh error (e.g. auth failure)', async () => { + const authError = Object.assign(new Error('gh: authentication required'), { + code: 'ERR_CMD_FAILED', + }); + setupGhResponse('https://github.com/owner/repo.git', authError); + const result = await getPrState(BRANCH, REPO); + expect(result).toBe('NONE'); + expect(mockLogger.warn).toHaveBeenCalled(); + }); + + test('returns NONE when gh returns malformed JSON (e.g. auth error mixed with output)', async () => { + setupGhResponse('https://github.com/owner/repo.git', 'error: not logged into github.com\n[]'); + const result = await getPrState(BRANCH, REPO); + expect(result).toBe('NONE'); + }); +}); diff --git a/packages/isolation/src/pr-state.ts b/packages/isolation/src/pr-state.ts new file mode 100644 index 0000000000..2f4e3c87c3 --- /dev/null +++ b/packages/isolation/src/pr-state.ts @@ -0,0 +1,91 @@ +/** + * PR state lookup via the `gh` CLI. + * + * Used by cleanup to detect squash-merged or closed PRs that git ancestry + * checks miss. The `gh` CLI is a soft dependency — if it's missing or fails, + * we return 'NONE' and let callers fall back to git-only signals. + */ +import { execFileAsync } from '@archon/git'; +import type { BranchName, RepoPath } from '@archon/git'; +import { createLogger } from '@archon/paths'; + +/** Lazy-initialized logger (deferred so test mocks can intercept createLogger) */ +let cachedLog: ReturnType | undefined; +function getLog(): ReturnType { + if (!cachedLog) cachedLog = createLogger('isolation'); + return cachedLog; +} + +export type PrState = 'MERGED' | 'CLOSED' | 'OPEN' | 'NONE'; + +/** + * Look up the PR state for a branch in the GitHub remote. + * + * Returns: + * - 'MERGED' / 'CLOSED' / 'OPEN' if a PR exists with that head branch + * - 'NONE' if no PR exists, gh is unavailable, or the remote is not GitHub + * + * The optional `cache` map dedupes lookups within a single cleanup invocation. + */ +export async function getPrState( + branch: BranchName, + repoPath: RepoPath, + cache?: Map +): Promise { + const cached = cache?.get(branch); + if (cached !== undefined) { + return cached; + } + + // Check whether the remote is on GitHub. Non-GitHub remotes are out of scope. + let remoteUrl = ''; + try { + const { stdout } = await execFileAsync('git', ['-C', repoPath, 'remote', 'get-url', 'origin'], { + timeout: 10000, + }); + remoteUrl = stdout.trim(); + } catch (error) { + getLog().debug( + { err: error as Error, repoPath, branch }, + 'isolation.pr_state_remote_lookup_failed' + ); + cache?.set(branch, 'NONE'); + return 'NONE'; + } + + if (!remoteUrl.toLowerCase().includes('github.com')) { + getLog().debug({ repoPath, branch, remoteUrl }, 'isolation.pr_state_github_only'); + cache?.set(branch, 'NONE'); + return 'NONE'; + } + + let result: PrState = 'NONE'; + let ghStdout = ''; + try { + const { stdout } = await execFileAsync( + 'gh', + ['pr', 'list', '--head', branch, '--state', 'all', '--json', 'state', '--limit', '1'], + { timeout: 15000, cwd: repoPath } + ); + ghStdout = stdout; + const parsed = JSON.parse(stdout) as { state?: string }[]; + const state = parsed[0]?.state; + if (state === 'MERGED' || state === 'CLOSED' || state === 'OPEN') { + result = state; + } + } catch (error) { + const err = error as NodeJS.ErrnoException; + const isNotInstalled = err.code === 'ENOENT' || err.message.includes('command not found'); + if (isNotInstalled) { + getLog().debug({ branch, repoPath }, 'isolation.pr_state_gh_not_installed'); + } else { + getLog().warn( + { err, branch, repoPath, ghStdout: ghStdout || undefined }, + 'isolation.pr_state_lookup_failed' + ); + } + } + + cache?.set(branch, result); + return result; +} diff --git a/packages/paths/package.json b/packages/paths/package.json index 8d52e684b9..fe958ad7f4 100644 --- a/packages/paths/package.json +++ b/packages/paths/package.json @@ -12,9 +12,7 @@ "type-check": "bun x tsc --noEmit" }, "dependencies": { - "pino": "^9" - }, - "optionalDependencies": { + "pino": "^9", "pino-pretty": "^13" }, "peerDependencies": { diff --git a/packages/paths/src/bundled-build.test.ts b/packages/paths/src/bundled-build.test.ts new file mode 100644 index 0000000000..9c1ade23c6 --- /dev/null +++ b/packages/paths/src/bundled-build.test.ts @@ -0,0 +1,19 @@ +import { describe, expect, it } from 'bun:test'; +import { BUNDLED_GIT_COMMIT, BUNDLED_IS_BINARY, BUNDLED_VERSION } from './bundled-build'; + +describe('bundled-build', () => { + // In dev/test mode the placeholders must be the dev defaults. + // `scripts/build-binaries.sh` rewrites this file only during binary + // compilation and restores it afterwards via an EXIT trap. + it('BUNDLED_IS_BINARY is false in dev mode', () => { + expect(BUNDLED_IS_BINARY).toBe(false); + }); + + it('BUNDLED_VERSION is the dev placeholder', () => { + expect(BUNDLED_VERSION).toBe('dev'); + }); + + it('BUNDLED_GIT_COMMIT is the dev placeholder', () => { + expect(BUNDLED_GIT_COMMIT).toBe('unknown'); + }); +}); diff --git a/packages/paths/src/bundled-build.ts b/packages/paths/src/bundled-build.ts new file mode 100644 index 0000000000..99a21b4eb1 --- /dev/null +++ b/packages/paths/src/bundled-build.ts @@ -0,0 +1,18 @@ +/** + * Build-time constants embedded into compiled binaries. + * + * In dev/test mode, the placeholders below are used and BUNDLED_IS_BINARY + * is `false`. Compiled binaries get this file overwritten by + * `scripts/build-binaries.sh` before `bun build --compile` is invoked, + * and restored afterwards via an EXIT trap. + * + * Lives in `@archon/paths` (the bottom of the dep graph) so any package + * can import these constants without creating dependency cycles. + * + * See GitHub issue #979 for the rationale (replaces runtime detection + * heuristics that were brittle across Bun's ESM/CJS compile modes). + */ + +export const BUNDLED_IS_BINARY = false; +export const BUNDLED_VERSION = 'dev'; +export const BUNDLED_GIT_COMMIT = 'unknown'; diff --git a/packages/paths/src/index.ts b/packages/paths/src/index.ts index da33e99049..3c3fd89618 100644 --- a/packages/paths/src/index.ts +++ b/packages/paths/src/index.ts @@ -30,3 +30,6 @@ export { // Logger export { createLogger, setLogLevel, getLogLevel, rootLogger } from './logger'; export type { Logger } from './logger'; + +// Build-time constants (rewritten by scripts/build-binaries.sh) +export { BUNDLED_IS_BINARY, BUNDLED_VERSION, BUNDLED_GIT_COMMIT } from './bundled-build'; diff --git a/packages/paths/src/logger.ts b/packages/paths/src/logger.ts index 73a73b040d..414f36ee10 100644 --- a/packages/paths/src/logger.ts +++ b/packages/paths/src/logger.ts @@ -23,6 +23,7 @@ import pino from 'pino'; import type { Logger } from 'pino'; +import pretty from 'pino-pretty'; export type { Logger } from 'pino'; @@ -44,36 +45,48 @@ function getInitialLevel(): string { } /** - * Uses pino-pretty when stdout is a TTY and NODE_ENV !== 'production'; - * outputs newline-delimited JSON otherwise. + * Build the root Pino logger. + * + * Uses `pino-pretty` as a **destination stream** (not a worker-thread transport) + * when stdout is a TTY and NODE_ENV !== 'production'. Running pino-pretty as a + * destination stream keeps the formatter on the main thread, which avoids the + * `require.resolve('pino-pretty')` lookup that crashes inside Bun's `/$bunfs/` + * virtual filesystem in compiled binaries (see GitHub issue #960 / #979). + * + * The same code path runs in dev and compiled binaries — no environment + * detection required. */ -function buildLoggerOptions(): pino.LoggerOptions { +function buildLogger(): Logger { const level = getInitialLevel(); const usePretty = process.stdout.isTTY && process.env.NODE_ENV !== 'production'; if (usePretty) { - return { - level, - transport: { - target: 'pino-pretty', - options: { - colorize: true, - levelFirst: true, - translateTime: 'SYS:standard', - ignore: 'pid,hostname', - }, - }, - }; + try { + const stream = pretty({ + colorize: true, + levelFirst: true, + translateTime: 'SYS:standard', + ignore: 'pid,hostname', + }); + return pino({ level }, stream); + } catch (err) { + // pino-pretty failed to initialize (missing peer, broken TTY descriptor, + // or incompatible runtime). Fall back to plain JSON so logging keeps + // working instead of crashing the entire process at module import time. + console.warn( + `[logger] pino-pretty failed to initialize, falling back to JSON output: ${(err as Error).message}` + ); + } } - return { level }; + return pino({ level }); } /** * Root Pino logger instance. * Children inherit the root's level at creation time (not dynamically updated). */ -export const rootLogger: Logger = pino(buildLoggerOptions()); +export const rootLogger: Logger = buildLogger(); /** * Create a child logger with a module binding. diff --git a/packages/server/src/adapters/web/transport.test.ts b/packages/server/src/adapters/web/transport.test.ts index d0897eb5d1..5f6203c475 100644 --- a/packages/server/src/adapters/web/transport.test.ts +++ b/packages/server/src/adapters/web/transport.test.ts @@ -224,6 +224,28 @@ describe('SSETransport', () => { }); }); + describe('buffer eviction warn throttle', () => { + test('throttles repeated eviction warns to one per conversation per window', () => { + // EVENT_BUFFER_MAX is 500; push 600 events into a conversation with no + // active stream so all overflow events trigger an eviction. Even though + // ~100 evictions happen, we should see exactly one warn (throttled). + const transport = new SSETransport(); + mockLogger.warn.mockClear(); + + for (let i = 0; i < 600; i++) { + // emit() with no registered stream falls through to bufferEvent() + void transport.emit('conv-throttle', `{"i":${i}}`); + } + + const evictionWarns = mockLogger.warn.mock.calls.filter( + (call: unknown[]) => call[1] === 'transport.buffer_evicted_oldest' + ); + expect(evictionWarns.length).toBe(1); + + transport.stop(); + }); + }); + describe('start/stop', () => { test('start logs adapter_ready', () => { const transport = new SSETransport(); diff --git a/packages/server/src/adapters/web/transport.ts b/packages/server/src/adapters/web/transport.ts index 3c7f10442c..3a43332a11 100644 --- a/packages/server/src/adapters/web/transport.ts +++ b/packages/server/src/adapters/web/transport.ts @@ -16,11 +16,33 @@ export interface SSEWriter { /** Grace period (ms) before firing onCleanup after stream removal. */ const RECONNECT_GRACE_MS = 5_000; -/** Max time (ms) to hold buffered events waiting for a stream to connect. */ -const EVENT_BUFFER_TTL_MS = 3_000; +/** + * Max time (ms) to hold buffered events waiting for a stream to connect. + * + * Must be ≥ RECONNECT_GRACE_MS — otherwise events emitted during a reconnect + * window are dropped *before* the client has had a chance to reconnect, which + * manifests as perpetually-spinning tool cards when a `tool_result` happens to + * land in the gap. 60s covers typical EventSource auto-reconnect delays on + * flaky networks (mobile, VPN, laptop sleep) without meaningfully growing + * memory footprint — events are small JSON strings and the cap below bounds + * the worst case. + */ +const EVENT_BUFFER_TTL_MS = 60_000; /** Max events to buffer per conversation before oldest are dropped. */ -const EVENT_BUFFER_MAX = 50; +const EVENT_BUFFER_MAX = 500; + +/** Min interval (ms) between `transport.buffer_evicted_oldest` warns per conversation. */ +const EVICTION_WARN_THROTTLE_MS = 5_000; + +// Fail-fast invariant: buffer TTL must outlive the reconnect grace window, +// otherwise events emitted during a reconnect can be dropped before the +// client has had a chance to come back. See comment on EVENT_BUFFER_TTL_MS. +if (EVENT_BUFFER_TTL_MS < RECONNECT_GRACE_MS) { + throw new Error( + `EVENT_BUFFER_TTL_MS (${EVENT_BUFFER_TTL_MS}) must be >= RECONNECT_GRACE_MS (${RECONNECT_GRACE_MS})` + ); +} interface BufferedEvent { data: string; @@ -33,6 +55,7 @@ export class SSETransport { private zombieReaperHandle: ReturnType | null = null; private eventBuffer = new Map(); private bufferCleanupTimers = new Map>(); + private lastEvictionWarnAt = new Map(); constructor( private onCleanup?: (conversationId: string) => void, @@ -65,7 +88,17 @@ export class SSETransport { if (buffered && buffered.length > 0) { const now = Date.now(); const valid = buffered.filter(e => now - e.timestamp < EVENT_BUFFER_TTL_MS); + const expired = buffered.length - valid.length; this.clearBuffer(conversationId); + if (expired > 0) { + // Events outlived the buffer TTL before the client reconnected. + // Symptom on the UI: stuck tool cards for any tool_result that was + // in the expired batch. If this fires in practice, bump TTL further. + getLog().warn( + { conversationId, expired, ttlMs: EVENT_BUFFER_TTL_MS }, + 'transport.buffer_ttl_expired' + ); + } if (valid.length > 0) { getLog().debug({ conversationId, count: valid.length }, 'sse_buffer_replay'); for (const event of valid) { @@ -132,6 +165,7 @@ export class SSETransport { } this.cleanupTimers.clear(); this.eventBuffer.clear(); + this.lastEvictionWarnAt.clear(); for (const timer of this.bufferCleanupTimers.values()) { clearTimeout(timer); } @@ -198,22 +232,39 @@ export class SSETransport { this.eventBuffer.set(conversationId, buf); } buf.push({ data, timestamp: Date.now() }); - // Cap buffer size — drop oldest if over limit + // Cap buffer size — drop oldest if over limit. Warn so we notice if + // this ever happens in practice: evicted events mean the UI will miss + // something when the client reconnects. if (buf.length > EVENT_BUFFER_MAX) { buf.shift(); + // Throttle: a runaway producer could overflow by hundreds in a tight + // loop and flood logs. Warn at most once per EVICTION_WARN_THROTTLE_MS + // per conversation — enough to notice in practice without flooding. + const lastWarn = this.lastEvictionWarnAt.get(conversationId) ?? 0; + const now = Date.now(); + if (now - lastWarn >= EVICTION_WARN_THROTTLE_MS) { + this.lastEvictionWarnAt.set(conversationId, now); + getLog().warn( + { conversationId, bufferMax: EVENT_BUFFER_MAX }, + 'transport.buffer_evicted_oldest' + ); + } } - // Schedule auto-cleanup so buffers don't leak for conversations that never connect - if (!this.bufferCleanupTimers.has(conversationId)) { - const timer = setTimeout(() => { - this.clearBuffer(conversationId); - }, EVENT_BUFFER_TTL_MS + 500); - this.bufferCleanupTimers.set(conversationId, timer); - } + // Schedule auto-cleanup so buffers don't leak for conversations that never + // connect. Reset the timer on each new event so the buffer is held for + // TTL past the *most recent* event, not the first one. + const existingCleanup = this.bufferCleanupTimers.get(conversationId); + if (existingCleanup) clearTimeout(existingCleanup); + const timer = setTimeout(() => { + this.clearBuffer(conversationId); + }, EVENT_BUFFER_TTL_MS + 500); + this.bufferCleanupTimers.set(conversationId, timer); getLog().debug({ conversationId, buffered: buf.length }, 'sse_event_buffered'); } private clearBuffer(conversationId: string): void { this.eventBuffer.delete(conversationId); + this.lastEvictionWarnAt.delete(conversationId); const timer = this.bufferCleanupTimers.get(conversationId); if (timer) { clearTimeout(timer); diff --git a/packages/server/src/index.ts b/packages/server/src/index.ts index d8463f573c..4d405b63ba 100644 --- a/packages/server/src/index.ts +++ b/packages/server/src/index.ts @@ -70,7 +70,9 @@ import { logConfig, getPort, createWorkflowStore, + scanPathForSensitiveKeys, } from '@archon/core'; +import * as codebaseDb from '@archon/core/db/codebases'; import type { IPlatformAdapter } from '@archon/core'; import { createLogger, logArchonPaths, validateAppDefaultsPaths } from '@archon/paths'; @@ -182,6 +184,58 @@ async function main(): Promise { process.exit(1); } + // Load configuration early so the startup env-leak scan can honor the + // global bypass. Without this, users who set `allow_target_repo_keys: true` + // would get a per-codebase warn spam on every boot even though the gate + // is intentionally disabled. + const config = await loadConfig(); + logConfig(config); + + // Startup env-leak scan: warn for codebases that would be blocked at next + // spawn by the env-leak-gate. Skipped entirely when the global bypass is + // active. Best-effort — failures are surfaced but never block startup. + if (config.allowTargetRepoKeys) { + getLog().info('startup_env_leak_scan_skipped — allow_target_repo_keys is true'); + } else { + try { + const codebases = await codebaseDb.listCodebases(); + for (const cb of codebases) { + if (cb.allow_env_keys) continue; + try { + const report = scanPathForSensitiveKeys(cb.default_cwd); + if (report.findings.length > 0) { + const files = report.findings.map(f => f.file); + const keys = Array.from(new Set(report.findings.flatMap(f => f.keys))); + getLog().warn( + { + codebaseId: cb.id, + name: cb.name, + path: cb.default_cwd, + files, + keys, + }, + 'startup_env_leak_gate_will_block' + ); + } + } catch (scanErr) { + // Path may no longer exist (codebase moved/deleted on disk) — + // log at debug, do not abort the loop. This is the only quiet path. + getLog().debug( + { err: scanErr, codebaseId: cb.id, path: cb.default_cwd }, + 'startup_env_leak_scan_path_unavailable' + ); + } + } + } catch (error) { + // listCodebases() failed — the entire startup safety net is silently + // absent. Surface at error level so operators see it. + getLog().error( + { err: error }, + 'startup_env_leak_scan_failed — startup migration warnings suppressed' + ); + } + } + // Start cleanup scheduler startCleanupScheduler(); @@ -198,10 +252,6 @@ async function main(): Promise { // Validate app defaults paths (non-blocking, just logs warnings) await validateAppDefaultsPaths(); - // Load and log configuration - const config = await loadConfig(); - logConfig(config); - // Initialize conversation lock manager const maxConcurrent = parseInt(process.env.MAX_CONCURRENT_CONVERSATIONS ?? '10'); const lockManager = new ConversationLockManager(maxConcurrent); diff --git a/packages/server/src/routes/api.codebases.test.ts b/packages/server/src/routes/api.codebases.test.ts index d06615968b..0265a359e1 100644 --- a/packages/server/src/routes/api.codebases.test.ts +++ b/packages/server/src/routes/api.codebases.test.ts @@ -48,6 +48,15 @@ mock.module('@archon/core', () => ({ this.name = 'ConversationNotFoundError'; } }, + scanPathForSensitiveKeys: mock((_p: string) => ({ path: _p, findings: [] })), + EnvLeakError: class EnvLeakError extends Error { + constructor(public report: { path: string; findings: { file: string; keys: string[] }[] }) { + super( + `Cannot add codebase — ${report.path} contains keys that will leak into AI subprocesses` + ); + this.name = 'EnvLeakError'; + } + }, getArchonWorkspacesPath: () => '/tmp/.archon/workspaces', generateAndSetTitle: mock(async () => {}), createLogger: () => ({ @@ -114,10 +123,12 @@ mock.module('@archon/core/db/conversations', () => ({ getConversationById: mock(async () => null), })); +const mockUpdateCodebaseAllowEnvKeys = mock(async (_id: string, _v: boolean) => {}); mock.module('@archon/core/db/codebases', () => ({ listCodebases: mockListCodebases, getCodebase: mockGetCodebase, deleteCodebase: mockDeleteCodebase, + updateCodebaseAllowEnvKeys: mockUpdateCodebaseAllowEnvKeys, })); mock.module('@archon/core/db/isolation-environments', () => ({ @@ -170,6 +181,7 @@ const MOCK_CODEBASE = { repository_url: 'https://github.com/user/repo', default_cwd: '/home/user/projects/my-project', ai_assistant_type: 'claude', + allow_env_keys: false, commands: {}, created_at: new Date().toISOString(), updated_at: new Date().toISOString(), @@ -387,7 +399,7 @@ describe('POST /api/codebases', () => { const body = (await response.json()) as { id: string }; expect(body.id).toBe('codebase-uuid-1'); - expect(mockCloneRepository).toHaveBeenCalledWith('https://github.com/user/repo'); + expect(mockCloneRepository).toHaveBeenCalledWith('https://github.com/user/repo', undefined); }); test('registers existing URL codebase with 200', async () => { @@ -424,7 +436,7 @@ describe('POST /api/codebases', () => { body: JSON.stringify({ path: '/home/user/my-repo' }), }); expect(response.status).toBe(201); - expect(mockRegisterRepository).toHaveBeenCalledWith('/home/user/my-repo'); + expect(mockRegisterRepository).toHaveBeenCalledWith('/home/user/my-repo', undefined); }); test('returns 400 when both url and path are provided', async () => { @@ -496,6 +508,101 @@ describe('POST /api/codebases', () => { const body = (await response.json()) as { error: string }; expect(body.error).toContain('authentication required'); }); + + test('returns 422 when cloneRepository throws EnvLeakError', async () => { + const { EnvLeakError } = await import('@archon/core'); + mockCloneRepository.mockImplementationOnce(async () => { + throw new EnvLeakError({ + path: '/repo/path', + findings: [{ file: '.env', keys: ['ANTHROPIC_API_KEY'] }], + }); + }); + + const app = makeApp(); + const response = await app.request('/api/codebases', { + method: 'POST', + headers: { 'Content-Type': 'application/json' }, + body: JSON.stringify({ url: 'https://github.com/user/repo' }), + }); + expect(response.status).toBe(422); + + const body = (await response.json()) as { error: string }; + expect(body.error).toContain('Cannot add codebase'); + }); + + test('passes allowEnvKeys=true to cloneRepository when body includes it', async () => { + mockCloneRepository.mockImplementationOnce(async () => ({ + codebaseId: 'clone-uuid-2', + alreadyExisted: false, + })); + mockGetCodebase.mockImplementationOnce(async () => MOCK_CODEBASE); + + const app = makeApp(); + const response = await app.request('/api/codebases', { + method: 'POST', + headers: { 'Content-Type': 'application/json' }, + body: JSON.stringify({ url: 'https://github.com/user/repo', allowEnvKeys: true }), + }); + expect(response.status).toBe(201); + expect(mockCloneRepository).toHaveBeenCalledWith('https://github.com/user/repo', true); + }); +}); + +// --------------------------------------------------------------------------- +// Tests: PATCH /api/codebases/:id +// --------------------------------------------------------------------------- + +describe('PATCH /api/codebases/:id', () => { + beforeEach(() => { + mockGetCodebase.mockReset(); + mockUpdateCodebaseAllowEnvKeys.mockReset(); + }); + + test('grants consent and returns updated codebase', async () => { + mockGetCodebase + .mockImplementationOnce(async () => MOCK_CODEBASE) + .mockImplementationOnce(async () => ({ ...MOCK_CODEBASE, allow_env_keys: true })); + mockUpdateCodebaseAllowEnvKeys.mockImplementationOnce(async () => {}); + + const app = makeApp(); + const response = await app.request('/api/codebases/codebase-uuid-1', { + method: 'PATCH', + headers: { 'Content-Type': 'application/json' }, + body: JSON.stringify({ allowEnvKeys: true }), + }); + expect(response.status).toBe(200); + const body = (await response.json()) as { allow_env_keys: boolean }; + expect(body.allow_env_keys).toBe(true); + expect(mockUpdateCodebaseAllowEnvKeys).toHaveBeenCalledWith('codebase-uuid-1', true); + }); + + test('revokes consent', async () => { + mockGetCodebase + .mockImplementationOnce(async () => ({ ...MOCK_CODEBASE, allow_env_keys: true })) + .mockImplementationOnce(async () => MOCK_CODEBASE); + mockUpdateCodebaseAllowEnvKeys.mockImplementationOnce(async () => {}); + + const app = makeApp(); + const response = await app.request('/api/codebases/codebase-uuid-1', { + method: 'PATCH', + headers: { 'Content-Type': 'application/json' }, + body: JSON.stringify({ allowEnvKeys: false }), + }); + expect(response.status).toBe(200); + expect(mockUpdateCodebaseAllowEnvKeys).toHaveBeenCalledWith('codebase-uuid-1', false); + }); + + test('returns 404 when codebase not found', async () => { + mockGetCodebase.mockImplementationOnce(async () => null); + + const app = makeApp(); + const response = await app.request('/api/codebases/missing', { + method: 'PATCH', + headers: { 'Content-Type': 'application/json' }, + body: JSON.stringify({ allowEnvKeys: true }), + }); + expect(response.status).toBe(404); + }); }); // --------------------------------------------------------------------------- diff --git a/packages/server/src/routes/api.ts b/packages/server/src/routes/api.ts index f47f2f6f52..81afc6db3d 100644 --- a/packages/server/src/routes/api.ts +++ b/packages/server/src/routes/api.ts @@ -27,6 +27,8 @@ import { registerRepository, ConversationNotFoundError, generateAndSetTitle, + EnvLeakError, + scanPathForSensitiveKeys, } from '@archon/core'; import { removeWorktree, toRepoPath, toWorktreePath } from '@archon/git'; import { @@ -103,6 +105,7 @@ import { codebaseSchema, codebaseIdParamsSchema, addCodebaseBodySchema, + updateCodebaseBodySchema, deleteCodebaseResponseSchema, codebaseEnvVarsResponseSchema, setEnvVarBodySchema, @@ -457,6 +460,28 @@ const addCodebaseRoute = createRoute({ }, }); +const updateCodebaseRoute = createRoute({ + method: 'patch', + path: '/api/codebases/{id}', + tags: ['Codebases'], + summary: 'Update codebase consent flags (e.g. allow_env_keys)', + request: { + params: codebaseIdParamsSchema, + body: { + content: { 'application/json': { schema: updateCodebaseBodySchema } }, + required: true, + }, + }, + responses: { + 200: { + content: { 'application/json': { schema: codebaseSchema } }, + description: 'Updated codebase', + }, + 404: jsonError('Not found'), + 500: jsonError('Server error'), + }, +}); + const deleteCodebaseRoute = createRoute({ method: 'delete', path: '/api/codebases/{id}', @@ -816,7 +841,7 @@ export function registerApiRoutes( ): void { function apiError( c: Context, - status: 400 | 404 | 500, + status: 400 | 404 | 422 | 500, message: string, detail?: string ): Response { @@ -1482,8 +1507,8 @@ export function registerApiRoutes( try { // .refine() guarantees exactly one of url/path is present const result = body.url - ? await cloneRepository(body.url) - : await registerRepository(body.path ?? ''); + ? await cloneRepository(body.url, body.allowEnvKeys) + : await registerRepository(body.path ?? '', body.allowEnvKeys); // Fetch the full codebase record for a consistent response const codebase = await codebaseDb.getCodebase(result.codebaseId); @@ -1493,6 +1518,12 @@ export function registerApiRoutes( return c.json(codebase, result.alreadyExisted ? 200 : 201); } catch (error) { + if (error instanceof EnvLeakError) { + const path = body.url ?? body.path ?? ''; + const files = error.report.findings.map(f => f.file); + getLog().warn({ path, files }, 'add_codebase_env_leak_refused'); + return apiError(c, 422, error.message); + } getLog().error({ err: error }, 'add_codebase_failed'); return apiError( c, @@ -1502,6 +1533,71 @@ export function registerApiRoutes( } }); + // PATCH /api/codebases/:id - Update consent flags + registerOpenApiRoute(updateCodebaseRoute, async c => { + const id = c.req.param('id') ?? ''; + const body = getValidatedBody(c, updateCodebaseBodySchema); + try { + const codebase = await codebaseDb.getCodebase(id); + if (!codebase) { + return apiError(c, 404, 'Codebase not found'); + } + + // Capture scanner findings for the audit log (best-effort — path may be gone) + let files: string[] = []; + let keys: string[] = []; + let scanStatus: 'ok' | 'skipped' = 'ok'; + try { + const report = scanPathForSensitiveKeys(codebase.default_cwd); + files = report.findings.map(f => f.file); + keys = Array.from(new Set(report.findings.flatMap(f => f.keys))); + } catch (scanErr) { + scanStatus = 'skipped'; + getLog().warn( + { err: scanErr, codebaseId: id, path: codebase.default_cwd }, + 'env_leak_consent_scan_skipped' + ); + } + + await codebaseDb.updateCodebaseAllowEnvKeys(id, body.allowEnvKeys); + + // Audit log: emitted unconditionally on every grant/revoke. `scanStatus` + // distinguishes "scanned and these are the findings" from "could not + // scan, files/keys are empty for that reason" — important for later + // security review of the audit trail. + getLog().warn( + { + codebaseId: id, + name: codebase.name, + path: codebase.default_cwd, + files, + keys, + scanStatus, + actor: 'user-ui', + }, + body.allowEnvKeys ? 'env_leak_consent_granted' : 'env_leak_consent_revoked' + ); + + const updated = await codebaseDb.getCodebase(id); + if (!updated) { + return apiError(c, 500, 'Codebase updated but not found'); + } + let commands = updated.commands; + if (typeof commands === 'string') { + try { + commands = JSON.parse(commands); + } catch (parseErr) { + getLog().error({ err: parseErr, codebaseId: id }, 'corrupted_commands_json'); + commands = {}; + } + } + return c.json({ ...updated, commands }); + } catch (error) { + getLog().error({ err: error, codebaseId: id }, 'update_codebase_failed'); + return apiError(c, 500, 'Failed to update codebase'); + } + }); + // DELETE /api/codebases/:id - Delete a project and clean up registerOpenApiRoute(deleteCodebaseRoute, async c => { const id = c.req.param('id') ?? ''; diff --git a/packages/server/src/routes/schemas/codebase.schemas.ts b/packages/server/src/routes/schemas/codebase.schemas.ts index d2880a6be1..e8a6dea887 100644 --- a/packages/server/src/routes/schemas/codebase.schemas.ts +++ b/packages/server/src/routes/schemas/codebase.schemas.ts @@ -16,6 +16,7 @@ export const codebaseSchema = z repository_url: z.string().nullable(), default_cwd: z.string(), ai_assistant_type: z.string(), + allow_env_keys: z.boolean(), commands: z.record(codebaseCommandSchema), created_at: z.string(), updated_at: z.string(), @@ -33,12 +34,20 @@ export const addCodebaseBodySchema = z .object({ url: z.string().min(1).optional(), path: z.string().min(1).optional(), + allowEnvKeys: z.boolean().optional(), }) .refine(b => (b.url !== undefined) !== (b.path !== undefined), { message: 'Provide either "url" or "path", not both and not neither', }) .openapi('AddCodebaseBody'); +/** PATCH /api/codebases/:id request body. */ +export const updateCodebaseBodySchema = z + .object({ + allowEnvKeys: z.boolean(), + }) + .openapi('UpdateCodebaseBody'); + /** DELETE /api/codebases/:id response. */ export const deleteCodebaseResponseSchema = z .object({ success: z.boolean() }) diff --git a/packages/web/src/lib/api.ts b/packages/web/src/lib/api.ts index 3a46568cb6..f13034f274 100644 --- a/packages/web/src/lib/api.ts +++ b/packages/web/src/lib/api.ts @@ -38,6 +38,7 @@ export interface CodebaseResponse { repository_url: string | null; default_cwd: string; ai_assistant_type: string; + allow_env_keys: boolean; commands: Record; created_at: string; updated_at: string; @@ -157,7 +158,7 @@ export async function getCodebase(id: string): Promise { } export async function addCodebase( - input: { url: string } | { path: string } + input: { url: string; allowEnvKeys?: boolean } | { path: string; allowEnvKeys?: boolean } ): Promise { return fetchJSON('/api/codebases', { method: 'POST', @@ -166,6 +167,17 @@ export async function addCodebase( }); } +export async function updateCodebase( + id: string, + input: { allowEnvKeys: boolean } +): Promise { + return fetchJSON(`/api/codebases/${id}`, { + method: 'PATCH', + headers: { 'Content-Type': 'application/json' }, + body: JSON.stringify(input), + }); +} + export async function deleteCodebase(id: string): Promise<{ success: boolean }> { return fetchJSON<{ success: boolean }>(`/api/codebases/${id}`, { method: 'DELETE' }); } diff --git a/packages/web/src/routes/SettingsPage.tsx b/packages/web/src/routes/SettingsPage.tsx index 0623150001..07a07690fc 100644 --- a/packages/web/src/routes/SettingsPage.tsx +++ b/packages/web/src/routes/SettingsPage.tsx @@ -10,6 +10,7 @@ import { getHealth, listCodebases, addCodebase, + updateCodebase, deleteCodebase, updateAssistantConfig, getCodebaseEnvVars, @@ -250,11 +251,22 @@ function EnvVarsPanel({ codebaseId }: { codebaseId: string }): React.ReactElemen ); } +function isEnvLeakError(error: unknown): boolean { + return ( + error instanceof Error && + 'status' in error && + (error as Error & { status: number }).status === 422 && + error.message.startsWith('Cannot add codebase') + ); +} + function ProjectsSection(): React.ReactElement { const queryClient = useQueryClient(); const [addPath, setAddPath] = useState(''); const [showAdd, setShowAdd] = useState(false); + const [allowEnvKeys, setAllowEnvKeys] = useState(false); const [expandedEnvVars, setExpandedEnvVars] = useState(null); + const [toggleError, setToggleError] = useState(null); const { data: codebases } = useQuery({ queryKey: ['codebases'], @@ -262,11 +274,13 @@ function ProjectsSection(): React.ReactElement { }); const addMutation = useMutation({ - mutationFn: (path: string) => addCodebase({ path }), + mutationFn: ({ path, allowEnvKeys }: { path: string; allowEnvKeys?: boolean }) => + addCodebase({ path, allowEnvKeys }), onSuccess: () => { void queryClient.invalidateQueries({ queryKey: ['codebases'] }); setAddPath(''); setShowAdd(false); + setAllowEnvKeys(false); }, }); @@ -277,10 +291,24 @@ function ProjectsSection(): React.ReactElement { }, }); + const toggleEnvKeysMutation = useMutation({ + mutationFn: ({ id, allowEnvKeys }: { id: string; allowEnvKeys: boolean }) => + updateCodebase(id, { allowEnvKeys }), + onSuccess: () => { + setToggleError(null); + void queryClient.invalidateQueries({ queryKey: ['codebases'] }); + }, + onError: (err: Error) => { + // Without this the user clicks "Revoke env keys", confirms the + // destructive dialog, and gets no feedback if the PATCH fails. + setToggleError(err.message); + }, + }); + function handleAddSubmit(e: React.FormEvent): void { e.preventDefault(); if (addPath.trim()) { - addMutation.mutate(addPath.trim()); + addMutation.mutate({ path: addPath.trim(), allowEnvKeys: allowEnvKeys || undefined }); } } @@ -290,6 +318,11 @@ function ProjectsSection(): React.ReactElement { Projects + {toggleError && ( +
+ Failed to update env-key consent: {toggleError} +
+ )} {!codebases || codebases.length === 0 ? (
No projects registered.
) : ( @@ -298,10 +331,40 @@ function ProjectsSection(): React.ReactElement {
-
{cb.name}
+
+
{cb.name}
+ {cb.allow_env_keys && ( + + env keys allowed + + )} +
{cb.default_cwd}
+
)} diff --git a/packages/workflows/src/defaults/bundled-defaults.test.ts b/packages/workflows/src/defaults/bundled-defaults.test.ts index 00124a4ee6..e1e1cb5a30 100644 --- a/packages/workflows/src/defaults/bundled-defaults.test.ts +++ b/packages/workflows/src/defaults/bundled-defaults.test.ts @@ -1,40 +1,13 @@ import { describe, it, expect } from 'bun:test'; -import { - isBinaryBuild, - isBunVirtualFs, - BUNDLED_COMMANDS, - BUNDLED_WORKFLOWS, -} from './bundled-defaults'; +import { isBinaryBuild, BUNDLED_COMMANDS, BUNDLED_WORKFLOWS } from './bundled-defaults'; describe('bundled-defaults', () => { - describe('isBunVirtualFs', () => { - it('should detect Linux/macOS virtual filesystem paths', () => { - expect(isBunVirtualFs('/$bunfs/root/bundled-defaults')).toBe(true); - expect(isBunVirtualFs('/$bunfs/root/')).toBe(true); - }); - - it('should detect Windows virtual filesystem paths (backslash)', () => { - expect(isBunVirtualFs('B:\\~BUN\\root\\bundled-defaults')).toBe(true); - expect(isBunVirtualFs('B:\\~BUN\\root')).toBe(true); - }); - - it('should detect Windows virtual filesystem paths (forward slash)', () => { - expect(isBunVirtualFs('B:/~BUN/root/bundled-defaults')).toBe(true); - expect(isBunVirtualFs('B:/~BUN/root')).toBe(true); - }); - - it('should return false for real filesystem paths', () => { - expect(isBunVirtualFs('/home/user/project/src')).toBe(false); - expect(isBunVirtualFs('C:\\Users\\user\\project\\src')).toBe(false); - expect(isBunVirtualFs('/tmp/test')).toBe(false); - }); - }); - describe('isBinaryBuild', () => { - it('should return false when running in test environment (not compiled)', () => { - // The true path requires an actual compiled binary (import.meta.dir points to - // Bun's virtual FS only inside compiled binaries). Coverage of the true branch - // relies on isBunVirtualFs tests above + manual binary smoke testing in CI. + it('should return false in dev/test mode', () => { + // `isBinaryBuild()` reads the build-time constant `BUNDLED_IS_BINARY` from + // `@archon/paths`. In dev/test mode it is `false`. It is only rewritten to + // `true` by `scripts/build-binaries.sh` before `bun build --compile`. + // Coverage of the `true` branch is via local binary smoke testing (see #979). expect(isBinaryBuild()).toBe(false); }); }); diff --git a/packages/workflows/src/defaults/bundled-defaults.ts b/packages/workflows/src/defaults/bundled-defaults.ts index 6caf6e7ae0..a921171b9e 100644 --- a/packages/workflows/src/defaults/bundled-defaults.ts +++ b/packages/workflows/src/defaults/bundled-defaults.ts @@ -8,6 +8,8 @@ * Import syntax uses `with { type: 'text' }` to import file contents as strings. */ +import { BUNDLED_IS_BINARY } from '@archon/paths'; + // ============================================================================= // Default Commands (21 total) // ============================================================================= @@ -102,23 +104,19 @@ export const BUNDLED_WORKFLOWS: Record = { 'archon-workflow-builder': archonWorkflowBuilderWf, }; -/** - * Check if a given module directory path belongs to a compiled Bun binary. - * - * Compiled Bun binaries use a virtual filesystem for bundled modules: - * - Linux/macOS: `/$bunfs/root/` - * - Windows: `B:\~BUN\root\` or `B:/~BUN/root/` - */ -export function isBunVirtualFs(dir: string): boolean { - return dir.startsWith('/$bunfs/') || dir.startsWith('B:\\~BUN\\') || dir.startsWith('B:/~BUN/'); -} - /** * Check if the current process is running as a compiled binary (not via Bun CLI). * - * Note: `process.versions.bun` is still set in compiled binaries as of Bun 1.3.5, - * so we use the virtual filesystem path prefix for detection instead. + * Reads the build-time constant `BUNDLED_IS_BINARY` from `@archon/paths`. + * `scripts/build-binaries.sh` rewrites that file to set it to `true` before + * `bun build --compile` and restores it afterwards. See GitHub issue #979. + * + * Kept as a function (rather than a direct re-export of `BUNDLED_IS_BINARY`) + * so tests can use `spyOn(bundledDefaults, 'isBinaryBuild').mockReturnValue(...)` + * without resorting to `mock.module('@archon/paths', ...)` — which is + * process-global and irreversible in Bun and would pollute other test files. + * See `.claude/rules/dx-quirks.md` and `loader.test.ts` for context. */ export function isBinaryBuild(): boolean { - return isBunVirtualFs(import.meta.dir); + return BUNDLED_IS_BINARY; } diff --git a/packages/workflows/src/executor-shared.ts b/packages/workflows/src/executor-shared.ts index 890eb6a1f3..0ac1f4ba73 100644 --- a/packages/workflows/src/executor-shared.ts +++ b/packages/workflows/src/executor-shared.ts @@ -67,8 +67,13 @@ export function matchesPattern(message: string, patterns: string[]): boolean { * Classify an error to determine if it's transient (can retry) or fatal (should fail). * FATAL patterns take priority over TRANSIENT patterns to prevent an error message * containing both (e.g. "unauthorized: process exited with code 1") from being retried. + * + * First-party named error types are checked by name (immune to message rewording). */ export function classifyError(error: Error): ErrorType { + // Named first-party errors checked by name — immune to message rewording + if (error.name === 'EnvLeakError') return 'FATAL'; + const message = error.message.toLowerCase(); if (matchesPattern(message, FATAL_PATTERNS)) { diff --git a/scripts/build-binaries.sh b/scripts/build-binaries.sh index b62e084f5e..ddc0498394 100755 --- a/scripts/build-binaries.sh +++ b/scripts/build-binaries.sh @@ -6,22 +6,28 @@ set -euo pipefail # Get version from package.json or git tag VERSION="${VERSION:-$(grep '"version"' package.json | head -1 | cut -d'"' -f4)}" -echo "Building Archon CLI v${VERSION}" +GIT_COMMIT="${GIT_COMMIT:-$(git rev-parse --short HEAD 2>/dev/null || echo 'unknown')}" +echo "Building Archon CLI v${VERSION} (commit: ${GIT_COMMIT})" -# Update bundled version in source before compiling -BUNDLED_VERSION_FILE="packages/cli/src/commands/bundled-version.ts" -echo "Updating bundled version to ${VERSION}..." -cat > "$BUNDLED_VERSION_FILE" << EOF +# Update build-time constants in source before compiling. +# The file is restored via an EXIT trap so the dev tree is never left dirty, +# even if `bun build --compile` fails mid-way. See GitHub issue #979. +BUNDLED_BUILD_FILE="packages/paths/src/bundled-build.ts" +trap 'echo "Restoring ${BUNDLED_BUILD_FILE}..."; git checkout -- "${BUNDLED_BUILD_FILE}"' EXIT + +echo "Updating build-time constants (version=${VERSION}, is_binary=true)..." +cat > "$BUNDLED_BUILD_FILE" << EOF /** - * Bundled version for compiled binaries - * - * This file is updated by scripts/build-binaries.sh before compilation. - * The version is read from package.json at build time and embedded here. + * Build-time constants embedded into compiled binaries. * - * For development, the version command reads directly from package.json instead. + * This file is rewritten by scripts/build-binaries.sh before \`bun build --compile\` + * and restored afterwards via an EXIT trap. Do not edit these values by hand + * outside the build script — the dev defaults live in the committed copy. */ +export const BUNDLED_IS_BINARY = true; export const BUNDLED_VERSION = '${VERSION}'; +export const BUNDLED_GIT_COMMIT = '${GIT_COMMIT}'; EOF # Output directory