Skip to content

test: reproduce #4285 — broken default agent commands for Claude and Codex#4290

Draft
github-actions[bot] wants to merge 1 commit into
mainfrom
triage/issue-4285-25594086182
Draft

test: reproduce #4285 — broken default agent commands for Claude and Codex#4290
github-actions[bot] wants to merge 1 commit into
mainfrom
triage/issue-4285-25594086182

Conversation

@github-actions
Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot commented May 9, 2026

What the bug is

Per #4285, the default commands generated for Claude and Codex don't work without manual fixes on first run:

  • Claude: launches with --permission-mode acceptEdits, which auto-accepts file edits but still prompts before every shell command — the user expects "auto mode" (no per-action prompts) out of the box.
  • Codex: launches with --sandbox workspace-write --ask-for-approval never, producing "sandbox doesn't allow me to run terminal command" errors. The workspace-write sandbox blocks operations needing network or out-of-workspace access, and --ask-for-approval never removes codex's ability to escalate — so it silently refuses instead of asking the user.

Every new user has to debug agent launch configs before they can do real work.

What code is affected and why

Defaults live in packages/shared/src/builtin-terminal-agents.ts (canonical) and are mirrored in packages/shared/src/host-agent-presets.ts. Both ship the flags that produce the broken first-run UX.

These flags were intentionally introduced in #3546 to favor safety, with #3615 backfilling YOLO defaults for canary users. So fixing this is a UX-policy call (revert vs. tweak vs. keep) — leaving this as repro-only.

Working alternatives if a fix is wanted:

  • Claude → --permission-mode bypassPermissions (or legacy --dangerously-skip-permissions)
  • Codex → --full-auto (workspace-write + on-failure approval) or --dangerously-bypass-approvals-and-sandbox for full YOLO

What the test does and how it proves the bug

Adds packages/shared/src/builtin-terminal-agents.test.ts with two assertions against BUILTIN_TERMINAL_AGENT_COMMANDS:

  1. Claude default contains an auto-mode flag (--dangerously-skip-permissions or --permission-mode bypassPermissions).
  2. Codex default does not pair --sandbox workspace-write with --ask-for-approval never.

Both assertions currently fail, demonstrating the misconfiguration.

$ bun test packages/shared/src/builtin-terminal-agents.test.ts
(fail) claude default launches in auto mode
(fail) codex default does not combine workspace-write sandbox with `--ask-for-approval never`
 0 pass
 2 fail

Refs #4285


Summary by cubic

Add failing tests to reproduce #4285: default terminal agent commands for Claude and Codex break first run. The test in packages/shared/src/builtin-terminal-agents.test.ts asserts Claude launches in auto mode and that Codex does not pair --sandbox workspace-write with --ask-for-approval never.

Written for commit aa0aeec. Summary will update on new commits.

…Codex

Adds a co-located test file capturing the two misconfigured defaults
reported in #4285:

- Claude default `--permission-mode acceptEdits` does not enable auto
  mode (the user expects no per-action prompts on first run).
- Codex default pairs `--sandbox workspace-write` with
  `--ask-for-approval never`, which produces "sandbox doesn't allow me
  to run terminal command" because codex cannot escalate when the
  sandbox blocks an operation.

Both tests fail against current defaults, demonstrating the issue.
Refs #4285
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants