Skip to content

shell: wrap only component-leading ! when neutralizing glob metachars#31272

Merged
Jarred-Sumner merged 2 commits into
mainfrom
farm/887cf3c5/shell-glob-bang-run
May 23, 2026
Merged

shell: wrap only component-leading ! when neutralizing glob metachars#31272
Jarred-Sumner merged 2 commits into
mainfrom
farm/887cf3c5/shell-glob-bang-run

Conversation

@robobun

@robobun robobun commented May 23, 2026

Copy link
Copy Markdown
Collaborator

Problem

Since #31220, glob metacharacters arriving in a shell word via ${...} interpolation (or $var, command substitution, quoted text) are neutralized before the word reaches the glob walker. ! is neutralized by wrapping it in a one-branch brace group {!}.

The glob matcher keeps brace state in a BoundedArray<Brace, 10>, and sequential brace groups occupy the stack simultaneously (a group's entry is only popped after the recursion that matches the rest of the pattern returns). A run of N interpolated ! bytes therefore needs N brace-stack slots, and at 11+ the push fails and the whole pattern is reported as a non-match:

import { $ } from "bun";
await Bun.write("prefix!!!!!!!!!!!x.txt", "");
await $`echo prefix${"!".repeat(11)}*`;
// bun: no matches found: prefix!!!!!!!!!!!*   (the file exists)

Before #31220 a non-leading ! reached the matcher bare and matched literally, so this case is a regression.

Fix

Only a ! at the start of a path component can act as pattern syntax (the matcher's negation loop). Everywhere else the matcher already treats ! as a literal byte — it can never be the first character of a [...] class either, because every [ coming from data is itself neutralized to [[].

neutralize_glob_metachars now wraps ! as {!} only when it starts a component (the pattern is empty or the previously emitted byte is a native path separator, mirroring the split build_pattern_components performs) and emits every other ! bare. A run of interpolated ! now costs at most one brace-stack slot per component.

Verification

Two new cases in the existing "interpolated values cannot inject glob syntax" block in test/js/bun/shell/bunshell.test.ts: an 11-! interpolated run in the middle of a word and at the start of a word must still match files containing those bytes literally.

  • Without the fix both fail: bun: no matches found: prefix!!!!!!!!!!!* / bun: no matches found: !!!!!!!!!!!keep*
  • With the fix they pass, alongside the rest of the glob-expansion block, test/js/bun/shell/brace.test.ts, and test/js/bun/glob/{match,scan,path-length,proto}.test.ts.

Neutralizing every interpolated `!` as its own `{!}` brace group meant a
run of more than 10 consecutive `!` bytes exceeded the glob matcher's
bounded brace stack, so the whole word reported "no matches found"
instead of matching files containing those bytes literally.

Only a `!` at the start of a path component (the same split
build_pattern_components performs) can act as negation in the matcher;
every other `!` already matches literally, so emit it bare.
@robobun

robobun commented May 23, 2026

Copy link
Copy Markdown
Collaborator Author
Updated 7:22 AM PT - May 23rd, 2026

@robobun, your commit 9bfcb6c has 2 failures in Build #57251 (All Failures):


🧪   To try this PR locally:

bunx bun-pr 31272

That installs a local version of the PR into your bun-31272 executable, so you can run:

bun-31272 --bun

@coderabbitai

coderabbitai Bot commented May 23, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 4f460644-19de-4a47-91c3-07cbbd71cd95

📥 Commits

Reviewing files that changed from the base of the PR and between 1ed5084 and 9bfcb6c.

📒 Files selected for processing (1)
  • test/js/bun/shell/bunshell.test.ts

Walkthrough

The PR refines how the Bun shell's glob expansion handles the ! character to prevent unintended negation-pattern interpretation. The neutralize_glob_metachars function now wraps component-leading ! in a single-branch brace group ({!}) while passing other ! bytes through literally. Tests validate that long ! sequences remain literal in word and pattern-start positions.

Changes

Glob Negation Literal Handling

Layer / File(s) Summary
Component-leading negation handling
src/runtime/shell/states/Expansion.rs
neutralize_glob_metachars conditionally wraps ! as {!} only when it appears at the start of a path component (preceded by absence or native separator); other ! bytes are emitted literally. Documentation updated to reflect this component-boundary-aware behavior.
Glob negation test coverage
test/js/bun/shell/bunshell.test.ts
New test cases for long ! runs (11 consecutive characters) verify that ! sequences remain literal when inside words or at pattern start, confirming they do not trigger glob negation or brace parsing.

Possibly related PRs

  • oven-sh/bun#31220: Both PRs modify neutralize_glob_metachars in src/runtime/shell/states/Expansion.rs to handle literal ! in glob patterns and extend shell glob-expansion test coverage for ! sequences.
🚥 Pre-merge checks | ✅ 4
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and specifically describes the main change: limiting brace-group wrapping of ! to component-leading positions when neutralizing glob metachars.
Description check ✅ Passed The description includes both required template sections: 'Problem' (instead of 'What does this PR do?') with comprehensive context, and 'Verification' (instead of 'How did you verify your code works?') with concrete test details.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@test/js/bun/shell/bunshell.test.ts`:
- Line 772: Replace the use of string.repeat in the test variable assignment
(variable bangRun) with Buffer.alloc to construct the repeated string: change
the expression from "!".repeat(11) to Buffer.alloc(11, "!").toString() in the
bunshell.test.ts test so the test uses Buffer.alloc(count, fill).toString() for
repetitive strings as required by the test suite guidelines.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: cc25088b-d0a9-44a0-af34-9d1a3dd269fa

📥 Commits

Reviewing files that changed from the base of the PR and between 3c92be8 and 1ed5084.

📒 Files selected for processing (2)
  • src/runtime/shell/states/Expansion.rs
  • test/js/bun/shell/bunshell.test.ts

Comment thread test/js/bun/shell/bunshell.test.ts Outdated

@claude claude Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM — narrow, well-reasoned regression fix with tests; verified the matcher only treats ! specially at component-start or after [, both of which remain neutralized.

Extended reasoning...

Overview

This PR refines neutralize_glob_metachars in src/runtime/shell/states/Expansion.rs so that interpolated ! bytes are only wrapped as {!} when they begin a path component, and emitted bare otherwise. It adds two regression tests in test/js/bun/shell/bunshell.test.ts covering 11-! runs both mid-word and at word start. This fixes a regression from #31220 where long runs of interpolated ! overflowed the glob matcher's 10-slot bounded brace stack and produced spurious "no matches found" errors.

Security risks

Relaxing neutralization of an interpolated byte is security-adjacent (glob injection), so I traced where ! can act as syntax in the matcher:

  • The leading negation loop at src/glob/matcher.rs:143 only consumes ! at the very start of the pattern string. match() is invoked per path component, and build_pattern_components splits on bun_core::path_sep::is_sep_native (src/glob/GlobWalker.rs:2067) — exactly the predicate the new starts_component check uses against pattern.last(). Component-leading ! is still wrapped.
  • The negated-class case at src/glob/matcher.rs:281 requires a preceding bare [. Every [ arriving from data is wrapped to [[], and the only bare [ in the output pattern is the wrapper's own opening bracket, which is always immediately followed by the wrapped char — never a bare !.

So a bare non-leading ! provably cannot reach a syntax-active position in the matcher. No new injection surface is introduced.

Level of scrutiny

Moderate. Shell glob expansion is correctness-sensitive, but the change is ~15 lines of logic with a precise, verifiable invariant, a thorough code comment, and targeted tests. The is_sep_native predicate correctly mirrors the component splitter on both POSIX (/) and Windows (/ + \, where \ already falls through to the default pass-through arm).

Other factors

No CODEOWNERS cover these paths. The bug hunter found no issues. The PR description states the new tests fail without the fix and pass with it, and that the existing glob/brace test suites continue to pass. The fix is minimal and does not change behavior for any byte other than non-component-leading !.

@claude claude Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logic looks correct to me — I traced ! handling in matcher.rs and build_pattern_components and the component-leading check lines up — but since this relaxes escaping in the interpolated-data→glob-syntax boundary, it's worth a human confirming the invariant.

Extended reasoning...

Overview

This PR narrows neutralize_glob_metachars in src/runtime/shell/states/Expansion.rs so that an interpolated ! is wrapped as {!} only when it would land at the start of a path component in the emitted pattern (i.e. pattern is empty or its last byte is a native path separator); every other interpolated ! is now emitted bare. Two new TestBuilder cases in bunshell.test.ts cover an 11-! run mid-word and at word start. The CodeRabbit nit about Buffer.alloc was addressed in 9bfcb6c.

Security risks

The function being changed is the mechanism that prevents interpolated/data bytes from acting as glob pattern syntax, so weakening it incorrectly could let user-controlled data broaden a glob (e.g. flip a component into a negated match). I verified against src/glob/matcher.rs that ! is only special (a) in the leading negation loop at glob_index == 0 for the component, and (b) immediately after [ in a class — and every data-sourced [ is already wrapped as [[], so a bare data ! can never land in position (b). I also checked that / (and \\ on Windows) are always emitted bare by this function, so pattern.last() agrees with the is_sep_native split that build_pattern_components performs. I didn't find a hole, but this is exactly the kind of invariant a second reviewer should sanity-check.

Level of scrutiny

Medium-high. The diff itself is ~20 lines and mechanically simple, and it restores pre-#31220 behavior for non-leading !. But it sits on the shell's data/syntax boundary, and the correctness argument depends on a non-local invariant about where the glob matcher treats ! as syntax.

Other factors

No bugs from the bug-hunting pass, no CODEOWNERS for this path, the only review feedback (Buffer.alloc) is resolved, and the new tests directly exercise the regression (11 > brace-stack depth of 10). Deferring rather than approving solely because the change reduces escaping in a neutralization path.

@Jarred-Sumner Jarred-Sumner merged commit f161e03 into main May 23, 2026
45 of 55 checks passed
@Jarred-Sumner Jarred-Sumner deleted the farm/887cf3c5/shell-glob-bang-run branch May 23, 2026 08:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants