Skip to content

fs: use clonefile for symlink-free recursive fs.cp on macOS#32503

Open
cirospaciari wants to merge 1 commit into
mainfrom
claude/fs-cp-clonefile-fastpath
Open

fs: use clonefile for symlink-free recursive fs.cp on macOS#32503
cirospaciari wants to merge 1 commit into
mainfrom
claude/fs-cp-clonefile-fastpath

Conversation

@cirospaciari

Copy link
Copy Markdown
Member

Follow-up to #31830. That PR routed all recursive fs.cp/fs.cpSync/fs.promises.cp calls through the node-ported JS walker so relative-symlink rewriting and the ERR_FS_CP_* error codes match node, which meant recursive copies on macOS no longer used a single whole-tree clonefile(). This restores the fast path for the cases where it is indistinguishable from the walker.

Summary

  • Recursive fs.cp/cpSync/promises.cp go back to the native path (one clonefile() of the whole tree on macOS) when: the options are the defaults already required for the existing single-file fast path (no filter/dereference/preserveTimestamps/verbatimSymlinks/mode/errorOnExist, force), the destination does not exist (so node's merge semantics never come into play), and a metadata-only readdir scan of the source tree finds nothing but regular files and directories.
  • Anything else still uses the ported walker: symlinks (node rewrites relative targets against the source tree), FIFOs/sockets/devices (node-specific ERR_FS_CP_* errors), entries whose type the filesystem does not report, an existing destination, or a scan error. Non-macOS platforms are completely unchanged — the directory branch still returns ok: false there, and the scan call is dead-code-eliminated.
  • The decision lives in tryNativeFastPathSync/tryNativeFastPath in src/js/internal/fs/cp-sync.ts / cp.ts; node's path validation (checkPaths, checkParentPaths, EISDIR) still runs before the native handoff, and the bail-out path keeps reusing the already-computed stats via checked exactly as before.
  • Adds bench/fs-cp/cp.mjs: recursive copy of a 256-file tree, sync and promises, with and without a symlink in the tree (the symlink variant always exercises the walker), runnable under both bun and node.

Test plan

  • New tests in test/js/node/fs/cp.test.ts for both fs.cpSync and fs.promises.cp, asserting the behaviors the scan must protect: relative in-tree symlink targets are rewritten to absolute paths resolved against the source tree (verified against node v26.3.0), file and directory modes are preserved into a fresh destination, and a FIFO inside the tree is rejected with ERR_FS_CP_FIFO_PIPE. On macOS these run against trees that bail out of the fast path; the existing cp suite plus the ported node tests cover the trees that take it.
  • bun bd test test/js/node/fs/cp.test.ts — 46 pass, 0 fail (Linux debug build; Linux behavior is unchanged by this PR).
  • bench/fs-cp/cp.mjs runs under both node 26.3.0 and the debug build.
  • macOS CI green (the platform where the new branch is actually taken).

… is node-equivalent

Recursive fs.cp/cpSync/promises.cp always went through the JS walker, which
copies every file individually. On macOS the native path can clone the whole
tree with a single clonefile(), but it copies symlinks verbatim and clones
special files, while node rewrites relative symlink targets against the source
tree and raises ERR_FS_CP_SOCKET / ERR_FS_CP_FIFO_PIPE.

Route recursive copies to the native path only when the result is
indistinguishable from the walker: the destination does not exist (no merge
semantics) and a metadata-only readdir scan of the source tree finds nothing
but regular files and directories. Anything else - symlinks, FIFOs, sockets,
devices, entries with unknown types, or a scan error - keeps using the ported
walker. Non-macOS platforms are unchanged.

Adds a recursive fs.cp benchmark (bench/fs-cp) and tests covering the cases
the scan must bail on: relative in-tree symlink targets resolved against the
source tree, mode preservation into a fresh destination, and FIFO rejection.
@robobun

robobun commented Jun 19, 2026

Copy link
Copy Markdown
Collaborator
Updated 8:14 PM PT - Jun 18th, 2026

@cirospaciari, your commit 7dd358d has 2 failures in Build #63431 (All Failures):


🧪   To try this PR locally:

bunx bun-pr 32503

That installs a local version of the PR into your bun-32503 executable, so you can run:

bun-32503 --bun

@coderabbitai

coderabbitai Bot commented Jun 19, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Walkthrough

A macOS-only native fast path is added to both fs.cpSync and fs.promises.cp for recursive directory copies. A new tree scanner (treeContainsOnlyFilesAndDirs[Sync]) gates this path, requiring the destination to be absent and the entire source tree to contain only regular files and directories. New tests cover symlink rewriting, permission preservation, and FIFO rejection. A benchmark script is added.

Changes

Native fast path for recursive fs.cp on macOS

Layer / File(s) Summary
Tree scanner and native fast path gating (async and sync)
src/js/internal/fs/cp.ts, src/js/internal/fs/cp-sync.ts
Adds treeContainsOnlyFilesAndDirs (async) and treeContainsOnlyFilesAndDirsSync (sync), both stack-based walkers that return false on any non-file/non-directory entry or on readdir failure. Updates tryNativeFastPath and tryNativeFastPathSync to enable the native directory copy path only on macOS, when the destination does not exist, and when the tree scan passes; all other cases fall through to the ported walker.
Edge-case regression tests: symlinks, permissions, FIFO rejection
test/js/node/fs/cp.test.ts
Imports mkfifo and adds tests asserting relative-target symlinks are rewritten to destination-internal paths, permission modes are preserved in fresh destinations (non-Windows), and a FIFO in the source tree causes ERR_FS_CP_FIFO_PIPE.
Benchmark script for cpSync and fs.promises.cp
bench/fs-cp/cp.mjs
Adds a benchmark that builds a plain-files tree and a symlink-containing tree, registers four cases (cpSync/fs.promises.cp × each tree) with per-iteration destination resets, runs all benchmarks, and cleans up the temp root.
🚥 Pre-merge checks | ✅ 4
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately describes the main change: restoring macOS clonefile optimization for recursive fs.cp when there are no symlinks in the tree.
Description check ✅ Passed The description follows the template with both required sections filled out comprehensively, detailing what the PR does and providing thorough verification details including test results and benchmark information.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@bench/fs-cp/cp.mjs`:
- Around line 76-78: The temporary benchmark directory created in the root
variable is not cleaned up when the run() function fails, leaving behind
artifacts. Wrap the await run() call in a try/finally block, moving the
rmSync(root, { recursive: true, force: true }) statement into the finally block
to ensure cleanup occurs regardless of whether run() succeeds or rejects.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 471cc4a9-265a-40b7-a5b5-2b7c49624158

📥 Commits

Reviewing files that changed from the base of the PR and between dae2e87 and 7dd358d.

📒 Files selected for processing (4)
  • bench/fs-cp/cp.mjs
  • src/js/internal/fs/cp-sync.ts
  • src/js/internal/fs/cp.ts
  • test/js/node/fs/cp.test.ts

Comment thread bench/fs-cp/cp.mjs
Comment on lines +76 to +78
await run();

rmSync(root, { recursive: true, force: true });

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial | ⚡ Quick win

Clean up the benchmark root on failed runs.

If run() rejects, the temporary source/destination tree is left behind. Wrap the run in try/finally so failed benchmark iterations still clean up.

♻️ Proposed cleanup guard
-await run();
-
-rmSync(root, { recursive: true, force: true });
+try {
+  await run();
+} finally {
+  rmSync(root, { recursive: true, force: true });
+}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@bench/fs-cp/cp.mjs` around lines 76 - 78, The temporary benchmark directory
created in the root variable is not cleaned up when the run() function fails,
leaving behind artifacts. Wrap the await run() call in a try/finally block,
moving the rmSync(root, { recursive: true, force: true }) statement into the
finally block to ensure cleanup occurs regardless of whether run() succeeds or
rejects.

Comment on lines +292 to +301
if (srcStat.isDirectory()) {
// On macOS the native path clones the whole tree with a single
// clonefile(). Only take it when the result is indistinguishable from
// node's walker: dest must not exist (no merge semantics) and the tree
// must contain only regular files and directories.
return {
ok: process.platform === "darwin" && !destStat && treeContainsOnlyFilesAndDirsSync(src),
checked,
};
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 When clonefile() fails at runtime (EXDEV cross-volume, ENOTSUP non-APFS, or ENOENT because dest's parent doesn't exist — the ok:true path skips checkParentDir), the native fallback creates every directory with mkdir_recursive_os_path(dest, args::Mkdir::DEFAULT_MODE, …) and never chmods it, so source directory modes are lost. Before this PR srcStat.isDirectory() always fell through to the JS walker (whose mkDirAndCopy → setDestMode preserves modes), so this is a regression for cross-volume / non-APFS / missing-parent recursive copies on macOS — exactly the cases the PR claims are "indistinguishable from node's walker". The same gap exists in the async path (src/js/internal/fs/cp.ts:166-175_cp_async_directory line 2067), and the new mode-preservation test won't catch it because tmpdir is same-volume APFS where clonefile() succeeds.

Extended reasoning...

What the bug is

tryNativeFastPathSync / tryNativeFastPath now return ok: true for a directory on macOS when dest is missing and the tree contains only regular files and directories. The dispatcher (src/js/node/fs.ts:947-949) then calls the native fs.cpSync directly, which lands in cp_sync_inner (src/runtime/node/node_fs.rs:8247). That function attempts clonefile() (line 8338-8341); on success, modes are preserved because clonefile is a metadata-faithful clone. But on any errno outside {ENAMETOOLONG, EROFS, EINVAL, EACCES, EPERM} — notably EXDEV (cross-volume), ENOTSUP (non-APFS target), ENOENT (dest parent missing), EEXIST — the _ => {} arm at line 8354 falls through to:

// node_fs.rs:8367
match self.mkdir_recursive_os_path(dest, args::Mkdir::DEFAULT_MODE, false) {

where DEFAULT_MODE is 0o777 (line 3467). The source directory's mode is never read and dest is never chmod'd. The loop at 8379-8448 then recurses into each subdirectory via cp_sync_inner (line 8413), which repeats the same mkdir-with-default-mode at every level. The async path is identical: _cp_async_directory at line 2067 calls mkdir_recursive_os_path(normdest, args::Mkdir::DEFAULT_MODE, false) after the clonefile fall-through at line 2025.

Why the gate doesn't prevent it

The new gate checks three things: process.platform === 'darwin', !destStat, and treeContainsOnlyFilesAndDirsSync(src). None of these tell you whether clonefile() will actually succeed. clonefile requires src and dest to be on the same APFS (or HFS+) volume; copying to an external USB drive, an SMB/NFS mount, or a FAT/exFAT volume fails with EXDEV or ENOTSUP regardless of the tree contents. Additionally, the ok: true branch bypasses checkParentDir (which the JS walker would run via cpSyncFn → checkParentDir → mkdirSync(destParent, { recursive: true })), so if dirname(dest) doesn't yet exist, clonefile() itself returns ENOENT and triggers the same fallback.

Step-by-step proof

On macOS, with /Volumes/USB an exFAT-formatted external drive:

  1. mkdir -m 0700 /tmp/src && mkdir -m 0750 /tmp/src/sub && touch /tmp/src/sub/f
  2. fs.cpSync('/tmp/src', '/Volumes/USB/dst', { recursive: true })
  3. tryNativeFastPathSync: darwin ✓, destStat is null ✓, tree is files+dirs only ✓ → ok: true
  4. Native cp_sync_inner runs clonefile('/tmp/src', '/Volumes/USB/dst', 0)EXDEV (cross-device)
  5. EXDEV matches _ => {} → fall through to mkdir_recursive_os_path('/Volumes/USB/dst', 0o777, false)dst created with 0o777 & ~umask (typically 0o755), not 0o700
  6. Recursion into sub: clonefile again fails EXDEV → mkdir_recursive_os_path('.../dst/sub', 0o777, false)0o755, not 0o750
  7. Result: statSync('/Volumes/USB/dst').mode & 0o777 === 0o755, statSync('/Volumes/USB/dst/sub').mode & 0o777 === 0o755

Node.js (and Bun before this PR) would have run mkDirAndCopy(srcMode, …) → setDestMode(dest, srcMode) and produced 0o700 / 0o750.

The ENOENT trigger needs no second volume: fs.cpSync('/tmp/src', '/tmp/does/not/exist/dst', { recursive: true }) on a single APFS volume hits the same path, because the dispatcher never creates /tmp/does/not/exist before calling native, clonefile() returns ENOENT, and the recursive mkdir creates both the parent chain and dst with 0o777.

Why this is a regression

Before this PR, the directory branch of tryNativeFastPathSync did not exist — srcStat.isDirectory() fell through to srcStat.isFile() && …ok: false, and the JS walker handled every directory copy with mkDirAndCopy → setDestMode(dest, srcMode). So on the immediately-preceding commit, all three scenarios above preserved directory modes. The PR description states the fast path is taken "when the result is indistinguishable from node's walker"; in these cases it observably isn't.

Why the new test doesn't catch it

The added "file and directory modes are preserved into a fresh destination" test runs entirely inside tempDirWithFiles (the process tmpdir), which on macOS CI is a single APFS volume. clonefile() succeeds there, so the fallback never runs and the test passes.

Suggested fix

Either (a) in the native fallback, stat the source directory and chmod(dest, src_mode & 0o7777) after mkdir_recursive_os_path at each recursion level (both cp_sync_inner ~line 8367 and _cp_async_directory ~line 2067), or (b) have the JS fast-path create dirname(dest) first (closes the ENOENT hole) and have the native entry point return a sentinel on top-level clonefile EXDEV/ENOTSUP so the JS layer can fall back to cpSyncFn/cpFn with the already-computed checked stats. Option (a) is simpler and also fixes Bun's pre-existing native cp for non-fs.cp callers.

Comment on lines +292 to +301
if (srcStat.isDirectory()) {
// On macOS the native path clones the whole tree with a single
// clonefile(). Only take it when the result is indistinguishable from
// node's walker: dest must not exist (no merge semantics) and the tree
// must contain only regular files and directories.
return {
ok: process.platform === "darwin" && !destStat && treeContainsOnlyFilesAndDirsSync(src),
checked,
};
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 The whole-tree clonefile() fast path silently strips setuid/setgid bits from regular files in the cloned tree (per clonefile(2): "setuid and setgid bits are turned off in the mode bits for regular files"), whereas node's walker — and the immediately-preceding Bun release — restores them via chmod(dest, srcStat.mode) after each file copy. So a 04755 file inside a files-and-dirs-only tree now comes out 0755 on macOS, contradicting the "indistinguishable from node's walker" invariant this gate is meant to enforce. Same applies to the async gate at src/js/internal/fs/cp.ts:166-175; either bail out of the fast path when the scan sees (st_mode & (S_ISUID|S_ISGID)) on a file, or document this as an accepted divergence.

Extended reasoning...

What the bug is

Apple's clonefile(2) man page states that the cloned file's mode bits are taken from the source except that the setuid and setgid bits are turned off for regular files — this is unconditional kernel behaviour applied as a security measure regardless of caller privilege. Node's ported walker, by contrast, follows every copyFile() with setDestMode(dest, srcStat.mode)chmodSync(dest, srcStat.mode), where srcStat.mode is the full 12-bit st_mode from lstat and so includes S_ISUID/S_ISGID. The walker therefore preserves setuid/setgid bits; whole-tree clonefile() does not.

The code path

This PR's tryNativeFastPathSync (cp-sync.ts:292-301) and tryNativeFastPath (cp.ts:166-175) now return ok: true for a directory on macOS when destStat is null and treeContainsOnlyFilesAndDirsSync(src) passes. That hands the copy to the native path, which on macOS is the whole-tree clonefile() at src/runtime/node/node_fs.zig:6259 (sync cpSyncInner) / :827 (async _cpAsyncDirectory). On a zero return both sites immediately return .success / true (zig:6277, zig:844) — there is no subsequent per-file chmod walk to restore the stripped bits.

That the codebase already knows clonefile doesn't preserve full mode is corroborated by the per-file native path at node_fs.zig:6505-6507, which explicitly follows a successful single-file clonefile with Syscall.chmod(dest, stat_.mode).

Why the existing gate doesn't prevent it

treeContainsOnlyFilesAndDirsSync only inspects Dirent types from readdir(..., { withFileTypes: true }). A setuid regular file reports isFile() === true, so the scan happily approves it. The new "file and directory modes are preserved" test in cp.test.ts checks & 0o777 against 0o600/0o700 and would not detect a stripped 04xxx bit.

Step-by-step proof

  1. On macOS/APFS, create src/helper with mode 04755 inside a tree containing only regular files and directories (e.g. a vendored toolchain or an .app bundle's privileged helper).
  2. Call fs.cpSync('src', 'dest', { recursive: true }) with default options and no existing dest.
  3. tryNativeFastPathSync: srcStat.isDirectory() → true, process.platform === 'darwin' → true, !destStat → true, treeContainsOnlyFilesAndDirsSync('src') → true (helper is a regular file). Returns ok: true.
  4. Native cpSyncInner runs clonefile('src', 'dest', 0)0 on APFS → returns .success immediately.
  5. fs.statSync('dest/helper').mode & 0o7777 is 0o755, not 0o4755.
  6. Pre-PR (post-fs: port Node.js v26.3.0 fs tests and fix the gaps they surface — cp error semantics, watcher event delivery, watch ignore+AbortSignal, FileHandle pull/writer, glob port, opendir/Dir, mkdtempDisposable, rmdir-recursive end-of-life, mock.fn (+119 tests) #31830): tryNativeFastPathSync returned ok: false for all directories, the JS walker copied helper via copyFileSync then chmodSync(dest, srcStat.mode), and the result was 0o4755. So this is a behavioural regression vs. the immediately-preceding release as well as vs. Node.

Impact and relationship to other findings

This is distinct from the clonefile-fails fallback issue: that one concerns the manual recursion when clonefile() returns an error; this one fires precisely when clonefile() succeeds, which is the common APFS case the PR is optimizing for. Real-world impact is admittedly niche — setuid binaries inside fs.cp'd trees are uncommon (privileged helpers in .app bundles, sudo-like tools in vendored toolchains), and stripping setuid on copy is arguably the safer direction security-wise — but it does violate the PR's stated "indistinguishable from node's walker" invariant and is a silent divergence the readdir scan was specifically introduced to prevent.

How to fix

Either (a) have the scan lstat each regular file and bail to the walker when (st_mode & (S_ISUID | S_ISGID)) !== 0 (cheap relative to the copy itself, keeps the fast path for the overwhelmingly-common case), (b) walk the cloned tree post-success and chmod files whose source had those bits set, or (c) explicitly document this as an accepted divergence in the gate's comment so the "indistinguishable" claim isn't misleading.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants