fs: use clonefile for symlink-free recursive fs.cp on macOS#32503
fs: use clonefile for symlink-free recursive fs.cp on macOS#32503cirospaciari wants to merge 1 commit into
Conversation
… is node-equivalent Recursive fs.cp/cpSync/promises.cp always went through the JS walker, which copies every file individually. On macOS the native path can clone the whole tree with a single clonefile(), but it copies symlinks verbatim and clones special files, while node rewrites relative symlink targets against the source tree and raises ERR_FS_CP_SOCKET / ERR_FS_CP_FIFO_PIPE. Route recursive copies to the native path only when the result is indistinguishable from the walker: the destination does not exist (no merge semantics) and a metadata-only readdir scan of the source tree finds nothing but regular files and directories. Anything else - symlinks, FIFOs, sockets, devices, entries with unknown types, or a scan error - keeps using the ported walker. Non-macOS platforms are unchanged. Adds a recursive fs.cp benchmark (bench/fs-cp) and tests covering the cases the scan must bail on: relative in-tree symlink targets resolved against the source tree, mode preservation into a fresh destination, and FIFO rejection.
|
Updated 8:14 PM PT - Jun 18th, 2026
❌ @cirospaciari, your commit 7dd358d has 2 failures in
🧪 To try this PR locally: bunx bun-pr 32503That installs a local version of the PR into your bun-32503 --bun |
WalkthroughA macOS-only native fast path is added to both ChangesNative fast path for recursive fs.cp on macOS
🚥 Pre-merge checks | ✅ 4✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@bench/fs-cp/cp.mjs`:
- Around line 76-78: The temporary benchmark directory created in the root
variable is not cleaned up when the run() function fails, leaving behind
artifacts. Wrap the await run() call in a try/finally block, moving the
rmSync(root, { recursive: true, force: true }) statement into the finally block
to ensure cleanup occurs regardless of whether run() succeeds or rejects.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 471cc4a9-265a-40b7-a5b5-2b7c49624158
📒 Files selected for processing (4)
bench/fs-cp/cp.mjssrc/js/internal/fs/cp-sync.tssrc/js/internal/fs/cp.tstest/js/node/fs/cp.test.ts
| await run(); | ||
|
|
||
| rmSync(root, { recursive: true, force: true }); |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial | ⚡ Quick win
Clean up the benchmark root on failed runs.
If run() rejects, the temporary source/destination tree is left behind. Wrap the run in try/finally so failed benchmark iterations still clean up.
♻️ Proposed cleanup guard
-await run();
-
-rmSync(root, { recursive: true, force: true });
+try {
+ await run();
+} finally {
+ rmSync(root, { recursive: true, force: true });
+}🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@bench/fs-cp/cp.mjs` around lines 76 - 78, The temporary benchmark directory
created in the root variable is not cleaned up when the run() function fails,
leaving behind artifacts. Wrap the await run() call in a try/finally block,
moving the rmSync(root, { recursive: true, force: true }) statement into the
finally block to ensure cleanup occurs regardless of whether run() succeeds or
rejects.
| if (srcStat.isDirectory()) { | ||
| // On macOS the native path clones the whole tree with a single | ||
| // clonefile(). Only take it when the result is indistinguishable from | ||
| // node's walker: dest must not exist (no merge semantics) and the tree | ||
| // must contain only regular files and directories. | ||
| return { | ||
| ok: process.platform === "darwin" && !destStat && treeContainsOnlyFilesAndDirsSync(src), | ||
| checked, | ||
| }; | ||
| } |
There was a problem hiding this comment.
🔴 When clonefile() fails at runtime (EXDEV cross-volume, ENOTSUP non-APFS, or ENOENT because dest's parent doesn't exist — the ok:true path skips checkParentDir), the native fallback creates every directory with mkdir_recursive_os_path(dest, args::Mkdir::DEFAULT_MODE, …) and never chmods it, so source directory modes are lost. Before this PR srcStat.isDirectory() always fell through to the JS walker (whose mkDirAndCopy → setDestMode preserves modes), so this is a regression for cross-volume / non-APFS / missing-parent recursive copies on macOS — exactly the cases the PR claims are "indistinguishable from node's walker". The same gap exists in the async path (src/js/internal/fs/cp.ts:166-175 → _cp_async_directory line 2067), and the new mode-preservation test won't catch it because tmpdir is same-volume APFS where clonefile() succeeds.
Extended reasoning...
What the bug is
tryNativeFastPathSync / tryNativeFastPath now return ok: true for a directory on macOS when dest is missing and the tree contains only regular files and directories. The dispatcher (src/js/node/fs.ts:947-949) then calls the native fs.cpSync directly, which lands in cp_sync_inner (src/runtime/node/node_fs.rs:8247). That function attempts clonefile() (line 8338-8341); on success, modes are preserved because clonefile is a metadata-faithful clone. But on any errno outside {ENAMETOOLONG, EROFS, EINVAL, EACCES, EPERM} — notably EXDEV (cross-volume), ENOTSUP (non-APFS target), ENOENT (dest parent missing), EEXIST — the _ => {} arm at line 8354 falls through to:
// node_fs.rs:8367
match self.mkdir_recursive_os_path(dest, args::Mkdir::DEFAULT_MODE, false) {where DEFAULT_MODE is 0o777 (line 3467). The source directory's mode is never read and dest is never chmod'd. The loop at 8379-8448 then recurses into each subdirectory via cp_sync_inner (line 8413), which repeats the same mkdir-with-default-mode at every level. The async path is identical: _cp_async_directory at line 2067 calls mkdir_recursive_os_path(normdest, args::Mkdir::DEFAULT_MODE, false) after the clonefile fall-through at line 2025.
Why the gate doesn't prevent it
The new gate checks three things: process.platform === 'darwin', !destStat, and treeContainsOnlyFilesAndDirsSync(src). None of these tell you whether clonefile() will actually succeed. clonefile requires src and dest to be on the same APFS (or HFS+) volume; copying to an external USB drive, an SMB/NFS mount, or a FAT/exFAT volume fails with EXDEV or ENOTSUP regardless of the tree contents. Additionally, the ok: true branch bypasses checkParentDir (which the JS walker would run via cpSyncFn → checkParentDir → mkdirSync(destParent, { recursive: true })), so if dirname(dest) doesn't yet exist, clonefile() itself returns ENOENT and triggers the same fallback.
Step-by-step proof
On macOS, with /Volumes/USB an exFAT-formatted external drive:
mkdir -m 0700 /tmp/src && mkdir -m 0750 /tmp/src/sub && touch /tmp/src/sub/ffs.cpSync('/tmp/src', '/Volumes/USB/dst', { recursive: true })tryNativeFastPathSync: darwin ✓,destStatis null ✓, tree is files+dirs only ✓ →ok: true- Native
cp_sync_innerrunsclonefile('/tmp/src', '/Volumes/USB/dst', 0)→ EXDEV (cross-device) - EXDEV matches
_ => {}→ fall through tomkdir_recursive_os_path('/Volumes/USB/dst', 0o777, false)→dstcreated with0o777 & ~umask(typically0o755), not0o700 - Recursion into
sub:clonefileagain fails EXDEV →mkdir_recursive_os_path('.../dst/sub', 0o777, false)→0o755, not0o750 - Result:
statSync('/Volumes/USB/dst').mode & 0o777 === 0o755,statSync('/Volumes/USB/dst/sub').mode & 0o777 === 0o755
Node.js (and Bun before this PR) would have run mkDirAndCopy(srcMode, …) → setDestMode(dest, srcMode) and produced 0o700 / 0o750.
The ENOENT trigger needs no second volume: fs.cpSync('/tmp/src', '/tmp/does/not/exist/dst', { recursive: true }) on a single APFS volume hits the same path, because the dispatcher never creates /tmp/does/not/exist before calling native, clonefile() returns ENOENT, and the recursive mkdir creates both the parent chain and dst with 0o777.
Why this is a regression
Before this PR, the directory branch of tryNativeFastPathSync did not exist — srcStat.isDirectory() fell through to srcStat.isFile() && … → ok: false, and the JS walker handled every directory copy with mkDirAndCopy → setDestMode(dest, srcMode). So on the immediately-preceding commit, all three scenarios above preserved directory modes. The PR description states the fast path is taken "when the result is indistinguishable from node's walker"; in these cases it observably isn't.
Why the new test doesn't catch it
The added "file and directory modes are preserved into a fresh destination" test runs entirely inside tempDirWithFiles (the process tmpdir), which on macOS CI is a single APFS volume. clonefile() succeeds there, so the fallback never runs and the test passes.
Suggested fix
Either (a) in the native fallback, stat the source directory and chmod(dest, src_mode & 0o7777) after mkdir_recursive_os_path at each recursion level (both cp_sync_inner ~line 8367 and _cp_async_directory ~line 2067), or (b) have the JS fast-path create dirname(dest) first (closes the ENOENT hole) and have the native entry point return a sentinel on top-level clonefile EXDEV/ENOTSUP so the JS layer can fall back to cpSyncFn/cpFn with the already-computed checked stats. Option (a) is simpler and also fixes Bun's pre-existing native cp for non-fs.cp callers.
| if (srcStat.isDirectory()) { | ||
| // On macOS the native path clones the whole tree with a single | ||
| // clonefile(). Only take it when the result is indistinguishable from | ||
| // node's walker: dest must not exist (no merge semantics) and the tree | ||
| // must contain only regular files and directories. | ||
| return { | ||
| ok: process.platform === "darwin" && !destStat && treeContainsOnlyFilesAndDirsSync(src), | ||
| checked, | ||
| }; | ||
| } |
There was a problem hiding this comment.
🟡 The whole-tree clonefile() fast path silently strips setuid/setgid bits from regular files in the cloned tree (per clonefile(2): "setuid and setgid bits are turned off in the mode bits for regular files"), whereas node's walker — and the immediately-preceding Bun release — restores them via chmod(dest, srcStat.mode) after each file copy. So a 04755 file inside a files-and-dirs-only tree now comes out 0755 on macOS, contradicting the "indistinguishable from node's walker" invariant this gate is meant to enforce. Same applies to the async gate at src/js/internal/fs/cp.ts:166-175; either bail out of the fast path when the scan sees (st_mode & (S_ISUID|S_ISGID)) on a file, or document this as an accepted divergence.
Extended reasoning...
What the bug is
Apple's clonefile(2) man page states that the cloned file's mode bits are taken from the source except that the setuid and setgid bits are turned off for regular files — this is unconditional kernel behaviour applied as a security measure regardless of caller privilege. Node's ported walker, by contrast, follows every copyFile() with setDestMode(dest, srcStat.mode) → chmodSync(dest, srcStat.mode), where srcStat.mode is the full 12-bit st_mode from lstat and so includes S_ISUID/S_ISGID. The walker therefore preserves setuid/setgid bits; whole-tree clonefile() does not.
The code path
This PR's tryNativeFastPathSync (cp-sync.ts:292-301) and tryNativeFastPath (cp.ts:166-175) now return ok: true for a directory on macOS when destStat is null and treeContainsOnlyFilesAndDirsSync(src) passes. That hands the copy to the native path, which on macOS is the whole-tree clonefile() at src/runtime/node/node_fs.zig:6259 (sync cpSyncInner) / :827 (async _cpAsyncDirectory). On a zero return both sites immediately return .success / true (zig:6277, zig:844) — there is no subsequent per-file chmod walk to restore the stripped bits.
That the codebase already knows clonefile doesn't preserve full mode is corroborated by the per-file native path at node_fs.zig:6505-6507, which explicitly follows a successful single-file clonefile with Syscall.chmod(dest, stat_.mode).
Why the existing gate doesn't prevent it
treeContainsOnlyFilesAndDirsSync only inspects Dirent types from readdir(..., { withFileTypes: true }). A setuid regular file reports isFile() === true, so the scan happily approves it. The new "file and directory modes are preserved" test in cp.test.ts checks & 0o777 against 0o600/0o700 and would not detect a stripped 04xxx bit.
Step-by-step proof
- On macOS/APFS, create
src/helperwith mode04755inside a tree containing only regular files and directories (e.g. a vendored toolchain or an.appbundle's privileged helper). - Call
fs.cpSync('src', 'dest', { recursive: true })with default options and no existingdest. tryNativeFastPathSync:srcStat.isDirectory()→ true,process.platform === 'darwin'→ true,!destStat→ true,treeContainsOnlyFilesAndDirsSync('src')→ true (helperis a regular file). Returnsok: true.- Native
cpSyncInnerrunsclonefile('src', 'dest', 0)→0on APFS → returns.successimmediately. fs.statSync('dest/helper').mode & 0o7777is0o755, not0o4755.- Pre-PR (post-fs: port Node.js v26.3.0 fs tests and fix the gaps they surface — cp error semantics, watcher event delivery, watch ignore+AbortSignal, FileHandle pull/writer, glob port, opendir/Dir, mkdtempDisposable, rmdir-recursive end-of-life, mock.fn (+119 tests) #31830):
tryNativeFastPathSyncreturnedok: falsefor all directories, the JS walker copiedhelperviacopyFileSyncthenchmodSync(dest, srcStat.mode), and the result was0o4755. So this is a behavioural regression vs. the immediately-preceding release as well as vs. Node.
Impact and relationship to other findings
This is distinct from the clonefile-fails fallback issue: that one concerns the manual recursion when clonefile() returns an error; this one fires precisely when clonefile() succeeds, which is the common APFS case the PR is optimizing for. Real-world impact is admittedly niche — setuid binaries inside fs.cp'd trees are uncommon (privileged helpers in .app bundles, sudo-like tools in vendored toolchains), and stripping setuid on copy is arguably the safer direction security-wise — but it does violate the PR's stated "indistinguishable from node's walker" invariant and is a silent divergence the readdir scan was specifically introduced to prevent.
How to fix
Either (a) have the scan lstat each regular file and bail to the walker when (st_mode & (S_ISUID | S_ISGID)) !== 0 (cheap relative to the copy itself, keeps the fast path for the overwhelmingly-common case), (b) walk the cloned tree post-success and chmod files whose source had those bits set, or (c) explicitly document this as an accepted divergence in the gate's comment so the "indistinguishable" claim isn't misleading.
Follow-up to #31830. That PR routed all recursive
fs.cp/fs.cpSync/fs.promises.cpcalls through the node-ported JS walker so relative-symlink rewriting and theERR_FS_CP_*error codes match node, which meant recursive copies on macOS no longer used a single whole-treeclonefile(). This restores the fast path for the cases where it is indistinguishable from the walker.Summary
fs.cp/cpSync/promises.cpgo back to the native path (oneclonefile()of the whole tree on macOS) when: the options are the defaults already required for the existing single-file fast path (no filter/dereference/preserveTimestamps/verbatimSymlinks/mode/errorOnExist, force), the destination does not exist (so node's merge semantics never come into play), and a metadata-only readdir scan of the source tree finds nothing but regular files and directories.ERR_FS_CP_*errors), entries whose type the filesystem does not report, an existing destination, or a scan error. Non-macOS platforms are completely unchanged — the directory branch still returnsok: falsethere, and the scan call is dead-code-eliminated.tryNativeFastPathSync/tryNativeFastPathinsrc/js/internal/fs/cp-sync.ts/cp.ts; node's path validation (checkPaths,checkParentPaths, EISDIR) still runs before the native handoff, and the bail-out path keeps reusing the already-computed stats viacheckedexactly as before.bench/fs-cp/cp.mjs: recursive copy of a 256-file tree, sync and promises, with and without a symlink in the tree (the symlink variant always exercises the walker), runnable under both bun and node.Test plan
test/js/node/fs/cp.test.tsfor bothfs.cpSyncandfs.promises.cp, asserting the behaviors the scan must protect: relative in-tree symlink targets are rewritten to absolute paths resolved against the source tree (verified against node v26.3.0), file and directory modes are preserved into a fresh destination, and a FIFO inside the tree is rejected withERR_FS_CP_FIFO_PIPE. On macOS these run against trees that bail out of the fast path; the existing cp suite plus the ported node tests cover the trees that take it.bun bd test test/js/node/fs/cp.test.ts— 46 pass, 0 fail (Linux debug build; Linux behavior is unchanged by this PR).bench/fs-cp/cp.mjsruns under both node 26.3.0 and the debug build.