-
Notifications
You must be signed in to change notification settings - Fork 4.7k
fs: use clonefile for symlink-free recursive fs.cp on macOS #32503
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,78 @@ | ||
| // Recursive fs.cp / fs.cpSync benchmark. | ||
| // | ||
| // bun cp.mjs | ||
| // node cp.mjs | ||
| // | ||
| // The "regular files only" trees are eligible for the whole-tree clonefile() | ||
| // fast path on macOS; the trees containing a symlink always go through the | ||
| // node-ported walker. | ||
| import { cpSync, mkdirSync, promises, rmSync, symlinkSync, writeFileSync } from "node:fs"; | ||
| import { tmpdir } from "node:os"; | ||
| import { join } from "node:path"; | ||
| import { bench, run } from "../runner.mjs"; | ||
|
|
||
| const root = join(tmpdir(), `bench-fs-cp-${process.pid}`); | ||
| rmSync(root, { recursive: true, force: true }); | ||
|
|
||
| const DIRS = 16; | ||
| const FILES_PER_DIR = 16; | ||
| const data = Buffer.alloc(4096, "a"); | ||
|
|
||
| function makeTree(src, { withSymlink = false } = {}) { | ||
| for (let d = 0; d < DIRS; d++) { | ||
| const dir = join(src, `dir-${d}`); | ||
| mkdirSync(dir, { recursive: true }); | ||
| for (let f = 0; f < FILES_PER_DIR; f++) { | ||
| writeFileSync(join(dir, `file-${f}.txt`), data); | ||
| } | ||
| } | ||
| if (withSymlink) { | ||
| symlinkSync(join("dir-0", "file-0.txt"), join(src, "link")); | ||
| } | ||
| } | ||
|
|
||
| const plainSrc = join(root, "plain-src"); | ||
| makeTree(plainSrc); | ||
| const symlinkSrc = join(root, "symlink-src"); | ||
| makeTree(symlinkSrc, { withSymlink: true }); | ||
|
|
||
| const destRoot = join(root, "dest"); | ||
| mkdirSync(destRoot, { recursive: true }); | ||
| let destCount = 0; | ||
|
|
||
| // Each copy goes to a brand-new destination (an existing destination switches | ||
| // fs.cp into its merge semantics, which is a different operation). The | ||
| // computed parameter clears out previously created destinations without | ||
| // counting towards the measured time. | ||
| function recursiveCopyBench(label, copyOne) { | ||
| bench(label, function* () { | ||
| yield { | ||
| [0]() { | ||
| rmSync(destRoot, { recursive: true, force: true }); | ||
| mkdirSync(destRoot, { recursive: true }); | ||
| return destRoot; | ||
| }, | ||
| bench(base) { | ||
| return copyOne(join(base, `d${destCount++}`)); | ||
| }, | ||
| }; | ||
| }); | ||
| } | ||
|
|
||
| const totalFiles = DIRS * FILES_PER_DIR; | ||
| recursiveCopyBench(`cpSync recursive (${totalFiles} files, regular files only)`, dest => | ||
| cpSync(plainSrc, dest, { recursive: true }), | ||
| ); | ||
| recursiveCopyBench(`cpSync recursive (${totalFiles} files, tree contains a symlink)`, dest => | ||
| cpSync(symlinkSrc, dest, { recursive: true }), | ||
| ); | ||
| recursiveCopyBench(`fs.promises.cp recursive (${totalFiles} files, regular files only)`, dest => | ||
| promises.cp(plainSrc, dest, { recursive: true }), | ||
| ); | ||
| recursiveCopyBench(`fs.promises.cp recursive (${totalFiles} files, tree contains a symlink)`, dest => | ||
| promises.cp(symlinkSrc, dest, { recursive: true }), | ||
| ); | ||
|
|
||
| await run(); | ||
|
|
||
| rmSync(root, { recursive: true, force: true }); | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -245,6 +245,35 @@ | |
| return checkParentPathsSync(src, srcStat, destParent); | ||
| } | ||
|
|
||
| // The native recursive copy (a single clonefile() on macOS) copies symlinks | ||
| // verbatim and clones special files, while node rewrites relative symlink | ||
| // targets against the source tree and raises ERR_FS_CP_SOCKET / | ||
| // ERR_FS_CP_FIFO_PIPE. It is therefore only node-equivalent for trees made of | ||
| // regular files and directories; anything else — including entries whose type | ||
| // the filesystem does not report — bails to the ported walker. Scan errors | ||
| // also bail so the walker surfaces them the way node would. | ||
| function treeContainsOnlyFilesAndDirsSync(root) { | ||
| const stack = [root]; | ||
| while (stack.length) { | ||
| const dir = stack.pop(); | ||
| let entries; | ||
| try { | ||
| entries = readdirSync(dir, { withFileTypes: true }); | ||
| } catch { | ||
| return false; | ||
| } | ||
| for (let i = 0; i < entries.length; i++) { | ||
| const entry = entries[i]; | ||
| if (entry.isDirectory()) { | ||
| stack.push(join(dir, entry.name)); | ||
| } else if (!entry.isFile()) { | ||
| return false; | ||
| } | ||
| } | ||
| } | ||
| return true; | ||
| } | ||
|
|
||
| // node-correct validation before handing off to the native fast path | ||
| // (which performs the copy but does not implement node's cp error codes). | ||
| function tryNativeFastPathSync(src, dest, opts) { | ||
|
|
@@ -260,10 +289,20 @@ | |
| code: "EISDIR", | ||
| }); | ||
| } | ||
| // The native copy is only node-equivalent for regular-file -> regular-file | ||
| // (or missing dest). Symlinks (node resolves relative link targets), | ||
| // directories (may contain symlinks), and special files (node-specific | ||
| // error codes) must go through the ported implementation. | ||
| if (srcStat.isDirectory()) { | ||
| // On macOS the native path clones the whole tree with a single | ||
| // clonefile(). Only take it when the result is indistinguishable from | ||
| // node's walker: dest must not exist (no merge semantics) and the tree | ||
| // must contain only regular files and directories. | ||
| return { | ||
| ok: process.platform === "darwin" && !destStat && treeContainsOnlyFilesAndDirsSync(src), | ||
| checked, | ||
| }; | ||
| } | ||
|
Check failure on line 301 in src/js/internal/fs/cp-sync.ts
|
||
|
Comment on lines
+292
to
+301
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🔴 When Extended reasoning...What the bug is
// node_fs.rs:8367
match self.mkdir_recursive_os_path(dest, args::Mkdir::DEFAULT_MODE, false) {where Why the gate doesn't prevent itThe new gate checks three things: Step-by-step proofOn macOS, with
Node.js (and Bun before this PR) would have run The ENOENT trigger needs no second volume: Why this is a regressionBefore this PR, the directory branch of Why the new test doesn't catch itThe added "file and directory modes are preserved into a fresh destination" test runs entirely inside Suggested fixEither (a) in the native fallback,
Comment on lines
+292
to
+301
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🟡 The whole-tree Extended reasoning...What the bug isApple's The code pathThis PR's That the codebase already knows Why the existing gate doesn't prevent it
Step-by-step proof
Impact and relationship to other findingsThis is distinct from the clonefile-fails fallback issue: that one concerns the manual recursion when How to fixEither (a) have the scan |
||
| // The single-file native copy is only node-equivalent for regular-file -> | ||
| // regular-file (or missing dest). Symlinks (node resolves relative link | ||
| // targets) and special files (node-specific error codes) must go through | ||
| // the ported implementation. | ||
| return { ok: srcStat.isFile() && (!destStat || destStat.isFile()), checked }; | ||
| } | ||
|
|
||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧹 Nitpick | 🔵 Trivial | ⚡ Quick win
Clean up the benchmark root on failed runs.
If
run()rejects, the temporary source/destination tree is left behind. Wrap the run intry/finallyso failed benchmark iterations still clean up.♻️ Proposed cleanup guard
🤖 Prompt for AI Agents