-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: extract tarball to temp directory on Windows #2846
fix: extract tarball to temp directory on Windows #2846
Conversation
throw new Error(k + ' local checksum ' + contentShasums[k] + ' not match remote ' + expectShasums[k]) | ||
// copy over the files from the temp tarball extract directory to devDir | ||
if (tarExtractDir !== devDir) { | ||
await copyDirectory(tarExtractDir, devDir) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason we wouldn't want to rename
/ move
here instead of copy? This could still race condition if it's not atomic?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason we wouldn't want to
rename
/move
here instead of copy?
Since there are parallel installs happening, there's no way to atomically rename to devDir
. That directory is created (if it doesn't already exist) at the start of the install process to confirm permissions, so a direct rename of tarExtractDir
to devDir
would fail. Any rename process would need to first rename devDir
, which opens a race condition where devDir
disappears momentarily during an ongoing build, causing a failure. That was my original approach for fixing this issue, but I switched to the current implementation to close that race condition.
This could still race condition if it's not atomic?
The copyFile
approach stays similar to the existing implementation, with node-tar
overwriting the existing files when there are parallel installs going on. The main difference is that on Windows node-tar
always unlinks the destination file path before writing to it to avoid certain conditions (same path multiple times in an archive) which don't exist for this usage.
I've stress tested this change (on CI and locally with a real package) with 35 parallel installs and didn't trip over any race conditions in practice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Since there are already 2 PRs addressing GitHub Action issues, we should make sure 6bb7cd6 is removed from here before landing it. Ideally, we'd first land a GitHub Actions fix, then rebase this PR, remove the mentioned commit and push changes.
I have no objection to these changes but I am not a Windows user so I lack the skills to be a reviewer. |
Hey, @dsanders11 can you prepare your branch for merging by removing 6bb7cd6 and potentially addressing #2846 (comment) and squashing some commits together (especially the ones addressing PR comments)? |
Signed-off-by: David Sanders <[email protected]>
Signed-off-by: David Sanders <[email protected]>
Signed-off-by: David Sanders <[email protected]>
2bb914b
to
9f9d7fb
Compare
@StefanStojanovic, done! Removed 6bb7cd6 and squashed it back down to the original three commits. For CI reference, 2bb914b was the last commit before removing 6bb7cd6, and it went green. |
* fix: check for errors while extracting downloaded tarball Signed-off-by: David Sanders <[email protected]> * test: parallel installs Signed-off-by: David Sanders <[email protected]> * fix: extract tarball to temp directory on Windows Signed-off-by: David Sanders <[email protected]> --------- Signed-off-by: David Sanders <[email protected]>
Checklist
npm install && npm test
passesDescription of change
Fixes #2482.
Fixes #2484.
Fixes #2584.
Fixes #2683.
Fixes #2751.
Fixes #2832.
NOTE: This PR contains a temporary initial commit which reverts a PR which causes CI workflow failures on
main
, so that the tests for this PR can run clean. It should be removed before merging this PR. #2837 is currently trying to fix that issue.node-tar
unlinks file paths before writing to them on Windows, which causes errors when there are parallel installs going on which are extracting the headers tarball todevDir
. This currently leads to random corruption of the cache atdevDir
with missing files and hanging files with<filename>.DELETE.<random>
names. See more in-depth discussion about the underlying issue here.This PR fixes the situation by:
node-tar
to watch for errors during extractionexponential-backoff
package in case of collision (EBUSY
) when copying the files todevDir
installVersion
to force clearing of existing cached versions which might be corrupt on WindowsThis is set up in commits which were pushed one at a time to this PR so that the CI workflows for each commit can be seen, confirming that the added tests fail until the final fix is applied.
Unfortunately the new tests, despite being simple, are a bit flaky, due to network requests to https://nodejs.org/download/release/. The tests also take significant longer (greater than 2x) on Windows with Node.js 18.x and 20.x - it's not clear why, but since it repros consistently with those versions, it seems likely to be some upstream regression in Node.js, possibly something to do with the
fs
module.