Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Corrupted unzipped files when using Extract #286

Closed
marcogrcr opened this issue Sep 12, 2023 · 15 comments
Closed

Corrupted unzipped files when using Extract #286

marcogrcr opened this issue Sep 12, 2023 · 15 comments

Comments

@marcogrcr
Copy link

I'm trying to unzip a .zip file containing chromedriver using [email protected]. When attempting to unzip chromedriver_linux64.zip downloaded from here, the extraction succeeds, but the executable is corrupted:

$ unzip chromedriver_linux64.zip

$ md5sum chromedriver
ee3dba5202ae87d7b79c22341030db49

$ ./chromedriver
Starting ChromeDriver 111.0.5563.64 (c710e93d5b63b7095afe8c2c17df34408078439d-refs/branch-heads/5563@{#995}) on port 9515
Only local connections are allowed.
Please see https://chromedriver.chromium.org/security-considerations for suggestions on keeping ChromeDriver safe.
ChromeDriver was started successfully.
import { createReadStream } from "node:fs";
import { Extract } from "unzipper";

await createReadStream("chromedriver_linux64.zip")
      .pipe(Extract({ path: "/tmp" }))
      .promise();

Then, when I run:

$ md5sum /tmp/chromedriver
641c5ede222e09e76fe52bc1bc61c0a9

$ /tmp/chromedriver
zsh: permission denied: /tmp/chromedriver

$ chmod +x /tmp/chromedriver

$ /tmp/chromedriver
zsh: exec format error: /tmp/chromedriver

Since I wasn't sure if I was using the .promise() method correctly, I also tried using the equivalent of the README example, but got the same results:

import { createReadStream } from "node:fs";
import { Extract } from "unzipper";

await new Promise((resolve, reject) => {
  createReadStream(driverPkg)
    .pipe(Extract({ path: destinationFolder }))
    .on("close", () => resolve(void 0))
    .on("error", e => reject(e));
});

However, this works just fine with [email protected]:

import extractZip from "extract-zip";

await extractZip("chromedriver_linux64.zip", { dir: "/tmp" })

Then, when I run:

$ md5sum chromedriver
ee3dba5202ae87d7b79c22341030db49

$ /tmp/chromedriver
Starting ChromeDriver 111.0.5563.64 (c710e93d5b63b7095afe8c2c17df34408078439d-refs/branch-heads/5563@{#995}) on port 9515
Only local connections are allowed.
Please see https://chromedriver.chromium.org/security-considerations for suggestions on keeping ChromeDriver safe.
ChromeDriver was started successfully.
@vincent-seibus
Copy link

hello ,

I have got the same issue, when I extract a zip on linux debian , I have got the javascript files inside my zip that have been modified.

@couturecraigj
Copy link

+1

@Toliak
Copy link

Toliak commented Sep 16, 2023

Same issue on Windows 11

Is related to #271

@dgdosen
Copy link

dgdosen commented Sep 16, 2023

same issue when run from macos...

@jarlah
Copy link

jarlah commented Sep 27, 2023

same problem here. in a linux docker image in openshift ...

@PeterSelvaraj
Copy link

I ran into this issue. After much research, this appears to be an issue brought on by changes in NodeJS. I was using 18.18.0 when I ran into the issue. Downgrading to v18.15.0 solved the problem. The issue exists in v18.16.0 and above.

@marcogrcr
Copy link
Author

Wow @PeterSelvaraj you nailed it, well done and thanks for the hard work! I can confirm this is the case in MacOS 13.5.2.

I'm installing node using asdf and tested chromedriver_mac_arm64.zip:

% unzip chromedriver_mac_arm64.zip 
Archive:  chromedriver_mac_arm64.zip
  inflating: chromedriver            
  inflating: LICENSE.chromedriver

% md5 chromedriver
MD5 (chromedriver) = 5e4ba0b62a072ff281fe9fc769d8665a

Then run:

import { createReadStream, readFileSync } from "node:fs";
import { Extract } from "unzipper";
import { createHash } from "node:crypto";

// version
console.log("Node.js version:", process.versions.node);

// unzipper
await createReadStream("chromedriver_mac_arm64.zip")
      .pipe(Extract({ path: "." }))
      .promise();

// MD5
const buffer = readFileSync("chromedriver");
const md5 = createHash("md5");
md5.write(buffer);
console.log("MD5:", md5.digest().toString("hex"));

Then playing with the .tool-versions => nodejs X.X.X

node@19 worked for some time and then stopped working, node@20 has never worked:

Version range Tested versions MD5 OK?
18.13.0-18.15.0 18.13.0, 18.14.0, 18.14.1, 18.14.2, 18.15.0 5e4ba0b62a072ff281fe9fc769d8665a Yes
18.16.0-18.18.0 18.16.0, 18.16.1, 18.17.0, 18.17.1, 18.18.0 4b4e65d5c6b824db4898cb070b600c19 No
19.0.0-19.7.0 19.0.0, 19.0.1, 19.1.0, 19.2.0, 19.3.0, 19.4.0, 19.5.0, 19.6.0, 19.6.1, 19.7.0 5e4ba0b62a072ff281fe9fc769d8665a Yes
19.8.0-19.9.0 19.8.0, 19.8.1, 19.9.0 4b4e65d5c6b824db4898cb070b600c19 No
20.0.0-20.8.0 20.0.0, 20.1.0, 20.2.0, 20.3.0, 20.3.1, 20.4.0, 2.5.0, 20.5.1, 20.6.0, 20.6.1, 20.7.0, 20.8.0 4b4e65d5c6b824db4898cb070b600c19 No

@Fadorico
Copy link

Fadorico commented Oct 17, 2023

I came upon this issue after having done quite a bit of investigation myself on some problems we were having in our app. I found that this is the exact commit in NodeJS that broke it: nodejs/node@654b747

I essentially built NodeJS from source and did a checkout of commits to see when stuff breaks and when it doesn't

dvirtz added a commit to dvirtz/vscode-parquet-viewer that referenced this issue Oct 27, 2023
dvirtz added a commit to dvirtz/vscode-parquet-viewer that referenced this issue Oct 27, 2023
dvirtz added a commit to dvirtz/vscode-parquet-viewer that referenced this issue Oct 27, 2023
dvirtz added a commit to dvirtz/vscode-parquet-viewer that referenced this issue Oct 28, 2023
dvirtz added a commit to dvirtz/vscode-parquet-viewer that referenced this issue Oct 30, 2023
dvirtz added a commit to dvirtz/vscode-parquet-viewer that referenced this issue Oct 30, 2023
dvirtz added a commit to dvirtz/vscode-parquet-viewer that referenced this issue Oct 30, 2023
@sovcik
Copy link

sovcik commented Dec 12, 2023

Hit the same issue upgrading to Node v20. Replaced unzipper with yauzl. Works like charm.

import fs from 'fs';
import yauzl from 'yauzl';
import path from 'path';

export async function unzipFile(archiveName: string, destFolder: string): Promise<void> {
  return new Promise((resolve, reject) => {
    try {
      yauzl.open(archiveName, { lazyEntries: true }, function (err, zipfile) {
        if (err) reject(err);
        zipfile.readEntry();
        zipfile.on('entry', function (entry) {
          if (/\/$/.test(entry.fileName)) {
            // Directory file names end with '/'.
            // Note that entries for directories themselves are optional.
            // An entry's fileName implicitly requires its parent directories to exist.
            zipfile.readEntry();
          } else {
            // file entry
            zipfile.openReadStream(entry, function (err, readStream) {
              if (err) reject(err);
              readStream.on('end', function () {
                zipfile.readEntry();
              });
              const writeFileStream = fs.createWriteStream(path.join(destFolder, entry.fileName));
              readStream.pipe(writeFileStream);
            });
          }
        });
        zipfile.on('end', function () {
          resolve();
        });
      });
    } catch (err) {
      reject(new Error(`Error unzipping ${archiveName}: ${err.message}`));
    }
  });
}

@bubblegumsoldier
Copy link

bubblegumsoldier commented Feb 21, 2024

Does anyone know if there is any fix on the way? Downgrading to node 18.15. is not an option because the latest supported electron version comes with 18.16. so basically node-unzipper cannot be used with any supported electron app at the moment.

@dy-dx
Copy link

dy-dx commented Mar 6, 2024

This does look like a bug in nodejs, which can cause the fstream dependency to write out of order.

Thanks @Fadorico for finding the change.
I've created a nodejs issue with a smaller test case: nodejs/node#51993

Unfortunately the fstream lib is abandoned, so the workaround for unzipper will have to be to switch off of fstream...

@sajeel45
Copy link

I was facing the same issue of corrupted files when the zip is extracted!
So I found some other package which is working fine for me
The package is extract-zip
You can install it by using:
npm install extract-zip
Here is code snippet how you can use it:
const extract = require('extract-zip');

try {
// Create the folder if it doesn't exist
await mkdir(uploadDir, { recursive: true });

const filePath = path.join(uploadDir, file.name);
// Convert ArrayBuffer to Buffer
const buffer = Buffer.from(await file.arrayBuffer());
await writeFile(filePath, buffer);

// Extract the zip file
await extract(filePath, { dir: uploadDir });



return NextResponse.json({
  message: "File uploaded and extracted successfully",
  success: true,
});

} catch (error) {
console.error("Error uploading file:", error);
return NextResponse.json({
message: "Failed to upload file",
success: false,
});
}

@sajeel45
Copy link

Does anyone know if there is any fix on the way? Downgrading to node 18.15. is not an option because the latest supported electron version comes with 18.16. so basically node-unzipper cannot be used with any supported electron app at the moment.

You can read my comment I have mentioned some other package which is working fine with latest node version

@dy-dx
Copy link

dy-dx commented Mar 26, 2024

The issue has been fixed in the following node.js versions:

But for anyone unable to upgrade, you'll need to switch to a different unzipper library until this gets addressed: #261

@ZJONSSON
Copy link
Owner

ZJONSSON commented Jun 8, 2024

Thanks to @dy-dx for following up on the node issue. Although not directly related to this bug (since it's createWriteStream related) we have moved from unmaintained fstream to fs-extra

published as [email protected]

@ZJONSSON ZJONSSON closed this as completed Jun 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests