Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invalid or unsupported zip format. No END header found #268

Open
jerrygreen opened this issue Nov 24, 2018 · 22 comments
Open

Invalid or unsupported zip format. No END header found #268

jerrygreen opened this issue Nov 24, 2018 · 22 comments

Comments

@jerrygreen
Copy link

jerrygreen commented Nov 24, 2018

Error:

Jerrys-MacBook-Pro:client jerrygreen$ node zip.js 

/Users/jerrygreen/my_project/node_modules/adm-zip/zipFile.js:66
			throw Utils.Errors.INVALID_FORMAT;
			^
Invalid or unsupported zip format. No END header found

My (simple) code:

const AdmZip = require('adm-zip')
const zip = new AdmZip('./my_file.zip')

I'm using Macos 10.14.1
By opening it from Finder (using default Archive Utility app) it's unzipping nicely, no problems

@AzureDoom
Copy link

Did you ever find a fix for this?

@jerrygreen
Copy link
Author

jerrygreen commented Feb 11, 2019

@AzureZhen I've found macos is using an util called ditto to zip/extract something. It's a default macos app – so I simply used this for extraction, works perfectly

@ghost
Copy link

ghost commented May 24, 2019

@jerrygreen , I am facing same issue. How do you fix it by using ditto? could you please attach sample code here?

Many thanks.

@jerrygreen
Copy link
Author

@kangwen6663

ditto -xk /path/from /path/to

@5saviahv
Copy link
Collaborator

This error is thrown cases when file comment field exceeds maximum 65k size. I have seen it with some external signing schemes.

@csalmeida
Copy link

csalmeida commented Nov 20, 2020

Anyone still experiencing the issue? Would appreciate a solution that is not dependent
on ditto if possible.

const zipPath = `./temp/file.zip`;
const zip = new AdmZip(zipPath);
zip.extractAllTo(`./temp/`, true);

The terminal output is:

Error: Invalid or unsupported zip format. No END header found
    at readMainHeader (/Users/username/Projects/example-project/node_modules/adm-zip/zipFile.js:107:10)
    at new module.exports (/Users/username/Projects/example-project/node_modules/adm-zip/zipFile.js:19:3)
    at new module.exports (/Users/username/Projects/example-project/node_modules/adm-zip/adm-zip.js:20:11)
    at module.exports._installWordpress (/Users/username/Projects/example-project/generators/app/index.js:102:17)
    at module.exports.writing (/Users/username/Projects/example-project/generators/app/index.js:45:12)
    at Object.<anonymous> (/Users/username/Projects/example-project/node_modules/yeoman-generator/lib/index.js:399:25)
    at /Users/username/Projects/example-project/node_modules/run-async/index.js:49:25
    at new Promise (<anonymous>)
    at /Users/username/Projects/example-project/node_modules/run-async/index.js:26:19
    at /Users/username/Projects/example-project/node_modules/yeoman-generator/lib/index.js:400:11

@5saviahv
Copy link
Collaborator

Are you able extract or view your file with archive managers like 7zip, WinRAR etc. ?

@csalmeida
Copy link

Are you able extract or view your file with archive managers like 7zip, WinRAR etc. ?

@5saviahv Yes, the file seems to be okay and extracts with these tools I've tried:

  • macOS Big Sur's extract feature
  • Windows 10 extract feature
  • 7zip
  • WinRAR

I also seem to only get this error sometimes, not all the time (using the same file) which makes it slightly more challenging to understand the cause.

Thanks!

@5saviahv
Copy link
Collaborator

Interesting, there maybe many culprits, but in way you describe it, it seems like race condition (two or more process wants access the file at same time)

  • Do you only read from file or you also write in this file ?
  • Do you open file multiple times ?
const zipPath = `./temp/file.zip`;
const zip1 = new AdmZip(zipPath);
const zip2 = new AdmZip(zipPath);
  • Do you use async functions ?
  • It fails on which OS ?

@csalmeida
Copy link

Interesting, there maybe many culprits, but in way you describe it, it seems like race condition (two or more process wants access the file at same time)

  • Do you only read from file or you also write in this file ?
  • Do you open file multiple times ?
const zipPath = `./temp/file.zip`;
const zip1 = new AdmZip(zipPath);
const zip2 = new AdmZip(zipPath);
  • Do you use async functions ?
  • It fails on which OS ?

Thanks for getting back to me on this. A race condition could be the case, after the file is extracted the contents are copied and then the original zip file and extracted folder are removed.

Do you use async functions?

I am unsure but I have provided an example below.

It fails on which OS?

It fails when running the script on Node v15.2.1 or v14.15.1 running on macOS Big Sur 11.0.1 (20B29).

Here's the function I have:

_extractZip(projectName, fileZipName, copyPath=null) {
  const extractedFolder = `./${projectName}/${fileZipName.replace('.zip', '')}`;
  
  // Extracts contents of zip file.
  const extractPath = `./${projectName}/${fileZipName}`;
  const zip = new AdmZip(extractPath);
  zip.extractAllTo(`${projectName}/`, true);

  let extractError = null;
  
  // If a copy path is not provided files won't be moved.
  if (copyPath) {
    fse.copy(extractedFolder, copyPath, { overwrite: true }, err => {
    
      if (err)  {
        extractError = `
        Could not copy files to ./${copyPath}. \n
        ./${err}
        `
      } else {
        // Cleans up by removing extracted folder and zip.
        try {
          fs.rmdirSync(extractedFolder, { recursive: true });
        } catch (err) {
            extractError = `
            Could not remove extractedFolder. \n
            ./${err}
            `
        }
        
        // Remove zip file as it is not longer needed.
        try {
          fs.unlinkSync(extractPath);
        } catch (error) {
          extractError = `
          Could not remove ./${extractPath}. \n
          ./${err}
          `
        }
      }
    });
  } else {
    // Cleans up by removing extracted folder and zip.
    // Lets user know that program did not work as intended.
    try {
      fs.rmdirSync(extractedFolder, { recursive: true });
    } catch (err) {
        extractError = `
        Could not remove extractedFolder. \n
        ./${err}
        `
    }
    
    // Remove zip file as it is not longer needed.
    try {
      fs.unlinkSync(extractPath);
    } catch (error) {
      extractError = `
      Could not remove ./${extractPath}. \n
      ./${err}
      `
    }

    this.log(`${chalk.red('Error:')} Could not copy files (copyPath is not present).
    Zip file and extracted files were removed.`);
  }
}

Thanks again for looking into it. If this is a race condition is there a way to only access the file when it is done extracting?

@5saviahv
Copy link
Collaborator

Code seems ok. It should not give any trouble.

@5saviahv
Copy link
Collaborator

How big your files are ? I mean aren't any of them Zip64 ? Many archive managers switch Zip64 if you use big files or you have many files. Adm-zip can read zip64 files but it has higher chance for fail.

@csalmeida
Copy link

@5saviahv thank you, for some reason I cannot replicate the error lately. I am not sure whether it is Zip64 but one of the files I have used this function on is wordpress.zip.

Since this is so intermittent (on that same file) I am unsure what might be cause it but I haven't been getting the error lately.
I really appreciate your help, I will comment here again if it returns and will add details. 🙏

@iget-master
Copy link

This happened to me once, but when retrying (same code, same file) it works. Weird.

@nimmc
Copy link

nimmc commented Sep 22, 2021

I have this same issue.
I think I can somehow replicate it.

I download the file from my aws-s3 then use adm to unzip then use it with cheerio.

The trick is I need to leave my computer alone for like 5-10 minutes and run my code and it will sometime (around 40% of the time) give the error "Invalid or unsupported zip format. No END header found".

But otherwise it will work fine.
The file is epub always same file so it this file is usable.

Here is my code.

getfile();
async function getfile(){
try {
aws.config.update({
accessKeyId: accessKeyId,
secretAccessKey: secretAccesskey,
region: 'us-east-2'
});

  var s3 = new aws.S3();

  var params = {
    Bucket: 'original', 
    Key: 'file.epub'
  };
  let readStream = s3.getObject(params).createReadStream();
  let writeStream = fs.createWriteStream(path.join(__dirname, params.Key)); 
  readStream.pipe(writeStream);
  readStream.on('end', () => { 
    console.log("this ends")
    console.log("paramkey = ",params.Key)  
    writeStream.end(); 
    epubToText(params.Key);
  })
} catch (error) {
console.log("error = ",error)
}
}

async function epubToText(path2) {

try {
console.log("path2 = ",path2) // always already exists
console.log("111111")
let zip = new AdmZip(path2); // it stops here. In console it only logged "111111" and not "22222"

console.log("22222")

let $ = cheerio.load( zip.readAsText('META-INF/container.xml'),{xmlMode:true, decodeEntities: false} );
console.log("$ = ",$)

let contentOpfPath = $("container rootfiles rootfile").attr("full-path");
console.log("contentOpfPath = ",contentOpfPath)

let contentOpfFolder = contentOpfPath.split("/")
console.log("contentOpfFolder = ",contentOpfFolder)

} catch(err) {
console.log(err);
}

I can't let this happen in production though. The file must be processed and served to customer.
This file is 6.5 Mb
I use node 12.16.1 on windows 7
and on my mac bigsur this happens too.

@krisrefs
Copy link

krisrefs commented Jan 13, 2022

I made a workaround for this as I was getting the zip file externally and then saved locally to extract.

A setTimeout solved my issue.

	const saveZIPFile = async () => {
		return new Promise((resolve) => {
			data.body.pipe(fs.createWriteStream(path.resolve(__dirname, `${project}.zip`)));

			data.body.on('end', () => {
				setTimeout(() => {
					resolve();
				}, 1000);
			});
		});
	};

	await saveZIPFile();

	var zip = new AdmZip(path.resolve(__dirname, `${project}.zip`));

	zip.extractAllTo(path), true);

@nhuethmayr
Copy link

I had the same problem and it turned out to be an issue with how I download the file. I never waited for the download to complete before attempting to unzip it.

Solution: Properly await the download and only then start working with the ZIP file.

@LuizAsFight
Copy link

I had this problem, and turns out the URL was returning a http code 302 (redirect), instead of a 200 (success). then my zip file was getting 0 bytes.

to fix that I changed the code a bit:

const url =
      'http://blablabla.zip';

    const zipFile = './blablabla.zip';
    const zipFileStream = fs.createWriteStream(zipFile);

    function downloadFile(url, attempt = 1) {
      return new Promise((resolve, reject) => {
        https
          .get(url, (res) => {
            if (res.statusCode === 302 || res.statusCode === 301) {
              if (attempt > 5) {
                // prevent infinite loops if there's a redirect loop
                reject(new Error('Too many redirects'));
                return;
              }

              const newUrl = res.headers.location;
              console.log(`Redirecting to: ${newUrl}`);
              downloadFile(newUrl, attempt + 1).then(resolve, reject);
              return;
            }

            if (res.statusCode !== 200) {
              reject(new Error(`Unexpected status code: ${res.statusCode}`));
              return;
            }

            res.pipe(zipFileStream);

            zipFileStream.on('finish', () => {
              zipFileStream.close(resolve);
            });

            zipFileStream.on('error', (error) => {
              reject(error);
            });
          })
          .on('error', (error) => {
            reject(error);
          });
      });
    }

    await downloadFile(extensionUrl);

    // eslint-disable-next-line no-console
    console.log('Download Completed extracting zip...');
    const zip = new admZip(zipFile); // eslint-disable-line new-cap
    zip.extractAllTo('./blablabla', true);
    // eslint-disable-next-line no-console
    console.log('zip extracted');

@panoply
Copy link

panoply commented Jan 2, 2024

My issue here was that I passed .DS_Store when attempting to unzip. Ensure you're filtering out invalid file paths.

@LukeSavefrogs
Copy link

LukeSavefrogs commented Apr 5, 2024

I had the same problem downloading a file using axios and fs.createWriteStream.

I solved by waiting on the writeStream' close event and then resolving the Promise:

import os from 'os'
import fs from 'fs';
import path from 'path';

import axios from 'axios';
import AdmZip from 'adm-zip';

async function download(url: string): Promise<string> {
    const outputFile = path.join(os.tmpdir(), 'archive.zip');
    const { data } = await axios.get(url, { responseType: 'stream' });
    
    // Pipe the data to a file
    const writeStream = fs.createWriteStream(outputFile);
    data.pipe(writeStream);

    // Return a promise and resolve when download finishes
    return new Promise((resolve, reject) => {
        data.on('error', () => {
            reject(`Failure while retrieving remote data (source: ${downloadURL})`);
        })

        writeStream.on('close', () => {
            resolve(outputFile);
        })
        writeStream.on('error', err => {
            reject(err);
        })
    })
}

async function extract(url: string, outputDir: string) {
    // 1. Download the zip file
    const file = await download(url);
    
    // 2. Extract the archive
    const zip = new AdmZip(file);
    zip.extractAllTo(outputDir, /*overwrite*/ true);

    return outputDir;
}

extract("https://example.com/archive.zip", path.join(os.tmpdir(), 'extracted')).catch((error) => {
    console.error(error);
});

@gsaukov
Copy link

gsaukov commented Jul 23, 2024

In my case it was my fault with curl and the way I was passing my binary.
I should have passed binary @/Users/gsaukov/Downloads/index.zip but i was passing path /Users/gsaukov/Downloads/index.zip as string.
Correct binary curl version (with @) below:

curl -X PUT \
  -H 'Content-Type: application/zip' \
  -H 'accept: application/json' \
  --insecure \
  --data-binary @/Users/gsaukov/Downloads/index.zip \
  https://localhost:3000/artifact

@Ericfreespirit
Copy link

Ericfreespirit commented Aug 14, 2024

I found the problem for me:
Wrong ❌:

        const outputFilePath = "./ExtractTextInfoFromPDF.zip";
        console.log(`Saving asset at ${outputFilePath}`);

        const writeStream = fs.createWriteStream(outputFilePath);
        streamAsset.readStream.pipe(writeStream);

        let zip = new AdmZip(outputFilePath); // Wrong  ❌
        let jsondata = zip.readAsText('structuredData.json');
        let data = JSON.parse(jsondata);
        console.log("data", data);
        data.elements.forEach((element: any) => {
            if (element.Path.endsWith('/H1')) {
                console.log(element.Text);
            }
        });

Right ✅:

        const outputFilePath = "./ExtractTextInfoFromPDF.zip";
        let zip = new AdmZip(outputFilePath); // Right ✅ !!!!
        console.log(`Saving asset at ${outputFilePath}`);

        const writeStream = fs.createWriteStream(outputFilePath);
        streamAsset.readStream.pipe(writeStream);

        let jsondata = zip.readAsText('structuredData.json');
        let data = JSON.parse(jsondata);
        console.log("data", data);
        data.elements.forEach((element: any) => {
            if (element.Path.endsWith('/H1')) {
                console.log(element.Text);
            }
        });

I think, we have to create a new AdmZip() before writing in it

maybe it's why i get a INVALID_FORMAT() in the adm-zip/zipFile.js

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests