Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decompressed file content is messed up #32

Closed
assafavital opened this issue Oct 31, 2021 · 10 comments · Fixed by #36
Closed

Decompressed file content is messed up #32

assafavital opened this issue Oct 31, 2021 · 10 comments · Fixed by #36

Comments

@assafavital
Copy link

assafavital commented Oct 31, 2021

I'm archiving a folder containing .git/config file with fastzip on Linux, following the example shown in README.

When extracting said file, I'm expecting to see:

[core]
        repositoryformatversion = 0
        filemode = true
        bare = false
[remote "origin"]
        url = ...
[receive]
        denyNonFastForwards = false
        denyCurrentBranch = ignore
        denyDeleteCurrent = ignore

However, I get:

<..J.@^PE.._1.^CL..Z^H..>.(.%."..$;..Sfg+.{..>^....w,x0...3).yb.VO(.8A^KkSM^T0....JAS^MV.0....^E#+...fJ....^Dh..^^....N.....g.v]7........9....4w^O..5...^X.fR_.^[..^?I\.^Xy.qH..v.t~...Y.,?V.|...]^Q../b...^B....

This happens with both golang sdk (using NewExtractor) and with traditional unix tar command.
Worth mentioning, is that this is only the case when using the default Deflate compression method. Using 0 (Store) method works fine (no compression though, which is expected).

I'd appreciate any help with this issue.

EDIT: Compressing only the file itself (as opposed to the folder containing it) seems to work fine.

@assafavital
Copy link
Author

Attaching my code:

Zip:

func Zip(ctx context.Context, source, target string) error {
	outputPath, err := os.Create(target)
	if err != nil {
		return errors.WithStack(err)
	}
	defer outputPath.Close()

	archiver, err := fastzip.NewArchiver(outputPath, source)
	if err != nil {
		return errors.WithStack(err)
	}
	defer archiver.Close()

	files := make(map[string]os.FileInfo)
	if err := filepath.Walk(source, func(path string, info os.FileInfo, err error) error {
		files[path] = info
		return nil
	}); err != nil {
		return errors.WithStack(err)
	}
	return errors.Wrapf(archiver.Archive(ctx, files), "failed to archive %q", source)
}

Unzip:

func Unzip(ctx context.Context, source, target string) error {
	extractor, err := fastzip.NewExtractor(source, target)
	if err != nil {
		return errors.WithStack(err)
	}
	defer extractor.Close()

	return errors.Wrapf(extractor.Extract(ctx), "failed to extract %q", source)
}

@saracen
Copy link
Owner

saracen commented Nov 1, 2021

Hi @assafavital

Can you check if v0.1.5 of fastzip also has this same problem? It might be a regression with the latest versions.

Are you able to put together a zip file (using the zip utility, to avoid the issues mentioned here), whose contents compressed with fastzip causes the problem?

Thank you for the report and help diagnosing this!

@assafavital
Copy link
Author

Can you check if v0.1.5 of fastzip also has this same problem? It might be a regression with the latest versions.

This problem is present in v0.1.5 as well.
Small files turn out fine (since their compressed size is still larger than the original), but larger files result in gibberish.

@saracen
Copy link
Owner

saracen commented Nov 8, 2021

@assafavital

Does this occur if you pass the option WithArchiverConcurrency(1) too?

I'd have expected there to be a checksum error if what you're decompressing isn't correct.

@assafavital
Copy link
Author

@assafavital

Does this occur if you pass the option WithArchiverConcurrency(1) too?

I'd have expected there to be a checksum error if what you're decompressing isn't correct.

Works well with the suggested option.

@saracen
Copy link
Owner

saracen commented Nov 8, 2021

That does take away some of the benefits of using fastzip though.

In order to perform parallel zip compression, we compress several files to individual files on disk and then read back the compressed content of each to add to the archive serially. You're the only one that has reported a problem with this technique.

Can you tell me more about your OS/disk etc? If it occurs on the cloud VMs, maybe I can replicate it.

@assafavital
Copy link
Author

I'm using fastzip inside an alpine:3.14.2 container.
The directory I'm trying to archive is located in a PersistentVolumeClaim backed by AWS EBS.

I'll try some more ArchiverOptions and see what's working and what's not.

@bashgeek
Copy link

bashgeek commented Jan 3, 2022

@saracen Just ran into this myself, seems to happen if you use a more up2date version of klauspost/compress. When using the go.mod listed 1.13.5, it works fine, if using the current 1.13.6 I have the same gibberish result as @assafavital. This can happen if blindly used go get -u to use all recent versions.

@mcbailey
Copy link

@saracen Just ran into this myself, seems to happen if you use a more up2date version of klauspost/compress. When using the go.mod listed 1.13.5, it works fine, if using the current 1.13.6 I have the same gibberish result as @assafavital. This can happen if blindly used go get -u to use all recent versions.

Thank you for this!

saracen added a commit that referenced this issue Feb 21, 2022
This also now calls the newer CreateRaw() method, as CreateHeaderRaw() is now
deprecated.

This fixes #32, as when CreateHeaderRaw() was deprecated, it started calling the
incorrect function that replaced it:
klauspost/compress#502

A test with a larger content body has been added as it was able to detect this
regression.

Fixes #32
@saracen
Copy link
Owner

saracen commented Feb 21, 2022

Thanks everybody for reporting this. Fix is now in the main branch and I'll tag a new release (v0.1.8) shortly.

@saracen saracen closed this as completed Feb 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants