Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

archive/zip: don't read data descriptor early #501

Merged
merged 1 commit into from
Feb 22, 2022

Conversation

saracen
Copy link
Contributor

@saracen saracen commented Feb 21, 2022

This is a patch originally applied to stdlib's archive/zip: https://go-review.googlesource.com/c/go/+/357489/

Your own compress/zip fork synced with stdlib after the problem was introduced, but before this commit, which reverted the change.

Go 1.17 introduced an unnecessary change to when a zip's data descriptor
is read for file entries, how it is parsed and how the crc32 field is
used.

Before Go 1.17, the data descriptor was read immediately after a file
entry's content. This continuous read is a pattern existing applications
have come to rely upon (for example, where reads at specific offsets
might be translated to HTTP range requests).

In Go 1.17, all data descriptors are immediately read upon opening the
file. This results in scattered and non-continuous reads of the archive,
and depending on the underlying reader, might have severe performance
implications. In addition, an additional object is now initialized for
each entry, but is mostly redundant.

Previously, the crc32 field in the data descriptor would return an error
if it did not match the central directory's entry. This check has
seemingly been unintentionally removed. If the central directory crc32
is invalid and a data descriptor is present, no error is returned.

This change reverts to the previous handling of data descriptors, before
CL 312310.

Fixes #48374
Fixes #49089

Change-Id: I5df2878c4fcc9e500064e7175f3ab9727c82f100
Reviewed-on: https://go-review.googlesource.com/c/go/+/357489
Run-TryBot: Ian Lance Taylor <[email protected]>
Reviewed-by: Ian Lance Taylor <[email protected]>
Trust: Dmitri Shuralyov <[email protected]>
@klauspost
Copy link
Owner

@saracen Why is the test removed?

@saracen
Copy link
Contributor Author

saracen commented Feb 22, 2022

@saracen Why is the test removed?

@klauspost This test was added along with the modified readDataDescriptor that has now been reverted.

The test was removed, because like the modified readDataDescriptor, it was redundant. The data it unmarshaled into the dataDescriptor struct was not used. The CRC32 value looks like it was used, but served no purpose, as it wasn't then checked against (something the older implementation and the reverted commit do).

@klauspost
Copy link
Owner

Great! Thanks!

@klauspost klauspost merged commit 48fd9c2 into klauspost:master Feb 22, 2022
@klauspost
Copy link
Owner

I expect a release soon to include #503 before merging the zstd decoder changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants