You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The file is a valid gzip file, but contains invalid trailing data.
Example test case (click to expand)
funcTestValidGzipFileWithTrailingData(t*testing.T) {
// Reproducer file. There are many examples of this.// https://github.com/udacity/self-driving-car-sim/blob/4b1f739ebda9ed4920fe895ee3677bd4ccb79218/Assets/Standard%20Assets/Environment/SpeedTree/Conifer/Conifer_Desktop.spmf, err:=os.Open("/tmp/Conifer_Desktop.spm")
iferr!=nil {
t.Fatal(err)
}
deferf.Close()
rc, err:=gzip.NewReader(f)
iferr!=nil {
t.Fatal(err)
}
deferrc.Close()
_, err=io.ReadAll(rc)
iferr!=nil {
t.Fatal(err)
}
}
// === RUN TestValidGzipFileWithTrailingData// gzip_test.go:19: gzip: invalid header
Scenario 2 can be especially confusing because Go's implementation of compress/gzip rejects invalid trailing data, while many popular applications and languages do not. Hence, the ambiguity of this error can lead people to believe that Go is rejecting a valid gzip file.
Side-note: #47809 (comment) explains why this choice was made and how to handle invalid trailing data using gzip.Reader.Multistream.
Proposal Details
I propose adding a distinct error for when subsequent inputs in a data stream don't have valid headers (i.e., trailing garbage). This would provide clarity for users who many not realize that they need to disable gzip.Reader.Multistream. Additionally, it could be used to programmatically call Multistream(false) for specific files with trailing garbage.
I will leave the specific error message and behave for future discussion, should there be interest in implementing this proposal.
Caveat: I am not familiar with the implementation of gzip.Reader. This change may not be possible or desirable due to technical limitations.
The text was updated successfully, but these errors were encountered:
Proposal Details
Context
When using the
compress/gzip
package to decompress gzip files, receiving agzip: invalid header
error can indicate two distinct possibilities.The file is not a valid gzip file.
Example test case (click to expand)
The file is a valid gzip file, but contains invalid trailing data.
Example test case (click to expand)
Scenario 2 can be especially confusing because Go's implementation of
compress/gzip
rejects invalid trailing data, while many popular applications and languages do not. Hence, the ambiguity of this error can lead people to believe that Go is rejecting a valid gzip file.$ file -i Conifer_Desktop.spm Conifer_Desktop.spm: application/gzip; charset=binary $ gzip -S .spm -d /tmp/Conifer_Desktop.spm gzip: stdin: decompression OK, trailing garbage ignored.
Side-note: #47809 (comment) explains why this choice was made and how to handle invalid trailing data using
gzip.Reader.Multistream
.Proposal Details
I propose adding a distinct error for when subsequent inputs in a data stream don't have valid headers (i.e., trailing garbage). This would provide clarity for users who many not realize that they need to disable
gzip.Reader.Multistream
. Additionally, it could be used to programmatically callMultistream(false)
for specific files with trailing garbage.I will leave the specific error message and behave for future discussion, should there be interest in implementing this proposal.
Caveat: I am not familiar with the implementation of
gzip.Reader
. This change may not be possible or desirable due to technical limitations.The text was updated successfully, but these errors were encountered: