Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"A local file header is corrupt" error occurs while unpacking the ZIP archive #49580

Closed
Albeoris opened this issue Mar 13, 2021 · 13 comments · Fixed by #68106
Closed

"A local file header is corrupt" error occurs while unpacking the ZIP archive #49580

Albeoris opened this issue Mar 13, 2021 · 13 comments · Fixed by #68106
Labels
area-System.IO.Compression needs-further-triage Issue has been initially triaged, but needs deeper consideration or reconsideration
Milestone

Comments

@Albeoris
Copy link

Albeoris commented Mar 13, 2021

Description

In the process of working with a large number of .zip archives from various sources, I ran into a problem when unpacking some of them.

Configuration

  • Framework: .NET 5.0
  • OS: Windows 10
  • Architecture: x64

Regression?

No, there is a similar problem in .NET 4.7.2

Other information

ZipArchiveEntry.cs :: IsOpenable

                // _compressedSize is (long) 4294967295 => ffffffff
                if (OffsetOfCompressedData + _compressedSize > _archive.ArchiveStream.Length)
                {
                    message = SR.LocalFileHeaderCorrupt;
                    return false;
                }

ZipBlocks.cs :: TryReadBlock

            bool uncompressedSizeInZip64 = uncompressedSizeSmall == ZipHelper.Mask32Bit; // true
            bool compressedSizeInZip64 = compressedSizeSmall == ZipHelper.Mask32Bit; // true
            bool relativeOffsetInZip64 = relativeOffsetOfLocalHeaderSmall == ZipHelper.Mask32Bit; // false
            bool diskNumberStartInZip64 = diskNumberStartSmall == ZipHelper.Mask16Bit; // false

ZipBlocks.cs :: TryGetZip64BlockFromGenericExtraField

                    zip64Block._size = extraField.Size;

                    ushort expectedSize = 0;

                    if (readUncompressedSize) expectedSize += 8; // true
                    if (readCompressedSize) expectedSize += 8; // true
                    if (readLocalHeaderOffset) expectedSize += 8; // false
                    if (readStartDiskNumber) expectedSize += 4;  // false

                    // expectedSize is 16
                    // zip64Block._size is 28
                    if (expectedSize != zip64Block._size)
                        return false;

                    // unreachable code 
                    if (readUncompressedSize) zip64Block._uncompressedSize = reader.ReadInt64();
                    if (readCompressedSize) zip64Block._compressedSize = reader.ReadInt64();

Here is the ZipInfo result for the given archive.
The archive is alive and correctly opened by all current archivers.

There is no zipfile comment.

End-of-central-directory record:
-------------------------------

  Zip archive file size:                      7414 (0000000000001CF6h)
  Actual end-cent-dir record offset:          7316 (0000000000001C94h)
  Expected end-cent-dir record offset:        7316 (0000000000001C94h)
  (based on the length of the central directory and its expected offset)

  This zipfile constitutes the sole disk of a single-part archive; its
  central directory contains 1 entry.
  The central directory is 151 (0000000000000097h) bytes long,
  and its (expected) offset in bytes from the beginning of the zipfile
  is 7165 (0000000000001BFDh).


Central directory entry #1:
---------------------------

  file.txt

  offset of local header from start of archive:   0
                                                  (0000000000000000h) bytes
  file system or operating system of origin:      MS-DOS, OS/2 or NT FAT
  version of encoding software:                   4.5
  minimum file system compatibility required:     MS-DOS, OS/2 or NT FAT
  minimum software version required to extract:   4.5
  compression method:                             deflated
  compression sub-type (deflation):               normal
  file security status:                           not encrypted
  extended local header:                          no
  file last modified on (DOS date/time):          2021 Mar 13 18:11:52
  32-bit CRC value (hex):                         1b0e1343
  compressed size:                                7042 bytes
  uncompressed size:                              93523 bytes
  length of filename:                             37 characters
  length of extra field:                          68 bytes
  length of file comment:                         0 characters
  disk number on which file begins:               disk 1
  apparent file type:                             binary
  non-MSDOS external file attributes:             000000 hex
  MS-DOS file attributes (00 hex):                none

  The central-directory extra field contains:
  - A subfield with ID 0x0001 (PKWARE 64-bit sizes) and 28 data bytes.  The first
    20 are:   53 6d 01 00 00 00 00 00 82 1b 00 00 00 00 00 00 00 00 00 00.
  - A subfield with ID 0x000a (PKWARE Win32) and 32 data bytes.  The first
    20 are:   00 00 00 00 01 00 18 00 4b 75 22 0f 76 04 d7 01 4b 75 22 0f.

  There is no file comment.

A similar problem was mentioned earlier, but it was related to large files:
#1094

@dotnet-issue-labeler
Copy link

I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.

@dotnet-issue-labeler dotnet-issue-labeler bot added the untriaged New issue has not been triaged by the area owner label Mar 13, 2021
@danmoseley
Copy link
Member

Is it possible to share a zip that repros? Although it may be some time before we can take a look, perhaps you are comfortable debugging further.

@Albeoris
Copy link
Author

Here you are: test.zip

@Albeoris Albeoris changed the title "A local file header is corrupt" error after upgrading to 3.0/3.1/5.0 "A local file header is corrupt" error occurs while unpacking the ZIP archive Mar 13, 2021
@carlossanlop carlossanlop removed the untriaged New issue has not been triaged by the area owner label Mar 25, 2021
@carlossanlop carlossanlop added this to the Future milestone Mar 25, 2021
@carlossanlop carlossanlop added the needs-further-triage Issue has been initially triaged, but needs deeper consideration or reconsideration label Mar 25, 2021
@maikebing
Copy link

maikebing commented May 17, 2021

It seems that the problem remains
OS:CentOS Linux release 7.6.1810 (Core)
.Net 5.0.621.22011

I'm trying to solve this problem using the following method,
https://github.com/dotnet/runtime/issues/1094#issuecomment-610260232

Most are normal, but occasionally the following occurs :

Exception has been thrown by the target of an invocation.

The problem could not be reproduced locally windows os !

I'll try to get further information,

@danmoseley
Copy link
Member

Thanks for the info. @maikebing it might be interesting if you could break under a debugger and see what return code we're getting from where. I assume you are using x64?

To set expectations, it might be a while before we look at this on our side. Breaking in in a debugger at the point the exception is thrown might suggest whether it's .NET code or zlib. We are using the latest zlib (https://github.com/madler/zlib/releases/tag/v1.2.11) -- they haven't updated for a couple years. If there is some other tool (such as the platform 'unzip' command perhaps) that is zlib based it would be interesting to know whether that repros the problem.

@0xced
Copy link
Contributor

0xced commented Jun 30, 2021

And here are reproduction steps.

Issue49580.csproj

<Project Sdk="Microsoft.NET.Sdk">

  <PropertyGroup>
    <OutputType>Exe</OutputType>
    <TargetFrameworks>net48;net5.0</TargetFrameworks>
    <LangVersion>9.0</LangVersion>
  </PropertyGroup>

  <ItemGroup>
    <PackageReference Include="System.IO.Compression" Version="4.3.0" />
  </ItemGroup>

</Project>

Program.cs

using System;
using System.IO;
using System.IO.Compression;
using System.Runtime.InteropServices;

// Demonstrates issue described on https://github.com/dotnet/runtime/issues/49580

try
{
    // Produced with `xxd -i test.zip` with file from https://github.com/dotnet/runtime/files/6135119/test.zip
    var zipData = new byte[]
    {
        0x50, 0x4b, 0x03, 0x04, 0x2d, 0x00, 0x00, 0x08, 0x08, 0x00, 0x17, 0x9b, 0x6d, 0x52, 0x0c, 0x7e, 0x7f, 0xd8, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
        0xff, 0xff, 0x08, 0x00, 0x38, 0x00, 0x66, 0x69, 0x6c, 0x65, 0x2e, 0x74, 0x78, 0x74, 0x01, 0x00, 0x10, 0x00, 0x04, 0x00, 0x00, 0x00, 0x00, 0x00,
        0x00, 0x00, 0x06, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x0a, 0x00, 0x20, 0x00, 0x00, 0x00, 0x00, 0x00, 0x01, 0x00, 0x18, 0x00, 0xa8, 0xb1,
        0xf6, 0x61, 0x25, 0x18, 0xd7, 0x01, 0xa8, 0xb1, 0xf6, 0x61, 0x25, 0x18, 0xd7, 0x01, 0xa8, 0xb1, 0xf6, 0x61, 0x25, 0x18, 0xd7, 0x01, 0x2b, 0x49,
        0x2d, 0x2e, 0x01, 0x00, 0x50, 0x4b, 0x01, 0x02, 0x2d, 0x00, 0x2d, 0x00, 0x00, 0x08, 0x08, 0x00, 0x17, 0x9b, 0x6d, 0x52, 0x0c, 0x7e, 0x7f, 0xd8,
        0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0x08, 0x00, 0x44, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
        0x00, 0x00, 0x66, 0x69, 0x6c, 0x65, 0x2e, 0x74, 0x78, 0x74, 0x01, 0x00, 0x1c, 0x00, 0x04, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x06, 0x00,
        0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x0a, 0x00, 0x20, 0x00, 0x00, 0x00,
        0x00, 0x00, 0x01, 0x00, 0x18, 0x00, 0xa8, 0xb1, 0xf6, 0x61, 0x25, 0x18, 0xd7, 0x01, 0xa8, 0xb1, 0xf6, 0x61, 0x25, 0x18, 0xd7, 0x01, 0xa8, 0xb1,
        0xf6, 0x61, 0x25, 0x18, 0xd7, 0x01, 0x50, 0x4b, 0x06, 0x06, 0x2c, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x2d, 0x00, 0x2d, 0x00, 0x00, 0x00,
        0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x01, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x01, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x7a, 0x00,
        0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x64, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x50, 0x4b, 0x06, 0x07, 0x00, 0x00, 0x00, 0x00, 0xde, 0x00,
        0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x01, 0x00, 0x00, 0x00, 0x50, 0x4b, 0x05, 0x06, 0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0xff, 0xff, 0x7a, 0x00,
        0x00, 0x00, 0x64, 0x00, 0x00, 0x00, 0x00, 0x00
    };

    Console.WriteLine($"Extracting test.zip on {RuntimeInformation.OSDescription.Trim()} ({RuntimeInformation.FrameworkDescription})");

    using var archive = new ZipArchive(new MemoryStream(zipData));
    foreach (var entry in archive.Entries)
    {
        Console.WriteLine($"{entry} (CompressedLength: {entry.CompressedLength} -- Length: {entry.Length})");
        using var _ = entry.Open(); // throws System.IO.InvalidDataException (A local file header is corrupt)
    }
    return 0;
}
catch (Exception exception)
{
    Console.Error.WriteLine(exception);
    return 1;
}

Result on .NET Framework 4.8 (dotnet run -f net48):

Extracting test.zip on Microsoft Windows 10.0.18363 (.NET Framework 4.8.4250.0)
file.txt (CompressedLength: 4294967295 -- Length: 4294967295)
System.IO.InvalidDataException: A local file header is corrupt.
   at System.IO.Compression.ZipArchiveEntry.OpenInReadMode(Boolean checkOpenable)
   at <Program>$.<Main>$(String[] args) in Program.cs:line 35

Result on .NET 5 (dotnet run -f net5.0):

Extracting test.zip on Microsoft Windows 10.0.18363 (.NET 5.0.7)
file.txt (CompressedLength: 4294967295 -- Length: 4294967295)
System.IO.InvalidDataException: A local file header is corrupt.
   at System.IO.Compression.ZipArchiveEntry.OpenInReadMode(Boolean checkOpenable)
   at System.IO.Compression.ZipArchiveEntry.Open()
   at <Program>$.<Main>$(String[] args) in Program.cs:line 35

@danmoseley
Copy link
Member

danmoseley commented Jul 13, 2021

                if (OffsetOfCompressedData + _compressedSize > _archive.ArchiveStream.Length)
                {
                    message = SR.LocalFileHeaderCorrupt;
                    return false;
                }

(long)94 + (long)4294967295 > (long)320.

Note 320 is an int, because it is Length on a Stream. I don't know the code, but I suspect it is not intended to compare longs with ints.

@Albeoris
Copy link
Author

Albeoris commented Jul 13, 2021

(long)94 + (long)4294967295 > (int)320.

4294967295 is 0xFFFFFFFF => -1

Note 320 is an int, because it is Length on a Stream. I don't know the code, but I suspect it is not intended to compare longs with ints.
Stream.Length is long

@ryanwilliams83
Copy link

Please fix; this bug also manifests in powershell's Expand-Archive as "Unable to remove file" (or similar) errors.

@0xced
Copy link
Contributor

0xced commented Sep 10, 2021

For the record, here's an example on how this issue can manifest in real life:

@vmachacek
Copy link

please fix this

@danmoseley
Copy link
Member

OK, apologies for not looking into this earlier, especially given the excellent description and ideal repro above.

0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0x08, 0x00, 0x44, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00

So as noted, compressed and uncompressed size are both 0xFFFFFFFF. file name is 0x0008, extra field length is 0x0044. 'disk number start' and 'external file attributes' are both 0x0.

the spec says that if they are 0xFFFF / 0xFFFFFFFF respectively, they are in the extra field; it doesn't say that if they aren't those values that they aren't in the extra field:

4.4.13 disk number start: (2 bytes)

  The number of the disk on which this file begins.  If an 
  archive is in ZIP64 format and the value in this field is 
  0xFFFF, the size will be in the corresponding 4 byte zip64 
  extended information extra field.

But the Zip64 extended field section seems completely explicit that they ONLY appear if they were 0xFFFF / 0xFFFFFFFF. Our code follows this, and identifies the unexpected length as a corruption.

4.5.3

 The order of the fields in the zip64 extended 
information record is fixed, but the fields MUST
 only appear if the corresponding Local or Central
 directory record field is set to 0xFFFF or 0xFFFFFFFF.

What do other implementations do?

The old WPF implementation mentioned in the past issue seems to make the same check and SharpCompress apparently does too.

However Python apparently skips the specified length, whether or not it uses the fields:
https://github.com/python/cpython/blob/main/Lib/zipfile.py#L514
as does Rust
https://github.com/zip-rs/zip/blob/master/src/read.rs#L794
and I think Go is not checking (not a Go speaker)
https://cs.opensource.google/go/go/+/master:src/archive/zip/reader.go;l=354;bpv=0;bpt=1

My guess is that Python and the others' implementations are most battle-tested and we should trust the length.

@danmoseley
Copy link
Member

danmoseley commented Apr 16, 2022

For this zip, 7zip shows a warning "Characteristics: Extra_ERROR Zip64_ERROR NTFS : UTF8". From looking at https://sourceforge.net/p/sevenzip/discussion/45797/thread/13e7d575/#83a1, this can occur when 7zip sees that the zip64 extended field has sections present that were not all 0xFF in the 32 bit fields. Evidently it still moves past those unexpected fields as it will read the archive successfully.

@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label Apr 16, 2022
@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label Apr 20, 2022
@ghost ghost locked as resolved and limited conversation to collaborators May 21, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-System.IO.Compression needs-further-triage Issue has been initially triaged, but needs deeper consideration or reconsideration
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants