LLJ-ZIP

A closer to the spec implementation of ZIP parsing for Java.

Relevant ZIP information

The notes and structure outlines are the basis for most of LLJ-ZIP.

The JVM zip reader implementation is based off this piece.

This is a zip format reader for seekable files, that tolerates leading and trailing garbage, and tolerates having had internal offsets adjusted for leading garbage (as with Info-Zip's zip -A).

But that's not all it does. That's just what that one comment says. Some other fun quirks of the JVM zip parser:

The end central directory entry is found by scanning from the end of the file, rather than from the beginning.
The central directory values are authoritative. Names/values defined by the local file headers are ignored.
The file data of local file headers is not size bound by the file header's compressed size field. Instead, it uses the central directory header's declared size.
Class names are allowed to end in trailing / which most tools interpret as directories.

Additional features

Reads ZIP files using MemorySegment backed mapped files.
Highly configurable, offering 3 ZIP reading strategies out of the box (See ZipIO for convenience calls)
- Std / Forward scanning: Scans for EndOfCentralDirectory from the front of the file, like many other tools
- Naive: Scans only for LocalFileHeader values from the front of the file, the fastest implementation, but obviously naive
- JVM: Matches the behavior of the JVM's ZIP parser, including a number of odd edge cases. Useful for opening JAR files to mirror java -jar <path> behavior.
Inputs do not have to be on-disk to be read, you can supply zip data in-memory.
Tracks data in front of ZIP contents as ZipArchive.getPrefixData()
- Useful for cases like keeping track of the executable header of Jar2Exe archives.

Usage

Maven dependency:

<dependency>
    <groupId>software.coley</groupId>
    <artifactId>lljzip</artifactId>
    <version>${zipVersion}</version> <!-- See release page for latest version -->
</dependency>

Gradle dependency:

implementation group: 'software.coley', name: 'lljzip', version: zipVersion
implementation "software.coley:lljzip:${zipVersion}"

Basic usage:

// ZipIO offers a number of different utility calls for using different ZipReader implementations
ZipArchive archive = ZipIO.readJvm(path);

// Local files have the actual file data/bytes.
// These entries mirror data also declared in central directory entries.
List<LocalFileHeader> localFiles = archive.getLocalFiles();
for (LocalFileHeader localFile : localFiles) {
    // Data model mirrors how a byte-buffer works.
    ByteData data = localFile.getFileData();
    
    // You can extract the data to raw byte[]
    byte[] decompressed = ZipCompressions.decompress(localFile);
    
    // Or do so with a specific decompressor implementation
    byte[] decompressed = localFile.decompress(DeflateDecompressor.INSTANCE);
}

// Typically used for authoritative definitions of properties.
// Some ZIP logic will ignore properties of 'LocalFileHeader' values and use these instead.
//  - Try using a hex editor to play around with this idea. Plenty of samples in the test cases to look at.
List<CentralDirectoryFileHeader> centralDirectories = archive.getCentralDirectories();

// Information about the archive and its contents.
EndOfCentralDirectory end = archive.getEnd();

For more detailed example usage see the tests.

How does each ZipReader implementation map to standard Java ZIP handling?

If you're looking to see which implementation models different ways of reading ZIP files in Java, here's a table for reference:

Java closest equivalent	LL-Java-Zip
`ZipFile`	`JvmZipReader` / `ZipIO.readJvm(...)`
`ZipInputStream`	`ForwardScanZipReader` / `ZipIO.readStandard(...)`
N/A	`NaiveLocalFileZipReader` / `ZipIO.readNaive(...)`

There is also a ZipFile delegating reader AdaptingZipReader but it should primarily be used only for debugging purposes.

Building

Due to some sun.misc.Unsafe hacks (For faster deflate performance), you will get compiler warnings when first opening the project in IntelliJ. You can resolve this by changing the compiler target:

Name		Name	Last commit message	Last commit date
Latest commit History 252 Commits
.github/workflows		.github/workflows
.mvn/wrapper		.mvn/wrapper
docs		docs
src		src
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
jitpack.yml		jitpack.yml
jreleaser.yml		jreleaser.yml
mvnw		mvnw
mvnw.cmd		mvnw.cmd
pom.xml		pom.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

LLJ-ZIP

Relevant ZIP information

Additional features

Usage

Building

About

Uh oh!

Releases 39

Packages

Uh oh!

Contributors 6

Uh oh!

Languages

License

Col-E/LL-Java-Zip

Folders and files

Latest commit

History

Repository files navigation

LLJ-ZIP

Relevant ZIP information

Additional features

Usage

Building

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 39

Packages 0

Uh oh!

Contributors 6

Uh oh!

Languages

Packages