Skip to content

Commit

Permalink
doc: update readme and comments
Browse files Browse the repository at this point in the history
  • Loading branch information
ybirader committed Sep 13, 2023
1 parent 47d75e0 commit 4123dd0
Show file tree
Hide file tree
Showing 4 changed files with 56 additions and 33 deletions.
3 changes: 3 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -3,3 +3,6 @@ test:

test-short:
go test -short ./...

build:
go build -o ./cmd/cli/pzip ./cmd/cli
74 changes: 46 additions & 28 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,27 +1,49 @@
![logo-5](https://github.com/ybirader/pzip/assets/68111562/0b3cee2c-1af0-4753-b088-8a488f8ff642)

# pzip
pzip, short for parallel-zip, is a blazing fast concurrent zip archiver.

## Features

- Archives files and directories into a valid zip archive, using DEFLATE
- Archives files and directories into a valid zip archive, using DEFLATE.
- Preserves modification times of files.
- Files are read and compressed concurrently

### Installation
## Installation

To install pzip, run:

### macOS

`brew install pzip/tap/pzip`

### Debian, Ubuntu, Raspbian

To install pzip, run `brew install pzip` [TODO: Add brew package]
```
sudo apt update
sudo apt install pzip
```

### Go

You can also use pzip as a library by importing the go package:
Alternatively, if you have Go installed:
```
go install github.com/ybirader/pzip
```

### Usage
### Build from source

To build from source, we require Go 1.21 or newer.

1. Clone the repository by running `git clone "https://github.com/ybirader/pzip.git"`
2. Build by running `make build` or `cd cmd/cli && go build`

pzip's API has been designed to mimic the standard zip utlity found on most *-nix systems.
## Usage

pzip's API is similar to that of the standard zip utlity found on most *-nix systems.

```
pzip /path/to/compressed.zip path/to/file_or_directory
pzip /path/to/compressed.zip path/to/file_or_directory1 path/to/file_or_directory2 ... path/to/file_or_directoryN
```

Alternatively, pzip can be imported as a library
Expand All @@ -46,44 +68,40 @@ if err != nil {
}
```

The concurrency of the archiver can be configured using the corresponding flag:
```
pzip --concurrency 2 /path/to/compressed.zip path/to/file_or_directory1 path/to/file_or_directory2 ... path/to/file_or_directoryN
```
or by using passing the `Concurrency` option:
```go
archiver, err := pzip.NewArchiver(archive, Concurrency(2))
```

### Benchmarks

We use Matt Mahoney's [sample directory](https://mattmahoney.net/dc/10gb.html) in our benchmark
pzip was benchmarked using Matt Mahoney's [sample directory](https://mattmahoney.net/dc/10gb.html).

Using the standard `zip` utlity found on most *nix systems, we get the following time to archive:
Using the standard `zip` utlity, we get the following time to archive:
```
real 14m31.809s
user 13m12.833s
sys 0m24.193s
```

The size of the resulting archive is 4.51 GB

Running the same benchmark with pzip, we find that:

```
goos: darwin
goarch: amd64
pkg: github.com/pzip/cmd/cli
cpu: Intel(R) Core(TM) i5-8259U CPU @ 2.30GHz
BenchmarkPzip-8 1 81600764936 ns/op 7928 B/op 32 allocs/op
PASS
ok github.com/pzip/cmd/cli 83.847s
real 0m56.851s
user 3m32.619s
sys 1m25.040s
```

The size of the resulting zip was slightly larger at: 4.62 GB.

Overall, this is over 10x faster! And this is with no optimizations for memory etc.

Upcoming features:
## Contributing

- add flag to maintain unix file permissions i.e. mode of original file
- add support for symbolic links
- add flag to support skipping compression i.e. --skip-suffixes
- add ability to register different compressors
To contribute to pzip, first submit or comment in an issue to discuss your contribution, then open a pull request (PR).

## License

pzip is released under the [MIT License](https://opensource.org/license/mit/).
pzip is released under the [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) license.

2 changes: 0 additions & 2 deletions benchmarks.txt

This file was deleted.

10 changes: 7 additions & 3 deletions cli_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,10 @@ import (
"github.com/pzip/internal/testutils"
)

const benchmarkRoot = "testdata/benchmark"
const (
benchmarkRoot = "testdata/benchmark"
benchmarkDir = "minibench" // modify this to match the file/directory you want to benchmark
)

func TestCLI(t *testing.T) {
t.Run("archives a directory and some files", func(t *testing.T) {
Expand All @@ -30,9 +33,10 @@ func TestCLI(t *testing.T) {
})
}

// BenchmarkCLI benchmarks the archiving of a file/directory, referenced by benchmarkDir in the benchmarkRoot directory
func BenchmarkCLI(b *testing.B) {
dirPath := filepath.Join(benchmarkRoot, "minibench")
archivePath := filepath.Join(benchmarkRoot, "minibench.zip")
dirPath := filepath.Join(benchmarkRoot, benchmarkDir)
archivePath := filepath.Join(benchmarkRoot, benchmarkDir+".zip")

cli := pzip.CLI{archivePath, []string{dirPath}, runtime.GOMAXPROCS(0)}

Expand Down

0 comments on commit 4123dd0

Please sign in to comment.