Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Much faster standard PNG writing #55

Open
richgel999 opened this issue Dec 14, 2021 · 8 comments
Open

Much faster standard PNG writing #55

richgel999 opened this issue Dec 14, 2021 · 8 comments

Comments

@richgel999
Copy link

richgel999 commented Dec 14, 2021

Using more modern zlib compression libs, it's possible to greatly speed up PNG writing while still writing fully standard PNG's that are reasonably compressed. The possible speed gains are large, I estimate around 2-15x vs. stb_image/lodepng/libpng. (Possibly even faster using libdeflate, a modern Deflate compressor, which I didn't have time to hook up.)

https://twitter.com/richgel999/status/1470696769470271493

Quick PNG writing bench, 2k 32bpp, ~7x faster vs. stbi_write_png:

miniz.h lv1g: .173 secs 9.99MB
lodepng+miniz f0: .256 9.99
miniz lvl3g: .498 8.91
miniz lvl5: .596 8.86
stbi f0 lvl 5: .895 11.41
stbi f0: .944 11.39
lodepng+miniz: .985 5.96
stbi: 1.247 8.15
lodepng: 2.311 5.89

@richgel999
Copy link
Author

richgel999 commented Dec 14, 2021

Additionally, it's also possible using multiple IDAT's to parallelize PNG encoding across multiple threads. Combining both techniques would result in enormous reductions in wall-block time for writing PNG.

The time it takes to read & write PNG is becoming a significant bottleneck in modern commercial environments that use 8K-16K or larger images/textures.

@DavidBuchanan314
Copy link

Given that it results in fully standard PNGs, using this technique doesn't require any changes to the spec, right?

@richgel999
Copy link
Author

richgel999 commented Dec 14, 2021

Given that it results in fully standard PNGs, using this technique doesn't require any changes to the spec, right?

Correct. No spec changes.

There's already a Rust implementation of the threaded writing idea:
https://github.com/brion/mtpng

@richgel999
Copy link
Author

I modified miniz (a zlib alternative) to implement pixel-wise LZ compression: All LZ matches are aligned to 4 byte boundaries, and literals are always output in groups of 4. Compression is ~16% faster, or 8.6x faster than stbi_write_png:

miniz.h lvl1g+pixelwise LZ: .145 10.38

@richgel999
Copy link
Author

Quick benchmark of various fast PNG writers vs. QOI:
image

@richgel999
Copy link
Author

What we need is a new codec inside this red circle. Simpler and way faster than the existing filtering+Deflate method, and compresses better by 10-30%:
image

@randy408
Copy link

Which tool was used to generate the graphic?

I have cleaned up and published the spec I had for parallel encoding/decoding, it's fairly flexible: https://github.com/libspng/png-restart-marker

@palemieux
Copy link
Contributor

What we need is a new codec inside this red circle. Simpler and way faster than the existing filtering+Deflate method, and compresses better by 10-30%:

I finally got around to setting up a framework for benchmarking lossless codecs, including JPEG 2000 Part 15 (HTJ2K) and JPEG XL. Initial results at:

https://www.lossless-benchmarks.com/

It is possible to achieve throughput comparable to QOI at lower file sizes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants