-
Notifications
You must be signed in to change notification settings - Fork 148
Zstd compression for sort files #541
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
What is the status on this one @tueda ? You have several experimental zstd branches. |
|
If I remember correctly, it should work. But I don't have a good benchmark showing the performance advantages of zstd compression. (This may just be because all my machines currently use SSDs, not HDDs.) Do you have a good benchmark results for this? If you are happy with this implementation, I think it can be merged (but the "`Fix previous" commit could be squashed into the previous one). I'll rebase it. |
|
I will take another look at some benchmarks. At least in #540 I claimed some improvements. |
|
zlibWrapper uses the quote form of the For some reason, with the macOS 14 image, Edit: OK, GCC is Clang on Mac, and the Clang manual says |
|
This also needs adding to the manual. Additionally, the current manual claims "On compress; |
|
This is a good benchmark I think. It makes a large expression (but it fits in ScratchSize) and then pointlessly reads and writes it a bunch of times. So a large % of the time is spent in the compression routines. I run it with FORMTMP on a tmpfs, so the speed of the storage is taken out of the equation. Then I see: The peak size of the sort files 741M, 2.5G, 701M and 741M, respectively. That Benchmark 1 == Benchmark 4 means that using gzip via the wrapper does not incur a penalty. Benchmark 2 is very fast since we do no compression at all, but if FORMTMP is on a real storage device this slows down a lot. So Zstd is both a bit faster, and compresses a bit better, in this test. Could you rebase the fix commit in your branch, and here is a suggestion for the manual which you can cherry-pick if you are happy: jodavies@7071cce |
|
Thanks! The results look nice. Maybe it would be useful to put this benchmark in the repository? We could simply put it into some subdirectory I have cherry-picked the patch for the manual and squashed the "Fix previous" commit into the previous one. |
|
I added the benchmark in check/benchmarks, and also a "mini" version in the usual tests. The idea is to ensure there is a test in there which creates sort files (very likely there is, but this guarantees it...). Under |
- Add zstd as a submodule at extern/zstd. Use only zstd/zlibWrapper. Clone during configuration if not already cloned. - Add a new configure option --with-zstd (default: check). Define WITHZSTD and build zstd/zlibWrapper. - Use the subdir-objects Automake option if appropriate.
The default is to use zstd, if FORM is compiled with the wrapper. Specifying "On compress,gzip;" will still use zlib, via the zstd wrapper.
Also fix "dubious ownership" in containers for Git operations.
|
Rebased (conflicts resolved). |
|
One thing: this works fine when loading tablebases which have been created with "zlib form" since the wrapper detects zlib data and uses the right decompression function. But not the other way around: tablebases created with zstd enabled do not work in "zlib form". The easiest solution would be to disable and renable the wrapper around calls to compress in minos.c. |
Maintain portability of tablebase files between FORM binaries compiled with and without zstd support.
This PR implements #540.