Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use lz4_flex? #85

Closed
PSeitz opened this issue Feb 15, 2022 · 12 comments
Closed

use lz4_flex? #85

PSeitz opened this issue Feb 15, 2022 · 12 comments
Labels

Comments

@PSeitz
Copy link

PSeitz commented Feb 15, 2022

I like your no unsafe approach, the LZ4 implementation of lz4_flex also uses foorbid(unsafe_code)
with default feature flags. If you switch to it, even more parts of code would be completely safe.

In the Readme I posted some benchmarks, the performance is a little bit slower than the unsafe version, but still decent.

@jorgecarleitao
Copy link
Owner

Thanks! Do you have benches against the current implementation we use, lz4?

@PSeitz
Copy link
Author

PSeitz commented Feb 15, 2022

Yes, in the Readme there is one comparison https://github.com/pseitz/lz4_flex#lz4_flex, but you can execute the benchmarks yourself with cargo bench. I noticed that the performance gap in the block format is smaller than in the frame format, but I think this can be improved further.

@jorgecarleitao
Copy link
Owner

Sorry, is the lzzz equivalent to the lz4 crate? Else, I can't find lz4 in the benches

@PSeitz
Copy link
Author

PSeitz commented Feb 15, 2022

Yes lzzz is equivalent to lz4, but exposes more options of the c99 reference implementation

@PSeitz
Copy link
Author

PSeitz commented Feb 21, 2022

Here are some numbers for a 10MB dickens book.
Feel free to open a PR, if you have a dataset which you'd like to be covered.

Compression

FrameCompress/lz4_flex_rust_indep/9991663
                        time:   [30.874 ms 30.999 ms 31.134 ms]
                        thrpt:  [306.06 MiB/s 307.39 MiB/s 308.64 MiB/s]
Found 2 outliers among 100 measurements (2.00%)
  2 (2.00%) high mild
FrameCompress/lz4_flex_rust_linked/9991663
                        time:   [32.109 ms 32.248 ms 32.410 ms]
                        thrpt:  [294.00 MiB/s 295.48 MiB/s 296.76 MiB/s]
Found 3 outliers among 100 measurements (3.00%)
  3 (3.00%) high severe
FrameCompress/lz4_cpp_indep/9991663
                        time:   [23.752 ms 23.832 ms 23.915 ms]
                        thrpt:  [398.45 MiB/s 399.83 MiB/s 401.19 MiB/s]
FrameCompress/lz4_cpp_linked/9991663
                        time:   [20.294 ms 20.365 ms 20.440 ms]
                        thrpt:  [466.19 MiB/s 467.89 MiB/s 469.54 MiB/s]
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild
FrameCompress/snap/9991663
                        time:   [26.722 ms 26.803 ms 26.887 ms]
                        thrpt:  [354.41 MiB/s 355.51 MiB/s 356.59 MiB/s]

Decompression

FrameDecompress/lz4_flex_rust_indep/9991663
                        time:   [5.5075 ms 5.5453 ms 5.5846 ms]
                        thrpt:  [1.6663 GiB/s 1.6781 GiB/s 1.6896 GiB/s]
                 change:
                        time:   [+6.4035% +7.2769% +8.2478%] (p = 0.00 < 0.05)
                        thrpt:  [-7.6194% -6.7832% -6.0182%]
                        Performance has regressed.
Found 2 outliers among 100 measurements (2.00%)
  2 (2.00%) high mild
FrameDecompress/lz4_flex_rust_linked/9991663
                        time:   [4.5667 ms 4.6009 ms 4.6373 ms]
                        thrpt:  [2.0067 GiB/s 2.0225 GiB/s 2.0377 GiB/s]
Found 8 outliers among 100 measurements (8.00%)
  8 (8.00%) high mild
FrameDecompress/lz4_cpp_indep/9991663
                        time:   [4.6392 ms 4.6758 ms 4.7163 ms]
                        thrpt:  [1.9730 GiB/s 1.9901 GiB/s 2.0058 GiB/s]
Found 8 outliers among 100 measurements (8.00%)
  7 (7.00%) high mild
  1 (1.00%) high severe
FrameDecompress/lz4_cpp_linked/9991663
                        time:   [3.7491 ms 3.7899 ms 3.8332 ms]
                        thrpt:  [2.4276 GiB/s 2.4553 GiB/s 2.4820 GiB/s]

@jorgecarleitao
Copy link
Owner

Hey, sorry for the delay on this one. So, for my understanding, the tradeoff here is:

  • lz4-flex is 30% slower in compression than lz4 cpp
  • lz4-flex is ~20% slower in decompression than lz4 cpp
  • lz4-flex is native Rust, while lz4 cpp is not (e.g. for wasm)

are there other knobs that I am missing?

If not, I think it would be cool to at least offer a feature flag switching between both so that users can pick between highest performance vs wasm compatibility.

@PSeitz
Copy link
Author

PSeitz commented Mar 4, 2022

These benchmarks are against the safe version. The only knobs are to use the unsafe implementation, which is closer to the cpp implementation, due to removed bounds checks. It's very well fuzzy tested, but still unsafe code.
https://github.com/pseitz/lz4_flex#usage

The overall performance is still very high for the safe version, it's likely the bottleneck is elsewhere for most applications.

@kylebarron
Copy link
Contributor

If not, I think it would be cool to at least offer a feature flag switching between both so that users can pick between highest performance vs wasm compatibility.

+1. I'm trying to write wasm bindings for a parquet decoder (and exploring arrow vs arrow2) and from my explorations so far (apache/arrow-rs#180 (comment)), snap, brotli, zstd, and flate2 all work out of the box. lz4-flex compiles to wasm32-unknown-unknown, but lz4 does not. Having a feature flag to switch between the two implementations would be great.

@jorgecarleitao
Copy link
Owner

What others think, e.g. @Dandandan , @ritchie46, @danburkert , @jhorstmann , @houqp?

@ritchie46
Copy link
Collaborator

I am for feature gating this. Would be valuable to have in wasm targets indeed. Regarding the forbid unsafe. I am more for using unsafe when the situation justifies it. E.g. the programmer has more information than the compiler and thereby can elide bound checks etc. I rather have a used with caution and has debug_asserts tag. ;)

@houqp
Copy link

houqp commented Mar 8, 2022

100% agree with both points made by @ritchie46 👍

@jorgecarleitao
Copy link
Owner

Closed by #124. Thanks @kylebarron and everyone else for the discussions!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants