Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch compression algorithm from zstd to lz4 #2112

Merged
merged 2 commits into from
May 25, 2023
Merged

Switch compression algorithm from zstd to lz4 #2112

merged 2 commits into from
May 25, 2023

Conversation

emilk
Copy link
Member

@emilk emilk commented May 13, 2023

Advantages:

  • Faster compilation times
  • 3x faster encode and decode
  • Pure Rust crate
  • Can compress in wasm

Disadvantages:

example .rrd zstd lz4
api_demo 48 kB 93 kB
car 120 kB 375 kB
clock 28 kB 53 kB
colmap 227 MB 241 MB
deep_sdf 19 MB 20 MB
dicom 39 MB 64 MB
nyud 535 MB 634 MB
plots 89 kB 163 kB
raw_mesh 1.5 MB 5.6 MB
text_logging 1.9 kB 3.0 kB

I'm not sure what is the best trade-off here. I'm gonna check the compilation times too.

Note that these compilation times impact our Rust users, as well as our contributors.

Checklist

PR Build Summary: https://build.rerun.io/pr/2112

@emilk emilk added the dependencies concerning crates, pip packages etc label May 13, 2023
@emilk emilk mentioned this pull request May 13, 2023
@emilk emilk added 🧑‍💻 dev experience developer experience (excluding CI) 🚀 performance Optimization, memory use, etc labels May 13, 2023
@emilk
Copy link
Member Author

emilk commented May 13, 2023

Compilation times

TL;DR: nothing won 😭

sccache --stop-server && rm -rf /Users/emilk/Library/Caches/Mozilla.sccache && sccache --start-server
cargo clean && cargo build --timings -p rerun

Before (main)

478 deps in 2m 12s

image

After

(together with EmbarkStudios/puffin#135)
470 deps in 2m 12s

Screenshot 2023-05-13 at 22 41 04

@Wumpf
Copy link
Member

Wumpf commented May 13, 2023

It's strange that re_log_types took so much longer on the lz4 run. It also seems like the whole thing is hitting some bottleneck where it can't parallize - it would be that this is still a win on a machine with less cores than yours, i.e. a strong improvement for many users. Might be worth doing a comparison run that is limited to 4 cores 🤔

Advantages:
* Faster compilation times
* 3x faster encode and decode
* Pure Rust crate
* Can compress in wasm
@emilk emilk marked this pull request as ready for review May 25, 2023 14:35
@emilk
Copy link
Member Author

emilk commented May 25, 2023

We've decided to prioritize CPU use right now over disk space. We may revisit this in the future, allowing users to pick different compression algorithms and levels.

@emilk emilk merged commit 7d1d8a4 into main May 25, 2023
@emilk emilk deleted the emilk/lz4 branch May 25, 2023 14:53
emilk added a commit that referenced this pull request May 25, 2023
Advantages:
* Faster compilation times
* 3x faster encode and decode
* Pure Rust crate
* Can compress in wasm

Disadvantages:
| example .rrd  |  zstd          |  lz4  |
| ------------- | ------------- | ------------- |
| api_demo      |   48 kB |  93 kB |
| car           | 120 kB        | 375 kB        |
| clock         | 28 kB         | 53 kB         |
| colmap        | 227 MB        | 241 MB        |
| deep_sdf      | 19 MB         | 20 MB         |
| dicom         | 39 MB         | 64 MB         |
| nyud          | 535 MB        | 634 MB        |
| plots         | 89 kB         | 163 kB        |
| raw_mesh      | 1.5 MB        | 5.6 MB        |
| text_logging  | 1.9 kB        | 3.0 kB        |

I'm not sure what is the best trade-off here. I'm gonna check the
compilation times too.

Note that these compilation times impact our Rust users, as well as our
contributors.

### Checklist
* [x] I have read and agree to [Contributor
Guide](https://github.com/rerun-io/rerun/blob/main/CONTRIBUTING.md) and
the [Code of
Conduct](https://github.com/rerun-io/rerun/blob/main/CODE_OF_CONDUCT.md)

<!-- This line will get updated when the PR build summary job finishes.
-->
PR Build Summary: https://build.rerun.io/pr/2112
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dependencies concerning crates, pip packages etc 🧑‍💻 dev experience developer experience (excluding CI) 🚀 performance Optimization, memory use, etc
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants