Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for LZ4 compression. #74

Merged
merged 1 commit into from
Feb 10, 2014
Merged

Support for LZ4 compression. #74

merged 1 commit into from
Feb 10, 2014

Conversation

alberts
Copy link
Contributor

@alberts alberts commented Feb 6, 2014

Support for LZ4 compression.

CPU: 4 * Intel(R) Core(TM) i7-3540M CPU @ 3.00GHz

Compression:
snappy: 4.928 micros/op 202905 ops/sec; 792.6 MB/s (output: 55.1%)
lz4: 5.973 micros/op 167427 ops/sec; 654.0 MB/s (output: 55.4%)
lz4hc: 65.445 micros/op 15279 ops/sec; 59.7 MB/s (output: 55.4%)

Decompression:
snappy: 0.837 micros/op 1194871 ops/sec; 4667.5 MB/s
lz4: 0.659 micros/op 1517871 ops/sec; 5929.2 MB/s
lz4hc: 0.667 micros/op 1498213 ops/sec; 5852.4 MB/s

@alberts
Copy link
Contributor Author

alberts commented Feb 6, 2014

The latest greatest lz4 versions aren't packaged by any major Linux distributions, so that's why I opted for a kind of in-tree build.

@igorcanadi
Copy link
Collaborator

Those are great results. :)

I would prefer using lz4 as a shared library instead of static in rocksdb tree. That's how we use other compression libraries.

@alberts
Copy link
Contributor Author

alberts commented Feb 6, 2014

Okay, I'll make the changes and get back to you.

@alberts
Copy link
Contributor Author

alberts commented Feb 6, 2014

Need to figure out why LZ4 is so much slower when called from a shared library...

@alberts
Copy link
Contributor Author

alberts commented Feb 7, 2014

Found the problem:

https://code.google.com/p/lz4/issues/detail?id=114

Some numbers when compiling liblz4.so with -O3:

Compression: snappy
compress: 4.865 micros/op 205553 ops/sec; 802.9 MB/s (output: 55.1%)
uncompress: 0.861 micros/op 1162073 ops/sec; 4539.3 MB/s

Compression: lz4
compress: 5.952 micros/op 168009 ops/sec; 656.3 MB/s (output: 55.4%)
uncompress: 0.677 micros/op 1476476 ops/sec; 5767.5 MB/s

Compression: lz4hc
compress: 66.506 micros/op 15036 ops/sec; 58.7 MB/s (output: 55.4%)
uncompress: 0.648 micros/op 1543885 ops/sec; 6030.8 MB/s

Typical inputs should compress smaller with lz4hc than with lz4.

This pull request is ready to merge unless you have more comments.

@@ -19,7 +19,8 @@
#
# -DLEVELDB_PLATFORM_POSIX if cstdatomic is present
# -DLEVELDB_PLATFORM_NOATOMIC if it is not
# -DSNAPPY if the Snappy library is present
# -DSNAPPY if the Snappy library is present
# -DLZ4 -DLZ4HC if the LZ4 library is present
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need both defines? Is there a use case where only one will be present and rocksdb should work correctly/partially?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably not. I'll get rid of the LZ4HC one.

@liukai
Copy link
Contributor

liukai commented Feb 7, 2014

I just checked our internal third-party release libraries: we are able to access lz4 library in our internal release.

I think with this diff, other team can benefit from lz4 seamlessly in next release.

inline char* BZip2_Uncompress(const char* input_data, size_t input_length,
int* decompress_size) {
inline char *BZip2_Uncompress(const char *input_data, size_t input_length,
int *decompress_size) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this what make format did? We keep the asterisks by the type, not the variable name, so: int* decompress_size.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would help me if you could make format the master branch before I retry this. make format doesn't seem to be idempotent.

@igorcanadi
Copy link
Collaborator

Great PR, tnx! I have some comments. Can you also try running arc lint?

@alberts
Copy link
Contributor Author

alberts commented Feb 7, 2014

Think I addressed most of the issues.

arc lint found 2, but it also seems my setup isn't quite right yet:

which: no checkCpp in (...)
which: no cpplint in (...)
[2014-02-07 21:16:56] ERROR 8: Undefined index: db/db_bench.cc at [/home/albert/rocksdb/linters/cpp_linter/FbcodeCppLinter.php:71]
  #0 FbcodeCppLinter::getCppLintOutput(db/db_bench.cc) called at [/home/albert/rocksdb/linters/cpp_linter/FbcodeCppLinter.php:50]
  #1 FbcodeCppLinter::lintPath(db/db_bench.cc) called at [/home/albert/arcanist/src/lint/engine/ArcanistLintEngine.php:297]
  #2 ArcanistLintEngine::run() called at [/home/albert/arcanist/src/workflow/ArcanistLintWorkflow.php:359]
  #3 ArcanistLintWorkflow::run() called at [/home/albert/arcanist/scripts/arcanist.php:321]

Could you add a page on the wiki on how contributors should set up arcanist and cpplint?

Finally, it's probably time to make format master again. It helps to run make format a few times in a row to see the issues with it not being idempotent.

@igorcanadi
Copy link
Collaborator

cc @liukai on make format

@liukai
Copy link
Contributor

liukai commented Feb 7, 2014

@alberts I think the lint issue showed up because you didn't run "arc lint" or "arc diff" in rocksdb root dir (sorry that's a known issue).

Similarly make format works for uncommitted files or your last committed diff. If you want to apply make format to all the lines you've touched so far, can you squash your diffs into one (which is also a recommended at Facebook: we use "git commit --amend" to make sure one code review associates with one git commit).

@alberts
Copy link
Contributor Author

alberts commented Feb 8, 2014

I've rolled up the work into a single commit now.

igorcanadi added a commit that referenced this pull request Feb 10, 2014
Support for LZ4 compression.
@igorcanadi igorcanadi merged commit 8e634d3 into facebook:master Feb 10, 2014
facebook-github-bot pushed a commit that referenced this pull request Jan 28, 2020
Summary:
./db_compaction_test DBCompactionTest.LevelTtlCascadingCompactions passed 96 / 100 times.
```
With the fix: all runs (tried 100, 1000, 10000) succeed.
```
$ TEST_TMPDIR=/dev/shm ~/gtest-parallel/gtest-parallel ./db_compaction_test --gtest_filter=DBCompactionTest.LevelTtlCascadingCompactions --repeat=1000
[1000/1000] DBCompactionTest.LevelTtlCascadingCompactions (1895 ms)
```

Test Plan:
Build:
```
COMPILE_WITH_TSAN=1 make db_compaction_test -j100
```
Without the fix: a few runs out of 100 fail:
```
$ TEST_TMPDIR=/dev/shm KEEP_DB=1 ~/gtest-parallel/gtest-parallel ./db_compaction_test --gtest_filter=DBCompactionTest.LevelTtlCascadingCompactions --repeat=100
...
...
Note: Google Test filter = DBCompactionTest.LevelTtlCascadingCompactions
[==========] Running 1 test from 1 test case.
[----------] Global test environment set-up.
[----------] 1 test from DBCompactionTest
[ RUN      ] DBCompactionTest.LevelTtlCascadingCompactions
db/db_compaction_test.cc:3687: Failure
Expected equality of these values:
  oldest_time
    Which is: 1580155869
  level_to_files[6][0].oldest_ancester_time
    Which is: 1580155870
DB is still at /dev/shm//db_compaction_test_6337001442947696266
[  FAILED  ] DBCompactionTest.LevelTtlCascadingCompactions (1432 ms)
[----------] 1 test from DBCompactionTest (1432 ms total)

[----------] Global test environment tear-down
[==========] 1 test from 1 test case ran. (1433 ms total)
[  PASSED  ] 0 tests.
[  FAILED  ] 1 test, listed below:
[  FAILED  ] DBCompactionTest.LevelTtlCascadingCompactions

 1 FAILED TEST
[80/100] DBCompactionTest.LevelTtlCascadingCompactions returned/aborted with exit code 1 (1489 ms)
[100/100] DBCompactionTest.LevelTtlCascadingCompactions (1522 ms)
FAILED TESTS (4/100):
    1419 ms: ./db_compaction_test DBCompactionTest.LevelTtlCascadingCompactions (try #90)
    1434 ms: ./db_compaction_test DBCompactionTest.LevelTtlCascadingCompactions (try #84)
    1457 ms: ./db_compaction_test DBCompactionTest.LevelTtlCascadingCompactions (try #82)
    1489 ms: ./db_compaction_test DBCompactionTest.LevelTtlCascadingCompactions (try #74)

Differential Revision: D19587040

Pulled By: sagar0

fbshipit-source-id: 11191ae9940837643bff47ebe18b299b4be3d950
mauricebarnum pushed a commit to mauricebarnum/rocksdb that referenced this pull request May 13, 2020
Fix assert/data corruption when using persistent cache
Nazgolze pushed a commit to Nazgolze/rocksdb-1 that referenced this pull request Sep 21, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants