-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use pure-java Air-Compressor instead of JNI based libraries #5390
Conversation
run cpp tests |
run java8 tests |
run integration tests |
run java8 tests |
@merlimat before merging this PR,
|
@rdhabalia There's a test to double-check that the format is the same as the current compression codecs: https://github.com/apache/pulsar/pull/5390/files#diff-5d40386eaa90a0ce27694830c4fa940cR41 |
I see.. instead only keeping static output in test, can we also keep pervious dependencies for a release and add real tests with different payload size. we can remove it next release once we have proof of successful tests for a release.? |
run java8 tests |
@merlimat there is genuine issue here.
|
e89a4f7
to
5d7a3f9
Compare
@rdhabalia Added tests to prove compress/decompress compatibility with current JNI implementations. |
@merlimat Could you please resolve the conflicts? So that we can onboard this in 2.6.0 release. |
/pulsarbot run-failure-checks |
2 similar comments
/pulsarbot run-failure-checks |
/pulsarbot run-failure-checks |
/pulsarbot run-failure-checks |
/pulsarbot run-failure-checks |
1 similar comment
/pulsarbot run-failure-checks |
…te-update * 'website-update' of github.com:zeo1995/pulsar: (432 commits) Fixed ordering issue in KeyShared dispatcher when adding consumer (apache#7106) Fix Duplicated messages are sent to dead letter topic apache#6960 (apache#7021) [Issue 2793][Doc]--Update the TLS hostname verification for CPP and Python clients (apache#7162) [Doc]--set netty mex frame size (apache#7174) [Doc] Update for the maximum message size (apache#7171) Fixed KeyShared consumers getting stuck on delivery (apache#7105) [apache#6003][pulsar-functions] Possibility to add builtin Functions (apache#6895) [Issue 6921][pulsar-broker-common] Replaced "Paths.get(...).getParent()", because it's system dependent and uses '\' as path separator on Windows (apache#6992) Improve broker unit test CI (apache#7173) Fix typo in exception message (apache#7027) Support KeyValue Schema Use Null Key And Null Value (apache#7139) [Doc]--Update documents for support consumer priority level in failover mode (apache#7136) Add schema config to cpp and cgo docs. (apache#7137) [Doc]--Update for the maximum message size (apache#7160) [C++] Expose ZSTD and Snappy compression to C API (apache#7014) [pulsar-proxy] add proxyLogLevel into config file (apache#6948) Add multi-hosts example for bookkeeperMetadataServiceUri (apache#6998) support for termination of partitioned topic (apache#6126) Use pure-java Air-Compressor instead of JNI based libraries (apache#5390) [Issues 5709]remove the namespace checking (apache#5716) ... # Conflicts: # site2/website/scripts/split-swagger-by-version.js
) * Use pure-java Air-Compressor instead of JNI based libraries * Fixed license files * Fixed non-needed exclusion * Added compat tests with JNI implementations * Ensure direct buffer is used in the test * Ensure direct bytebuf for both compression and decompression test Co-authored-by: penghui <[email protected]>
) * Use pure-java Air-Compressor instead of JNI based libraries * Fixed license files * Fixed non-needed exclusion * Added compat tests with JNI implementations * Ensure direct buffer is used in the test * Ensure direct bytebuf for both compression and decompression test Co-authored-by: penghui <[email protected]>
…pache#5390)" This reverts commit b22b323.
Motivation
Right now we're using JNI based libraries to perform data compression. These libraries are do have an overhead in terms of size (7Mb out of 20Mb of Pulsar-Client lib) and are incurring the JNI overhead which is typically measurable when compressing many small payloads.
We can replaces compression libraries for LZ4, ZStd and Snappy with AirCompressor (https://github.com/airlift/aircompressor), which is a pure Java compression library used by Presto.
Microbenchmarks
Microbenchmark code is available at https://github.com/merlimat/compression-benchmark
The results are on-par with the JNI version in most cases.
Results:
https://docs.google.com/spreadsheets/d/18ntnyxiQY3VedYeywoum9JXV97f-G9RL7xD-T6th7tA/edit#gid=153785868
Compression
Decompression