-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Improve VariableWidthBlock deserialization #15883
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve VariableWidthBlock deserialization #15883
Conversation
19f13fa to
ade22a2
Compare
radek-kondziolka
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM but please run microbenchmarks
core/trino-spi/src/main/java/io/trino/spi/block/VariableWidthBlockEncoding.java
Outdated
Show resolved
Hide resolved
250f2fb to
cc6286f
Compare
cc6286f to
82b6d42
Compare
|
@raunaqmorarka - let me know if you have any questions when you get a chance to review this PR. |
|
@radek-starburst could you please run this and post results from your system as well ? |
|
My results: |
core/trino-spi/src/main/java/io/trino/spi/block/VariableWidthBlockEncoding.java
Outdated
Show resolved
Hide resolved
Avoids allocating an unnecessary additional array allocation when deserializing VariableWidthBlocks, since we can use the final offsets array to temporarily store the serialized lengths values and transform them back into valid offsets in place. Benchmark (nulls) Mode Before After Units BenchmarkBlockSerde.deserializeSliceDirect 0 avgt 4.174 ± 0.455 3.700 ± 0.092 ns/op BenchmarkBlockSerde.deserializeSliceDirect .01 avgt 4.943 ± 0.340 4.575 ± 0.330 ns/op BenchmarkBlockSerde.deserializeSliceDirect .10 avgt 5.471 ± 0.329 5.252 ± 0.078 ns/op BenchmarkBlockSerde.deserializeSliceDirect .50 avgt 6.730 ± 0.815 3.592 ± 0.167 ns/op BenchmarkBlockSerde.deserializeSliceDirect .90 avgt 2.636 ± 0.265 2.646 ± 0.260 ns/op BenchmarkBlockSerde.deserializeSliceDirect .99 avgt 1.303 ± 0.068 1.708 ± 0.038 ns/op
82b6d42 to
3adf7c3
Compare
Description
Follows up from #15760 to further optimize
VariableWidthBlockEncoding#readBlockto avoid allocating an unnecessary additional lengths array, since the final offsets array can temporarily store the serialized lengths of non-null positions and transform them back into valid offsets in-place.Avoiding the extra array allocation does not reduce CPU throughput, and generally also improve as a result of being able to trigger CMOV optimizations a special path for the "no-nulls" case.
Benchmarks:
Release notes
(x) This is not user-visible or docs only and no release notes are required.
( ) Release notes are required, please propose a release note for me.
( ) Release notes are required, with the following suggested text: