Improve partitioned output operator performance for strings#12798
Conversation
201bac1 to
51c506d
Compare
51c506d to
04d2417
Compare
core/trino-main/src/main/java/io/trino/operator/output/SlicePositionsAppender.java
Outdated
Show resolved
Hide resolved
core/trino-spi/src/main/java/io/trino/spi/block/AbstractVariableWidthBlock.java
Outdated
Show resolved
Hide resolved
core/trino-spi/src/main/java/io/trino/spi/block/AbstractVariableWidthBlock.java
Outdated
Show resolved
Hide resolved
core/trino-spi/src/main/java/io/trino/spi/block/AbstractVariableWidthBlock.java
Outdated
Show resolved
Hide resolved
04d2417 to
160d89b
Compare
lukasz-stec
left a comment
There was a problem hiding this comment.
comments answered.
core/trino-main/src/main/java/io/trino/operator/output/SlicePositionsAppender.java
Outdated
Show resolved
Hide resolved
core/trino-spi/src/main/java/io/trino/spi/block/AbstractVariableWidthBlock.java
Outdated
Show resolved
Hide resolved
core/trino-spi/src/main/java/io/trino/spi/block/AbstractVariableWidthBlock.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/operator/output/SlicePositionsAppender.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/operator/output/SlicePositionsAppender.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/operator/output/SlicePositionsAppender.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/operator/output/SlicePositionsAppender.java
Outdated
Show resolved
Hide resolved
core/trino-spi/src/main/java/io/trino/spi/block/AbstractVariableWidthBlock.java
Outdated
Show resolved
Hide resolved
core/trino-spi/src/main/java/io/trino/spi/block/AbstractVariableWidthBlock.java
Outdated
Show resolved
Hide resolved
160d89b to
d305bf1
Compare
lukasz-stec
left a comment
There was a problem hiding this comment.
comments addressed
core/trino-main/src/main/java/io/trino/operator/output/SlicePositionsAppender.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/operator/output/SlicePositionsAppender.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/operator/output/SlicePositionsAppender.java
Outdated
Show resolved
Hide resolved
core/trino-spi/src/main/java/io/trino/spi/block/AbstractVariableWidthBlock.java
Outdated
Show resolved
Hide resolved
|
Made it non-draft as @raunaqmorarka is reviewing it anyway. |
d305bf1 to
8d20e3c
Compare
|
I rebased on the master (there were some changes to the |
|
I also updated the jmh results as the absolute number changed significantly after rebasing on the master and running the benchmarks on java 11.0.15 instead of 11.0.11. The relative improvements are similar (slightly better). |
8d20e3c to
c4265a5
Compare
raunaqmorarka
left a comment
There was a problem hiding this comment.
minor comments, looks good overall
core/trino-main/src/main/java/io/trino/operator/output/SlicePositionsAppender.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/operator/output/SlicePositionsAppender.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
nit: Could be simplified to something like
int currentStartOffset = startOffset + length;
for (int i = 0; i < count; i++) {
offsets[positionCount + i + 1] = currentStartOffset;
currentStartOffset += length;
}
There was a problem hiding this comment.
I missed this comment.
applied now.
There was a problem hiding this comment.
An alternative way here is to copy only once into the byte array from Slice using this method, then use System.arrayCopy to copy bytes within the byte array with doubling size at each step, so you get only log_2(count) calls to System.arrayCopy overall.
Not sure if it'll be faster though.
There was a problem hiding this comment.
This is a good idea.
First of all, it allows us to not rely on Slice being backed by a byte array.
Second, I did a quick benchmark and this method is ridiculously fast on small to medium chunk sizes.
I suspect it's not gonna be that used in practice but I included this in this PR given results below.
Benchmark (length) Mode Cnt Score Error Units
BenchmarkDuplicateBytes.arrayCopy 2 avgt 20 4.101 ± 0.082 ns/op
BenchmarkDuplicateBytes.arrayCopy 16 avgt 20 5.088 ± 0.055 ns/op
BenchmarkDuplicateBytes.arrayCopy 35 avgt 20 4.389 ± 0.175 ns/op
BenchmarkDuplicateBytes.arrayCopy 32 avgt 20 4.333 ± 0.049 ns/op
BenchmarkDuplicateBytes.arrayCopy 64 avgt 20 4.900 ± 0.050 ns/op
BenchmarkDuplicateBytes.arrayCopy 80 avgt 20 5.609 ± 0.117 ns/op
BenchmarkDuplicateBytes.arrayCopy 128 avgt 20 5.318 ± 0.069 ns/op
BenchmarkDuplicateBytes.arrayCopy 200 avgt 20 6.087 ± 0.098 ns/op
BenchmarkDuplicateBytes.arrayCopy 256 avgt 20 6.971 ± 0.111 ns/op
BenchmarkDuplicateBytes.arrayCopy 512 avgt 20 12.175 ± 0.324 ns/op
BenchmarkDuplicateBytes.arrayCopy 1024 avgt 20 22.759 ± 0.300 ns/op
BenchmarkDuplicateBytes.arrayCopy 2048 avgt 20 55.769 ± 1.213 ns/op
BenchmarkDuplicateBytes.arrayCopy 100000 avgt 20 4931.827 ± 276.57 ns/op
BenchmarkDuplicateBytes.arrayCopy 1000000 avgt 20 70111.621 ± 922.907 ns/op
BenchmarkDuplicateBytes.copyFromSelf2x 2 avgt 20 0.230 ± 0.005 ns/op
BenchmarkDuplicateBytes.copyFromSelf2x 16 avgt 20 0.352 ± 0.015 ns/op
BenchmarkDuplicateBytes.copyFromSelf2x 35 avgt 20 0.533 ± 0.052 ns/op
BenchmarkDuplicateBytes.copyFromSelf2x 32 avgt 20 0.498 ± 0.007 ns/op
BenchmarkDuplicateBytes.copyFromSelf2x 64 avgt 20 0.881 ± 0.013 ns/op
BenchmarkDuplicateBytes.copyFromSelf2x 80 avgt 20 1.100 ± 0.015 ns/op
BenchmarkDuplicateBytes.copyFromSelf2x 128 avgt 20 1.518 ± 0.228 ns/op
BenchmarkDuplicateBytes.copyFromSelf2x 200 avgt 20 3.522 ± 0.069 ns/op
BenchmarkDuplicateBytes.copyFromSelf2x 256 avgt 20 4.472 ± 0.154 ns/op
BenchmarkDuplicateBytes.copyFromSelf2x 512 avgt 20 10.083 ± 0.206 ns/op
BenchmarkDuplicateBytes.copyFromSelf2x 1024 avgt 20 21.741 ± 0.349 ns/op
BenchmarkDuplicateBytes.copyFromSelf2x 2048 avgt 20 57.329 ± 0.854 ns/op
BenchmarkDuplicateBytes.copyFromSelf2x 100000 avgt 20 5243.664 ± 188.963 ns/op
BenchmarkDuplicateBytes.copyFromSelf2x 1000000 avgt 20 109400.739 ± 992.325 ns/op
core/trino-main/src/main/java/io/trino/operator/output/SlicePositionsAppender.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/operator/output/SlicePositionsAppender.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/operator/output/SlicePositionsAppender.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/operator/output/SlicePositionsAppender.java
Outdated
Show resolved
Hide resolved
c4265a5 to
1b9a136
Compare
|
CI hit #13107 |
1b9a136 to
93a4b22
Compare
93a4b22 to
ddfad78
Compare
skrzypo987
left a comment
There was a problem hiding this comment.
Looks ok % comment about making getRawSlice method public
core/trino-main/src/main/java/io/trino/operator/output/SlicePositionsAppender.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/operator/output/SlicePositionsAppender.java
Outdated
Show resolved
Hide resolved
core/trino-spi/src/main/java/io/trino/spi/block/VariableWidthBlock.java
Outdated
Show resolved
Hide resolved
ddfad78 to
a803c35
Compare
core/trino-spi/src/main/java/io/trino/spi/block/VariableWidthBlock.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/operator/output/SlicePositionsAppender.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/test/java/io/trino/operator/output/TestDuplicateBytes.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/operator/output/SlicePositionsAppender.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/operator/output/SlicePositionsAppender.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/test/java/io/trino/operator/output/TestPositionsAppender.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/test/java/io/trino/operator/output/TestPositionsAppender.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/test/java/io/trino/operator/output/TestPositionsAppender.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Isn't this already covered by adaptation?
There was a problem hiding this comment.
type tests with block adaptation
There was a problem hiding this comment.
Ah, this new one. So no, it's not covered by normal tests, even with adaptation. Like I mentioned here I need to access SlicePositionsAppender directly because UnnestingPositionsAppender copies RLE value to byte array always.
That said, in the current implementation I don't rely on the byte array in the RLE block anymore so I could drop this test
core/trino-main/src/test/java/io/trino/operator/output/TestPositionsAppender.java
Outdated
Show resolved
Hide resolved
e3b4118 to
f725cbc
Compare
lukasz-stec
left a comment
There was a problem hiding this comment.
comments addressed
core/trino-main/src/main/java/io/trino/operator/output/SlicePositionsAppender.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/operator/output/SlicePositionsAppender.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/test/java/io/trino/operator/output/TestPositionsAppender.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/test/java/io/trino/operator/output/TestPositionsAppender.java
Outdated
Show resolved
Hide resolved
core/trino-spi/src/main/java/io/trino/spi/block/VariableWidthBlock.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/operator/output/SlicePositionsAppender.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/test/java/io/trino/operator/output/TestPositionsAppender.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
rename test name, if it's not byte array, then what it is?
Why the test is important (please add short comment)
There was a problem hiding this comment.
rename test name, if it's not byte array, then what it is?
the core of this test is that it is not a byte array and not that it is e.g. long array.
comment added.
There was a problem hiding this comment.
rename, NotByteArray -> WithSomeType
There was a problem hiding this comment.
I dropped this test. it's no longer needed.
core/trino-main/src/test/java/io/trino/operator/output/TestPositionsAppender.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/test/java/io/trino/operator/output/TestPositionsAppender.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/test/java/io/trino/operator/output/TestPositionsAppender.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/test/java/io/trino/operator/output/TestPositionsAppender.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
everything should support supportsNullRle
There was a problem hiding this comment.
ArrayBlockBuilder does not support it
There was a problem hiding this comment.
ArrayBlockBuilder does not support it
Does it make sense to add support there? It seems odd that we have to work around that particular type here since all other types (including complex ones) support it.
There was a problem hiding this comment.
moved changes from #13973 to the first commit here + adjusted the tests
f725cbc to
8d3ea38
Compare
There was a problem hiding this comment.
rename test name, if it's not byte array, then what it is?
the core of this test is that it is not a byte array and not that it is e.g. long array.
comment added.
There was a problem hiding this comment.
I dropped this test. it's no longer needed.
There was a problem hiding this comment.
ArrayBlockBuilder does not support it
core/trino-main/src/test/java/io/trino/operator/output/TestPositionsAppender.java
Outdated
Show resolved
Hide resolved
8d3ea38 to
97d5eb7
Compare
f0108e4 to
7fb023e
Compare
...no-thrift-api/src/main/java/io/trino/plugin/thrift/api/datatypes/TrinoThriftBigintArray.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/test/java/io/trino/operator/output/TestSlicePositionsAppender.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/test/java/io/trino/operator/output/TestSlicePositionsAppender.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/test/java/io/trino/operator/output/TestSlicePositionsAppender.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/test/java/io/trino/operator/output/TestSlicePositionsAppender.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/test/java/io/trino/operator/output/TestSlicePositionsAppender.java
Outdated
Show resolved
Hide resolved
7fb023e to
46ec5b4
Compare
core/trino-main/src/test/java/io/trino/operator/output/TestSlicePositionsAppender.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/test/java/io/trino/operator/output/TestSlicePositionsAppender.java
Outdated
Show resolved
Hide resolved
...no-thrift-api/src/main/java/io/trino/plugin/thrift/api/datatypes/TrinoThriftBigintArray.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
testAppendSliceNotSupportedByByteArray -> testAppendSliceNotBackedByByteArray
46ec5b4 to
931cdf4
Compare
|
conflict, please rebase |
Produce RunLengthEncodedBlock in ArrayBlockBuilder when all values are null
The improvement comes from not allocating Slice per position and using System.arrayCopy directly instead of Slice.getBytes. Before Benchmark (channelCount) (enableCompression) (nullRate) (partitionCount) (positionCount) (type) Mode Cnt Score Error Units BenchmarkPartitionedOutputOperator.addPage 1 false 0 16 8192 VARCHAR avgt 20 2137.367 ± 166.834 ms/op BenchmarkPartitionedOutputOperator.addPage 1 false 0.2 16 8192 VARCHAR avgt 20 1660.550 ± 24.274 ms/op After BenchmarkPartitionedOutputOperator.addPage 1 false 0 16 8192 VARCHAR avgt 20 1399.476 ± 181.301 ms/op BenchmarkPartitionedOutputOperator.addPage 1 false 0.2 16 8192 VARCHAR avgt 20 1194.077 ± 160.496 ms/op
931cdf4 to
45dfbe8
Compare
rebased, checks in progress |
Description
Improve performance of
SlicePositionsAppender.The improvement comes from not allocating
Sliceper position + usingSystem.arrayCopydirectly instead ofSlice.getBytesperformance improvement
core query engine (partitioned exchange) and SPI extension
Improve performance of remote partitioned exchange
Benchamrks
jmh
about 30% improvement
tpch/tpcds
Overall there is a small improvement. This is expected as
PartitionedOutputOperatoris only around 5% of total CPU time and this pr only improves varchars.When looking at the
PartitionedOutputOperatorstats only, the improvement is ~10%.poo-slice-orc-part-sf1K.pdf
Documentation
(x) No documentation is needed.
( ) Sufficient documentation is included in this PR.
( ) Documentation PR is available with #prnumber.
( ) Documentation issue #issuenumber is filed, and can be handled later.
Release notes
( ) No release notes entries required.
(x) Release notes entries required with the following suggested text: