Skip to content

Optimize null representation in encoded VariableBlockWidthBlock#15760

Merged
raunaqmorarka merged 1 commit intotrinodb:masterfrom
radek-kondziolka:rk/try_more_optimal_null_representation_in_varchar_block
Jan 24, 2023
Merged

Optimize null representation in encoded VariableBlockWidthBlock#15760
raunaqmorarka merged 1 commit intotrinodb:masterfrom
radek-kondziolka:rk/try_more_optimal_null_representation_in_varchar_block

Conversation

@radek-kondziolka
Copy link
Copy Markdown
Contributor

@radek-kondziolka radek-kondziolka commented Jan 18, 2023

Description

Currently, when block VariableWidthBlock is encoded it is writing an array of offsets for all positions regardless of the fact whether a position is null or not. Instead we could save the lengths only for non-null positions and compute offsets from an array of lengths and the array of nullability (array that determines whether position is null or not).

This change should be tested by io.trino.spi.block.TestVariableWidthBlockEncoding.

The difference was tested on query with different value of X to have a control on null frequency

with cs as (select *,  case when rand(100) < X then null else '0' end nullek from catalog_sales cs)
SELECT  count_if(cs.nullek is null), count(*), cs.nullek FROM cs
RIGHT JOIN call_center cc ON cc.cc_call_center_sk =  cs.cs_call_center_sk
GROUP BY 3
ORDER BY 1, 2;

Results (cumulative size of exchanged GB via network)

           No nullw   50% of nulls    99% of nulls   
baseline   28GB      27GB            26GB                    
change     28GB      24GB            21 GB

Let's wait for benchmarks results.

Release notes

(*) This is not user-visible or docs only and no release notes are required.
( ) Release notes are required, please propose a release note for me.
( ) Release notes are required, with the following suggested text:

@cla-bot cla-bot bot added the cla-signed label Jan 18, 2023
@radek-kondziolka radek-kondziolka marked this pull request as ready for review January 18, 2023 15:59
Copy link
Copy Markdown
Member

@lukasz-stec lukasz-stec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mostly lgtm, I would make sure that test cover cases with and without nulls

@radek-kondziolka radek-kondziolka force-pushed the rk/try_more_optimal_null_representation_in_varchar_block branch 2 times, most recently from 60d4498 to fff7d51 Compare January 20, 2023 10:03
@radek-kondziolka
Copy link
Copy Markdown
Contributor Author

radek-kondziolka commented Jan 20, 2023

micro benchmark results:

Benchmark                                   (nullChance)  Mode  Cnt  Score   Error  Units   Before
BenchmarkBlockSerde.deserializeSliceDirect             0  avgt   10  3.223 ± 0.068  ns/op   2.716 ± 0.150  ns/op
BenchmarkBlockSerde.deserializeSliceDirect           .01  avgt   10  4.499 ± 0.103  ns/op   3.725 ± 0.084  ns/op
BenchmarkBlockSerde.deserializeSliceDirect           .10  avgt   10  5.180 ± 0.159  ns/op   3.471 ± 0.075  ns/op
BenchmarkBlockSerde.deserializeSliceDirect           .50  avgt   10  5.819 ± 0.176  ns/op   2.678 ± 0.040  ns/op
BenchmarkBlockSerde.deserializeSliceDirect           .90  avgt   10  2.100 ± 0.050  ns/op   1.813 ± 0.017  ns/op   
BenchmarkBlockSerde.deserializeSliceDirect           .99  avgt   10  1.104 ± 0.019  ns/op   1.553 ± 0.024  ns/op
BenchmarkBlockSerde.serializeSliceDirect               0  avgt   10  2.324 ± 0.051  ns/op   5.436 ± 0.104  ns/op
BenchmarkBlockSerde.serializeSliceDirect             .01  avgt   10  3.360 ± 0.026  ns/op   5.900 ± 0.021  ns/op
BenchmarkBlockSerde.serializeSliceDirect             .10  avgt   10  4.127 ± 0.060  ns/op   6.453 ± 0.101  ns/op
BenchmarkBlockSerde.serializeSliceDirect             .50  avgt   10  2.870 ± 0.040  ns/op   4.948 ± 0.145  ns/op
BenchmarkBlockSerde.serializeSliceDirect             .90  avgt   10  2.740 ± 0.102  ns/op   4.954 ± 0.090  ns/op
BenchmarkBlockSerde.serializeSliceDirect             .99  avgt   10  1.951 ± 0.071  ns/op   4.315 ± 0.052  ns/op

deserialization is slower as expected but serialization is much faster (relatively). In total, it should be even faster.

macrobenchmark: (tpch/tpcds, orc, part, sf1000);

wall time tpch:  -1.49%
wall time tpcds: -1.05% 
cpu time tpch: -2.97% 
cpu time tpcds: -2.64%

macrobenchmark: (tpch/tpcds, orc, unpart, sf1000);

wall time tpch:  -2.37%
wall time tpcds: -1.68% 
cpu time tpch: -1.94% 
cpu time tpcds: -0.72%
network bytes tpch: 0
network bytes tpcds: -0.05%

Copy link
Copy Markdown
Member

@raunaqmorarka raunaqmorarka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add results of TPC benchmarks as well

@radek-kondziolka radek-kondziolka force-pushed the rk/try_more_optimal_null_representation_in_varchar_block branch from fff7d51 to 8581782 Compare January 20, 2023 12:19
@radek-kondziolka radek-kondziolka force-pushed the rk/try_more_optimal_null_representation_in_varchar_block branch 4 times, most recently from 8fbc6af to ac3ad6f Compare January 23, 2023 12:42
@radek-kondziolka radek-kondziolka force-pushed the rk/try_more_optimal_null_representation_in_varchar_block branch from ac3ad6f to b283d87 Compare January 23, 2023 16:44
Currently, encoding of VariableBlockWidthBlock writes offsets
(4 bytes per position) for every position, regardless of
nullability of the position.
Instead of that, it is sufficient to write lengths of non-null
positions and null array. From that it is possible to get offsets.

Benchmark                                   (nullChance)  Mode  Cnt  Score   Error  Units   Before
BenchmarkBlockSerde.deserializeSliceDirect             0  avgt   10  3.223 ± 0.068  ns/op   2.716 ± 0.150  ns/op
BenchmarkBlockSerde.deserializeSliceDirect           .01  avgt   10  4.499 ± 0.103  ns/op   3.725 ± 0.084  ns/op
BenchmarkBlockSerde.deserializeSliceDirect           .10  avgt   10  5.180 ± 0.159  ns/op   3.471 ± 0.075  ns/op
BenchmarkBlockSerde.deserializeSliceDirect           .50  avgt   10  5.819 ± 0.176  ns/op   2.678 ± 0.040  ns/op
BenchmarkBlockSerde.deserializeSliceDirect           .90  avgt   10  2.100 ± 0.050  ns/op   1.813 ± 0.017  ns/op
BenchmarkBlockSerde.deserializeSliceDirect           .99  avgt   10  1.104 ± 0.019  ns/op   1.553 ± 0.024  ns/op
BenchmarkBlockSerde.serializeSliceDirect               0  avgt   10  2.324 ± 0.051  ns/op   5.436 ± 0.104  ns/op
BenchmarkBlockSerde.serializeSliceDirect             .01  avgt   10  3.360 ± 0.026  ns/op   5.900 ± 0.021  ns/op
BenchmarkBlockSerde.serializeSliceDirect             .10  avgt   10  4.127 ± 0.060  ns/op   6.453 ± 0.101  ns/op
BenchmarkBlockSerde.serializeSliceDirect             .50  avgt   10  2.870 ± 0.040  ns/op   4.948 ± 0.145  ns/op
BenchmarkBlockSerde.serializeSliceDirect             .90  avgt   10  2.740 ± 0.102  ns/op   4.954 ± 0.090  ns/op
BenchmarkBlockSerde.serializeSliceDirect             .99  avgt   10  1.951 ± 0.071  ns/op   4.315 ± 0.052  ns/op
@radek-kondziolka radek-kondziolka force-pushed the rk/try_more_optimal_null_representation_in_varchar_block branch from b283d87 to 22b1aaa Compare January 24, 2023 08:13
@raunaqmorarka raunaqmorarka merged commit ca1ab5f into trinodb:master Jan 24, 2023
@github-actions github-actions bot added this to the 406 milestone Jan 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

3 participants