Reduce overheads related to DictionaryBlock#getSizeInBytes() by pettyjamesm · Pull Request #10970 · trinodb/trino

pettyjamesm · 2022-02-07T20:34:41Z

Description

Reduces the overhead of calculating DictionaryBlock#getSizeInBytes() by making two changes to the Block interface definition:

Adds a new argument selectedPositionCount to Block#getPositionsSizeInBytes(boolean[] positions, int selectedPositionCount). Callers to the previous method getPositionsSizeInBytes(boolean[] positions) could trivially calculate and pass the total count of the positions selected, and failing to do so required the called blocks to count the number of true values in the positions array- sometimes repeatedly in the case of eg: RowBlock.
Adds a new method Block#fixedSizeInBytesPerPosition() that allows blocks to describe whether their getSizeInBytes() can be calculated directly from the position count, ie: the size is not variable based on the specific positions selected

Also includes a change to eagerly populate the unique position count and size in bytes on the results of DictionaryBlock#getPositions when the result is not compact, since the selected positions array may have already been created and would otherwise have to be reconstructed in a subsequent call to DictionaryBlock#getSizeInBytes().

General information

Is this change a fix, improvement, new feature, refactoring, or other?

This is an improvement that reduces the overhead for common scenarios involving DictionaryBlock#getSizeInBytes()

Is this a change to the core query engine, a connector, client library, or the SPI interfaces? (be specific)

This change affects the SPI interface of blocks and refactors the implementations of DictionaryBlock#getPositions and Block#getPositionsSizeInBytes to leverage the new information available.

How would you describe this change to a non-technical end user or system administrator?

Specifically describing this change to a non-technical audience should not be necessary

Related issues, pull requests, and links

Documentation

( ) No documentation is needed.
( ) Sufficient documentation is included in this PR.
( ) Documentation PR is available with #prnumber.
( ) Documentation issue #issuenumber is filed, and can be handled later.

Release notes

( ) No release notes entries required.
( ) Release notes entries required with the following suggested text:

# Section
* Fix some things. ({issue}`5678`)

pettyjamesm · 2022-02-08T19:25:52Z

Benchmark results

Before

Benchmark                                                (selectedPositions)  (valueType)  Mode  Cnt     Score     Error  Units
BenchmarkDictionaryBlock.getPositionsThenGetSizeInBytes                  100      varchar  avgt    5   441.915 ±  39.146  us/op
BenchmarkDictionaryBlock.getPositionsThenGetSizeInBytes                  100      integer  avgt    5   357.674 ±  12.478  us/op
BenchmarkDictionaryBlock.getPositionsThenGetSizeInBytes                 1000      varchar  avgt    5   436.585 ±   5.135  us/op
BenchmarkDictionaryBlock.getPositionsThenGetSizeInBytes                 1000      integer  avgt    5   374.206 ±   4.870  us/op
BenchmarkDictionaryBlock.getPositionsThenGetSizeInBytes                10000      varchar  avgt    5   838.399 ±  17.185  us/op
BenchmarkDictionaryBlock.getPositionsThenGetSizeInBytes                10000      integer  avgt    5   647.638 ±   5.097  us/op
BenchmarkDictionaryBlock.getPositionsThenGetSizeInBytes               100000      varchar  avgt    5  2437.982 ± 119.726  us/op
BenchmarkDictionaryBlock.getPositionsThenGetSizeInBytes               100000      integer  avgt    5  1754.578 ±  87.259  us/op
BenchmarkDictionaryBlock.getPositionsSizeInBytes                         100      varchar  avgt    5   383.916 ±  9.507  us/op
BenchmarkDictionaryBlock.getPositionsSizeInBytes                         100      integer  avgt    5   329.267 ±  7.083  us/op
BenchmarkDictionaryBlock.getPositionsSizeInBytes                        1000      varchar  avgt    5   420.201 ±  4.594  us/op
BenchmarkDictionaryBlock.getPositionsSizeInBytes                        1000      integer  avgt    5   361.897 ± 28.052  us/op
BenchmarkDictionaryBlock.getPositionsSizeInBytes                       10000      varchar  avgt    5   836.371 ± 20.350  us/op
BenchmarkDictionaryBlock.getPositionsSizeInBytes                       10000      integer  avgt    5   636.434 ±  4.285  us/op
BenchmarkDictionaryBlock.getPositionsSizeInBytes                      100000      varchar  avgt    5  2489.630 ± 44.489  us/op
BenchmarkDictionaryBlock.getPositionsSizeInBytes                      100000      integer  avgt    5  1806.359 ± 24.145  us/op

After

Benchmark                                                (selectedPositions)  (valueType)  Mode  Cnt     Score    Error  Units
BenchmarkDictionaryBlock.getPositionsThenGetSizeInBytes                  100      varchar  avgt    5   253.769 ± 10.734  us/op
BenchmarkDictionaryBlock.getPositionsThenGetSizeInBytes                  100      integer  avgt    5    56.284 ±  1.955  us/op
BenchmarkDictionaryBlock.getPositionsThenGetSizeInBytes                 1000      varchar  avgt    5   311.319 ±  7.131  us/op
BenchmarkDictionaryBlock.getPositionsThenGetSizeInBytes                 1000      integer  avgt    5    51.795 ±  0.427  us/op
BenchmarkDictionaryBlock.getPositionsThenGetSizeInBytes                10000      varchar  avgt    5   772.982 ± 13.421  us/op
BenchmarkDictionaryBlock.getPositionsThenGetSizeInBytes                10000      integer  avgt    5   137.741 ±  8.286  us/op
BenchmarkDictionaryBlock.getPositionsThenGetSizeInBytes               100000      varchar  avgt    5  2216.938 ± 45.850  us/op
BenchmarkDictionaryBlock.getPositionsThenGetSizeInBytes               100000      integer  avgt    5   623.966 ± 14.182  us/op
BenchmarkDictionaryBlock.getPositionsSizeInBytes                         100      varchar  avgt    5   303.344 ± 14.147  us/op
BenchmarkDictionaryBlock.getPositionsSizeInBytes                         100      integer  avgt    5    91.252 ± 42.979  us/op
BenchmarkDictionaryBlock.getPositionsSizeInBytes                        1000      varchar  avgt    5   374.223 ± 15.776  us/op
BenchmarkDictionaryBlock.getPositionsSizeInBytes                        1000      integer  avgt    5    85.253 ±  4.161  us/op
BenchmarkDictionaryBlock.getPositionsSizeInBytes                       10000      varchar  avgt    5   778.172 ± 16.958  us/op
BenchmarkDictionaryBlock.getPositionsSizeInBytes                       10000      integer  avgt    5   213.559 ±  1.584  us/op
BenchmarkDictionaryBlock.getPositionsSizeInBytes                      100000      varchar  avgt    5  2374.654 ± 10.486  us/op
BenchmarkDictionaryBlock.getPositionsSizeInBytes                      100000      integer  avgt    5   725.355 ± 26.155  us/op

core/trino-spi/src/main/java/io/trino/spi/block/AbstractMapBlock.java

core/trino-spi/src/main/java/io/trino/spi/block/Block.java

pettyjamesm · 2022-02-14T14:47:05Z

@skrzypo987 think you'll have a chance to take a full review pass on this PR sometime soon?

skrzypo987 · 2022-02-15T08:08:06Z

@skrzypo987 think you'll have a chance to take a full review pass on this PR sometime soon?

I'll take a look today

skrzypo987

Seems ok from my perspective. If you want to get it merged soon ping some other maintainer since @sopel39 is out this week.

core/trino-spi/src/main/java/io/trino/spi/block/Block.java

pettyjamesm · 2022-02-15T18:56:25Z

@dain maybe you have a minute to take a look and decide whether to merge it while Karol is out?

losipiuk · 2022-02-22T09:42:15Z

core/trino-spi/src/main/java/io/trino/spi/block/AbstractArrayBlock.java

nit: maybe just getSizeInBytesPerPosition?

I think the "fixed" adds extra information here and makes it more clear (to me) that this method is only for blocks where the size in bytes is not dependent on the value at any given position, but that's just my opinion- if you feel strongly then I'm willing to change it.

The semantics is was kinda natural for me even with shorter name based on the fact that single number is returned and is guarded by Optional to handle case when we cannot compute it. But I do not feel strongly. it may stay.

losipiuk · 2022-02-22T09:50:38Z

core/trino-spi/src/main/java/io/trino/spi/block/AbstractArrayBlock.java

If we are exploiting the fact that rawElementBlock is of specific class do we need to compute countSelectedPositionsFromOffsets. This arg is ignored by RunLengthEncodedBlock.getPositionsSizeInBytes anyway.

That's true. I left this in because I was planning on making RunLengthEncodedBlock#getPositionsSizeInBytes do something like return selectedPositions > 0 ? value.getSizeInBytes() : 0; but that broke a lot of tests and it didn't seem worth the headache- so I instead added tests that assert the opposite.

It doesn't quite feel right for AbstractArrayBlock to special case the handling of RunLengthEncodedBlock by calling RunLengthEncodedBlock#getSizeInBytes, but it certainly would reduce the overhead involved.

I'm open to doing whatever you think makes the most sense here.

core/trino-spi/src/main/java/io/trino/spi/block/BlockUtil.java

core/trino-spi/src/main/java/io/trino/spi/block/DictionaryBlock.java

losipiuk · 2022-02-22T13:06:25Z

LGTM. But it is not the area I feel best with. @dain / @sopel39 want to take a look still?

Adds a positionCount argument to Block#getPositionSizeInBytes and adds a new method: Block#fixedSizeInBytesPerPosition() to reduce the overhead associated with calculating DictionaryBlock size in bytes when the underlying dictionary size in bytes can be calculated without specific information about which positions are referenced.

pettyjamesm · 2022-02-24T20:06:00Z

More complete benchmark results were easier to produce in the Presto equivalent PR, and links to those results can be found in my comment there.

losipiuk · 2022-02-25T12:56:31Z

@pettyjamesm The JMH benchmarks look nice! I have a question still. Did you have a chance to verify how much the change impacts end-to-end query execution? how much are we gaining in CPU/wall time over some verbose benchmark (e.g. tpc-h/ds), or over your production queries?
The change surely adds some complexity and it would be nice to have a good justification for adding that.

pettyjamesm · 2022-02-25T16:00:44Z

Did you have a chance to verify how much the change impacts end-to-end query execution? how much are we gaining in CPU/wall time over some verbose benchmark (e.g. tpc-h/ds), or over your production queries?

I haven't had a chance to put these changes into our production environment, and in TPCDS / TPCH queries I wouldn't expect much of a measurable improvement at the macro level. The biggest gains are going to be in queries that make heavy use of RowBlock / ArrayBlock / DictionaryBlock combinations. I just tweaked the BenchmarkUnnestOperator implementation to call operatorContext.addProcessedInput(inputPage.getSizeInBytes(), inputPage.getPositionCount()) as well as for the output page. Results in about ~45% throughput improvement on my laptop (eg: ~6.5 ops/second to ~10.0 ops/second)

losipiuk · 2022-03-01T02:11:41Z

I haven't had a chance to put these changes into our production environment, and in TPCDS / TPCH queries I wouldn't expect much of a measurable improvement at the macro level. The biggest gains are going to be in queries that make heavy use of RowBlock / ArrayBlock / DictionaryBlock combinations. I just tweaked the BenchmarkUnnestOperator implementation to call operatorContext.addProcessedInput(inputPage.getSizeInBytes(), inputPage.getPositionCount()) as well as for the output page. Results in about ~45% throughput improvement on my laptop (eg: ~6.5 ops/second to ~10.0 ops/second)

Nice. Thanks.

cla-bot bot added the cla-signed label Feb 7, 2022

findepi requested a review from sopel39 February 8, 2022 09:52

pettyjamesm force-pushed the improve-dictionary-size-calculation branch from d2878b6 to 533f5f0 Compare February 8, 2022 14:44

pettyjamesm marked this pull request as ready for review February 8, 2022 15:16

pettyjamesm force-pushed the improve-dictionary-size-calculation branch from 805d041 to d6d3860 Compare February 8, 2022 19:26

sopel39 requested a review from skrzypo987 February 8, 2022 22:12

pettyjamesm force-pushed the improve-dictionary-size-calculation branch 2 times, most recently from 1c8e4cd to d56413d Compare February 9, 2022 18:21

skrzypo987 reviewed Feb 10, 2022

View reviewed changes

core/trino-spi/src/main/java/io/trino/spi/block/AbstractMapBlock.java Outdated Show resolved Hide resolved

core/trino-spi/src/main/java/io/trino/spi/block/Block.java Outdated Show resolved Hide resolved

pettyjamesm force-pushed the improve-dictionary-size-calculation branch 3 times, most recently from 9064e23 to 105467f Compare February 10, 2022 23:31

skrzypo987 approved these changes Feb 15, 2022

View reviewed changes

core/trino-spi/src/main/java/io/trino/spi/block/Block.java Outdated Show resolved Hide resolved

pettyjamesm requested a review from dain February 15, 2022 18:52

losipiuk reviewed Feb 22, 2022

View reviewed changes

core/trino-spi/src/main/java/io/trino/spi/block/BlockUtil.java Outdated Show resolved Hide resolved

losipiuk reviewed Feb 22, 2022

View reviewed changes

core/trino-spi/src/main/java/io/trino/spi/block/BlockUtil.java Outdated Show resolved Hide resolved

losipiuk reviewed Feb 22, 2022

View reviewed changes

core/trino-spi/src/main/java/io/trino/spi/block/DictionaryBlock.java Outdated Show resolved Hide resolved

losipiuk approved these changes Feb 22, 2022

View reviewed changes

pettyjamesm force-pushed the improve-dictionary-size-calculation branch 2 times, most recently from dcbbe4d to d992eb0 Compare February 23, 2022 23:46

pettyjamesm added 3 commits February 24, 2022 09:45

Populate size and uniqueIds inside of DictionaryBlock#getPositions

3213303

Add more DictionaryBlock benchmarks

e42636b

Refactor and simplify all nested DictionaryBlock size methods

79ea564

pettyjamesm force-pushed the improve-dictionary-size-calculation branch from d992eb0 to 79ea564 Compare February 24, 2022 14:45

pettyjamesm mentioned this pull request Feb 24, 2022

Reduce overheads related to DictionaryBlock#getSizeInBytes() prestodb/presto#17349

Merged

losipiuk merged commit 7b96c8d into trinodb:master Mar 1, 2022

pettyjamesm deleted the improve-dictionary-size-calculation branch March 1, 2022 02:36

github-actions bot added this to the 372 milestone Mar 1, 2022

mosabua mentioned this pull request Mar 1, 2022

Add Trino 372 release notes #11094

Merged

takezoe mentioned this pull request Mar 2, 2022

Add operator cache and coercion cache to reduce planning time treasure-data/presto#43

Merged

Conversation

pettyjamesm commented Feb 7, 2022

Description

General information

Related issues, pull requests, and links

Documentation

Release notes

Uh oh!

pettyjamesm commented Feb 8, 2022

Before

After

Uh oh!

Uh oh!

Uh oh!

pettyjamesm commented Feb 14, 2022

Uh oh!

skrzypo987 commented Feb 15, 2022

Uh oh!

skrzypo987 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

pettyjamesm commented Feb 15, 2022

Uh oh!

losipiuk Feb 22, 2022

Choose a reason for hiding this comment

Uh oh!

pettyjamesm Feb 22, 2022

Choose a reason for hiding this comment

Uh oh!

losipiuk Feb 24, 2022

Choose a reason for hiding this comment

Uh oh!

losipiuk Feb 22, 2022

Choose a reason for hiding this comment

Uh oh!

pettyjamesm Feb 22, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

losipiuk commented Feb 22, 2022

Uh oh!

pettyjamesm commented Feb 24, 2022

Uh oh!

losipiuk commented Feb 25, 2022

Uh oh!

pettyjamesm commented Feb 25, 2022

Uh oh!

losipiuk commented Mar 1, 2022

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

3 participants

pettyjamesm Feb 22, 2022 •

edited

Loading