Skip to content

Conversation

@lukasz-stec
Copy link
Member

@lukasz-stec lukasz-stec commented Jun 22, 2022

Description

Make AbstractRowBlock#copyPositions branchless to avoid branch misprediction penalties

JMH benchmarks

Benchmark                             nullsAllowed  selectedPositions  selectedPositionsCount  type         baseline  branchless  branchless%
BenchmarkCopyPositions.copyPositions  FALSE         GROUPED            200                     ROW(BIGINT)  0.484     0.484       0
BenchmarkCopyPositions.copyPositions  FALSE         GROUPED            1000                    ROW(BIGINT)  2.082     2.083       0.04803073967
BenchmarkCopyPositions.copyPositions  FALSE         GROUPED            8000                    ROW(BIGINT)  18.698    18.893      1.042892288
BenchmarkCopyPositions.copyPositions  FALSE         SEQUENCE           200                     ROW(BIGINT)  0.484     0.492       1.652892562
BenchmarkCopyPositions.copyPositions  FALSE         SEQUENCE           1000                    ROW(BIGINT)  2.097     2.102       0.2384358608
BenchmarkCopyPositions.copyPositions  FALSE         SEQUENCE           8000                    ROW(BIGINT)  18.881    18.914      0.1747788782
BenchmarkCopyPositions.copyPositions  FALSE         RANDOM             200                     ROW(BIGINT)  0.489     0.492       0.6134969325
BenchmarkCopyPositions.copyPositions  FALSE         RANDOM             1000                    ROW(BIGINT)  2.181     2.145       -1.650618982
BenchmarkCopyPositions.copyPositions  FALSE         RANDOM             8000                    ROW(BIGINT)  21.12     20.017      -5.222537879
BenchmarkCopyPositions.copyPositions  TRUE          GROUPED            200                     ROW(BIGINT)  0.862     0.894       3.712296984
BenchmarkCopyPositions.copyPositions  TRUE          GROUPED            1000                    ROW(BIGINT)  3.968     4.005       0.9324596774
BenchmarkCopyPositions.copyPositions  TRUE          GROUPED            8000                    ROW(BIGINT)  41.909    27.061      -35.4291441
BenchmarkCopyPositions.copyPositions  TRUE          SEQUENCE           200                     ROW(BIGINT)  0.902     0.94        4.21286031
BenchmarkCopyPositions.copyPositions  TRUE          SEQUENCE           1000                    ROW(BIGINT)  4.389     4.174       -4.898610162
BenchmarkCopyPositions.copyPositions  TRUE          SEQUENCE           8000                    ROW(BIGINT)  41.143    28.406      -30.95787862
BenchmarkCopyPositions.copyPositions  TRUE          RANDOM             200                     ROW(BIGINT)  0.9       0.927       3
BenchmarkCopyPositions.copyPositions  TRUE          RANDOM             1000                    ROW(BIGINT)  4.556     4.17        -8.472344162
BenchmarkCopyPositions.copyPositions  TRUE          RANDOM             8000                    ROW(BIGINT)  41.507    29.262      -29.50104802

Is this change a fix, improvement, new feature, refactoring, or other?

improvement

Is this a change to the core query engine, a connector, client library, or the SPI interfaces? (be specific)

core query engine

How would you describe this change to a non-technical end user or system administrator?

Improve read performance with ROW data types

Related issues, pull requests, and links

Documentation

(x) No documentation is needed.
( ) Sufficient documentation is included in this PR.
( ) Documentation PR is available with #prnumber.
( ) Documentation issue #issuenumber is filed, and can be handled later.

Release notes

( ) No release notes entries required.
(x) Release notes entries required with the following suggested text:

# Section
* Improve read performance with ROW data types. ({issue}`12926`)

raunaqmorarka
raunaqmorarka previously approved these changes Jun 22, 2022
Copy link
Member

@raunaqmorarka raunaqmorarka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor comments

@lukasz-stec lukasz-stec force-pushed the ls/025-row-block-copy-pos-branchless branch from 7056be1 to b21dbda Compare June 23, 2022 11:17
Benchmark                             nullsAllowed  selectedPositions  selectedPositionsCount  type         baseline  branchless  branchless%
BenchmarkCopyPositions.copyPositions  FALSE         GROUPED            200                     ROW(BIGINT)  0.484     0.484       0
BenchmarkCopyPositions.copyPositions  FALSE         GROUPED            1000                    ROW(BIGINT)  2.082     2.083       0.04803073967
BenchmarkCopyPositions.copyPositions  FALSE         GROUPED            8000                    ROW(BIGINT)  18.698    18.893      1.042892288
BenchmarkCopyPositions.copyPositions  FALSE         SEQUENCE           200                     ROW(BIGINT)  0.484     0.492       1.652892562
BenchmarkCopyPositions.copyPositions  FALSE         SEQUENCE           1000                    ROW(BIGINT)  2.097     2.102       0.2384358608
BenchmarkCopyPositions.copyPositions  FALSE         SEQUENCE           8000                    ROW(BIGINT)  18.881    18.914      0.1747788782
BenchmarkCopyPositions.copyPositions  FALSE         RANDOM             200                     ROW(BIGINT)  0.489     0.492       0.6134969325
BenchmarkCopyPositions.copyPositions  FALSE         RANDOM             1000                    ROW(BIGINT)  2.181     2.145       -1.650618982
BenchmarkCopyPositions.copyPositions  FALSE         RANDOM             8000                    ROW(BIGINT)  21.12     20.017      -5.222537879
BenchmarkCopyPositions.copyPositions  TRUE          GROUPED            200                     ROW(BIGINT)  0.862     0.894       3.712296984
BenchmarkCopyPositions.copyPositions  TRUE          GROUPED            1000                    ROW(BIGINT)  3.968     4.005       0.9324596774
BenchmarkCopyPositions.copyPositions  TRUE          GROUPED            8000                    ROW(BIGINT)  41.909    27.061      -35.4291441
BenchmarkCopyPositions.copyPositions  TRUE          SEQUENCE           200                     ROW(BIGINT)  0.902     0.94        4.21286031
BenchmarkCopyPositions.copyPositions  TRUE          SEQUENCE           1000                    ROW(BIGINT)  4.389     4.174       -4.898610162
BenchmarkCopyPositions.copyPositions  TRUE          SEQUENCE           8000                    ROW(BIGINT)  41.143    28.406      -30.95787862
BenchmarkCopyPositions.copyPositions  TRUE          RANDOM             200                     ROW(BIGINT)  0.9       0.927       3
BenchmarkCopyPositions.copyPositions  TRUE          RANDOM             1000                    ROW(BIGINT)  4.556     4.17        -8.472344162
BenchmarkCopyPositions.copyPositions  TRUE          RANDOM             8000                    ROW(BIGINT)  41.507    29.262      -29.50104802
@lukasz-stec lukasz-stec force-pushed the ls/025-row-block-copy-pos-branchless branch from b21dbda to dfa8df4 Compare June 23, 2022 11:17
@raunaqmorarka raunaqmorarka merged commit 9bd9dc3 into trinodb:master Jun 23, 2022
@github-actions github-actions bot added this to the 388 milestone Jun 23, 2022
@martint
Copy link
Member

martint commented Jun 23, 2022

@lukasz-stec, please include the error bars whenever showing benchmark results. It's important to understand the variance to see if the results are significant.

@findepi findepi deleted the ls/025-row-block-copy-pos-branchless branch June 23, 2022 15:34
@findepi
Copy link
Member

findepi commented Jun 23, 2022

Congrats @raunaqmorarka on your first merge! 🎉

@lukasz-stec
Copy link
Member Author

@lukasz-stec, please include the error bars whenever showing benchmark results. It's important to understand the variance to see if the results are significant.

sure, full jmh results for future reference below

before
Benchmark                             (nullsAllowed)  (selectedPositions)  (selectedPositionsCount)       (type)  Mode  Cnt   Score   Error  Units
BenchmarkCopyPositions.copyPositions           false              GROUPED                       200  ROW(BIGINT)  avgt   20   0.484 ± 0.003  us/op
BenchmarkCopyPositions.copyPositions           false              GROUPED                      1000  ROW(BIGINT)  avgt   20   2.082 ± 0.015  us/op
BenchmarkCopyPositions.copyPositions           false              GROUPED                      8000  ROW(BIGINT)  avgt   20  18.698 ± 0.116  us/op
BenchmarkCopyPositions.copyPositions           false             SEQUENCE                       200  ROW(BIGINT)  avgt   20   0.484 ± 0.004  us/op
BenchmarkCopyPositions.copyPositions           false             SEQUENCE                      1000  ROW(BIGINT)  avgt   20   2.097 ± 0.016  us/op
BenchmarkCopyPositions.copyPositions           false             SEQUENCE                      8000  ROW(BIGINT)  avgt   20  18.881 ± 0.190  us/op
BenchmarkCopyPositions.copyPositions           false               RANDOM                       200  ROW(BIGINT)  avgt   20   0.489 ± 0.004  us/op
BenchmarkCopyPositions.copyPositions           false               RANDOM                      1000  ROW(BIGINT)  avgt   20   2.181 ± 0.091  us/op
BenchmarkCopyPositions.copyPositions           false               RANDOM                      8000  ROW(BIGINT)  avgt   20  21.120 ± 0.946  us/op
BenchmarkCopyPositions.copyPositions            true              GROUPED                       200  ROW(BIGINT)  avgt   20   0.862 ± 0.047  us/op
BenchmarkCopyPositions.copyPositions            true              GROUPED                      1000  ROW(BIGINT)  avgt   20   3.968 ± 0.281  us/op
BenchmarkCopyPositions.copyPositions            true              GROUPED                      8000  ROW(BIGINT)  avgt   20  41.909 ± 1.993  us/op
BenchmarkCopyPositions.copyPositions            true             SEQUENCE                       200  ROW(BIGINT)  avgt   20   0.902 ± 0.010  us/op
BenchmarkCopyPositions.copyPositions            true             SEQUENCE                      1000  ROW(BIGINT)  avgt   20   4.389 ± 0.073  us/op
BenchmarkCopyPositions.copyPositions            true             SEQUENCE                      8000  ROW(BIGINT)  avgt   20  41.143 ± 2.505  us/op
BenchmarkCopyPositions.copyPositions            true               RANDOM                       200  ROW(BIGINT)  avgt   20   0.900 ± 0.004  us/op
BenchmarkCopyPositions.copyPositions            true               RANDOM                      1000  ROW(BIGINT)  avgt   20   4.556 ± 0.144  us/op
BenchmarkCopyPositions.copyPositions            true               RANDOM                      8000  ROW(BIGINT)  avgt   20  41.507 ± 2.424  us/op

after
Benchmark                             (nullsAllowed)  (selectedPositions)  (selectedPositionsCount)       (type)  Mode  Cnt   Score   Error  Units
BenchmarkCopyPositions.copyPositions           false              GROUPED                       200  ROW(BIGINT)  avgt   20   0.484 ± 0.006  us/op
BenchmarkCopyPositions.copyPositions           false              GROUPED                      1000  ROW(BIGINT)  avgt   20   2.083 ± 0.014  us/op
BenchmarkCopyPositions.copyPositions           false              GROUPED                      8000  ROW(BIGINT)  avgt   20  18.893 ± 0.261  us/op
BenchmarkCopyPositions.copyPositions           false             SEQUENCE                       200  ROW(BIGINT)  avgt   20   0.492 ± 0.006  us/op
BenchmarkCopyPositions.copyPositions           false             SEQUENCE                      1000  ROW(BIGINT)  avgt   20   2.102 ± 0.008  us/op
BenchmarkCopyPositions.copyPositions           false             SEQUENCE                      8000  ROW(BIGINT)  avgt   20  18.914 ± 0.173  us/op
BenchmarkCopyPositions.copyPositions           false               RANDOM                       200  ROW(BIGINT)  avgt   20   0.492 ± 0.003  us/op
BenchmarkCopyPositions.copyPositions           false               RANDOM                      1000  ROW(BIGINT)  avgt   20   2.145 ± 0.022  us/op
BenchmarkCopyPositions.copyPositions           false               RANDOM                      8000  ROW(BIGINT)  avgt   20  20.017 ± 0.904  us/op
BenchmarkCopyPositions.copyPositions            true              GROUPED                       200  ROW(BIGINT)  avgt   20   0.894 ± 0.048  us/op
BenchmarkCopyPositions.copyPositions            true              GROUPED                      1000  ROW(BIGINT)  avgt   20   4.005 ± 0.214  us/op
BenchmarkCopyPositions.copyPositions            true              GROUPED                      8000  ROW(BIGINT)  avgt   20  27.061 ± 1.552  us/op
BenchmarkCopyPositions.copyPositions            true             SEQUENCE                       200  ROW(BIGINT)  avgt   20   0.940 ± 0.021  us/op
BenchmarkCopyPositions.copyPositions            true             SEQUENCE                      1000  ROW(BIGINT)  avgt   20   4.174 ± 0.111  us/op
BenchmarkCopyPositions.copyPositions            true             SEQUENCE                      8000  ROW(BIGINT)  avgt   20  28.406 ± 0.198  us/op
BenchmarkCopyPositions.copyPositions            true               RANDOM                       200  ROW(BIGINT)  avgt   20   0.927 ± 0.007  us/op
BenchmarkCopyPositions.copyPositions            true               RANDOM                      1000  ROW(BIGINT)  avgt   20   4.170 ± 0.131  us/op
BenchmarkCopyPositions.copyPositions            true               RANDOM                      8000  ROW(BIGINT)  avgt   20  29.262 ± 0.159  us/op

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

4 participants