Skip to content

Improve BenchmarkPartitionedOutputOperator#12234

Merged
sopel39 merged 7 commits intotrinodb:masterfrom
starburstdata:ls/014-poo-benchmark-docs
May 5, 2022
Merged

Improve BenchmarkPartitionedOutputOperator#12234
sopel39 merged 7 commits intotrinodb:masterfrom
starburstdata:ls/014-poo-benchmark-docs

Conversation

@lukasz-stec
Copy link
Copy Markdown
Member

@lukasz-stec lukasz-stec commented May 4, 2022

Description

Introduce a consistent TestType naming convention.
Add descriptions to the TestType cases.
Add more Dictionary test cases.
Add RowType with RLE field case.

Extracted (except for the docs commit) from #11289.

Is this change a fix, improvement, new feature, refactoring, or other?

benchmark refactoring + documentation

Is this a change to the core query engine, a connector, client library, or the SPI interfaces? (be specific)

only BenchmarkPartitionedOutputOperator

How would you describe this change to a non-technical end user or system administrator?

Improved code readibility

Related issues, pull requests, and links

Documentation

( ) No documentation is needed.
( ) Sufficient documentation is included in this PR.
( ) Documentation PR is available with #prnumber.
( ) Documentation issue #issuenumber is filed, and can be handled later.

Release notes

( ) No release notes entries required.
( ) Release notes entries required with the following suggested text:

# Section
* Fix some things. ({issue}`issuenumber`)

@cla-bot cla-bot bot added the cla-signed label May 4, 2022
@lukasz-stec lukasz-stec requested a review from sopel39 May 4, 2022 07:20
@lukasz-stec lukasz-stec force-pushed the ls/014-poo-benchmark-docs branch from 2c46d6a to 6f9558c Compare May 4, 2022 11:02
@lukasz-stec lukasz-stec changed the title Document BenchmarkPartitionedOutputOperator cases Improve BenchmarkPartitionedOutputOperator May 4, 2022
Copy link
Copy Markdown
Member

@sopel39 sopel39 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm % nits (optional)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: that should be separate commit

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: it would be better to use composition instead of overriding createPage. Right now you pass some arguments to TestType, but you also optionally override a method. With composition, you could reuse createRandomDictionaryPage rather than override it every time. Composition is also cleaner and more consistent than inheritance

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense. I refactored it to use new PageGenerator interface

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should go to commit that adds row-wise benchmark. BTW: do you know if it works?

Copy link
Copy Markdown
Member Author

@lukasz-stec lukasz-stec May 4, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is no explicit row-wise benchmark. It has to be set up manually by choosing positionCount ~= partitionCount.

And yes, row-wise code path pollution works, it has visible influence on benchmark results:

arm
### without row-wise pollution
Benchmark                                   (channelCount)  (enableCompression)  (nullRate)  (partitionCount)  (positionCount)  (type)  Mode  Cnt    Score   Error  Units
BenchmarkPartitionedOutputOperator.addPage               2                false           0               512              512  BIGINT  avgt   10  276.219 ± 3.432  ms/op

### with row-wise pollution
Benchmark                                   (channelCount)  (enableCompression)  (nullRate)  (partitionCount)  (positionCount)  (type)  Mode  Cnt    Score   Error  Units
BenchmarkPartitionedOutputOperator.addPage               2                false           0               512              512  BIGINT  avgt   10  352.991 ± 9.117  ms/op```

Use PageGenerator composition inside TestType instead of
overriding createPage method. It makes the intent clearer,
especially if the PageGenerators are re-used.
Extend
BenchmarkPartitionedOutputOperator.pollute
with row-wise code path pollution
Introduce a consistent TestType naming convention.
Add descriptions to the TestType cases.
@lukasz-stec lukasz-stec force-pushed the ls/014-poo-benchmark-docs branch from 6f9558c to 8e75abd Compare May 4, 2022 13:58
Copy link
Copy Markdown
Member Author

@lukasz-stec lukasz-stec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as suggested I refactored the benchmark TestType to use page generation via composition over overriding the createPage method

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense. I refactored it to use new PageGenerator interface

Copy link
Copy Markdown
Member Author

@lukasz-stec lukasz-stec May 4, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is no explicit row-wise benchmark. It has to be set up manually by choosing positionCount ~= partitionCount.

And yes, row-wise code path pollution works, it has visible influence on benchmark results:

arm
### without row-wise pollution
Benchmark                                   (channelCount)  (enableCompression)  (nullRate)  (partitionCount)  (positionCount)  (type)  Mode  Cnt    Score   Error  Units
BenchmarkPartitionedOutputOperator.addPage               2                false           0               512              512  BIGINT  avgt   10  276.219 ± 3.432  ms/op

### with row-wise pollution
Benchmark                                   (channelCount)  (enableCompression)  (nullRate)  (partitionCount)  (positionCount)  (type)  Mode  Cnt    Score   Error  Units
BenchmarkPartitionedOutputOperator.addPage               2                false           0               512              512  BIGINT  avgt   10  352.991 ± 9.117  ms/op```

@sopel39 sopel39 merged commit f626817 into trinodb:master May 5, 2022
@github-actions github-actions bot added this to the 380 milestone May 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

2 participants