Pre-allocate definition & repetition levels; perf improvement#11675
Pre-allocate definition & repetition levels; perf improvement#11675theosib-amazon wants to merge 2 commits intotrinodb:masterfrom
Conversation
Testing shows 5% to 15% performance improvement on a lot of queries by pre-reserving capacity for definitionLevels and repetitionLevels.
Pre-allocate capacity for definition & repetition levels
|
Thank you for your pull request and welcome to our community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. In order for us to review and merge your code, please submit the signed CLA to cla@trino.io. For more information, see https://github.com/trinodb/cla. |
|
I submitted my CLA a week ago; hopefully that will be acted upon soon. |
raunaqmorarka
left a comment
There was a problem hiding this comment.
Please get rid of the merge commit
|
@cla-bot check |
|
Thank you for your pull request and welcome to our community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. In order for us to review and merge your code, please submit the signed CLA to cla@trino.io. For more information, see https://github.com/trinodb/cla. |
|
The cla-bot has been summoned, and re-checked this pull request! |
|
@martint Have you received a CLA from @theosib-amazon ? |
|
Should I resubmit my CLA some other way? |
|
@cla-bot check |
|
The cla-bot has been summoned, and re-checked this pull request! |
|
Merged as 086d56f. Thanks! |
Testing shows 5% to 15% performance improvement on a lot of queries by pre-reserving capacity for definitionLevels and repetitionLevels.
Description
I did profiling on Trino while doing TPCDS queries from a Parquet source and noticed that a lot of them were wasting time on array copy while growing capacity of the definitionLevels and definitionLevels integer array lists. Pre-reserving capacity sped up a lot of queries, some as much as 15%.
Improvement
Change to the parquet reader component
Speed up parquet column reading a bit by pre-reserving capacity in some of the containers.
Related issues, pull requests, and links
Performance optimization
Documentation
(x) No documentation is needed.
Release notes
( ) No release notes entries required.
(x) Release notes entries required with the following suggested text: