Avoid reading Iceberg delete files when not needed#13395
Merged
findepi merged 2 commits intotrinodb:masterfrom Aug 8, 2022
Merged
Avoid reading Iceberg delete files when not needed#13395findepi merged 2 commits intotrinodb:masterfrom
findepi merged 2 commits intotrinodb:masterfrom
Conversation
homar
approved these changes
Jul 29, 2022
findepi
approved these changes
Aug 2, 2022
Member
Author
There was a problem hiding this comment.
Switched the approach here to just wrap the DeleteFilter reading in a Supplier. I think that reads better
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergPageSourceProvider.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/delete/DeleteFile.java
Outdated
Show resolved
Hide resolved
Member
There was a problem hiding this comment.
.array()
do we need to make a defensive copy of these?
Member
Author
There was a problem hiding this comment.
Probably. Added a call to clone
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/delete/DeleteFile.java
Outdated
Show resolved
Hide resolved
d3b7369 to
43369ad
Compare
Member
Author
|
Applied comments in fixup commit, thanks @findepi |
b676542 to
716b527
Compare
findepi
approved these changes
Aug 2, 2022
Member
|
squashed |
Member
|
@alexjo2144 can you please rebase? |
Parqet only. Skip reading the delete files associated with a data file if the deletes are not relevant. This can happen when the statistics from the data file already show the split can be skipped. Additionally, this can happen when the line numbers read by the split are known and can be used to filter positional deletes.
716b527 to
6df8a8b
Compare
Closed
6 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Parqet only.
Skip reading the delete files associated with a data file if the deletes are
not relevant. This can happen when the statistics from the data file already
show the split can be skipped. Additionally, this can happen when the line
numbers read by the split are known and can be used to filter positional
deletes.
Performance improvement
Iceberg connector
Minimize I/O operations
Related issues, pull requests, and links
#13219
Documentation
(x) No documentation is needed.
( ) Sufficient documentation is included in this PR.
( ) Documentation PR is available with #prnumber.
( ) Documentation issue #issuenumber is filed, and can be handled later.
Release notes
(x) No release notes entries required.
( ) Release notes entries required with the following suggested text: