Improve Iceberg partition predicate enforcement#13239
Conversation
4b74206 to
e27be8b
Compare
erichwang
left a comment
There was a problem hiding this comment.
seems reasonable to me
There was a problem hiding this comment.
call me ignorant, but under what conditions would table.getSnapshotId() be empty?
There was a problem hiding this comment.
Good question. It's only ever empty during new table creation
There was a problem hiding this comment.
add a message so that -- if it throws -- we know what assumption was violated
snapshotId = table.getSnapshotId().orElseThrow(() -> new IllegalStateException("no snapshot id"))
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergUtil.java
Outdated
Show resolved
Hide resolved
findepi
left a comment
There was a problem hiding this comment.
LGTM expect for the call to allManifests().
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergMetadata.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergUtil.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
add a message so that -- if it throws -- we know what assumption was violated
snapshotId = table.getSnapshotId().orElseThrow(() -> new IllegalStateException("no snapshot id"))
There was a problem hiding this comment.
i think calling allManifests can be expensive, right?
can we prune them out using the predicates we already know + the new ones?
There was a problem hiding this comment.
Getting this list doesn't require reading all of the individual Manifests, just the ManifestList, so it's one Avro file to read
|
@findepi AC, thanks. Had to make one more small change to |
42996c4 to
a14e665
Compare
There was a problem hiding this comment.
There's an implicit change here that we can now enforce all predicates on empty tables. That broke this test because the table is empty but was not expecting a predicate to be enforced.
There was a problem hiding this comment.
But doesn't it mean that now iceberg will fail on valid sql ?
There was a problem hiding this comment.
Spoke with Konrad offline about this, not a problem
Only partition specs which are used by the Snapshot being queried are relevant when determining if partitions can be used to enforce a Predicate.
a14e665 to
6c54592
Compare
|
Squashed |
Description
Only partition specs which are used by the Snapshot being queried are relevant when determining if partitions can be used to enforce a Predicate.
Improvement
Iceberg connector
Improve filtering on partition columns in Iceberg
Related issues, pull requests, and links
Relates to: #12795
Documentation
(x) No documentation is needed.
( ) Sufficient documentation is included in this PR.
( ) Documentation PR is available with #prnumber.
( ) Documentation issue #issuenumber is filed, and can be handled later.
Release notes
(x) No release notes entries required.
( ) Release notes entries required with the following suggested text: