Subsume non-identity partition predicate in Iceberg#12795
Subsume non-identity partition predicate in Iceberg#12795findepi merged 4 commits intotrinodb:masterfrom
Conversation
|
Still a draft, so just FYI @alexjo2144 @losipiuk @ebyhr @erichwang @phd3 @findinpath @homar |
9f5c58d to
6316cc7
Compare
|
CI #12726 |
6316cc7 to
89ad178
Compare
03f49a9 to
997709f
Compare
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergUtil.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergUtil.java
Outdated
Show resolved
Hide resolved
86b162b to
e609691
Compare
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergUtil.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergUtil.java
Outdated
Show resolved
Hide resolved
f4275ea to
50b834d
Compare
a0d139d to
09f133c
Compare
09f133c to
69a2446
Compare
6bb399c to
7e23ddc
Compare
|
This is ready to review. @alexjo2144 @losipiuk @ebyhr @hashhar @erichwang @martint PTAL |
alexjo2144
left a comment
There was a problem hiding this comment.
Neat way of implementing this. I want to give it another read
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergUtil.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergUtil.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergUtil.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergUtil.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergUtil.java
Outdated
Show resolved
Hide resolved
erichwang
left a comment
There was a problem hiding this comment.
Mostly nits here, but the logic looks sound.
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/TrinoTypes.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/TrinoTypes.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergUtil.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
if you follow the java Math library naming conventions, this would be called: previousBefore or nextDown.
There was a problem hiding this comment.
I didn't mean to follow the java Math library here.
Which particular methods are you referring to?
("nextDown" still sounds odd to me)
anyway, i think this should be provided by the Type itself, so let's discuss best naming in #12797
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/TrinoTypes.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/PartitionTransforms.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/TrinoTypes.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/test/java/io/trino/plugin/iceberg/BaseIcebergConnectorTest.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergMetadata.java
Outdated
Show resolved
Hide resolved
7e23ddc to
15998a5
Compare
|
AC |
|
Build is green. Rebasing after #12865 merged. |
Before the change, Iceberg connector accepted all predicates expressible as `TupleDomain` on primitive columns and they were used to filter data files during split generation. However, only predicates defined on identity partitioning columns were subsumed into connector. This commit extends Iceberg capabilities to subsume predicates on partitioning columns. Besides subsuming predicates on identity partitioning columns, it also subsumes predicates if they align with partitioning boundaries. For example, for `truncate(col, 2)` (round to 100s) partitioning, predicates `col >= 1200` OR `col > 1199` are subsumed, while `col > 1200` or `col > 1250` are not. This change is especially important for Iceberg OPTIMIZE table procedure, which requires the `WHERE` condition to be fully subsumed into the connector. It is also helpful for `DELETE`, as it allows to do metadata-only delete in more cases, where we don't really needing to do a row-level delete.
15998a5 to
4b54f26
Compare
Before the change, Iceberg connector accepted all predicates expressible
as
TupleDomainon primitive columns and they were used to filter datafiles during split generation. However, only predicates defined on
identity partitioning columns were subsumed into connector.
This commit extends Iceberg capabilities to subsume predicates on
partitioning columns. Besides subsuming predicates on identity
partitioning columns, it also subsumes predicates if they align with
partitioning boundaries. For example, for
truncate(col, 2)(round to100s) partitioning, predicates
col >= 1200ORcol > 1199aresubsumed, while
col > 1200orcol > 1250are not.This change is especially important for Iceberg OPTIMIZE table
procedure, which requires the
WHEREcondition to be fully subsumedinto the connector. It is also helpful for
DELETE, as it allows to dometadata-only delete in more cases, where we don't really needing to do
a row-level delete.
Fixes #7905
For #12362