Support Iceberg OPTIMIZE with WHERE casting timestamp_tz column to a date#12918
Conversation
...trino-main/src/main/java/io/trino/sql/planner/iterative/rule/PushPredicateIntoTableScan.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
For future: I wonder if we can communicate a constraint on a type from connector to engine so standard optimizer can handle that.
There was a problem hiding this comment.
Yes, we could do that as part of io.trino.spi.connector.ColumnMetadata.
Or, we could use a dedicate type: #2273
A type may be better because this limitation -- storing point in time only, without time zone -- isn't really specific to Iceberg.
9be3f52 to
412e967
Compare
412e967 to
1ad18f7
Compare
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/ConstraintExtractor.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/ConstraintExtractor.java
Outdated
Show resolved
Hide resolved
When `IcebergMetadata.applyFilter` is invoked with `Constraint` that has no useful summary (`TupleDomain`) and carries only expression or functional predicate, we can short-circuit the method execution.
Also improve wording.
`unwrapTimestampToDateCast` assumes the target type is `DATE` (and is invoked only when it is), so passing `targetType` is redundant.
The check dates back to the time when `TupleDomain` was the only information passed to the connector in the `ConnectorMetadata.applyFilter`. Its purpose was to ensure the connector does not erroneously return some new column constraints (`Domain` objects) in `ConstraintApplicationResult.remainingFilter`. With expression-based pushdown, the check is no longer valid. A connector may be able to translate `Constraint.expression` (or part thereof) into a `Domain` / `TupleDomain` and then enforce it, or return such simplified representation as a remaining `TupleDomain` (`ConstraintApplicationResult.remainingFilter`).
1ad18f7 to
d6f2da1
Compare
|
(just rebased to resolve a conflict with #12911) |
|
@findepi Just a quick question: -- regardless of session zone
ALTER TABLR iceberg_table EXECUTE optimize WHERE CAST(c_timestamp_tz AS date) > a_date_constantThis wouldn't work for Iceberg time partitions since c_timestamp_tz could be in any time zone right? I thought we had discussed making it look like: -- regardless of session zone
ALTER TABLR iceberg_table EXECUTE optimize WHERE CAST(c_timestamp_tz AT TIME ZONE 'UTC' AS date) > a_date_constantTo force the same timezone? Side note: that being said, it looks like if we just do |
in Iceberg,
Yep, this is already supported, but may be less intuitive to users. fortunately, it's or, not xor. People can choose whichever they prefer. |
|
I think we should document this so users know how to leverage this... |
|
Yes, definitely prefer |
Follow-up to #12795
Among other things, that PR added support for
This PR adds support for
Further enhances #7905
Fixes #12362