Push down predicate involving timestamp_tz in Delta Lake#19874
Push down predicate involving timestamp_tz in Delta Lake#19874kasiafi merged 5 commits intotrinodb:masterfrom
Conversation
86b1abf to
71003a0
Compare
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/ConstraintExtractor.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/ConstraintExtractor.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/UtcConstraintExtractor.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/UtcConstraintExtractor.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/test/java/io/trino/sql/query/QueryAssertions.java
Outdated
Show resolved
Hide resolved
lib/trino-plugin-toolkit/src/main/java/io/trino/plugin/base/filter/UtcConstraintExtractor.java
Outdated
Show resolved
Hide resolved
plugin/trino-delta-lake/src/main/java/io/trino/plugin/deltalake/DeltaLakeMetadata.java
Outdated
Show resolved
Hide resolved
plugin/trino-delta-lake/src/main/java/io/trino/plugin/deltalake/DeltaLakeMetadata.java
Outdated
Show resolved
Hide resolved
plugin/trino-delta-lake/src/test/java/io/trino/plugin/deltalake/TestUtcConstraintExtractor.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/test/java/io/trino/plugin/iceberg/TestConstraintExtractor.java
Outdated
Show resolved
Hide resolved
Rename ConstraintExtractor to UtcConstraintExtractor
25e9cee to
869594b
Compare
| * {@link TimestampWithTimeZoneType}. Such unwrap would not be monotonic. Within Iceberg, we know | ||
| * Test equivalent of {@code io.trino.sql.planner.iterative.rule.UnwrapCastInComparison} for {@link TimestampWithTimeZoneType}. | ||
| * {@code UnwrapCastInComparison} handles {@link DateType} and {@link TimestampType}, but cannot handle | ||
| * {@link TimestampWithTimeZoneType}. Such unwrap would not be monotonic. If we know |
There was a problem hiding this comment.
| * {@link TimestampWithTimeZoneType}. Such unwrap would not be monotonic. If we know | |
| * {@link TimestampWithTimeZoneType}. Such unwrap would not be monotonic. We know |
There was a problem hiding this comment.
same for the other occurences
There was a problem hiding this comment.
If the "if" was removed, it would suggest that this is always true that the time zone is UTC. It can be only guaranteed per connector, and the optimization only applies to those connectors.
There was a problem hiding this comment.
Reduce code redundancy with the method testExtractTimestampTzMicrosDateComparison
There was a problem hiding this comment.
When writing this, I experimented with extracting common code for similar test methods. I eventually settled for more repetition, as it improves readability a lot.
There was a problem hiding this comment.
optional: This assertion can be added as well.
assertThat(query("SELECT * FROM " + tableName + " WHERE date_trunc('hour', part) >= TIMESTAMP '2005-09-10 00:00:00.001 +07:00'"))
.isFullyPushedDown();
There was a problem hiding this comment.
Added a similar test case, thanks!
869594b to
3964149
Compare
|
@kasiafi and others... do we need to add a pushdown section to the performance section in the Delta Lake connector docs .. some here https://trino.io/docs/current/connector/delta-lake.html#performance Similar to how we do it for JDBC connectors like https://trino.io/docs/current/connector/postgresql.html#pushdown |
I tend to say no - the improvemenst covered by this PR (which btw exist for Iceberg connector as well) are goodies that should just be there - they are "obvious" from user perspective. While we were working on these topics on Iceberg, we've published a blog post about them https://trino.io/blog/2023/04/11/date-predicates.html |
💯 user docs is not the place to brag about various optimization techniques we are using. |
|
Okay .. thanks @findepi and @findinpath |
Description
Certain expressions involving TIMESTAMP WITH TIME ZONE can be optimized in Delta Lake connector based on the fact that columns of type TIMESTAMP WITH TIME ZONE are represented using the UTC time zone:
date_trunc('...', timeztamp_tz_column) <comparison> constantCAST(timestamp_tz_column AS DATE) <comparison> constantyear(timestamp_tz_column) <comparison> constantThe optimization is based on similar optimization in the Iceberg connector.
Fixes #18664
Release notes
( ) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
(x) Release notes are required, with the following suggested text: