Perform range optimization for BETWEEN predicate on date_trunc and temporal casts#14390
Perform range optimization for BETWEEN predicate on date_trunc and temporal casts#14390findinpath wants to merge 4 commits intotrinodb:masterfrom
date_trunc and temporal casts#14390Conversation
date_trunc and temporal casts
c01d17e to
4226391
Compare
a487c99 to
aa7abc0
Compare
|
CI hit #11140 |
aa7abc0 to
5317143
Compare
…expression
This change allows the engine to infer that, for instance,
given t::timestamp(6)
date_trunc('day', t) BETWEEN TIMESTAMP '2022-01-01 00:00:00' AND TIMESTAMP '2022-01-02 00:00:00'
can be rewritten as
t BETWEEN TIMESTAMP '2022-01-01 00:00:00' AND TIMESTAMP '2022-01-02 23:59:59.999999'
The change applies for the temporal types:
- date
- timestamp
- timestamp with time zone
Range predicate BetweenPredicate can be transformed into a `TupleDomain`
and thus help with predicate pushdown.
Range-based `TupleDomain` representation is critical for connectors
which have min/max-based metadata (like Iceberg manifests lists which
play a key role in partition pruning or Iceberg data files), as ranges allow
for intersection tests, something that is hard
to do in a generic manner for `ConnectorExpression`.
This change allows the engine to infer that, for instance,
given t::timestamp(6)
cast(t as date) BETWEEN DATE '2022-01-01' AND DATE '2022-01-02'
can be rewritten as
t BETWEEEN TIMESTAMP '2022-01-01 00:00:00' AND TIMESTAMP '2022-01-02 23:59:59.999999'
The change applies for the temporal types:
- date
- timestamp
- timestamp with time zone
Range predicate BetweenPredicate can be transformed into a `TupleDomain`
and thus help with predicate pushdown.
Range-based `TupleDomain` representation is critical for connectors
which have min/max-based metadata (like Iceberg manifests lists which
play a key role in partition pruning or Iceberg data files), as ranges allow
for intersection tests, something that is hard
to do in a generic manner for `ConnectorExpression`.
5317143 to
a7ab471
Compare
| verify(longTimestamp.getPicosOfMicro() == 0, "Unexpected picos in %s, value not rounded to %s", rangeStart, rangeUnit); | ||
| long endInclusiveMicros = (long) calculateRangeEndInclusive(longTimestamp.getEpochMicros(), createTimestampType(6), rangeUnit); | ||
| return new LongTimestamp(endInclusiveMicros, toIntExact(PICOSECONDS_PER_MICROSECOND - scaleFactor(timestampType.getPrecision(), 12))); | ||
| long endInclusiveMicros = (long) calculateRangeEndInclusive(longTimestamp.getEpochMicros(), createTimestampType(TimestampType.MAX_SHORT_PRECISION), rangeUnit); |
There was a problem hiding this comment.
the variable name is "endInclusiveMicros"
the code used 6 and it's know that 10^(-6)s is a microsecond.
after the change the code uses TimestampType.MAX_SHORT_PRECISION. it's not obvious that it's correct (is short precision actually microseconds?). Thus, actually this change decreases readability
| long endInclusiveMicros = (long) calculateRangeEndInclusive(longTimestamp.getEpochMicros(), createTimestampType(6), rangeUnit); | ||
| return new LongTimestamp(endInclusiveMicros, toIntExact(PICOSECONDS_PER_MICROSECOND - scaleFactor(timestampType.getPrecision(), 12))); | ||
| long endInclusiveMicros = (long) calculateRangeEndInclusive(longTimestamp.getEpochMicros(), createTimestampType(TimestampType.MAX_SHORT_PRECISION), rangeUnit); | ||
| return new LongTimestamp(endInclusiveMicros, toIntExact(PICOSECONDS_PER_MICROSECOND - scaleFactor(timestampType.getPrecision(), TimestampType.MAX_PRECISION))); |
There was a problem hiding this comment.
similar here. the use PICOSECONDS_PER_MICROSECOND mandates that we know we're dealing with picoseconds, i.e. 10^(-12)s, so it matched the corresponding 12 on this line
after the change, we invoke "max precision" constant, but we still rely on it having an actual value of 12
core/trino-main/src/main/java/io/trino/sql/planner/iterative/rule/UnwrapCastInComparison.java
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/sql/planner/iterative/rule/UnwrapCastInComparison.java
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/sql/planner/iterative/rule/UnwrapCastInComparison.java
Show resolved
Hide resolved
|
@findinpath let's have unwrapping of CASTs and date_trunc as separate PRs. |
Description
This change allows the engine to infer that, for instance,
given t::timestamp(6)
or
can be rewritten as
The change applies for the temporal types:
datetimestamptimestamp with time zoneRange predicate BetweenPredicate can be transformed into a
TupleDomainand thus help with predicate pushdown.
Range-based
TupleDomainrepresentation is critical for connectorswhich have min/max-based metadata (like Iceberg manifests lists which
play a key role in partition pruning or Iceberg data files), as ranges allow
for intersection tests, something that is hard
to do in a generic manner for
ConnectorExpression.Fixes #14293
Non-technical explanation
Release notes
( ) This is not user-visible or docs only and no release notes are required.
( ) Release notes are required, please propose a release note for me.
(x) Release notes are required, with the following suggested text: