Skip to content

Rewrite date_trunc in BETWEEN predicate#14451

Closed
findinpath wants to merge 1 commit intotrinodb:masterfrom
findinpath:rewrite-date-trunc-in-between-predicate
Closed

Rewrite date_trunc in BETWEEN predicate#14451
findinpath wants to merge 1 commit intotrinodb:masterfrom
findinpath:rewrite-date-trunc-in-between-predicate

Conversation

@findinpath
Copy link
Copy Markdown
Contributor

@findinpath findinpath commented Oct 4, 2022

Description

This change allows the engine to infer that, for instance,
given t::timestamp(6)

    date_trunc('day', t) BETWEEN TIMESTAMP '2022-01-01 00:00:00' AND TIMESTAMP '2022-01-02 00:00:00'

can be rewritten as

    t >= TIMESTAMP '2022-01-01 00:00:00' AND t <= TIMESTAMP '2022-01-02 23:59:59.999999'

The change applies for the temporal types:

  • date
  • timestamp
  • timestamp with time zone

Range predicate BetweenPredicate can be transformed into a TupleDomain
and thus help with predicate pushdown.
Range-based TupleDomain representation is critical for connectors
which have min/max-based metadata (like Iceberg manifests lists which
play a key role in partition pruning or Iceberg data files), as ranges allow
for intersection tests, something that is hard
to do in a generic manner for ConnectorExpression.

Fixes #14293

This is a spin-off from #14390

Non-technical explanation

Release notes

( ) This is not user-visible or docs only and no release notes are required.
( ) Release notes are required, please propose a release note for me.
(x) Release notes are required, with the following suggested text:

# Main
* Improve partition and data pruning when comparing output of `date_trunc` function with ranges

@cla-bot cla-bot bot added the cla-signed label Oct 4, 2022
@findinpath findinpath changed the title Rewrite date trunc in between predicate Rewrite date_trunc in BETWEEN predicate Oct 4, 2022
This change allows the engine to infer that, for instance,
given t::timestamp(6)

    date_trunc('day', t) BETWEEN TIMESTAMP '2022-01-01 00:00:00' AND TIMESTAMP '2022-01-02 00:00:00'

can be rewritten as

    t >= '2022-01-01 00:00:00' t <= '2022-01-02 23:59:59.999999'

The change applies for the temporal types:
- date
- timestamp
- timestamp with time zone

Range predicate BetweenPredicate can be transformed into a `TupleDomain`
and thus help with predicate pushdown.
Range-based `TupleDomain` representation is critical for connectors
which have min/max-based metadata (like Iceberg manifests lists which
play a key role in partition pruning or Iceberg data files), as ranges allow
for intersection tests, something that is hard
to do in a generic manner for `ConnectorExpression`.
@findinpath findinpath force-pushed the rewrite-date-trunc-in-between-predicate branch from 30cd2d6 to f6d4578 Compare October 4, 2022 21:55
@findinpath findinpath requested a review from findepi October 4, 2022 21:57
@findinpath findinpath marked this pull request as draft August 11, 2023 13:59
@mosabua
Copy link
Copy Markdown
Member

mosabua commented Jan 12, 2024

@findinpath is this still in progress/valid?

@findinpath
Copy link
Copy Markdown
Contributor Author

This would be a follow-up after #14648

@github-actions
Copy link
Copy Markdown

This pull request has gone a while without any activity. Tagging for triage help: @mosabua

@github-actions github-actions bot added the stale label Feb 11, 2025
@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 5, 2025

Closing this pull request, as it has been stale for six weeks. Feel free to re-open at any time.

@github-actions github-actions bot closed this Mar 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

date_trunc range optimization should apply also for BETWEEN predicate

2 participants