Skip to content

Connector changes for time travel BEFORE clause.#22851

Merged
tdcmeehan merged 1 commit intoprestodb:masterfrom
gupteaj:tt_before_connector
Jun 5, 2024
Merged

Connector changes for time travel BEFORE clause.#22851
tdcmeehan merged 1 commit intoprestodb:masterfrom
gupteaj:tt_before_connector

Conversation

@gupteaj
Copy link
Contributor

@gupteaj gupteaj commented May 28, 2024

Presto issue : #21971

Note - This PR has iceberg connector changes, test and doc update. Syntax change for "BEFORE" keyword in presto engine is already merged.

This feature will allow iceberg connector to query historical data using BEFORE syntax on a table.
Time travel version option will read bigint snapshot id value for the table. Time travel timestamp option will read
timestamp-with-time-zone value for the table.

Examples

// Return table data for tab1 before snapshot id 8772871542276440693
select * from tab1 FOR SYSTEM_VERSION BEFORE 8772871542276440693;

// Return table data for tab1 before '2023-08-17 13:29:46.822 America/Los_Angeles'
select * from tab1 FOR SYSTEM_TIME BEFORE TIMESTAMP '2023-08-17 13:29:46.822 America/Los_Angeles';

Iceberg connector

  • change getSnapshotIdAsOfTime() method to getSnapshotIdTimeOperator(), which will return snapshot id based on EQUAL or LESS_THAN operator
  • update getSnapshotIdForTableVersion() to handle BEFORE cases
  • add getVersionOperator() in connector table version

Motivation and Context

Snowflake provides a BEFORE clause for time travel queries. When used in a query, this clause is specified in the FROM clause immediately after the table name. It determines the point in the past from which historical data is requested for the object. The AT keyword includes changes made by a statement or transaction with a timestamp equal to the specified parameter, while the BEFORE keyword refers to a point immediately preceding the specified parameter.

IBM Netezza supports BEFORE clause similar to Snowflake.

Impact

A new table level option to return specific snapshot for iceberg connector

Test Plan

  • Simple cases with FOR VERSION BEFORE and FOR TIMESTAMP BEFORE options on a table
  • Alias SYSTEM_TIME and SYSTEM_VERSION cases
  • Multiple table join query with VERSION and TIMESTAMP syntax
  • Multiple table joins, CTE, Create table as cases
  • Union, Intersect, Except, Subquery and view cases
  • Error handling cases for FOR VERSION BEFORE and FOR TIMESTAMP BEFORE options

Contributor checklist

  • Please make sure your submission complies with our development, formatting, commit message, and attribution guidelines.
  • PR description addresses the issue accurately and concisely. If the change is non-trivial, a GitHub Issue is referenced.
  • Documented new properties (with its default value), SQL syntax, functions, or other functionality.
  • If release notes are required, they follow the release notes guidelines.
  • Adequate tests were added if applicable.
  • CI passed.

Release Notes

Please follow release notes guidelines and fill in the release notes below.


== RELEASE NOTES ==

Iceberg Connector Changes
* Add time travel ``BEFORE`` syntax for Iceberg tables to return historical data :pr:`22851`
* Improve time travel ``VERSION (SYSTEM_VERSION)`` syntax to include snapshot id using bigint data type :pr:`22851`
* Improve time travel ``TIMESTAMP (SYSTEM_TIME)`` syntax to include timestamp-with-time-zone data type :pr:`22851`

steveburnett
steveburnett previously approved these changes May 28, 2024
Copy link
Contributor

@steveburnett steveburnett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! (docs)

Pull branch, local docs build, everything looks good. Thanks!

@steveburnett
Copy link
Contributor

Suggest revising the release note entry following the Order of changes and Section naming topics in the Release Notes Guidelines.

== RELEASE NOTES ==

Iceberg Connector Changes
* Add time travel ``BEFORE`` syntax for Iceberg tables to return historical data. 
* Improve time travel ``VERSION (SYSTEM_VERSION)`` syntax to include snapshot id using bigint data type.
* Improve time travel ``TIMESTAMP (SYSTEM_TIME)`` syntax to include timestamp-with-time-zone data type. 

If the examples are valuable, suggest adding them to the Iceberg Connector documentation.

@tdcmeehan tdcmeehan mentioned this pull request May 30, 2024
6 tasks
@steveburnett
Copy link
Contributor

I overlooked "add PR # to the release note entry" in my previous comment, apologies for that!

== RELEASE NOTES ==

Iceberg Connector Changes
* Add time travel ``BEFORE`` syntax for Iceberg tables to return historical data :pr:`22851`
* Improve time travel ``VERSION (SYSTEM_VERSION)`` syntax to include snapshot id using bigint data type :pr:`22851`
* Improve time travel ``TIMESTAMP (SYSTEM_TIME)`` syntax to include timestamp-with-time-zone data type :pr:`22851`

@gupteaj gupteaj marked this pull request as ready for review May 31, 2024 15:35
@gupteaj gupteaj requested a review from presto-oss May 31, 2024 15:35
Copy link
Member

@hantangwangd hantangwangd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the work, change looks good to me. Only one little thing in test.

Copy link
Contributor

@steveburnett steveburnett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! (docs)

Pull updated branch, review new local docs build, everything looks good. Thanks!

Copy link
Member

@hantangwangd hantangwangd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@gupteaj gupteaj force-pushed the tt_before_connector branch from b3e95f8 to 734fc60 Compare June 4, 2024 17:44
Copy link
Contributor

@tdcmeehan tdcmeehan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @gupteaj

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants