Skip to content

Add time travel syntax support for iceberg tables (VERSION and TIMESTAMP).#20991

Merged
tdcmeehan merged 1 commit intoprestodb:masterfrom
gupteaj:time_travel_20495
Nov 17, 2023
Merged

Add time travel syntax support for iceberg tables (VERSION and TIMESTAMP).#20991
tdcmeehan merged 1 commit intoprestodb:masterfrom
gupteaj:time_travel_20495

Conversation

@gupteaj
Copy link
Contributor

@gupteaj gupteaj commented Sep 28, 2023

Description

Presto issue : #20495

Note - This PR has syntax and engine side changes. For iceberg connector changes, test and doc, a new PR will be created.

This feature will allow iceberg connector to query historical data using AS OF syntax on a table.
Time travel version option will read bigint snapshot id value for the table. Time travel timestamp option will read
timestamp-with-time-zone value for the table.
examples :

select * from tab1 FOR SYSTEM_VERSION AS OF 8772871542276440693;
select * from tab1 FOR SYSTEM_TIME AS OF TIMESTAMP '2023-08-17 13:29:46.822 America/Los_Angeles';

Note - TIMESTAMP and VERSION are the alias names for SYSTEM_TIME and SYSTEM_VERSION respectively

select * from tab1 FOR VERSION AS OF 8772871542276440693;
select * from tab1 FOR TIMESTAMP AS OF TIMESTAMP '2023-08-17 13:29:46.822 America/Los_Angeles';

Design description :
Parser

  • support point in time query with AS OF syntax on table reference
select * from tab1 FOR VERSION AS OF  < bigint type >
select * from tab1 FOR TIMESTAMP AS OF <timestamp with time zone type>
  • In parser tree, create a new class to hold time travel expression (extend Expression)
  • SqlFormatter & AstBuilder code for time travel expression
  • Parser table tree change to include time travel expression
  • New file for TableVersionExpression class to hold TableVersionType and AS-OF expression
  • Add SYSTEM_TIME as alias for TIMESTAMP and SYSTEM_VERSION for VERSION.

Semantic Analyzer

  • Changed VisitTable() to process table version expression and type
  • In statement analyzer, Add a new function processTableVersion() to combine all semantic error handling and time travel expression evaluation
  • TIMESTAMP clause can accept any expression that will return timestamp-with-time-zone type
  • New function getTableHandle to extract table version expression and type
  • Return table snapshot id based on Metadata functions
  • Add visitTableVersion() in ExpressionFormatter

Metadata

  • Add getHandleVersion() method to return optional table handle
  • Changed getTableHandle() to pass tableVersionType and tableVersionExpression expression object to connector and iceberg code

Limitations :

  • VERSION syntax can only accept Bigint value or Bigint expression/cast
  • TIMESTAMP syntax can only accept timestamp-with-time-zone value or timestamp-with-time-zone value expression/cast

Motivation and Context

Right now, we only support time travel via system tables and system procedures.

The current support does not include a convenient way to select a particular snapshot or the ability to select a snapshot as of a particular timestamp. Using system tables and system procedures to do this is not convenient, and it also is not safe per-user, since the system table alters the table for everyone, not just the current user.

See Trino implementation, which has syntax level support for going back to a particular snapshot via FOR VERSION AS OF, and to go back to a period in time without having to first fetch a snapshot id via FOR TIMESTAMP AS OF. We should consider a similar feature for convenience of time travel.

Impact

  • A new table level option to return specific snapshot for iceberg connector

Test Plan

-manual testing

  • new iceberg PR will have time travel tests

Contributor checklist

  • Please make sure your submission complies with our development, formatting, commit message, and attribution guidelines.
  • PR description addresses the issue accurately and concisely. If the change is non-trivial, a GitHub Issue is referenced.
  • Documented new properties (with its default value), SQL syntax, functions, or other functionality.
  • If release notes are required, they follow the release notes guidelines.
  • Adequate tests were added if applicable.
  • CI passed.

Release Notes

Please follow release notes guidelines and fill in the release notes below.

Note - Iceberg PR will have release notes and doc section

Copy link
Contributor

@tdcmeehan tdcmeehan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if I try to use a connector that doesn't support time travel? Is it unsupported operaton exception?

@gupteaj
Copy link
Contributor Author

gupteaj commented Oct 20, 2023

What happens if I try to use a connector that doesn't support time travel? Is it unsupported operaton exception?

It will give an error.

presto:runtime> select * from nodes FOR VERSION AS OF 8772871542276440693;
Query 20231018_184346_00038_jrkns failed: This connector does not support table version AS OF expression

presto:runtime> select * from nodes  FOR TIMESTAMP AS OF TIMESTAMP '2023-08-17 13:29:46.822 America/Los_Angeles';
Query 20231018_184533_00039_jrkns failed: This connector does not support table version AS OF expression

@gupteaj gupteaj requested a review from tdcmeehan October 20, 2023 16:43
@gupteaj gupteaj force-pushed the time_travel_20495 branch 3 times, most recently from 14d69a2 to 0ff1b7e Compare October 25, 2023 00:42
Copy link
Contributor

@steveburnett steveburnett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the documentation! I made some suggestions to simplify the text, and have a question about TIMESTAMP (value) and CURRENT_TIMESTAMP for you to consider.

Let me know what you think, please.

Copy link
Contributor

@steveburnett steveburnett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, looks great! I made a single suggestion in the new text, but everything else is good.

Copy link
Contributor

@steveburnett steveburnett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! (docs)

@gupteaj gupteaj requested a review from hantangwangd October 31, 2023 20:43
@gupteaj gupteaj changed the title Add time travel support for iceberg tables (VERSION and TIMESTAMP). Add time travel syntax support for iceberg tables (VERSION and TIMESTAMP). Nov 3, 2023
@tdcmeehan tdcmeehan marked this pull request as ready for review November 3, 2023 19:46
@tdcmeehan tdcmeehan requested a review from a team as a code owner November 3, 2023 19:46
@tdcmeehan tdcmeehan requested a review from presto-oss November 3, 2023 19:46
@github-actions
Copy link

github-actions bot commented Nov 3, 2023

Codenotify: Notifying subscribers in CODENOTIFY files for diff e8e0798...ddba425.

Notify File(s)
@aditi-pandit presto-parser/src/main/antlr4/com/facebook/presto/sql/parser/SqlBase.g4
@elharo presto-parser/src/main/antlr4/com/facebook/presto/sql/parser/SqlBase.g4
@kaikalur presto-parser/src/main/antlr4/com/facebook/presto/sql/parser/SqlBase.g4
@rschlussel presto-parser/src/main/antlr4/com/facebook/presto/sql/parser/SqlBase.g4

@gupteaj gupteaj marked this pull request as draft November 3, 2023 19:49
@gupteaj gupteaj requested a review from tdcmeehan November 3, 2023 21:19
@gupteaj gupteaj marked this pull request as ready for review November 3, 2023 21:26
Copy link
Member

@hantangwangd hantangwangd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some little problems and comment for discussion.

@tdcmeehan tdcmeehan self-assigned this Nov 7, 2023
Copy link
Member

@hantangwangd hantangwangd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mind add a test case for sql with time travel syntax being parsed to statement and formatted back?

Copy link
Member

@hantangwangd hantangwangd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, LGTM!

Copy link
Contributor

@ZacBlanco ZacBlanco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two minor questions. Otherwise PR looks good!

@gupteaj
Copy link
Contributor Author

gupteaj commented Nov 9, 2023

Do you mind add a test case for sql with time travel syntax being parsed to statement and formatted back?

Added test cases in TestSqlParser

@aditi-pandit
Copy link
Contributor

Codenotify: Notifying subscribers in CODENOTIFY files for diff e8e0798...ddba425.

Notify File(s)
@aditi-pandit presto-parser/src/main/antlr4/com/facebook/presto/sql/parser/SqlBase.g4
@elharo presto-parser/src/main/antlr4/com/facebook/presto/sql/parser/SqlBase.g4
@kaikalur presto-parser/src/main/antlr4/com/facebook/presto/sql/parser/SqlBase.g4
@rschlussel presto-parser/src/main/antlr4/com/facebook/presto/sql/parser/SqlBase.g4

Confirming that Prestissimo eval is not affected by this change rightaway

Codenotify: Notifying subscribers in CODENOTIFY files for diff e8e0798...ddba425.

Notify File(s)
@aditi-pandit presto-parser/src/main/antlr4/com/facebook/presto/sql/parser/SqlBase.g4
@elharo presto-parser/src/main/antlr4/com/facebook/presto/sql/parser/SqlBase.g4
@kaikalur presto-parser/src/main/antlr4/com/facebook/presto/sql/parser/SqlBase.g4
@rschlussel presto-parser/src/main/antlr4/com/facebook/presto/sql/parser/SqlBase.g4

@tdcmeehan : Confirming this PR doesn't affect Prestissimo at this point.

@tdcmeehan tdcmeehan linked an issue Nov 17, 2023 that may be closed by this pull request
@tdcmeehan tdcmeehan merged commit 8242a40 into prestodb:master Nov 17, 2023
@wanglinsong wanglinsong mentioned this pull request Feb 12, 2024
64 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Keywords in Time Travel syntax for Presto

6 participants