From 1fbf385ca538e4806e49f0879491994c56b51ea6 Mon Sep 17 00:00:00 2001 From: Amogh Jahagirdar Date: Thu, 12 Jan 2023 13:00:45 -0800 Subject: [PATCH] Docs: Add information on how to read from branches and tags in Spark query docs --- docs/spark-queries.md | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) diff --git a/docs/spark-queries.md b/docs/spark-queries.md index 08ca4c2e19a3..9c7614ffc25f 100644 --- a/docs/spark-queries.md +++ b/docs/spark-queries.md @@ -126,6 +126,8 @@ To select a specific table snapshot or the snapshot at some time in the DataFram * `snapshot-id` selects a specific table snapshot * `as-of-timestamp` selects the current snapshot at a timestamp, in milliseconds +* `branch` selects the head snapshot of the specified branch. Note that currently branch cannot be combined with as-of-timestamp. +* `tag` selects the snapshot associated with the specified tag ```scala // time travel to October 26, 1986 at 01:21:00 @@ -143,6 +145,22 @@ spark.read .load("path/to/table") ``` +```scala +// time travel to tag historical-snapshot +spark.read + .option(SparkReadOptions.TAG, "historical-snapshot") + .format("iceberg") + .load("path/to/table") +``` + +```scala +// time travel to the head snapshot of audit-branch +spark.read + .option(SparkReadOptions.BRANCH, "audit-branch") + .format("iceberg") + .load("path/to/table") +``` + {{< hint info >}} Spark 3.0 and earlier versions do not support using `option` with `table` in DataFrameReader commands. All options will be silently ignored. Do not use `table` when attempting to time-travel or use other options. See [SPARK-32592](https://issues.apache.org/jira/browse/SPARK-32592).