Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 5 additions & 3 deletions api/src/main/java/org/apache/iceberg/Snapshot.java
Original file line number Diff line number Diff line change
Expand Up @@ -185,12 +185,14 @@ default Long firstRowId() {
}

/**
* The total number of newly added rows in this snapshot. It should be the summation of {@link
* ManifestFile#ADDED_ROWS_COUNT} for every manifest added in this snapshot.
* The upper bound of number of rows with assigned row IDs in this snapshot. It can be used safely
* to increment the table's `next-row-id` during a commit. It can be more than the number of rows
* added in this snapshot and include some existing rows.
*
* <p>This field is optional but is required when the table version supports row lineage.
*
* @return the total number of new rows in this snapshot or null if the value was not stored.
* @return the upper bound of number of rows with assigned row IDs in this snapshot or null if the
* value was not stored.
*/
default Long addedRows() {
return null;
Expand Down
6 changes: 5 additions & 1 deletion format/spec.md
Original file line number Diff line number Diff line change
Expand Up @@ -754,9 +754,9 @@ A snapshot consists of the following fields:
| _optional_ | _required_ | _required_ | **`summary`** | A string map that summarizes the snapshot changes, including `operation` as a _required_ field (see below) |
| _optional_ | _optional_ | _optional_ | **`schema-id`** | ID of the table's current schema when the snapshot was created |
| | | _required_ | **`first-row-id`** | The first `_row_id` assigned to the first row in the first data file in the first manifest, see [Row Lineage](#row-lineage) |
| | | _required_ | **`added-rows`** | The upper bound of the number of rows with assigned row IDs, see [Row Lineage](#row-lineage) |
| | | _optional_ | **`key-id`** | ID of the encryption key that encrypts the manifest list key metadata |


The snapshot summary's `operation` field is used by some operations, like snapshot expiration, to skip processing certain snapshots. Possible `operation` values are:

* `append` -- Only data files were added and no files were removed.
Expand All @@ -782,6 +782,10 @@ A snapshot's `first-row-id` is assigned to the table's current `next-row-id` on

The snapshot's `first-row-id` is the starting `first_row_id` assigned to manifests in the snapshot's manifest list.

The snapshot's `added-rows` captures the upper bound of the number of rows with assigned row IDs.
It can be used safely to increment the table's `next-row-id` during a commit.
It can be more than the number of rows added in this snapshot and include some existing rows,
see [Row Lineage Example](#row-lineage-example).

### Manifest Lists

Expand Down