Skip to content

New global state is breaking the Alter table scenarios #2670

@ajantha-bhat

Description

@ajantha-bhat

After #2626 ,

main branch:
CREATE TABLE nessie.db1.T1 (col_a STRING, col_b STRING, col_c STRING) USING iceberg;
INSERT INTO nessie.db1.T1 VALUES ('foo', 'bar', 'baz');

create branch b1, b2 from main.
use branch b1

ALTER TABLE nessie.db1.T1 ADD COLUMN add_a STRING
INSERT INTO nessie.db1.T1 VALUES ('a', 'b', 'c', 'add')

use branch b2
ALTER TABLE nessie.db1.T1 ADD COLUMN add_b BIGINT   -- Iceberg throws validation exception with this error message "Cannot set invalid snapshot log: latest entry is not the current snapshot" 

Analysis:

Alter table operations goes as a simple transaction in Iceberg. While committing it, Iceberg cleans up snapshotLog in TableMetadata. During clean up it expects current snapshot id to be the head of snapshot log.
Refer code here: BaseTransaction::commitSimpleTransaction
this.current.removeSnapshotLogEntries(this.intermediateSnapshotIds)

But due to our global state change, now because of insert operation happened at b1, snapshot history head points to insert snapshot id. but b2's alter table operation expects snapshots head to be b2's snapshot id [Note: In iceberg, alter table operation doesn't generate new snapshot id. only schema id will be newly generated]

Solution
Keep a List<HistoryEntry> snapshotLog in on-reference-state of Iceberg, so that when b2's operation is in progress, it will just use b2's snapshot history.
Any other better solution ?

New solution:
Thinking more on this, I think instead of keeping snapshot history in Iceberg on-reference-state, we can compute it on the fly by querying nessie as we already have that information. This way each branch will use its own history and no Iceberg functionality will be broken including metadata tables.

When each time metadata refresh happens (we already have a hook to patch on-ref-state to table metadata pointer), we can query Nessie and collect the snapshot history from Nessie [by going through commits] and patch it up on TableMetadata.
Snapshot history needs snapshot timestamp. So, may be along with snapshot id, on-reference-state has to have snapshot timestamp. We cannot use commit timestamp there as some commits don't change snapshots also commit time is not exactly same as snapshot time, some issues in concurrent scenario may arise.

Also, for Iceberg's native snapshot/branching feature, if the Iceberg provides an interface to have snapshot history per reference. We should make use of it instead of changing our code.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions