Skip to content

Conversation

@RussellSpitzer
Copy link
Member

Instead of using a cache to preserve the state from before the
expireSnapshots command, we preserve the table metadata via a
StaticTable reference. This reference doesn't change when the
Snapshosts are expired and allows us to look up all the files
referenced by the prior version of the table without holding
everything in memory.

Requires #1342

Previously ther only way to expire snapshots was through a single machine table
operation with RemoveSnapshots. In this patch we add a new Spark Action which
does the same work, but does so in a scalable way. Instead of using the old
logic for analyzing files to remove, we use the Metadata Table representations
of the table both before and after Snapshot Expiration to determine un-needed
files.
Lazy construction of Expire Snapshots Action.
Processing of deletes using Local Iterator
Ignoring Versioning Files
Adding ExecutorService Option Like RemoveSnapshots
Move getManifestLists / getOtherManifestPaths to Core Module
Fixup of doc typos
Refactoring of Tests, All tests use only table.operations no Spark Writes
All tests now check file deletions
Renaming of class methods to fit style
Removal of TableUtils, Functions moved back into BaseAction
ExpireSnapshotsAction defaults to single threaded deleter
@RussellSpitzer RussellSpitzer deleted the ExpireSnapshotsAction branch August 14, 2020 18:19
parthchandra pushed a commit to parthchandra/iceberg that referenced this pull request Oct 22, 2025
…n) (apache#1343)

* API, Spark 3.5: Action to compute table stats (apache#10288)

(cherry picked from commit 2f6e7e6)

* Spark 3.4: Action to compute table stats (apache#11106)

(cherry picked from commit 5582b0c)

* Spark 3.4: Add utility to load table state reliably (apache#11115)

(cherry picked from commit d5b21d8)

* Cheery-pick data-sketches lib version chnage from
apache@cbe391d#diff-697f70cdd88ba88fe77eebda60c7e143f6ad1286bca75017421e93ad84fb87df

---------

Co-authored-by: Karuppayya <[email protected]>
Co-authored-by: Hongyue/Steve Zhang <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant