-
Notifications
You must be signed in to change notification settings - Fork 3k
Core: Support serializable and snapshot isolation for ReplacePartitions #2925
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@rdblue if you could look when you have time, thanks |
core/src/main/java/org/apache/iceberg/BaseReplacePartitions.java
Outdated
Show resolved
Hide resolved
57af2be to
2406ab9
Compare
|
Updated, and also turned off serialzable isolation by default (can turn it on as well) |
core/src/main/java/org/apache/iceberg/BaseReplacePartitions.java
Outdated
Show resolved
Hide resolved
core/src/main/java/org/apache/iceberg/MergingSnapshotProducer.java
Outdated
Show resolved
Hide resolved
core/src/test/java/org/apache/iceberg/TestReplacePartitions.java
Outdated
Show resolved
Hide resolved
core/src/test/java/org/apache/iceberg/TestReplacePartitions.java
Outdated
Show resolved
Hide resolved
core/src/test/java/org/apache/iceberg/TestReplacePartitions.java
Outdated
Show resolved
Hide resolved
szehon-ho
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rdblue thanks a lot for taking a look. I rewrote it based on your suggestion (from using Expression to using a filter of partitionSet::contains). Also added a toString method to PartitionSet to make the user-facing message.
2d4ff91 to
dce6568
Compare
core/src/main/java/org/apache/iceberg/BaseReplacePartitions.java
Outdated
Show resolved
Hide resolved
core/src/main/java/org/apache/iceberg/MergingSnapshotProducer.java
Outdated
Show resolved
Hide resolved
spark3/src/main/java/org/apache/iceberg/spark/source/SparkWrite.java
Outdated
Show resolved
Hide resolved
spark3/src/main/java/org/apache/iceberg/spark/source/SparkWrite.java
Outdated
Show resolved
Hide resolved
spark3/src/main/java/org/apache/iceberg/spark/source/SparkWrite.java
Outdated
Show resolved
Hide resolved
b512dae to
88112c4
Compare
|
@rdblue thanks for the suggestions. I put validate-from-snapshot as a SparkWriteOption, and added a Spark level test for it based on it, if you want to take another look. |
e2c0b19 to
df7d8aa
Compare
…ot and add unit test
-Refactor common logic to methods in MergingSnapshotProducer::validate methods -Change serialization level from table properties to spark-write-conf -Other small fixes
cc7043f to
f6d2c97
Compare
f6d2c97 to
4dbc301
Compare
|
@aokolnychyi thanks for the review, addressed the comments and implemented snapshot isolation + tests based on the general agreement. Will also change the name of the pr to reflect |
|
Thanks, @szehon-ho! This is awesome work! Thanks others for reviewing too. |
This commit bumps the CDP Build number to 27992803 and versions to 7.2.16.0-77. The new version includes apache/iceberg#2925, an Iceberg library improvement to support serializable and snapshot isolation for ReplacePartitions. This is a prereuquisite to safely execute INSERT OVERWRITE entire partitions. Testing: - Ran a build and verified that the downloads succeed. Change-Id: I9fcef254a5f845540e09e0d8d2111f305fb2fea2 Reviewed-on: http://gerrit.cloudera.org:8080/18613 Reviewed-by: Tamas Mate <[email protected]> Tested-by: Tamas Mate <[email protected]>
…ns (apache#2925) (cherry picked from commit 01c62cb)
Uh oh!
There was an error while loading. Please reload this page.