-
Notifications
You must be signed in to change notification settings - Fork 3k
Core: Remove deprecated method from BaseMetadataTable #9298
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -105,6 +105,8 @@ private String metadataFileLocation(Table table) { | |
| if (table instanceof HasTableOperations) { | ||
| TableOperations ops = ((HasTableOperations) table).operations(); | ||
| return ops.current().metadataFileLocation(); | ||
| } else if (table instanceof BaseMetadataTable) { | ||
| return ((BaseMetadataTable) table).table().operations().current().metadataFileLocation(); | ||
|
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is needed now since the metadata table won't enter above check of |
||
| } else { | ||
| return null; | ||
| } | ||
|
|
||
| Original file line number | Diff line number | Diff line change | ||||||
|---|---|---|---|---|---|---|---|---|
|
|
@@ -28,6 +28,8 @@ | |||||||
| import java.util.stream.Collectors; | ||||||||
| import java.util.stream.Stream; | ||||||||
| import org.apache.hadoop.fs.Path; | ||||||||
| import org.apache.iceberg.BaseMetadataTable; | ||||||||
| import org.apache.iceberg.HasTableOperations; | ||||||||
| import org.apache.iceberg.NullOrder; | ||||||||
| import org.apache.iceberg.PartitionField; | ||||||||
| import org.apache.iceberg.PartitionSpec; | ||||||||
|
|
@@ -948,6 +950,17 @@ public static org.apache.spark.sql.catalyst.TableIdentifier toV1TableIdentifier( | |||||||
| return org.apache.spark.sql.catalyst.TableIdentifier.apply(table, database); | ||||||||
| } | ||||||||
|
|
||||||||
| static String baseTableUUID(org.apache.iceberg.Table table) { | ||||||||
| if (table instanceof HasTableOperations) { | ||||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. why not just call
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Because iceberg/core/src/main/java/org/apache/iceberg/BaseMetadataTable.java Lines 204 to 206 in d56dd63
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think while adding UUID interface we concluded that we should not use base table's UUID
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah the argument there is that the metadata table can be considered as a separate table and should therefore have it's own unique identifier compared to the base table. But I think @nastra point still stands, even if it's different then the base table UUID, why does that matter here? I think we just want the table.uuid() right? or do we need the metadata table's underlying table's UUID?
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I tried using table.uuid(), many testcase failed as the scan task of metadata table expects UUID of the base table not the metadata table.
Does it make sense to return base table's UUID for the metadata table? (That is change the behaviour from #8800?)
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. done. Rebased.
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hmm, looks like Either we need to change that logic or return base table uuid. I will dig deeper next week.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sure, I'd check out that logic further and we can see what the right behavior is here. I still think the change that was made in #9310 is definitely the right fix from an API perspective (even if we decide not to use that API here). The main issue that was solved there was semantically the metadata table UUID should be the same for the same reference. In other words, imo I would not change the UUID API semantics to fit whatever the caching logic relies on. If we need the base table UUID for the caching logic, then maybe
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Looks like there is a tight correlation between metadata table scan tasks and main table UUID from multiple classes. Hence, I went back to reverting using table.uuid() So, this PR can go ahead.
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @nastra: Thoughts? |
||||||||
| TableOperations ops = ((HasTableOperations) table).operations(); | ||||||||
| return ops.current().uuid(); | ||||||||
| } else if (table instanceof BaseMetadataTable) { | ||||||||
| return ((BaseMetadataTable) table).table().operations().current().uuid(); | ||||||||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. call
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This function can be called for main table or metadata table. So, the varibale name is table. Agree that So, I think we can leave it as it is as of now (out of scope for this PR). |
||||||||
| } else { | ||||||||
| throw new UnsupportedOperationException("Cannot retrieve UUID for table " + table.name()); | ||||||||
| } | ||||||||
| } | ||||||||
|
|
||||||||
| private static class DescribeSortOrderVisitor implements SortOrderVisitor<String> { | ||||||||
| private static final DescribeSortOrderVisitor INSTANCE = new DescribeSortOrderVisitor(); | ||||||||
|
|
||||||||
|
|
||||||||
Uh oh!
There was an error while loading. Please reload this page.