Skip to content

Conversation

@ajantha-bhat
Copy link
Member

BaseFileRewriteCoordinator, ScanTaskSetManager calls table.uuid() of metadata table multiple times for setting and reading from cache. So, uuid has to be consistent on each call for a given table.

This issue was discovered during #9298

Copy link
Contributor

@amogh-jahagirdar amogh-jahagirdar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing this @ajantha-bhat just a nit

.as("UUID should be consistent on multiple calls")
.isEqualTo(manifestsTable.uuid());
Assertions.assertThat(manifestsTable.uuid())
.as("Metadata table UUID should be different from main table UUID")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: "should be different from the base table's UUID"


@Override
public UUID uuid() {
return UUID.randomUUID();
Copy link
Contributor

@amogh-jahagirdar amogh-jahagirdar Dec 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, I can't believe I wrote it like this originally :). Good fix, yes we do want a random UUID but it should be consistent across the same Metadata Table.

The only thing is if we extend this logic further is that the same base metadata table across different references won't return the same UUID. So for example "all manifests" for t1 may have a different UUID than "all manifests" for t1 in a different query.

I don't really see a way around that without persisting the UUID somewhere (like for normal tables that's in the metadata) but considering metadata tables are really more logical tables, I think that's OK. If we wanted that strict definition for metadata tables, then we would have to throw an unsupported until we persisted those details somewhere.

For now though, I think this change is good as is. It looks like in the next change you have the UUID is used as part of caching, in which case this fix should solve that.

@amogh-jahagirdar
Copy link
Contributor

@ajantha-bhat the same unrelated Flink flaky test is occurring that #9309 is fixing. I'm going to close and re-open this PR to retrigger CI

@amogh-jahagirdar amogh-jahagirdar merged commit 24578a2 into apache:main Dec 16, 2023
lisirrx pushed a commit to lisirrx/iceberg that referenced this pull request Jan 4, 2024
geruh pushed a commit to geruh/iceberg that referenced this pull request Jan 26, 2024
devangjhabakh pushed a commit to cdouglas/iceberg that referenced this pull request Apr 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants