-
Notifications
You must be signed in to change notification settings - Fork 3k
Nessie: Fix NPE while accessing refreshed table in nessie catalog #4509
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
|
||
| createBranch(branch2, catalog.currentHash(), branch1); | ||
| // commit on tableIdentifier1 on branch2 | ||
| NessieCatalog catalogBranch1 = initCatalog(branch2); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe just change to catalog = initCatalog(branch2)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I followed the pattern from base testcases in this file.
Changed this on newly added testcase, base testcases can be done in a follow up PR as it will confuse the reviewers.
| NessieCatalog catalog = initCatalog(branch1); | ||
| String metadataCommit1 = addRow(catalog, tableIdentifier1, "initial-data", | ||
| ImmutableMap.of("id0", 4L)); | ||
| catalog.refresh(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
those catalog.refresh() calls can be removed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is just a double check that even refresh is not gonna fix it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
refresh() shouldn't be used outside of the internals of the catalog imo (I have a PR that cleans that up). Also it's not apparent that this is the reason why refresh() is being used here, so I woudl just remove it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok. Removed.
|
|
||
| // commit on tableIdentifier1 on branch1 | ||
| NessieCatalog catalog = initCatalog(branch1); | ||
| String metadataCommit1 = addRow(catalog, tableIdentifier1, "initial-data", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we rename this (and others) to something like metadataLocationOfCommit1. Initially it wasn't clear that this refers to the metadata location
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed on newly added testcase, base testcases can be done in a follow up PR as it will confuse the reviewers.
| catalog = initCatalog(branch1); | ||
| catalog.refresh(); | ||
| // load tableIdentifier1 on branch1 | ||
| BaseTable table = (BaseTable)catalog.loadTable(tableIdentifier1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: formatting
75da1a7 to
819e751
Compare
| catalog = initCatalog(branch2); | ||
| String metadataLocationOfCommit2 = addRow(catalog, tableIdentifier1, "some-more-data", | ||
| ImmutableMap.of("id0", 42L)); | ||
| Assertions.assertThat(metadataLocationOfCommit2).isNotEqualTo(metadataLocationOfCommit1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: maybe change to .isNotNull().isNotEqualTo(metadataLocationOfCommit1). same for the other check further below
|
@szehon-ho / @openinx : Can you please help in merging this ? |
openinx
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me .
|
@openinx : Thanks for merging. |
- upgrade to iceberg 0.13.1 - upgrade to flink 1.13.6 - update notebooks to new sql grammar (see projectnessie/nessie#2273) - adjust notebooks to new nessie cli - adjust notebooks to new nessie python api - adjust notebooks to no longer claim branches have the same hash after merging - avoid silent exception handling - apply workaround for delta demo (for projectnessie/nessie#3552) - apply workaround for hive demo (for apache/iceberg#4509)
- upgrade to iceberg 0.13.1 - upgrade to flink 1.13.6 - update notebooks to new sql grammar (see projectnessie/nessie#2273) - adjust notebooks to new nessie cli - adjust notebooks to new nessie python api - adjust notebooks to no longer claim branches have the same hash after merging - avoid silent exception handling - apply workaround for delta demo (for projectnessie/nessie#3552) - apply workaround for hive demo (for apache/iceberg#4509)
) (#4840) Co-authored-by: Ajantha Bhat <[email protected]>
- upgrade to iceberg 0.13.1 - upgrade to flink 1.13.6 - update notebooks to new sql grammar (see projectnessie/nessie#2273) - adjust notebooks to new nessie cli - adjust notebooks to new nessie python api - adjust notebooks to no longer claim branches have the same hash after merging - avoid silent exception handling - apply workaround for delta demo (for projectnessie/nessie#3552) - apply workaround for hive demo (for apache/iceberg#4509) closes #260
what is null ?
TableMetadata.metadataFileLocationis nullScenario:
Reason:
NessieTableOperation.refreshFromMetadataLocation()callsloadTableMetadata()whereTableMetadata.buildFrom()creates anTableMetadatawith nullmetadataFileLocation(refer here)and Nessie didn't set the
metadataFileLocationHow basic scenarios are working?
Nessie itself holds
metadataFileLocationand uses for all the Nessie operations. But when Iceberg table in nessie catalog is used for iceberg specific functionality like read.snapshotstable, Iceberg needsmetadataFileLocationwhich was not filled.