[Iceberg] Add Iceberg metadata table $ref#23503
[Iceberg] Add Iceberg metadata table $ref#23503agrawalreetika merged 1 commit intoprestodb:masterfrom
Conversation
steveburnett
left a comment
There was a problem hiding this comment.
Thanks for the doc! A single tiny nit about punctuation, everything else looks good.
tdcmeehan
left a comment
There was a problem hiding this comment.
Nice change! Just a couple of nits.
| return new FixedPageSource(buildPages(tableMetadata, icebergTable)); | ||
| } | ||
|
|
||
| private static boolean checkNonNull(Object object, PageListBuilder pagesBuilder) |
There was a problem hiding this comment.
It seems to only be used for bigint values, so I'd just call this something like appendLongValue and append null if it's null, otherwise append the long value.
There was a problem hiding this comment.
Actaully I just noticed value.minSnapshotsToKeep() is an Integer. I can add it as appendLongValue itself and use appendLongValue method. Or I can keep this method like earlier and append it with pagesBuilder.appendInteger LMK?
There was a problem hiding this comment.
I'm just wondering if we can encapsulate the setting of values, either null or bigint, into one method. So perhaps just appendValue?
|
|
||
| .. code-block:: text | ||
|
|
||
| name | type | snapshot_id | max_reference_age_in_ms | min_snapshots_to_keep | max_snapshot_age_in_ms |
There was a problem hiding this comment.
It would be nice to include a non-main branch as well.
a0c4498 to
15f3194
Compare
hantangwangd
left a comment
There was a problem hiding this comment.
Thanks for adding this metadata table, the change looks good to me. Should we add a test case to show that there will always be a main branch by default?
15f3194 to
1a0eea7
Compare
Thanks for your review @hantangwangd , I have added this. please review. |
steveburnett
left a comment
There was a problem hiding this comment.
LGTM! (docs)
Pull updated branch, new local doc build, looks good. Thanks!
1a0eea7 to
54eb31b
Compare
| @Override | ||
| public Distribution getDistribution() | ||
| { | ||
| return Distribution.SINGLE_COORDINATOR; |
| icebergTable.refs().forEach((key, value) -> { | ||
| pagesBuilder.beginRow(); | ||
| pagesBuilder.appendVarchar(key); | ||
| pagesBuilder.appendVarchar(value.isTag() ? "TAG" : "BRANCH"); |
There was a problem hiding this comment.
What do you think about doing String.valueOf(value.type())? This way it handles updates to the SnapshotRefType enum.
|
|
||
| // Check main branch entry | ||
| assertQuery("SELECT count(*) FROM test_schema.\"test_table$refs\"", "VALUES 1"); | ||
| assertQuery("SELECT name FROM test_schema.\"test_table$refs\"", "VALUES 'main'"); |
There was a problem hiding this comment.
I think it would also be prudent to add assertions on values in the row of the table matching the SnapshotRef object.
You could add or refactor the loadTable method from IcebergDistributedTestBase to get a reference to the Iceberg table, then re-create the values and run assertQuery
There was a problem hiding this comment.
Added tests inIcebergDistributedTestBase which I could reuse in my subsequent PR around adding support for querying tags & branch. Please review
54eb31b to
e5d2c74
Compare
hantangwangd
left a comment
There was a problem hiding this comment.
Thanks for the added test case, overall looks good to me, one little nit.
| assertEquals(icebergTable.refs().size(), 2); | ||
| icebergTable.manageSnapshots().createTag("testTag", icebergTable.currentSnapshot().snapshotId()).commit(); | ||
|
|
||
| assertEquals(icebergTable.refs().size(), 3); | ||
| assertQuery("SELECT count(*) FROM \"test_table_references$refs\"", "VALUES 3"); |
There was a problem hiding this comment.
Maybe we can add some more detailed assertions like follows:
assertQuery("SELECT * from \"test_table_references$refs\" where name = 'main' and type = 'BRANCH'",
format("VALUES('%s', '%s', %s, %s, %s, %s)",
"main",
"BRANCH",
icebergTable.refs().get("main").snapshotId(),
icebergTable.refs().get("main").maxRefAgeMs(),
icebergTable.refs().get("main").minSnapshotsToKeep(),
icebergTable.refs().get("main").maxSnapshotAgeMs()));
assertQuery("SELECT * from \"test_table_references$refs\" where type = 'TAG'",
format("VALUES('%s', '%s', %s, %s, %s, %s)",
"testTag",
"TAG",
icebergTable.refs().get("testTag").snapshotId(),
icebergTable.refs().get("testTag").maxRefAgeMs(),
icebergTable.refs().get("testTag").minSnapshotsToKeep(),
icebergTable.refs().get("testTag").maxSnapshotAgeMs()));
There was a problem hiding this comment.
Sure, I have added detailed assertions for testTag & testBranch instead of the main branch. Please LMK if that's fine
e5d2c74 to
9fac986
Compare
Description
Add Iceberg metadata table $ref
Motivation and Context
Add Iceberg metadata table $ref
provides details around tags & branches on the Iceberg table - https://iceberg.apache.org/docs/nightly/branching
Impact
Iceberg Connector
Provides details around tags & branches on the Iceberg table - https://iceberg.apache.org/docs/nightly/branching
Test Plan
Contributor checklist
Release Notes