-
Notifications
You must be signed in to change notification settings - Fork 3k
BigQuery: Add table validity check for BigQueryMetastoreCatalog #14113
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BigQuery: Add table validity check for BigQueryMetastoreCatalog #14113
Conversation
| } | ||
|
|
||
| @Test | ||
| public void testLoadMetadataTableIsCalled() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure I follow why this is being tested here, can you elaborate please?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I want to test that loading a metadata table will actually happen with the fix in place, since that's technically a valid case of specifying an invalid table name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am bit confused too, if the goal is that that the metadata table loading works correctly post this fix, shouldnt we enable this test instead ?
iceberg/bigquery/src/test/java/org/apache/iceberg/gcp/bigquery/TestBigQueryCatalog.java
Lines 147 to 149 in a73a5c0
| @Disabled("BigQuery Metastore does not support multi layer namespaces") | |
| @Test | |
| public void testLoadMetadataTable() {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will move the test case there!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any reason why can't re-use the same test ? as if just enable this test from the base class it test if we can read the Files metadata table ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah I agree, we should be able to remove this test method entirely and rely on the testing behavior defined in CatalogTests. The reason this method existed here is so that it could be disabled
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch, heres the test with the same name in CatalogTests.java
iceberg/core/src/test/java/org/apache/iceberg/catalog/CatalogTests.java
Lines 929 to 950 in a73a5c0
| @Test | |
| public void testLoadMetadataTable() { | |
| C catalog = catalog(); | |
| TableIdentifier tableIdent = TableIdentifier.of("ns", "tbl"); | |
| TableIdentifier metaIdent = TableIdentifier.of("ns", "tbl", "files"); | |
| if (requiresNamespaceCreate()) { | |
| catalog.createNamespace(tableIdent.namespace()); | |
| } | |
| catalog.buildTable(tableIdent, SCHEMA).create(); | |
| Table table = catalog.loadTable(metaIdent); | |
| assertThat(table).isNotNull(); | |
| assertThat(table).isInstanceOf(FilesTable.class); | |
| // check that the table metadata can be refreshed | |
| table.refresh(); | |
| assertThat(table.name()).isEqualTo(catalog.name() + "." + metaIdent); | |
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good thank you.
nastra
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the fix itself LGTM but I left a question for the test around metadata loading
|
/cc @talatuyarer |
kevinjqliu
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thanks @thomaschow :)
## Summary - Iceberg 1.10's BQMS has a bug that's fixed in: apache/iceberg#14113. Since this isn't included in the 1.10 release we'll build the jar locally and include it in `cloud_gcp` module for now. ## Checklist - [ ] Added Unit Tests - [ ] Covered by existing CI - [ ] Integration tested - [ ] Documentation update <!-- av pr metadata This information is embedded by the av CLI when creating PRs to track the status of stacks when using Aviator. Please do not delete or edit this section of the PR. ``` {"parent":"main","parentHead":"","trunk":"main"} ``` --> <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - New Features - Bundled an updated Iceberg BigQuery integration JAR on the classpath by default. - Bug Fixes - Improved identifier handling when namespaces are empty, reducing errors in BigQuery catalog operations. - Tests - Added end-to-end checks for partition parsing, namespace compatibility, table reachability, and partition spec validation in BigQuery catalog flows. - Chores - Removed the previous external dependency and stopped preloading additional Dataproc JARs by default. - Style - Minor string formatting cleanup with no behavioral changes. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: thomaschow <[email protected]>
bigquery/src/test/java/org/apache/iceberg/gcp/bigquery/TestBigQueryCatalog.java
Outdated
Show resolved
Hide resolved
…QueryCatalog.java
bigquery/src/test/java/org/apache/iceberg/gcp/bigquery/TestBigQueryCatalog.java
Outdated
Show resolved
Hide resolved
bigquery/src/test/java/org/apache/iceberg/gcp/bigquery/TestBigQueryCatalog.java
Show resolved
Hide resolved
…QueryCatalog.java
|
|
||
| @Test | ||
| public void testIsValidIdentifierWithInvalidMultiLevelNamespace() { | ||
| TableIdentifier invalidIdentifier = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: all of these can be inlined
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done, formatting creates multiple lines for a few of the more verbose cases.
bigquery/src/test/java/org/apache/iceberg/gcp/bigquery/TestBigQueryCatalog.java
Outdated
Show resolved
Hide resolved
bigquery/src/test/java/org/apache/iceberg/gcp/bigquery/TestBigQueryCatalog.java
Outdated
Show resolved
Hide resolved
bigquery/src/test/java/org/apache/iceberg/gcp/bigquery/TestBigQueryCatalog.java
Outdated
Show resolved
Hide resolved
|
Thanks @thomaschow and thanks everyone for the review! |
This PR fixes a table validity check for the BigQueryMetastoreCatalog introduced in #12808. We need this validity check in order to access metadata tables using the syntax documented in: https://iceberg.apache.org/docs/1.10.0/docs/hive/?h=metadata#querying-metadata-tables. For example inspecting metadata tables. This is similar to the HiveCatalog implementation. Without this check Iceberg will attempt to load a regular table and fail in the BigQueryMetastoreCatalog implementation because the namespace has more than one level.
Added tests as well.