Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion .github/labeler.yml
Original file line number Diff line number Diff line change
Expand Up @@ -159,7 +159,9 @@ GCP:
- changed-files:
- any-glob-to-any-file: [
'gcp/**/*',
'gcp-bundle/**/*'
'gcp-bundle/**/*',
'bigquery/**/*',
'bigquery-bundle/**/*'
]

DELL:
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,129 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/
package org.apache.iceberg.gcp.bigquery;

import com.google.api.services.bigquery.model.Dataset;
import com.google.api.services.bigquery.model.DatasetList.Datasets;
import com.google.api.services.bigquery.model.DatasetReference;
import com.google.api.services.bigquery.model.Table;
import com.google.api.services.bigquery.model.TableList.Tables;
import com.google.api.services.bigquery.model.TableReference;
import java.util.List;
import java.util.Map;
import java.util.Set;

/**
* A client of Google BigQuery Metastore functions over the BigQuery service. Uses the Google
* BigQuery API.
*/
public interface BigQueryClient {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is there an interface for BigQueryClient? My assumption was that there would be a test implementation, but BigQueryClientImplTest uses a BigQueryClientImpl and injects a mock Bigquery client via the package-private constructor.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like the catalog tests use a mock BigQueryClient, while the tests for BigQueryClient use a mock Bigquery. Is this necessary? It seems to me that this sets up an unnecessary test surface. For instance, to delete a table, the BigQueryClientImplTest validates that client.deleteTable calls a mocked Delete.executeUnparsed method once. The dropTable test then just tests for side-effects (directory is present, directory is not present).

I don't think that this structure is necessary because the drop table test could validate that the underlying Delete.executeUnparsed is called. This interface seems unnecssary.

After looking at the test structure here, I think the bigger problem is that these tests don't really ensure behavior and will break if the implementation changes. Using the dropTable test as an example again, it first loads the table and then calls the client to delete it. An arguably better implementation would simply call deleteTable and catch the NoSuchTableException and return false. If you made that change, the mock tests would break. Another problem is that deleteTable also loads the table by calling getTable, which is hidden by the structure of the tests.

I think that this should be tested like other implementations and needs to use CatalogTests. That will test the behavior. You could potentially do that using a Bigquery mock, but I think the way mocks are used in this PR is very specific to the implementation, so I would prefer just having a fake in-memory implementation of the Bigquery interface instead.

I would keep in mind that the goal of these tests isn't to test that dropTable is translated into getTable followed by deleteTable. If tests are validating a list of forwarded calls (at two different levels) then you aren't ensuring that the behavior of the actual catalog is correct.


/**
* Creates and returns a new dataset.
*
* @param dataset the dataset to create
*/
Dataset createDataset(Dataset dataset);

/**
* Returns a dataset.
*
* @param datasetReference full dataset reference
*/
Dataset getDataset(DatasetReference datasetReference);

/**
* Deletes a dataset.
*
* @param datasetReference full dataset reference
*/
void deleteDataset(DatasetReference datasetReference);

/**
* Updates parameters of a dataset or adds them if did not exist, leaving already-existing ones
* intact.
*
* @param datasetReference full dataset reference
* @param parameters metadata parameters to add
* @return dataset after patch
*/
Dataset setDatasetParameters(DatasetReference datasetReference, Map<String, String> parameters);

/**
* Removes given set of keys of parameters of a dataset. Ignores keys that do not exist already.
*
* @param datasetReference full dataset reference
* @param parameters metadata parameter keys to remove
* @return dataset after patch
*/
Dataset removeDatasetParameters(DatasetReference datasetReference, Set<String> parameters);

/**
* Lists datasets under a given project
*
* @param projectId the identifier of the project to list datasets under
*/
List<Datasets> listDatasets(String projectId);

/**
* Creates and returns a new table.
*
* @param table body of the table to create
*/
Table createTable(Table table);

/**
* Returns a table.
*
* @param tableReference full table reference
*/
Table getTable(TableReference tableReference);

/**
* Updates the catalog table options of an Iceberg table and returns the updated table.
*
* @param tableReference full table reference
* @param table to patch
*/
Table patchTable(TableReference tableReference, Table table);

/**
* Renames a table.
*
* @param tableToRename full table reference
* @param newTableId new table identifier
*/
Table renameTable(TableReference tableToRename, String newTableId);

/**
* Deletes a table.
*
* @param tableReference full table reference
*/
void deleteTable(TableReference tableReference);

/**
* Returns all tables in a database.
*
* @param datasetReference full dataset reference
* @param filterUnsupportedTables if true, fetches every item on the list to verify it is
* supported, in order to filter the list from unsupported Iceberg tables
*/
List<Tables> listTables(DatasetReference datasetReference, boolean filterUnsupportedTables);
}
Loading