-
Notifications
You must be signed in to change notification settings - Fork 2.9k
AWS: support registerTable in GlueCatalog #4099
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -22,6 +22,7 @@ | |
| import java.io.Closeable; | ||
| import java.io.IOException; | ||
| import java.util.List; | ||
| import java.util.Locale; | ||
| import java.util.Map; | ||
| import java.util.Set; | ||
| import java.util.stream.Collectors; | ||
|
|
@@ -48,6 +49,7 @@ | |
| import org.apache.iceberg.io.FileIO; | ||
| import org.apache.iceberg.relocated.com.google.common.annotations.VisibleForTesting; | ||
| import org.apache.iceberg.relocated.com.google.common.base.Preconditions; | ||
| import org.apache.iceberg.relocated.com.google.common.collect.ImmutableMap; | ||
| import org.apache.iceberg.relocated.com.google.common.collect.Lists; | ||
| import org.apache.iceberg.relocated.com.google.common.collect.Maps; | ||
| import org.apache.iceberg.util.LockManagers; | ||
|
|
@@ -431,6 +433,39 @@ protected boolean isValidIdentifier(TableIdentifier tableIdentifier) { | |
| IcebergToGlueConverter.isValidTableName(tableIdentifier.name()); | ||
| } | ||
|
|
||
| @Override | ||
| public org.apache.iceberg.Table registerTable(TableIdentifier identifier, String metadataFileLocation) { | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is it possible to do this more generically in public Table registerTable(TableIdentifier identifier, String metadataFileLocation) {
TableOperations ops = newTableOps(identifier);
if (ops.current() != null) {
throw new AlreadyExistsException("Table already exists: %s", identifier);
}
FileIO io = ops.io();
TableMetadata metadata = TableMetadataParser.read(io, metadataFileLocation);
try {
// use temporary ops to pick up the table metadata settings
ops.temp(metadata).commit(null, metadata);
} catch (CommitFailedException ignored) {
throw new AlreadyExistsException("Table was created concurrently: %s", identifier);
}
return new BaseTable(ops, fullTableName(name(), identifier));
}That will rewrite the metadata file rather than using it directly, but it seems like it would work in most cases.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @jackye1995, what do you think about this suggestion?
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||
| Preconditions.checkArgument(isValidIdentifier(identifier), | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. these can likely be generalized to the base metastore class, will do that after we have some other implementations to see how much common code we can extract
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. +1, looks like we are missing pre-conditions on metadataFileLocation in HiveCatalog. CodePointer Adding it at BaseMetaStoreClass will unify this stuff. |
||
| "Table identifier to register is invalid: " + identifier); | ||
| Preconditions.checkArgument(metadataFileLocation != null && !metadataFileLocation.isEmpty(), | ||
| "Cannot register an empty metadata file location as a table"); | ||
|
|
||
| Map<String, String> tableParameters = ImmutableMap.of( | ||
| BaseMetastoreTableOperations.TABLE_TYPE_PROP, | ||
| BaseMetastoreTableOperations.ICEBERG_TABLE_TYPE_VALUE.toLowerCase(Locale.ENGLISH), | ||
| BaseMetastoreTableOperations.METADATA_LOCATION_PROP, | ||
| metadataFileLocation); | ||
|
|
||
| TableInput tableInput = TableInput.builder() | ||
| .name(IcebergToGlueConverter.getTableName(identifier)) | ||
| .tableType(GlueTableOperations.GLUE_EXTERNAL_TABLE_TYPE) | ||
| .parameters(tableParameters) | ||
| .build(); | ||
|
|
||
| try { | ||
| glue.createTable(CreateTableRequest.builder() | ||
| .databaseName(IcebergToGlueConverter.getDatabaseName(identifier)) | ||
| .tableInput(tableInput) | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should we save |
||
| .build()); | ||
| } catch (software.amazon.awssdk.services.glue.model.AlreadyExistsException e) { | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do we need the full class path here? Is there another AlreadyExistsException?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yes, this is rethrown as Iceberg's AlreadyExistsException |
||
| throw new AlreadyExistsException(e, "Table %s already exists in Glue", identifier); | ||
| } catch (EntityNotFoundException e) { | ||
| throw new NoSuchNamespaceException(e, "Namespace %s is not found in Glue", identifier.namespace()); | ||
| } | ||
|
|
||
| return loadTable(identifier); | ||
| } | ||
|
|
||
| @Override | ||
| public String name() { | ||
| return catalogName; | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can avoid the refresh calls.