Iceberg files distributed#4840
Conversation
|
@electrum when you get a chance , can you please review this? |
electrum
left a comment
There was a problem hiding this comment.
Still reviewing. I'm wondering if there is a way we can reuse more of the Iceberg library code for system tables so that we don't have to reimplement everything.
| public FilesTable(Schema schema, TypeManager typeManager) | ||
| { | ||
| return new FixedPageSource(buildPages(tableMetadata, session, icebergTable, snapshotId)); | ||
| ImmutableList.Builder<IcebergColumnHandle> columnHandleBuilder = new ImmutableList.Builder<>(); |
There was a problem hiding this comment.
This can chain
ImmutableList.Builder<IcebergColumnHandle> columnHandleBuilder = ImmutableList.builder()
.add(FILE_PATH)
.add(FILE_FORMAT)
...| columnHandleBuilder.add(KEY_METADATA); | ||
| columnHandleBuilder.add(SPLIT_OFFSETS); | ||
|
|
||
| List<Field> fields = Lists.newArrayList( |
| columnBuilder.build() | ||
| .forEach(column -> { | ||
| final Type type = toPrestoType(column.type(), typeManager); | ||
| List<Field> boundFields = Lists.newArrayList( |
|
|
||
| columnBuilder.build() | ||
| .forEach(column -> { | ||
| final Type type = toPrestoType(column.type(), typeManager); |
There was a problem hiding this comment.
We don't add final to local variables
| new Field(Optional.of(VALUE_COUNTS), BIGINT), | ||
| new Field(Optional.of(NULL_VALUE_COUNTS), BIGINT)); | ||
|
|
||
| final ImmutableList.Builder<Types.NestedField> columnBuilder = new ImmutableList.Builder<>(); |
There was a problem hiding this comment.
This could be done as a stream
schema.columns().stream()
.flatMap(column -> handleNestedType(column, Optional.empty()))
.map(column -> {
...
return new IcebergColumnHandle(...);
})
.forEach(columnHandleBuilder::add);
| { | ||
| SchemaTableName table = tableHandle.getSchemaTableName(); | ||
| TableIdentifier tableIdentifier = tableHandle.toTableIdentifier(); | ||
| if (MetadataTableType.from(tableIdentifier.name()) != null && tableIdentifier.namespace().levels().length == 2) { |
There was a problem hiding this comment.
Why do we need to check the namespace levels length? When would it not be 2?
|
|
||
| public static Table getIcebergTable(HiveMetastore metastore, HdfsEnvironment hdfsEnvironment, ConnectorSession session, IcebergTableHandle tableHandle) | ||
| { | ||
| SchemaTableName table = tableHandle.getSchemaTableName(); |
There was a problem hiding this comment.
This could move inside the if block
|
|
||
| private static Table loadMetadataTable(MetadataTableType type, SchemaTableName table, HiveMetastore metastore, HdfsEnvironment hdfsEnvironment, HdfsContext hdfsContext, HiveIdentity identity) | ||
| { | ||
| if (type != null) { |
There was a problem hiding this comment.
We don't need to check since this method isn't called with null. It's the caller's responsibility to do the check.
| if (type != null) { | ||
| TableOperations ops = new HiveTableOperations(metastore, hdfsEnvironment, hdfsContext, identity, table.getSchemaName(), table.getTableName()); | ||
| if (ops.current() == null) { | ||
| throw new NoSuchTableException("Table does not exist: " + table); |
There was a problem hiding this comment.
This should throw Presto TableNotFoundException
| case PARTITIONS: | ||
| return new PartitionsTable(ops, baseTable); | ||
| default: | ||
| throw new NoSuchTableException("Unknown metadata table type: %s for %s", type, table); |
There was a problem hiding this comment.
This is a bug, so do
throw new VerifyException(format("Unknown metadata table type [%s] for table: %s", type, table));|
👋 @Parth-Brahmbhatt - this PR is inactive and doesn't seem to be under development. If you'd like to continue work on this at any point in the future, feel free to re-open. |
No description provided.