Support creating tables with column comment in Delta Lake#12455
Conversation
Please split this into two PRs. |
12c206c to
495369a
Compare
|
CI hit #12471 |
495369a to
7847baa
Compare
|
Verified Spark compatibility locally. |
...elta-lake/src/main/java/io/trino/plugin/deltalake/transactionlog/DeltaLakeSchemaSupport.java
Outdated
Show resolved
Hide resolved
| private final Optional<String> comment; | ||
|
|
||
| @Deprecated | ||
| public DeltaLakeColumnHandle( |
There was a problem hiding this comment.
Can you please remove the usage of this constructor from the current codebase of trino-delta-lake module? I still found 2 usages of it.
There was a problem hiding this comment.
Actually, there're a lot of usages more than 2. That is why I marked @Deprecated instead of migrating them. I can migrate all if it's not burden for reviewers.
There was a problem hiding this comment.
DLCH is used as key in maps, comment is part of its equality.
We unfortunately need to update all usages appropriately, otherwise we cannot reason about the correctness of the codebase.
I would very much prefer not adding comment here, to the table handle.
It bite us a few times in Iceberg, so i would prefer to see if we can transport the comment to ColumnMetadata somehow else.
plugin/trino-accumulo/src/test/java/io/trino/plugin/accumulo/TestAccumuloConnectorTest.java
Show resolved
Hide resolved
7847baa to
18df43c
Compare
...roduct-tests/src/main/java/io/trino/tests/product/deltalake/TestDeltaLakeOssCreateTable.java
Outdated
Show resolved
Hide resolved
18df43c to
0072698
Compare
|
@ebyhr please squash and ping me for a review. |
ee31f4e to
417d9cd
Compare
|
@findepi Squashed commits. |
| private final Optional<String> comment; | ||
|
|
||
| @Deprecated | ||
| public DeltaLakeColumnHandle( |
There was a problem hiding this comment.
DLCH is used as key in maps, comment is part of its equality.
We unfortunately need to update all usages appropriately, otherwise we cannot reason about the correctness of the codebase.
I would very much prefer not adding comment here, to the table handle.
It bite us a few times in Iceberg, so i would prefer to see if we can transport the comment to ColumnMetadata somehow else.
|
|
||
| private static Optional<String> getComment(JsonNode node) | ||
| { | ||
| return Optional.ofNullable(node.get("metadata")) |
There was a problem hiding this comment.
Why node.get("metadata") is nullable?
| assertUpdate("DROP TABLE " + tableName); | ||
| } | ||
|
|
||
| @Test |
There was a problem hiding this comment.
"Deny creating tables with column comment if unsupported" looks solid & only slightly related to the Delta.
Let's separate PR for this.
7d63473 to
183d16e
Compare
|
@findepi Updated not to touch |
| String location = metastore.getTableLocation(tableHandle.getSchemaTableName(), session); | ||
| List<ColumnMetadata> columns = getColumns(tableHandle.getMetadataEntry()).stream() | ||
| .map(DeltaLakeMetadata::getColumnMetadata) | ||
| .map(column -> getColumnMetadata(column, getColumnComments(tableHandle.getMetadataEntry()))) |
There was a problem hiding this comment.
getColumnComments(tableHandle.getMetadataEntry()) is performed once for every column, should be only once
| return metastore.getMetadata(metastore.getSnapshot(table, session), session).stream().map(metadata -> { | ||
| List<ColumnMetadata> columnMetadata = getColumns(metadata).stream() | ||
| .map(DeltaLakeMetadata::getColumnMetadata) | ||
| .map(column -> getColumnMetadata(column, getColumnComments(metadata))) |
There was a problem hiding this comment.
getColumnComments(tableHandle.getMetadataEntry()) is performed once for every column, should be only once
| .build(); | ||
| } | ||
|
|
||
| public static Map<String, Optional<String>> getColumnComments(MetadataEntry metadataEntry) |
There was a problem hiding this comment.
Is there a difference between missing entry and entry being Optional.empty()?
maybe we just have a map of non-null comments, passed around as Map<String,String> ?
we would lose ability to validate map contains entries for the columns, but you don't do this anyway (by means of columnComments.getOrDefault...)
| .orElseThrow(() -> new IllegalStateException("Serialized schema not found in transaction log for " + metadataEntry.getName())); | ||
| } | ||
|
|
||
| private static Map<String, Optional<String>> getColumnComment(String json) |
There was a problem hiding this comment.
getColumnComment -> getColumnComments (plural)
| } | ||
| } | ||
|
|
||
| private static Map.Entry<String, Optional<String>> columnComment(JsonNode node) |
a11f6d9 to
dcf4f0d
Compare
|
@findepi Addressed comments. |
| } | ||
|
|
||
| @Test(groups = {DELTA_LAKE_DATABRICKS, PROFILE_SPECIFIC_TESTS}) | ||
| public void testCreateTableWithColumnComment() |
There was a problem hiding this comment.
Do we have a test that goes the other way, creating the comment in Delta and reading from Trino?
| handle.getMetadataEntry().getId(), | ||
| columnsBuilder.build(), | ||
| partitionColumns, | ||
| ImmutableMap.of(), |
There was a problem hiding this comment.
Can you add a test that creates a table with comments and then adds a column? Make sure this doesn't wipe the existing comments
| randomUUID().toString(), | ||
| handle.getInputColumns(), | ||
| handle.getPartitionedBy(), | ||
| ImmutableMap.of(), |
There was a problem hiding this comment.
Should this match what we do in createTable?
There was a problem hiding this comment.
I think it's not needed as column comment is unsupported in CTAS syntax level.
There was a problem hiding this comment.
Yep, you're totally right. Thanks
dcf4f0d to
04d0d3f
Compare
alexjo2144
left a comment
There was a problem hiding this comment.
Do you want to support alter table add column with a comment here, or create a separate PR?
|
I want to separate PR for ADD COLUMN with a comment. |
04d0d3f to
5442cfa
Compare
|
Rebased on upstream to resolve conflicts. |
Description
Support creating tables with column comment in Delta Lake
Documentation
( ) No documentation is needed.
( ) Sufficient documentation is included in this PR.
( ) Documentation PR is available with #prnumber.
( ) Documentation issue #issuenumber is filed, and can be handled later.
Release notes
( ) No release notes entries required.
(x) Release notes entries required with the following suggested text: