Support converting column stats on row type to json in Delta Lake#14314
Support converting column stats on row type to json in Delta Lake#14314
Conversation
a5a1264 to
497e650
Compare
497e650 to
8c61387
Compare
|
CI hit #14391 at
|
There was a problem hiding this comment.
Rather than getChildren I think you want to convert the rowBlock to a ColumnarRow
There was a problem hiding this comment.
The argument is SingleRowBlock which is unsupported in ColumnarRow#toColumnarRow.
There was a problem hiding this comment.
That's surprising, toColumnarRow checks that the input is an instance of AbstractRowBlock, which SingleRowBlock extends. Seems like it should work.
Where does the error come from?
There was a problem hiding this comment.
toColumnarRow checks that the input is an instance of AbstractRowBlock, which SingleRowBlock extends. Seems like it should work.
SingleRowBlock extends AbstractSingleRowBlock, not AbstractRowBlock.
.../src/main/java/io/trino/plugin/deltalake/transactionlog/DeltaLakeParquetStatisticsUtils.java
Outdated
Show resolved
Hide resolved
.../src/main/java/io/trino/plugin/deltalake/transactionlog/DeltaLakeParquetStatisticsUtils.java
Outdated
Show resolved
Hide resolved
.../src/main/java/io/trino/plugin/deltalake/transactionlog/DeltaLakeParquetStatisticsUtils.java
Outdated
Show resolved
Hide resolved
|
CI hit #14391 |
There was a problem hiding this comment.
The assertions for addFileEntries.get(0) and addFileEntries.get(1) are not relevant. The stats already existed there before running the test.
There was a problem hiding this comment.
No, they're relevant. Those two assertions fail if we don't copy the statistics.
| import static io.trino.plugin.hive.HiveTestUtils.HDFS_ENVIRONMENT; | ||
| import static io.trino.testing.TestingConnectorSession.SESSION; | ||
|
|
||
| public final class TestDeltaLakeUtils |
| { | ||
| private TestDeltaLakeUtils() {} | ||
|
|
||
| public static List<AddFileEntry> getAddFileEntries(SchemaTableName table, String tableLocation) |
There was a problem hiding this comment.
The table has no impact on the result of this method, so you can remove this parameter and use eg new SchemaTableName("dummy_schema_placeholder", "dummy_table_placeholder") below
There was a problem hiding this comment.
include the key sets in the message
also, would be nice to add a comment why this is expected. it's not obvious to me
There was a problem hiding this comment.
btw instead of this check here, i'd rather have a non-null check on type after Type type = columnTypeMapping.get(value.getKey()); line
There was a problem hiding this comment.
a "verify ..." should verify, i.e. ensure something is true
as a follow-up we could rename this to eg skipUnlessInsertsSupported
There was a problem hiding this comment.
I will send a follow-up PR.
There was a problem hiding this comment.
for the test, do we need transaction json files before the checkpoint (0 and 1) ?
There was a problem hiding this comment.
Those files aren't required. Removed.
There was a problem hiding this comment.
do the getAddFileEntries come from a new snapshot that we just created, or from previous snapshot + transaction log files?
i think the intention is that we create transaction 4 and a checkpoint, so let's verify that happened
Additionally, remove a redundant argument.
2312115 to
bcbbc9f
Compare
Description
Fixes #13996
Release notes
(x) This is not user-visible or docs only and no release notes are required.