Add support for partition pruning in Delta checkpoint iterator#19588
Add support for partition pruning in Delta checkpoint iterator#19588ebyhr merged 1 commit intotrinodb:masterfrom
Conversation
plugin/trino-delta-lake/src/main/java/io/trino/plugin/deltalake/DeltaLakeMetadata.java
Outdated
Show resolved
Hide resolved
a4106b2 to
7c9ac69
Compare
plugin/trino-delta-lake/src/test/java/io/trino/plugin/deltalake/TestingDeltaLakeUtils.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Shouldn't we have here only 1 entry?
Probably this relates to https://github.com/trinodb/trino/pull/19588/files/7c9ac692875bdb08827aa1dc9f7beac63a9874d4#r1383331077
We should have also the check to see that a reduced amount of entries has been actually read from the parquet file
assertThat(checkpointEntryIterator.getCompletedPositions().orElseThrow()).isEqualTo(....);
There was a problem hiding this comment.
When doing buildAddEntry check whether the partitionValues / partitionValues_parsed match the partitionConstraint and return null if not matching.
...lake/src/main/java/io/trino/plugin/deltalake/transactionlog/checkpoint/CheckpointWriter.java
Outdated
Show resolved
Hide resolved
plugin/trino-delta-lake/src/main/java/io/trino/plugin/deltalake/DeltaLakeConfig.java
Outdated
Show resolved
Hide resolved
...c/main/java/io/trino/plugin/deltalake/transactionlog/checkpoint/CheckpointEntryIterator.java
Outdated
Show resolved
Hide resolved
7c9ac69 to
9adb767
Compare
|
Just rebased on master. |
There was a problem hiding this comment.
perhaps remove && !partitionConstraint.isAll()
i think the new code path should eventually replace the old cache-based approach, so we can use isCheckpointPartitionFilterEnabled as a algorithm-selecting toggle
9adb767 to
e73edbc
Compare
|
CI hit #19602 |
There was a problem hiding this comment.
Do we need to check this for every position ? Seems like we should know this per file based on parquet file metadata (maybe it's possible to use io.trino.plugin.hive.ReaderPageSource#getReaderColumns).
There was a problem hiding this comment.
Agree with using Parquet metadata though getReaderColumns returns an empty list in this case. Sent another PR #19727
There was a problem hiding this comment.
While this may help in reducing the number of DeltaLakeTransactionLogEntry, doing the filtering after materialising all channels on each position of a page means that we can't benefit from lazy loading of blocks.
Ideally we should filter directly on the relevant block channels and skip to next position without decoding the remaining channels when the predicate does not match. But this can be looked at as a follow-up.
There was a problem hiding this comment.
Correct.
The partition matching check should be done directly in io.trino.plugin.deltalake.transactionlog.checkpoint.CheckpointEntryIterator#buildAddEntry
If we know that we have the field partitionValues_parsed (see https://github.com/trinodb/trino/pull/19588/files#r1389691135) , maybe we should do this check right away after doing
optional: One word concerning using entry.getAdd().getCanonicalPartitionValues().
We have at hand the partitionValues_parsed. We could avoid deserializing the stringified partition values and use the "parsed" values directly. OTOH, we don't actually use the parsed partition values otherwise anywhere else. Did you intentionally restrain from reading the parsed partition values in favor of the stringified partition values?
...-delta-lake/src/main/java/io/trino/plugin/deltalake/transactionlog/TransactionLogAccess.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Correct.
The partition matching check should be done directly in io.trino.plugin.deltalake.transactionlog.checkpoint.CheckpointEntryIterator#buildAddEntry
If we know that we have the field partitionValues_parsed (see https://github.com/trinodb/trino/pull/19588/files#r1389691135) , maybe we should do this check right away after doing
optional: One word concerning using entry.getAdd().getCanonicalPartitionValues().
We have at hand the partitionValues_parsed. We could avoid deserializing the stringified partition values and use the "parsed" values directly. OTOH, we don't actually use the parsed partition values otherwise anywhere else. Did you intentionally restrain from reading the parsed partition values in favor of the stringified partition values?
...st/java/io/trino/plugin/deltalake/transactionlog/checkpoint/TestCheckpointEntryIterator.java
Outdated
Show resolved
Hide resolved
...st/java/io/trino/plugin/deltalake/transactionlog/checkpoint/TestCheckpointEntryIterator.java
Outdated
Show resolved
Hide resolved
...c/main/java/io/trino/plugin/deltalake/transactionlog/checkpoint/CheckpointEntryIterator.java
Outdated
Show resolved
Hide resolved
e73edbc to
c0494e2
Compare
There was a problem hiding this comment.
Why TODO? why not do it right away?
There was a problem hiding this comment.
I just wanted to focus on SELECT path in this PR. Going to handle in this PR.
...-delta-lake/src/main/java/io/trino/plugin/deltalake/transactionlog/TransactionLogAccess.java
Outdated
Show resolved
Hide resolved
...-delta-lake/src/main/java/io/trino/plugin/deltalake/transactionlog/TransactionLogAccess.java
Outdated
Show resolved
Hide resolved
...-delta-lake/src/main/java/io/trino/plugin/deltalake/transactionlog/TransactionLogAccess.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
The callers (eg split source) will likely repeat this work, so it's partially wasted.
Still useful because this allows us to materialize a shorter list.
I think this wouldn't be needed here if we could return a Stream/Iterator instead of a List.
There was a problem hiding this comment.
require... or use... ?
we wnt to use use partitionvalues_parsed field if it is present, but we don't require that it exists (we don't fail when it doesn't), right?
There was a problem hiding this comment.
The set of partitioning columns may change in the meantime probably only through the CREATE OR REPLACE TABLE operation. In such case, we shouldn't need to read the old checkpoint file at all, but I don't know whether this is the case.
plugin/trino-delta-lake/src/main/java/io/trino/plugin/deltalake/DeltaLakeConfig.java
Outdated
Show resolved
Hide resolved
plugin/trino-delta-lake/src/test/java/io/trino/plugin/deltalake/TestDeltaLakeConfig.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Could you pls test coverage into TestDeltaLakeFileOperations with checkpoint_filtering_enabled session property enabled to add more transparence in regards to the consequences coming with this change?
97cf6e7 to
7f87123
Compare
Release notes
(x) Release notes are required, with the following suggested text: