Construct AddFileEntry instance only if necessary while reading the Delta Lake checkpoint by findinpath · Pull Request #19795 · trinodb/trino

findinpath · 2023-11-17T11:31:15Z

Description

In case that there is checkpoint filtering applied and there are partition constraints which do not
match the partition values of the entry, avoid early to create the AddFileEntry instance.

Split the loading for the add entries from the Parquet checkpoint in two channels:

one channels contains the partitionValues information
the other channel contains everything else related to add

When building the add entry, check first the partition constraint to match against the partition values and only then load the add block to avoid unecessary resources spent on deserialization.

Used for testing a multi-part checkpoint file (25 parts , each around 12MB ~ 300MB in total) for testing this feature while storing the checkpoint in local MinIO and came up with the following results:

ADD retrieval of all entries
number of add entries: Optional[1235155]
checkpoint iterator completed positions: Optional[1235157]
checkpoint iterator completed bytes: Optional[323227977]
Elapsed Time in milliseconds: 17866

ADD partition pruning without current changes
number of add entries: Optional[18578]
checkpoint iterator completed positions: Optional[49290]
checkpoint iterator completed bytes: Optional[14099674]
Elapsed Time in milliseconds: 1056

ADD partition pruning with current changes (
number of add entries: Optional[18578]
checkpoint iterator completed positions: Optional[49290]
checkpoint iterator completed bytes: Optional[14099674]
Elapsed Time in milliseconds: 701

As can be seen from the analysis above, there can't be spotted any relevant improvement in terms of IO with this change.
That's because the Parquet page is already loaded because it contains at least one entry matching the partition predicate.
This is why the checkpoint iterator still does read the same amount of bytes as the baseline in case of applying the partition pruning.

However, it can be seen that the elapsed number of milliseconds decreases in case of using this change because there is less deserialization performed.

Tested as well with a more permissive filter and haven't actually spotted bigger improvements than ~ 0.5s in terms of elapsed time between the base line and the current changes.

ADD partition pruning without current changes
number of add entries: Optional[210575]
checkpoint iterator completed positions: Optional[248524]
checkpoint iterator completed bytes: Optional[66021553]
Elapsed Time in milliseconds: 3992

ADD partition pruning with current changes
number of add entries: Optional[210575]
checkpoint iterator completed positions: Optional[248524]
checkpoint iterator completed bytes: Optional[66021553]
Elapsed Time in milliseconds: 3532

Additional context and related issues

This change builds on top of #19588

Release notes

( ) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
(x) Release notes are required, with the following suggested text:

# Delta
* Improve query planning performance on delta lake tables. ({issue}`19795`)

...in/trino-delta-lake/src/main/java/io/trino/plugin/deltalake/transactionlog/AddFileEntry.java

...c/main/java/io/trino/plugin/deltalake/transactionlog/checkpoint/CheckpointEntryIterator.java

...in/trino-delta-lake/src/main/java/io/trino/plugin/deltalake/transactionlog/AddFileEntry.java

findinpath · 2023-11-21T14:23:52Z

...c/main/java/io/trino/plugin/deltalake/transactionlog/checkpoint/CheckpointEntryIterator.java

That's not enough - I'm seeing while debugging buildAddEntry that the blocks are all not lazy.

Even if we use lazy blocks, the block behind the lazy block is the monolithic structure corresponding to the add entry.
If we want to actually avoid reading from parquet add related fields for the entries which are not relevant, we need to refactor the way we are reading the checkpoint.

Even if we use lazy blocks, the block behind the lazy block is the monolithic structure

yes

If we want to actually avoid reading from parquet add related fields for the entries which are not relevant,

i don't think the parquet reader supports that, or can support that, given how values are encoded in Parquet.

i don't think the parquet reader supports that,

I'm pointing here towards using a similar method as for dereference pushdown.

findepi · 2023-11-24T10:52:10Z

Does this technically conflict with #19848 ?

findinpath · 2023-11-24T11:40:03Z

Does this technically conflict with #19848 ?

No, it shouldn't.

The stats to be read with #19848 are build statically.
This change is mostly about splitting in two separate channels the reading from Parquet for the add entries, specifically:

partitionValues
everything else

findepi

"Avoid early to construct the AddFileEntry"

...in/trino-delta-lake/src/main/java/io/trino/plugin/deltalake/transactionlog/AddFileEntry.java

...c/main/java/io/trino/plugin/deltalake/transactionlog/checkpoint/CheckpointEntryIterator.java

findepi

"Create CheckPointFieldExtractor instance only if necessary"

findepi · 2023-11-24T11:59:27Z

...c/main/java/io/trino/plugin/deltalake/transactionlog/checkpoint/CheckpointEntryIterator.java

requireNonNull becomes redundant

findepi · 2023-11-24T12:46:19Z

...c/main/java/io/trino/plugin/deltalake/transactionlog/checkpoint/CheckpointEntryIterator.java

|| addPartitionValuesBlock.isNull(pagePosition) wasn't here, right?
why is it being added?

The isNull check for addPartitionValuesBlock is being added because there are now 2 blocks (instead of initially 1) from which we build up the add entry and they need to be consistent.
Changing though slightly the logic. Thank you for raising this.

findepi · 2023-11-24T12:47:00Z

...c/main/java/io/trino/plugin/deltalake/transactionlog/checkpoint/CheckpointEntryIterator.java

Is this useful, especually considering Block.toString?

or to be removed?

Are you implying here to eventually remove all the debug statements from the CheckpointEntryIterator class?
Potential follow-up?

We could log the field count of RowBlock once per Page somewhere, it looks unnecessary to log it for every position in a Page

...c/main/java/io/trino/plugin/deltalake/transactionlog/checkpoint/CheckpointEntryIterator.java

...c/main/java/io/trino/plugin/deltalake/transactionlog/checkpoint/CheckpointSchemaManager.java

...in/trino-delta-lake/src/main/java/io/trino/plugin/deltalake/transactionlog/AddFileEntry.java

...c/main/java/io/trino/plugin/deltalake/transactionlog/checkpoint/CheckpointEntryIterator.java

findinpath · 2023-11-27T17:19:51Z

Rebased on master to handle conflicts with #19848

findepi · 2023-11-28T11:58:59Z

Rebased on master to handle conflicts with #19848

that's why i asked #19795 (comment) :)

findinpath · 2023-12-06T11:47:48Z

Rebased on master to adress code conflicts.

...c/main/java/io/trino/plugin/deltalake/transactionlog/checkpoint/CheckpointEntryIterator.java

raunaqmorarka · 2023-12-08T02:14:22Z

...in/trino-delta-lake/src/main/java/io/trino/plugin/deltalake/transactionlog/AddFileEntry.java

Please reword the commit message to

Construct AddFileEntry lazily When checkpoint filtering is applied and there are partition constraints which do not match the partition values of the entry, avoid eagerly constructing `AddFileEntry`.

Modified to

Construct AddFileEntry instance only if necessary When checkpoint filtering is applied and there are partition constraints which do not match the partition values of the entry, avoid eagerly to construct `AddFileEntry` instances.

...c/main/java/io/trino/plugin/deltalake/transactionlog/checkpoint/CheckpointEntryIterator.java

When checkpoint filtering is applied and there are partition constraints which do not match the partition values of the entry, avoid eagerly to construct `AddFileEntry` instances.

In case of performing checkpoint filtering in Delta Lake, avoid reading from Parquet pages loaded in memory the `add` entries which don't match the partition predicate.

cla-bot bot added the cla-signed label Nov 17, 2023

findinpath requested a review from ebyhr November 17, 2023 11:31

findinpath self-assigned this Nov 17, 2023

github-actions bot added the delta-lake Delta Lake connector label Nov 17, 2023

findinpath added the delta-lake Delta Lake connector label Nov 17, 2023

ebyhr reviewed Nov 20, 2023

View reviewed changes

...in/trino-delta-lake/src/main/java/io/trino/plugin/deltalake/transactionlog/AddFileEntry.java Outdated Show resolved Hide resolved

findinpath force-pushed the findinpath/add-file-entry branch from 1a41795 to a8e51ec Compare November 20, 2023 10:44

ebyhr approved these changes Nov 20, 2023

View reviewed changes

findinpath requested review from findepi and raunaqmorarka November 21, 2023 07:07

raunaqmorarka reviewed Nov 21, 2023

View reviewed changes

findepi reviewed Nov 21, 2023

View reviewed changes

...in/trino-delta-lake/src/main/java/io/trino/plugin/deltalake/transactionlog/AddFileEntry.java Outdated Show resolved Hide resolved

findinpath force-pushed the findinpath/add-file-entry branch from a8e51ec to efd2637 Compare November 21, 2023 11:48

findinpath mentioned this pull request Nov 21, 2023

Prune unused stats columns when reading Delta checkpoint #19848

Merged

findinpath commented Nov 21, 2023

View reviewed changes

findinpath marked this pull request as draft November 22, 2023 09:10

findinpath force-pushed the findinpath/add-file-entry branch from efd2637 to e1ce7b8 Compare November 22, 2023 11:56

findinpath marked this pull request as ready for review November 22, 2023 11:56

findinpath force-pushed the findinpath/add-file-entry branch 2 times, most recently from ee830b6 to 17fac3f Compare November 23, 2023 05:54

findepi reviewed Nov 24, 2023

View reviewed changes

...in/trino-delta-lake/src/main/java/io/trino/plugin/deltalake/transactionlog/AddFileEntry.java Outdated Show resolved Hide resolved

...c/main/java/io/trino/plugin/deltalake/transactionlog/checkpoint/CheckpointEntryIterator.java Outdated Show resolved Hide resolved

findepi reviewed Nov 24, 2023

View reviewed changes

findinpath force-pushed the findinpath/add-file-entry branch from 17fac3f to b52e93f Compare November 24, 2023 15:18

findepi reviewed Nov 24, 2023

View reviewed changes

findinpath force-pushed the findinpath/add-file-entry branch 2 times, most recently from 9dda8f5 to 1e3db18 Compare November 27, 2023 10:02

raunaqmorarka reviewed Nov 27, 2023

View reviewed changes

findinpath force-pushed the findinpath/add-file-entry branch from 1e3db18 to 8dea018 Compare November 27, 2023 17:06

findinpath force-pushed the findinpath/add-file-entry branch from 8dea018 to 29fab3e Compare November 27, 2023 17:19

findinpath force-pushed the findinpath/add-file-entry branch 2 times, most recently from 6f080ce to 206ad69 Compare November 28, 2023 10:28

findinpath force-pushed the findinpath/add-file-entry branch from 206ad69 to ef818db Compare November 28, 2023 12:30

findinpath force-pushed the findinpath/add-file-entry branch from ef818db to f1f645e Compare December 6, 2023 11:47

findinpath requested a review from raunaqmorarka December 7, 2023 19:04

raunaqmorarka changed the title ~~Avoid early to construct the AddFileEntry while reading the Delta Lake checkpoint~~ Avoid eagerly constructing AddFileEntry while reading the Delta Lake checkpoint Dec 8, 2023

raunaqmorarka approved these changes Dec 8, 2023

View reviewed changes

findinpath added 8 commits December 8, 2023 09:05

Rename internally used interface to CheckpointFieldExtractor

e8c43c7

Compute deletionVectorsEnabled only once per checkpoint file

e72696d

Construct AddFileEntry instance only if necessary

c1c01a4

When checkpoint filtering is applied and there are partition constraints which do not match the partition values of the entry, avoid eagerly to construct `AddFileEntry` instances.

Create CheckpointFieldExtractor instance only if necessary

3f63dc3

Support building a DeltaLakeTransactionLogEntry from multiple blocks

7bd1f20

Extract buildAddEntry method logic to AddFileEntryExtractor class

18507c0

Avoid materializing from the checkpoint irrelevant add entries

6da1aac

In case of performing checkpoint filtering in Delta Lake, avoid reading from Parquet pages loaded in memory the `add` entries which don't match the partition predicate.

Compute Parquet types for add entries only once per checkpoint file

2039b5c

findinpath force-pushed the findinpath/add-file-entry branch from f1f645e to 2039b5c Compare December 8, 2023 08:34

findinpath changed the title ~~Avoid eagerly constructing AddFileEntry while reading the Delta Lake checkpoint~~ Construct AddFileEntry instance only if necessary while reading the Delta Lake checkpoint Dec 8, 2023

findinpath requested a review from raunaqmorarka December 8, 2023 08:39

raunaqmorarka approved these changes Dec 8, 2023

View reviewed changes

raunaqmorarka merged commit b554d65 into trinodb:master Dec 9, 2023

github-actions bot added this to the 435 milestone Dec 9, 2023

mosabua mentioned this pull request Dec 11, 2023

Add Trino 435 release notes #20000

Merged

raunaqmorarka mentioned this pull request Dec 5, 2024

Add SourcePage interface for delayed materialization of ConnectorSourceData #24011

Merged

Conversation

findinpath commented Nov 17, 2023 • edited by raunaqmorarka Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Additional context and related issues

Release notes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

findinpath Nov 21, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

findepi commented Nov 24, 2023

Uh oh!

findinpath commented Nov 24, 2023

Uh oh!

findepi left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

findepi left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

findinpath commented Nov 27, 2023

Uh oh!

findepi commented Nov 28, 2023

Uh oh!

findinpath commented Dec 6, 2023

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

4 participants

findinpath commented Nov 17, 2023 •

edited by raunaqmorarka

Loading

findinpath Nov 21, 2023 •

edited

Loading