Skip to content

Convert the retrieval of the active data files to streaming#20054

Merged
ebyhr merged 5 commits intotrinodb:masterfrom
findinpath:findinpath/delta-stream-add-entries
Jan 16, 2024
Merged

Convert the retrieval of the active data files to streaming#20054
ebyhr merged 5 commits intotrinodb:masterfrom
findinpath:findinpath/delta-stream-add-entries

Conversation

@findinpath
Copy link
Copy Markdown
Contributor

@findinpath findinpath commented Dec 7, 2023

Description

Instead of waiting to get in bulk the collection of all
the active data files of a table relevant to a query,
provide a stream of active add entries so that the
split manager can start providing to the engine splits right away.

The active add entries stream is closeable in order to ensure that
the underlying checkpoint parquet page source gets closed
even when not all the entries from the checkpoint are consumed.

These changes should provide visible improvements in terms of reduced memory requirements when dealing with Delta Lake tables having many active add files, because as of now these files will not be held anymore in memory within DeltaLakeSplitManager due to the fact that the add files are processed as a consequence of the current changes now one by one.

Potential downside The checkpoint entry iterator will keep open the parquet page source corresponding to the checkpoint (multipart) file until all the add entries are processed by the DeltaLakeSplitSource.

Additional context and related issues

The access to the active add stream needs to be done within a try-with-resources statement to make sure that the checkpoint iterator does indeed release the opened resources.

The key output of these changes is that in the DeltaLakeSplitManager & DeltaLakeSplitSource don't block anymore until all the add entries relevant to the query are collected, but instead the splits are provided gradually (in parallel with the query execution).

This change should particularly speed up the planning of the queries involving Delta Lake tables where the table statistics are not required because the split generation can start right away.

Fixes #20088

Release notes

( ) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
(x) Release notes are required, with the following suggested text:

# Delta Lake
* Speed up the beginning of the query execution for queries which don't involve table statitstics. ({issue}`issuenumber`)

@cla-bot cla-bot bot added the cla-signed label Dec 7, 2023
@findinpath findinpath self-assigned this Dec 7, 2023
@findinpath findinpath added the delta-lake Delta Lake connector label Dec 7, 2023
Copy link
Copy Markdown
Member

@alexjo2144 alexjo2144 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may want to consider standardizing all the code here on Iterators rather than Streams. Going back and forth has some overhead and generally I think we prefer iterators for performance sensitive things.

@findinpath
Copy link
Copy Markdown
Contributor Author

findinpath commented Dec 7, 2023

Do we have a potential memory leak with this change?

The pageSource is opened, but if the entries are not consumed to the very last one, the pageSource is not being closed.

Do we maybe need a concept similar to https://github.com/apache/iceberg/blob/263b530502e5597b19b6b5e282917af8eede7600/api/src/main/java/org/apache/iceberg/io/CloseableIterator.java#L29 ?

@jkylling
Copy link
Copy Markdown
Contributor

jkylling commented Dec 8, 2023

Do we have a potential memory leak with this change?

The pageSource is opened, but if the entries are not consumed to the very last one, the pageSource is not being closed.

Do we maybe need a concept similar to https://github.com/apache/iceberg/blob/263b530502e5597b19b6b5e282917af8eede7600/api/src/main/java/org/apache/iceberg/io/CloseableIterator.java#L29 ?

We should. One possible alternative could be to use that streams are AutoCloseable, and expose the CheckpointEntryIterator as a Stream, hook the closing of the stream up to the closing of the page source, and always consume the stream in a try-with statement.

@findinpath findinpath force-pushed the findinpath/delta-stream-add-entries branch from fa485f7 to 417003e Compare December 9, 2023 22:29
@findinpath findinpath marked this pull request as draft December 10, 2023 00:26
@findinpath findinpath force-pushed the findinpath/delta-stream-add-entries branch from 417003e to 1e33c29 Compare December 10, 2023 07:54
@findinpath findinpath marked this pull request as ready for review December 10, 2023 08:02
@findinpath findinpath marked this pull request as draft December 10, 2023 08:10
@findinpath findinpath force-pushed the findinpath/delta-stream-add-entries branch from 1e33c29 to 055c7bd Compare December 10, 2023 09:59
@findinpath findinpath marked this pull request as ready for review December 10, 2023 12:10
@findinpath findinpath marked this pull request as draft December 11, 2023 11:50
@findinpath findinpath force-pushed the findinpath/delta-stream-add-entries branch from 055c7bd to 704400b Compare December 11, 2023 14:57
@findinpath findinpath requested a review from findepi December 11, 2023 14:59
@findinpath findinpath force-pushed the findinpath/delta-stream-add-entries branch 4 times, most recently from 0f02eb6 to 7b29295 Compare December 11, 2023 15:10
@findinpath findinpath marked this pull request as ready for review December 11, 2023 19:33
Copy link
Copy Markdown
Member

@findepi findepi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Rewrite data file filtering logic for OPTIMIZE statements"

@findinpath findinpath force-pushed the findinpath/delta-stream-add-entries branch 2 times, most recently from 71b50e4 to 8a2edbf Compare December 14, 2023 14:08
@findinpath
Copy link
Copy Markdown
Contributor Author

Rebased on master to address minor code conflicts

@findinpath findinpath force-pushed the findinpath/delta-stream-add-entries branch from 8a2edbf to 4f8c6ef Compare December 14, 2023 15:29
@findinpath findinpath force-pushed the findinpath/delta-stream-add-entries branch from 4f8c6ef to 20fe80a Compare December 14, 2023 15:52
@findepi
Copy link
Copy Markdown
Member

findepi commented Dec 15, 2023

#20121 merged, please rebase.

@findinpath findinpath force-pushed the findinpath/delta-stream-add-entries branch 2 times, most recently from 63bfb21 to 624ee09 Compare December 15, 2023 09:25
Copy link
Copy Markdown
Member

@findepi findepi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Add close logic for the Delta Lake splits"

-- add some color to the commit description

Copy link
Copy Markdown
Member

@findepi findepi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Process lazily multi-part checkpoint files"

@findinpath findinpath force-pushed the findinpath/delta-stream-add-entries branch from 624ee09 to 5ff2514 Compare December 15, 2023 10:31
Copy link
Copy Markdown
Contributor

@jkylling jkylling left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could there be a potential resource leak here (which might not be introduced as part of this PR)? We call TrinoInputFile.newInput() within ParquetPageSourceFactory.createPageSource as part of the constructor of CheckpointEntryIterator. If some later part of the constructor fails, we likely won't get to close the file.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We call TrinoInputFile.newInput() ...

I'm assuming you are refferring to

this.input = file.newInput();
}
@Override
public void close()
throws IOException
{
input.close();
}

If some later part of the constructor fails, we likely won't get to close the file.

There is actually exception handling in place for this specific scenario. See

catch (Exception e) {
try {
if (dataSource != null) {
dataSource.close();
}
}

I'm adding though exception handling in the CheckpointEntryIterator constructor to make sure that the page source gets closed in case that there are eventual exceptions happening after initializing the pageSource field.

@findinpath findinpath requested a review from findepi December 18, 2023 13:53
Instead of waiting to get in bulk the collection of all
the active data files of a table relevant to a query,
provide a stream of active add entries so that the
split manager can start providing to the engine splits right away.

The active add entries stream is closeable in order to ensure that
the underlying checkpoint parquet page source gets closed
even when not all the entries from the checkpoint are consumed.
@findinpath findinpath force-pushed the findinpath/delta-stream-add-entries branch from 5ff2514 to ce75821 Compare January 2, 2024 09:03
@ebyhr
Copy link
Copy Markdown
Member

ebyhr commented Jan 12, 2024

Looks good to me. How about using @MustBeClosed annotation to enforce try-with-resources? The follow-up PR is fine though.

@findinpath
Copy link
Copy Markdown
Contributor Author

How about using @MustBeClosed annotation to enforce try-with-resources?

I tried adding @MustBeClosed but I stumbled in DeltaLakeSplitManager because I need to pass further the stream to DeltaLakeSplitSource.

@ebyhr ebyhr merged commit 5649d74 into trinodb:master Jan 16, 2024
@github-actions github-actions bot added this to the 437 milestone Jan 16, 2024
@findepi
Copy link
Copy Markdown
Member

findepi commented Jan 31, 2024

can @MustBeClosed resource be further returned if it's still @MustBeClosed?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla-signed delta-lake Delta Lake connector

Development

Successfully merging this pull request may close these issues.

OPTIMIZE of Delta tables copies small files unchanged

6 participants