Skip to content

Improve performance for Equality Delete files in Iceberg connector#18397

Closed
jasonf20 wants to merge 3 commits intotrinodb:masterfrom
jasonf20:iceberg/equality-delete-optimizations
Closed

Improve performance for Equality Delete files in Iceberg connector#18397
jasonf20 wants to merge 3 commits intotrinodb:masterfrom
jasonf20:iceberg/equality-delete-optimizations

Conversation

@jasonf20
Copy link
Copy Markdown
Member

@jasonf20 jasonf20 commented Jul 25, 2023

Description

The goal of this PR is to improve the performance of queries that contain Equality Delete files.

Before this commit equality delete files were re-loaded for every split that needed them. Now, each delete file is loaded once per execution node.

In addition, the equality deletes are stored in a single map (per delete schema) with the data sequence number in which the row was deleted. This allows us to merge rows that were deleted multiple times (common in upsert use cases) into a single entry instead of holding an entry per delete file.

Additional context and related issues

Fixes #18396

Changes in this commit:

  • Create a DeleteManager class to manage the above logic
  • Read every delete file only once
  • Merge deletes into single map
  • Opportunistic delete file load parallelization

Release notes

( ) This is not user-visible or docs only and no release notes are required.
( ) Release notes are required, please propose a release note for me.
(x) Release notes are required, with the following suggested text:

# Iceberg
* Improved performance and memory usage when [Equality Delete](https://iceberg.apache.org/spec/#equality-delete-files)  files are used ({issue}`18396`)

@cla-bot
Copy link
Copy Markdown

cla-bot bot commented Jul 25, 2023

Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to cla@trino.io. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla

@github-actions github-actions bot added the iceberg Iceberg connector label Jul 25, 2023
@jasonf20 jasonf20 force-pushed the iceberg/equality-delete-optimizations branch from eb8be8c to 2de67b7 Compare July 26, 2023 08:07
@cla-bot
Copy link
Copy Markdown

cla-bot bot commented Jul 26, 2023

Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to cla@trino.io. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla

@jasonf20 jasonf20 force-pushed the iceberg/equality-delete-optimizations branch from 2de67b7 to 7d32723 Compare July 26, 2023 15:04
@cla-bot
Copy link
Copy Markdown

cla-bot bot commented Jul 26, 2023

Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to cla@trino.io. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla

@jasonf20 jasonf20 force-pushed the iceberg/equality-delete-optimizations branch from 7d32723 to 65c40fd Compare August 3, 2023 15:53
@cla-bot
Copy link
Copy Markdown

cla-bot bot commented Aug 3, 2023

Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to cla@trino.io. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla

@yoniiny
Copy link
Copy Markdown

yoniiny commented Aug 3, 2023

@findinpath This PR is the fix for the equality deletes issue we discussed. We have another PR coming that should further improve for situations where there are a lot of deletes (>~200M).

@findinpath
Copy link
Copy Markdown
Contributor

findinpath commented Aug 4, 2023

@jasonf20 does your contribution overlap with #17115 ?

Note that there is in the above mentioned PR scaffolding for testing. Please use it here as well to ensure the validity of your changes.

Read every delete file only once

Is this something we can verify as part of this PR (before & after your change) ?
Have a look at https://github.com/trinodb/trino/blob/18b47f0223c7d7cb74fbb8c16f9b3871ad79690b/plugin/trino-iceberg/src/test/java/io/trino/plugin/iceberg/TestIcebergMetadataFileOperations.java

@jasonf20 jasonf20 force-pushed the iceberg/equality-delete-optimizations branch from 65c40fd to 0472932 Compare August 6, 2023 09:43
@cla-bot cla-bot bot added the cla-signed label Aug 6, 2023
@jasonf20
Copy link
Copy Markdown
Member Author

jasonf20 commented Aug 6, 2023

Hi @findinpath

It does seem like this PR overlaps with #17115. This PR should do the same map compaction that is done in that PR. However, it includes more optimizations on top of that including:

  • Reading delete files only once
    • This adds some complexity since the shared state must eventually be cleaned up. This is done with a WeakReference in this PR.
    • Also the data sequence number needs to be taken into account since we share the deletes map across splits
  • Optimistically loading delete files in parallel (despite reading each file only once, we will allow different splits to load the files in a different order so that multiple files are loaded at once)

I think that it should suffice to merge only one of these PRs. If the other one is merged I will need to rebase and add the above improvements on top of the previous PR.

I have added two tests based on the templates/suggestion I saw in the other PR.

I was able to use the MetadataFileOperations spec as a guideline for adding a test that validates the delete file is loaded only once. Added as a new commit.

@jasonf20 jasonf20 force-pushed the iceberg/equality-delete-optimizations branch 3 times, most recently from 89ee06a to 16d859a Compare August 7, 2023 07:42
@jasonf20 jasonf20 force-pushed the iceberg/equality-delete-optimizations branch from 16d859a to f230f95 Compare August 7, 2023 09:51
@findinpath
Copy link
Copy Markdown
Contributor

@jasonf20 let's concentrate on landing first the PR #17115 which comes with less changes.
There are a few corner cases that are not fully obvious which need to be covered in detail before landing these changes. Feel free to contribute with suggestions on the above mentioned PR to accelerate its landing.

@jasonf20
Copy link
Copy Markdown
Member Author

@findinpath Happy to assist with the merging of #17115. Is there anything specifically there? Looking at the PR it seems most action items have been handled.

Once that's merged I'll rebase on top of that with the remaining changes.

Keep in mind that as it stands the other PR isn't enough for the use case of a large table with constant upserts to actually work since each split loads all the delete files sequentially.

@findinpath
Copy link
Copy Markdown
Contributor

findinpath commented Aug 10, 2023

Happy to assist with the merging of #17115.
Is there anything specifically there?

We stumbled on a situation related to using nested fields for equality deletes. The PR will be adding proper handling for such situations.
It seems that, at the moment, Trino is not capable to read such delete files.
In any case, adding support in Trino for dealing with nested fields in equality deletes is not in the scope of the above mentioned PR.

@findinpath
Copy link
Copy Markdown
Contributor

@jasonf20 #17115 has been merged.
Feel free to continue the work on this PR.

@jasonf20 jasonf20 force-pushed the iceberg/equality-delete-optimizations branch 3 times, most recently from 2afae2c to 1d5e8da Compare October 5, 2023 15:02
@yoniiny
Copy link
Copy Markdown

yoniiny commented Oct 5, 2023

@findinpath This PR is ready for review

@jasonf20 jasonf20 force-pushed the iceberg/equality-delete-optimizations branch from 1d5e8da to ff5861e Compare October 9, 2023 15:30
@jasonf20 jasonf20 requested a review from findinpath October 23, 2023 07:37
@jasonf20 jasonf20 force-pushed the iceberg/equality-delete-optimizations branch 2 times, most recently from 45f388f to a6b9841 Compare October 30, 2023 13:32
@sopel39
Copy link
Copy Markdown
Member

sopel39 commented Feb 16, 2024

@jasonf20 IIUC this PR is caching deletion files. However, it seems that these can be cached per split rather than per query.
In that case ConnectorPageSourceProviderFactory seems like too narrow scope.

@jasonf20
Copy link
Copy Markdown
Member Author

@jasonf20 IIUC this PR is caching deletion files. However, it seems that these can be cached per split rather than per query. In that case ConnectorPageSourceProviderFactory seems like too narrow scope.

The delete files are already read only once per split. The issue is that caching per task/query is required otherwise the delete files are read in each split leading to unusable performance. This approach was designed with @pettyjamesm as the cleanest way to cache at the right level.

@jasonf20 jasonf20 force-pushed the iceberg/equality-delete-optimizations branch 4 times, most recently from 02122ea to d3d0597 Compare February 17, 2024 00:16
@sopel39
Copy link
Copy Markdown
Member

sopel39 commented Feb 19, 2024

@jasonf20 Would it make sense to cache deletion files across queries too?

@jasonf20
Copy link
Copy Markdown
Member Author

jasonf20 commented Feb 19, 2024

@sopel39

It could be useful for performance when querying the same tables but the same can be said for data files. Perhaps a generic cache of this sort would be implemented in Trino at some point but it's probably a much more involved task, considering memory usage and such.

For this PR specifically it would be interesting to consider caching at the query level (this PR does it at the Task Level). But for most queries caching at the task level should produce the same performance. We discussed query level caches and decided it can probably be built on top of this Task Level caching with reference counting or something like that in the future if needed. So the first step is to solve this at this level which should have the largest impact and potentially expand the caching scope later.

The main motivation here is to make queries on tables with equality deletes complete in a reasonable amount of time. The current implementation doesn't really work with more than a couple equality delete files

@jasonf20 jasonf20 force-pushed the iceberg/equality-delete-optimizations branch 3 times, most recently from 165bad5 to 5158db7 Compare February 19, 2024 20:37
@jasonf20
Copy link
Copy Markdown
Member Author

jasonf20 commented Feb 19, 2024

@dain @pettyjamesm Should the PageSourceProviderFactory be closeable? It acts as a singleton now (like PageSourceProvider did before). Should we instead make PageSourceProvider closeable? If so when would be the right place to close it? I don't think OperatorFactory::noMoreOperators is the right place since IIUC that is called before the splits are read.

I don't think the use case here really has anything to close, it should all be cleaned up by GC, but if we are creating a stateful instance we might want it as part of the interface.

@jasonf20 jasonf20 force-pushed the iceberg/equality-delete-optimizations branch from 5158db7 to 0bef97e Compare February 19, 2024 23:38
…ovider

This abstraction enables connectors to implement page source provider
factories that provide stateful instances of PageSourceProvider at the
task level.

This allows for things like task level caching across splits for
example.
@jasonf20 jasonf20 force-pushed the iceberg/equality-delete-optimizations branch from 0bef97e to b28035e Compare February 19, 2024 23:42
The goal of this commit is to improve the performance of queries that
contain Eqaulity Delete files.

Before this commit equality delete files were re-loaded for every split
that needed them. Now, each delete file is loaded once per execution
node.

In addition, the equality deletes are stored in a single map (per delete
schema) with the data sequence number in which the row was deleted. This
allows us to merge rows that were deleted multiple times (common in
upsert use cases) into a single entry instead of holding an entry per
delete file.

Changes in this commit:
* Create a DeleteManager class to manage the above logic
* Read every delete file only once
* Merge deletes into single map
* Opportunistic delete file load parallelization

State Management:
The `IcebergPageSourceProviderFactory` creates an
`IcebergPageSourceProvider` per task and allows
for reusing delete file information within the task.
The `DeleteManager` is created only once and then loaded equality deletes for the
task are cached until the end of the task.

Tasks are per partition and typically most equality delete files in a partition
apply to most data files so caching all the equality delete files is
efficient.
Refactored some existing tests to extract shared code to util classes.
This allows adding a new test suite for testing data file access
opeartions, in addition to the existing suite testing metadata file
access operations.
@jasonf20 jasonf20 force-pushed the iceberg/equality-delete-optimizations branch from b28035e to a8725ec Compare February 20, 2024 01:47
@github-actions
Copy link
Copy Markdown

This pull request has gone a while without any activity. Tagging the Trino developer relations team: @bitsondatadev @colebow @mosabua

@github-actions github-actions bot added the stale label Mar 12, 2024
@yoniiny
Copy link
Copy Markdown

yoniiny commented Mar 28, 2024

@findinpath we'd love to get this PR merged. Are there any outstanding action items here?

@github-actions github-actions bot removed the stale label Mar 28, 2024
@findepi findepi changed the title IcebergPlugin: Performance improvements for Equality Delete files Improve performance for Equality Delete files in Iceberg connector Mar 28, 2024
@dain dain mentioned this pull request Apr 8, 2024
dain

This comment was marked as outdated.

@colebow colebow requested review from dain and pettyjamesm April 26, 2024 19:53
@mosabua
Copy link
Copy Markdown
Member

mosabua commented May 1, 2024

Replaced by #21441

@mosabua mosabua closed this May 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla-signed hive Hive connector iceberg Iceberg connector

Development

Successfully merging this pull request may close these issues.

Iceberg: Inefficient Equality Delete file handling

8 participants