-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Reduce memory usage of remove_orphan_files procedure #25847
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce memory usage of remove_orphan_files procedure #25847
Conversation
b5f7198 to
6cce0e7
Compare
6cce0e7 to
a7c53f8
Compare
raunaqmorarka
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor comments, lgtm
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergMetadata.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this move to ManifestUtils ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note the org.apache.iceberg package.
The intention of this class was to utilize the package private method ManifestLists.read.
We want the behavior of BaseSnapshot.allManifests() without caching, but iceberg does not support that. My workaround works fine for v2 iceberg tables, but I'm concerned it might not work with v1 iceberg tables. See:
https://github.com/apache/iceberg/blob/main/core/src/main/java/org/apache/iceberg/BaseSnapshot.java#L174-L186
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to restrict the fix to V2 iceberg tables while we look for a solution for V1 tables in https://apache-iceberg.slack.com/archives/C03LG1D563F/p1747921709647279 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@grantatspothero i've pushed changes here to restrict the fix to v2 tables, PTAL
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not something we have to do this PR, but can we start moving these types of utilities out into a separate utility class? We have lots of utilities throughout with similar/close functionality repeated, and I imagine most occasional contributors (like myself) don't even know they exist because they are placed in the calling class as a private method.
plugin/trino-iceberg/src/main/java/org/apache/iceberg/ManifestUtils.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergMetadata.java
Outdated
Show resolved
Hide resolved
c715bb8 to
347f370
Compare
|
This pull request has gone a while without any activity. Ask for help on #core-dev on Trino slack. |
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergMetadata.java
Outdated
Show resolved
Hide resolved
|
This pull request has gone a while without any activity. Ask for help on #core-dev on Trino slack. |
347f370 to
a279f4e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR reduces memory usage during Iceberg's remove orphan files procedure by avoiding duplicate caching of manifest files when processing tables with many snapshots. The fix introduces a new method to load manifests directly from manifest lists instead of going through snapshot objects that cache manifests independently.
- Introduces a new
ManifestUtilsclass with a memory-efficient method to read manifest files - Modifies the orphan file removal procedure to use direct manifest list reading for better memory efficiency
- Maintains backward compatibility for V1 tables with embedded manifest lists
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| ManifestUtils.java | New utility class providing memory-efficient manifest file reading |
| IcebergMetadata.java | Updated orphan file removal logic to use the new manifest reading approach |
Comments suppressed due to low confidence (1)
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergMetadata.java:2265
- [nitpick] The variable name
allManifestsis somewhat ambiguous. Consider renaming it tomanifestFilesormanifeststo be more concise and clear.
List<ManifestFile> allManifests;
plugin/trino-iceberg/src/main/java/org/apache/iceberg/ManifestUtils.java
Show resolved
Hide resolved
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergMetadata.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not something we have to do this PR, but can we start moving these types of utilities out into a separate utility class? We have lots of utilities throughout with similar/close functionality repeated, and I imagine most occasional contributors (like myself) don't even know they exist because they are placed in the calling class as a private method.
plugin/trino-iceberg/src/main/java/org/apache/iceberg/ManifestUtils.java
Outdated
Show resolved
Hide resolved
a279f4e to
d436fc9
Compare
d436fc9 to
a5b0e00
Compare
| validMetadataFileNames.add(fileName(snapshot.manifestListLocation())); | ||
| String manifestListLocation = snapshot.manifestListLocation(); | ||
| List<ManifestFile> allManifests; | ||
| if (manifestListLocation != null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not think we have tests using v1 iceberg format for remove orphan files. We should add at least one as a sanity check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a special case of V1 tables where write.manifest-lists.enabled is set to false. However, the iceberg library removed the ability to write embedded manifest list a few years ago (apache/iceberg#5773). So we don't have a way to produce such data in the tests.
It might be possible to generate such data from an old version and test on that pre-generated dataset, but I think that is overkill at the moment. The fallback logic looks safe enough to me.
grantatspothero
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is fine as it fixes a real issue but I'm not happy with adding separate code paths for different iceberg spec versions.
For context I tried asking iceberg maintainers for advice on what to do here:
https://apache-iceberg.slack.com/archives/C03LG1D563F/p1747921709647279
Suggestion was to implement a per table metadata cache of manifest files. I spent a few hours doing this and then had to pivot to other things. If someone is familiar with the avro reader iceberg uses would love to pair with them.
I don't see the point of doing extra work for V1 tables, why would anyone still be using those ? |
This would break remove orphan files for v1 iceberg tables, the trino docs say |
I'm not talking about dropping support, I'm talking about doing extra work to have the same code path or same fix for V1. |
Description
Reduces memory usage of coordinator during iceberg remove orphan files procedure for tables with large numbers of snapshots.
See comment for description of fix:
Additional context and related issues
To show decreased memory usage:
BaseSnapshotobject. Root cause was incorrect attribution of memory usage of the List, becauseBaseSnapshot#cacheManifestscallsManifestLists.readwhich returns a LinkedList. The heapdump memory attribution logic was not correctly attributing all the objects in the LinkedList toBaseSnapshot, only the first node.table.snapshots()in a debugger to confirm the manifest caching was not occurring.(before change)


(after change)
Release notes
( ) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
(x) Release notes are required, with the following suggested text: