-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Allow using local staging path in iceberg for sorted writes #26172
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow using local staging path in iceberg for sorted writes #26172
Conversation
606ea70
to
9280531
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Adds support for a configurable temporary staging directory when writing sorted files in Iceberg, improving performance on object stores.
- Introduces
temporaryStagingDirectoryEnabled
andtemporaryStagingDirectoryPath
inIcebergConfig
and exposes them as session properties. - Updates
IcebergSessionProperties
andIcebergPageSink
to select between the staging directory and default temp path per session. - Expands tests and documentation to cover the new staging directory behavior.
Reviewed Changes
Copilot reviewed 11 out of 11 changed files in this pull request and generated 1 comment.
Show a summary per file
File | Description |
---|---|
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergConfig.java | Add new config properties and their setters/getters |
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergSessionProperties.java | Register new session properties and provide accessors |
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergPageSink.java | Implement logic to choose staging directory or default temp path |
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergPageSinkProvider.java | Remove unused SortingFileWriterConfig injection |
plugin/trino-iceberg/src/test/java/io/trino/plugin/iceberg/*.java | Update tests to include SortingFileWriterConfig and new session props |
docs/src/main/sphinx/connector/iceberg.md | Document the new temporary staging directory properties |
Comments suppressed due to low confidence (2)
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergConfig.java:90
- [nitpick] The field name includes the 'is' prefix which duplicates the boolean getter style. Consider renaming it to
temporaryStagingDirectoryEnabled
to avoid confusion between field and getter naming.
private boolean isTemporaryStagingDirectoryEnabled;
docs/src/main/sphinx/connector/iceberg.md:220
- [nitpick] The list formatting uses
* -
which differs from surrounding items. Align these entries with the existing Sphinx list style for consistency and readability.
* - `iceberg.temporary-staging-directory-enabled`
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergPageSink.java
Outdated
Show resolved
Hide resolved
1c37cdf
to
80d1253
Compare
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergConfig.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergSessionProperties.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/test/java/io/trino/plugin/iceberg/BaseIcebergConnectorSmokeTest.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/test/java/io/trino/plugin/iceberg/BaseIcebergConnectorSmokeTest.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/test/java/io/trino/plugin/iceberg/BaseIcebergConnectorSmokeTest.java
Outdated
Show resolved
Hide resolved
This pull request has gone a while without any activity. Ask for help on #core-dev on Trino slack. |
This pull request has gone a while without any activity. Ask for help on #core-dev on Trino slack. |
6207475
to
5964057
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 6 out of 6 changed files in this pull request and generated no new comments.
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergPageSink.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergPageSink.java
Outdated
Show resolved
Hide resolved
Added iceberg.sorted-writing.local-staging-path config property Co-Authored-By: Raunaq Morarka <[email protected]>
5964057
to
bacde5b
Compare
Description
This change adds support for using temporary staging directory during write operations involving sorted tables. Writes to sorted tables will utilize this path for staging temporary files during sorting operation. When disabled, the target storage will be used for staging while writing sorted tables which can be inefficient when writing to object stores like S3.
Additional context and related issues
Fixes #24376
Similar to functionality added for Hive in #3434
Release notes
( ) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
(x) Release notes are required, with the following suggested text: