Skip to content

Conversation

agoyal-sfdc
Copy link
Member

@agoyal-sfdc agoyal-sfdc commented Jul 10, 2025

Description

This change adds support for using temporary staging directory during write operations involving sorted tables. Writes to sorted tables will utilize this path for staging temporary files during sorting operation. When disabled, the target storage will be used for staging while writing sorted tables which can be inefficient when writing to object stores like S3.

Additional context and related issues

Fixes #24376
Similar to functionality added for Hive in #3434

Release notes

( ) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
(x) Release notes are required, with the following suggested text:

## Iceberg
* Support using a local staging path for improved performance of writes to sorted tables. This can be enabled using the catalog configuration property `iceberg.sorted-writing.local-staging-path`. ({issue}`24376`)

Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Adds support for a configurable temporary staging directory when writing sorted files in Iceberg, improving performance on object stores.

  • Introduces temporaryStagingDirectoryEnabled and temporaryStagingDirectoryPath in IcebergConfig and exposes them as session properties.
  • Updates IcebergSessionProperties and IcebergPageSink to select between the staging directory and default temp path per session.
  • Expands tests and documentation to cover the new staging directory behavior.

Reviewed Changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergConfig.java Add new config properties and their setters/getters
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergSessionProperties.java Register new session properties and provide accessors
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergPageSink.java Implement logic to choose staging directory or default temp path
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergPageSinkProvider.java Remove unused SortingFileWriterConfig injection
plugin/trino-iceberg/src/test/java/io/trino/plugin/iceberg/*.java Update tests to include SortingFileWriterConfig and new session props
docs/src/main/sphinx/connector/iceberg.md Document the new temporary staging directory properties
Comments suppressed due to low confidence (2)

plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergConfig.java:90

  • [nitpick] The field name includes the 'is' prefix which duplicates the boolean getter style. Consider renaming it to temporaryStagingDirectoryEnabled to avoid confusion between field and getter naming.
    private boolean isTemporaryStagingDirectoryEnabled;

docs/src/main/sphinx/connector/iceberg.md:220

  • [nitpick] The list formatting uses * - which differs from surrounding items. Align these entries with the existing Sphinx list style for consistency and readability.
* - `iceberg.temporary-staging-directory-enabled`

@agoyal-sfdc agoyal-sfdc force-pushed the iceberg_sort_temp_dir branch 2 times, most recently from 1c37cdf to 80d1253 Compare July 18, 2025 06:59
@github-actions
Copy link

This pull request has gone a while without any activity. Ask for help on #core-dev on Trino slack.

@github-actions github-actions bot added stale and removed stale labels Aug 26, 2025
@github-actions
Copy link

This pull request has gone a while without any activity. Ask for help on #core-dev on Trino slack.

@github-actions github-actions bot added the stale label Sep 17, 2025
@raunaqmorarka raunaqmorarka force-pushed the iceberg_sort_temp_dir branch 2 times, most recently from 6207475 to 5964057 Compare October 1, 2025 09:09
@raunaqmorarka raunaqmorarka changed the title Allow using temporary staging path in Iceberg for writing sorted files Allow using local staging path in iceberg for sorted writes Oct 1, 2025
@raunaqmorarka raunaqmorarka requested a review from Copilot October 1, 2025 09:49
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated no new comments.


Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Added iceberg.sorted-writing.local-staging-path config property

Co-Authored-By: Raunaq Morarka <[email protected]>
@raunaqmorarka raunaqmorarka force-pushed the iceberg_sort_temp_dir branch from 5964057 to bacde5b Compare October 1, 2025 11:02
@raunaqmorarka raunaqmorarka merged commit 7bf13a6 into trinodb:master Oct 1, 2025
46 of 47 checks passed
@github-actions github-actions bot added this to the 478 milestone Oct 1, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

Large insert into sorted Iceberg table fails

3 participants