[Iceberg] Enable affinity scheduling on file sections #24598

ZacBlanco · 2025-02-20T06:26:35Z

Description

This change moves the affinity scheduling file section size
configuration from HiveClientConfig and HiveSessionProperties
to HiveCommonClientConfig and HiveCommonSessionProperties so
that the iceberg connector can benefit from this scheduling
strategy when tables have a small number of files but a large
number of splits.

Motivation and Context

On tables with a small number of large files, queries may perform poorly due to the distribution in split scheduling being skewed. This is more likely to occur when there is a limited number of values being hashed to determine the preferred nodes to schedule to. By changing the identifier used for selecting the preferred nodes we increase the probability that the splits are scheduled more evenly across the cluster.

Impact

Hive-specific configuration moved to common configuration.

Test Plan

Added a unit test to verify that the number of unique identifiers changes as we scale up the file section size

Contributor checklist

Please make sure your submission complies with our contributing guide, in particular code style and commit standards.
PR description addresses the issue accurately and concisely. If the change is non-trivial, a GitHub Issue is referenced.
Documented new properties (with its default value), SQL syntax, functions, or other functionality.
If release notes are required, they follow the release notes guidelines.
Adequate tests were added if applicable.
CI passed.

Release Notes

== RELEASE NOTES ==

Iceberg Connector Changes
* Add support for the ``hive.affinity-scheduling-file-section-size`` configuration property and ``affinity_scheduling_file_section_size`` session property.

steveburnett

LGTM! (docs)

Pull branch, local doc build, looks good. Thanks!

yingsu00

Mostly looks good. One minor correction: "splits not being scheduled to enough nodes" : It's not necessarily they were not scheduled to enough nodes, but in general it had more skew than Hive, even when the splits were scheduled to the same number of nodes. Scheduling to less nodes happened non-determistically when I ran the queries multiple times. More than half times they did were scheduled to all nodes, but even in such cases the load was not as balanced as Hive.

...rg/src/main/java/com/facebook/presto/iceberg/equalitydeletes/EqualityDeletesSplitSource.java

ZacBlanco · 2025-02-20T23:28:12Z

Thanks for the feedback @yingsu00 - I updated the PR description to be a bit more accurate

presto-iceberg/src/main/java/com/facebook/presto/iceberg/IcebergSplitSource.java

presto-iceberg/src/test/java/com/facebook/presto/iceberg/TestIcebergSplitManager.java

hantangwangd

Thanks for the change, lgtm. A couple of little nits.

hantangwangd · 2025-02-24T14:32:03Z

presto-docs/src/main/sphinx/connector/iceberg.rst

                                                      Set to 0 to use the value in each Iceberg table's
                                                      ``read.split.target-size`` property.
+``iceberg.affinity_scheduling_file_section_size``     When the ``node_selection_strategy`` or
+                                                      ``hive.node-selection-strategy`` property is set to ``SOFT_AFFINITY``,


Should the property's name be iceberg.node-selection-strategy?

The way we register the config, I believe it is still hive.node-selection-strategy. The config comes from HiveCommonClientConfig.java which is bound in HiveCommonModule.java. The injector doesn't register a prefix with the config, so it uses the same value as in the *Config class which is hive.node-selection-strategy

Oh yes, you are right. Perhaps in future we should consider binding separate prefixes to the configs in presto-hive-common in each lake house connector's own Module.

presto-iceberg/src/main/java/com/facebook/presto/iceberg/IcebergSplit.java

This change moves the affinity scheduling file section size configuration from HiveClientConfig and HiveSessionProperties to HiveCommonClientConfig and HiveCommonSessionProperties so that the iceberg connector can benefit from this scheduling strategy when tables have a small number of files but a large number of splits.

hantangwangd · 2025-02-25T00:04:32Z

presto-docs/src/main/sphinx/connector/iceberg.rst

                                                      Set to 0 to use the value in each Iceberg table's
                                                      ``read.split.target-size`` property.
+``iceberg.affinity_scheduling_file_section_size``     When the ``node_selection_strategy`` or
+                                                      ``hive.node-selection-strategy`` property is set to ``SOFT_AFFINITY``,


Oh yes, you are right. Perhaps in future we should consider binding separate prefixes to the configs in presto-hive-common in each lake house connector's own Module.

prestodb-ci added the from:IBM PR from IBM label Feb 20, 2025

ZacBlanco changed the title ~~[Iceberg] Enable affinity scheduling file sections~~ [Iceberg] Enable affinity scheduling on file sections Feb 20, 2025

ZacBlanco force-pushed the upstream-iceberg-split-hashing branch 2 times, most recently from ac6fce7 to 6dfb1f2 Compare February 20, 2025 06:36

aaneja self-requested a review February 20, 2025 10:38

ZacBlanco force-pushed the upstream-iceberg-split-hashing branch 3 times, most recently from bc804bc to 484408b Compare February 20, 2025 20:41

ZacBlanco marked this pull request as ready for review February 20, 2025 21:42

ZacBlanco requested review from a team, elharo, hantangwangd and steveburnett as code owners February 20, 2025 21:42

ZacBlanco requested a review from presto-oss February 20, 2025 21:42

steveburnett previously approved these changes Feb 20, 2025

View reviewed changes

yingsu00 reviewed Feb 20, 2025

View reviewed changes

...rg/src/main/java/com/facebook/presto/iceberg/equalitydeletes/EqualityDeletesSplitSource.java Show resolved Hide resolved

ZacBlanco force-pushed the upstream-iceberg-split-hashing branch from 484408b to fa2b10e Compare February 20, 2025 23:26

ZacBlanco dismissed steveburnett’s stale review via 7cf8789 February 20, 2025 23:33

ZacBlanco force-pushed the upstream-iceberg-split-hashing branch from fa2b10e to 7cf8789 Compare February 20, 2025 23:33

yingsu00 previously approved these changes Feb 21, 2025

View reviewed changes

aaneja previously approved these changes Feb 21, 2025

View reviewed changes

presto-iceberg/src/main/java/com/facebook/presto/iceberg/IcebergSplitSource.java Outdated Show resolved Hide resolved

presto-iceberg/src/test/java/com/facebook/presto/iceberg/TestIcebergSplitManager.java Show resolved Hide resolved

ZacBlanco dismissed stale reviews from aaneja and yingsu00 via d4ae7ad February 21, 2025 17:09

ZacBlanco force-pushed the upstream-iceberg-split-hashing branch from 7cf8789 to d4ae7ad Compare February 21, 2025 17:09

hantangwangd reviewed Feb 24, 2025

View reviewed changes

ZacBlanco force-pushed the upstream-iceberg-split-hashing branch from d4ae7ad to 973a860 Compare February 24, 2025 18:48

ZacBlanco force-pushed the upstream-iceberg-split-hashing branch from 973a860 to 5f8f14e Compare February 24, 2025 19:31

hantangwangd approved these changes Feb 25, 2025

View reviewed changes

yingsu00 approved these changes Feb 25, 2025

View reviewed changes

yingsu00 merged commit 0927c8f into prestodb:master Feb 25, 2025
54 checks passed

This was referenced Mar 10, 2025

Add release notes for 0.292 unix280/presto#5

Closed

Add release notes for 0.292 unix280/presto#6

Closed

prestodb-ci mentioned this pull request Mar 28, 2025

Add release notes for 0.292 #24825

Merged

30 tasks

unidevel mentioned this pull request Apr 25, 2025

Add release notes for 0.292 unix280/presto#23

Closed

30 tasks

This was referenced May 6, 2025

Add release notes for 0.292 unix280/presto#28

Closed

Add release notes for 0.292 unix280/presto#29

Closed

aaneja mentioned this pull request Nov 4, 2025

Optimizer improvements for TPCDS #24276

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Iceberg] Enable affinity scheduling on file sections #24598

[Iceberg] Enable affinity scheduling on file sections #24598

Uh oh!

ZacBlanco commented Feb 20, 2025 •

edited

Loading

Uh oh!

steveburnett left a comment

Uh oh!

yingsu00 left a comment

Uh oh!

Uh oh!

ZacBlanco commented Feb 20, 2025

Uh oh!

Uh oh!

Uh oh!

hantangwangd left a comment

Uh oh!

hantangwangd Feb 24, 2025

Uh oh!

ZacBlanco Feb 24, 2025

Uh oh!

hantangwangd Feb 25, 2025

Uh oh!

Uh oh!

hantangwangd Feb 25, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

[Iceberg] Enable affinity scheduling on file sections #24598

[Iceberg] Enable affinity scheduling on file sections #24598

Uh oh!

Conversation

ZacBlanco commented Feb 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Motivation and Context

Impact

Test Plan

Contributor checklist

Release Notes

Uh oh!

steveburnett left a comment

Choose a reason for hiding this comment

Uh oh!

yingsu00 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ZacBlanco commented Feb 20, 2025

Uh oh!

Uh oh!

Uh oh!

hantangwangd left a comment

Choose a reason for hiding this comment

Uh oh!

hantangwangd Feb 24, 2025

Choose a reason for hiding this comment

Uh oh!

ZacBlanco Feb 24, 2025

Choose a reason for hiding this comment

Uh oh!

hantangwangd Feb 25, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

hantangwangd Feb 25, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

ZacBlanco commented Feb 20, 2025 •

edited

Loading