Skip to content

Conversation

@ZacBlanco
Copy link
Contributor

@ZacBlanco ZacBlanco commented Feb 20, 2025

Description

This change moves the affinity scheduling file section size
configuration from HiveClientConfig and HiveSessionProperties
to HiveCommonClientConfig and HiveCommonSessionProperties so
that the iceberg connector can benefit from this scheduling
strategy when tables have a small number of files but a large
number of splits.

Motivation and Context

On tables with a small number of large files, queries may perform poorly due to the distribution in split scheduling being skewed. This is more likely to occur when there is a limited number of values being hashed to determine the preferred nodes to schedule to. By changing the identifier used for selecting the preferred nodes we increase the probability that the splits are scheduled more evenly across the cluster.

Impact

  • Hive-specific configuration moved to common configuration.

Test Plan

Added a unit test to verify that the number of unique identifiers changes as we scale up the file section size

Contributor checklist

  • Please make sure your submission complies with our contributing guide, in particular code style and commit standards.
  • PR description addresses the issue accurately and concisely. If the change is non-trivial, a GitHub Issue is referenced.
  • Documented new properties (with its default value), SQL syntax, functions, or other functionality.
  • If release notes are required, they follow the release notes guidelines.
  • Adequate tests were added if applicable.
  • CI passed.

Release Notes

== RELEASE NOTES ==

Iceberg Connector Changes
* Add support for the ``hive.affinity-scheduling-file-section-size`` configuration property and ``affinity_scheduling_file_section_size`` session property.

@prestodb-ci prestodb-ci added the from:IBM PR from IBM label Feb 20, 2025
@ZacBlanco ZacBlanco changed the title [Iceberg] Enable affinity scheduling file sections [Iceberg] Enable affinity scheduling on file sections Feb 20, 2025
@ZacBlanco ZacBlanco force-pushed the upstream-iceberg-split-hashing branch 2 times, most recently from ac6fce7 to 6dfb1f2 Compare February 20, 2025 06:36
@aaneja aaneja self-requested a review February 20, 2025 10:38
@ZacBlanco ZacBlanco force-pushed the upstream-iceberg-split-hashing branch 3 times, most recently from bc804bc to 484408b Compare February 20, 2025 20:41
@ZacBlanco ZacBlanco marked this pull request as ready for review February 20, 2025 21:42
@ZacBlanco ZacBlanco requested a review from presto-oss February 20, 2025 21:42
steveburnett
steveburnett previously approved these changes Feb 20, 2025
Copy link
Contributor

@steveburnett steveburnett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! (docs)

Pull branch, local doc build, looks good. Thanks!

Copy link
Contributor

@yingsu00 yingsu00 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly looks good. One minor correction: "splits not being scheduled to enough nodes" : It's not necessarily they were not scheduled to enough nodes, but in general it had more skew than Hive, even when the splits were scheduled to the same number of nodes. Scheduling to less nodes happened non-determistically when I ran the queries multiple times. More than half times they did were scheduled to all nodes, but even in such cases the load was not as balanced as Hive.

@ZacBlanco ZacBlanco force-pushed the upstream-iceberg-split-hashing branch from 484408b to fa2b10e Compare February 20, 2025 23:26
@ZacBlanco
Copy link
Contributor Author

Thanks for the feedback @yingsu00 - I updated the PR description to be a bit more accurate

yingsu00
yingsu00 previously approved these changes Feb 21, 2025
aaneja
aaneja previously approved these changes Feb 21, 2025
@ZacBlanco ZacBlanco dismissed stale reviews from aaneja and yingsu00 via d4ae7ad February 21, 2025 17:09
@ZacBlanco ZacBlanco force-pushed the upstream-iceberg-split-hashing branch from 7cf8789 to d4ae7ad Compare February 21, 2025 17:09
Copy link
Member

@hantangwangd hantangwangd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the change, lgtm. A couple of little nits.

Set to 0 to use the value in each Iceberg table's
``read.split.target-size`` property.
``iceberg.affinity_scheduling_file_section_size`` When the ``node_selection_strategy`` or
``hive.node-selection-strategy`` property is set to ``SOFT_AFFINITY``,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the property's name be iceberg.node-selection-strategy?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way we register the config, I believe it is still hive.node-selection-strategy. The config comes from HiveCommonClientConfig.java which is bound in HiveCommonModule.java. The injector doesn't register a prefix with the config, so it uses the same value as in the *Config class which is hive.node-selection-strategy

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yes, you are right. Perhaps in future we should consider binding separate prefixes to the configs in presto-hive-common in each lake house connector's own Module.

@ZacBlanco ZacBlanco force-pushed the upstream-iceberg-split-hashing branch from d4ae7ad to 973a860 Compare February 24, 2025 18:48
This change moves the affinity scheduling file section size
configuration from HiveClientConfig and HiveSessionProperties
to HiveCommonClientConfig and HiveCommonSessionProperties so
that the iceberg connector can benefit from this scheduling
strategy when tables have a small number of files but a large
number of splits.
@ZacBlanco ZacBlanco force-pushed the upstream-iceberg-split-hashing branch from 973a860 to 5f8f14e Compare February 24, 2025 19:31
Set to 0 to use the value in each Iceberg table's
``read.split.target-size`` property.
``iceberg.affinity_scheduling_file_section_size`` When the ``node_selection_strategy`` or
``hive.node-selection-strategy`` property is set to ``SOFT_AFFINITY``,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yes, you are right. Perhaps in future we should consider binding separate prefixes to the configs in presto-hive-common in each lake house connector's own Module.

@yingsu00 yingsu00 merged commit 0927c8f into prestodb:master Feb 25, 2025
54 checks passed
@prestodb-ci prestodb-ci mentioned this pull request Mar 28, 2025
30 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

from:IBM PR from IBM

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants