Skip to content

Make number of preferred nodes configurable#22562

Merged
arhimondr merged 1 commit intoprestodb:masterfrom
arhimondr:configure-preferred-nodes
Apr 26, 2024
Merged

Make number of preferred nodes configurable#22562
arhimondr merged 1 commit intoprestodb:masterfrom
arhimondr:configure-preferred-nodes

Conversation

@arhimondr
Copy link
Member

@arhimondr arhimondr commented Apr 18, 2024

Description

Allow changing number of preferred nodes when soft affinity scheduling is enabled

Motivation and Context

The default number of 2 may not always be sufficient. We've seen cases when there are too many fallbacks to "random" nodes.

Test Plan

CI

== RELEASE NOTES ==

General Changes
* Add `node-scheduler.max-preferred-nodes` configuration property to allow changing number of preferred nodes when soft affinity scheduling is enabled. :pr:`22562`

Copy link
Member

@hantangwangd hantangwangd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good to me. Just a little question for discussing. Besides, should we add docs and release not for the config parameter node-scheduler.max-preferred-nodes?

Comment on lines -166 to -169
// Using all nodes for soft affinity scheduling with modular hashing because otherwise temporarily down nodes would trigger too much rehashing
if (nodeSelectionHashStrategy == MODULAR_HASHING) {
nodeProvider = new ModularHashingNodeProvider(nodeMap.getAllNodes());
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it intentional to use nodeMap.allNodes to build ModularHashingNodeProvider for all split nodeSelectionStrategies including HARD_AFFINITY and NO_PREFERENCE besides SOFT_AFFINITY? As I understand the whole change would affect them as well, is that OK?

Copy link
Member Author

@arhimondr arhimondr Apr 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great question.

nodeProvider is only used when there is an affinity preference (HARD_AFFINITY, SOFT_AFFINITY). When there is no affinity preference (NO_PREFERENCE) the nodeProvider is not used and random nodes are selected via randomNodeSelection.pickNodes(split).

For SOFT_AFFINITY when the MODULAR_HASHING strategy is selected nothing changes.

The only place when a potential change is possible is when HARD_AFFINITY is enabled, however I don't think it is the case in practice.

When HARD_AFFINITY is enabled the connector should return a set of "hard" allocated nodes. Meaning that a split is only allowed to be scheduled on that very specific set of nodes. Hence in theory a connector should never consult a nodeProvider when HARD_AFFINITY is required, as the nodeProvider does not guarantee any specific nodes to be selected. Based on what I checked it's always the case. All connectors that call NodeProvider#get only do so when SOFT_AFFINITY is preferred:

However I agree, the interface today makes it rather obscure. I guess a better interface could've been something like ConnectorSplit#getSoftAffinityNodeSelectionKey. We might want to consider a follow up refactor changing the interface to make it more explicit and easier to follow.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the detailed message. Now I get that in practice, the situation is indeed as you said. And I agree with you that a follow-up refactor to make the interface more explicit would be great. LGTM, thanks!

@arhimondr arhimondr force-pushed the configure-preferred-nodes branch from ae67568 to 70e9dce Compare April 23, 2024 17:03
@steveburnett
Copy link
Contributor

Overall looks good to me. Just a little question for discussing. Besides, should we add docs and release not for the config parameter node-scheduler.max-preferred-nodes?

Yes, please, to both. Here's a possible draft for the release note entry:

== RELEASE NOTES ==

General Changes
* Add `node-scheduler.max-preferred-nodes` configuration property to allow changing number of preferred nodes when soft affinity scheduling is enabled. :pr:`22562`

@arhimondr
Copy link
Member Author

Thank you @steveburnett . Let me also work on a change to the docs.

import static java.util.Objects.requireNonNull;

public class ConsistentHashingNodeProvider
implements NodeProvider
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are we removing this ? so that we can plugin the nodeCount preference dynamically .. seems hacky

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm actually not sure if NodeProvider interface is necessary. This interface seems a little counterintuitive to me. A NodeProvide is passed to the split just to be called with a key. I wonder if a better interface would be to have something like ConnectorSplit#getSoftAffinityNodeSelectionKey that returns a String (or an object) similar to what is done for fragment level caching with ConnectorSplit#getSplitIdentifier. Then a scheduler can make a decision on it's own of how many preferred nodes are required. Thoughts?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another thing what this PR changes fundamentally, it moves the number of preferred nodes selection from a connector to scheduler. Previously it was a connector responsibility to request a certain number of preferred nodes. Now the scheduler will decide.

From a practicality standpoint it feels like it's easier to make this selection on coordinator (a single property for every connector). But I wonder if there was some deeper thought process behind this that motivated to implement it this way.

@NikhilCollooru can you think of a situation when we think it would actually be preferred for a connector to chose of how many soft affinity nodes to chose for a certain split?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah i dont see a usecase where you configure the node count based on connector.
so moving the logic to scheduler seems the better way to do it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. Let me follow up with a refactor PR.

default:
throw new IllegalArgumentException(format("Unknown NodeSelectionHashStrategy: %s", nodeSelectionHashStrategy));
if (consistentHashingNodeProvider.isPresent()) {
return (key) -> consistentHashingNodeProvider.get().get(key, nodeCount);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels little hacky ..to plugin the node count dynamically.

Copy link
Contributor

@NikhilCollooru NikhilCollooru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apart from the comment i made earlier, the changes look okay to me.

@arhimondr arhimondr merged commit 12f1ae6 into prestodb:master Apr 26, 2024
@arhimondr arhimondr deleted the configure-preferred-nodes branch April 26, 2024 15:14
@wanglinsong wanglinsong mentioned this pull request Jun 25, 2024
36 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants