Make number of preferred nodes configurable#22562
Conversation
2d9dba1 to
ae67568
Compare
hantangwangd
left a comment
There was a problem hiding this comment.
Overall looks good to me. Just a little question for discussing. Besides, should we add docs and release not for the config parameter node-scheduler.max-preferred-nodes?
| // Using all nodes for soft affinity scheduling with modular hashing because otherwise temporarily down nodes would trigger too much rehashing | ||
| if (nodeSelectionHashStrategy == MODULAR_HASHING) { | ||
| nodeProvider = new ModularHashingNodeProvider(nodeMap.getAllNodes()); | ||
| } |
There was a problem hiding this comment.
Is it intentional to use nodeMap.allNodes to build ModularHashingNodeProvider for all split nodeSelectionStrategies including HARD_AFFINITY and NO_PREFERENCE besides SOFT_AFFINITY? As I understand the whole change would affect them as well, is that OK?
There was a problem hiding this comment.
Great question.
nodeProvider is only used when there is an affinity preference (HARD_AFFINITY, SOFT_AFFINITY). When there is no affinity preference (NO_PREFERENCE) the nodeProvider is not used and random nodes are selected via randomNodeSelection.pickNodes(split).
For SOFT_AFFINITY when the MODULAR_HASHING strategy is selected nothing changes.
The only place when a potential change is possible is when HARD_AFFINITY is enabled, however I don't think it is the case in practice.
When HARD_AFFINITY is enabled the connector should return a set of "hard" allocated nodes. Meaning that a split is only allowed to be scheduled on that very specific set of nodes. Hence in theory a connector should never consult a nodeProvider when HARD_AFFINITY is required, as the nodeProvider does not guarantee any specific nodes to be selected. Based on what I checked it's always the case. All connectors that call NodeProvider#get only do so when SOFT_AFFINITY is preferred:
However I agree, the interface today makes it rather obscure. I guess a better interface could've been something like ConnectorSplit#getSoftAffinityNodeSelectionKey. We might want to consider a follow up refactor changing the interface to make it more explicit and easier to follow.
There was a problem hiding this comment.
Thank you for the detailed message. Now I get that in practice, the situation is indeed as you said. And I agree with you that a follow-up refactor to make the interface more explicit would be great. LGTM, thanks!
ae67568 to
70e9dce
Compare
Yes, please, to both. Here's a possible draft for the release note entry: |
|
Thank you @steveburnett . Let me also work on a change to the docs. |
| import static java.util.Objects.requireNonNull; | ||
|
|
||
| public class ConsistentHashingNodeProvider | ||
| implements NodeProvider |
There was a problem hiding this comment.
why are we removing this ? so that we can plugin the nodeCount preference dynamically .. seems hacky
There was a problem hiding this comment.
I'm actually not sure if NodeProvider interface is necessary. This interface seems a little counterintuitive to me. A NodeProvide is passed to the split just to be called with a key. I wonder if a better interface would be to have something like ConnectorSplit#getSoftAffinityNodeSelectionKey that returns a String (or an object) similar to what is done for fragment level caching with ConnectorSplit#getSplitIdentifier. Then a scheduler can make a decision on it's own of how many preferred nodes are required. Thoughts?
There was a problem hiding this comment.
Another thing what this PR changes fundamentally, it moves the number of preferred nodes selection from a connector to scheduler. Previously it was a connector responsibility to request a certain number of preferred nodes. Now the scheduler will decide.
From a practicality standpoint it feels like it's easier to make this selection on coordinator (a single property for every connector). But I wonder if there was some deeper thought process behind this that motivated to implement it this way.
@NikhilCollooru can you think of a situation when we think it would actually be preferred for a connector to chose of how many soft affinity nodes to chose for a certain split?
There was a problem hiding this comment.
yeah i dont see a usecase where you configure the node count based on connector.
so moving the logic to scheduler seems the better way to do it.
There was a problem hiding this comment.
Sounds good. Let me follow up with a refactor PR.
| default: | ||
| throw new IllegalArgumentException(format("Unknown NodeSelectionHashStrategy: %s", nodeSelectionHashStrategy)); | ||
| if (consistentHashingNodeProvider.isPresent()) { | ||
| return (key) -> consistentHashingNodeProvider.get().get(key, nodeCount); |
There was a problem hiding this comment.
This feels little hacky ..to plugin the node count dynamically.
NikhilCollooru
left a comment
There was a problem hiding this comment.
Apart from the comment i made earlier, the changes look okay to me.
Description
Allow changing number of preferred nodes when soft affinity scheduling is enabled
Motivation and Context
The default number of
2may not always be sufficient. We've seen cases when there are too many fallbacks to "random" nodes.Test Plan
CI