Overall Decision for Deciders prioritizes THROTTLE#140237
Overall Decision for Deciders prioritizes THROTTLE#140237nicktindall merged 12 commits intoelastic:mainfrom
Conversation
Fixing a bug where AllocationDeciders could summarize AllocationDecider responses as NOT_PREFERRED, which allows shard movement, when an AllocationDecider responded THROTTLE. Relates ES-13903
|
Pinging @elastic/es-distributed-coordination (Team:Distributed Coordination) |
|
Hi @DiannaHohensee, I've created a changelog YAML for you. |
|
Henning pointed out that I've missed the Type#min comparison usage -- so I'll add a fix for that momentarily. I should add test coverage, too. |
…ator/reconciler, as well as unit test the method directly.
| ); | ||
| } | ||
|
|
||
| public static class NotPreferredPlugin extends Plugin implements ClusterPlugin { |
There was a problem hiding this comment.
Nit: seeing as this does throttle and not preferred, perhaps the name should be changed to reflect that?
| if (newDecision.type().isWorseForTheSameNode(worstDecision.type())) { | ||
| worstDecision = newDecision; | ||
| if (worstDecision.type() == Decision.Type.NO) { | ||
| traceNoDecisions(decider, newDecision, logMessageCreator); |
There was a problem hiding this comment.
In the interest of keeping the change to a minimum, can we leave the variable naming as-is? It just makes it easier to see what changed that way IMO. I don't think there is a signficant difference in readability between "worst" and "mostNegative", or "decision" and "newDecision" to warrant the additional delta (thinking about people doing archaeology later)
| public static Type min(Type a, Type b) { | ||
| return a.compareTo(b) < 0 ? a : b; | ||
| return a.isWorseForTheSameNode(b) ? a : b; | ||
| } |
There was a problem hiding this comment.
Perhaps we should get rid of min altogether, higherThan and min suggest there is a natural order, when clearly there isn't.
| case YES -> { | ||
| yield true; | ||
| } | ||
| }; |
There was a problem hiding this comment.
Can we declare these orders like
sameNodeOrder = [NO, THROTTLE, NOT_PREFERRED, YES]and use the indices to do the comparison?
There was a problem hiding this comment.
Attempted in 76c2b99
I understand this might be considered slightly harder to read, but it does follow the Comparable#compareTo convention so should be something Java people are familiar with. I think I prefer it to isWorseThan... and isBetterThan... but we could always add them as convenience methods if we think that makes it nicer?
server/src/main/java/org/elasticsearch/cluster/routing/allocation/decider/Decision.java
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/cluster/routing/allocation/decider/Decision.java
Show resolved
Hide resolved
| final Decision.Type canAllocateOrRebalance = allocationDecision.type() == Type.THROTTLE | ||
| || rebalanceDecision.type() == Type.THROTTLE ? Type.THROTTLE : Type.YES; |
There was a problem hiding this comment.
Can we capture this as a method on Type? Or should this use compareToBetweenDecisions since it's technically still decision aggreation for a single node?
There was a problem hiding this comment.
I'd prefer not to use compareToBetweenDecisions because it sort-of implies that the order used by that method is significant, here we are dealing only with THROTTLE and YES so I think it's good to make it clear that it only cares about that.
I think we could add it to Type but perhaps as a subsequent PR to prevent holding things up any further.
| for (AllocationDecider decider : deciders) { | ||
| var decision = deciderAction.apply(decider); | ||
| if (mostNegativeDecision.type().higherThan(decision.type())) { | ||
| if (mostNegativeDecision.type().compareToBetweenDecisions(decision.type()) > 0) { |
There was a problem hiding this comment.
For my understanding, this is a behaviour change to the existing code, right? It currently takes a THROTTLE over NOT_PREFERRED. But it will be the opposite after this change?
Essentially, the two places where compareToBetweenDecisions is used indicates behaviour change compared to the existing code? It is effectively reverting to the behaviour before #137228. Is that the target here? Although it does not align with the PR description which says
Fixing a bug where AllocationDeciders could summarize
AllocationDecider responses as NOT_PREFERRED, which allows shard
movement, when an AllocationDecider responded THROTTLE.
I am a bit confused ...
There was a problem hiding this comment.
It is effectively reverting to the behaviour before #137228
Yep, that's the goal. It turns out there are two prioritisation orders, so a single order can't define them here we make them explicit
There was a problem hiding this comment.
Yes, that is the target, to revert the between decision comparison to prior to that PR but keep the node-level comparison from the PR.
With the change of this PR, we will end up with THROTTLE overruling NOT-PREFERRED, like it used to. Notice that here the most negative decision we've seen so far is "higher" than the current decision and we thus pick the decision moving forward. NOT-PREFERRED.compareToBetweenDecisions(THROTTLE) is > 0, making us keep THROTTLE. (sorry for spelling this out, trying to save a round of interaction).
There was a problem hiding this comment.
Ah ok thanks both for explaining, especially the explict call out from Henning! I got it now.
...terTest/java/org/elasticsearch/cluster/routing/allocation/allocator/NonDesiredBalanceIT.java
Outdated
Show resolved
Hide resolved
…routing/allocation/allocator/NonDesiredBalanceIT.java Co-authored-by: Yang Wang <ywangd@gmail.com>
nicktindall
left a comment
There was a problem hiding this comment.
I'm happy with the PR as-is, will implement any required tidy up as a follow up
| protected Settings nodeSettings(int nodeOrdinal, Settings otherSettings) { | ||
| return Settings.builder() | ||
| .put(super.nodeSettings(nodeOrdinal, otherSettings)) | ||
| .put(ClusterModule.SHARDS_ALLOCATOR_TYPE_SETTING.getKey(), ClusterModule.BALANCED_ALLOCATOR) | ||
| .build(); | ||
| } |
There was a problem hiding this comment.
Nit: Since we are starting node manually for eadch test. I think we can merge the two test classes and start node in each test method with its own node settings. That seems less clutter to me.
There was a problem hiding this comment.
Yeah definitely some de-duplication due here. I'll add it to the follow-up
💚 Backport successful
|
Fixing a bug where AllocationDeciders could summarize AllocationDecider responses as NOT_PREFERRED, which allows shard movement, when an AllocationDecider responded THROTTLE. Relates ES-13903 Co-authored-by: Nick Tindall <nick.tindall@elastic.co> Co-authored-by: Henning Andersen <33268011+henningandersen@users.noreply.github.com> Co-authored-by: Yang Wang <ywangd@gmail.com>
* upstream/main: (191 commits) Overall Decision for Deciders prioritizes THROTTLE (elastic#140237) Apply group by all logic not only to top-level aggregates (elastic#140248) [ES|QL] Refactor MV_UNION and MV_INTERSECTION to use shared set operation helper (elastic#139982) Avoid reading entire bloom filter file on reader open (elastic#139374) Mark bloom filter files for random access (elastic#139375) Ensure that the buffer used for ES93BloomFilterStoredFieldsFormat is zeroed (elastic#139034) Add busy assertion to avoid race condition for testStalledShardMigrationProperlyDetected (elastic#140230) Remove line number check for testTransitiveFindsDeepCallChain (elastic#140228) Allow a slight difference in rescored docs (elastic#139931) Mute org.elasticsearch.xpack.inference.integration.AuthorizationTaskExecutorIT testCreatesEisChatCompletion_DoesNotRemoveEndpointWhenNoLongerAuthorized elastic#138480 Start exchange sink fetchers concurrently (elastic#140196) Allow allocation to replacement target node on vacate completion (elastic#140150) Ignore JNA cleaner threads in SecureHdfsRepositoryAnalysisRestIT (elastic#139925) DeterministicQueue refactor and enhancement (elastic#140151) Always error out if CCS expression shows up when CCS is not supported (elastic#139009) Use IllegalArgumentException over RepositoryException for readonly-repository checks (elastic#140200) Guard promql capabilities in AnalyzerTests (elastic#140232) [Inference API] Fix flaky AuthorizationTaskExecutorIT tests (elastic#139978) Cleaning up exitable vector value impls (elastic#140190) [Inference API] Fix auth exception listener not called bug (elastic#139966) ...
|
fwiw if this is the causing PR, assuming will be muted soon, but failing quite frequently on main if i'm not mistaken: ~/repos/elasticsearch $ ./gradlew ":server:test" --tests "org.elasticsearch.cluster.routing.allocation.decider.AllocationDecidersTests.testCheckAllDecidersBeforeReturningNotPreferred" -Dtests.iters=10000
...
2> REPRODUCE WITH: ./gradlew ":server:test" --tests "org.elasticsearch.cluster.routing.allocation.decider.AllocationDecidersTests.testCheckAllDecidersBeforeReturningNotPreferred {seed=[85009956502A5DC7:200A75E136AA6C09]}" -Dtests.seed=85009956502A5DC7 -Dtests.locale=ca-FR -Dtests.timezone=Indian/Chagos -Druntime.java=25
2> java.lang.AssertionError:
Expected: <NOT_PREFERRED()>
but: was <THROTTLE()>
at __randomizedtesting.SeedInfo.seed([85009956502A5DC7:200A75E136AA6C09]:0)
at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:6)
at org.elasticsearch.test.ESTestCase.assertThat(ESTestCase.java:2853)
at org.elasticsearch.cluster.routing.allocation.decider.AllocationDecidersTests.lambda$verifyDecidersCall$21(AllocationDecidersTests.java:194)
at java.base/java.lang.Iterable.forEach(Iterable.java:75)
at org.elasticsearch.cluster.routing.allocation.decider.AllocationDecidersTests.verifyDecidersCall(AllocationDecidersTests.java:182)
...
10000 tests completed, 3210 failed |
THROTTLE to take precedence over NOT_PREFERRED. This behaviour was changed in elastic#140237 but this randomized test was not updated. Fixed here.
THROTTLE to take precedence over NOT_PREFERRED. This behaviour was changed in #140237 but this randomized test was not updated. Fixed here.
THROTTLE to take precedence over NOT_PREFERRED. This behaviour was changed in elastic#140237 but this randomized test was not updated. Fixed here.
Fixing a bug where AllocationDeciders could summarize AllocationDecider responses as NOT_PREFERRED, which allows shard movement, when an AllocationDecider responded THROTTLE. Relates ES-13903 Co-authored-by: Nick Tindall <nick.tindall@elastic.co> Co-authored-by: Henning Andersen <33268011+henningandersen@users.noreply.github.com> Co-authored-by: Yang Wang <ywangd@gmail.com>
THROTTLE to take precedence over NOT_PREFERRED. This behaviour was changed in elastic#140237 but this randomized test was not updated. Fixed here.
Fixing a bug where AllocationDeciders could summarize AllocationDecider responses as NOT_PREFERRED, which allows shard movement, when an AllocationDecider responded THROTTLE. Relates ES-13903 Co-authored-by: Nick Tindall <nick.tindall@elastic.co> Co-authored-by: Henning Andersen <33268011+henningandersen@users.noreply.github.com> Co-authored-by: Yang Wang <ywangd@gmail.com>
Fixing a bug where AllocationDeciders could summarize
AllocationDecider responses as NOT_PREFERRED, which allows shard
movement, when an AllocationDecider responded THROTTLE.
Relates ES-13903