-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Description
Elasticsearch version (bin/elasticsearch --version): commit c3ab6f7
Plugins installed: []
JVM version (java -version):
openjdk version "12.0.2" 2019-07-16
OpenJDK Runtime Environment AdoptOpenJDK (build 12.0.2+10)
OpenJDK 64-Bit Server VM AdoptOpenJDK (build 12.0.2+10, mixed mode)
OS version (uname -a if on a Unix-like system):
Darwin Roberts-MacBook-Pro.local 19.6.0 Darwin Kernel Version 19.6.0: Thu Oct 29 22:56:45 PDT 2020; root:xnu-6153.141.2.2~1/RELEASE_X86_64 x86_64
Description of the problem including expected versus actual behavior:
The implementation of the C3 algorithm uses 1/serviceTime in place of serviceTime.
i.e.
elasticsearch/server/src/main/java/org/elasticsearch/node/ResponseCollectorService.java
Line 176 in fd3d7e3
| double muBarS = serviceTime / FACTOR; |
should be
double muBarS = FACTOR / serviceTime;
The confusion derives from the fact that the paper in question uses the notation mu_s to denote the service rate, which is the inverse of the service-time.
This results in incorrect choices of the fastest node.
Steps to reproduce:
For example, if you apply this change to tests:
diff --git a/server/src/test/java/org/elasticsearch/cluster/routing/OperationRoutingTests.java b/server/src/test/java/org/elasticsearch/cluster/routing/OperationRoutingTests.java
index c05d0a551fe..f2bf47e49b2 100644
--- a/server/src/test/java/org/elasticsearch/cluster/routing/OperationRoutingTests.java
+++ b/server/src/test/java/org/elasticsearch/cluster/routing/OperationRoutingTests.java
@@ -571,8 +571,8 @@ public class OperationRoutingTests extends ESTestCase{
assertThat(searchedShards.size(), equalTo(3));
// Now let's start adding node metrics, since that will affect which node is chosen
- collector.addNodeStatistics("node_0", 2, TimeValue.timeValueMillis(200).nanos(), TimeValue.timeValueMillis(150).nanos());
- collector.addNodeStatistics("node_1", 1, TimeValue.timeValueMillis(100).nanos(), TimeValue.timeValueMillis(50).nanos());
+ collector.addNodeStatistics("node_0", 1, TimeValue.timeValueMillis(200).nanos(), TimeValue.timeValueMillis(150).nanos());
+ collector.addNodeStatistics("node_1", 1, TimeValue.timeValueMillis(200).nanos(), TimeValue.timeValueMillis(50).nanos());
collector.addNodeStatistics("node_2", 1, TimeValue.timeValueMillis(200).nanos(), TimeValue.timeValueMillis(200).nanos());
outstandingRequests.put("node_0", 1L);
outstandingRequests.put("node_1", 1L);
You can see that with the same number of queued requests and the same client response time for each node, the algorithm should choose the node with the lowest service time (i.e. node 1), however the test fails because the current implementation chooses node_2 (the slowest).
Provide logs (if relevant):