ARS implementation inverts the priority of servicetime

**Elasticsearch version** (`bin/elasticsearch --version`): commit c3ab6f71c873551ebef333c655b348023a497d9b

**Plugins installed**: []

**JVM version** (`java -version`):
```
openjdk version "12.0.2" 2019-07-16
OpenJDK Runtime Environment AdoptOpenJDK (build 12.0.2+10)
OpenJDK 64-Bit Server VM AdoptOpenJDK (build 12.0.2+10, mixed mode)
```

**OS version** (`uname -a` if on a Unix-like system):
```
Darwin Roberts-MacBook-Pro.local 19.6.0 Darwin Kernel Version 19.6.0: Thu Oct 29 22:56:45 PDT 2020; root:xnu-6153.141.2.2~1/RELEASE_X86_64 x86_64
```

**Description of the problem including expected versus actual behavior**:

The implementation of the [C3 algorithm](https://www.usenix.org/system/files/conference/nsdi15/nsdi15-paper-suresh.pdf) uses 1/serviceTime in place of serviceTime.

i.e.
https://github.com/elastic/elasticsearch/blob/fd3d7e336822ed9966cec2be9a4c2d95b438117b/server/src/main/java/org/elasticsearch/node/ResponseCollectorService.java#L176
should be
```
double muBarS = FACTOR / serviceTime;
```

The confusion derives from the fact that the paper in question uses the notation mu_s to denote the **service rate**, which is the inverse of the service-time.

This results in incorrect choices of the fastest node. 

**Steps to reproduce**:


For example, if you apply this change to tests:
```
diff --git a/server/src/test/java/org/elasticsearch/cluster/routing/OperationRoutingTests.java b/server/src/test/java/org/elasticsearch/cluster/routing/OperationRoutingTests.java
index c05d0a551fe..f2bf47e49b2 100644
--- a/server/src/test/java/org/elasticsearch/cluster/routing/OperationRoutingTests.java
+++ b/server/src/test/java/org/elasticsearch/cluster/routing/OperationRoutingTests.java
@@ -571,8 +571,8 @@ public class OperationRoutingTests extends ESTestCase{
         assertThat(searchedShards.size(), equalTo(3));
 
         // Now let's start adding node metrics, since that will affect which node is chosen
-        collector.addNodeStatistics("node_0", 2, TimeValue.timeValueMillis(200).nanos(), TimeValue.timeValueMillis(150).nanos());
-        collector.addNodeStatistics("node_1", 1, TimeValue.timeValueMillis(100).nanos(), TimeValue.timeValueMillis(50).nanos());
+        collector.addNodeStatistics("node_0", 1, TimeValue.timeValueMillis(200).nanos(), TimeValue.timeValueMillis(150).nanos());
+        collector.addNodeStatistics("node_1", 1, TimeValue.timeValueMillis(200).nanos(), TimeValue.timeValueMillis(50).nanos());
         collector.addNodeStatistics("node_2", 1, TimeValue.timeValueMillis(200).nanos(), TimeValue.timeValueMillis(200).nanos());
         outstandingRequests.put("node_0", 1L);
         outstandingRequests.put("node_1", 1L);
```
You can see that with the same number of queued requests and the same client response time for each node, the algorithm should choose the node with the lowest service time (i.e. node 1), however the test fails because the current implementation chooses node_2 (the slowest).



**Provide logs (if relevant)**:




Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ARS implementation inverts the priority of servicetime #65838

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

ARS implementation inverts the priority of servicetime #65838

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions