Skip to content

Conversation

@idegtiarenko
Copy link
Contributor

No description provided.

@idegtiarenko idegtiarenko added :Distributed Coordination/Allocation All issues relating to the decision making around placing a shard (both master logic & on the nodes) Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. labels Sep 14, 2022
this.totalBytes = totalBytes;
this.path = path;
this.totalBytes = totalBytes;
this.freeBytes = freeBytes;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reordered to match constructor args order. Will port it to the main branch separately later.

// relocation
increaseTargetDiskUsage(shard.relocatingNodeId(), size);
decreaseSourceDiskUsage(shard.currentNodeId(), size);
// TODO update shard data path?
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

routingToDataPath is keyed by a ShardRouting. If we want to update it as well then we need to also pass the state of the shard after it was initialized here.

final Function<ShardRouting, Long> shardSizeFunctionCopy = MockInternalClusterInfoService.this.shardSizeFunction;
if (shardSizeFunctionCopy == null) {
return super.getShardSize(shardRouting);
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This faked only a getter call rather then entire underlying map that is used for a calculations in ClusterInfoSimulator. Updating entire ShardStats in adjustShardStats similar to adjustNodesStats. This should fix failures in MockDiskUsagesIT.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This commit is ported to the master branch: https://github.com/elastic/elasticsearch/pull/90092/files to minimize the diff

@idegtiarenko
Copy link
Contributor Author

@elasticmachine please run elasticsearch-ci/bwc

@idegtiarenko idegtiarenko marked this pull request as ready for review September 15, 2022 12:24
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (Team:Distributed)

…sage

# Conflicts:
#	server/src/main/java/org/elasticsearch/cluster/ClusterInfo.java
#	server/src/main/java/org/elasticsearch/cluster/DiskUsage.java
Copy link
Contributor

@DaveCTurner DaveCTurner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, I left some comments about tests.

}

@Override
public boolean equals(Object o) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should use EqualsHashCodeTestUtils.checkEqualsAndHashCode to test this in ClusterInfoTests. Alternatively add a dedicated AbstractWireSerializingTestCase derivative for ClusterInfo which would cover equals and hashcode as well as replacing ClusterInfoTests#testSerialization.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the change to equals/hashCode and their tests be ported to main separately?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that'd be good.

final var changes = routingAllocation.changes();
final var knownNodeIds = routingAllocation.nodes().stream().map(DiscoveryNode::getId).collect(Collectors.toSet());
final var unassignedPrimaries = new HashSet<ShardId>();
final var clusterInfoSimulator = new ClusterInfoSimulator(routingAllocation.clusterInfo());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we enhance DesiredBalanceComputerTests to verify we're accounting for disk usage in all the right places?

…sage

# Conflicts:
#	server/src/main/java/org/elasticsearch/cluster/routing/allocation/allocator/DesiredBalanceComputer.java
Copy link
Contributor

@DaveCTurner DaveCTurner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm I don't think we're really testing that the DesiredBalanceComputer is using the ClusterInfoSimulator correctly. Specifically, DesiredBalanceComputerTests all pass even if I remove all mention of the simulator:

diff --git a/server/src/main/java/org/elasticsearch/cluster/routing/allocation/allocator/DesiredBalanceComputer.java b/server/src/main/java/org/elasticsearch/cluster/routing/allocation/allocator/DesiredBalanceComputer.java
index 50bde0b40dd..f1a40911ba1 100644
--- a/server/src/main/java/org/elasticsearch/cluster/routing/allocation/allocator/DesiredBalanceComputer.java
+++ b/server/src/main/java/org/elasticsearch/cluster/routing/allocation/allocator/DesiredBalanceComputer.java
@@ -10,7 +10,6 @@ package org.elasticsearch.cluster.routing.allocation.allocator;

 import org.apache.logging.log4j.LogManager;
 import org.apache.logging.log4j.Logger;
-import org.elasticsearch.cluster.ClusterInfoSimulator;
 import org.elasticsearch.cluster.node.DiscoveryNode;
 import org.elasticsearch.cluster.routing.RoutingNodes;
 import org.elasticsearch.cluster.routing.ShardRouting;
@@ -57,7 +56,6 @@ public class DesiredBalanceComputer {
         final var routingNodes = routingAllocation.routingNodes();
         final var ignoredShards = new HashSet<>(desiredBalanceInput.ignoredShards());
         final var changes = routingAllocation.changes();
-        final var clusterInfoSimulator = new ClusterInfoSimulator(routingAllocation.clusterInfo());

         if (routingNodes.isEmpty()) {
             return new DesiredBalance(desiredBalanceInput.index(), Map.of());
@@ -67,7 +65,6 @@ public class DesiredBalanceComputer {
         for (final var routingNode : routingNodes) {
             for (final var shardRouting : routingNode) {
                 if (shardRouting.initializing()) {
-                    clusterInfoSimulator.simulate(shardRouting);
                     routingNodes.startShard(logger, shardRouting, changes, 0L);
                 }
             }
@@ -128,7 +125,6 @@ public class DesiredBalanceComputer {
                 assert shardRouting.started();
                 if (targetNodesIterator.hasNext()) {
                     ShardRouting shardToRelocate = routingNodes.relocateShard(shardRouting, targetNodesIterator.next(), 0L, changes).v2();
-                    clusterInfoSimulator.simulate(shardToRelocate);
                     routingNodes.startShard(logger, shardToRelocate, changes, 0L);
                 } else {
                     break;
@@ -154,7 +150,6 @@ public class DesiredBalanceComputer {
                 if (nodeIds != null && nodeIds.isEmpty() == false) {
                     final String nodeId = nodeIds.removeFirst();
                     ShardRouting shardToInitialized = unassignedPrimaryIterator.initialize(nodeId, null, 0L, changes);
-                    clusterInfoSimulator.simulate(shardToInitialized);
                     routingNodes.startShard(logger, shardToInitialized, changes, 0L);
                 }
             }
@@ -168,7 +163,6 @@ public class DesiredBalanceComputer {
                 if (nodeIds != null && nodeIds.isEmpty() == false) {
                     final String nodeId = nodeIds.removeFirst();
                     ShardRouting shardToInitialize = unassignedReplicaIterator.initialize(nodeId, null, 0L, changes);
-                    clusterInfoSimulator.simulate(shardToInitialize);
                     routingNodes.startShard(logger, shardToInitialize, changes, 0L);
                 }
             }
@@ -212,7 +206,6 @@ public class DesiredBalanceComputer {
                 // TODO test that we reset ignored shards properly
             }

-            routingAllocation.setSimulatedClusterInfo(clusterInfoSimulator.getClusterInfo());
             logger.trace("running delegate allocator");
             delegateAllocator.allocate(routingAllocation);
             assert routingNodes.unassigned().size() == 0; // any unassigned shards should now be ignored
@@ -222,7 +215,6 @@ public class DesiredBalanceComputer {
                 for (final var shardRouting : routingNode) {
                     if (shardRouting.initializing()) {
                         hasChanges = true;
-                        clusterInfoSimulator.simulate(shardRouting);
                         routingNodes.startShard(logger, shardRouting, changes, 0L);
                         logger.trace("starting shard {}", shardRouting);
                     }

@idegtiarenko
Copy link
Contributor Author

@elasticsearchmachine please run elasticsearch-ci/bwc

Copy link
Contributor

@DaveCTurner DaveCTurner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but I would like a follow-up to extend the tests to verify the usage of the clusterInfoSimulator in the preparatory steps of the DesiredBalanceComputer. Specifically, the tests all still pass even with these lines removed:

diff --git a/server/src/main/java/org/elasticsearch/cluster/routing/allocation/allocator/DesiredBalanceComputer.java b/server/src/main/java/org/elasticsearch/cluster/routing/allocation/allocator/DesiredBalanceComputer.java
index 50bde0b40dd..e3784be6551 100644
--- a/server/src/main/java/org/elasticsearch/cluster/routing/allocation/allocator/DesiredBalanceComputer.java
+++ b/server/src/main/java/org/elasticsearch/cluster/routing/allocation/allocator/DesiredBalanceComputer.java
@@ -67,7 +67,6 @@ public class DesiredBalanceComputer {
         for (final var routingNode : routingNodes) {
             for (final var shardRouting : routingNode) {
                 if (shardRouting.initializing()) {
-                    clusterInfoSimulator.simulate(shardRouting);
                     routingNodes.startShard(logger, shardRouting, changes, 0L);
                 }
             }
@@ -128,7 +127,6 @@ public class DesiredBalanceComputer {
                 assert shardRouting.started();
                 if (targetNodesIterator.hasNext()) {
                     ShardRouting shardToRelocate = routingNodes.relocateShard(shardRouting, targetNodesIterator.next(), 0L, changes).v2();
-                    clusterInfoSimulator.simulate(shardToRelocate);
                     routingNodes.startShard(logger, shardToRelocate, changes, 0L);
                 } else {
                     break;
@@ -154,7 +152,6 @@ public class DesiredBalanceComputer {
                 if (nodeIds != null && nodeIds.isEmpty() == false) {
                     final String nodeId = nodeIds.removeFirst();
                     ShardRouting shardToInitialized = unassignedPrimaryIterator.initialize(nodeId, null, 0L, changes);
-                    clusterInfoSimulator.simulate(shardToInitialized);
                     routingNodes.startShard(logger, shardToInitialized, changes, 0L);
                 }
             }
@@ -168,7 +165,6 @@ public class DesiredBalanceComputer {
                 if (nodeIds != null && nodeIds.isEmpty() == false) {
                     final String nodeId = nodeIds.removeFirst();
                     ShardRouting shardToInitialize = unassignedReplicaIterator.initialize(nodeId, null, 0L, changes);
-                    clusterInfoSimulator.simulate(shardToInitialize);
                     routingNodes.startShard(logger, shardToInitialize, changes, 0L);
                 }
             }

…sage

# Conflicts:
#	server/src/main/java/org/elasticsearch/cluster/routing/allocation/allocator/DesiredBalanceComputer.java
#	server/src/test/java/org/elasticsearch/cluster/routing/allocation/allocator/DesiredBalanceComputerTests.java
@idegtiarenko
Copy link
Contributor Author

@elasticsearchmachine please run elasticsearch-ci/part-1

@idegtiarenko idegtiarenko merged commit 50cdbe5 into elastic:feature/desired-balance-allocator Oct 5, 2022
@idegtiarenko idegtiarenko deleted the simulate_disk_usage branch October 5, 2022 10:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Distributed Coordination/Allocation All issues relating to the decision making around placing a shard (both master logic & on the nodes) >enhancement Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants