-
Notifications
You must be signed in to change notification settings - Fork 25.6k
simulate disk usage during balance calculation #90061
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
simulate disk usage during balance calculation #90061
Conversation
| this.totalBytes = totalBytes; | ||
| this.path = path; | ||
| this.totalBytes = totalBytes; | ||
| this.freeBytes = freeBytes; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reordered to match constructor args order. Will port it to the main branch separately later.
| // relocation | ||
| increaseTargetDiskUsage(shard.relocatingNodeId(), size); | ||
| decreaseSourceDiskUsage(shard.currentNodeId(), size); | ||
| // TODO update shard data path? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
routingToDataPath is keyed by a ShardRouting. If we want to update it as well then we need to also pass the state of the shard after it was initialized here.
| final Function<ShardRouting, Long> shardSizeFunctionCopy = MockInternalClusterInfoService.this.shardSizeFunction; | ||
| if (shardSizeFunctionCopy == null) { | ||
| return super.getShardSize(shardRouting); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This faked only a getter call rather then entire underlying map that is used for a calculations in ClusterInfoSimulator. Updating entire ShardStats in adjustShardStats similar to adjustNodesStats. This should fix failures in MockDiskUsagesIT.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This commit is ported to the master branch: https://github.com/elastic/elasticsearch/pull/90092/files to minimize the diff
|
@elasticmachine please run elasticsearch-ci/bwc |
|
Pinging @elastic/es-distributed (Team:Distributed) |
…sage # Conflicts: # server/src/main/java/org/elasticsearch/cluster/ClusterInfo.java # server/src/main/java/org/elasticsearch/cluster/DiskUsage.java
DaveCTurner
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, I left some comments about tests.
server/src/main/java/org/elasticsearch/cluster/ClusterInfo.java
Outdated
Show resolved
Hide resolved
| } | ||
|
|
||
| @Override | ||
| public boolean equals(Object o) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should use EqualsHashCodeTestUtils.checkEqualsAndHashCode to test this in ClusterInfoTests. Alternatively add a dedicated AbstractWireSerializingTestCase derivative for ClusterInfo which would cover equals and hashcode as well as replacing ClusterInfoTests#testSerialization.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the change to equals/hashCode and their tests be ported to main separately?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes that'd be good.
| final var changes = routingAllocation.changes(); | ||
| final var knownNodeIds = routingAllocation.nodes().stream().map(DiscoveryNode::getId).collect(Collectors.toSet()); | ||
| final var unassignedPrimaries = new HashSet<ShardId>(); | ||
| final var clusterInfoSimulator = new ClusterInfoSimulator(routingAllocation.clusterInfo()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we enhance DesiredBalanceComputerTests to verify we're accounting for disk usage in all the right places?
…sage # Conflicts: # server/src/main/java/org/elasticsearch/cluster/routing/allocation/allocator/DesiredBalanceComputer.java
DaveCTurner
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm I don't think we're really testing that the DesiredBalanceComputer is using the ClusterInfoSimulator correctly. Specifically, DesiredBalanceComputerTests all pass even if I remove all mention of the simulator:
diff --git a/server/src/main/java/org/elasticsearch/cluster/routing/allocation/allocator/DesiredBalanceComputer.java b/server/src/main/java/org/elasticsearch/cluster/routing/allocation/allocator/DesiredBalanceComputer.java
index 50bde0b40dd..f1a40911ba1 100644
--- a/server/src/main/java/org/elasticsearch/cluster/routing/allocation/allocator/DesiredBalanceComputer.java
+++ b/server/src/main/java/org/elasticsearch/cluster/routing/allocation/allocator/DesiredBalanceComputer.java
@@ -10,7 +10,6 @@ package org.elasticsearch.cluster.routing.allocation.allocator;
import org.apache.logging.log4j.LogManager;
import org.apache.logging.log4j.Logger;
-import org.elasticsearch.cluster.ClusterInfoSimulator;
import org.elasticsearch.cluster.node.DiscoveryNode;
import org.elasticsearch.cluster.routing.RoutingNodes;
import org.elasticsearch.cluster.routing.ShardRouting;
@@ -57,7 +56,6 @@ public class DesiredBalanceComputer {
final var routingNodes = routingAllocation.routingNodes();
final var ignoredShards = new HashSet<>(desiredBalanceInput.ignoredShards());
final var changes = routingAllocation.changes();
- final var clusterInfoSimulator = new ClusterInfoSimulator(routingAllocation.clusterInfo());
if (routingNodes.isEmpty()) {
return new DesiredBalance(desiredBalanceInput.index(), Map.of());
@@ -67,7 +65,6 @@ public class DesiredBalanceComputer {
for (final var routingNode : routingNodes) {
for (final var shardRouting : routingNode) {
if (shardRouting.initializing()) {
- clusterInfoSimulator.simulate(shardRouting);
routingNodes.startShard(logger, shardRouting, changes, 0L);
}
}
@@ -128,7 +125,6 @@ public class DesiredBalanceComputer {
assert shardRouting.started();
if (targetNodesIterator.hasNext()) {
ShardRouting shardToRelocate = routingNodes.relocateShard(shardRouting, targetNodesIterator.next(), 0L, changes).v2();
- clusterInfoSimulator.simulate(shardToRelocate);
routingNodes.startShard(logger, shardToRelocate, changes, 0L);
} else {
break;
@@ -154,7 +150,6 @@ public class DesiredBalanceComputer {
if (nodeIds != null && nodeIds.isEmpty() == false) {
final String nodeId = nodeIds.removeFirst();
ShardRouting shardToInitialized = unassignedPrimaryIterator.initialize(nodeId, null, 0L, changes);
- clusterInfoSimulator.simulate(shardToInitialized);
routingNodes.startShard(logger, shardToInitialized, changes, 0L);
}
}
@@ -168,7 +163,6 @@ public class DesiredBalanceComputer {
if (nodeIds != null && nodeIds.isEmpty() == false) {
final String nodeId = nodeIds.removeFirst();
ShardRouting shardToInitialize = unassignedReplicaIterator.initialize(nodeId, null, 0L, changes);
- clusterInfoSimulator.simulate(shardToInitialize);
routingNodes.startShard(logger, shardToInitialize, changes, 0L);
}
}
@@ -212,7 +206,6 @@ public class DesiredBalanceComputer {
// TODO test that we reset ignored shards properly
}
- routingAllocation.setSimulatedClusterInfo(clusterInfoSimulator.getClusterInfo());
logger.trace("running delegate allocator");
delegateAllocator.allocate(routingAllocation);
assert routingNodes.unassigned().size() == 0; // any unassigned shards should now be ignored
@@ -222,7 +215,6 @@ public class DesiredBalanceComputer {
for (final var shardRouting : routingNode) {
if (shardRouting.initializing()) {
hasChanges = true;
- clusterInfoSimulator.simulate(shardRouting);
routingNodes.startShard(logger, shardRouting, changes, 0L);
logger.trace("starting shard {}", shardRouting);
}
server/src/main/java/org/elasticsearch/cluster/ClusterInfo.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/cluster/ClusterInfo.java
Outdated
Show resolved
Hide resolved
Co-authored-by: David Turner <[email protected]>
Co-authored-by: David Turner <[email protected]>
|
@elasticsearchmachine please run elasticsearch-ci/bwc |
DaveCTurner
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, but I would like a follow-up to extend the tests to verify the usage of the clusterInfoSimulator in the preparatory steps of the DesiredBalanceComputer. Specifically, the tests all still pass even with these lines removed:
diff --git a/server/src/main/java/org/elasticsearch/cluster/routing/allocation/allocator/DesiredBalanceComputer.java b/server/src/main/java/org/elasticsearch/cluster/routing/allocation/allocator/DesiredBalanceComputer.java
index 50bde0b40dd..e3784be6551 100644
--- a/server/src/main/java/org/elasticsearch/cluster/routing/allocation/allocator/DesiredBalanceComputer.java
+++ b/server/src/main/java/org/elasticsearch/cluster/routing/allocation/allocator/DesiredBalanceComputer.java
@@ -67,7 +67,6 @@ public class DesiredBalanceComputer {
for (final var routingNode : routingNodes) {
for (final var shardRouting : routingNode) {
if (shardRouting.initializing()) {
- clusterInfoSimulator.simulate(shardRouting);
routingNodes.startShard(logger, shardRouting, changes, 0L);
}
}
@@ -128,7 +127,6 @@ public class DesiredBalanceComputer {
assert shardRouting.started();
if (targetNodesIterator.hasNext()) {
ShardRouting shardToRelocate = routingNodes.relocateShard(shardRouting, targetNodesIterator.next(), 0L, changes).v2();
- clusterInfoSimulator.simulate(shardToRelocate);
routingNodes.startShard(logger, shardToRelocate, changes, 0L);
} else {
break;
@@ -154,7 +152,6 @@ public class DesiredBalanceComputer {
if (nodeIds != null && nodeIds.isEmpty() == false) {
final String nodeId = nodeIds.removeFirst();
ShardRouting shardToInitialized = unassignedPrimaryIterator.initialize(nodeId, null, 0L, changes);
- clusterInfoSimulator.simulate(shardToInitialized);
routingNodes.startShard(logger, shardToInitialized, changes, 0L);
}
}
@@ -168,7 +165,6 @@ public class DesiredBalanceComputer {
if (nodeIds != null && nodeIds.isEmpty() == false) {
final String nodeId = nodeIds.removeFirst();
ShardRouting shardToInitialize = unassignedReplicaIterator.initialize(nodeId, null, 0L, changes);
- clusterInfoSimulator.simulate(shardToInitialize);
routingNodes.startShard(logger, shardToInitialize, changes, 0L);
}
}…sage # Conflicts: # server/src/main/java/org/elasticsearch/cluster/routing/allocation/allocator/DesiredBalanceComputer.java # server/src/test/java/org/elasticsearch/cluster/routing/allocation/allocator/DesiredBalanceComputerTests.java
|
@elasticsearchmachine please run elasticsearch-ci/part-1 |
No description provided.