HDDS-4927. Determine over and under utilized datanodes in Container Balancer. #2230

siddhantsangwan · 2021-05-10T12:13:05Z

What changes were proposed in this pull request?

ContainerBalancer will identify over, under, and within threshold utilized nodes at the start of each iteration. Based on this, it will determine whether balancing should continue. Unit tests to test this functionality.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-4927

How was this patch tested?

Added unit test TestContainerBalancer

linyiqun

@siddhantsangwan , some minor comments from me. Please have a look.

linyiqun · 2021-05-10T15:49:07Z

...erver-scm/src/main/java/org/apache/hadoop/hdds/scm/container/balancer/ContainerBalancer.java

Before initialize the iteration, can we do the cleanup operation for related node list additionally, like overUtilizedNodes/underUtilizedNodes.. It will look more understandable that we make node list clear logic as part of balance method rather than we clear list out side of this method.

Yes, thanks for pointing this out.

linyiqun · 2021-05-10T15:50:25Z

...erver-scm/src/main/java/org/apache/hadoop/hdds/scm/container/balancer/ContainerBalancer.java

Not fully get this. Why underUtilizedNodes are added into source list, not target nodes like original logic did?

In the case where there are no overUtilizedNodes in the cluster. Only underUtilizedNodes and other nodes having utilization within the limits are present. Then underUtilizedNodes need to be balanced and become the source nodes to which data will be moved.

So here the term 'source nodes' has been used for nodes that need to be balanced. Target nodes will be chosen from the list of over utilized, above average, under utilized or below average nodes as necessary. Do you have another approach in mind?

So here the term 'source nodes' has been used for nodes that need to be balanced.

Okay, so the meaning of source node is a little different with before.

Yes. However, the meaning of source nodes might change according to the algorithm for moving containers. The term will be definite once the exact algorithm is final.

lokeshj1703 · 2021-05-11T08:48:52Z

@siddhantsangwan The PR shows commits from HDDS-4925 as well. Can you please take a look?

GlenGeng-awx

Thanks @siddhantsangwan for the work. We are looking forward to this feature.

GlenGeng-awx · 2021-05-11T08:35:15Z

...erver-scm/src/main/java/org/apache/hadoop/hdds/scm/container/balancer/ContainerBalancer.java

NIT. merge the two info. In multi-thread context, there might be intervening logs between 114 and 115.

GlenGeng-awx · 2021-05-11T08:44:31Z

...erver-scm/src/main/java/org/apache/hadoop/hdds/scm/container/balancer/ContainerBalancer.java

NIT. nodeUsageInfos. nodes is misleading here.

Yes, changing that.

GlenGeng-awx · 2021-05-11T08:45:19Z

...erver-scm/src/main/java/org/apache/hadoop/hdds/scm/container/balancer/ContainerBalancer.java

Why catching NPE here ?

GlenGeng-awx · 2021-05-11T08:50:57Z

...erver-scm/src/main/java/org/apache/hadoop/hdds/scm/container/balancer/ContainerBalancer.java

ArithmeticException means nodes is empty, which leading to divide 0. How about skip this iteration if nodes is empty ? say

nodes = nodeManager.getMostOrLeastUsedDatanodes(true); if (nodes.empty()) { return true. }

The balancer should not work if SCM haven't heard any datanodes.

Makes sense. But shouldn't we return false then?

I think we will also need to handle safe mode. Balancer should not operate when SCM is in safe mode.

GlenGeng-awx · 2021-05-11T08:52:50Z

...erver-scm/src/main/java/org/apache/hadoop/hdds/scm/container/balancer/ContainerBalancer.java

NIT. merge 166 and 167 together.
what if clusterAvgUtilisation is less than threshold, e.g., for an empty cluster. Does a negative lowerLimit make sense here ?

A negative lower limit and upper limit greater than 1 will not lead to errors(as checked by the unit test). But in that case we could return false early since the cluster is balanced.

Question:

say we have a 10 DN cluster, the usage of all of them is 95%, then one empty DN is added to rebalance the cluster. Given the threshold is 10%, it seems the balancer will not work in this case, since that 10 DN will not achieve upperLimit.

Have we consider corner case like this ?

In that case, the newly added DN will be under utilized. Balancer will recognize this and then try to move data into it from the other 10 DNs that are now above average utilized. However, the exact algorithm for choosing particular source DN and containers to move is under progress.

GlenGeng-awx · 2021-05-11T08:55:23Z

...erver-scm/src/main/java/org/apache/hadoop/hdds/scm/container/balancer/ContainerBalancer.java

NIT. Better use a lock free variable to avoid contention, and print a error instead of throwing RuntimeException.

private final AtomicBoolean balancerRunning = new AtomicBoolean(false); public void start(ContainerBalancerConfiguration balancerConfiguration) { if (!balancerRunning.compareAndSet(false, true)) { LOG.error("Container Balancer is already running."); return; } /// }

GlenGeng-awx · 2021-05-11T09:03:35Z

...erver-scm/src/main/java/org/apache/hadoop/hdds/scm/container/balancer/ContainerBalancer.java

NIT. containsNode is called multi times, better change listToSearch to hash set, and do the existence check, which may be simpler and quicker.

For example, declare overUtilizedNodes and underUtilizedNodes to be hash set.

That's a great suggestion. The current implementation is assuming that the order of nodes in terms of utilization is important so balancer can focus on nodes that are most over or under utilized first. That's why a sorted list is being used.

Comparator can return 0 for two different datanodes with same utilisation. We will need to handle that case.

Good point.

GlenGeng-awx · 2021-05-11T09:13:28Z

...erver-scm/src/main/java/org/apache/hadoop/hdds/scm/container/balancer/ContainerBalancer.java

You can use CollectionUtils.intersection(), CollectionUtils.intersection(), CollectionUtils.union() to simplify the code.

…ainerBalancer.

…itialising iteration.

…stions.

lokeshj1703

@siddhantsangwan Thanks for working on the PR! I have added a few comments inline.

...erver-scm/src/main/java/org/apache/hadoop/hdds/scm/container/balancer/ContainerBalancer.java

lokeshj1703 · 2021-05-12T07:22:46Z

...erver-scm/src/main/java/org/apache/hadoop/hdds/scm/container/balancer/ContainerBalancer.java

I think we will also need to handle safe mode. Balancer should not operate when SCM is in safe mode.

...erver-scm/src/main/java/org/apache/hadoop/hdds/scm/container/balancer/ContainerBalancer.java

lokeshj1703 · 2021-05-12T07:37:02Z

...erver-scm/src/main/java/org/apache/hadoop/hdds/scm/container/balancer/ContainerBalancer.java

Comparator can return 0 for two different datanodes with same utilisation. We will need to handle that case.

.../main/java/org/apache/hadoop/hdds/scm/container/balancer/ContainerBalancerConfiguration.java

hadoop-hdds/server-scm/src/test/java/org/apache/hadoop/hdds/scm/container/MockNodeManager.java

...r-scm/src/test/java/org/apache/hadoop/hdds/scm/container/balancer/TestContainerBalancer.java

siddhantsangwan · 2021-05-12T10:39:58Z

Changed the term source nodes to unBalanced nodes to avoid ambiguity. Source node can mean 'node from which data is leaving' as expected. @linyiqun

…verage. Other changes include preventing Container Balancer from operating while SCM is in safe mode. Code clean up.

JacksonYao287 · 2021-05-15T10:13:56Z

...erver-scm/src/main/java/org/apache/hadoop/hdds/scm/container/balancer/ContainerBalancer.java

+    this.clusterRemaining = 0L;
+
+    this.overUtilizedNodes = new ArrayList<>();
+    this.underUtilizedNodes = new ArrayList<>();


maybe it is better to put these initialize operations in constructor, and the start function will just only do the start work

lokeshj1703

@siddhantsangwan Thanks for updating the PR! I have added a few more minor comments based on recent changes.

...erver-scm/src/main/java/org/apache/hadoop/hdds/scm/container/balancer/ContainerBalancer.java

.../main/java/org/apache/hadoop/hdds/scm/container/balancer/ContainerBalancerConfiguration.java

…nges.

# Conflicts: # hadoop-hdds/server-scm/src/test/java/org/apache/hadoop/hdds/scm/container/MockNodeManager.java

siddhantsangwan · 2021-05-20T12:23:30Z

Thanks for the reviews. I have addressed the comments.

lokeshj1703

@siddhantsangwan Thanks for updating the PR! The changes look good to me. +1.

Can you create another jira for #2230 (comment) or handle it in next jira?

siddhantsangwan · 2021-05-24T09:13:49Z

@siddhantsangwan Thanks for updating the PR! The changes look good to me. +1.

Can you create another jira for #2230 (comment) or handle it in next jira?

I have handled that case in the containsNode method. Do you mean changing the comparator itself?

lokeshj1703 · 2021-05-24T09:29:20Z

If hypothetically you have 5 nodes with same utilisation then containsNode might not return true with the current change. Because binary search can return index of any node with same utilisation.

return index >= 0 && listToSearch.get(index).equals(node);

This logic would return false since the returned datanode might have a different id.

Comparator would need to handle the case of same utilisation.

siddhantsangwan · 2021-05-24T10:43:32Z

Because binary search can return index of any node with same utilisation.

Yes, this case can be handled in another Jira. Thanks for pointing this out.

siddhantsangwan · 2021-05-24T10:47:14Z

@JacksonYao287 and @linyiqun can you please review the changes? Any comments are welcome.

GlenGeng-awx

Thanks @siddhantsangwan for the work. Just some inline comments.

GlenGeng-awx · 2021-05-24T11:16:27Z

...cm/src/main/java/org/apache/hadoop/hdds/scm/container/balancer/ContainerBalancerMetrics.java

+
+  @Metric(about = "The amount of Giga Bytes that have been moved to achieve " +
+      "balance.")
+  private LongMetric gigaBytesMoved;


NIT: gigaBytesMoved to dataSizeBalancedGB

GlenGeng-awx · 2021-05-24T11:17:49Z

...cm/src/main/java/org/apache/hadoop/hdds/scm/container/balancer/ContainerBalancerMetrics.java

+
+  @Metric(about = "Number of containers that Container Balancer has moved" +
+      " until now.")
+  private LongMetric numContainersMoved;


NIT: numContainersMoved to movedContainerNum

GlenGeng-awx · 2021-05-24T11:19:27Z

...cm/src/main/java/org/apache/hadoop/hdds/scm/container/balancer/ContainerBalancerMetrics.java

+  private LongMetric numContainersMoved;
+
+  @Metric(about = "The total number of datanodes that need to be balanced.")
+  private LongMetric totalNumDatanodesToBalance;


NIT: totalNumDatanodesToBalance to datanodeNumToBalance

GlenGeng-awx · 2021-05-24T11:20:27Z

...cm/src/main/java/org/apache/hadoop/hdds/scm/container/balancer/ContainerBalancerMetrics.java

+
+  @Metric(about = "Number of datanodes that Container Balancer has balanced " +
+      "until now.")
+  private LongMetric numDatanodesBalanced;


NIT: numDatanodesBalanced to datanodeNumBalanced

GlenGeng-awx · 2021-05-24T11:21:14Z

...cm/src/main/java/org/apache/hadoop/hdds/scm/container/balancer/ContainerBalancerMetrics.java

+  private LongMetric numDatanodesBalanced;
+
+  @Metric(about = "Utilisation value of the current maximum utilised datanode.")
+  private double maxUtilizedDatanodeRatio;


NIT: maxUtilizedDatanodeRatio to maxDatanodeUtilizedRatio

GlenGeng-awx · 2021-05-24T11:27:59Z

...cm/src/main/java/org/apache/hadoop/hdds/scm/container/balancer/ContainerBalancerMetrics.java

+
+  @Metric(about = "The total amount of used space in GigaBytes that needs to " +
+      "be balanced.")
+  private LongMetric totalSizeToBalanceGB;


NIT totalSizeToBalanceGB -> dataSizeToBalanceGB

GlenGeng-awx · 2021-05-24T11:34:23Z

...erver-scm/src/main/java/org/apache/hadoop/hdds/scm/container/balancer/ContainerBalancer.java

  public void stop() {
    LOG.info("Stopping Container Balancer...");
-    balancerRunning = false;
+    balancerRunning.set(false);


NIT. remove line 319, one info line is sufficient here, since no actual work needs to be done here.

GlenGeng-awx · 2021-05-24T11:35:37Z

...erver-scm/src/main/java/org/apache/hadoop/hdds/scm/container/balancer/ContainerBalancer.java

+   * @param datanodeUsageInfo DatanodeUsageInfo to calculate utilization for
+   * @return Utilization value
+   */
+  private double calculateUtilization(DatanodeUsageInfo datanodeUsageInfo) {


NIT. make it to be a static helper function.

GlenGeng-awx · 2021-05-24T11:37:53Z

...erver-scm/src/main/java/org/apache/hadoop/hdds/scm/container/balancer/ContainerBalancer.java

+
+    try {
+      return clusterUsed / (double) clusterCapacity;
+    } catch (ArithmeticException e) {


NIT. better not handle ArithmeticException, instead check nodes.size() != 0 at the entrance of function.

Updated PR. Please take a look.

GlenGeng-awx · 2021-05-25T12:26:26Z

LGTM. We can merge this PR for now, and start our future development based on it.

JacksonYao287 · 2021-05-26T02:09:09Z

...erver-scm/src/main/java/org/apache/hadoop/hdds/scm/container/balancer/ContainerBalancer.java

   */
-  public void start(ContainerBalancerConfiguration balancerConfiguration) {
-    this.balancerRunning = true;
+  public boolean start(ContainerBalancerConfiguration balancerConfiguration) {


maybe it is better to move these configuration initialization operations to constructor, start just do the start work without any parameter

This was initially decided in order to support configuration change using restart. Admin could restart balancer and balancer was supposed to load the new configuration values.

@JacksonYao287 if there are other suggestions for this function we can take them up in #2278. If it is ok I will commit this PR.

if we put container balancer inside SCM , the configuration is just the one loaded by scm. so ,the configuration can not be reloaded unless restarting scm

According to our design decision, params will be based from command line, which will be implemented by #2278, thereby the conf related code will be removed in future.

I see. We will not need the configuration param in that case. I see that change is made in #2278.

…alancer. (#2230)

lokeshj1703 · 2021-05-27T06:27:16Z

@siddhantsangwan Thanks for the contribution! @linyiqun @GlenGeng @JacksonYao287 Thanks for the reviews! I have committed the PR to master branch.

…ing-upgrade-master-merge * upstream/master: (76 commits) HDDS-5280. Make XceiverClientManager creation when necessary in ContainerOperationClient (apache#2289) HDDS-5272. Make ozonefs.robot execution repeatable (apache#2280) HDDS-5123. Use the pre-created apache/ozone-testkrb5 image during secure acceptance tests (apache#2165) HDDS-4993. Add guardrail for reserved buffer size when DN reads a chunk (apache#2058) HDDS-4936. Change ozone groupId from org.apache.hadoop to org.apache.ozone (apache#2018) HDDS-4043. allow deletion from Trash directory without -skipTrash option (apache#2110) HDDS-4927. Determine over and under utilized datanodes in Container Balancer. (apache#2230) HDDS-5273. Handle unsecure cluster convert to secure cluster for SCM. (apache#2281) HDDS-5158. Add documentation for SCM HA Security. (apache#2205) HDDS-5275. Datanode Report Publisher publishes one extra report after DN shutdown (apache#2283) HDDS-5241. SCM UI should have leader/follower and Primordial SCM information (apache#2260) HDDS-5219. Limit number of bad volumes by dfs.datanode.failed.volumes.tolerated. (apache#2243) HDDS-5252. PipelinePlacementPolicy filter out datanodes with not enough space. (apache#2271) HDDS-5191. Increase default pvc storage size (apache#2219) HDDS-5073. Use ReplicationConfig on client side (apache#2136) HDDS-5250. Build integration tests with Maven cache (apache#2269) HDDS-5236. Require block token for more operations (apache#2254) HDDS-5266 Misspelt words in S3MultipartUploadCommitPartRequest.java line 202 (apache#2279) HDDS-5249. Race Condition between Full and Incremental Container Reports (apache#2268) HDDS-5142. Make generic streaming client/service for container re-replication, data read, scm/om snapshot download (apache#2256) ... Conflicts: hadoop-hdds/common/src/main/java/org/apache/hadoop/hdds/scm/protocol/StorageContainerLocationProtocol.java hadoop-hdds/framework/src/main/java/org/apache/hadoop/hdds/scm/protocolPB/StorageContainerLocationProtocolClientSideTranslatorPB.java hadoop-hdds/interface-admin/src/main/proto/ScmAdminProtocol.proto hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/server/StorageContainerManager.java hadoop-hdds/server-scm/src/test/java/org/apache/hadoop/hdds/scm/container/MockNodeManager.java hadoop-ozone/dist/src/main/compose/testlib.sh hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/TestStorageContainerManager.java hadoop-ozone/interface-client/src/main/proto/OmClientProtocol.proto hadoop-ozone/ozone-manager/pom.xml hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OzoneManager.java hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/ratis/utils/OzoneManagerRatisUtils.java hadoop-ozone/s3gateway/pom.xml

linyiqun reviewed May 10, 2021

View reviewed changes

GlenGeng-awx reviewed May 11, 2021

View reviewed changes

siddhantsangwan added 6 commits May 12, 2021 13:10

HDDS-4927. Add support for determing source and target nodes for Cont…

84243bb

…ainerBalancer.

HDDS-4927. Update ContainerBalancer and add unit tests.

0ef6924

HDDS-4927. Minor changes.

4462655

HDDS-4927. Resolve ContainerBalancer errors and clear lists before in…

ba4602e

…itialising iteration.

HDDS-4927. Change source nodes to unBalanced nodes. Incorporate sugge…

8f54b39

…stions.

HDDS-4927. Minor change.

6747866

lokeshj1703 reviewed May 12, 2021

View reviewed changes

siddhantsangwan force-pushed the HDDS-4927 branch from d157859 to 6747866 Compare May 12, 2021 09:40

HDDS-4927. Use withinThresholdUtilizedNodes in place of above/below a…

3b21048

…verage. Other changes include preventing Container Balancer from operating while SCM is in safe mode. Code clean up.

siddhantsangwan closed this May 13, 2021

siddhantsangwan reopened this May 13, 2021

HDDS-4927. Trigger new CI check.

8d2adf1

JacksonYao287 reviewed May 15, 2021

View reviewed changes

lokeshj1703 reviewed May 19, 2021

View reviewed changes

siddhantsangwan added 4 commits May 20, 2021 12:02

HDDS-4927. Check SCM SafeMode in initialiseIteration. Other minor cha…

6c18894

…nges.

Merge remote-tracking branch 'apache/master' into HDDS-4927

4391972

# Conflicts: # hadoop-hdds/server-scm/src/test/java/org/apache/hadoop/hdds/scm/container/MockNodeManager.java

HDDS-4927. Resolve some merge conflicts.

143d4b3

Trigger new CI check.

16e4ef4

lokeshj1703 reviewed May 24, 2021

View reviewed changes

siddhantsangwan requested a review from GlenGeng-awx May 24, 2021 10:40

siddhantsangwan requested a review from lokeshj1703 May 24, 2021 10:43

GlenGeng-awx reviewed May 24, 2021

View reviewed changes

siddhantsangwan added 2 commits May 25, 2021 11:39

HDDS-4927. Renaming metrics in ContainerBalancerMetrics.

3a2121c

Trigger new CI check.

523809a

JacksonYao287 reviewed May 26, 2021

View reviewed changes

siddhantsangwan changed the title ~~HDDS-4927. Add support for initializing an iteration in ContainerBalancer. Add unit tests.~~ HDDS-4927. Determine over and under utilized datanodes in Container Balancer. May 26, 2021

lokeshj1703 pushed a commit that referenced this pull request May 27, 2021

HDDS-4927. Determine over and under utilized datanodes in Container B…

00b675d

…alancer. (#2230)

lokeshj1703 closed this May 27, 2021

JacksonYao287 mentioned this pull request May 31, 2021

HDDS-4926. Support start/stop for container balancer via command line #2278

Merged

siddhantsangwan mentioned this pull request Jul 23, 2021

HDDS-4929. Select target datanodes and containers to move for Container Balancer #2441

Merged

siddhantsangwan mentioned this pull request Nov 12, 2021

HDDS-5517. Support multiple container moves from a source datanode in one balance iteration #2808

Merged

siddhantsangwan mentioned this pull request Feb 25, 2022

HDDS-6244. ContainerBalancer metrics don't show updated values in JMX #3049

Merged

HDDS-4927. Determine over and under utilized datanodes in Container Balancer. #2230

HDDS-4927. Determine over and under utilized datanodes in Container Balancer. #2230

Uh oh!

Conversation

siddhantsangwan commented May 10, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

What is the link to the Apache JIRA

How was this patch tested?

Uh oh!

linyiqun left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

siddhantsangwan May 11, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lokeshj1703 commented May 11, 2021

Uh oh!

GlenGeng-awx left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lokeshj1703 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

siddhantsangwan commented May 10, 2021 •

edited

Loading

siddhantsangwan May 11, 2021 •

edited

Loading

JacksonYao287 May 15, 2021 •

edited

Loading