HDDS-7666. EC: Unrecoverable EC containers with some remaining replicas may block decommissioning #4118

sodonnel · 2022-12-21T11:34:05Z

What changes were proposed in this pull request?

In EC, a container is considered missing and under replicated if it has lost enough replicas that offline reconstruction is not possible. If any of the remaining replicas for this container are on a datanode that is being decommissioned, the decommissioning will not proceed. All the containers on that node must be restored to proper replication for it to finish decommissioning, but the code will not copy the replica of the missing container to a different node.

There are 3 parts to fixing this problem:

In DatanodeAdminMonitorImpl, inside the method checkContainersReplicatedOnNode, we use a call to ECContainerReplicaCount.isSufficientlyReplicated() to decide if the container is replicated ok or not. Even if we address 1 and 2 above, this is still a problem, as the container is un-recoverable. For EC container in the decommission monitor, perhaps we need a different check. Ie, that for the replica on the host being checked, it is also available on another IN_SERVICE host. From a decommission point of view, we don't care if the entire EC container is sufficiently replicated or not - we just care that the replica on the current host has a copy elsewhere.

In ECReplicationCheckHandler, we deliberately skip adding "unrecoverable" containers to the under replicated queue as we previously believed there was no point in adding them. They cannot be recovered anyway. However this decommission issue is specific to EC, so we should allow the container to make it onto the under-replicated queue if it has decommissioning or maintenance indexes.

In ECUnderReplicationHandle we need to check that the decommissioning indexes are copied ok, even if the container is otherwise unrecoverable.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-7666

How was this patch tested?

New unit tests added.

…sioning replicas to the queue

…Handler so decommission and maint indexes are replicated

adoroszlai · 2023-01-02T14:27:36Z

.../src/main/java/org/apache/hadoop/hdds/scm/container/replication/ECContainerReplicaCount.java

+  public boolean isSufficientlyReplicatedForOffline(DatanodeDetails datanode) {
+    boolean sufficientlyReplicated = isSufficientlyReplicated(false);
+    if (sufficientlyReplicated) {
+      return true;
+    }
+    // If it is not sufficiently replicated (ie the container has all replicas)
+    // then we need to check if the replica that is on this node is available
+    // on another ONLINE node, ie in the healthy set. This means we avoid
+    // blocking decommission or maintenance caused by un-recoverable EC
+    // containers.
+    ContainerReplica thisReplica = null;
+    for (ContainerReplica r : replicas) {
+      if (r.getDatanodeDetails().equals(datanode)) {
+        thisReplica = r;
+        break;
+      }
+    }
+    if (thisReplica == null) {
+      // From the set of replicas, none are on the passed datanode.
+      // This should not happen in practice but if it does we cannot indicate
+      // the container is sufficiently replicated.
+      return false;
+    }
+    return healthyIndexes.containsKey(thisReplica.getReplicaIndex());
+  }


This returns true even if called for the datanode which has the online replica. Should we guard against that?

The intention is for this method to be passed a decommissioning / maintenance node. If that is the case, its replicas will not be in healthy replicas, but the code does not enforce that.

Perhaps we could also check there is a replica index for the passed in node in the decommission / maintenance indexes and return false if not. I will add that in.

siddhantsangwan · 2023-01-04T11:40:37Z

.../java/org/apache/hadoop/hdds/scm/container/replication/health/ECReplicationCheckHandler.java

+              container, remainingRedundancy, dueToDecommission,
+              replicaCount.isSufficientlyReplicated(true),
+              replicaCount.isUnrecoverable());
+      if (replicaCount.decommissioningOnlyIndexes(true).size() > 0


These methods will also count indexes that are on DECOMMISSIONED or IN_MAINTENANCE datanodes. I guess we only want to add replicas on datanodes that haven't yet entered these states to the under rep queue.

For it to have reached DECOMMISSIONED or IN_MAINTENANCE, then it should have made a new copy elsewhere already. The flag it sets here only comes into play later and when its an unrecoverable container. If there are that many replicas missing, we should have another copy of any decommission / or maintenance replicas anyway and if there are not, it would be OK for the system to create them. I think its OK for this to consider both decommissioning / entering_maintenance and decommissioned / in_mantenance.

…ot IN_SERVICE

* master: (176 commits) HDDS-7726. EC: Enhance datanode reconstruction log message (apache#4155) HDDS-7739. EC: Increase the information in the RM sending command log message (apache#4153) HDDS-7652. Volume Quota not enforced during write when bucket quota is not set (apache#4124) HDDS-7628. Intermittent failure in TestOzoneContainerWithTLS (apache#4142) HDDS-7695. EC metrics related to replication commands don't add up (apache#4152) HDDS-7729. EC: ECContainerReplicaCount should handle pending delete of unhealthy replicas (apache#4146) HDDS-7738. SCM terminates when adding container to a closed pipeline (apache#4154) HDDS-7243. Remove RequestFeatureValidator from echoRPC method which supports only ValidationCondition.OLDER_CLIENT_REQUESTS (apache#4051) HDDS-7708. No check for certificate duration config scenarios. (apache#4149) HDDS-7727. EC: SCM unregistered event handler for DatanodeCommandCountUpdated (apache#4147) HDDS-7606. Add SCM HA support in intellij run (apache#4058) HDDS-7666. EC: Unrecoverable EC containers with some remaining replicas may block decommissioning (apache#4118) HDDS-7339. Implement Certificate renewal task for services (apache#3982) HDDS-7696. MisReplicationHandler does not consider QUASI_CLOSED replicas as sources (apache#4144) HDDS-7714. Docker cluster ozone-om-ha fails during docker-compose up (apache#4137) HDDS-7716. Log read requests rejected with permission denied in OM audit (apache#4136) HDDS-7588. Intermittent failure in TestObjectStoreWithLegacyFS#testFlatKeyStructureWithOBS (apache#4040) HDDS-7633. Compile error with Java 11: package com.sun.jmx.mbeanserver is not visible (apache#4077) HDDS-7648. Add a servername tag in UGI metrics. (apache#4094) HDDS-7564. Update Ozone version after 1.3.0 release (apache#4115) ...

…as may block decommissioning (apache#4118) (cherry picked from commit 10811c5) Change-Id: I4f5cbdcda9f7492f8a4f3b332b8f2f600a2b34e5

S O'Donnell added 4 commits December 20, 2022 13:11

Allow decommission monitor to continue with unrecoverabled EC containers

3428ba2

Fix ECReplicationCheckHandler to have add unrecoverable with decommis…

742ebbd

…sioning replicas to the queue

Allow unrecoverable containers to pass through the ECUnderReplication…

36c13c3

…Handler so decommission and maint indexes are replicated

fix style issues

d86bdd2

adoroszlai reviewed Jan 2, 2023

View reviewed changes

siddhantsangwan reviewed Jan 4, 2023

View reviewed changes

Having isSufficientlyReplicatedForOffline ensure the passed node is n…

1ca714c

…ot IN_SERVICE

adoroszlai approved these changes Jan 5, 2023

View reviewed changes

sodonnel merged commit 10811c5 into apache:master Jan 5, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

HDDS-7666. EC: Unrecoverable EC containers with some remaining replicas may block decommissioning #4118

HDDS-7666. EC: Unrecoverable EC containers with some remaining replicas may block decommissioning #4118

Uh oh!

sodonnel commented Dec 21, 2022

Uh oh!

adoroszlai Jan 2, 2023

Uh oh!

sodonnel Jan 3, 2023

Uh oh!

siddhantsangwan Jan 4, 2023 •

edited

Loading

Uh oh!

sodonnel Jan 4, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

HDDS-7666. EC: Unrecoverable EC containers with some remaining replicas may block decommissioning #4118

HDDS-7666. EC: Unrecoverable EC containers with some remaining replicas may block decommissioning #4118

Uh oh!

Conversation

sodonnel commented Dec 21, 2022

What changes were proposed in this pull request?

What is the link to the Apache JIRA

How was this patch tested?

Uh oh!

adoroszlai Jan 2, 2023

Choose a reason for hiding this comment

Uh oh!

sodonnel Jan 3, 2023

Choose a reason for hiding this comment

Uh oh!

siddhantsangwan Jan 4, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sodonnel Jan 4, 2023

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

siddhantsangwan Jan 4, 2023 •

edited

Loading