HDDS-6704. EC: ReplicationManager - create version of ContainerReplicaCounts applicable to EC #3405

sodonnel · 2022-05-11T13:58:49Z

What changes were proposed in this pull request?

This change adds a class and tests intended to be used for checking if an EC container is over or under replicated, while also considering decommission and maintenance. Given the container details and replicas, there are a series of methods used to check for over and under replication. There are also methods to return the under or over replicated indexes. The intention of this class, is to be used by Replication manager and any other parts of the code that need to make decisions based on the replicated state of EC containers.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-6704

How was this patch tested?

New unit tests

…aCounts applicable to EC

umamaheswararao

@sodonnel thanks for the patch. I have dropped few minors/nits. Thanks

umamaheswararao · 2022-05-11T23:44:21Z

...s/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/ECContainerReplicaCount.java

+   * in maintenance. Note it is possible for an index to be
+   * decommissioning, healthy and in maintenance, if there are multiple copies
+   * of it.
+   * @return


Nit: forgot to mention what we are returning?

Fixed this.

umamaheswararao · 2022-05-12T00:06:36Z

...s/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/ECContainerReplicaCount.java

+   * Return true if there are insufficient replicas to recover this container.
+   * Ie, less than EC Datanum containers are present.
+   * @return True if the container cannot be recovered, false otherwise.
+   */


this method is canRecover?

Probably unrecoverable makes sense, rather than missing

umamaheswararao · 2022-05-12T01:26:23Z

...s/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/ECContainerReplicaCount.java

+  private final Map<Integer, Integer> decommissionIndexes = new HashMap<>();
+  private final Map<Integer, Integer> maintenanceIndexes = new HashMap<>();
+
+  public ECContainerReplicaCount(ContainerInfo containerInfo,


IndexesPendingAdd means that we already schedule with some of previous iteration?

Yes, the command to create them has been scheduled and may not have completed yet. Or it may fail / timeout.

umamaheswararao · 2022-05-12T01:29:37Z

...m/src/test/java/org/apache/hadoop/hdds/scm/container/states/TestECContainerReplicaCount.java

+/**
+ * Tests for EcContainerReplicaCounts.
+ */
+public class TestECContainerReplicaCount {


Looks like pendingAdd list related tests not covered?

Yea I have missed this. They are only used in missingNonMaintenanceIndexes(), so I will add a test there.

JacksonYao287

thanks @sodonnel for the patch , i have few comment , please take a look.

JacksonYao287 · 2022-05-12T00:46:32Z

...s/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/ECContainerReplicaCount.java

+   * @param index The replica index to check.
+   * @Throws IllegalArgumentException if the index is out of bounds.
+   */
+  private void ensureIndexWithinBounds(Integer index) {


NIT: we could add an info to indicate in which set the index is out of bound.
Eg.

ensureIndexWithinBounds(Interger index, String setName) { if (index < 1 || index > repConfig.getRequiredNodes()) { throw new IllegalArgumentException("Replica Index in {} must be between " + " 1 and " + repConfig.getRequiredNodes(), setName); }

Yes, good idea. I also added the containerID.

JacksonYao287 · 2022-05-12T01:46:55Z

...s/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/ECContainerReplicaCount.java

+    // Now we have a list of missing. Remove any pending add as they should
+    // eventually recover.
+    for (Integer i : pendingAdd) {
+      missing.remove(i);
+    }


pendingAdd may fail. should we remove them here?

No, the intention in this case, is to get a list of indexes we must schedule new copies for, so we assume the adds will complete. From the java doc above:

and any pending add are assume to be created and omitted them from the returned list. This list can be used to determine which replicas must be recovered in a group, assuming the inflight replicas pending add complete successfully.

JacksonYao287 · 2022-05-12T02:13:35Z

...s/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/ECContainerReplicaCount.java

+   * @return
+   */
+  private int getMaxMaintenance() {
+    return Math.max(0, repConfig.getParity() - remainingMaintenanceRedundancy);


if i do not missunderstand , remainingMaintenanceRedundancy is just the maximum number of replicas that are allowed to exist only on a maintenance node. should we just return remainingMaintenanceRedundancy ?

remainingMaintenanceRedundancy is how many replicas we can still lose when some are in maintenance. For EC-3-2, and remainingRedundancy of 1, we can have 1 in maintenance without replicating.

For 6-3, and remainingRedundancy of 1, we can have 2 in maintenance without replicating.

In both these cases, one more replica could fail without affecting data availability.

It is also possible for someone to set an impossible remainingRedundancy, eg 3 for 3-2.

If we have 3 for 3-2, then this method returns zero, indicating we must make a new copy of all maintenance replicas.

got it , thanks

JacksonYao287 · 2022-05-12T03:10:55Z

...s/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/ECContainerReplicaCount.java

+   * will have the pending deletes already removed.
+   * @return List of indexes which are over-replicated.
+   */
+  public List<Integer> overReplicatedIndexes() {


can we return a Map<Integer , Integer> to indicate the index and the associated extra number?

We should keep this idea is mind when we write the code to use this method. I am not yet sure how the code to deal with each over-replicated index will behave. Eg whether it should query the replicas again and work out which to remove or just use the output from this class.

I am guessing that code will need to get the replicas for a container, filter them for "IN_SERVICE" + Replica_Index and then schedule a delete of all but 1 of them, so knowing the total count is not going to help it.

we could keep the code like this for now. when we implement EC container check and know how we will use it exactly, we can refine it if necessary

umamaheswararao

LGTM from my side as this is currently evolving stage and not yet connected to existing code flows yet. Let's refine if needed when we use this class.

JacksonYao287

thank @sodonnel for the patch and @umamaheswararao for the review! LGTM +1
I have commit this patch to master branch , and we can refine it if necessary when implementing EC container health check

* master: (96 commits) HDDS-6738. Migrate tests with rules in hdds-server-framework to JUnit5 (apache#3415) HDDS-6650. S3MultipartUpload support update bucket usedNamespace. (apache#3404) HDDS-6491. Support FSO keys in getExpiredOpenKeys (apache#3226) HDDS-6596. EC: Support ListBlock from CoordinatorDN (apache#3410) HDDS-6737. Migrate parameterized tests in hdds-server-framework to JUnit5 (apache#3414) HDDS-6660: EC: Add the DN side Reconstruction Handler class. (apache#3399) HDDS-6750. Migrate simple tests in hdds-server-scm to JUnit5 (apache#3417) HDDS-6749. SCM includes itself as peer in addSCM request (apache#3413) HDDS-6657. Improve Ozone integrated Ranger configuration instructions (apache#3365) HDDS-6742. Audit operation category mismatch (apache#3407) HDDS-6748. Intermittent timeout in TestECBlockReconstructedInputStream#testReadDataWithUnbuffer (apache#3416) HDDS-6731. Migrate simple tests in hdds-server-framework to JUnit5 (apache#3412) HDDS-5919. In kubernetes OM HA has circular dependency on service availability (apache#3185) HDDS-6730. Migrate tests in hdds-tools to JUnit5 (apache#3402) HDDS-6630. Explicitly remove node after being chosen (apache#3332) HDDS-6560. Add general Grafana dashboard (apache#3285) HDDS-6704. EC: ReplicationManager - create version of ContainerReplicaCounts applicable to EC (apache#3405) HDDS-6680. Pre-Finalize behaviour for Bucket Layout Feature. (apache#3377) HDDS-6619. Add freon command to run r/w mix workload using ObjectStore APIs (apache#3383) HDDS-6734. ozone admin pipeline list CLI is not backward compatible (apache#3406) ... Conflicts: hadoop-hdds/framework/src/main/java/org/apache/hadoop/hdds/scm/metadata/SCMMetadataStore.java hadoop-hdds/interface-server/src/main/proto/SCMRatisProtocol.proto hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/metadata/SCMDBDefinition.java hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/metadata/SCMMetadataStoreImpl.java hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/server/StorageContainerManager.java

HDDS-6704. EC: ReplicationManager - create version of ContainerReplic…

94285fa

…aCounts applicable to EC

umamaheswararao reviewed May 12, 2022

View reviewed changes

JacksonYao287 reviewed May 12, 2022

View reviewed changes

Address review comments

4aa2201

umamaheswararao approved these changes May 12, 2022

View reviewed changes

JacksonYao287 approved these changes May 13, 2022

View reviewed changes

JacksonYao287 merged commit 86ffd94 into apache:master May 13, 2022

HDDS-6704. EC: ReplicationManager - create version of ContainerReplicaCounts applicable to EC #3405

HDDS-6704. EC: ReplicationManager - create version of ContainerReplicaCounts applicable to EC #3405

Uh oh!

Conversation

sodonnel commented May 11, 2022

What changes were proposed in this pull request?

What is the link to the Apache JIRA

How was this patch tested?

Uh oh!

umamaheswararao left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

JacksonYao287 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sodonnel May 12, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

umamaheswararao left a comment

Choose a reason for hiding this comment

Uh oh!

JacksonYao287 left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

sodonnel May 12, 2022 •

edited

Loading