HDDS-6957. EC: ReplicationManager - priortise under replicated containers #3574

sodonnel · 2022-06-30T20:14:00Z

What changes were proposed in this pull request?

After the under / over replicated containers are collect in HDDS-6699, they need to be priortised and placed on a queue for the next stage of RM to pick up and process.

This change adds a priority queue for the under replicated containers, where they are priortised by the remaining redundancy and made available for processing by the next stage of the replication process.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-6957

How was this patch tested?

New unit tests

…ners

umamaheswararao

Thanks @sodonnel for working on this patch. I have dropped few comments. PTAL!

umamaheswararao · 2022-07-01T10:41:23Z

...cm/src/main/java/org/apache/hadoop/hdds/scm/container/replication/ContainerHealthResult.java

+    // For under replicated containers, the best remaining redundancy we can
+    // have is 3 for EC-10-4, 2 for EC-6-3, 1 for EC-3-2 and 2 for Ratis.
+    // A container which is under-replicated due to decommission will have one
+    // more, ie 4, 3, 2, 3 respectively. Ideally we want to sort decommission


one more means weight right?

Under-replicated due to decommission is not missing any replicas - so its remaining redundancy is still the same as if it was not under-replicated at all.

umamaheswararao · 2022-07-01T10:45:07Z

...cm/src/main/java/org/apache/hadoop/hdds/scm/container/replication/ContainerHealthResult.java

+    // decommission only under-replicated containers to a floor of 5 so they
+    // sort after an under-replicated container with 3 remaining replicas (
+    // EC-10-4) and plus one retry.
+    private static final int DECOMMISSION_REDUNDANCY = 5;


Basically if the idea is to keep the decom elements last priority than underreplication, will that cause decom to take very long time if there are lot of under replication items in that cluster?

yea, but that is how it should be. Decommission is less important that repairing containers which are at risk of dataloss.

I am not sure we should block decom tasks. Decommision tasks (replicate commands are lighter weight compared to reconstruction tasks). If cluster has too manay reconstruction tasks ( may be due to rack down or so), decommission will take very long time may be. I just looked at HDFS, looks we there is no separate queue for decom. Probably let's move ahead with the current plan and revisit based on how this is going with decom in practice. I am wondering there may be complaints on decom taking longer time in practice.

...m/src/test/java/org/apache/hadoop/hdds/scm/container/replication/TestReplicationManager.java

umamaheswararao · 2022-07-01T11:17:14Z

...m/src/test/java/org/apache/hadoop/hdds/scm/container/replication/TestReplicationManager.java

+    replicationManager.processContainer(underRep1, underRep, overRep,
+        repReport);
+    replicationManager.processContainer(underRep0, underRep, overRep,
+        repReport);


Can we use process all?

...m/src/test/java/org/apache/hadoop/hdds/scm/container/replication/TestReplicationManager.java

...r-scm/src/main/java/org/apache/hadoop/hdds/scm/container/replication/ReplicationManager.java

umamaheswararao

LGTM

* master: (46 commits) HDDS-6901. Configure HDDS volume reserved as percentage of the volume space. (apache#3532) HDDS-6978. EC: Cleanup RECOVERING container on DN restarts (apache#3585) HDDS-6982. EC: Attempt to cleanup the RECOVERING container when reconstruction failed at coordinator. (apache#3583) HDDS-6968. Addendum: [Multi-Tenant] Fix USER_MISMATCH error even on correct user. (apache#3578) HDDS-6794. EC: Analyze and add putBlock even on non writing node in the case of partial single stripe. (apache#3514) HDDS-6900. Propagate TimeoutException for all SCM HA Ratis calls. (apache#3564) HDDS-6938. handle NPE when removing prefixAcl (apache#3568) HDDS-6960. EC: Implement the Over-replication Handler (apache#3572) HDDS-6979. Remove unused plexus dependency declaration (apache#3579) HDDS-6957. EC: ReplicationManager - priortise under replicated containers (apache#3574) HDDS-6723. Close Rocks objects properly in OzoneManager (apache#3400) HDDS-6942. Ozone Buckets/Objects created via S3 should not allow group access (apache#3553) HDDS-6965. Increase timeout for basic check (apache#3563) HDDS-6969. Add link to compose directory in smoketest README (apache#3567) HDDS-6970. EC: Ensure DatanodeAdminMonitor can handle EC containers during decommission (apache#3573) HDDS-6977. EC: Remove references to ContainerReplicaPendingOps in TestECContainerReplicaCount (apache#3575) HDDS-6217. Cleanup XceiverClientGrpc TODOs, and document how the client works and should be used. (apache#3012) HDDS-6773. Cleanup TestRDBTableStore (apache#3434) - fix checkstyle HDDS-6773. Cleanup TestRDBTableStore (apache#3434) HDDS-6676. KeyValueContainerData#getProtoBufMessage() should set block count (apache#3371) ... Conflicts: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/server/upgrade/SCMUpgradeFinalizer.java

HDDS-6957. EC: ReplicationManager - priortise under replicated containers (apache#3574) (cherry picked from commit 03cd7c4) Change-Id: I524bb79b44ead9432fb752ccb82ae0b6e168e1a5

HDDS-6957. EC: ReplicationManager - priortise under replicated contai…

1c9b444

…ners

umamaheswararao reviewed Jul 1, 2022

View reviewed changes

adoroszlai added the EC label Jul 1, 2022

adoroszlai reviewed Jul 1, 2022

View reviewed changes

...r-scm/src/main/java/org/apache/hadoop/hdds/scm/container/replication/ReplicationManager.java Outdated Show resolved Hide resolved

Fix review comments

670e94e

umamaheswararao approved these changes Jul 2, 2022

View reviewed changes

umamaheswararao merged commit 03cd7c4 into apache:master Jul 2, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

HDDS-6957. EC: ReplicationManager - priortise under replicated containers #3574

HDDS-6957. EC: ReplicationManager - priortise under replicated containers #3574

Uh oh!

sodonnel commented Jun 30, 2022

Uh oh!

umamaheswararao left a comment

Uh oh!

umamaheswararao Jul 1, 2022

Uh oh!

sodonnel Jul 1, 2022

Uh oh!

umamaheswararao Jul 1, 2022

Uh oh!

sodonnel Jul 1, 2022

Uh oh!

umamaheswararao Jul 2, 2022

Uh oh!

Uh oh!

umamaheswararao Jul 1, 2022

Uh oh!

Uh oh!

Uh oh!

umamaheswararao left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

HDDS-6957. EC: ReplicationManager - priortise under replicated containers #3574

HDDS-6957. EC: ReplicationManager - priortise under replicated containers #3574

Uh oh!

Conversation

sodonnel commented Jun 30, 2022

What changes were proposed in this pull request?

What is the link to the Apache JIRA

How was this patch tested?

Uh oh!

umamaheswararao left a comment

Choose a reason for hiding this comment

Uh oh!

umamaheswararao Jul 1, 2022

Choose a reason for hiding this comment

Uh oh!

sodonnel Jul 1, 2022

Choose a reason for hiding this comment

Uh oh!

umamaheswararao Jul 1, 2022

Choose a reason for hiding this comment

Uh oh!

sodonnel Jul 1, 2022

Choose a reason for hiding this comment

Uh oh!

umamaheswararao Jul 2, 2022

Choose a reason for hiding this comment

Uh oh!

Uh oh!

umamaheswararao Jul 1, 2022

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

umamaheswararao left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants