-
Notifications
You must be signed in to change notification settings - Fork 588
HDDS-6704. EC: ReplicationManager - create version of ContainerReplicaCounts applicable to EC #3405
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…aCounts applicable to EC
umamaheswararao
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sodonnel thanks for the patch. I have dropped few minors/nits. Thanks
| * in maintenance. Note it is possible for an index to be | ||
| * decommissioning, healthy and in maintenance, if there are multiple copies | ||
| * of it. | ||
| * @return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: forgot to mention what we are returning?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed this.
| * Return true if there are insufficient replicas to recover this container. | ||
| * Ie, less than EC Datanum containers are present. | ||
| * @return True if the container cannot be recovered, false otherwise. | ||
| */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this method is canRecover?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably unrecoverable makes sense, rather than missing
| private final Map<Integer, Integer> decommissionIndexes = new HashMap<>(); | ||
| private final Map<Integer, Integer> maintenanceIndexes = new HashMap<>(); | ||
|
|
||
| public ECContainerReplicaCount(ContainerInfo containerInfo, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IndexesPendingAdd means that we already schedule with some of previous iteration?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the command to create them has been scheduled and may not have completed yet. Or it may fail / timeout.
| /** | ||
| * Tests for EcContainerReplicaCounts. | ||
| */ | ||
| public class TestECContainerReplicaCount { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like pendingAdd list related tests not covered?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea I have missed this. They are only used in missingNonMaintenanceIndexes(), so I will add a test there.
JacksonYao287
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @sodonnel for the patch , i have few comment , please take a look.
| * @param index The replica index to check. | ||
| * @Throws IllegalArgumentException if the index is out of bounds. | ||
| */ | ||
| private void ensureIndexWithinBounds(Integer index) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIT: we could add an info to indicate in which set the index is out of bound.
Eg.
ensureIndexWithinBounds(Interger index, String setName) {
if (index < 1 || index > repConfig.getRequiredNodes()) {
throw new IllegalArgumentException("Replica Index in {} must be between "
+ " 1 and " + repConfig.getRequiredNodes(), setName);
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, good idea. I also added the containerID.
| // Now we have a list of missing. Remove any pending add as they should | ||
| // eventually recover. | ||
| for (Integer i : pendingAdd) { | ||
| missing.remove(i); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pendingAdd may fail. should we remove them here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, the intention in this case, is to get a list of indexes we must schedule new copies for, so we assume the adds will complete. From the java doc above:
and any pending add are assume to be created and omitted them from the returned list. This list can be used to determine which replicas must be recovered in a group, assuming the inflight replicas pending add complete successfully.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
got it
| * @return | ||
| */ | ||
| private int getMaxMaintenance() { | ||
| return Math.max(0, repConfig.getParity() - remainingMaintenanceRedundancy); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if i do not missunderstand , remainingMaintenanceRedundancy is just the maximum number of replicas that are allowed to exist only on a maintenance node. should we just return remainingMaintenanceRedundancy ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remainingMaintenanceRedundancy is how many replicas we can still lose when some are in maintenance. For EC-3-2, and remainingRedundancy of 1, we can have 1 in maintenance without replicating.
For 6-3, and remainingRedundancy of 1, we can have 2 in maintenance without replicating.
In both these cases, one more replica could fail without affecting data availability.
It is also possible for someone to set an impossible remainingRedundancy, eg 3 for 3-2.
If we have 3 for 3-2, then this method returns zero, indicating we must make a new copy of all maintenance replicas.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
got it , thanks
| * will have the pending deletes already removed. | ||
| * @return List of indexes which are over-replicated. | ||
| */ | ||
| public List<Integer> overReplicatedIndexes() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we return a Map<Integer , Integer> to indicate the index and the associated extra number?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should keep this idea is mind when we write the code to use this method. I am not yet sure how the code to deal with each over-replicated index will behave. Eg whether it should query the replicas again and work out which to remove or just use the output from this class.
I am guessing that code will need to get the replicas for a container, filter them for "IN_SERVICE" + Replica_Index and then schedule a delete of all but 1 of them, so knowing the total count is not going to help it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we could keep the code like this for now. when we implement EC container check and know how we will use it exactly, we can refine it if necessary
umamaheswararao
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM from my side as this is currently evolving stage and not yet connected to existing code flows yet. Let's refine if needed when we use this class.
JacksonYao287
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thank @sodonnel for the patch and @umamaheswararao for the review! LGTM +1
I have commit this patch to master branch , and we can refine it if necessary when implementing EC container health check
* master: (96 commits) HDDS-6738. Migrate tests with rules in hdds-server-framework to JUnit5 (apache#3415) HDDS-6650. S3MultipartUpload support update bucket usedNamespace. (apache#3404) HDDS-6491. Support FSO keys in getExpiredOpenKeys (apache#3226) HDDS-6596. EC: Support ListBlock from CoordinatorDN (apache#3410) HDDS-6737. Migrate parameterized tests in hdds-server-framework to JUnit5 (apache#3414) HDDS-6660: EC: Add the DN side Reconstruction Handler class. (apache#3399) HDDS-6750. Migrate simple tests in hdds-server-scm to JUnit5 (apache#3417) HDDS-6749. SCM includes itself as peer in addSCM request (apache#3413) HDDS-6657. Improve Ozone integrated Ranger configuration instructions (apache#3365) HDDS-6742. Audit operation category mismatch (apache#3407) HDDS-6748. Intermittent timeout in TestECBlockReconstructedInputStream#testReadDataWithUnbuffer (apache#3416) HDDS-6731. Migrate simple tests in hdds-server-framework to JUnit5 (apache#3412) HDDS-5919. In kubernetes OM HA has circular dependency on service availability (apache#3185) HDDS-6730. Migrate tests in hdds-tools to JUnit5 (apache#3402) HDDS-6630. Explicitly remove node after being chosen (apache#3332) HDDS-6560. Add general Grafana dashboard (apache#3285) HDDS-6704. EC: ReplicationManager - create version of ContainerReplicaCounts applicable to EC (apache#3405) HDDS-6680. Pre-Finalize behaviour for Bucket Layout Feature. (apache#3377) HDDS-6619. Add freon command to run r/w mix workload using ObjectStore APIs (apache#3383) HDDS-6734. ozone admin pipeline list CLI is not backward compatible (apache#3406) ... Conflicts: hadoop-hdds/framework/src/main/java/org/apache/hadoop/hdds/scm/metadata/SCMMetadataStore.java hadoop-hdds/interface-server/src/main/proto/SCMRatisProtocol.proto hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/metadata/SCMDBDefinition.java hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/metadata/SCMMetadataStoreImpl.java hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/server/StorageContainerManager.java
What changes were proposed in this pull request?
This change adds a class and tests intended to be used for checking if an EC container is over or under replicated, while also considering decommission and maintenance. Given the container details and replicas, there are a series of methods used to check for over and under replication. There are also methods to return the under or over replicated indexes. The intention of this class, is to be used by Replication manager and any other parts of the code that need to make decisions based on the replicated state of EC containers.
What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-6704
How was this patch tested?
New unit tests