Skip to content

Conversation

@sodonnel
Copy link
Contributor

What changes were proposed in this pull request?

In its current form the replication manager does not consider decommission or maintenance states when checking if replicas are sufficiently replicated. With the introduction of maintenance states, it needs to consider decommission and maintenance states when deciding if blocks are over or under replicated.

It also needs to provide an API to allow the decommission manager to check if blocks are over or under replicated, so the decommission manager can decide if a node has completed decommission and maintenance or not.

The key part of this change is a new class called ContainerReplicaCount - the logic to determine if a container is "sufficiently replicated", over replicated and the delta of replicas required is extracted into this class. This allows for a standalone testable unit which can be used inside the ReplicationManager, but this same object can be returned from the Replication Manager to another class that needs to make replication decisions (ie the datanode admin manager).

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-2459

How was this patch tested?

Additional unit tests have been added to test the new functionality.

@sodonnel sodonnel force-pushed the HDDS-2459-rep-manager-dn-reports branch from 61ea716 to 5a9bc68 Compare November 22, 2019 22:37
… locally. Will revisit it in a later patch
@adoroszlai
Copy link
Contributor

@sodonnel Consider disabling the failing unit test using @Ignore instead of commenting it out. This way you can prevent the situation where some refactoring makes the commented code not compile when you want to enable it.

@elek
Copy link
Member

elek commented Nov 27, 2019

@sodonnel Consider disabling the failing unit test using @ignore instead of commenting it out. This way you can prevent the situation where some refactoring makes the commented code not compile when you want to enable it.

+1. And please create a Jira with information of the failure (log, stdout) and put it to the @Ignore.

Copy link
Member

@elek elek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much for this patch @sodonnel . Overall it looks good to me. I tested it and it works as it was proposed in the design doc.

My biggest question is the usage of the NodeManager vs using the sate of the containers. AFAIK @anuengineer had a different proposal.

* and zero indicates the containers has replicationFactor healthy
* replica
*/
public int additionalReplicaNeeded() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @anuengineer who suggested to separated the calculation of the required replicas from the retry logic (calculate with inflights). It would make the logic more similar to the original proposal and would make it possible to handle the inflightDel / inFlightAdd in different ways (for example in case of over replication the inflightAdd can be stopped...)

For example:

  public int additionalReplicaNeeded() {
    int missing = calculateMissingReplicas();
    if (missing <= 0) {
      return missing + inFlightDel;
    } else {
      return missing + inFlightDel - inFlightAdd;
    }
  }

Where calculateMissingReplicas is the existing additionalReplicaNeeded

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I discussed this with @elek offline. I have refactored this code and we believe it is simpler and easier to understand now.

// calculating the new containers needed.
delta = Math.max(0, delta - maintenanceCount);
// Check we have enough healthy replicas
minHealthyForMaintenance = Math.min(repFactor, minHealthyForMaintenance);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line can be moved to the constructor to simplify the logic here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, I missed this. I have made this change.

* @return True if the container is sufficiently replicated and False
* otherwise.
*/
public boolean isSufficientlyReplicated() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This (and the isOverReplicated function) seems to be the same logic what we have in the additionalReplicaNeeded just in a simplified version. Can be harder to maintain the logic in two places. Why don't we calculate the number of the missing replicas and use that number to decide if we need to call under/over replication?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a subtle difference in these methods in that they only consider inflight deletes, while additionalReplicaNeeded() considers inflight add and delete in some cases.

isSufficientlyReplicated() is intended to be used by the decommission monitor, so it can make a decision on whether a container is correctly replicated to allow decommission or maintenance to proceed. Therefore it assumes an inflight del will complete (worst case) but ignores inflight adds (assuming they will fail, again worst case, until they actually complete).

However, based on the refactor above we can reuse the logic to calculate the missing replicas, and then apply the inflight deletes and simplify these methods. The next commit will have that change.

/**
* Used to lookup the health of a nodes or the nodes operational state.
*/
private final NodeManager nodeManager;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@anuengineer / @nandakumar131 had a different proposal offline (and in #160): to manage the state only in the containers. In that case it's not required to call the NodeManager.

I am not sure about the current state of that proposal.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The use of the node manager is not to see the state of the containers, but to check if a node is alive or not before attempting to use it as the source of a replication. Eg, the containers can be decommissioned (which is their end state) but the host is either healthy, stale or dead. We don't want to use a stale or dead node as a source of the replication hence we need to use the node manager to get that health state. The proposal by @anuengineer / @nandakumar131 is in place here - the logic expects the DNs to change the container state via a container report to decommissioned / maintenance for the rest of the logic to work .

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I understood the proposal is to update the state of the containers by an other components based on the node state and use only the container state here (instead of checking the state by the node manager).

I discussed it with @anuengineer. Let's go forward with this approach and later we can improve this part.

public class TestContainerReplicaCount {

private OzoneConfiguration conf;
private List<DatanodeDetails> dns;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be unused.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this test class went though a few changes and those variables are no longer used. I have removed them.


@Test
public void testThreeHealthyReplica() {
registerNodes(CLOSED, CLOSED, CLOSED);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any reason to store the state on the class level here? It would be more readable (for me) to store all the required state in local variables:

Set<ContainerReplica> replica =  registerNodes(CLOSED, CLOSED, CLOSED);
... rcount = new ContainerReplicaCount(replica, 0, 0, 3, 2);
validate(rcount, true, 0, false);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have changed this in the most recent commit and all the instance variables are now gone.

@sodonnel
Copy link
Contributor Author

@sodonnel Consider disabling the failing unit test using @ignore instead of commenting it out. This way you can prevent the situation where some refactoring makes the commented code not compile when you want to enable it.

+1. And please create a Jira with information of the failure (log, stdout) and put it to the @Ignore.

I will change this to @ignore however I have not been able to find the cause of the problem. It just times out when running via github actions every time, but did not before the switch to GH actions, and it works locally in < 2 seconds every time. I am going to revisit the approach to this test as a wider effort or dig into the failure later. HDDS-2631 raised so I do not forget this.

@elek elek changed the title HDDS-2459 - Change the ReplicationManager to consider decommission and maintenance states HDDS-2459. Change the ReplicationManager to consider decommission and maintenance states Nov 28, 2019
@elek
Copy link
Member

elek commented Nov 28, 2019

I will change this to @ignore however I have not been able to find the cause of the problem

Sure, just link the failing github actions unit test + download the logs and upload to the jira (if meaningful, in case of timeout can be empty). I am not interested about the real root cause, but we need a definition of the problem including assertion errors, exceptions and log output to check it later.

@elek elek closed this Nov 28, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants