Skip to content

Conversation

@runzhiwang
Copy link
Contributor

@runzhiwang runzhiwang commented Sep 1, 2020

What changes were proposed in this pull request?

What's the problem ?
image

What's the reason of the failed ut ?

When cluster create 9 nodes, the following code try to shutdown the follower of the pipeline. Actually, the pipeline only exist in 3 nodes, but the following code check 9 nodes, so the pipeline can not found in the other 6 nodes.

    for (HddsDatanodeService dn : cluster.getHddsDatanodes()) {
      // shutdown the ratis follower
      if (ContainerTestHelper.isRatisFollower(dn, pipeline)) {
        cluster.shutdownHddsDatanode(dn.getDatanodeDetails());
        break;
      }
    }

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-4176

How was this patch tested?

Existed ut.

@runzhiwang
Copy link
Contributor Author

@elek Could you help review this patch ? Thank you very much.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic looks right to me.

Can you bump up the number of DN to make tests run against higher number DN: https://github.com/apache/hadoop-ozone/blob/34ee8311b0d0a37878fe1fd2e5d8c1b91aa8cc8f/hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/client/rpc/Test2WayCommitInRatis.java#L109

Which seems will cover your change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops. You are right. I was looking at a wrong file.

Copy link
Contributor

@adoroszlai adoroszlai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @runzhiwang for fixing this long standing intermittent failure.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: nodesInPipeline may be a better name.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@adoroszlai Thanks for review. I have updated the patch.

@runzhiwang runzhiwang force-pushed the test2WayCommitForTimeoutException branch from e9d0ddc to 391ef3d Compare September 1, 2020 23:46
@adoroszlai adoroszlai merged commit 9cef3f6 into apache:master Sep 2, 2020
@adoroszlai
Copy link
Contributor

Thanks @runzhiwang for the fix and @amaliujia for the review.

rakeshadr pushed a commit to rakeshadr/hadoop-ozone that referenced this pull request Sep 3, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants