-
Notifications
You must be signed in to change notification settings - Fork 9.2k
HDDS-1205. Refactor ReplicationManager to handle QUASI_CLOSED contain… #620
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
💔 -1 overall
This message was automatically generated. |
arp7
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @nandakumar131 . The code is very beautifully laid out and this patch was a pleasure to review. A few comments.
...-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/SCMContainerManager.java
Outdated
Show resolved
Hide resolved
...p-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/ReplicationManager.java
Outdated
Show resolved
Hide resolved
...p-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/ReplicationManager.java
Outdated
Show resolved
Hide resolved
...p-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/ReplicationManager.java
Outdated
Show resolved
Hide resolved
...p-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/ReplicationManager.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are not deleting the 'obsolete' replicas here, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, we don't do any replica deletion here.
...p-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/ReplicationManager.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we also need to handle the case where state == CLOSED but we have some quasi-closed replicas, so we need to do force-close.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That will be handled by handleInconsistentContainer method
...p-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/ReplicationManager.java
Outdated
Show resolved
Hide resolved
...p-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/ReplicationManager.java
Outdated
Show resolved
Hide resolved
|
🎊 +1 overall
This message was automatically generated. |
arp7
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 LGTM, with one minor comment about handling InterruptedException.
Thanks for working through this so patiently @nandakumar131!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We discussed this offline and I think the agreement was to fix the locking in container report processor.
...p-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/ReplicationManager.java
Outdated
Show resolved
Hide resolved
|
💔 -1 overall
This message was automatically generated. |
arp7
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two minor comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't make the method synchronized. If so then anyone who invokes processContainersNow will block indefinitely while the whole iteration completes. Instead just wrap the wait(interval) inside a synchronized(this) block.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also you will have to suppress the findbugs warning about the naked notify call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't call terminate if running is false. In that case, just exit the thread. See RedundancyMonitor#run in HDFS BlockManager.java.
|
+1 LGTM. |
|
💔 -1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
|
Not sure why yetus is failing, the branch is up to date with trunk. |
|
+1 from me. Not sure why Yetus is failing either. |
…ers. Contributed by Nanda kumar.
|
Squashed and rebased on top of Apache trunk to get Jenkins results. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
… apply configuration overrides Our integration test framework uses configuration overrides (i.e. jobs.*) to override the user system configuration set in the code (e.g. KafkaSystemDescriptor) to test systems. However, the StreamManager we created before calling ExecutionPlanner.plan() does not apply the overrides and causes failure in tests since the system was not correctly set to in-memory systems by configuration overrides. Author: Yi Pan (Data Infrastructure) <[email protected]> Reviewers: Prateek Maheshwari <[email protected]>, Boris S <[email protected]>, Xinyu Liu <[email protected]>, Shanthoosh Venkataraman <[email protected]> Closes apache#620 from nickpan47/SAMZA-1836 and squashes the following commits: a376b888 [Yi Pan (Data Infrastructure)] SAMZA-1836: StreamManager created before ExecutionPlanner should also apply the configuration overrides 35f2c0b7 [Yi Pan (Data Infrastructure)] SAMZA-1836: fixing two unit tests that should use InMemorySystemFactory instead of KafkaSystemDescriptor
…ers. Contributed by Nanda kumar.
Change-Id: I578ed5c196bb4d20754607cb9f4ddee7e0599077