-
Notifications
You must be signed in to change notification settings - Fork 590
HDDS-7396. Force close non-RATIS containers in ReplicationManager #3877
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| // Add CLOSING container replicas with index [1, closing] | ||
| for (int i = 1; i <= closing; i++) { | ||
| containerReplicas.add(ReplicationTestUtil.createContainerReplica( | ||
| containerInfo.containerID(), i, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor point, but just incase it causes problems in the future, we should fix it.
Ratis replicas should always have index = 0. EC replicas should always have indexes >= 1.
Based on the force flag, you could set the index to force == true ? i : 0 in the two loops.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I missed that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have set it to ReplicationType == EC ? i : 0 in case of STAND_ALONE or CHAINED being passed.
sodonnel
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change looks good. I just have one minor comment on the index used in the test. If we fix it we should be good to commit.
| public void testOpenOrClosingReplicasAreClosed() { | ||
| @ParameterizedTest | ||
| @MethodSource("replicationConfigs") | ||
| public void testOpenOrClosingReplicasAreClosed(ReplicationConfig repConfig) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Parameterizing this is a good idea. If I understand correctly, this test will now run for both EC and RATIS replication configs. If so, we can delete the next test testOpenOrClosingRatisReplicasAreClosed().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it's covered by the parameterized test.
Thanks for pointing this out.
siddhantsangwan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kaijchen Thanks for working on this. Looks good! Pending CI.
sodonnel
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM when the CI is green.
JacksonYao287
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @kaijchen for the work, and @sodonnel @siddhantsangwan for the review! LGTM
|
Thanks @sodonnel @siddhantsangwan @JacksonYao287 for the review. |
* master: (718 commits) HDDS-7342. Move encryption-related code from MultipartCryptoKeyInputStream to OzoneCryptoInputStream (apache#3852) HDDS-7413. Fix logging while marking container state unhealthy (apache#3887) Revert "HDDS-7253. Fix exception when '/' in key name (apache#3774)" HDDS-7396. Force close non-RATIS containers in ReplicationManager (apache#3877) HDDS-7121. Support namespace summaries (du, dist & counts) for legacy FS buckets (apache#3746) HDDS-7258. Cleanup the allocated but uncommitted blocks (apache#3778) HDDS-7381. Cleanup of VolumeManagerImpl (apache#3873) HDDS-7253. Fix exception when '/' in key name (apache#3774) HDDS-7182. Add property to control RocksDB max open files (apache#3843) HDDS-7284. JVM crash for rocksdb for read/write after close (apache#3801) HDDS-7368. [Multi-Tenant] Add Volume Existence check in preExecute for OMTenantCreateRequest (apache#3869) HDDS-7403. README Security Improvement (apache#3879) HDDS-7199. Implement new mix workload Read/Write Freon command (apache#3872) HDDS-7248. Recon: Expand the container status page to show all unhealthy container states (apache#3837) HDDS-7141. Recon: Improve Disk Usage Page (apache#3789) HDDS-7369. Fix wrong order of command arguments in Nonrolling-Upgrade.md (apache#3866) HDDS-6210. EC: Add EC metrics (apache#3851) HDDS-7355. non-primordial scm fail to get signed cert from primordial SCM when converting an unsecure cluster to secure (apache#3859) HDDS-7356. Update SCM-HA.zh.md to match the English version (apache#3861) HDDS-6930. SCM,OM,RECON should not print ERROR and exit with code 1 on successful shutdown (apache#3848) ... Conflicts: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/replication/LegacyReplicationManager.java hadoop-hdds/server-scm/src/test/java/org/apache/hadoop/hdds/scm/container/replication/TestLegacyReplicationManager.java
What changes were proposed in this pull request?
Force close non-RATIS containers in ReplicationManager.
ozone/hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/statemachine/commandhandler/CloseContainerCommandHandler.java
Lines 106 to 113 in 965d31c
What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-7396
How was this patch tested?
TestClosingContainerHandler#testOpenOrClosingReplicasAreClosed