-
Notifications
You must be signed in to change notification settings - Fork 588
HDDS-6749. SCM includes itself as peer in addSCM request #3413
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
swagle
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 LGTM
smengcl
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm. Thanks @adoroszlai for the patch
lokeshj1703
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @adoroszlai for working on this! The changes look good to me. +1.
|
Thanks @lokeshj1703, @smengcl, @swagle for the review. |
* master: (96 commits) HDDS-6738. Migrate tests with rules in hdds-server-framework to JUnit5 (apache#3415) HDDS-6650. S3MultipartUpload support update bucket usedNamespace. (apache#3404) HDDS-6491. Support FSO keys in getExpiredOpenKeys (apache#3226) HDDS-6596. EC: Support ListBlock from CoordinatorDN (apache#3410) HDDS-6737. Migrate parameterized tests in hdds-server-framework to JUnit5 (apache#3414) HDDS-6660: EC: Add the DN side Reconstruction Handler class. (apache#3399) HDDS-6750. Migrate simple tests in hdds-server-scm to JUnit5 (apache#3417) HDDS-6749. SCM includes itself as peer in addSCM request (apache#3413) HDDS-6657. Improve Ozone integrated Ranger configuration instructions (apache#3365) HDDS-6742. Audit operation category mismatch (apache#3407) HDDS-6748. Intermittent timeout in TestECBlockReconstructedInputStream#testReadDataWithUnbuffer (apache#3416) HDDS-6731. Migrate simple tests in hdds-server-framework to JUnit5 (apache#3412) HDDS-5919. In kubernetes OM HA has circular dependency on service availability (apache#3185) HDDS-6730. Migrate tests in hdds-tools to JUnit5 (apache#3402) HDDS-6630. Explicitly remove node after being chosen (apache#3332) HDDS-6560. Add general Grafana dashboard (apache#3285) HDDS-6704. EC: ReplicationManager - create version of ContainerReplicaCounts applicable to EC (apache#3405) HDDS-6680. Pre-Finalize behaviour for Bucket Layout Feature. (apache#3377) HDDS-6619. Add freon command to run r/w mix workload using ObjectStore APIs (apache#3383) HDDS-6734. ozone admin pipeline list CLI is not backward compatible (apache#3406) ... Conflicts: hadoop-hdds/framework/src/main/java/org/apache/hadoop/hdds/scm/metadata/SCMMetadataStore.java hadoop-hdds/interface-server/src/main/proto/SCMRatisProtocol.proto hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/metadata/SCMDBDefinition.java hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/metadata/SCMMetadataStoreImpl.java hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/server/StorageContainerManager.java
What changes were proposed in this pull request?
SCM during start sends
addSCMrequest to its peers. It tries to exclude itself from the target list:ozone/hadoop-hdds/framework/src/main/java/org/apache/hadoop/hdds/utils/HAUtils.java
Lines 119 to 121 in 4046150
but
removeSelfIddoes not work. The bug is that it sets the genericozone.scm.nodesproperty, leaving the SCM service-specific config keyozone.scm.nodes.<service>unchanged with 3 nodes.https://issues.apache.org/jira/browse/HDDS-6749
How was this patch tested?
Added unit test.
Also added log message to show list of peers the fail-over proxy is created with. From HA acceptance test:
https://github.com/adoroszlai/hadoop-ozone/actions/runs/2322096338