HDDS-8009. OM HA metrics should be unregistered if leader is not known #4300
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What changes were proposed in this pull request?
Patch #4140 added OM HA metrics, with some logic in OM to unregister if leader is unknown. However, unregistration happens in the wrong if branch, should be done if leader is null, not its id (which is required to be non-null).
Related discussion: #4140 (comment)
What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-8009
How was this patch tested?
This patch was tested manually in docker clusters under
/hadoop-ozone/dist/target/ozone-1.4.0-SNAPSHOT/compose/ozone-haand
/hadoop-ozone/dist/target/ozone-1.4.0-SNAPSHOT/compose/ozone.Also, after taking another look in method
OzoneManager.updatePeerList()under which we registerOMHAMetrics, I can see that this method is only called after electing a leader. Under any scenario that there is no leader, this method won't even get called so we won't have to worry about RatisServer leader being null. It should be safe to remove the leader check altogether.OzoneManager.updatePeerList()gets used only inOzoneManagerStateMachine.notifyConfigurationChanged(..)if we search the logs we can verify, that it only gets called after new leader election.(
2023-02-22 15:05:04,865 [om2@group-D66704EFC61C-StateMachineUpdater] INFO ratis.OzoneManagerStateMachine: Received Configuration change notification from Ratis. New Peer list:)