-
Notifications
You must be signed in to change notification settings - Fork 588
HDDS-7210. Missing open containers show up as "Closing" on the container report. #4207
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
c730b61
02c37e1
fe3d513
bab561c
a66748d
5103c80
a47c397
62defa5
a6c4450
e96f141
2040a96
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -428,6 +428,7 @@ protected void processContainer(ContainerInfo container, | |
| * we have to resend close container command to the datanodes. | ||
| */ | ||
| if (state == LifeCycleState.CLOSING) { | ||
| setHealthStateForClosing(replicas, container, report); | ||
| for (ContainerReplica replica: replicas) { | ||
| if (replica.getState() != State.UNHEALTHY) { | ||
| sendCloseCommand( | ||
|
|
@@ -1613,6 +1614,18 @@ private boolean isOpenContainerHealthy( | |
| .allMatch(r -> compareState(state, r.getState())); | ||
| } | ||
|
|
||
| private void setHealthStateForClosing(Set<ContainerReplica> replicas, | ||
| ContainerInfo container, | ||
| ReplicationManagerReport report) { | ||
| if (replicas.size() == 0) { | ||
| report.incrementAndSample(HealthState.MISSING, container.containerID()); | ||
| report.incrementAndSample(HealthState.UNDER_REPLICATED, | ||
| container.containerID()); | ||
| report.incrementAndSample(HealthState.MIS_REPLICATED, | ||
| container.containerID()); | ||
|
Comment on lines
+1622
to
+1625
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why is the container categorized in all of these states? Isn't
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In theory, MISSING could be enough but we wanted to have the same behavior as we have in Recon - this is way we have set all the states.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
This is also consistent with the replication manager detecting and reporting "
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah, I see that we're also setting the container as mis replicated, under replicated and missing further down in that method for closed containers. Looks like over time we've diverged from the original intention of the replication manager report: It specifies that each container should only be in one state at a time. I think we need to decide what will best help with debugging. For example, if a container is missing, it's naturally also mis replicated and under replicated. We can choose to count it only once as missing or we can count it in all three categories, but that needs to be done consistently everywhere. The new RM does not count a missing container as mis replicated, but it does count it as under replicated in |
||
| } | ||
| } | ||
|
|
||
| public boolean isContainerReplicatingOrDeleting(ContainerID containerID) { | ||
| return inflightReplication.containsKey(containerID) || | ||
| inflightDeletion.containsKey(containerID); | ||
|
|
||
Uh oh!
There was an error while loading. Please reload this page.