-
Notifications
You must be signed in to change notification settings - Fork 588
HDDS-8326. Container Level Info - OM DB Insights. #4509
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
sumitagrawl
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@devmadhuu Thanks for working over this, please check few minor comments.
| pipelines.add(pipelineManager.getPipeline(pipelineID)); | ||
| } | ||
| } catch (ContainerNotFoundException e) { | ||
| throw new RuntimeException(e); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For containerNotFound exception, we can log and continue as parallel can be removed from SCM. No need throw exeption
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sumitagrawl - pls have re-look, Changes done and pushed.
| } catch (ContainerNotFoundException e) { | ||
| throw new RuntimeException(e); | ||
| } catch (PipelineNotFoundException e) { | ||
| throw new RuntimeException(e); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For pipelineNotFound exception, pipeline may have been closed, so we can ignore this with debug log, and continue reporting this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sumitagrawl - pls have re-look, Changes done and pushed.
|
|
||
| @GET | ||
| @Path("insights/containermismatch") | ||
| public Response getContainerInsights() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
method name can be changed, getContainerMismatchInsights
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sumitagrawl - pls have re-look, Changes done and pushed.
| } | ||
|
|
||
| @GET | ||
| @Path("insights/containermismatch") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
path can be /containers/mismatch which aligns with current container APIs. In API "insights" is not required.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes done and pushed
| try { | ||
| Map<Long, ContainerMetadata> omContainers = | ||
| reconContainerMetadataManager.getContainers(-1, -1); | ||
| List<Long> scmContainers = containerManager.getContainers().stream() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assume this scmContainers are which are is not in DELETED state?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is SCM container list irrespective of their state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1-In this case all SCM Containers when compared with OM Containers , will show all Deleted State containers as discrepancy i.e missing in OM
2-Also OM will never report any container "available in OM but missing in SCM" as deleted container is still in SCM list.
We need to have only Non-Deleted State SCM containers.
@sumitagrawl can you help check
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@devmadhuu @krishnaasawa1 Yes, we need consider non-deleted state of containers. This is like a dataloss only if OM have referred container but that container is not present / deleted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OM Containers
@sumitagrawl @krishnaasawa1 first point is fixed where "Missing in OM and present in SCM" case.
| .collect( | ||
| Collectors.toList()); | ||
|
|
||
| notSCMContainers.forEach(nonSCMContainer -> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
notSCMContainers List will be empty in normal scenario and only if some thing went wrong will have it populated.
Here we need to check notSCMContainers size is not zero then only do further steps.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
foreach will not execute if list is empty, so no check required.
| .collect(Collectors.toList()); | ||
|
|
||
| List<Pipeline> pipelines = new ArrayList<>(); | ||
| nonOMContainers.forEach(nonOMContainerId -> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here again need a check if nonOMContainers empty
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
foreach will not execute if list is empty, so no check required.
sumitagrawl
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@devmadhuu Thanks working over this, LGTM
sumitagrawl
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@devmadhuu Given one minor comment
| HddsProtos.LifeCycleState.DELETED)) | ||
| .map(containerInfo -> containerInfo.getContainerID()).collect( | ||
| Collectors.toList()); | ||
| List<Long> scmNonDeletedContainers = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
scmNonDeletedContainers and scmAllContainers are same, duplicate
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
scmNonDeletedContainers and scmAllContainers are same, duplicate
Its fixed. Pls re-review.
| // Filter list of container Ids which are present in OM but not in SCM. | ||
| List<Map.Entry<Long, ContainerMetadata>> notSCMContainers = | ||
| omContainers.entrySet().stream().filter(containerMetadataEntry -> | ||
| !(scmAllContainers.contains(containerMetadataEntry.getKey()))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This still looks incorrect. If a container has got Deleted in SCM and still present in OM, it will never get flagged as scmAllContainers has Deleted SCM containers as well. so no discrepancy will ever get reported.
you need to have scmNonDeletedContainers here as well instead of scmAllContainers. What we are trying to figure out is container is still referred in OM but in SCM that container has got deleted.
@devmadhuu @sumitagrawl
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This still looks incorrect. If a container has got Deleted in SCM and still present in OM, it will never get flagged as scmAllContainers has Deleted SCM containers as well. so no discrepancy will ever get reported. you need to have scmNonDeletedContainers here as well instead of scmAllContainers. What we are trying to figure out is container is still referred in OM but in SCM that container has got deleted. @devmadhuu @sumitagrawl
IMO, we want to identify mismatch of container referred in OM but present in SCM, now why the container is in deleted state in SCM, that is different issue. But any container which cannot be queried for metadata by OM to SCM is a data loss situation. @sumitagrawl - I think your 1st point is what we discussed. Let us know what discussion happened with @errose28
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@devmadhuu We should not count deleted SCM container for comparing to OM also, deleted container is just a bookmark, to avoid re-created same container at DN due to issue with sync at DN or Old DN.
This needs to be removed in comparison here tool.
sumitagrawl
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@devmadhuu Need avoid deleted container in comparision.
| // Filter list of container Ids which are present in OM but not in SCM. | ||
| List<Map.Entry<Long, ContainerMetadata>> notSCMContainers = | ||
| omContainers.entrySet().stream().filter(containerMetadataEntry -> | ||
| !(scmAllContainers.contains(containerMetadataEntry.getKey()))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@devmadhuu We should not count deleted SCM container for comparing to OM also, deleted container is just a bookmark, to avoid re-created same container at DN due to issue with sync at DN or Old DN.
This needs to be removed in comparison here tool.
Updated the patch, pls review |
|
|
||
| @JsonProperty("scmContainerState") | ||
| @JsonInclude(JsonInclude.Include.NON_EMPTY) | ||
| private long scmContainerState; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like scmContainerState and omContainerState are not used in anywhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like scmContainerState and omContainerState are not used in anywhere.
Thanks for pointing out Sammi, I have removed them, earlier I was thinking to use, but I don't have a mechanism to for states of mismatched containers. Pls review again.
Container Level Info (Tab1): A new API will present the container level info showing mismatch between SCM and OM. Some containers are deleted in SCM but seem to be referred to by files/keys in OM, So recon needs to detect if any container is missing or not from OM perspective. This is a data loss situation. This API will be consumed by Recon new tab/page. This API will only identify and list such containers along with number of keys. Recon UI can later query "api/v1/containers//keys" to get details of keys mapped to such containers.
Rest API Response for "api/v1/containers/mismatch"
Here in this tab, below list of containers will be shown along with their key mappings.
So this PR covers identifying and listing of following containers:
https://issues.apache.org/jira/browse/HDDS-8326
How was this patch tested?
This patch was tested using Junit test cases.