Skip to content

Conversation

@devmadhuu
Copy link
Contributor

@devmadhuu devmadhuu commented Mar 31, 2023

Container Level Info (Tab1): A new API will present the container level info showing mismatch between SCM and OM. Some containers are deleted in SCM but seem to be referred to by files/keys in OM, So recon needs to detect if any container is missing or not from OM perspective. This is a data loss situation. This API will be consumed by Recon new tab/page. This API will only identify and list such containers along with number of keys. Recon UI can later query "api/v1/containers//keys" to get details of keys mapped to such containers.

Rest API Response for "api/v1/containers/mismatch"

[
  {
  "containerId" : 1,
  "numberOfKeys" : 3,
  "pipelines" : [ 
        pipelineId: 1423ghjds832403232,
	pipelineId: 32vds94943fsdh4443,
	pipelineId: 32vds94943fsdhs443
   ],
  "existsAt" : "OM"
}
]

Here in this tab, below list of containers will be shown along with their key mappings.

image

So this PR covers identifying and listing of following containers:

  1. Containers are present in OM, but not in SCM.
  2. Containers are present in SCM, but not in OM.

https://issues.apache.org/jira/browse/HDDS-8326

How was this patch tested?

This patch was tested using Junit test cases.

@devmadhuu
Copy link
Contributor Author

@errose28, @smengcl - pls review.

@devmadhuu devmadhuu marked this pull request as draft March 31, 2023 14:35
@devmadhuu devmadhuu marked this pull request as ready for review March 31, 2023 16:25
@devmadhuu devmadhuu marked this pull request as draft April 3, 2023 04:05
@devmadhuu devmadhuu marked this pull request as ready for review April 3, 2023 05:45
@kerneltime kerneltime requested a review from ChenSammi April 3, 2023 16:24
Copy link
Contributor

@sumitagrawl sumitagrawl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@devmadhuu Thanks for working over this, please check few minor comments.

pipelines.add(pipelineManager.getPipeline(pipelineID));
}
} catch (ContainerNotFoundException e) {
throw new RuntimeException(e);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For containerNotFound exception, we can log and continue as parallel can be removed from SCM. No need throw exeption

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sumitagrawl - pls have re-look, Changes done and pushed.

} catch (ContainerNotFoundException e) {
throw new RuntimeException(e);
} catch (PipelineNotFoundException e) {
throw new RuntimeException(e);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For pipelineNotFound exception, pipeline may have been closed, so we can ignore this with debug log, and continue reporting this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sumitagrawl - pls have re-look, Changes done and pushed.


@GET
@Path("insights/containermismatch")
public Response getContainerInsights() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

method name can be changed, getContainerMismatchInsights

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sumitagrawl - pls have re-look, Changes done and pushed.

}

@GET
@Path("insights/containermismatch")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

path can be /containers/mismatch which aligns with current container APIs. In API "insights" is not required.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes done and pushed

try {
Map<Long, ContainerMetadata> omContainers =
reconContainerMetadataManager.getContainers(-1, -1);
List<Long> scmContainers = containerManager.getContainers().stream()

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assume this scmContainers are which are is not in DELETED state?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is SCM container list irrespective of their state.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1-In this case all SCM Containers when compared with OM Containers , will show all Deleted State containers as discrepancy i.e missing in OM
2-Also OM will never report any container "available in OM but missing in SCM" as deleted container is still in SCM list.
We need to have only Non-Deleted State SCM containers.
@sumitagrawl can you help check

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@devmadhuu @krishnaasawa1 Yes, we need consider non-deleted state of containers. This is like a dataloss only if OM have referred container but that container is not present / deleted.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OM Containers
@sumitagrawl @krishnaasawa1 first point is fixed where "Missing in OM and present in SCM" case.

.collect(
Collectors.toList());

notSCMContainers.forEach(nonSCMContainer -> {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

notSCMContainers List will be empty in normal scenario and only if some thing went wrong will have it populated.
Here we need to check notSCMContainers size is not zero then only do further steps.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

foreach will not execute if list is empty, so no check required.

.collect(Collectors.toList());

List<Pipeline> pipelines = new ArrayList<>();
nonOMContainers.forEach(nonOMContainerId -> {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here again need a check if nonOMContainers empty

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

foreach will not execute if list is empty, so no check required.

Copy link
Contributor

@sumitagrawl sumitagrawl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@devmadhuu Thanks working over this, LGTM

Copy link
Contributor

@sumitagrawl sumitagrawl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@devmadhuu Given one minor comment

HddsProtos.LifeCycleState.DELETED))
.map(containerInfo -> containerInfo.getContainerID()).collect(
Collectors.toList());
List<Long> scmNonDeletedContainers =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

scmNonDeletedContainers and scmAllContainers are same, duplicate

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

scmNonDeletedContainers and scmAllContainers are same, duplicate

Its fixed. Pls re-review.

// Filter list of container Ids which are present in OM but not in SCM.
List<Map.Entry<Long, ContainerMetadata>> notSCMContainers =
omContainers.entrySet().stream().filter(containerMetadataEntry ->
!(scmAllContainers.contains(containerMetadataEntry.getKey())))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This still looks incorrect. If a container has got Deleted in SCM and still present in OM, it will never get flagged as scmAllContainers has Deleted SCM containers as well. so no discrepancy will ever get reported.
you need to have scmNonDeletedContainers here as well instead of scmAllContainers. What we are trying to figure out is container is still referred in OM but in SCM that container has got deleted.
@devmadhuu @sumitagrawl

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This still looks incorrect. If a container has got Deleted in SCM and still present in OM, it will never get flagged as scmAllContainers has Deleted SCM containers as well. so no discrepancy will ever get reported. you need to have scmNonDeletedContainers here as well instead of scmAllContainers. What we are trying to figure out is container is still referred in OM but in SCM that container has got deleted. @devmadhuu @sumitagrawl

IMO, we want to identify mismatch of container referred in OM but present in SCM, now why the container is in deleted state in SCM, that is different issue. But any container which cannot be queried for metadata by OM to SCM is a data loss situation. @sumitagrawl - I think your 1st point is what we discussed. Let us know what discussion happened with @errose28

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@devmadhuu We should not count deleted SCM container for comparing to OM also, deleted container is just a bookmark, to avoid re-created same container at DN due to issue with sync at DN or Old DN.

This needs to be removed in comparison here tool.

Copy link
Contributor

@sumitagrawl sumitagrawl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@devmadhuu Need avoid deleted container in comparision.

// Filter list of container Ids which are present in OM but not in SCM.
List<Map.Entry<Long, ContainerMetadata>> notSCMContainers =
omContainers.entrySet().stream().filter(containerMetadataEntry ->
!(scmAllContainers.contains(containerMetadataEntry.getKey())))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@devmadhuu We should not count deleted SCM container for comparing to OM also, deleted container is just a bookmark, to avoid re-created same container at DN due to issue with sync at DN or Old DN.

This needs to be removed in comparison here tool.

@devmadhuu
Copy link
Contributor Author

  • Containers are present in OM, but not in SCM.
  • Containers are present in SCM, but not in OM.

Updated the patch, pls review


@JsonProperty("scmContainerState")
@JsonInclude(JsonInclude.Include.NON_EMPTY)
private long scmContainerState;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like scmContainerState and omContainerState are not used in anywhere.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like scmContainerState and omContainerState are not used in anywhere.

Thanks for pointing out Sammi, I have removed them, earlier I was thinking to use, but I don't have a mechanism to for states of mismatched containers. Pls review again.

@ChenSammi ChenSammi merged commit e8fb174 into apache:master May 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants