Skip to content

Conversation

@dombizita
Copy link
Contributor

What changes were proposed in this pull request?

#6506 introduced a new container replica field, dataChecksum, which will tell us a specific container replica's checksum. In this patch I'm adding the functionality in Recon to behave based on this value.

Recon already processes the container reports from the datanodes and based on these the ContainerHealthTask does certain things. It'll scan through the containers and mark containers unhealthy if needed (e.g. missing, under replicated, etc.). I'm introducing a new unhealthy container state in Recon, called REPLICA_MISMATCH. A container will be marked REPLICA_MISMATCH by Recon if at least one of their replicas data checksum is different from the others.

To be able to tell which replica has which data checksum value, I added the dataChecksum field to ContainerReplicaHistoryProto as an optional field (just like in #4443 the state field). In the ContainerEndpoint when we are requesting the /unhealthy containers, we will be able to get the information for each replica (here).

This patch only does the backend changes, in the Recon UI we'll add the new table in HDDS-12395.

As I'm adding a new unhealthy container state in Recon, it'll introduce a new state in that column in the DB (where Recon maintains the unhealthy containers). With the help of the Recon Schema Versioning I added a new upgrade action, UnhealthyContainerReplicaMismatchAction, to handle this.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-11887

How was this patch tested?

Updated and added new unit tests: https://github.com/dombizita/ozone/actions/runs/13454086954

Change-Id: Ic58b5b195c36799104370490c61a7e7dd8466eb7
Change-Id: Iad0013190331886714ba1643bc671cdc0bd27242
Change-Id: I4ebc33680cc17f8f0be4c7f39ccdb0fa6cde00fb
Change-Id: Ie6f7e6d9f869057be371f3466621f5ea6ef03f68
Change-Id: I542a4badc17f21229f6586446e7fbb7b82de3518
Change-Id: Ie7d09f16121a5cf633db23f2c02f4e360f664ca6
Change-Id: Ib8b3412c8080663d176028eeffda1d71f8988aef
Change-Id: I1b33049e75f4a40a7ab84b604625ab93775ffc6a
Change-Id: I75ef680eb643eddc37b33141ffd50f5c33c555b7
Change-Id: I516826fa8bdfeb2ea557081c8b4983c94ba4260e
Change-Id: Iaed2368bcd737aaacaac6a2038bb15fed5aff674
Change-Id: I41ab7a2d5e210fb958bf151383c4c2991950606f
Change-Id: I4f27e5275ccee3150e976e859e964f25bd196172
Change-Id: I4036472a8e2a3c360b19fbee1a5c05ee3065b38a
@adoroszlai adoroszlai changed the title HDDS-11887. Ozone Recon - Identify container replicas difference based on content checksums HDDS-11887. Recon - Identify container replicas difference based on content checksums Feb 21, 2025
Copy link
Contributor

@devmadhuu devmadhuu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @dombizita for the patch. Few comments. Pls check.

required int64 lastSeenTime = 3;
required int64 bcsId = 4;
optional string state = 5;
optional int64 dataChecksum = 6;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding this new field in proto, will there be any decoding issues for existing data in replica_history table in upgrade case when try to load ? Has this been tested ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added this as an optional field, so that is saving us from some problems regarding compatibility (similarly how it was done in #4443). But I'm not sure if we should handle the case when there is no dataChecksum and we call this

public static ContainerReplicaHistory fromProto(
ContainerReplicaHistoryProto proto) {
return new ContainerReplicaHistory(UUID.fromString(proto.getUuid()),
proto.getFirstSeenTime(), proto.getLastSeenTime(), proto.getBcsId(),
proto.getState(), proto.getDataChecksum());
}

If we should, than the state field is also missing this and we should do it for that as well.

Copy link
Contributor

@errose28 errose28 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM, just one comment in addition to Devesh's review above.

@dombizita
Copy link
Contributor Author

#7942 (comment) made me realise that with introducing the REPLICA_MISMATCH state we'll be able to have containers that are in two unhealthy state at the same time, e.g. a container can be over replicated and it could have a replica checksum mismatch. I think this couldn't happen before with these states

public enum UnHealthyContainerStates {
MISSING,
EMPTY_MISSING,
UNDER_REPLICATED,
OVER_REPLICATED,
MIS_REPLICATED,
ALL_REPLICAS_BAD,
NEGATIVE_SIZE // Added new state to track containers with negative sizes
}

What do you think, how should we handle this? It should be in both states and it should show up in both unhealthy container tables, right? We have stats for each container state count and the overall count, how should we do that? Count it to both states, but we shouldn't count them twice to the overall count?
I'll look over my patch and check if I should change anything based on this, as I think this will bring new scenarios into the picture, how the unhealthy containers are handled in Recon. Let me know if I miss something.

@devmadhuu
Copy link
Contributor

#7942 (comment) made me realise that with introducing the REPLICA_MISMATCH state we'll be able to have containers that are in two unhealthy state at the same time, e.g. a container can be over replicated and it could have a replica checksum mismatch. I think this couldn't happen before with these states

public enum UnHealthyContainerStates {
MISSING,
EMPTY_MISSING,
UNDER_REPLICATED,
OVER_REPLICATED,
MIS_REPLICATED,
ALL_REPLICAS_BAD,
NEGATIVE_SIZE // Added new state to track containers with negative sizes
}

What do you think, how should we handle this? It should be in both states and it should show up in both unhealthy container tables, right? We have stats for each container state count and the overall count, how should we do that? Count it to both states, but we shouldn't count them twice to the overall count?
I'll look over my patch and check if I should change anything based on this, as I think this will bring new scenarios into the picture, how the unhealthy containers are handled in Recon. Let me know if I miss something.

As far as I know, I think , we add the record per container per state in UNHEALTHY_CONTAINERS table. Can you pls check this and this code ?

Change-Id: I32a5d519087badeeadb6e69fd9dba21cd3170b6e
@dombizita
Copy link
Contributor Author

As far as I know, I think , we add the record per container per state in UNHEALTHY_CONTAINERS table. Can you pls check this and this code ?

Thanks for linking these classes @devmadhuu. Based on these I think we should be good storing the REPLICA_MISMATCH containers in the unhealthy container table, as we already do.

@dombizita dombizita requested review from devmadhuu and removed request for ArafatKhan2198 March 10, 2025 13:43
Copy link
Contributor

@devmadhuu devmadhuu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @dombizita for improving the patch. Changes largely LGTM +1.

Copy link
Contributor

@ArafatKhan2198 ArafatKhan2198 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot @dombizita on working on this, based on the patch review and our´ initial offline discussion everything looks good!

@kerneltime kerneltime merged commit f47b407 into apache:HDDS-10239-container-reconciliation Apr 7, 2025
43 checks passed
errose28 added a commit to errose28/ozone that referenced this pull request Apr 11, 2025
…anner-builds-mt

* HDDS-10239-container-reconciliation: (646 commits)
  HDDS-11763. Implement container repair logic within datanodes. (apache#7474)
  HDDS-11887. Recon - Identify container replicas difference based on content checksums (apache#7942)
  HDDS-12397. Persist putBlock for closed container. (apache#7943)
  HDDS-12361. Mark testGetSnapshotDiffReportJob as flaky
  HDDS-12375. Random object created and used only once (apache#7933)
  HDDS-12335. Fix ozone admin namespace summary to give complete output (apache#7908)
  HDDS-12364. Require Override annotation for overridden methods (apache#7923)
  HDDS-11530. Support listMultipartUploads max uploads and markers (apache#7817)
  HDDS-11867. Remove code paths for non-Ratis SCM. (apache#7911)
  HDDS-12363. Add import options to IntelliJ IDEA style settings (apache#7921)
  HDDS-12215. Mark testContainerStateMachineRestartWithDNChangePipeline as flaky
  HDDS-12331. BlockOutputStream.failedServers is not thread-safe (apache#7885)
  HDDS-12188. Move server-only upgrade classes from hdds-common to hdds-server-framework (apache#7903)
  HDDS-12362. Remove temporary checkstyle suppression file (apache#7920)
  HDDS-12343. Fix spotbugs warnings in Recon (apache#7902)
  HDDS-12286. Fix license headers and imports for ozone-tools (apache#7919)
  HDDS-12284. Fix license headers and imports for ozone-s3-secret-store (apache#7917)
  HDDS-12275. Fix license headers and imports for ozone-integration-test (apache#7904)
  HDDS-12164. Rename and deprecate DFSConfigKeysLegacy config keys (apache#7803)
  HDDS-12283. Fix license headers and imports for ozone-recon-codegen (apache#7916)
  ...

Conflicts:
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/checksum/ContainerChecksumTreeManager.java
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/checksum/ContainerMerkleTreeWriter.java
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/KeyValueContainerCheck.java
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/KeyValueHandler.java
hadoop-hdds/container-service/src/test/java/org/apache/hadoop/ozone/container/checksum/ContainerMerkleTreeTestUtils.java
hadoop-hdds/container-service/src/test/java/org/apache/hadoop/ozone/container/checksum/TestContainerMerkleTreeWriter.java
hadoop-hdds/container-service/src/test/java/org/apache/hadoop/ozone/container/keyvalue/TestKeyValueContainerCheck.java
hadoop-hdds/container-service/src/test/java/org/apache/hadoop/ozone/container/ozoneimpl/TestBackgroundContainerDataScanner.java
hadoop-hdds/container-service/src/test/java/org/apache/hadoop/ozone/container/ozoneimpl/TestOnDemandContainerDataScanner.java
hadoop-ozone/dist/src/main/smoketest/admincli/container.robot
hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/dn/scanner/TestBackgroundContainerDataScannerIntegration.java
hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/dn/scanner/TestBackgroundContainerMetadataScannerIntegration.java
hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/dn/scanner/TestOnDemandContainerDataScannerIntegration.java
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants