-
Notifications
You must be signed in to change notification settings - Fork 588
HDDS-12980. Add unit test framework for reconciliation. #8402
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HDDS-12980. Add unit test framework for reconciliation. #8402
Conversation
aswinshakil
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Posting the intermediate review here. Still need to review tests and the KeyValueHandler reconciliation logic
...e/src/main/java/org/apache/hadoop/ozone/container/checksum/ContainerChecksumTreeManager.java
Outdated
Show resolved
Hide resolved
| + data.getContainerID(), ex); | ||
| } | ||
| // Set in-memory data checksum. | ||
| data.setDataChecksum(checksumInfo.getContainerMerkleTree().getDataChecksum()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right now this is also being used by markBlockAsDeleted in case of failure in that code flow we also want set the checksum.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed it from here because markBlocksAsDeleted does not affect the merkle tree and should not change the data checksum.
...vice/src/main/java/org/apache/hadoop/ozone/container/checksum/ContainerMerkleTreeWriter.java
Show resolved
Hide resolved
...vice/src/main/java/org/apache/hadoop/ozone/container/checksum/ContainerMerkleTreeWriter.java
Show resolved
Hide resolved
...vice/src/main/java/org/apache/hadoop/ozone/container/checksum/ContainerMerkleTreeWriter.java
Outdated
Show resolved
Hide resolved
...tainer-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/KeyValueHandler.java
Show resolved
Hide resolved
...tainer-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/KeyValueHandler.java
Show resolved
Hide resolved
aswinshakil
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on this @errose28. I have posted some comments here.
...tainer-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/KeyValueHandler.java
Outdated
Show resolved
Hide resolved
...tainer-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/KeyValueHandler.java
Outdated
Show resolved
Hide resolved
...tainer-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/KeyValueHandler.java
Outdated
Show resolved
Hide resolved
...tainer-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/KeyValueHandler.java
Outdated
Show resolved
Hide resolved
...vice/src/main/java/org/apache/hadoop/ozone/container/checksum/ContainerMerkleTreeWriter.java
Show resolved
Hide resolved
|
@aswinshakil All comments should be addressed in the latest commits. I just need to circle back for some checkstyle failures. |
aswinshakil
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the patch @errose28. The changes look good to me. Let me trigger the CI.
aswinshakil
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test failures are not related to this change. LGTM+1
a355664
into
apache:HDDS-10239-container-reconciliation
|
Thanks for the review @aswinshakil. We can do a reverse merge of master into the feature branch to bring in #8443 which should resolve the test issues for future PRs. |
…anner-builds-mt * HDDS-10239-container-reconciliation: HDDS-12980. Add unit test framework for reconciliation. (apache#8402) hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/checksum/ContainerChecksumTreeManager.java hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/checksum/ContainerMerkleTreeWriter.java hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/KeyValueHandler.java hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/ozoneimpl/ContainerController.java hadoop-hdds/container-service/src/test/java/org/apache/hadoop/ozone/container/checksum/ContainerMerkleTreeTestUtils.java hadoop-hdds/container-service/src/test/java/org/apache/hadoop/ozone/container/checksum/TestContainerMerkleTreeWriter.java hadoop-hdds/container-service/src/test/java/org/apache/hadoop/ozone/container/keyvalue/TestContainerReconciliationWithMockDatanodes.java hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/dn/checksum/TestContainerCommandReconciliation.java
What changes were proposed in this pull request?
Create a unit test framework that writes container data locally, and mocks an ozone client to operate directly on those containers. This will allow us to test multiple failure combinations across replicas using container scan and repair while running much faster than a miniozone integration test.
The major components of this change are:
TestKeyValueHandlerinto its own test class for unit testing scan and repair across replicas.As part of the review, it will be helpful to pull #7490, which also incorporates this change, and run the tests there where they will pass with the scanner changes. Inspect the new log messages to make sure that they help follow what reconciliation is doing in each of these situations. The 10 missing blocks + 5 corrupt chunks case is one of the most involved tests to inspect the output from.
What is the link to the Apache JIRA
HDDS-12980
How was this patch tested?
The tests depend on HDDS-10374/#7490 to run. For now the tests are ignored to split up the review of HDDS-10374/#7490. I have also pushed the test to that change and they are passing. Once this PR is merged we can start running the tests in HDDS-10374/#7490.