Skip to content

Conversation

@hemantk-12
Copy link
Contributor

@hemantk-12 hemantk-12 commented May 8, 2024

What changes were proposed in this pull request?

After applying RATIS-2045, we saw lots of integration test failures as described in the HDDS-10696.

It is because of how MiniOzoneHACluster is set up. It doesn't set value for config ozone.om.db.dirs which OMDBCheckpointServlet is trying to access here.
So when ozone.om.db.dirs is not set, ServerUtils#getDBPath returns ozone.metadata.dirs. Because of that snapshotDir path: is getting evaluated to/Users/iamgroot/ozone-ws/ozone/hadoop-ozone/integration-test/target/test-dir/MiniOzoneClusterImpl-e919072d-914a-4de6-9bfc-5b610697b58a/ozone-meta/db.snapshots, and the test is trying to access that doesn't exist causing NoSuchFileException.
In the actual snapshotDir path is /Users/iamgroot/ozone-ws/ozone/hadoop-ozone/integration-test/target/test-dir/MiniOzoneClusterImpl-1146c9db-bf52-4c4b-94c9-131053141484/omNode-1/db.snapshots which is inside the ozone data directory.

To fix the issue, I changed the way we are trying to get the snapshotDir. So rather than creating one, it will get from RDBStore after the change dir should exist by default after snapshot feature.

Also removed Flaky annotation from the test because jira: HDDS-7880 has been resolved after RATIS-1960.

What is the link to the Apache JIRA

HDDS-10696

How was this patch tested?

Applied the patch on top of the changes where we saw the failure. And verify by running the full workflow and flaky-test-workflow 5x5 time.

Set<Path> snapshotPaths = waitForSnapshotDirs(checkpoint);
Path snapshotDir = Paths.get(OMStorage.getOmDbDir(getConf()).toString(),
OM_SNAPSHOT_DIR);
Path snapshotDir = Paths.get(getSnapshotsParentDir()).getParent();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hemantk-12 , Why the parentDir of the getSnapshotsParentDir is the snapshotDir? Is it a typo?

Copy link
Contributor Author

@hemantk-12 hemantk-12 May 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is the naming issue in getSnapshotsParentDir.
I renamed and changed the function to return snapshotDir path directly.
I also added a comment explaining it.

Please take another look.

Copy link
Contributor

@szetszwo szetszwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hemantk-12 , thanks for clarifying it.

+1 the change looks good.

@adoroszlai adoroszlai changed the title HDDS-10696. Fixed ozone integration test fails because of empty snapshot installation HDDS-10696. Fix test failure caused by empty snapshot installation May 9, 2024
@adoroszlai adoroszlai merged commit 3f1a7ed into apache:master May 9, 2024
@adoroszlai
Copy link
Contributor

Thanks @hemantk-12 for the patch, @szetszwo for the review.

jojochuang pushed a commit to jojochuang/ozone that referenced this pull request May 29, 2024
xichen01 pushed a commit to xichen01/ozone that referenced this pull request Jul 18, 2024
xichen01 pushed a commit to xichen01/ozone that referenced this pull request Jul 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants