-
Notifications
You must be signed in to change notification settings - Fork 588
HDDS-10300. InstallSnapshot may fail if OM metadata dir and OM DB dir are in different local storage partitions. #6226
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
… are in different local storage partitions.
Galsza
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for working on this. I've left 2 minor comments, please fix them
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please write a test case where we assert for the new functionality. (New file is in a different partition than the old file)
...p-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/snapshot/OmSnapshotUtils.java
Outdated
Show resolved
Hide resolved
hemantk-12
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @aswinshakil for the quick fix.
Left some comments.
...p-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/snapshot/OmSnapshotUtils.java
Outdated
Show resolved
Hide resolved
...p-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/snapshot/OmSnapshotUtils.java
Outdated
Show resolved
Hide resolved
...p-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/snapshot/OmSnapshotUtils.java
Outdated
Show resolved
Hide resolved
hemantk-12
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Regarding @Galsza's comment to add test, it would be great if can be added as part of this change otherwise create a follow up task.
swamirishi
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aswinshakil Thanks for the patch. The patch overall looks good to me, apart from a new nitpicky changes and a test case for the same function would be great.
| if (isSamePartition(fullFromPath.getParent(), fullToPath.getParent())) { | ||
| Files.createLink(fullToPath, fullFromPath); | ||
| } else { | ||
| Files.copy(fullFromPath, fullToPath, StandardCopyOption.REPLACE_EXISTING); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is replace existing correct? Can this kind of a situation occur where we are re-copying stuff? We shouldn't inadvertently delete data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we do some kind of a checksum verification if the fullToPath exists.
- copy to tmp file
- Verify checksum if the files exist
- if checksums don't match raise exception.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The file shouldn't exist at all, I'm doing a replace just as a precaution. I'm following the normal OM bootstrap process, We don't do a checksum check for that either.
| assertFalse(f1Link.exists()); | ||
|
|
||
| OmSnapshotUtils.linkFiles(tree1, tree2); | ||
| OmSnapshotUtils.linkFilesOrCopy(tree1, tree2); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add similar test case for copy case, when both files are in different partitions. You can achieve this by mocking the layer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mocking this isSamePartition function should help. From what I know you cannot mock Files interface through mockito but PowerMockito can do it.
|
Copying seems like a bad solution since we will lose efficiency of snapshots space and diff. Additionally there is no indication to the user that this will happen if they set up their configurations in this way, and the state machines, although functionally the same, will be different across OM replicas when only one node installs a snapshot. Why not use either of these simpler options:
|
|
@errose28 Thanks for taking a look at it. For the first approach, With moving and creating a hardlink we would be doing double the work. Instead, why not move to the correct directory and avoid the hard linking part altogether? I agree with 2nd, I'm not sure why the snapshot download directory is in a different disk. Ideally, both should be in the same disk. |
|
@errose28 regarding point#2
@aswinshakil, @swamirishi and I had a discussion about it and we don't see any problem if candidate dir just resides inside I looked into code and config for the snapshot dir is ozone.om.ratis.snapshot.dir and if it is not set, it falls back to If we can get to an agreement that |
|
I was talking about this with @aswinshakil and there is another problem we should consider: the size of the Ratis snapshot might be far greater than the size of the final OM DB. This is because the Ratis snapshot has "inflated" all the hardlinks, so if there are many filesystem snapshots on dense buckets, the OM DB device needs an undetermined amount of space greater than the current DB in order to install the ratis snapshot. If this space is not available and we remove the configs, there is no good way out of the situation. We already don't have great handling to make sure there is always room to install a ratis snapshot, but the hardlink inflation makes the problem worse. One idea to mitigate the problem:
|
|
Discussed with @hemantk-12 and @swamirishi offline, Right now This can happen in the same partition because we would be doing the same thing if we downloaded it in a different partition and moved that into OM DB's partition. Instead, download the checkpoint(candidate) in the OM DB partition. We need to do a few things, deprecate configs |
|
Regarding this previous comment, it turns out that the snapshot install only copies one copy of each SST file needed, and then builds the hardlinks from there. That means the Ratis snapshot will not be bigger than the OM DB and we should be able to put it right on the main DB disk without any more space concerns than what we already have. I'm +1 for deprecating the configs. |
|
@errose28 @aswinshakil Hi, the issue we encountered is similar, and we would like to know if this issue will continue to progress according to the discussion above? |
|
Hi @weimingdiit based on the discussion here I think we would like to simplify the snapshot install configuration by deprecating |
What changes were proposed in this pull request?
When we hard link files between different disk/partition we face this
Invalid cross-device linkerror. Because we can't create hard links between different disks. In this patch, Instead of creating a hard link we move the file to the target directory.What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-10300
How was this patch tested?
Existing Tests