HDDS-13770. SstBackup Directory can have orphan files after bootstrap not present in compaction log table#9169
Conversation
The previous implementation had a race condition where a compaction could occur after a checkpoint was created but before SST files were linked for backup. This could result in orphan SST files in the backup directory that were not tracked in the compaction log. This commit fixes the issue by reading the list of SST files to be backed up directly from the `compactionLogTable` within the database checkpoint. This ensures that only the SST files that are part of the consistent checkpoint are included in the backup. Changes: - Added a new public method `getCompactionLogSstFiles` to `RocksDBCheckpointDiffer` to read the compaction log from a checkpoint. - Modified `OMDBCheckpointServletInodeBasedXfer.createAndPrepareCheckpoint` to use the new method and link only the necessary SST files. - This also eliminates the redundant copy of the compaction log directory. Change-Id: I82a2bb3b28bb191adfdc9c95a691042f3c9e38b8
Change-Id: Iab58267cd61795f7b8ed12febfed2064d3b1d251
There was a problem hiding this comment.
Pull Request Overview
This PR fixes an issue where the SstBackup directory could contain orphan files after bootstrap that are not present in the compaction log table. The solution opens snapshot RocksDB in read-only mode to get the accurate list of SST files from the compaction log table and uses that as the reference to create hard links.
Key changes:
- Modified checkpoint creation to read SST file list directly from the compaction log table in the checkpoint
- Added a new method to extract SST files from a checkpoint's compaction log in read-only mode
- Replaced directory-wide file copying with selective hard link creation based on compaction log entries
Reviewed Changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| OMDBCheckpointServletInodeBasedXfer.java | Changed checkpoint preparation to selectively link only SST files present in the compaction log instead of linking all files from backup directory |
| TestOMDbCheckpointServlet.java | Added blank line (formatting change only) |
| RocksDBCheckpointDiffer.java | Added new method getCompactionLogSstFiles to read SST file list from checkpoint's compaction log table in read-only mode |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
...ksdb-checkpoint-differ/src/main/java/org/apache/ozone/rocksdiff/RocksDBCheckpointDiffer.java
Outdated
Show resolved
Hide resolved
| "directory {}", sstFile, getSstBackupDir()); | ||
| } | ||
| } | ||
| } catch (Exception e) { |
There was a problem hiding this comment.
Catching generic Exception is too broad. Catch specific exceptions like RocksDBException or IOException to handle different failure scenarios appropriately and avoid masking unexpected errors.
| logEntry.getInputFileInfoList().forEach(f -> | ||
| sstFiles.add(f.getFileName())); | ||
| logEntry.getOutputFileInfoList().forEach(f -> | ||
| sstFiles.add(f.getFileName())); |
There was a problem hiding this comment.
Using List.add() in a loop can be inefficient if the list grows large. Consider using a Set for sstFiles to automatically handle duplicates and improve lookup performance, or use addAll() if the file info lists can be converted to filename collections.
swamirishi
left a comment
There was a problem hiding this comment.
Thanks for the patch @jojochuang I have posted comments inline
...ksdb-checkpoint-differ/src/main/java/org/apache/ozone/rocksdiff/RocksDBCheckpointDiffer.java
Outdated
Show resolved
Hide resolved
...ne-manager/src/main/java/org/apache/hadoop/ozone/om/OMDBCheckpointServletInodeBasedXfer.java
Outdated
Show resolved
Hide resolved
...ksdb-checkpoint-differ/src/main/java/org/apache/ozone/rocksdiff/RocksDBCheckpointDiffer.java
Outdated
Show resolved
Hide resolved
| for (String sstFile : sstFiles) { | ||
| Path sstFileToLink = getSstBackupDir().resolve(sstFile); | ||
| if (Files.exists(sstFileToLink)) { | ||
| Files.createLink(tmpSstBackupDir.resolve(sstFile), sstFileToLink); |
There was a problem hiding this comment.
We don't need to create SstFileLinks we can directly read it and write it to the tar archive stream
Change-Id: Ia23af3fea433c58ef9342d62cdfc90ef8678ae99
swamirishi
left a comment
There was a problem hiding this comment.
@jojochuang Thanks for addressing the review comments I just have a few more review comments once that is done it should be good to merged after that
...ne-manager/src/main/java/org/apache/hadoop/ozone/om/OMDBCheckpointServletInodeBasedXfer.java
Outdated
Show resolved
Hide resolved
...ne-manager/src/main/java/org/apache/hadoop/ozone/om/OMDBCheckpointServletInodeBasedXfer.java
Show resolved
Hide resolved
Change-Id: I7a32ce946963aec07167d7e7e6036a749af5b612
...ne-manager/src/main/java/org/apache/hadoop/ozone/om/OMDBCheckpointServletInodeBasedXfer.java
Outdated
Show resolved
Hide resolved
...ne-manager/src/main/java/org/apache/hadoop/ozone/om/OMDBCheckpointServletInodeBasedXfer.java
Outdated
Show resolved
Hide resolved
...ne-manager/src/main/java/org/apache/hadoop/ozone/om/OMDBCheckpointServletInodeBasedXfer.java
Outdated
Show resolved
Hide resolved
|
@jojochuang there are a couple of more review comments |
...ne-manager/src/main/java/org/apache/hadoop/ozone/om/OMDBCheckpointServletInodeBasedXfer.java
Show resolved
Hide resolved
...ne-manager/src/main/java/org/apache/hadoop/ozone/om/OMDBCheckpointServletInodeBasedXfer.java
Show resolved
Hide resolved
...ne-manager/src/main/java/org/apache/hadoop/ozone/om/OMDBCheckpointServletInodeBasedXfer.java
Outdated
Show resolved
Hide resolved
Change-Id: I172095c5bd264cbe5327a2115c08d9070331acc0
| while (iterator.hasNext()) { | ||
| CompactionLogEntry logEntry = iterator.next().getValue(); | ||
| logEntry.getInputFileInfoList().forEach(f -> | ||
| sstFiles.add(Paths.get(f.getFileName()))); |
There was a problem hiding this comment.
the path would only have filename here right i.e something like "x.sst" so when the file stream is passed to writeDBToArchive method I believe the absolute path wouldn't be available so it won't be copied to the tarball.
There was a problem hiding this comment.
Thanks @sadanand48 good catch. Can you check again now?
There was a problem hiding this comment.
yes looks good now, As swami suggested will test this part in #9132. I need to make changes there
Change-Id: I52fa1333d85958fc6ed4da4596b97a249bc111e6
Change-Id: Icb404ab9f70624d16661a74095819433d2217401
Change-Id: I72a8373eb17f9f4cdc8597d0c93a3707e2ecd399
...ne-manager/src/main/java/org/apache/hadoop/ozone/om/OMDBCheckpointServletInodeBasedXfer.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
LGTM @jojochuang @sadanand48 we should add some unit test cases which are clearly missing for this race condition. I was thinking if we can create a mocked RDBstore and create dummy files and track all the files being written to a mocked static function linkAndIncludeFile(File file, String entryName,
ArchiveOutputStream archiveOutput, Path tmpDir)
Let us do this as part of #9132
…ne/om/OMDBCheckpointServletInodeBasedXfer.java Co-authored-by: Swaminathan Balachandran <swamirishi.sb@gmail.com>
sadanand48
left a comment
There was a problem hiding this comment.
Thanks for helping with this patch @jojochuang , LGTM.
|
Merged. Thanks @swamirishi @sadanand48 |
What changes were proposed in this pull request?
HDDS-13770. SstBackup Directory can have orphan files after bootstrap not present in compaction log table
Please describe your PR in detail:
What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-13770
How was this patch tested?
Added unit tests.