Conversation
Summary of ChangesHello @ChenSammi, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly refactors the KeyLifecycleService within Ozone Manager to improve how lifecycle rules are applied to directories, particularly in File System Optimized (FSO) buckets. The changes introduce a more sophisticated, stack-based recursive evaluation mechanism, allowing for accurate and efficient processing of deeply nested directories. Additionally, the trash functionality has been refined to properly handle the movement of directories, ensuring that lifecycle policies are consistently applied across both keys and directories. New metrics are also included to provide better visibility into directory rename operations. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request significantly refactors the KeyLifecycleService to improve the handling of lifecycle rules for FSO buckets, particularly for nested directories and prefix-based rules. It introduces a more robust depth-first search mechanism using a stack to traverse directory structures, which should prevent out-of-memory issues with deep hierarchies. The changes also add support for moving directories to trash, a previously missing feature. The new logic appears more correct and is well-supported by extensive new tests. My review includes a couple of performance suggestions to optimize collection lookups within the new directory traversal logic.
| return; | ||
| } | ||
|
|
||
| List<Long> deletedDirList = new ArrayList<>(); |
There was a problem hiding this comment.
For better performance, consider using a HashSet<Long> instead of an ArrayList<Long> for deletedDirList (perhaps renaming it to deletedDirSet). The current implementation uses contains() and remove(Object) on an ArrayList on lines 537-540, which have O(n) time complexity. With a HashSet, these operations would be O(1) on average, which can be a significant improvement if the number of deleted directories is large.
You would also need to update its usage on lines 537-540 and 624. For example, lines 537-540 could be changed to:
for (OmDirectoryInfo subDir : subDirSummary.getSubDirList()) {
if (deletedDirSet.remove(subDir.getObjectID())) {
deletedDirCount++;
}
}| List<Long> deletedDirList = new ArrayList<>(); | |
| Set<Long> deletedDirSet = new HashSet<>(); |
| // and fromKey is also in table | ||
| long numKeysUnderDir = 0; | ||
| long numKeysExpired = 0; | ||
| List<String> deletedKeyList = new ArrayList(); |
There was a problem hiding this comment.
For better performance, consider using a HashSet<String> for deletedKeyList (perhaps renaming it to deletedKeySet). The contains() check on line 590 has O(n) complexity for an ArrayList, which could be slow if there are many deleted keys in the cache. A HashSet would provide O(1) average time complexity for this check.
You would also need to update its usage on lines 563 and 590.
| List<String> deletedKeyList = new ArrayList(); | |
| Set<String> deletedKeySet = new HashSet<>(); |
|
This PR has been marked as stale due to 21 days of inactivity. Please comment or remove the stale label to keep it open. Otherwise, it will be automatically closed in 7 days. |
|
Thank you for your contribution. This PR is being closed due to inactivity. If needed, feel free to reopen it. |
What changes were proposed in this pull request?
Provide a one-liner summary of the changes in the PR Title field above.
It should be in the form of
HDDS-1234. Short summary of the change.Please describe your PR in detail:
perspective not just for the reviewer.
the Jira's description if the jira is well defined.
issue investigation, github discussion, etc.
Examples of well-written pull requests:
What is the link to the Apache JIRA
Please create an issue in ASF JIRA before opening a pull request, and you need to set the title of the pull
request which starts with the corresponding JIRA issue number. (e.g. HDDS-XXXX. Fix a typo in YYY.)
(Please replace this section with the link to the Apache JIRA)
How was this patch tested?
(Please explain how this patch was tested. Ex: unit tests, manual tests, workflow run on the fork git repo.)
(If this patch involves UI changes, please attach a screenshot; otherwise, remove this.)