-
Notifications
You must be signed in to change notification settings - Fork 588
HDDS-8877. Intermittent failure in TestOzoneFileSystem#testListStatusOnKeyNameContainDelimiter #5093
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
jojochuang
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks fine. unit tests passed so I am +1. Some nits that's more cosmetic in nature.
Re-triggered the test to check the acceptance tests.
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/KeyManagerImpl.java
Show resolved
Hide resolved
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/KeyManagerImpl.java
Show resolved
Hide resolved
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/KeyManagerImpl.java
Show resolved
Hide resolved
| if (key.startsWith(targetKey)) { | ||
| if (!Objects.equals(key, targetKey) | ||
| && !isKeyDeleted(key, keyTable)) { | ||
| && keyTable.isExist(key)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@chungen0126 Thanks for working on this patch, however instead of double checking , let deleted key flow outside of this method to caller and outside , it can be filtered. I think the issue is due to the cache is getting cleared after delete operation and isKeyDeleted will not tell as expected. Pls check my patch here. We may need not this extra call of "keyTable.isExist(key)"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@devmadhuu Thanks for your review. Here is an issue about that creating a deleteSet will lead to bloat in memory. I'm not sure if it is bothered now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if it is fine to add deleted keys to TreeMap. Once It is resolved by the PR, I will remove my changes about listing keys from DB.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @chungen0126 @devmadhuu .
Should we consider this PR superceded by #5244 ? Since the latter is fixing the same issue in a cleaner way
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure if isExists() API for checking the existence of key till it finds target key will impact performance, as cache is always faster. I applied the patch and ran multiple iterations of each job run in CI for single test class TestOzoneFileSystem and I can notice the time taken in completing the test is slightly increased. Also I can notice multiple test run in CI still failing.
| if (key.startsWith(targetKey)) { | ||
| if (!Objects.equals(key, targetKey) | ||
| && !isKeyDeleted(key, keyTable)) { | ||
| && keyTable.isExist(key)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure if isExists() API for checking the existence of key till it finds target key will impact performance, as cache is always faster. I applied the patch and ran multiple iterations of each job run in CI for single test class TestOzoneFileSystem and I can notice the time taken in completing the test is slightly increased. Also I can notice multiple test run in CI still failing.
| String entryKey = entry.getKey(); | ||
| if (entryKey.startsWith(prefixKey)) { | ||
| if (!KeyManagerImpl.isKeyDeleted(entryKey, table)) { | ||
| if (table.isExist(entryKey)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getNextKey() call is on every hasNextCall(), so again here we are iterating on table and checking existence of key, so is it acceptable from performance point of view ? One more point that as per isExist API doc:
A lock on the key / bucket needs to be acquired before invoking this API.
|
@chungen0126 if you can confirm #5244 fixes the same bug we can close it. Thank you. |
I guess this is a different problem, as it has also happened after the other fix was merged: |
I think the other issue also were handled in PR #5252 , but still open |
|
Is this good to merge? If not, could we mark |
Unfortunately these filesystem tests are implemented using JUnit4, but the Also, the test itself is probably OK, fix is proposed for production code. |
|
This PR fixs the error of listing status from cache, not the race condition in listStatus. It might be different bug in PR #5252. In master, listing status from cache only skip the keys with slash. To fix it, I create some fakeDirs for some condition if enableFileSystemPaths was false. When enableFileSystemPaths is false and I create a key with slash called "dir1/dir2/key1", there will be a key called "dir1/dir2/key1" in table. I also make the code more clear in |
|
The fix makes sense to me. I'm curious why it is intermittent though. |
|
@chungen0126 @jojochuang It seems #5399 also fixes the same problem and more. Can you please review that one? |
|
Thanks a lot @chungen0126 for the patch. The bug is now fixed in HDDS-9347, so I'm closing this one. However, to give credit to you for your work on this PR, I have added you as co-author of 84fb0b4. Also thanks @devmadhuu, @jojochuang for the review. |
What changes were proposed in this pull request?
When the fileSystemPaths isn't enable, listing status from cache will skip the keys with slash. To fix it, we need to create fakeDirs.
What is the link to the Apache JIRA
HDDS-8877
Please replace this section with the link to the Apache JIRA)
How was this patch tested?
Succeeded in all 10 * 10 run for TestOzoneFileSystem.
https://github.com/chungen0126/ozone/actions/runs/6081955837