-
Notifications
You must be signed in to change notification settings - Fork 590
HDDS-9967. Intermittent failure in TestOzoneRpcClientAbstract.testListSnapshot. #5970
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@sumitagrawl @adoroszlai Pls review. |
hemantk-12
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @devmadhuu for the patch and finding out the root cause.
Overall looks good to me, left some inline comments.
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/ListIterator.java
Outdated
Show resolved
Hide resolved
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/ListIterator.java
Outdated
Show resolved
Hide resolved
sumitagrawl
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@devmadhuu thanks for working over this. Have few comments.
Additionally, can avoid returning null value in next() with some optimization.
swamirishi
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the patch @devmadhuu Overall the changes look good to me I had some nitpicky comment in the one place, check and see if it makes sense.
| return cacheKeyMap.containsKey(key); | ||
| } | ||
|
|
||
| private void getNextKey() throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: instead of creating a new function, we can just remove null values when initializing the cacheCreatedKeyIterator.
cacheCreatedKeyIter = cacheKeyMap.entrySet().stream().filter(e -> e.getValue() != null).iterator();
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can not remove null value in cache iterator initialization as this is requried while checking db entry with cache if element is deleted or not (as deletion is marked with null value)
Optimization is done not to return null value while checking hasNext() or next(), so caller need not do check with null value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cacheCreatedKeyIter = cacheKeyMap.entrySet().stream().filter(e -> e.getValue() != null).iterator();
This should not remove any element from the map.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think @swamirishi is right. doesKeyExistInCache checks if key exist in cacheKeyMap while cacheCreatedKeyIter can have filtered result.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, this filtering can be done while initializing iterator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @hemantk-12 @swamirishi @sumitagrawl for review. This has been handled as per suggestion. Kindly re-review. Once changes finalized. I'll re-run the repeated CI run.
hemantk-12
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good to me.
| private final String startKey; | ||
| private final String tableName; | ||
|
|
||
| private HeapEntry currentKey; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: currentEntry and getCurrentEntry()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
| CacheIter cacheIter = new CacheIter<>(iteratorId, table.getName(), | ||
| table.cacheIterator(), startKey, prefixKey); | ||
| Predicate<String> isKeyExistInCache = cacheIter::isKeyExistInCache; | ||
| Predicate<String> doesKeyExistInCache = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: JFYI, now char limit is 120. It can be fit in one line. Configure/update your IDE with latest ozone-style.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok done.
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/ListIterator.java
Outdated
Show resolved
Hide resolved
sumitagrawl
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@devmadhuu Overall looks good, have one minor comment
sumitagrawl
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@devmadhuu LGTM +1
|
@swamirishi @hemantk-12 Kindly re-review. |
| private void getNextKey() throws IOException { | ||
| while (cacheCreatedKeyIter.hasNext() && currentKey == null) { | ||
| Map.Entry<String, Value> entry = cacheCreatedKeyIter.next(); | ||
| if (null == entry.getValue()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sumitagrawl @devmadhuu I am surely missing something here. Correct me if I am wrong, we are doing a continue here as per this statement which will skip the null value here. So all I am asking is when we are initializing cacheCreatedKeyIter can we skip null values? The key would be still there in the cacheKeyMap. It will make this code much simpler.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sumitagrawl @devmadhuu I am surely missing something here. Correct me if I am wrong, we are doing a continue here as per this statement which will skip the null value here. So all I am asking is when we are initializing cacheCreatedKeyIter can we skip null values? The key would be still there in the cacheKeyMap. It will make this code much simpler.
cacheKeyMap: This is used for 2 porpose,
- check if latest update available in cache compared to DB,
- Value if not null: latest value from cache
- Value if null: represent data is deleted and
dbIteratorcheck for this also
Now if we remove null value while initialize, then how DBIterator knows if keys are deleted? So this is required if key is deleted to skip from db iterator and hence we need this information.
- retrieve data in getNextKey() to caller.
Here, null value is not required to be returned while iterator, so we can skip as above.
@swamirishi I think this will help to understand why we can not remove null value during initialization. else if any key deleted, this can not be know via dbIterator (as may not yet flushed to db with delete) and may give deleted key details to user.
hemantk-12
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM+1.
swamirishi
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a minor nitpicky comment on the iterator initialization. You can make the changes if you feel it is good.
| return cacheKeyMap.containsKey(key); | ||
| } | ||
|
|
||
| private void getNextKey() throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cacheCreatedKeyIter = cacheKeyMap.entrySet().stream().filter(e -> e.getValue() != null).iterator();
This should not remove any element from the map.
I think above explaination gives why null value key is requried and help to avoid returning key which is deleted in cache. |
|
@sumitagrawl @swamirishi Pls re-review. |
Thanks @swamirishi for your review. Changes done as suggested. Kindly re-review. |
sumitagrawl
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
hemantk-12
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @devmadhuu for the patch and incorporating the review comments.
LGTM.
swamirishi
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@devmadhuu Thanks for the changes LGTM
|
Thanks @devmadhuu for the patch, @hemantk-12, @sumitagrawl, @swamirishi for the review. |
…tSnapshot. (apache#5970) (cherry picked from commit fa98426) (cherry picked from commit 42e8540f3f3fcfadfbc4e66579fa2465a2eb3725)
…ract.testListSnapshot. (apache#5970) (cherry picked from commit fa98426) Change-Id: Id59434644bfb9bbbeb2515dd4ecdd1bdad90fef7
What changes were proposed in this pull request?
This PR fixes the Intermittent failure in
TestOzoneRpcClientAbstract.testListSnapshot.TestOzoneRpcClientAbstract.testListSnapshotcreates 20 snapshots usingcreateSnapshotAPI and asserts the count of same usinglistSnapshotAPI. This assertion of count fails intermittently.listSnapshotAPI uses theorg.apache.hadoop.ozone.om.ListIterator.MinHeapIteratorwhich internally uses both CacheIterator and DBIterator and DBIterator had the logic of checking if rocks DB key is present in cache inorg.apache.hadoop.ozone.om.ListIterator.DbTableIter#getNextKey, this checks the cache from table cache which may be intermittently flushed which makes the addition of duplicate entry inorg.apache.hadoop.ozone.om.ListIterator.MinHeapIterator. So to fix this, we should use the pre-loaded keys inorg.apache.hadoop.ozone.om.ListIterator.CacheIter#cacheKeyMapinorg.apache.hadoop.ozone.om.ListIterator.CacheIter.What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-9967
How was this patch tested?
This patch is tested by repeated CI runs around 500 iterations and ZERO failure reported for this test case. Here is the green CI link.