HDDS-2214. TestSCMContainerPlacementRackAware has an intermittent fai… #6
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What changes were proposed in this pull request?
Fixing an intermittent unit test.
What is the problem
For example from the nightly build:
The problem is in the testNoFallback:
Let's say we have 11 nodes (from parameter) and we would like to choose 5 nodes (hard coded in the test).
As the first two replicas are chosen from the same rack an all the other from different racks it's not possible, so we except a failure.
But we have an assertion that the success count is at least 3. But this is true only if the first two replicas are placed to the rack1 (5 nodes) or rack2 (5nodes). If the replica is placed to the rack3 (one node) it will fail immediately:
Lucky case when we have success count > 3
The specific case when we have success count == 1, as we can't choose the second replica on rack3 (This is when the test is failing)
What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-2289
How was this patch tested?
With Intellij you can execute the unit test multiple times (1000x) or until the next failure. Execute it with or without the patch. Usually I got the problem during the first 100 execution.