-
Notifications
You must be signed in to change notification settings - Fork 217
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Shuffle statistical tests sometimes fail ... #291
Milestone
Comments
1 task
iohk-bors bot
added a commit
that referenced
this issue
May 21, 2019
289: show feature availability in API specification r=KtorZ a=KtorZ # Issue Number <!-- Put here a reference to the issue this PR relates to and which requirements it tackles --> N/A # Overview <!-- Detail in a few bullet points the work accomplished in this PR --> - [ ] I have removed the"priority" from the specification and added a "status" indicating the availability of a particular feature in the API. # Comments <!-- Additional comments or screenshots to attach if any --> As requested by Chris. Makes it easier for externals to know what's available or not. A preview: ![image](https://user-images.githubusercontent.com/5680256/58074137-bdc26f00-7ba4-11e9-95b5-33ff723e0f2a.png) ![image](https://user-images.githubusercontent.com/5680256/58074150-c6b34080-7ba4-11e9-83a5-943877e14fd7.png) ![image](https://user-images.githubusercontent.com/5680256/58074161-cfa41200-7ba4-11e9-8d83-0948b129e856.png) ![image](https://user-images.githubusercontent.com/5680256/58074177-dd599780-7ba4-11e9-867e-5047012b1f93.png) ![image](https://user-images.githubusercontent.com/5680256/58074188-e5193c00-7ba4-11e9-845c-cacfd34a9fc9.png) <!-- Don't forget to: ✓ Self-review your changes to make sure nothing unexpected slipped through ✓ Assign yourself to the PR ✓ Assign one or several reviewer(s) ✓ Once created, link this PR to its corresponding ticket ✓ Acknowledge any changes required to the Wiki --> 295: fix shuffle tests in CoinSelectionSpec once and for all r=KtorZ a=KtorZ # Issue Number <!-- Put here a reference to the issue this PR relates to and which requirements it tackles --> #291 # Overview <!-- Detail in a few bullet points the work accomplished in this PR --> - [ ] I have removed the precondition about the list length in favor of a correct generator # Comments <!-- Additional comments or screenshots to attach if any --> Turns out that the problem wasn't much the confidence interval that was being too strict as I thought in the past, but simply that the precondition was too hard to satistify. Indeed, quickcheck does generate empty lists quite often, (more than 10% of the generated values actually) and this caused the `checkCoverage` to give up very early despite the coverage being okay. I've switched to using a generator of `NonEmptyList` instead of the precondition and re-run both statistical tests a thousand times: ``` replicateM_ 1000 $ quickCheck (checkCoverageWith lowerConfidence prop_shuffleNotDeterministic) ``` without observing any failure. So I've got quite some confidence that this is now fixed.. <!-- Don't forget to: ✓ Self-review your changes to make sure nothing unexpected slipped through ✓ Assign yourself to the PR ✓ Assign one or several reviewer(s) ✓ Once created, link this PR to its corresponding ticket ✓ Acknowledge any changes required to the Wiki --> Co-authored-by: KtorZ <[email protected]>
👍 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Context
We use statistical tests to verify that our shuffle function does a fair job at .. shuffling:
https://github.com/input-output-hk/cardano-wallet/blob/master/lib/core/test/unit/Cardano/Wallet/Primitive/CoinSelectionSpec.hs#L77-L83
However, sometimes, quickcheck fails to verify the property and simply give up early trying to shuffle a list that isn't much shuffled.
Steps to Reproduce
Expected behavior
The test should NEVER fail
Actual behavior
It fails eventually, once in a while
Resolution Plan
Try constraining the length of the list to a minimum. The properties are formulated with "non-trivial" lists in mind, so using it on empty list or singleton doesn't really help our cause here.Removed the precondition on the list length
length xs > 1 ==> ...
; it appears that quickcheck generates quite a few empty lists which made thecheckCoverage
really hard to satisfy in practice as it will give up after only a few cases. Instead, using a proper generator to generateNonEmptyList
solved the issue nicely.PR
develop
QA
Re-run the tests locally a thousand times each without observing any failure (was occurring more than 1 every 50 tests before that). So I believe this is now fixed.
Also no failure in CI about this since the fix was merged.
The text was updated successfully, but these errors were encountered: