Shuffle statistical tests sometimes fail ... #291

KtorZ · 2019-05-21T07:41:30Z

Release	Operating System	Cause
next	Windows & OSX & Linux)	Code

Context

We use statistical tests to verify that our shuffle function does a fair job at .. shuffling:

https://github.com/input-output-hk/cardano-wallet/blob/master/lib/core/test/unit/Cardano/Wallet/Primitive/CoinSelectionSpec.hs#L77-L83

However, sometimes, quickcheck fails to verify the property and simply give up early trying to shuffle a list that isn't much shuffled.

Steps to Reproduce

Run the test many times

Expected behavior

The test should NEVER fail

Actual behavior

It fails eventually, once in a while

Resolution Plan

~~Try constraining the length of the list to a minimum. The properties are formulated with "non-trivial" lists in mind, so using it on empty list or singleton doesn't really help our cause here.~~
Removed the precondition on the list length length xs > 1 ==> ...; it appears that quickcheck generates quite a few empty lists which made the checkCoverage really hard to satisfy in practice as it will give up after only a few cases. Instead, using a proper generator to generate NonEmptyList solved the issue nicely.

PR

Number	Base
#295	`develop`

QA

Re-run the tests locally a thousand times each without observing any failure (was occurring more than 1 every 50 tests before that). So I believe this is now fixed.
Also no failure in CI about this since the fix was merged.

The text was updated successfully, but these errors were encountered:

289: show feature availability in API specification r=KtorZ a=KtorZ # Issue Number  N/A # Overview  - [ ] I have removed the"priority" from the specification and added a "status" indicating the availability of a particular feature in the API. # Comments  As requested by Chris. Makes it easier for externals to know what's available or not. A preview: ![image](https://user-images.githubusercontent.com/5680256/58074137-bdc26f00-7ba4-11e9-95b5-33ff723e0f2a.png) ![image](https://user-images.githubusercontent.com/5680256/58074150-c6b34080-7ba4-11e9-83a5-943877e14fd7.png) ![image](https://user-images.githubusercontent.com/5680256/58074161-cfa41200-7ba4-11e9-8d83-0948b129e856.png) ![image](https://user-images.githubusercontent.com/5680256/58074177-dd599780-7ba4-11e9-867e-5047012b1f93.png) ![image](https://user-images.githubusercontent.com/5680256/58074188-e5193c00-7ba4-11e9-845c-cacfd34a9fc9.png)  295: fix shuffle tests in CoinSelectionSpec once and for all r=KtorZ a=KtorZ # Issue Number  #291 # Overview  - [ ] I have removed the precondition about the list length in favor of a correct generator # Comments  Turns out that the problem wasn't much the confidence interval that was being too strict as I thought in the past, but simply that the precondition was too hard to satistify. Indeed, quickcheck does generate empty lists quite often, (more than 10% of the generated values actually) and this caused the `checkCoverage` to give up very early despite the coverage being okay. I've switched to using a generator of `NonEmptyList` instead of the precondition and re-run both statistical tests a thousand times: ``` replicateM_ 1000 $ quickCheck (checkCoverageWith lowerConfidence prop_shuffleNotDeterministic) ``` without observing any failure. So I've got quite some confidence that this is now fixed..  Co-authored-by: KtorZ <[email protected]>

piotr-iohk · 2019-05-22T11:12:19Z

👍

KtorZ added the BUG label May 21, 2019

KtorZ self-assigned this May 21, 2019

KtorZ mentioned this issue May 21, 2019

fix shuffle tests in CoinSelectionSpec once and for all #295

Merged

1 task

KtorZ added this to the Bugs - Sprint 19-20 milestone May 22, 2019

piotr-iohk closed this as completed May 22, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Shuffle statistical tests sometimes fail ... #291

Shuffle statistical tests sometimes fail ... #291

KtorZ commented May 21, 2019 •

edited

Loading

piotr-iohk commented May 22, 2019

Shuffle statistical tests sometimes fail ... #291

Shuffle statistical tests sometimes fail ... #291

Comments

KtorZ commented May 21, 2019 • edited Loading

Context

Steps to Reproduce

Expected behavior

Actual behavior

Resolution Plan

PR

QA

piotr-iohk commented May 22, 2019

KtorZ commented May 21, 2019 •

edited

Loading