Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Non-deterministic test failure #556

Open
janosg opened this issue Apr 5, 2024 · 3 comments
Open

Non-deterministic test failure #556

janosg opened this issue Apr 5, 2024 · 3 comments
Labels
testing Writing and verifying tests (unit or otherwise)

Comments

@janosg
Copy link
Collaborator

janosg commented Apr 5, 2024

The test case test_shapley_batch_size[1-PermutationSampler-beta_coefficient_w-5-test_game0] sometimes fails due to a precision problem. The test output is:

x: array([0.171717, 0.23    , 0.272727, 0.646465, 0.686869])
y: array([0.171717, 0.222222, 0.272727, 0.646465, 0.68    ])

The failure seems to occur randomly (on the same machine) with a relatively low probability.

@mdbenito
Copy link
Collaborator

mdbenito commented Apr 5, 2024

Yes, the tests for MC methods are flakeyy. We have at best (eps,delta) bounds on sample complexity assuming deterministic utilities. So tests will fail from time to time.

We have a fixture allowing for a fraction of tests to fail, but we are not really using it. One reason is that the fixture is "Ensure that results are within eps precision, with 1-delta probability", so you need to run a bunch times which is obviously costly. Another is that even that is a probabilistic statement, so we'd only reduce the number of failures, but never eliminate them.

I guess a more pragmatic approach would be to allow one retry, ignoring the sample complexity bounds altogether. This would catch 99% of the failures and be good enough for regression tests.

@janosg janosg added the testing Writing and verifying tests (unit or otherwise) label Apr 8, 2024
@jakobkruse1
Copy link
Collaborator

This is really important if you ask me. This test is not the only one that is failing randomly, there are a few others as well. We need to fix this, it is really annoying having to rerun failing tests. I agree with @mdbenito to add a retry fixture to all test that involve some kind of randomness.

@mdbenito
Copy link
Collaborator

Note that we have a tolerate fixture already that can be used for this right away. (Related: #539)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
testing Writing and verifying tests (unit or otherwise)
Projects
None yet
Development

No branches or pull requests

3 participants