Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Estimating mutation score by sampling #490

Closed
tomato42 opened this issue Nov 2, 2019 · 3 comments · Fixed by #491
Closed

Estimating mutation score by sampling #490

tomato42 opened this issue Nov 2, 2019 · 3 comments · Fixed by #491

Comments

@tomato42
Copy link
Contributor

tomato42 commented Nov 2, 2019

I was thinking about the issue #484, and I think there is a bit of a problem with it: what if the test coverage changes but the code does not? How to see if the changes to test coverage are ok, when no application code was changed?

Maybe the cosmic-ray could execute a random selection of the tests and calculate confidence interval for it?

The formula I found, uses:
p for percentage of mutants that survived
q for percentage of mutants that were killed (i.e. 1 - p)
n number of mutants tested
N number of mutants total
z z-score (scaling factor for the given confidence level, 1.65 for 90%, 1.96 for 95%, 2.58 for 99%)

So if I have 8000 mutants, tested randomly 40 of them, 20% of them survived and I want to know a 95% confidence interval for that 20% I calculate:
sqrt((p * q)/n) * z * (1 - sqrt(n/N)) = sqrt(0.2 * 0.8 / 40) * 1.96 * (1 - sqrt(40/8000)) = 0.115

so by executing 40 tests, I know that the real mutation score of this test suite is 20% ± 11.5% (95% confidence)

The nice thing is that if the execution was selecting the tests at random, that estimation could be simply a switch to cr-report to base it off of total jobs vs complete jobs and already calculated survival rate.

@abingham
Copy link
Contributor

abingham commented Nov 2, 2019 via email

@tomato42
Copy link
Contributor Author

tomato42 commented Nov 2, 2019

I've updated the question later: if the cosmic-ray exec would execute test cases in random order, then cr-report could simply take the results from DB and calculate the confidence interval (probably with a switch)

so it wouldn't be a new interceptor, but rather ability to estimate results from a partial run (like in CI, where you can run the tests for 20-30 minutes and make do with what you got)

@tomato42
Copy link
Contributor Author

tomato42 commented Nov 2, 2019

I've proposed PR to implement it

Here's one lecture that goes into error estimation: http://sphweb.bumc.bu.edu/otlt/MPH-Modules/BS/BS704_Confidence_Intervals/BS704_Confidence_Intervals_print.html

tomato42 added a commit to tomato42/cosmic-ray that referenced this issue Dec 29, 2023
for the estimation of the survival rate to be representative, the
sample must be random, so execute the tasks in random order

see sixty-north#490 and sixty-north#491
abingham pushed a commit that referenced this issue Jan 11, 2024
for the estimation of the survival rate to be representative, the
sample must be random, so execute the tasks in random order

see #490 and #491
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants