You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
For each PR, our CI compares running criterion benchmarks on main vs. running them on the PR and posts a comment with the timing changes. Unfortunately, the github actions runner performance is rather noisy, so e.g. in this change that doesn't touch the linter at all we still see large seeming changes to linter performance.
criterion, the benchmarking framework we use, integrates statistical methods and e.g. computes p-values that tells you whether a change is statistically significant (vs. the just noise) and it plots each run to allow for manual inspection. It would be great if we could extend CI to show more information, e.g. whether criterion considers the difference significant. I don't know if that is possible with the way github handles artifacts, but exporting the criterion plots and linking them on the PR comment would also be helpful.
The text was updated successfully, but these errors were encountered:
For each PR, our CI compares running criterion benchmarks on main vs. running them on the PR and posts a comment with the timing changes. Unfortunately, the github actions runner performance is rather noisy, so e.g. in this change that doesn't touch the linter at all we still see large seeming changes to linter performance.
criterion, the benchmarking framework we use, integrates statistical methods and e.g. computes p-values that tells you whether a change is statistically significant (vs. the just noise) and it plots each run to allow for manual inspection. It would be great if we could extend CI to show more information, e.g. whether criterion considers the difference significant. I don't know if that is possible with the way github handles artifacts, but exporting the criterion plots and linking them on the PR comment would also be helpful.
The text was updated successfully, but these errors were encountered: