chore(ci): require clear hyperfine regressions by risu729 · Pull Request #9874 · jdx/mise

risu729 · 2026-05-15T05:48:37Z

Summary

keep the hyperfine workflow on the Namespace runner, but stop failing on measurements whose relative uncertainty still overlaps the 10% regression threshold
continue failing clear regressions, and continue reporting outlier/noisy results in the PR comment and annotations
parse the hyperfine reference/current commands once so custom MISE_ALT comparisons use the same improvement detection path

Investigation

The hyperfine workflow became much noisier after the Namespace runner migration in #9561 on 2026-05-03 15:44 UTC.

Completed, non-cancelled/action-required runs from 2026-04-25 through 2026-05-14:

before chore(ci): use namespace runners for ci jobs #9561: 10 failures / 320 runs = 3.1%
after chore(ci): use namespace runners for ci jobs #9561, before chore(ci): make perf script robust to runner noise #9635: 22 failures / 137 runs = 16.1%
after chore(ci): make perf script robust to runner noise #9635, before chore(ci): skip hyperfine comments without permission #9629: 2 failures / 38 runs = 5.3%
after chore(ci): skip hyperfine comments without permission #9629: 69 failures / 484 runs = 14.3%

Failed-log classification over the same window:

80 perf-gate failures
20 build/code failures
2 runner SIGKILL failures
1 follow-on comment failure after an earlier failure

After #9847, there were still 3 perf-gate failures in 27 completed runs. The current representative failure is not a deterministic benchmark regression: run 25871537929 failed on mise ls with 1.16 ± 0.17, so the measured range overlaps the 10% threshold. Run 25871443106 similarly failed on a 13% mise ls result without a hyperfine outlier warning. This PR keeps real regressions failing, but treats these statistically unclear cases as inconclusive instead of red CI.

Validation

sed -n '72,148p' .github/workflows/hyperfine.yml | sed 's/^ //' | sed 's/${{ steps\.versions\.outputs\.release }}/2026.5.7/g' | bash -n
mise x actionlint -- actionlint .github/workflows/hyperfine.yml
git diff --check --cached

This PR was generated by an AI coding assistant.

gemini-code-assist · 2026-05-15T05:48:43Z

Note

Gemini is unable to generate a review for this pull request due to the file types involved not being currently supported.

greptile-apps · 2026-05-15T05:52:45Z

Greptile Summary

This PR tightens the hyperfine CI gate so that benchmark runs whose measured regression does not exceed the 10% threshold when the relative uncertainty is subtracted are treated as inconclusive rather than red, while still failing on statistically clear regressions. It also fixes a latent bug where the improvement-detection grep was hard-coded to the release version string and would have silently misfired for custom MISE_ALT comparisons.

Inconclusive guard: after extracting uncertainty from the hyperfine markdown row and scaling it with the new scale_decimal helper, the script checks variance_scaled − uncertainty_scaled ≤ threshold_scaled; only when this check fails does failed=true get set, preserving the existing noisy-result path above it.
Improvement detection fix: replaces grep -q "mise-<release>.*±.*±" out.md with grep -Fq "| \$reference_cmd` |"on the already-extractedrelative_row, so the check works correctly whether MISE_ALT` is set or not.
Robustness additions: missing or unparseable hyperfine relative rows now emit a ::warning annotation and a continue, preventing silent undefined-variable arithmetic.

Confidence Score: 5/5

The change is purely additive CI logic that loosens a noisy benchmark gate; it cannot break builds or affect the mise binary itself.

All arithmetic paths are guarded by the empty-string checks at lines 119–124 before scale_decimal is ever called, so invalid input cannot reach the base-10 expansion. The NF-relative awk field extraction is correct regardless of how many whitespace-delimited tokens the command name occupies. The improvement-detection grep uses -F (fixed string) so special characters in version strings are not misinterpreted. The ordering of the noisy branch before the new uncertainty branch is intentional and correct. No production code changes are included.

No files require special attention.

Important Files Changed

Filename	Overview
.github/workflows/hyperfine.yml	Adds uncertainty-aware inconclusive detection to the hyperfine benchmark gate, extracts `scale_decimal` helper, unifies reference/current command construction, and adds graceful handling for missing or malformed hyperfine output rows.

_{Reviews (1): Last reviewed commit: "chore(ci): require clear hyperfine regre..." | Re-trigger Greptile}

chore(ci): require clear hyperfine regressions

89d3e91

risu729 marked this pull request as ready for review May 15, 2026 19:28

jdx merged commit 9fd7611 into jdx:main May 17, 2026
33 checks passed

mise-en-dev mentioned this pull request May 17, 2026

chore: release 2026.5.11 #9931

Merged

risu729 deleted the chore/hyperfine-clear-regressions branch May 17, 2026 15:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

chore(ci): require clear hyperfine regressions#9874

chore(ci): require clear hyperfine regressions#9874
jdx merged 1 commit into
jdx:mainfrom
risu729:chore/hyperfine-clear-regressions

risu729 commented May 15, 2026

Uh oh!

gemini-code-assist Bot commented May 15, 2026

Uh oh!

greptile-apps Bot commented May 15, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

risu729 commented May 15, 2026

Summary

Investigation

Validation

Uh oh!

gemini-code-assist Bot commented May 15, 2026

Uh oh!

greptile-apps Bot commented May 15, 2026

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants