Skip to content

chore(ci): require clear hyperfine regressions#9874

Merged
jdx merged 1 commit into
jdx:mainfrom
risu729:chore/hyperfine-clear-regressions
May 17, 2026
Merged

chore(ci): require clear hyperfine regressions#9874
jdx merged 1 commit into
jdx:mainfrom
risu729:chore/hyperfine-clear-regressions

Conversation

@risu729

@risu729 risu729 commented May 15, 2026

Copy link
Copy Markdown
Contributor

Summary

  • keep the hyperfine workflow on the Namespace runner, but stop failing on measurements whose relative uncertainty still overlaps the 10% regression threshold
  • continue failing clear regressions, and continue reporting outlier/noisy results in the PR comment and annotations
  • parse the hyperfine reference/current commands once so custom MISE_ALT comparisons use the same improvement detection path

Investigation

The hyperfine workflow became much noisier after the Namespace runner migration in #9561 on 2026-05-03 15:44 UTC.

Completed, non-cancelled/action-required runs from 2026-04-25 through 2026-05-14:

Failed-log classification over the same window:

  • 80 perf-gate failures
  • 20 build/code failures
  • 2 runner SIGKILL failures
  • 1 follow-on comment failure after an earlier failure

After #9847, there were still 3 perf-gate failures in 27 completed runs. The current representative failure is not a deterministic benchmark regression: run 25871537929 failed on mise ls with 1.16 ± 0.17, so the measured range overlaps the 10% threshold. Run 25871443106 similarly failed on a 13% mise ls result without a hyperfine outlier warning. This PR keeps real regressions failing, but treats these statistically unclear cases as inconclusive instead of red CI.

Validation

  • sed -n '72,148p' .github/workflows/hyperfine.yml | sed 's/^ //' | sed 's/${{ steps\.versions\.outputs\.release }}/2026.5.7/g' | bash -n
  • mise x actionlint -- actionlint .github/workflows/hyperfine.yml
  • git diff --check --cached

This PR was generated by an AI coding assistant.

@gemini-code-assist

Copy link
Copy Markdown
Contributor

Note

Gemini is unable to generate a review for this pull request due to the file types involved not being currently supported.

@greptile-apps

greptile-apps Bot commented May 15, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR tightens the hyperfine CI gate so that benchmark runs whose measured regression does not exceed the 10% threshold when the relative uncertainty is subtracted are treated as inconclusive rather than red, while still failing on statistically clear regressions. It also fixes a latent bug where the improvement-detection grep was hard-coded to the release version string and would have silently misfired for custom MISE_ALT comparisons.

  • Inconclusive guard: after extracting uncertainty from the hyperfine markdown row and scaling it with the new scale_decimal helper, the script checks variance_scaled − uncertainty_scaled ≤ threshold_scaled; only when this check fails does failed=true get set, preserving the existing noisy-result path above it.
  • Improvement detection fix: replaces grep -q "mise-<release>.*±.*±" out.md with grep -Fq "| \$reference_cmd` |"on the already-extractedrelative_row, so the check works correctly whether MISE_ALT` is set or not.
  • Robustness additions: missing or unparseable hyperfine relative rows now emit a ::warning annotation and a continue, preventing silent undefined-variable arithmetic.

Confidence Score: 5/5

The change is purely additive CI logic that loosens a noisy benchmark gate; it cannot break builds or affect the mise binary itself.

All arithmetic paths are guarded by the empty-string checks at lines 119–124 before scale_decimal is ever called, so invalid input cannot reach the base-10 expansion. The NF-relative awk field extraction is correct regardless of how many whitespace-delimited tokens the command name occupies. The improvement-detection grep uses -F (fixed string) so special characters in version strings are not misinterpreted. The ordering of the noisy branch before the new uncertainty branch is intentional and correct. No production code changes are included.

No files require special attention.

Important Files Changed

Filename Overview
.github/workflows/hyperfine.yml Adds uncertainty-aware inconclusive detection to the hyperfine benchmark gate, extracts scale_decimal helper, unifies reference/current command construction, and adds graceful handling for missing or malformed hyperfine output rows.

Reviews (1): Last reviewed commit: "chore(ci): require clear hyperfine regre..." | Re-trigger Greptile

@risu729 risu729 marked this pull request as ready for review May 15, 2026 19:28
@jdx jdx merged commit 9fd7611 into jdx:main May 17, 2026
33 checks passed
@risu729 risu729 deleted the chore/hyperfine-clear-regressions branch May 17, 2026 15:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants