feat(github): add matching and matching_regex asset options#10325
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (11)
🚧 Files skipped from review as they are similar to previous changes (7)
📝 WalkthroughWalkthroughAdds substring ChangesMulti-binary asset selection filtering
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
docs/dev-tools/backends/forgejo.md (1)
15-15:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winRemove the duplicated
bin=option in the example.The inline command repeats the same key twice, which makes the example misleading and may break copy/paste usage.
🛠️ Suggested fix
-$ mise use -g forgejo:forgejo/runner[api_url=https://code.forgejo.org/api/v1,bin=forgejo-runner,bin=forgejo-runner] +$ mise use -g forgejo:forgejo/runner[api_url=https://code.forgejo.org/api/v1,bin=forgejo-runner]🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/dev-tools/backends/forgejo.md` at line 15, The example command duplicates the bin= option in the inline example (`$ mise use -g forgejo:forgejo/runner[api_url=...,bin=forgejo-runner,bin=forgejo-runner]`); remove the repeated `bin=forgejo-runner` so the bracketed options only include each key once (e.g., keep `api_url=...` and a single `bin=forgejo-runner`) to make the example correct and safe to copy/paste.docs/dev-tools/backends/github.md (1)
61-183:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winFix the new anchor fragments in the docs.
The new intra-page links use hyphenated fragments like
#asset-patternand#matching-regex, but the corresponding headings are named with underscores (asset_pattern,matching_regex). markdownlint already flagged these, and they likely won’t resolve in the rendered docs. Please update the fragments consistently in GitHub, GitLab, and Forgejo.🔧 Example fix
-[`matching_regex`](`#matching-regex`) +[`matching_regex`](`#matching_regex`)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/dev-tools/backends/github.md` around lines 61 - 183, The intra-page anchor fragments currently use hyphens (e.g. `#asset-pattern`, `#matching-regex`) but the corresponding headings are named with underscores (asset_pattern, matching_regex); update the documentation so anchors and headings match consistently: either rename the headings to hyphenated forms (asset-pattern, matching-regex) or change all fragment links to use underscores (`#asset_pattern`, `#matching_regex`), and apply the same consistent fix for other headings mentioned (e.g. version_prefix -> version-prefix if you choose hyphens) across GitHub, GitLab, and Forgejo renderings to ensure links resolve.Source: Linters/SAST tools
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/backend/github.rs`:
- Around line 516-521: The pre-validation call to
asset_matcher::validate_matching_regex runs even when install is using the
asset_pattern branch, causing installs to fail on ignored/invalid regex; change
UnifiedGitBackend::install_version_ (the function containing the current call to
asset_matcher::validate_matching_regex) to only validate matching_regex when
asset_pattern is not present (e.g., check opts.asset_pattern() / equivalent and
skip validate_matching_regex if an asset_pattern is supplied), and add a
regression test that exercises "asset_pattern + matching_regex=invalid" to
assert the install succeeds (or at least does not hard-fail) when asset_pattern
takes precedence.
---
Outside diff comments:
In `@docs/dev-tools/backends/forgejo.md`:
- Line 15: The example command duplicates the bin= option in the inline example
(`$ mise use -g
forgejo:forgejo/runner[api_url=...,bin=forgejo-runner,bin=forgejo-runner]`);
remove the repeated `bin=forgejo-runner` so the bracketed options only include
each key once (e.g., keep `api_url=...` and a single `bin=forgejo-runner`) to
make the example correct and safe to copy/paste.
In `@docs/dev-tools/backends/github.md`:
- Around line 61-183: The intra-page anchor fragments currently use hyphens
(e.g. `#asset-pattern`, `#matching-regex`) but the corresponding headings are named
with underscores (asset_pattern, matching_regex); update the documentation so
anchors and headings match consistently: either rename the headings to
hyphenated forms (asset-pattern, matching-regex) or change all fragment links to
use underscores (`#asset_pattern`, `#matching_regex`), and apply the same consistent
fix for other headings mentioned (e.g. version_prefix -> version-prefix if you
choose hyphens) across GitHub, GitLab, and Forgejo renderings to ensure links
resolve.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: f36a5bb0-11bd-40be-a681-96827d370da2
📒 Files selected for processing (10)
docs/dev-tools/backends/forgejo.mddocs/dev-tools/backends/github.mddocs/dev-tools/backends/gitlab.mddocs/dev-tools/backends/ubi.mde2e/backend/test_forgejo_matchinge2e/backend/test_github_matchinge2e/backend/test_github_tool_alias_matchinge2e/backend/test_gitlab_matchingsrc/backend/asset_matcher.rssrc/backend/github.rs
Greptile SummaryThis PR adds
Confidence Score: 4/5Safe to merge; the filter logic, precedence rules, and fail-closed error paths are well-exercised by unit and e2e tests. The previously-flagged resolve_lock_info bug is correctly fixed. The remaining provenance-picker tie-break does not affect binary selection — only which of two identically-scored provenance files is used for SLSA verification — and the scenario requires two per-binary provenance files that both survive the matching filter, which is an uncommon release layout. src/backend/asset_matcher.rs — the pick_best_provenance stable-sort + first() tie-break is the one spot that could produce a non-deterministic result when two filtered provenance files score identically. Important Files Changed
Reviews (3): Last reviewed commit: "Merge branch 'main' into feat/github-mat..." | Re-trigger Greptile |
d4ec110 to
9fe184b
Compare
There was a problem hiding this comment.
🧹 Nitpick comments (1)
docs/dev-tools/backends/ubi.md (1)
6-9: ⚡ Quick winCapitalize "GitHub" in prose references.
Lines 6 and 8 contain prose references to "github backend" that should follow standard English capitalization as "GitHub backend". Backend identifiers like
github:owner/reposhould remain lowercase, but narrative text should use the official "GitHub" capitalization.📝 Proposed fix
-The github backend offers several advantages over ubi including provenance verification, download progress reports, and fewer dependencies. To migrate, replace `ubi:owner/repo` with `github:owner/repo` in your configuration files. The [`matching`](/dev-tools/backends/github.html#matching) and [`matching_regex`](/dev-tools/backends/github.html#matching-regex) options carry over. One behavioral difference is worth noting: ubi applies the substring `matching` only as a tiebreaker among assets that already match your OS/arch, and skips it when a single asset matches the platform. The github backend applies `matching` as a pre-filter before autodetection, so for multi-binary releases you get the binary your filter names, or a clear error naming the filter if it isn't published for your platform. +The GitHub backend offers several advantages over ubi including provenance verification, download progress reports, and fewer dependencies. To migrate, replace `ubi:owner/repo` with `github:owner/repo` in your configuration files. The [`matching`](/dev-tools/backends/github.html#matching) and [`matching_regex`](/dev-tools/backends/github.html#matching-regex) options carry over. One behavioral difference is worth noting: ubi applies the substring `matching` only as a tiebreaker among assets that already match your OS/arch, and skips it when a single asset matches the platform. The GitHub backend applies `matching` as a pre-filter before autodetection, so for multi-binary releases you get the binary your filter names, or a clear error naming the filter if it isn't published for your platform. -One migration gotcha: ubi folds `matching` into the install path, so you can install several binaries from one repo via separate `matching` values on the same `ubi:owner/repo` string. The github backend keeps the install path keyed by tool name + version only, so two `github:owner/repo` entries with different `matching` values resolve to the **same** directory and the second overwrites the first. If you rely on that ubi pattern, give each binary its own [`tool_alias`](/dev-tools/backends/github.html#multiple-assets-from-the-same-release) on github so each gets its own install directory. +One migration gotcha: ubi folds `matching` into the install path, so you can install several binaries from one repo via separate `matching` values on the same `ubi:owner/repo` string. The GitHub backend keeps the install path keyed by tool name + version only, so two `github:owner/repo` entries with different `matching` values resolve to the **same** directory and the second overwrites the first. If you rely on that ubi pattern, give each binary its own [`tool_alias`](/dev-tools/backends/github.html#multiple-assets-from-the-same-release) on GitHub so each gets its own install directory.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/dev-tools/backends/ubi.md` around lines 6 - 9, Update prose references of "github backend" to use the proper brand capitalization "GitHub backend" while leaving inline code/identifiers like `github:owner/repo`, `matching`, `matching_regex`, and `tool_alias` exactly as-is (lowercase/code spans). Specifically, find the narrative mentions of "github backend" and change them to "GitHub backend" (do not alter code-style or backtick-wrapped tokens such as `github:owner/repo`).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@docs/dev-tools/backends/ubi.md`:
- Around line 6-9: Update prose references of "github backend" to use the proper
brand capitalization "GitHub backend" while leaving inline code/identifiers like
`github:owner/repo`, `matching`, `matching_regex`, and `tool_alias` exactly
as-is (lowercase/code spans). Specifically, find the narrative mentions of
"github backend" and change them to "GitHub backend" (do not alter code-style or
backtick-wrapped tokens such as `github:owner/repo`).
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: a0889ebd-75b1-4a70-82ea-a4fa920e9ee6
📒 Files selected for processing (11)
docs/dev-tools/backends/forgejo.mddocs/dev-tools/backends/github.mddocs/dev-tools/backends/gitlab.mddocs/dev-tools/backends/ubi.mde2e/backend/test_forgejo_matchinge2e/backend/test_github_matchinge2e/backend/test_github_matching_locke2e/backend/test_github_tool_alias_matchinge2e/backend/test_gitlab_matchingsrc/backend/asset_matcher.rssrc/backend/github.rs
🚧 Files skipped from review as they are similar to previous changes (5)
- e2e/backend/test_gitlab_matching
- e2e/backend/test_github_matching
- e2e/backend/test_github_tool_alias_matching
- src/backend/github.rs
- src/backend/asset_matcher.rs
Port ubi's `matching` (substring) and `matching_regex` (regex) asset filters to the github/gitlab/forgejo backends, applied as a pre-filter before platform autodetection so one config stays portable across OS/arch. Disambiguates multi-binary releases (e.g. oxc -> oxlint+oxfmt, bazelbuild/buildtools -> buildifier+buildozer) that the shortest-name tiebreak and repo-name preference can't portably select. The filter also scopes verification so a multi-binary release can't verify one binary against another's digest: it narrows SLSA provenance selection (with a fallback to the full set when the filter excludes every provenance file, preserving shared goreleaser-style attestations), and an invalid matching_regex is rejected up front rather than silently falling back to autodetection. matching/matching_regex round-trip to the lockfile and are install-time keys so a stale cached filter never overrides mise.toml.
9fe184b to
16ceffb
Compare
|
Heads up: the failing Root cause: Since this branch is rebased onto current The fix (update the completion-test expectations for usage 3.5, or pin Comment posted by an AI coding assistant. |
|
Root cause found for the
Concretely, completion lines went from #10313 ("require usage 3.5") bumped the requirement and regenerated Reproduces on a clean Likely fix: update the expected strings in those two test files to the new double-tab format ( Comment posted by an AI coding assistant. |
feat(github): add
matchingandmatching_regexasset optionsImplements the proposal in #10318. This is a step toward
github-backend parity with ubi so ubi can eventually be retired (#7464).Summary
matching(substring) andmatching_regex(regex) asset filters to the github / gitlab / forgejo backends, ported from ubi.asset_pattern, which replaces autodetection and forces you to template every platform's asset name by hand.matching/matching_regexround-trip to the lockfile and are install-time keys, so a stale cached filter never silently overridesmise.toml.Motivation
Some repos ship multiple binaries as separate per-platform assets in one release, none named after the repo, so neither the shortest-name tiebreak (#9358) nor the repo-name preference (#10008) can portably pick the one you want:
oxc-project/oxcshipsoxlint+oxfmtbazelbuild/buildtoolsshipsbuildifier+buildozer+unused_depsgrpc-ecosystem/grpc-gatewayshipsprotoc-gen-grpc-gateway+protoc-gen-openapiv2asset_patterncan target one, but it discards autodetection so the config stops being portable.matching/matching_regexnarrow the candidates and let autodetection still pick the right OS/arch, so the same config works everywhere.Verification Consistency (Checksums + SLSA Provenance)
The filter also scopes verification, so a multi-binary release can't verify one binary against another's digest:
pick_best_provenancescored provenance files by platform only, over all assets, so a release with per-binary*.intoto.jsonlfiles could attach the wrong one (the tiebreak is a stable sort with no name fallback, so the wrong pick was positional). This PR threadsmatching/matching_regexinto the picker and all threepick_best_provenancecall sites, with one exception: if the filter excludes every provenance file, it falls back to the full set. That keeps a single shared provenance file (goreleaser'smultiple.intoto.jsonl), which attests every artifact but carries no binary name, in play instead of silently skipping verification.The two "empty result" cases are handled differently so a bad pattern can't turn into a silently skipped verification:
matching_regex: hard error on the binary path (the error names the pattern); the provenance picker returnsNoneinstead of falling back.The regex is compiled once and cached on the picker, and both checks read that same cached result, so whether the pattern is valid can't differ between the binary and provenance paths. The install path that skips binary selection (when the release URL is already in the lockfile) validates the pattern up front via
asset_matcher::validate_matching_regex— otherwise a bad pattern could reach the picker, getNone, and be read as "no provenance found," silently skipping SLSA.(SLSA provenance gap caught in review on #10318; thanks @jdx.)
Fail-closed on an invalid
matching_regexA syntactically invalid
matching_regexis a hard error wherever the filter actually applies, so it can't silently degrade into the wrong asset or an empty result:mise lock(cross-platform generation) is best-effort — a platform with no matching asset is skipped, not fatal. Butresolve_lock_infocaught the invalid-regex error the same way and returned an emptyPlatformInfo, so the tool was written to the lockfile with its options but no url/checksum and miscounted as a successful entry. It now validates up front and fails closed, so the platform is skipped (nothing written) instead.asset_patternbeing unset for the target. Whenasset_patternis set it replaces the filter entirely, so an invalidmatching_regexis silently ignored, not validated — mise doesn't error on an option it won't act on.Backend Scope
github,gitlab, andforgejoshare the implementation (commonasset_matcher+ three backend-specificresolve_*_asset_url_for_targetfunctions reading through sharedGitBackendOptionsaccessors). Provenance/attestation is GitHub-only: gitlab/forgejo bail beforepick_best_provenance, so there's no provenance path to change there.test_matching_plumbing_parity_across_git_backendslocks the three paths together so they can't drift.Tool options are a flattened free-form map, so the rest of the CLI (parse, resolve,
outdated/upgrade, doctor, schema) handles them generically. The only per-backend sites that hardcode option names for asset selection (install_time_option_keys,lockfile_options, theasset_patternprecedence guards) are exactly the three this PR touches.Install-Path Keying
matching/matching_regexare lockfile + install-time keys but not part of the install directory (kept as tool name + version). The supported way to install multiple binaries from one repo istool_alias: each alias is a distinct tool with its own install dir. There's one drawback: reusing the baregithub:owner/repostring with two differentmatchingvalues resolves to the same dir. That case is called out with a warning in the backend docs.Testing
Two layers: fast unit tests for the asset-selection and provenance-picker logic (no network, driven by fixtures from real release JSON), plus e2e tests that install from real multi-binary releases across all three backends.
asset_matcher, 3 ingithub.rs, including 6 for the SLSA provenance picker). Fixtures from real releases (oxc, bazel buildtools, grpc-gateway) cover happy path, cross-platform portability, AND semantics, autodetection still applied within the narrowed set, invalid-regex hard error, over-narrow fallback, shared-prefix ordering regression, lockfile round-trip, and the malformed-regex-returns-Noneprovenance guard.test_github_matchingandtest_github_tool_alias_matchingrun against real multi-binary releases (oxc@apps_v1.69.0,bazelbuild/buildtools@7.1.2). Each asserts the failure reason via an expected substring, so a test can't pass on an unrelated failure. The tool-alias test exercises the SLSA fix end-to-end and the lockfile round-trip.test_github_matchingalso assertsasset_patternprecedence over an invalidmatching_regex(the install fails on the pattern path, not regex compilation — proving an ignored regex isn't validated), andtest_github_matching_lockassertsmise lockskips a tool whosematching_regexis invalid instead of writing a url-less lockfile entry.test_gitlab_matchingandtest_forgejo_matchingexercise the gitlab/forgejo resolve paths against their real fixtures. (forgejo is committed disabled like the existingtest_forgejo, for known Forgejo API flakiness; the unit parity test covers it meanwhile.)Docs
New
matching/matching_regexsections in the github backend docs (with theasset_patternprecedence rule and verification-consistency note), shared-options updates to gitlab/forgejo docs, and a migration note in the ubi docs.Behavior & Semantics
A few decisions we made on purpose:
matchingonly as a tiebreaker after OS/arch filtering; we apply it before. This keeps one config portable across platforms and avoids the wrong-shared-prefix-binary problem from ubi #137 / mise #6611.matchingandmatching_regexare AND, andmatchingis case-sensitive.matching_regexsupports inline(?i)if you want case-insensitive; if both options are set, an asset has to satisfy both.asset_patternwins. When it's set,matching/matching_regexare ignored for both the binary and its provenance — silently, so even an invalidmatching_regexis not validated or reported. That keeps a contradictory config from pointing provenance at a different binary thanasset_patternpicked.(Kept this short — happy to add more on any of it, like the full ubi comparison or why these options are top-level instead of per-platform.)
This PR was generated by an AI coding assistant.
Summary by CodeRabbit
New Features
matching(substring) andmatching_regex(regex) options for GitHub, GitLab, and Forgejo to narrow release asset candidates while preserving platform/OS/arch autodetection.Documentation
asset_patternwins), case/validation notes, migration guidance, and a warning about install-path/keying requiring distincttool_aliasto avoid overwrites.Bug Fixes
Tests