diff --git a/docs/dev-tools/backends/forgejo.md b/docs/dev-tools/backends/forgejo.md index 9e538660b7..84fdd32db4 100644 --- a/docs/dev-tools/backends/forgejo.md +++ b/docs/dev-tools/backends/forgejo.md @@ -139,7 +139,7 @@ mise install forgejo:user/repo ``` ::: tip -The autodetection logic is implemented in [`src/backend/asset_matcher.rs`](https://github.com/jdx/mise/blob/main/src/backend/asset_matcher.rs), which is shared by the Forgejo, GitHub and GitLab backends. +The autodetection logic is implemented in [`src/backend/asset_matcher.rs`](https://github.com/jdx/mise/blob/main/src/backend/asset_matcher.rs), which is shared by the GitHub, GitLab, and Forgejo backends. ::: ### `asset_pattern` @@ -151,6 +151,51 @@ Specifies the pattern to match against release asset names. This is useful when "forgejo:user/repo" = { version = "latest", asset_pattern = "tool_*_linux_x64.tar.gz" } ``` +### `matching` + +Narrows asset selection to names containing the given substring, **while keeping platform autodetection**. Unlike [`asset_pattern`](#asset_pattern) (which replaces autodetection entirely), `matching` only refines the candidate set — autodetection still chooses the correct OS/arch from the narrowed list, so a single config stays portable across platforms. + +This is the option to reach for when a repository ships **multiple binaries as separate per-platform assets** and autodetection can't tell which one you want. + +```toml +[tools] +# When a release ships several binaries per platform (e.g. `mytool-cli` and +# `mytool-server`), matching picks one on every OS/arch without hardcoding a +# platform-specific asset_pattern. +"forgejo:user/repo" = { version = "latest", matching = "mytool-cli" } +``` + +Tool options can also be passed inline on the command line using `[key=value]` syntax: + +```sh +mise use "forgejo:user/repo[matching=mytool-cli]" +``` + +`matching` is a case-sensitive substring test, so a value that is also a substring of another asset's name (e.g. `matching = "tool"` when both `tool-*` and `tool-extras-*` are published) won't uniquely select your binary. Use [`matching_regex`](#matching_regex) with an anchor when you need a precise match. + +If [`asset_pattern`](#asset_pattern) is also set, it takes precedence and `matching`/`matching_regex` are ignored — `asset_pattern` replaces autodetection entirely, so there is no candidate set left for them to narrow. They are ignored silently: when `asset_pattern` is set, a `matching_regex` is never consulted and an invalid one is not reported, since mise does not error on a superseded option. + +### `matching_regex` + +Like [`matching`](#matching), but the asset name must match the given regular expression. Use this when a substring isn't selective enough. The match is case-sensitive; use an inline `(?i)` flag for case-insensitive matching. + +```toml +[tools] +"forgejo:user/repo" = { version = "latest", matching_regex = "^mytool-cli-" } +``` + +If both `matching` and `matching_regex` are set, an asset must satisfy **both** (logical AND) +to remain a candidate. + +::: warning +`matching`/`matching_regex` are **not** part of the install path — it is keyed by the tool +name (`user/repo`, or a `tool_alias`) and version. To install two binaries from the same +release, give each its own [`tool_alias`](/dev-tools/backends/github.html#multiple-assets-from-the-same-release) +so they get distinct install directories; reusing the same `forgejo:user/repo` string with +different `matching` values resolves to the same directory and the second install overwrites +the first. +::: + ### `version_prefix` Specifies a custom version prefix for release tags. By default, mise handles the common `v` prefix (e.g., `v1.0.0`), but some repositories use different prefixes like `release-`, `version-`, or no prefix at all. diff --git a/docs/dev-tools/backends/github.md b/docs/dev-tools/backends/github.md index 5a0fad239f..18c5133acd 100644 --- a/docs/dev-tools/backends/github.md +++ b/docs/dev-tools/backends/github.md @@ -44,7 +44,7 @@ mise install github:user/repo ``` ::: tip -The autodetection logic is implemented in [`src/backend/asset_matcher.rs`](https://github.com/jdx/mise/blob/main/src/backend/asset_matcher.rs), which is shared by both the GitHub and GitLab backends. +The autodetection logic is implemented in [`src/backend/asset_matcher.rs`](https://github.com/jdx/mise/blob/main/src/backend/asset_matcher.rs), which is shared by the GitHub, GitLab, and Forgejo backends. ::: ### `asset_pattern` @@ -56,6 +56,45 @@ Specifies the pattern to match against release asset names. This is useful when "github:cli/cli" = { version = "latest", asset_pattern = "gh_*_linux_x64.tar.gz" } ``` +### `matching` + +Narrows asset selection to names containing the given substring, **while keeping platform autodetection**. Unlike [`asset_pattern`](#asset_pattern) (which replaces autodetection entirely), `matching` only refines the candidate set — autodetection still chooses the correct OS/arch from the narrowed list, so a single config stays portable across platforms. + +This is the option to reach for when a repository ships **multiple binaries as separate per-platform assets** and autodetection can't tell which one you want (see [Multiple Assets from the Same Release](#multiple-assets-from-the-same-release)). + +```toml +[tools] +# oxc-project/oxc ships both oxlint and oxfmt per platform; matching picks oxlint +# on every OS/arch without hardcoding a platform-specific asset_pattern. +# `apps_v1.69.0` is the literal release tag; the assets are per-platform +# archives, and rename_exe renames the extracted `oxlint-` binary to `oxlint`. +"github:oxc-project/oxc" = { version = "apps_v1.69.0", matching = "oxlint", rename_exe = "oxlint" } +``` + +Tool options can also be passed inline on the command line using `[key=value]` syntax: + +```sh +mise use "github:oxc-project/oxc[matching=oxlint,rename_exe=oxlint]@apps_v1.69.0" +``` + +`matching` is a case-sensitive substring test, so a value that is also a substring of another asset's name (e.g. `matching = "tool"` when both `tool-*` and `tool-extras-*` are published) won't uniquely select your binary. Use [`matching_regex`](#matching_regex) with an anchor when you need a precise match. + +If [`asset_pattern`](#asset_pattern) is also set, it takes precedence and `matching`/`matching_regex` are ignored — `asset_pattern` replaces autodetection entirely, so there is no candidate set left for them to narrow. They are ignored silently: when `asset_pattern` is set, a `matching_regex` is never consulted and an invalid one is not reported, since mise does not error on a superseded option. + +The filter also scopes verification: checksums are looked up for the selected asset, and SLSA provenance discovery is narrowed the same way, so a multi-binary release can't verify one binary against another's provenance. A single shared provenance file that attests every artifact in the release (e.g. `multiple.intoto.jsonl`) is still used as a fallback when no per-binary provenance matches. + +### `matching_regex` + +Like [`matching`](#matching), but the asset name must match the given regular expression. Use this when a substring isn't selective enough. The match is case-sensitive; use an inline `(?i)` flag for case-insensitive matching. + +```toml +[tools] +"github:oxc-project/oxc" = { version = "apps_v1.69.0", matching_regex = "^oxlint-", rename_exe = "oxlint" } +``` + +If both `matching` and `matching_regex` are set, an asset must satisfy **both** (logical AND) +to remain a candidate. + ### `version_prefix` Specifies a custom version prefix for release tags. By default, mise handles the common `v` prefix (e.g., `v1.0.0`), but some repositories use different prefixes like `release-`, `version-`, or no prefix at all. @@ -97,14 +136,55 @@ macos-arm64 = { asset_pattern = "gh_*_macOS_arm64.tar.gz" } The GitHub backend installs one release asset for each tool. If a repository publishes multiple binaries as separate assets in the same release, define one tool alias per -binary and point each alias at the same `github:owner/repo` backend. Then configure -each aliased tool with its own `asset_pattern`. +binary and point each alias at the same `github:owner/repo` backend, then narrow each +alias to its binary. + +Prefer [`matching`](#matching) (or [`matching_regex`](#matching_regex)): it narrows the +candidate set while **keeping platform autodetection**, so one config works on every +OS/arch. This is the right choice when the per-platform asset names can't be templated +portably (e.g. Rust target-triples like `oxlint-aarch64-apple-darwin.tar.gz`). + +The example below installs both `oxlint` and `oxfmt` from the single +`oxc-project/oxc` release. Note that each `matching` value must be specific enough to +select **only** the intended binary — if one binary's name were a substring of the +other's, use [`matching_regex`](#matching_regex) with an anchor (e.g. `"^oxlint-"`) +instead (see the [`matching`](#matching) caveat). ```toml [tool_alias] -tool-a = "github:owner/repo" -tool-b = "github:owner/repo" +oxlint = "github:oxc-project/oxc" +oxfmt = "github:oxc-project/oxc" + +[tools.oxlint] +version = "apps_v1.69.0" +matching = "oxlint" +rename_exe = "oxlint" + +[tools.oxfmt] +version = "apps_v1.69.0" +matching = "oxfmt" +rename_exe = "oxfmt" +``` + +::: warning +A distinct alias per binary is **required**, not just tidy. `matching`/`matching_regex` are +**not** part of the install path — it is keyed by the tool name (the alias, or `owner/repo` +when unaliased) and version. Installing the same `github:owner/repo` backend string twice +with different `matching` values (for example `mise use "github:owner/repo[matching=tool-a]"` +followed by `mise use "github:owner/repo[matching=tool-b]"`) resolves to the **same** +directory, so the second install overwrites the first. Giving each binary its own alias gives +each its own install directory, so they coexist. +::: +If the binary isn't named the way you want to invoke it, add +[`rename_exe`](#rename_exe) (renames the executable extracted from an archive) or +[`bin`](#bin) (selects/renames the binary, including a single bare non-archive binary). + +Use [`asset_pattern`](#asset_pattern) instead only when you need full manual control and +can name the asset portably (it replaces autodetection, so any `{{os}}`/`{{arch}}` +templating must cover every platform you target): + +```toml [tools.tool-a] version = "latest" asset_pattern = "tool-a-*" diff --git a/docs/dev-tools/backends/gitlab.md b/docs/dev-tools/backends/gitlab.md index 15b19434d4..7f628b58cd 100644 --- a/docs/dev-tools/backends/gitlab.md +++ b/docs/dev-tools/backends/gitlab.md @@ -134,7 +134,7 @@ mise install gitlab:user/repo ``` ::: tip -The autodetection logic is implemented in [`src/backend/asset_matcher.rs`](https://github.com/jdx/mise/blob/main/src/backend/asset_matcher.rs), which is shared by both the GitHub and GitLab backends. +The autodetection logic is implemented in [`src/backend/asset_matcher.rs`](https://github.com/jdx/mise/blob/main/src/backend/asset_matcher.rs), which is shared by the GitHub, GitLab, and Forgejo backends. ::: ### `asset_pattern` @@ -147,6 +147,51 @@ version = "latest" asset_pattern = "gitlab-runner-linux-x64" ``` +### `matching` + +Narrows asset selection to names containing the given substring, **while keeping platform autodetection**. Unlike [`asset_pattern`](#asset_pattern) (which replaces autodetection entirely), `matching` only refines the candidate set — autodetection still chooses the correct OS/arch from the narrowed list, so a single config stays portable across platforms. + +This is the option to reach for when a repository ships **multiple binaries as separate per-platform assets** and autodetection can't tell which one you want. + +```toml +[tools] +# When a release ships several binaries per platform (e.g. `mytool-cli` and +# `mytool-server`), matching picks one on every OS/arch without hardcoding a +# platform-specific asset_pattern. +"gitlab:owner/repo" = { version = "latest", matching = "mytool-cli" } +``` + +Tool options can also be passed inline on the command line using `[key=value]` syntax: + +```sh +mise use "gitlab:owner/repo[matching=mytool-cli]" +``` + +`matching` is a case-sensitive substring test, so a value that is also a substring of another asset's name (e.g. `matching = "tool"` when both `tool-*` and `tool-extras-*` are published) won't uniquely select your binary. Use [`matching_regex`](#matching_regex) with an anchor when you need a precise match. + +If [`asset_pattern`](#asset_pattern) is also set, it takes precedence and `matching`/`matching_regex` are ignored — `asset_pattern` replaces autodetection entirely, so there is no candidate set left for them to narrow. They are ignored silently: when `asset_pattern` is set, a `matching_regex` is never consulted and an invalid one is not reported, since mise does not error on a superseded option. + +### `matching_regex` + +Like [`matching`](#matching), but the asset name must match the given regular expression. Use this when a substring isn't selective enough. The match is case-sensitive; use an inline `(?i)` flag for case-insensitive matching. + +```toml +[tools] +"gitlab:owner/repo" = { version = "latest", matching_regex = "^mytool-cli-" } +``` + +If both `matching` and `matching_regex` are set, an asset must satisfy **both** (logical AND) +to remain a candidate. + +::: warning +`matching`/`matching_regex` are **not** part of the install path — it is keyed by the tool +name (`owner/repo`, or a `tool_alias`) and version. To install two binaries from the same +release, give each its own [`tool_alias`](/dev-tools/backends/github.html#multiple-assets-from-the-same-release) +so they get distinct install directories; reusing the same `gitlab:owner/repo` string with +different `matching` values resolves to the same directory and the second install overwrites +the first. +::: + ### `version_prefix` Specifies a custom version prefix for release tags. By default, mise handles the common `v` prefix (e.g., `v1.0.0`), but some repositories use different prefixes like `release-`, `version-`, or no prefix at all. diff --git a/docs/dev-tools/backends/ubi.md b/docs/dev-tools/backends/ubi.md index 195bf7bba2..5ac7e14b2b 100644 --- a/docs/dev-tools/backends/ubi.md +++ b/docs/dev-tools/backends/ubi.md @@ -1,9 +1,11 @@ # Ubi Backend ::: warning -The ubi backend is **deprecated**. Please use the [github backend](/dev-tools/backends/github) instead. +The ubi backend is **deprecated**. Please use the [GitHub backend](/dev-tools/backends/github) instead. -The github backend offers several advantages over ubi including provenance verification, download progress reports, and fewer dependencies. To migrate, replace `ubi:owner/repo` with `github:owner/repo` in your configuration files. +The GitHub backend offers several advantages over ubi including provenance verification, download progress reports, and fewer dependencies. To migrate, replace `ubi:owner/repo` with `github:owner/repo` in your configuration files. The [`matching`](/dev-tools/backends/github.html#matching) and [`matching_regex`](/dev-tools/backends/github.html#matching_regex) options carry over. One behavioral difference is worth noting: ubi applies the substring `matching` only as a tiebreaker among assets that already match your OS/arch, and skips it when a single asset matches the platform. The GitHub backend applies `matching` as a pre-filter before autodetection, so for multi-binary releases you get the binary your filter names, or a clear error naming the filter if it isn't published for your platform. + +One migration gotcha: ubi folds `matching` into the install path, so you can install several binaries from one repo via separate `matching` values on the same `ubi:owner/repo` string. The GitHub backend keeps the install path keyed by tool name + version only, so two `github:owner/repo` entries with different `matching` values resolve to the **same** directory and the second overwrites the first. If you rely on that ubi pattern, give each binary its own [`tool_alias`](/dev-tools/backends/github.html#multiple-assets-from-the-same-release) on GitHub so each gets its own install directory. ::: You may install GitHub Releases and URL packages directly using [ubi](https://github.com/houseabsolute/ubi) backend. ubi is directly compiled into diff --git a/e2e/backend/test_forgejo_matching b/e2e/backend/test_forgejo_matching new file mode 100644 index 0000000000..e94a634a66 --- /dev/null +++ b/e2e/backend/test_forgejo_matching @@ -0,0 +1,52 @@ +#!/usr/bin/env bash + +# Disabled: Forgejo tests are flaky due to intermittent API issues (same reason +# e2e/backend/test_forgejo is disabled). Kept here, ready to enable, so the forgejo +# `matching`/`matching_regex` path has e2e parity with gitlab the moment the forgejo +# API is reliable enough to re-enable. Until then, the forgejo plumbing is covered by +# the unit test backend::github::tests::test_matching_plumbing_parity_across_git_backends +# (which exercises the Forgejo backend type's option accessors, lockfile serialization, +# and install-time-key inheritance) and by the passing gitlab e2e, which shares the +# same AssetMatcher. +exit 0 + +# Test the `matching` / `matching_regex` options on the FORGEJO backend. +# +# Mirrors e2e/backend/test_gitlab_matching: forgejo:roele/mise-test-fixtures v1.0.0 +# ships THREE assets (fd-8.7.0.tar.gz, hello-world-1.0.0.tar.gz, +# hello-world-2.0.0.tar.gz), none carrying OS/arch tokens, so the shortest-name +# tiebreak picks fd-8.7.0.tar.gz by default. `matching` narrows the candidate set +# BEFORE autodetection, proving the filter changes selection on the SEPARATE +# resolve_forgejo_asset_url_for_target path. + +export MISE_EXPERIMENTAL=1 + +# A matching value that excludes every asset must FAIL asset selection on the +# forgejo path with an error naming the filter — proving `matching` is parsed and +# threaded into resolve_forgejo_asset_url_for_target, not silently ignored. Run +# BEFORE any successful install (install dir is keyed by repo+version, not by the +# matching value, so an installed version would make `mise install` a no-op). +assert_fail 'mise install "forgejo:roele/mise-test-fixtures[matching=does-not-exist]@1.0.0"' "filtered by matching" + +# A syntactically invalid matching_regex (unclosed group) must be a hard error on +# the forgejo path, not silently ignored. +assert_fail 'mise install "forgejo:roele/mise-test-fixtures[matching_regex=hello(]@1.0.0"' "invalid matching_regex" + +# matching=hello-world narrows to the two hello-world-* assets; the tiebreak picks +# hello-world-1.0.0.tar.gz (NOT the default fd-8.7.0.tar.gz), and it runs. +cat <mise.toml +[tools] +"forgejo:roele/mise-test-fixtures" = { version = "1.0.0", matching = "hello-world", bin_path = "hello-world-1.0.0/bin", postinstall = "chmod +x \$MISE_TOOL_INSTALL_PATH/hello-world-1.0.0/bin/hello-world" } +EOF +mise install +assert_contains "mise x -- hello-world" "hello world" +mise uninstall forgejo:roele/mise-test-fixtures + +# matching_regex selects on the forgejo path too: ^hello-world-1 matches only +# hello-world-1.0.0.tar.gz. +cat <mise.toml +[tools] +"forgejo:roele/mise-test-fixtures" = { version = "1.0.0", matching_regex = "^hello-world-1", bin_path = "hello-world-1.0.0/bin", postinstall = "chmod +x \$MISE_TOOL_INSTALL_PATH/hello-world-1.0.0/bin/hello-world" } +EOF +mise install +assert_contains "mise x -- hello-world" "hello world" diff --git a/e2e/backend/test_github_matching b/e2e/backend/test_github_matching new file mode 100644 index 0000000000..de1640d578 --- /dev/null +++ b/e2e/backend/test_github_matching @@ -0,0 +1,64 @@ +#!/usr/bin/env bash + +# Test the `matching` / `matching_regex` options for the github backend. +# +# oxc-project/oxc ships BOTH oxlint and oxfmt as separate per-platform assets in +# a single release. Plain autodetection picks oxfmt (shortest-name tiebreak), and +# neither binary is named after the repo so the repo-name preference gives no +# signal. `matching` narrows to the intended binary WHILE keeping platform +# autodetection — so the same config is portable across OS/arch (unlike +# asset_pattern, which discards autodetection). Ported from the ubi backend. +# +# Note: the oxc assets are per-platform archives named with the platform triple +# (e.g. oxlint-aarch64-apple-darwin.tar.gz / .zip); each extracts to a triple-named +# binary, so `rename_exe = oxlint` exposes it on PATH as plain `oxlint`. + +export MISE_EXPERIMENTAL=1 + +# A `matching` value that excludes every asset must fail asset selection with an +# error that names the filter (proves the option is parsed + threaded, and that +# the failure isn't a misleading "no asset for this platform"). Run this BEFORE +# any successful install, since asset selection only happens at download time — +# an already-installed version would make `mise install` a no-op. The expected +# substring asserts the failure is the filter-excluded path, not an unrelated +# error (network, bad tag, etc.). +assert_fail 'mise install "github:oxc-project/oxc[matching=does-not-exist]@apps_v1.69.0"' "filtered by matching" + +# Same for matching_regex. +assert_fail 'mise install "github:oxc-project/oxc[matching_regex=^does-not-exist]@apps_v1.69.0"' "filtered by matching_regex" + +# A syntactically invalid matching_regex (unclosed group) must be a hard error, +# NOT silently ignored — otherwise selection would fall back to autodetection and +# install the wrong binary (oxfmt) without telling the user their filter was bad. +# The substring asserts it failed specifically on regex compilation. +assert_fail 'mise install "github:oxc-project/oxc[matching_regex=oxlint(]@apps_v1.69.0"' "invalid matching_regex" + +# `asset_pattern` takes precedence over `matching`/`matching_regex` (it replaces +# autodetection, leaving no candidate set to narrow). Proven platform-independently: +# a bogus asset_pattern that matches NO asset, combined with a valid matching=oxlint, +# must FAIL — if matching were (incorrectly) honored here it would rescue the install +# by selecting oxlint. The substring asserts the failure is the asset_pattern path +# ("No matching asset found for pattern"), not the matching path ("filtered by ..."), +# which proves asset_pattern wins and matching is ignored. +assert_fail 'mise install "github:oxc-project/oxc[asset_pattern=this-matches-no-asset,matching=oxlint]@apps_v1.69.0"' "No matching asset found for pattern" + +# Same precedence check for an *invalid* matching_regex. When asset_pattern is set, +# matching_regex is ignored everywhere (binary selection AND provenance), so a +# syntactically invalid regex must NOT hard-fail the install — asset_pattern wins, +# and an ignored field shouldn't be validated. This bogus asset_pattern matches no +# asset, so the failure must be the asset_pattern path ("No matching asset found +# for pattern"), NOT regex compilation ("invalid matching_regex"). If the up-front +# matching_regex validation runs regardless of asset_pattern, this fails on the +# wrong (regex) error. +assert_fail 'mise install "github:oxc-project/oxc[asset_pattern=this-matches-no-asset,matching_regex=oxlint(]@apps_v1.69.0"' "No matching asset found for pattern" + +# matching=oxlint selects the oxlint asset for THIS platform (autodetection still +# chooses the correct OS/arch), not oxfmt — verified end-to-end by running it. +assert_contains 'mise x "github:oxc-project/oxc[matching=oxlint,rename_exe=oxlint]@apps_v1.69.0" -- oxlint --version' "1.69.0" + +# Second real repo to prove the behavior isn't oxc-specific: bazelbuild/buildtools +# ships buildifier, buildozer and unused_deps as separate per-platform bare +# binaries. Plain autodetection picks buildozer (shortest-name tiebreak), so +# matching=buildifier is required to select buildifier. This is the same repo the +# ubi backend covers in e2e/cli/test_upgrade — exercised here for the github backend. +assert_contains 'mise x "github:bazelbuild/buildtools[matching=buildifier,rename_exe=buildifier]@7.1.2" -- buildifier --version' "buildifier version: 7.1.2" diff --git a/e2e/backend/test_github_matching_lock b/e2e/backend/test_github_matching_lock new file mode 100644 index 0000000000..0938445dca --- /dev/null +++ b/e2e/backend/test_github_matching_lock @@ -0,0 +1,32 @@ +#!/usr/bin/env bash + +# Regression: `mise lock` must not write a polluting, url-less entry for a tool +# whose `matching_regex` is invalid. +# +# `mise lock` is best-effort (since #7113): a platform it can't resolve is skipped, +# not fatal. But `resolve_lock_info` caught the invalid-regex hard error and +# returned an empty `PlatformInfo`, so the tool was written to the lockfile with its +# options but no resolved url/checksum — and miscounted as a successful platform +# entry ("Updated 1 (0 skipped)"). Failing closed in `resolve_lock_info` instead +# makes the orchestration skip the platform (nothing written), consistent with the +# best-effort model. The install path already validates up front, so this only +# affects the cross-platform lock path. + +export MISE_EXPERIMENTAL=1 +export MISE_LOCKFILE=1 + +cat <mise.toml +[tools] +"github:oxc-project/oxc" = { version = "apps_v1.69.0", matching_regex = "oxlint(" } +EOF + +touch mise.lock + +# Best-effort: locking a platform that can't resolve is a skip, not a hard error, +# so the command itself succeeds. +mise lock --platform linux-x64 + +# With the bug, the lockfile gained a `github:oxc-project/oxc` entry carrying the +# options but no platform data (url/checksum). Failing closed skips the tool, so no +# entry is written — assert the unresolved tool did not get a (broken) lock entry. +assert_not_contains "cat mise.lock" "oxc-project/oxc" diff --git a/e2e/backend/test_github_tool_alias_matching b/e2e/backend/test_github_tool_alias_matching new file mode 100644 index 0000000000..4285a09b01 --- /dev/null +++ b/e2e/backend/test_github_tool_alias_matching @@ -0,0 +1,91 @@ +#!/usr/bin/env bash + +# Test `matching` combined with `tool_alias` to install MULTIPLE binaries from a +# single multi-binary release, each selected portably. +# +# Two real multi-binary repos are exercised so the pattern isn't tied to one repo: +# +# - oxc-project/oxc ships oxlint and oxfmt as separate per-platform binaries in +# one release. apps_v1.69.0 bundles oxlint 1.69.0 and oxfmt 0.54.0: different +# names, different internal versions, one repo and one release tag. Its assets +# are per-platform archives named with the platform triple (e.g. +# oxlint-aarch64-apple-darwin.tar.gz / .zip), each extracting to a triple-named +# binary, so `rename_exe` exposes the plain name on PATH. +# - bazelbuild/buildtools ships buildifier, buildozer and unused_deps as separate +# per-platform bare binaries (e.g. buildifier-darwin-arm64), all at version +# 7.1.2. This repo is already a CI download dependency via e2e/cli/test_upgrade. +# +# `tool_alias` maps a name to the backend, and `matching` selects each binary while +# keeping platform autodetection, so one config stays portable across OS/arch. This +# is the portable replacement for the per-platform `asset_pattern` workaround +# (#9358, #9074). + +export MISE_EXPERIMENTAL=1 +# Enable the lockfile so we can assert `matching` round-trips through a real +# install (stronger than the unit test, which inserts the option directly). +export MISE_LOCKFILE=1 + +cat <<'EOF' >mise.toml +[tool_alias] +oxlint = "github:oxc-project/oxc" +oxfmt = "github:oxc-project/oxc" +buildifier = "github:bazelbuild/buildtools" +buildozer = "github:bazelbuild/buildtools" + +# version is the release tag (apps_v1.69.0); it ships oxlint 1.69.0 and oxfmt 0.54.0 +[tools.oxlint] +version = "apps_v1.69.0" +matching = "oxlint" +rename_exe = "oxlint" + +[tools.oxfmt] +version = "apps_v1.69.0" +matching = "oxfmt" +rename_exe = "oxfmt" + +# buildifier and buildozer both ship in the single 7.1.2 release +[tools.buildifier] +version = "7.1.2" +matching = "buildifier" +rename_exe = "buildifier" + +[tools.buildozer] +version = "7.1.2" +matching = "buildozer" +rename_exe = "buildozer" +EOF + +touch mise.lock + +mise install + +# Each alias selected its own binary from the same release and reports its own +# version, even though the two aliases of each repo pin a shared release tag. +assert_contains "mise x oxlint -- oxlint --version" "1.69.0" +assert_contains "mise x oxfmt -- oxfmt --version" "0.54.0" +assert_contains "mise x buildifier -- buildifier --version" "buildifier version: 7.1.2" +assert_contains "mise x buildozer -- buildozer --version" "buildozer version: 7.1.2" + +# Distinct aliases resolving to the same backend/version install as distinct +# tools, rather than being deduplicated to a single install. Assert the +# version-keyed subdir (the release tag for oxc, 7.1.2 for buildtools) so this +# proves a real per-alias install, not just an empty top-level dir. +assert "test -d ~/.local/share/mise/installs/oxlint/apps_v1.69.0" +assert "test -d ~/.local/share/mise/installs/oxfmt/apps_v1.69.0" +assert "test -d ~/.local/share/mise/installs/buildifier/7.1.2" +assert "test -d ~/.local/share/mise/installs/buildozer/7.1.2" + +# `matching` round-trips through a real install into the lockfile, so a relock on +# another OS reproduces the same per-alias asset selection (each option is written +# as `matching = ""`, the same way `asset_pattern` is — see +# e2e/backend/test_github_url_tracking). +assert_contains "cat mise.lock" 'matching = "oxlint"' +assert_contains "cat mise.lock" 'matching = "oxfmt"' +assert_contains "cat mise.lock" 'matching = "buildifier"' +assert_contains "cat mise.lock" 'matching = "buildozer"' + +# Reinstalling from the lockfile reproduces each binary (re-resolution path). +mise uninstall oxlint oxfmt buildifier buildozer +mise install +assert_contains "mise x oxlint -- oxlint --version" "1.69.0" +assert_contains "mise x buildifier -- buildifier --version" "buildifier version: 7.1.2" diff --git a/e2e/backend/test_gitlab_matching b/e2e/backend/test_gitlab_matching new file mode 100644 index 0000000000..1a9b43ea58 --- /dev/null +++ b/e2e/backend/test_gitlab_matching @@ -0,0 +1,54 @@ +#!/usr/bin/env bash + +# Test the `matching` / `matching_regex` options on the GITLAB backend. +# +# The github/gitlab/forgejo backends share one AssetMatcher, but each threads the +# filter through its OWN resolve_*_asset_url_for_target function. This exercises +# the gitlab path against the real GitLab release API so it can't silently drift +# from github (the shared option plumbing is also covered by the unit test +# backend::github::tests::test_matching_plumbing_parity_across_git_backends). +# +# gitlab:jdxcode/mise-test-fixtures v1.0.0 ships THREE assets: +# fd-8.7.0.tar.gz, hello-world-1.0.0.tar.gz, hello-world-2.0.0.tar.gz +# None carry OS/arch tokens, so platform autodetection scores them equally and the +# shortest-name tiebreak picks fd-8.7.0.tar.gz by default. `matching` narrows the +# candidate set BEFORE autodetection, so matching=hello-world selects a hello-world +# asset instead — proving the filter actually changes selection on the gitlab path. + +export MISE_EXPERIMENTAL=1 + +# A matching value that excludes every asset must FAIL asset selection on the +# gitlab path with an error that names the filter — proving `matching` is parsed +# and threaded into resolve_gitlab_asset_url_for_target rather than silently +# ignored (if it were ignored, autodetection would succeed and pick fd). The +# expected substring asserts it failed on the filter-excluded path, not an +# unrelated error (network, bad tag). Run BEFORE any successful install, since an +# already-installed version makes `mise install` a no-op (the install dir is keyed +# by repo+version, NOT by the matching value). +assert_fail 'mise install "gitlab:jdxcode/mise-test-fixtures[matching=does-not-exist]@1.0.0"' "filtered by matching" + +# A syntactically invalid matching_regex (unclosed group) must be a hard error on +# the gitlab path, NOT silently ignored (which would fall back to autodetection and +# install fd). The substring asserts it failed specifically on regex compilation. +assert_fail 'mise install "gitlab:jdxcode/mise-test-fixtures[matching_regex=hello(]@1.0.0"' "invalid matching_regex" + +# matching=hello-world narrows the three assets to the two hello-world-* ones; the +# shortest-name/lexicographic tiebreak then picks hello-world-1.0.0.tar.gz (NOT the +# default fd-8.7.0.tar.gz). bin_path + postinstall match that asset's layout, so +# the installed binary runs end-to-end. +cat <mise.toml +[tools] +"gitlab:jdxcode/mise-test-fixtures" = { version = "1.0.0", matching = "hello-world", bin_path = "hello-world-1.0.0/bin", postinstall = "chmod +x \$MISE_TOOL_INSTALL_PATH/hello-world-1.0.0/bin/hello-world" } +EOF +mise install +assert_contains "mise x -- hello-world" "hello world" +mise uninstall gitlab:jdxcode/mise-test-fixtures + +# matching_regex selects on the gitlab path too: ^hello-world-1 matches only +# hello-world-1.0.0.tar.gz, so the same binary installs and runs. +cat <mise.toml +[tools] +"gitlab:jdxcode/mise-test-fixtures" = { version = "1.0.0", matching_regex = "^hello-world-1", bin_path = "hello-world-1.0.0/bin", postinstall = "chmod +x \$MISE_TOOL_INSTALL_PATH/hello-world-1.0.0/bin/hello-world" } +EOF +mise install +assert_contains "mise x -- hello-world" "hello world" diff --git a/src/backend/asset_matcher.rs b/src/backend/asset_matcher.rs index f87b4b09f6..7014926528 100644 --- a/src/backend/asset_matcher.rs +++ b/src/backend/asset_matcher.rs @@ -175,6 +175,16 @@ pub struct AssetPicker { target_libc: String, no_app: bool, preferred_name: Option, + /// Substring that an asset name must contain to remain a candidate. + /// Applied as a pre-filter before platform scoring (ubi's `matching`). + matching: Option, + /// Regex an asset name must match to remain a candidate (ubi's `matching_regex`), + /// compiled once when the picker is built. `Some(Ok)` is a valid pattern; + /// `Some(Err(msg))` records that the pattern was set but failed to compile (the + /// string is a ready-to-surface error message). Caching the compile here makes + /// regex validity a local property of the picker rather than something that + /// depends on call ordering between binary and provenance selection. + matching_regex: Option>, } impl AssetPicker { @@ -198,6 +208,8 @@ impl AssetPicker { target_libc, no_app: false, preferred_name: None, + matching: None, + matching_regex: None, } } @@ -216,6 +228,73 @@ impl AssetPicker { self } + /// Narrow candidates to assets whose name contains `matching`, before + /// platform autodetection runs. Ports ubi's `matching` to keep a portable, + /// autodetecting config for repos that ship multiple binaries per platform. + pub fn with_matching(mut self, matching: impl Into) -> Self { + let matching = matching.into(); + if !matching.is_empty() { + self.matching = Some(matching); + } + self + } + + /// Narrow candidates to assets whose name matches `matching_regex`, before + /// platform autodetection runs. Ports ubi's `matching_regex`. Empty is a no-op. + /// + /// The pattern is compiled here, once, and the result is cached on the picker. + /// An invalid pattern is retained as `Some(Err(msg))` rather than dropped, so + /// it can be surfaced as a hard error on the binary path and fails closed on + /// the provenance path — never silently degrading to "no filter". + pub fn with_matching_regex(mut self, matching_regex: impl Into) -> Self { + let matching_regex = matching_regex.into(); + if !matching_regex.is_empty() { + let compiled = Regex::new(&matching_regex) + .map_err(|e| format!("invalid matching_regex \"{matching_regex}\": {e}")); + self.matching_regex = Some(compiled); + } + self + } + + /// The compile error message when `matching_regex` was set but failed to + /// compile, else `None`. Single source of truth for "is the cached pattern + /// invalid?" so the binary choke point ([`AssetMatcher::match_by_auto_detection`], + /// which hard-errors) and the provenance guard ([`Self::pick_best_provenance`], + /// which returns `None`) decide it the same way and can't drift apart. + fn matching_regex_error(&self) -> Option<&str> { + match &self.matching_regex { + Some(Err(msg)) => Some(msg.as_str()), + _ => None, + } + } + + /// Apply the `matching` / `matching_regex` pre-filter to the candidate set. + /// + /// Returns the assets that pass the filter; when neither option is set this + /// is the full list unchanged (so the no-matching path is byte-for-byte the + /// previous behavior). The regex was compiled once when the picker was built, + /// so this uses the cached result and never recompiles. An invalid pattern + /// (`Some(Err)`) fails closed — it matches *nothing* rather than degrading to + /// "no filter" — so a misconfiguration surfaces as "no asset found" instead + /// of silently installing whatever plain autodetection would have picked. On + /// the binary path that empty result is turned into a hard error upstream in + /// [`AssetMatcher::match_by_auto_detection`]. + fn apply_matching_filter<'a>(&self, assets: &'a [String]) -> Vec<&'a String> { + assets + .iter() + .filter(|asset| match &self.matching { + Some(m) => asset.contains(m.as_str()), + None => true, + }) + .filter(|asset| match &self.matching_regex { + Some(Ok(re)) => re.is_match(asset), + // Invalid pattern: fail closed (matches nothing). + Some(Err(_)) => false, + None => true, + }) + .collect() + } + /// Picks the best asset from available options. /// /// When multiple assets tie on score, prefers the shortest name. This handles @@ -224,7 +303,21 @@ impl AssetPicker { /// canonical binary's name is almost always the shortest. /// See: https://github.com/jdx/mise/discussions/9358 pub fn pick_best_asset(&self, assets: &[String]) -> Option { - let scored_assets = self.score_all_assets(assets); + // Narrow by `matching`/`matching_regex` before scoring. When neither is + // set, score the assets directly — no filtering, no intermediate clone — + // so the no-matching path is allocation-identical to the pre-feature + // behavior. Only when a filter is active do we materialize the narrowed + // candidate set. + let scored_assets = if self.matching.is_none() && self.matching_regex.is_none() { + self.score_all_assets(assets) + } else { + let candidates: Vec = self + .apply_matching_filter(assets) + .into_iter() + .cloned() + .collect(); + self.score_all_assets(&candidates) + }; scored_assets .into_iter() .filter(|(score, asset)| *score > 0 && !self.has_arch_mismatch(asset)) @@ -255,8 +348,53 @@ impl AssetPicker { return None; } + // Narrow by `matching`/`matching_regex` so a multi-binary release's + // per-binary provenance files don't cross-verify (e.g. attaching oxfmt's + // provenance to an oxlint install). Mirrors the pre-filter the binary + // picker applies, keeping the provenance aligned with the selected tool. + // + // When neither filter is set, score the provenance files directly — no + // intermediate clone — mirroring the binary picker's no-op short-circuit + // (`pick_best_asset`). `owned_provenance` is function-scoped so the narrowed + // `candidates` can borrow it past the `if`. + let owned_provenance: Vec; + let candidates: Vec<&String> = if self.matching.is_none() && self.matching_regex.is_none() { + provenance_assets + } else { + // A malformed `matching_regex` is a different case from a valid filter + // that excludes everything (handled by the fallback below). We can't + // trust a garbage pattern to narrow anything, so refuse to pick rather + // than fall back to the full set and risk attaching the wrong binary's + // provenance. Production never reaches here with a bad pattern: the + // autodetection path validates it up front and hard-errors first; the + // `asset_pattern` path suppresses `matching` entirely for provenance + // (`matching_for_provenance`); and the install path that reuses a cached + // lockfile URL — which skips binary selection — validates it explicitly + // via [`validate_matching_regex`] before any verification runs. So a bad + // pattern is never threaded into this picker. This guard purely backstops + // a future caller that builds a provenance picker without any of those + // protections. + if self.matching_regex_error().is_some() { + return None; + } + + // Fall back to the full provenance set when the filter excludes + // everything: a single shared provenance file (e.g. goreleaser's + // `multiple.intoto.jsonl`) attests every artifact in the release but + // doesn't carry the binary name, so it would be filtered out. Dropping + // it would silently skip verification — a downgrade — so we keep it + // instead and let cryptographic verification decide. + owned_provenance = provenance_assets.into_iter().cloned().collect(); + let filtered = self.apply_matching_filter(&owned_provenance); + if filtered.is_empty() { + owned_provenance.iter().collect() + } else { + filtered + } + }; + // Score by platform match only (no format/build penalties) - let mut scored: Vec<(i32, &String)> = provenance_assets + let mut scored: Vec<(i32, &String)> = candidates .into_iter() .map(|asset| { let score = self.score_os_match(asset) + self.score_arch_match(asset); @@ -569,6 +707,26 @@ static CHECKSUM_PATTERNS: LazyLock> = LazyLock::new(|| { ] }); +/// Validate a `matching_regex` option string, returning a hard error that names +/// the pattern if it fails to compile (an empty/`None` value is a no-op). +/// +/// Binary selection already surfaces an invalid pattern via +/// [`AssetMatcher::match_by_auto_detection`], but the github backend's install +/// path can reuse a cached lockfile URL and skip binary selection entirely +/// (`install_version_`). That branch must still reject a bad pattern up front — +/// otherwise the invalid regex reaches [`AssetPicker::pick_best_provenance`], +/// which returns `None` and is read downstream as "no provenance", silently +/// skipping SLSA verification. This reuses the picker's cached-compile and error +/// message so every path decides "is the pattern valid?" identically. +pub fn validate_matching_regex(matching_regex: Option<&str>) -> Result<()> { + let picker = AssetPicker::with_libc(String::new(), String::new(), None) + .with_matching_regex(matching_regex.unwrap_or_default()); + if let Some(msg) = picker.matching_regex_error() { + return Err(eyre::eyre!("{msg}")); + } + Ok(()) +} + /// Represents a matched asset with metadata #[derive(Debug, Clone)] pub struct MatchedAsset { @@ -589,6 +747,10 @@ pub struct AssetMatcher { no_app: bool, /// Preferred primary executable/tool name for asset selection preferred_name: Option, + /// Substring an asset name must contain (ubi's `matching`) + matching: Option, + /// Regex an asset name must match (ubi's `matching_regex`) + matching_regex: Option, } impl AssetMatcher { @@ -620,6 +782,32 @@ impl AssetMatcher { self } + /// Narrow candidates to assets whose name contains `matching` before + /// platform autodetection (ubi's `matching`). Empty is a no-op. Mirrors + /// [`Self::with_preferred_name`]'s signature so the optional string fields + /// are configured the same way. + pub fn with_matching(mut self, matching: impl Into) -> Self { + let matching = matching.into(); + if !matching.is_empty() { + self.matching = Some(matching); + } + self + } + + /// Narrow candidates to assets matching `matching_regex` before platform + /// autodetection (ubi's `matching_regex`). Empty is a no-op. + /// + /// This stores the *unparsed* pattern by design: the compile-once cache lives + /// on [`AssetPicker`] (built in [`Self::create_picker`]), so validity is a + /// local property of the picker rather than of this builder. + pub fn with_matching_regex(mut self, matching_regex: impl Into) -> Self { + let matching_regex = matching_regex.into(); + if !matching_regex.is_empty() { + self.matching_regex = Some(matching_regex); + } + self + } + /// Pick the best matching asset from a list of names pub fn pick_from(&self, assets: &[String]) -> Result { self.match_by_auto_detection(assets) @@ -666,7 +854,9 @@ impl AssetMatcher { Some( AssetPicker::with_libc(os.clone(), arch.clone(), self.target_libc.clone()) .with_no_app(self.no_app) - .with_preferred_name(self.preferred_name.clone().unwrap_or_default()), + .with_preferred_name(self.preferred_name.clone().unwrap_or_default()) + .with_matching(self.matching.clone().unwrap_or_default()) + .with_matching_regex(self.matching_regex.clone().unwrap_or_default()), ) } @@ -675,13 +865,40 @@ impl AssetMatcher { .create_picker() .ok_or_else(|| eyre::eyre!("Target OS and arch must be set for auto-detection"))?; + // Reject an invalid `matching_regex` as a hard error that names it, rather + // than letting it silently drop to plain autodetection and install the + // wrong asset. The picker compiled the pattern once when it was built and + // cached the result, so this just surfaces that error. This is the single + // Result-returning choke point all binary-asset selection funnels through. + if let Some(msg) = picker.matching_regex_error() { + return Err(eyre::eyre!("{msg}")); + } + let best = picker.pick_best_asset(assets).ok_or_else(|| { let os = self.target_os.as_deref().unwrap_or("unknown"); let arch = self.target_arch.as_deref().unwrap_or("unknown"); + // When a matching filter is set, surface it — otherwise an empty + // filter result reads as "no asset for this platform", hiding that + // the user's own `matching`/`matching_regex` excluded everything. + // Report every active filter so a user who set both isn't told only + // half of what narrowed the candidate set. + let mut active_filters = Vec::new(); + if let Some(m) = &self.matching { + active_filters.push(format!("matching=\"{m}\"")); + } + if let Some(re) = &self.matching_regex { + active_filters.push(format!("matching_regex=\"{re}\"")); + } + let filter_note = if active_filters.is_empty() { + String::new() + } else { + format!("\nNote: filtered by {}", active_filters.join(", ")) + }; eyre::eyre!( - "No matching asset found for platform {}-{}\nAvailable assets:\n{}", + "No matching asset found for platform {}-{}{}\nAvailable assets:\n{}", os, arch, + filter_note, assets.join("\n") ) })?; @@ -1261,6 +1478,566 @@ abc123def456abc123def456abc123def456abc123def456abc123def456abcd tool-1.0.0-dar assert_eq!(picked, "opengrep_osx_arm64"); } + /// Multi-binary release set used by the `matching` tests below. + /// + /// `oxc-project/oxc` ships both `oxlint` and `oxfmt` as separate per-platform + /// assets in a single release. Neither is named after the repo (`oxc`). + fn oxc_assets() -> Vec { + vec![ + "oxlint-aarch64-apple-darwin.tar.gz".to_string(), + "oxfmt-aarch64-apple-darwin.tar.gz".to_string(), + "oxlint-x86_64-unknown-linux-gnu.tar.gz".to_string(), + "oxfmt-x86_64-unknown-linux-gnu.tar.gz".to_string(), + "oxlint-i686-pc-windows-msvc.zip".to_string(), + "oxfmt-i686-pc-windows-msvc.zip".to_string(), + ] + } + + #[test] + fn test_multi_binary_release_without_matching_is_ambiguous() { + // Demonstrates the gap that `matching` closes, using ONLY existing APIs + // (runs against current `main` with no new production code), and guards + // the unchanged no-matching path — `matching` must stay purely additive. + // + // `oxc-project/oxc` ships `oxlint` and `oxfmt` as separate per-platform + // assets. Neither is named after the repo, so no existing signal can + // portably select `oxlint`: + let assets = vec![ + "oxlint-aarch64-apple-darwin.tar.gz".to_string(), + "oxfmt-aarch64-apple-darwin.tar.gz".to_string(), + ]; + + // 1. Plain autodetection falls back to the #9358 shortest-name tiebreak + // and picks `oxfmt` (5 chars) over `oxlint` (6) — the wrong binary. + let picker = AssetPicker::with_libc("macos".to_string(), "aarch64".to_string(), None); + assert_eq!( + picker.pick_best_asset(&assets).unwrap(), + "oxfmt-aarch64-apple-darwin.tar.gz" + ); + + // 2. The #10008 repo-name preference can't rescue it either: the github + // backend passes preferred_name = the repo's last path segment + // (`oxc`), but neither asset starts with `oxc`, so there is no boost + // and `oxfmt` still wins. This is exactly the missing signal that + // `matching` supplies (see the `matching` tests below). + let picker = AssetPicker::with_libc("macos".to_string(), "aarch64".to_string(), None) + .with_preferred_name("oxc"); + assert_eq!( + picker.pick_best_asset(&assets).unwrap(), + "oxfmt-aarch64-apple-darwin.tar.gz" + ); + } + + #[test] + fn test_matching_narrows_multi_binary_release_to_named_binary() { + // `matching=oxlint` supplies the signal autodetection lacks, while + // keeping platform autodetection (ubi's `matching`, ported to github). + let assets = oxc_assets(); + + // macOS arm64 -> the darwin oxlint asset. + let picker = AssetPicker::with_libc("macos".to_string(), "aarch64".to_string(), None) + .with_matching("oxlint"); + assert_eq!( + picker.pick_best_asset(&assets).unwrap(), + "oxlint-aarch64-apple-darwin.tar.gz" + ); + + // The SAME config is portable: linux x64 -> the linux oxlint asset. + // (`asset_pattern` can't do this — it discards platform autodetection.) + let picker = AssetPicker::with_libc("linux".to_string(), "x86_64".to_string(), None) + .with_matching("oxlint"); + assert_eq!( + picker.pick_best_asset(&assets).unwrap(), + "oxlint-x86_64-unknown-linux-gnu.tar.gz" + ); + } + + #[test] + fn test_matching_selects_the_other_binary_from_the_same_release() { + // Complements the oxlint test above: the SAME oxc release also ships + // oxfmt, and `matching=oxfmt` selects it independently. This is what lets + // a `tool_alias` config install both oxlint and oxfmt from one repo, each + // picked portably (see e2e/backend/test_github_tool_alias_matching). + let assets = oxc_assets(); + + // macOS arm64 -> the darwin oxfmt asset. + let picker = AssetPicker::with_libc("macos".to_string(), "aarch64".to_string(), None) + .with_matching("oxfmt"); + assert_eq!( + picker.pick_best_asset(&assets).unwrap(), + "oxfmt-aarch64-apple-darwin.tar.gz" + ); + + // Portable across platforms: linux x64 -> the linux oxfmt asset. + let picker = AssetPicker::with_libc("linux".to_string(), "x86_64".to_string(), None) + .with_matching("oxfmt"); + assert_eq!( + picker.pick_best_asset(&assets).unwrap(), + "oxfmt-x86_64-unknown-linux-gnu.tar.gz" + ); + } + + #[test] + fn test_matching_regex_narrows_multi_binary_release() { + let assets = oxc_assets(); + let picker = AssetPicker::with_libc("macos".to_string(), "aarch64".to_string(), None) + .with_matching_regex("^oxlint-"); + assert_eq!( + picker.pick_best_asset(&assets).unwrap(), + "oxlint-aarch64-apple-darwin.tar.gz" + ); + } + + #[test] + fn test_matching_still_respects_platform_autodetection() { + // `matching` NARROWS — it does not override platform autodetection the + // way `asset_pattern` does. With `matching=oxlint` on a macOS target but + // only a *windows* oxlint asset surviving the filter, the result is + // None (no asset for this OS/arch) — NOT the wrong-OS asset. + let assets = vec![ + "oxlint-i686-pc-windows-msvc.zip".to_string(), + "oxfmt-aarch64-apple-darwin.tar.gz".to_string(), + ]; + let picker = AssetPicker::with_libc("macos".to_string(), "aarch64".to_string(), None) + .with_matching("oxlint"); + assert_eq!(picker.pick_best_asset(&assets), None); + } + + #[test] + fn test_matching_filtering_out_all_assets_returns_none() { + // If `matching` excludes every asset there is nothing to install; + // callers turn this None into an error naming the matching filter. + let assets = vec!["oxfmt-aarch64-apple-darwin.tar.gz".to_string()]; + let picker = AssetPicker::with_libc("macos".to_string(), "aarch64".to_string(), None) + .with_matching("oxlint"); + assert_eq!(picker.pick_best_asset(&assets), None); + } + + #[test] + fn test_asset_matcher_with_matching_threads_through_to_picker() { + // Covers the high-level builder path the github backend actually uses: + // AssetMatcher::new().for_target(..).with_matching(..).pick_from(..). + use crate::platform::Platform; + + let target = PlatformTarget::new(Platform::parse("linux-x64").unwrap()); + let assets = oxc_assets(); + + // matching threads through AssetMatcher -> AssetPicker; the linux oxlint + // asset is chosen (autodetection still picks the OS/arch). + let picked = AssetMatcher::new() + .for_target(&target) + .with_matching("oxlint") + .pick_from(&assets) + .unwrap() + .name; + assert_eq!(picked, "oxlint-x86_64-unknown-linux-gnu.tar.gz"); + + // Empty matching is a no-op (the github backend passes + // opts.matching().unwrap_or_default(), so an unset option arrives here + // as ""); the same set is ambiguous and the shortest-name tiebreak picks + // oxfmt, proving the no-matching path is unchanged. + let picked = AssetMatcher::new() + .for_target(&target) + .with_matching("") + .pick_from(&assets) + .unwrap() + .name; + assert_eq!(picked, "oxfmt-x86_64-unknown-linux-gnu.tar.gz"); + } + + #[test] + fn test_asset_matcher_empty_matching_regex_is_noop() { + // Twin of the empty-`matching` no-op above, for `matching_regex`. The + // github backend passes opts.matching_regex().unwrap_or_default(), so an + // unset option arrives as "" and must be a no-op (not a filter that + // excludes everything). The set is then ambiguous and the shortest-name + // tiebreak picks oxfmt — identical to the no-filter path. + use crate::platform::Platform; + + let target = PlatformTarget::new(Platform::parse("linux-x64").unwrap()); + let assets = oxc_assets(); + + let picked = AssetMatcher::new() + .for_target(&target) + .with_matching_regex("") + .pick_from(&assets) + .unwrap() + .name; + assert_eq!(picked, "oxfmt-x86_64-unknown-linux-gnu.tar.gz"); + } + + #[test] + fn test_matching_does_not_fall_back_to_sibling_when_named_binary_missing_for_platform() { + // The decisive safety property: when `matching` names a binary that is + // NOT published for this platform, the result is None — it must NOT fall + // back to a *sibling* binary that IS published here. Here oxlint ships for + // linux and windows but not macOS, while oxfmt ships for macOS; a macOS + // target with matching=oxlint must yield None, never the macOS oxfmt. + let assets = vec![ + "oxlint-x86_64-unknown-linux-gnu.tar.gz".to_string(), + "oxlint-i686-pc-windows-msvc.zip".to_string(), + "oxfmt-aarch64-apple-darwin.tar.gz".to_string(), + ]; + let picker = AssetPicker::with_libc("macos".to_string(), "aarch64".to_string(), None) + .with_matching("oxlint"); + assert_eq!(picker.pick_best_asset(&assets), None); + } + + #[test] + fn test_matching_on_windows_target() { + // The matching tests above target macOS/linux; cover a Windows target too + // (matching is platform-string-driven, so this guards the windows arm). + // The oxc fixture ships i686-pc-windows-msvc assets for both binaries; + // matching=oxlint selects the windows oxlint asset, not oxfmt. + let assets = oxc_assets(); + let picker = AssetPicker::with_libc("windows".to_string(), "x86".to_string(), None) + .with_matching("oxlint"); + assert_eq!( + picker.pick_best_asset(&assets).unwrap(), + "oxlint-i686-pc-windows-msvc.zip" + ); + } + + #[test] + fn test_invalid_matching_regex_is_a_hard_error() { + // A syntactically invalid `matching_regex` must be a HARD ERROR that + // names the bad pattern — not silently ignored. Silently ignoring it + // would fall back to plain autodetection and install the WRONG binary + // (here: oxfmt instead of the intended oxlint) with no signal to the + // user. This matches ubi, which rejects an invalid pattern up front. + use crate::platform::Platform; + + let target = PlatformTarget::new(Platform::parse("linux-x64").unwrap()); + let assets = oxc_assets(); + + // `oxlint(` is invalid (unclosed group). Same bad pattern the e2e uses. + let err = AssetMatcher::new() + .for_target(&target) + .with_matching_regex("oxlint(") + .pick_from(&assets) + .unwrap_err(); + let msg = err.to_string(); + assert!( + msg.contains("matching_regex") && msg.contains("oxlint("), + "error must name the option and the bad pattern, got: {msg}" + ); + } + + #[test] + fn test_validate_matching_regex_rejects_bad_pattern_without_a_picker() { + // The github install path can reuse a cached lockfile URL and skip binary + // selection — the path that normally hard-errors on a bad pattern. That + // branch instead calls `validate_matching_regex` up front so an invalid + // pattern still fails closed (rather than reaching the provenance picker, + // returning `None`, and silently skipping SLSA verification). This guards + // that the standalone validator names the option + the bad pattern, and + // that valid/empty/None patterns are a no-op. + let err = validate_matching_regex(Some("oxlint(")).unwrap_err(); + let msg = err.to_string(); + assert!( + msg.contains("matching_regex") && msg.contains("oxlint("), + "error must name the option and the bad pattern, got: {msg}" + ); + + assert!(validate_matching_regex(Some("^oxlint")).is_ok()); + assert!(validate_matching_regex(Some("")).is_ok()); + assert!(validate_matching_regex(None).is_ok()); + } + + #[test] + fn test_matching_is_a_literal_substring_not_a_regex() { + // `matching` is a plain substring test (str::contains), so regex + // metacharacters in the value are LITERAL. `matching="a.c"` selects only + // the asset whose name literally contains "a.c"; the `.` is a dot, not a + // wildcard. The decoy "abc-..." matches `a.c` *as a regex* and is the + // shorter name (so the shortest-name tiebreak would prefer it), so if + // `matching` were ever treated as a regex this assertion would pick the + // wrong asset. Use `matching_regex` when you want pattern semantics. + let assets = vec![ + "mytool-a.c-x86_64-unknown-linux-gnu.tar.gz".to_string(), + "abc-x86_64-unknown-linux-gnu.tar.gz".to_string(), + ]; + let picker = AssetPicker::with_libc("linux".to_string(), "x86_64".to_string(), None) + .with_matching("a.c"); + assert_eq!( + picker.pick_best_asset(&assets).unwrap(), + "mytool-a.c-x86_64-unknown-linux-gnu.tar.gz" + ); + } + + #[test] + fn test_matching_and_matching_regex_combine_as_and() { + // matching and matching_regex set TOGETHER on the same picker are ANDed: + // an asset must satisfy both to survive the pre-filter. This is the only + // test that chains both on one picker — the other multi-filter tests use + // separate pickers, so they'd still pass if the two filters were ever + // accidentally ORed in apply_matching_filter. + let assets = oxc_assets(); + + // matching="ox" admits both oxlint and oxfmt; the regex narrows to + // oxlint. The survivor is the intersection: the darwin oxlint asset. + let picker = AssetPicker::with_libc("macos".to_string(), "aarch64".to_string(), None) + .with_matching("ox") + .with_matching_regex("^oxlint-"); + assert_eq!( + picker.pick_best_asset(&assets).unwrap(), + "oxlint-aarch64-apple-darwin.tar.gz" + ); + + // Contradictory filters (substring wants oxfmt, regex wants oxlint) + // intersect to nothing -> None, not a fall-back to either filter alone. + let picker = AssetPicker::with_libc("macos".to_string(), "aarch64".to_string(), None) + .with_matching("oxfmt") + .with_matching_regex("^oxlint-"); + assert_eq!(picker.pick_best_asset(&assets), None); + } + + #[test] + fn test_matching_substring_leaks_into_longer_sibling_name() { + // `matching` uses substring `contains`, so a value that is a prefix of + // another binary's name admits BOTH — it does not uniquely select. This + // documents that footgun and shows the `matching_regex` escape hatch. + let assets = vec![ + "tool-a-x86_64-unknown-linux-gnu.tar.gz".to_string(), + "tool-ab-x86_64-unknown-linux-gnu.tar.gz".to_string(), + ]; + + // "tool-a" is a substring of BOTH names, so both survive the pre-filter + // and the shortest-name tiebreak decides. A user who actually wanted the + // longer-named sibling would silently get the wrong one. + let picker = AssetPicker::with_libc("linux".to_string(), "x86_64".to_string(), None) + .with_matching("tool-a"); + assert_eq!( + picker.pick_best_asset(&assets).unwrap(), + "tool-a-x86_64-unknown-linux-gnu.tar.gz" + ); + + // An anchored `matching_regex` disambiguates: only tool-ab matches. + let picker = AssetPicker::with_libc("linux".to_string(), "x86_64".to_string(), None) + .with_matching_regex("^tool-ab-"); + assert_eq!( + picker.pick_best_asset(&assets).unwrap(), + "tool-ab-x86_64-unknown-linux-gnu.tar.gz" + ); + } + + #[test] + fn test_direct_picker_invalid_regex_fails_closed() { + // The picker caches the compiled matching_regex. A direct AssetPicker + // built with a bad pattern must fail CLOSED: an invalid regex matches + // nothing (-> None), never degrading to "no filter" and silently + // installing the autodetected asset. (The AssetMatcher path turns this + // into the hard error covered by test_invalid_matching_regex_is_a_hard_error; + // the provenance path returns None via + // test_pick_best_provenance_invalid_regex_returns_none_not_fallback.) + let assets = oxc_assets(); + let picker = AssetPicker::with_libc("macos".to_string(), "aarch64".to_string(), None) + .with_matching_regex("oxlint("); + assert_eq!(picker.pick_best_asset(&assets), None); + } + + /// Real release set for bazelbuild/buildtools v7.1.2 — three bare binaries + /// per platform. This is the case the ubi backend covers via e2e + /// (`e2e/cli/test_upgrade`: `ubi:bazelbuild/buildtools[matching=buildifier]`); + /// ported here so the github backend has the same coverage at the unit level. + fn bazel_buildtools_assets() -> Vec { + vec![ + "buildifier-darwin-amd64".to_string(), + "buildifier-darwin-arm64".to_string(), + "buildifier-linux-amd64".to_string(), + "buildifier-linux-arm64".to_string(), + "buildifier-windows-amd64.exe".to_string(), + "buildozer-darwin-amd64".to_string(), + "buildozer-darwin-arm64".to_string(), + "buildozer-linux-amd64".to_string(), + "buildozer-linux-arm64".to_string(), + "buildozer-windows-amd64.exe".to_string(), + "unused_deps-darwin-amd64".to_string(), + "unused_deps-darwin-arm64".to_string(), + "unused_deps-linux-amd64".to_string(), + "unused_deps-linux-arm64".to_string(), + "unused_deps-windows-amd64.exe".to_string(), + ] + } + + #[test] + fn test_matching_selects_buildifier_from_bazel_buildtools() { + // Mirrors the ubi e2e: `matching=buildifier` selects buildifier from a + // multi-binary release, while platform autodetection still chooses the + // correct OS/arch — so one config is portable across platforms. + let assets = bazel_buildtools_assets(); + + let picker = AssetPicker::with_libc("macos".to_string(), "aarch64".to_string(), None) + .with_matching("buildifier"); + assert_eq!( + picker.pick_best_asset(&assets).unwrap(), + "buildifier-darwin-arm64" + ); + + let picker = AssetPicker::with_libc("linux".to_string(), "x86_64".to_string(), None) + .with_matching("buildifier"); + assert_eq!( + picker.pick_best_asset(&assets).unwrap(), + "buildifier-linux-amd64" + ); + + // matching_regex works the same way. + let picker = AssetPicker::with_libc("linux".to_string(), "aarch64".to_string(), None) + .with_matching_regex("^buildifier-"); + assert_eq!( + picker.pick_best_asset(&assets).unwrap(), + "buildifier-linux-arm64" + ); + } + + #[test] + fn test_bazel_buildtools_without_matching_picks_shortest_not_buildifier() { + // Documents why `matching` is needed for this repo: with three binaries + // per platform and none named after the repo (`buildtools`), the #9358 + // shortest-name tiebreak picks `buildozer` (shorter than `buildifier`), + // so a user wanting buildifier has no portable signal without `matching`. + let assets = bazel_buildtools_assets(); + let picker = AssetPicker::with_libc("linux".to_string(), "x86_64".to_string(), None); + assert_eq!( + picker.pick_best_asset(&assets).unwrap(), + "buildozer-linux-amd64" + ); + } + + /// Real release set for grpc-ecosystem/grpc-gateway v2.27.3 — two binaries + /// per platform that SHARE the `protoc-gen-` prefix. This is the shape behind + /// the wrong-artifact bug ubi hit (ubi #137 / mise discussion #6611), where + /// `--matching protoc-gen-openapiv2` selected the wrong binary because ubi + /// applied `matching` *after* arch filtering. Ported here as a regression + /// guard for the github backend's pre-filter ordering. + fn grpc_gateway_assets() -> Vec { + vec![ + "protoc-gen-grpc-gateway-v2.27.3-darwin-arm64".to_string(), + "protoc-gen-grpc-gateway-v2.27.3-darwin-x86_64".to_string(), + "protoc-gen-grpc-gateway-v2.27.3-linux-arm64".to_string(), + "protoc-gen-grpc-gateway-v2.27.3-linux-x86_64".to_string(), + "protoc-gen-grpc-gateway-v2.27.3-windows-x86_64.exe".to_string(), + "protoc-gen-openapiv2-v2.27.3-darwin-arm64".to_string(), + "protoc-gen-openapiv2-v2.27.3-darwin-x86_64".to_string(), + "protoc-gen-openapiv2-v2.27.3-linux-arm64".to_string(), + "protoc-gen-openapiv2-v2.27.3-linux-x86_64".to_string(), + "protoc-gen-openapiv2-v2.27.3-windows-x86_64.exe".to_string(), + ] + } + + #[test] + fn test_matching_overrides_shortest_name_tiebreak_for_shared_prefix() { + // Regression for the wrong-artifact class of bug (ubi #137 / mise #6611). + // grpc-gateway ships protoc-gen-grpc-gateway and protoc-gen-openapiv2, + // sharing the `protoc-gen-` prefix. The decisive case: `matching` must be + // able to select protoc-gen-grpc-gateway — the LONGER name, which the + // #9358 shortest-name tiebreak would never pick on its own. This proves + // the pre-filter genuinely overrides autodetection's tiebreak rather than + // coinciding with it (the distinct-prefix oxc/bazel fixtures can't show + // this, since there the wanted binary is also the shorter one). + let assets = grpc_gateway_assets(); + + let picker = AssetPicker::with_libc("macos".to_string(), "aarch64".to_string(), None) + .with_matching("protoc-gen-grpc-gateway"); + assert_eq!( + picker.pick_best_asset(&assets).unwrap(), + "protoc-gen-grpc-gateway-v2.27.3-darwin-arm64" + ); + + // The same config selects protoc-gen-openapiv2 portably across platforms. + let picker = AssetPicker::with_libc("linux".to_string(), "x86_64".to_string(), None) + .with_matching("protoc-gen-openapiv2"); + assert_eq!( + picker.pick_best_asset(&assets).unwrap(), + "protoc-gen-openapiv2-v2.27.3-linux-x86_64" + ); + + // `contains` is substring, but the prefix is shared safely: the openapiv2 + // matching string does NOT appear in the grpc-gateway asset name, so the + // filter is unambiguous despite the common `protoc-gen-` prefix. + let picker = AssetPicker::with_libc("macos".to_string(), "x86_64".to_string(), None) + .with_matching_regex("^protoc-gen-grpc-gateway-"); + assert_eq!( + picker.pick_best_asset(&assets).unwrap(), + "protoc-gen-grpc-gateway-v2.27.3-darwin-x86_64" + ); + } + + #[test] + fn test_grpc_gateway_without_matching_falls_to_tiebreak() { + // Documents why `matching` is required for this repo: without it, both + // binaries score equally for the platform and the shortest-name tiebreak + // decides — so a user wanting the longer-named protoc-gen-grpc-gateway has + // no portable signal. (ubi picked grpc-gateway here via a different + // tiebreak; the point is identical — without `matching` the choice isn't + // the user's to make.) + let assets = grpc_gateway_assets(); + let picker = AssetPicker::with_libc("macos".to_string(), "aarch64".to_string(), None); + assert_eq!( + picker.pick_best_asset(&assets).unwrap(), + "protoc-gen-openapiv2-v2.27.3-darwin-arm64" + ); + } + + #[test] + fn test_matching_is_case_sensitive_with_regex_escape_hatch() { + // Characterization for ubi #83 (open: "match executable names + // case-insensitively"). The github backend's `matching` is case-SENSITIVE + // (ubi parity — it uses substring `contains`). Lock that in, and document + // that `matching_regex` with the `(?i)` inline flag is the escape hatch + // for users who need case-insensitive selection. + let assets = vec![ + "OxLint-aarch64-apple-darwin.tar.gz".to_string(), + "oxfmt-aarch64-apple-darwin.tar.gz".to_string(), + ]; + + // Wrong case excludes the intended asset -> None (case-sensitive). + let picker = AssetPicker::with_libc("macos".to_string(), "aarch64".to_string(), None) + .with_matching("oxlint"); + assert_eq!(picker.pick_best_asset(&assets), None); + + // `(?i)` makes the regex case-insensitive and selects it. + let picker = AssetPicker::with_libc("macos".to_string(), "aarch64".to_string(), None) + .with_matching_regex("(?i)^oxlint-"); + assert_eq!( + picker.pick_best_asset(&assets).unwrap(), + "OxLint-aarch64-apple-darwin.tar.gz" + ); + } + + #[test] + fn test_matching_and_case_insensitive_regex_each_apply_independently() { + // When both options are set they AND, and each keeps its own case rule: + // `matching` stays case-SENSITIVE even when `matching_regex` opts into + // case-insensitivity via `(?i)`. So a case-insensitive regex does NOT + // loosen the substring test — an asset must satisfy both as written. + let assets = vec![ + "OxLint-aarch64-apple-darwin.tar.gz".to_string(), + "oxlint-aarch64-apple-darwin.tar.gz".to_string(), + ]; + + // `(?i)^oxlint-` matches both casings, but case-sensitive `matching=oxlint` + // still excludes the capitalized one -> only the lowercase asset survives. + let picker = AssetPicker::with_libc("macos".to_string(), "aarch64".to_string(), None) + .with_matching("oxlint") + .with_matching_regex("(?i)^oxlint-"); + assert_eq!( + picker.pick_best_asset(&assets).unwrap(), + "oxlint-aarch64-apple-darwin.tar.gz" + ); + + // Flip the `matching` case: case-sensitive `matching=OxLint` selects the + // capitalized asset even though the regex matches both, proving the + // substring test keeps its own (sensitive) case rule inside the AND. + let picker = AssetPicker::with_libc("macos".to_string(), "aarch64".to_string(), None) + .with_matching("OxLint") + .with_matching_regex("(?i)^oxlint-"); + assert_eq!( + picker.pick_best_asset(&assets).unwrap(), + "OxLint-aarch64-apple-darwin.tar.gz" + ); + } + #[test] fn test_manylinux_and_musllinux_assets_are_linux_with_libc() { let assets = vec![ @@ -1716,6 +2493,147 @@ abc123def456abc123def456abc123def456abc123def456abc123def456abcd tool-darwin.ta ); } + #[test] + fn test_pick_best_provenance_respects_matching() { + // A multi-binary release that ships a SEPARATE provenance file per binary + // per platform. Both darwin provenance files score identically on + // platform, and pick_best_provenance breaks ties by stable input order + // (no shortest-name tiebreak), so the FIRST one wins — here oxfmt. For an + // oxlint install that attaches oxfmt's provenance, verifying the wrong + // digest. `matching` must narrow provenance the same way it narrows the + // binary so the provenance follows the selected tool. + let assets = vec![ + // oxfmt deliberately first so the unfiltered pick is the WRONG one. + "oxfmt-aarch64-apple-darwin.intoto.jsonl".to_string(), + "oxlint-aarch64-apple-darwin.intoto.jsonl".to_string(), + ]; + + // Without matching: positional tiebreak picks oxfmt's provenance. + let picker = AssetPicker::with_libc("macos".to_string(), "aarch64".to_string(), None); + assert_eq!( + picker.pick_best_provenance(&assets).unwrap(), + "oxfmt-aarch64-apple-darwin.intoto.jsonl" + ); + + // matching=oxlint selects oxlint's provenance despite oxfmt being first. + let picker = AssetPicker::with_libc("macos".to_string(), "aarch64".to_string(), None) + .with_matching("oxlint"); + assert_eq!( + picker.pick_best_provenance(&assets).unwrap(), + "oxlint-aarch64-apple-darwin.intoto.jsonl" + ); + + // matching=oxfmt selects oxfmt's, independently — proves it narrows to the + // named binary rather than just preferring a fixed one. + let picker = AssetPicker::with_libc("macos".to_string(), "aarch64".to_string(), None) + .with_matching("oxfmt"); + assert_eq!( + picker.pick_best_provenance(&assets).unwrap(), + "oxfmt-aarch64-apple-darwin.intoto.jsonl" + ); + } + + #[test] + fn test_pick_best_provenance_respects_matching_regex() { + // Same as above but driven by matching_regex, since both options thread + // into the picker and both must narrow provenance. + let assets = vec![ + "oxfmt-x86_64-unknown-linux-gnu.intoto.jsonl".to_string(), + "oxlint-x86_64-unknown-linux-gnu.intoto.jsonl".to_string(), + ]; + let picker = AssetPicker::with_libc("linux".to_string(), "x86_64".to_string(), None) + .with_matching_regex("^oxlint-"); + assert_eq!( + picker.pick_best_provenance(&assets).unwrap(), + "oxlint-x86_64-unknown-linux-gnu.intoto.jsonl" + ); + } + + #[test] + fn test_pick_best_provenance_matching_keeps_platform_autodetection() { + // matching narrows to the binary; platform autodetection still chooses the + // right OS/arch among that binary's per-platform provenance files. So a + // portable `matching=oxlint` config picks the linux oxlint provenance on a + // linux target — not oxlint's darwin provenance, and not oxfmt's anything. + let assets = vec![ + "oxlint-aarch64-apple-darwin.intoto.jsonl".to_string(), + "oxlint-x86_64-unknown-linux-gnu.intoto.jsonl".to_string(), + "oxfmt-x86_64-unknown-linux-gnu.intoto.jsonl".to_string(), + ]; + let picker = AssetPicker::with_libc("linux".to_string(), "x86_64".to_string(), None) + .with_matching("oxlint"); + assert_eq!( + picker.pick_best_provenance(&assets).unwrap(), + "oxlint-x86_64-unknown-linux-gnu.intoto.jsonl" + ); + } + + #[test] + fn test_pick_best_provenance_matching_falls_back_to_shared_file() { + // goreleaser-style: ONE provenance file attests every artifact in the + // release (its subject digest list covers oxlint too). Its name doesn't + // contain the binary name, so the matching filter would exclude it — but + // with no per-binary provenance to fall back to, dropping it would lose + // verification entirely. The shared file must still be returned. + let assets = vec![ + "oxlint-aarch64-apple-darwin.tar.gz".to_string(), + "oxfmt-aarch64-apple-darwin.tar.gz".to_string(), + "multiple.intoto.jsonl".to_string(), + ]; + let picker = AssetPicker::with_libc("macos".to_string(), "aarch64".to_string(), None) + .with_matching("oxlint"); + assert_eq!( + picker.pick_best_provenance(&assets).unwrap(), + "multiple.intoto.jsonl" + ); + } + + #[test] + fn test_pick_best_provenance_matching_excludes_all_real_provenance_falls_back() { + // Per-binary provenance exists but matching excludes ALL of it (e.g. a + // typo'd or over-narrow filter). Rather than report "no provenance" and + // silently skip verification — a downgrade — fall back to the full + // provenance set so verification still runs (and fails loudly if the + // digest doesn't match), mirroring how the binary path errors rather than + // silently degrading. + let assets = vec![ + "oxfmt-aarch64-apple-darwin.intoto.jsonl".to_string(), + "oxlint-aarch64-apple-darwin.intoto.jsonl".to_string(), + ]; + let picker = AssetPicker::with_libc("macos".to_string(), "aarch64".to_string(), None) + .with_matching("does-not-exist"); + // Falls back to platform scoring over all provenance (positional tiebreak). + assert_eq!( + picker.pick_best_provenance(&assets).unwrap(), + "oxfmt-aarch64-apple-darwin.intoto.jsonl" + ); + } + + #[test] + fn test_pick_best_provenance_invalid_regex_returns_none_not_fallback() { + // Defense-in-depth: an INVALID matching_regex must NOT fall back to the + // full provenance set (which could attach the wrong binary's provenance). + // This is deliberately DIFFERENT from a VALID but over-narrow filter (see + // test_pick_best_provenance_matching_excludes_all_real_provenance_falls_back), + // which DOES fall back so verification still runs and fails loudly. A + // malformed pattern can't be trusted to narrow anything, so we refuse to + // pick rather than guess at a provenance file. + // + // In production this is unreachable — binary selection validates the regex + // up front and hard-errors first (test_invalid_matching_regex_is_a_hard_error) + // — so this guards against a future refactor that reaches a provenance + // picker without first resolving (and validating) the binary. The compiled + // regex is cached on the picker, so validity is a local property of the + // picker rather than something that depends on call ordering. + let assets = vec![ + "oxfmt-aarch64-apple-darwin.intoto.jsonl".to_string(), + "oxlint-aarch64-apple-darwin.intoto.jsonl".to_string(), + ]; + let picker = AssetPicker::with_libc("macos".to_string(), "aarch64".to_string(), None) + .with_matching_regex("oxlint("); // invalid: unclosed group + assert_eq!(picker.pick_best_provenance(&assets), None); + } + #[test] fn test_vsix_vs_gz() { let picker = AssetPicker::with_libc("macos".to_string(), "x86_64".to_string(), None); diff --git a/src/backend/github.rs b/src/backend/github.rs index 956178ddc8..c960981c5a 100644 --- a/src/backend/github.rs +++ b/src/backend/github.rs @@ -107,6 +107,42 @@ impl<'a> GitBackendOptions<'a> { }) } + /// Substring an asset name must contain to remain a candidate, applied as a + /// pre-filter before platform autodetection (ported from the ubi backend). + fn matching(&self) -> Option<&'a str> { + self.values.str("matching") + } + + /// Regex an asset name must match to remain a candidate, applied as a + /// pre-filter before platform autodetection (ported from the ubi backend). + fn matching_regex(&self) -> Option<&'a str> { + self.values.str("matching_regex") + } + + /// `matching`/`matching_regex` for *provenance* selection, suppressed when + /// `asset_pattern` is set for this target. + /// + /// `asset_pattern` replaces autodetection and selects the binary directly, + /// ignoring the matching pre-filter (see the asset-selection call sites). Its + /// provenance must therefore align with that asset by platform alone: + /// re-applying `matching` here could narrow provenance to a *different* binary + /// than `asset_pattern` picked. Suppressing it also keeps an invalid + /// `matching_regex` — which is never validated on the asset_pattern path, + /// since that path skips `match_by_auto_detection` — from reaching the + /// provenance picker and silently returning no provenance (a verification + /// downgrade). When `asset_pattern` is unset this is just `matching`/ + /// `matching_regex` unchanged. + fn matching_for_provenance( + &self, + target: &PlatformTarget, + ) -> (Option<&'a str>, Option<&'a str>) { + if self.asset_pattern_for_target(target).is_some() { + (None, None) + } else { + (self.matching(), self.matching_regex()) + } + } + fn lockfile_options(&self, target: &PlatformTarget) -> BTreeMap { let mut result = BTreeMap::new(); if self.api_url() != self.default_api_url { @@ -124,6 +160,12 @@ impl<'a> GitBackendOptions<'a> { if self.no_app_for_target(target) { result.insert("no_app".to_string(), "true".to_string()); } + if let Some(value) = self.matching() { + result.insert("matching".to_string(), value.to_string()); + } + if let Some(value) = self.matching_regex() { + result.insert("matching_regex".to_string(), value.to_string()); + } result } } @@ -179,6 +221,8 @@ pub fn install_time_option_keys() -> Vec { "url".into(), "version_prefix".into(), "no_app".into(), + "matching".into(), + "matching_regex".into(), ] } @@ -469,6 +513,28 @@ impl Backend for UnifiedGitBackend { let opts = self.options(&raw_opts); let api_url = opts.api_url(); + // Validate `matching_regex` up front, before the cached-URL branch below. + // Reusing a cached lockfile URL skips binary selection (the path that + // normally hard-errors on a bad pattern), so without this an invalid + // regex would reach the provenance picker, return `None`, and silently + // skip SLSA verification rather than failing closed. + // + // Skip this when `asset_pattern` is set: it supersedes `matching`/ + // `matching_regex` for both binary selection and provenance (see + // `matching_for_provenance` and `resolve_*_asset_url_for_target`), so the + // regex is never consulted and an invalid one is irrelevant. mise doesn't + // hard-fail on options it won't act on — `url` short-circuits before + // `asset_pattern` is ever templated (resolve_asset_url_for_target), an + // ignored hook `shell` is dropped with a warning (src/hooks.rs), and + // unknown tool-option keys are silently ignored. Validating here would be + // the lone exception that rejects a superseded option. + if opts + .asset_pattern_for_target(&PlatformTarget::from_current()) + .is_none() + { + asset_matcher::validate_matching_regex(opts.matching_regex())?; + } + // Check if URL already exists in lockfile platforms first let platform_key = self.get_platform_key(); @@ -538,6 +604,19 @@ impl Backend for UnifiedGitBackend { let opts = self.options(&raw_opts); let api_url = opts.api_url(); + // Fail closed on an invalid `matching_regex` instead of writing an empty + // entry. The `Err` arm below intentionally swallows resolution failures so a + // platform with no matching asset is skipped rather than failing the whole + // (best-effort) lock — but `resolve_asset_url_for_target` returns the same + // `Err` for an invalid regex, which would then be caught and written as a + // url-less `PlatformInfo::default()`. Returning `Err` here makes the lock + // orchestration skip the platform (no entry written) instead. Gated on the + // same `asset_pattern` precedence as everywhere else, so an ignored regex is + // never validated. + if opts.asset_pattern_for_target(target).is_none() { + asset_matcher::validate_matching_regex(opts.matching_regex())?; + } + // Resolve asset for the target platform let asset = self .resolve_asset_url_for_target(tv, &opts, &repo, &api_url, target) @@ -734,11 +813,18 @@ impl UnifiedGitBackend { // when a matching provenance file exists for the target platform. if settings.slsa && settings.github.slsa { let asset_names: Vec = release.assets.iter().map(|a| a.name.clone()).collect(); + // Narrow provenance the same way the binary is narrowed, so a + // multi-binary release's per-binary provenance files don't + // cross-verify the wrong digest. Suppressed when `asset_pattern` is + // set (it selects the binary, ignoring `matching`). + let (matching, matching_regex) = opts.matching_for_provenance(target); let picker = AssetPicker::with_libc( target.os_name().to_string(), target.arch_name().to_string(), target.qualifier().map(|s| s.to_string()), - ); + ) + .with_matching(matching.unwrap_or_default()) + .with_matching_regex(matching_regex.unwrap_or_default()); if let Some(provenance_name) = picker.pick_best_provenance(&asset_names) { let url = release .assets @@ -852,11 +938,16 @@ impl UnifiedGitBackend { let asset_names: Vec = release.assets.iter().map(|a| a.name.clone()).collect(); let current_platform = PlatformTarget::from_current(); + // Keep provenance aligned with the matching-selected binary, unless + // `asset_pattern` is set (it selects the binary, ignoring `matching`). + let (matching, matching_regex) = opts.matching_for_provenance(¤t_platform); let picker = AssetPicker::with_libc( current_platform.os_name().to_string(), current_platform.arch_name().to_string(), current_platform.qualifier().map(|s| s.to_string()), - ); + ) + .with_matching(matching.unwrap_or_default()) + .with_matching_regex(matching_regex.unwrap_or_default()); if let Some(provenance_name) = picker.pick_best_provenance(&asset_names) { let provenance_asset = release @@ -1253,7 +1344,10 @@ impl UnifiedGitBackend { .map(|a| Asset::new(&a.name, &a.browser_download_url)) .collect(); - // Try explicit pattern first + // Try explicit pattern first. `asset_pattern` replaces autodetection + // entirely, so it intentionally takes precedence over and ignores + // `matching`/`matching_regex` (there is no autodetected candidate set left + // to narrow). Do NOT thread the matching filter into this branch. if let Some(pattern) = opts.asset_pattern_for_target(target) { // Template the pattern for the target platform let templated_pattern = template_string_for_target(&pattern, tv, target); @@ -1289,6 +1383,8 @@ impl UnifiedGitBackend { .for_target(target) .with_no_app(opts.no_app_for_target(target)) .with_preferred_name(self.preferred_asset_name()) + .with_matching(opts.matching().unwrap_or_default()) + .with_matching_regex(opts.matching_regex().unwrap_or_default()) .pick_from(&available_assets)? .name; let asset = self @@ -1343,7 +1439,10 @@ impl UnifiedGitBackend { .map(|a| Asset::new(&a.name, &a.direct_asset_url)) .collect(); - // Try explicit pattern first + // Try explicit pattern first. `asset_pattern` replaces autodetection + // entirely, so it intentionally takes precedence over and ignores + // `matching`/`matching_regex` (there is no autodetected candidate set left + // to narrow). Do NOT thread the matching filter into this branch. if let Some(pattern) = opts.asset_pattern_for_target(target) { // Template the pattern for the target platform let templated_pattern = template_string_for_target(&pattern, tv, target); @@ -1376,6 +1475,8 @@ impl UnifiedGitBackend { .for_target(target) .with_no_app(opts.no_app_for_target(target)) .with_preferred_name(self.preferred_asset_name()) + .with_matching(opts.matching().unwrap_or_default()) + .with_matching_regex(opts.matching_regex().unwrap_or_default()) .pick_from(&available_assets)? .name; let asset = self @@ -1430,7 +1531,10 @@ impl UnifiedGitBackend { ) }; - // Try explicit pattern first + // Try explicit pattern first. `asset_pattern` replaces autodetection + // entirely, so it intentionally takes precedence over and ignores + // `matching`/`matching_regex` (there is no autodetected candidate set left + // to narrow). Do NOT thread the matching filter into this branch. if let Some(pattern) = opts.asset_pattern_for_target(target) { // Template the pattern for the target platform let templated_pattern = template_string_for_target(&pattern, tv, target); @@ -1463,6 +1567,8 @@ impl UnifiedGitBackend { .for_target(target) .with_no_app(opts.no_app_for_target(target)) .with_preferred_name(self.preferred_asset_name()) + .with_matching(opts.matching().unwrap_or_default()) + .with_matching_regex(opts.matching_regex().unwrap_or_default()) .pick_from(&available_assets)? .name; let asset = self @@ -1975,11 +2081,17 @@ impl UnifiedGitBackend { // Find the best provenance asset for the current platform let asset_names: Vec = release.assets.iter().map(|a| a.name.clone()).collect(); let current_platform = PlatformTarget::from_current(); + // Keep provenance aligned with the matching-selected binary at install + // time, matching the selection used at lock time. Suppressed when + // `asset_pattern` is set (it selects the binary, ignoring `matching`). + let (matching, matching_regex) = opts.matching_for_provenance(¤t_platform); let picker = AssetPicker::with_libc( current_platform.os_name().to_string(), current_platform.arch_name().to_string(), current_platform.qualifier().map(|s| s.to_string()), - ); + ) + .with_matching(matching.unwrap_or_default()) + .with_matching_regex(matching_regex.unwrap_or_default()); let provenance_name = match picker.pick_best_provenance(&asset_names) { Some(name) => name, @@ -2163,6 +2275,20 @@ mod tests { )) } + fn create_test_gitlab_backend() -> UnifiedGitBackend { + UnifiedGitBackend::from_arg(BackendArg::new( + "gitlab:test/repo".to_string(), + Some("gitlab:test/repo".to_string()), + )) + } + + fn create_test_forgejo_backend() -> UnifiedGitBackend { + UnifiedGitBackend::from_arg(BackendArg::new( + "forgejo:test/repo".to_string(), + Some("forgejo:test/repo".to_string()), + )) + } + #[test] fn test_pick_by_pattern_basic() { // Single-match cases that the old `matches_pattern` test covered. @@ -2278,6 +2404,18 @@ mod tests { assert_eq!(backend.strip_version_prefix("1.0.0", &opts), "1.0.0"); } + #[test] + fn test_matching_options_are_install_time_keys() { + // `matching`/`matching_regex` must be install-time-only keys so a stale + // cached filter from a prior install can't silently override what's in + // mise.toml now. They are deliberately NOT folded into the install path + // (that stays keyed by tool name + version) — `tool_alias` is the way to + // install multiple binaries from one repo into distinct dirs. + let keys = install_time_option_keys(); + assert!(keys.contains(&"matching".to_string())); + assert!(keys.contains(&"matching_regex".to_string())); + } + #[test] fn test_lockfile_options_use_target_artifact_inputs() { let backend = create_test_backend(); @@ -2290,6 +2428,17 @@ mod tests { "version_prefix".to_string(), toml::Value::String("release-".to_string()), ); + // matching/matching_regex are top-level (not per-platform) options; they + // must round-trip into lockfile_options for every target so a relock on + // another OS reproduces the same asset selection. + opts.opts.insert( + "matching".to_string(), + toml::Value::String("tool".to_string()), + ); + opts.opts.insert( + "matching_regex".to_string(), + toml::Value::String("^tool-".to_string()), + ); let mut platforms = toml::Table::new(); let mut linux = toml::Table::new(); linux.insert( @@ -2321,6 +2470,8 @@ mod tests { "asset_pattern".to_string(), "tool-*-linux.tar.gz".to_string() ), + ("matching".to_string(), "tool".to_string()), + ("matching_regex".to_string(), "^tool-".to_string()), ("version_prefix".to_string(), "release-".to_string()), ]) ); @@ -2335,12 +2486,142 @@ mod tests { "asset_pattern".to_string(), "tool-*-windows.zip".to_string() ), + ("matching".to_string(), "tool".to_string()), + ("matching_regex".to_string(), "^tool-".to_string()), ("no_app".to_string(), "true".to_string()), ("version_prefix".to_string(), "release-".to_string()), ]) ); } + #[test] + fn test_matching_for_provenance_suppressed_when_asset_pattern_set() { + // `asset_pattern` selects the binary directly and ignores `matching`, so + // provenance must NOT be narrowed by `matching` on that path — otherwise a + // self-contradictory config could attach a *different* binary's provenance + // than `asset_pattern` picked, and an invalid `matching_regex` (never + // validated on the asset_pattern path) could silently skip verification. + let backend = create_test_backend(); + let mut opts = ToolVersionOptions::default(); + opts.opts.insert( + "matching".to_string(), + toml::Value::String("oxlint".to_string()), + ); + opts.opts.insert( + "matching_regex".to_string(), + toml::Value::String("^oxlint-".to_string()), + ); + // asset_pattern set for linux only, not for macos. + let mut platforms = toml::Table::new(); + let mut linux = toml::Table::new(); + linux.insert( + "asset_pattern".to_string(), + toml::Value::String("oxlint-*-linux.tar.gz".to_string()), + ); + platforms.insert("linux-x64".to_string(), toml::Value::Table(linux)); + opts.opts + .insert("platforms".to_string(), toml::Value::Table(platforms)); + + let linux = PlatformTarget::new(crate::platform::Platform::parse("linux-x64").unwrap()); + let macos = PlatformTarget::new(crate::platform::Platform::parse("macos-arm64").unwrap()); + + // asset_pattern set for this target -> matching suppressed for provenance. + assert_eq!( + backend.options(&opts).matching_for_provenance(&linux), + (None, None) + ); + // No asset_pattern for this target -> matching flows through to provenance. + assert_eq!( + backend.options(&opts).matching_for_provenance(&macos), + (Some("oxlint"), Some("^oxlint-")) + ); + } + + #[test] + fn test_matching_plumbing_parity_across_git_backends() { + // The github/gitlab/forgejo backends share one option struct + // (`GitBackendOptions`) and one `AssetMatcher`, but each has its OWN + // `resolve_*_asset_url_for_target` function that threads + // `matching`/`matching_regex` separately (copy-paste identical today, with + // only backend-specific asset/digest plumbing differing). This test guards + // the shared seams those three paths depend on from drifting per backend + // type: the option accessors, lockfile serialization, and install-time-key + // inheritance must behave identically for all three. The resolve functions + // themselves are covered end-to-end for github by + // e2e/backend/test_github_matching, and the matcher all three feed is + // covered by the asset_matcher unit tests. + let mut opts = ToolVersionOptions::default(); + opts.opts.insert( + "matching".to_string(), + toml::Value::String("oxlint".to_string()), + ); + opts.opts.insert( + "matching_regex".to_string(), + toml::Value::String("^oxlint-".to_string()), + ); + let target = PlatformTarget::new(crate::platform::Platform::parse("linux-x64").unwrap()); + + // Guard that the helpers really build distinct backend types, so the loop + // below genuinely exercises gitlab/forgejo and isn't three githubs. + assert!(create_test_gitlab_backend().is_gitlab()); + assert!(create_test_forgejo_backend().is_forgejo()); + + for backend in [ + create_test_backend(), + create_test_gitlab_backend(), + create_test_forgejo_backend(), + ] { + let backend_type = backend.ba.backend_type(); + let resolved = backend.options(&opts); + + // Accessors the three resolve_*_asset_url_for_target functions read. + assert_eq!( + resolved.matching(), + Some("oxlint"), + "matching() must be readable for {backend_type:?}" + ); + assert_eq!( + resolved.matching_regex(), + Some("^oxlint-"), + "matching_regex() must be readable for {backend_type:?}" + ); + + // Both keys must round-trip to the lockfile for every git backend so a + // relock on another platform reproduces the same asset selection. + let lf = resolved.lockfile_options(&target); + assert_eq!( + lf.get("matching").map(String::as_str), + Some("oxlint"), + "matching must round-trip to lockfile for {backend_type:?}" + ); + assert_eq!( + lf.get("matching_regex").map(String::as_str), + Some("^oxlint-"), + "matching_regex must round-trip to lockfile for {backend_type:?}" + ); + + // Cache-keying: a stale cached filter must never silently override + // mise.toml, so both keys must be install-time keys for every type + // (gitlab/forgejo inherit github's list via the routing in mod.rs). + let itk = crate::backend::install_time_option_keys_for_type(&backend_type); + assert!( + itk.contains(&"matching".to_string()) + && itk.contains(&"matching_regex".to_string()), + "matching/matching_regex must be install-time keys for {backend_type:?}" + ); + // ...including the per-platform `platforms..matching` form the + // stale-cache check uses. + assert!( + crate::backend::is_install_time_option_key_for_type(&backend_type, "matching") + && crate::backend::is_install_time_option_key_for_type( + &backend_type, + "platforms.linux-x64.matching" + ), + "is_install_time_option_key_for_type must report matching for {backend_type:?}" + ); + } + } + #[test] fn test_find_asset_case_insensitive() { let backend = create_test_backend();