fix(conda): dedup repodata by archive identifier instead of URL#9831
Conversation
The previous URL-based dedup in flatten_repodata wasn't sufficient: rattler-solve's resolvo solver detects duplicates by DistArchiveIdentifier (name-version-build + archive type), so when conda-forge serves the same archive under multiple URLs (e.g. distinct CDN paths), URL dedup keeps both and the solver rejects them with "encountered duplicate records for <filename>". Dedup by r.identifier instead — the exact key the solver uses — so collisions can no longer slip through. .conda vs .tar.bz2 variants stay distinct because their archive_type differs. Fixes the imagemagick install failure reported in #9829. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Greptile SummaryFixes a conda solve failure where
Confidence Score: 5/5Safe to merge — the change is a single-file, narrowly scoped fix to conda repodata deduplication with no observable behavior change for non-duplicate inputs. The fix correctly targets the exact key the solver uses to detect duplicates ( No files require special attention. Important Files Changed
Reviews (4): Last reviewed commit: "style(conda): apply rustfmt to test addi..." | Re-trigger Greptile |
There was a problem hiding this comment.
Code Review
This pull request updates the deduplication logic in flatten_repodata to use archive identifiers instead of URLs, ensuring consistency with how rattler-solve detects duplicates. Feedback suggests optimizing the implementation by filtering references before cloning to improve performance and reduce unnecessary memory allocations.
Filter on references and clone only the records that survive dedup, instead of cloning every record up front. Per gemini-code-assist on #9831. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Extract the dedup loop into a private helper so it's testable without constructing a gateway RepoData (whose records field is pub(crate)). Adds two regression tests: - Same name-version-build served under different URLs collapses to one record (the #9829 repro). - .conda and .tar.bz2 variants of the same package are preserved so the solver's archive-type preference still applies. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Hyperfine Performance
|
| Command | Mean [ms] | Min [ms] | Max [ms] | Relative |
|---|---|---|---|---|
mise-2026.5.6 x -- echo |
58.1 ± 11.2 | 31.1 | 81.3 | 1.00 |
mise x -- echo |
59.9 ± 10.5 | 31.0 | 81.3 | 1.03 ± 0.27 |
mise env
| Command | Mean [ms] | Min [ms] | Max [ms] | Relative |
|---|---|---|---|---|
mise-2026.5.6 env |
56.3 ± 10.3 | 29.9 | 72.0 | 1.00 |
mise env |
57.0 ± 10.3 | 26.6 | 76.7 | 1.01 ± 0.26 |
mise hook-env
| Command | Mean [ms] | Min [ms] | Max [ms] | Relative |
|---|---|---|---|---|
mise-2026.5.6 hook-env |
60.2 ± 11.8 | 28.3 | 79.1 | 1.00 |
mise hook-env |
61.0 ± 10.5 | 32.3 | 84.5 | 1.01 ± 0.26 |
mise ls
| Command | Mean [ms] | Min [ms] | Max [ms] | Relative |
|---|---|---|---|---|
mise-2026.5.6 ls |
49.6 ± 8.8 | 26.0 | 68.2 | 1.00 |
mise ls |
49.9 ± 8.6 | 21.2 | 66.8 | 1.01 ± 0.25 |
xtasks/test/perf
| Command | mise-2026.5.6 | mise | Variance |
|---|---|---|---|
| install (cached) | 290ms | 271ms | +7% |
| ls (cached) | 190ms | 189ms | +0% |
| bin-paths (cached) | 206ms | 207ms | +0% |
| task-ls (cached) | 706ms | 682ms | +3% |
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
### 🐛 Bug Fixes - **(backend)** use runtime paths for backend bin dirs by @risu729 in [#9606](#9606) - **(ci)** preserve vendor/aqua-registry/ in PPA publish workflow by @jdx in [#9782](#9782) - **(ci)** set UTF-8 locale in e2e Docker image by @jdx in [#9820](#9820) - **(ci)** pass UTF-8 locale through to e2e tests by @jdx in [#9823](#9823) - **(conda)** dedup repodata by archive identifier instead of URL by @jdx in [#9831](#9831) - **(github)** use default shell for credential command by @risu729 in [#9664](#9664) - **(settings)** distinguish unset known settings from unknown ones by @jdx in [#9818](#9818) - **(upgrade)** remove completed progress jobs to prevent duplicate output by @jdx in [#9779](#9779) - **(vfox)** resolve GitHub token lazily inside Lua plugins by @jdx in [#9816](#9816) ### 🚜 Refactor - **(config)** separate core and backend tool options by @risu729 in [#9753](#9753) - **(schema)** reuse env directive property schemas by @risu729 in [#9651](#9651) ### 📚 Documentation - **(aliases)** fix Aliased Versions example and drop stale asdf callout by @jdx in [#9830](#9830) ### ⚡ Performance - **(aqua)** use phf for baked registry lookups by @risu729 in [#9763](#9763) - **(task)** cache per-file content hashes for source_freshness_hash_contents by @jdx in [#9819](#9819) ### 🧪 Testing - **(e2e)** pin aube to known-good version in npm package_manager test by @jdx in [#9794](#9794) ### 📦 Registry - replace unsupported exe options by @risu729 in [#9587](#9587) - update pi by @garysassano in [#9792](#9792) ### Chore - **(ci)** use non-large runners for release builds by @jdx in [#9786](#9786) - **(ci)** compare registry PRs from fork point by @risu729 in [#9643](#9643) - **(ci)** make build-copr.sh the single source of truth for COPR chroots by @jdx in [#9788](#9788) - **(ci)** use crates.io trusted publishing in release-plz by @jdx in [#9793](#9793) - **(ci)** remove autofix.ci workflow by @jdx in [#9801](#9801) - **(ci)** restore -large runner for Linux release builds by @jdx in [#9815](#9815) - **(ci)** add zizmor workflow for github actions security analysis by @jdx in [#9804](#9804) - **(ci)** assert mise run render produces no diff by @jdx in [#9803](#9803) - **(copr)** publish EL9 builds via centos-stream+epel-next-9 chroot by @jdx in [#9787](#9787) ### Ci - remove pull_request_target workflow by @jdx in [#9799](#9799) - remove caching from publishing workflows by @jdx in [#9800](#9800) ### Security - reject shell metacharacters in version strings and CI inputs by @jdx in [#9814](#9814) ## 📦 Aqua Registry Updates ### New Packages (11) - [`Code-Hex/Neo-cowsay`](https://github.com/Code-Hex/Neo-cowsay) - [`SonarSource/sonarqube-cli`](https://github.com/SonarSource/sonarqube-cli) - [`earendil-works/pi`](https://github.com/earendil-works/pi) - [`hylo-lang/hylo-new`](https://github.com/hylo-lang/hylo-new) - [`jfernandez/bpftop`](https://github.com/jfernandez/bpftop) - [`modem-dev/hunk`](https://github.com/modem-dev/hunk) - [`npm/cli`](https://github.com/npm/cli) - [`racket/racket/minimal`](https://github.com/racket/racket) - [`slackapi/slack-cli`](https://github.com/slackapi/slack-cli) - [`vectordotdev/vector`](https://github.com/vectordotdev/vector) - [`wasilibs/go-yamllint`](https://github.com/wasilibs/go-yamllint) ### Updated Packages (10) - [`DataDog/pup`](https://github.com/DataDog/pup) - [`aquasecurity/trivy`](https://github.com/aquasecurity/trivy) - [`astral-sh/uv`](https://github.com/astral-sh/uv) - [`caarlos0/svu`](https://github.com/caarlos0/svu) - [`cargo-bins/cargo-binstall`](https://github.com/cargo-bins/cargo-binstall) - [`foundry-rs/foundry`](https://github.com/foundry-rs/foundry) - [`gastownhall/beads`](https://github.com/gastownhall/beads) - [`gruntwork-io/terragrunt`](https://github.com/gruntwork-io/terragrunt) - [`pnpm/pnpm`](https://github.com/pnpm/pnpm) - [`santosr2/TerraTidy`](https://github.com/santosr2/TerraTidy)
|
Works great. |
…9831) ## Summary Fixes [jdx#9829](jdx#9829) — `mise use -g imagemagick` (and other tools pulling `adwaita-icon-theme` transitively) fails with: ``` conda solve failed: encountered duplicate records for adwaita-icon-theme-40.1.1-ha770c72_1.tar.bz2 ``` [PR jdx#8337](jdx#8337) previously addressed this class of error by deduplicating records by URL in `flatten_repodata`, but that wasn't sufficient. rattler-solve's resolvo solver detects duplicates by `DistArchiveIdentifier` (`name-version-build` + archive type), not by URL — see `rattler_solve-6.0.2/src/resolvo/mod.rs:414`. When conda-forge serves the same archive under multiple URLs (distinct CDN paths / aliasing), URL-based dedup keeps both, so the solver still rejects them. This PR switches the dedup key to `r.identifier`, the exact key rattler-solve uses, so colliding records can no longer reach the solver. `.conda` vs `.tar.bz2` variants of the same package remain distinct (their `archive_type` differs), preserving the solver's existing archive-type preference logic. ## Test plan - [x] `cargo check -p mise` compiles - [ ] User-reported reproducer: `mise use -g imagemagick` on Linux64 <!-- CURSOR_SUMMARY --> --- > [!NOTE] > **Medium Risk** > Changes conda solver input deduplication to drop records by `DistArchiveIdentifier`, which can affect which package variants reach the solver and thus impact dependency resolution. Scope is small and covered by new regression tests, but it touches install/solve behavior. > > **Overview** > Fixes conda solve failures caused by duplicate package records being returned under different URLs by deduplicating solver inputs on `RepoDataRecord.identifier` (archive identifier) instead of URL. > > Refactors repodata flattening to reuse a new `dedup_records_by_identifier` helper, and adds unit tests to ensure identical identifiers across different URLs collapse while distinct `.conda` vs `.tar.bz2` variants are preserved. > > <sup>Reviewed by [Cursor Bugbot](https://cursor.com/bugbot) for commit 01fc485. Bugbot is set up for automated code reviews on this repo. Configure [here](https://www.cursor.com/dashboard/bugbot).</sup> <!-- /CURSOR_SUMMARY --> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
### 🐛 Bug Fixes - **(backend)** use runtime paths for backend bin dirs by @risu729 in [jdx#9606](jdx#9606) - **(ci)** preserve vendor/aqua-registry/ in PPA publish workflow by @jdx in [jdx#9782](jdx#9782) - **(ci)** set UTF-8 locale in e2e Docker image by @jdx in [jdx#9820](jdx#9820) - **(ci)** pass UTF-8 locale through to e2e tests by @jdx in [jdx#9823](jdx#9823) - **(conda)** dedup repodata by archive identifier instead of URL by @jdx in [jdx#9831](jdx#9831) - **(github)** use default shell for credential command by @risu729 in [jdx#9664](jdx#9664) - **(settings)** distinguish unset known settings from unknown ones by @jdx in [jdx#9818](jdx#9818) - **(upgrade)** remove completed progress jobs to prevent duplicate output by @jdx in [jdx#9779](jdx#9779) - **(vfox)** resolve GitHub token lazily inside Lua plugins by @jdx in [jdx#9816](jdx#9816) ### 🚜 Refactor - **(config)** separate core and backend tool options by @risu729 in [jdx#9753](jdx#9753) - **(schema)** reuse env directive property schemas by @risu729 in [jdx#9651](jdx#9651) ### 📚 Documentation - **(aliases)** fix Aliased Versions example and drop stale asdf callout by @jdx in [jdx#9830](jdx#9830) ### ⚡ Performance - **(aqua)** use phf for baked registry lookups by @risu729 in [jdx#9763](jdx#9763) - **(task)** cache per-file content hashes for source_freshness_hash_contents by @jdx in [jdx#9819](jdx#9819) ### 🧪 Testing - **(e2e)** pin aube to known-good version in npm package_manager test by @jdx in [jdx#9794](jdx#9794) ### 📦 Registry - replace unsupported exe options by @risu729 in [jdx#9587](jdx#9587) - update pi by @garysassano in [jdx#9792](jdx#9792) ### Chore - **(ci)** use non-large runners for release builds by @jdx in [jdx#9786](jdx#9786) - **(ci)** compare registry PRs from fork point by @risu729 in [jdx#9643](jdx#9643) - **(ci)** make build-copr.sh the single source of truth for COPR chroots by @jdx in [jdx#9788](jdx#9788) - **(ci)** use crates.io trusted publishing in release-plz by @jdx in [jdx#9793](jdx#9793) - **(ci)** remove autofix.ci workflow by @jdx in [jdx#9801](jdx#9801) - **(ci)** restore -large runner for Linux release builds by @jdx in [jdx#9815](jdx#9815) - **(ci)** add zizmor workflow for github actions security analysis by @jdx in [jdx#9804](jdx#9804) - **(ci)** assert mise run render produces no diff by @jdx in [jdx#9803](jdx#9803) - **(copr)** publish EL9 builds via centos-stream+epel-next-9 chroot by @jdx in [jdx#9787](jdx#9787) ### Ci - remove pull_request_target workflow by @jdx in [jdx#9799](jdx#9799) - remove caching from publishing workflows by @jdx in [jdx#9800](jdx#9800) ### Security - reject shell metacharacters in version strings and CI inputs by @jdx in [jdx#9814](jdx#9814) ## 📦 Aqua Registry Updates ### New Packages (11) - [`Code-Hex/Neo-cowsay`](https://github.com/Code-Hex/Neo-cowsay) - [`SonarSource/sonarqube-cli`](https://github.com/SonarSource/sonarqube-cli) - [`earendil-works/pi`](https://github.com/earendil-works/pi) - [`hylo-lang/hylo-new`](https://github.com/hylo-lang/hylo-new) - [`jfernandez/bpftop`](https://github.com/jfernandez/bpftop) - [`modem-dev/hunk`](https://github.com/modem-dev/hunk) - [`npm/cli`](https://github.com/npm/cli) - [`racket/racket/minimal`](https://github.com/racket/racket) - [`slackapi/slack-cli`](https://github.com/slackapi/slack-cli) - [`vectordotdev/vector`](https://github.com/vectordotdev/vector) - [`wasilibs/go-yamllint`](https://github.com/wasilibs/go-yamllint) ### Updated Packages (10) - [`DataDog/pup`](https://github.com/DataDog/pup) - [`aquasecurity/trivy`](https://github.com/aquasecurity/trivy) - [`astral-sh/uv`](https://github.com/astral-sh/uv) - [`caarlos0/svu`](https://github.com/caarlos0/svu) - [`cargo-bins/cargo-binstall`](https://github.com/cargo-bins/cargo-binstall) - [`foundry-rs/foundry`](https://github.com/foundry-rs/foundry) - [`gastownhall/beads`](https://github.com/gastownhall/beads) - [`gruntwork-io/terragrunt`](https://github.com/gruntwork-io/terragrunt) - [`pnpm/pnpm`](https://github.com/pnpm/pnpm) - [`santosr2/TerraTidy`](https://github.com/santosr2/TerraTidy)
Summary
Fixes #9829 —
mise use -g imagemagick(and other tools pullingadwaita-icon-themetransitively) fails with:PR #8337 previously addressed this class of error by deduplicating records by URL in
flatten_repodata, but that wasn't sufficient. rattler-solve's resolvo solver detects duplicates byDistArchiveIdentifier(name-version-build+ archive type), not by URL — seerattler_solve-6.0.2/src/resolvo/mod.rs:414. When conda-forge serves the same archive under multiple URLs (distinct CDN paths / aliasing), URL-based dedup keeps both, so the solver still rejects them.This PR switches the dedup key to
r.identifier, the exact key rattler-solve uses, so colliding records can no longer reach the solver..condavs.tar.bz2variants of the same package remain distinct (theirarchive_typediffers), preserving the solver's existing archive-type preference logic.Test plan
cargo check -p misecompilesmise use -g imagemagickon Linux64Note
Medium Risk
Changes conda solver input deduplication to drop records by
DistArchiveIdentifier, which can affect which package variants reach the solver and thus impact dependency resolution. Scope is small and covered by new regression tests, but it touches install/solve behavior.Overview
Fixes conda solve failures caused by duplicate package records being returned under different URLs by deduplicating solver inputs on
RepoDataRecord.identifier(archive identifier) instead of URL.Refactors repodata flattening to reuse a new
dedup_records_by_identifierhelper, and adds unit tests to ensure identical identifiers across different URLs collapse while distinct.condavs.tar.bz2variants are preserved.Reviewed by Cursor Bugbot for commit 01fc485. Bugbot is set up for automated code reviews on this repo. Configure here.