perf: cache repeated path canonicalization#10068
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces a cached path canonicalization mechanism (canonicalize_cached and canonicalize_or_self) in src/file.rs to optimize repeated filesystem path resolution operations. It refactors direct .canonicalize() calls to use these cached alternatives across several modules, including backend operations, CLI activation, doctor checks, environment hooks, settings, shims, and task helpers. I have no feedback to provide as there are no review comments to assess.
d562d22 to
cdd7e2c
Compare
Greptile SummaryThis PR introduces a process-local
Confidence Score: 5/5Safe to merge; the change is a well-scoped performance optimisation that also incidentally corrects a subtle empty-path sentinel comparison in the shim fallback path. All call sites were reviewed. The cache is correctly bounded to successful, absolute-path canonicalisations; failure paths are never cached so paths created later in the same process are handled correctly. The Mutex-double-lock pattern matches the existing No files require special attention; Important Files Changed
Reviews (1): Last reviewed commit: "perf: cache repeated path canonicalizati..." | Re-trigger Greptile |
Hyperfine Performance
|
| Command | Mean [ms] | Min [ms] | Max [ms] | Relative |
|---|---|---|---|---|
mise-2026.5.15 x -- echo |
20.4 ± 1.4 | 17.8 | 27.5 | 1.00 |
mise x -- echo |
21.2 ± 2.6 | 18.3 | 40.6 | 1.04 ± 0.15 |
mise env
| Command | Mean [ms] | Min [ms] | Max [ms] | Relative |
|---|---|---|---|---|
mise-2026.5.15 env |
19.7 ± 1.4 | 17.2 | 27.7 | 1.03 ± 0.10 |
mise env |
19.1 ± 1.3 | 17.0 | 28.4 | 1.00 |
mise hook-env
| Command | Mean [ms] | Min [ms] | Max [ms] | Relative |
|---|---|---|---|---|
mise-2026.5.15 hook-env |
20.1 ± 1.3 | 17.9 | 26.8 | 1.00 |
mise hook-env |
20.3 ± 1.7 | 18.1 | 29.1 | 1.01 ± 0.11 |
mise ls
| Command | Mean [ms] | Min [ms] | Max [ms] | Relative |
|---|---|---|---|---|
mise-2026.5.15 ls |
17.9 ± 1.4 | 14.7 | 24.4 | 1.06 ± 0.11 |
mise ls |
16.9 ± 1.2 | 14.9 | 22.4 | 1.00 |
xtasks/test/perf
| Command | mise-2026.5.15 | mise | Variance |
|---|---|---|---|
| install (cached) | 136ms | 138ms | -1% |
| ls (cached) | 61ms | 61ms | +0% |
| bin-paths (cached) | 68ms | 67ms | +1% |
| task-ls (cached) | 125ms | 127ms | -1% |
Bot + maintainer reviews on jdx#10019 surfaced: - (gemini, HIGH) The dual-dir filter logic was duplicated between `dependency_env` and `which_shim`, and the file.rs helpers (`path_env_without_shims`, `strip_shims_from_path`, `which_no_shims`) still only filtered the user shims dir — leaving every caller of those helpers vulnerable to the same recursion through the system shims dir this PR is fixing for `dependency_env`. Also flagged that `paths_eq`/`replace_path` won't match symlinked roots (e.g. `/usr/local/share` → `/private/usr/local/share` on macOS), unlike `which_shim`'s canonicalize approach. - (jdx) Symlink-aware shim-dir checks need memoization — `dependency_env` is called per backend resolution and otherwise hits the filesystem on every PATH entry. jdx#10068 added `canonicalize_cached` / `canonicalize_or_self` for exactly this case. - (greptile, P2) The regression test pinned `node = "99.0.0"`, a non-existent version that could fail at version-resolve and mask a fork-bomb regression. `test_go_shim_recursion` uses `go = "1.23.3"` — a real-but-uninstalled version — for the same reason. - (greptile/gemini) The test's final `assert_contains "echo \"$output\"" '"node"'` embeds $output inside double quotes, so bash consumes the JSON's double-quote characters when `assert_contains` re-evals the command; the check for literal `"node"` then always fails. The sibling `test_go_shim_recursion` uses single quotes (`echo '$output'`) for this reason. Changes: - Add `file::is_mise_shims_dir(&Path) -> bool` that recognises both the user dir (`dirs::SHIMS`) and the system dir (`MISE_SYSTEM_DATA_DIR/shims`), using `canonicalize_or_self` from jdx#10068 so symlinked roots match without re-hitting the filesystem on repeated PATH checks. - Route all four call sites — `dependency_env`, `path_env_without_shims`, `strip_shims_from_path`, `which_no_shims` — through the new helper. Eliminates the divergence the file.rs helpers carried. - Test: pin `node = "22.0.0"`, switch to `output="$(... || true)"`, and use single-quoted `echo '$output'` so JSON double quotes survive the re-eval inside `assert_contains`. Sandbox: 093f945e-81a3-4d19-bea5-ee6ff164ef62
Summary
Context
This is intended as a small prereq for #10019 so shim-directory comparisons can keep symlink-aware canonicalization without repeatedly hitting the filesystem for stable PATH/root entries.
Tests
Note
Low Risk
Localized performance refactor with intentional non-caching of failures; behavior should match prior symlink-aware comparisons for stable paths.
Overview
Adds process-local caching for successful absolute-path
canonicalizeresults via newcanonicalize_cachedandcanonicalize_or_selfhelpers infile.rs. Failed canonicalizations are not cached so paths that appear later in the same run still resolve correctly.Call sites that repeatedly compare stable directories and PATH entries now use the helpers instead of calling
canonicalizeevery time—activation (shim removal and “already in PATH” checks),hook-envPATH deduplication,doctorpath-order warnings, shim system fallback lookup, install symlink root checks, trusted config paths, and task cache keys.Relative paths still canonicalize on each use without entering the cache. Other code paths that need fresh resolution (trust mutations, lockfiles, templates, etc.) are unchanged.
Reviewed by Cursor Bugbot for commit cdd7e2c. Bugbot is set up for automated code reviews on this repo. Configure here.