Skip to content

perf: cache repeated path canonicalization#10068

Merged
jdx merged 1 commit into
mainfrom
codex/cache-canonicalize
May 25, 2026
Merged

perf: cache repeated path canonicalization#10068
jdx merged 1 commit into
mainfrom
codex/cache-canonicalize

Conversation

@jdx

@jdx jdx commented May 25, 2026

Copy link
Copy Markdown
Owner

Summary

  • add a process-local helper for cached successful path canonicalization
  • use it in repeated PATH/root comparison paths such as shim lookup, activation, hook-env diffing, and install root checks
  • leave freshness-sensitive canonicalization sites such as trust mutation, template filters, lockfiles, and OCI packaging uncached

Context

This is intended as a small prereq for #10019 so shim-directory comparisons can keep symlink-aware canonicalization without repeatedly hitting the filesystem for stable PATH/root entries.

Tests

  • cargo check --bin mise
  • cargo clippy --bin mise -- -D warnings
  • cargo test --bin mise file::

Note

Low Risk
Localized performance refactor with intentional non-caching of failures; behavior should match prior symlink-aware comparisons for stable paths.

Overview
Adds process-local caching for successful absolute-path canonicalize results via new canonicalize_cached and canonicalize_or_self helpers in file.rs. Failed canonicalizations are not cached so paths that appear later in the same run still resolve correctly.

Call sites that repeatedly compare stable directories and PATH entries now use the helpers instead of calling canonicalize every time—activation (shim removal and “already in PATH” checks), hook-env PATH deduplication, doctor path-order warnings, shim system fallback lookup, install symlink root checks, trusted config paths, and task cache keys.

Relative paths still canonicalize on each use without entering the cache. Other code paths that need fresh resolution (trust mutations, lockfiles, templates, etc.) are unchanged.

Reviewed by Cursor Bugbot for commit cdd7e2c. Bugbot is set up for automated code reviews on this repo. Configure here.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a cached path canonicalization mechanism (canonicalize_cached and canonicalize_or_self) in src/file.rs to optimize repeated filesystem path resolution operations. It refactors direct .canonicalize() calls to use these cached alternatives across several modules, including backend operations, CLI activation, doctor checks, environment hooks, settings, shims, and task helpers. I have no feedback to provide as there are no review comments to assess.

@jdx jdx force-pushed the codex/cache-canonicalize branch from d562d22 to cdd7e2c Compare May 25, 2026 16:50
@greptile-apps

greptile-apps Bot commented May 25, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR introduces a process-local canonicalize_cached helper in src/file.rs that memoises successful fs::canonicalize calls in a Mutex<HashMap>, then threads it through the hot paths in shim lookup, shell activation, hook-env diffing, install-root checks, and task helpers to eliminate redundant filesystem round-trips for stable PATH/root entries.

  • src/file.rs: New canonicalize_cached (caches only successes; skips cache for relative paths) and canonicalize_or_self convenience wrapper, following the same double-lock Mutex pattern already used by the existing which cache.
  • src/shims.rs: In addition to adopting the cache, the refactor also silently fixes a pre-existing bug — the old code used PathBuf::new() (empty path) as the sentinel for "directory not found" in user_shims/sys_shims, which meant a PATH entry whose fs::canonicalize also failed (returning PathBuf::new() via unwrap_or_default) would incorrectly match and be skipped; the new Option<PathBuf> approach makes the absence case explicit.
  • All other call sites (activate.rs, hook_env.rs, doctor/mod.rs, backend/mod.rs, settings.rs, task_helpers.rs): mechanical substitution with no logic changes.

Confidence Score: 5/5

Safe to merge; the change is a well-scoped performance optimisation that also incidentally corrects a subtle empty-path sentinel comparison in the shim fallback path.

All call sites were reviewed. The cache is correctly bounded to successful, absolute-path canonicalisations; failure paths are never cached so paths created later in the same process are handled correctly. The Mutex-double-lock pattern matches the existing which cache in the same file. The shims.rs refactor moves from a fragile PathBuf::new() sentinel to typed Option<PathBuf>, making the logic more correct. No freshness-sensitive sites (trust mutation, lockfile, OCI) were touched.

No files require special attention; src/shims.rs is worth a quick read to confirm the sentinel-fix intent, but the change is clearly correct.

Important Files Changed

Filename Overview
src/file.rs Adds canonicalize_cached (process-local Mutex+HashMap cache for successful canonicalizations) and the convenience wrapper canonicalize_or_self; follows the same double-lock pattern used by the existing which cache.
src/shims.rs Replaces fs::canonicalize + unwrap_or_default with canonicalize_cached; also silently fixes a pre-existing bug where a failed PATH-entry canonicalization could produce PathBuf::new() matching user_shims/sys_shims when those also defaulted to empty paths.
src/cli/activate.rs Switches path comparisons in shim-removal and PATH checks to use canonicalize_cached/canonicalize_or_self; logic is functionally equivalent and benefits from caching.
src/cli/hook_env.rs PATH-diffing logic switches four p.canonicalize().ok() calls to canonicalize_cached; semantics unchanged.
src/cli/doctor/mod.rs Replaces inline lambda with canonicalize_or_self and uses canonicalize_cached as a method-value for and_then; no logic changes.
src/backend/mod.rs Install-root comparison uses canonicalize_cached; uses ? via Option to preserve the "if canonicalize fails, don't skip" logic from the original code.
src/config/settings.rs Trusted-path iterator switches to canonicalize_cached; semantics are equivalent for stable config directory roots.
src/task/task_helpers.rs Thin wrapper canonicalize_path now delegates to canonicalize_or_self, adding process-level caching for task-path cache keys.

Reviews (1): Last reviewed commit: "perf: cache repeated path canonicalizati..." | Re-trigger Greptile

@github-actions

Copy link
Copy Markdown

Hyperfine Performance

mise x -- echo

Command Mean [ms] Min [ms] Max [ms] Relative
mise-2026.5.15 x -- echo 20.4 ± 1.4 17.8 27.5 1.00
mise x -- echo 21.2 ± 2.6 18.3 40.6 1.04 ± 0.15

mise env

Command Mean [ms] Min [ms] Max [ms] Relative
mise-2026.5.15 env 19.7 ± 1.4 17.2 27.7 1.03 ± 0.10
mise env 19.1 ± 1.3 17.0 28.4 1.00

mise hook-env

Command Mean [ms] Min [ms] Max [ms] Relative
mise-2026.5.15 hook-env 20.1 ± 1.3 17.9 26.8 1.00
mise hook-env 20.3 ± 1.7 18.1 29.1 1.01 ± 0.11

mise ls

Command Mean [ms] Min [ms] Max [ms] Relative
mise-2026.5.15 ls 17.9 ± 1.4 14.7 24.4 1.06 ± 0.11
mise ls 16.9 ± 1.2 14.9 22.4 1.00

xtasks/test/perf

Command mise-2026.5.15 mise Variance
install (cached) 136ms 138ms -1%
ls (cached) 61ms 61ms +0%
bin-paths (cached) 68ms 67ms +1%
task-ls (cached) 125ms 127ms -1%

@jdx jdx merged commit 14b3107 into main May 25, 2026
33 checks passed
@jdx jdx deleted the codex/cache-canonicalize branch May 25, 2026 17:22
andrewjamesbrown added a commit to andrewjamesbrown/mise that referenced this pull request May 25, 2026
Bot + maintainer reviews on jdx#10019 surfaced:

- (gemini, HIGH) The dual-dir filter logic was duplicated between
  `dependency_env` and `which_shim`, and the file.rs helpers
  (`path_env_without_shims`, `strip_shims_from_path`, `which_no_shims`)
  still only filtered the user shims dir — leaving every caller of those
  helpers vulnerable to the same recursion through the system shims dir
  this PR is fixing for `dependency_env`. Also flagged that
  `paths_eq`/`replace_path` won't match symlinked roots
  (e.g. `/usr/local/share` → `/private/usr/local/share` on macOS),
  unlike `which_shim`'s canonicalize approach.

- (jdx) Symlink-aware shim-dir checks need memoization — `dependency_env`
  is called per backend resolution and otherwise hits the filesystem on
  every PATH entry. jdx#10068 added `canonicalize_cached` /
  `canonicalize_or_self` for exactly this case.

- (greptile, P2) The regression test pinned `node = "99.0.0"`, a
  non-existent version that could fail at version-resolve and mask a
  fork-bomb regression. `test_go_shim_recursion` uses
  `go = "1.23.3"` — a real-but-uninstalled version — for the same
  reason.

- (greptile/gemini) The test's final
  `assert_contains "echo \"$output\"" '"node"'` embeds $output inside
  double quotes, so bash consumes the JSON's double-quote characters
  when `assert_contains` re-evals the command; the check for literal
  `"node"` then always fails. The sibling `test_go_shim_recursion` uses
  single quotes (`echo '$output'`) for this reason.

Changes:

- Add `file::is_mise_shims_dir(&Path) -> bool` that recognises both the
  user dir (`dirs::SHIMS`) and the system dir
  (`MISE_SYSTEM_DATA_DIR/shims`), using `canonicalize_or_self` from
  jdx#10068 so symlinked roots match without re-hitting the filesystem on
  repeated PATH checks.
- Route all four call sites — `dependency_env`,
  `path_env_without_shims`, `strip_shims_from_path`, `which_no_shims` —
  through the new helper. Eliminates the divergence the file.rs helpers
  carried.
- Test: pin `node = "22.0.0"`, switch to `output="$(... || true)"`,
  and use single-quoted `echo '$output'` so JSON double quotes survive
  the re-eval inside `assert_contains`.

Sandbox: 093f945e-81a3-4d19-bea5-ee6ff164ef62
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant