refactor(file): rename TarFormat and split archive extraction#10241
refactor(file): rename TarFormat and split archive extraction#10241risu729 wants to merge 20 commits into
Conversation
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Plus Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
Greptile SummaryThis PR refactors archive handling in mise by renaming
Confidence Score: 5/5Safe to merge; all edge cases touched by the refactor are handled with proper bail! returns, and explicitly documented behavioral changes are intentional and clearly scoped. The refactor is well-structured: every previously panicking unimplemented!() site is replaced with bail!(), the legacy Raw→gzip-tar fallback is preserved exactly where documented, and callers are systematically updated. No silent swallowed errors or incorrect routing was found across the changed paths. src/cli/generate/tool_stub.rs has a minor redundant format detection call (pre-existing pattern made more visible by the refactor), but it is harmless. Important Files Changed
Reviews (13): Last reviewed commit: "refactor(file): remove TarOptions in fav..." | Re-trigger Greptile |
|
This PR currently has merge conflicts. If this continues for 7 days, it will be closed automatically. This is warning day 1 of 7. Please update the PR when you have a chance. Feel free to reopen or create a new PR if it is closed and you'd like to continue working on it. This comment was generated by an automated workflow. |
262da3d to
9b660a1
Compare
|
This PR currently has failing checks. If this continues for 7 days, it will be closed automatically. This is warning day 1 of 7. Please update the PR when you have a chance. Feel free to reopen or create a new PR if it is closed and you'd like to continue working on it. This comment was generated by an automated workflow. |
cargo deny fails on the new unmaintained advisory for proc-macro-error2, a transitive dependency with no safe upgrade path yet.
TarFormat::from_ext handles tgz/tbz/txz aliases at extraction time.
Call TarFormat::from_ext and unarchive directly from the aqua backend.
- rename TarFormat to ArchiveFormat; from_ext returns Option - replace unarchive/ArchiveOptions with extract_archive/ExtractOptions - route aqua compressed assets through decompress_file directly
Move tar.br, lz4, sz, rar and aliases into ArchiveFormat and fail via unimplemented! in open_tar and related extraction paths instead of a separate aqua-specific format list.
Use ArchiveFormat::from_ext on JavaMetadata.file_type when present and fall back to filename detection only when metadata omits the format.
3c6a7fa to
d9d5b2c
Compare
Replace unimplemented! with bail! for tar.br, lz4, sz, and rar so encountering these formats surfaces a CLI error instead of panicking.
Satisfy clippy::needless_update after all ExtractOptions fields were set explicitly in extract_archive.
|
This PR currently has merge conflicts. If this continues for 7 days, it will be closed automatically. This is warning day 1 of 7. Please update the PR when you have a chance. Feel free to reopen or create a new PR if it is closed and you'd like to continue working on it. This comment was generated by an automated workflow. |
Summary
file::untarto tar extraction onlyTarFormattoArchiveFormat; addextract_archiveandExtractOptionsdecompress_filefor single-file compression (gz/xz/zst/bz2)aqua-registry;ArchiveFormat::from_exthandles aliases at extraction timedecompress_fileto the registry bin pathStacked on #10269 (
chore/ignore-proc-macro-error2-advisory). Depends on #10224.API changes
TarFormatArchiveFormatTarFormat::from_ext→ fallbackRawArchiveFormat::from_ext→Option<_>unarchive(dest, ArchiveOptions { single_file_dest, ... })extract_archive(dest, ExtractOptions)for archives;decompress_file(output_file, ...)for single compressionArchiveOptionsExtractOptions(strip_components,pr,preserve_mtimeonly)extract_archivehandles tar archives,zip/vsix, and7z(Windows only). It does not accept compressed single-file formats.decompress_filealways writes to an explicit output file path (parent dirs are created).Removal of implicit tar.gz fallback (
Raw)This is the largest behavioral change in the PR.
Previously, unknown or unrecognized format strings silently became
TarFormat::Raw, andRawwas treated as gzip-compressed tar in several places:TarFormat::from_ext— any unrecognized registry/option string parsed toRawinstead of failing.open_tar—Raw(andGz) opened withGzDecoder, assuming tar.gz payload.untar— also handled zip, 7z, and single-filegz/xz/zst/bz2inline; aRawformat string could end up on the tar path and get gzip-decoded.src/backend/aqua.rs— builtTarOptions::new(TarFormat::from_ext(format))for every package, so unknown aqua formats becameRawtar opts. Additionally,format.starts_with("tar")forceduntarfor anytar.*string even whenfrom_extdid not recognize it (e.g.tar.br→Rawopts but still extracted via thestarts_with("tar")branch).This PR removes that silent defaulting in most paths. Formats must be recognized explicitly; unsupported ones fail instead of guessing tar.gz.
Affected code and behavior
ArchiveFormat::from_ext(src/file.rs)parse().unwrap_or(Raw)parse().ok()→Nonefor unknown stringsopen_tar(src/file.rs)TarGz | Gz | Raw→GzDecoderTarGz | Raw→GzDecoder;Gzbails as non-tarRaw→ gzip tar kept for legacy callers that still passRawintountaruntar(src/file.rs)extract_archive/decompress_filesrc/backend/aqua.rsinstallTarOptions::new(from_ext(format))+format.starts_with("tar")catch-allArchiveFormat::from_ext(format)onpkg.format(…); explicitraw/dmg/pkgbranches; onlyGithubArchiveusesunwrap_or(TarGz)tar.gz; other types fail whenfrom_extcannot parse the registry formatsrc/backend/aqua.rsSLSA fallbackfrom_ext(pkg.format(…))→ unknown becameRaw, then failedis_archive()from_ext(pkg.format(…)).ok_or(...)— unknown registry format errors immediatelyRawsrc/backend/http.rsuntar; archives viauntardecompress_file/extract_archivesrc/backend/static_helpers.rsformatoption →from_ext→Rawformatoption →from_ext(...).unwrap_or(Raw)github:/static installssrc/backend/github.rsSLSA fallbackformatoption →from_ext→Rawfrom_ext(...).unwrap_or(Raw)formatoption onlysrc/plugins/core/python.rsuntarwithfrom_file_nameextract_archivewithfrom_file_nameRaw→ gzip tar viaopen_tarsrc/plugins/core/java.rsfile_typefrom_ext(file_type)thenfrom_file_namefallbackfile_typehonoured againsrc/plugins/core/{zig,erlang,go,node,ruby,swift}.rsuntarwith explicitTarGz(etc.) at call sitesrc/cli/generate/tool_stub.rsunarchiveextract_archive+from_file_namecrates/aqua-registry/src/types.rstgz/txz/tbz→ long formsfrom_extparses aliasesArchiveFormatIntentional behavior changes
GithubArchive): whenpkg.format(v, os, arch)returns a string thatArchiveFormat::from_extcannot parse (and it is not handled by the explicitraw/dmg/pkg/GithubContentbranches), install nowbail!("unsupported format: …")instead of attempting gzip-tar extraction. That string comes fromAquaPackage::format()— either the registry YAMLformat:field, ordetect_format()on the resolved asset/url filename whenformat:is empty.format.starts_with("tar")catch-all;tar.br/tar.lz4/raretc. are recognized enum variants and fail viaunimplemented!()instead of being misread as gzip tar.Raw.from_extgenerally: no silentRawunless a caller explicitly opts in with.unwrap_or(Raw).Explicit defaults that remain
GithubArchivepackages (src/backend/aqua.rs):ArchiveFormat::from_ext(format).unwrap_or(TarGz)— same as before when aqua registry omits format for GitHub archive installs.Rawinopen_tar: still gzip-decoded (marked with a TODO in code). Used when callers detect no extension viafrom_file_name(e.g. python builds).formatinstall option (static_helpers, github SLSA): unknown value still maps toRawvia.unwrap_or(Raw)— opt-in legacy path for explicit user configuration, not registry auto-detection.Why format alias canonicalization is no longer needed
Previously,
crates/aqua-registry/src/types.rsnormalized short archive suffixes before passing a format string to mise:.tgz/ explicitformat: tgz→tar.gz.txz/txz→tar.xz.tbz/.tbz2→tar.bz2That existed because extraction used string-based routing and needed one canonical spelling per archive family.
This PR uses
ArchiveFormat::from_ext, which accepts aliases directly (tgz,tbz,txz,tzst,vsix, …). aqua-registry can return literal formats from filenames/registry and extraction still works. This matches aqua upstream (RemoveExtFromAssetreturnstgz, nottar.gz).dmg,pkg, andrawremain special-cased insrc/backend/aqua.rs.Aqua extraction routing
extract_archiveinstall_pathdecompress_filefirst_bin_pathfrom registrycopyfirst_bin_pathun_dmg/un_pkginstall_pathCaller format support
src/backend/aqua.rsextract_archiveordecompress_file; raw/dmg/pkg branchessrc/backend/http.rsdecompress_fileorextract_archivesrc/backend/static_helpers.rsdecompress_fileorextract_archivesrc/backend/spm.rsunzipziponlysrc/cli/generate/tool_stub.rsextract_archivesrc/plugins/core/java.rsextract_archivefile_typemetadata, then filenamesrc/plugins/core/python.rsextract_archiveRaw→ gzip tar (legacy)src/plugins/core/zig.rsextract_archivefile::untarcallers (erlang,go,node,ruby,swift)untarTests
cargo test test_archive_formatcargo test test_decompresscargo test test_extract_archivecargo test test_untar_rejects_single_file_compressioncargo check