Only install torchao ROCm stub on Windows ROCm#704
Conversation
The torchao meta_path stub was installed whenever `import torchao` failed, including on Linux where torchao simply is not installed. transformers' is_torchao_available() then read torchao.__version__, got a sentinel class, and crashed in packaging.version.parse() with "'_ROCmSentinelMeta' object is not iterable", breaking `import unsloth_zoo`. Gate the stub install on actual Windows ROCm (sys.platform == "win32" plus a HIP/ROCm torch build) so every other platform keeps transformers' own torchao handling. The Windows ROCm behavior is unchanged.
|
Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits. |
There was a problem hiding this comment.
Code Review
This pull request restricts the installation of the torchao ROCm stub to Windows + ROCm (HIP) PyTorch environments to prevent packaging errors on other platforms. Feedback was provided to simplify the platform check by utilizing the globally imported torch module instead of importing it locally with an alias.
| import torch as _torch_rocm_probe | ||
| _is_windows_rocm = bool( | ||
| getattr(getattr(_torch_rocm_probe, "version", None), "hip", None) | ||
| or "rocm" in getattr(_torch_rocm_probe, "__version__", "").lower() | ||
| ) | ||
| del _torch_rocm_probe |
There was a problem hiding this comment.
Since torch is already imported globally at the top of this file (on line 42), there is no need to import it again as a local alias (_torch_rocm_probe) and delete it afterwards. You can directly reference the globally imported torch module.
_is_windows_rocm = bool(
getattr(getattr(torch, "version", None), "hip", None)
or "rocm" in getattr(torch, "__version__", "").lower()
)
Problem
import unsloth_zoocrashes on any platform wheretorchaois not installed (the common case on Linux CI and most Linux/macOS setups):This is currently failing CI on multiple branches/PRs across both repos.
Root cause
The torchao Windows-ROCm stub in
temporary_patches/utils.py(added in #703) was installed wheneverimport torchaoraised:On a normal Linux machine
import torchaoraises simply because torchao is not installed. The guard then installs the meta-path finder anyway, so every laterimport torchaoreturns a sentinel stub. transformers'is_torchao_available()readstorchao.__version__, gets a_ROCmSentinelMetaclass instead of a version string, andpackaging.version.parse()crashes on it.The stub is only ever needed on Windows ROCm, where
import torchaocrashes on the incompletetorch.distributedC-extension stack. Everywhere else a failingimport torchaojust means "not installed", which transformers already handles correctly.Fix
Gate the stub install on an actual Windows ROCm build before touching
sys.meta_path:The Windows ROCm path is unchanged. Every other platform keeps transformers' own torchao handling, so
import unsloth_zoono longer crashes when torchao is absent.Verification
A scope-faithful simulation execs the real stub classes and the real patched gate under faked
sys.platform/torch/torchaoconditions:'_ROCmSentinelMeta' object is not iterableimport torchaofails, new gateisinstance(x, torchao.dtypes.AffineQuantizedTensor)returns False with no TypeError; deeptorchao.a.b.cresolvespython -m py_compilepasses on the changed file.