fix(rocm): cap RDNA2 to rocm6.2 torch and fix inference BF16 on RDNA2 by LeoBorcherding · Pull Request #3 · LeoBorcherding/unsloth

LeoBorcherding · 2026-06-03T04:20:56Z

Summary

Two bugs affecting RDNA2 GPUs (gfx1030-gfx1036, e.g. RX 6600/6700/6800/6900) when ROCm 7.x is installed:

Bug 1 — Installer puts a dev/nightly PyTorch build in the Studio venv

When ROCm 7.x is detected, the installer selects the rocm7.x PyTorch index which serves dev builds (version string contains a git hash, e.g. 2.10.0+rocm7.2.0.gitb6ee5fde). These builds segfault during unsloth's import/patching phase on RDNA2 hardware. The user's system pip had a stable 2.7.1+rocm6.2.4 but the Studio venv got the broken dev build.

Fix: detect the runtime GPU gfx code at install time. If it's RDNA2, cap the torch install to the rocm6.2 index (torch 2.7.x) regardless of system ROCm version. Applied in both install.sh and install_python_stack.py, with --force-reinstall so existing broken builds get replaced.

Bug 2 — Inference subprocess crashes with LLVM ERROR on RDNA2

The training path fix from unslothai#5301 only covered trainer.py. The inference subprocess (InferenceBackend.load_model()) still passed dtype=None to FastLanguageModel/FastVisionModel.from_pretrained(), which auto-selected bfloat16 on RDNA2, triggering:

LLVM ERROR: Cannot select: intrinsic %llvm.amdgcn.fdot2.bf16.bf16

Fix: apply the same is_bfloat16_supported() guard from trainer.py to InferenceBackend.load_model(), forcing float16 on hardware that doesn't support bfloat16.

Files changed

install.sh — RDNA2 gfx detection + rocm6.2 cap in the shell installer
studio/install_python_stack.py — same cap in the Python stack installer (unsloth studio update)
studio/backend/core/inference/inference.py — float16 fallback for inference on RDNA2

Closes unslothai#5337

Two bugs affecting RDNA2 (gfx1030-gfx1036, e.g. RX 6600) with ROCm 7.x: 1. Installer selects the rocm7.x PyTorch index when ROCm 7.x is detected, landing dev/nightly builds (e.g. 2.10.0+rocm7.2.0.gitXXXXXXXX) in the Studio venv. These builds segfault during unsloth import on RDNA2. Fix: detect RDNA2 gfx code at install time and cap to the rocm6.2 index (torch 2.7.x) in both install.sh and install_python_stack.py, replacing any existing broken dev build via --force-reinstall. 2. Inference subprocess passed dtype=None to FastLanguageModel/FastVisionModel which auto-selected bfloat16 on RDNA2, crashing with: LLVM ERROR: Cannot select: intrinsic %llvm.amdgcn.fdot2.bf16.bf16 Fix: apply the same is_bfloat16_supported() guard from trainer.py to InferenceBackend.load_model(), forcing float16 on RDNA2 for inference. Fixes: unslothai#5337

LeoBorcherding closed this Jun 3, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(rocm): cap RDNA2 to rocm6.2 torch and fix inference BF16 on RDNA2#3

fix(rocm): cap RDNA2 to rocm6.2 torch and fix inference BF16 on RDNA2#3
LeoBorcherding wants to merge 1 commit into
mainfrom
fix/llvm-error-issue-5337

LeoBorcherding commented Jun 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

LeoBorcherding commented Jun 3, 2026

Summary

Files changed

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant