Skip to content

unsloth support AMD radeon cards#4256

Open
danielhanchen wants to merge 1 commit into
mainfrom
dh/recover-3740-amd-radeon
Open

unsloth support AMD radeon cards#4256
danielhanchen wants to merge 1 commit into
mainfrom
dh/recover-3740-amd-radeon

Conversation

@danielhanchen
Copy link
Copy Markdown
Member

@danielhanchen danielhanchen commented Mar 12, 2026

Replacement for #3740 due to Studio rebasing

Partial recovery note: this replacement preserves only the final visible PR payload that could be safely recovered after Studio rebasing. It does not fully recreate the broader AMD Radeon support work originally described in #3740, such as the wider training, testing, and script coverage.

Original PR description follows.

  1. Modify envs build for amd radeon cards.
  2. Supports single-GPU QLoRA fine-tuning for the Llama-3.1-8B-Instruct model;
  3. Supports multi-GPU QLoRA fine-tuning for the Qwen3-30B-A3B MoE model;
  4. Add scripts to supports Attention operator testing, comparing the accuracy and performance of three implementations: torch, flash-attention, and sdpa;
  5. Add scripts to supports MoE operator testing, including:
    Gating operator accuracy and performance testing;
    SparseMoe-FFN operator accuracy and performance testing;
    6.Attempted FP8 precision, which failed on both NVIDIA and AMD GPUs with consistent error messages. Preliminary investigation indicates it's an issue with Unsloth compatibility.

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: aa005cbb4e

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread unsloth/_build_support.py
Comment on lines +306 to +307
_BOOTSTRAP_COMPLETED = True
_perform_rocm_bootstrap(runtime)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Mark ROCm bootstrap complete only after success

_ensure_rocm_bootstrap sets _BOOTSTRAP_COMPLETED = True before running _perform_rocm_bootstrap(runtime), so any failure during git/pip install leaves the flag permanently true and all later calls in the same process incorrectly skip retrying bootstrap. This can leave ROCm dependencies missing while subsequent install steps proceed as if bootstrap already succeeded.

Useful? React with 👍 / 👎.

Comment thread unsloth/_build_support.py
)

with _pushd(Path("bitsandbytes")):
subprocess.check_call(["git", "checkout", "main"])
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Pin bitsandbytes to 0.48.2 instead of main

This block announces and records a 0.48.2 install, but it actually checks out main, which makes builds non-reproducible and can silently pick commits that are incompatible with the rest of the ROCm bootstrap flow. Checking out the specific 0.48.2 tag/commit is needed to match the declared version and avoid drift.

Useful? React with 👍 / 👎.

Comment thread unsloth/_build_support.py
pass

if not rocm_arch:
rocm_arch = "gfx1201"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Respect BNB_ROCM_ARCH before forcing gfx1201 fallback

When rocminfo is unavailable, the code hardcodes rocm_arch = "gfx1201" and later passes that value to CMake, so non-Radeon ROCm systems (for example gfx90a/gfx942) can compile bitsandbytes for the wrong architecture and fail at runtime. Because the fallback is forced, a user-provided BNB_ROCM_ARCH cannot effectively override this path.

Useful? React with 👍 / 👎.

@danielhanchen danielhanchen force-pushed the dh/recover-3740-amd-radeon branch from aa005cb to c5df65d Compare March 12, 2026 23:10
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c5df65d199

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread unsloth/_build_support.py
Comment on lines +690 to +691
class RocmExtraInstallCommand(install): # type: ignore[misc]
def run(self):
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Register ROCm bootstrap hooks in setuptools config

This commit defines compute_version_string and RocmExtraInstallCommand, but nothing wires either into the package build/install flow (no cmdclass registration and no version hook references these functions), so the ROCm bootstrap path is never invoked during normal installs. In this commit, these symbols are only defined in _build_support.py, which means the new AMD support logic is effectively dead code.

Useful? React with 👍 / 👎.

Comment thread unsloth/_build_support.py
Comment on lines +360 to +362
refreshed = detect_runtime()
if not refreshed.has_torch or not refreshed.has_hip:
raise RuntimeError(
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Validate ROCm torch support without forced-runtime flags

The post-install guard in _ensure_rocm_torch trusts refreshed.has_hip, but detect_runtime can set that flag from UNSLOTH_FORCE_RUNTIME=hip even when the installed torch wheel has no HIP backend. In the bootstrap flow where forcing HIP is required when torch is initially missing, an incorrect UNSLOTH_BOOTSTRAP_TORCH_ARGS can install a non-ROCm torch and still pass this check, causing later ROCm-dependent builds to fail.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants