fix(ROCm): restrict is_rdna() to ROCm-officially-supported GPUs by GoldenGrapeGentleman · Pull Request #4136 · unslothai/unsloth

GoldenGrapeGentleman · 2026-03-02T06:57:38Z

Problem

The current arch.startswith("gfx1") implementation in is_rdna() over-matches hardware that is not in the ROCm Linux support matrix:

Fix

Replace with an explicit whitelist of the 4 officially ROCm-supported RDNA gfx targets, mirroring the existing is_cdna() style:

gfx1100 / gfx1101  — RDNA3 discrete (RX 7900/7800/7700 series, PRO W7900/W7800/W7700)
gfx1200 / gfx1201  — RDNA4 discrete (RX 9060/9070 series, AI PRO R9700)

Tested on AMD Radeon PRO W7900 (gfx1100, ROCm 7.1).

cc @danielhanchen

gemini-code-assist · 2026-03-02T06:57:49Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request refines the detection logic for RDNA GPUs within the is_rdna() function to align strictly with AMD's official ROCm support matrix. The change addresses a problem where the previous implementation could over-match unsupported hardware, risking the application of unverified Triton kernel tunings. By implementing an explicit whitelist, the system now ensures that Triton kernels are only applied to genuinely supported RDNA architectures, enhancing stability and correctness.

Highlights

ROCm GPU Detection: Modified the is_rdna() function to explicitly whitelist only ROCm-officially-supported RDNA GPUs (gfx1100, gfx1101, gfx1200, gfx1201). This prevents the application of unverified Triton kernel tuning to unsupported hardware.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog

unsloth/kernels/utils.py
- Updated the is_rdna function to use an explicit whitelist for ROCm-supported RDNA architectures.
- Revised the docstring for is_rdna to reflect the new detection scope.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request correctly refines the is_rdna() function to whitelist only ROCm-officially-supported RDNA GPUs, preventing the use of unverified kernels on unsupported hardware. The change is concise and aligns well with the existing is_cdna() implementation. I have one minor suggestion to improve the readability of the new function body.

gemini-code-assist · 2026-03-02T06:58:56Z

+    return is_hip() and triton.runtime.driver.active.get_current_target().arch in (
+        "gfx1100",
+        "gfx1101",
+        "gfx1200",
+        "gfx1201",
+    )


For improved readability and to avoid a very long line of code, consider extracting the architecture string into a local variable. This makes the code's intent clearer and is more robust to different code formatters, while being consistent with the structure of the previous implementation.

Suggested change

return is_hip() and triton.runtime.driver.active.get_current_target().arch in (

"gfx1100",

"gfx1101",

"gfx1200",

"gfx1201",

)

arch = triton.runtime.driver.active.get_current_target().arch

return is_hip() and arch in (

"gfx1100",

"gfx1101",

"gfx1200",

"gfx1201",

)

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b82d0972c4

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-03-02T06:59:28Z

+    return is_hip() and triton.runtime.driver.active.get_current_target().arch in (
+        "gfx1100",
+        "gfx1101",
+        "gfx1200",
+        "gfx1201",


Split RDNA support check from RDNA hardware detection

is_rdna() now encodes a ROCm support whitelist, but it is also used as a general RDNA feature gate; in unsloth/models/loader.py (Gemma3 path around lines 1134-1141), the NaN workaround is conditioned on is_rdna() and explicitly calls out affected gfx1102/gfx115x devices. With this whitelist, those GPUs now return False, so the compile-disable workaround is skipped and those users can regress to NaN outputs. A separate predicate for "ROCm-officially-supported RDNA" should be introduced for tuning decisions, while preserving RDNA-family detection for correctness workarounds.

Useful? React with 👍 / 👎.

Datta0 · 2026-03-02T07:24:00Z

-    arch = triton.runtime.driver.active.get_current_target().arch
-    return arch.startswith("gfx1") and not is_cdna()
+    """Detect ROCm-supported RDNA consumer/workstation GPUs (RDNA3, RDNA4)."""
+    return is_hip() and triton.runtime.driver.active.get_current_target().arch in (


NIT: What if triton isn't installed?

Hi @Datta0 ~ Good catch in spirit, but we're actually safe here on two levels! 😄

triton is already imported unconditionally at the module top-level (line 16 & 56 of utils.py) — if triton isn't installed, the whole module blows up long before anyone calls is_rdna().

Even at runtime, is_hip() and ... short-circuits — so on a non-ROCm machine (where triton might not ship the HIP driver), we never touch triton.runtime.driver at all.

Current arch.startswith("gfx1") incorrectly matches: - RDNA1 (gfx10xx) and RDNA2 (gfx103x): not ROCm supported - gfx1102 (RX 7600), gfx1103 (Phoenix APU): not in ROCm support matrix - gfx1150/1151/1152 (RDNA3.5 APUs): not in ROCm support matrix Replace with explicit whitelist aligned to the ROCm Linux support matrix: https://rocm.docs.amd.com/projects/install-on-linux/en/latest/reference/system-requirements.html gfx1100 - RDNA3 discrete (RX 7900 series, PRO W7900/W7800) gfx1101 - RDNA3 discrete (RX 7800/7700 series, PRO W7700) gfx1200 - RDNA4 discrete (RX 9060 series) gfx1201 - RDNA4 discrete (RX 9070 series, AI PRO R9700) Mirrors the existing is_cdna() pattern. Avoids silently applying unverified Triton kernel tuning to unsupported hardware.

Datta0 · 2026-03-03T08:24:45Z

btw @GoldenGrapeGentleman I see your other PR #4139
Do you wanna club these two into one? I see these are small changes and would simplify review

danielhanchen

The whitelist approach makes sense for performance tuning, but after PR #4139 reverts the only performance use of is_rdna(), the function's sole remaining caller is the Gemma3 NaN correctness workaround in loader.py:1135-1142:

# ROCm/HIP: Gemma3 compiled forward produces NaN on RDNA GPUs
# (gfx1100, gfx1101, gfx1102, gfx1150, gfx1151, etc.).
from unsloth.kernels.utils import is_rdna
if is_rdna():
    os.environ["UNSLOTH_COMPILE_DISABLE"] = "partial"

That comment explicitly lists gfx1102 (RX 7600), gfx1150, gfx1151 (Strix Halo) as affected -- all excluded by this whitelist. Issue #3385 was filed from a gfx1151 system. This PR would cause the NaN workaround to be skipped on the very hardware it was designed to protect.

For correctness workarounds, broad matching (startswith("gfx1")) is the safer approach. I'd recommend closing this after #4139 merges. If a narrow "officially supported RDNA" check is needed later for perf tuning, it should be a separate function (e.g. is_rdna_supported()) so the correctness path stays broad.

danielhanchen · 2026-03-03T08:43:05Z

Better to merge #4139 first -- it's a clean revert, ready now, and has zero regression risk.

#4136 needs reconsideration: after #4139 lands, the only remaining is_rdna() caller is the Gemma3 NaN correctness workaround in loader.py. The whitelist here excludes gfx1102, gfx1150, gfx1151 -- the exact hardware that workaround protects (issue #3385 was filed from gfx1151). Combining would hold up the good revert while the whitelist issue gets sorted out.

…nslothai#4136) Current arch.startswith("gfx1") incorrectly matches: - RDNA1 (gfx10xx) and RDNA2 (gfx103x): not ROCm supported - gfx1102 (RX 7600), gfx1103 (Phoenix APU): not in ROCm support matrix - gfx1150/1151/1152 (RDNA3.5 APUs): not in ROCm support matrix Replace with explicit whitelist aligned to the ROCm Linux support matrix: https://rocm.docs.amd.com/projects/install-on-linux/en/latest/reference/system-requirements.html gfx1100 - RDNA3 discrete (RX 7900 series, PRO W7900/W7800) gfx1101 - RDNA3 discrete (RX 7800/7700 series, PRO W7700) gfx1200 - RDNA4 discrete (RX 9060 series) gfx1201 - RDNA4 discrete (RX 9070 series, AI PRO R9700) Mirrors the existing is_cdna() pattern. Avoids silently applying unverified Triton kernel tuning to unsupported hardware.

GoldenGrapeGentleman requested review from Datta0 and danielhanchen as code owners March 2, 2026 06:57

gemini-code-assist Bot reviewed Mar 2, 2026

View reviewed changes

chatgpt-codex-connector Bot reviewed Mar 2, 2026

View reviewed changes

Datta0 reviewed Mar 2, 2026

View reviewed changes

GoldenGrapeGentleman mentioned this pull request Mar 2, 2026

perf(ROCm): add is_rdna() detection and optimize CE loss for RDNA GPUs #4123

Merged

Merge branch 'main' into fix/is-rdna-rocm-official-list

e27a6f6

danielhanchen requested changes Mar 3, 2026

View reviewed changes

danielhanchen merged commit f737858 into unslothai:main Mar 3, 2026
1 check passed

GoldenGrapeGentleman mentioned this pull request Mar 18, 2026

feat(ROCm): expand is_rdna() to cover RDNA2, RDNA3, RDNA3.5, RDNA4 #4428

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(ROCm): restrict is_rdna() to ROCm-officially-supported GPUs#4136

fix(ROCm): restrict is_rdna() to ROCm-officially-supported GPUs#4136
danielhanchen merged 2 commits into
unslothai:mainfrom
GoldenGrapeGentleman:fix/is-rdna-rocm-official-list

GoldenGrapeGentleman commented Mar 2, 2026 •

edited

Loading

Uh oh!

gemini-code-assist Bot commented Mar 2, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Mar 2, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Mar 2, 2026

Uh oh!

Datta0 Mar 2, 2026

Uh oh!

GoldenGrapeGentleman Mar 2, 2026 •

edited

Loading

Uh oh!

Datta0 commented Mar 3, 2026 •

edited

Loading

Uh oh!

danielhanchen left a comment

Uh oh!

danielhanchen commented Mar 3, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

GoldenGrapeGentleman commented Mar 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Fix

Uh oh!

gemini-code-assist Bot commented Mar 2, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Mar 2, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Mar 2, 2026

Choose a reason for hiding this comment

Uh oh!

Datta0 Mar 2, 2026

Choose a reason for hiding this comment

Uh oh!

GoldenGrapeGentleman Mar 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Datta0 commented Mar 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

danielhanchen left a comment

Choose a reason for hiding this comment

Uh oh!

danielhanchen commented Mar 3, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

GoldenGrapeGentleman commented Mar 2, 2026 •

edited

Loading

GoldenGrapeGentleman Mar 2, 2026 •

edited

Loading

Datta0 commented Mar 3, 2026 •

edited

Loading