fix(studio): add ROCm DeviceType and correct AMD GPU detection in hardware.py by GoldenGrapeGentleman · Pull Request #4449 · unslothai/unsloth

GoldenGrapeGentleman · 2026-03-19T02:40:39Z

Problem

ROCm implements the CUDA API via HIP, so torch.cuda.is_available() returns True on AMD GPUs. Without this fix, detect_hardware() always reports DeviceType.CUDA on AMD hosts, causing incorrect backend labels in the Studio UI.

Additionally, get_physical_gpu_count() relies solely on nvidia-smi, which returns 1 on AMD systems.

Changes

DeviceType.ROCM = "rocm" added after CUDA in the enum
detect_hardware(): check torch.version.hip to distinguish NVIDIA CUDA from AMD ROCm
clear_gpu_cache(): extend to DeviceType.ROCM (same torch.cuda API via HIP)
get_gpu_memory_info(): extend to DeviceType.ROCM
get_gpu_utilization(): allow ROCm through; VRAM backfill via torch.cuda works on ROCm
get_package_versions(): add rocm key (torch.version.hip)
get_physical_gpu_count(): add rocm-smi fallback for AMD hosts

Testing

Verified on 8×AMD MI355X (ROCm 7.1, 288 GB HBM3e per GPU):

detect_hardware() → DeviceType.ROCM ✅
get_gpu_memory_info() total_gb=288.0 ✅
get_package_versions() rocm='7.1.25...' cuda=None ✅
clear_gpu_cache() no exception ✅

Co-authored-by: billishyahao bill.he@amd.com

gemini-code-assist · 2026-03-19T02:40:44Z

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

ROCm implements the CUDA API via HIP, so torch.cuda.is_available() returns True on AMD GPUs. Without this fix, detect_hardware() always reports DeviceType.CUDA on AMD hosts. Changes: - Add DeviceType.ROCM = 'rocm' after CUDA in the enum - detect_hardware(): use torch.version.hip to distinguish CUDA vs ROCm - clear_gpu_cache(): extend to DeviceType.ROCM (same torch.cuda API) - get_gpu_memory_info(): extend to DeviceType.ROCM - get_gpu_utilization(): allow ROCm through; VRAM backfill via torch.cuda works - get_package_versions(): add 'rocm' key (torch.version.hip) - get_physical_gpu_count(): add rocm-smi fallback for AMD hosts Tested on 8×AMD MI355X (ROCm 7.1, 288 GB HBM3e per GPU): detect_hardware() -> DeviceType.ROCM ✅ get_gpu_memory_info() total_gb=288.0 ✅ get_package_versions() rocm='7.1.25...' cuda=None ✅ Co-authored-by: billishyahao <bill.he@amd.com>

for more information, see https://pre-commit.ci

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 62c406ae33

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-03-19T02:45:15Z

+                DEVICE = DeviceType.CUDA
+                print(f"Hardware detected: CUDA — {device_name}")
+            else:
+                DEVICE = DeviceType.ROCM


Keep PyTorch device string on ROCm as cuda

On ROCm hosts this branch returns DeviceType.ROCM, whose .value is "rocm", but downstream inference code uses get_device().value as a PyTorch device string (InferenceBackend.__init__ sets self.device, then .to(self.device) is called in generation paths). PyTorch HIP still uses the CUDA device namespace, so "rocm" is not a valid target for Tensor.to(...), which can break model load/inference specifically in ROCm environments.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-03-22T17:33:46Z

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.
Repo admins can enable using credits for code reviews in their settings.

chatgpt-codex-connector · 2026-03-23T05:57:47Z

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.
Repo admins can enable using credits for code reviews in their settings.

GoldenGrapeGentleman requested a review from rolandtannous as a code owner March 19, 2026 02:40

GoldenGrapeGentleman force-pushed the fix/hardware-rocm-device branch from 62c406a to 8adbdde Compare March 19, 2026 02:43

[pre-commit.ci] auto fixes from pre-commit.com hooks

03696dc

for more information, see https://pre-commit.ci

chatgpt-codex-connector Bot reviewed Mar 19, 2026

View reviewed changes

Merge branch 'main' into fix/hardware-rocm-device

411daf0

Merge branch 'main' into fix/hardware-rocm-device

4651334

danielhanchen mentioned this pull request Mar 31, 2026

Add AMD ROCm/HIP support across installer and hardware detection #4720

Merged

7 tasks

danielhanchen closed this in #4720 Apr 10, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(studio): add ROCm DeviceType and correct AMD GPU detection in hardware.py#4449

fix(studio): add ROCm DeviceType and correct AMD GPU detection in hardware.py#4449
GoldenGrapeGentleman wants to merge 4 commits into
unslothai:mainfrom
GoldenGrapeGentleman:fix/hardware-rocm-device

GoldenGrapeGentleman commented Mar 19, 2026

Uh oh!

gemini-code-assist Bot commented Mar 19, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Mar 19, 2026

Uh oh!

chatgpt-codex-connector Bot commented Mar 22, 2026

Uh oh!

chatgpt-codex-connector Bot commented Mar 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

GoldenGrapeGentleman commented Mar 19, 2026

Problem

Changes

Testing

Uh oh!

gemini-code-assist Bot commented Mar 19, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot commented Mar 22, 2026

Uh oh!

chatgpt-codex-connector Bot commented Mar 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant