[ROCm][Critical] Fix the GDN import bug by tjtanaa · Pull Request #43486 · vllm-project/vllm

tjtanaa · 2026-05-23T15:15:34Z

Purpose

#41126 changes the import of gdn thus causing import error. However this import error is critical as it will cause the server to crashed when we use VLLM_ROCM_USE_AITER=1

(EngineCore pid=2818) ERROR 05-23 13:50:24 [core.py:1165]     self.aot_compiled_fn = self.aot_compile(*args, **kwargs)
(EngineCore pid=2818) ERROR 05-23 13:50:24 [core.py:1165]                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=2818) ERROR 05-23 13:50:24 [core.py:1165]   File "/usr/local/lib/python3.12/dist-packages/vllm/compilation/wrapper.py", line 167, in aot_compile
(EngineCore pid=2818) ERROR 05-23 13:50:24 [core.py:1165]     return self._compiled_callable.aot_compile((args, kwargs))
(EngineCore pid=2818) ERROR 05-23 13:50:24 [core.py:1165]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=2818) ERROR 05-23 13:50:24 [core.py:1165]   File "/usr/local/lib/python3.12/dist-packages/torch/_dynamo/eval_frame.py", line 832, in aot_compile
(EngineCore pid=2818) ERROR 05-23 13:50:24 [core.py:1165]     return aot_compile_fullgraph(
(EngineCore pid=2818) ERROR 05-23 13:50:24 [core.py:1165]            ^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=2818) ERROR 05-23 13:50:24 [core.py:1165]   File "/usr/local/lib/python3.12/dist-packages/torch/_dynamo/aot_compile.py", line 239, in aot_compile_fullgraph
(EngineCore pid=2818) ERROR 05-23 13:50:24 [core.py:1165]     compiled_fn = backend(
(EngineCore pid=2818) ERROR 05-23 13:50:24 [core.py:1165]                   ^^^^^^^^
(EngineCore pid=2818) ERROR 05-23 13:50:24 [core.py:1165]   File "/usr/local/lib/python3.12/dist-packages/torch/__init__.py", line 2509, in __call__
(EngineCore pid=2818) ERROR 05-23 13:50:24 [core.py:1165]     return self.compiler_fn(model_, inputs_, **self.kwargs)
(EngineCore pid=2818) ERROR 05-23 13:50:24 [core.py:1165]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=2818) ERROR 05-23 13:50:24 [core.py:1165]   File "/usr/lib/python3.12/contextlib.py", line 81, in inner
(EngineCore pid=2818) ERROR 05-23 13:50:24 [core.py:1165]     return func(*args, **kwds)
(EngineCore pid=2818) ERROR 05-23 13:50:24 [core.py:1165]            ^^^^^^^^^^^^^^^^^^^
(EngineCore pid=2818) ERROR 05-23 13:50:24 [core.py:1165]   File "/usr/local/lib/python3.12/dist-packages/vllm/compilation/backends.py", line 1163, in __call__
(EngineCore pid=2818) ERROR 05-23 13:50:24 [core.py:1165]     self.configure_post_pass()
(EngineCore pid=2818) ERROR 05-23 13:50:24 [core.py:1165]   File "/usr/local/lib/python3.12/dist-packages/vllm/compilation/backends.py", line 948, in configure_post_pass
(EngineCore pid=2818) ERROR 05-23 13:50:24 [core.py:1165]     self.pass_manager.configure(self.vllm_config)
(EngineCore pid=2818) ERROR 05-23 13:50:24 [core.py:1165]   File "/usr/local/lib/python3.12/dist-packages/vllm/compilation/passes/pass_manager.py", line 163, in configure
(EngineCore pid=2818) ERROR 05-23 13:50:24 [core.py:1165]     RocmAiterRMSNormQuantFusionPass(config),
(EngineCore pid=2818) ERROR 05-23 13:50:24 [core.py:1165]     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=2818) ERROR 05-23 13:50:24 [core.py:1165]   File "/usr/local/lib/python3.12/dist-packages/vllm/compilation/passes/inductor_pass.py", line 139, in fn_new
(EngineCore pid=2818) ERROR 05-23 13:50:24 [core.py:1165]     result = fn(*args, **kwargs)
(EngineCore pid=2818) ERROR 05-23 13:50:24 [core.py:1165]              ^^^^^^^^^^^^^^^^^^^
(EngineCore pid=2818) ERROR 05-23 13:50:24 [core.py:1165]   File "/usr/local/lib/python3.12/dist-packages/vllm/compilation/passes/fusion/rocm_aiter_fusion.py", line 563, in __init__
(EngineCore pid=2818) ERROR 05-23 13:50:24 [core.py:1165]     from vllm.model_executor.layers.mamba.gdn_linear_attn import (
(EngineCore pid=2818) ERROR 05-23 13:50:24 [core.py:1165] ModuleNotFoundError: No module named 'vllm.model_executor.layers.mamba.gdn_linear_attn'

Test Plan

fix unit test pytest -svvvv tests/compile/passes/test_fusion.py::test_aiter_fusion_rmsnorm_gated_quant
Evaluate that the following command works and model generate correct end to end accuracy

#!/bin/bash

rm -rf ~/.cache/vllm
export VLLM_ROCM_USE_AITER=1

vllm serve Qwen/Qwen3-Next-80B-A3B-Instruct-FP8 \
  --compilation-config '{"cudagraph_mode":"FULL_AND_PIECEWISE","custom_ops":["-rms_norm","-silu_and_mul","+quant_fp8"],"pass_config":{"fuse_norm_quant":true}}'

Test Result

pytest -svvvv tests/compile/passes/test_fusion.py::test_aiter_fusion_rmsnorm_gated_quant

======================= 2 passed, 21 warnings in 11.54s ========================

Acc of Qwen/Qwen3-Next-80B-A3B-Instruct-FP8

local-completions ({'model': 'Qwen/Qwen3-Next-80B-A3B-Instruct-FP8', 'base_url': 'http://0.0.0.0:8000/v1/completions', 'num_concurrent': 256, 'max_retries': 10, 'max_gen_toks': 2048, 'max_length': 1048576, 'timeout': 60000}), gen_kwargs: ({}), limit: None, num_fewshot: 8, batch_size: auto
|Tasks|Version|     Filter     |n-shot|  Metric   |   |Value |   |Stderr|
|-----|------:|----------------|-----:|-----------|---|-----:|---|-----:|
|gsm8k|      3|flexible-extract|     8|exact_match|↑  |0.9507|±  |0.0060|
|     |       |strict-match    |     8|exact_match|↑  |0.9431|±  |0.0064|

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.

Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>

tjtanaa · 2026-05-23T15:17:34Z

@tpopp Please evaluate if your optimization in #40710 is still working after Mamba Refactoring #41126 .

This PR is needed as quick critical fix as it affects all models when AITER is used.

gemini-code-assist

Code Review

This pull request updates the import path for GatedDeltaNetAttention to its new location in vllm.model_executor.layers.mamba.gdn.base within the ROCm Aiter fusion pass and the fusion test suite. It also adds a type ignore annotation to handle type-checking issues during layer discovery. I have no feedback to provide as the existing review comments were purely explanatory and did not identify any issues.

Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>

Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com> Signed-off-by: Liuweixiong0118 <lwx34158427@gmail.com>

fix the import bug

6c312e8

Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>

mergify Bot added rocm Related to AMD ROCm bug Something isn't working labels May 23, 2026

github-project-automation Bot added this to AMD May 23, 2026

github-project-automation Bot moved this to Todo in AMD May 23, 2026

tjtanaa added the ready ONLY add when PR is ready to merge/full CI is needed label May 23, 2026

tjtanaa requested a review from DarkLight1337 May 23, 2026 15:20

gemini-code-assist Bot reviewed May 23, 2026

View reviewed changes

tjtanaa mentioned this pull request May 23, 2026

[ROCm] [DSv4] [Perf] Support DeepSeek v4 MTP #43385

Merged

4 tasks

tjtanaa requested a review from dllehr-amd May 23, 2026 15:31

DarkLight1337 approved these changes May 23, 2026

View reviewed changes

DarkLight1337 enabled auto-merge (squash) May 23, 2026 15:37

ZJY0516 approved these changes May 23, 2026

View reviewed changes

DarkLight1337 merged commit 46f95b2 into vllm-project:main May 23, 2026
65 checks passed

github-project-automation Bot moved this from Todo to Done in AMD May 23, 2026

lrioxh pushed a commit to lrioxh/vllm-dev that referenced this pull request May 24, 2026

[ROCm][Critical] Fix the GDN import bug (vllm-project#43486)

12a8626

Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>

h1t35h pushed a commit to h1t35h/vllm that referenced this pull request May 26, 2026

[ROCm][Critical] Fix the GDN import bug (vllm-project#43486)

e286b3a

Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>

Liuweixiong0118 pushed a commit to Liuweixiong0118/vllm that referenced this pull request Jun 1, 2026

[ROCm][Critical] Fix the GDN import bug (vllm-project#43486)

c5ce46e

Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com> Signed-off-by: Liuweixiong0118 <lwx34158427@gmail.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[ROCm][Critical] Fix the GDN import bug#43486

[ROCm][Critical] Fix the GDN import bug#43486
DarkLight1337 merged 1 commit into
vllm-project:mainfrom
EmbeddedLLM:bugfixrocmgdn

tjtanaa commented May 23, 2026 •

edited by github-actions Bot

Loading

Uh oh!

tjtanaa commented May 23, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

tjtanaa commented May 23, 2026 • edited by github-actions Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

tjtanaa commented May 23, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

tjtanaa commented May 23, 2026 •

edited by github-actions Bot

Loading