Skip to content

[Bug] Fix AttributeError: 'Qwen3VLMoeConfig' object has no attribute 'intermediate_size'#30567

Closed
yewentao256 wants to merge 1 commit intomainfrom
wentao-fix-qwen3vl-launch-bug
Closed

[Bug] Fix AttributeError: 'Qwen3VLMoeConfig' object has no attribute 'intermediate_size'#30567
yewentao256 wants to merge 1 commit intomainfrom
wentao-fix-qwen3vl-launch-bug

Conversation

@yewentao256
Copy link
Copy Markdown
Member

@yewentao256 yewentao256 commented Dec 12, 2025

Purpose

export MODEL="Qwen/Qwen3-VL-235B-A22B-Thinking-FP8"
vllm serve $MODEL -tp 4 --port 9256 --enable-expert-parallel

^[[0;36m(Worker_TP3_EP3 pid=1427826)^[[0;0m ERROR 12-12 09:15:20 [multiproc_executor.py:822]     compiled_fn = compiler_fn(gm, example_inputs)
^[[0;36m(Worker_TP3_EP3 pid=1427826)^[[0;0m ERROR 12-12 09:15:20 [multiproc_executor.py:822]                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
^[[0;36m(Worker_TP3_EP3 pid=1427826)^[[0;0m ERROR 12-12 09:15:20 [multiproc_executor.py:822]   File "/home/wentao/.venv/lib/python3.12/site-packages/torch/_dynamo/repro/after_dynamo.py", line 156, in __call__
^[[0;36m(Worker_TP3_EP3 pid=1427826)^[[0;0m ERROR 12-12 09:15:20 [multiproc_executor.py:822]     compiled_gm = compiler_fn(gm, example_inputs)
^[[0;36m(Worker_TP3_EP3 pid=1427826)^[[0;0m ERROR 12-12 09:15:20 [multiproc_executor.py:822]                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
^[[0;36m(Worker_TP3_EP3 pid=1427826)^[[0;0m ERROR 12-12 09:15:20 [multiproc_executor.py:822]   File "/home/wentao/.venv/lib/python3.12/site-packages/torch/__init__.py", line 2437, in __call__
^[[0;36m(Worker_TP3_EP3 pid=1427826)^[[0;0m ERROR 12-12 09:15:20 [multiproc_executor.py:822]     return self.compiler_fn(model_, inputs_, **self.kwargs)
^[[0;36m(Worker_TP3_EP3 pid=1427826)^[[0;0m ERROR 12-12 09:15:20 [multiproc_executor.py:822]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
^[[0;36m(Worker_TP3_EP3 pid=1427826)^[[0;0m ERROR 12-12 09:15:20 [multiproc_executor.py:822]   File "/home/wentao/vllm-source/vllm/compilation/backends.py", line 704, in __call__
^[[0;36m(Worker_TP3_EP3 pid=1427826)^[[0;0m ERROR 12-12 09:15:20 [multiproc_executor.py:822]     self.configure_post_pass()
^[[0;36m(Worker_TP3_EP3 pid=1427826)^[[0;0m ERROR 12-12 09:15:20 [multiproc_executor.py:822]   File "/home/wentao/vllm-source/vllm/compilation/backends.py", line 552, in configure_post_pass
^[[0;36m(Worker_TP3_EP3 pid=1427826)^[[0;0m ERROR 12-12 09:15:20 [multiproc_executor.py:822]     self.pass_manager.configure(self.vllm_config)
^[[0;36m(Worker_TP3_EP3 pid=1427826)^[[0;0m ERROR 12-12 09:15:20 [multiproc_executor.py:822]   File "/home/wentao/vllm-source/vllm/compilation/pass_manager.py", line 118, in configure
^[[0;36m(Worker_TP3_EP3 pid=1427826)^[[0;0m ERROR 12-12 09:15:20 [multiproc_executor.py:822]     self.passes += [RMSNormQuantFusionPass(config)]
^[[0;36m(Worker_TP3_EP3 pid=1427826)^[[0;0m ERROR 12-12 09:15:20 [multiproc_executor.py:822]                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
^[[0;36m(Worker_TP3_EP3 pid=1427826)^[[0;0m ERROR 12-12 09:15:20 [multiproc_executor.py:822]   File "/home/wentao/vllm-source/vllm/compilation/inductor_pass.py", line 134, in fn_new
^[[0;36m(Worker_TP3_EP3 pid=1427826)^[[0;0m ERROR 12-12 09:15:20 [multiproc_executor.py:822]     result = fn(*args, **kwargs)
^[[0;36m(Worker_TP3_EP3 pid=1427826)^[[0;0m ERROR 12-12 09:15:20 [multiproc_executor.py:822]              ^^^^^^^^^^^^^^^^^^^
^[[0;36m(Worker_TP3_EP3 pid=1427826)^[[0;0m ERROR 12-12 09:15:20 [multiproc_executor.py:822]   File "/home/wentao/vllm-source/vllm/compilation/fusion.py", line 495, in __init__
^[[0;36m(Worker_TP3_EP3 pid=1427826)^[[0;0m ERROR 12-12 09:15:20 [multiproc_executor.py:822]     FusedAddRMSNormGroupQuantPattern(
^[[0;36m(Worker_TP3_EP3 pid=1427826)^[[0;0m ERROR 12-12 09:15:20 [multiproc_executor.py:822]   File "/home/wentao/vllm-source/vllm/compilation/fusion.py", line 270, in __init__
^[[0;36m(Worker_TP3_EP3 pid=1427826)^[[0;0m ERROR 12-12 09:15:20 [multiproc_executor.py:822]     super().__init__(epsilon, key)
^[[0;36m(Worker_TP3_EP3 pid=1427826)^[[0;0m ERROR 12-12 09:15:20 [multiproc_executor.py:822]   File "/home/wentao/vllm-source/vllm/compilation/fusion.py", line 130, in __init__
^[[0;36m(Worker_TP3_EP3 pid=1427826)^[[0;0m ERROR 12-12 09:15:20 [multiproc_executor.py:822]     config.model_config.hf_config.intermediate_size,
^[[0;36m(Worker_TP3_EP3 pid=1427826)^[[0;0m ERROR 12-12 09:15:20 [multiproc_executor.py:822]     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
^[[0;36m(Worker_TP3_EP3 pid=1427826)^[[0;0m ERROR 12-12 09:15:20 [multiproc_executor.py:822]   File "/home/wentao/.venv/lib/python3.12/site-packages/transformers/configuration_utils.py", line 207, in __getattribute__
^[[0;36m(Worker_TP3_EP3 pid=1427826)^[[0;0m ERROR 12-12 09:15:20 [multiproc_executor.py:822]     return super().__getattribute__(key)
^[[0;36m(Worker_TP3_EP3 pid=1427826)^[[0;0m ERROR 12-12 09:15:20 [multiproc_executor.py:822]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
^[[0;36m(Worker_TP3_EP3 pid=1427826)^[[0;0m ERROR 12-12 09:15:20 [multiproc_executor.py:822] torch._dynamo.exc.BackendCompilerFailed: backend='<vllm.compilation.backends.VllmBackend object at 0x7ede706738c0>' raised:
^[[0;36m(Worker_TP3_EP3 pid=1427826)^[[0;0m ERROR 12-12 09:15:20 [multiproc_executor.py:822] AttributeError: 'Qwen3VLMoeConfig' object has no attribute 'intermediate_size'

This PR fixes the issue

(APIServer pid=1577619) INFO 12-12 10:26:49 [launcher.py:46] Route: /load, Methods: GET
(APIServer pid=1577619) INFO 12-12 10:26:49 [launcher.py:46] Route: /v1/models, Methods: GET
(APIServer pid=1577619) INFO 12-12 10:26:49 [launcher.py:46] Route: /version, Methods: GET
(APIServer pid=1577619) INFO 12-12 10:26:49 [launcher.py:46] Route: /v1/responses, Methods: POST
(APIServer pid=1577619) INFO 12-12 10:26:49 [launcher.py:46] Route: /v1/responses/{response_id}, Methods: GET
(APIServer pid=1577619) INFO 12-12 10:26:49 [launcher.py:46] Route: /v1/responses/{response_id}/cancel, Methods: POST
(APIServer pid=1577619) INFO 12-12 10:26:49 [launcher.py:46] Route: /v1/messages, Methods: POST
(APIServer pid=1577619) INFO 12-12 10:26:49 [launcher.py:46] Route: /v1/chat/completions, Methods: POST
(APIServer pid=1577619) INFO 12-12 10:26:49 [launcher.py:46] Route: /v1/completions, Methods: POST
(APIServer pid=1577619) INFO 12-12 10:26:49 [launcher.py:46] Route: /v1/audio/transcriptions, Methods: POST
(APIServer pid=1577619) INFO 12-12 10:26:49 [launcher.py:46] Route: /v1/audio/translations, Methods: POST
(APIServer pid=1577619) INFO 12-12 10:26:49 [launcher.py:46] Route: /ping, Methods: GET
(APIServer pid=1577619) INFO 12-12 10:26:49 [launcher.py:46] Route: /ping, Methods: POST
(APIServer pid=1577619) INFO 12-12 10:26:49 [launcher.py:46] Route: /invocations, Methods: POST
(APIServer pid=1577619) INFO 12-12 10:26:49 [launcher.py:46] Route: /classify, Methods: POST
(APIServer pid=1577619) INFO 12-12 10:26:49 [launcher.py:46] Route: /v1/embeddings, Methods: POST
(APIServer pid=1577619) INFO 12-12 10:26:49 [launcher.py:46] Route: /score, Methods: POST
(APIServer pid=1577619) INFO 12-12 10:26:49 [launcher.py:46] Route: /v1/score, Methods: POST
(APIServer pid=1577619) INFO 12-12 10:26:49 [launcher.py:46] Route: /rerank, Methods: POST
(APIServer pid=1577619) INFO 12-12 10:26:49 [launcher.py:46] Route: /v1/rerank, Methods: POST
(APIServer pid=1577619) INFO 12-12 10:26:49 [launcher.py:46] Route: /v2/rerank, Methods: POST
(APIServer pid=1577619) INFO 12-12 10:26:49 [launcher.py:46] Route: /pooling, Methods: POST
(APIServer pid=1577619) INFO:     Started server process [1577619]
(APIServer pid=1577619) INFO:     Waiting for application startup.
(APIServer pid=1577619) INFO:     Application startup complete.

…mediate_size'

Signed-off-by: yewentao256 <zhyanwentao@126.com>
@chatgpt-codex-connector
Copy link
Copy Markdown

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.

@mergify mergify bot added the qwen Related to Qwen models label Dec 12, 2025
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request fixes an AttributeError for Qwen3VLMoeConfig by safely accessing intermediate_size and hidden_size from the model configuration. The change correctly looks for these attributes in text_config for multimodal models. I've added a suggestion to make the code more robust by handling cases where model_config itself might be None, which can occur in certain testing environments.

@yewentao256 yewentao256 added the ready ONLY add when PR is ready to merge/full CI is needed label Dec 12, 2025
@cjackal
Copy link
Copy Markdown
Contributor

cjackal commented Dec 12, 2025

I think #30244 also fixes the same VLM kernel fusion issue.

Copy link
Copy Markdown
Member

@DarkLight1337 DarkLight1337 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this PR still needed now that #30244 has been merged?

@mergify
Copy link
Copy Markdown

mergify bot commented Dec 15, 2025

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @yewentao256.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

@mergify mergify bot added the needs-rebase label Dec 15, 2025
Copy link
Copy Markdown
Member Author

@yewentao256 yewentao256 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cjackal @DarkLight1337 Thanks for letting me know, it is not needed now

@yewentao256
Copy link
Copy Markdown
Member Author

Close in favor of #30244

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

needs-rebase qwen Related to Qwen models ready ONLY add when PR is ready to merge/full CI is needed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants