Skip to content

[ROCm][Bugfix] Fix ROCm runtime failure due to missing symbol#38750

Merged
vllm-bot merged 3 commits intovllm-project:mainfrom
ROCm:rocm_fix
Apr 2, 2026
Merged

[ROCm][Bugfix] Fix ROCm runtime failure due to missing symbol#38750
vllm-bot merged 3 commits intovllm-project:mainfrom
ROCm:rocm_fix

Conversation

@gshtras
Copy link
Copy Markdown
Collaborator

@gshtras gshtras commented Apr 1, 2026

Follow up for #32996

Failed to import from vllm._C with ImportError('/projects/ROCm/vllm_upstream/vllm/_C.abi3.so: undefined symbol: _Z28silu_and_mul_per_block_quantRN2at6TensorERKS0_S1_lSt8optionalIS0_Eb')

The file is not built for ROCm, so the import of torch._C fails

Signed-off-by: Gregory Shtrasberg <Gregory.Shtrasberg@amd.com>
@mergify mergify bot added rocm Related to AMD ROCm bug Something isn't working labels Apr 1, 2026
@github-project-automation github-project-automation bot moved this to Todo in AMD Apr 1, 2026
@ProExpertProg ProExpertProg enabled auto-merge (squash) April 1, 2026 20:42
@github-actions github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Apr 1, 2026
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request restricts the silu_and_mul_per_block_quant operation to non-ROCm environments by adding preprocessor guards in the header and binding files. A review comment correctly identifies that the macro IS_ROCM used in csrc/ops.h is inconsistent with the USE_ROCM macro used in the rest of the project, which would prevent the function from being properly hidden during ROCm builds.

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Signed-off-by: Gregory Shtrasberg <156009573+gshtras@users.noreply.github.com>
@AndreasKaratzas
Copy link
Copy Markdown
Collaborator

AndreasKaratzas commented Apr 1, 2026

Both these CI errors look unrelated. One is a type mismatch in test level, and the other is a trt related error.

Update: found one of them: #37831 (comment)

@chaunceyjiang
Copy link
Copy Markdown
Collaborator

please rebase the main

@AndreasKaratzas
Copy link
Copy Markdown
Collaborator

please rebase the main

Sure but is there any reason for that? Failures are not related. Why rebase and waste resources?

@tjtanaa
Copy link
Copy Markdown
Collaborator

tjtanaa commented Apr 2, 2026

please rebase the main

Sure but is there any reason for that? Failures are not related. Why rebase and waste resources?

We can cancel AMD CI if we want to.

Usually, it is helpful to rebase the PR to see if the passes in case if we missed something. Moreover, some of us reviewers might not be the expert of the topic and would like to help to merge critical PRs.

I am not sure if that's the intention, but that's the case for some other reviewers.

Some errors could also be hidden and does not surface as correct error messages.

So depending on the importance of the test cases, we could sometime not able force merge PRs and has to wait until the test are resolved. This is to ensure that we don't keep on introducing code which could make triaging harder. (This is a per case basis, I don't know if there is any clear policy)

@vllm-bot vllm-bot merged commit 3aab680 into vllm-project:main Apr 2, 2026
134 of 140 checks passed
@github-project-automation github-project-automation bot moved this from Todo to Done in AMD Apr 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working ready ONLY add when PR is ready to merge/full CI is needed rocm Related to AMD ROCm

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

7 participants