Skip to content

[ROCm][CI] Fix accuracy for llama-nemotron-vl pooling tests#37613

Merged
DarkLight1337 merged 5 commits intovllm-project:mainfrom
ROCm:akaratza_fix_multi_mod_pooling
Mar 20, 2026
Merged

[ROCm][CI] Fix accuracy for llama-nemotron-vl pooling tests#37613
DarkLight1337 merged 5 commits intovllm-project:mainfrom
ROCm:akaratza_fix_multi_mod_pooling

Conversation

@AndreasKaratzas
Copy link
Collaborator

Follow-up for:

Fixes small accuracy diff due to differences in HF and vLLM attention backends on ROCm in mi250_1: Multi-Modal Models (Extended Pooling)

Motivation: https://buildkite.com/vllm/amd-ci/builds/6701/steps/canvas?sid=019d07a7-1a1e-445a-8480-1feaf029a19d&tab=output

cc @kenroche

Signed-off-by: Andreas Karatzas <akaratza@amd.com>
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
@AndreasKaratzas AndreasKaratzas added rocm Related to AMD ROCm ready ONLY add when PR is ready to merge/full CI is needed labels Mar 19, 2026
@AndreasKaratzas
Copy link
Collaborator Author

Testing MI250 to see if issue is resolved (added rocm and ready labels).

@github-project-automation github-project-automation bot moved this to Todo in AMD Mar 19, 2026
@mergify mergify bot added llama Related to Llama models multi-modality Related to multi-modality (#4194) labels Mar 19, 2026
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request addresses an accuracy issue for llama-nemotron-vl pooling tests on ROCm by generalizing the patch to force SDPA for vision encoders. The changes refactor patch_hf_vision_attn_for_rocm to support more model architectures and apply this patch in the relevant tests. Additionally, the relative tolerance for test assertions is increased for ROCm to account for numerical differences. My feedback includes a suggestion to improve the robustness of the patching logic to prevent potential errors with different model structures in the future.

Comment on lines +81 to +83
if hasattr(inner, "vision_embedding"):
vit = inner.vision_embedding[0]
for layer in vit.encoder.layers:
if hasattr(layer, "self_attn"):
layer.self_attn.vision_config._attn_implementation = "sdpa"
_patch_encoder_layers(vit.encoder)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The current implementation assumes that inner.vision_embedding is a non-empty list and that its first element has an encoder attribute. This could lead to IndexError or AttributeError if a model has a vision_embedding attribute with a different structure. To make this patch more robust and prevent future test failures, it's better to add checks for the list's existence and content, as well as for the presence of the encoder attribute.

Suggested change
if hasattr(inner, "vision_embedding"):
vit = inner.vision_embedding[0]
for layer in vit.encoder.layers:
if hasattr(layer, "self_attn"):
layer.self_attn.vision_config._attn_implementation = "sdpa"
_patch_encoder_layers(vit.encoder)
if hasattr(inner, "vision_embedding") and inner.vision_embedding:
vit = inner.vision_embedding[0]
if hasattr(vit, "encoder"):
_patch_encoder_layers(vit.encoder)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done :)

Signed-off-by: Andreas Karatzas <akaratza@amd.com>
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
@AndreasKaratzas AndreasKaratzas marked this pull request as ready for review March 20, 2026 15:14
@AndreasKaratzas
Copy link
Collaborator Author

Signed-off-by: Andreas Karatzas <akaratza@amd.com>
@DarkLight1337 DarkLight1337 enabled auto-merge (squash) March 20, 2026 15:42
@DarkLight1337 DarkLight1337 merged commit fb4e8bf into vllm-project:main Mar 20, 2026
20 of 22 checks passed
@github-project-automation github-project-automation bot moved this from Todo to Done in AMD Mar 20, 2026
@AndreasKaratzas AndreasKaratzas deleted the akaratza_fix_multi_mod_pooling branch March 20, 2026 17:21
chooper26 pushed a commit to intellistream/vllm-hust that referenced this pull request Mar 21, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

llama Related to Llama models multi-modality Related to multi-modality (#4194) ready ONLY add when PR is ready to merge/full CI is needed rocm Related to AMD ROCm

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

2 participants