[ROCm][CI] Mark gemma3 as large GPU test to avoid OOM on MI250#37610
[ROCm][CI] Mark gemma3 as large GPU test to avoid OOM on MI250#37610DarkLight1337 merged 5 commits intovllm-project:mainfrom
Conversation
|
Testing MI250 to see if issue is resolved (added |
There was a problem hiding this comment.
Code Review
This pull request addresses an Out-Of-Memory error on MI250 for gemma3 tests under ROCm by skipping the Scaled Dot-Product Attention (SDP) override. The change is simple and effective. I've added one suggestion to improve code maintainability by documenting the reason for this skip directly in the code.
3e44e63 to
879b58b
Compare
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
700d164 to
cdcea97
Compare
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
|
I have just added a large GPU mark for ROCm only here. This will help skip the test if the platform is mi250 and resolve OOMing there. |
|
Test group has been confirmed green: https://buildkite.com/vllm/amd-ci/builds/6743/steps/canvas?sid=019d0d33-9d6c-4d3e-a0b7-bd741edf4239&tab=output |
|
Actually, how can a 4B model cause OOM? |
|
I think it's the profiling stage that generates a tensor that is big enough to create that. It happens during the SDPA stage. |
|
Hmm ok, maybe you should investigate this further as it's quite unexpected. Let's get the CI to pass first though |
…project#37610) Signed-off-by: sagformas <sagformas@epdcenter.es>
Follow-up for:
Fixes OOM in
mi250_1: Multi-Modal Models (Standard) 2: qwen3 + gemmaMotivation: https://buildkite.com/vllm/amd-ci/builds/6701/steps/canvas?sid=019d07a7-1a19-4174-b4a1-c9bbfff0c164&tab=output
@kenroche