[CI/Build] Fix AMD CI: test_cpu_gpu.py by zhewenl · Pull Request #27388 · vllm-project/vllm

zhewenl · 2025-10-23T04:50:45Z

Purpose

test_cpu_gpu.py is added in #21448, while it add different attention backends to test, some backends might not be compatible with platforms - like FlashInferBackend is not supported in ROCM at the current moment.

This PR refactors the test to conditional import backends, like what we did in test_attention_backends.py.

Test Plan

pytest -v -s v1/kv_offload/test_cpu_gpu.py
INFO 10-22 21:15:25 [__init__.py:225] Automatically detected platform rocm.
================================================================= test session starts =================================================================
platform linux -- Python 3.12.11, pytest-8.4.2, pluggy-1.6.0 -- /home/zhewenli/uv_env/vllm-fork/bin/python3
cachedir: .pytest_cache
rootdir: /data/users/zhewenli/gitrepos/vllm-fork
configfile: pyproject.toml
plugins: anyio-4.11.0, asyncio-1.2.0
asyncio: mode=Mode.STRICT, debug=False, asyncio_default_fixture_loop_scope=None, asyncio_default_test_loop_scope=function
collecting ... WARNING 10-22 21:15:43 [interface.py:512] Current platform cuda does not have '__test__' attribute.
WARNING 10-22 21:15:43 [interface.py:512] Current platform cuda does not have '__bases__' attribute.
WARNING 10-22 21:15:43 [interface.py:512] Current platform cuda does not have '__test__' attribute.
collected 4 items                                                                                                                                     

v1/kv_offload/test_cpu_gpu.py::test_transfer[cuda:0-0-dtype0-4-256-64-1-16-8-64-3-True] INFO 10-22 21:15:43 [cpu_gpu.py:78] Allocating 4 CPU tensors...
PASSED
v1/kv_offload/test_cpu_gpu.py::test_transfer[cuda:0-0-dtype0-4-256-64-1-16-8-64-3-False] INFO 10-22 21:15:44 [cpu_gpu.py:78] Allocating 4 CPU tensors...
PASSED
v1/kv_offload/test_cpu_gpu.py::test_transfer[cuda:0-0-dtype0-4-256-64-3-16-8-64-3-True] INFO 10-22 21:15:44 [cpu_gpu.py:78] Allocating 4 CPU tensors...
PASSED
v1/kv_offload/test_cpu_gpu.py::test_transfer[cuda:0-0-dtype0-4-256-64-3-16-8-64-3-False] INFO 10-22 21:15:45 [cpu_gpu.py:78] Allocating 4 CPU tensors...
PASSED

================================================================== warnings summary ===================================================================
<frozen importlib._bootstrap>:488
  <frozen importlib._bootstrap>:488: DeprecationWarning: builtin type SwigPyPacked has no __module__ attribute

<frozen importlib._bootstrap>:488
  <frozen importlib._bootstrap>:488: DeprecationWarning: builtin type SwigPyObject has no __module__ attribute

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
============================================================ 4 passed, 2 warnings in 2.86s ============================================================

Signed-off-by: zhewenli <zhewenli@meta.com>

tests/v1/kv_offload/test_cpu_gpu.py

Signed-off-by: zhewenli <zhewenli@meta.com>

DarkLight1337 · 2025-10-23T05:26:09Z

tests/v1/kv_offload/test_cpu_gpu.py

 BACKENDS_TO_TEST = [FlashAttentionBackend]

-try:
+if current_platform.is_cuda():


Let's use the previous logic for all non-ROCm platforms, to avoid changing the logic prior to this PR

@DarkLight1337 sounds good, updated - since I am working on checking AMD CI in general, I wonder if we want to use current_platform to gate CUDA specific kernels/features? The context is I found there are many tests with assumption that they will be running on CUDA platforms, thus these test are failing on AMD.

For example, ROCM uses TritonAttentionImpl as default, which doesn't support models with encoder(like openai/whisper)

cc @LucasWilkinson

Signed-off-by: zhewenli <zhewenli@meta.com>

…o step_forward * 'step_forward' of https://github.com/raindaywhu/vllm: (148 commits) [Model] Add MoE support for NemotronH (vllm-project#25863) [Metrics] [KVConnector] Add connector prefix cache hit rate stats (vllm-project#26245) [CI] Reorganize entrypoints tests (vllm-project#27403) add SLA information into comparison graph for vLLM Benchmark Suite (vllm-project#25525) [CI/Build] Fix AMD CI: test_cpu_gpu.py (vllm-project#27388) [Bugfix] Fix args settings for guided decoding args (vllm-project#27375) [CI/Build] Fix Prithvi plugin test (vllm-project#27393) [Chore] Remove duplicate `has_` functions in vllm.utils (vllm-project#27372) [Model] Add num_cached_tokens for PoolingRequestOutput (vllm-project#27378) [V1][spec decode] return logprobs for spec decoding (vllm-project#26060) [CORE] Support Prefix Caching with Prompt Embeds (vllm-project#27219) [Bugfix][Core] running queue index leakage exception (vllm-project#26754) [Bugfix] Fix incorrect kv cache metrics in grafana.json (vllm-project#27133) [Bugfix] Fix SLA tuner initialization (vllm-project#27355) [Bugfix] Fix deepseek-ocr multi-image inference and add `merge_by_field_config=True` with tensor schema support (vllm-project#27361) [MLA] Bump FlashMLA (vllm-project#27354) [Chore] Separate out system utilities from vllm.utils (vllm-project#27201) [BugFix] bugfix for Flash Attention MLA with full cuda graph IMA following pr-25490 (vllm-project#27128) [Feature] publisher default set zmq in kv_event config (vllm-project#26915) [Prefix Cache] Use LoRA name for consistent KV-cache block hashing (vllm-project#27211) ...

Signed-off-by: zhewenli <zhewenli@meta.com> Signed-off-by: 0xrushi <6279035+0xrushi@users.noreply.github.com>

Signed-off-by: zhewenli <zhewenli@meta.com>

update tests

d36420d

Signed-off-by: zhewenli <zhewenli@meta.com>

mergify bot added the v1 label Oct 23, 2025

zhewenl requested review from ApostaC, njhill, yeqcharlotte, yewentao256 and ywang96 October 23, 2025 04:51

zhewenl marked this pull request as ready for review October 23, 2025 04:51

zhewenl requested a review from Alexei-V-Ivanov-AMD October 23, 2025 04:51

zhewenl changed the title ~~update tests~~ [CI/Build] Fix AMD CI: test_cpu_gpu.py Oct 23, 2025

mergify bot added the rocm Related to AMD ROCm label Oct 23, 2025

DarkLight1337 reviewed Oct 23, 2025

View reviewed changes

tests/v1/kv_offload/test_cpu_gpu.py Show resolved Hide resolved

update logic to use current_platform

834e042

Signed-off-by: zhewenli <zhewenli@meta.com>

DarkLight1337 reviewed Oct 23, 2025

View reviewed changes

update

f8a9123

Signed-off-by: zhewenli <zhewenli@meta.com>

zhewenl added ci/build ci-failure Issue about an unexpected test failure in CI labels Oct 23, 2025

github-project-automation bot added this to CI Failures Oct 23, 2025

yeqcharlotte approved these changes Oct 23, 2025

View reviewed changes

yeqcharlotte enabled auto-merge (squash) October 23, 2025 05:54

github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Oct 23, 2025

yeqcharlotte merged commit 50b788a into vllm-project:main Oct 23, 2025
24 checks passed

github-project-automation bot moved this to Done in CI Failures Oct 23, 2025

usberkeley pushed a commit to usberkeley/vllm that referenced this pull request Oct 23, 2025

[CI/Build] Fix AMD CI: test_cpu_gpu.py (vllm-project#27388)

e0224b5

Signed-off-by: zhewenli <zhewenli@meta.com>

0xrushi pushed a commit to 0xrushi/vllm that referenced this pull request Oct 26, 2025

[CI/Build] Fix AMD CI: test_cpu_gpu.py (vllm-project#27388)

0a85a28

Signed-off-by: zhewenli <zhewenli@meta.com> Signed-off-by: 0xrushi <6279035+0xrushi@users.noreply.github.com>

0xrushi pushed a commit to 0xrushi/vllm that referenced this pull request Oct 26, 2025

[CI/Build] Fix AMD CI: test_cpu_gpu.py (vllm-project#27388)

16024b9

Signed-off-by: zhewenli <zhewenli@meta.com> Signed-off-by: 0xrushi <6279035+0xrushi@users.noreply.github.com>

ilmarkov pushed a commit to neuralmagic/vllm that referenced this pull request Nov 7, 2025

[CI/Build] Fix AMD CI: test_cpu_gpu.py (vllm-project#27388)

6188fd4

Signed-off-by: zhewenli <zhewenli@meta.com>

rtourgeman pushed a commit to rtourgeman/vllm that referenced this pull request Nov 10, 2025

[CI/Build] Fix AMD CI: test_cpu_gpu.py (vllm-project#27388)

7692606

Signed-off-by: zhewenli <zhewenli@meta.com>

devpatelio pushed a commit to SumanthRH/vllm that referenced this pull request Nov 29, 2025

[CI/Build] Fix AMD CI: test_cpu_gpu.py (vllm-project#27388)

5d7550c

Signed-off-by: zhewenli <zhewenli@meta.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[CI/Build] Fix AMD CI: test_cpu_gpu.py#27388

[CI/Build] Fix AMD CI: test_cpu_gpu.py#27388
yeqcharlotte merged 3 commits intovllm-project:mainfrom
zhewenl:fix-amd-ci-kv-trasnfer

zhewenl commented Oct 23, 2025 •

edited by github-actions bot

Loading

Uh oh!

Uh oh!

DarkLight1337 Oct 23, 2025

Uh oh!

zhewenl Oct 23, 2025

Uh oh!

DarkLight1337 Oct 23, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

zhewenl commented Oct 23, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Uh oh!

Uh oh!

DarkLight1337 Oct 23, 2025

Choose a reason for hiding this comment

Uh oh!

zhewenl Oct 23, 2025

Choose a reason for hiding this comment

Uh oh!

DarkLight1337 Oct 23, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

zhewenl commented Oct 23, 2025 •

edited by github-actions bot

Loading