Skip to content

[CI][ROCm] Fix NIXL tests on ROCm#31728

Merged
DarkLight1337 merged 2 commits intovllm-project:mainfrom
NickLucche:nixl-split-ci-tests2
Jan 8, 2026
Merged

[CI][ROCm] Fix NIXL tests on ROCm#31728
DarkLight1337 merged 2 commits intovllm-project:mainfrom
NickLucche:nixl-split-ci-tests2

Conversation

@NickLucche
Copy link
Copy Markdown
Collaborator

Follow up to #31491 for AMD mirror pipeline + timeout update from latest nightly run.

Signed-off-by: NickLucche <nlucches@redhat.com>
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the CI configuration for NIXL tests, primarily for ROCm. The changes involve renaming a test script to a more generic version and adjusting test timeouts. The script consolidation simplifies the test execution logic, and the timeout adjustments are based on recent test runs, which should improve CI stability. The changes are straightforward and appear correct.

@mergify mergify bot added ci/build rocm Related to AMD ROCm kv-connector labels Jan 5, 2026
Signed-off-by: NickLucche <nlucches@redhat.com>
@NickLucche NickLucche added the ready ONLY add when PR is ready to merge/full CI is needed label Jan 5, 2026
@tjtanaa tjtanaa added ready-run-all-tests Trigger CI with all tests for wide-ranging PRs and removed ready-run-all-tests Trigger CI with all tests for wide-ranging PRs labels Jan 7, 2026
@tjtanaa
Copy link
Copy Markdown
Collaborator

tjtanaa commented Jan 7, 2026

The amdproduction marker does not trigger the unit test that is related to this PR. I have retriggered the AMDCI to include the test group that is going to be fixed by this PR. But not all test groups are passing. So, this is to use as reference to check if this PR fixes that specific test group only.

@DarkLight1337
Copy link
Copy Markdown
Member

The test failed with RuntimeError: NIXL is not available

@tjtanaa
Copy link
Copy Markdown
Collaborator

tjtanaa commented Jan 7, 2026

@DarkLight1337 The most likely rocm/vllm-dev:base has not been propagated with the change from this PR 0f35429 . Will need @gshtras to trigger internal pipeline to update rocm/vllm-dev:base. So for now. As long as Nick has validated, I think it should be fine.

@NickLucche have you run the tests group locally on ROCm?

@NickLucche
Copy link
Copy Markdown
Collaborator Author

NickLucche commented Jan 8, 2026

@tjtanaa @DarkLight1337 Test is running just fine locally with rixl build. It's failing after some time, but that's unrelated to the filename I am fixing here.

@DarkLight1337 DarkLight1337 merged commit 83e1c76 into vllm-project:main Jan 8, 2026
18 of 20 checks passed
yugong333 pushed a commit to yugong333/vllm that referenced this pull request Jan 9, 2026
Signed-off-by: NickLucche <nlucches@redhat.com>
akh64bit pushed a commit to akh64bit/vllm that referenced this pull request Jan 16, 2026
Signed-off-by: NickLucche <nlucches@redhat.com>
dsuhinin pushed a commit to dsuhinin/vllm that referenced this pull request Jan 21, 2026
Signed-off-by: NickLucche <nlucches@redhat.com>
Signed-off-by: dsuhinin <suhinin.dmitriy@gmail.com>
ItzDEXX pushed a commit to ItzDEXX/vllm that referenced this pull request Feb 19, 2026
Signed-off-by: NickLucche <nlucches@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci/build kv-connector ready ONLY add when PR is ready to merge/full CI is needed rocm Related to AMD ROCm

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants