Skip to content

Buildkite hardware ci xpu test#1340

Merged
gcanlin merged 25 commits into
vllm-project:mainfrom
pi314ever:buildkite-hardware-ci-xpu-test
Mar 14, 2026
Merged

Buildkite hardware ci xpu test#1340
gcanlin merged 25 commits into
vllm-project:mainfrom
pi314ever:buildkite-hardware-ci-xpu-test

Conversation

@pi314ever
Copy link
Copy Markdown
Contributor

@pi314ever pi314ever commented Feb 11, 2026

PLEASE FILL IN THE PR DESCRIPTION HERE ENSURING ALL CHECKLIST ITEMS (AT THE BOTTOM) HAVE BEEN CONSIDERED.

Purpose

Adds hardware CI for Intel XPU via buildkite. Relies on #1162 for XPU docker.

Test Plan

BUILDKITE_COMMIT=test VLLM_VERSION=v0.17.0 ./.buildkite/scripts/hardware_ci/run-xpu-test.sh

Test Result

All tests pass on 8xB60.


Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model. Please run mkdocs serve to sync the documentation editions to ./docs.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft.

BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)

@pi314ever pi314ever force-pushed the buildkite-hardware-ci-xpu-test branch 3 times, most recently from 596b8f4 to 22f7ab4 Compare February 18, 2026 18:47
@pi314ever pi314ever mentioned this pull request Feb 20, 2026
1 task
@pi314ever pi314ever force-pushed the buildkite-hardware-ci-xpu-test branch from cab95fa to b071308 Compare February 20, 2026 20:48
@pi314ever pi314ever marked this pull request as ready for review February 20, 2026 20:48
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b0713083a9

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread .buildkite/scripts/hardware_ci/run-xpu-test.sh Outdated
Copy link
Copy Markdown
Collaborator

@lishunyang12 lishunyang12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Straightforward CI script addition -- a few nits below, mostly around hardening and a stale comment.

Comment thread .buildkite/scripts/hardware_ci/run-xpu-test.sh Outdated
Comment thread .buildkite/scripts/hardware_ci/run-xpu-test.sh Outdated
Comment thread .buildkite/scripts/hardware_ci/run-xpu-test.sh
Comment thread .buildkite/scripts/hardware_ci/run-xpu-test.sh
Comment thread .buildkite/scripts/hardware_ci/run-xpu-test.sh Outdated
Comment thread .buildkite/scripts/hardware_ci/run-xpu-test.sh
@hsliuustc0106
Copy link
Copy Markdown
Collaborator

@vllm-omni-reviewer

Copy link
Copy Markdown
Collaborator

@lishunyang12 lishunyang12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good now, thanks for the quick turnaround on the fixes.

@pi314ever pi314ever force-pushed the buildkite-hardware-ci-xpu-test branch 2 times, most recently from 1e78648 to 18a24ca Compare March 5, 2026 17:43
@pi314ever
Copy link
Copy Markdown
Contributor Author

Recent changes are for more feature-based testing (testing for features that have passed for XPU on B60 nodes) as opposed to sweeping all available tests. For future PRs that enable certain features, tests should be added as they are enabled on XPU.

Copy link
Copy Markdown
Contributor

@xuechendi xuechendi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @pi314ever , looks good to me.
@hsliuustc0106 , @gcanlin , may you take a look, thanks so much

@gcanlin
Copy link
Copy Markdown
Collaborator

gcanlin commented Mar 7, 2026

Thanks, @pi314ever , looks good to me. @hsliuustc0106 , @gcanlin , may you take a look, thanks so much

We're upgrading to v0.17.0. Please wait #1639 merged, then rebase this PR based on v0.17.0. Feel free to ping me again when the rebasing is ready :)

@pi314ever pi314ever force-pushed the buildkite-hardware-ci-xpu-test branch from febc254 to dabdb8f Compare March 9, 2026 16:24
@pi314ever
Copy link
Copy Markdown
Contributor Author

@gcanlin Rebased!

Comment thread tests/e2e/offline_inference/test_qwen2_5_omni.py Outdated
Copy link
Copy Markdown
Collaborator

@gcanlin gcanlin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Just one non-blocking thought: I notice that we're introducing many config dispatch for platform in tests. Not sure whether we have the way to clean up them.

@xuechendi
Copy link
Copy Markdown
Contributor

xuechendi commented Mar 10, 2026

LGTM. Just one non-blocking thought: I notice that we're introducing many config dispatch for platform in tests. Not sure whether we have the way to clean up them.

Thanks for the review. Let's put this as an AR. Will follow up with a different PR.

I am think we can at least reuse existing one in platforms/xpu/stage_configs? So all these config can get tested as well?

Also, will also need to provide multiple folders under platforms/${HW}/stage_configs/ for different products which are with different device_memory_size, so config will also be different

@gcanlin gcanlin added the ready label to trigger buildkite CI label Mar 10, 2026
Comment thread .buildkite/scripts/hardware_ci/run-xpu-test.sh Outdated
Comment thread .buildkite/scripts/hardware_ci/run-xpu-test.sh Outdated
@pi314ever pi314ever force-pushed the buildkite-hardware-ci-xpu-test branch 2 times, most recently from 507fade to 9a25375 Compare March 11, 2026 22:10
Signed-off-by: Daniel Huang <daniel1.huang@intel.com>
Signed-off-by: Daniel Huang <daniel1.huang@intel.com>
Signed-off-by: Daniel Huang <daniel1.huang@intel.com>
Signed-off-by: Daniel Huang <daniel1.huang@intel.com>
Signed-off-by: Daniel Huang <daniel1.huang@intel.com>
Signed-off-by: Daniel Huang <daniel1.huang@intel.com>
Signed-off-by: Daniel Huang <daniel1.huang@intel.com>
Signed-off-by: Daniel Huang <daniel1.huang@intel.com>
Signed-off-by: Daniel Huang <daniel1.huang@intel.com>
Signed-off-by: Daniel Huang <daniel1.huang@intel.com>
Signed-off-by: Daniel Huang <daniel1.huang@intel.com>
Signed-off-by: Daniel Huang <daniel1.huang@intel.com>
Signed-off-by: Daniel Huang <daniel1.huang@intel.com>
Signed-off-by: Daniel Huang <daniel1.huang@intel.com>
Signed-off-by: Daniel Huang <daniel1.huang@intel.com>
Signed-off-by: Daniel Huang <daniel1.huang@intel.com>
Signed-off-by: Daniel Huang <daniel1.huang@intel.com>
Signed-off-by: Daniel Huang <daniel1.huang@intel.com>
@pi314ever pi314ever force-pushed the buildkite-hardware-ci-xpu-test branch from 3603e37 to 79bb1ad Compare March 13, 2026 16:43
Signed-off-by: Daniel Huang <daniel1.huang@intel.com>
Signed-off-by: Daniel Huang <daniel1.huang@intel.com>
@pi314ever
Copy link
Copy Markdown
Contributor Author

I have fixed all of the conflicts. Here are the latest test results

core_model and xpu and B60:
image

advanced_model and xpu and B60:
image

Docker build time: 15s with cache, ~8 min without (vLLM base docker build included)
Total test time: 10 min (7 min core_model, 2 min advanced_model, 1 min aggregate overhead)

@xuechendi
Copy link
Copy Markdown
Contributor

@congw729 @gcanlin @hsliuustc0106
We have validated the script in the prepared CI node, it takes less <10min to complete entire tests.
Please take a look and if it is OK

@gcanlin gcanlin merged commit c107f0b into vllm-project:main Mar 14, 2026
7 checks passed
@hsliuustc0106
Copy link
Copy Markdown
Collaborator

currently, do we have access to the hardware to run the pipeline?

wtomin pushed a commit to wtomin/vllm-omni that referenced this pull request Mar 16, 2026
Signed-off-by: Daniel Huang <daniel1.huang@intel.com>
lengrongfu pushed a commit to lengrongfu/vllm-omni that referenced this pull request Mar 16, 2026
Signed-off-by: Daniel Huang <daniel1.huang@intel.com>
tangbinh pushed a commit to tangbinh/vllm-omni that referenced this pull request Mar 18, 2026
Signed-off-by: Daniel Huang <daniel1.huang@intel.com>
yiliu30 pushed a commit to yiliu30/vllm-omni-fork that referenced this pull request Mar 20, 2026
Signed-off-by: Daniel Huang <daniel1.huang@intel.com>

Signed-off-by: yiliu30 <yi4.liu@intel.com>
clodaghwalsh17 pushed a commit to clodaghwalsh17/nm-vllm-omni-ent that referenced this pull request May 12, 2026
Signed-off-by: Daniel Huang <daniel1.huang@intel.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready label to trigger buildkite CI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants