Conversation
🚧 CI BlockedThe main CI workflow was not started for the following reason:
|
There was a problem hiding this comment.
Pull request overview
This PR adds support for UCX testing in the CI pipeline by introducing a new test function and making the UCX transport layer selection configurable based on the buffer device type.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| tests/unit_tests/run_accuracy_test.sh | Makes UCX_TLS configurable, using gaudi-specific settings when buffer device is HPU, and restores proper git root detection |
| tests/full_tests/ci_gsm8k_tests.sh | Adds new test function for PD disaggregate through NIXL UCX |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
156e4c2 to
5480568
Compare
|
Since intel-staging/ucx#1 is not merged yet, convert the PR to draft to avoid repeated CI |
🚧 CI BlockedThe main CI workflow was not started for the following reason:
|
🚧 CI BlockedThe main CI workflow was not started for the following reason:
|
🚧 CI BlockedThe main CI workflow was not started for the following reason:
|
Signed-off-by: Daniel Huang <daniel1.huang@intel.com>
Signed-off-by: Daniel Huang <daniel1.huang@intel.com>
Signed-off-by: Daniel Huang <daniel1.huang@intel.com>
Signed-off-by: Daniel Huang <daniel1.huang@intel.com>
Signed-off-by: Daniel Huang <daniel1.huang@intel.com>
Signed-off-by: Daniel Huang <daniel1.huang@intel.com>
Signed-off-by: Daniel Huang <daniel1.huang@intel.com>
Signed-off-by: Daniel Huang <daniel1.huang@intel.com>
Signed-off-by: Daniel Huang <daniel1.huang@intel.com>
Signed-off-by: Daniel Huang <daniel1.huang@intel.com>
Signed-off-by: Daniel Huang <daniel1.huang@intel.com>
Signed-off-by: Daniel Huang <daniel1.huang@intel.com>
Signed-off-by: Daniel Huang <daniel1.huang@intel.com>
| export VLLM_NIXL_DEVICE_TO_DEVICE=true | ||
| UCX_TLS="tcp" | ||
| if [ "$VLLM_NIXL_BACKEND" == "UCX" ]; then | ||
| export VLLM_NIXL_DEVICE_TO_DEVICE=false |
There was a problem hiding this comment.
we will also test UCX + CPU with non_gaudi_gdr, so please don't put
VLLM_NIXL_DEVICE_TO_DEVICE=true as only option
There was a problem hiding this comment.
Fixed, added back VLLM_NIXL_DEVICE_TO_DEVICE=false for UCX + CPU configuration.
Signed-off-by: Daniel Huang <daniel1.huang@intel.com>
✅ CI PassedAll checks passed successfully against the following vllm commit: |
Adds pd nixl UCX test to CI. This is dependent on intel-staging/ucx#1 --------- Signed-off-by: Daniel Huang <daniel1.huang@intel.com> Signed-off-by: Jin, Youzhi <youzhi.jin@intel.com>
Adds pd nixl UCX test to CI. This is dependent on intel-staging/ucx#1 --------- Signed-off-by: Daniel Huang <daniel1.huang@intel.com> Signed-off-by: slokesha <slokeshappa@habana.ai>
Adds pd nixl UCX test to CI. This is dependent on intel-staging/ucx#1 --------- Signed-off-by: Daniel Huang <daniel1.huang@intel.com>
Adds pd nixl UCX test to CI. This is dependent on intel-staging/ucx#1 --------- Signed-off-by: Daniel Huang <daniel1.huang@intel.com>
Adds pd nixl UCX test to CI. This is dependent on intel-staging/ucx#1