[XPU][P/D] Add XPU support in NixlConnector#22436
[XPU][P/D] Add XPU support in NixlConnector#22436vllm-bot merged 4 commits intovllm-project:mainfrom
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces XPU support for the NixlConnector, which is a great enhancement. The changes are mostly correct and follow the existing patterns in the codebase. However, I've found a critical bug in the tensor indexing logic for KV block copying in xpu_model_runner.py, which would lead to incorrect behavior. Additionally, the new test script for accuracy verification has a process cleanup mechanism that could be made more robust and safer for shared environments. I've provided suggestions to fix these issues.
tests/v1/kv_connector/nixl_integration/run_xpu_disagg_accuracy_test.sh
Outdated
Show resolved
Hide resolved
|
👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add 🚀 |
|
This pull request has merge conflicts that must be resolved before it can be |
tests/v1/kv_connector/nixl_integration/run_xpu_disagg_accuracy_test.sh
Outdated
Show resolved
Hide resolved
jikunshang
left a comment
There was a problem hiding this comment.
Overall LGTM. Need one more commiter to approve.
|
Hi @njhill, could you help review this pr? |
|
This pull request has merge conflicts that must be resolved before it can be |
|
Please fix pre-commit |
Signed-off-by: zhenwei <zhenwei.liu@intel.com>
Signed-off-by: zhenwei <zhenwei.liu@intel.com>
Head branch was pushed to by a user without write access
35de37b to
a39caed
Compare
fixed, thanks |
Signed-off-by: zhenwei <zhenwei.liu@intel.com> Co-authored-by: Kunshang Ji <kunshang.ji@intel.com>
Signed-off-by: zhenwei <zhenwei.liu@intel.com> Co-authored-by: Kunshang Ji <kunshang.ji@intel.com>
Signed-off-by: zhenwei <zhenwei.liu@intel.com> Co-authored-by: Kunshang Ji <kunshang.ji@intel.com>
Support PD disaggregation on XPU based on #18293
Utilizing CPU memory as a buffer and performing point-to-point transmission via NIXL
Limitation
Example
Verify accuracy
bash tests/v1/kv_connector/nixl_integration/run_xpu_acc_test.shTest performace
Scenario:1K input & 256 output tokens & request-rate=1
1P1D vs vLLM (TP=2) vs vLLM (TP=1):
