Conversation
🚧 CI BlockedThe main CI workflow was not started for the following reason:
|
There was a problem hiding this comment.
Pull request overview
This PR adds documentation for the heterogeneous PD (Prefill-Decode) disaggregation feature, which allows splitting model execution across CUDA prefill nodes and Gaudi decode nodes. The documentation covers setup requirements, installation procedures, service configuration, and verification steps.
Key Changes:
- Added comprehensive setup guide for CUDA+Gaudi multi-node systems
- Documented installation of NIXL with UCX support
- Provided launch configurations for prefill, decode, and proxy services
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
🚧 CI BlockedThe main CI workflow was not started for the following reason:
|
6 similar comments
🚧 CI BlockedThe main CI workflow was not started for the following reason:
|
🚧 CI BlockedThe main CI workflow was not started for the following reason:
|
🚧 CI BlockedThe main CI workflow was not started for the following reason:
|
🚧 CI BlockedThe main CI workflow was not started for the following reason:
|
🚧 CI BlockedThe main CI workflow was not started for the following reason:
|
🚧 CI BlockedThe main CI workflow was not started for the following reason:
|
c5511f6 to
6ff44c9
Compare
🚧 CI BlockedThe main CI workflow was not started for the following reason:
|
1 similar comment
🚧 CI BlockedThe main CI workflow was not started for the following reason:
|
Signed-off-by: Daniel Huang <daniel1.huang@intel.com>
Signed-off-by: Daniel Huang <daniel1.huang@intel.com>
Signed-off-by: Daniel Huang <daniel1.huang@intel.com>
Signed-off-by: Daniel Huang <daniel1.huang@intel.com>
Signed-off-by: Daniel Huang <daniel1.huang@intel.com>
Signed-off-by: Daniel Huang <daniel1.huang@intel.com>
Signed-off-by: Daniel Huang <daniel1.huang@intel.com>
Signed-off-by: Daniel Huang <daniel1.huang@intel.com>
Signed-off-by: Daniel Huang <daniel1.huang@intel.com>
Signed-off-by: Daniel Huang <daniel1.huang@intel.com>
7e52936 to
5acc621
Compare
🚧 CI BlockedThe main CI workflow was not started for the following reason:
|
Add docs for #711 feature. I'm not sure where the best place to put the docs, the placement is flexible.
This docker is reply on upstream PR merge for
Prefill(CUDA) -> Decode(Gaudi): #vllm-project/vllm#30275
Prefill(Gaudi) -> Decode(Cuda): #vllm-project/vllm#30448