[NIXL][OOT platform] support nixl_connector with oot platform and other nixl_backend#25121
Conversation
There was a problem hiding this comment.
Code Review
This pull request extends the nixl_connector to support Out-Of-Tree (OOT) platforms and allows configuring the NIXL backend via an environment variable. The changes are well-structured, following the existing platform interface pattern in vLLM by adding new methods to the Platform class. This makes the integration clean and extensible. I have one suggestion to improve the robustness of handling the new environment variable.
|
@robertgshaw2-redhat @NickLucche please help to review this PR, thanks |
| self.nixl_memory_type) | ||
| logger.debug("Registering descs: %s", caches_data) | ||
| self.nixl_wrapper.register_memory(descs) | ||
| self.nixl_wrapper.register_memory(descs, backends=[self.nixl_backend]) |
There was a problem hiding this comment.
I think using --kv-transfer-config '{"kv_connector":"NixlConnector","kv_role":"kv_both", "kv_connector_extra_config":{"backend":"ucx"}}' might be more appropriate.
NickLucche
left a comment
There was a problem hiding this comment.
Hey @xuechendi , thanks a lot for contributing!
This looks clean, but I also agree with @chaunceyjiang it would be even cleaner to put extra configs in kv_connector_extra_config":{"backend":"ucx"}} instead of adding more env vars.
Also, it would be quite useful to the community if you could add a line or two to the docs highlighting your change, as well as a simple unit test to test_nixl_connector.py checking memory_type/kv_buffer_type are as intended (we have more changes lined up to it in review).
Thank you!
|
@NickLucche, for adding a new UT in test_nixl_connector.py, I added one "test_kv_buffer_to_nixl_memory_types", which is initialized with a FakePlatform, may you check if this is what you expected? |
cec89f3 to
c76b975
Compare
| @@ -63,6 +64,8 @@ | |||
| "tpu": ("cpu", ), | |||
| "xpu": ("cpu", ), | |||
| } | |||
| # support for oot platform by providing mapping in current_platform | |||
| _NIXL_SUPPORTED_DEVICE.update(current_platform.get_nixl_supported_devices()) | |||
There was a problem hiding this comment.
The variable _NIXL_SUPPORTED_DEVICE appears to be unused.
There was a problem hiding this comment.
|
This pull request has merge conflicts that must be resolved before it can be |
c76b975 to
2b365f3
Compare
|
@NickLucche @chaunceyjiang , may you help to check again, or please approve if the PR is looking good to you, Thanks |
NickLucche
left a comment
There was a problem hiding this comment.
LGTM, thanks @xuechendi !
Head branch was pushed to by a user without write access
6578346 to
200ae4c
Compare
Signed-off-by: Chendi Xue <Chendi.Xue@intel.com>
Signed-off-by: Chendi Xue <Chendi.Xue@intel.com>
Signed-off-by: Chendi Xue <Chendi.Xue@intel.com>
Signed-off-by: Chendi Xue <Chendi.Xue@intel.com>
Signed-off-by: Chendi Xue <Chendi.Xue@intel.com>
Signed-off-by: Chendi Xue <Chendi.Xue@intel.com>
ae425a2 to
6e21b38
Compare
Signed-off-by: Chendi Xue <Chendi.Xue@intel.com>
Head branch was pushed to by a user without write access
6e21b38 to
5a78f48
Compare
Head branch was pushed to by a user without write access
…er nixl_backend (vllm-project#25121) Signed-off-by: Chendi Xue <Chendi.Xue@intel.com>
…er nixl_backend (vllm-project#25121) Signed-off-by: Chendi Xue <Chendi.Xue@intel.com> Signed-off-by: charlifu <charlifu@amd.com>
…er nixl_backend (#25121) Signed-off-by: Chendi Xue <Chendi.Xue@intel.com> Signed-off-by: yewentao256 <zhyanwentao@126.com>
…er nixl_backend (vllm-project#25121) Signed-off-by: Chendi Xue <Chendi.Xue@intel.com>
…er nixl_backend (vllm-project#25121) Signed-off-by: Chendi Xue <Chendi.Xue@intel.com>
…er nixl_backend (vllm-project#25121) Signed-off-by: Chendi Xue <Chendi.Xue@intel.com>
|
Documentation preview: https://vllm--25121.org.readthedocs.build/en/25121/ |
Purpose
Originally, nixl_connector only supports in-tree platforms and UCX as backend. In this PR, we want to extend ths nixl_connector support to OOT platform as well. Also, want to introduce the option to use different NIXL_backend instead of UCX for vllm
Test Plan
use existing test
Test Result
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.