Skip to content

[test/doc] make NixlConnector example more clear#24249

Merged
NickLucche merged 8 commits into
vllm-project:mainfrom
panpan0000:nixl-test
Sep 23, 2025
Merged

[test/doc] make NixlConnector example more clear#24249
NickLucche merged 8 commits into
vllm-project:mainfrom
panpan0000:nixl-test

Conversation

@panpan0000
Copy link
Copy Markdown
Contributor

@panpan0000 panpan0000 commented Sep 4, 2025

Purpose

NixlConnector are lack of tutorial documents, and the only reference,
per https://github.com/vllm-project/vllm/blob/main/docs/features/disagg_prefill.md?plain=1#L26
is the test code.

So it's better to make this code more clear.


  • (1) we can tell between consumer and producer role
  • (2) tell user NCCL_* environment variables are no longer applicable to NixlConnector, but UCX replaces NCCL, so UCX_* variable should be used instead.

example, UCX_TLS or UCX_NET_DEVICES are the way to configurate underlaying communication device or method, NCCL_IB_HCA NCCL_SOCKET_IFNAME are not applicable.

So in my PR, UCX_NET_DEVICES=all is just a "Hint" to people to be aware of that.


If you want me to add 2 more additional Tips:

  • adding VLLM_NIXL_SIDE_CHANNEL_HOST variables , which is helpful when P and D are in diff machines
  • remove VLLM_NIXL_SIDE_CHANNEL_* for Decoder , since it's just needed for Prefiller.

I'm glad to follow up

@mergify mergify Bot added the v1 label Sep 4, 2025
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request improves the clarity of the NixlConnector integration test script. The changes correctly distinguish between kv_producer and kv_consumer roles, which is more explicit than the previous kv_both. Additionally, the inclusion of the UCX_NET_DEVICES=all environment variable serves as a useful hint for users transitioning from NCCL to UCX for communication configuration. The formatting of the command construction has also been improved for better readability. The changes are sound and achieve the stated goal of making the example clearer.

@panpan0000
Copy link
Copy Markdown
Contributor Author

@chaunceyjiang

Copy link
Copy Markdown
Collaborator

@chaunceyjiang chaunceyjiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks~

#21800 (comment)
It seems there’s still a lack of tutorials on NixlConnector’s xPyD. Could you help add one?

Copy link
Copy Markdown
Collaborator

@NickLucche NickLucche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR @panpan0000 !

I agree with you the lack of documentation for the NixlConnector can be frustrating, but I don't think editing this script is the best way towards clarity.

I would be happier to see a more specific docs page on the topic, with a list of references eg nixl unit tests, llm-d guides https://github.com/llm-d/llm-d/tree/dev/guides/pd-disaggregation and basic nixl setup.
I can help with some Justfiles to get started too.

(1) we can tell between consumer and producer role
(2) tell user NCCL_* environment variables are no longer applicable to NixlConnector, but UCX replaces NCCL, so UCX_* variable should be used instead.

1 - the fact you can specify kv_both here is a "feature not a bug" as the connector makes no assumption about its role providing symmetric functionality.
2 - This isn't generally true as nixl supports backends other than ucx, although ucx is indeed the main transport library.

@panpan0000 panpan0000 requested a review from hmellor as a code owner September 17, 2025 09:15
@mergify mergify Bot added the documentation Improvements or additions to documentation label Sep 17, 2025
@panpan0000
Copy link
Copy Markdown
Contributor Author

Thanks @NickLucche , the doc already added per @chaunceyjiang's suggestion. can you please help to review again, thanks

@panpan0000 panpan0000 changed the title [test] make NixlConnector example more clear [test/doc] make NixlConnector example more clear Sep 17, 2025
Comment thread docs/features/nixl_connector_usage.md Outdated
Comment thread docs/features/nixl_connector_usage.md Outdated
@panpan0000 panpan0000 force-pushed the nixl-test branch 3 times, most recently from c89986a to 2af5028 Compare September 17, 2025 11:42
@panpan0000
Copy link
Copy Markdown
Contributor Author

Thank you @hmellor , all fixed

Copy link
Copy Markdown
Collaborator

@NickLucche NickLucche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the doc page!!
Left some comments

Comment thread docs/features/nixl_connector_usage.md
Comment thread docs/features/nixl_connector_usage.md Outdated
Comment thread docs/features/nixl_connector_usage.md Outdated
Comment thread docs/features/nixl_connector_usage.md Outdated
Comment thread docs/features/nixl_connector_usage.md Outdated
Comment thread docs/features/nixl_connector_usage.md Outdated
Comment thread tests/v1/kv_connector/nixl_integration/run_accuracy_test.sh
@Alan-D-Chen
Copy link
Copy Markdown

Thank you very much for the work of all the experts. I have tried to understand your work, but I find it somewhat difficult. SGLang and Dynamo have already provided PD-disaggregated inference tutorials that are very user-friendly for readers. Perhaps these can bring better inspiration to everyone.

https://docs.sglang.ai/advanced_features/pd_disaggregation.html
https://github.com/ai-dynamo/dynamo/blob/v0.3.2/examples/sglang/multinode-examples.md

@NickLucche
Copy link
Copy Markdown
Collaborator

Great point @Alan-D-Chen !

panpan0000 and others added 8 commits September 23, 2025 10:36
Signed-off-by: Peter Pan <Peter.Pan@daocloud.io>
Signed-off-by: Peter Pan <Peter.Pan@daocloud.io>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: Peter Pan <peter.pan@daocloud.io>
Signed-off-by: Peter Pan <Peter.Pan@daocloud.io>
Signed-off-by: Peter Pan <Peter.Pan@daocloud.io>
Signed-off-by: Peter Pan <Peter.Pan@daocloud.io>
Signed-off-by: Peter Pan <Peter.Pan@daocloud.io>
Signed-off-by: Peter Pan <Peter.Pan@daocloud.io>
Signed-off-by: Nicolò Lucchesi<nicolo.lucchesi@gmail.com>
@panpan0000
Copy link
Copy Markdown
Contributor Author

Thank you for your time again, @NickLucche , your comments are all fixed :-)

Copy link
Copy Markdown
Collaborator

@NickLucche NickLucche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, thanks for contributing @panpan0000 !

@github-project-automation github-project-automation Bot moved this from To Triage to Ready in gpt-oss Issues & Enhancements Sep 23, 2025
@NickLucche NickLucche enabled auto-merge (squash) September 23, 2025 13:36
@github-actions github-actions Bot added the ready ONLY add when PR is ready to merge/full CI is needed label Sep 23, 2025
@NickLucche NickLucche merged commit da5e7e4 into vllm-project:main Sep 23, 2025
29 checks passed
ABC12345anouys pushed a commit to ABC12345anouys/vllm that referenced this pull request Sep 25, 2025
Signed-off-by: Peter Pan <Peter.Pan@daocloud.io>
Signed-off-by: Peter Pan <peter.pan@daocloud.io>
Signed-off-by: Nicolò Lucchesi<nicolo.lucchesi@gmail.com>
Co-authored-by: Nicolò Lucchesi <nicolo.lucchesi@gmail.com>
charlifu pushed a commit to ROCm/vllm that referenced this pull request Sep 25, 2025
Signed-off-by: Peter Pan <Peter.Pan@daocloud.io>
Signed-off-by: Peter Pan <peter.pan@daocloud.io>
Signed-off-by: Nicolò Lucchesi<nicolo.lucchesi@gmail.com>
Co-authored-by: Nicolò Lucchesi <nicolo.lucchesi@gmail.com>
Signed-off-by: charlifu <charlifu@amd.com>
yewentao256 pushed a commit that referenced this pull request Oct 3, 2025
Signed-off-by: Peter Pan <Peter.Pan@daocloud.io>
Signed-off-by: Peter Pan <peter.pan@daocloud.io>
Signed-off-by: Nicolò Lucchesi<nicolo.lucchesi@gmail.com>
Co-authored-by: Nicolò Lucchesi <nicolo.lucchesi@gmail.com>
Signed-off-by: yewentao256 <zhyanwentao@126.com>
choprahetarth pushed a commit to Tandemn-Labs/vllm that referenced this pull request Oct 11, 2025
Signed-off-by: Peter Pan <Peter.Pan@daocloud.io>
Signed-off-by: Peter Pan <peter.pan@daocloud.io>
Signed-off-by: Nicolò Lucchesi<nicolo.lucchesi@gmail.com>
Co-authored-by: Nicolò Lucchesi <nicolo.lucchesi@gmail.com>
lywa1998 pushed a commit to lywa1998/vllm that referenced this pull request Oct 20, 2025
Signed-off-by: Peter Pan <Peter.Pan@daocloud.io>
Signed-off-by: Peter Pan <peter.pan@daocloud.io>
Signed-off-by: Nicolò Lucchesi<nicolo.lucchesi@gmail.com>
Co-authored-by: Nicolò Lucchesi <nicolo.lucchesi@gmail.com>
rtourgeman pushed a commit to rtourgeman/vllm that referenced this pull request Nov 10, 2025
Signed-off-by: Peter Pan <Peter.Pan@daocloud.io>
Signed-off-by: Peter Pan <peter.pan@daocloud.io>
Signed-off-by: Nicolò Lucchesi<nicolo.lucchesi@gmail.com>
Co-authored-by: Nicolò Lucchesi <nicolo.lucchesi@gmail.com>
@Alan-D-Chen
Copy link
Copy Markdown

Alan-D-Chen commented Nov 21, 2025

Great point @Alan-D-Chen !

Lots of thanks for kind developers. 🏵️🏵️🏵️🏵️🏵️🏵️
So, where can we find a good user manual for the Prefill/Decode (PD) disaggregated inference feature of the useful vLLM framework? Or perhaps I'm thinking about this wrong, and what I really need is more of an adventurous spirit?

mystous pushed a commit to mystous/vllm_hybrid that referenced this pull request May 10, 2026
Signed-off-by: Peter Pan <Peter.Pan@daocloud.io>
Signed-off-by: Peter Pan <peter.pan@daocloud.io>
Signed-off-by: Nicolò Lucchesi<nicolo.lucchesi@gmail.com>
Co-authored-by: Nicolò Lucchesi <nicolo.lucchesi@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci/build deepseek Related to DeepSeek models documentation Improvements or additions to documentation frontend gpt-oss Related to GPT-OSS models kv-connector multi-modality Related to multi-modality (#4194) performance Performance-related issues qwen Related to Qwen models ready ONLY add when PR is ready to merge/full CI is needed rocm Related to AMD ROCm v1

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

5 participants