Skip to content

[Misc] Delay EPLB Nixl import until needed #41805

Merged
vllm-bot merged 1 commit intovllm-project:mainfrom
NickLucche:fix-nixl-import-eplb
May 7, 2026
Merged

[Misc] Delay EPLB Nixl import until needed #41805
vllm-bot merged 1 commit intovllm-project:mainfrom
NickLucche:fix-nixl-import-eplb

Conversation

@NickLucche
Copy link
Copy Markdown
Collaborator

Starting a simple colocated vllm instance with a dense model (ep obv disabled) still brings up this unrelated logs that show up when importing nixl

> vllm serve mistralai/Mistral-Medium-3.5-128B -tp 4 --port 8004 --max-model-len 32k
(APIServer pid=377662) INFO 05-06 09:34:02 [utils.py:299]
(APIServer pid=377662) INFO 05-06 09:34:02 [utils.py:299]        █     █     █▄   ▄█
(APIServer pid=377662) INFO 05-06 09:34:02 [utils.py:299]  ▄▄ ▄█ █     █     █ ▀▄▀ █  version 0.20.2rc1.dev69+g66d1cc0c7
(APIServer pid=377662) INFO 05-06 09:34:02 [utils.py:299]   █▄█▀ █     █     █     █  model   mistralai/Mistral-Medium-3.5-128B
(APIServer pid=377662) INFO 05-06 09:34:02 [utils.py:299]    ▀▀  ▀▀▀▀▀ ▀▀▀▀▀ ▀     ▀
(APIServer pid=377662) INFO 05-06 09:34:02 [utils.py:299]
 [ . . . ]
==>(APIServer pid=377662) INFO 05-06 09:34:04 [nixl_utils.py:20] Setting UCX_RCACHE_MAX_UNRELEASED to '1024' to avoid a rare memory leak in UCX when using NIXL.
==>(APIServer pid=377662) INFO 05-06 09:34:04 [nixl_utils.py:32] NIXL is available

(APIServer pid=377662) INFO 05-06 09:34:04 [scheduler.py:239] Chunked prefill is enabled with max_num_batched_tokens=8192.

This PR simply delays importing nixl until needed, that is when eplb with nixl backend has been requested by the user

> vllm serve mistralai/Mistral-Medium-3.5-128B -tp 4 --port 8004 --max-model-len 32k

(APIServer pid=388945) INFO 05-06 09:55:28 [utils.py:299]
(APIServer pid=388945) INFO 05-06 09:55:28 [utils.py:299]        █     █     █▄   ▄█
(APIServer pid=388945) INFO 05-06 09:55:28 [utils.py:299]  ▄▄ ▄█ █     █     █ ▀▄▀ █  version 0.20.2rc1.dev69+g66d1cc0c7
(APIServer pid=388945) INFO 05-06 09:55:28 [utils.py:299]   █▄█▀ █     █     █     █  model   mistralai/Mistral-Medium-3.5-128B
(APIServer pid=388945) INFO 05-06 09:55:28 [utils.py:299]    ▀▀  ▀▀▀▀▀ ▀▀▀▀▀ ▀     ▀
(APIServer pid=388945) INFO 05-06 09:55:28 [utils.py:299]
[ . . .]
(APIServer pid=388945) INFO 05-06 09:55:30 [model.py:563] Resolved architecture: PixtralForConditionalGeneration
(APIServer pid=388945) INFO 05-06 09:55:30 [model.py:1692] Using max model len 32000
(APIServer pid=388945) INFO 05-06 09:55:31 [scheduler.py:239] Chunked prefill is enabled with max_num_batched_tokens=8192.
(APIServer pid=388945) INFO 05-06 09:55:31 [vllm.py:844] Asynchronous scheduling is enabled.
(APIServer pid=388945) INFO 05-06 09:55:31 [kernel.py:212] Final IR op priority after setting platform defaults: IrOpPriorityConfig(rms_norm=['native'], fused_add_rms_norm=['native'])

cc @ilmarkov

Signed-off-by: NickLucche <nlucches@redhat.com>
Copy link
Copy Markdown

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude Code Review

This pull request is from a fork — automated review is disabled. A repository maintainer can comment @claude review to run a one-time review.

@mergify mergify Bot added the kv-connector label May 6, 2026
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors vllm/distributed/eplb/eplb_communicator.py by moving the imports of NixlWrapper and nixl_agent_config from the module level into the has_nixl function and the communicator's init method. This change facilitates lazy loading of these dependencies. As there were no review comments provided, I have no additional feedback to offer.

@ilmarkov
Copy link
Copy Markdown
Contributor

ilmarkov commented May 6, 2026

I think there is already PR: #41392 that does similar things

@NickLucche
Copy link
Copy Markdown
Collaborator Author

@yewentao256 #41392 looks maybe a bit overly complicated, are you sure we don't just need lazy imports for eplb?

Copy link
Copy Markdown
Member

@yewentao256 yewentao256 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for the work!

I think both of the PRs can be landed as they are doing different things.
This PR fixes the issue for now and previous PR will avoid similiar issues in the future.
@NickLucche Would appreciate your approval for previous PR as well.

@yewentao256 yewentao256 added the ready ONLY add when PR is ready to merge/full CI is needed label May 6, 2026
@ZhanqiuHu
Copy link
Copy Markdown
Contributor

LGTM! This is helpful. I think later on if needed we could also wrap the imports in nixl_utils as util functions (e.g. get_nixl_wrapper(), get_nixl_agent_config()) with @lru_cache or something to avoid triggering unnecessary side effects?

@vllm-bot vllm-bot merged commit 9d6500b into vllm-project:main May 7, 2026
63 of 65 checks passed
alexagriffith pushed a commit to alexagriffith/vllm that referenced this pull request May 7, 2026
Signed-off-by: NickLucche <nlucches@redhat.com>
Signed-off-by: alexagriffith <agriffith96@gmail.com>
libinta pushed a commit to libinta/vllm that referenced this pull request May 8, 2026
Signed-off-by: NickLucche <nlucches@redhat.com>
Signed-off-by: Libin Tang <libin.tang@intel.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kv-connector ready ONLY add when PR is ready to merge/full CI is needed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants