lazy load vllm.utils.serial_utils import tensor2base64 to avoid break. by QiliangCui · Pull Request #30094 · vllm-project/vllm

QiliangCui · 2025-12-04T23:29:38Z

Purpose

Fix the vllm on tpu loading issue.

After the PR #29970, vllm on tpu is blocked at loading by

^[[0;36m(EngineCore_DP0 pid=309)^[[0;0m ERROR 12-04 07:44:22 [core.py:843] EngineCore failed to start.
^[[0;36m(EngineCore_DP0 pid=309)^[[0;0m ERROR 12-04 07:44:22 [core.py:843] Traceback (most recent call last):
^[[0;36m(EngineCore_DP0 pid=309)^[[0;0m ERROR 12-04 07:44:22 [core.py:843]   File "/workspace/vllm/vllm/v1/engine/core.py", line 834, in run_engine_core
^[[0;36m(EngineCore_DP0 pid=309)^[[0;0m ERROR 12-04 07:44:22 [core.py:843]     engine_core = EngineCoreProc(*args, **kwargs)
^[[0;36m(EngineCore_DP0 pid=309)^[[0;0m ERROR 12-04 07:44:22 [core.py:843]                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
^[[0;36m(EngineCore_DP0 pid=309)^[[0;0m ERROR 12-04 07:44:22 [core.py:843]   File "/workspace/vllm/vllm/v1/engine/core.py", line 610, in __init__
^[[0;36m(EngineCore_DP0 pid=309)^[[0;0m ERROR 12-04 07:44:22 [core.py:843]     super().__init__(
^[[0;36m(EngineCore_DP0 pid=309)^[[0;0m ERROR 12-04 07:44:22 [core.py:843]   File "/workspace/vllm/vllm/v1/engine/core.py", line 102, in __init__
^[[0;36m(EngineCore_DP0 pid=309)^[[0;0m ERROR 12-04 07:44:22 [core.py:843]     self.model_executor = executor_class(vllm_config)
^[[0;36m(EngineCore_DP0 pid=309)^[[0;0m ERROR 12-04 07:44:22 [core.py:843]                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
^[[0;36m(EngineCore_DP0 pid=309)^[[0;0m ERROR 12-04 07:44:22 [core.py:843]   File "/workspace/vllm/vllm/v1/executor/abstract.py", line 101, in __init__
^[[0;36m(EngineCore_DP0 pid=309)^[[0;0m ERROR 12-04 07:44:22 [core.py:843]     self._init_executor()
^[[0;36m(EngineCore_DP0 pid=309)^[[0;0m ERROR 12-04 07:44:22 [core.py:843]   File "/workspace/vllm/vllm/v1/executor/uniproc_executor.py", line 46, in _init_executor
^[[0;36m(EngineCore_DP0 pid=309)^[[0;0m ERROR 12-04 07:44:22 [core.py:843]     self.driver_worker.init_worker(all_kwargs=[kwargs])
^[[0;36m(EngineCore_DP0 pid=309)^[[0;0m ERROR 12-04 07:44:22 [core.py:843]   File "/workspace/vllm/vllm/v1/worker/worker_base.py", line 255, in init_worker
^[[0;36m(EngineCore_DP0 pid=309)^[[0;0m ERROR 12-04 07:44:22 [core.py:843]     worker_class = resolve_obj_by_qualname(
^[[0;36m(EngineCore_DP0 pid=309)^[[0;0m ERROR 12-04 07:44:22 [core.py:843]                    ^^^^^^^^^^^^^^^^^^^^^^^^
^[[0;36m(EngineCore_DP0 pid=309)^[[0;0m ERROR 12-04 07:44:22 [core.py:843]   File "/workspace/vllm/vllm/utils/import_utils.py", line 122, in resolve_obj_by_qualname
^[[0;36m(EngineCore_DP0 pid=309)^[[0;0m ERROR 12-04 07:44:22 [core.py:843]     module = importlib.import_module(module_name)
^[[0;36m(EngineCore_DP0 pid=309)^[[0;0m ERROR 12-04 07:44:22 [core.py:843]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
^[[0;36m(EngineCore_DP0 pid=309)^[[0;0m ERROR 12-04 07:44:22 [core.py:843]   File "/usr/local/lib/python3.12/importlib/__init__.py", line 90, in import_module
^[[0;36m(EngineCore_DP0 pid=309)^[[0;0m ERROR 12-04 07:44:22 [core.py:843]     return _bootstrap._gcd_import(name[level:], package, level)
^[[0;36m(EngineCore_DP0 pid=309)^[[0;0m ERROR 12-04 07:44:22 [core.py:843]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
^[[0;36m(EngineCore_DP0 pid=309)^[[0;0m ERROR 12-04 07:44:22 [core.py:843]   File "<frozen importlib._bootstrap>", line 1387, in _gcd_import
^[[0;36m(EngineCore_DP0 pid=309)^[[0;0m ERROR 12-04 07:44:22 [core.py:843]   File "<frozen importlib._bootstrap>", line 1360, in _find_and_load
^[[0;36m(EngineCore_DP0 pid=309)^[[0;0m ERROR 12-04 07:44:22 [core.py:843]   File "<frozen importlib._bootstrap>", line 1331, in _find_and_load_unlocked
^[[0;36m(EngineCore_DP0 pid=309)^[[0;0m ERROR 12-04 07:44:22 [core.py:843]   File "<frozen importlib._bootstrap>", line 935, in _load_unlocked
^[[0;36m(EngineCore_DP0 pid=309)^[[0;0m ERROR 12-04 07:44:22 [core.py:843]   File "<frozen importlib._bootstrap_external>", line 999, in exec_module
^[[0;36m(EngineCore_DP0 pid=309)^[[0;0m ERROR 12-04 07:44:22 [core.py:843]   File "<frozen importlib._bootstrap>", line 488, in _call_with_frames_removed
^[[0;36m(EngineCore_DP0 pid=309)^[[0;0m ERROR 12-04 07:44:22 [core.py:843]   File "/workspace/vllm/vllm/v1/worker/tpu_worker.py", line 41, in <module>
^[[0;36m(EngineCore_DP0 pid=309)^[[0;0m ERROR 12-04 07:44:22 [core.py:843]     import torch_xla.core.xla_model as xm
^[[0;36m(EngineCore_DP0 pid=309)^[[0;0m ERROR 12-04 07:44:22 [core.py:843] ModuleNotFoundError: No module named 'torch_xla'

Lazy loading vllm.utils.serial_utils import can address it.

Test Plan

wait for ci/cd test.
manually load vllm tpu with

vllm serve \
  --model=Qwen/Qwen2.5-7B-Instruct   \
  --download_dir /mnt/disks/persist \
  --tensor-parallel-size=1   \
  --swap-space=16   \
  --enable-chunked-prefill   \
  --max-model-len=128

with the fix, it can loaded.

Test Result

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

…ing tpu. Signed-off-by: Qiliang Cui <derrhein@gmail.com>

gemini-code-assist

Code Review

This pull request addresses a module loading issue by lazy-loading the tensor2base64 utility. The change correctly moves the import statement from the module level into the encode_base64 method where it is used. This is a standard and appropriate approach to resolve import-related problems, preventing an undesirable import chain from being triggered at application startup. The implementation is sound and effectively resolves the issue described in the pull request.

DarkLight1337 · 2025-12-05T04:44:36Z

Sorry for breaking this

QiliangCui · 2025-12-05T15:30:59Z

thank you @DarkLight1337 ! No problem! We will add some test in vllm main branch so that will know if it impacts TPU.

Jun from tpu team merged in fix in tpu branch vllm-project/tpu-inference#1251. So, I don't need to update this for now.

lazy load vllm.utils.serial_utils import tensor2base64 to avoid break…

308557d

…ing tpu. Signed-off-by: Qiliang Cui <derrhein@gmail.com>

QiliangCui requested review from DarkLight1337, NickLucche, tjtanaa and ywang96 as code owners December 4, 2025 23:29

mergify bot added the multi-modality Related to multi-modality (#4194) label Dec 4, 2025

gemini-code-assist bot reviewed Dec 4, 2025

View reviewed changes

DarkLight1337 approved these changes Dec 5, 2025

View reviewed changes

DarkLight1337 enabled auto-merge (squash) December 5, 2025 04:44

DarkLight1337 added the ready ONLY add when PR is ready to merge/full CI is needed label Dec 5, 2025

QiliangCui closed this Dec 5, 2025

auto-merge was automatically disabled December 5, 2025 15:31
Pull request was closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

lazy load vllm.utils.serial_utils import tensor2base64 to avoid break. #30094

lazy load vllm.utils.serial_utils import tensor2base64 to avoid break. #30094
QiliangCui wants to merge 1 commit intovllm-project:mainfrom
QiliangCui:dev1204

QiliangCui commented Dec 4, 2025 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

DarkLight1337 commented Dec 5, 2025

Uh oh!

QiliangCui commented Dec 5, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

QiliangCui commented Dec 4, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

DarkLight1337 commented Dec 5, 2025

Uh oh!

QiliangCui commented Dec 5, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

QiliangCui commented Dec 4, 2025 •

edited by github-actions bot

Loading