-
-
Notifications
You must be signed in to change notification settings - Fork 10.1k
Description
Your current environment
@KuntaiDu 大佬瞅一眼,
[Bug]: use disagg_example_p2p_nccl_xpyd.sh, 每次pd 分离的实例,p和d 只能启动成功一个,另一个就报下面的错,大佬帮忙看看
INFO 09-10 21:20:09 [init.py:1152] Found nccl from library libnccl.so.2
Traceback (most recent call last):
File "/opt/ac2/bin/vllm", line 8, in
sys.exit(main())
^^^^^^
File "/opt/ac2/lib/python3.12/site-packages/vllm/entrypoints/cli/main.py", line 65, in main
args.dispatch_function(args)
File "/opt/ac2/lib/python3.12/site-packages/vllm/entrypoints/cli/serve.py", line 55, in cmd
uvloop.run(run_server(args))
File "/opt/ac2/lib/python3.12/site-packages/uvloop/init.py", line 109, in run
return __asyncio.run(
^^^^^^^^^^^^^^
File "/opt/ac2/lib/python3.12/asyncio/runners.py", line 194, in run
return runner.run(main)
^^^^^^^^^^^^^^^^
File "/opt/ac2/lib/python3.12/asyncio/runners.py", line 118, in run
return self._loop.run_until_complete(task)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "uvloop/loop.pyx", line 1518, in uvloop.loop.Loop.run_until_complete
File "/opt/ac2/lib/python3.12/site-packages/uvloop/init.py", line 61, in wrapper
return await main
^^^^^^^^^^
File "/opt/ac2/lib/python3.12/site-packages/vllm/entrypoints/openai/api_server.py", line 1431, in run_server
await run_server_worker(listen_address, sock, args, **uvicorn_kwargs)
File "/opt/ac2/lib/python3.12/site-packages/vllm/entrypoints/openai/api_server.py", line 1451, in run_server_worker
async with build_async_engine_client(args, client_config) as engine_client:
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/ac2/lib/python3.12/contextlib.py", line 210, in aenter
return await anext(self.gen)
^^^^^^^^^^^^^^^^^^^^^
File "/opt/ac2/lib/python3.12/site-packages/vllm/entrypoints/openai/api_server.py", line 158, in build_async_engine_client
async with build_async_engine_client_from_engine_args(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/ac2/lib/python3.12/contextlib.py", line 210, in aenter
return await anext(self.gen)
^^^^^^^^^^^^^^^^^^^^^
File "/opt/ac2/lib/python3.12/site-packages/vllm/entrypoints/openai/api_server.py", line 194, in build_async_engine_client_from_engine_args
async_llm = AsyncLLM.from_vllm_config(
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/ac2/lib/python3.12/site-packages/vllm/v1/engine/async_llm.py", line 162, in from_vllm_config
return cls(
^^^^
File "/opt/ac2/lib/python3.12/site-packages/vllm/v1/engine/async_llm.py", line 124, in init
self.engine_core = EngineCoreClient.make_async_mp_client(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/ac2/lib/python3.12/site-packages/vllm/v1/engine/core_client.py", line 96, in make_async_mp_client
return AsyncMPClient(*client_args)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/ac2/lib/python3.12/site-packages/vllm/v1/engine/core_client.py", line 666, in init
super().init(
File "/opt/ac2/lib/python3.12/site-packages/vllm/v1/engine/core_client.py", line 403, in init
with launch_core_engines(vllm_config, executor_class,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/ac2/lib/python3.12/contextlib.py", line 144, in exit
next(self.gen)
File "/opt/ac2/lib/python3.12/site-packages/vllm/v1/engine/utils.py", line 434, in launch_core_engines
wait_for_engine_startup(
File "/opt/ac2/lib/python3.12/site-packages/vllm/v1/engine/utils.py", line 484, in wait_for_engine_startup
raise RuntimeError("Engine core initialization failed. "
RuntimeError: Engine core initialization failed. See root cause above. Failed core proc(s): {}
🐛 Describe the bug
no
Before submitting a new issue...
- Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.