You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I installed ochat into new venv with pip3 install ochat.
Then I run the server with python -m ochat.serving.openai_api_server --model openchat/openchat-3.6-8b-20240522 --model-type openchat_3.6
However, when trying curl curl http://localhost:18888/v1/chat/completions -H "Content-Type: application/json" -d '{ "model": "openchat_3.6", "messages": [{"role": "user", "content": "You are a large language model named OpenChat. Write a poem to describe yourself"}] }'
I get Internal server error. Server prints this:
ERROR: Exception in ASGI application
Traceback (most recent call last):
File "/home/vojta/ochat/lib/python3.10/site-packages/uvicorn/protocols/http/httptools_impl.py", line 399, in run_asgi
result = await app( # type: ignore[func-returns-value]
File "/home/vojta/ochat/lib/python3.10/site-packages/uvicorn/middleware/proxy_headers.py", line 70, in __call__
return await self.app(scope, receive, send)
File "/home/vojta/ochat/lib/python3.10/site-packages/fastapi/applications.py", line 1054, in __call__
await super().__call__(scope, receive, send)
File "/home/vojta/ochat/lib/python3.10/site-packages/starlette/applications.py", line 123, in __call__
await self.middleware_stack(scope, receive, send)
File "/home/vojta/ochat/lib/python3.10/site-packages/starlette/middleware/errors.py", line 186, in __call__
raise exc
File "/home/vojta/ochat/lib/python3.10/site-packages/starlette/middleware/errors.py", line 164, in __call__
await self.app(scope, receive, _send)
File "/home/vojta/ochat/lib/python3.10/site-packages/starlette/middleware/cors.py", line 85, in __call__
await self.app(scope, receive, send)
File "/home/vojta/ochat/lib/python3.10/site-packages/starlette/middleware/exceptions.py", line 65, in __call__
await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
File "/home/vojta/ochat/lib/python3.10/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app
raise exc
File "/home/vojta/ochat/lib/python3.10/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
await app(scope, receive, sender)
File "/home/vojta/ochat/lib/python3.10/site-packages/starlette/routing.py", line 756, in __call__
await self.middleware_stack(scope, receive, send)
File "/home/vojta/ochat/lib/python3.10/site-packages/starlette/routing.py", line 776, in app
await route.handle(scope, receive, send)
File "/home/vojta/ochat/lib/python3.10/site-packages/starlette/routing.py", line 297, in handle
await self.app(scope, receive, send)
File "/home/vojta/ochat/lib/python3.10/site-packages/starlette/routing.py", line 77, in app
await wrap_app_handling_exceptions(app, request)(scope, receive, send)
File "/home/vojta/ochat/lib/python3.10/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app
raise exc
File "/home/vojta/ochat/lib/python3.10/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
await app(scope, receive, sender)
File "/home/vojta/ochat/lib/python3.10/site-packages/starlette/routing.py", line 72, in app
response = await func(request)
File "/home/vojta/ochat/lib/python3.10/site-packages/fastapi/routing.py", line 278, in app
raw_response = await run_endpoint_function(
File "/home/vojta/ochat/lib/python3.10/site-packages/fastapi/routing.py", line 191, in run_endpoint_function
return await dependant.call(**values)
File "/home/vojta/ochat/lib/python3.10/site-packages/ochat/serving/openai_api_server.py", line 188, in create_chat_completion
result_generator = engine.generate(prompt=None,
TypeError: AsyncLLMEngine.generate() got an unexpected keyword argument 'prompt'
any idea where the problem might be?
Running on RTX3090:
Mon Jun 10 11:41:19 2024
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 545.23.08 Driver Version: 545.23.08 CUDA Version: 12.3 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA GeForce RTX 3090 On | 00000000:4C:00.0 Off | N/A |
| 0% 42C P8 18W / 350W | 21469MiB / 24576MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| 0 N/A N/A 10265 C python 21452MiB |
+---------------------------------------------------------------------------------------+
The text was updated successfully, but these errors were encountered:
I had a similar issue. It looks like it's due to a change in the VLLM AsyncLLMEngine.generate() API from v0.4.2 to v0.4.3 (prompt was deprecated by inputs). Downgrading VLLM fixed the issue for me.
I installed ochat into new venv with
pip3 install ochat
.Then I run the server with
python -m ochat.serving.openai_api_server --model openchat/openchat-3.6-8b-20240522 --model-type openchat_3.6
However, when trying curl
curl http://localhost:18888/v1/chat/completions -H "Content-Type: application/json" -d '{ "model": "openchat_3.6", "messages": [{"role": "user", "content": "You are a large language model named OpenChat. Write a poem to describe yourself"}] }'
I get Internal server error. Server prints this:
any idea where the problem might be?
Running on RTX3090:
The text was updated successfully, but these errors were encountered: