Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TypeError: AsyncLLMEngine.generate() got an unexpected keyword argument 'prompt' #225

Open
zombak79 opened this issue Jun 10, 2024 · 1 comment

Comments

@zombak79
Copy link

zombak79 commented Jun 10, 2024

I installed ochat into new venv with pip3 install ochat.

Then I run the server with python -m ochat.serving.openai_api_server --model openchat/openchat-3.6-8b-20240522 --model-type openchat_3.6

However, when trying curl curl http://localhost:18888/v1/chat/completions -H "Content-Type: application/json" -d '{ "model": "openchat_3.6", "messages": [{"role": "user", "content": "You are a large language model named OpenChat. Write a poem to describe yourself"}] }'

I get Internal server error. Server prints this:

ERROR:    Exception in ASGI application
Traceback (most recent call last):
File "/home/vojta/ochat/lib/python3.10/site-packages/uvicorn/protocols/http/httptools_impl.py", line 399, in run_asgi
  result = await app(  # type: ignore[func-returns-value]
File "/home/vojta/ochat/lib/python3.10/site-packages/uvicorn/middleware/proxy_headers.py", line 70, in __call__
  return await self.app(scope, receive, send)
File "/home/vojta/ochat/lib/python3.10/site-packages/fastapi/applications.py", line 1054, in __call__
  await super().__call__(scope, receive, send)
File "/home/vojta/ochat/lib/python3.10/site-packages/starlette/applications.py", line 123, in __call__
  await self.middleware_stack(scope, receive, send)
File "/home/vojta/ochat/lib/python3.10/site-packages/starlette/middleware/errors.py", line 186, in __call__
  raise exc
File "/home/vojta/ochat/lib/python3.10/site-packages/starlette/middleware/errors.py", line 164, in __call__
  await self.app(scope, receive, _send)
File "/home/vojta/ochat/lib/python3.10/site-packages/starlette/middleware/cors.py", line 85, in __call__
  await self.app(scope, receive, send)
File "/home/vojta/ochat/lib/python3.10/site-packages/starlette/middleware/exceptions.py", line 65, in __call__
  await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
File "/home/vojta/ochat/lib/python3.10/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app
  raise exc
File "/home/vojta/ochat/lib/python3.10/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
  await app(scope, receive, sender)
File "/home/vojta/ochat/lib/python3.10/site-packages/starlette/routing.py", line 756, in __call__
  await self.middleware_stack(scope, receive, send)
File "/home/vojta/ochat/lib/python3.10/site-packages/starlette/routing.py", line 776, in app
  await route.handle(scope, receive, send)
File "/home/vojta/ochat/lib/python3.10/site-packages/starlette/routing.py", line 297, in handle
  await self.app(scope, receive, send)
File "/home/vojta/ochat/lib/python3.10/site-packages/starlette/routing.py", line 77, in app
  await wrap_app_handling_exceptions(app, request)(scope, receive, send)
File "/home/vojta/ochat/lib/python3.10/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app
  raise exc
File "/home/vojta/ochat/lib/python3.10/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
  await app(scope, receive, sender)
File "/home/vojta/ochat/lib/python3.10/site-packages/starlette/routing.py", line 72, in app
  response = await func(request)
File "/home/vojta/ochat/lib/python3.10/site-packages/fastapi/routing.py", line 278, in app
  raw_response = await run_endpoint_function(
File "/home/vojta/ochat/lib/python3.10/site-packages/fastapi/routing.py", line 191, in run_endpoint_function
  return await dependant.call(**values)
File "/home/vojta/ochat/lib/python3.10/site-packages/ochat/serving/openai_api_server.py", line 188, in create_chat_completion
  result_generator = engine.generate(prompt=None,
TypeError: AsyncLLMEngine.generate() got an unexpected keyword argument 'prompt'

any idea where the problem might be?


Running on RTX3090:

Mon Jun 10 11:41:19 2024       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 545.23.08              Driver Version: 545.23.08    CUDA Version: 12.3     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA GeForce RTX 3090        On  | 00000000:4C:00.0 Off |                  N/A |
|  0%   42C    P8              18W / 350W |  21469MiB / 24576MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                                         
+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|    0   N/A  N/A     10265      C   python                                    21452MiB |
+---------------------------------------------------------------------------------------+
@jlewis200
Copy link

I had a similar issue. It looks like it's due to a change in the VLLM AsyncLLMEngine.generate() API from v0.4.2 to v0.4.3 (prompt was deprecated by inputs). Downgrading VLLM fixed the issue for me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants