Add vllm serve to wrap vllm.entrypoints.openai.api_server#4167
Add vllm serve to wrap vllm.entrypoints.openai.api_server#4167simon-mo wants to merge 5 commits intovllm-project:mainfrom
vllm serve to wrap vllm.entrypoints.openai.api_server#4167Conversation
|
|
||
| TIMEOUT_KEEP_ALIVE = 5 # seconds | ||
|
|
||
| engine: AsyncLLMEngine = None |
There was a problem hiding this comment.
shoudln't this be optional?
|
|
||
| if __name__ == "__main__": | ||
| # NOTE(simon): | ||
| # This section should be in sync with vllm/scripts.py for CLI entrypoints. |
There was a problem hiding this comment.
any way to add a simple regression test for this?
| usage="vllm serve <model_tag> [options]") | ||
| make_arg_parser(serve_parser) | ||
| # Override the `--model` optional argument, make it positional. | ||
| serve_parser.add_argument("model", type=str, help="The model tag to serve") |
There was a problem hiding this comment.
what's happening if vllm serve --model ?
| serve_parser.set_defaults(func=run_server) | ||
|
|
||
| args = parser.parse_args() | ||
| if hasattr(args, "func"): |
There was a problem hiding this comment.
this part of code is confusing. Add a comment to explain what this does?
| vllm serve ... \ | ||
| --chat-template ./path-to-chat-template.jinja | ||
| ``` | ||
|
|
There was a problem hiding this comment.
Based on #4709, the :prog: value under CLI args (line 111) should be updated to vllm serve.
|
Is there any update on this? Having the command simplifies the installation of pipx install vllm
# now you can use the command
vllm serve --helpwith the wonderful pipx tool that manages virtual environments automatically. In the current state you cannot use |
|
Would be nice if #4794 is also made available via CLI (perhaps |
|
Please refer to #5090 for the complete new CLI. |
Easier to type. It will be now