Add `vllm serve` to wrap `vllm.entrypoints.openai.api_server` by simon-mo · Pull Request #4167 · vllm-project/vllm

simon-mo · 2024-04-18T08:27:25Z

Easier to type. It will be now

(base) xmo@simon-devbox:~/vllm$ vllm serve --help
usage: vllm serve <model_tag> [options]

positional arguments:
  model                 The model tag to serve

options:
  -h, --help            show this help message and exit
  --host HOST           host name
  --port PORT           port number
  --uvicorn-log-level {debug,info,warning,error,critical,trace}
                        log level for uvicorn
  --allow-credentials   allow credentials
  --allowed-origins ALLOWED_ORIGINS
                        allowed origins
  --allowed-methods ALLOWED_METHODS
                        allowed methods
.....

benchmarks/benchmark_serving.py

rkooo567 · 2024-04-18T13:37:25Z

vllm/entrypoints/openai/api_server.py


 TIMEOUT_KEEP_ALIVE = 5  # seconds

+engine: AsyncLLMEngine = None


shoudln't this be optional?

rkooo567 · 2024-04-18T13:38:08Z

vllm/entrypoints/openai/api_server.py

+
+if __name__ == "__main__":
+    # NOTE(simon):
+    # This section should be in sync with vllm/scripts.py for CLI entrypoints.


any way to add a simple regression test for this?

rkooo567 · 2024-04-18T13:39:23Z

vllm/scripts.py

+        usage="vllm serve <model_tag> [options]")
+    make_arg_parser(serve_parser)
+    # Override the `--model` optional argument, make it positional.
+    serve_parser.add_argument("model", type=str, help="The model tag to serve")


what's happening if vllm serve --model ?

rkooo567 · 2024-04-18T13:40:09Z

vllm/scripts.py

+    serve_parser.set_defaults(func=run_server)
+
+    args = parser.parse_args()
+    if hasattr(args, "func"):


this part of code is confusing. Add a comment to explain what this does?

DarkLight1337 · 2024-05-09T10:49:57Z

docs/source/serving/openai_compatible_server.md

+vllm serve ... \
  --chat-template ./path-to-chat-template.jinja
 ```



Based on #4709, the :prog: value under CLI args (line 111) should be updated to vllm serve.

sasha0552 · 2024-05-16T03:16:29Z

Is there any update on this? Having the command simplifies the installation of vllm to

pipx install vllm

# now you can use the command
vllm serve --help

with the wonderful pipx tool that manages virtual environments automatically. In the current state you cannot use vllm with pipx, pipx only supports exposing commands.

DarkLight1337 · 2024-05-16T04:01:38Z

Would be nice if #4794 is also made available via CLI (perhaps vllm batch?).

EthanqX · 2024-06-05T00:56:22Z

Please refer to #5090 for the complete new CLI.

simon-mo added 2 commits April 18, 2024 08:25

simple cli

533dfa2

Merge branch 'main' of github.com:vllm-project/vllm

fa90277

simon-mo assigned rkooo567 Apr 18, 2024

simon-mo added 3 commits April 18, 2024 08:32

fix sorting

b6f06fa

change to positional

3c09138

fix isort

01b0fef

rkooo567 approved these changes Apr 18, 2024

View reviewed changes

rkooo567 added the action-required label Apr 28, 2024

prashantgupta24 mentioned this pull request May 2, 2024

✨ add tgis-cli tools IBM/vllm#16

Closed

DarkLight1337 mentioned this pull request May 9, 2024

[Bugfix] Fix CLI arguments in OpenAI server docs #4709

Merged

DarkLight1337 requested changes May 9, 2024

View reviewed changes

sasha0552 mentioned this pull request May 18, 2024

v0.4.3 Release Tracker #4895

Closed

6 tasks

EthanqX mentioned this pull request May 28, 2024

[Feature] vLLM CLI for serving and querying OpenAI compatible server #5090

Merged

simon-mo closed this Jun 5, 2024

rafvasq mentioned this pull request Jun 28, 2024

feat: add CLI tools IBM/vllm#52

Closed

rafvasq mentioned this pull request Jul 16, 2024

feat: add TGIS CLI commands opendatahub-io/vllm#92

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Comments

Add `vllm serve` to wrap `vllm.entrypoints.openai.api_server`#4167

Add `vllm serve` to wrap `vllm.entrypoints.openai.api_server`#4167
simon-mo wants to merge 5 commits intovllm-project:mainfrom
simon-mo:serve-cli

simon-mo commented Apr 18, 2024 •

edited

Loading

Uh oh!

Uh oh!

rkooo567 Apr 18, 2024

Uh oh!

rkooo567 Apr 18, 2024

Uh oh!

rkooo567 Apr 18, 2024

Uh oh!

rkooo567 Apr 18, 2024

Uh oh!

DarkLight1337 May 9, 2024

Uh oh!

sasha0552 commented May 16, 2024

Uh oh!

DarkLight1337 commented May 16, 2024 •

edited

Loading

Uh oh!

EthanqX commented Jun 5, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants


		TIMEOUT_KEEP_ALIVE = 5 # seconds

		engine: AsyncLLMEngine = None

Uh oh!

Comments

Conversation

simon-mo commented Apr 18, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

rkooo567 Apr 18, 2024

Choose a reason for hiding this comment

Uh oh!

rkooo567 Apr 18, 2024

Choose a reason for hiding this comment

Uh oh!

rkooo567 Apr 18, 2024

Choose a reason for hiding this comment

Uh oh!

rkooo567 Apr 18, 2024

Choose a reason for hiding this comment

Uh oh!

DarkLight1337 May 9, 2024

Choose a reason for hiding this comment

Uh oh!

sasha0552 commented May 16, 2024

Uh oh!

DarkLight1337 commented May 16, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

EthanqX commented Jun 5, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

simon-mo commented Apr 18, 2024 •

edited

Loading

DarkLight1337 commented May 16, 2024 •

edited

Loading