Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature]: benchmark_serving.py should support --logprobs #8193

Closed
1 task done
afeldman-nm opened this issue Sep 5, 2024 · 0 comments · Fixed by #8191
Closed
1 task done

[Feature]: benchmark_serving.py should support --logprobs #8193

afeldman-nm opened this issue Sep 5, 2024 · 0 comments · Fixed by #8191

Comments

@afeldman-nm
Copy link
Contributor

afeldman-nm commented Sep 5, 2024

🚀 The feature, motivation and pitch

The OpenAI API (and by extension vLLM's completions functionality) supports configuring the number of logprobs-per-token to return at the granularity of each request, via the logprobs argument. However, benchmarks/benchmark_serving.py currently does not configure the logprobs argument when generating requests, nor does benchmarks/benchmark_serving.py have a --logprobs CLI argument. This is an issue because it is desirable to benchmark the impact of different --logprobs settings on vLLM performance.

So the issue is that (1) benchmarks/benchmark_serving.py should support a --logprobs argument, and (2) the value of the --logprobs CLI argument should configure the underlying logprobs argument to the completion requests generated during benchmarking.

Alternatives

In tests/utils.py, the function

completions_with_server_args(
    prompts: List[str],
    model_name: str,
    server_cli_args: List[str],
    num_logprobs: Optional[int],
    max_wait_seconds: int = 240,
)

shows how to configure OpenAI API requests with the logprobs argument set.

Additional context

No response

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant