Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding the option to avoid displaying tqdm bars at inference with vllm #1004

Merged
merged 2 commits into from
Jun 24, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 8 additions & 2 deletions outlines/models/vllm.py
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@ def generate(
sampling_parameters: SamplingParameters,
*,
sampling_params: Optional["SamplingParams"] = None,
use_tqdm: bool = True,
):
"""Generate text using vLLM.

Expand All @@ -47,11 +48,13 @@ def generate(
An instance of `SamplingParameters`, a dataclass that contains
the name of the sampler to use and related parameters as available
in Outlines.
samplng_params
sampling_params
An instance of `vllm.sampling_params.SamplingParams`. The values
passed via this dataclass supersede the values of the parameters
in `generation_parameters` and `sampling_parameters`. See the
vLLM documentation for more details: https://docs.vllm.ai/en/latest/dev/sampling_params.html.
use_tqdm
A boolean in order to display progress bar while inferencing

Returns
-------
Expand Down Expand Up @@ -103,7 +106,10 @@ def generate(
sampling_params.use_beam_search = True

results = self.model.generate(
prompts, sampling_params=sampling_params, lora_request=self.lora_request
prompts,
sampling_params=sampling_params,
lora_request=self.lora_request,
use_tqdm=use_tqdm,
)
results = [[sample.text for sample in batch.outputs] for batch in results]

Expand Down
Loading