[Feat] Add audio benchmarking support /v1/audio/transcriptions#99
Closed
b8zhong wants to merge 1 commit intoCentML:mainfrom
bzhng-development:add-audio-benchmarks
Closed
[Feat] Add audio benchmarking support /v1/audio/transcriptions#99b8zhong wants to merge 1 commit intoCentML:mainfrom bzhng-development:add-audio-benchmarks
/v1/audio/transcriptions#99b8zhong wants to merge 1 commit intoCentML:mainfrom
bzhng-development:add-audio-benchmarks
Conversation
small cleanup remove unused small cleanup small cleanup small cleanup small cleanup Small refactor.
Author
|
Maybe cc @benchislett , thanks in advance 👍 |
Contributor
|
Thanks! @b8zhong , sorry for the delayed the review |
benchislett
reviewed
Jun 5, 2025
| calculate_metrics(output["inputs"], output["outputs"], output["time"], tokenizer, output["stream"]) | ||
| simplified_inputs = None | ||
| if args.backend == "openai-audio": | ||
| simplified_inputs = [(req["prompt"], req["prompt_len"], req["output_len"]) for req in prepared_requests_data] |
Contributor
There was a problem hiding this comment.
This if/else looks the same in both branches
benchislett
reviewed
Jun 5, 2025
| if args.output_file: | ||
| filename = args.output_file | ||
| if args.num_of_imgs_per_req: | ||
| w, h = args.img_ratios_per_req[idx] |
Contributor
There was a problem hiding this comment.
was this code moved, or intentionally removed? If the latter, for what reason?
Contributor
|
Hi @b8zhong , thanks a lot for the contribution, currently we don't plan to support audio models for benchmarking, so adding support is a bit pre-mature, we will reopen this PR when audio support is added to our inference engine. |
Author
|
No problem, thanks Xin + Benjamin for reviewing anyway 👍 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Add Audio Transcription Benchmarking
vLLM supports Whisper, since vllm-project/vllm#12909 and TensorRT in https://github.com/NVIDIA/TensorRT-LLM/tree/release/0.19/examples/whisper (haven't personally tried this one).
Support ASR model benchmarking via
/v1/audio/transcriptions.Changes:
openai-audiobackendASRDatasetclass for loading/preparing ASR samples from Hugging Face datasets (e.g., LibriSpeech, Common Voice, AMI), including temporary file management. Mostly lifted from vLLM.--audio-dataset-name, etc.) for ASR data configuration.RequestFuncInput,main.py, andClient.pyto integrate the audio pipeline.librosa,soundfile,datasetsdependencies. We can move these to an extra[audio]if necessary as wellSigned-off-by: Brayden Zhong b8zhong@uwaterloo.ca
Co-authored-by: @vincentzed
Example: