Fix quantization in Whisper model export #26353

jiafatom · 2025-10-18T18:55:39Z

Description

Fix quantization in Whisper model export

Motivation and Context

As titled.

onnxruntime/python/tools/transformers/models/whisper/convert_to_onnx.py

github-actions

You can commit the suggested changes from lintrunner.

onnxruntime/python/tools/transformers/models/whisper/convert_to_onnx.py

onnxruntime/python/tools/transformers/models/whisper/whisper_chain.py

onnxruntime/python/tools/transformers/models/whisper/convert_to_onnx.py

github-actions

You can commit the suggested changes from lintrunner.

onnxruntime/python/tools/transformers/models/whisper/convert_to_onnx.py

github-actions

You can commit the suggested changes from lintrunner.

onnxruntime/python/tools/transformers/models/whisper/convert_to_onnx.py

…o_onnx.py Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

onnxruntime/python/tools/transformers/models/whisper/convert_to_onnx.py

kunal-vaishnavi · 2025-10-18T23:16:35Z

Since the big models CI pipeline was deleted, we should verify that these commands all still work.

jiafatom · 2025-10-19T15:14:34Z

Since the big models CI pipeline was deleted, we should verify that these commands all still work.

I verified a lot of cases work. The issue is --model_impl openai case.
When we use
python3 -m models.whisper.convert_to_onnx -m openai/whisper-tiny --model_impl openai --output wtiny-fp32-cpu-oai --precision fp32 --provider cpu --overwrite --use_external_data_format --optimize_onnx --no_beam_search_op --output_cross_qk ;
we will see
TypeError: scaled_dot_product_attention(): argument 'is_causal' must be bool, not Tensor

However, I see the opeai-whisper bug mentioned in the script

       # Run forward pass
       # NOTE: There is a bug with openai-whisper==20240930 with the introduction of SDPA.
       # In the Whisper codebase, the following line
       #
       # is_causal = mask is not None and n_ctx > 1
       #
       # has been added where `mask` is a torch tensor. The right-hand side evaluates to `tensor(True/False)`
       # but `is_causal` only accepts the boolean value. The fix is to apply `.item()` after the right-hand
       # side has been evaluated. In other words, the line should be
       #
       # is_causal = (mask is not None and n_ctx > 1).item()
       #
       # instead.

kunal-vaishnavi · 2025-10-19T19:14:20Z

Since the big models CI pipeline was deleted, we should verify that these commands all still work.

I verified a lot of cases work. The issue is --model_impl openai case. When we use python3 -m models.whisper.convert_to_onnx -m openai/whisper-tiny --model_impl openai --output wtiny-fp32-cpu-oai --precision fp32 --provider cpu --overwrite --use_external_data_format --optimize_onnx --no_beam_search_op --output_cross_qk ; we will see TypeError: scaled_dot_product_attention(): argument 'is_causal' must be bool, not Tensor

However, I see the opeai-whisper bug mentioned in the script
       # Run forward pass
       # NOTE: There is a bug with openai-whisper==20240930 with the introduction of SDPA.
       # In the Whisper codebase, the following line
       #
       # is_causal = mask is not None and n_ctx > 1
       #
       # has been added where `mask` is a torch tensor. The right-hand side evaluates to `tensor(True/False)`
       # but `is_causal` only accepts the boolean value. The fix is to apply `.item()` after the right-hand
       # side has been evaluated. In other words, the line should be
       #
       # is_causal = (mask is not None and n_ctx > 1).item()
       #
       # instead.

For the --model_impl openai case, can you verify it works with the version of openai-whisper in requirements.txt?

onnxruntime/onnxruntime/python/tools/transformers/models/whisper/requirements.txt

Line 3 in 7cc28b0

openai-whisper==20240927

Separately, an internal customer mentioned there were export issues with torch==2.9.0 while export succeeded with torch==2.7.0.

onnxruntime/onnxruntime/python/tools/transformers/models/whisper/requirements.txt

Line 1 in 7cc28b0

torch>=2.7.0

Let's pin the torch version in requirements.txt to 2.7.0.

onnxruntime/python/tools/transformers/models/whisper/convert_to_onnx.py

onnxruntime/python/tools/transformers/models/whisper/README.md

jiafatom · 2025-10-19T19:41:43Z

python3 -m models.whisper.convert_to_onnx -m openai/whisper-tiny --model_impl openai --output wtiny-fp32-cpu-oai --precision fp32 --provider cpu --overwrite --use_external_data_format --optimize_onnx --no_beam_search_op --output_cross_qk

Tested with openai-whisper==20240927 and it works!

onnxruntime/python/tools/transformers/models/whisper/convert_to_onnx.py

github-actions

You can commit the suggested changes from lintrunner.

onnxruntime/python/tools/transformers/models/whisper/convert_to_onnx.py

…o_onnx.py Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

github-actions

You can commit the suggested changes from lintrunner.

onnxruntime/test/python/transformers/test_generation.py

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

onnxruntime/test/python/transformers/test_generation.py

### Description Fix quantization in Whisper model export ### Motivation and Context As titled. --------- Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

jiafatom requested a review from kunal-vaishnavi October 18, 2025 18:55

github-advanced-security bot found potential problems Oct 18, 2025

View reviewed changes

onnxruntime/python/tools/transformers/models/whisper/convert_to_onnx.py Fixed Show fixed Hide fixed

github-actions bot reviewed Oct 18, 2025

View reviewed changes

github-advanced-security bot found potential problems Oct 18, 2025

View reviewed changes

onnxruntime/python/tools/transformers/models/whisper/convert_to_onnx.py Fixed Show fixed Hide fixed

onnxruntime/python/tools/transformers/models/whisper/convert_to_onnx.py Fixed Show fixed Hide fixed

jiafatom force-pushed the whisper_quant branch from aa04494 to f8b42d3 Compare October 18, 2025 19:49

github-actions bot reviewed Oct 18, 2025

View reviewed changes

onnxruntime/python/tools/transformers/models/whisper/convert_to_onnx.py Outdated Show resolved Hide resolved

jiafatom force-pushed the whisper_quant branch from f8b42d3 to 9fdea3a Compare October 18, 2025 19:59

Fix quantization in Whisper model export

0431c56

jiafatom force-pushed the whisper_quant branch from 9fdea3a to 0431c56 Compare October 18, 2025 22:37

github-actions bot reviewed Oct 18, 2025

View reviewed changes

onnxruntime/python/tools/transformers/models/whisper/convert_to_onnx.py Show resolved Hide resolved

onnxruntime/python/tools/transformers/models/whisper/convert_to_onnx.py Outdated Show resolved Hide resolved

Update onnxruntime/python/tools/transformers/models/whisper/convert_t…

f8670fa

…o_onnx.py Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

kunal-vaishnavi reviewed Oct 18, 2025

View reviewed changes

onnxruntime/python/tools/transformers/models/whisper/convert_to_onnx.py Outdated Show resolved Hide resolved

kunal-vaishnavi reviewed Oct 18, 2025

View reviewed changes

onnxruntime/python/tools/transformers/models/whisper/convert_to_onnx.py Outdated Show resolved Hide resolved

kunal-vaishnavi reviewed Oct 18, 2025

View reviewed changes

onnxruntime/python/tools/transformers/models/whisper/convert_to_onnx.py Show resolved Hide resolved

kunal-vaishnavi reviewed Oct 18, 2025

View reviewed changes

onnxruntime/python/tools/transformers/models/whisper/convert_to_onnx.py Outdated Show resolved Hide resolved

kunal-vaishnavi reviewed Oct 18, 2025

View reviewed changes

onnxruntime/python/tools/transformers/models/whisper/convert_to_onnx.py Outdated Show resolved Hide resolved

kunal-vaishnavi reviewed Oct 18, 2025

View reviewed changes

onnxruntime/python/tools/transformers/models/whisper/convert_to_onnx.py Outdated Show resolved Hide resolved

jiafatom force-pushed the whisper_quant branch from 93e6193 to 9d873de Compare October 19, 2025 15:13

kunal-vaishnavi reviewed Oct 19, 2025

View reviewed changes

onnxruntime/python/tools/transformers/models/whisper/convert_to_onnx.py Outdated Show resolved Hide resolved

kunal-vaishnavi reviewed Oct 19, 2025

View reviewed changes

onnxruntime/python/tools/transformers/models/whisper/README.md Outdated Show resolved Hide resolved

kunal-vaishnavi reviewed Oct 19, 2025

View reviewed changes

onnxruntime/python/tools/transformers/models/whisper/README.md Outdated Show resolved Hide resolved

kunal-vaishnavi reviewed Oct 19, 2025

View reviewed changes

onnxruntime/python/tools/transformers/models/whisper/README.md Outdated Show resolved Hide resolved

jiafatom force-pushed the whisper_quant branch from 9d873de to d792d5e Compare October 19, 2025 19:42

github-advanced-security bot found potential problems Oct 19, 2025

View reviewed changes

onnxruntime/python/tools/transformers/models/whisper/convert_to_onnx.py Fixed Show fixed Hide fixed

github-actions bot reviewed Oct 19, 2025

View reviewed changes

onnxruntime/python/tools/transformers/models/whisper/convert_to_onnx.py Outdated Show resolved Hide resolved

github-advanced-security bot found potential problems Oct 19, 2025

View reviewed changes

onnxruntime/python/tools/transformers/models/whisper/convert_to_onnx.py Fixed Show fixed Hide fixed

jiafatom force-pushed the whisper_quant branch from d792d5e to a5e5248 Compare October 19, 2025 19:49

kunal-vaishnavi reviewed Oct 20, 2025

View reviewed changes

onnxruntime/python/tools/transformers/models/whisper/convert_to_onnx.py Show resolved Hide resolved

jiafatom force-pushed the whisper_quant branch from a5e5248 to a2f8ded Compare October 20, 2025 01:22

kunal-vaishnavi previously approved these changes Oct 20, 2025

View reviewed changes

Update onnxruntime/python/tools/transformers/models/whisper/convert_t…

1a38cbe

…o_onnx.py Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

jiafatom dismissed kunal-vaishnavi’s stale review via 1a38cbe October 20, 2025 02:24

jiafatom force-pushed the whisper_quant branch from a2f8ded to 1a38cbe Compare October 20, 2025 02:24

github-actions bot reviewed Oct 20, 2025

View reviewed changes

onnxruntime/test/python/transformers/test_generation.py Show resolved Hide resolved

onnxruntime/test/python/transformers/test_generation.py Show resolved Hide resolved

jiafatom and others added 2 commits October 19, 2025 19:32

Update onnxruntime/test/python/transformers/test_generation.py

1191ccf

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

Update onnxruntime/test/python/transformers/test_generation.py

ff42202

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

kunal-vaishnavi reviewed Oct 20, 2025

View reviewed changes

onnxruntime/test/python/transformers/test_generation.py Show resolved Hide resolved

kunal-vaishnavi approved these changes Oct 20, 2025

View reviewed changes

kunal-vaishnavi merged commit ed7847c into main Oct 20, 2025
95 of 100 checks passed

kunal-vaishnavi deleted the whisper_quant branch October 20, 2025 18:24

Fix quantization in Whisper model export #26353

Fix quantization in Whisper model export #26353

Uh oh!

Conversation

jiafatom commented Oct 18, 2025

Description

Motivation and Context

Uh oh!

Uh oh!

github-actions bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

github-actions bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

kunal-vaishnavi commented Oct 18, 2025

Uh oh!

jiafatom commented Oct 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kunal-vaishnavi commented Oct 19, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jiafatom commented Oct 19, 2025

Uh oh!

Uh oh!

github-actions bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

jiafatom commented Oct 19, 2025 •

edited

Loading