Issue with converting Whisper model to ONNX #1040

AvivSham · 2024-11-19T10:58:20Z

System Info

Created new env following this requirement file:

transformers[torch]==4.46.1
onnxruntime==1.19.2
optimum==1.23.3
onnx==1.16.2
onnxconverter-common==1.14.0
tqdm==4.66.5
onnxslim==0.1.36
--extra-index-url https://pypi.ngc.nvidia.com
onnx_graphsurgeon==0.3.27

system info:
MAC M2
converting using CPU device

Environment/Platform

Description

We are attempting to convert whisper-small using the HF model openai/whisper-small by executing the command specified in the README file.
python -m scripts.convert --quantize --model_id openai/whisper-small

We get the following trace:

TracerWarning: Using len to get tensor shape might cause the trace to be incorrect. Recommended usage would be tensor.shape[0]. Passing a tensor of different shape might lead to errors or silently give incorrect results.
  elif len(self.key_cache[layer_idx]) == 0:  # fills previously skipped layers; checking for tensor causes errors
Found different candidate ONNX initializers (likely duplicate) for the tied weights:
	model.decoder.embed_tokens.weight: {'model.decoder.embed_tokens.weight'}
	proj_out.weight: {'onnx::MatMul_3259'}
Found different candidate ONNX initializers (likely duplicate) for the tied weights:
	model.decoder.embed_tokens.weight: {'model.decoder.embed_tokens.weight'}
	proj_out.weight: {'onnx::MatMul_2910'}
		-[x] values not close enough, max diff: 0.024361729621887207 (atol: 0.001)
		-[x] values not close enough, max diff: 6.988886833190918 (atol: 0.001)
		-[x] values not close enough, max diff: 5.208465576171875 (atol: 0.001)
		-[x] values not close enough, max diff: 1.9965003728866577 (atol: 0.001)
		-[x] values not close enough, max diff: 1.4132819175720215 (atol: 0.001)
		-[x] values not close enough, max diff: 0.8667690753936768 (atol: 0.001)
		-[x] values not close enough, max diff: 3.7726752758026123 (atol: 0.001)
		-[x] values not close enough, max diff: 2.159898519515991 (atol: 0.001)
		-[x] values not close enough, max diff: 12.425561904907227 (atol: 0.001)
		-[x] values not close enough, max diff: 1.2728543281555176 (atol: 0.001)
		-[x] values not close enough, max diff: 6.912049770355225 (atol: 0.001)
		-[x] values not close enough, max diff: 1.0248034000396729 (atol: 0.001)
		-[x] values not close enough, max diff: 7.5350022315979 (atol: 0.001)
		-[x] values not close enough, max diff: 1.6307682991027832 (atol: 0.001)
		-[x] values not close enough, max diff: 7.0035505294799805 (atol: 0.001)
		-[x] values not close enough, max diff: 0.8978527784347534 (atol: 0.001)
		-[x] values not close enough, max diff: 5.2730207443237305 (atol: 0.001)
		-[x] values not close enough, max diff: 1.0290248394012451 (atol: 0.001)
		-[x] values not close enough, max diff: 5.59857177734375 (atol: 0.001)
		-[x] values not close enough, max diff: 1.0392111539840698 (atol: 0.001)
		-[x] values not close enough, max diff: 4.692121505737305 (atol: 0.001)
		-[x] values not close enough, max diff: 1.080666184425354 (atol: 0.001)
		-[x] values not close enough, max diff: 2.687824249267578 (atol: 0.001)
		-[x] values not close enough, max diff: 1.6337403059005737 (atol: 0.001)
		-[x] values not close enough, max diff: 2.598097801208496 (atol: 0.001)
		-[x] values not close enough, max diff: 1.6576173305511475 (atol: 0.001)
Validation for the model models/openai/whisper-small/encoder_model.onnx raised: The maximum absolute difference between the output of the reference model and the ONNX exported model is not within the set tolerance 0.001:
- last_hidden_state: max diff = 0.024361729621887207
The ONNX export succeeded with the warning: The maximum absolute difference between the output of the reference model and the ONNX exported model is not within the set tolerance 0.001:
- logits: max diff = 6.988886833190918
- present.0.decoder.key: max diff = 5.208465576171875
- present.0.decoder.value: max diff = 1.9965003728866577
- present.1.decoder.key: max diff = 1.4132819175720215
- present.1.decoder.value: max diff = 0.8667690753936768
- present.2.decoder.key: max diff = 3.7726752758026123
- present.2.decoder.value: max diff = 2.159898519515991
- present.3.decoder.key: max diff = 12.425561904907227
- present.3.decoder.value: max diff = 1.2728543281555176
- present.4.decoder.key: max diff = 6.912049770355225
- present.4.decoder.value: max diff = 1.0248034000396729
- present.5.decoder.key: max diff = 7.5350022315979
- present.5.decoder.value: max diff = 1.6307682991027832
- present.6.decoder.key: max diff = 7.0035505294799805
- present.6.decoder.value: max diff = 0.8978527784347534
- present.7.decoder.key: max diff = 5.2730207443237305
- present.7.decoder.value: max diff = 1.0290248394012451
- present.8.decoder.key: max diff = 5.59857177734375
- present.8.decoder.value: max diff = 1.0392111539840698
- present.9.decoder.key: max diff = 4.692121505737305
- present.9.decoder.value: max diff = 1.080666184425354
- present.10.decoder.key: max diff = 2.687824249267578
- present.10.decoder.value: max diff = 1.6337403059005737
- present.11.decoder.key: max diff = 2.598097801208496
- present.11.decoder.value: max diff = 1.6576173305511475.
 The exported model was saved at: models/openai/whisper-small

None of the layers meet the default tolerance and in most layers, the difference is more than 3 orders of magnitude.
@xenova can you please help with this?

Thanks,

Reproduction

just run:
python -m scripts.convert --quantize --model_id openai/whisper-small

The text was updated successfully, but these errors were encountered:

xenova · 2024-11-21T23:59:35Z

Thanks @AvivSham, I am able to reproduce the issue. Same thing happens with other variants of whisper. @echarlaix looks to be an issue with Optimum as I'm able to reproduce with optimum-cli. 👀 Any idea what's up?

AvivSham · 2024-11-28T09:38:13Z

bumping...

xenova · 2024-11-28T20:18:42Z

@AvivSham In the meantime, can you try downgrade to "transformers_version": "4.38.2",?

AvivSham · 2024-12-02T07:43:19Z

We downgraded transformers to 4.38.2, and still none of the model versions (small/medium/large) meet the threshold. The trace looks more of the same for all versions:

Found different candidate ONNX initializers (likely duplicate) for the tied weights:
        model.decoder.embed_tokens.weight: {'model.decoder.embed_tokens.weight'}
        proj_out.weight: {'onnx::MatMul_8717'}
Found different candidate ONNX initializers (likely duplicate) for the tied weights:
        model.decoder.embed_tokens.weight: {'model.decoder.embed_tokens.weight'}
        proj_out.weight: {'onnx::MatMul_7406'}
                -[x] values not close enough, max diff: 0.006899833679199219 (atol: 0.001)
The ONNX export succeeded with the warning: The maximum absolute difference between the output of the reference model and the ONNX exported model is not within the set tolerance 0.001:
- last_hidden_state: max diff = 0.006899833679199219.
 The exported model was saved at: models/openai/whisper-medium

However we see 2 differences:

The validation is less informative (or it is because of the logs) only one layer needs to meet the atol threshold.
The difference from the threshold is much smaller.

@xenova

xenova · 2024-12-02T13:35:06Z

@AvivSham Those differences are negligible and the model will produce similar results to the python version!

Looks like we need to investigate what broke in a recent update to transformers. cc @echarlaix

AvivSham · 2024-12-02T16:10:59Z

@echarlaix @xenova thanks!

Do you know why the logs are less informative? I can recall from optimum that if layers meet the threshold they are also printed with a checkmark sign, here the logs are only for single layer.

I also have a follow-up question which I also asked here - #917 (comment)
When we try to use the converted model with whisper web we get a cache_position related error it also bothers others - #917 (comment)
This issue does not reproduce in python env and you can see in the code snip we added (check the first comment link).

AvivSham added the bug Something isn't working label Nov 19, 2024

AvivSham changed the title ~~Issue with converting whisper model to ONNX~~ Issue with converting Whisper model to ONNX Nov 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue with converting Whisper model to ONNX #1040

Issue with converting Whisper model to ONNX #1040

AvivSham commented Nov 19, 2024

xenova commented Nov 21, 2024

AvivSham commented Nov 28, 2024

xenova commented Nov 28, 2024

AvivSham commented Dec 2, 2024

xenova commented Dec 2, 2024

AvivSham commented Dec 2, 2024

Issue with converting Whisper model to ONNX #1040

Issue with converting Whisper model to ONNX #1040

Comments

AvivSham commented Nov 19, 2024

System Info

Environment/Platform

Description

Reproduction

xenova commented Nov 21, 2024

AvivSham commented Nov 28, 2024

xenova commented Nov 28, 2024

AvivSham commented Dec 2, 2024

xenova commented Dec 2, 2024

AvivSham commented Dec 2, 2024