whisper decode fail #1848

lzhin · 2024-12-26T06:49:09Z

the error log is :
lib/python3.8/site-packages/whisper/model.py", line 124, in qkv_attention
a = scaled_dot_product_attention(
RuntimeError: The size of tensor a (90) must match the size of tensor b (9) at non-singleton dimension 0

the run command is:
CUDA_VISIBLE_DEVICES=3 python3 ./whisper/decode.py
--exp-dir whisper/exp_large_v2
--model-name large-v2
--epoch 9 --avg 1
--manifest-dir data/fbank
--beam-size 10 --max-duration 50

the error do not occur when set --beam-size 1

csukuangfj · 2024-12-26T06:50:15Z

Could you post the complete logs?

lzhin · 2024-12-26T06:52:07Z

Traceback (most recent call last):
File "./whisper/decode.py", line 507, in
main()
File "my_env/lib/python3.8/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "./whisper/decode.py", line 492, in main
results_dict = decode_dataset(
File "./whisper/decode.py", line 324, in decode_dataset
hyps_dict = decode_one_batch(
File "./whisper/decode.py", line 283, in decode_one_batch
results = model.decode(feature, params.decoding_options)
File "my_env/lib/python3.8/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "my_env/lib/python3.8/site-packages/whisper/decoding.py", line 824, in decode
result = DecodingTask(model, options).run(mel)
File "my_env/lib/python3.8/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "my_env/lib/python3.8/site-packages/whisper/decoding.py", line 737, in run
tokens, sum_logprobs, no_speech_probs = self._main_loop(audio_features, tokens)
File "my_env/lib/python3.8/site-packages/whisper/decoding.py", line 687, in _main_loop
logits = self.inference.logits(tokens, audio_features)
File "my_env/lib/python3.8/site-packages/whisper/decoding.py", line 163, in logits
return self.model.decoder(tokens, audio_features, kv_cache=self.kv_cache)
File "my_env/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "my_env/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "my_env/lib/python3.8/site-packages/whisper/model.py", line 242, in forward
x = block(x, xa, mask=self.mask, kv_cache=kv_cache)
File "my_env/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "my_env/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "my_env/lib/python3.8/site-packages/whisper/model.py", line 169, in forward
x = x + self.cross_attn(self.cross_attn_ln(x), xa, kv_cache=kv_cache)[0]
File "my_env/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "my_env/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "my_env/lib/python3.8/site-packages/whisper/model.py", line 111, in forward
wv, qk = self.qkv_attention(q, k, v, mask)
File "my_env/lib/python3.8/site-packages/whisper/model.py", line 124, in qkv_attention
a = scaled_dot_product_attention(
RuntimeError: The size of tensor a (90) must match the size of tensor b (9) at non-singleton dimension 0

csukuangfj · 2024-12-26T06:55:40Z

@yuekaizhang Could you have a look?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

whisper decode fail #1848

whisper decode fail #1848

lzhin commented Dec 26, 2024

csukuangfj commented Dec 26, 2024

lzhin commented Dec 26, 2024

csukuangfj commented Dec 26, 2024

whisper decode fail #1848

whisper decode fail #1848

Comments

lzhin commented Dec 26, 2024

csukuangfj commented Dec 26, 2024

lzhin commented Dec 26, 2024

csukuangfj commented Dec 26, 2024