Error on using self converted models #271

OutisLi · 2024-12-08T07:20:47Z

I converted the BELLE-2/Belle-whisper-large-v3-zh model from PyTorch to coreml using whisperkiltools generate. After successfully converted to coreML format, I use the cli to transcribe. However, it stuck on initialize models. My device is Mac mini M4 Pro. the output is below:
"whisperkit-cli" transcribe --language zh --audio-path "/Users/tiancheng/Downloads/20min.mp3" --model-path "/Users/tiancheng/AI_Models/whisperkit-coreml/Belle-whisper-large-v3-turbo-zh_667MB" --concurrent-worker-count 0 --report-path "/Users/tiancheng/Downloads/results" --report
Error: Tokenizer is unavailable

I copied the original tokenizer.json to the model folder, But it still stuck on initializing.

atiorh · 2024-12-09T05:33:57Z

@OutisLi Are you on an M1 device by any chance? *turbo* models are incompatible with M1 devices but you can still generate non-turbo models with --audio-encoder-sdpa-implementation Cat while executing whisperkit-generate-model. The device compatibility map is published here and our TestFlight app demonstrates how to leverage this file (or a similar file) in your app.

OutisLi · 2024-12-09T05:42:55Z

@OutisLi Are you on an M1 device by any chance? *turbo* models are incompatible with M1 devices but you can still generate non-turbo models with --audio-encoder-sdpa-implementation Cat while executing whisperkit-generate-model. The device compatibility map is published here and our TestFlight app demonstrates how to leverage this file (or a similar file) in your app.

I first try this model on macbook pro with m1pro and got this error. Then I use Mac mini with M4Pro, the non-quantified turbo can finally run. The convert command is :whisperkit-generate-model --model-version BELLE-2/Belle-whisper-large-v3-turbo-zh --output-dir /Users/outisli/Downloads --generate-quantized-variants --generate-decoder-context-prefill-data

However, when I tried this converted model using cli through subprocess.run using python. It took a long time to initialize the model every time I run, even after the first run.
Meanwhile the --generate-quantized-variants will generate a 520M model, but the result is :Transcription of 20min.mp3:

.com. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .com. . . . . . . .。 . . . . . I don't.

while the full turbo model seems right

atiorh · 2024-12-09T11:46:54Z

The default quantization recipe may not work on every model out of the box. e.g. 520MB is pretty aggressive for large-v3-turbo. I recommend tuning the compression parameters to get closer to 620MB for this particular model (based on our experience). Feel free to drop by our Discord for help: https://discord.gg/G5F5GZGecC

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error on using self converted models #271

Error on using self converted models #271

OutisLi commented Dec 8, 2024

atiorh commented Dec 9, 2024 •

edited

Loading

OutisLi commented Dec 9, 2024

atiorh commented Dec 9, 2024

Error on using self converted models #271

Error on using self converted models #271

Comments

OutisLi commented Dec 8, 2024

atiorh commented Dec 9, 2024 • edited Loading

OutisLi commented Dec 9, 2024

atiorh commented Dec 9, 2024

atiorh commented Dec 9, 2024 •

edited

Loading