Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error on using self converted models #271

Open
OutisLi opened this issue Dec 8, 2024 · 3 comments
Open

Error on using self converted models #271

OutisLi opened this issue Dec 8, 2024 · 3 comments

Comments

@OutisLi
Copy link

OutisLi commented Dec 8, 2024

I converted the BELLE-2/Belle-whisper-large-v3-zh model from PyTorch to coreml using whisperkiltools generate. After successfully converted to coreML format, I use the cli to transcribe. However, it stuck on initialize models. My device is Mac mini M4 Pro. the output is below:
"whisperkit-cli" transcribe --language zh --audio-path "/Users/tiancheng/Downloads/20min.mp3" --model-path "/Users/tiancheng/AI_Models/whisperkit-coreml/Belle-whisper-large-v3-turbo-zh_667MB" --concurrent-worker-count 0 --report-path "/Users/tiancheng/Downloads/results" --report
Error: Tokenizer is unavailable

I copied the original tokenizer.json to the model folder, But it still stuck on initializing.

@atiorh
Copy link
Contributor

atiorh commented Dec 9, 2024

@OutisLi Are you on an M1 device by any chance? *turbo* models are incompatible with M1 devices but you can still generate non-turbo models with --audio-encoder-sdpa-implementation Cat while executing whisperkit-generate-model. The device compatibility map is published here and our TestFlight app demonstrates how to leverage this file (or a similar file) in your app.

@OutisLi
Copy link
Author

OutisLi commented Dec 9, 2024

@OutisLi Are you on an M1 device by any chance? *turbo* models are incompatible with M1 devices but you can still generate non-turbo models with --audio-encoder-sdpa-implementation Cat while executing whisperkit-generate-model. The device compatibility map is published here and our TestFlight app demonstrates how to leverage this file (or a similar file) in your app.

I first try this model on macbook pro with m1pro and got this error. Then I use Mac mini with M4Pro, the non-quantified turbo can finally run. The convert command is :whisperkit-generate-model --model-version BELLE-2/Belle-whisper-large-v3-turbo-zh --output-dir /Users/outisli/Downloads --generate-quantized-variants --generate-decoder-context-prefill-data

However, when I tried this converted model using cli through subprocess.run using python. It took a long time to initialize the model every time I run, even after the first run.
Meanwhile the --generate-quantized-variants will generate a 520M model, but the result is :Transcription of 20min.mp3:

.com. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .com. . . . . . . .。 . . . . . I don't.

while the full turbo model seems right

@atiorh
Copy link
Contributor

atiorh commented Dec 9, 2024

The default quantization recipe may not work on every model out of the box. e.g. 520MB is pretty aggressive for large-v3-turbo. I recommend tuning the compression parameters to get closer to 620MB for this particular model (based on our experience). Feel free to drop by our Discord for help: https://discord.gg/G5F5GZGecC

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants