Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to load model (or very very slow) #268

Open
HugoDellinger opened this issue Dec 3, 2024 · 5 comments
Open

Unable to load model (or very very slow) #268

HugoDellinger opened this issue Dec 3, 2024 · 5 comments
Labels
documentation Improvements or additions to documentation

Comments

@HugoDellinger
Copy link

HugoDellinger commented Dec 3, 2024

Hi,

Loading models on the IOS. However, when I try to load it on my developer Iphone (IOS17, iphone 13 pro), it doesn't work. It seems to get stuck on Loading audio encoder. I'm loading openai_whisper-large-v3-v20240930_turbo.

I'm also getting:
ANE model load has failed for on-device compiled macho. Must re-compile the E5 bundle. @ GetANEFModel E5RT: ANE model load has failed for on-device compiled macho. Must re-compile the E5 bundle. (13) E5RT encountered an STL exception. msg = MILCompilerForANE error: failed to compile ANE model using ANEF. Error=Couldn’t communicate with a helper application.. E5RT: MILCompilerForANE error: failed to compile ANE model using ANEF. Error=Couldn’t communicate with a helper application. (11) [Espresso::handle_ex_plan] exception=ANECF error: failed to load ANE model file:///private/var/mobile/Containers/Data/Application/4D70A0C9-753D-4885-B517-1553E5A1F338/Documents/huggingface/models/argmaxinc/whisperkit-coreml/openai_whisper-large-v3-v20240930_turbo/AudioEncoder.mlmodelc/model.mil Error=createProgramInstanceForModel:modelToken:qos:isPreCompiled:enablePowerSaving:skipPreparePhase:statsMask:memoryPoolID:enableLateLatch:modelIdentityStr:owningPid:cacheUrlIdentifier:aotCacheUrlIdentifier:error:: Program load failure (0x20004)

How long is the expected loading time on 13 pro ? Is there any build configuration I'm missing ?

Thanks,

Hugo

@ZachNagengast
Copy link
Contributor

Some models are simply too large to load on an iPhone 13 Pro, this one in particular is one of them. Luckily we've spent a lot of effort quantizing and optimizing many of the biggest models to work on the smallest devices, and created a list for all the models that are supported per device. It is directly accessible in WhisperKit via this handy function, and the example app also shows how they can be utilized here.

This method pulls from https://huggingface.co/argmaxinc/whisperkit-coreml/blob/main/config.json#L40-L57, which we have done extensive benchmarks to verify performance and functionality - viewable in this huggingface space: https://huggingface.co/spaces/argmaxinc/whisperkit-benchmarks. This will help you choose the best accuracy / performance tradeoff for all the phones we support. Hope that helps!

@ZachNagengast ZachNagengast added the documentation Improvements or additions to documentation label Dec 3, 2024
@neo773
Copy link

neo773 commented Jan 4, 2025

I have the same issue I'm trying to use this model distil-whisper_distil-large-v3 although I'm on a M1 Max 32GB Mac which should be fairly quick

image

@sunflsks
Copy link

sunflsks commented Jan 9, 2025

I think for the larger models the first-time loading time for the audio encoder will be pretty long, no matter how powerful the processor as the system still has to compile the model for the ANE, which seems to be a single-threaded process (from what I can tell so far)

@neo773
Copy link

neo773 commented Jan 9, 2025

I’m pretty sure it’s stuck infinitely I left it hoping it would load but it never does.

@sunflsks
Copy link

sunflsks commented Jan 10, 2025

To be fair it took almost 10 minutes for openai_whisper-large-v3-v20240930_turbo_632MB to load on my M1 Pro, (and it took far less time to load openai-whisper-tiny) so considering you're loading the full model it might take far far longer. But I'm also not familiar with the internals of ANE, so who knows 🤷

(Side note; I checked in powermetrics and it appears the ANE compiler service is not single-threaded. However it runs solely on the efficiency cores of the processor, which explains a lot! Quite silly in my opinion but I guess it makes sense for the majority of use cases. Oh well)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

4 participants