-
Notifications
You must be signed in to change notification settings - Fork 835
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
With exo unable to run llama-3.2-1b #459
Comments
Are you using tinygrad? |
Yes. That's a linux machine there therefore TinygradDynamicShardInferenceEngine picked up. |
You cant run llama 3.2 with tinygrad yet, currently it only support llama 3.1 and llama 3 |
what about exo --inference-engine mlx ? I have errors using both on a linux server. Does mlx have some specific requirements regarding the system (e.g. a GPU installed)? |
uhh mlx is for Apple only, Apple silicon not even intel Apple. MLX Supports every model and those model are quantized in 4bit, model for linux currently is only Llama 3 and 3.1 8b 70b in fp32 |
It's trying load and never completed
Final
The text was updated successfully, but these errors were encountered: