-
Notifications
You must be signed in to change notification settings - Fork 696
Export LLMs with Optimum docs #15062
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/15062
Note: Links to docs will display an error until the docs builds have been completed. ❌ 11 New Failures, 1 Unrelated FailureAs of commit b0a0276 with merge base 5246168 ( NEW FAILURES - The following jobs have failed:
FLAKY - The following job failed but was likely due to flakiness present on trunk:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
2a5b832 to
55a4669
Compare
|
|
||
| [Optimum ExecuTorch](https://github.com/huggingface/optimum-executorch) provides a streamlined way to export Hugging Face transformer models to ExecuTorch format. It offers seamless integration with the Hugging Face ecosystem, making it easy to export models directly from the Hugging Face Hub. | ||
|
|
||
| ## Overview |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add here or somewhere obvious that optimum-executorch is undergoing active development so proceed with caution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thinks it is stable enough to not need to add this, additionally I don't see what the benefit of saying this is
|
|
||
| ## Overview | ||
|
|
||
| Optimum ExecuTorch supports a much wider variety of model architectures compared to ExecuTorch's native `export_llm` API. While `export_llm` focuses on a limited set of highly optimized models (Llama, Qwen, Phi, and SmolLM) with advanced features like SpinQuant and attention sink, Optimum ExecuTorch can export diverse architectures including Gemma, Mistral, GPT-2, BERT, T5, Whisper, and many others. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Voxtral?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also would be good to mention export_llm supports different backends other than xnnpack.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both have limited non-cpu support atm though
|
@kimishpatel can you take a look? |
| python install_dev.py | ||
| ``` | ||
|
|
||
| This installs `executorch`, `torch`, `torchao`, `transformers`, and other dependencies from nightly builds or source. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like that it re-installs executorch torch and torchao
Can we provide an option to build with no-isolation mode>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will make separate PR
a144861 to
b0a0276
Compare
|
|
||
| After verifying your model works correctly, deploy it to device: | ||
|
|
||
| - [Running with C++](run-with-c-plus-plus.md) - Run exported models using ExecuTorch's C++ runtime |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can do .html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wdym?
No description provided.