-
Notifications
You must be signed in to change notification settings - Fork 31.9k
[REIMPLEMETATION] Vision encoder decoder Onnx conversion #19476
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[REIMPLEMETATION] Vision encoder decoder Onnx conversion #19476
Conversation
This reverts commit 3080bb4.
…Seq Config. Fixing issue. trying to get this working. Trying to fix. Fixing issues. More fixes. Fixing error. Solving more errors. Trying to fix. Fix More fixes. Debugging. sdgsg Formatting code. aaaa
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. |
|
Hi @WaterKnight1998 thanks for your PR! Indeed PR #19254 splits the model in separate encoder / decoder pieces, which differs to other seq2seq models that are currently implemented as a single ONNX graph. The main reason is the following:
Do you happen to have a latency benchmark for the You can find more information in our cc @mht-sharma re the original implementation |
I will try to create it, but I didn't take into account what you mention. It makes sense to just do a single pass in the encoder
I am looking forward for this implementation, I am thinking on running Donut model in production and it will be very cool, actually the inference times are pretty bad. I have an open PR for ONNX conversion: #19401 |
|
@lewtun is this the pipeline that you mention: https://github.com/huggingface/optimum/blob/996f209147a466c7ecf5bfb29c9fd2e9831ea3a7/optimum/onnxruntime/modeling_seq2seq.py#L154? |
Yes @WaterKnight1998, in this implementation the encoder and decoder part are exported separately and the inference is performed using ORT. |
|
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Please note that issues that do not follow the contributing guidelines are likely to be ignored. |
|
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Please note that issues that do not follow the contributing guidelines are likely to be ignored. |
What does this PR do?
This PR is a reimplementation of Vision Encoder Decoder Onnx conversion as a Seq2Seq Model like documentation explains: Encoder-decoder models inherit from OnnxSeq2SeqConfigWithPast
The PR #19254 didn't follow this classes. You have several examples on Repo on how to use it: https://github.com/huggingface/transformers/blob/v4.22.2/src/transformers/models/mbart/configuration_mbart.py
Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.
If you know how to use git blame, that is the easiest way, otherwise, here is a rough guide of who to tag.
Please tag fewer than 3 people.
@chainyo for OnnxConfigs
@lewtun & @sgugger for approving PR: #19254