[REIMPLEMETATION] Vision encoder decoder Onnx conversion #19476

WaterKnight1998 · 2022-10-10T20:27:40Z

What does this PR do?

This PR is a reimplementation of Vision Encoder Decoder Onnx conversion as a Seq2Seq Model like documentation explains: Encoder-decoder models inherit from OnnxSeq2SeqConfigWithPast

The PR #19254 didn't follow this classes. You have several examples on Repo on how to use it: https://github.com/huggingface/transformers/blob/v4.22.2/src/transformers/models/mbart/configuration_mbart.py

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

If you know how to use git blame, that is the easiest way, otherwise, here is a rough guide of who to tag.
Please tag fewer than 3 people.

@chainyo for OnnxConfigs

@lewtun & @sgugger for approving PR: #19254

This reverts commit 3080bb4.

…Seq Config. Fixing issue. trying to get this working. Trying to fix. Fixing issues. More fixes. Fixing error. Solving more errors. Trying to fix. Fix More fixes. Debugging. sdgsg Formatting code. aaaa

HuggingFaceDocBuilderDev · 2022-10-10T20:50:33Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

lewtun · 2022-10-11T09:32:58Z

Hi @WaterKnight1998 thanks for your PR!

Indeed PR #19254 splits the model in separate encoder / decoder pieces, which differs to other seq2seq models that are currently implemented as a single ONNX graph. The main reason is the following:

To speed up the decoding process, it is more efficient to have a single pass through the encoder, followed by N passes through the decoder.
To support the caching of past key-value pairs, it is more efficient to have a separate decoder

Do you happen to have a latency benchmark for the VisionEncoderDecoder export that compares your PR vs the current implementation? I think this would be the axis on which we'd consider incorporating your changes, but I would be surprised if a single graph can beat the decomposed ones.

You can find more information in our optimum library, where we'll be implementing the pipeline for inference: https://huggingface.co/docs/optimum/onnxruntime/modeling_ort#export-and-inference-of-sequencetosequence-models

cc @mht-sharma re the original implementation

WaterKnight1998 · 2022-10-11T10:06:46Z

Hi @WaterKnight1998 thanks for your PR!

Indeed PR #19254 splits the model in separate encoder / decoder pieces, which differs to other seq2seq models that are currently implemented as a single ONNX graph. The main reason is the following:

To speed up the decoding process, it is more efficient to have a single pass through the encoder, followed by N passes through the decoder.

To support the caching of past key-value pairs, it is more efficient to have a separate decoder

Do you happen to have a latency benchmark for the VisionEncoderDecoder export that compares your PR vs the current implementation? I think this would be the axis on which we'd consider incorporating your changes, but I would be surprised if a single graph can beat the decomposed ones.

I will try to create it, but I didn't take into account what you mention. It makes sense to just do a single pass in the encoder

You can find more information in our optimum library, where we'll be implementing the pipeline for inference: https://huggingface.co/docs/optimum/onnxruntime/modeling_ort#export-and-inference-of-sequencetosequence-models

I am looking forward for this implementation, I am thinking on running Donut model in production and it will be very cool, actually the inference times are pretty bad. I have an open PR for ONNX conversion: #19401

WaterKnight1998 · 2022-10-11T11:15:15Z

@lewtun is this the pipeline that you mention: https://github.com/huggingface/optimum/blob/996f209147a466c7ecf5bfb29c9fd2e9831ea3a7/optimum/onnxruntime/modeling_seq2seq.py#L154?

mht-sharma · 2022-10-11T11:22:45Z

@lewtun is this the pipeline that you mention: https://github.com/huggingface/optimum/blob/996f209147a466c7ecf5bfb29c9fd2e9831ea3a7/optimum/onnxruntime/modeling_seq2seq.py#L154?

Yes @WaterKnight1998, in this implementation the encoder and decoder part are exported separately and the inference is performed using ORT.

github-actions · 2022-11-10T15:01:55Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

WaterKnight1998 · 2022-11-18T15:16:24Z

@lewtun @sgugger please reopen it

github-actions · 2022-12-18T15:01:51Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

mht-sharma and others added 3 commits October 10, 2022 21:55

Revert "Add onnx support for VisionEncoderDecoder (huggingface#19254)"

7c2ce42

This reverts commit 3080bb4.

Reimplementation of Vision Encoder Decoder ONNX conversion using Seq2…

7907423

…Seq Config. Fixing issue. trying to get this working. Trying to fix. Fixing issues. More fixes. Fixing error. Solving more errors. Trying to fix. Fix More fixes. Debugging. sdgsg Formatting code. aaaa

Removing unnecesary imports.

f95d6a9

github-actions bot closed this Nov 18, 2022

lewtun reopened this Nov 24, 2022

github-actions bot closed this Dec 26, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[REIMPLEMETATION] Vision encoder decoder Onnx conversion #19476

[REIMPLEMETATION] Vision encoder decoder Onnx conversion #19476

Uh oh!

WaterKnight1998 commented Oct 10, 2022

Uh oh!

HuggingFaceDocBuilderDev commented Oct 10, 2022

Uh oh!

lewtun commented Oct 11, 2022 •

edited

Loading

Uh oh!

WaterKnight1998 commented Oct 11, 2022 •

edited

Loading

Uh oh!

WaterKnight1998 commented Oct 11, 2022 •

edited

Loading

Uh oh!

mht-sharma commented Oct 11, 2022

Uh oh!

github-actions bot commented Nov 10, 2022

Uh oh!

WaterKnight1998 commented Nov 18, 2022

Uh oh!

github-actions bot commented Dec 18, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[REIMPLEMETATION] Vision encoder decoder Onnx conversion #19476

[REIMPLEMETATION] Vision encoder decoder Onnx conversion #19476

Uh oh!

Conversation

WaterKnight1998 commented Oct 10, 2022

What does this PR do?

Before submitting

Who can review?

Uh oh!

HuggingFaceDocBuilderDev commented Oct 10, 2022

Uh oh!

lewtun commented Oct 11, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

WaterKnight1998 commented Oct 11, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

WaterKnight1998 commented Oct 11, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mht-sharma commented Oct 11, 2022

Uh oh!

github-actions bot commented Nov 10, 2022

Uh oh!

WaterKnight1998 commented Nov 18, 2022

Uh oh!

github-actions bot commented Dec 18, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

lewtun commented Oct 11, 2022 •

edited

Loading

WaterKnight1998 commented Oct 11, 2022 •

edited

Loading

WaterKnight1998 commented Oct 11, 2022 •

edited

Loading