Add ONNX support for MarianMT models #14586

lewtun · 2021-12-01T13:48:50Z

What does this PR do?

This PR adds support to export MarianMT models in the ONNX format. The underlying logic builds on the awesome refactor / feature enhancement that @michaelbenayoun has implemented in #14358 & #14700 - ~~we should rebase this branch on master once that PR is merged to simplify the diff in this PR.~~ (Done)

Currently, this PR supports ONNX exports for the following "tasks" (i.e. uses):

default, default-with-past => equivalent to exporting a pretrained MarianModel
seq2seq-lm, seq2seq-lm-with-past => equivalent to exporting a pretrained MarianMTModel
causal-lm, causal-lm-with-past=> equivalent to exporting a pretrained MarianForCausalLM

Note that in each case, the end user will have to implement their own generate() method with the ONNX model - see this BART example for what's involved.

I've also checked locally that the "slow" tests pass with:

RUN_SLOW=1 pytest tests/test_onnx_v2.py -k "marian" -rp

Usage

Here's a quick example to show how this works:

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
from transformers.models.marian import MarianOnnxConfig

model_ckpt = "Helsinki-NLP/opus-mt-en-de"
tokenizer = AutoTokenizer.from_pretrained(model_ckpt)
ref_model = AutoModelForSeq2SeqLM.from_pretrained(model_ckpt)
# Export model
feature = "seq2seq-lm"
onnx_path = f"onnx/{model_ckpt}-{feature}/"
# Run this from a Jupyter notebook
!python -m transformers.onnx --model={model_ckpt} --atol=1e-4 --feature={feature} {onnx_path}
# Test export with inputs
batch_size = 4
encoder_inputs = tokenizer(
    ["Studies have been shown that owning a dog is good for you"] * batch_size,
    return_tensors="np",
)
decoder_inputs = tokenizer(
    ["Studien haben gezeigt dass es hilfreich ist einen Hund zu besitzen"]
    * batch_size,
    return_tensors="np",
)
all_inputs = {
    "input_ids": encoder_inputs["input_ids"],
    "attention_mask": encoder_inputs["attention_mask"],
    "decoder_input_ids": decoder_inputs["input_ids"],
    "decoder_attention_mask": decoder_inputs["attention_mask"],
}
# Generate ONNX outputs
ort_session = ort.InferenceSession(f"{onnx_path}model.onnx")
onnx_config = MarianOnnxConfig(ref_model.config, task=feature)
onnx_named_outputs = list(onnx_config.outputs.keys())
onnx_outputs = ort_session.run(onnx_named_outputs, all_inputs)

TODO

Extend support for language modelling head
Investigate range of numerical tolerance between raw and ONNX models for a range of checkpoints
Ensure that ONNX models are compatible with ONNX Runtime
Verify whether past key values are supported

Closes #13823, #13854

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

…like BartModel.forward()

…feature

…arianmt

Using a sequence length of 1 in generate_dummy_outputs() produces large discrepancies, presumably due to some hidden optimisations.

lewtun · 2021-12-08T12:16:53Z

There seems to be some sort of race condition happening in run_tests_torch:

_____________________________ ERROR collecting gw1 _____________________________
Different tests were collected between gw0 and gw1. The difference is:
--- gw0

+++ gw1

This issue has similar problems - perhaps a solution lies there.

…d models (#14358)" (#14679)" This reverts commit 0f4e39c.

…-onnx-marianmt

lewtun · 2021-12-22T14:07:47Z

docs/source/serialization.rst

 - GPT Neo
 - LayoutLM
 - Longformer
+- Marian


I'm not sure whether .rst files are still allowed with the new .mdx doc - does this need updating / changing?

Letting @LysandreJik answering this one.

I saw the Sylvain recently converted all the RST files to MDX, so I'll rebase and this file should disappear :)

lewtun · 2021-12-22T14:14:28Z

src/transformers/models/marian/tokenization_marian.py

        self._setup_normalizer()

-    def num_special_tokens_to_add(self, **unused):
+    def num_special_tokens_to_add(self, *args, **kwargs):


This change is required to accommodate the use of positional arguments like tokenizer.num_special_tokens_to_add(is_pair) in _generate_dummy_inputs_for_sequence_classification_and_question_answering().

I'm not sure why we had **unused in the first place, but the change also seems more conventional IMO.

lewtun · 2021-12-22T14:16:42Z

src/transformers/models/marian/configuration_marian.py

+            ]
+        return common_inputs
+
+    def _generate_dummy_inputs_for_sequence_classification_and_question_answering(


Technically, Marian doesn't have heads for sequence classification or question answering and this function is here due to the copy-paste from the BART config.

If you think this is confusing, I can remove this function and refactor the other dummy generation functions accordingly.

I think it can be done, you'll just have to remove the # Copied from comment at the top of the class declaration.

You could remove the # Copied from which is at the top of the class declaration and add it only to methods. It supports methods as well as classes.

lewtun · 2021-12-22T16:13:38Z

src/transformers/models/marian/configuration_marian.py

        )
+
+
+# Copied from transformers.models.bart.configuration_bart.BartOnnxConfig with Bart->Marian


Since the Marian model is copied from BART (see modeling_marian.py), I adopted a similar approach for the ONNX config.

michaelbenayoun · 2021-12-22T17:47:30Z

docs/source/serialization.rst

 - GPT Neo
 - LayoutLM
 - Longformer
+- Marian


Letting @LysandreJik answering this one.

michaelbenayoun · 2021-12-22T17:49:14Z

src/transformers/models/marian/configuration_marian.py

+            ]
+        return common_inputs
+
+    def _generate_dummy_inputs_for_sequence_classification_and_question_answering(


I think it can be done, you'll just have to remove the # Copied from comment at the top of the class declaration.

LysandreJik

Looks good, thank you @lewtun!

LysandreJik · 2021-12-23T09:25:21Z

src/transformers/models/marian/configuration_marian.py

        )
+
+
+# Copied from transformers.models.bart.configuration_bart.BartOnnxConfig with Bart->Marian


LysandreJik · 2021-12-23T09:25:58Z

src/transformers/models/marian/configuration_marian.py

+            ]
+        return common_inputs
+
+    def _generate_dummy_inputs_for_sequence_classification_and_question_answering(


You could remove the # Copied from which is at the top of the class declaration and add it only to methods. It supports methods as well as classes.

LysandreJik · 2021-12-23T09:26:37Z

Feel free to merge once you have taken care of the docs and the # Copied from statements :)

lewtun · 2021-12-23T09:34:28Z

src/transformers/models/marian/configuration_marian.py

+            ]
+        return common_inputs
+
+    def _generate_dummy_inputs_for_encoder_and_decoder(


I renamed this function from _generate_dummy_inputs_for_sequence_classification_and_question_answering() to something that closer reflects its usage in the other dummy input functions.

As noted earlier, Marian models don't have sequence classification or question answering heads, so this change is aimed at minimizing confusion for those inspecting the source code.

lewtun · 2021-12-23T10:32:40Z

Thanks for the reviews @LysandreJik and @michaelbenayoun 🙏 !

I've fixed the docs by rebasing on master and added the # Copied from: snippets to the functions (I did not know about that trick!)

Will merge once all the test pass :)

chaodreaming · 2022-10-02T08:51:59Z

The outputs decoding cannot get the correct result. How do you get the translation result

Maxinho96 and others added 23 commits October 4, 2021 08:33

First commit to add MarianMT to ONNX

5e574a2

Now MarianModel.forward() automatically generates decoder_input_ids, …

d79320e

…like BartModel.forward()

Adjusted MarianOnnxConfig.inputs and outputs to work with seq2seq-lm …

e6558aa

…feature

Style fix

a13f0f2

Added support for other features for already supported models

b744cb2

Partial support for causal and seq2seq models

4a51e13

Partial support for causal and seq2seq models

17825c1

Add default task for MarianMT ONNX

67c3554

Remove automatic creation of decoder_input_ids

0a53ea4

Extend inputs and outputs for MarianMT ONNX config

3c9b016

Add MarianMT to ONNX unit tests

01aafc2

Refactor

6cd2cdc

Merge branch 'master' into add-onnx-marianmt

13c7d8e

OnnxSeq2SeqConfigWithPast to support seq2seq models

f109dde

Parameterized the onnx tests

6514c58

Restored run_mlm.py

0462ffe

Restored run_mlm.py

9e66937

[WIP] BART update

1e0d074

BART and MBART

720b41a

Merge branch 'onnx_enable_tasks_for_supported_models' into add-onnx-m…

8bc9608

…arianmt

Add past_key_values and fix dummy decoder inputs

5f28c14

Using a sequence length of 1 in generate_dummy_outputs() produces large discrepancies, presumably due to some hidden optimisations.

Refactor MarianOnnxConfig to remove custom past_key_values logic

f1a4340

Fix quality

d58e433

michaelbenayoun added 6 commits December 9, 2021 10:46

Revert "Revert "Added support for other features for already supporte…

b071fb6

…d models (#14358)" (#14679)" This reverts commit 0f4e39c.

is_torch_available test to avoid failing imports

04be8d0

sorting parameterize parameters to solve ERROR gw0 gw1

1320d9a

tests fix

0fd50d5

tests fix

3a7d849

GPT2 with past fix

12b6d08

michaelbenayoun and others added 6 commits December 22, 2021 12:40

Fixed __init__ to resolve conflict with master

050c5b8

Remove commented import

b35705e

Merge branch 'onnx_enable_tasks_for_supported_models_part_2' into add…

a2357f6

…-onnx-marianmt

Merge branch 'master' into add-onnx-marianmt

14d613a

Remove ONNX model

2ee1b9e

Remove redundant class method

8cff0c9

lewtun commented Dec 22, 2021

View reviewed changes

lewtun marked this pull request as ready for review December 22, 2021 14:11

lewtun commented Dec 22, 2021

View reviewed changes

lewtun requested review from LysandreJik and michaelbenayoun December 22, 2021 14:18

lewtun added 2 commits December 22, 2021 15:30

Tidy up imports

f03202c

Fix quality

a8c4f26

lewtun commented Dec 22, 2021

View reviewed changes

michaelbenayoun approved these changes Dec 22, 2021

View reviewed changes

Merge branch 'master' into add-onnx-marianmt

24b1588

LysandreJik approved these changes Dec 23, 2021

View reviewed changes

Refactor dummy input function

2c5982d

lewtun commented Dec 23, 2021

View reviewed changes

Add copied from statements to Marian config functions

d20d1b0

lewtun added 2 commits December 23, 2021 11:56

Remove false copied from comments

3f6d9d9

Fix copy from comment

3dac78c

lewtun merged commit 6b655cc into master Dec 23, 2021

lewtun deleted the add-onnx-marianmt branch December 23, 2021 12:35

NielsRogge mentioned this pull request Jan 6, 2022

How to boost the speed of one sentence Marian Translation(no batches)? #14998

Closed

lewtun mentioned this pull request Mar 1, 2022

Added support for other features for already supported models #14358

Merged

		)


		# Copied from transformers.models.bart.configuration_bart.BartOnnxConfig with Bart->Marian

Add ONNX support for MarianMT models #14586

Add ONNX support for MarianMT models #14586

Uh oh!

Conversation

lewtun commented Dec 1, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Usage

Before submitting

Who can review?

Uh oh!

lewtun commented Dec 8, 2021

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

LysandreJik left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

LysandreJik commented Dec 23, 2021

Uh oh!

lewtun Dec 23, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lewtun commented Dec 23, 2021

Uh oh!

chaodreaming commented Oct 2, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

lewtun commented Dec 1, 2021 •

edited

Loading

lewtun Dec 23, 2021 •

edited

Loading