Skip to content

Conversation

@lewtun
Copy link
Member

@lewtun lewtun commented Mar 10, 2022

What does this PR do?

This PR fixes:

  • a bug that was introduced in Add ONNX export for ViT #15658 where a preprocessor and tokenizer were being passed together to the generate_dummy_inputs() function during the ONNX export.
  • an oversight in the refactoring of the ONNX config for M2M-100

It also removes problematic TensorFlow integration tests, where the model implementation doesn't have parity with the PyTorch one (e.g. camembert-base is missing the causal LM head in TensorFlow). I'll address those issues in separate PRs as it involves touching the TensorFlow modeling files.

With these fixes, all slow ONNX tests now pass in all environments (only torch, only tensorflow, torch and tensorflow):

RUN_SLOW=1 python -m pytest tests/onnx/test_onnx_v2.py

cc @michaelbenayoun

@lewtun lewtun changed the title Fix duplicate arguments passed to dummy inputs in ONNX export [WIP] Fix duplicate arguments passed to dummy inputs in ONNX export Mar 10, 2022
@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Mar 10, 2022

The documentation is not available anymore as the PR was closed or merged.

@lewtun lewtun changed the title [WIP] Fix duplicate arguments passed to dummy inputs in ONNX export Fix duplicate arguments passed to dummy inputs in ONNX export Mar 10, 2022
@lewtun lewtun requested review from LysandreJik and sgugger March 10, 2022 13:56
Copy link
Member

@LysandreJik LysandreJik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Left two comments that should be applied 4 times each :)

`Tuple[List[str], List[str]]`: A tuple with an ordered list of the model's inputs, and the named inputs from
the ONNX configuration.
"""
from ..tokenization_utils_base import PreTrainedTokenizerBase
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this can be a top-level import

"The `tokenizer` argument is deprecated and will be removed in version 5 of Transformers. Use `preprocessor` instead.",
FutureWarning,
)
logger.warning("Overwriting the `preprocessor` argument with `tokenizer` to generate dummmy inputs.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this can be an info as it's more additional information and not really an error

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(warnings get displayed by default, info is displayed when users ask to have more info)

import onnx
import tf2onnx

from ..tokenization_utils_base import PreTrainedTokenizerBase
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment about top-level

"The `tokenizer` argument is deprecated and will be removed in version 5 of Transformers. Use `preprocessor` instead.",
FutureWarning,
)
logger.warning("Overwriting the `preprocessor` argument with `tokenizer` to generate dummmy inputs.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment about logging level

Copy link
Collaborator

@sgugger sgugger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good for me with Lysandre's comments! Thanks for working on this!

@lewtun lewtun merged commit 6b09328 into master Mar 10, 2022
@lewtun lewtun deleted the fix-onnx-dummies branch March 10, 2022 19:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants