[Swin] Replace hard-coded batch size to enable dynamic ONNX export #19475

lewtun · 2022-10-10T20:06:13Z

What does this PR do?

This PR tweaks the modeling code of swin to enable dynamic batch sizes with the ONNX export. With this fix, the ONNX slow tests for this model now pass, including the slow tests for the original PyTorch model:

This passes
RUN_SLOW=1 pytest -x -sv tests/models/swin/test_modeling_swin.py
This also passes
RUN_SLOW=1 pytest -x -sv tests/onnx/test_onnx_v2.py -k "swin"

Since this change also impacts other models, I've also checked the modeling slow tests pass for:

maskformer
donut_swin
swin_v2

Related to #17476

HuggingFaceDocBuilderDev · 2022-10-10T20:22:38Z

The documentation is not available anymore as the PR was closed or merged.

ydshieh

Hi @lewtun

Thank you for the fix ❤️ !

Is the issue caused by the changes in #19255? More precisely, from (newly added code)
https://github.com/dwyatte/transformers/blob/949683675d83cc38620106626822279cd45b076b/src/transformers/onnx/convert.py#L368

The error shows Outputs values doesn't match between reference model and ONNX exported model - it must be non-trivi
al to figure out this is coming from the shape things! How are you able to find out 💯 ? Is there some tool we can use to check things (tensor values/shape) when running onnx inference?

ydshieh · 2022-10-11T03:16:58Z

(a bit off-topic, but still related question)

Viewing the issue and the fix provided this PR, I was thinking we would have a lot of the same errors due to this hard-coded batch size. However, when I check bert:

transformers/src/transformers/models/bert/modeling_bert.py

Line 962 in 1010097

input_shape = input_ids.size()

and

transformers/src/transformers/models/bert/modeling_bert.py

Line 968 in 1010097

batch_size, seq_length = input_shape

but the onnx tests still pass for bert. Are batch_size and seq_length not hard-coded here? Just wondering if @lewtun has already some insight regarding this.

lewtun · 2022-10-11T09:03:26Z

Is the issue caused by the changes in #19255? More precisely, from (newly added code) https://github.com/dwyatte/transformers/blob/949683675d83cc38620106626822279cd45b076b/src/transformers/onnx/convert.py#L368

The error shows Outputs values doesn't match between reference model and ONNX exported model - it must be non-trivi al to figure out this is coming from the shape things! How are you able to find out 💯 ? Is there some tool we can use to check things (tensor values/shape) when running onnx inference?

Yes, this issue was surfaced by #19255, which implemented a stronger validation test on exported ONNX models. Basically, it generates the ONNX graph using dummy data with one batch size b, and then validates the forward pass with a different b'.

The reason it can be non-trivial to figure out when an export fails to have agreement between the PyTorch / ONNX models is that ONNX traces a graph based on dummy data, and this tracing can be incorrect if there are data-dependent flow statements (Swin in particular has a lot of these if/else statements). Currently , the best tool I know of is to visualise the graph with Netron and manually inspect for discrepancies.

Viewing the issue and the fix provided this PR, I was thinking we would have a lot of the same errors due to this hard-coded batch size. However, when I check bert:

I think in those cases we don't hit a problem because batch_size is only used to create the attention mask when none is provided. Since our dummy input provides an attention mask, this flow in the graph is never traced AFAICT

sgugger

LGTM, thanks for working on this!

lewtun added 2 commits October 10, 2022 21:39

[Swin] Replace hard-coded batch size to enable dynamic ONNX export

ae1233d

Fix type casting for torch.fx

b06150f

lewtun requested review from sgugger and ydshieh October 10, 2022 20:07

Fix copies

a6c1e20

ydshieh approved these changes Oct 11, 2022

View reviewed changes

sgugger approved these changes Oct 11, 2022

View reviewed changes

lewtun merged commit b651efe into main Oct 11, 2022

lewtun deleted the onnx-fix-swin branch October 11, 2022 13:21

lewtun mentioned this pull request Oct 11, 2022

Adds DonutSwin to models exportable with ONNX #19401

Closed

4 tasks

fxmarty mentioned this pull request Nov 28, 2022

Issue when exporting SwinForImageClassification to ONNX format #19780

Closed

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Swin] Replace hard-coded batch size to enable dynamic ONNX export #19475

[Swin] Replace hard-coded batch size to enable dynamic ONNX export #19475

Uh oh!

lewtun commented Oct 10, 2022 •

edited

Loading

Uh oh!

HuggingFaceDocBuilderDev commented Oct 10, 2022 •

edited

Loading

Uh oh!

ydshieh left a comment •

edited

Loading

Uh oh!

ydshieh commented Oct 11, 2022 •

edited

Loading

Uh oh!

lewtun commented Oct 11, 2022

Uh oh!

sgugger left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[Swin] Replace hard-coded batch size to enable dynamic ONNX export #19475

[Swin] Replace hard-coded batch size to enable dynamic ONNX export #19475

Uh oh!

Conversation

lewtun commented Oct 10, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Uh oh!

HuggingFaceDocBuilderDev commented Oct 10, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ydshieh left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ydshieh commented Oct 11, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lewtun commented Oct 11, 2022

Uh oh!

sgugger left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

lewtun commented Oct 10, 2022 •

edited

Loading

HuggingFaceDocBuilderDev commented Oct 10, 2022 •

edited

Loading

ydshieh left a comment •

edited

Loading

ydshieh commented Oct 11, 2022 •

edited

Loading