Fix torchscript tests for GPT-NeoX #18012

ydshieh · 2022-07-04T13:10:27Z

What does this PR do?

Fix torchscript tests for GPT-NeoX. The main issue comes from the fact that current RotaryEmbedding changes the model structure in forward.

This PR creates the necessary embeddings in __init__, which basically makes the cache (of embedding) mechanism useless. Furthermore, the attribute names seems a bit confusing now. We could probably add some attribute (ex. init_sin_cos_cache_seq_len) in config with a value <= max_position_embeddings, but I think it's way too much.

Not certain if it is worth it. However, with a PR opened, we have a reference.

The current failing test is
https://github.com/huggingface/transformers/runs/7216768053?check_suite_focus=true

HuggingFaceDocBuilderDev · 2022-07-04T13:20:13Z

The documentation is not available anymore as the PR was closed or merged.

sgugger

LGTM, thanks for fixing!

sgugger · 2022-07-05T12:36:17Z

src/transformers/models/gpt_neox/modeling_gpt_neox.py

            beta=1.0,
-            alpha=(1.0 / self.norm_factor),
+            alpha=(torch.tensor(1.0, dtype=self.norm_factor.dtype, device=self.norm_factor.device) / self.norm_factor),
+            # alpha=(1.0 / self.norm_factor),


Should be cleaned up.

patrickvonplaten

LGTM!

However, could we add the failing test for reference or do we need to add a new test here?

ydshieh · 2022-07-07T10:12:12Z

LGTM!

However, could we add the failing test for reference or do we need to add a new test here?

I updated the PR description to include the current failing test. Regarding new tests, I don't think it's necessary, as we just build the necessary tensors in __init__ instead of in forward, and the current set of tests should be enough :-)

(however, let me know if you have some idea of new necessary test cases!)

patrickvonplaten · 2022-07-07T12:51:10Z

LGTM!
However, could we add the failing test for reference or do we need to add a new test here?

I updated the PR description to include the current failing test. Regarding new tests, I don't think it's necessary, as we just build the necessary tensors in __init__ instead of in forward, and the current set of tests should be enough :-)

(however, let me know if you have some idea of new necessary test cases!)

Perfect thanks!

LysandreJik

LGTM

* fix dtype issue in _attn * fix RotaryEmbedding * fix RotaryEmbedding 2 * clean up Co-authored-by: ydshieh <[email protected]>

fix dtype issue in _attn

5c9adf3

ydshieh added 2 commits July 4, 2022 20:38

fix RotaryEmbedding

32394f8

fix RotaryEmbedding 2

89b3762

ydshieh requested review from patil-suraj, patrickvonplaten and sgugger July 4, 2022 18:56

sgugger approved these changes Jul 5, 2022

View reviewed changes

clean up

d9befca

patrickvonplaten approved these changes Jul 7, 2022

View reviewed changes

LysandreJik approved these changes Jul 11, 2022

View reviewed changes

LysandreJik merged commit ac98a88 into huggingface:main Jul 11, 2022

viclzhu pushed a commit to viclzhu/transformers that referenced this pull request Jul 18, 2022

Fix torchscript tests for GPT-NeoX (huggingface#18012)

f8c6396

* fix dtype issue in _attn * fix RotaryEmbedding * fix RotaryEmbedding 2 * clean up Co-authored-by: ydshieh <[email protected]>

ydshieh deleted the fix_torchscript_tests_for_gpt_neox branch September 7, 2022 08:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix torchscript tests for GPT-NeoX #18012

Fix torchscript tests for GPT-NeoX #18012

Uh oh!

ydshieh commented Jul 4, 2022 •

edited

Loading

Uh oh!

HuggingFaceDocBuilderDev commented Jul 4, 2022 •

edited

Loading

Uh oh!

sgugger left a comment

Uh oh!

sgugger Jul 5, 2022

Uh oh!

patrickvonplaten left a comment

Uh oh!

ydshieh commented Jul 7, 2022

Uh oh!

patrickvonplaten commented Jul 7, 2022

Uh oh!

LysandreJik left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Fix torchscript tests for GPT-NeoX #18012

Fix torchscript tests for GPT-NeoX #18012

Uh oh!

Conversation

ydshieh commented Jul 4, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Uh oh!

HuggingFaceDocBuilderDev commented Jul 4, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sgugger left a comment

Choose a reason for hiding this comment

Uh oh!

sgugger Jul 5, 2022

Choose a reason for hiding this comment

Uh oh!

patrickvonplaten left a comment

Choose a reason for hiding this comment

Uh oh!

ydshieh commented Jul 7, 2022

Uh oh!

patrickvonplaten commented Jul 7, 2022

Uh oh!

LysandreJik left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

ydshieh commented Jul 4, 2022 •

edited

Loading

HuggingFaceDocBuilderDev commented Jul 4, 2022 •

edited

Loading