TF: Finalize `unpack_inputs`-related changes #16499

gante · 2022-03-30T16:48:13Z

What does this PR do?

Please read this before diving into the changes :) This PR finalizes the changes related to the unpack_inputs and is slightly more complex than the other PRs.

Changes:

Removes **kwargs from most call methods in our TF models:
1. This argument was not documented and, after adding the decorator, not being used;
2. It was used before exclusively as an input to input_processing, to handle some special cases (which are now handled inside the decorator);
3. The exception is the encoder_decoder models (see below);
4. Removing it implies that there are no more hidden parameters being passed to the models. Somewhat expected, a few tests were passing unused parameters, and had to be corrected. I've added comments throughout the PR to elaborate here.
Replaces the use of input_processing by the decorator in the encoder_decoder models:
1. This was not a 1:1 change, like in the other models -- input_processing was being used before the encoder and the decoder were called, which was redundant (the encoder/decoder now have the decorator, which also calls the function);
2. However, it was also being used for its side effects, i.e. to set some variables (like use_cache), which is equivalent to adding the decorator on the encoder_decoder model;
3. Because these encoder_decoder models must use kwags, as the encoder/decoder might have a myriad of arguments, the decorator was updated so as to allow random kwargs on models that expect them. This brings us back to 1. -- no other models have kwags now.
Icing on the cake -- input_processing is now only used in the decorator, so I made the function protected :) This means we can start modernizing it without the fear of it being used in other places.

…otected

HuggingFaceDocBuilderDev · 2022-03-30T17:00:17Z

The documentation is not available anymore as the PR was closed or merged.

gante · 2022-04-01T10:55:24Z

src/transformers/modeling_tf_utils.py

+        if "output_attentions" in kwargs:
+            final_booleans["output_attentions"] = (
+                kwargs["output_attentions"] if kwargs["output_attentions"] is not None else config.output_attentions
+            )


The previous version was passing down final_booleans["output_attentions"]=False in pure conv models, which would set the output_attentions argument to False. The new version results in no argument, which is the desired behavior.

Can you add a comment that "output_attentions" will be in kwargs, with a value of None if unset? That change made me pause for a couple of minutes.

gante · 2022-04-01T10:56:42Z

src/transformers/modeling_tf_utils.py

+    if has_kwargs:
+        output["kwargs"] = kwargs.pop("kwargs_call", {})
+    else:
+        if len(kwargs["kwargs_call"]) > 0:
+            raise ValueError(
+                f"The following keyword arguments are not supported by this model: {list(kwargs['kwargs_call'].keys())}."
+            )
+        kwargs.pop("kwargs_call")


encoder_decoder models want the kwargs, all other models will discard them (and throw an error if they are not empty)

gante · 2022-04-01T11:01:17Z

src/transformers/models/t5/modeling_tf_t5.py

+    @property
+    def dummy_inputs(self):
+        return {"input_ids": tf.constant(DUMMY_INPUTS)}
+


This class, TFT5EncoderModel, was inheriting the dummy_inputs that are used in all other TF T5 classes. However, contrarily to these other classes, the call() here does not accept decoder_xxx arguments, which are in the other dummy_inputs. Naturally, with stricter checking, it caused tests to fail (better yet -- the model failed at load time)

The changes here correct this. The serving function also had to be overwritten, for the same reasons.

gante · 2022-04-01T11:04:07Z

tests/t5/test_modeling_tf_t5.py

+    # This test is run in `TFT5EncoderOnlyModelTest`, where the main layer has the same inputs as the model
+    @unittest.skip(reason="The inputs of the Main Layer are different.")
+    def test_keras_save_load(self):
+        pass


Related to the dummy_inputs comment above. This test uses the TFT5MainLayer class, which has the same inputs as TFT5EncoderModel. All classes in this Tester use the other input format.

This test still happens below, in the Tester for TFT5EncoderModel.

gante · 2022-04-01T11:06:20Z

tests/test_modeling_tf_common.py

+                # Not all models accept "labels" in the forward pass (yet :) )
+                return_labels=True if "labels" in inspect.signature(model_class.call).parameters.keys() else False,


Some models, like TFElectraForPreTraining, do not have a label argument, unlike their PT counterparts. There are 5 instances like this, all XXXForPretraining (not to be confused with XXXPreTrainedModel, the base models to be inherited).

Without this correction, those models would fail due to the inexisting label argument.

gante · 2022-04-01T11:06:42Z

tests/test_modeling_tf_common.py


        for model_class in self.all_model_classes:
            inputs_dict["output_attentions"] = True
-            inputs_dict["use_cache"] = False


Not being used at all

gante · 2022-04-01T11:07:21Z

tests/test_modeling_tf_common.py

+            # Not all models accept "labels" in the forward pass (yet :) )
+            if "labels" in inspect.signature(model.call).parameters.keys():


Reordered the tests so as to place all label-dependent tests under this if. Essentially the same label issue as above.

Rocketknight1

This is great! I'm a big fan of pushing input_processing into a protected class, and the kwargs changes make it much clearer what's going on in all of our models. Along with the other unpack_inputs changes, this makes all of our individual model files a lot less confusing for newcomers to the library.

sgugger

Thanks for cleaning all those kwargs up!

sgugger · 2022-04-04T13:55:27Z

src/transformers/modeling_tf_utils.py

+        if "output_attentions" in kwargs:
+            final_booleans["output_attentions"] = (
+                kwargs["output_attentions"] if kwargs["output_attentions"] is not None else config.output_attentions
+            )


Can you add a comment that "output_attentions" will be in kwargs, with a value of None if unset? That change made me pause for a couple of minutes.

src/transformers/modeling_tf_utils.py

* Add unpack_inputs to remaining models * removed kwargs to `call()` in TF models * fix TF T5 tests

* 📝 add image/vision classification and asr * 🖍 minor formatting fixes * Fixed a typo in legacy seq2seq_trainer.py (#16531) * Add ONNX export for BeiT (#16498) * Add beit onnx conversion support * Updated docs * Added cross reference to ViT ONNX config * call on_train_end when trial is pruned (#16536) * Type hints added (#16529) * Fix Bart type hints (#16297) * Add type hints to PLBart PyTorch * Remove pending merge conflicts * Fix PLBart Type Hints * Add changes from review * Add VisualBert type hints (#16544) * Adding missing type hints for mBART model (PyTorch) (#16429) * added type hints for mbart tensorflow tf implementation * Adding missing type hints for mBART model Tensorflow Implementation model added with missing type hints * Missing Type hints - correction For TF model * Code fixup using make quality tests * Hint types - typo error * make fix-copies and make fixup * type hints * updated files * type hints update * making dependent modesls coherent Co-authored-by: matt <[email protected]> * Remove MBart subclass of XLMRoberta in tokenzier docs (#16546) * Remove MBart subclass of XLMRoberta in tokenzier * Fix style * Copy docs from MBart50 tokenizer * Use random_attention_mask for TF tests (#16517) * use random_attention_mask for TF tests * Fix for TFCLIP test (for now). Co-authored-by: ydshieh <[email protected]> * Improve code example (#16450) Co-authored-by: Niels Rogge <[email protected]> * Pin tokenizers version <0.13 (#16539) * Pin tokenizers version <0.13 * Style * Add code samples for TF speech models (#16494) Co-authored-by: ydshieh <[email protected]> * [FlaxSpeechEncoderDecoder] Fix dtype bug (#16581) * [FlaxSpeechEncoderDecoder] Fix dtype bug * more fixes * Making the impossible to connect error actually report the right URL. (#16446) * Fix flax import in __init__.py: modeling_xglm -> modeling_flax_xglm (#16556) * Add utility to find model labels (#16526) * Add utility to find model labels * Use it in the Trainer * Update src/transformers/utils/generic.py Co-authored-by: Matt <[email protected]> * Quality Co-authored-by: Matt <[email protected]> * Enable doc in Spanish (#16518) * Reorganize doc for multilingual support * Fix style * Style * Toc trees * Adapt templates * Add use_auth to load_datasets for private datasets to PT and TF examples (#16521) * fix formatting and remove use_auth * Add use_auth_token to Flax examples * add a test checking the format of `convert_tokens_to_string`'s output (#16540) * add new tests * add comment to overridden tests * TF: Finalize `unpack_inputs`-related changes (#16499) * Add unpack_inputs to remaining models * removed kwargs to `call()` in TF models * fix TF T5 tests * [SpeechEncoderDecoderModel] Correct Encoder Last Hidden State Output (#16586) * initialize the default rank set on TrainerState (#16530) * initialize the default rank set on TrainerState * fix style * Trigger doc build * Fix CI: test_inference_for_pretraining in ViTMAEModelTest (#16591) Co-authored-by: ydshieh <[email protected]> * add a template to add missing tokenization test (#16553) * add a template to add missing tokenization test * add cookiecutter setting * improve doc * Update templates/adding_a_missing_tokenization_test/README.md Co-authored-by: Sylvain Gugger <[email protected]> Co-authored-by: Sylvain Gugger <[email protected]> * made _load_pretrained_model_low_mem static + bug fix (#16548) * handle torch_dtype in low cpu mem usage (#16580) * [Doctests] Correct filenaming (#16599) * [Doctests] Correct filenaming * improve quicktour * make style * Adding new train_step logic to make things less confusing for users (#15994) * Adding new train_step logic to make things less confusing for users * DO NOT ASK WHY WE NEED THAT SUBCLASS * Metrics now working, at least for single-output models with type annotations! * Updates and TODOs for the new train_step * Make fixup * Temporary test workaround until T5 has types * Temporary test workaround until T5 has types * I think this actually works! Needs a lot of tests though * MAke style/quality * Revert changes to T5 tests * Deleting the aforementioned unmentionable subclass * Deleting the aforementioned unmentionable subclass * Adding a Keras API test * Style fixes * Removing unneeded TODO and comments * Update test_step too * Stop trying to compute metrics with the dummy_loss, patch up test * Make style * make fixup * Docstring cleanup * make fixup * make fixup * Stop expanding 1D input tensors when using dummy loss * Adjust T5 test given the new compile() * make fixup * Skipping test for convnext * Removing old T5-specific Keras test now that we have a common one * make fixup * make fixup * Only skip convnext test on CPU * Update src/transformers/modeling_tf_utils.py Co-authored-by: Sylvain Gugger <[email protected]> * Update src/transformers/modeling_tf_utils.py Co-authored-by: Sylvain Gugger <[email protected]> * Avoiding TF import issues * make fixup * Update compile() to support TF 2.3 * Skipping model.fit() on template classes for now * Skipping model.fit() on template class tests for now * Replace ad-hoc solution with find_labels * make fixup Co-authored-by: Sylvain Gugger <[email protected]> * Adding missing type hints for BigBird model (#16555) * added type hints for mbart tensorflow tf implementation * Adding missing type hints for mBART model Tensorflow Implementation model added with missing type hints * Missing Type hints - correction For TF model * Code fixup using make quality tests * Hint types - typo error * make fix-copies and make fixup * type hints * updated files * type hints update * making dependent modesls coherent * Type hints for BigBird * removing typos Co-authored-by: matt <[email protected]> * [deepspeed] fix typo, adjust config name (#16597) * 🖍 apply feedback Co-authored-by: Cathy <[email protected]> Co-authored-by: Jim Rohrer <[email protected]> Co-authored-by: Ferdinand Schlatt <[email protected]> Co-authored-by: Dahlbomii <[email protected]> Co-authored-by: Gunjan Chhablani <[email protected]> Co-authored-by: Rishav Chandra Varma <[email protected]> Co-authored-by: matt <[email protected]> Co-authored-by: Yih-Dar <[email protected]> Co-authored-by: ydshieh <[email protected]> Co-authored-by: NielsRogge <[email protected]> Co-authored-by: Niels Rogge <[email protected]> Co-authored-by: Lysandre Debut <[email protected]> Co-authored-by: Patrick von Platen <[email protected]> Co-authored-by: Nicolas Patry <[email protected]> Co-authored-by: Daniel Stancl <[email protected]> Co-authored-by: Sylvain Gugger <[email protected]> Co-authored-by: Matt <[email protected]> Co-authored-by: Karim Foda <[email protected]> Co-authored-by: SaulLu <[email protected]> Co-authored-by: Joao Gante <[email protected]> Co-authored-by: Sanchit Gandhi <[email protected]> Co-authored-by: Andres Codas <[email protected]> Co-authored-by: Sylvain Gugger <[email protected]> Co-authored-by: Francesco Saverio Zuppichini <[email protected]> Co-authored-by: Suraj Patil <[email protected]> Co-authored-by: Stas Bekman <[email protected]>

gante added 7 commits March 30, 2022 16:27

Add unpack_inputs to remaining models

558ee00

tmp commit

01c4539

remove breakpoint

801fd1d

add unpack_inputs

821baa7

removed kwargs up to clip

3e912f7

removed kwags up to longformer

a68901b

remove unused kwargs in remaining models; inputs_processing is now pr…

9bc6e76

…otected

gante added 5 commits March 31, 2022 09:26

forgot to pop empty kwargs

87b7ab5

handle models without attention

ac1f936

derp

93c2a65

fix t5 tests

df4da79

Fix equivalence tests

364284c

gante commented Apr 1, 2022

View reviewed changes

gante requested review from Rocketknight1 and sgugger April 1, 2022 11:07

Rocketknight1 approved these changes Apr 1, 2022

View reviewed changes

Merge branch 'main' into encoder_decoder_unpack_inputs

11e8ca4

sgugger approved these changes Apr 4, 2022

View reviewed changes

PR comments

a8d859d

gante merged commit dad5ca8 into huggingface:main Apr 4, 2022

stevhliu pushed a commit to stevhliu/transformers that referenced this pull request Apr 5, 2022

TF: Finalize unpack_inputs-related changes (huggingface#16499)

7d64881

* Add unpack_inputs to remaining models * removed kwargs to `call()` in TF models * fix TF T5 tests

gante deleted the encoder_decoder_unpack_inputs branch April 11, 2022 21:48

		# Not all models accept "labels" in the forward pass (yet :) )
		return_labels=True if "labels" in inspect.signature(model_class.call).parameters.keys() else False,

		# Not all models accept "labels" in the forward pass (yet :) )
		if "labels" in inspect.signature(model.call).parameters.keys():

TF: Finalize unpack_inputs-related changes #16499

TF: Finalize unpack_inputs-related changes #16499

Uh oh!

Conversation

gante commented Mar 30, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Uh oh!

HuggingFaceDocBuilderDev commented Mar 30, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gante Apr 1, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Rocketknight1 left a comment

Choose a reason for hiding this comment

Uh oh!

sgugger left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

TF: Finalize `unpack_inputs`-related changes #16499

TF: Finalize `unpack_inputs`-related changes #16499

gante commented Mar 30, 2022 •

edited

Loading

HuggingFaceDocBuilderDev commented Mar 30, 2022 •

edited

Loading

gante Apr 1, 2022 •

edited

Loading