add doctests to TF ViT by johko · Pull Request #16462 · huggingface/transformers

johko · 2022-03-28T19:59:49Z

What does this PR do?

Add doctests for the TF version of ViT

Fixes # (issue)
#16292

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

@patrickvonplaten

johko · 2022-03-28T20:00:49Z

src/transformers/models/vit/modeling_tf_vit.py

        Returns:

        Examples:
-


I wasn't sure if there is supposed to be an empty line here or not, for me it looked better without it

Let's just keep the empty line. It is done this way in other places.

johko · 2022-03-28T20:01:40Z

src/transformers/models/vit/modeling_tf_vit.py

        >>> last_hidden_states = outputs.last_hidden_state
+        >>> list(last_hidden_states.shape)
+        [1, 197, 768]
        ```"""


Also wasn't sure about the style here, keep ``` and """ in the same line or split to two lines

It should be more or less a style choice. In the model files, we usually keep it as ```"""

HuggingFaceDocBuilderDev · 2022-03-28T20:12:45Z

The documentation is not available anymore as the PR was closed or merged.

ydshieh · 2022-03-28T20:28:39Z

Hi, @johko

Thank you for this PR! Really appreciated.

I just realized that we have PT_VISION_BASE_MODEL_SAMPLE, but not TF_VISION_BASE_MODEL_SAMPLE.

Ideally, we would like to re-use code sample in doc.py. I will need to discuss with the team to make a decision.

I will come back to you once we have a decision. Sorry for the inconvenience!

… into doc-test-vit-tf

ydshieh · 2022-03-29T12:56:48Z

Hi, @johko

After the discussion with the team, we think it would be really better for us to add the following in doc.py

TF_VISION_BASE_MODEL_SAMPLE
TF_VISION_SEQ_CLASS_SAMPLE

(as already done in PyTorch side).

I will open a PR to add these, and keep you updated when that PR is merged. Thank you.

* docstring still WIP | adding to documentation_tests * clean version | passes tests * adding to documentation_test * adding forward for training pass * make fixup applied * address comments * fix doctest * apply make fixup * remove additional blank * fix file to have correct split for prepare_for_doc_test * Update src/transformers/models/trocr/modeling_trocr.py Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> * address comments * changing text | adding loss check | make fixup * make fixup * Update src/transformers/models/trocr/modeling_trocr.py Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> * Update src/transformers/models/trocr/modeling_trocr.py Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> * Update src/transformers/models/trocr/modeling_trocr.py Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> * make fixup Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>

…uggingface#16475) * Prevent overwriting matched with mismatched metrics * Fix style

* Remove duplicate mLuke * 🖍 apply feedback

…e#16271) * fix - set output_attentions to True * Update tests/test_modeling_flax_common.py * update for has_attentions * overwrite check_outputs in FlaxBigBirdModelTest Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: Suraj Patil <surajp815@gmail.com>

* Fix for test_mixed_precision * Fix test_saved_model_creation by using shape_list instead of shape * skit test_model_from_pretrained on GPU for now to avoid GPU OOM * skip test_gptj_sample_max_time for now Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

* add code samples Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

* Add type hints for UniSpeech * Added type hints for UniSpeechSat * Added type hints for Wave2Vec2 (PT) * Added type hints for models dependent of wave2vec

ydshieh · 2022-03-29T17:05:17Z

@johko

TF_VISION_BASE_MODEL_SAMPLE
TF_VISION_SEQ_CLASS_SAMPLE

are added in this PR (and already merged to main) #16477

Once you pull the upstream's main in your local clone's main (or master), and rebase or merge it into your working branch, the process is just a matter of using @add_code_sample_docstrings together with expected_output, checkpoint, etc. You can take this as a reference:

https://github.com/huggingface/transformers/pull/16363/files#diff-5707805d290617078f996faf1138de197fa813f78c0aa5ea497e73b5228f1103

Regarding the checkpoint to use:

TFViTModel
"google/vit-base-patch16-224-in21k"
TFViTForImageClassification
"google/vit-base-patch16-224"

I manually checked the code sample. Let me know if you encounter any issue.

…ace#16465) * properly handle kwargs in encoder_decoder architectures * make fixup

* ported TFViTMAEIntermediate and TFViTMAEOutput. * added TFViTMAEModel and TFViTMAEDecoder. * feat: added a noise argument in the implementation for reproducibility. * feat: vit mae models with an additional noise argument for reproducibility. Co-authored-by: ariG23498 <aritra.born2fly@gmail.com> Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

* Avoid accessing .dataset of a dataloader * style * fix * cleaning up, reverting some misunderstandings * black * add train_dataset argument to get_train_dataloader, and fix other instances of length checks * flake8 * address comments * fix bug * cleanup * add test * Update tests/trainer/test_trainer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * under torch * merge * stylistic suggestion Co-authored-by: Sander Land <sander@chatdesk.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

…ce#16311) * add unpack_inputs decorator to Main Layer * add unpack_inputs decorator to Model * add unpack_inputs decorator to LMHead Model * add unpack_inputs decorator to Double Head Model * add unpack_inputs decorator to Sequence Classification Model * run fixup recipe * make unpack_inputs the first decorator

johko · 2022-03-29T19:39:31Z

@ydshieh
Thanks for the explanation, I'll look into it and implement the changes within the next days

* Raise diff tolerance value Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

… initailized (huggingface#16487) * Do not initialize torch process group twice * Apply suggestions from code review

* Type hints and TF decorator added * Type hints and TF decorator added * make style Co-authored-by: matt <rocketknight1@gmail.com>

* Duplication of the source eng file * Spanish translation of the file multilingual.mdx * Update docs/source_es/multilingual.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source_es/multilingual.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source_es/multilingual.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source_es/multilingual.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source_es/multilingual.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source_es/multilingual.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source_es/multilingual.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Fix nits and finish translation Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>

* Translate installation.mdx to Spanish * Update docs/source_es/installation.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source_es/installation.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source_es/installation.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source_es/installation.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source_es/installation.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source_es/installation.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source_es/installation.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source_es/installation.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source_es/installation.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Update docs/source_es/installation.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Fix nits and finish translation Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>

* Translate accelerate.mdx from english to spanish * Update docs/source_es/accelerate.mdx Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Apply suggestions from code review Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Apply suggestions from code review Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Fix nits and finish translation Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>

* added type hints to xglm pytorch * Update src/transformers/models/xglm/modeling_xglm.py * Update src/transformers/models/xglm/modeling_xglm.py Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>

* [research] link to the XTREME-S paper * Update examples/research_projects/xtreme-s/README.md Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr> Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>

* Add beit onnx conversion support * Updated docs * Added cross reference to ViT ONNX config

* Add type hints to PLBart PyTorch * Remove pending merge conflicts * Fix PLBart Type Hints * Add changes from review

* added type hints for mbart tensorflow tf implementation * Adding missing type hints for mBART model Tensorflow Implementation model added with missing type hints * Missing Type hints - correction For TF model * Code fixup using make quality tests * Hint types - typo error * make fix-copies and make fixup * type hints * updated files * type hints update * making dependent modesls coherent Co-authored-by: matt <rocketknight1@gmail.com>

) * Remove MBart subclass of XLMRoberta in tokenzier * Fix style * Copy docs from MBart50 tokenizer

* use random_attention_mask for TF tests * Fix for TFCLIP test (for now). Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

Co-authored-by: Niels Rogge <nielsrogge@nielss-mbp.home>

* Pin tokenizers version <0.13 * Style

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

johko · 2022-04-01T22:26:03Z

@ydshieh
Ah, sorry I messed up the history now, I'm still getting used to the VS Code Git plugin.
But I added the add_code_sample_docstrings decorator to modeling_tf_vit.py now

ydshieh · 2022-04-04T12:27:56Z

@johko

Thank you for the update!

Regarding the commit history, could you try to fix it. I never need to deal with this situation so far, but here are a few threads I could find (potentially) relevant:

Please note that these links are merely an attempt to fix the current situation, I could not guarantee that the mentioned methods would work well and without any problem.

In any case, since the real change is very small, you can always update the master/main branch of your local clone, and create a new branch + add the changes there, then open a new PR.

Thank you for your understanding!

johko · 2022-04-04T13:08:22Z

Thank you, I'll look into it and try to fix it. In the end, as you mentioned I might always just make a new branch from my local master

johko · 2022-04-04T13:31:45Z

@ydshieh I'll close this PR and create a new one with a clean history. It seems VS Code used git sync and this messes up the history after a rebase

ydshieh · 2022-04-04T15:15:26Z

No problem, @johko . Thank you for the effort!

Johannes Kolbe added 2 commits March 28, 2022 21:54

add doctests to TF ViT

56b9b2c

add doctests to TF ViT

fbd0461

johko commented Mar 28, 2022

View reviewed changes

Johannes Kolbe and others added 3 commits March 28, 2022 22:34

add empty lines back in

f9d231f

Merge branch 'doc-test-vit-tf' of https://github.com/johko/transformers…

946e4b4

… into doc-test-vit-tf

Fix blenderbot conversion script (huggingface#16472)

8529562

arnaudstiegler and others added 10 commits March 29, 2022 16:19

[MNLI example] Prevent overwriting matched with mismatched metrics (h…

5216607

…uggingface#16475) * Prevent overwriting matched with mismatched metrics * Fix style

Remove duplicate mLuke (huggingface#16460)

45abb37

* Remove duplicate mLuke * 🖍 apply feedback

Fix example test and test_fetcher for examples (huggingface#16478)

b62ac4d

fix wrong variable name (huggingface#16467)

3015d12

Add TF vision model code samples (huggingface#16477)

6358a4c

* add code samples Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

[doc] Fix missing trainer import (huggingface#16469)

875e07a

Add type hints for UniSpeech (huggingface#16399)

0540d1b

* Add type hints for UniSpeech * Added type hints for UniSpeechSat * Added type hints for Wave2Vec2 (PT) * Added type hints for models dependent of wave2vec

gante and others added 5 commits March 29, 2022 18:17

TF: properly handle kwargs in encoder_decoder architectures (huggingf…

7a9ef81

…ace#16465) * properly handle kwargs in encoder_decoder architectures * make fixup

added typehints for RAG pytorch models (huggingface#16416)

781af73

ydshieh and others added 4 commits March 29, 2022 22:12

Raise diff tolerance value for TFViTMAEModelTest (huggingface#16483)

2b48323

* Raise diff tolerance value Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

Do not initialize torch.distributed process group if one is already…

277d49a

… initailized (huggingface#16487) * Do not initialize torch process group twice * Apply suggestions from code review

TF GPT-J Type hints and TF decorator (huggingface#16488)

ffd19ee

* Type hints and TF decorator added * Type hints and TF decorator added * make style Co-authored-by: matt <rocketknight1@gmail.com>

Nit: MCSCOCO -> MS COCO (huggingface#16481)

147c816

SimplyJuanjo and others added 23 commits March 31, 2022 07:43

fixed a typo (huggingface#16508)

05b4c32

added type hints to xglm pytorch (huggingface#16500)

b808d8a

* added type hints to xglm pytorch * Update src/transformers/models/xglm/modeling_xglm.py * Update src/transformers/models/xglm/modeling_xglm.py Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>

Fix syntax error in generate docstrings (huggingface#16516)

e4b2348

[research] link to the XTREME-S paper (huggingface#16519)

5807054

* [research] link to the XTREME-S paper * Update examples/research_projects/xtreme-s/README.md Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr> Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>

Fixed a typo in legacy seq2seq_trainer.py (huggingface#16531)

bfeff6c

Add ONNX export for BeiT (huggingface#16498)

9de70f2

* Add beit onnx conversion support * Updated docs * Added cross reference to ViT ONNX config

call on_train_end when trial is pruned (huggingface#16536)

483a945

Type hints added (huggingface#16529)

afc5a1e

Fix Bart type hints (huggingface#16297)

59a9c83

* Add type hints to PLBart PyTorch * Remove pending merge conflicts * Fix PLBart Type Hints * Add changes from review

Add VisualBert type hints (huggingface#16544)

9947dd0

Remove MBart subclass of XLMRoberta in tokenzier docs (huggingface#16546

823dbf8

) * Remove MBart subclass of XLMRoberta in tokenzier * Fix style * Copy docs from MBart50 tokenizer

Use random_attention_mask for TF tests (huggingface#16517)

2199382

* use random_attention_mask for TF tests * Fix for TFCLIP test (for now). Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

Improve code example (huggingface#16450)

61ee26a

Co-authored-by: Niels Rogge <nielsrogge@nielss-mbp.home>

Pin tokenizers version <0.13 (huggingface#16539)

53a4d6b

* Pin tokenizers version <0.13 * Style

Add code samples for TF speech models (huggingface#16494)

60d27b1

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

add doctests to TF ViT

9ac49ab

add empty lines back in

f4f0f6c

use add_code_sample_docstrings decorator

92e0b2d

merged

1cb3323

johko closed this Apr 4, 2022

johko mentioned this pull request Apr 6, 2022

add vit tf doctest with @add_code_sample_docstrings #16636

Merged

5 tasks

Comments

Conversation

johko commented Mar 28, 2022

What does this PR do?

Before submitting

Who can review?

Uh oh!

johko Mar 28, 2022

Choose a reason for hiding this comment

Uh oh!

ydshieh Mar 28, 2022

Choose a reason for hiding this comment

Uh oh!

johko Mar 28, 2022

Choose a reason for hiding this comment

Uh oh!

ydshieh Mar 28, 2022

Choose a reason for hiding this comment

Uh oh!

HuggingFaceDocBuilderDev commented Mar 28, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ydshieh commented Mar 28, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ydshieh commented Mar 29, 2022

Uh oh!

ydshieh commented Mar 29, 2022

Uh oh!

johko commented Mar 29, 2022

Uh oh!

johko commented Apr 1, 2022

Uh oh!

ydshieh commented Apr 4, 2022

Uh oh!

johko commented Apr 4, 2022

Uh oh!

johko commented Apr 4, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ydshieh commented Apr 4, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

HuggingFaceDocBuilderDev commented Mar 28, 2022 •

edited

Loading

ydshieh commented Mar 28, 2022 •

edited

Loading

johko commented Apr 4, 2022 •

edited

Loading