Synchronize with HF by tileintel · Pull Request #23 · abhiwand/transformers

tileintel · 2023-03-08T09:09:45Z

What does this PR do?

Fixes # (issue)

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

* Fix flaky test for log level * Fix other flaky test

…in a future version of pytorch" (#20211) * rounding_mode = "floor" instead of // to prevent behavioral change * add other TODO * use `torch_int_div` from pytrch_utils * same for tests * fix copies * style * use relative imports when needed * Co-authored-by: sgugger <sylvain.gugger@gmail.com>

* fix reshaping Fixes #21523 * add test * styling * last fixes * Update src/transformers/models/convbert/modeling_convbert.py * code quallity

Co-authored-by: saswatmeher <saswatmeher@cse.iitb.ac.in>

…ut type (#21800) * trying to figure out whether model is NLP * drop my changes and apply easier fix * trying to handle all int input types * fix logic --------- Co-authored-by: Stas Bekman <stas@stason.org>

…shape) (#21860) * Change the .view call to .reshape * Change the .view call to .reshape to all the copies from bart attention * Fix copies and style * Fix copies and style * Fix copies and style * Fix copies and style * Fix copies and style * Revert unneccessary changes * Revert unneccessary changes * Revert unneccessary changes * Revert unneccessary changes

Italian translation of community.mdx gh-17459

fix blip doctest

removed BLIP mention from the troubleshooting guide

* update FSDP and add XLA-FSDP documentation * resolving comments * minor update * fix xla-fsdp docs

* Add an utility file to get information from test files --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

* Add check for different embedding types in examples * Correctly update summarization example

…Conv1D's weights (#21879) apply normal_ after assigning weight as nn.Parameter to avoid unnecessary initialization computation

* Temporary commit to stash everything so far * Temporary commit to stash everything so far * stash commit * Refactor from_pretrained * Fix final test, make fixup * Update dummies * Add model to TEST_FILES_WITH_NO_COMMON_TESTS * Update src/transformers/models/vision_text_dual_encoder/modeling_tf_vision_text_dual_encoder.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update src/transformers/models/vision_text_dual_encoder/modeling_tf_vision_text_dual_encoder.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update src/transformers/models/vision_text_dual_encoder/modeling_tf_vision_text_dual_encoder.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update src/transformers/models/vision_text_dual_encoder/modeling_tf_vision_text_dual_encoder.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Add TFVisionTextDualEncoder to utils/documentation_tests.txt * make fixup --------- Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

Adds the ALIGN model to transformers. ALIGN is introduced in "Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision" by Chao Jia, Yinfei Yang, Ye Xia, Yi-Ting Chen, Zarana Parekh, Hieu Pham, Quoc V. Le, Yunhsuan Sung, Zhen Li, Tom Duerig.

Co-authored-by: saswatmeher <saswatmeher@cse.iitb.ac.in>

* force on the same device * fix tests --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

* fix tests --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

* initial commit * update * second batch * style * fix imports * fix relative import on pipeline

add correct revision after model was overwritten

* Use PyAV instead of Decord * Get frame indices * Fix number of frames * Update src/transformers/models/videomae/image_processing_videomae.py * Fix up * Fix copies * Update timesformer doctests * Update docstrings

* initial commit to add inputs_embeds to generation * formatting

* Confusing documentation in T5 * Fix onfusing documentation in T5 configuration file

* add `zero_mean_unit_var_norm` function * normalize before MEL computation * fixup * add simple test * quality * Update tests/models/whisper/test_feature_extraction_whisper.py Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * fixup * use attention masks if padding was applied * Update based on review Co-authored-by: bofeng huang <bofenghuang7@gmail.com> --------- Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> Co-authored-by: bofeng huang <bofenghuang7@gmail.com>

* add deprecation warning * remove pos ids from args docstirng * fix failing test

update values Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

* fix `get_proposal_pos_embed` * fix order * style * zero shot simplify test * add approximate values for zero shot audio classification

Disable DDp for neuron Co-authored-by: EC2 Default User <ec2-user@ip-172-31-42-72.us-west-2.compute.internal>

Co-authored-by: saswatmeher <saswatmeher@cse.iitb.ac.in>

…1956) Step 1 - Change use_cache fix

Four parameters in `LayoutLM` config were missing definitions, Added their definition (copied from BertConfig).

Use larger atol Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

* Initial commit * stash commit * Add model checkpointing and pushing * Fix model name inference * Update README * Update README * Remove a couple of Torch references * Update copyright date * make fixup * Update PushToHubCallback args! * Remove the torch summary * Add strategy.scope

update expected values for xglm Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

Make Format

* docs: improve clarity for clm/mlm * docs: remove incorrect explanation * docs: remove incorrect explanation --------- Co-authored-by: pdhall99 <pdhall99>

* update expected values for jukebox * update expected values for jukebox * update expected values for jukebox * update expected values for jukebox * update expected values for jukebox --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

* Add check before int casting for PIL conversion * Line length * Tidier logic

…kens (#21959) * Fix MinNewTokensLengthLogitsProcessor when used with a list of eos tokens * fix docs * Empty commit * formatting

* Fix integration test * Add test * Add test

Remove cast to Bool

* better check * better check --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

) skip test_multi_gpu_data_parallel_forward for some model tests Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

* [Whisper] Add model for audio classification * make fix-copies * add to docs * add docstring * empty returns * add code example * switch to fleurs * stick everything on one line

* Stop requiring Torch for our TF examples! * Slight tweak to logging in the example itself

* add create pr arg * style * add test * ficup * update test * last nit fix typo * add `is_pt_tf_cross_test` marker for the tsts

* First draft * Fix to_dict * Improve conversion script * Update config * Remove timm dependency * Fix dummies * Fix typo, add integration test * Upload 101 model as well * Remove timm dummies * Fix style --------- Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>

* added informer to gitignore * added informer to gitignore * WIP informer2020 * added checking that instantiate works * added config using gluonTS by kashif * WIP config * adding informeConfig. need to remove FeatureEmbedder * done InformerConfig, but need to change the names * Done informer model init. working on enc-dec * added things to address, after reading again enc-dec in the paper * done modeling - checking initialization work * added informer to gitignore * WIP informer2020 * added checking that instantiate works * added config using gluonTS by kashif * WIP config * adding informeConfig. need to remove FeatureEmbedder * done InformerConfig, but need to change the names * Done informer model init. working on enc-dec * added things to address, after reading again enc-dec in the paper * done modeling - checking initialization work * moved enc-dec init to InformerEncoder/Decoder init * added 'init_std' to config, now model init works! * WIP conversion script, and added code sources * WIP conversion script: loading original informer pth works * WIP conversion script: change defaults in the config * WIP conversion script: supporting Informer input embedding * WIP conversion script: added parameters for the informer embed * WIP conversion script: change dim_feedforward=2048 * WIP conversion script: remove unused args for loading checkpoint * just cleaning up * DataEmbedding removed, after thinking with Kashif * working on forward pass * WIP forward pass: trying to establish working batch for forward pass * cleaning and finalizing * adding HF names and docs * init after cleaning works * WIP in tests * added docs for the informer specific args * fix style * undo change * cleaning informer, now need to work only enc-dec * initial enc-dec classes * added encoder and decoder * added todo * add todos for conv_layers * added decoder docs from vanilla * added encoder docs from vanilla * remove encoder decoder from the original informer * removed AttentionLayer from the original paper * removed TriangularCausalMask, same as decoder_attention_mask * initial sparse attention * use conv_layers * fixed test_config test * fix parenthesis when itearting zip(layers, conv_layers) * error found in prob attention, added sizes as comments * fix sizes * added proposal for q_reduce indexing, and remove unused * WIP ProbMask, and changed factor=2 for testing * remove unused libs for this PR for creating the env * fix checking the attn_weights.size() after bmm * Q_reduce: changed from torch.gather to simple slicing * WIP calculate final attn_output * finish adding v_aggregated, attn_output ready * changed tgt_len to u in attention_mask, need to fix the size error * comment attention_mask for encoder, and fix if cond for v_agg * added ProbMask support (wip), removed old original code * finished ProbMask 😃 * Revert "remove unused libs for this PR for creating the env" This reverts commit 11a081e. * fixes * make style * fix initial tests * fix more tests * dry * make style * remove unused files * style * added integration tests * fix num_static_real_features * fix header * remove unused function * fix example * fix docs * Update src/transformers/models/informer/configuration_informer.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/informer/modeling_informer.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/informer/configuration_informer.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/informer/configuration_informer.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/informer/configuration_informer.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/informer/configuration_informer.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * fixes for reviewer * use prediction_length from model * fix style * fixed informer.mdx * added to index * updated readme * undo * make fix-copies * typo * fix copy * added Informer to toctree * in order * fixed comments * remove unneeded new lines in docs * make static real and cat optional * fix use of distil conv layers * fixed integration test * added checkpoint for convlayer * make fix-copies * updated from time series model * make fix-copies * copy decoder * fix unit tests * updated scaling config * fix integration tests * IGNORE_NON_TESTED * IGNORE_NON_AUTO_CONFIGURED * IGNORE_NON_AUTO_CONFIGURED * updated check configs * fix formatting * undo change from time series * prediction_length should not be None * aliign with the blog: prettify ProbSparse and change attention_factor to sampling_factor * make style * make fix-copies * niels CR: update contributed by * niels CR: update configuration_informer.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * niels CR: update kashif -> huggingface Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * niels CR: `sampling_factor` only relevant when `attention_type`=prob * make style * fixed U_part: added multiplication by `L_Q` * fixed bug: remove `is not None` from `if config.distil` * fixed test: `decoder_seq_length` to `encoder_seq_length` in cross_attentions check * fix integration tests * updated model hub * do not shift as in training * undo * fix make-copies * make fix-copies * added `if prediction_length is None` * changed `ProbSparseAttention` to `InformerProbSparseAttention` * changed `V_sum` -> `v_mean_dim_time` * changed `ConvLayer` to `InformerConvLayer` and fixed `super()` * TimeSeriesTansformer->Informer in decoder's Copied from * more descriptive in ProbSparse * make style * fix coped from * Revert "added `if prediction_length is None`" This reverts commit b4cbddf. * fixed indent * use InformerSinusoidalPositionalEmbedding * make fix-style * fix from #21860 * fix name * make fix-copies * use time series utils * fix dec num_heads * docstring * added time series util doc * _import_structure * formatting * changes from review * make style * fix docs * fix doc * removed NegativeLogLikelihood --------- Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update 1 * Update 2 * Update 3 * Update 4 * Update 5 * Update 6 * Update 7 * Update 8 * Update 9 * Update 10 --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

HuggingFaceDocBuilderDev · 2023-03-08T09:25:56Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

sgugger and others added 30 commits February 28, 2023 16:24

Fix flaky test for log level (#21776)

b29e2dc

* Fix flaky test for log level * Fix other flaky test

[ConvBert] Fix #21523 (#21849)

b599b19

* fix reshaping Fixes #21523 * add test * styling * last fixes * Update src/transformers/models/convbert/modeling_convbert.py * code quallity

Flax beam search fix (#21857)

5e6cd51

Fix gradient checkpointing bug Bart (#21866)

72e9ca7

Co-authored-by: saswatmeher <saswatmeher@cse.iitb.ac.in>

[deepspeed] check whether model is NLP one instead of counting on inp…

f71873c

…ut type (#21800) * trying to figure out whether model is NLP * drop my changes and apply easier fix * trying to handle all int input types * fix logic --------- Co-authored-by: Stas Bekman <stas@stason.org>

Italian translation of community.mdx (#21871)

619d831

Italian translation of community.mdx gh-17459

[Blip] Fix blip doctest (#21868)

72787c5

fix blip doctest

Removed BLIP mention from the troubleshooting guide (#21872)

9c1d598

removed BLIP mention from the troubleshooting guide

update FSDP and add XLA-FSDP documentation (#21812)

571dd69

* update FSDP and add XLA-FSDP documentation * resolving comments * minor update * fix xla-fsdp docs

[doc] deepspeed tests (#21859)

3eba1dd

Add an utility file to get information from test files (#21856)

53735d7

* Add an utility file to get information from test files --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

Add check for different embedding types in examples (#21881)

1d3a1cc

* Add check for different embedding types in examples * Correctly update summarization example

Make loading of pretrained gpt2 faster by avoiding initialization of …

45e1109

…Conv1D's weights (#21879) apply normal_ after assigning weight as nn.Parameter to avoid unnecessary initialization computation

Fix Gradient checkpointing bug BigBird (#21882)

4edfd2d

Co-authored-by: saswatmeher <saswatmeher@cse.iitb.ac.in>

Fix WhisperModelTest (#21883)

36ee128

* force on the same device * fix tests --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

Fix test_load_default_pipelines_pt for ClapModel (#21886)

89359e4

* fix tests --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

fix checkpoint (#21874)

43299c6

[Refactor] Relative imports wherever we can (#21880)

633e5e8

* initial commit * update * second batch * style * fix imports * fix relative import on pipeline

[ZAC] fix ci daily (#21893)

c256bc6

add correct revision after model was overwritten

Use PyAV instead of Decord in examples (#21572)

3412f59

* Use PyAV instead of Decord * Get frame indices * Fix number of frames * Update src/transformers/models/videomae/image_processing_videomae.py * Fix up * Fix copies * Update timesformer doctests * Update docstrings

Add inputs_embeds functionality when generating with BioGPT (#21889)

edbb37f

* initial commit to add inputs_embeds to generation * formatting

[T5 doc] Fix confusing documentation about d_kv (#21896)

b48c7f7

* Confusing documentation in T5 * Fix onfusing documentation in T5 configuration file

fix typo in Bart's attention (#21898)

648d0de

[GPT-J] add deprecation warning (#21869)

fb76994

* add deprecation warning * remove pos ids from args docstirng * fix failing test

fsdp bf16 enable autocast (#21847)

b6f47b5

ydshieh and others added 28 commits March 6, 2023 09:15

Update expected values in XLMProphetNetModelIntegrationTest (#21957)

fcf8134

update values Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

[CI] Fix ci (#21940)

bc33fbf

* fix `get_proposal_pos_embed` * fix order * style * zero shot simplify test * add approximate values for zero shot audio classification

Disable DDP for neuron (#21953)

0bb1729

Disable DDp for neuron Co-authored-by: EC2 Default User <ec2-user@ip-172-31-42-72.us-west-2.compute.internal>

Fix bert issue (#21963)

934d0b8

Co-authored-by: saswatmeher <saswatmeher@cse.iitb.ac.in>

[Generate] Fix gradient_checkpointing and use_cache bug for BLOOM (#2…

f3c75f8

…1956) Step 1 - Change use_cache fix

Add missing parameter definition in layoutlm config (#21960)

64d95c4

Four parameters in `LayoutLM` config were missing definitions, Added their definition (copied from BertConfig).

Use larger atol in torch.allclose for some tests (#21966)

9474abd

Use larger atol Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

Update expected values for test_xglm_sample (#21975)

f2a2616

update expected values for xglm Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

Fix gradient checkpointing bug in BigBird Pegasus (#21976)

4f84ded

Fix gradient checkpointing bug in Blenderbot Small (#21977)

451263b

Fix gradient checkpointing bug in BlipText (#21978)

4a545d1

Make Format

Fix gradient checkpointing bug in Codegen (#21979)

de496ef

Fix gradient checkpointing bug in ESM (#21980)

0ce5236

docs: improve clarity for language modeling (#21952)

31e3c6c

* docs: improve clarity for clm/mlm * docs: remove incorrect explanation * docs: remove incorrect explanation --------- Co-authored-by: pdhall99 <pdhall99>

Add check before int casting for PIL conversion (#21969)

4063fd9

* Add check before int casting for PIL conversion * Line length * Tidier logic

Fix MinNewTokensLengthLogitsProcessor when used with a list of eos to…

eec46b4

…kens (#21959) * Fix MinNewTokensLengthLogitsProcessor when used with a list of eos tokens * fix docs * Empty commit * formatting

[DETR, YOLOS] Fix device bug (#21974)

95408e9

* Fix integration test * Add test * Add test

Remove unneeded casts to bool (#21983)

10bcbca

Remove cast to Bool

Update notification_service.py (#21992)

99c5c60

* better check * better check --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

Skip test_multi_gpu_data_parallel_forward for some model tests (#21991

9402788

) skip test_multi_gpu_data_parallel_forward for some model tests Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

[Whisper] Add model for audio classification (#21754)

7c39318

* [Whisper] Add model for audio classification * make fix-copies * add to docs * add docstring * empty returns * add code example * switch to fleurs * stick everything on one line

Stop requiring Torch for our TF examples! (#21997)

d128f2f

* Stop requiring Torch for our TF examples! * Slight tweak to logging in the example itself

[TF] Fix creating a PR while pushing in TF framework (#21968)

2156662

* add create pr arg * style * add test * ficup * update test * last nit fix typo * add `is_pt_tf_cross_test` marker for the tsts

Update tiny model creation script and some others files (#22006)

b338414

* Update 1 * Update 2 * Update 3 * Update 4 * Update 5 * Update 6 * Update 7 * Update 8 * Update 9 * Update 10 --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

tileintel merged commit c9984da into abhiwand:main Mar 8, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Synchronize with HF#23

Synchronize with HF#23
tileintel merged 92 commits into
abhiwand:mainfrom
huggingface:main

tileintel commented Mar 8, 2023

Uh oh!

HuggingFaceDocBuilderDev commented Mar 8, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

Conversation

tileintel commented Mar 8, 2023

What does this PR do?

Before submitting

Who can review?

Uh oh!

HuggingFaceDocBuilderDev commented Mar 8, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants