Skip to content

Synchronize with HF#23

Merged
tileintel merged 92 commits into
abhiwand:mainfrom
huggingface:main
Mar 8, 2023
Merged

Synchronize with HF#23
tileintel merged 92 commits into
abhiwand:mainfrom
huggingface:main

Conversation

@tileintel
Copy link
Copy Markdown
Collaborator

What does this PR do?

Fixes # (issue)

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
  • Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

sgugger and others added 30 commits February 28, 2023 16:24
* Fix flaky test for log level

* Fix other flaky test
…in a future version of pytorch" (#20211)

* rounding_mode = "floor"  instead of // to prevent behavioral change

* add other TODO

* use `torch_int_div` from pytrch_utils

* same for tests

* fix copies

* style

* use relative imports when needed

* Co-authored-by: sgugger <sylvain.gugger@gmail.com>
* fix reshaping
Fixes #21523

* add test

* styling

* last fixes

* Update src/transformers/models/convbert/modeling_convbert.py

* code quallity
Co-authored-by: saswatmeher <saswatmeher@cse.iitb.ac.in>
…ut type (#21800)

* trying to figure out whether model is NLP

* drop my changes and apply easier fix

* trying to handle all int input types

* fix logic

---------

Co-authored-by: Stas Bekman <stas@stason.org>
…shape) (#21860)

* Change the .view call to .reshape

* Change the .view call to .reshape to all the copies from bart attention

* Fix copies and style

* Fix copies and style

* Fix copies and style

* Fix copies and style

* Fix copies and style

* Revert unneccessary changes

* Revert unneccessary changes

* Revert unneccessary changes

* Revert unneccessary changes
Italian translation of community.mdx gh-17459
removed BLIP mention from the troubleshooting guide
* update FSDP and add XLA-FSDP documentation

* resolving comments

* minor update

* fix xla-fsdp docs
* Add an utility file to get information from test files

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
* Add check for different embedding types in examples

* Correctly update summarization example
…Conv1D's weights (#21879)

apply normal_ after assigning weight as nn.Parameter to avoid unnecessary initialization computation
* Temporary commit to stash everything so far

* Temporary commit to stash everything so far

* stash commit

* Refactor from_pretrained

* Fix final test, make fixup

* Update dummies

* Add model to TEST_FILES_WITH_NO_COMMON_TESTS

* Update src/transformers/models/vision_text_dual_encoder/modeling_tf_vision_text_dual_encoder.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update src/transformers/models/vision_text_dual_encoder/modeling_tf_vision_text_dual_encoder.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update src/transformers/models/vision_text_dual_encoder/modeling_tf_vision_text_dual_encoder.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update src/transformers/models/vision_text_dual_encoder/modeling_tf_vision_text_dual_encoder.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Add TFVisionTextDualEncoder to utils/documentation_tests.txt

* make fixup

---------

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Adds the ALIGN model to transformers. ALIGN is introduced in "Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision" by Chao Jia, Yinfei Yang, Ye Xia, Yi-Ting Chen, Zarana Parekh, Hieu Pham, Quoc V. Le, Yunhsuan Sung, Zhen Li, Tom Duerig.
Co-authored-by: saswatmeher <saswatmeher@cse.iitb.ac.in>
* force on the same device

* fix tests

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
* fix tests

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
* initial commit

* update

* second batch

* style

* fix imports

* fix relative import on pipeline
add correct revision after model was overwritten
* Use PyAV instead of Decord

* Get frame indices

* Fix number of frames

* Update src/transformers/models/videomae/image_processing_videomae.py

* Fix up

* Fix copies

* Update timesformer doctests

* Update docstrings
* initial commit to add inputs_embeds to generation

* formatting
* Confusing documentation in T5

* Fix onfusing documentation in T5 configuration file
* add `zero_mean_unit_var_norm` function

* normalize before MEL computation

* fixup

* add simple test

* quality

* Update tests/models/whisper/test_feature_extraction_whisper.py

Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* fixup

* use attention masks if padding was applied

* Update based on review

Co-authored-by: bofeng huang <bofenghuang7@gmail.com>

---------

Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
Co-authored-by: bofeng huang <bofenghuang7@gmail.com>
* add deprecation warning

* remove pos ids from args docstirng

* fix failing test
ydshieh and others added 28 commits March 6, 2023 09:15
update values

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
* fix `get_proposal_pos_embed`

* fix order

* style

* zero shot simplify test

* add approximate values for zero shot audio classification
Disable DDp for neuron

Co-authored-by: EC2 Default User <ec2-user@ip-172-31-42-72.us-west-2.compute.internal>
Co-authored-by: saswatmeher <saswatmeher@cse.iitb.ac.in>
Four parameters in `LayoutLM` config were missing definitions, Added their definition (copied from BertConfig).
Use larger atol

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
* Initial commit

* stash commit

* Add model checkpointing and pushing

* Fix model name inference

* Update README

* Update README

* Remove a couple of Torch references

* Update copyright date

* make fixup

* Update PushToHubCallback args!

* Remove the torch summary

* Add strategy.scope
update expected values for xglm

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
* docs: improve clarity for clm/mlm

* docs: remove incorrect explanation

* docs: remove incorrect explanation

---------

Co-authored-by: pdhall99 <pdhall99>
* update expected values for jukebox

* update expected values for jukebox

* update expected values for jukebox

* update expected values for jukebox

* update expected values for jukebox

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
* Add check before int casting for PIL conversion

* Line length

* Tidier logic
…kens (#21959)

* Fix MinNewTokensLengthLogitsProcessor when used with a list of eos tokens

* fix docs

* Empty commit

* formatting
* Fix integration test

* Add test

* Add test
* better check

* better check

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
)

skip test_multi_gpu_data_parallel_forward for some model tests

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
* [Whisper] Add model for audio classification

* make fix-copies

* add to docs

* add docstring

* empty returns

* add code example

* switch to fleurs

* stick everything on one line
* Stop requiring Torch for our TF examples!

* Slight tweak to logging in the example itself
* add create pr arg

* style

* add test

* ficup

* update test

* last nit fix typo

* add `is_pt_tf_cross_test` marker for the tsts
* First draft

* Fix to_dict

* Improve conversion script

* Update config

* Remove timm dependency

* Fix dummies

* Fix typo, add integration test

* Upload 101 model as well

* Remove timm dummies

* Fix style

---------

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
* added informer to gitignore

* added informer to gitignore

* WIP informer2020

* added checking that instantiate works

* added config using gluonTS by kashif

* WIP config

* adding informeConfig. need to remove FeatureEmbedder

* done InformerConfig, but need to change the names

* Done informer model init. working on enc-dec

* added things to address, after reading again enc-dec in the paper

* done modeling - checking initialization work

* added informer to gitignore

* WIP informer2020

* added checking that instantiate works

* added config using gluonTS by kashif

* WIP config

* adding informeConfig. need to remove FeatureEmbedder

* done InformerConfig, but need to change the names

* Done informer model init. working on enc-dec

* added things to address, after reading again enc-dec in the paper

* done modeling - checking initialization work

* moved enc-dec init to InformerEncoder/Decoder init

* added 'init_std' to config, now model init works!

* WIP conversion script, and added code sources

* WIP conversion script: loading original informer pth works

* WIP conversion script: change defaults in the config

* WIP conversion script: supporting Informer input embedding

* WIP conversion script: added parameters for the informer embed

* WIP conversion script: change dim_feedforward=2048

* WIP conversion script: remove unused args for loading checkpoint

* just cleaning up

* DataEmbedding removed, after thinking with Kashif

* working on forward pass

* WIP forward pass: trying to establish working batch for forward pass

* cleaning and finalizing

* adding HF names and docs

* init after cleaning works

* WIP in tests

* added docs for the informer specific args

* fix style

* undo change

* cleaning informer, now need to work only enc-dec

* initial enc-dec classes

* added encoder and decoder

* added todo

* add todos for conv_layers

* added decoder docs from vanilla

* added encoder docs from vanilla

* remove encoder decoder from the original informer

* removed AttentionLayer from the original paper

* removed TriangularCausalMask, same as decoder_attention_mask

* initial sparse attention

* use conv_layers

* fixed test_config test

* fix parenthesis when itearting zip(layers, conv_layers)

* error found in prob attention, added sizes as comments

* fix sizes

* added proposal for q_reduce indexing, and remove unused

* WIP ProbMask, and changed factor=2 for testing

* remove unused libs for this PR for creating the env

* fix checking the attn_weights.size() after bmm

* Q_reduce: changed from torch.gather to simple slicing

* WIP calculate final attn_output

* finish adding v_aggregated, attn_output ready

* changed tgt_len to u in attention_mask, need to fix the size error

* comment attention_mask for encoder, and fix if cond for v_agg

* added ProbMask support (wip), removed old original code

* finished ProbMask 😃

* Revert "remove unused libs for this PR for creating the env"

This reverts commit 11a081e.

* fixes

* make style

* fix initial tests

* fix more tests

* dry

* make style

* remove unused files

* style

* added integration tests

* fix num_static_real_features

* fix header

* remove unused function

* fix example

* fix docs

* Update src/transformers/models/informer/configuration_informer.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/informer/modeling_informer.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/informer/configuration_informer.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/informer/configuration_informer.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/informer/configuration_informer.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/informer/configuration_informer.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* fixes for reviewer

* use prediction_length from model

* fix style

* fixed informer.mdx

* added to index

* updated readme

* undo

* make fix-copies

* typo

* fix copy

* added Informer to toctree

* in order

* fixed comments

* remove unneeded new lines in docs

* make static real and cat optional

* fix use of distil conv layers

* fixed integration test

* added checkpoint for convlayer

* make fix-copies

* updated from time series model

* make fix-copies

* copy decoder

* fix unit tests

* updated scaling config

* fix integration tests

* IGNORE_NON_TESTED

* IGNORE_NON_AUTO_CONFIGURED

* IGNORE_NON_AUTO_CONFIGURED

* updated check configs

* fix formatting

* undo change from time series

* prediction_length should not be None

* aliign with the blog: prettify ProbSparse and change attention_factor  to sampling_factor

* make style

* make fix-copies

* niels CR: update contributed by

* niels CR: update configuration_informer.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* niels CR: update kashif -> huggingface

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* niels CR: `sampling_factor` only relevant when `attention_type`=prob

* make style

* fixed U_part: added multiplication by `L_Q`

* fixed bug: remove `is not None` from `if config.distil`

* fixed test: `decoder_seq_length` to `encoder_seq_length` in cross_attentions check

* fix integration tests

* updated model hub

* do not shift as in training

* undo

* fix make-copies

* make fix-copies

* added `if prediction_length is None`

* changed `ProbSparseAttention` to `InformerProbSparseAttention`

* changed `V_sum` -> `v_mean_dim_time`

* changed `ConvLayer` to `InformerConvLayer` and fixed `super()`

* TimeSeriesTansformer->Informer in decoder's Copied from

* more descriptive in ProbSparse

* make style

* fix coped from

* Revert "added `if prediction_length is None`"

This reverts commit b4cbddf.

* fixed indent

* use InformerSinusoidalPositionalEmbedding

* make fix-style

* fix from #21860

* fix name

* make fix-copies

* use time series utils

* fix dec num_heads

* docstring

* added time series util doc

* _import_structure

* formatting

* changes from review

* make style

* fix docs

* fix doc

* removed NegativeLogLikelihood

---------

Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update 1

* Update 2

* Update 3

* Update 4

* Update 5

* Update 6

* Update 7

* Update 8

* Update 9

* Update 10

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
@tileintel tileintel merged commit c9984da into abhiwand:main Mar 8, 2023
@HuggingFaceDocBuilderDev
Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.