Type cast before normalize videomae by amyeroberts · Pull Request #9 · amyeroberts/transformers

amyeroberts · 2022-08-05T19:44:26Z

What does this PR do?

Modifies the feature extractor call to make the inputs to BatchFeature always numpy arrays. This ensures the type of the returned pixel_values matches those requested with return_tensors. The conversion to numpy arrays happens before normalization to ensure consistent rescaling is done on the inputs.

This solution means the return_tensors=None default can stay and that the behaviour is as expected for any combination of the do_xxx flag values.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

) * Fix GPT-NeoX-20B past handling, swap attention computation to hopefully avoid NaN, update docs * 20B tests

* doc: Unify training arg type annotations * wip: extracting enum type from Union * blackening

) * trigger test failure * upload revision poc * Update src/transformers/pipelines/base.py Co-authored-by: Julien Chaumond <julien@huggingface.co> * up * add test * correct some stuff * Update src/transformers/pipelines/__init__.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * correct require flag Co-authored-by: Julien Chaumond <julien@huggingface.co> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

…uggingface#17951)

* sharded conversion; add flag to control max hidden error * better hidden name matching * Add test: load TF from PT shards * fix test (PT data must be local)

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

* Add ONNX support for LayoutLMv3 * Update docstrings * Update empty description in docstring * Fix imports and type hints

* feat: add pipeline registry abstraction - added `PipelineRegistry` abstraction - updates `add_new_pipeline.mdx` (english docs) to reflect the api addition - migrate `check_task` and `get_supported_tasks` from transformers/pipelines/__init__.py to transformers/pipelines/base.py#PipelineRegistry.{check_task,get_supported_tasks} Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com> * fix: update with upstream/main chore: Apply suggestions from sgugger's code review Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * chore: PR updates - revert src/transformers/dependency_versions_table.py from upstream/main - updates pipeline registry to use global variables Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com> * tests: add tests for pipeline registry Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com> * tests: add test for output warning. Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com> * chore: fmt and cleanup unused imports Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com> * fix: change imports to top of the file and address comments Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* skip some gpt_neox tests that require 80G RAM * remove tests * fix quality Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

* fixing fsdp autowrap functionality * update version and quality * update torch version to latest stable version

* add onnx support for BLOOM * use TYPE_CHECKING for type annotations * fix past_shape for bloom (different from gpt2) * use logical_or instead of `+` for onnx support * bigger `atol_for_validation` for larger bloom models * copied -> taken because it's no longer an exact copy * remove "copied from" comment Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

* [Flax] Add remat (gradient checkpointing) * fix variable naming in test * flip: checkpoint using a method * fix naming * fix class naming * apply PVP's suggestions from code review * make fix-copies * fix big-bird, electra, roberta * cookie-cutter * fix flax big-bird * move test to common

* Copy inputs to train and test step before modifying them, as this breaks things * Add XLA tests, fix our loss functions to be XLA-compatible * make fixup * Update loss computation test to expect vector of per-sample losses * Patch loss for TFLED * Patch loss for TFAlbert * Add a tf_legacy_loss config flag that enables old loss functions * Stop using config.get() because it's not a dict * Skip loss computation test for RAG because its loss is very strange and I'm afraid to rewrite it * make fixup * Add XLA-compatible RAG loss * Fix dtype of loss mask for TFAlbert * Fix test for XLNet too because it overrides the default one * make fixup * Fix config test * No more depending on GPU NaN behaviour * Add test, avoid potential zero division * Fix test item assignment * Fix loss computation masking test * make fixup * Fix dtype bugs

…ne (huggingface#17970)

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

…17987) * Shifting labels for causal LM when using label smoother When training CausalLM, loss is computed within model's foward() function and labels are shifted internally. However, if label smoothing is applied, loss is computed in trainer's compute_loss function and labels are not shifted. This causes unintended confusion during the alignment of labels and corresponding inputs. This commit is for resolving this confusion. Resolves huggingface#17960 On branch shift_labels_for_causalLM Changes to be committed: modified: src/transformers/trainer.py modified: src/transformers/trainer_pt_utils.py * Update trainer.py * Update src/transformers/trainer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

huggingface#17988) * Exclude Databricks from notebook env only if the runtime is below 11.0 * Dummy commit to trigger CI * Empty commit to trigger CI * Empty commit to trigger CI * Empty commit to trigger CI * Empty commit to trigger CI * Empty commit to trigger CI * Empty commit to trigger CI * Empty commit to trigger CI

@Rocketknight1

* Rought TF conversion outline * Tidy up * Fix padding differences between layers * Add back embedder - whoops * Match test file to main * Match upstream test file * Correctly pass and assign image_size parameter Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> * Add in MainLayer * Correctly name layer * Tidy up AdaptivePooler * Small tidy-up More accurate type hints and remove whitespaces * Change AdaptiveAvgPool Use the AdaptiveAvgPool implementation by @Rocketknight1, which correctly pools if the output shape does not evenly divide by input shape c.f. https://github.com/huggingface/transformers/pull/17554/files/9e26607e22aa8d069c86b50196656012ff0ce62a#r900109509 Co-authored-by: From: matt <rocketknight1@gmail.com> Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> * Use updated AdaptiveAvgPool Co-authored-by: matt <rocketknight1@gmail.com> * Make AdaptiveAvgPool compatible with CPU * Remove image_size from configuration * Fixup * Tensorflow -> TensorFlow * Fix pt references in tests * Apply suggestions from code review - grammar and wording Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Add TFResNet to doc tests * PR comments - GlobalAveragePooling and clearer comments * Remove unused import * Add in keepdims argument * Add num_channels check * grammar fix: by -> of Co-authored-by: matt <rocketknight1@gmail.com> Co-authored-by: Matt <Rocketknight1@users.noreply.github.com> * Remove transposes - keep NHWC throughout forward pass * Fixup look sharp * Add missing layer names * Final tidy up - remove from_pt now weights on hub Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> Co-authored-by: matt <rocketknight1@gmail.com> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>

…ace#17501) * Refactor to inherit from nn.Module instead of nn.ModuleList * Fix typo * Empty to trigger CI re-run Blender Bot tests failing (should be unrelated to this PR) and pass locally). I don't have sufficient permisisons to re-run the CI workflow (totally or from failed)

* Return scalar losses instead of per-sample means * Make loss shape (1,) instead of scalar * Allow scalar losses in test_loss_computation * Allow scalar losses in test_loss_computation * Allow scalar losses in test_loss_computation * Remove XLA loss function for RAG

Comparisons like version.parse(torch.__version__) > version.parse("1.6") are True for torch==1.6.0+cu101 or torch==1.6.0+cpu version.parse(version.parse(torch.__version__).base_version) are preferred (and available in pytorch_utils.py

* Cleanup some code * Improve signatures * Try to reduce the number of reshape/copies * I don't think we actually need the layer_num scaling trick * No need for duplication * Try to fix beam_search * Fix beam search * Removing layer num normalization seems to be breaking * Not sure self.layer_number normalization actually matters * Try and be backward compatible * Try to fix beam_search * Revert attempt to be backward compatible * Improve documentation on past_key_values format * Optimize the device allocation in case of hidden_states in multiple devices * No need to manually cast the values to a specific device * Rename with long version of variables * Improve type hinting * Add comment that explains that some methods return views * Actually i think the attention casting only makes sense when we use torch.float16 * We don't actually need layer_number to be passed anymore * Fix FX test * Bypass torch.baddbmm * Apply suggestions from code review * Add comment about support for torchScript v1.11 * fix ONNX support for bloom (huggingface#18456) Co-authored-by: Niklas Muennighoff <n.muennighoff@gmail.com> Co-authored-by: Nouamane Tazi <nouamane98@gmail.com>

…te for tf serving (huggingface#18372) * change shape to support dynamic batch input in tf.generate * add tests Co-authored-by: nlpcatcode <nlpcodecat@gmail.com>

…e#18457) * Enable HFTracer to trace with custom dummy inputs instead of pre-computed ones * Add HFTracer.trace docstring, and make it possible to handle callable and torch.nn.Module in general * Remove pdb comment * Apply suggestions

* swag_no_trainer updated for with gather_metrics * Removed unused variable samples_seen

…ce#18363)

* First draft * Add VideoMAEForVideoClassification * Improve conversion script * Add VideoMAEForPreTraining * Add VideoMAEFeatureExtractor * Improve VideoMAEFeatureExtractor * Improve docs * Add first draft of model tests * Improve VideoMAEForPreTraining * Fix base_model_prefix * Make model take pixel_values of shape (B, T, C, H, W) * Add loss computation of VideoMAEForPreTraining * Improve tests * Improve model testsé * Make all tests pass * Add VideoMAE to main README * Add tests for VideoMAEFeatureExtractor * Add integration test * Improve conversion script * Rename patch embedding class * Remove VideoMAELayer from init * Update design of patch embeddings * Improve comments * Improve conversion script * Improve conversion script * Add conversion of pretrained model * Add loss verification of pretrained model * Add loss verification of unnormalized targets * Add integration test for pretraining model * Apply suggestions from code review * Fix bug to make feature extractor resize only shorter edge * Address more comments * Improve normalization of videos * Add doc examples * Move constants to dedicated script * Remove scripts * Transfer checkpoints, fix docs * Update script * Update image mean and std * Fix doc tests * Set return_tensors to NumPy by default * Revert the previous change Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>

…ce#18459) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

…ace#18474) * swag_no_trainer updated for with gather_metrics * Removed unused variable samples_seen * updated examples with gather_for_metrics

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

…e pipeline (huggingface#18392) * Adding a better error message when the model is improperly configured within transformers. * Update src/transformers/pipelines/__init__.py * Black version. * Overriding task aliases so that tokenizer+feature_extractor values are correct. * Fixing task aliases by overriding their names early * X. * Fixing feature-extraction. * black again. * Normalizing `translation` too. * Fixing last few corner cases. translation need to use its non normalized name (translation_XX_to_YY, so that the task_specific_params are correctly overloaded). This can be removed and cleaned up in a later PR. `speech-encode-decoder` actually REQUIRES to pass a `tokenizer` manually so the error needs to be discarded when the `tokenizer` is already there. * doc-builder fix. * Fixing the real issue. * Removing dead code. * Do not import the actual config classes.

…ble weight (huggingface#18226) Signed-off-by: Seunghwan Hong <seunghwan@scatterlab.co.kr>

…#18352) * Refactor `TFSwinLayer` to increase serving compatibility Signed-off-by: Seunghwan Hong <seunghwan@scatterlab.co.kr> * Fix missed parameters while refactoring Signed-off-by: Seunghwan Hong <seunghwan@scatterlab.co.kr> * Fix window_reverse to calculate batch size Signed-off-by: Seunghwan Hong <harrydrippin@gmail.com> Co-Authored-By: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

This is necessary to allow for casting our images / videos to numpy arrays within the feature extractors' call. We want to do this to make sure the behaviour is as expected when flags like are False. If some transformations aren't applied, then the output type can't be unexpected e.g. a list of PIL images instead of numpy arrays.

…th different configs

…st-before-normalize-videomae

HuggingFaceDocBuilderDev · 2022-08-05T19:54:44Z

The documentation is not available anymore as the PR was closed or merged.

amyeroberts · 2022-08-12T16:11:52Z

@alaradirik @NielsRogge This is the example of the refactor to cast to numpy arrays before normalization that covers the most changes across the models. Almost all of the changes to the models' feature extractors are the same. Once this has been reviewed and approved. I'll merge all model changes into a single branch for a final review before merging.

* Cohere Model Release (#1) Cohere Model Release * Remove unnecessary files and code (#2) Some cleanup * Delete cohere-model directory (#3) * Make Fix (#5) * Pr fixes (#6) * fixes for pr * pr fixes for the format * pr fixes for the format * src/transformers/models/auto/tokenization_auto.py * Tokenizer test (#8) * tokenizer test * format fix * Adding Docs and other minor changes (#7) * Add modeling tests (#9) * Smol Fix (#11) * tokenization tests are fixed * format fixes * fix pr doc tests * fix pr doc tests * fix pr doc tests * fix pr style check * small changes in cohere.md * FIX: Address final comments for transformers integration (#13) * fix modeling final nits and add proper test file * for now leave empty tests * add integration test * push new test * fix modeling cohere (#14) * Update chat templates to use the new API (#15) --------- Co-authored-by: ahmetustun <ahmetustun89@gmail.com> Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>

zphang and others added 30 commits June 30, 2022 08:47

Fix GPT-NeoX-20B past handling, attention computation (huggingface#17811

205bc41

) * Fix GPT-NeoX-20B past handling, swap attention computation to hopefully avoid NaN, update docs * 20B tests

Unifying training argument type annotations (huggingface#17934)

4f8361a

* doc: Unify training arg type annotations * wip: extracting enum type from Union * blackening

Fix number of examples for iterable dataset in distributed training (h…

f25457b

…uggingface#17951)

CLI: convert sharded PT models (huggingface#17959)

91e1f24

* sharded conversion; add flag to control max hidden error * better hidden name matching * Add test: load TF from PT shards * fix test (PT data must be local)

skip some ipex tests until it works with torch 1.12 (huggingface#17964)

fe14046

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

Add ONNX support for LayoutLMv3 (huggingface#17953)

9cb7cef

* Add ONNX support for LayoutLMv3 * Update docstrings * Update empty description in docstring * Fix imports and type hints

skip some gpt_neox tests that require 80G RAM (huggingface#17923)

14fb8a6

* skip some gpt_neox tests that require 80G RAM * remove tests * fix quality Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

Fix typo in perf_train_gpu_one.mdx (huggingface#17983)

cb42502

Update expected values in CodeGen tests (huggingface#17888)

569b679

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

fix bias keyword argument in TFDebertaEmbeddings (huggingface#17940)

3a064bd

fixing fsdp autowrap functionality (huggingface#17922)

462b7f3

* fixing fsdp autowrap functionality * update version and quality * update torch version to latest stable version

Fix FlaxBigBirdEmbeddings (huggingface#17842)

8bb2c38

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

higher atol to avoid flaky trainer test failure (huggingface#17979)

664688b

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

Ensure PT model is in evaluation mode and lightweight forward pass do…

009171d

…ne (huggingface#17970)

Restore original task in test_warning_logs (huggingface#17985)

6f0723a

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

only a stupid typo, but it can lead to confusion (huggingface#17930)

a045cbd

Add link to existing documentation (huggingface#17931)

7b18702

Fix typo in error message in generation_utils (huggingface#18000)

3cfdefa

Replace BloomTokenizer by BloomTokenizerFast in doc (huggingface#18005)

7498db0

sort list of models (huggingface#18011)

6cb1954

LSinev and others added 21 commits August 3, 2022 13:37

Fix torch version comparisons (huggingface#18460)

02b176c

Comparisons like version.parse(torch.__version__) > version.parse("1.6") are True for torch==1.6.0+cu101 or torch==1.6.0+cpu version.parse(version.parse(torch.__version__).base_version) are preferred (and available in pytorch_utils.py

change shape to support dynamic batch input in tf.function XLA genera…

fc1d841

…te for tf serving (huggingface#18372) * change shape to support dynamic batch input in tf.generate * add tests Co-authored-by: nlpcatcode <nlpcodecat@gmail.com>

Update no trainer scripts for multiple-choice (huggingface#18468)

330247e

* swag_no_trainer updated for with gather_metrics * Removed unused variable samples_seen

Fix load of model checkpoints in the Trainer (huggingface#18470)

df28de0

Add FX support for torch.baddbmm andd torch.Tensor.baddbmm (huggingfa…

672b662

…ce#18363)

Add machine type in the artifact of Examples directory job (huggingfa…

d2704c4

…ce#18459) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

Update no trainer examples for QA and Semantic Segmentation (huggingf…

0bf1e1a

…ace#18474) * swag_no_trainer updated for with gather_metrics * Removed unused variable samples_seen * updated examples with gather_for_metrics

Add TF_MODEL_FOR_SEMANTIC_SEGMENTATION_MAPPING (huggingface#18469)

1492892

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

Fix TFSwinSelfAttention to have relative position index as non-traina…

575aa6e

…ble weight (huggingface#18226) Signed-off-by: Seunghwan Hong <seunghwan@scatterlab.co.kr>

Add TF prefix to TF-Res test class (huggingface#18481)

893122f

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

Cast images to numpy arrays in call to enable consistent behaviour wi…

134e7a7

…th different configs

Cast frames to numpy arrays in call to enable consistent behaviour wi…

94515a2

…th different configs

Remove accidental clip changes

1ace93b

Remove accidental clip changes

6e21b7d

Merge branch 'type-cast-before-normalize-update-methods' into type-ca…

51e4959

…st-before-normalize-videomae

amyeroberts mentioned this pull request Aug 5, 2022

Update feature extractor methods to enable type cast before normalize huggingface/transformers#18499

Merged

27 tasks

Make sure defaults are the same as before

2b967ed

amyeroberts requested review from NielsRogge and alaradirik August 12, 2022 16:10

amyeroberts changed the base branch from type-cast-before-normalize-update-methods to main August 19, 2022 10:53

amyeroberts closed this Aug 19, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Type cast before normalize videomae#9

Type cast before normalize videomae#9
amyeroberts wants to merge 335 commits intomainfrom
type-cast-before-normalize-videomae

amyeroberts commented Aug 5, 2022

Uh oh!

HuggingFaceDocBuilderDev commented Aug 5, 2022 •

edited

Loading

Uh oh!

amyeroberts commented Aug 12, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

Conversation

amyeroberts commented Aug 5, 2022

What does this PR do?

Before submitting

Uh oh!

HuggingFaceDocBuilderDev commented Aug 5, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

amyeroberts commented Aug 12, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

HuggingFaceDocBuilderDev commented Aug 5, 2022 •

edited

Loading