Closed
Conversation
* doc: Unify training arg type annotations * wip: extracting enum type from Union * blackening
) * trigger test failure * upload revision poc * Update src/transformers/pipelines/base.py Co-authored-by: Julien Chaumond <julien@huggingface.co> * up * add test * correct some stuff * Update src/transformers/pipelines/__init__.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * correct require flag Co-authored-by: Julien Chaumond <julien@huggingface.co> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* sharded conversion; add flag to control max hidden error * better hidden name matching * Add test: load TF from PT shards * fix test (PT data must be local)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
* Add ONNX support for LayoutLMv3 * Update docstrings * Update empty description in docstring * Fix imports and type hints
* feat: add pipeline registry abstraction
- added `PipelineRegistry` abstraction
- updates `add_new_pipeline.mdx` (english docs) to reflect the api addition
- migrate `check_task` and `get_supported_tasks` from
transformers/pipelines/__init__.py to
transformers/pipelines/base.py#PipelineRegistry.{check_task,get_supported_tasks}
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
* fix: update with upstream/main
chore: Apply suggestions from sgugger's code review
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* chore: PR updates
- revert src/transformers/dependency_versions_table.py from upstream/main
- updates pipeline registry to use global variables
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
* tests: add tests for pipeline registry
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
* tests: add test for output warning.
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
* chore: fmt and cleanup unused imports
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
* fix: change imports to top of the file and address comments
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* skip some gpt_neox tests that require 80G RAM * remove tests * fix quality Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
* fixing fsdp autowrap functionality * update version and quality * update torch version to latest stable version
* add onnx support for BLOOM * use TYPE_CHECKING for type annotations * fix past_shape for bloom (different from gpt2) * use logical_or instead of `+` for onnx support * bigger `atol_for_validation` for larger bloom models * copied -> taken because it's no longer an exact copy * remove "copied from" comment Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
* [Flax] Add remat (gradient checkpointing) * fix variable naming in test * flip: checkpoint using a method * fix naming * fix class naming * apply PVP's suggestions from code review * make fix-copies * fix big-bird, electra, roberta * cookie-cutter * fix flax big-bird * move test to common
* Copy inputs to train and test step before modifying them, as this breaks things * Add XLA tests, fix our loss functions to be XLA-compatible * make fixup * Update loss computation test to expect vector of per-sample losses * Patch loss for TFLED * Patch loss for TFAlbert * Add a tf_legacy_loss config flag that enables old loss functions * Stop using config.get() because it's not a dict * Skip loss computation test for RAG because its loss is very strange and I'm afraid to rewrite it * make fixup * Add XLA-compatible RAG loss * Fix dtype of loss mask for TFAlbert * Fix test for XLNet too because it overrides the default one * make fixup * Fix config test * No more depending on GPU NaN behaviour * Add test, avoid potential zero division * Fix test item assignment * Fix loss computation masking test * make fixup * Fix dtype bugs
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
…17987) * Shifting labels for causal LM when using label smoother When training CausalLM, loss is computed within model's foward() function and labels are shifted internally. However, if label smoothing is applied, loss is computed in trainer's compute_loss function and labels are not shifted. This causes unintended confusion during the alignment of labels and corresponding inputs. This commit is for resolving this confusion. Resolves huggingface#17960 On branch shift_labels_for_causalLM Changes to be committed: modified: src/transformers/trainer.py modified: src/transformers/trainer_pt_utils.py * Update trainer.py * Update src/transformers/trainer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
huggingface#17988) * Exclude Databricks from notebook env only if the runtime is below 11.0 * Dummy commit to trigger CI * Empty commit to trigger CI * Empty commit to trigger CI * Empty commit to trigger CI * Empty commit to trigger CI * Empty commit to trigger CI * Empty commit to trigger CI * Empty commit to trigger CI
* Rought TF conversion outline * Tidy up * Fix padding differences between layers * Add back embedder - whoops * Match test file to main * Match upstream test file * Correctly pass and assign image_size parameter Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> * Add in MainLayer * Correctly name layer * Tidy up AdaptivePooler * Small tidy-up More accurate type hints and remove whitespaces * Change AdaptiveAvgPool Use the AdaptiveAvgPool implementation by @Rocketknight1, which correctly pools if the output shape does not evenly divide by input shape c.f. https://github.com/huggingface/transformers/pull/17554/files/9e26607e22aa8d069c86b50196656012ff0ce62a#r900109509 Co-authored-by: From: matt <rocketknight1@gmail.com> Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> * Use updated AdaptiveAvgPool Co-authored-by: matt <rocketknight1@gmail.com> * Make AdaptiveAvgPool compatible with CPU * Remove image_size from configuration * Fixup * Tensorflow -> TensorFlow * Fix pt references in tests * Apply suggestions from code review - grammar and wording Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Add TFResNet to doc tests * PR comments - GlobalAveragePooling and clearer comments * Remove unused import * Add in keepdims argument * Add num_channels check * grammar fix: by -> of Co-authored-by: matt <rocketknight1@gmail.com> Co-authored-by: Matt <Rocketknight1@users.noreply.github.com> * Remove transposes - keep NHWC throughout forward pass * Fixup look sharp * Add missing layer names * Final tidy up - remove from_pt now weights on hub Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> Co-authored-by: matt <rocketknight1@gmail.com> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
…ace#17501) * Refactor to inherit from nn.Module instead of nn.ModuleList * Fix typo * Empty to trigger CI re-run Blender Bot tests failing (should be unrelated to this PR) and pass locally). I don't have sufficient permisisons to re-run the CI workflow (totally or from failed)
* Return scalar losses instead of per-sample means * Make loss shape (1,) instead of scalar * Allow scalar losses in test_loss_computation * Allow scalar losses in test_loss_computation * Allow scalar losses in test_loss_computation * Remove XLA loss function for RAG
Comparisons like
version.parse(torch.__version__) > version.parse("1.6")
are True for torch==1.6.0+cu101 or torch==1.6.0+cpu
version.parse(version.parse(torch.__version__).base_version) are preferred (and available in pytorch_utils.py
* Cleanup some code * Improve signatures * Try to reduce the number of reshape/copies * I don't think we actually need the layer_num scaling trick * No need for duplication * Try to fix beam_search * Fix beam search * Removing layer num normalization seems to be breaking * Not sure self.layer_number normalization actually matters * Try and be backward compatible * Try to fix beam_search * Revert attempt to be backward compatible * Improve documentation on past_key_values format * Optimize the device allocation in case of hidden_states in multiple devices * No need to manually cast the values to a specific device * Rename with long version of variables * Improve type hinting * Add comment that explains that some methods return views * Actually i think the attention casting only makes sense when we use torch.float16 * We don't actually need layer_number to be passed anymore * Fix FX test * Bypass torch.baddbmm * Apply suggestions from code review * Add comment about support for torchScript v1.11 * fix ONNX support for bloom (huggingface#18456) Co-authored-by: Niklas Muennighoff <n.muennighoff@gmail.com> Co-authored-by: Nouamane Tazi <nouamane98@gmail.com>
…te for tf serving (huggingface#18372) * change shape to support dynamic batch input in tf.generate * add tests Co-authored-by: nlpcatcode <nlpcodecat@gmail.com>
…e#18457) * Enable HFTracer to trace with custom dummy inputs instead of pre-computed ones * Add HFTracer.trace docstring, and make it possible to handle callable and torch.nn.Module in general * Remove pdb comment * Apply suggestions
* swag_no_trainer updated for with gather_metrics * Removed unused variable samples_seen
* First draft * Add VideoMAEForVideoClassification * Improve conversion script * Add VideoMAEForPreTraining * Add VideoMAEFeatureExtractor * Improve VideoMAEFeatureExtractor * Improve docs * Add first draft of model tests * Improve VideoMAEForPreTraining * Fix base_model_prefix * Make model take pixel_values of shape (B, T, C, H, W) * Add loss computation of VideoMAEForPreTraining * Improve tests * Improve model testsé * Make all tests pass * Add VideoMAE to main README * Add tests for VideoMAEFeatureExtractor * Add integration test * Improve conversion script * Rename patch embedding class * Remove VideoMAELayer from init * Update design of patch embeddings * Improve comments * Improve conversion script * Improve conversion script * Add conversion of pretrained model * Add loss verification of pretrained model * Add loss verification of unnormalized targets * Add integration test for pretraining model * Apply suggestions from code review * Fix bug to make feature extractor resize only shorter edge * Address more comments * Improve normalization of videos * Add doc examples * Move constants to dedicated script * Remove scripts * Transfer checkpoints, fix docs * Update script * Update image mean and std * Fix doc tests * Set return_tensors to NumPy by default * Revert the previous change Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
…ce#18459) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
…ace#18474) * swag_no_trainer updated for with gather_metrics * Removed unused variable samples_seen * updated examples with gather_for_metrics
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
…e pipeline (huggingface#18392) * Adding a better error message when the model is improperly configured within transformers. * Update src/transformers/pipelines/__init__.py * Black version. * Overriding task aliases so that tokenizer+feature_extractor values are correct. * Fixing task aliases by overriding their names early * X. * Fixing feature-extraction. * black again. * Normalizing `translation` too. * Fixing last few corner cases. translation need to use its non normalized name (translation_XX_to_YY, so that the task_specific_params are correctly overloaded). This can be removed and cleaned up in a later PR. `speech-encode-decoder` actually REQUIRES to pass a `tokenizer` manually so the error needs to be discarded when the `tokenizer` is already there. * doc-builder fix. * Fixing the real issue. * Removing dead code. * Do not import the actual config classes.
…ble weight (huggingface#18226) Signed-off-by: Seunghwan Hong <seunghwan@scatterlab.co.kr>
…#18352) * Refactor `TFSwinLayer` to increase serving compatibility Signed-off-by: Seunghwan Hong <seunghwan@scatterlab.co.kr> * Fix missed parameters while refactoring Signed-off-by: Seunghwan Hong <seunghwan@scatterlab.co.kr> * Fix window_reverse to calculate batch size Signed-off-by: Seunghwan Hong <harrydrippin@gmail.com> Co-Authored-By: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
This is necessary to allow for casting our images / videos to numpy arrays within the feature extractors' call. We want to do this to make sure the behaviour is as expected when flags like are False. If some transformations aren't applied, then the output type can't be unexpected e.g. a list of PIL images instead of numpy arrays.
…th different configs
…th different configs
…st-before-normalize-videomae
|
The documentation is not available anymore as the PR was closed or merged. |
Owner
Author
|
@alaradirik @NielsRogge This is the example of the refactor to cast to numpy arrays before normalization that covers the most changes across the models. Almost all of the changes to the models' feature extractors are the same. Once this has been reviewed and approved. I'll merge all model changes into a single branch for a final review before merging. |
amyeroberts
pushed a commit
that referenced
this pull request
Mar 15, 2024
* Cohere Model Release (#1) Cohere Model Release * Remove unnecessary files and code (#2) Some cleanup * Delete cohere-model directory (#3) * Make Fix (#5) * Pr fixes (#6) * fixes for pr * pr fixes for the format * pr fixes for the format * src/transformers/models/auto/tokenization_auto.py * Tokenizer test (#8) * tokenizer test * format fix * Adding Docs and other minor changes (#7) * Add modeling tests (#9) * Smol Fix (#11) * tokenization tests are fixed * format fixes * fix pr doc tests * fix pr doc tests * fix pr doc tests * fix pr style check * small changes in cohere.md * FIX: Address final comments for transformers integration (#13) * fix modeling final nits and add proper test file * for now leave empty tests * add integration test * push new test * fix modeling cohere (#14) * Update chat templates to use the new API (#15) --------- Co-authored-by: ahmetustun <ahmetustun89@gmail.com> Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What does this PR do?
Modifies the feature extractor call to make the inputs to
BatchFeaturealways numpy arrays. This ensures the type of the returnedpixel_valuesmatches those requested withreturn_tensors. The conversion to numpy arrays happens before normalization to ensure consistent rescaling is done on the inputs.This solution means the
return_tensors=Nonedefault can stay and that the behaviour is as expected for any combination of thedo_xxxflag values.Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.