diff --git a/.buildinfo b/.buildinfo new file mode 100644 index 0000000000..24b4cea41c --- /dev/null +++ b/.buildinfo @@ -0,0 +1,4 @@ +# Sphinx build info version 1 +# This file hashes the configuration used when building these files. When it is not found, a full rebuild will be done. +config: 45af2c3d9f73cbab9591dc1959d3bff6 +tags: 645f666f9bcd5a90fca523b33c5a78b7 diff --git a/.doctrees/adapter_composition.doctree b/.doctrees/adapter_composition.doctree new file mode 100644 index 0000000000..9ee4d325e4 Binary files /dev/null and b/.doctrees/adapter_composition.doctree differ diff --git a/.doctrees/classes/adapter_config.doctree b/.doctrees/classes/adapter_config.doctree new file mode 100644 index 0000000000..7f1861306b Binary files /dev/null and b/.doctrees/classes/adapter_config.doctree differ diff --git a/.doctrees/classes/adapter_layer.doctree b/.doctrees/classes/adapter_layer.doctree new file mode 100644 index 0000000000..32959c7835 Binary files /dev/null and b/.doctrees/classes/adapter_layer.doctree differ diff --git a/.doctrees/classes/adapter_training.doctree b/.doctrees/classes/adapter_training.doctree new file mode 100644 index 0000000000..0090cf9703 Binary files /dev/null and b/.doctrees/classes/adapter_training.doctree differ diff --git a/.doctrees/classes/adapter_utils.doctree b/.doctrees/classes/adapter_utils.doctree new file mode 100644 index 0000000000..4a0f785338 Binary files /dev/null and b/.doctrees/classes/adapter_utils.doctree differ diff --git a/.doctrees/classes/model_adapters_config.doctree b/.doctrees/classes/model_adapters_config.doctree new file mode 100644 index 0000000000..0d33dc1e84 Binary files /dev/null and b/.doctrees/classes/model_adapters_config.doctree differ diff --git a/.doctrees/classes/model_mixins.doctree b/.doctrees/classes/model_mixins.doctree new file mode 100644 index 0000000000..79766c3bdf Binary files /dev/null and b/.doctrees/classes/model_mixins.doctree differ diff --git a/.doctrees/classes/models/albert.doctree b/.doctrees/classes/models/albert.doctree new file mode 100644 index 0000000000..7c5c0dcc02 Binary files /dev/null and b/.doctrees/classes/models/albert.doctree differ diff --git a/.doctrees/classes/models/auto.doctree b/.doctrees/classes/models/auto.doctree new file mode 100644 index 0000000000..35a95ee5ec Binary files /dev/null and b/.doctrees/classes/models/auto.doctree differ diff --git a/.doctrees/classes/models/bart.doctree b/.doctrees/classes/models/bart.doctree new file mode 100644 index 0000000000..cbb30702cc Binary files /dev/null and b/.doctrees/classes/models/bart.doctree differ diff --git a/.doctrees/classes/models/beit.doctree b/.doctrees/classes/models/beit.doctree new file mode 100644 index 0000000000..a2b38f3f94 Binary files /dev/null and b/.doctrees/classes/models/beit.doctree differ diff --git a/.doctrees/classes/models/bert-generation.doctree b/.doctrees/classes/models/bert-generation.doctree new file mode 100644 index 0000000000..47b628fc4b Binary files /dev/null and b/.doctrees/classes/models/bert-generation.doctree differ diff --git a/.doctrees/classes/models/bert.doctree b/.doctrees/classes/models/bert.doctree new file mode 100644 index 0000000000..116a498575 Binary files /dev/null and b/.doctrees/classes/models/bert.doctree differ diff --git a/.doctrees/classes/models/clip.doctree b/.doctrees/classes/models/clip.doctree new file mode 100644 index 0000000000..b656084cf9 Binary files /dev/null and b/.doctrees/classes/models/clip.doctree differ diff --git a/.doctrees/classes/models/deberta.doctree b/.doctrees/classes/models/deberta.doctree new file mode 100644 index 0000000000..8ae6ff57de Binary files /dev/null and b/.doctrees/classes/models/deberta.doctree differ diff --git a/.doctrees/classes/models/deberta_v2.doctree b/.doctrees/classes/models/deberta_v2.doctree new file mode 100644 index 0000000000..ed9696286b Binary files /dev/null and b/.doctrees/classes/models/deberta_v2.doctree differ diff --git a/.doctrees/classes/models/distilbert.doctree b/.doctrees/classes/models/distilbert.doctree new file mode 100644 index 0000000000..1790786ea0 Binary files /dev/null and b/.doctrees/classes/models/distilbert.doctree differ diff --git a/.doctrees/classes/models/electra.doctree b/.doctrees/classes/models/electra.doctree new file mode 100644 index 0000000000..3a88984117 Binary files /dev/null and b/.doctrees/classes/models/electra.doctree differ diff --git a/.doctrees/classes/models/encoderdecoder.doctree b/.doctrees/classes/models/encoderdecoder.doctree new file mode 100644 index 0000000000..2bf389efee Binary files /dev/null and b/.doctrees/classes/models/encoderdecoder.doctree differ diff --git a/.doctrees/classes/models/gpt2.doctree b/.doctrees/classes/models/gpt2.doctree new file mode 100644 index 0000000000..1f96937364 Binary files /dev/null and b/.doctrees/classes/models/gpt2.doctree differ diff --git a/.doctrees/classes/models/gptj.doctree b/.doctrees/classes/models/gptj.doctree new file mode 100644 index 0000000000..c5fc536e58 Binary files /dev/null and b/.doctrees/classes/models/gptj.doctree differ diff --git a/.doctrees/classes/models/llama.doctree b/.doctrees/classes/models/llama.doctree new file mode 100644 index 0000000000..9d3a32eb58 Binary files /dev/null and b/.doctrees/classes/models/llama.doctree differ diff --git a/.doctrees/classes/models/mbart.doctree b/.doctrees/classes/models/mbart.doctree new file mode 100644 index 0000000000..b2ea1deab1 Binary files /dev/null and b/.doctrees/classes/models/mbart.doctree differ diff --git a/.doctrees/classes/models/mt5.doctree b/.doctrees/classes/models/mt5.doctree new file mode 100644 index 0000000000..b5eba461f5 Binary files /dev/null and b/.doctrees/classes/models/mt5.doctree differ diff --git a/.doctrees/classes/models/roberta.doctree b/.doctrees/classes/models/roberta.doctree new file mode 100644 index 0000000000..063aea598c Binary files /dev/null and b/.doctrees/classes/models/roberta.doctree differ diff --git a/.doctrees/classes/models/t5.doctree b/.doctrees/classes/models/t5.doctree new file mode 100644 index 0000000000..15f70c1c59 Binary files /dev/null and b/.doctrees/classes/models/t5.doctree differ diff --git a/.doctrees/classes/models/vit.doctree b/.doctrees/classes/models/vit.doctree new file mode 100644 index 0000000000..9265f90980 Binary files /dev/null and b/.doctrees/classes/models/vit.doctree differ diff --git a/.doctrees/classes/models/xlmroberta.doctree b/.doctrees/classes/models/xlmroberta.doctree new file mode 100644 index 0000000000..7aef5cb137 Binary files /dev/null and b/.doctrees/classes/models/xlmroberta.doctree differ diff --git a/.doctrees/classes/models/xmod.doctree b/.doctrees/classes/models/xmod.doctree new file mode 100644 index 0000000000..992c07e915 Binary files /dev/null and b/.doctrees/classes/models/xmod.doctree differ diff --git a/.doctrees/contributing.doctree b/.doctrees/contributing.doctree new file mode 100644 index 0000000000..2e4ca86224 Binary files /dev/null and b/.doctrees/contributing.doctree differ diff --git a/.doctrees/contributing/adding_adapter_methods.doctree b/.doctrees/contributing/adding_adapter_methods.doctree new file mode 100644 index 0000000000..6359d28b44 Binary files /dev/null and b/.doctrees/contributing/adding_adapter_methods.doctree differ diff --git a/.doctrees/contributing/adding_adapters_to_a_model.doctree b/.doctrees/contributing/adding_adapters_to_a_model.doctree new file mode 100644 index 0000000000..1527f09c6c Binary files /dev/null and b/.doctrees/contributing/adding_adapters_to_a_model.doctree differ diff --git a/.doctrees/embeddings.doctree b/.doctrees/embeddings.doctree new file mode 100644 index 0000000000..f2967aef78 Binary files /dev/null and b/.doctrees/embeddings.doctree differ diff --git a/.doctrees/environment.pickle b/.doctrees/environment.pickle new file mode 100644 index 0000000000..22346f08d9 Binary files /dev/null and b/.doctrees/environment.pickle differ diff --git a/.doctrees/extending.doctree b/.doctrees/extending.doctree new file mode 100644 index 0000000000..b92cc645ba Binary files /dev/null and b/.doctrees/extending.doctree differ diff --git a/.doctrees/hub_contributing.doctree b/.doctrees/hub_contributing.doctree new file mode 100644 index 0000000000..6df2f2aa11 Binary files /dev/null and b/.doctrees/hub_contributing.doctree differ diff --git a/.doctrees/huggingface_hub.doctree b/.doctrees/huggingface_hub.doctree new file mode 100644 index 0000000000..9ec2324526 Binary files /dev/null and b/.doctrees/huggingface_hub.doctree differ diff --git a/.doctrees/index.doctree b/.doctrees/index.doctree new file mode 100644 index 0000000000..3b5d199ef9 Binary files /dev/null and b/.doctrees/index.doctree differ diff --git a/.doctrees/installation.doctree b/.doctrees/installation.doctree new file mode 100644 index 0000000000..40f67148c3 Binary files /dev/null and b/.doctrees/installation.doctree differ diff --git a/.doctrees/loading.doctree b/.doctrees/loading.doctree new file mode 100644 index 0000000000..930ebb2202 Binary files /dev/null and b/.doctrees/loading.doctree differ diff --git a/.doctrees/method_combinations.doctree b/.doctrees/method_combinations.doctree new file mode 100644 index 0000000000..fbea1fb73d Binary files /dev/null and b/.doctrees/method_combinations.doctree differ diff --git a/.doctrees/methods.doctree b/.doctrees/methods.doctree new file mode 100644 index 0000000000..a356a6715b Binary files /dev/null and b/.doctrees/methods.doctree differ diff --git a/.doctrees/model_overview.doctree b/.doctrees/model_overview.doctree new file mode 100644 index 0000000000..1c58e3180e Binary files /dev/null and b/.doctrees/model_overview.doctree differ diff --git a/.doctrees/overview.doctree b/.doctrees/overview.doctree new file mode 100644 index 0000000000..9594ef656a Binary files /dev/null and b/.doctrees/overview.doctree differ diff --git a/.doctrees/prediction_heads.doctree b/.doctrees/prediction_heads.doctree new file mode 100644 index 0000000000..248f033036 Binary files /dev/null and b/.doctrees/prediction_heads.doctree differ diff --git a/.doctrees/quickstart.doctree b/.doctrees/quickstart.doctree new file mode 100644 index 0000000000..9396f494b2 Binary files /dev/null and b/.doctrees/quickstart.doctree differ diff --git a/.doctrees/training.doctree b/.doctrees/training.doctree new file mode 100644 index 0000000000..6314a4190c Binary files /dev/null and b/.doctrees/training.doctree differ diff --git a/.doctrees/transitioning.doctree b/.doctrees/transitioning.doctree new file mode 100644 index 0000000000..aac39e8c62 Binary files /dev/null and b/.doctrees/transitioning.doctree differ diff --git a/.nojekyll b/.nojekyll new file mode 100644 index 0000000000..e69de29bb2 diff --git a/CNAME b/CNAME new file mode 100644 index 0000000000..83a410743b --- /dev/null +++ b/CNAME @@ -0,0 +1 @@ +docs.adapterhub.ml diff --git a/_images/Fusion.png b/_images/Fusion.png new file mode 100644 index 0000000000..2d30d7b75f Binary files /dev/null and b/_images/Fusion.png differ diff --git a/_images/architecture.png b/_images/architecture.png new file mode 100644 index 0000000000..2db42e0ec8 Binary files /dev/null and b/_images/architecture.png differ diff --git a/_images/compacter.png b/_images/compacter.png new file mode 100644 index 0000000000..a7c2786e96 Binary files /dev/null and b/_images/compacter.png differ diff --git a/_images/hfhub.svg b/_images/hfhub.svg new file mode 100644 index 0000000000..a3d076dd42 --- /dev/null +++ b/_images/hfhub.svg @@ -0,0 +1,66 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/_images/ia3.png b/_images/ia3.png new file mode 100644 index 0000000000..f335d132be Binary files /dev/null and b/_images/ia3.png differ diff --git a/_images/lora.png b/_images/lora.png new file mode 100644 index 0000000000..310420fb10 Binary files /dev/null and b/_images/lora.png differ diff --git a/_images/parallel.png b/_images/parallel.png new file mode 100644 index 0000000000..0629d28afd Binary files /dev/null and b/_images/parallel.png differ diff --git a/_images/prefix.png b/_images/prefix.png new file mode 100644 index 0000000000..59d2909933 Binary files /dev/null and b/_images/prefix.png differ diff --git a/_images/splitting_adapters.png b/_images/splitting_adapters.png new file mode 100644 index 0000000000..b7709e70d2 Binary files /dev/null and b/_images/splitting_adapters.png differ diff --git a/_images/stacking_adapters.png b/_images/stacking_adapters.png new file mode 100644 index 0000000000..abcd63af4c Binary files /dev/null and b/_images/stacking_adapters.png differ diff --git a/_images/unipelt.png b/_images/unipelt.png new file mode 100644 index 0000000000..110a19ead9 Binary files /dev/null and b/_images/unipelt.png differ diff --git a/_sources/adapter_composition.md.txt b/_sources/adapter_composition.md.txt new file mode 100644 index 0000000000..5ff2d4284f --- /dev/null +++ b/_sources/adapter_composition.md.txt @@ -0,0 +1,305 @@ +# Adapter Activation and Composition + +With `adapters`, it becomes possible to combine multiple adapters trained on different tasks in so-called *adapter compositions*. +To enable such compositions, `adapters` comes with a modular and flexible concept to define how the input to the model should flow through the available adapters. +This allows, e.g., stacking ([_MAD-X_](https://arxiv.org/pdf/2005.00052.pdf)) and fusing ([_AdapterFusion_](https://arxiv.org/pdf/2005.00247.pdf)) adapters and even more complex adapter setups. + +## Adapter Activation + +The single location where all the adapter composition magic happens is the `active_adapters` property of the model class. +In the simplest case, you can set the name of a single adapter here to activate it: +```python +model.active_adapters = "adapter_name" +``` + +```{eval-rst} +.. important:: + ``active_adapters`` defines which available adapters are used in each forward and backward pass through the model. This means: + + - You cannot activate an adapter before previously adding it to the model using either ``add_adapter()`` or ``load_adapter()``. + - All adapters not mentioned in the ``active_adapters`` setup are ignored, although they might have been loaded into the model. Thus, after adding an adapter, make sure to activate it. +``` +Note that we also could have used the `set_active_adapters` method with `model.set_active_adapters("adapter_name")` which does the same. + +Alternatively, the [`AdapterSetup`](adapters.AdapterSetup) context manager allows dynamic configuration of activated setups without changing the model state: + +```python +from adapters import AdapterSetup + +model = ... +model.add_adapter("adapter_name") + +with AdapterSetup("adapter_name"): + # will use the adapter named "adapter_name" in the forward pass + outputs = model(**inputs) +``` + +## Composition Blocks - Overview + +The basic building blocks of the more advanced setups are objects derived from `AdapterCompositionBlock`, +each representing a different possibility to combine single adapters. +The following table gives an overview on the supported composition blocks and their support by different adapter methods. + +| Block | Bottleneck
Adapters | Prefix
Tuning | Compacter | LoRA | (IA)³ | Prompt Tuning | +| --- | --- | --- | --- | --- | --- | --- | +| [`Stack`](#stack) | ✅ | ✅ | ✅ | ✅(*) | ✅(*) | | +| [`Fuse`](#fuse) | ✅ | | ✅ | | | | +| [`Split`](#split) | ✅ | | ✅ | | | | +| [`BatchSplit`](#batchsplit) | ✅ | ✅ | ✅ | ✅(*) | ✅(*) | | +| [`Parallel`](#parallel) | ✅ | ✅ | ✅ | ✅(*) | ✅(*) | | +| [Output averaging](#output-averaging) | ✅ | | ✅ | ✅(*) | ✅(*) | | +| [Parameter averaging](#parameter-averaging) | ✅ | ✅ | ✅ | ✅ | ✅ | | + +(*) except for Deberta-v1, GPT-2. + +Next, we present all composition blocks in more detail. + +## `Stack` + +```{eval-rst} +.. figure:: img/stacking_adapters.png + :height: 300 + :align: center + :alt: Illustration of stacking adapters. + + Stacking adapters using the 'Stack' block. +``` + +The `Stack` block can be used to stack multiple adapters on top of each other. +This kind of adapter composition is used e.g. in the _MAD-X_ framework for cross-lingual transfer [(Pfeiffer et al., 2020)](https://arxiv.org/pdf/2005.00052.pdf), where language and task adapters are stacked on top of each other. +For more, check out [this Colab notebook](https://colab.research.google.com/github/Adapter-Hub/adapters/blob/main/notebooks/04_Cross_Lingual_Transfer.ipynb) on cross-lingual transfer. + +In the following example, we stack the adapters `a`, `b` and `c` so that in each layer, the input is first passed through `a`, the output of `a` is then inputted to `b` and the output of `b` is finally inputted to `c`. + +```python +import adapters.composition as ac + +// ... + +model.add_adapter("a") +model.add_adapter("b") +model.add_adapter("c") + +model.active_adapters = ac.Stack("a", "b", "c") +``` + +```{eval-rst} +.. note:: + When using stacking for prefix tuning the stacked prefixed are prepended to the input states from right to left, i.e. `Stack("a", "b", "c")` will first prepend prefix states for "a" to the input vectors, then prepend "b" to the resulting vectors etc. +``` + +## `Fuse` + +```{eval-rst} +.. figure:: img/Fusion.png + :height: 300 + :align: center + :alt: Illustration of AdapterFusion. + + Fusing adapters with AdapterFusion. +``` + +The `Fuse` block can be used to activate a fusion layer of adapters. +_AdapterFusion_ is a non-destructive way to combine the knowledge of multiple pre-trained adapters on a new downstream task, proposed by [Pfeiffer et al., 2021](https://arxiv.org/pdf/2005.00247.pdf). +In the following example, we activate the adapters `d`, `e` and `f` as well as the fusion layer that combines the outputs of all three. +The fusion layer is added beforehand using `model.add_adapter_fusion()`, where we specify the names of the adapters which should be fused. + +```python +import adapters.composition as ac + +// ... + +model.add_adapter("d") +model.add_adapter("e") +model.add_adapter("f") +model.add_adapter_fusion(["d", "e", "f"]) + +model.active_adapters = ac.Fuse("d", "e", "f") +``` + +```{eval-rst} +.. important:: + Fusing adapters with the ``Fuse`` block only works successfully if an adapter fusion layer combining all of the adapters listed in the ``Fuse`` has been added to the model. + This can be done either using ``add_adapter_fusion()`` or ``load_adapter_fusion()``. +``` + +To learn how training an _AdapterFusion_ layer works, check out [this Colab notebook](https://colab.research.google.com/github/Adapter-Hub/adapters/blob/main/notebooks/03_Adapter_Fusion.ipynb) from the `adapters` repo. + +### Retrieving AdapterFusion attentions + +Finally, it is possible to retrieve the attention scores computed by each fusion layer in a forward pass of the model. +These scores can be used for analyzing the fused adapter blocks and can serve as the basis for visualizations similar to those in the AdapterFusion paper. +You can collect the fusion attention scores by passing `output_adapter_fusion_attentions=True` to the model forward call. +The scores for each layer will then be saved in the `adapter_fusion_attentions` attribute of the output: + +```python +outputs = model(**inputs, output_adapter_fusion_attentions=True) +attention_scores = outputs.adapter_fusion_attentions +``` +Note that this parameter is only available to base model classes and [AdapterModel classes](prediction_heads.md#adaptermodel-classes). +In the example, `attention_scores` holds a dictionary of the following form: +``` +{ + '': { + : { + '': np.array([...]), + ... + }, + ... + }, + ... +} +``` + +## `Split` + +```{eval-rst} +.. figure:: img/splitting_adapters.png + :height: 300 + :align: center + :alt: Illustration of splitting adapters. + + Splitting the input between two adapters using the 'Split' block. +``` + +The `Split` block can be used to split an input sequence between multiple adapters. +This is done by specifying split indices at which the sequences should be divided. +In the following example, we split each input sequence between adapters `g` and `h`. +For each sequence, all tokens from 0 up to 63 are forwarded through `g` while the next 64 tokens are forwarded through `h`: + +```python +import adapters.composition as ac + +// ... + +model.add_adapter("g") +model.add_adapter("h") + +model.active_adapters = ac.Split("g", "h", splits=[64, 64]) +``` + +## `BatchSplit` + +The `BatchSplit` block is an alternative to split the input between several adapters. It does not split the input sequences but the +batch into smaller batches. As a result, the input sequences remain untouched. + +In the following example, we split the batch between adapters `i`, `k` and `l`. The `batch_sizes`parameter specifies +the batch size for each of the adapters. The adapter `i` gets two sequences, `k`gets 1 sequence and `l` gets two sequences. +If all adapters should get the same batch size this can be specified by passing one batch size e.g. `batch_sizes = 2`. The sum +specified batch has to match the batch size of the input. +```python +import adapters.composition as ac + +// ... + +model.add_adapter("i") +model.add_adapter("k") +model.add_adapter("l") + +model.active_adapters = ac.BatchSplit("i", "k", "l", batch_sizes=[2, 1, 2]) + +``` + +## `Parallel` + +```{eval-rst} +.. figure:: img/parallel.png + :height: 300 + :align: center + :alt: Illustration of parallel adapter forward pass. + + Parallel adapter forward pass as implemented by the 'Parallel' block. The input is replicated at the first layer with parallel adapters. +``` + +The `Parallel` block can be used to enable parallel multi-task training and inference on different adapters, each with their own prediction head. +Parallel adapter inference was first used in _AdapterDrop: On the Efficiency of Adapters in Transformers_ [(Rücklé et al., 2020)](https://arxiv.org/pdf/2010.11918.pdf). + +In the following example, we load two adapters for semantic textual similarity (STS) from the Hub, one trained on the STS benchmark, the other trained on the MRPC dataset. +We activate a parallel setup where the input is passed through both adapters and their respective prediction heads. + +```python +import adapters.composition as ac + +model = AutoAdapterModel.from_pretrained("distilbert-base-uncased") +tokenizer = AutoTokenizer.from_pretrained("distilbert-base-uncased") + +adapter1 = model.load_adapter("sts/sts-b@ukp") +adapter2 = model.load_adapter("sts/mrpc@ukp") + +model.active_adapters = ac.Parallel(adapter1, adapter2) + +input_ids = tokenizer("Adapters are great!", "Adapters are awesome!", return_tensors="pt") + +output1, output2 = model(**input_ids) + +print("STS-B adapter output:", output1[0].item()) +print("MRPC adapter output:", bool(torch.argmax(output2[0]).item())) +``` + +## Averaging Outputs or Parameters + +Following approaches of ensembling full models at inference time for better generalization, recent work on adapters has explored methods of averaging pre-trained adapters. +This includes averaging output representations of adapters ([Wang et al., 2021](https://arxiv.org/pdf/2109.04877.pdf)) as well as averaging adapter parameters ([Wang et al., 2022](https://arxiv.org/pdf/2205.12410.pdf), [Chronopoulou et al., 2023](https://aclanthology.org/2023.findings-eacl.153.pdf)). +`adapters` provides built-in support for both types of inference time averaging methods. + +### Output averaging + +Output averaging allows to dynamically aggregate the output representations of multiple adapters in a model forward pass via weighted averaging. +This is realized via the `Average` composition block that works similar to other composition blocks. +In the example below, the three adapters are averaged with the weights `0.1` for `m`, `0.6` for `n` and `0.3` for `o`. + +```python +import adapters.composition as ac + +// ... + +model.add_adapter("m") +model.add_adapter("n") +model.add_adapter("o") + +model.active_adapters = ac.Average("m", "n", "o", weights=[0.1, 0.6, 0.3]) +``` + +### Parameter averaging + +Parameter averaging enables creating a new adapter via weighted averaging of the parameters of multiple pre-trained adapters. +As this process is typically not done dynamically at runtime, `adapters` provides `average_adapter()` as a dedicated method for parameter averaging. +In the example below, the parameters of the adapters `m`, `n` and `o` are averaged (with weights `0.1` `0.6` and `0.3`, respectively) to create a new adapter `avg`. +Note that for this to succeed, all averaged adapters must use the same adapter configuration. + +```python +model.add_adapter("m") +model.add_adapter("n") +model.add_adapter("o") + +model.average_adapter("avg", ["m", "n", "o"], weights=[0.1, 0.6, 0.3]) +``` + +Compared to output averaging, parameter averaging of adapters has the advantage of not inducing any additional inference time relative to using a single adapter. + +For both output and parameter averaging, passed weights are normalized by default. +To disable normalization, pass `normalize_weights=False`. + +## Nesting composition blocks + +Of course, it is also possible to combine different composition blocks in one adapter setup. +E.g., we can nest a `Split` block within a `Stack` of adapters: + +```python +import adapters.composition as ac + +model.active_adapters = ac.Stack("a", ac.Split("b", "c", splits=60)) +``` + +However, combinations of adapter composition blocks cannot be arbitrarily deep. All currently supported possibilities are visualized in the table below. + +|Block|Supported Nesting| +|---|---| +| [`Stack`](#stack)|[str, Fuse, Split, Parallel, BatchSplit, Average]| +| [`Fuse`](#fuse)|[str, Stack]| +|[`Split`](#split)|[str, Split, Stack, BatchSplit, Average]| +|[`Parallel`](#parallel)|[str, Stack, BatchSplit, Average]| +|[`BatchSplit`](#batchsplit)|[str, Stack, Split, BatchSplit, Average]| +|[`Average`](#output-averaging)|[str, Stack, Split, BatchSplit]| + +In the table, `str` represents an adapter, e.g. adapter "a" in the nesting example above. Depending on the individual model, some nested compositions might not be possible. diff --git a/_sources/classes/adapter_config.rst.txt b/_sources/classes/adapter_config.rst.txt new file mode 100644 index 0000000000..91a9f506a0 --- /dev/null +++ b/_sources/classes/adapter_config.rst.txt @@ -0,0 +1,95 @@ +Adapter Configuration +======================= + +Classes representing the architectures of adapter modules and fusion layers. + +Single (bottleneck) adapters +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +.. autoclass:: adapters.AdapterConfig + :members: + +.. autoclass:: adapters.BnConfig + :members: + :inherited-members: Mapping + +.. autoclass:: adapters.SeqBnConfig + :members: + +.. autoclass:: adapters.SeqBnInvConfig + :members: + +.. autoclass:: adapters.DoubleSeqBnConfig + :members: + +.. autoclass:: adapters.DoubleSeqBnInvConfig + :members: + +.. autoclass:: adapters.ParBnConfig + :members: + +.. autoclass:: adapters.CompacterConfig + :members: + +.. autoclass:: adapters.CompacterPlusPlusConfig + :members: + +Prefix Tuning +~~~~~~~~~~~~~~~~~~~~~~~ + +.. autoclass:: adapters.PrefixTuningConfig + :members: + :inherited-members: Mapping + +LoRAConfig +~~~~~~~~~~~~~~~~~~~~~~~ + +.. autoclass:: adapters.LoRAConfig + :members: + :inherited-members: Mapping + +IA3Config +~~~~~~~~~~~~~~~~~~~~~~~ + +.. autoclass:: adapters.IA3Config + :members: + :inherited-members: Mapping + +PromptTuningConfig +~~~~~~~~~~~~~~~~~~~~~~~ + +.. autoclass:: adapters.PromptTuningConfig + :members: + :inherited-members: Mapping + +Combined configurations +~~~~~~~~~~~~~~~~~~~~~~~~ + +.. autoclass:: adapters.ConfigUnion + :members: + :inherited-members: Mapping + +.. autoclass:: adapters.MAMConfig + :members: + +.. autoclass:: adapters.UniPELTConfig + :members: + +Adapter Fusion +~~~~~~~~~~~~~~~ + +.. autoclass:: adapters.AdapterFusionConfig + :members: + :inherited-members: Mapping + +.. autoclass:: adapters.StaticAdapterFusionConfig + :members: + +.. autoclass:: adapters.DynamicAdapterFusionConfig + :members: + +Adapter Setup +~~~~~~~~~~~~~~~ + +.. autoclass:: adapters.AdapterSetup + :members: diff --git a/_sources/classes/adapter_layer.rst.txt b/_sources/classes/adapter_layer.rst.txt new file mode 100644 index 0000000000..01233d6328 --- /dev/null +++ b/_sources/classes/adapter_layer.rst.txt @@ -0,0 +1,12 @@ +Adapter Implementation +======================= + +The following classes define the common interfaces for all adapter methods. +They further hold logic shared by all adapter implementations. +All newly added adapter methods should inherit from either one of these classes. + +.. autoclass:: adapters.AdapterLayerBase + :members: + +.. autoclass:: adapters.ComposableAdapterLayerBase + :members: diff --git a/_sources/classes/adapter_training.rst.txt b/_sources/classes/adapter_training.rst.txt new file mode 100644 index 0000000000..67784c5d54 --- /dev/null +++ b/_sources/classes/adapter_training.rst.txt @@ -0,0 +1,10 @@ +Adapter Training +==================== + +Classes and methods related to training adapters. + +.. automodule:: adapters.training + :members: + +.. automodule:: adapters.trainer + :members: diff --git a/_sources/classes/adapter_utils.rst.txt b/_sources/classes/adapter_utils.rst.txt new file mode 100644 index 0000000000..957e2aab4f --- /dev/null +++ b/_sources/classes/adapter_utils.rst.txt @@ -0,0 +1,8 @@ +Adapter Utilities +==================== + +A collection of utility methods mainly related to searching and loading adapter modules from +Adapter-Hub. + +.. automodule:: adapters.utils + :members: diff --git a/_sources/classes/model_adapters_config.rst.txt b/_sources/classes/model_adapters_config.rst.txt new file mode 100644 index 0000000000..130f7b647e --- /dev/null +++ b/_sources/classes/model_adapters_config.rst.txt @@ -0,0 +1,7 @@ +Model Adapters Config +======================= + +This class manages the setup and configuration of adapter modules in a pre-trained model. + +.. autoclass:: adapters.ModelAdaptersConfig + :members: diff --git a/_sources/classes/model_mixins.rst.txt b/_sources/classes/model_mixins.rst.txt new file mode 100644 index 0000000000..3a43525eb5 --- /dev/null +++ b/_sources/classes/model_mixins.rst.txt @@ -0,0 +1,43 @@ +Model Mixins +======================= + +These classes provide the basis of adapter module integration into model classes such as adapter saving and loading. +Depending on the model, one of these mixins should be implemented by every adapter-supporting model class. + +InvertibleAdaptersMixin +---------------------------------- + +.. autoclass:: adapters.InvertibleAdaptersMixin + :members: + + +EmbeddingAdaptersMixin +---------------------------------- + +.. autoclass:: adapters.EmbeddingAdaptersMixin + :members: + + +ModelAdaptersMixin +------------------ + +.. autoclass:: adapters.ModelAdaptersMixin + :members: + +ModelWithHeadsAdaptersMixin +---------------------------------- + +.. autoclass:: adapters.ModelWithHeadsAdaptersMixin + :members: + +ModelWithFlexibleHeadsAdaptersMixin +--------------------------------------- + +.. autoclass:: adapters.ModelWithFlexibleHeadsAdaptersMixin + :members: + +PushAdapterToHubMixin +---------------------- + +.. autoclass:: adapters.hub_mixin.PushAdapterToHubMixin + :members: diff --git a/_sources/classes/models/albert.rst.txt b/_sources/classes/models/albert.rst.txt new file mode 100644 index 0000000000..8db0d0c12d --- /dev/null +++ b/_sources/classes/models/albert.rst.txt @@ -0,0 +1,22 @@ +ALBERT +====== + +.. note:: + Adapter implementation notes for ALBERT: + - As layers are shared between groups, adapters added to a layer are also shared between groups. Therefore, changing the adapter configuration for a layer affects the behavior of all groups that use this layer. + - As usual, the ``leave_out`` parameter can be used to specify the layers in which adapters should be added. The layer IDs are counted by putting all layers of the groups into a sequence depending on the group number and their position in the group. I.e., for a ALBERT model with `inner_group_num=2` the first layer of the first group has ID 0, the second layer of the first group has ID 1, the first layer of the second group has ID 2, etc. + + +The ALBERT model was proposed in `ALBERT: A Lite BERT for Self-supervised Learning of Language Representations `__ +by Zhenzhong Lan, Mingda Chen, Sebastian Goodman, Kevin Gimpel, Piyush Sharma, Radu Soricut. +It presents two parameter-reduction techniques to lower memory consumption and increase the training speed of BERT: + +- Splitting the embedding matrix into two smaller matrices. +- Using repeating layers split among groups. + +AlbertAdapterModel +~~~~~~~~~~~~~~~~~~~~ + +.. autoclass:: adapters.AlbertAdapterModel + :members: + :inherited-members: AlbertPreTrainedModel diff --git a/_sources/classes/models/auto.rst.txt b/_sources/classes/models/auto.rst.txt new file mode 100644 index 0000000000..a276854894 --- /dev/null +++ b/_sources/classes/models/auto.rst.txt @@ -0,0 +1,14 @@ +Auto Classes +============ + +Similar to the ``AutoModel`` classes built-in into HuggingFace Transformers, adapters provides an ``AutoAdapterModel`` class. +As with other auto classes, the correct adapter model class is automatically instantiated based on the pre-trained model passed to the ``from_pretrained()`` method. + +.. note:: + If the model loaded with the ``from_pretrained(...)`` function has a head, this head gets loaded as well. However, this only works for non-sharded models. If you want to load a sharded model with a head, you first need to load the model and then the head separately. + +AutoAdapterModel +~~~~~~~~~~~~~~~~~~~~ + +.. autoclass:: adapters.AutoAdapterModel + :members: diff --git a/_sources/classes/models/bart.rst.txt b/_sources/classes/models/bart.rst.txt new file mode 100644 index 0000000000..67a5e56572 --- /dev/null +++ b/_sources/classes/models/bart.rst.txt @@ -0,0 +1,25 @@ +BART +===== + +The Bart model was proposed in `BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, +Translation, and Comprehension `__ by Mike Lewis, Yinhan Liu, Naman Goyal, Marjan +Ghazvininejad, Abdelrahman Mohamed, Omer Levy, Ves Stoyanov and Luke Zettlemoyer on 29 Oct, 2019. + +According to the abstract, + +- Bart uses a standard seq2seq/machine translation architecture with a bidirectional encoder (like BERT) and a + left-to-right decoder (like GPT). +- The pretraining task involves randomly shuffling the order of the original sentences and a novel in-filling scheme, + where spans of text are replaced with a single mask token. +- BART is particularly effective when fine tuned for text generation but also works well for comprehension tasks. It + matches the performance of RoBERTa with comparable training resources on GLUE and SQuAD, achieves new + state-of-the-art results on a range of abstractive dialogue, question answering, and summarization tasks, with gains + of up to 6 ROUGE. + + +BartAdapterModel +~~~~~~~~~~~~~~~~~~~~ + +.. autoclass:: adapters.BartAdapterModel + :members: + :inherited-members: BartPreTrainedModel diff --git a/_sources/classes/models/beit.rst.txt b/_sources/classes/models/beit.rst.txt new file mode 100644 index 0000000000..8d58127241 --- /dev/null +++ b/_sources/classes/models/beit.rst.txt @@ -0,0 +1,27 @@ +BEiT +====== + +The Bidirectional Encoder representation from Image Transformers (BEiT) model was proposed in `BERT Pre-Training of Image +Transformers `__ by Hangbo Bao, Li Dong, Songhao Piao, Furu Wei. + + +The abstract from the paper is the following: + +*We introduce a self-supervised vision representation model BEiT, which stands for Bidirectional Encoder representation +from Image Transformers. Following BERT developed in the natural language processing area, we propose a masked image +modeling task to pretrain vision Transformers. Specifically, each image has two views in our pre-training, i.e, image +patches (such as 16x16 pixels), and visual tokens (i.e., discrete tokens). We first "tokenize" the original image into +visual tokens. Then we randomly mask some image patches and fed them into the backbone Transformer. The pre-training +objective is to recover the original visual tokens based on the corrupted image patches. After pre-training BEiT, we +directly fine-tune the model parameters on downstream tasks by appending task layers upon the pretrained encoder. +Experimental results on image classification and semantic segmentation show that our model achieves competitive results +with previous pre-training methods. For example, base-size BEiT achieves 83.2% top-1 accuracy on ImageNet-1K, +significantly outperforming from-scratch DeiT training (81.8%) with the same setup. Moreover, large-size BEiT obtains +86.3% only using ImageNet-1K, even outperforming ViT-L with supervised pre-training on ImageNet-22K (85.2%).* + +BeitAdapterModel +~~~~~~~~~~~~~~~~~~~~ + +.. autoclass:: adapters.BeitAdapterModel + :members: + :inherited-members: BeitPreTrainedModel diff --git a/_sources/classes/models/bert-generation.rst.txt b/_sources/classes/models/bert-generation.rst.txt new file mode 100644 index 0000000000..ebcdb3205e --- /dev/null +++ b/_sources/classes/models/bert-generation.rst.txt @@ -0,0 +1,40 @@ +.. + Copyright 2020 The HuggingFace Team. All rights reserved. + + Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with + the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on + an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the + specific language governing permissions and limitations under the License. + +BertGeneration +----------------------------------------------------------------------------------------------------------------------- + +Overview +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The BertGeneration model is a BERT model that can be leveraged for sequence-to-sequence tasks using +EncoderDecoderModel as proposed in `Leveraging Pre-trained Checkpoints for Sequence Generation +Tasks `__ by Sascha Rothe, Shashi Narayan, Aliaksei Severyn. + +The abstract from the paper is the following: + +*Unsupervised pretraining of large neural models has recently revolutionized Natural Language Processing. By +warm-starting from the publicly released checkpoints, NLP practitioners have pushed the state-of-the-art on multiple +benchmarks while saving significant amounts of compute time. So far the focus has been mainly on the Natural Language +Understanding tasks. In this paper, we demonstrate the efficacy of pre-trained checkpoints for Sequence Generation. We +developed a Transformer-based sequence-to-sequence model that is compatible with publicly available pre-trained BERT, +GPT-2 and RoBERTa checkpoints and conducted an extensive empirical study on the utility of initializing our model, both +encoder and decoder, with these checkpoints. Our models result in new state-of-the-art results on Machine Translation, +Text Summarization, Sentence Splitting, and Sentence Fusion.* + + +BertGenerationAdapterModel +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +.. autoclass:: adapters.BertGenerationAdapterModel + :members: + :inherited-members: BertGenerationPreTrainedModel diff --git a/_sources/classes/models/bert.rst.txt b/_sources/classes/models/bert.rst.txt new file mode 100644 index 0000000000..c022d137bc --- /dev/null +++ b/_sources/classes/models/bert.rst.txt @@ -0,0 +1,14 @@ +BERT +====== + +The BERT model was proposed in `BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding `__ +by Jacob Devlin, Ming-Wei Chang, Kenton Lee and Kristina Toutanova. It is a bidirectional transformer +pre-trained using a combination of masked language modeling objective and next sentence prediction. + + +BertAdapterModel +~~~~~~~~~~~~~~~~~~~~ + +.. autoclass:: adapters.BertAdapterModel + :members: + :inherited-members: BertPreTrainedModel diff --git a/_sources/classes/models/clip.rst.txt b/_sources/classes/models/clip.rst.txt new file mode 100644 index 0000000000..2112cf74db --- /dev/null +++ b/_sources/classes/models/clip.rst.txt @@ -0,0 +1,50 @@ +CLIP +===== + +.. note:: + Adapter implementation notes: + - CLIP consists of two separate Transformer encoder models, a ViT-style Transformer for visual features and a language model for textual features. Both encoders can be fitted with adapters. As usual, the ``leave_out`` parameter can be used to specify the layers in which adapters should be added. For CLIP, layer IDs are counted globally across both encoders, starting from the text encoder. I.e., for a CLIP model with 12 layers in each Transformer encoder, the text encoder will have IDs 0-11 and the vision encoder will have IDs 12-23. + - As CLIP does not come with pre-supported task-specific prediction heads, there is currently no ``CLIPAdapterModel`` class. Use ``CLIPModel`` instead. + +The CLIP model was proposed in `Learning Transferable Visual Models From Natural Language Supervision `_ by Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, +Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, Gretchen Krueger, Ilya Sutskever. CLIP +(Contrastive Language-Image Pre-Training) is a neural network trained on a variety of (image, text) pairs. It can be +instructed in natural language to predict the most relevant text snippet, given an image, without directly optimizing +for the task, similarly to the zero-shot capabilities of GPT-2 and 3. + +The abstract from the paper is the following: + +*State-of-the-art computer vision systems are trained to predict a fixed set of predetermined object categories. This +restricted form of supervision limits their generality and usability since additional labeled data is needed to specify +any other visual concept. Learning directly from raw text about images is a promising alternative which leverages a +much broader source of supervision. We demonstrate that the simple pre-training task of predicting which caption goes +with which image is an efficient and scalable way to learn SOTA image representations from scratch on a dataset of 400 +million (image, text) pairs collected from the internet. After pre-training, natural language is used to reference +learned visual concepts (or describe new ones) enabling zero-shot transfer of the model to downstream tasks. We study +the performance of this approach by benchmarking on over 30 different existing computer vision datasets, spanning tasks +such as OCR, action recognition in videos, geo-localization, and many types of fine-grained object classification. The +model transfers non-trivially to most tasks and is often competitive with a fully supervised baseline without the need +for any dataset specific training. For instance, we match the accuracy of the original ResNet-50 on ImageNet zero-shot +without needing to use any of the 1.28 million training examples it was trained on. We release our code and pre-trained +model weights at this https URL.* + +CLIPTextModel +~~~~~~~~~~~~~ + +.. autoclass:: transformers.CLIPTextModel + :members: + :inherited-members: CLIPPreTrainedModel + +CLIPVisionModel +~~~~~~~~~~~~~~~ + +.. autoclass:: transformers.CLIPVisionModel + :members: + :inherited-members: CLIPPreTrainedModel + +CLIPModel +~~~~~~~~~ + +.. autoclass:: transformers.CLIPModel + :members: + :inherited-members: CLIPPreTrainedModel diff --git a/_sources/classes/models/deberta.rst.txt b/_sources/classes/models/deberta.rst.txt new file mode 100644 index 0000000000..9513ee83d5 --- /dev/null +++ b/_sources/classes/models/deberta.rst.txt @@ -0,0 +1,50 @@ +.. + Copyright 2020 The HuggingFace Team. All rights reserved. + + Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with + the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on + an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the + specific language governing permissions and limitations under the License. + +DeBERTa +----------------------------------------------------------------------------------------------------------------------- + +Overview +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The DeBERTa model was proposed in `DeBERTa: Decoding-enhanced BERT with Disentangled Attention +`__ by Pengcheng He, Xiaodong Liu, Jianfeng Gao, Weizhu Chen It is based on Google's +BERT model released in 2018 and Facebook's RoBERTa model released in 2019. + +It builds on RoBERTa with disentangled attention and enhanced mask decoder training with half of the data used in +RoBERTa. + +The abstract from the paper is the following: + +*Recent progress in pre-trained neural language models has significantly improved the performance of many natural +language processing (NLP) tasks. In this paper we propose a new model architecture DeBERTa (Decoding-enhanced BERT with +disentangled attention) that improves the BERT and RoBERTa models using two novel techniques. The first is the +disentangled attention mechanism, where each word is represented using two vectors that encode its content and +position, respectively, and the attention weights among words are computed using disentangled matrices on their +contents and relative positions. Second, an enhanced mask decoder is used to replace the output softmax layer to +predict the masked tokens for model pretraining. We show that these two techniques significantly improve the efficiency +of model pretraining and performance of downstream tasks. Compared to RoBERTa-Large, a DeBERTa model trained on half of +the training data performs consistently better on a wide range of NLP tasks, achieving improvements on MNLI by +0.9% +(90.2% vs. 91.1%), on SQuAD v2.0 by +2.3% (88.4% vs. 90.7%) and RACE by +3.6% (83.2% vs. 86.8%). The DeBERTa code and +pre-trained models will be made publicly available at https://github.com/microsoft/DeBERTa.* + + +This model was contributed by `DeBERTa `__. This model TF 2.0 implementation was +contributed by `kamalkraj `__ . The original code can be found `here +`__. + +DebertaAdapterModel +~~~~~~~~~~~~~~~~~~~~ + +.. autoclass:: adapters.DebertaAdapterModel + :members: + :inherited-members: DebertaPreTrainedModel diff --git a/_sources/classes/models/deberta_v2.rst.txt b/_sources/classes/models/deberta_v2.rst.txt new file mode 100644 index 0000000000..d4e172dc0e --- /dev/null +++ b/_sources/classes/models/deberta_v2.rst.txt @@ -0,0 +1,71 @@ +.. + Copyright 2020 The HuggingFace Team. All rights reserved. + + Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with + the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on + an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the + specific language governing permissions and limitations under the License. + +DeBERTa-v2 +----------------------------------------------------------------------------------------------------------------------- + +Overview +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The DeBERTa model was proposed in `DeBERTa: Decoding-enhanced BERT with Disentangled Attention +`__ by Pengcheng He, Xiaodong Liu, Jianfeng Gao, Weizhu Chen It is based on Google's +BERT model released in 2018 and Facebook's RoBERTa model released in 2019. + +It builds on RoBERTa with disentangled attention and enhanced mask decoder training with half of the data used in +RoBERTa. + +The abstract from the paper is the following: + +*Recent progress in pre-trained neural language models has significantly improved the performance of many natural +language processing (NLP) tasks. In this paper we propose a new model architecture DeBERTa (Decoding-enhanced BERT with +disentangled attention) that improves the BERT and RoBERTa models using two novel techniques. The first is the +disentangled attention mechanism, where each word is represented using two vectors that encode its content and +position, respectively, and the attention weights among words are computed using disentangled matrices on their +contents and relative positions. Second, an enhanced mask decoder is used to replace the output softmax layer to +predict the masked tokens for model pretraining. We show that these two techniques significantly improve the efficiency +of model pretraining and performance of downstream tasks. Compared to RoBERTa-Large, a DeBERTa model trained on half of +the training data performs consistently better on a wide range of NLP tasks, achieving improvements on MNLI by +0.9% +(90.2% vs. 91.1%), on SQuAD v2.0 by +2.3% (88.4% vs. 90.7%) and RACE by +3.6% (83.2% vs. 86.8%). The DeBERTa code and +pre-trained models will be made publicly available at https://github.com/microsoft/DeBERTa.* + + +The following information is visible directly on the [original implementation +repository](https://github.com/microsoft/DeBERTa). DeBERTa v2 is the second version of the DeBERTa model. It includes +the 1.5B model used for the SuperGLUE single-model submission and achieving 89.9, versus human baseline 89.8. You can +find more details about this submission in the authors' +[blog](https://www.microsoft.com/en-us/research/blog/microsoft-deberta-surpasses-human-performance-on-the-superglue-benchmark/) + +New in v2: + +- **Vocabulary** In v2 the tokenizer is changed to use a new vocabulary of size 128K built from the training data. + Instead of a GPT2-based tokenizer, the tokenizer is now + [sentencepiece-based](https://github.com/google/sentencepiece) tokenizer. +- **nGiE(nGram Induced Input Encoding)** The DeBERTa-v2 model uses an additional convolution layer aside with the first + transformer layer to better learn the local dependency of input tokens. +- **Sharing position projection matrix with content projection matrix in attention layer** Based on previous + experiments, this can save parameters without affecting the performance. +- **Apply bucket to encode relative positions** The DeBERTa-v2 model uses log bucket to encode relative positions + similar to T5. +- **900M model & 1.5B model** Two additional model sizes are available: 900M and 1.5B, which significantly improves the + performance of downstream tasks. + +This model was contributed by `DeBERTa `__. This model TF 2.0 implementation was +contributed by `kamalkraj `__. The original code can be found `here +`__. + + +DebertaV2AdapterModel +~~~~~~~~~~~~~~~~~~~~~ + +.. autoclass:: adapters.DebertaV2AdapterModel + :members: + :inherited-members: DebertaV2PreTrainedModel diff --git a/_sources/classes/models/distilbert.rst.txt b/_sources/classes/models/distilbert.rst.txt new file mode 100644 index 0000000000..3fceae3910 --- /dev/null +++ b/_sources/classes/models/distilbert.rst.txt @@ -0,0 +1,17 @@ +DistilBERT +=========== + +The DistilBERT model was proposed in the blog post +`Smaller, faster, cheaper, lighter: Introducing DistilBERT, a distilled version of BERT `__, +and the paper `DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter `__. +DistilBERT is a small, fast, cheap and light Transformer model trained by distilling Bert base. It has 40% less +parameters than `bert-base-uncased`, runs 60% faster while preserving over 95% of Bert's performances as measured on +the GLUE language understanding benchmark. + + +DistilBertAdapterModel +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +.. autoclass:: adapters.DistilBertAdapterModel + :members: + :inherited-members: DistilBertPreTrainedModel diff --git a/_sources/classes/models/electra.rst.txt b/_sources/classes/models/electra.rst.txt new file mode 100644 index 0000000000..e0dc9c5ef4 --- /dev/null +++ b/_sources/classes/models/electra.rst.txt @@ -0,0 +1,32 @@ +ELECTRA +======= + +The ELECTRA model was proposed in the paper `ELECTRA: Pre-training Text Encoders as Discriminators Rather Than +Generators `__. ELECTRA is a new pretraining approach which trains two +transformer models: the generator and the discriminator. The generator's role is to replace tokens in a sequence, and +is therefore trained as a masked language model. The discriminator, which is the model we're interested in, tries to +identify which tokens were replaced by the generator in the sequence. + +The abstract from the paper is the following: + +*Masked language modeling (MLM) pretraining methods such as BERT corrupt the input by replacing some tokens with [MASK] +and then train a model to reconstruct the original tokens. While they produce good results when transferred to +downstream NLP tasks, they generally require large amounts of compute to be effective. As an alternative, we propose a +more sample-efficient pretraining task called replaced token detection. Instead of masking the input, our approach +corrupts it by replacing some tokens with plausible alternatives sampled from a small generator network. Then, instead +of training a model that predicts the original identities of the corrupted tokens, we train a discriminative model that +predicts whether each token in the corrupted input was replaced by a generator sample or not. Thorough experiments +demonstrate this new pretraining task is more efficient than MLM because the task is defined over all input tokens +rather than just the small subset that was masked out. As a result, the contextual representations learned by our +approach substantially outperform the ones learned by BERT given the same model size, data, and compute. The gains are +particularly strong for small models; for example, we train a model on one GPU for 4 days that outperforms GPT (trained +using 30x more compute) on the GLUE natural language understanding benchmark. Our approach also works well at scale, +where it performs comparably to RoBERTa and XLNet while using less than 1/4 of their compute and outperforms them when +using the same amount of compute.* + +ElectraAdapterModel +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +.. autoclass:: adapters.ElectraAdapterModel + :members: + :inherited-members: ElectraPreTrainedModel diff --git a/_sources/classes/models/encoderdecoder.rst.txt b/_sources/classes/models/encoderdecoder.rst.txt new file mode 100644 index 0000000000..8e2f65b2dd --- /dev/null +++ b/_sources/classes/models/encoderdecoder.rst.txt @@ -0,0 +1,43 @@ +.. + Copyright 2020 The HuggingFace Team. All rights reserved. + + Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with + the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on + an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the + specific language governing permissions and limitations under the License. + +Encoder Decoder Models +----------------------------------------------------------------------------------------------------------------------- + +.. note:: + Adapter implementation notes: + - Unlike other models, an explicit EncoderDecoderAdapterModel for the EncoderDecoderModel has not been implemented. This decision was made due to the lack of support for the EncoderDecoderModel in Hugging Face Transformers' ``AutoModel`` class. As a result, our ``AutoAdapterModel`` class would not support the EncoderDecoderAdapterModel either. Thus, to use an EncoderDecoderModel with *Adapters*, follow these steps: + + 1. First, create an :class:`~transformers.EncoderDecoderModel` instance, for example, using ``model = EncoderDecoderModel.from_encoder_decoder_pretrained("bert-base-uncased", "bert-base-uncased")``. + 2. Next, convert this model to an adapter model using the ``adapters.init(model)`` function. + + - Adapters can be added to both the encoder and the decoder. As usual, the ``leave_out`` parameter can be used to specify the layers where adapters are to be added. For the EncoderDecoderModel the layer IDs are counted seperately over the encoder and decoder starting from 0. Thus, specifying ``leave_out=[0,1]`` will leave out the first and second layer of the encoder and the first and second layer of the decoder. + +The :class:`~transformers.EncoderDecoderModel` can be used to initialize a sequence-to-sequence model with any +pretrained autoencoding model as the encoder and any pretrained autoregressive model as the decoder. + +The effectiveness of initializing sequence-to-sequence models with pretrained checkpoints for sequence generation tasks +was shown in `Leveraging Pre-trained Checkpoints for Sequence Generation Tasks `__ by +Sascha Rothe, Shashi Narayan, Aliaksei Severyn. + +After such an :class:`~transformers.EncoderDecoderModel` has been trained/fine-tuned, it can be saved/loaded just like +any other models (see the examples for more information). + +An application of this architecture could be to leverage two pretrained :class:`~transformers.BertModel` as the encoder +and decoder for a summarization model as was shown in: `Text Summarization with Pretrained Encoders +`__ by Yang Liu and Mirella Lapata. + +EncoderDecoderModel +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +.. autoclass:: transformers.EncoderDecoderModel + :members: forward, from_encoder_decoder_pretrained diff --git a/_sources/classes/models/gpt2.rst.txt b/_sources/classes/models/gpt2.rst.txt new file mode 100644 index 0000000000..05100f0eb5 --- /dev/null +++ b/_sources/classes/models/gpt2.rst.txt @@ -0,0 +1,23 @@ +OpenAI GPT2 +----------------------------------------------------------------------------------------------------------------------- + +OpenAI GPT-2 model was proposed in `Language Models are Unsupervised Multitask Learners +`_ by Alec +Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei and Ilya Sutskever. It's a causal (unidirectional) +transformer pretrained using language modeling on a very large corpus of ~40 GB of text data. + +The abstract from the paper is the following: + +*GPT-2 is a large transformer-based language model with 1.5 billion parameters, trained on a dataset[1] of 8 million +web pages. GPT-2 is trained with a simple objective: predict the next word, given all of the previous words within some +text. The diversity of the dataset causes this simple goal to contain naturally occurring demonstrations of many tasks +across diverse domains. GPT-2 is a direct scale-up of GPT, with more than 10X the parameters and trained on more than +10X the amount of data.* + + +GPT2AdapterModel +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +.. autoclass:: adapters.GPT2AdapterModel + :members: + :inherited-members: GPT2PreTrainedModel diff --git a/_sources/classes/models/gptj.rst.txt b/_sources/classes/models/gptj.rst.txt new file mode 100644 index 0000000000..31546df179 --- /dev/null +++ b/_sources/classes/models/gptj.rst.txt @@ -0,0 +1,24 @@ +EleutherAI GPT-J-6B +----------------------------------------------------------------------------------------------------------------------- + +EleutherAI GPT-J-6B is an open source, autoregressive language model created by a group of researchers called +EleutherAI. It's one of the most advanced alternatives to OpenAI's GPT-3 and performs well on a wide array of +natural language tasks such as chat, summarization, and question answering, to name a few. + +For a deeper dive, GPT-J is a transformer model trained using Ben Wang's Mesh Transformer JAX `Mesh Transformer JAX +`_. "GPT" is short for +generative pre-trained transformer, "J" distinguishes this model from other GPT models, and "6B" represents the 6 +billion trainable parameters. + +The model consists of 28 layers with a model dimension of 4096, and a feedforward dimension of 16384. The model +dimension is split into 16 heads, each with a dimension of 256. Rotary Position Embedding (RoPE) is applied to +64 dimensions of each head. The model is trained with a tokenization vocabulary of 50257, using the same set of +BPEs as GPT-2/GPT-3. + + +GPTJAdapterModel +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +.. autoclass:: adapters.GPTJAdapterModel + :members: + :inherited-members: GPTJPreTrainedModel diff --git a/_sources/classes/models/llama.rst.txt b/_sources/classes/models/llama.rst.txt new file mode 100644 index 0000000000..f650f93225 --- /dev/null +++ b/_sources/classes/models/llama.rst.txt @@ -0,0 +1,26 @@ +LLaMA +----------------------------------------------------------------------------------------------------------------------- + +.. note:: + Loading a ``LlamaForQuestionAnswering`` via [`AutoAdapterModel`](adapters.AutoAdapterModel) or via [`LlamaAdapterModel`](adapters.LlamaAdapterModel) does not load the head, even if the model is not sharded. Please load the base model first and then subsequently the head. + Note that for sharded models the head is never automatically loaded as described here: [Auto Classes](auto.rst) + + +The LLaMA model was proposed in `LLaMA: Open and Efficient Foundation Language Models `__ by +Hugo Touvron, Thibaut Lavril, Gautier Izacard, Xavier Martinet, Marie-Anne Lachaux, Timothée Lacroix, Baptiste Rozière, Naman Goyal, +Eric Hambro, Faisal Azhar, Aurelien Rodriguez, Armand Joulin, Edouard Grave, Guillaume Lample. It is a collection of foundation language +models ranging from 7B to 65B parameters. + +The abstract from the paper is the following: + +*We introduce LLaMA, a collection of foundation language models ranging from 7B to 65B parameters. We train our models on trillions of tokens, +and show that it is possible to train state-of-the-art models using publicly available datasets exclusively, without resorting to proprietary +and inaccessible datasets. In particular, LLaMA-13B outperforms GPT-3 (175B) on most benchmarks, and LLaMA-65B is competitive with the best models, +Chinchilla-70B and PaLM-540B. We release all our models to the research community.* + +LlamaAdapterModel +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +.. autoclass:: adapters.LlamaAdapterModel + :members: + :inherited-members: LlamaPreTrainedModel diff --git a/_sources/classes/models/mbart.rst.txt b/_sources/classes/models/mbart.rst.txt new file mode 100644 index 0000000000..263ae456e0 --- /dev/null +++ b/_sources/classes/models/mbart.rst.txt @@ -0,0 +1,19 @@ +MBart +----------------------------------------------------------------------------------------------------------------------- + +The MBart model was presented in `Multilingual Denoising Pre-training for Neural Machine Translation +`_ by Yinhan Liu, Jiatao Gu, Naman Goyal, Xian Li, Sergey Edunov Marjan +Ghazvininejad, Mike Lewis, Luke Zettlemoyer. + +According to the abstract, MBART is a sequence-to-sequence denoising auto-encoder pretrained on large-scale monolingual +corpora in many languages using the BART objective. mBART is one of the first methods for pretraining a complete +sequence-to-sequence model by denoising full texts in multiple languages, while previous approaches have focused only +on the encoder, decoder, or reconstructing parts of the text. + + +MBartAdapterModel +~~~~~~~~~~~~~~~~~~~~ + +.. autoclass:: adapters.MBartAdapterModel + :members: + :inherited-members: MBartPreTrainedModel diff --git a/_sources/classes/models/mt5.rst.txt b/_sources/classes/models/mt5.rst.txt new file mode 100644 index 0000000000..d05542056d --- /dev/null +++ b/_sources/classes/models/mt5.rst.txt @@ -0,0 +1,24 @@ +MT5 +===== + +The mT5 model was presented in `mT5: A massively multilingual pre-trained text-to-text transformer +`__ by Linting Xue, Noah Constant, Adam Roberts, Mihir Kale, Rami Al-Rfou, +Aditya Siddhant, Aditya Barua, Colin Raffel. + +The abstract from the paper is the following, + + +- The recent "Text-to-Text Transfer Transformer" (T5) leveraged a unified text-to-text format and scale to attain + state-of-the-art results on a wide variety of English-language NLP tasks. In this paper, we introduce mT5, a + multilingual variant of T5 that was pre-trained on a new Common Crawl-based dataset covering 101 languages. We detail + the design and modified training of mT5 and demonstrate its state-of-the-art performance on many multilingual + benchmarks. We also describe a simple technique to prevent "accidental translation" in the zero-shot setting, where a + generative model chooses to (partially) translate its prediction into the wrong language. All of the code and model + checkpoints used in this work are publicly available. + +MT5AdapterModel +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +.. autoclass:: adapters.MT5AdapterModel + :members: + :inherited-members: MT5PreTrainedModel \ No newline at end of file diff --git a/_sources/classes/models/roberta.rst.txt b/_sources/classes/models/roberta.rst.txt new file mode 100644 index 0000000000..93b6ab5b38 --- /dev/null +++ b/_sources/classes/models/roberta.rst.txt @@ -0,0 +1,14 @@ +RoBERTa +======== + +The RoBERTa model was proposed in `RoBERTa: A Robustly Optimized BERT Pretraining Approach `_ +by Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, +Veselin Stoyanov. It is based on Google's BERT model released in 2018. + + +RobertaAdapterModel +~~~~~~~~~~~~~~~~~~~~~~~~~ + +.. autoclass:: adapters.RobertaAdapterModel + :members: + :inherited-members: RobertaPreTrainedModel diff --git a/_sources/classes/models/t5.rst.txt b/_sources/classes/models/t5.rst.txt new file mode 100644 index 0000000000..085c5ba2cd --- /dev/null +++ b/_sources/classes/models/t5.rst.txt @@ -0,0 +1,25 @@ +T5 +===== + +The T5 model was presented in `Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer +`__ by Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, +Michael Matena, Yanqi Zhou, Wei Li, Peter J. Liu. + +The abstract from the paper is the following, + + +- T5 is an encoder-decoder model pre-trained on a multi-task mixture of unsupervised and supervised tasks and for which + each task is converted into a text-to-text format. T5 works well on a variety of tasks out-of-the-box by prepending a + different prefix to the input corresponding to each task, e.g., for translation: *translate English to German: ...*, + for summarization: *summarize: ...*. + + For more information about which prefix to use, it is easiest to look into Appendix D of the `paper + `__. + + +T5AdapterModel +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +.. autoclass:: adapters.T5AdapterModel + :members: + :inherited-members: T5PreTrainedModel diff --git a/_sources/classes/models/vit.rst.txt b/_sources/classes/models/vit.rst.txt new file mode 100644 index 0000000000..28a44183e4 --- /dev/null +++ b/_sources/classes/models/vit.rst.txt @@ -0,0 +1,27 @@ +Vision Transformer (ViT) +========================= + +The Vision Transformer (ViT) model was proposed in `An Image is Worth 16x16 Words: Transformers for Image Recognition +at Scale `__ by Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk +Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob +Uszkoreit, Neil Houlsby. It's the first paper that successfully trains a Transformer encoder on ImageNet, attaining +very good results compared to familiar convolutional architectures. + + +The abstract from the paper is the following: + +*While the Transformer architecture has become the de-facto standard for natural language processing tasks, its +applications to computer vision remain limited. In vision, attention is either applied in conjunction with +convolutional networks, or used to replace certain components of convolutional networks while keeping their overall +structure in place. We show that this reliance on CNNs is not necessary and a pure transformer applied directly to +sequences of image patches can perform very well on image classification tasks. When pre-trained on large amounts of +data and transferred to multiple mid-sized or small image recognition benchmarks (ImageNet, CIFAR-100, VTAB, etc.), +Vision Transformer (ViT) attains excellent results compared to state-of-the-art convolutional networks while requiring +substantially fewer computational resources to train.* + +ViTAdapterModel +~~~~~~~~~~~~~~~~~~~~ + +.. autoclass:: adapters.ViTAdapterModel + :members: + :inherited-members: ViTPreTrainedModel diff --git a/_sources/classes/models/xlmroberta.rst.txt b/_sources/classes/models/xlmroberta.rst.txt new file mode 100644 index 0000000000..dc4208c335 --- /dev/null +++ b/_sources/classes/models/xlmroberta.rst.txt @@ -0,0 +1,14 @@ +XLM-RoBERTa +============ + +The XLM-RoBERTa model was proposed in `Unsupervised Cross-lingual Representation Learning at Scale `__ +by Alexis Conneau, Kartikay Khandelwal, Naman Goyal, Vishrav Chaudhary, Guillaume Wenzek, Francisco Guzmán, +Edouard Grave, Myle Ott, Luke Zettlemoyer and Veselin Stoyanov. It is based on Facebook's RoBERTa model released in 2019. +It is a large multi-lingual language model, trained on 2.5TB of filtered CommonCrawl data. + + +XLMRobertaAdapterModel +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +.. autoclass:: adapters.XLMRobertaAdapterModel + :members: diff --git a/_sources/classes/models/xmod.rst.txt b/_sources/classes/models/xmod.rst.txt new file mode 100644 index 0000000000..1b92284940 --- /dev/null +++ b/_sources/classes/models/xmod.rst.txt @@ -0,0 +1,23 @@ +X-MOD +===== + +.. important:: + The X-MOD implementation integrated into Transformers already supports adapters. + To make this implementation compatible with Adapters, a few changes were necessary: + + - Pre-trained X-MOD checkpoints require conversion before they can be used with Adapters. We provide pre-converted checkpoints for the following models: + - ``facebook/xmod-base`` -> ``AdapterHub/xmod-base`` with languages adapters split into separate repos (e.g. ``AdapterHub/xmod-base-af_ZA``) + - In Adapters, the X-MOD classes rely on the usual adapter methods instead of the custom methods introduced in Transformers, i.e.: + - ``set_active_adapters()`` instead of ``set_default_language()``. + - ``AdapterSetup`` context instead of ``lang_ids`` parameter. + +The abstract from the paper is the following: + +*Multilingual pre-trained models are known to suffer from the curse of multilinguality, which causes per-language performance to drop as they cover more languages. We address this issue by introducing language-specific modules, which allows us to grow the total capacity of the model, while keeping the total number of trainable parameters per language constant. In contrast with prior work that learns language-specific components post-hoc, we pre-train the modules of our Cross-lingual Modular (X-MOD) models from the start. Our experiments on natural language inference, named entity recognition and question answering show that our approach not only mitigates the negative interference between languages, but also enables positive transfer, resulting in improved monolingual and cross-lingual performance. Furthermore, our approach enables adding languages post-hoc with no measurable drop in performance, no longer limiting the model usage to the set of pre-trained languages.* + +XmodAdapterModel +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +.. autoclass:: adapters.XmodAdapterModel + :members: + :inherited-members: XmodPreTrainedModel diff --git a/_sources/contributing.md.txt b/_sources/contributing.md.txt new file mode 100644 index 0000000000..13f363a603 --- /dev/null +++ b/_sources/contributing.md.txt @@ -0,0 +1,78 @@ +# Contributing to AdapterHub + +There are many ways in which you can contribute to AdapterHub and the `adapters` library. +This includes code contributions such as: +- implementing new adapter methods +- adding support for new Transformer +- fixing open issues + +as well as non-code contributions such as: +- training and uploading adapters to the Hub +- writing documentation and blog posts +- helping others with their issues and questions + +Whichever way you'd like to contribute, you're very welcome to do so! + +## Contributing to the `adapters` codebase + +### Setting up your dev environment + +To get started with writing code for `adapters`, you'd want to set up the project on a local development environment. + +`adapters` closely follows the original Hugging Face Transformers repository in many aspects. +This guide assumes that you want to set up your dev environment on a local machine and that you have basic knowledge of `git`. +Additionally, you require **Python 3.8** or above pre-installed to get started. + +In the following, we go through the setup procedure step by step: + +1. Fork [the `adapters` repository](https://github.com/adapter-hub/adapters) to get a local copy of the code under your user account. +2. Clone your fork to your local machine: + ``` + git clone --recursive git@github.com:/adapters.git + cd adapters + ``` + **Note:** The `--recursive` flag is important to initialize git submodules. +3. Create a virtual environment, e.g. via `virtualenv` or `conda`. +4. Install PyTorch, following the installation command for your environment [on their website](https://pytorch.org/get-started/locally/). +5. Install Hugging Face Transformers from the local git submodule: + ``` + pip install ./hf_transformers + ``` +6. Install `adapters` and required dev dependencies: + ``` + pip install -e ".[dev]" + ``` + +### Adding Adapter Methods + +How to integrate new efficient fine-tuning/ adapter methods to `adapters` is described at [https://docs.adapterhub.ml/contributing/adding_adapter_methods.html](https://docs.adapterhub.ml/contributing/adding_adapter_methods.html). + +### Adding Adapters to a Model + +How to add adapter support to a model type already supported by Hugging Face Transformers is described at [https://docs.adapterhub.ml/contributing/adding_adapters_to_a_model.html](https://docs.adapterhub.ml/contributing/adding_adapters_to_a_model.html). + +### Testing your changes to the codebase + +`adapters` provides multiple Makefile targets for easily running tests and repo checks. +Make sure these checks run without errors to pass the CI pipeline tasks when you open a pull request. + +To **run all tests** in the repository: +``` +make test +``` + +To **auto format code and imports** in the whole codebase: +``` +make style +``` +This will run `black` and `isort`. + +To **run all quality checks** ensuring code style and repo consistency: +``` +make quality +``` +This will run checks with `black`, `isort` and `flake8` as well as additional custom checks. + +## Publishing Pre-Trained Adapters + +How to make your own trained adapters accessible for the `adapters` library HuggingFace Model Hub is described at [https://docs.adapterhub.ml/huggingface_hub.html](https://docs.adapterhub.ml/huggingface_hub.html). diff --git a/_sources/contributing/adding_adapter_methods.md.txt b/_sources/contributing/adding_adapter_methods.md.txt new file mode 100644 index 0000000000..af750a9e1c --- /dev/null +++ b/_sources/contributing/adding_adapter_methods.md.txt @@ -0,0 +1,101 @@ +# Adding Adapter Methods + +This document describes how different efficient fine-tuning methods can be integrated into the codebase of `adapters`. +It can be used as a guide to add new efficient fine-tuning/ adapter methods. + +Before we start to go into implementation details, first some important design philosophies of `adapters`: + +- _Adapters should integrate seamlessly with existing model classes_: This means (a) if a model architecture supports adapters, it should be possible to use them with all model classes of this architecture and (b) adapters should be entirely opt-in, i.e. the model classes still must work without adapters. +- _Copying original should be minimal_: `adapters` tries to avoid copying of the original HF code as far as possible. We extensively use Python mixins to achieve this. + +Now we highlight the most important components of integrating adapter methods into Transformer models. +Each integration is highly dependent on the specific details of the adapter methods. +Therefore, the described steps might not be applicable to each implementation. + +## Implementation + +❓ As adapter methods typically inject blocks of new parameters into an existing Transformer model, they mostly can be implemented using multiple blocks of classes deriving from `torch.nn.Module`. +These module classes then have to be inserted into the correct locations within the Transformer model implementation. +Thus, each adapter method implementation at least should provide two classes: + +- a configuration class deriving from `AdapterConfig` that provides attributes for all configuration options of the method +- a module class deriving from the abstract `AdapterLayerBase` that provides the method parameters and a set of standard adapter management functions + - modules supporting [adapter composition](https://docs.adapterhub.ml/adapter_composition.html) should instead derive from `ComposableAdapterLayerBase` + +### Configuration + +All configuration classes reside in `src/adapters/configuration/adapter_config.py`. +- To add a new configuration class for a new method, create a new subclass of [`AdapterConfig`](adapters.AdapterConfig). + Make sure to set the `architecture` attribute in your class. +- Finally, also make sure the config class is added to the `__init__.py` files in `src/adapters`. + +### Modeling + +All adapter method implementations reside in `src/adapters/methods`. + +#### For methods **without** composition support + +The [`AdapterLayerBase`](adapters.AdapterLayerBase) class from which any new adapter modules should derive resides in `src/adapters/methods/adapter_layer_base.py`. +- This abstract base class defines a set of methods that should be implemented by each deriving class, +including methods for adding, enabling and deleting adapter weights. These methods are marked as abstract in the base class. See [`AdapterLayerBase`](adapters.AdapterLayerBase) for details. +- Most importantly however, the module classes deriving from this base class should implement the forward pass through an adaptation component. +- The concrete implementation of these classes heavily depends on the specifics of the adapter method. + +#### For methods **with** composition support + +The [`ComposableAdapterLayerBase`](adapters.ComposableAdapterLayerBase) class (as subclass of [`AdapterLayerBase`](adapters.AdapterLayerBase)), which resides in `src/adapters/methods/adapter_layer_base.py` provides the basic skeleton for implementing adapter composition. +- Your deriving module class firstly should implement all methods required by [`AdapterLayerBase`](adapters.AdapterLayerBase). See section above for details. +- For adapter composition, the pre-implemented `compose()` method constitutes the main entry-point. This method should be called during the forward pass of your adapter module. +- `compose()` expects a `state` object, which is a generic named tuple object defined by your adapter method. This state object should hold all tensors (such as hidden states, attention masks etc.) and state attributes required for your adapter implementation. See `BottleneckState` for an example. +- Implementations for specific composition blocks are given in methods starting with `compose_`. Some composition blocks provide generic default implementations, some must be implemented by the deriving class if they should be supported. Make sure to list all supported composition blocks in the `supported_compositions` class attribute of your deriving module. +- In any case, a small set of helper methods should be implemented by any deriving module to support basic composition logic. These are marked as abstract methods in [`ComposableAdapterLayerBase`](adapters.ComposableAdapterLayerBase) and currently consist of the following: vslice(), pad_and_concat(), repeat(), mean(), compose_single(). See [`ComposableAdapterLayerBase`](adapters.ComposableAdapterLayerBase) for details. + +For a reference implementation, have a look at `BottleneckLayer` for bottleneck adapters. + +#### For all methods + +To actually make use of the newly implemented classes, it's finally necessary to integrate the forward calls to the modules in the actual model implementations. +- This, again, is highly dependent on how the adapter method interacts with the base model classes. Typically, module classes can be integrated either via mixins (see modules starting with "mixin" in `src/adapters/models`) or directly as submodules of the respective model components. +- The model class integration has to be repeated for each supported Transformer model, as they typically don't share a codebase. At this point it is often important to consider where the adapters need to be added to the transformer model and whether there is an implementation that does not require more copying of classes than the current implementation. +Please try to integrate any new adapter method into every model class when it's reasonable. +You can find all currently supported model classes at https://docs.adapterhub.ml/model_overview.html. + +**Additional things to consider** + +- New adapter methods typically also require some changes in the `AdapterLoader` class in `src/adapters/loading.py` (also see [here](https://docs.adapterhub.ml/extending.html#loading-custom-module-weights)). +- Depending on the method to be integrated, further changes in other classes might be necessary. + +## Testing + +❓ `adapters` provides a framework for testing adapter methods on implementing models in `tests`. +Tests for each adapter method are provided via a mixin class. +All test mixins derive from the common `AdapterMethodBaseTestMixin` class and reside in `tests/methods`. + +**📝 Steps** + +- Add a new `test_.py` module in `tests/methods`. + - This module should contain a `TestMixin` class deriving from `AdapterMethodBaseTestMixin` that implements typical methods of adding, loading and training modules of the new adapter method. + - Have a look at existing test mixins for reference. +- Next, add the newly implemented test mixin to the tests of all model types that support the new adapter method. + - Each model type has its own test class `tests/test_.py` that contains a `AdapterTest` class. + Add the new test mixin to the mixins of this class. + E.g., if the new method is supported by BERT, add the its test mixin to `BertAdapterTest`. + +## Documentation + +❓ The documentation for `adapters` lives in the `docs` folder. + +**📝 Steps** + +- Add the class documentation for the configuration class of the new method in `docs/classes/adapter_config.rst`. +- In `docs/overview.md`, add a new section for the new adapter method that describes the most important concepts. Please try to follow the general format of the existing methods. +- Add a new column in the table in `docs/model_overview.md` and check the models that support the new adapter method. + +Finally, please add a row for the new method in the table of supported methods under _Implemented Methods_ in the main `README.md` of this repository. + +## Training Example Adapters + +❓ To make sure the new adapter implementation works properly, it is useful to train some example adapters and compare the training results to full model fine-tuning and/or reference implementations. +Ideally, this would include training adapters on one (or more) tasks that are good for demonstrating the new method and uploading them to AdapterHub. + +Hugging Face already provides example training scripts for many tasks, some of them have already been modified to support adapter training (see https://github.com/Adapter-Hub/adapters/tree/main/examples). diff --git a/_sources/contributing/adding_adapters_to_a_model.md.txt b/_sources/contributing/adding_adapters_to_a_model.md.txt new file mode 100644 index 0000000000..d52ac09bb2 --- /dev/null +++ b/_sources/contributing/adding_adapters_to_a_model.md.txt @@ -0,0 +1,90 @@ +# Adding Adapters to a Model +This document gives an overview of how new model architectures of Hugging Face Transformers can be supported by `adapters`. +Before delving into implementation details, you should familiarize yourself with the main design philosophies of `adapters`: + +- _Adapters should integrate seamlessly with existing model classes_: If a model architecture supports adapters, it should be possible to use them with all model classes of this architecture. +- _Copied code should be minimal_: `adapters` extensively uses Python mixins to add adapter support to HF models. Functions that cannot be sufficiently modified by mixins are copied and then modified. Try to avoid copying functions as much as possible. + +## Relevant Classes +Adding adapter support to an existing model architecture requires modifying some parts of the model forward pass logic. These modifications are realized by the four files in the `src/adapters/models//` directory. Let's examine the purpose of these files in the example of BERT. It's important to note that we are adapting the original Hugging Face model, implemented in [transformers/models/bert/modeling_bert.py](https://github.com/huggingface/transformers/blob/main/src/transformers/models/bert/modeling_bert.py). The files in `src/adapters/models/bert/` are: + +1. `src/adapters/models/bert/mixin_bert.py`: +This file contains mixins for each class we want to change. For example, in the `BertSelfAttention` class, we need to make changes for LoRA and Prefix Tuning. For this, we create a `BertSelfAttentionAdaptersMixin` to implement these changes. We will discuss how this works in detail below. +2. `src/adapters/models/bert/modeling_bert.py`: +For some classes of the BERT implementation (e.g. `BertModel` or `BertLayer`) the code can be sufficiently customized via mixins. For other classes (like `BertSelfAttention`), we need to edit the original code directly. These classes are copied into `src/adapters/models/bert/modeling_bert.py` and modified. +3. `src/adapters/models/bert/adapter_model.py`: +In this file, the adapter model class is defined. This class allows flexible adding of and switching between multiple prediction heads of different types. This looks about the same for each model, except that each model has different heads and thus different `add_..._head()` functions. +4. `src/adapters/models/bert/__init__.py`: Defines Python's import structure. + + +## Implementation Steps 📝 +Now that we have discussed the purpose of every file in `src/adapters/models//`, we go through the integration of adapters into an existing model architecture step by step. **The following steps might not be applicable to every model architecture.** + +1. **Files:** + - Create the `src/adapters/models//` directory and in it the 4 files: `mixin_.py`, `modeling_.py` `adapter_model.py` and `__init__.py` +2. **Mixins:** + - In `src/adapters/models//mixin_.py`, create mixins for any class you want to change and where you can't reuse an existing mixin from another class. + - To figure out which classes to change, think about where to insert LoRA, Prefix Tuning, and bottleneck adapters. + - You can use similar model implementations for guidance. + - Often, existing mixins of another class can be reused. E.g. `BertLayer`, `RobertaLayer`, `XLMRobertaLayer`, `DebertaLayer`, `DebertaV2Layer` and `BertGenerationLayer` (all models derived from BERT) use the `BertLayerAdaptersMixin`. + - To additionally support Prefix Tuning, it's necessary to apply the forward call to the `PrefixTuningLayer` module in the respective attention layer (see step 3 for how to modify the code of an Hugging Face class). + - Make sure the calls to `bottleneck_layer_forward()` are added in the right places. + - The mixin for the whole base model class (e.g., `BertModel`) should derive from `ModelBaseAdaptersMixin` and (if possible) `EmbeddingAdaptersMixin` and/or `InvertibleAdaptersMixin`. This mixin should at least implement the `iter_layers()` method but might require additional modifications depending on the architecture. + - If the model is a combination of different models, such as the EncoderDecoderModel, use `ModelUsingSubmodelsAdaptersMixin` instead of `ModelBaseAdaptersMixin`. +3. **Copied functions:** + - For those classes where the mixin is not enough to realize the wanted behavior, you must: + - Create a new class in `src/adapters/models//modeling_.py` with the name `WithAdapters`. This class should derive from the corresponding mixin and HF class. + - Copy the function you want to change into this class and modify it. + - e.g., the `forward` method of the `BertSelfAttention` class must be adapted to support prefix tuning. We therefore create a class `BertSelfAttentionWithAdapters(BertSelfAttentionAdaptersMixin, BertSelfAttention)`, copy the forward method into it and modify it. + - if the `forward` method of a module is copied and modified, make sure to call `adapters.utils.patch_forward()` in the module's `init_adapters()` method. This ensures adapters work correctly with the `accelerate` package. +4. **Modify MODEL_MIXIN_MAPPING** + - For each mixin whose class was not copied into `modeling_.py`, add the mixin/class combination into `MODEL_MIXIN_MAPPING` in the file `src/adapters/models/__init__.py`. +5. **Create the adapter model:** + - Adapter-supporting architectures should provide a new model class `AdapterModel`. This class allows flexible adding of and switching between multiple prediction heads of different types. + - This is done in the `adapter_model.py` file: + - This module should implement the `AdapterModel` class, deriving from `ModelWithFlexibleHeadsAdaptersMixin` and `PreTrainedModel`. + - In the model class, add methods for those prediction heads that make sense for the new model architecture. + - Again, have a look at existing implementations. + - Add `AdapterModel` to the `ADAPTER_MODEL_MAPPING_NAMES` mapping in `src/adapters/models/auto/adapter_model.py` and to `src/adapters/__init__.py`. + - Define the classes to be added to Python's import structure in `src/adapters/models//__init__.py`. This will likely only be the `AdapterModel`. +6. **Adapt the config classes:** + - Adapt the config class to the requirements of adapters in `src/transformers/adapters/wrappers/configuration.py`. + - There are some naming differences in the config attributes of different model architectures. The adapter implementation requires some additional attributes with a specific name to be available. These currently are `num_attention_heads`, `hidden_size`, `hidden_dropout_prob` and `attention_probs_dropout_prob` as in the `BertConfig` class. + If your model config does not provide these, add corresponding mappings to `CONFIG_CLASS_KEYS_MAPPING`. + + +### Additional (optional) implementation steps 📝 + +- Parallel adapter inference via `Parallel` composition block (cf. [documentation](https://docs.adapterhub.ml/adapter_composition.html#parallel), [PR#150](https://github.com/Adapter-Hub/adapters/pull/150)). +- Provide mappings for an architecture's existing (static) prediction heads into `adapters` flex heads (cf. [implementation](https://github.com/adapter-hub/adapters/blob/main/src/adapters/head_utils.py#L11)). + +## Testing + +❓ In addition to the general Hugging Face model tests, there are adapter-specific test cases. All tests are executed from the `tests` folder. You need to add two different test classes. + +**📝 Steps** +1. Add a new `test_.py` module in `tests/` + - This file is used to test that everything related to the usage of adapters (adding, removing, activating, ...) works. + - This module typically holds 2 test classes and a test base class: + - `AdapterTestBase`: This class contains the `tokenizer_name`, `config_class` and `config`. + - `AdapterTest` derives from a collection of test mixins that hold various adapter tests (depending on the implementation). + - (optionally) `ClassConversionTest` runs tests for correct class conversion if conversion of prediction heads is implemented. +2. Add a new `test_.py` module in `tests/models/` + - This file is used to test the AdapterModel class. + - This module typically holds 1 test class with the name `AdapterModelTest` + - `AdapterModelTest` derives directly from Hugging Face's existing model test class `ModelTest` and adds `AdapterModel` as a class to test. + +## Documentation + +❓ The documentation for `adapters` lives in the `docs` folder. + +**📝 Steps** + +- Add `docs/classes/models/.rst` (oriented at the doc file in the HF docs). Make sure to include `AdapterModel` autodoc. Finally, list the file in `index.rst`. +- Add a new row for the model in the model table of the overview page at `docs/model_overview.md`, listing all the methods implemented by the new model. + +## Training Example Adapters + +❓ To make sure the new adapter implementation works properly, it is useful to train some example adapters and compare the training results to full model fine-tuning. Ideally, this would include training adapters on one (or more) tasks that are good for demonstrating the new model architecture (e.g. GLUE benchmark for BERT, summarization for BART) and uploading them to AdapterHub. + +We provide training scripts for many tasks here: [https://github.com/Adapter-Hub/adapters/tree/main/examples/pytorch/](https://github.com/Adapter-Hub/adapters/tree/main/examples/pytorch/) diff --git a/_sources/embeddings.md.txt b/_sources/embeddings.md.txt new file mode 100644 index 0000000000..5699f11897 --- /dev/null +++ b/_sources/embeddings.md.txt @@ -0,0 +1,53 @@ +# Embeddings + +With `adapters`, we support dynamically adding, loading, and deleting of `Embeddings`. This section +will give you an overview of these features. A toy example is illustrated in this [notebook](https://colab.research.google.com/github/Adapter-Hub/adapters/blob/main/notebooks/Adapter_With_Embeddings.ipynb). + +## Adding and Deleting Embeddings +The methods for handling embeddings are similar to the ones handling adapters. To add new embeddings we call +`add_embeddings`. This adds new embeddings for the vocabulary of the `tokenizer`. +In some cases, it might be useful to initialize embeddings of tokens to the ones of another embeddings module. If a +`reference_embedding` and `reference_tokenizer` are provided all embeddings for tokens that are present in both embeddings are initialized to the embedding provided by the `reference_embedding`. The new embedding will be created and set as the active embedding. If you are unsure which embedding +is currently active, the `active_embeddings` property contains the currently active embedding. + +```python +model.add_embeddings('name', tokenizer, reference_embedding='default', reference_tokenizer=reference_tokenizer) +``` + +The original embedding of the transformers model is always available under the name `"default"`. To set it as the active +embedding simply call the `set_active_embedding('name')` method. +```python +model.set_active_embeddings('name') +``` +Similarly, all other embeddings can be set as active by passing their name to the `set_active_embedding` method. + +To delete an embedding that is no longer needed, we can call the `delete_embeddings` method with the name of the adapter +we want to delete. However, you cannot delete the default embedding. +```python +model.delete_embeddings('name') +``` +Please note, that if the active embedding is deleted the default embedding is set as the active embedding. + +## Training Embeddings +Embeddings can only be trained with an adapter. To freeze all weights except for the embedding and the adapter: +```python +model.train_adapter('adapter_name', train_embeddings=True) +``` +Except for the `train_embeddings` flag, the training is the same as for just training an adapter (see [Adapter Training](training.md)). + +## Saving and Loading Embeddings +You can save the embeddings by calling `save_embeddings('path/to/dir', 'name')` and load them with `load_embeddings('path/to/dir', 'name')`. + +```python +model.save_embeddings(path, 'name') +model.load_embeddings(path, 'reloaded_name') +``` + +The path needs to be to a directory in which the weights of the embedding will be saved. + +You can also save and load the tokenizer +with the embedding by passing the tokenizer to `save_embeddings`. +```python +model.save_embeddings(path, 'name', tokenizer) +loaded_tokenizer = model.load_embeddings(path, 'name') +``` diff --git a/_sources/extending.md.txt b/_sources/extending.md.txt new file mode 100644 index 0000000000..290c09b9db --- /dev/null +++ b/_sources/extending.md.txt @@ -0,0 +1,34 @@ +# Extending the Library + +## Integrating new Transformer models +Currently, not all model types included in Hugging Face's `transformers` support adapters yet. +However, it is possible to add the existing adapter implementation to new models. +For detailed instructions, see [Adding Adapters to a Model](https://docs.adapterhub.ml/contributing/adding_adapters_to_a_model.html). + +## Loading custom module weights + +`adapters` provides support for saving and loading adapter and prediction head modules from the local file system or the Hub out of the box. +However, countless additional module integrations into language models are thinkable. +To provide a basis for such new custom model plugins, `adapters` integrates a basic mechanism to save and load custom weights. + +All adapter and head module weights are extracted, saved and loaded by implementations of the `WeightsLoader` class, the two preincluded being `AdapterLoader` and `PredictionHeadLoader`. To add basic saving and loading functionalities to your custom module weights, you can implement a new subclass of `WeightsLoader`. The two required abstract methods to be implemented are: + +- `filter_func(self, name: str) -> Callable[[str], bool]`: The callable returned by this method is used to extract the module weights to be saved or loaded based on their names. + +- `rename_func(self, old_name: str, new_name: str) -> Callable[[str], str]`: The callable returned by this method is used to optionally rename the module weights after loading. + +For more advanced functionalities, you may also want to override the `save()` and `load()` method. + +Using the custom loader class, weights can now be saved with: +```python +loader = MyCustomWeightsLoader(model) +loader.save("path/to/save/dir", "custom_weights_name") +``` + +You can also upload these weights to the Hub and then load them from there together with an adapter: +```python +model.load_adapter( + "adapter_name", + custom_weights_loaders=[MyCustomWeightsLoader] +) +``` diff --git a/_sources/hub_contributing.md.txt b/_sources/hub_contributing.md.txt new file mode 100644 index 0000000000..b427171e5c --- /dev/null +++ b/_sources/hub_contributing.md.txt @@ -0,0 +1,7 @@ +# Contributing Adapters to the Hub + +```{eval-rst} +.. warning:: + The original approach of contributing adapters via the Hub repository is deprecated. Please upload all new adapters to HuggingFace's Model Hub as described in `Integration with Hugging Face's Model Hub `_. + For the legacy documentation, refer to `here `_. +``` diff --git a/_sources/huggingface_hub.md.txt b/_sources/huggingface_hub.md.txt new file mode 100644 index 0000000000..cc1e6034ac --- /dev/null +++ b/_sources/huggingface_hub.md.txt @@ -0,0 +1,71 @@ +# Integration with Hugging Face's Model Hub + +```{eval-rst} +.. figure:: img/hfhub.svg + :align: center + :alt: Hugging Face Hub logo. +``` + +You can download adapters from and upload them to [Hugging Face's Model Hub](https://huggingface.co/models). +This document describes how to interact with the Model Hub when working with adapters. + +## Downloading from the Hub + +The Hugging Face Model Hub already provides hundreds of pre-trained adapters available for download. +To search for available adapters, use the _Adapters_ library filter on the Model Hub website or use this link: [https://huggingface.co/models?library=adapter-transformers](https://huggingface.co/models?library=adapter-transformers). +Alternatively, all adapters on the Hugging Face Model Hub are also listed on [https://adapterhub.ml/explore](https://adapterhub.ml/explore) together with all adapters directly uploaded to AdapterHub. + +After you have found an adapter you would like to use, loading it into a Transformer model is easy. +For example, for loading and activating the adapter [`AdapterHub/roberta-base-pf-sick`](https://huggingface.co/AdapterHub/roberta-base-pf-sick), write: +```python +from adapters import AutoAdapterModel + +model = AutoAdapterModel.from_pretrained("roberta-base") +adapter_name = model.load_adapter("AdapterHub/roberta-base-pf-sick") +model.active_adapters = adapter_name +``` + +## Uploading to the Hub + +Hugging Face's Model Hub provides a convenient way for everyone to upload their pre-trained models and share them with the world. +Of course, this is also possible with adapters now! +In the following, we'll go through the fastest way of uploading an adapter directly via Python in the `adapters` library. +For more options and information, e.g. for managing models via the CLI and Git, refer to [HugginFace's documentation](https://huggingface.co/transformers/model_sharing.html). + +1. **Prepare access credentials**: Before being able to push to the Hugging Face Model Hub for the first time, we have to store our access token in the cache. + This can be done via the `huggingface-cli` by running: + ``` + huggingface-cli login + ``` + +2. **Push an adapter**: Next, we can proceed to upload our first adapter. + Let's say we have a standard pre-trained Transformers model with an existing adapter named `awesome_adapter` (e.g. added via `model.add_adapter("awesome_adapter")` and [trained](training.md) afterwards). + We can now push this adapter to the Model Hub using `model.push_adapter_to_hub()` like this: + ```python + model.push_adapter_to_hub( + "my-awesome-adapter", + "awesome_adapter", + adapterhub_tag="sentiment/imdb", + datasets_tag="imdb" + ) + ``` + This will create a repository `my-awesome-adapter` under your username, generate a default adapter card as `README.md` and upload the adapter named `awesome_adapter` together with the adapter card to the new repository. + `adapterhub_tag` and `datasets_tag` provide additional information for categorization. + + ```{eval-rst} + .. important:: + All adapters uploaded to Hugging Face's Model Hub are automatically also listed on AdapterHub.ml. Thus, for better categorization, either ``adapterhub_tag`` or ``datasets_tag`` is required when uploading a new adapter to the Model Hub. + + - ``adapterhub_tag`` specifies the AdapterHub categorization of the adapter in the format ``/`` according to the tasks and subtasks shown on https://adapterhub.ml/explore. For more, see `Add a new task or subtask `_. + - ``datasets_tag`` specifies the dataset the adapter was trained on as an identifier from `Hugging Face Datasets `_. + ``` + +Voilà! Your first adapter is on the Hugging Face Model Hub. +Anyone can now run: +``` +model.load_adapter("/my-awesome-adapter", source="hf") +``` + +To update your adapter, simply run `push_adapter_to_hub()` with the same repository name again. This will push a new commit to the existing repository. + +You can find the full documentation of `push_adapter_to_hub()` [here](adapters.hub_mixin.PushAdapterToHubMixin.push_adapter_to_hub). diff --git a/_sources/index.rst.txt b/_sources/index.rst.txt new file mode 100644 index 0000000000..b78a249c64 --- /dev/null +++ b/_sources/index.rst.txt @@ -0,0 +1,168 @@ +.. adapters documentation main file, created by + sphinx-quickstart on Sat Apr 18 10:21:23 2020. + You can adapt this file completely to your liking, but it should at least + contain the root `toctree` directive. + +AdapterHub Documentation +================================================ + +.. note:: + This documentation is based on the new *Adapters* library. + + The documentation based on the legacy *adapter-transformers* library can be found at: `https://docs-legacy.adapterhub.ml `_. + +*AdapterHub* is a framework simplifying the integration, training and usage of adapters and other efficient fine-tuning methods for Transformer-based language models. +For a full list of currently implemented methods, see the `table in our repository `_. + +The framework consists of two main components: + +.. list-table:: + :widths: 50 50 + :header-rows: 1 + + * - `Adapters `_ + - `AdapterHub.ml `_ + * - an add-on to Hugging Face's `Transformers `_ library that adds adapters into transformer models + - a central collection of pre-trained adapter modules + +Currently, we support the PyTorch versions of all models as listed on the `Model Overview `_ page. + +.. toctree:: + :maxdepth: 2 + :caption: Getting Started + + installation + quickstart + training + transitioning + +.. toctree:: + :maxdepth: 2 + :caption: Adapter Methods + + overview + methods + method_combinations + +.. toctree:: + :maxdepth: 2 + :caption: Advanced + + adapter_composition + prediction_heads + embeddings + extending + +.. toctree:: + :maxdepth: 2 + :caption: Loading and Sharing + + loading + huggingface_hub + +.. toctree:: + :maxdepth: 1 + :caption: Supported Models + + model_overview + classes/models/albert + classes/models/auto + classes/models/bart + classes/models/beit + classes/models/bert + classes/models/bert-generation + classes/models/clip + classes/models/deberta + classes/models/deberta_v2 + classes/models/distilbert + classes/models/electra + classes/models/encoderdecoder + classes/models/gpt2 + classes/models/gptj + classes/models/llama + classes/models/mbart + classes/models/mt5 + classes/models/roberta + classes/models/t5 + classes/models/vit + classes/models/xlmroberta + classes/models/xmod + +.. toctree:: + :maxdepth: 1 + :caption: Adapter-Related Classes + + classes/adapter_config + classes/model_adapters_config + classes/adapter_layer + classes/model_mixins + classes/adapter_training + classes/adapter_utils + +.. toctree:: + :maxdepth: 1 + :caption: Contributing + + contributing + contributing/adding_adapter_methods + contributing/adding_adapters_to_a_model + +Citation +======== + +If you use _Adapters_ in your work, please consider citing our library paper `Adapters: A Unified Library for Parameter-Efficient and Modular Transfer Learning ` + + +.. code-block:: bibtex + + @inproceedings{poth-etal-2023-adapters, + title = "Adapters: A Unified Library for Parameter-Efficient and Modular Transfer Learning", + author = {Poth, Clifton and + Sterz, Hannah and + Paul, Indraneil and + Purkayastha, Sukannya and + Engl{\"a}nder, Leon and + Imhof, Timo and + Vuli{\'c}, Ivan and + Ruder, Sebastian and + Gurevych, Iryna and + Pfeiffer, Jonas}, + booktitle = "Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing: System Demonstrations", + month = dec, + year = "2023", + address = "Singapore", + publisher = "Association for Computational Linguistics", + url = "https://aclanthology.org/2023.emnlp-demo.13", + pages = "149--160", + } + + +Alternatively, for the predecessor `adapter-transformers`, the Hub infrastructure and adapters uploaded by the AdapterHub team, please consider citing our initial paper: `AdapterHub: A Framework for Adapting Transformers `_ + + +.. code-block:: bibtex + + @inproceedings{pfeiffer2020AdapterHub, + title={AdapterHub: A Framework for Adapting Transformers}, + author={Jonas Pfeiffer and + Andreas R\"uckl\'{e} and + Clifton Poth and + Aishwarya Kamath and + Ivan Vuli\'{c} and + Sebastian Ruder and + Kyunghyun Cho and + Iryna Gurevych}, + booktitle={Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP 2020): Systems Demonstrations}, + year={2020}, + address = "Online", + publisher = "Association for Computational Linguistics", + url = "https://www.aclweb.org/anthology/2020.emnlp-demos.7", + pages = "46--54", + } + + +Indices and tables +================== + +* :ref:`genindex` +* :ref:`modindex` diff --git a/_sources/installation.md.txt b/_sources/installation.md.txt new file mode 100644 index 0000000000..c3b8468eb8 --- /dev/null +++ b/_sources/installation.md.txt @@ -0,0 +1,40 @@ +# Installation + +The `adapters` package is designed as an add-on for Hugging Face's Transformers library. +It currently supports Python 3.8+ and PyTorch 1.10+. You will have to [install PyTorch](https://pytorch.org/get-started/locally/) first. + +```{eval-rst} +.. important:: + Each ``adapters`` version is built for one specific version of Transformers. + While using a different version of Transformers with an ``adapters`` might work, it is highly recommended to use the intended version. + ``adapters`` will automatically install the correct Transformers version if not installed. +``` + +## Using pip + +### From PyPI + +The simplest way of installation is by using pip to install the package from the Python Package Index: + +``` +pip install adapters +``` + +### From GitHub + +You can also install the latest development version directly from our GitHub repository: + +``` +pip install git+https://github.com/adapter-hub/adapters.git +``` + +## From repository + +Alternatively, you can clone the repository first and install the package from source. +This allows you to run the included example scripts directly: + +``` +git clone https://github.com/adapter-hub/adapters.git +cd adapters +pip install . +``` diff --git a/_sources/loading.md.txt b/_sources/loading.md.txt new file mode 100644 index 0000000000..573aab8bc8 --- /dev/null +++ b/_sources/loading.md.txt @@ -0,0 +1,108 @@ +# Loading Pre-Trained Adapters + +## Finding pre-trained adapters + +**[AdapterHub.ml](https://adapterhub.ml/explore)** provides a central collection of all pre-trained adapters uploaded via Hugging Face's [Model Hub](https://huggingface.co/models). +You can easily find pre-trained adapters for your task of interest along with all relevant information and code snippets to get started. + +```{eval-rst} +.. note:: + The original `Hub repository `_ (via ``source="ah"``) has been archived and migrated to the HuggingFace Model Hub. The Adapters library supports automatic redirecting to the HF Model Hub when attempting to load adapters from the original Hub repository. +``` + +Alternatively, [`list_adapters()`](adapters.utils.list_adapters) provides a programmatical way of accessing all available pre-trained adapters. +This will return an [`AdapterInfo`](adapters.utils.AdapterInfo) object for each retrieved adapter. +E.g., we can use it to retrieve information for all adapters trained for a specific model: + +```python +from adapters import list_adapters + +# source can be "ah" (archived Hub repo), "hf" (huggingface.co) or None (for both, default) +adapter_infos = list_adapters(source="hf", model_name="bert-base-uncased") + +for adapter_info in adapter_infos: + print("Id:", adapter_info.adapter_id) + print("Model name:", adapter_info.model_name) + print("Uploaded by:", adapter_info.username) +``` + +In case the adapter ID is known, information for a single adapter can also be retrieved via [`get_adapter_info()`](adapters.utils.get_adapter_info): + +```python +adapter_info = get_adapter_info("@ukp/bert-base-uncased_sentiment_sst-2_pfeiffer", source="ah") + +print("Id:", adapter_info.adapter_id) +print("Model name:", adapter_info.model_name) +print("Uploaded by:", adapter_info.username) +``` + +## Using pre-trained adapters in your code + +Suppose we have loaded a pre-trained transformer model from Hugging Face, e.g. BERT, and initialized it for adding adapters: + +```python +from transformers import BertModel +import adapters + +model = BertModel.from_pretrained('bert-base-uncased') +adaptrers.init(model) +``` + +We can now easily load a pre-trained adapter module from Adapter Hub by its identifier using the [`load_adapter()`](adapters.ModelWithHeadsAdaptersMixin.load_adapter) method: + +```python +adapter_name = model.load_adapter('sst-2') +``` + +In the minimal case, that's everything we need to specify to load a pre-trained task adapter for sentiment analysis, trained on the `sst-2` dataset using BERT base and a suitable adapter configuration. +The name of the adapter is returned by [`load_adapter()`](adapters.ModelWithHeadsAdaptersMixin.load_adapter), so we can [activate it](adapter_composition.md) in the next step: +```python +model.set_active_adapters(adapter_name) +``` + +As the second example, let's have a look at how to load an adapter based on the [`AdapterInfo`](adapters.utils.AdapterInfo) returned by the [`list_adapters()`](adapters.utils.list_adapters) method from [above](#finding-pre-trained-adapters): +```python +from adapters import AutoAdapterModel, list_available_adapters + +adapter_infos = list_available_adapters(source="ah") +# Take the first adapter info as an example +adapter_info = adapter_infos[0] + +model = AutoAdapterModel.from_pretrained(adapter_info.model_name) +model.load_adapter(adapter_info.adapter_id, source=adapter_info.source) +``` + +### Advanced usage of `load_adapter()` + +To examine what's happening underneath in a bit more detail, let's first write out the full method call with all relevant arguments explicitly stated: + +```python +model.load_adapter( + 'sst-2', + config='pfeiffer', + model_name='bert-base-uncased', + version=1, + load_as='sst', + source='ah' +) +``` + +We will go through the different arguments and their meaning one by one: + +- The first argument passed to the method specifies the name of the adapter we want to load from Adapter-Hub. The library will search for an available adapter module with this name that matches the model architecture as well as the adapter type and configuration we requested. As the identifier `sst-2` resolves to a unique entry in the Hub, the corresponding adapter can be successfully loaded based on this information. To get an overview of all available adapter identifiers, please refer to [the Adapter-Hub website](https://adapterhub.ml/explore). + +- The `config` argument defines the adapter architecture the loaded adapter should have. +The value of this parameter can be either a string identifier for one of the predefined architectures, the identifier of an architecture available in the Hub or a dictionary representing a full adapter configuration. +Based on this information, the library will only search for pre-trained adapter modules having the same configuration. + +- Adapter modules trained on different pre-trained language models in general can not be used interchangeably. +Therefore, we need to make sure to load an adapter matching the language model we are using. +If possible, the library will infer the name of the pre-trained model automatically (e.g. when we use `from_pretrained('identifier')` to load a model from Hugging Face). However, if this is not the case, we must specify the name of the host model in the `model_name` parameter. + +- There could be multiple versions of the same adapter available. To load a specific version, use the `version` parameter. + +- By default, the `load_adapter()` method will add the loaded adapter using the identifier string given as the first argument. +To load the adapter using a custom name, we can use the `load_as` parameter. + +- Finally the `source` parameter provides the possibility to load adapters from alternative adapter repositories. +Besides the default value `ah`, referring to AdapterHub, it's also possible to pass `hf` to [load adapters from Hugging Face's Model Hub](huggingface_hub.md). diff --git a/_sources/method_combinations.md.txt b/_sources/method_combinations.md.txt new file mode 100644 index 0000000000..4bd57ac10c --- /dev/null +++ b/_sources/method_combinations.md.txt @@ -0,0 +1,124 @@ +# Method Combinations + +_Configuration class_: [`ConfigUnion`](adapters.ConfigUnion) + +While different efficient fine-tuning methods and configurations have often been proposed as standalone, combining them for joint training might be beneficial. +To make this process easier, `adapters` provides the possibility to group multiple configuration instances using the [`ConfigUnion`](adapters.ConfigUnion) class. + +For example, this could be used to define different reduction factors for the adapter modules placed after the multi-head attention and the feed-forward blocks: + +```python +from adapters import BnConfig, ConfigUnion + +config = ConfigUnion( + BnConfig(mh_adapter=True, output_adapter=False, reduction_factor=16, non_linearity="relu"), + BnConfig(mh_adapter=False, output_adapter=True, reduction_factor=2, non_linearity="relu"), +) +model.add_adapter("union_adapter", config=config) +``` + +## Mix-and-Match Adapters + +_Configuration class_: [`MAMConfig`](adapters.MAMConfig) + +[He et al. (2021)](https://arxiv.org/pdf/2110.04366.pdf) study various variants and combinations of efficient fine-tuning methods. +They propose _Mix-and-Match Adapters_ as a combination of Prefix Tuning and parallel bottleneck adapters. +This configuration is supported by `adapters` out-of-the-box: + +```python +from adapters import MAMConfig + +config = MAMConfig() +model.add_adapter("mam_adapter", config=config) +``` + +and is identical to using the following `ConfigUnion`: + +```python +from adapters import ConfigUnion, ParBnConfig, PrefixTuningConfig + +config = ConfigUnion( + PrefixTuningConfig(bottleneck_size=800), + ParBnConfig(), +) +model.add_adapter("mam_adapter", config=config) +``` + +_Papers:_ +- [Towards a Unified View of Parameter-Efficient Transfer Learning](https://arxiv.org/pdf/2110.04366.pdf) (He et al., 2021) + +## UniPELT + +_Configuration class_: [`UniPELTConfig`](adapters.UniPELTConfig) + +```{eval-rst} +.. figure:: img/unipelt.png + :height: 300 + :align: center + :alt: Illustration of UniPELT. + + Illustration of the UniPELT method within one Transformer layer. Trained components are colored in shades of magenta. +``` + +An approach similar to the work of [He et al. (2021)](https://arxiv.org/pdf/2110.04366.pdf) is taken by [Mao et al. (2022)](https://arxiv.org/pdf/2110.07577.pdf) in their _UniPELT_ framework. +They, too, combine multiple efficient fine-tuning methods, namely LoRA, Prefix Tuning and bottleneck adapters, in a single unified setup. +_UniPELT_ additionally introduces a gating mechanism that controls the activation of the different submodules. + +Concretely, for each adapted module $m$, UniPELT adds a trainable gating value $\mathcal{G}_m \in (0, 1)$ that is computed via a feed-forward network ($W_{\mathcal{G}_m}$) and sigmoid activation ($\sigma$) from the Transformer layer input states ($x$): + +$$\mathcal{G}_m \leftarrow \sigma(W_{\mathcal{G}_m} \cdot x)$$ + +These gating values are then used to scale the output activations of the injected adapter modules, e.g., for a LoRA layer: + +$$ +h \leftarrow W_0 x + \mathcal{G}_{LoRA} B A x +$$ + +In the configuration classes of `adapters`, these gating mechanisms can be activated via `use_gating=True`. +The full UniPELT setup can be instantiated using `UniPELTConfig`[^unipelt]: + +[^unipelt]: Note that the implementation of UniPELT in `adapters` follows the implementation in the original code, which is slightlty different from the description in the paper. See [here](https://github.com/morningmoni/UniPELT/issues/1) for more. + +```python +from adapters import UniPELTConfig + +config = UniPELTConfig() +model.add_adapter("unipelt", config=config) +``` + +which is identical to the following `ConfigUnion`: + +```python +from adapters import ConfigUnion, LoRAConfig, PrefixTuningConfig, SeqBnConfig + +config = ConfigUnion( + LoRAConfig(r=8, alpha=2, use_gating=True), + PrefixTuningConfig(prefix_length=10, use_gating=True), + SeqBnConfig(reduction_factor=16, use_gating=True), +) +model.add_adapter("unipelt", config=config) +``` + +Finally, as the gating values for each adapter module might provide interesting insights for analysis, `adapters` comes with an integrated mechanism of returning all gating values computed during a model forward pass via the `output_adapter_gating_scores` parameter: + +```python +outputs = model(**inputs, output_adapter_gating_scores=True) +gating_scores = outputs.adapter_gating_scores +``` +Note that this parameter is only available to base model classes and [AdapterModel classes](prediction_heads.md#adaptermodel-classes). +In the example, `gating_scores` holds a dictionary of the following form: +``` +{ + '': { + : { + '': np.array([...]), + ... + }, + ... + }, + ... +} +``` + +_Papers:_ +- [UNIPELT: A Unified Framework for Parameter-Efficient Language Model Tuning](https://arxiv.org/pdf/2110.07577.pdf) (Mao et al., 2022) diff --git a/_sources/methods.md.txt b/_sources/methods.md.txt new file mode 100644 index 0000000000..535b23d088 --- /dev/null +++ b/_sources/methods.md.txt @@ -0,0 +1,297 @@ +# Adapter Methods + +On this page, we present all adapter methods currently integrated into the `adapters` library. +A tabular overview of adapter methods is provided [here](overview.md#table-of-adapter-methods). +Additionally, options to combine multiple adapter methods in a single setup are presented [on the next page](method_combinations.md). + +## Bottleneck Adapters + +_Configuration class_: [`BnConfig`](adapters.BnConfig) + +Bottleneck adapters introduce bottleneck feed-forward layers in each layer of a Transformer model. +Generally, these adapter layers consist of a down-projection matrix $W_{down}$ that projects the layer hidden states into a lower dimension $d_{bottleneck}$, a non-linearity $f$, an up-projection $W_{up}$ that projects back into the original hidden layer dimension and a residual connection $r$: + +$$ +h \leftarrow W_{up} \cdot f(W_{down} \cdot h) + r +$$ + +Depending on the concrete adapter configuration, these layers can be introduced at different locations within a Transformer block. Further, residual connections, layer norms, activation functions and bottleneck sizes ,etc., can be configured. + +The most important configuration hyperparameter to be highlighted here is the bottleneck dimension $d_{bottleneck}$. +In adapters, this bottleneck dimension is specified indirectly via the `reduction_factor` attribute of a configuration. +This `reduction_factor` defines the ratio between a model's layer hidden dimension and the bottleneck dimension, i.e.: + +$$ +\text{reduction_factor} = \frac{d_{hidden}}{d_{bottleneck}} +$$ + +A visualization of further configuration options related to the adapter structure is given in the figure below. For more details, we refer to the documentation of `BnConfig`](adapters.BnConfig). + + +```{eval-rst} +.. figure:: img/architecture.png + :width: 350 + :align: center + :alt: Adapter architectures + + Visualization of possible adapter configurations with corresponding dictionary keys. +``` + +`adapters` comes with pre-defined configurations for some bottleneck adapter architectures proposed in literature: + +- [`DoubleSeqBnConfig`](adapters.DoubleSeqBnConfig), as proposed by [Houlsby et al. (2019)](https://arxiv.org/pdf/1902.00751.pdf) places adapter layers after both the multi-head attention and feed-forward block in each Transformer layer. +- [`SeqBnConfig`](adapters.SeqBnConfig), as proposed by [Pfeiffer et al. (2020)](https://arxiv.org/pdf/2005.00052.pdf) places an adapter layer only after the feed-forward block in each Transformer layer. +- [`ParBnConfig`](adapters.ParBnConfig), as proposed by [He et al. (2021)](https://arxiv.org/pdf/2110.04366.pdf) places adapter layers in parallel to the original Transformer layers. + +_Example_: +```python +from adapters import BnConfig + +config = BnConfig(mh_adapter=True, output_adapter=True, reduction_factor=16, non_linearity="relu") +model.add_adapter("bottleneck_adapter", config=config) +``` + +_Papers:_ + +* [Parameter-Efficient Transfer Learning for NLP](https://arxiv.org/pdf/1902.00751.pdf) (Houlsby et al., 2019) +* [Simple, Scalable Adaptation for Neural Machine Translation](https://arxiv.org/pdf/1909.08478.pdf) (Bapna and Firat, 2019) +* [AdapterFusion: Non-Destructive Task Composition for Transfer Learning](https://aclanthology.org/2021.eacl-main.39.pdf) (Pfeiffer et al., 2021) +* [AdapterHub: A Framework for Adapting Transformers](https://arxiv.org/pdf/2007.07779.pdf) (Pfeiffer et al., 2020) + +## Language Adapters - Invertible Adapters + +_Configuration class_: [`SeqBnInvConfig`](adapters.SeqBnInvConfig), [`DoubleSeqBnInvConfig`](adapters.DoubleSeqBnInvConfig) + +The MAD-X setup ([Pfeiffer et al., 2020](https://arxiv.org/pdf/2005.00052.pdf)) proposes language adapters to learn language-specific transformations. +After being trained on a language modeling task, a language adapter can be stacked before a task adapter for training on a downstream task. +To perform zero-shot cross-lingual transfer, one language adapter can simply be replaced by another. + +In terms of architecture, language adapters are largely similar to regular bottleneck adapters, except for an additional _invertible adapter_ layer after the LM embedding layer. +Embedding outputs are passed through this invertible adapter in the forward direction before entering the first Transformer layer and in the inverse direction after leaving the last Transformer layer. +Invertible adapter architectures are further detailed in [Pfeiffer et al. (2020)](https://arxiv.org/pdf/2005.00052.pdf) and can be configured via the `inv_adapter` attribute of the `BnConfig` class. + +_Example_: +```python +from adapters import SeqBnInvConfig + +config = SeqBnInvConfig() +model.add_adapter("lang_adapter", config=config) +``` + +_Papers:_ +- [MAD-X: An Adapter-based Framework for Multi-task Cross-lingual Transfer](https://arxiv.org/pdf/2005.00052.pdf) (Pfeiffer et al., 2020) + +```{eval-rst} +.. note:: + V1.x of adapters made a distinction between task adapters (without invertible adapters) and language adapters (with invertible adapters) with the help of the ``AdapterType`` enumeration. + This distinction was dropped with v2.x. +``` + +## Prefix Tuning + +_Configuration class_: [`PrefixTuningConfig`](adapters.PrefixTuningConfig) + +```{eval-rst} +.. figure:: img/prefix.png + :height: 300 + :align: center + :alt: Illustration of Prefix Tuning. + + Illustration of the Prefix Tuning method within one Transformer layer. Trained components are colored in shades of magenta. +``` + +Prefix Tuning ([Li and Liang, 2021](https://aclanthology.org/2021.acl-long.353.pdf)) introduces new parameters in the multi-head attention blocks in each Transformer layer. +More specifically, it prepends trainable prefix vectors $P^K$ and $P^V$ to the keys and values of the attention head input, each of a configurable prefix length $l$ (`prefix_length` attribute): + +$$ +head_i = \text{Attention}(Q W_i^Q, [P_i^K, K W_i^K], [P_i^V, V W_i^V]) +$$ + +Following the original authors, the prefix vectors in $P^K$ and $P^V$ are not optimized directly but reparameterized via a bottleneck MLP. +This behavior is controlled via the `flat` attribute of the configuration. +Using `PrefixTuningConfig(flat=True)` will create prefix tuning vectors that are optimized without reparameterization. + +_Example_: +```python +from adapters import PrefixTuningConfig + +config = PrefixTuningConfig(flat=False, prefix_length=30) +model.add_adapter("prefix_tuning", config=config) +``` + +As reparameterization using the bottleneck MLP is not necessary for performing inference on an already trained Prefix Tuning module, `adapters` includes a function to "eject" a reparameterized Prefix Tuning into a flat one: +```python +model.eject_prefix_tuning("prefix_tuning") +``` +This will only retain the necessary parameters and reduces the size of the trained Prefix Tuning. + +_Papers:_ +- [Prefix-Tuning: Optimizing Continuous Prompts for Generation](https://arxiv.org/pdf/2101.00190.pdf) (Li and Liang, 2021) + +## Compacter + +_Configuration class_: [`CompacterConfig`](adapters.CompacterConfig), [`CompacterPlusPlusConfig`](adapters.CompacterPlusPlusConfig) + +```{eval-rst} +.. figure:: img/compacter.png + :height: 300 + :align: center + :alt: Illustration of Compacter. + + Illustration of the Compacter method within one Transformer layer. Trained components are colored in shades of magenta. +``` + +The Compacter architecture proposed by [Mahabadi et al., 2021](https://arxiv.org/pdf/2106.04647.pdf) +is similar to the bottleneck adapter architecture. It only exchanges the linear down- and +up-projection with a PHM layer. Unlike the linear layer, the PHM layer constructs its weight matrix from two smaller matrices, which reduces the number of parameters. + These matrices can be factorized and shared between all adapter layers. You can exchange the down- and up-projection layers from any of the bottleneck adapters described in the previous section +for a PHM layer by specifying `use_phm=True` in the config. + +The PHM layer has the following additional properties: `phm_dim`, `shared_phm_rule`, `factorized_phm_rule`, `learn_phm`, +`factorized_phm_W`, `shared_W_phm`, `phm_c_init`, `phm_init_range`, `hypercomplex_nonlinearity` + +For more information, check out the [`BnConfig`](adapters.BnConfig) class. + +To add a Compacter to your model, you can use the predefined configs: +```python +from adapters import CompacterConfig + +config = CompacterConfig() +model.add_adapter("dummy", config=config) +``` +_Papers:_ +- [COMPACTER: Efficient Low-Rank Hypercomplex Adapter Layers](https://arxiv.org/pdf/2106.04647.pdf) (Mahabadi, Henderson and Ruder, 2021) + +## LoRA + +_Configuration class_: [`LoRAConfig`](adapters.LoRAConfig) + +```{eval-rst} +.. figure:: img/lora.png + :height: 300 + :align: center + :alt: Illustration of LoRA. + + Illustration of the LoRA method within one Transformer layer. Trained components are colored in shades of magenta. +``` + +Low-Rank Adaptation (LoRA) is an efficient fine-tuning technique proposed by [Hu et al. (2021)](https://arxiv.org/pdf/2106.09685.pdf). +LoRA injects trainable low-rank decomposition matrices into the layers of a pre-trained model. +For any model layer expressed as a matrix multiplication of the form $h = W_0 x$, it performs a reparameterization, such that: + +$$ +h = W_0 x + \frac{\alpha}{r} B A x +$$ + +Here, $A \in \mathbb{R}^{r\times k}$ and $B \in \mathbb{R}^{d\times r}$ are the decomposition matrices and $r$, the low-dimensional rank of the decomposition, is the most important hyperparameter. + +While, in principle, this reparameterization can be applied to any weight matrix in a model, the original paper only adapts the attention weights of the Transformer self-attention sub-layer with LoRA. +`adapters` additionally allows injecting LoRA into the dense feed-forward layers in the intermediate and output components of a Transformer block. +You can configure the locations where LoRA weights should be injected using the attributes in the [`LoRAConfig`](adapters.LoRAConfig) class. + +_Example_: +```python +from adapters import LoRAConfig + +config = LoRAConfig(r=8, alpha=16) +model.add_adapter("lora_adapter", config=config) +``` + +In the design of LoRA, Hu et al. (2021) also pay special attention to keeping the inference latency overhead compared to full fine-tuning at a minimum. +To accomplish this, the LoRA reparameterization can be merged with the original pre-trained weights of a model for inference. +Thus, the adapted weights are directly used in every forward pass without passing activations through an additional module. +In `adapters`, this can be realized using the built-in [`merge_adapter()`](adapters.ModelAdaptersMixin.merge_adapter) method: +```python +model.merge_adapter("lora_adapter") +``` + +To continue training on this LoRA adapter or to deactivate it entirely, the merged weights first have to be reset again: +```python +model.reset_adapter() +``` + +_Papers:_ +- [LoRA: Low-Rank Adaptation of Large Language Models](https://arxiv.org/pdf/2106.09685.pdf) (Hu et al., 2021) + +## (IA)^3 + +_Configuration class_: [`IA3Config`](adapters.IA3Config) + +```{eval-rst} +.. figure:: img/ia3.png + :height: 300 + :align: center + :alt: Illustration of (IA)^3. + + Illustration of the (IA)^3 method within one Transformer layer. Trained components are colored in shades of magenta. +``` + +_Infused Adapter by Inhibiting and Amplifying Inner Activations ((IA)^3)_ is an efficient fine-tuning method proposed within the _T-Few_ fine-tuning approach by [Liu et al. (2022)](https://arxiv.org/pdf/2205.05638.pdf). +(IA)^3 introduces trainable vectors $l_W$ into different components of a Transformer model, which perform element-wise rescaling of inner model activations. +For any model layer expressed as a matrix multiplication of the form $h = W x$, it therefore performs an element-wise multiplication with $l_W$, such that: + +$$ +h = l_W \odot W x +$$ + +Here, $\odot$ denotes element-wise multiplication where the entries of $l_W$ are broadcasted to the shape of $W$. + +_Example_: +```python +from adapters import IA3Config + +config = IA3Config() +model.add_adapter("ia3_adapter", config=config) +``` + +The implementation of (IA)^3, as well as the [`IA3Config`](adapters.IA3Config) class, are derived from the implementation of [LoRA](#lora), with a few main modifications. +First, (IA)^3 uses multiplicative composition of weights instead of additive composition, as in LoRA. +Second, the added weights are not further decomposed into low-rank matrices. +These modifications are controlled via the `composition_mode` configuration attribute by setting `composition_mode="scale"`. +Additionally, as the added weights are already of rank 1, `r=1` is set. + +Beyond that, both methods share the same configuration attributes that allow you to specify in which Transformer components rescaling vectors will be injected. +Following the original implementation, [`IA3Config`](adapters.IA3Config) adds rescaling vectors to the self-attention weights (`selfattn_lora=True`) and the final feed-forward layer (`output_lora=True`). +Further, you can modify which matrices of the attention mechanism to rescale by leveraging the `attn_matrices` attribute. +By default, (IA)^3 injects weights into the key ('k') and value ('v') matrices but not in the query ('q') matrix. + +Finally, similar to LoRA, (IA)^3 also allows merging the injected parameters with the original weight matrices of the Transformer model. +E.g.: +```python +# Merge (IA)^3 adapter +model.merge_adapter("ia3_adapter") + +# Reset merged weights +model.reset_adapter() +``` + +_Papers:_ +- [Few-Shot Parameter-Efficient Fine-Tuning is Better and Cheaper than In-Context Learning](https://arxiv.org/pdf/2205.05638.pdf) (Liu et al., 2022) + +## Prompt Tuning +Prompt Tuning is an efficient fine-tuning technique proposed by Lester et al. (2021). Prompt tuning adds tunable tokens, called soft-prompts, that are prepended to the input text. +First, the input sequence ${x_1, x_2, \dots, x_n }$ gets embedded, resulting in the matrix $X_e \in \mathbb{R}^{n \times e}$ where $e$ is the dimension of +the embedding space. The soft-prompts with length $p$ are represented as $P_e \in \mathbb{R}^{p \times e}$. +$P_e$ and $X_e$ get concatenated, forming the input of the following encoder or decoder: + +$$ +\left[P_e; X_e\right] \in \mathbb{R}^{\left(p + n\right) \times e} +$$ + +The `PromptTuningConfig` has the properties: +- `prompt_length`: to set the soft-prompts length $p$ +- `prompt_init`: to set the weight initialisation method, which is either "random_uniform" or "from_string" to initialize each prompt token with an embedding drawn from the model’s vocabulary. + - `prompt_init_text` as the text use for initialisation if `prompt_init="from_string"` +- `combine`: To define if the prefix should be added before the embedded input sequence or after the BOS token + +To add Prompt Tuning to your model, you can use the predefined configs: +```python +from adapters import PromptTuningConfig + +config = PromptTuningConfig(prompt_length=10) +model.add_adapter("dummy", config=config) +``` + +_Papers:_ +- [The Power of Scale for Parameter-Efficient Prompt Tuning](https://aclanthology.org/2021.emnlp-main.243/) (Lester et al., 2021) + diff --git a/_sources/model_overview.md.txt b/_sources/model_overview.md.txt new file mode 100644 index 0000000000..58ae523b43 --- /dev/null +++ b/_sources/model_overview.md.txt @@ -0,0 +1,42 @@ +# Model Overview + +This page gives an overview of the Transformer models currently supported by `adapters`. +The table below further shows which model architectures support which adaptation methods and which features of `adapters`. + +```{eval-rst} +.. note:: + Each supported model architecture X typically provides a class ``XAdapterModel`` for usage with ``AutoAdapterModel``. + Additionally, it is possible to use adapters with the model classes already shipped with Hugging Face Transformers. For these classes, initialize the model for adapters with `adapters.init(model)`. + E.g., for BERT, this means adapters provides a ``BertAdapterModel`` class, but you can also use ``BertModel``, ``BertForSequenceClassification`` etc. together with adapters. +``` + +| Model | (Bottleneck)
Adapters | Prefix
Tuning | LoRA | Compacter | Adapter
Fusion | Invertible
Adapters | Parallel
block | Prompt
Tuning | +| --------------------------------------- | -| - | - | - | - | - | - |- | +| [ALBERT](classes/models/albert.html) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | +| [BART](classes/models/bart.html) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | | +| [BEIT](classes/models/beit.html) | ✅ | ✅ | ✅ | ✅ | ✅ | | | ✅ | +| [BERT-Generation](classes/models/bert-generation.html) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | +| [BERT](classes/models/bert.html) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | +| [CLIP](classes/models/clip.html) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | | | +| [DeBERTa](classes/models/deberta.html) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | +| [DeBERTa-v2](classes/models/debertaV2.html) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | +| [DistilBERT](classes/models/distilbert.html) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | +| [Electra](classes/models/electra.html) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | +| [Encoder Decoder](classes/models/encoderdecoder.html) | (*) | (*) | (*) | (*) | (*) | (*) | | | +| [GPT-2](classes/models/gpt2.html) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | | +| [GPT-J](classes/models/gptj.html) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | | +| [Llama](classes/models/llama.html) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | | +| [MBart](classes/models/mbart.html) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | | +| [MT5](classes/models/mt5.html) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | | +| [RoBERTa](classes/models/roberta.html) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | +| [T5](classes/models/t5.html) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | | +| [ViT](classes/models/vit.html) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | +| [XLM-RoBERTa](classes/models/xlmroberta.html) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | +| [X-MOD](classes/models/xmod.html) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | + +(*) If the used encoder and decoder model class are supported. + +**Missing a model architecture you'd like to use?** +adapters can be easily extended to new model architectures as described in [Adding Adapters to a Model](https://docs.adapterhub.ml/contributing/adding_adapters_to_a_model.html). +Feel free to [open an issue](https://github.com/Adapter-Hub/adapters/issues) requesting support for a new architecture. +_We very much welcome pull requests adding new model implementations!_ diff --git a/_sources/overview.md.txt b/_sources/overview.md.txt new file mode 100644 index 0000000000..ee76c2b30f --- /dev/null +++ b/_sources/overview.md.txt @@ -0,0 +1,99 @@ +# Overview and Configuration + +Large pre-trained Transformer-based language models (LMs) have become the foundation of NLP in recent years. +While the most prevalent method of using these LMs for transfer learning involves costly *full fine-tuning* of all model parameters, a series of *efficient* and *lightweight* alternatives have recently been established. +Instead of updating all parameters of the pre-trained LM towards a downstream target task, these methods commonly introduce a small number of new parameters and only update these while keeping the pre-trained model weights fixed. + +```{admonition} Why use Efficient Fine-Tuning? +Efficient fine-tuning methods offer multiple benefits over the full fine-tuning of LMs: + +- They are **parameter-efficient**, i.e., they only update a tiny subset (often under 1%) of a model's parameters. +- They often are **modular**, i.e., the updated parameters can be extracted and shared independently of the base model parameters. +- They are easy to share and deploy due to their **small file sizes**, e.g., having only ~3MB per task instead of ~440MB for sharing a full model. +- They **speed up training**, i.e., efficient fine-tuning often requires less training time than fully fine-tuning LMs. +- They are **composable**, e.g., multiple adapters trained on different tasks can be stacked, fused, or mixed to leverage their combined knowledge. +- They often provide **on-par performance** with full fine-tuning. +``` + +More specifically, let the parameters of a LM be composed of a set of pre-trained parameters $\Theta$ (frozen) and a set of (newly introduced) parameters $\Phi$. +Then, efficient fine-tuning methods optimize only $\Phi$ according to a loss function $L$ on a dataset $D$: + +$$ +\Phi^* \leftarrow \arg \min_{\Phi} L(D; \{\Theta, \Phi\}) +$$ + +Efficient fine-tuning might insert parameters $\Phi$ at different locations of a Transformer-based LM. +One early and successful method, (bottleneck) adapters, introduces bottleneck feed-forward layers in each layer of a Transformer model. +While these adapters have laid the foundation of the `adapters` library, multiple alternative methods have been introduced and integrated since. + +```{eval-rst} +.. important:: + In literature, different terms are used to refer to efficient fine-tuning methods. + The term "adapter" is usually only applied to bottleneck adapter modules. + However, most efficient fine-tuning methods follow the same general idea of inserting a small set of new parameters and, by this, "adapting" the pre-trained LM to a new task. + In ``adapters``, the term "adapter" thus may refer to any efficient fine-tuning method if not specified otherwise. +``` + +In the remaining sections, we will present how adapter methods can be configured in `adapters`. +The next two pages will then present the methodological details of all currently supported adapter methods. + +## Table of Adapter Methods + +The following table gives an overview of all adapter methods supported by `adapters`. +Identifiers and configuration classes are explained in more detail in the [next section](#configuration). + +| Identifier | Configuration class | More information +| --- | --- | --- | +| `seq_bn` | `SeqBnConfig()` | [Bottleneck Adapters](methods.html#bottleneck-adapters) | +| `double_seq_bn` | `DoubleSeqBnConfig()` | [Bottleneck Adapters](methods.html#bottleneck-adapters) | +| `par_bn` | `ParBnConfig()` | [Bottleneck Adapters](methods.html#bottleneck-adapters) | +| `scaled_par_bn` | `ParBnConfig(scaling="learned")` | [Bottleneck Adapters](methods.html#bottleneck-adapters) | +| `seq_bn_inv` | `SeqBnInvConfig()` | [Invertible Adapters](methods.html#language-adapters---invertible-adapters) | +| `double_seq_bn_inv` | `DoubleSeqBnInvConfig()` | [Invertible Adapters](methods.html#language-adapters---invertible-adapters) | +| `compacter` | `CompacterConfig()` | [Compacter](methods.html#compacter) | +| `compacter++` | `CompacterPlusPlusConfig()` | [Compacter](methods.html#compacter) | +| `prefix_tuning` | `PrefixTuningConfig()` | [Prefix Tuning](methods.html#prefix-tuning) | +| `prefix_tuning_flat` | `PrefixTuningConfig(flat=True)` | [Prefix Tuning](methods.html#prefix-tuning) | +| `lora` | `LoRAConfig()` | [LoRA](methods.html#lora) | +| `ia3` | `IA3Config()` | [IA³](methods.html#ia-3) | +| `mam` | `MAMConfig()` | [Mix-and-Match Adapters](method_combinations.html#mix-and-match-adapters) | +| `unipelt` | `UniPELTConfig()` | [UniPELT](method_combinations.html#unipelt) | +| `prompt_tuning` | `PromptTuningConfig()` | [Prompt Tuning](methods.html#prompt-tuning) + +## Configuration + +All supported adapter methods can be added, trained, saved and shared using the same set of model class functions (see [class documentation](adapters.ModelAdaptersMixin)). +Each method is specified and configured using a specific configuration class, all of which derive from the common [`AdapterConfig`](adapters.AdapterConfig) class. +E.g., adding one of the supported adapter methods to an existing model instance follows this scheme: +```python +model.add_adapter("name", config=) +``` + +Here, `` can either be: +- a configuration string, as described below +- an instance of a configuration class, as listed in the table above +- a path to a JSON file containing a configuration dictionary + +### Configuration strings + +Configuration strings are a concise way of defining a specific adapter method configuration. +They are especially useful when adapter configurations are passed from external sources such as the command-line, when using configuration classes is not an option. + +In general, a configuration string for a single method takes the form `[=, ...]`. +Here, `` refers to one of the identifiers listed in [the table above](#table-of-adapter-methods), e.g. `par_bn`. +In square brackets after the identifier, you can set specific configuration attributes from the respective configuration class, e.g. `par_bn[reduction_factor=2]`. +If all attributes remain at their default values, this can be omitted. + +Finally, it is also possible to specify a [method combination](method_combinations.md) as a configuration string by joining multiple configuration strings with `|`, e.g.: +```python +config = "prefix_tuning[bottleneck_size=800]|parallel" +``` + +is identical to the following `ConfigUnion`: + +```python +config = ConfigUnion( + PrefixTuningConfig(bottleneck_size=800), + ParBnConfig(), +) +``` diff --git a/_sources/prediction_heads.md.txt b/_sources/prediction_heads.md.txt new file mode 100644 index 0000000000..0fb3810789 --- /dev/null +++ b/_sources/prediction_heads.md.txt @@ -0,0 +1,150 @@ +# Prediction Heads + +This section gives an overview of how different prediction heads can be used together with adapter modules and how pre-trained adapters can be distributed side-by-side with matching prediction heads in AdapterHub. +We will take a look at the `AdapterModel` classes (e.g. `BertAdapterModel`) introduced by adapters, which provide **flexible** support for prediction heads, as well as models with **static** heads provided out-of-the-box by Hugging Face Transformers (e.g. `BertForSequenceClassification`). + +```{eval-rst} +.. tip:: + We recommend to use the `AdapterModel classes <#adaptermodel-classes>`_ whenever possible. + These **flexible** models have been created specifically for working with adapters. +``` + +## AdapterModel classes + +The AdapterModel classes provided by `adapters` allow a flexible configuration of prediction heads on top of a pre-trained language model. + +First, we load pre-trained model from the Hugging Face Hub via the [`AutoAdapterModel`](adapters.AutoAdapterModel) class: +```python +model = AutoAdapterModel.from_pretrained("bert-base-uncased") +``` + +By default, this model doesn't have any heads yet, so let's add a new binary sequence classification head on top of our model: +```python +model.add_classification_head("mrpc", num_labels=2) +``` +All heads have a name, we called this new head `"mrpc"`. Since all heads are named, we can add multiple other heads with different names to the same model. +To see the head types of a model and how they can get configured, please refer to the class references of the respective model classes, e.g. [`BertAdapterModel`](adapters.BertAdapterModel). + +A head alone is just one layer with very few parameters. Hence, we want to train our classification head together with an adapter, so let's add one: +```python +model.add_adapter("mrpc", config="seq_bn") +model.set_active_adapters("mrpc") +``` + +Since we gave the task adapter the same name as our head, we can easily identify them as belonging together. +The call to `set_active_adapters()` in the second line tells our model to use the adapter - head configuration we specified by default in a forward pass. +At this point, we can start to [train our setup](training.md). + +```{eval-rst} +.. note:: + The ``set_active_adapters()`` will search for an adapter and a prediction head with the given name to be activated. + Alternatively, prediction heads can also be activated explicitly (i.e. without adapter modules). + These three options are possible (in order of priority when multiple are specified): + + 1. If ``head`` is passed to the forward call, the head with the given name is used. + 2. If the forward call is executed within an ``AdapterSetup`` context, the head configuration is read from the context. + 3. If the ``active_head`` property is set, the head configuration is read from there. +``` + +After training has completed, we can save our whole setup (adapter module _and_ prediction head), with a single call: +```python +model.save_adapter("/path/to/dir", "mrpc", with_head=True) +``` + +Now, you just have to [share your work with the world](huggingface_hub.md). +After you published the adapter together with its head in the Hub, anyone else can load both adapter and head by using the same model class. + +Alternatively, we can also save and load the prediction head separately from an adapter module: + +```python +# save +model.save_head("/path/to/dir", "mrpc") +# load +model.load_head("/path/to/dir") +``` + +Lastly, it's also possible to delete an added head again: + +```python +model.delete_head("mrpc") +``` + +## Model classes with static heads (Hugging Face Transformers) + +The `transformers` library provides strongly typed model classes with heads for various different tasks (e.g. `RobertaForSequenceClassification`, `AutoModelForMultipleChoice` ...). +If an adapter module is trained with one of these out-of-the-box classes, it is encouraged to also distribute the prediction head weights together with the adapter weights. +Therefore, we can also easily save the prediction head weights for these models together with an adapter: + +```python +model.save_adapter("/path/to/dir", "mrpc", with_head=True) +``` + +In the next step, we can provide both the adapter weights and the head weights to the Hub. +If someone else then downloads the pre-trained adapter, the resolving method will check if the prediction head matches the class of his model. +In case the classes match, the prediction head weights will be automatically loaded too. + +## Automatic conversion +`adapters` supports loading static heads, e.g., created with `AutoModelForSequenceClassification`, into model classes with flexible heads, e.g. `AutoAdapterModel`. + +For this, for a model created with `AutoModelForSequenceClassification` we first need to enable adapter support by calling the `init()` method. +```python +from adapters import init, AutoAdapterModel +from transformers import AutoModelForSequenceClassification +import os + +static_head_model = AutoModelForSequenceClassification.from_pretrained("bert-base-uncased") +# Enable adapter support +init(static_head_model) +``` +Now we can add an adapter and save it together with the head as usual: +```python +static_head_model.add_adapter("test") + +temp_dir = os.path.join(os.getcwd(), "temp_dir") +static_head_model.save_adapter(temp_dir, "test", with_head=True) +``` +When now loading the adapter and head into a new AdapterModel, the conversion of weights happens automatically during the call of `load_adapter()`, so no additional steps are needed: + +```python +flex_head_model = AutoAdapterModel.from_pretrained("bert-base-uncased") +flex_head_model.load_adapter(temp_dir) + +assert "test" in flex_head_model.adapters_config +assert "test" in flex_head_model.heads +``` + +```{eval-rst} +.. note:: + The conversion in the opposite direction is not supported, i.e. you cannot load a head created with ``AutoAdapterModel`` into a model of type ``AutoModelForSequenceClassification``. +``` + +## Custom Heads +If none of the available prediction heads fit your requirements, you can define and add a custom head. + +First, we need to define the new head class. For that, the initialization and the forward pass need to be implemented. +The initialization of the head gets a reference to the model, the name of the head, and additionally defined kwargs. +You can use the following template as a guideline. +```python +class CustomHead(PredictionHead): + def __init__( + self, + model, + head_name, + **kwargs, + ): + # innitialization of the custom head + + def forward(self, outputs, cls_output=None, attention_mask=None, return_dict=False, **kwargs): + # implementation of the forward pass +``` + + +Next, we can register the new custom head and give the new head type a name. This only notifies +the model that there is a new head type. Then, we can add an instance of the new head to the model by +calling `add_custom_head` with the name of the new head type, the name of the head instance we are creating, and +additional arguments required by the head. +```python +model.register_custom_head("my_custom_head", CustomHead) +model.add_custom_head(head_type="my_custom_head", head_name="custom_head", **kwargs) +``` +After adding the custom head you can treat it like any other build-in head type. diff --git a/_sources/quickstart.md.txt b/_sources/quickstart.md.txt new file mode 100644 index 0000000000..9cefe33cc1 --- /dev/null +++ b/_sources/quickstart.md.txt @@ -0,0 +1,124 @@ +# Quick Start + +## Introduction + +`adapters` adds adapter functionality to the PyTorch implementations of all Transformer models listed in the [Model Overview](https://docs.adapterhub.ml/model_overview.html). +For working with adapters, a couple of methods, e.g. for creation (`add_adapter()`), loading (`load_adapter()`), +storing (`save_adapter()`) and deletion (`delete_adapter()`) are added to the model classes. +In the following, we will briefly go through some examples to showcase these methods. + +```{eval-rst} +.. note:: + This document focuses on the adapter-related functionalities added by ``adapters``. + For a more general overview of the *transformers* library, visit + `the 'Usage' section in Hugging Face's documentation `_. +``` + +## Initialize a Model with Adapters + +The `XAdapterModel` is the recommended model for training and inference of adapters: + +``` +from adapters import AutoAdapterModel + +model = AutoAdapterModel.from_pretrained(model_name) +```` + +This handles the initialization of the adapter-related functionality internally and provides you with the initialized model. The `XAdapterModel` also supports the dynamic adding, loading, and storing of heads for different tasks. + + +If you want to use adapters in Hugging Face models, the models need to be initialized with the adapters library. This initializes the functionality of adding, loading and storing of adapters within the `transformers` models. + +``` +import adapters + +adapters.init(model) +``` + + +## Using a Pre-Trained Adapter for Inference + +_We also have a Quickstart Colab notebook for adapter inference:_ [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Adapter-Hub/adapters/blob/main/notebooks/02_Adapter_Inference.ipynb) + +The following example shows the usage of a basic pre-trained Transformer model with adapters. +Our goal here is to predict the sentiment of a given sentence. + +We use BERT in this example, so we first load a pre-trained `BertTokenizer` to encode the input sentence and a pre-trained +`bert-base-uncased` checkpoint from Hugging Face's Model Hub using the [`BertAdapterModel`](adapters.BertAdapterModel) class: + +```python +import os + +import torch +from transformers import BertTokenizer +from adapters import BertAdapterModel + +# Load pre-trained BERT tokenizer from Hugging Face +tokenizer = BertTokenizer.from_pretrained('bert-base-uncased') + +# An input sentence +sentence = "It's also, clearly, great fun." + +# Tokenize the input sentence and create a PyTorch input tensor +input_data = tokenizer(sentence, return_tensors="pt") + +# Load pre-trained BERT model from Hugging Face Hub +# The `BertAdapterModel` class is specifically designed for working with adapters +# It can be used with different prediction heads +model = BertAdapterModel.from_pretrained('bert-base-uncased') +``` + +Having loaded the model, we now add a pre-trained task adapter that is useful to our task from AdapterHub. +In this case, for sentiment classification, we thus use [an adapter trained on the SST-2 dataset](https://adapterhub.ml/adapters/ukp/bert-base-uncased_sentiment_sst-2_pfeiffer/). +The task prediction head loaded together with the adapter gives us a class label for our sentence: + +```python +# Load pre-trained task adapter from Adapter Hub +# This method call will also load a pre-trained classification head for the adapter task +adapter_name = model.load_adapter("sentiment/sst-2@ukp", config='pfeiffer') + +# Activate the adapter we just loaded, so that it is used in every forward pass +model.set_active_adapters(adapter_name) + +# Predict output tensor +outputs = model(**input_data) + +# Retrieve the predicted class label +predicted = torch.argmax(outputs[0]).item() +assert predicted == 1 +``` + +To save our pre-trained model and adapters, we can easily store and reload them as follows: + +```python +# For the sake of this demonstration an example path for loading and storing is given below +example_path = os.path.join(os.getcwd(), "adapter-quickstart") + +# Save model +model.save_pretrained(example_path) +# Save adapter +model.save_adapter(example_path, adapter_name) + +# Load model, similar to Hugging Face's AutoModel class, +# you can also use AutoAdapterModel instead of BertAdapterModel +model = AutoAdapterModel.from_pretrained(example_path) +model.load_adapter(example_path) +``` + +Similar to how the weights of the full model are saved, the `save_adapter()` will create a file for saving the adapter weights and a file for saving the adapter configuration in the specified directory. + +Finally, if we have finished working with adapters, we can restore the base Transformer to its original form by deactivating and deleting the adapter: + +```python +# Deactivate all adapters +model.set_active_adapters(None) +# Delete the added adapter +model.delete_adapter(adapter_name) +``` + +## Adapter training + +_We also have a Quickstart Colab notebook for adapter training:_ [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Adapter-Hub/adapters/blob/main/notebooks/01_Adapter_Training.ipynb) + +For more examples of training different adapter setups, refer to the section on [Adapter Training](training.md). +Further information on using adapters with prediction heads can be found in the [Prediction Heads](prediction_heads.md) section. diff --git a/_sources/training.md.txt b/_sources/training.md.txt new file mode 100644 index 0000000000..aec8b812b5 --- /dev/null +++ b/_sources/training.md.txt @@ -0,0 +1,223 @@ +# Adapter Training + +This section describes some examples of training adapter methods for different scenarios. We focus on integrating adapter methods into existing training scripts for Transformer models. +All presented scripts are only slightly modified from the original [examples from Hugging Face Transformers](https://github.com/huggingface/transformers/tree/main/examples/pytorch#examples). +To run the scripts, make sure you have the latest version of the repository and have installed some additional requirements: + +``` +git clone https://github.com/adapter-hub/adapters +cd adapters +pip install . +pip install -r ./examples/pytorch//requirements.txt +``` + +## Train a Task Adapter + +Training a task adapter module on a dataset only requires minor modifications compared to training the entire model. +Suppose we have an existing script for training a Transformer model. +In the following, we will use Hugging Face's [run_glue.py](https://github.com/Adapter-Hub/adapters/blob/main/examples/pytorch/text-classification/run_glue.py) example script for training on the GLUE benchmark. +We go through all required changes step by step: + +### Step A - Parse `AdapterArguments` + +The [`AdapterArguments`](adapters.training.AdapterArguments) class integrated into adapters provides a set of command-line options useful for training adapters. +These include options such as `--train_adapter` for activating adapter training and `--load_adapter` for loading adapters from checkpoints. +Thus, the first step of integrating adapters is to add these arguments to the line where `HfArgumentParser` is instantiated: + +```python +parser = HfArgumentParser((ModelArguments, DataTrainingArguments, TrainingArguments, AdapterArguments)) +# ... +model_args, data_args, training_args, adapter_args = parser.parse_args_into_dataclasses() +``` + +### Step B - Switch model class (optional) + +In our example, we replace the built-in `AutoModelForSequenceClassification` class with the `AutoAdapterModel` class introduced by `adapters`. +Therefore, the model instantiation changed to: + +```python +model = AutoAdapterModel.from_pretrained( + model_args.model_name_or_path, + config=config, +) +model.add_classification_head(data_args.task_name, num_labels=num_labels) +``` + +Alternatively, you can also use the original `transformers` class and initialize the model for the usage of adapters by calling `adapters.init(model)`. +Learn more about the benefits of AdapterModel classes [here](prediction_heads.md) + +### Step C - Setup adapter methods + +```{eval-rst} +.. tip:: + In the following, we show how to set up adapters manually. In most cases, you can use the built-in ``setup_adapter_training()`` method to perform this job automatically. Just add a statement similar to this anywhere between model instantiation and training start in your script: ``setup_adapter_training(model, adapter_args, task_name)`` +``` + +Compared to fine-tuning the entire model, we have to make only one significant adaptation: adding an adapter setup and activating it. + +```python +# task adapter - only add if not existing +if task_name not in model.adapters_config: + # resolve the adapter config + adapter_config = AdapterConfig.load(adapter_args.adapter_config) + # add a new adapter + model.add_adapter(task_name, config=adapter_config) +# Enable adapter training +model.train_adapter(task_name) +``` + +```{eval-rst} +.. important:: + The most crucial step when training an adapter module is to freeze all weights in the model except for those of the + adapter. In the previous snippet, this is achieved by calling the ``train_adapter()`` method, which disables training + of all weights outside the task adapter. In case you want to unfreeze all model weights later on, you can use + ``freeze_model(False)``. +``` + +Besides this, we only have to make sure that the task adapter and prediction head are activated so that they are used in every forward pass. To specify the adapter modules to use, we can use the `model.set_active_adapters()` +method and pass the adapter setup. If you only use a single adapter, you can simply pass the name of the adapter. For more information +on complex setups, checkout the [Composition Blocks](https://docs.adapterhub.ml/adapter_composition.html). + +```python +model.set_active_adapters(task_name) +``` + +### Step D - Switch to `AdapterTrainer` class + +Finally, we exchange the `Trainer` class built into Transformers for the [`AdapterTrainer`](adapters.trainer.AdapterTrainer) class that is optimized for training adapter methods. +See [below for more information](#adaptertrainer). + +Technically, this change is not required as no changes to the training loop are required for training adapters. +However, `AdapterTrainer` e.g., provides better support for checkpointing and reloading adapter weights. + +### Step E - Start training + +The rest of the training procedure does not require any further changes in code. + +You can find the full version of the modified training script for GLUE at [run_glue.py](https://github.com/Adapter-Hub/adapters/blob/master/examples/pytorch/text-classification/run_glue.py) in the `examples` folder of our repository. +We also adapted [various other example scripts](https://github.com/Adapter-Hub/adapters/tree/master/examples/pytorch) (e.g., `run_glue.py`, `run_multiple_choice.py`, `run_squad.py`, ...) to support adapter training. + +To start adapter training on a GLUE task, you can run something similar to: + +``` +export TASK_NAME=mrpc + +python run_glue.py \ + --model_name_or_path bert-base-uncased \ + --task_name $TASK_NAME \ + --do_train \ + --do_eval \ + --max_seq_length 128 \ + --per_device_train_batch_size 32 \ + --learning_rate 1e-4 \ + --num_train_epochs 10.0 \ + --output_dir /tmp/$TASK_NAME \ + --overwrite_output_dir \ + --train_adapter \ + --adapter_config seq_bn +``` + +The important flag here is `--train_adapter`, which switches from fine-tuning the entire model to training an adapter module for the given GLUE task. + +```{eval-rst} +.. tip:: + Adapter weights are usually initialized randomly, which is why we require a higher learning rate. We have found that a default adapter learning rate of ``1e-4`` works well for most settings. +``` + +```{eval-rst} +.. tip:: + Depending on your data set size, you might also need to train longer than usual. To avoid overfitting, you can evaluate the adapters after each epoch on the development set and only save the best model. +``` + +## Train a Language Adapter + +Training a language adapter is equally straightforward as training a task adapter. Similarly to the steps for task adapters +described above, we add a language adapter module to an existing model training script. Here, we modified Hugging Face's [run_mlm.py](https://github.com/Adapter-Hub/adapters/blob/main/examples/pytorch/language-modeling/run_mlm.py) script for masked language modeling with BERT-based models. + +Training a language adapter on BERT using this script may look like the following: + +```bash +export TRAIN_FILE=/path/to/dataset/train +export VALIDATION_FILE=/path/to/dataset/validation + +python run_mlm.py \ + --model_name_or_path bert-base-uncased \ + --train_file $TRAIN_FILE \ + --validation_file $VALIDATION_FILE \ + --do_train \ + --do_eval \ + --learning_rate 1e-4 \ + --num_train_epochs 10.0 \ + --output_dir /tmp/test-mlm \ + --train_adapter \ + --adapter_config "seq_bn_inv" +``` + +## Train AdapterFusion + +We provide an example for training _AdapterFusion_ ([Pfeiffer et al., 2020](https://arxiv.org/pdf/2005.00247)) on the GLUE dataset: [run_fusion_glue.py](https://github.com/Adapter-Hub/adapters/blob/main/examples/pytorch/adapterfusion/run_fusion_glue.py). +You can adapt this script to train AdapterFusion with different pre-trained adapters on your own dataset. + +```{eval-rst} +.. important:: + AdapterFusion on a target task is trained in a second training stage after independently training adapters on individual tasks. + When setting up a fusion architecture on your model, make sure to load the pre-trained adapter modules to be fused using ``model.load_adapter()`` before adding a fusion layer. + For more on AdapterFusion, also refer to `Pfeiffer et al., 2020 `_. +``` + +To start fusion training on SST-2 as the target task, you can run something like the following: + +``` +export GLUE_DIR=/path/to/glue +export TASK_NAME=SST-2 + +python run_fusion_glue.py \ + --model_name_or_path bert-base-uncased \ + --task_name $TASK_NAME \ + --do_train \ + --do_eval \ + --data_dir $GLUE_DIR/$TASK_NAME \ + --max_seq_length 128 \ + --per_device_train_batch_size 32 \ + --learning_rate 5e-5 \ + --num_train_epochs 10.0 \ + --output_dir /tmp/$TASK_NAME \ + --overwrite_output_dir +``` + + +## AdapterTrainer + +Similar to the `Trainer` class provided by Hugging Face, adapters provides an `AdapterTrainer` class. This class is only +intended for training adapters. The `Trainer` class should still be used to fully fine-tune models. To train adapters with the `AdapterTrainer` +class, simply initialize it the same way you would initialize the `Trainer` class, e.g.: + +```python +model.add_adapter(task_name) +model.train_adapter(task_name) + +trainings_args = TrainingsArguments( + learning_rate=1e-4, + num_train_epochs=6, +) + +trainer = AdapterTrainer( + model=model, + args=training_args, + train_dataset=train_dataset, + eval_dataset=eval_dataset, + tokenizer=tokenizer, + data_collator=data_collator, + ) +``` +```{eval-rst} +.. tip:: + When you migrate from the previous versions, which use the Trainer class for adapter training and fully fine-tuning, note that the + specialized AdapterTrainer class does not have the parameters `do_save_full_model`, `do_save_adapters` and `do_save_adapter_fusion`. +``` + +## Quantized Model Training + +_Adapters_ supports fine-tuning of quantized language models similar to [QLoRA (Dettmers et al., 2023)](https://arxiv.org/pdf/2305.14314.pdf) via the `bitsandbytes` library integrated into Transformers. +Quantized training is supported for LoRA-based adapters as well as bottleneck adapters and prefix tuning. +Please refer to [this notebook](https://colab.research.google.com/github/Adapter-Hub/adapters/blob/main/notebooks/QLoRA_Llama_Finetuning.ipynb) for a hands-on guide. diff --git a/_sources/transitioning.md.txt b/_sources/transitioning.md.txt new file mode 100644 index 0000000000..2cdaeb5688 --- /dev/null +++ b/_sources/transitioning.md.txt @@ -0,0 +1,88 @@ +# Transitioning from `adapter-transformers` + +```{eval-rst} +.. important:: + ``adapters`` is fully compatible to ``adapter-transformers`` in terms of model weights, meaning you can load any adapter trained with any version of ``adapter-transformers`` to the new library without degradation. +``` + +The new `adapters` library is the successor to the `adapter-transformers` library. It differs essentially in that `adapters` is now a stand-alone package, i.e., the package is disentangled from the `transformers` package from Hugging Face and is no longer a drop-in replacement. + +This results in some breaking changes. To transition your code from `adapter-transformers` to `adapters` you need to consider the following changes: + +## Package and Namespace + To use the library you need to install +`transformers` and `adapters` in the same environment (unlike `adapter-transformers` which contained `transformers` and could not be installed in the same environment). + +Run the following to install both (installing `adapters` will automatically trigger the installation of a compatible `transformers` version): + +``` +pip install adapters +``` + +This also changes the namespace to `adapters`. For all imports of adapter classes change the import from `transformers` to `adapters`. +This mainly affects the following classes: +- AdapterModel classes, e.g. `AutoAdapterModel` (see [AdapterModels](https://docs.adapterhub.ml/model_overview.html) ) +- Adapter configurations e.g. `PrefixTuningConfig` (see [Configurations](https://docs.adapterhub.ml/overview.html) ) +- Adapter composition blocks, e.g. `Stack` (see [Composition Blocks](https://docs.adapterhub.ml/adapter_composition.html) ) +- The `AdapterTrainer` class + +## Model Initialisation + +The Hugging Face model classes, such as `BertModel`, cannot be used directly with adapters. They must first be initialised for adding adapters: + +``` +from transformers import AutoModel +import adapters + +model = AutoModel.from_pretrained("bert-base-uncased") +adapters.init(model) # prepare model for use with adapters +``` + +The necessary change is the call of the `adapters.init()` method. +Note that no additional initialisation is required to use the AdapterModel classes such as the `BertAdapterModel`'. These classes are provided by the `adapters` library and are already prepared for using adapters in training and inference. + +## Bottleneck Configuration Names + +The `adapters` library supports the configuration of adapters using [config strings](https://docs.adapterhub.ml/overview.html#configuration-strings). Compared to the `adapter-transformers` library, we have changed some of the strings to make them more consistent and intuitive: +- `houlsby` -> `double_seq_bn` +- `pfeiffer` -> `seq_bn` +- `parallel`-> `par_seq_bn` +- `houlsby+inv` -> `double_seq_bn_inv` +- `pfeiffer+inv`-> `seq_bn_inv` + + +For a complete list of config strings and classes see [here](https://docs.adapterhub.ml/overview.html). We strongly recommend using the new config strings, but we will continue to support the old config strings for the time being to make the transition easier. +Note that with the config strings the corresponding adapter config classes have changed, e.g. `PfeifferConfig` -> `SeqBnConfig`. + +Another consequence of this that the `AdapterConfig` class is now not only for the bottleneck adapters anymore, but the base class of all the configurations (previously `AdapterConfigBase`). Hence, the function this class serves has changed. However, you can still load adapter configs with: +``` +adapter_config = AdapterConfig.load("lora") +``` + + +## Features that are not supported by `adapters` + +Compared to `adapter-transformers`, there are a few features that are no longer supported by the `adapters` library: +- Using `transformers` pipelines with adapters. +- Using invertible adapters in the Hugging Face model classes. To use invertible adapters you must use the AdapterModel class. +- Loading model and adapter checkpoints saved with `save_pretrained` using Hugging Face classes. This is only supported by the AdapterModel classes. + +## What has remained the same + +- The new library is fully backwards compatible in terms of adapter weights, i.e. you can load all adapter modules trained with `adapter-transformers`. +- The functionality for adding, activating, and training adapters has __not__ changed, except for the renaming of some adapter configs. You still add and activate adapters as follows: +``` +# add adapter to the model +model.add_adapter("adapter_name", config="lora") +# activate adapter +model.set_active_adapters("adapter_name") +# freeze model weights and activate adapter +model.train_adapter("adapter_name") +``` + +## Where can I still find `adapter-transformers`? + +The codebase of `adapter-transformers` has moved to [https://github.com/adapter-hub/adapter-transformers-legacy](https://github.com/adapter-hub/adapter-transformers-legacy) for archival purposes. + +The full documentation of the old library is now hosted at [https://docs-legacy.adapterhub.ml](https://docs-legacy.adapterhub.ml/). + diff --git a/_static/_sphinx_javascript_frameworks_compat.js b/_static/_sphinx_javascript_frameworks_compat.js new file mode 100644 index 0000000000..8549469dc2 --- /dev/null +++ b/_static/_sphinx_javascript_frameworks_compat.js @@ -0,0 +1,134 @@ +/* + * _sphinx_javascript_frameworks_compat.js + * ~~~~~~~~~~ + * + * Compatability shim for jQuery and underscores.js. + * + * WILL BE REMOVED IN Sphinx 6.0 + * xref RemovedInSphinx60Warning + * + */ + +/** + * select a different prefix for underscore + */ +$u = _.noConflict(); + + +/** + * small helper function to urldecode strings + * + * See https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/decodeURIComponent#Decoding_query_parameters_from_a_URL + */ +jQuery.urldecode = function(x) { + if (!x) { + return x + } + return decodeURIComponent(x.replace(/\+/g, ' ')); +}; + +/** + * small helper function to urlencode strings + */ +jQuery.urlencode = encodeURIComponent; + +/** + * This function returns the parsed url parameters of the + * current request. Multiple values per key are supported, + * it will always return arrays of strings for the value parts. + */ +jQuery.getQueryParameters = function(s) { + if (typeof s === 'undefined') + s = document.location.search; + var parts = s.substr(s.indexOf('?') + 1).split('&'); + var result = {}; + for (var i = 0; i < parts.length; i++) { + var tmp = parts[i].split('=', 2); + var key = jQuery.urldecode(tmp[0]); + var value = jQuery.urldecode(tmp[1]); + if (key in result) + result[key].push(value); + else + result[key] = [value]; + } + return result; +}; + +/** + * highlight a given string on a jquery object by wrapping it in + * span elements with the given class name. + */ +jQuery.fn.highlightText = function(text, className) { + function highlight(node, addItems) { + if (node.nodeType === 3) { + var val = node.nodeValue; + var pos = val.toLowerCase().indexOf(text); + if (pos >= 0 && + !jQuery(node.parentNode).hasClass(className) && + !jQuery(node.parentNode).hasClass("nohighlight")) { + var span; + var isInSVG = jQuery(node).closest("body, svg, foreignObject").is("svg"); + if (isInSVG) { + span = document.createElementNS("http://www.w3.org/2000/svg", "tspan"); + } else { + span = document.createElement("span"); + span.className = className; + } + span.appendChild(document.createTextNode(val.substr(pos, text.length))); + node.parentNode.insertBefore(span, node.parentNode.insertBefore( + document.createTextNode(val.substr(pos + text.length)), + node.nextSibling)); + node.nodeValue = val.substr(0, pos); + if (isInSVG) { + var rect = document.createElementNS("http://www.w3.org/2000/svg", "rect"); + var bbox = node.parentElement.getBBox(); + rect.x.baseVal.value = bbox.x; + rect.y.baseVal.value = bbox.y; + rect.width.baseVal.value = bbox.width; + rect.height.baseVal.value = bbox.height; + rect.setAttribute('class', className); + addItems.push({ + "parent": node.parentNode, + "target": rect}); + } + } + } + else if (!jQuery(node).is("button, select, textarea")) { + jQuery.each(node.childNodes, function() { + highlight(this, addItems); + }); + } + } + var addItems = []; + var result = this.each(function() { + highlight(this, addItems); + }); + for (var i = 0; i < addItems.length; ++i) { + jQuery(addItems[i].parent).before(addItems[i].target); + } + return result; +}; + +/* + * backward compatibility for jQuery.browser + * This will be supported until firefox bug is fixed. + */ +if (!jQuery.browser) { + jQuery.uaMatch = function(ua) { + ua = ua.toLowerCase(); + + var match = /(chrome)[ \/]([\w.]+)/.exec(ua) || + /(webkit)[ \/]([\w.]+)/.exec(ua) || + /(opera)(?:.*version|)[ \/]([\w.]+)/.exec(ua) || + /(msie) ([\w.]+)/.exec(ua) || + ua.indexOf("compatible") < 0 && /(mozilla)(?:.*? rv:([\w.]+)|)/.exec(ua) || + []; + + return { + browser: match[ 1 ] || "", + version: match[ 2 ] || "0" + }; + }; + jQuery.browser = {}; + jQuery.browser[jQuery.uaMatch(navigator.userAgent).browser] = true; +} diff --git a/_static/basic.css b/_static/basic.css new file mode 100644 index 0000000000..7d5974c322 --- /dev/null +++ b/_static/basic.css @@ -0,0 +1,928 @@ +/* + * basic.css + * ~~~~~~~~~ + * + * Sphinx stylesheet -- basic theme. + * + * :copyright: Copyright 2007-2022 by the Sphinx team, see AUTHORS. + * :license: BSD, see LICENSE for details. + * + */ + +/* -- main layout ----------------------------------------------------------- */ + +div.clearer { + clear: both; +} + +div.section::after { + display: block; + content: ''; + clear: left; +} + +/* -- relbar ---------------------------------------------------------------- */ + +div.related { + width: 100%; + font-size: 90%; +} + +div.related h3 { + display: none; +} + +div.related ul { + margin: 0; + padding: 0 0 0 10px; + list-style: none; +} + +div.related li { + display: inline; +} + +div.related li.right { + float: right; + margin-right: 5px; +} + +/* -- sidebar --------------------------------------------------------------- */ + +div.sphinxsidebarwrapper { + padding: 10px 5px 0 10px; +} + +div.sphinxsidebar { + float: left; + width: 230px; + margin-left: -100%; + font-size: 90%; + word-wrap: break-word; + overflow-wrap : break-word; +} + +div.sphinxsidebar ul { + list-style: none; +} + +div.sphinxsidebar ul ul, +div.sphinxsidebar ul.want-points { + margin-left: 20px; + list-style: square; +} + +div.sphinxsidebar ul ul { + margin-top: 0; + margin-bottom: 0; +} + +div.sphinxsidebar form { + margin-top: 10px; +} + +div.sphinxsidebar input { + border: 1px solid #98dbcc; + font-family: sans-serif; + font-size: 1em; +} + +div.sphinxsidebar #searchbox form.search { + overflow: hidden; +} + +div.sphinxsidebar #searchbox input[type="text"] { + float: left; + width: 80%; + padding: 0.25em; + box-sizing: border-box; +} + +div.sphinxsidebar #searchbox input[type="submit"] { + float: left; + width: 20%; + border-left: none; + padding: 0.25em; + box-sizing: border-box; +} + + +img { + border: 0; + max-width: 100%; +} + +/* -- search page ----------------------------------------------------------- */ + +ul.search { + margin: 10px 0 0 20px; + padding: 0; +} + +ul.search li { + padding: 5px 0 5px 20px; + background-image: url(file.png); + background-repeat: no-repeat; + background-position: 0 7px; +} + +ul.search li a { + font-weight: bold; +} + +ul.search li p.context { + color: #888; + margin: 2px 0 0 30px; + text-align: left; +} + +ul.keywordmatches li.goodmatch a { + font-weight: bold; +} + +/* -- index page ------------------------------------------------------------ */ + +table.contentstable { + width: 90%; + margin-left: auto; + margin-right: auto; +} + +table.contentstable p.biglink { + line-height: 150%; +} + +a.biglink { + font-size: 1.3em; +} + +span.linkdescr { + font-style: italic; + padding-top: 5px; + font-size: 90%; +} + +/* -- general index --------------------------------------------------------- */ + +table.indextable { + width: 100%; +} + +table.indextable td { + text-align: left; + vertical-align: top; +} + +table.indextable ul { + margin-top: 0; + margin-bottom: 0; + list-style-type: none; +} + +table.indextable > tbody > tr > td > ul { + padding-left: 0em; +} + +table.indextable tr.pcap { + height: 10px; +} + +table.indextable tr.cap { + margin-top: 10px; + background-color: #f2f2f2; +} + +img.toggler { + margin-right: 3px; + margin-top: 3px; + cursor: pointer; +} + +div.modindex-jumpbox { + border-top: 1px solid #ddd; + border-bottom: 1px solid #ddd; + margin: 1em 0 1em 0; + padding: 0.4em; +} + +div.genindex-jumpbox { + border-top: 1px solid #ddd; + border-bottom: 1px solid #ddd; + margin: 1em 0 1em 0; + padding: 0.4em; +} + +/* -- domain module index --------------------------------------------------- */ + +table.modindextable td { + padding: 2px; + border-collapse: collapse; +} + +/* -- general body styles --------------------------------------------------- */ + +div.body { + min-width: 360px; + max-width: 800px; +} + +div.body p, div.body dd, div.body li, div.body blockquote { + -moz-hyphens: auto; + -ms-hyphens: auto; + -webkit-hyphens: auto; + hyphens: auto; +} + +a.headerlink { + visibility: hidden; +} +a.brackets:before, +span.brackets > a:before{ + content: "["; +} + +a.brackets:after, +span.brackets > a:after { + content: "]"; +} + + +h1:hover > a.headerlink, +h2:hover > a.headerlink, +h3:hover > a.headerlink, +h4:hover > a.headerlink, +h5:hover > a.headerlink, +h6:hover > a.headerlink, +dt:hover > a.headerlink, +caption:hover > a.headerlink, +p.caption:hover > a.headerlink, +div.code-block-caption:hover > a.headerlink { + visibility: visible; +} + +div.body p.caption { + text-align: inherit; +} + +div.body td { + text-align: left; +} + +.first { + margin-top: 0 !important; +} + +p.rubric { + margin-top: 30px; + font-weight: bold; +} + +img.align-left, figure.align-left, .figure.align-left, object.align-left { + clear: left; + float: left; + margin-right: 1em; +} + +img.align-right, figure.align-right, .figure.align-right, object.align-right { + clear: right; + float: right; + margin-left: 1em; +} + +img.align-center, figure.align-center, .figure.align-center, object.align-center { + display: block; + margin-left: auto; + margin-right: auto; +} + +img.align-default, figure.align-default, .figure.align-default { + display: block; + margin-left: auto; + margin-right: auto; +} + +.align-left { + text-align: left; +} + +.align-center { + text-align: center; +} + +.align-default { + text-align: center; +} + +.align-right { + text-align: right; +} + +/* -- sidebars -------------------------------------------------------------- */ + +div.sidebar, +aside.sidebar { + margin: 0 0 0.5em 1em; + border: 1px solid #ddb; + padding: 7px; + background-color: #ffe; + width: 40%; + float: right; + clear: right; + overflow-x: auto; +} + +p.sidebar-title { + font-weight: bold; +} +div.admonition, div.topic, blockquote { + clear: left; +} + +/* -- topics ---------------------------------------------------------------- */ +div.topic { + border: 1px solid #ccc; + padding: 7px; + margin: 10px 0 10px 0; +} + +p.topic-title { + font-size: 1.1em; + font-weight: bold; + margin-top: 10px; +} + +/* -- admonitions ----------------------------------------------------------- */ + +div.admonition { + margin-top: 10px; + margin-bottom: 10px; + padding: 7px; +} + +div.admonition dt { + font-weight: bold; +} + +p.admonition-title { + margin: 0px 10px 5px 0px; + font-weight: bold; +} + +div.body p.centered { + text-align: center; + margin-top: 25px; +} + +/* -- content of sidebars/topics/admonitions -------------------------------- */ + +div.sidebar > :last-child, +aside.sidebar > :last-child, +div.topic > :last-child, +div.admonition > :last-child { + margin-bottom: 0; +} + +div.sidebar::after, +aside.sidebar::after, +div.topic::after, +div.admonition::after, +blockquote::after { + display: block; + content: ''; + clear: both; +} + +/* -- tables ---------------------------------------------------------------- */ + +table.docutils { + margin-top: 10px; + margin-bottom: 10px; + border: 0; + border-collapse: collapse; +} + +table.align-center { + margin-left: auto; + margin-right: auto; +} + +table.align-default { + margin-left: auto; + margin-right: auto; +} + +table caption span.caption-number { + font-style: italic; +} + +table caption span.caption-text { +} + +table.docutils td, table.docutils th { + padding: 1px 8px 1px 5px; + border-top: 0; + border-left: 0; + border-right: 0; + border-bottom: 1px solid #aaa; +} + +th { + text-align: left; + padding-right: 5px; +} + +table.citation { + border-left: solid 1px gray; + margin-left: 1px; +} + +table.citation td { + border-bottom: none; +} + +th > :first-child, +td > :first-child { + margin-top: 0px; +} + +th > :last-child, +td > :last-child { + margin-bottom: 0px; +} + +/* -- figures --------------------------------------------------------------- */ + +div.figure, figure { + margin: 0.5em; + padding: 0.5em; +} + +div.figure p.caption, figcaption { + padding: 0.3em; +} + +div.figure p.caption span.caption-number, +figcaption span.caption-number { + font-style: italic; +} + +div.figure p.caption span.caption-text, +figcaption span.caption-text { +} + +/* -- field list styles ----------------------------------------------------- */ + +table.field-list td, table.field-list th { + border: 0 !important; +} + +.field-list ul { + margin: 0; + padding-left: 1em; +} + +.field-list p { + margin: 0; +} + +.field-name { + -moz-hyphens: manual; + -ms-hyphens: manual; + -webkit-hyphens: manual; + hyphens: manual; +} + +/* -- hlist styles ---------------------------------------------------------- */ + +table.hlist { + margin: 1em 0; +} + +table.hlist td { + vertical-align: top; +} + +/* -- object description styles --------------------------------------------- */ + +.sig { + font-family: 'Consolas', 'Menlo', 'DejaVu Sans Mono', 'Bitstream Vera Sans Mono', monospace; +} + +.sig-name, code.descname { + background-color: transparent; + font-weight: bold; +} + +.sig-name { + font-size: 1.1em; +} + +code.descname { + font-size: 1.2em; +} + +.sig-prename, code.descclassname { + background-color: transparent; +} + +.optional { + font-size: 1.3em; +} + +.sig-paren { + font-size: larger; +} + +.sig-param.n { + font-style: italic; +} + +/* C++ specific styling */ + +.sig-inline.c-texpr, +.sig-inline.cpp-texpr { + font-family: unset; +} + +.sig.c .k, .sig.c .kt, +.sig.cpp .k, .sig.cpp .kt { + color: #0033B3; +} + +.sig.c .m, +.sig.cpp .m { + color: #1750EB; +} + +.sig.c .s, .sig.c .sc, +.sig.cpp .s, .sig.cpp .sc { + color: #067D17; +} + + +/* -- other body styles ----------------------------------------------------- */ + +ol.arabic { + list-style: decimal; +} + +ol.loweralpha { + list-style: lower-alpha; +} + +ol.upperalpha { + list-style: upper-alpha; +} + +ol.lowerroman { + list-style: lower-roman; +} + +ol.upperroman { + list-style: upper-roman; +} + +:not(li) > ol > li:first-child > :first-child, +:not(li) > ul > li:first-child > :first-child { + margin-top: 0px; +} + +:not(li) > ol > li:last-child > :last-child, +:not(li) > ul > li:last-child > :last-child { + margin-bottom: 0px; +} + +ol.simple ol p, +ol.simple ul p, +ul.simple ol p, +ul.simple ul p { + margin-top: 0; +} + +ol.simple > li:not(:first-child) > p, +ul.simple > li:not(:first-child) > p { + margin-top: 0; +} + +ol.simple p, +ul.simple p { + margin-bottom: 0; +} + +/* Docutils 0.17 and older (footnotes & citations) */ +dl.footnote > dt, +dl.citation > dt { + float: left; + margin-right: 0.5em; +} + +dl.footnote > dd, +dl.citation > dd { + margin-bottom: 0em; +} + +dl.footnote > dd:after, +dl.citation > dd:after { + content: ""; + clear: both; +} + +/* Docutils 0.18+ (footnotes & citations) */ +aside.footnote > span, +div.citation > span { + float: left; +} +aside.footnote > span:last-of-type, +div.citation > span:last-of-type { + padding-right: 0.5em; +} +aside.footnote > p { + margin-left: 2em; +} +div.citation > p { + margin-left: 4em; +} +aside.footnote > p:last-of-type, +div.citation > p:last-of-type { + margin-bottom: 0em; +} +aside.footnote > p:last-of-type:after, +div.citation > p:last-of-type:after { + content: ""; + clear: both; +} + +/* Footnotes & citations ends */ + +dl.field-list { + display: grid; + grid-template-columns: fit-content(30%) auto; +} + +dl.field-list > dt { + font-weight: bold; + word-break: break-word; + padding-left: 0.5em; + padding-right: 5px; +} + +dl.field-list > dt:after { + content: ":"; +} + +dl.field-list > dd { + padding-left: 0.5em; + margin-top: 0em; + margin-left: 0em; + margin-bottom: 0em; +} + +dl { + margin-bottom: 15px; +} + +dd > :first-child { + margin-top: 0px; +} + +dd ul, dd table { + margin-bottom: 10px; +} + +dd { + margin-top: 3px; + margin-bottom: 10px; + margin-left: 30px; +} + +dl > dd:last-child, +dl > dd:last-child > :last-child { + margin-bottom: 0; +} + +dt:target, span.highlighted { + background-color: #fbe54e; +} + +rect.highlighted { + fill: #fbe54e; +} + +dl.glossary dt { + font-weight: bold; + font-size: 1.1em; +} + +.versionmodified { + font-style: italic; +} + +.system-message { + background-color: #fda; + padding: 5px; + border: 3px solid red; +} + +.footnote:target { + background-color: #ffa; +} + +.line-block { + display: block; + margin-top: 1em; + margin-bottom: 1em; +} + +.line-block .line-block { + margin-top: 0; + margin-bottom: 0; + margin-left: 1.5em; +} + +.guilabel, .menuselection { + font-family: sans-serif; +} + +.accelerator { + text-decoration: underline; +} + +.classifier { + font-style: oblique; +} + +.classifier:before { + font-style: normal; + margin: 0 0.5em; + content: ":"; + display: inline-block; +} + +abbr, acronym { + border-bottom: dotted 1px; + cursor: help; +} + +/* -- code displays --------------------------------------------------------- */ + +pre { + overflow: auto; + overflow-y: hidden; /* fixes display issues on Chrome browsers */ +} + +pre, div[class*="highlight-"] { + clear: both; +} + +span.pre { + -moz-hyphens: none; + -ms-hyphens: none; + -webkit-hyphens: none; + hyphens: none; + white-space: nowrap; +} + +div[class*="highlight-"] { + margin: 1em 0; +} + +td.linenos pre { + border: 0; + background-color: transparent; + color: #aaa; +} + +table.highlighttable { + display: block; +} + +table.highlighttable tbody { + display: block; +} + +table.highlighttable tr { + display: flex; +} + +table.highlighttable td { + margin: 0; + padding: 0; +} + +table.highlighttable td.linenos { + padding-right: 0.5em; +} + +table.highlighttable td.code { + flex: 1; + overflow: hidden; +} + +.highlight .hll { + display: block; +} + +div.highlight pre, +table.highlighttable pre { + margin: 0; +} + +div.code-block-caption + div { + margin-top: 0; +} + +div.code-block-caption { + margin-top: 1em; + padding: 2px 5px; + font-size: small; +} + +div.code-block-caption code { + background-color: transparent; +} + +table.highlighttable td.linenos, +span.linenos, +div.highlight span.gp { /* gp: Generic.Prompt */ + user-select: none; + -webkit-user-select: text; /* Safari fallback only */ + -webkit-user-select: none; /* Chrome/Safari */ + -moz-user-select: none; /* Firefox */ + -ms-user-select: none; /* IE10+ */ +} + +div.code-block-caption span.caption-number { + padding: 0.1em 0.3em; + font-style: italic; +} + +div.code-block-caption span.caption-text { +} + +div.literal-block-wrapper { + margin: 1em 0; +} + +code.xref, a code { + background-color: transparent; + font-weight: bold; +} + +h1 code, h2 code, h3 code, h4 code, h5 code, h6 code { + background-color: transparent; +} + +.viewcode-link { + float: right; +} + +.viewcode-back { + float: right; + font-family: sans-serif; +} + +div.viewcode-block:target { + margin: -1px -10px; + padding: 0 10px; +} + +/* -- math display ---------------------------------------------------------- */ + +img.math { + vertical-align: middle; +} + +div.body div.math p { + text-align: center; +} + +span.eqno { + float: right; +} + +span.eqno a.headerlink { + position: absolute; + z-index: 1; +} + +div.math:hover a.headerlink { + visibility: visible; +} + +/* -- printout stylesheet --------------------------------------------------- */ + +@media print { + div.document, + div.documentwrapper, + div.bodywrapper { + margin: 0 !important; + width: 100%; + } + + div.sphinxsidebar, + div.related, + div.footer, + #top-link { + display: none; + } +} \ No newline at end of file diff --git a/_static/check-solid.svg b/_static/check-solid.svg new file mode 100644 index 0000000000..92fad4b5c0 --- /dev/null +++ b/_static/check-solid.svg @@ -0,0 +1,4 @@ + + + + diff --git a/_static/clipboard.min.js b/_static/clipboard.min.js new file mode 100644 index 0000000000..54b3c46381 --- /dev/null +++ b/_static/clipboard.min.js @@ -0,0 +1,7 @@ +/*! + * clipboard.js v2.0.8 + * https://clipboardjs.com/ + * + * Licensed MIT © Zeno Rocha + */ +!function(t,e){"object"==typeof exports&&"object"==typeof module?module.exports=e():"function"==typeof define&&define.amd?define([],e):"object"==typeof exports?exports.ClipboardJS=e():t.ClipboardJS=e()}(this,function(){return n={686:function(t,e,n){"use strict";n.d(e,{default:function(){return o}});var e=n(279),i=n.n(e),e=n(370),u=n.n(e),e=n(817),c=n.n(e);function a(t){try{return document.execCommand(t)}catch(t){return}}var f=function(t){t=c()(t);return a("cut"),t};var l=function(t){var e,n,o,r=1 + + + + diff --git a/_static/copybutton.css b/_static/copybutton.css new file mode 100644 index 0000000000..f1916ec7d1 --- /dev/null +++ b/_static/copybutton.css @@ -0,0 +1,94 @@ +/* Copy buttons */ +button.copybtn { + position: absolute; + display: flex; + top: .3em; + right: .3em; + width: 1.7em; + height: 1.7em; + opacity: 0; + transition: opacity 0.3s, border .3s, background-color .3s; + user-select: none; + padding: 0; + border: none; + outline: none; + border-radius: 0.4em; + /* The colors that GitHub uses */ + border: #1b1f2426 1px solid; + background-color: #f6f8fa; + color: #57606a; +} + +button.copybtn.success { + border-color: #22863a; + color: #22863a; +} + +button.copybtn svg { + stroke: currentColor; + width: 1.5em; + height: 1.5em; + padding: 0.1em; +} + +div.highlight { + position: relative; +} + +/* Show the copybutton */ +.highlight:hover button.copybtn, button.copybtn.success { + opacity: 1; +} + +.highlight button.copybtn:hover { + background-color: rgb(235, 235, 235); +} + +.highlight button.copybtn:active { + background-color: rgb(187, 187, 187); +} + +/** + * A minimal CSS-only tooltip copied from: + * https://codepen.io/mildrenben/pen/rVBrpK + * + * To use, write HTML like the following: + * + *

Short

+ */ + .o-tooltip--left { + position: relative; + } + + .o-tooltip--left:after { + opacity: 0; + visibility: hidden; + position: absolute; + content: attr(data-tooltip); + padding: .2em; + font-size: .8em; + left: -.2em; + background: grey; + color: white; + white-space: nowrap; + z-index: 2; + border-radius: 2px; + transform: translateX(-102%) translateY(0); + transition: opacity 0.2s cubic-bezier(0.64, 0.09, 0.08, 1), transform 0.2s cubic-bezier(0.64, 0.09, 0.08, 1); +} + +.o-tooltip--left:hover:after { + display: block; + opacity: 1; + visibility: visible; + transform: translateX(-100%) translateY(0); + transition: opacity 0.2s cubic-bezier(0.64, 0.09, 0.08, 1), transform 0.2s cubic-bezier(0.64, 0.09, 0.08, 1); + transition-delay: .5s; +} + +/* By default the copy button shouldn't show up when printing a page */ +@media print { + button.copybtn { + display: none; + } +} diff --git a/_static/copybutton.js b/_static/copybutton.js new file mode 100644 index 0000000000..2ea7ff3e21 --- /dev/null +++ b/_static/copybutton.js @@ -0,0 +1,248 @@ +// Localization support +const messages = { + 'en': { + 'copy': 'Copy', + 'copy_to_clipboard': 'Copy to clipboard', + 'copy_success': 'Copied!', + 'copy_failure': 'Failed to copy', + }, + 'es' : { + 'copy': 'Copiar', + 'copy_to_clipboard': 'Copiar al portapapeles', + 'copy_success': '¡Copiado!', + 'copy_failure': 'Error al copiar', + }, + 'de' : { + 'copy': 'Kopieren', + 'copy_to_clipboard': 'In die Zwischenablage kopieren', + 'copy_success': 'Kopiert!', + 'copy_failure': 'Fehler beim Kopieren', + }, + 'fr' : { + 'copy': 'Copier', + 'copy_to_clipboard': 'Copier dans le presse-papier', + 'copy_success': 'Copié !', + 'copy_failure': 'Échec de la copie', + }, + 'ru': { + 'copy': 'Скопировать', + 'copy_to_clipboard': 'Скопировать в буфер', + 'copy_success': 'Скопировано!', + 'copy_failure': 'Не удалось скопировать', + }, + 'zh-CN': { + 'copy': '复制', + 'copy_to_clipboard': '复制到剪贴板', + 'copy_success': '复制成功!', + 'copy_failure': '复制失败', + }, + 'it' : { + 'copy': 'Copiare', + 'copy_to_clipboard': 'Copiato negli appunti', + 'copy_success': 'Copiato!', + 'copy_failure': 'Errore durante la copia', + } +} + +let locale = 'en' +if( document.documentElement.lang !== undefined + && messages[document.documentElement.lang] !== undefined ) { + locale = document.documentElement.lang +} + +let doc_url_root = DOCUMENTATION_OPTIONS.URL_ROOT; +if (doc_url_root == '#') { + doc_url_root = ''; +} + +/** + * SVG files for our copy buttons + */ +let iconCheck = ` + ${messages[locale]['copy_success']} + + +` + +// If the user specified their own SVG use that, otherwise use the default +let iconCopy = ``; +if (!iconCopy) { + iconCopy = ` + ${messages[locale]['copy_to_clipboard']} + + + +` +} + +/** + * Set up copy/paste for code blocks + */ + +const runWhenDOMLoaded = cb => { + if (document.readyState != 'loading') { + cb() + } else if (document.addEventListener) { + document.addEventListener('DOMContentLoaded', cb) + } else { + document.attachEvent('onreadystatechange', function() { + if (document.readyState == 'complete') cb() + }) + } +} + +const codeCellId = index => `codecell${index}` + +// Clears selected text since ClipboardJS will select the text when copying +const clearSelection = () => { + if (window.getSelection) { + window.getSelection().removeAllRanges() + } else if (document.selection) { + document.selection.empty() + } +} + +// Changes tooltip text for a moment, then changes it back +// We want the timeout of our `success` class to be a bit shorter than the +// tooltip and icon change, so that we can hide the icon before changing back. +var timeoutIcon = 2000; +var timeoutSuccessClass = 1500; + +const temporarilyChangeTooltip = (el, oldText, newText) => { + el.setAttribute('data-tooltip', newText) + el.classList.add('success') + // Remove success a little bit sooner than we change the tooltip + // So that we can use CSS to hide the copybutton first + setTimeout(() => el.classList.remove('success'), timeoutSuccessClass) + setTimeout(() => el.setAttribute('data-tooltip', oldText), timeoutIcon) +} + +// Changes the copy button icon for two seconds, then changes it back +const temporarilyChangeIcon = (el) => { + el.innerHTML = iconCheck; + setTimeout(() => {el.innerHTML = iconCopy}, timeoutIcon) +} + +const addCopyButtonToCodeCells = () => { + // If ClipboardJS hasn't loaded, wait a bit and try again. This + // happens because we load ClipboardJS asynchronously. + if (window.ClipboardJS === undefined) { + setTimeout(addCopyButtonToCodeCells, 250) + return + } + + // Add copybuttons to all of our code cells + const COPYBUTTON_SELECTOR = 'div.highlight pre'; + const codeCells = document.querySelectorAll(COPYBUTTON_SELECTOR) + codeCells.forEach((codeCell, index) => { + const id = codeCellId(index) + codeCell.setAttribute('id', id) + + const clipboardButton = id => + `` + codeCell.insertAdjacentHTML('afterend', clipboardButton(id)) + }) + +function escapeRegExp(string) { + return string.replace(/[.*+?^${}()|[\]\\]/g, '\\$&'); // $& means the whole matched string +} + +/** + * Removes excluded text from a Node. + * + * @param {Node} target Node to filter. + * @param {string} exclude CSS selector of nodes to exclude. + * @returns {DOMString} Text from `target` with text removed. + */ +function filterText(target, exclude) { + const clone = target.cloneNode(true); // clone as to not modify the live DOM + if (exclude) { + // remove excluded nodes + clone.querySelectorAll(exclude).forEach(node => node.remove()); + } + return clone.innerText; +} + +// Callback when a copy button is clicked. Will be passed the node that was clicked +// should then grab the text and replace pieces of text that shouldn't be used in output +function formatCopyText(textContent, copybuttonPromptText, isRegexp = false, onlyCopyPromptLines = true, removePrompts = true, copyEmptyLines = true, lineContinuationChar = "", hereDocDelim = "") { + var regexp; + var match; + + // Do we check for line continuation characters and "HERE-documents"? + var useLineCont = !!lineContinuationChar + var useHereDoc = !!hereDocDelim + + // create regexp to capture prompt and remaining line + if (isRegexp) { + regexp = new RegExp('^(' + copybuttonPromptText + ')(.*)') + } else { + regexp = new RegExp('^(' + escapeRegExp(copybuttonPromptText) + ')(.*)') + } + + const outputLines = []; + var promptFound = false; + var gotLineCont = false; + var gotHereDoc = false; + const lineGotPrompt = []; + for (const line of textContent.split('\n')) { + match = line.match(regexp) + if (match || gotLineCont || gotHereDoc) { + promptFound = regexp.test(line) + lineGotPrompt.push(promptFound) + if (removePrompts && promptFound) { + outputLines.push(match[2]) + } else { + outputLines.push(line) + } + gotLineCont = line.endsWith(lineContinuationChar) & useLineCont + if (line.includes(hereDocDelim) & useHereDoc) + gotHereDoc = !gotHereDoc + } else if (!onlyCopyPromptLines) { + outputLines.push(line) + } else if (copyEmptyLines && line.trim() === '') { + outputLines.push(line) + } + } + + // If no lines with the prompt were found then just use original lines + if (lineGotPrompt.some(v => v === true)) { + textContent = outputLines.join('\n'); + } + + // Remove a trailing newline to avoid auto-running when pasting + if (textContent.endsWith("\n")) { + textContent = textContent.slice(0, -1) + } + return textContent +} + + +var copyTargetText = (trigger) => { + var target = document.querySelector(trigger.attributes['data-clipboard-target'].value); + + // get filtered text + let exclude = '.linenos'; + + let text = filterText(target, exclude); + return formatCopyText(text, '', false, true, true, true, '', '') +} + + // Initialize with a callback so we can modify the text before copy + const clipboard = new ClipboardJS('.copybtn', {text: copyTargetText}) + + // Update UI with error/success messages + clipboard.on('success', event => { + clearSelection() + temporarilyChangeTooltip(event.trigger, messages[locale]['copy'], messages[locale]['copy_success']) + temporarilyChangeIcon(event.trigger) + }) + + clipboard.on('error', event => { + temporarilyChangeTooltip(event.trigger, messages[locale]['copy'], messages[locale]['copy_failure']) + }) +} + +runWhenDOMLoaded(addCopyButtonToCodeCells) \ No newline at end of file diff --git a/_static/copybutton_funcs.js b/_static/copybutton_funcs.js new file mode 100644 index 0000000000..dbe1aaad79 --- /dev/null +++ b/_static/copybutton_funcs.js @@ -0,0 +1,73 @@ +function escapeRegExp(string) { + return string.replace(/[.*+?^${}()|[\]\\]/g, '\\$&'); // $& means the whole matched string +} + +/** + * Removes excluded text from a Node. + * + * @param {Node} target Node to filter. + * @param {string} exclude CSS selector of nodes to exclude. + * @returns {DOMString} Text from `target` with text removed. + */ +export function filterText(target, exclude) { + const clone = target.cloneNode(true); // clone as to not modify the live DOM + if (exclude) { + // remove excluded nodes + clone.querySelectorAll(exclude).forEach(node => node.remove()); + } + return clone.innerText; +} + +// Callback when a copy button is clicked. Will be passed the node that was clicked +// should then grab the text and replace pieces of text that shouldn't be used in output +export function formatCopyText(textContent, copybuttonPromptText, isRegexp = false, onlyCopyPromptLines = true, removePrompts = true, copyEmptyLines = true, lineContinuationChar = "", hereDocDelim = "") { + var regexp; + var match; + + // Do we check for line continuation characters and "HERE-documents"? + var useLineCont = !!lineContinuationChar + var useHereDoc = !!hereDocDelim + + // create regexp to capture prompt and remaining line + if (isRegexp) { + regexp = new RegExp('^(' + copybuttonPromptText + ')(.*)') + } else { + regexp = new RegExp('^(' + escapeRegExp(copybuttonPromptText) + ')(.*)') + } + + const outputLines = []; + var promptFound = false; + var gotLineCont = false; + var gotHereDoc = false; + const lineGotPrompt = []; + for (const line of textContent.split('\n')) { + match = line.match(regexp) + if (match || gotLineCont || gotHereDoc) { + promptFound = regexp.test(line) + lineGotPrompt.push(promptFound) + if (removePrompts && promptFound) { + outputLines.push(match[2]) + } else { + outputLines.push(line) + } + gotLineCont = line.endsWith(lineContinuationChar) & useLineCont + if (line.includes(hereDocDelim) & useHereDoc) + gotHereDoc = !gotHereDoc + } else if (!onlyCopyPromptLines) { + outputLines.push(line) + } else if (copyEmptyLines && line.trim() === '') { + outputLines.push(line) + } + } + + // If no lines with the prompt were found then just use original lines + if (lineGotPrompt.some(v => v === true)) { + textContent = outputLines.join('\n'); + } + + // Remove a trailing newline to avoid auto-running when pasting + if (textContent.endsWith("\n")) { + textContent = textContent.slice(0, -1) + } + return textContent +} diff --git a/_static/css/badge_only.css b/_static/css/badge_only.css new file mode 100644 index 0000000000..3c33cef545 --- /dev/null +++ b/_static/css/badge_only.css @@ -0,0 +1 @@ +.fa:before{-webkit-font-smoothing:antialiased}.clearfix{*zoom:1}.clearfix:before,.clearfix:after{display:table;content:""}.clearfix:after{clear:both}@font-face{font-family:FontAwesome;font-weight:normal;font-style:normal;src:url("../fonts/fontawesome-webfont.eot");src:url("../fonts/fontawesome-webfont.eot?#iefix") format("embedded-opentype"),url("../fonts/fontawesome-webfont.woff") format("woff"),url("../fonts/fontawesome-webfont.ttf") format("truetype"),url("../fonts/fontawesome-webfont.svg#FontAwesome") format("svg")}.fa:before{display:inline-block;font-family:FontAwesome;font-style:normal;font-weight:normal;line-height:1;text-decoration:inherit}a .fa{display:inline-block;text-decoration:inherit}li .fa{display:inline-block}li .fa-large:before,li .fa-large:before{width:1.875em}ul.fas{list-style-type:none;margin-left:2em;text-indent:-0.8em}ul.fas li .fa{width:.8em}ul.fas li .fa-large:before,ul.fas li .fa-large:before{vertical-align:baseline}.fa-book:before{content:""}.icon-book:before{content:""}.fa-caret-down:before{content:""}.icon-caret-down:before{content:""}.fa-caret-up:before{content:""}.icon-caret-up:before{content:""}.fa-caret-left:before{content:""}.icon-caret-left:before{content:""}.fa-caret-right:before{content:""}.icon-caret-right:before{content:""}.rst-versions{position:fixed;bottom:0;left:0;width:300px;color:#fcfcfc;background:#1f1d1d;font-family:"Lato","proxima-nova","Helvetica Neue",Arial,sans-serif;z-index:400}.rst-versions a{color:#2980B9;text-decoration:none}.rst-versions .rst-badge-small{display:none}.rst-versions .rst-current-version{padding:12px;background-color:#272525;display:block;text-align:right;font-size:90%;cursor:pointer;color:#27AE60;*zoom:1}.rst-versions .rst-current-version:before,.rst-versions .rst-current-version:after{display:table;content:""}.rst-versions .rst-current-version:after{clear:both}.rst-versions .rst-current-version .fa{color:#fcfcfc}.rst-versions .rst-current-version .fa-book{float:left}.rst-versions .rst-current-version .icon-book{float:left}.rst-versions .rst-current-version.rst-out-of-date{background-color:#E74C3C;color:#fff}.rst-versions .rst-current-version.rst-active-old-version{background-color:#F1C40F;color:#000}.rst-versions.shift-up{height:auto;max-height:100%;overflow-y:scroll}.rst-versions.shift-up .rst-other-versions{display:block}.rst-versions .rst-other-versions{font-size:90%;padding:12px;color:gray;display:none}.rst-versions .rst-other-versions hr{display:block;height:1px;border:0;margin:20px 0;padding:0;border-top:solid 1px #413d3d}.rst-versions .rst-other-versions dd{display:inline-block;margin:0}.rst-versions .rst-other-versions dd a{display:inline-block;padding:6px;color:#fcfcfc}.rst-versions.rst-badge{width:auto;bottom:20px;right:20px;left:auto;border:none;max-width:300px;max-height:90%}.rst-versions.rst-badge .icon-book{float:none}.rst-versions.rst-badge .fa-book{float:none}.rst-versions.rst-badge.shift-up .rst-current-version{text-align:right}.rst-versions.rst-badge.shift-up .rst-current-version .fa-book{float:left}.rst-versions.rst-badge.shift-up .rst-current-version .icon-book{float:left}.rst-versions.rst-badge .rst-current-version{width:auto;height:30px;line-height:30px;padding:0 6px;display:block;text-align:center}@media screen and (max-width: 768px){.rst-versions{width:85%;display:none}.rst-versions.shift{display:block}} diff --git a/_static/css/theme.css b/_static/css/theme.css new file mode 100644 index 0000000000..aed8cef066 --- /dev/null +++ b/_static/css/theme.css @@ -0,0 +1,6 @@ +/* sphinx_rtd_theme version 0.4.3 | MIT license */ +/* Built 20190212 16:02 */ +*{-webkit-box-sizing:border-box;-moz-box-sizing:border-box;box-sizing:border-box}article,aside,details,figcaption,figure,footer,header,hgroup,nav,section{display:block}audio,canvas,video{display:inline-block;*display:inline;*zoom:1}audio:not([controls]){display:none}[hidden]{display:none}*{-webkit-box-sizing:border-box;-moz-box-sizing:border-box;box-sizing:border-box}html{font-size:100%;-webkit-text-size-adjust:100%;-ms-text-size-adjust:100%}body{margin:0}a:hover,a:active{outline:0}abbr[title]{border-bottom:1px dotted}b,strong{font-weight:bold}blockquote{margin:0}dfn{font-style:italic}ins{background:#ff9;color:#000;text-decoration:none}mark{background:#ff0;color:#000;font-style:italic;font-weight:bold}pre,code,.rst-content tt,.rst-content code,kbd,samp{font-family:monospace,serif;_font-family:"courier new",monospace;font-size:1em}pre{white-space:pre}q{quotes:none}q:before,q:after{content:"";content:none}small{font-size:85%}sub,sup{font-size:75%;line-height:0;position:relative;vertical-align:baseline}sup{top:-0.5em}sub{bottom:-0.25em}ul,ol,dl{margin:0;padding:0;list-style:none;list-style-image:none}li{list-style:none}dd{margin:0}img{border:0;-ms-interpolation-mode:bicubic;vertical-align:middle;max-width:100%}svg:not(:root){overflow:hidden}figure{margin:0}form{margin:0}fieldset{border:0;margin:0;padding:0}label{cursor:pointer}legend{border:0;*margin-left:-7px;padding:0;white-space:normal}button,input,select,textarea{font-size:100%;margin:0;vertical-align:baseline;*vertical-align:middle}button,input{line-height:normal}button,input[type="button"],input[type="reset"],input[type="submit"]{cursor:pointer;-webkit-appearance:button;*overflow:visible}button[disabled],input[disabled]{cursor:default}input[type="checkbox"],input[type="radio"]{box-sizing:border-box;padding:0;*width:13px;*height:13px}input[type="search"]{-webkit-appearance:textfield;-moz-box-sizing:content-box;-webkit-box-sizing:content-box;box-sizing:content-box}input[type="search"]::-webkit-search-decoration,input[type="search"]::-webkit-search-cancel-button{-webkit-appearance:none}button::-moz-focus-inner,input::-moz-focus-inner{border:0;padding:0}textarea{overflow:auto;vertical-align:top;resize:vertical}table{border-collapse:collapse;border-spacing:0}td{vertical-align:top}.chromeframe{margin:.2em 0;background:#ccc;color:#000;padding:.2em 0}.ir{display:block;border:0;text-indent:-999em;overflow:hidden;background-color:transparent;background-repeat:no-repeat;text-align:left;direction:ltr;*line-height:0}.ir br{display:none}.hidden{display:none !important;visibility:hidden}.visuallyhidden{border:0;clip:rect(0 0 0 0);height:1px;margin:-1px;overflow:hidden;padding:0;position:absolute;width:1px}.visuallyhidden.focusable:active,.visuallyhidden.focusable:focus{clip:auto;height:auto;margin:0;overflow:visible;position:static;width:auto}.invisible{visibility:hidden}.relative{position:relative}big,small{font-size:100%}@media print{html,body,section{background:none !important}*{box-shadow:none !important;text-shadow:none !important;filter:none !important;-ms-filter:none !important}a,a:visited{text-decoration:underline}.ir a:after,a[href^="javascript:"]:after,a[href^="#"]:after{content:""}pre,blockquote{page-break-inside:avoid}thead{display:table-header-group}tr,img{page-break-inside:avoid}img{max-width:100% !important}@page{margin:.5cm}p,h2,.rst-content .toctree-wrapper p.caption,h3{orphans:3;widows:3}h2,.rst-content .toctree-wrapper p.caption,h3{page-break-after:avoid}}.fa:before,.wy-menu-vertical li span.toctree-expand:before,.wy-menu-vertical li.on a span.toctree-expand:before,.wy-menu-vertical li.current>a span.toctree-expand:before,.rst-content .admonition-title:before,.rst-content h1 .headerlink:before,.rst-content h2 .headerlink:before,.rst-content h3 .headerlink:before,.rst-content h4 .headerlink:before,.rst-content h5 .headerlink:before,.rst-content h6 .headerlink:before,.rst-content dl dt .headerlink:before,.rst-content p.caption .headerlink:before,.rst-content table>caption .headerlink:before,.rst-content .code-block-caption .headerlink:before,.rst-content tt.download span:first-child:before,.rst-content code.download span:first-child:before,.icon:before,.wy-dropdown .caret:before,.wy-inline-validate.wy-inline-validate-success .wy-input-context:before,.wy-inline-validate.wy-inline-validate-danger .wy-input-context:before,.wy-inline-validate.wy-inline-validate-warning .wy-input-context:before,.wy-inline-validate.wy-inline-validate-info .wy-input-context:before,.wy-alert,.rst-content .note,.rst-content .attention,.rst-content .caution,.rst-content .danger,.rst-content .error,.rst-content .hint,.rst-content .important,.rst-content .tip,.rst-content .warning,.rst-content .seealso,.rst-content .admonition-todo,.rst-content .admonition,.btn,input[type="text"],input[type="password"],input[type="email"],input[type="url"],input[type="date"],input[type="month"],input[type="time"],input[type="datetime"],input[type="datetime-local"],input[type="week"],input[type="number"],input[type="search"],input[type="tel"],input[type="color"],select,textarea,.wy-menu-vertical li.on a,.wy-menu-vertical li.current>a,.wy-side-nav-search>a,.wy-side-nav-search .wy-dropdown>a,.wy-nav-top a{-webkit-font-smoothing:antialiased}.clearfix{*zoom:1}.clearfix:before,.clearfix:after{display:table;content:""}.clearfix:after{clear:both}/*! + * Font Awesome 4.7.0 by @davegandy - http://fontawesome.io - @fontawesome + * License - http://fontawesome.io/license (Font: SIL OFL 1.1, CSS: MIT License) + */@font-face{font-family:'FontAwesome';src:url("../fonts/fontawesome-webfont.eot?v=4.7.0");src:url("../fonts/fontawesome-webfont.eot?#iefix&v=4.7.0") format("embedded-opentype"),url("../fonts/fontawesome-webfont.woff2?v=4.7.0") format("woff2"),url("../fonts/fontawesome-webfont.woff?v=4.7.0") format("woff"),url("../fonts/fontawesome-webfont.ttf?v=4.7.0") format("truetype"),url("../fonts/fontawesome-webfont.svg?v=4.7.0#fontawesomeregular") format("svg");font-weight:normal;font-style:normal}.fa,.wy-menu-vertical li span.toctree-expand,.wy-menu-vertical li.on a span.toctree-expand,.wy-menu-vertical li.current>a span.toctree-expand,.rst-content .admonition-title,.rst-content h1 .headerlink,.rst-content h2 .headerlink,.rst-content h3 .headerlink,.rst-content h4 .headerlink,.rst-content h5 .headerlink,.rst-content h6 .headerlink,.rst-content dl dt .headerlink,.rst-content p.caption .headerlink,.rst-content table>caption .headerlink,.rst-content .code-block-caption .headerlink,.rst-content tt.download span:first-child,.rst-content code.download span:first-child,.icon{display:inline-block;font:normal normal normal 14px/1 FontAwesome;font-size:inherit;text-rendering:auto;-webkit-font-smoothing:antialiased;-moz-osx-font-smoothing:grayscale}.fa-lg{font-size:1.3333333333em;line-height:.75em;vertical-align:-15%}.fa-2x{font-size:2em}.fa-3x{font-size:3em}.fa-4x{font-size:4em}.fa-5x{font-size:5em}.fa-fw{width:1.2857142857em;text-align:center}.fa-ul{padding-left:0;margin-left:2.1428571429em;list-style-type:none}.fa-ul>li{position:relative}.fa-li{position:absolute;left:-2.1428571429em;width:2.1428571429em;top:.1428571429em;text-align:center}.fa-li.fa-lg{left:-1.8571428571em}.fa-border{padding:.2em .25em .15em;border:solid 0.08em #eee;border-radius:.1em}.fa-pull-left{float:left}.fa-pull-right{float:right}.fa.fa-pull-left,.wy-menu-vertical li span.fa-pull-left.toctree-expand,.wy-menu-vertical li.on a span.fa-pull-left.toctree-expand,.wy-menu-vertical li.current>a span.fa-pull-left.toctree-expand,.rst-content .fa-pull-left.admonition-title,.rst-content h1 .fa-pull-left.headerlink,.rst-content h2 .fa-pull-left.headerlink,.rst-content h3 .fa-pull-left.headerlink,.rst-content h4 .fa-pull-left.headerlink,.rst-content h5 .fa-pull-left.headerlink,.rst-content h6 .fa-pull-left.headerlink,.rst-content dl dt .fa-pull-left.headerlink,.rst-content p.caption .fa-pull-left.headerlink,.rst-content table>caption .fa-pull-left.headerlink,.rst-content .code-block-caption .fa-pull-left.headerlink,.rst-content tt.download span.fa-pull-left:first-child,.rst-content code.download span.fa-pull-left:first-child,.fa-pull-left.icon{margin-right:.3em}.fa.fa-pull-right,.wy-menu-vertical li span.fa-pull-right.toctree-expand,.wy-menu-vertical li.on a span.fa-pull-right.toctree-expand,.wy-menu-vertical li.current>a span.fa-pull-right.toctree-expand,.rst-content .fa-pull-right.admonition-title,.rst-content h1 .fa-pull-right.headerlink,.rst-content h2 .fa-pull-right.headerlink,.rst-content h3 .fa-pull-right.headerlink,.rst-content h4 .fa-pull-right.headerlink,.rst-content h5 .fa-pull-right.headerlink,.rst-content h6 .fa-pull-right.headerlink,.rst-content dl dt .fa-pull-right.headerlink,.rst-content p.caption .fa-pull-right.headerlink,.rst-content table>caption .fa-pull-right.headerlink,.rst-content .code-block-caption .fa-pull-right.headerlink,.rst-content tt.download span.fa-pull-right:first-child,.rst-content code.download span.fa-pull-right:first-child,.fa-pull-right.icon{margin-left:.3em}.pull-right{float:right}.pull-left{float:left}.fa.pull-left,.wy-menu-vertical li span.pull-left.toctree-expand,.wy-menu-vertical li.on a span.pull-left.toctree-expand,.wy-menu-vertical li.current>a span.pull-left.toctree-expand,.rst-content .pull-left.admonition-title,.rst-content h1 .pull-left.headerlink,.rst-content h2 .pull-left.headerlink,.rst-content h3 .pull-left.headerlink,.rst-content h4 .pull-left.headerlink,.rst-content h5 .pull-left.headerlink,.rst-content h6 .pull-left.headerlink,.rst-content dl dt .pull-left.headerlink,.rst-content p.caption .pull-left.headerlink,.rst-content table>caption .pull-left.headerlink,.rst-content .code-block-caption .pull-left.headerlink,.rst-content tt.download span.pull-left:first-child,.rst-content code.download span.pull-left:first-child,.pull-left.icon{margin-right:.3em}.fa.pull-right,.wy-menu-vertical li span.pull-right.toctree-expand,.wy-menu-vertical li.on a span.pull-right.toctree-expand,.wy-menu-vertical li.current>a span.pull-right.toctree-expand,.rst-content .pull-right.admonition-title,.rst-content h1 .pull-right.headerlink,.rst-content h2 .pull-right.headerlink,.rst-content h3 .pull-right.headerlink,.rst-content h4 .pull-right.headerlink,.rst-content h5 .pull-right.headerlink,.rst-content h6 .pull-right.headerlink,.rst-content dl dt .pull-right.headerlink,.rst-content p.caption .pull-right.headerlink,.rst-content table>caption .pull-right.headerlink,.rst-content .code-block-caption .pull-right.headerlink,.rst-content tt.download span.pull-right:first-child,.rst-content code.download span.pull-right:first-child,.pull-right.icon{margin-left:.3em}.fa-spin{-webkit-animation:fa-spin 2s infinite linear;animation:fa-spin 2s infinite linear}.fa-pulse{-webkit-animation:fa-spin 1s infinite steps(8);animation:fa-spin 1s infinite steps(8)}@-webkit-keyframes fa-spin{0%{-webkit-transform:rotate(0deg);transform:rotate(0deg)}100%{-webkit-transform:rotate(359deg);transform:rotate(359deg)}}@keyframes fa-spin{0%{-webkit-transform:rotate(0deg);transform:rotate(0deg)}100%{-webkit-transform:rotate(359deg);transform:rotate(359deg)}}.fa-rotate-90{-ms-filter:"progid:DXImageTransform.Microsoft.BasicImage(rotation=1)";-webkit-transform:rotate(90deg);-ms-transform:rotate(90deg);transform:rotate(90deg)}.fa-rotate-180{-ms-filter:"progid:DXImageTransform.Microsoft.BasicImage(rotation=2)";-webkit-transform:rotate(180deg);-ms-transform:rotate(180deg);transform:rotate(180deg)}.fa-rotate-270{-ms-filter:"progid:DXImageTransform.Microsoft.BasicImage(rotation=3)";-webkit-transform:rotate(270deg);-ms-transform:rotate(270deg);transform:rotate(270deg)}.fa-flip-horizontal{-ms-filter:"progid:DXImageTransform.Microsoft.BasicImage(rotation=0, mirror=1)";-webkit-transform:scale(-1, 1);-ms-transform:scale(-1, 1);transform:scale(-1, 1)}.fa-flip-vertical{-ms-filter:"progid:DXImageTransform.Microsoft.BasicImage(rotation=2, mirror=1)";-webkit-transform:scale(1, -1);-ms-transform:scale(1, -1);transform:scale(1, -1)}:root .fa-rotate-90,:root .fa-rotate-180,:root .fa-rotate-270,:root .fa-flip-horizontal,:root .fa-flip-vertical{filter:none}.fa-stack{position:relative;display:inline-block;width:2em;height:2em;line-height:2em;vertical-align:middle}.fa-stack-1x,.fa-stack-2x{position:absolute;left:0;width:100%;text-align:center}.fa-stack-1x{line-height:inherit}.fa-stack-2x{font-size:2em}.fa-inverse{color:#fff}.fa-glass:before{content:""}.fa-music:before{content:""}.fa-search:before,.icon-search:before{content:""}.fa-envelope-o:before{content:""}.fa-heart:before{content:""}.fa-star:before{content:""}.fa-star-o:before{content:""}.fa-user:before{content:""}.fa-film:before{content:""}.fa-th-large:before{content:""}.fa-th:before{content:""}.fa-th-list:before{content:""}.fa-check:before{content:""}.fa-remove:before,.fa-close:before,.fa-times:before{content:""}.fa-search-plus:before{content:""}.fa-search-minus:before{content:""}.fa-power-off:before{content:""}.fa-signal:before{content:""}.fa-gear:before,.fa-cog:before{content:""}.fa-trash-o:before{content:""}.fa-home:before,.icon-home:before{content:""}.fa-file-o:before{content:""}.fa-clock-o:before{content:""}.fa-road:before{content:""}.fa-download:before,.rst-content tt.download span:first-child:before,.rst-content code.download span:first-child:before{content:""}.fa-arrow-circle-o-down:before{content:""}.fa-arrow-circle-o-up:before{content:""}.fa-inbox:before{content:""}.fa-play-circle-o:before{content:""}.fa-rotate-right:before,.fa-repeat:before{content:""}.fa-refresh:before{content:""}.fa-list-alt:before{content:""}.fa-lock:before{content:""}.fa-flag:before{content:""}.fa-headphones:before{content:""}.fa-volume-off:before{content:""}.fa-volume-down:before{content:""}.fa-volume-up:before{content:""}.fa-qrcode:before{content:""}.fa-barcode:before{content:""}.fa-tag:before{content:""}.fa-tags:before{content:""}.fa-book:before,.icon-book:before{content:""}.fa-bookmark:before{content:""}.fa-print:before{content:""}.fa-camera:before{content:""}.fa-font:before{content:""}.fa-bold:before{content:""}.fa-italic:before{content:""}.fa-text-height:before{content:""}.fa-text-width:before{content:""}.fa-align-left:before{content:""}.fa-align-center:before{content:""}.fa-align-right:before{content:""}.fa-align-justify:before{content:""}.fa-list:before{content:""}.fa-dedent:before,.fa-outdent:before{content:""}.fa-indent:before{content:""}.fa-video-camera:before{content:""}.fa-photo:before,.fa-image:before,.fa-picture-o:before{content:""}.fa-pencil:before{content:""}.fa-map-marker:before{content:""}.fa-adjust:before{content:""}.fa-tint:before{content:""}.fa-edit:before,.fa-pencil-square-o:before{content:""}.fa-share-square-o:before{content:""}.fa-check-square-o:before{content:""}.fa-arrows:before{content:""}.fa-step-backward:before{content:""}.fa-fast-backward:before{content:""}.fa-backward:before{content:""}.fa-play:before{content:""}.fa-pause:before{content:""}.fa-stop:before{content:""}.fa-forward:before{content:""}.fa-fast-forward:before{content:""}.fa-step-forward:before{content:""}.fa-eject:before{content:""}.fa-chevron-left:before{content:""}.fa-chevron-right:before{content:""}.fa-plus-circle:before{content:""}.fa-minus-circle:before{content:""}.fa-times-circle:before,.wy-inline-validate.wy-inline-validate-danger .wy-input-context:before{content:""}.fa-check-circle:before,.wy-inline-validate.wy-inline-validate-success .wy-input-context:before{content:""}.fa-question-circle:before{content:""}.fa-info-circle:before{content:""}.fa-crosshairs:before{content:""}.fa-times-circle-o:before{content:""}.fa-check-circle-o:before{content:""}.fa-ban:before{content:""}.fa-arrow-left:before{content:""}.fa-arrow-right:before{content:""}.fa-arrow-up:before{content:""}.fa-arrow-down:before{content:""}.fa-mail-forward:before,.fa-share:before{content:""}.fa-expand:before{content:""}.fa-compress:before{content:""}.fa-plus:before{content:""}.fa-minus:before{content:""}.fa-asterisk:before{content:""}.fa-exclamation-circle:before,.wy-inline-validate.wy-inline-validate-warning .wy-input-context:before,.wy-inline-validate.wy-inline-validate-info .wy-input-context:before,.rst-content .admonition-title:before{content:""}.fa-gift:before{content:""}.fa-leaf:before{content:""}.fa-fire:before,.icon-fire:before{content:""}.fa-eye:before{content:""}.fa-eye-slash:before{content:""}.fa-warning:before,.fa-exclamation-triangle:before{content:""}.fa-plane:before{content:""}.fa-calendar:before{content:""}.fa-random:before{content:""}.fa-comment:before{content:""}.fa-magnet:before{content:""}.fa-chevron-up:before{content:""}.fa-chevron-down:before{content:""}.fa-retweet:before{content:""}.fa-shopping-cart:before{content:""}.fa-folder:before{content:""}.fa-folder-open:before{content:""}.fa-arrows-v:before{content:""}.fa-arrows-h:before{content:""}.fa-bar-chart-o:before,.fa-bar-chart:before{content:""}.fa-twitter-square:before{content:""}.fa-facebook-square:before{content:""}.fa-camera-retro:before{content:""}.fa-key:before{content:""}.fa-gears:before,.fa-cogs:before{content:""}.fa-comments:before{content:""}.fa-thumbs-o-up:before{content:""}.fa-thumbs-o-down:before{content:""}.fa-star-half:before{content:""}.fa-heart-o:before{content:""}.fa-sign-out:before{content:""}.fa-linkedin-square:before{content:""}.fa-thumb-tack:before{content:""}.fa-external-link:before{content:""}.fa-sign-in:before{content:""}.fa-trophy:before{content:""}.fa-github-square:before{content:""}.fa-upload:before{content:""}.fa-lemon-o:before{content:""}.fa-phone:before{content:""}.fa-square-o:before{content:""}.fa-bookmark-o:before{content:""}.fa-phone-square:before{content:""}.fa-twitter:before{content:""}.fa-facebook-f:before,.fa-facebook:before{content:""}.fa-github:before,.icon-github:before{content:""}.fa-unlock:before{content:""}.fa-credit-card:before{content:""}.fa-feed:before,.fa-rss:before{content:""}.fa-hdd-o:before{content:""}.fa-bullhorn:before{content:""}.fa-bell:before{content:""}.fa-certificate:before{content:""}.fa-hand-o-right:before{content:""}.fa-hand-o-left:before{content:""}.fa-hand-o-up:before{content:""}.fa-hand-o-down:before{content:""}.fa-arrow-circle-left:before,.icon-circle-arrow-left:before{content:""}.fa-arrow-circle-right:before,.icon-circle-arrow-right:before{content:""}.fa-arrow-circle-up:before{content:""}.fa-arrow-circle-down:before{content:""}.fa-globe:before{content:""}.fa-wrench:before{content:""}.fa-tasks:before{content:""}.fa-filter:before{content:""}.fa-briefcase:before{content:""}.fa-arrows-alt:before{content:""}.fa-group:before,.fa-users:before{content:""}.fa-chain:before,.fa-link:before,.icon-link:before{content:""}.fa-cloud:before{content:""}.fa-flask:before{content:""}.fa-cut:before,.fa-scissors:before{content:""}.fa-copy:before,.fa-files-o:before{content:""}.fa-paperclip:before{content:""}.fa-save:before,.fa-floppy-o:before{content:""}.fa-square:before{content:""}.fa-navicon:before,.fa-reorder:before,.fa-bars:before{content:""}.fa-list-ul:before{content:""}.fa-list-ol:before{content:""}.fa-strikethrough:before{content:""}.fa-underline:before{content:""}.fa-table:before{content:""}.fa-magic:before{content:""}.fa-truck:before{content:""}.fa-pinterest:before{content:""}.fa-pinterest-square:before{content:""}.fa-google-plus-square:before{content:""}.fa-google-plus:before{content:""}.fa-money:before{content:""}.fa-caret-down:before,.wy-dropdown .caret:before,.icon-caret-down:before{content:""}.fa-caret-up:before{content:""}.fa-caret-left:before{content:""}.fa-caret-right:before{content:""}.fa-columns:before{content:""}.fa-unsorted:before,.fa-sort:before{content:""}.fa-sort-down:before,.fa-sort-desc:before{content:""}.fa-sort-up:before,.fa-sort-asc:before{content:""}.fa-envelope:before{content:""}.fa-linkedin:before{content:""}.fa-rotate-left:before,.fa-undo:before{content:""}.fa-legal:before,.fa-gavel:before{content:""}.fa-dashboard:before,.fa-tachometer:before{content:""}.fa-comment-o:before{content:""}.fa-comments-o:before{content:""}.fa-flash:before,.fa-bolt:before{content:""}.fa-sitemap:before{content:""}.fa-umbrella:before{content:""}.fa-paste:before,.fa-clipboard:before{content:""}.fa-lightbulb-o:before{content:""}.fa-exchange:before{content:""}.fa-cloud-download:before{content:""}.fa-cloud-upload:before{content:""}.fa-user-md:before{content:""}.fa-stethoscope:before{content:""}.fa-suitcase:before{content:""}.fa-bell-o:before{content:""}.fa-coffee:before{content:""}.fa-cutlery:before{content:""}.fa-file-text-o:before{content:""}.fa-building-o:before{content:""}.fa-hospital-o:before{content:""}.fa-ambulance:before{content:""}.fa-medkit:before{content:""}.fa-fighter-jet:before{content:""}.fa-beer:before{content:""}.fa-h-square:before{content:""}.fa-plus-square:before{content:""}.fa-angle-double-left:before{content:""}.fa-angle-double-right:before{content:""}.fa-angle-double-up:before{content:""}.fa-angle-double-down:before{content:""}.fa-angle-left:before{content:""}.fa-angle-right:before{content:""}.fa-angle-up:before{content:""}.fa-angle-down:before{content:""}.fa-desktop:before{content:""}.fa-laptop:before{content:""}.fa-tablet:before{content:""}.fa-mobile-phone:before,.fa-mobile:before{content:""}.fa-circle-o:before{content:""}.fa-quote-left:before{content:""}.fa-quote-right:before{content:""}.fa-spinner:before{content:""}.fa-circle:before{content:""}.fa-mail-reply:before,.fa-reply:before{content:""}.fa-github-alt:before{content:""}.fa-folder-o:before{content:""}.fa-folder-open-o:before{content:""}.fa-smile-o:before{content:""}.fa-frown-o:before{content:""}.fa-meh-o:before{content:""}.fa-gamepad:before{content:""}.fa-keyboard-o:before{content:""}.fa-flag-o:before{content:""}.fa-flag-checkered:before{content:""}.fa-terminal:before{content:""}.fa-code:before{content:""}.fa-mail-reply-all:before,.fa-reply-all:before{content:""}.fa-star-half-empty:before,.fa-star-half-full:before,.fa-star-half-o:before{content:""}.fa-location-arrow:before{content:""}.fa-crop:before{content:""}.fa-code-fork:before{content:""}.fa-unlink:before,.fa-chain-broken:before{content:""}.fa-question:before{content:""}.fa-info:before{content:""}.fa-exclamation:before{content:""}.fa-superscript:before{content:""}.fa-subscript:before{content:""}.fa-eraser:before{content:""}.fa-puzzle-piece:before{content:""}.fa-microphone:before{content:""}.fa-microphone-slash:before{content:""}.fa-shield:before{content:""}.fa-calendar-o:before{content:""}.fa-fire-extinguisher:before{content:""}.fa-rocket:before{content:""}.fa-maxcdn:before{content:""}.fa-chevron-circle-left:before{content:""}.fa-chevron-circle-right:before{content:""}.fa-chevron-circle-up:before{content:""}.fa-chevron-circle-down:before{content:""}.fa-html5:before{content:""}.fa-css3:before{content:""}.fa-anchor:before{content:""}.fa-unlock-alt:before{content:""}.fa-bullseye:before{content:""}.fa-ellipsis-h:before{content:""}.fa-ellipsis-v:before{content:""}.fa-rss-square:before{content:""}.fa-play-circle:before{content:""}.fa-ticket:before{content:""}.fa-minus-square:before{content:""}.fa-minus-square-o:before,.wy-menu-vertical li.on a span.toctree-expand:before,.wy-menu-vertical li.current>a span.toctree-expand:before{content:""}.fa-level-up:before{content:""}.fa-level-down:before{content:""}.fa-check-square:before{content:""}.fa-pencil-square:before{content:""}.fa-external-link-square:before{content:""}.fa-share-square:before{content:""}.fa-compass:before{content:""}.fa-toggle-down:before,.fa-caret-square-o-down:before{content:""}.fa-toggle-up:before,.fa-caret-square-o-up:before{content:""}.fa-toggle-right:before,.fa-caret-square-o-right:before{content:""}.fa-euro:before,.fa-eur:before{content:""}.fa-gbp:before{content:""}.fa-dollar:before,.fa-usd:before{content:""}.fa-rupee:before,.fa-inr:before{content:""}.fa-cny:before,.fa-rmb:before,.fa-yen:before,.fa-jpy:before{content:""}.fa-ruble:before,.fa-rouble:before,.fa-rub:before{content:""}.fa-won:before,.fa-krw:before{content:""}.fa-bitcoin:before,.fa-btc:before{content:""}.fa-file:before{content:""}.fa-file-text:before{content:""}.fa-sort-alpha-asc:before{content:""}.fa-sort-alpha-desc:before{content:""}.fa-sort-amount-asc:before{content:""}.fa-sort-amount-desc:before{content:""}.fa-sort-numeric-asc:before{content:""}.fa-sort-numeric-desc:before{content:""}.fa-thumbs-up:before{content:""}.fa-thumbs-down:before{content:""}.fa-youtube-square:before{content:""}.fa-youtube:before{content:""}.fa-xing:before{content:""}.fa-xing-square:before{content:""}.fa-youtube-play:before{content:""}.fa-dropbox:before{content:""}.fa-stack-overflow:before{content:""}.fa-instagram:before{content:""}.fa-flickr:before{content:""}.fa-adn:before{content:""}.fa-bitbucket:before,.icon-bitbucket:before{content:""}.fa-bitbucket-square:before{content:""}.fa-tumblr:before{content:""}.fa-tumblr-square:before{content:""}.fa-long-arrow-down:before{content:""}.fa-long-arrow-up:before{content:""}.fa-long-arrow-left:before{content:""}.fa-long-arrow-right:before{content:""}.fa-apple:before{content:""}.fa-windows:before{content:""}.fa-android:before{content:""}.fa-linux:before{content:""}.fa-dribbble:before{content:""}.fa-skype:before{content:""}.fa-foursquare:before{content:""}.fa-trello:before{content:""}.fa-female:before{content:""}.fa-male:before{content:""}.fa-gittip:before,.fa-gratipay:before{content:""}.fa-sun-o:before{content:""}.fa-moon-o:before{content:""}.fa-archive:before{content:""}.fa-bug:before{content:""}.fa-vk:before{content:""}.fa-weibo:before{content:""}.fa-renren:before{content:""}.fa-pagelines:before{content:""}.fa-stack-exchange:before{content:""}.fa-arrow-circle-o-right:before{content:""}.fa-arrow-circle-o-left:before{content:""}.fa-toggle-left:before,.fa-caret-square-o-left:before{content:""}.fa-dot-circle-o:before{content:""}.fa-wheelchair:before{content:""}.fa-vimeo-square:before{content:""}.fa-turkish-lira:before,.fa-try:before{content:""}.fa-plus-square-o:before,.wy-menu-vertical li span.toctree-expand:before{content:""}.fa-space-shuttle:before{content:""}.fa-slack:before{content:""}.fa-envelope-square:before{content:""}.fa-wordpress:before{content:""}.fa-openid:before{content:""}.fa-institution:before,.fa-bank:before,.fa-university:before{content:""}.fa-mortar-board:before,.fa-graduation-cap:before{content:""}.fa-yahoo:before{content:""}.fa-google:before{content:""}.fa-reddit:before{content:""}.fa-reddit-square:before{content:""}.fa-stumbleupon-circle:before{content:""}.fa-stumbleupon:before{content:""}.fa-delicious:before{content:""}.fa-digg:before{content:""}.fa-pied-piper-pp:before{content:""}.fa-pied-piper-alt:before{content:""}.fa-drupal:before{content:""}.fa-joomla:before{content:""}.fa-language:before{content:""}.fa-fax:before{content:""}.fa-building:before{content:""}.fa-child:before{content:""}.fa-paw:before{content:""}.fa-spoon:before{content:""}.fa-cube:before{content:""}.fa-cubes:before{content:""}.fa-behance:before{content:""}.fa-behance-square:before{content:""}.fa-steam:before{content:""}.fa-steam-square:before{content:""}.fa-recycle:before{content:""}.fa-automobile:before,.fa-car:before{content:""}.fa-cab:before,.fa-taxi:before{content:""}.fa-tree:before{content:""}.fa-spotify:before{content:""}.fa-deviantart:before{content:""}.fa-soundcloud:before{content:""}.fa-database:before{content:""}.fa-file-pdf-o:before{content:""}.fa-file-word-o:before{content:""}.fa-file-excel-o:before{content:""}.fa-file-powerpoint-o:before{content:""}.fa-file-photo-o:before,.fa-file-picture-o:before,.fa-file-image-o:before{content:""}.fa-file-zip-o:before,.fa-file-archive-o:before{content:""}.fa-file-sound-o:before,.fa-file-audio-o:before{content:""}.fa-file-movie-o:before,.fa-file-video-o:before{content:""}.fa-file-code-o:before{content:""}.fa-vine:before{content:""}.fa-codepen:before{content:""}.fa-jsfiddle:before{content:""}.fa-life-bouy:before,.fa-life-buoy:before,.fa-life-saver:before,.fa-support:before,.fa-life-ring:before{content:""}.fa-circle-o-notch:before{content:""}.fa-ra:before,.fa-resistance:before,.fa-rebel:before{content:""}.fa-ge:before,.fa-empire:before{content:""}.fa-git-square:before{content:""}.fa-git:before{content:""}.fa-y-combinator-square:before,.fa-yc-square:before,.fa-hacker-news:before{content:""}.fa-tencent-weibo:before{content:""}.fa-qq:before{content:""}.fa-wechat:before,.fa-weixin:before{content:""}.fa-send:before,.fa-paper-plane:before{content:""}.fa-send-o:before,.fa-paper-plane-o:before{content:""}.fa-history:before{content:""}.fa-circle-thin:before{content:""}.fa-header:before{content:""}.fa-paragraph:before{content:""}.fa-sliders:before{content:""}.fa-share-alt:before{content:""}.fa-share-alt-square:before{content:""}.fa-bomb:before{content:""}.fa-soccer-ball-o:before,.fa-futbol-o:before{content:""}.fa-tty:before{content:""}.fa-binoculars:before{content:""}.fa-plug:before{content:""}.fa-slideshare:before{content:""}.fa-twitch:before{content:""}.fa-yelp:before{content:""}.fa-newspaper-o:before{content:""}.fa-wifi:before{content:""}.fa-calculator:before{content:""}.fa-paypal:before{content:""}.fa-google-wallet:before{content:""}.fa-cc-visa:before{content:""}.fa-cc-mastercard:before{content:""}.fa-cc-discover:before{content:""}.fa-cc-amex:before{content:""}.fa-cc-paypal:before{content:""}.fa-cc-stripe:before{content:""}.fa-bell-slash:before{content:""}.fa-bell-slash-o:before{content:""}.fa-trash:before{content:""}.fa-copyright:before{content:""}.fa-at:before{content:""}.fa-eyedropper:before{content:""}.fa-paint-brush:before{content:""}.fa-birthday-cake:before{content:""}.fa-area-chart:before{content:""}.fa-pie-chart:before{content:""}.fa-line-chart:before{content:""}.fa-lastfm:before{content:""}.fa-lastfm-square:before{content:""}.fa-toggle-off:before{content:""}.fa-toggle-on:before{content:""}.fa-bicycle:before{content:""}.fa-bus:before{content:""}.fa-ioxhost:before{content:""}.fa-angellist:before{content:""}.fa-cc:before{content:""}.fa-shekel:before,.fa-sheqel:before,.fa-ils:before{content:""}.fa-meanpath:before{content:""}.fa-buysellads:before{content:""}.fa-connectdevelop:before{content:""}.fa-dashcube:before{content:""}.fa-forumbee:before{content:""}.fa-leanpub:before{content:""}.fa-sellsy:before{content:""}.fa-shirtsinbulk:before{content:""}.fa-simplybuilt:before{content:""}.fa-skyatlas:before{content:""}.fa-cart-plus:before{content:""}.fa-cart-arrow-down:before{content:""}.fa-diamond:before{content:""}.fa-ship:before{content:""}.fa-user-secret:before{content:""}.fa-motorcycle:before{content:""}.fa-street-view:before{content:""}.fa-heartbeat:before{content:""}.fa-venus:before{content:""}.fa-mars:before{content:""}.fa-mercury:before{content:""}.fa-intersex:before,.fa-transgender:before{content:""}.fa-transgender-alt:before{content:""}.fa-venus-double:before{content:""}.fa-mars-double:before{content:""}.fa-venus-mars:before{content:""}.fa-mars-stroke:before{content:""}.fa-mars-stroke-v:before{content:""}.fa-mars-stroke-h:before{content:""}.fa-neuter:before{content:""}.fa-genderless:before{content:""}.fa-facebook-official:before{content:""}.fa-pinterest-p:before{content:""}.fa-whatsapp:before{content:""}.fa-server:before{content:""}.fa-user-plus:before{content:""}.fa-user-times:before{content:""}.fa-hotel:before,.fa-bed:before{content:""}.fa-viacoin:before{content:""}.fa-train:before{content:""}.fa-subway:before{content:""}.fa-medium:before{content:""}.fa-yc:before,.fa-y-combinator:before{content:""}.fa-optin-monster:before{content:""}.fa-opencart:before{content:""}.fa-expeditedssl:before{content:""}.fa-battery-4:before,.fa-battery:before,.fa-battery-full:before{content:""}.fa-battery-3:before,.fa-battery-three-quarters:before{content:""}.fa-battery-2:before,.fa-battery-half:before{content:""}.fa-battery-1:before,.fa-battery-quarter:before{content:""}.fa-battery-0:before,.fa-battery-empty:before{content:""}.fa-mouse-pointer:before{content:""}.fa-i-cursor:before{content:""}.fa-object-group:before{content:""}.fa-object-ungroup:before{content:""}.fa-sticky-note:before{content:""}.fa-sticky-note-o:before{content:""}.fa-cc-jcb:before{content:""}.fa-cc-diners-club:before{content:""}.fa-clone:before{content:""}.fa-balance-scale:before{content:""}.fa-hourglass-o:before{content:""}.fa-hourglass-1:before,.fa-hourglass-start:before{content:""}.fa-hourglass-2:before,.fa-hourglass-half:before{content:""}.fa-hourglass-3:before,.fa-hourglass-end:before{content:""}.fa-hourglass:before{content:""}.fa-hand-grab-o:before,.fa-hand-rock-o:before{content:""}.fa-hand-stop-o:before,.fa-hand-paper-o:before{content:""}.fa-hand-scissors-o:before{content:""}.fa-hand-lizard-o:before{content:""}.fa-hand-spock-o:before{content:""}.fa-hand-pointer-o:before{content:""}.fa-hand-peace-o:before{content:""}.fa-trademark:before{content:""}.fa-registered:before{content:""}.fa-creative-commons:before{content:""}.fa-gg:before{content:""}.fa-gg-circle:before{content:""}.fa-tripadvisor:before{content:""}.fa-odnoklassniki:before{content:""}.fa-odnoklassniki-square:before{content:""}.fa-get-pocket:before{content:""}.fa-wikipedia-w:before{content:""}.fa-safari:before{content:""}.fa-chrome:before{content:""}.fa-firefox:before{content:""}.fa-opera:before{content:""}.fa-internet-explorer:before{content:""}.fa-tv:before,.fa-television:before{content:""}.fa-contao:before{content:""}.fa-500px:before{content:""}.fa-amazon:before{content:""}.fa-calendar-plus-o:before{content:""}.fa-calendar-minus-o:before{content:""}.fa-calendar-times-o:before{content:""}.fa-calendar-check-o:before{content:""}.fa-industry:before{content:""}.fa-map-pin:before{content:""}.fa-map-signs:before{content:""}.fa-map-o:before{content:""}.fa-map:before{content:""}.fa-commenting:before{content:""}.fa-commenting-o:before{content:""}.fa-houzz:before{content:""}.fa-vimeo:before{content:""}.fa-black-tie:before{content:""}.fa-fonticons:before{content:""}.fa-reddit-alien:before{content:""}.fa-edge:before{content:""}.fa-credit-card-alt:before{content:""}.fa-codiepie:before{content:""}.fa-modx:before{content:""}.fa-fort-awesome:before{content:""}.fa-usb:before{content:""}.fa-product-hunt:before{content:""}.fa-mixcloud:before{content:""}.fa-scribd:before{content:""}.fa-pause-circle:before{content:""}.fa-pause-circle-o:before{content:""}.fa-stop-circle:before{content:""}.fa-stop-circle-o:before{content:""}.fa-shopping-bag:before{content:""}.fa-shopping-basket:before{content:""}.fa-hashtag:before{content:""}.fa-bluetooth:before{content:""}.fa-bluetooth-b:before{content:""}.fa-percent:before{content:""}.fa-gitlab:before,.icon-gitlab:before{content:""}.fa-wpbeginner:before{content:""}.fa-wpforms:before{content:""}.fa-envira:before{content:""}.fa-universal-access:before{content:""}.fa-wheelchair-alt:before{content:""}.fa-question-circle-o:before{content:""}.fa-blind:before{content:""}.fa-audio-description:before{content:""}.fa-volume-control-phone:before{content:""}.fa-braille:before{content:""}.fa-assistive-listening-systems:before{content:""}.fa-asl-interpreting:before,.fa-american-sign-language-interpreting:before{content:""}.fa-deafness:before,.fa-hard-of-hearing:before,.fa-deaf:before{content:""}.fa-glide:before{content:""}.fa-glide-g:before{content:""}.fa-signing:before,.fa-sign-language:before{content:""}.fa-low-vision:before{content:""}.fa-viadeo:before{content:""}.fa-viadeo-square:before{content:""}.fa-snapchat:before{content:""}.fa-snapchat-ghost:before{content:""}.fa-snapchat-square:before{content:""}.fa-pied-piper:before{content:""}.fa-first-order:before{content:""}.fa-yoast:before{content:""}.fa-themeisle:before{content:""}.fa-google-plus-circle:before,.fa-google-plus-official:before{content:""}.fa-fa:before,.fa-font-awesome:before{content:""}.fa-handshake-o:before{content:""}.fa-envelope-open:before{content:""}.fa-envelope-open-o:before{content:""}.fa-linode:before{content:""}.fa-address-book:before{content:""}.fa-address-book-o:before{content:""}.fa-vcard:before,.fa-address-card:before{content:""}.fa-vcard-o:before,.fa-address-card-o:before{content:""}.fa-user-circle:before{content:""}.fa-user-circle-o:before{content:""}.fa-user-o:before{content:""}.fa-id-badge:before{content:""}.fa-drivers-license:before,.fa-id-card:before{content:""}.fa-drivers-license-o:before,.fa-id-card-o:before{content:""}.fa-quora:before{content:""}.fa-free-code-camp:before{content:""}.fa-telegram:before{content:""}.fa-thermometer-4:before,.fa-thermometer:before,.fa-thermometer-full:before{content:""}.fa-thermometer-3:before,.fa-thermometer-three-quarters:before{content:""}.fa-thermometer-2:before,.fa-thermometer-half:before{content:""}.fa-thermometer-1:before,.fa-thermometer-quarter:before{content:""}.fa-thermometer-0:before,.fa-thermometer-empty:before{content:""}.fa-shower:before{content:""}.fa-bathtub:before,.fa-s15:before,.fa-bath:before{content:""}.fa-podcast:before{content:""}.fa-window-maximize:before{content:""}.fa-window-minimize:before{content:""}.fa-window-restore:before{content:""}.fa-times-rectangle:before,.fa-window-close:before{content:""}.fa-times-rectangle-o:before,.fa-window-close-o:before{content:""}.fa-bandcamp:before{content:""}.fa-grav:before{content:""}.fa-etsy:before{content:""}.fa-imdb:before{content:""}.fa-ravelry:before{content:""}.fa-eercast:before{content:""}.fa-microchip:before{content:""}.fa-snowflake-o:before{content:""}.fa-superpowers:before{content:""}.fa-wpexplorer:before{content:""}.fa-meetup:before{content:""}.sr-only{position:absolute;width:1px;height:1px;padding:0;margin:-1px;overflow:hidden;clip:rect(0, 0, 0, 0);border:0}.sr-only-focusable:active,.sr-only-focusable:focus{position:static;width:auto;height:auto;margin:0;overflow:visible;clip:auto}.fa,.wy-menu-vertical li span.toctree-expand,.wy-menu-vertical li.on a span.toctree-expand,.wy-menu-vertical li.current>a span.toctree-expand,.rst-content .admonition-title,.rst-content h1 .headerlink,.rst-content h2 .headerlink,.rst-content h3 .headerlink,.rst-content h4 .headerlink,.rst-content h5 .headerlink,.rst-content h6 .headerlink,.rst-content dl dt .headerlink,.rst-content p.caption .headerlink,.rst-content table>caption .headerlink,.rst-content .code-block-caption .headerlink,.rst-content tt.download span:first-child,.rst-content code.download span:first-child,.icon,.wy-dropdown .caret,.wy-inline-validate.wy-inline-validate-success .wy-input-context,.wy-inline-validate.wy-inline-validate-danger .wy-input-context,.wy-inline-validate.wy-inline-validate-warning .wy-input-context,.wy-inline-validate.wy-inline-validate-info .wy-input-context{font-family:inherit}.fa:before,.wy-menu-vertical li span.toctree-expand:before,.wy-menu-vertical li.on a span.toctree-expand:before,.wy-menu-vertical li.current>a span.toctree-expand:before,.rst-content .admonition-title:before,.rst-content h1 .headerlink:before,.rst-content h2 .headerlink:before,.rst-content h3 .headerlink:before,.rst-content h4 .headerlink:before,.rst-content h5 .headerlink:before,.rst-content h6 .headerlink:before,.rst-content dl dt .headerlink:before,.rst-content p.caption .headerlink:before,.rst-content table>caption .headerlink:before,.rst-content .code-block-caption .headerlink:before,.rst-content tt.download span:first-child:before,.rst-content code.download span:first-child:before,.icon:before,.wy-dropdown .caret:before,.wy-inline-validate.wy-inline-validate-success .wy-input-context:before,.wy-inline-validate.wy-inline-validate-danger .wy-input-context:before,.wy-inline-validate.wy-inline-validate-warning .wy-input-context:before,.wy-inline-validate.wy-inline-validate-info .wy-input-context:before{font-family:"FontAwesome";display:inline-block;font-style:normal;font-weight:normal;line-height:1;text-decoration:inherit}a .fa,a .wy-menu-vertical li span.toctree-expand,.wy-menu-vertical li a span.toctree-expand,.wy-menu-vertical li.on a span.toctree-expand,.wy-menu-vertical li.current>a span.toctree-expand,a .rst-content .admonition-title,.rst-content a .admonition-title,a .rst-content h1 .headerlink,.rst-content h1 a .headerlink,a .rst-content h2 .headerlink,.rst-content h2 a .headerlink,a .rst-content h3 .headerlink,.rst-content h3 a .headerlink,a .rst-content h4 .headerlink,.rst-content h4 a .headerlink,a .rst-content h5 .headerlink,.rst-content h5 a .headerlink,a .rst-content h6 .headerlink,.rst-content h6 a .headerlink,a .rst-content dl dt .headerlink,.rst-content dl dt a .headerlink,a .rst-content p.caption .headerlink,.rst-content p.caption a .headerlink,a .rst-content table>caption .headerlink,.rst-content table>caption a .headerlink,a .rst-content .code-block-caption .headerlink,.rst-content .code-block-caption a .headerlink,a .rst-content tt.download span:first-child,.rst-content tt.download a span:first-child,a .rst-content code.download span:first-child,.rst-content code.download a span:first-child,a .icon{display:inline-block;text-decoration:inherit}.btn .fa,.btn .wy-menu-vertical li span.toctree-expand,.wy-menu-vertical li .btn span.toctree-expand,.btn .wy-menu-vertical li.on a span.toctree-expand,.wy-menu-vertical li.on a .btn span.toctree-expand,.btn .wy-menu-vertical li.current>a span.toctree-expand,.wy-menu-vertical li.current>a .btn span.toctree-expand,.btn .rst-content .admonition-title,.rst-content .btn .admonition-title,.btn .rst-content h1 .headerlink,.rst-content h1 .btn .headerlink,.btn .rst-content h2 .headerlink,.rst-content h2 .btn .headerlink,.btn .rst-content h3 .headerlink,.rst-content h3 .btn .headerlink,.btn .rst-content h4 .headerlink,.rst-content h4 .btn .headerlink,.btn .rst-content h5 .headerlink,.rst-content h5 .btn .headerlink,.btn .rst-content h6 .headerlink,.rst-content h6 .btn .headerlink,.btn .rst-content dl dt .headerlink,.rst-content dl dt .btn .headerlink,.btn .rst-content p.caption .headerlink,.rst-content p.caption .btn .headerlink,.btn .rst-content table>caption .headerlink,.rst-content table>caption .btn .headerlink,.btn .rst-content .code-block-caption .headerlink,.rst-content .code-block-caption .btn .headerlink,.btn .rst-content tt.download span:first-child,.rst-content tt.download .btn span:first-child,.btn .rst-content code.download span:first-child,.rst-content code.download .btn span:first-child,.btn .icon,.nav .fa,.nav .wy-menu-vertical li span.toctree-expand,.wy-menu-vertical li .nav span.toctree-expand,.nav .wy-menu-vertical li.on a span.toctree-expand,.wy-menu-vertical li.on a .nav span.toctree-expand,.nav .wy-menu-vertical li.current>a span.toctree-expand,.wy-menu-vertical li.current>a .nav span.toctree-expand,.nav .rst-content .admonition-title,.rst-content .nav .admonition-title,.nav .rst-content h1 .headerlink,.rst-content h1 .nav .headerlink,.nav .rst-content h2 .headerlink,.rst-content h2 .nav .headerlink,.nav .rst-content h3 .headerlink,.rst-content h3 .nav .headerlink,.nav .rst-content h4 .headerlink,.rst-content h4 .nav .headerlink,.nav .rst-content h5 .headerlink,.rst-content h5 .nav .headerlink,.nav .rst-content h6 .headerlink,.rst-content h6 .nav .headerlink,.nav .rst-content dl dt .headerlink,.rst-content dl dt .nav .headerlink,.nav .rst-content p.caption .headerlink,.rst-content p.caption .nav .headerlink,.nav .rst-content table>caption .headerlink,.rst-content table>caption .nav .headerlink,.nav .rst-content .code-block-caption .headerlink,.rst-content .code-block-caption .nav .headerlink,.nav .rst-content tt.download span:first-child,.rst-content tt.download .nav span:first-child,.nav .rst-content code.download span:first-child,.rst-content code.download .nav span:first-child,.nav .icon{display:inline}.btn .fa.fa-large,.btn .wy-menu-vertical li span.fa-large.toctree-expand,.wy-menu-vertical li .btn span.fa-large.toctree-expand,.btn .rst-content .fa-large.admonition-title,.rst-content .btn .fa-large.admonition-title,.btn .rst-content h1 .fa-large.headerlink,.rst-content h1 .btn .fa-large.headerlink,.btn .rst-content h2 .fa-large.headerlink,.rst-content h2 .btn .fa-large.headerlink,.btn .rst-content h3 .fa-large.headerlink,.rst-content h3 .btn .fa-large.headerlink,.btn .rst-content h4 .fa-large.headerlink,.rst-content h4 .btn .fa-large.headerlink,.btn .rst-content h5 .fa-large.headerlink,.rst-content h5 .btn .fa-large.headerlink,.btn .rst-content h6 .fa-large.headerlink,.rst-content h6 .btn .fa-large.headerlink,.btn .rst-content dl dt .fa-large.headerlink,.rst-content dl dt .btn .fa-large.headerlink,.btn .rst-content p.caption .fa-large.headerlink,.rst-content p.caption .btn .fa-large.headerlink,.btn .rst-content table>caption .fa-large.headerlink,.rst-content table>caption .btn .fa-large.headerlink,.btn .rst-content .code-block-caption .fa-large.headerlink,.rst-content .code-block-caption .btn .fa-large.headerlink,.btn .rst-content tt.download span.fa-large:first-child,.rst-content tt.download .btn span.fa-large:first-child,.btn .rst-content code.download span.fa-large:first-child,.rst-content code.download .btn span.fa-large:first-child,.btn .fa-large.icon,.nav .fa.fa-large,.nav .wy-menu-vertical li span.fa-large.toctree-expand,.wy-menu-vertical li .nav span.fa-large.toctree-expand,.nav .rst-content .fa-large.admonition-title,.rst-content .nav .fa-large.admonition-title,.nav .rst-content h1 .fa-large.headerlink,.rst-content h1 .nav .fa-large.headerlink,.nav .rst-content h2 .fa-large.headerlink,.rst-content h2 .nav .fa-large.headerlink,.nav .rst-content h3 .fa-large.headerlink,.rst-content h3 .nav .fa-large.headerlink,.nav .rst-content h4 .fa-large.headerlink,.rst-content h4 .nav .fa-large.headerlink,.nav .rst-content h5 .fa-large.headerlink,.rst-content h5 .nav .fa-large.headerlink,.nav .rst-content h6 .fa-large.headerlink,.rst-content h6 .nav .fa-large.headerlink,.nav .rst-content dl dt .fa-large.headerlink,.rst-content dl dt .nav .fa-large.headerlink,.nav .rst-content p.caption .fa-large.headerlink,.rst-content p.caption .nav .fa-large.headerlink,.nav .rst-content table>caption .fa-large.headerlink,.rst-content table>caption .nav .fa-large.headerlink,.nav .rst-content .code-block-caption .fa-large.headerlink,.rst-content .code-block-caption .nav .fa-large.headerlink,.nav .rst-content tt.download span.fa-large:first-child,.rst-content tt.download .nav span.fa-large:first-child,.nav .rst-content code.download span.fa-large:first-child,.rst-content code.download .nav span.fa-large:first-child,.nav .fa-large.icon{line-height:.9em}.btn .fa.fa-spin,.btn .wy-menu-vertical li span.fa-spin.toctree-expand,.wy-menu-vertical li .btn span.fa-spin.toctree-expand,.btn .rst-content .fa-spin.admonition-title,.rst-content .btn .fa-spin.admonition-title,.btn .rst-content h1 .fa-spin.headerlink,.rst-content h1 .btn .fa-spin.headerlink,.btn .rst-content h2 .fa-spin.headerlink,.rst-content h2 .btn .fa-spin.headerlink,.btn .rst-content h3 .fa-spin.headerlink,.rst-content h3 .btn .fa-spin.headerlink,.btn .rst-content h4 .fa-spin.headerlink,.rst-content h4 .btn .fa-spin.headerlink,.btn .rst-content h5 .fa-spin.headerlink,.rst-content h5 .btn .fa-spin.headerlink,.btn .rst-content h6 .fa-spin.headerlink,.rst-content h6 .btn .fa-spin.headerlink,.btn .rst-content dl dt .fa-spin.headerlink,.rst-content dl dt .btn .fa-spin.headerlink,.btn .rst-content p.caption .fa-spin.headerlink,.rst-content p.caption .btn .fa-spin.headerlink,.btn .rst-content table>caption .fa-spin.headerlink,.rst-content table>caption .btn .fa-spin.headerlink,.btn .rst-content .code-block-caption .fa-spin.headerlink,.rst-content .code-block-caption .btn .fa-spin.headerlink,.btn .rst-content tt.download span.fa-spin:first-child,.rst-content tt.download .btn span.fa-spin:first-child,.btn .rst-content code.download span.fa-spin:first-child,.rst-content code.download .btn span.fa-spin:first-child,.btn .fa-spin.icon,.nav .fa.fa-spin,.nav .wy-menu-vertical li span.fa-spin.toctree-expand,.wy-menu-vertical li .nav span.fa-spin.toctree-expand,.nav .rst-content .fa-spin.admonition-title,.rst-content .nav .fa-spin.admonition-title,.nav .rst-content h1 .fa-spin.headerlink,.rst-content h1 .nav .fa-spin.headerlink,.nav .rst-content h2 .fa-spin.headerlink,.rst-content h2 .nav .fa-spin.headerlink,.nav .rst-content h3 .fa-spin.headerlink,.rst-content h3 .nav .fa-spin.headerlink,.nav .rst-content h4 .fa-spin.headerlink,.rst-content h4 .nav .fa-spin.headerlink,.nav .rst-content h5 .fa-spin.headerlink,.rst-content h5 .nav .fa-spin.headerlink,.nav .rst-content h6 .fa-spin.headerlink,.rst-content h6 .nav .fa-spin.headerlink,.nav .rst-content dl dt .fa-spin.headerlink,.rst-content dl dt .nav .fa-spin.headerlink,.nav .rst-content p.caption .fa-spin.headerlink,.rst-content p.caption .nav .fa-spin.headerlink,.nav .rst-content table>caption .fa-spin.headerlink,.rst-content table>caption .nav .fa-spin.headerlink,.nav .rst-content .code-block-caption .fa-spin.headerlink,.rst-content .code-block-caption .nav .fa-spin.headerlink,.nav .rst-content tt.download span.fa-spin:first-child,.rst-content tt.download .nav span.fa-spin:first-child,.nav .rst-content code.download span.fa-spin:first-child,.rst-content code.download .nav span.fa-spin:first-child,.nav .fa-spin.icon{display:inline-block}.btn.fa:before,.wy-menu-vertical li span.btn.toctree-expand:before,.rst-content .btn.admonition-title:before,.rst-content h1 .btn.headerlink:before,.rst-content h2 .btn.headerlink:before,.rst-content h3 .btn.headerlink:before,.rst-content h4 .btn.headerlink:before,.rst-content h5 .btn.headerlink:before,.rst-content h6 .btn.headerlink:before,.rst-content dl dt .btn.headerlink:before,.rst-content p.caption .btn.headerlink:before,.rst-content table>caption .btn.headerlink:before,.rst-content .code-block-caption .btn.headerlink:before,.rst-content tt.download span.btn:first-child:before,.rst-content code.download span.btn:first-child:before,.btn.icon:before{opacity:.5;-webkit-transition:opacity .05s ease-in;-moz-transition:opacity .05s ease-in;transition:opacity .05s ease-in}.btn.fa:hover:before,.wy-menu-vertical li span.btn.toctree-expand:hover:before,.rst-content .btn.admonition-title:hover:before,.rst-content h1 .btn.headerlink:hover:before,.rst-content h2 .btn.headerlink:hover:before,.rst-content h3 .btn.headerlink:hover:before,.rst-content h4 .btn.headerlink:hover:before,.rst-content h5 .btn.headerlink:hover:before,.rst-content h6 .btn.headerlink:hover:before,.rst-content dl dt .btn.headerlink:hover:before,.rst-content p.caption .btn.headerlink:hover:before,.rst-content table>caption .btn.headerlink:hover:before,.rst-content .code-block-caption .btn.headerlink:hover:before,.rst-content tt.download span.btn:first-child:hover:before,.rst-content code.download span.btn:first-child:hover:before,.btn.icon:hover:before{opacity:1}.btn-mini .fa:before,.btn-mini .wy-menu-vertical li span.toctree-expand:before,.wy-menu-vertical li .btn-mini span.toctree-expand:before,.btn-mini .rst-content .admonition-title:before,.rst-content .btn-mini .admonition-title:before,.btn-mini .rst-content h1 .headerlink:before,.rst-content h1 .btn-mini .headerlink:before,.btn-mini .rst-content h2 .headerlink:before,.rst-content h2 .btn-mini .headerlink:before,.btn-mini .rst-content h3 .headerlink:before,.rst-content h3 .btn-mini .headerlink:before,.btn-mini .rst-content h4 .headerlink:before,.rst-content h4 .btn-mini .headerlink:before,.btn-mini .rst-content h5 .headerlink:before,.rst-content h5 .btn-mini .headerlink:before,.btn-mini .rst-content h6 .headerlink:before,.rst-content h6 .btn-mini .headerlink:before,.btn-mini .rst-content dl dt .headerlink:before,.rst-content dl dt .btn-mini .headerlink:before,.btn-mini .rst-content p.caption .headerlink:before,.rst-content p.caption .btn-mini .headerlink:before,.btn-mini .rst-content table>caption .headerlink:before,.rst-content table>caption .btn-mini .headerlink:before,.btn-mini .rst-content .code-block-caption .headerlink:before,.rst-content .code-block-caption .btn-mini .headerlink:before,.btn-mini .rst-content tt.download span:first-child:before,.rst-content tt.download .btn-mini span:first-child:before,.btn-mini .rst-content code.download span:first-child:before,.rst-content code.download .btn-mini span:first-child:before,.btn-mini .icon:before{font-size:14px;vertical-align:-15%}.wy-alert,.rst-content .note,.rst-content .attention,.rst-content .caution,.rst-content .danger,.rst-content .error,.rst-content .hint,.rst-content .important,.rst-content .tip,.rst-content .warning,.rst-content .seealso,.rst-content .admonition-todo,.rst-content .admonition{padding:12px;line-height:24px;margin-bottom:24px;background:#e7f2fa}.wy-alert-title,.rst-content .admonition-title{color:#fff;font-weight:bold;display:block;color:#fff;background:#6ab0de;margin:-12px;padding:6px 12px;margin-bottom:12px}.wy-alert.wy-alert-danger,.rst-content .wy-alert-danger.note,.rst-content .wy-alert-danger.attention,.rst-content .wy-alert-danger.caution,.rst-content .danger,.rst-content .error,.rst-content .wy-alert-danger.hint,.rst-content .wy-alert-danger.important,.rst-content .wy-alert-danger.tip,.rst-content .wy-alert-danger.warning,.rst-content .wy-alert-danger.seealso,.rst-content .wy-alert-danger.admonition-todo,.rst-content .wy-alert-danger.admonition{background:#fdf3f2}.wy-alert.wy-alert-danger .wy-alert-title,.rst-content .wy-alert-danger.note .wy-alert-title,.rst-content .wy-alert-danger.attention .wy-alert-title,.rst-content .wy-alert-danger.caution .wy-alert-title,.rst-content .danger .wy-alert-title,.rst-content .error .wy-alert-title,.rst-content .wy-alert-danger.hint .wy-alert-title,.rst-content .wy-alert-danger.important .wy-alert-title,.rst-content .wy-alert-danger.tip .wy-alert-title,.rst-content .wy-alert-danger.warning .wy-alert-title,.rst-content .wy-alert-danger.seealso .wy-alert-title,.rst-content .wy-alert-danger.admonition-todo .wy-alert-title,.rst-content .wy-alert-danger.admonition .wy-alert-title,.wy-alert.wy-alert-danger .rst-content .admonition-title,.rst-content .wy-alert.wy-alert-danger .admonition-title,.rst-content .wy-alert-danger.note .admonition-title,.rst-content .wy-alert-danger.attention .admonition-title,.rst-content .wy-alert-danger.caution .admonition-title,.rst-content .danger .admonition-title,.rst-content .error .admonition-title,.rst-content .wy-alert-danger.hint .admonition-title,.rst-content .wy-alert-danger.important .admonition-title,.rst-content .wy-alert-danger.tip .admonition-title,.rst-content .wy-alert-danger.warning .admonition-title,.rst-content .wy-alert-danger.seealso .admonition-title,.rst-content .wy-alert-danger.admonition-todo .admonition-title,.rst-content .wy-alert-danger.admonition .admonition-title{background:#f29f97}.wy-alert.wy-alert-warning,.rst-content .wy-alert-warning.note,.rst-content .attention,.rst-content .caution,.rst-content .wy-alert-warning.danger,.rst-content .wy-alert-warning.error,.rst-content .wy-alert-warning.hint,.rst-content .wy-alert-warning.important,.rst-content .wy-alert-warning.tip,.rst-content .warning,.rst-content .wy-alert-warning.seealso,.rst-content .admonition-todo,.rst-content .wy-alert-warning.admonition{background:#ffedcc}.wy-alert.wy-alert-warning .wy-alert-title,.rst-content .wy-alert-warning.note .wy-alert-title,.rst-content .attention .wy-alert-title,.rst-content .caution .wy-alert-title,.rst-content .wy-alert-warning.danger .wy-alert-title,.rst-content .wy-alert-warning.error .wy-alert-title,.rst-content .wy-alert-warning.hint .wy-alert-title,.rst-content .wy-alert-warning.important .wy-alert-title,.rst-content .wy-alert-warning.tip .wy-alert-title,.rst-content .warning .wy-alert-title,.rst-content .wy-alert-warning.seealso .wy-alert-title,.rst-content .admonition-todo .wy-alert-title,.rst-content .wy-alert-warning.admonition .wy-alert-title,.wy-alert.wy-alert-warning .rst-content .admonition-title,.rst-content .wy-alert.wy-alert-warning .admonition-title,.rst-content .wy-alert-warning.note .admonition-title,.rst-content .attention .admonition-title,.rst-content .caution .admonition-title,.rst-content .wy-alert-warning.danger .admonition-title,.rst-content .wy-alert-warning.error .admonition-title,.rst-content .wy-alert-warning.hint .admonition-title,.rst-content .wy-alert-warning.important .admonition-title,.rst-content .wy-alert-warning.tip .admonition-title,.rst-content .warning .admonition-title,.rst-content .wy-alert-warning.seealso .admonition-title,.rst-content .admonition-todo .admonition-title,.rst-content .wy-alert-warning.admonition .admonition-title{background:#f0b37e}.wy-alert.wy-alert-info,.rst-content .note,.rst-content .wy-alert-info.attention,.rst-content .wy-alert-info.caution,.rst-content .wy-alert-info.danger,.rst-content .wy-alert-info.error,.rst-content .wy-alert-info.hint,.rst-content .wy-alert-info.important,.rst-content .wy-alert-info.tip,.rst-content .wy-alert-info.warning,.rst-content .seealso,.rst-content .wy-alert-info.admonition-todo,.rst-content .wy-alert-info.admonition{background:#e7f2fa}.wy-alert.wy-alert-info .wy-alert-title,.rst-content .note .wy-alert-title,.rst-content .wy-alert-info.attention .wy-alert-title,.rst-content .wy-alert-info.caution .wy-alert-title,.rst-content .wy-alert-info.danger .wy-alert-title,.rst-content .wy-alert-info.error .wy-alert-title,.rst-content .wy-alert-info.hint .wy-alert-title,.rst-content .wy-alert-info.important .wy-alert-title,.rst-content .wy-alert-info.tip .wy-alert-title,.rst-content .wy-alert-info.warning .wy-alert-title,.rst-content .seealso .wy-alert-title,.rst-content .wy-alert-info.admonition-todo .wy-alert-title,.rst-content .wy-alert-info.admonition .wy-alert-title,.wy-alert.wy-alert-info .rst-content .admonition-title,.rst-content .wy-alert.wy-alert-info .admonition-title,.rst-content .note .admonition-title,.rst-content .wy-alert-info.attention .admonition-title,.rst-content .wy-alert-info.caution .admonition-title,.rst-content .wy-alert-info.danger .admonition-title,.rst-content .wy-alert-info.error .admonition-title,.rst-content .wy-alert-info.hint .admonition-title,.rst-content .wy-alert-info.important .admonition-title,.rst-content .wy-alert-info.tip .admonition-title,.rst-content .wy-alert-info.warning .admonition-title,.rst-content .seealso .admonition-title,.rst-content .wy-alert-info.admonition-todo .admonition-title,.rst-content .wy-alert-info.admonition .admonition-title{background:#6ab0de}.wy-alert.wy-alert-success,.rst-content .wy-alert-success.note,.rst-content .wy-alert-success.attention,.rst-content .wy-alert-success.caution,.rst-content .wy-alert-success.danger,.rst-content .wy-alert-success.error,.rst-content .hint,.rst-content .important,.rst-content .tip,.rst-content .wy-alert-success.warning,.rst-content .wy-alert-success.seealso,.rst-content .wy-alert-success.admonition-todo,.rst-content .wy-alert-success.admonition{background:#dbfaf4}.wy-alert.wy-alert-success .wy-alert-title,.rst-content .wy-alert-success.note .wy-alert-title,.rst-content .wy-alert-success.attention .wy-alert-title,.rst-content .wy-alert-success.caution .wy-alert-title,.rst-content .wy-alert-success.danger .wy-alert-title,.rst-content .wy-alert-success.error .wy-alert-title,.rst-content .hint .wy-alert-title,.rst-content .important .wy-alert-title,.rst-content .tip .wy-alert-title,.rst-content .wy-alert-success.warning .wy-alert-title,.rst-content .wy-alert-success.seealso .wy-alert-title,.rst-content .wy-alert-success.admonition-todo .wy-alert-title,.rst-content .wy-alert-success.admonition .wy-alert-title,.wy-alert.wy-alert-success .rst-content .admonition-title,.rst-content .wy-alert.wy-alert-success .admonition-title,.rst-content .wy-alert-success.note .admonition-title,.rst-content .wy-alert-success.attention .admonition-title,.rst-content .wy-alert-success.caution .admonition-title,.rst-content .wy-alert-success.danger .admonition-title,.rst-content .wy-alert-success.error .admonition-title,.rst-content .hint .admonition-title,.rst-content .important .admonition-title,.rst-content .tip .admonition-title,.rst-content .wy-alert-success.warning .admonition-title,.rst-content .wy-alert-success.seealso .admonition-title,.rst-content .wy-alert-success.admonition-todo .admonition-title,.rst-content .wy-alert-success.admonition .admonition-title{background:#1abc9c}.wy-alert.wy-alert-neutral,.rst-content .wy-alert-neutral.note,.rst-content .wy-alert-neutral.attention,.rst-content .wy-alert-neutral.caution,.rst-content .wy-alert-neutral.danger,.rst-content .wy-alert-neutral.error,.rst-content .wy-alert-neutral.hint,.rst-content .wy-alert-neutral.important,.rst-content .wy-alert-neutral.tip,.rst-content .wy-alert-neutral.warning,.rst-content .wy-alert-neutral.seealso,.rst-content .wy-alert-neutral.admonition-todo,.rst-content .wy-alert-neutral.admonition{background:#f3f6f6}.wy-alert.wy-alert-neutral .wy-alert-title,.rst-content .wy-alert-neutral.note .wy-alert-title,.rst-content .wy-alert-neutral.attention .wy-alert-title,.rst-content .wy-alert-neutral.caution .wy-alert-title,.rst-content .wy-alert-neutral.danger .wy-alert-title,.rst-content .wy-alert-neutral.error .wy-alert-title,.rst-content .wy-alert-neutral.hint .wy-alert-title,.rst-content .wy-alert-neutral.important .wy-alert-title,.rst-content .wy-alert-neutral.tip .wy-alert-title,.rst-content .wy-alert-neutral.warning .wy-alert-title,.rst-content .wy-alert-neutral.seealso .wy-alert-title,.rst-content .wy-alert-neutral.admonition-todo .wy-alert-title,.rst-content .wy-alert-neutral.admonition .wy-alert-title,.wy-alert.wy-alert-neutral .rst-content .admonition-title,.rst-content .wy-alert.wy-alert-neutral .admonition-title,.rst-content .wy-alert-neutral.note .admonition-title,.rst-content .wy-alert-neutral.attention .admonition-title,.rst-content .wy-alert-neutral.caution .admonition-title,.rst-content .wy-alert-neutral.danger .admonition-title,.rst-content .wy-alert-neutral.error .admonition-title,.rst-content .wy-alert-neutral.hint .admonition-title,.rst-content .wy-alert-neutral.important .admonition-title,.rst-content .wy-alert-neutral.tip .admonition-title,.rst-content .wy-alert-neutral.warning .admonition-title,.rst-content .wy-alert-neutral.seealso .admonition-title,.rst-content .wy-alert-neutral.admonition-todo .admonition-title,.rst-content .wy-alert-neutral.admonition .admonition-title{color:#404040;background:#e1e4e5}.wy-alert.wy-alert-neutral a,.rst-content .wy-alert-neutral.note a,.rst-content .wy-alert-neutral.attention a,.rst-content .wy-alert-neutral.caution a,.rst-content .wy-alert-neutral.danger a,.rst-content .wy-alert-neutral.error a,.rst-content .wy-alert-neutral.hint a,.rst-content .wy-alert-neutral.important a,.rst-content .wy-alert-neutral.tip a,.rst-content .wy-alert-neutral.warning a,.rst-content .wy-alert-neutral.seealso a,.rst-content .wy-alert-neutral.admonition-todo a,.rst-content .wy-alert-neutral.admonition a{color:#2980B9}.wy-alert p:last-child,.rst-content .note p:last-child,.rst-content .attention p:last-child,.rst-content .caution p:last-child,.rst-content .danger p:last-child,.rst-content .error p:last-child,.rst-content .hint p:last-child,.rst-content .important p:last-child,.rst-content .tip p:last-child,.rst-content .warning p:last-child,.rst-content .seealso p:last-child,.rst-content .admonition-todo p:last-child,.rst-content .admonition p:last-child{margin-bottom:0}.wy-tray-container{position:fixed;bottom:0px;left:0;z-index:600}.wy-tray-container li{display:block;width:300px;background:transparent;color:#fff;text-align:center;box-shadow:0 5px 5px 0 rgba(0,0,0,0.1);padding:0 24px;min-width:20%;opacity:0;height:0;line-height:56px;overflow:hidden;-webkit-transition:all .3s ease-in;-moz-transition:all .3s ease-in;transition:all .3s ease-in}.wy-tray-container li.wy-tray-item-success{background:#27AE60}.wy-tray-container li.wy-tray-item-info{background:#2980B9}.wy-tray-container li.wy-tray-item-warning{background:#E67E22}.wy-tray-container li.wy-tray-item-danger{background:#E74C3C}.wy-tray-container li.on{opacity:1;height:56px}@media screen and (max-width: 768px){.wy-tray-container{bottom:auto;top:0;width:100%}.wy-tray-container li{width:100%}}button{font-size:100%;margin:0;vertical-align:baseline;*vertical-align:middle;cursor:pointer;line-height:normal;-webkit-appearance:button;*overflow:visible}button::-moz-focus-inner,input::-moz-focus-inner{border:0;padding:0}button[disabled]{cursor:default}.btn{display:inline-block;border-radius:2px;line-height:normal;white-space:nowrap;text-align:center;cursor:pointer;font-size:100%;padding:6px 12px 8px 12px;color:#fff;border:1px solid rgba(0,0,0,0.1);background-color:#27AE60;text-decoration:none;font-weight:normal;font-family:"Lato","proxima-nova","Helvetica Neue",Arial,sans-serif;box-shadow:0px 1px 2px -1px rgba(255,255,255,0.5) inset,0px -2px 0px 0px rgba(0,0,0,0.1) inset;outline-none:false;vertical-align:middle;*display:inline;zoom:1;-webkit-user-drag:none;-webkit-user-select:none;-moz-user-select:none;-ms-user-select:none;user-select:none;-webkit-transition:all .1s linear;-moz-transition:all .1s linear;transition:all .1s linear}.btn-hover{background:#2e8ece;color:#fff}.btn:hover{background:#2cc36b;color:#fff}.btn:focus{background:#2cc36b;outline:0}.btn:active{box-shadow:0px -1px 0px 0px rgba(0,0,0,0.05) inset,0px 2px 0px 0px rgba(0,0,0,0.1) inset;padding:8px 12px 6px 12px}.btn:visited{color:#fff}.btn:disabled{background-image:none;filter:progid:DXImageTransform.Microsoft.gradient(enabled = false);filter:alpha(opacity=40);opacity:.4;cursor:not-allowed;box-shadow:none}.btn-disabled{background-image:none;filter:progid:DXImageTransform.Microsoft.gradient(enabled = false);filter:alpha(opacity=40);opacity:.4;cursor:not-allowed;box-shadow:none}.btn-disabled:hover,.btn-disabled:focus,.btn-disabled:active{background-image:none;filter:progid:DXImageTransform.Microsoft.gradient(enabled = false);filter:alpha(opacity=40);opacity:.4;cursor:not-allowed;box-shadow:none}.btn::-moz-focus-inner{padding:0;border:0}.btn-small{font-size:80%}.btn-info{background-color:#2980B9 !important}.btn-info:hover{background-color:#2e8ece !important}.btn-neutral{background-color:#f3f6f6 !important;color:#404040 !important}.btn-neutral:hover{background-color:#e5ebeb !important;color:#404040}.btn-neutral:visited{color:#404040 !important}.btn-success{background-color:#27AE60 !important}.btn-success:hover{background-color:#295 !important}.btn-danger{background-color:#E74C3C !important}.btn-danger:hover{background-color:#ea6153 !important}.btn-warning{background-color:#E67E22 !important}.btn-warning:hover{background-color:#e98b39 !important}.btn-invert{background-color:#222}.btn-invert:hover{background-color:#2f2f2f !important}.btn-link{background-color:transparent !important;color:#2980B9;box-shadow:none;border-color:transparent !important}.btn-link:hover{background-color:transparent !important;color:#409ad5 !important;box-shadow:none}.btn-link:active{background-color:transparent !important;color:#409ad5 !important;box-shadow:none}.btn-link:visited{color:#9B59B6}.wy-btn-group .btn,.wy-control .btn{vertical-align:middle}.wy-btn-group{margin-bottom:24px;*zoom:1}.wy-btn-group:before,.wy-btn-group:after{display:table;content:""}.wy-btn-group:after{clear:both}.wy-dropdown{position:relative;display:inline-block}.wy-dropdown-active .wy-dropdown-menu{display:block}.wy-dropdown-menu{position:absolute;left:0;display:none;float:left;top:100%;min-width:100%;background:#fcfcfc;z-index:100;border:solid 1px #cfd7dd;box-shadow:0 2px 2px 0 rgba(0,0,0,0.1);padding:12px}.wy-dropdown-menu>dd>a{display:block;clear:both;color:#404040;white-space:nowrap;font-size:90%;padding:0 12px;cursor:pointer}.wy-dropdown-menu>dd>a:hover{background:#2980B9;color:#fff}.wy-dropdown-menu>dd.divider{border-top:solid 1px #cfd7dd;margin:6px 0}.wy-dropdown-menu>dd.search{padding-bottom:12px}.wy-dropdown-menu>dd.search input[type="search"]{width:100%}.wy-dropdown-menu>dd.call-to-action{background:#e3e3e3;text-transform:uppercase;font-weight:500;font-size:80%}.wy-dropdown-menu>dd.call-to-action:hover{background:#e3e3e3}.wy-dropdown-menu>dd.call-to-action .btn{color:#fff}.wy-dropdown.wy-dropdown-up .wy-dropdown-menu{bottom:100%;top:auto;left:auto;right:0}.wy-dropdown.wy-dropdown-bubble .wy-dropdown-menu{background:#fcfcfc;margin-top:2px}.wy-dropdown.wy-dropdown-bubble .wy-dropdown-menu a{padding:6px 12px}.wy-dropdown.wy-dropdown-bubble .wy-dropdown-menu a:hover{background:#2980B9;color:#fff}.wy-dropdown.wy-dropdown-left .wy-dropdown-menu{right:0;left:auto;text-align:right}.wy-dropdown-arrow:before{content:" ";border-bottom:5px solid #f5f5f5;border-left:5px solid transparent;border-right:5px solid transparent;position:absolute;display:block;top:-4px;left:50%;margin-left:-3px}.wy-dropdown-arrow.wy-dropdown-arrow-left:before{left:11px}.wy-form-stacked select{display:block}.wy-form-aligned input,.wy-form-aligned textarea,.wy-form-aligned select,.wy-form-aligned .wy-help-inline,.wy-form-aligned label{display:inline-block;*display:inline;*zoom:1;vertical-align:middle}.wy-form-aligned .wy-control-group>label{display:inline-block;vertical-align:middle;width:10em;margin:6px 12px 0 0;float:left}.wy-form-aligned .wy-control{float:left}.wy-form-aligned .wy-control label{display:block}.wy-form-aligned .wy-control select{margin-top:6px}fieldset{border:0;margin:0;padding:0}legend{display:block;width:100%;border:0;padding:0;white-space:normal;margin-bottom:24px;font-size:150%;*margin-left:-7px}label{display:block;margin:0 0 .3125em 0;color:#333;font-size:90%}input,select,textarea{font-size:100%;margin:0;vertical-align:baseline;*vertical-align:middle}.wy-control-group{margin-bottom:24px;*zoom:1;max-width:68em;margin-left:auto;margin-right:auto;*zoom:1}.wy-control-group:before,.wy-control-group:after{display:table;content:""}.wy-control-group:after{clear:both}.wy-control-group:before,.wy-control-group:after{display:table;content:""}.wy-control-group:after{clear:both}.wy-control-group.wy-control-group-required>label:after{content:" *";color:#E74C3C}.wy-control-group .wy-form-full,.wy-control-group .wy-form-halves,.wy-control-group .wy-form-thirds{padding-bottom:12px}.wy-control-group .wy-form-full select,.wy-control-group .wy-form-halves select,.wy-control-group .wy-form-thirds select{width:100%}.wy-control-group .wy-form-full input[type="text"],.wy-control-group .wy-form-full input[type="password"],.wy-control-group .wy-form-full input[type="email"],.wy-control-group .wy-form-full input[type="url"],.wy-control-group .wy-form-full input[type="date"],.wy-control-group .wy-form-full input[type="month"],.wy-control-group .wy-form-full input[type="time"],.wy-control-group .wy-form-full input[type="datetime"],.wy-control-group .wy-form-full input[type="datetime-local"],.wy-control-group .wy-form-full input[type="week"],.wy-control-group .wy-form-full input[type="number"],.wy-control-group .wy-form-full input[type="search"],.wy-control-group .wy-form-full input[type="tel"],.wy-control-group .wy-form-full input[type="color"],.wy-control-group .wy-form-halves input[type="text"],.wy-control-group .wy-form-halves input[type="password"],.wy-control-group .wy-form-halves input[type="email"],.wy-control-group .wy-form-halves input[type="url"],.wy-control-group .wy-form-halves input[type="date"],.wy-control-group .wy-form-halves input[type="month"],.wy-control-group .wy-form-halves input[type="time"],.wy-control-group .wy-form-halves input[type="datetime"],.wy-control-group .wy-form-halves input[type="datetime-local"],.wy-control-group .wy-form-halves input[type="week"],.wy-control-group .wy-form-halves input[type="number"],.wy-control-group .wy-form-halves input[type="search"],.wy-control-group .wy-form-halves input[type="tel"],.wy-control-group .wy-form-halves input[type="color"],.wy-control-group .wy-form-thirds input[type="text"],.wy-control-group .wy-form-thirds input[type="password"],.wy-control-group .wy-form-thirds input[type="email"],.wy-control-group .wy-form-thirds input[type="url"],.wy-control-group .wy-form-thirds input[type="date"],.wy-control-group .wy-form-thirds input[type="month"],.wy-control-group .wy-form-thirds input[type="time"],.wy-control-group .wy-form-thirds input[type="datetime"],.wy-control-group .wy-form-thirds input[type="datetime-local"],.wy-control-group .wy-form-thirds input[type="week"],.wy-control-group .wy-form-thirds input[type="number"],.wy-control-group .wy-form-thirds input[type="search"],.wy-control-group .wy-form-thirds input[type="tel"],.wy-control-group .wy-form-thirds input[type="color"]{width:100%}.wy-control-group .wy-form-full{float:left;display:block;margin-right:2.3576515979%;width:100%;margin-right:0}.wy-control-group .wy-form-full:last-child{margin-right:0}.wy-control-group .wy-form-halves{float:left;display:block;margin-right:2.3576515979%;width:48.821174201%}.wy-control-group .wy-form-halves:last-child{margin-right:0}.wy-control-group .wy-form-halves:nth-of-type(2n){margin-right:0}.wy-control-group .wy-form-halves:nth-of-type(2n+1){clear:left}.wy-control-group .wy-form-thirds{float:left;display:block;margin-right:2.3576515979%;width:31.7615656014%}.wy-control-group .wy-form-thirds:last-child{margin-right:0}.wy-control-group .wy-form-thirds:nth-of-type(3n){margin-right:0}.wy-control-group .wy-form-thirds:nth-of-type(3n+1){clear:left}.wy-control-group.wy-control-group-no-input .wy-control{margin:6px 0 0 0;font-size:90%}.wy-control-no-input{display:inline-block;margin:6px 0 0 0;font-size:90%}.wy-control-group.fluid-input input[type="text"],.wy-control-group.fluid-input input[type="password"],.wy-control-group.fluid-input input[type="email"],.wy-control-group.fluid-input input[type="url"],.wy-control-group.fluid-input input[type="date"],.wy-control-group.fluid-input input[type="month"],.wy-control-group.fluid-input input[type="time"],.wy-control-group.fluid-input input[type="datetime"],.wy-control-group.fluid-input input[type="datetime-local"],.wy-control-group.fluid-input input[type="week"],.wy-control-group.fluid-input input[type="number"],.wy-control-group.fluid-input input[type="search"],.wy-control-group.fluid-input input[type="tel"],.wy-control-group.fluid-input input[type="color"]{width:100%}.wy-form-message-inline{display:inline-block;padding-left:.3em;color:#666;vertical-align:middle;font-size:90%}.wy-form-message{display:block;color:#999;font-size:70%;margin-top:.3125em;font-style:italic}.wy-form-message p{font-size:inherit;font-style:italic;margin-bottom:6px}.wy-form-message p:last-child{margin-bottom:0}input{line-height:normal}input[type="button"],input[type="reset"],input[type="submit"]{-webkit-appearance:button;cursor:pointer;font-family:"Lato","proxima-nova","Helvetica Neue",Arial,sans-serif;*overflow:visible}input[type="text"],input[type="password"],input[type="email"],input[type="url"],input[type="date"],input[type="month"],input[type="time"],input[type="datetime"],input[type="datetime-local"],input[type="week"],input[type="number"],input[type="search"],input[type="tel"],input[type="color"]{-webkit-appearance:none;padding:6px;display:inline-block;border:1px solid #ccc;font-size:80%;font-family:"Lato","proxima-nova","Helvetica Neue",Arial,sans-serif;box-shadow:inset 0 1px 3px #ddd;border-radius:0;-webkit-transition:border .3s linear;-moz-transition:border .3s linear;transition:border .3s linear}input[type="datetime-local"]{padding:.34375em .625em}input[disabled]{cursor:default}input[type="checkbox"],input[type="radio"]{-webkit-box-sizing:border-box;-moz-box-sizing:border-box;box-sizing:border-box;padding:0;margin-right:.3125em;*height:13px;*width:13px}input[type="search"]{-webkit-box-sizing:border-box;-moz-box-sizing:border-box;box-sizing:border-box}input[type="search"]::-webkit-search-cancel-button,input[type="search"]::-webkit-search-decoration{-webkit-appearance:none}input[type="text"]:focus,input[type="password"]:focus,input[type="email"]:focus,input[type="url"]:focus,input[type="date"]:focus,input[type="month"]:focus,input[type="time"]:focus,input[type="datetime"]:focus,input[type="datetime-local"]:focus,input[type="week"]:focus,input[type="number"]:focus,input[type="search"]:focus,input[type="tel"]:focus,input[type="color"]:focus{outline:0;outline:thin dotted \9;border-color:#333}input.no-focus:focus{border-color:#ccc !important}input[type="file"]:focus,input[type="radio"]:focus,input[type="checkbox"]:focus{outline:thin dotted #333;outline:1px auto #129FEA}input[type="text"][disabled],input[type="password"][disabled],input[type="email"][disabled],input[type="url"][disabled],input[type="date"][disabled],input[type="month"][disabled],input[type="time"][disabled],input[type="datetime"][disabled],input[type="datetime-local"][disabled],input[type="week"][disabled],input[type="number"][disabled],input[type="search"][disabled],input[type="tel"][disabled],input[type="color"][disabled]{cursor:not-allowed;background-color:#fafafa}input:focus:invalid,textarea:focus:invalid,select:focus:invalid{color:#E74C3C;border:1px solid #E74C3C}input:focus:invalid:focus,textarea:focus:invalid:focus,select:focus:invalid:focus{border-color:#E74C3C}input[type="file"]:focus:invalid:focus,input[type="radio"]:focus:invalid:focus,input[type="checkbox"]:focus:invalid:focus{outline-color:#E74C3C}input.wy-input-large{padding:12px;font-size:100%}textarea{overflow:auto;vertical-align:top;width:100%;font-family:"Lato","proxima-nova","Helvetica Neue",Arial,sans-serif}select,textarea{padding:.5em .625em;display:inline-block;border:1px solid #ccc;font-size:80%;box-shadow:inset 0 1px 3px #ddd;-webkit-transition:border .3s linear;-moz-transition:border .3s linear;transition:border .3s linear}select{border:1px solid #ccc;background-color:#fff}select[multiple]{height:auto}select:focus,textarea:focus{outline:0}select[disabled],textarea[disabled],input[readonly],select[readonly],textarea[readonly]{cursor:not-allowed;background-color:#fafafa}input[type="radio"][disabled],input[type="checkbox"][disabled]{cursor:not-allowed}.wy-checkbox,.wy-radio{margin:6px 0;color:#404040;display:block}.wy-checkbox input,.wy-radio input{vertical-align:baseline}.wy-form-message-inline{display:inline-block;*display:inline;*zoom:1;vertical-align:middle}.wy-input-prefix,.wy-input-suffix{white-space:nowrap;padding:6px}.wy-input-prefix .wy-input-context,.wy-input-suffix .wy-input-context{line-height:27px;padding:0 8px;display:inline-block;font-size:80%;background-color:#f3f6f6;border:solid 1px #ccc;color:#999}.wy-input-suffix .wy-input-context{border-left:0}.wy-input-prefix .wy-input-context{border-right:0}.wy-switch{position:relative;display:block;height:24px;margin-top:12px;cursor:pointer}.wy-switch:before{position:absolute;content:"";display:block;left:0;top:0;width:36px;height:12px;border-radius:4px;background:#ccc;-webkit-transition:all .2s ease-in-out;-moz-transition:all .2s ease-in-out;transition:all .2s ease-in-out}.wy-switch:after{position:absolute;content:"";display:block;width:18px;height:18px;border-radius:4px;background:#999;left:-3px;top:-3px;-webkit-transition:all .2s ease-in-out;-moz-transition:all .2s ease-in-out;transition:all .2s ease-in-out}.wy-switch span{position:absolute;left:48px;display:block;font-size:12px;color:#ccc;line-height:1}.wy-switch.active:before{background:#1e8449}.wy-switch.active:after{left:24px;background:#27AE60}.wy-switch.disabled{cursor:not-allowed;opacity:.8}.wy-control-group.wy-control-group-error .wy-form-message,.wy-control-group.wy-control-group-error>label{color:#E74C3C}.wy-control-group.wy-control-group-error input[type="text"],.wy-control-group.wy-control-group-error input[type="password"],.wy-control-group.wy-control-group-error input[type="email"],.wy-control-group.wy-control-group-error input[type="url"],.wy-control-group.wy-control-group-error input[type="date"],.wy-control-group.wy-control-group-error input[type="month"],.wy-control-group.wy-control-group-error input[type="time"],.wy-control-group.wy-control-group-error input[type="datetime"],.wy-control-group.wy-control-group-error input[type="datetime-local"],.wy-control-group.wy-control-group-error input[type="week"],.wy-control-group.wy-control-group-error input[type="number"],.wy-control-group.wy-control-group-error input[type="search"],.wy-control-group.wy-control-group-error input[type="tel"],.wy-control-group.wy-control-group-error input[type="color"]{border:solid 1px #E74C3C}.wy-control-group.wy-control-group-error textarea{border:solid 1px #E74C3C}.wy-inline-validate{white-space:nowrap}.wy-inline-validate .wy-input-context{padding:.5em .625em;display:inline-block;font-size:80%}.wy-inline-validate.wy-inline-validate-success .wy-input-context{color:#27AE60}.wy-inline-validate.wy-inline-validate-danger .wy-input-context{color:#E74C3C}.wy-inline-validate.wy-inline-validate-warning .wy-input-context{color:#E67E22}.wy-inline-validate.wy-inline-validate-info .wy-input-context{color:#2980B9}.rotate-90{-webkit-transform:rotate(90deg);-moz-transform:rotate(90deg);-ms-transform:rotate(90deg);-o-transform:rotate(90deg);transform:rotate(90deg)}.rotate-180{-webkit-transform:rotate(180deg);-moz-transform:rotate(180deg);-ms-transform:rotate(180deg);-o-transform:rotate(180deg);transform:rotate(180deg)}.rotate-270{-webkit-transform:rotate(270deg);-moz-transform:rotate(270deg);-ms-transform:rotate(270deg);-o-transform:rotate(270deg);transform:rotate(270deg)}.mirror{-webkit-transform:scaleX(-1);-moz-transform:scaleX(-1);-ms-transform:scaleX(-1);-o-transform:scaleX(-1);transform:scaleX(-1)}.mirror.rotate-90{-webkit-transform:scaleX(-1) rotate(90deg);-moz-transform:scaleX(-1) rotate(90deg);-ms-transform:scaleX(-1) rotate(90deg);-o-transform:scaleX(-1) rotate(90deg);transform:scaleX(-1) rotate(90deg)}.mirror.rotate-180{-webkit-transform:scaleX(-1) rotate(180deg);-moz-transform:scaleX(-1) rotate(180deg);-ms-transform:scaleX(-1) rotate(180deg);-o-transform:scaleX(-1) rotate(180deg);transform:scaleX(-1) rotate(180deg)}.mirror.rotate-270{-webkit-transform:scaleX(-1) rotate(270deg);-moz-transform:scaleX(-1) rotate(270deg);-ms-transform:scaleX(-1) rotate(270deg);-o-transform:scaleX(-1) rotate(270deg);transform:scaleX(-1) rotate(270deg)}@media only screen and (max-width: 480px){.wy-form button[type="submit"]{margin:.7em 0 0}.wy-form input[type="text"],.wy-form input[type="password"],.wy-form input[type="email"],.wy-form input[type="url"],.wy-form input[type="date"],.wy-form input[type="month"],.wy-form input[type="time"],.wy-form input[type="datetime"],.wy-form input[type="datetime-local"],.wy-form input[type="week"],.wy-form input[type="number"],.wy-form input[type="search"],.wy-form input[type="tel"],.wy-form input[type="color"]{margin-bottom:.3em;display:block}.wy-form label{margin-bottom:.3em;display:block}.wy-form input[type="password"],.wy-form input[type="email"],.wy-form input[type="url"],.wy-form input[type="date"],.wy-form input[type="month"],.wy-form input[type="time"],.wy-form input[type="datetime"],.wy-form input[type="datetime-local"],.wy-form input[type="week"],.wy-form input[type="number"],.wy-form input[type="search"],.wy-form input[type="tel"],.wy-form input[type="color"]{margin-bottom:0}.wy-form-aligned .wy-control-group label{margin-bottom:.3em;text-align:left;display:block;width:100%}.wy-form-aligned .wy-control{margin:1.5em 0 0 0}.wy-form .wy-help-inline,.wy-form-message-inline,.wy-form-message{display:block;font-size:80%;padding:6px 0}}@media screen and (max-width: 768px){.tablet-hide{display:none}}@media screen and (max-width: 480px){.mobile-hide{display:none}}.float-left{float:left}.float-right{float:right}.full-width{width:100%}.wy-table,.rst-content table.docutils,.rst-content table.field-list{border-collapse:collapse;border-spacing:0;empty-cells:show;margin-bottom:24px}.wy-table caption,.rst-content table.docutils caption,.rst-content table.field-list caption{color:#000;font:italic 85%/1 arial,sans-serif;padding:1em 0;text-align:center}.wy-table td,.rst-content table.docutils td,.rst-content table.field-list td,.wy-table th,.rst-content table.docutils th,.rst-content table.field-list th{font-size:90%;margin:0;overflow:visible;padding:8px 16px}.wy-table td:first-child,.rst-content table.docutils td:first-child,.rst-content table.field-list td:first-child,.wy-table th:first-child,.rst-content table.docutils th:first-child,.rst-content table.field-list th:first-child{border-left-width:0}.wy-table thead,.rst-content table.docutils thead,.rst-content table.field-list thead{color:#000;text-align:left;vertical-align:bottom;white-space:nowrap}.wy-table thead th,.rst-content table.docutils thead th,.rst-content table.field-list thead th{font-weight:bold;border-bottom:solid 2px #e1e4e5}.wy-table td,.rst-content table.docutils td,.rst-content table.field-list td{background-color:transparent;vertical-align:middle}.wy-table td p,.rst-content table.docutils td p,.rst-content table.field-list td p{line-height:18px}.wy-table td p:last-child,.rst-content table.docutils td p:last-child,.rst-content table.field-list td p:last-child{margin-bottom:0}.wy-table .wy-table-cell-min,.rst-content table.docutils .wy-table-cell-min,.rst-content table.field-list .wy-table-cell-min{width:1%;padding-right:0}.wy-table .wy-table-cell-min input[type=checkbox],.rst-content table.docutils .wy-table-cell-min input[type=checkbox],.rst-content table.field-list .wy-table-cell-min input[type=checkbox],.wy-table .wy-table-cell-min input[type=checkbox],.rst-content table.docutils .wy-table-cell-min input[type=checkbox],.rst-content table.field-list .wy-table-cell-min input[type=checkbox]{margin:0}.wy-table-secondary{color:gray;font-size:90%}.wy-table-tertiary{color:gray;font-size:80%}.wy-table-odd td,.wy-table-striped tr:nth-child(2n-1) td,.rst-content table.docutils:not(.field-list) tr:nth-child(2n-1) td{background-color:#f3f6f6}.wy-table-backed{background-color:#f3f6f6}.wy-table-bordered-all,.rst-content table.docutils{border:1px solid #e1e4e5}.wy-table-bordered-all td,.rst-content table.docutils td{border-bottom:1px solid #e1e4e5;border-left:1px solid #e1e4e5}.wy-table-bordered-all tbody>tr:last-child td,.rst-content table.docutils tbody>tr:last-child td{border-bottom-width:0}.wy-table-bordered{border:1px solid #e1e4e5}.wy-table-bordered-rows td{border-bottom:1px solid #e1e4e5}.wy-table-bordered-rows tbody>tr:last-child td{border-bottom-width:0}.wy-table-horizontal tbody>tr:last-child td{border-bottom-width:0}.wy-table-horizontal td,.wy-table-horizontal th{border-width:0 0 1px 0;border-bottom:1px solid #e1e4e5}.wy-table-horizontal tbody>tr:last-child td{border-bottom-width:0}.wy-table-responsive{margin-bottom:24px;max-width:100%;overflow:auto}.wy-table-responsive table{margin-bottom:0 !important}.wy-table-responsive table td,.wy-table-responsive table th{white-space:nowrap}a{color:#2980B9;text-decoration:none;cursor:pointer}a:hover{color:#3091d1}a:visited{color:#9B59B6}html{height:100%;overflow-x:hidden}body{font-family:"Lato","proxima-nova","Helvetica Neue",Arial,sans-serif;font-weight:normal;color:#404040;min-height:100%;overflow-x:hidden;background:#edf0f2}.wy-text-left{text-align:left}.wy-text-center{text-align:center}.wy-text-right{text-align:right}.wy-text-large{font-size:120%}.wy-text-normal{font-size:100%}.wy-text-small,small{font-size:80%}.wy-text-strike{text-decoration:line-through}.wy-text-warning{color:#E67E22 !important}a.wy-text-warning:hover{color:#eb9950 !important}.wy-text-info{color:#2980B9 !important}a.wy-text-info:hover{color:#409ad5 !important}.wy-text-success{color:#27AE60 !important}a.wy-text-success:hover{color:#36d278 !important}.wy-text-danger{color:#E74C3C !important}a.wy-text-danger:hover{color:#ed7669 !important}.wy-text-neutral{color:#404040 !important}a.wy-text-neutral:hover{color:#595959 !important}h1,h2,.rst-content .toctree-wrapper p.caption,h3,h4,h5,h6,legend{margin-top:0;font-weight:700;font-family:"Roboto Slab","ff-tisa-web-pro","Georgia",Arial,sans-serif}p{line-height:24px;margin:0;font-size:16px;margin-bottom:24px}h1{font-size:175%}h2,.rst-content .toctree-wrapper p.caption{font-size:150%}h3{font-size:125%}h4{font-size:115%}h5{font-size:110%}h6{font-size:100%}hr{display:block;height:1px;border:0;border-top:1px solid #e1e4e5;margin:24px 0;padding:0}code,.rst-content tt,.rst-content code{white-space:nowrap;max-width:100%;background:#fff;border:solid 1px #e1e4e5;font-size:75%;padding:0 5px;font-family:SFMono-Regular,Menlo,Monaco,Consolas,"Liberation Mono","Courier New",Courier,monospace;color:#E74C3C;overflow-x:auto}code.code-large,.rst-content tt.code-large{font-size:90%}.wy-plain-list-disc,.rst-content .section ul,.rst-content .toctree-wrapper ul,article ul{list-style:disc;line-height:24px;margin-bottom:24px}.wy-plain-list-disc li,.rst-content .section ul li,.rst-content .toctree-wrapper ul li,article ul li{list-style:disc;margin-left:24px}.wy-plain-list-disc li p:last-child,.rst-content .section ul li p:last-child,.rst-content .toctree-wrapper ul li p:last-child,article ul li p:last-child{margin-bottom:0}.wy-plain-list-disc li ul,.rst-content .section ul li ul,.rst-content .toctree-wrapper ul li ul,article ul li ul{margin-bottom:0}.wy-plain-list-disc li li,.rst-content .section ul li li,.rst-content .toctree-wrapper ul li li,article ul li li{list-style:circle}.wy-plain-list-disc li li li,.rst-content .section ul li li li,.rst-content .toctree-wrapper ul li li li,article ul li li li{list-style:square}.wy-plain-list-disc li ol li,.rst-content .section ul li ol li,.rst-content .toctree-wrapper ul li ol li,article ul li ol li{list-style:decimal}.wy-plain-list-decimal,.rst-content .section ol,.rst-content ol.arabic,article ol{list-style:decimal;line-height:24px;margin-bottom:24px}.wy-plain-list-decimal li,.rst-content .section ol li,.rst-content ol.arabic li,article ol li{list-style:decimal;margin-left:24px}.wy-plain-list-decimal li p:last-child,.rst-content .section ol li p:last-child,.rst-content ol.arabic li p:last-child,article ol li p:last-child{margin-bottom:0}.wy-plain-list-decimal li ul,.rst-content .section ol li ul,.rst-content ol.arabic li ul,article ol li ul{margin-bottom:0}.wy-plain-list-decimal li ul li,.rst-content .section ol li ul li,.rst-content ol.arabic li ul li,article ol li ul li{list-style:disc}.wy-breadcrumbs{*zoom:1}.wy-breadcrumbs:before,.wy-breadcrumbs:after{display:table;content:""}.wy-breadcrumbs:after{clear:both}.wy-breadcrumbs li{display:inline-block}.wy-breadcrumbs li.wy-breadcrumbs-aside{float:right}.wy-breadcrumbs li a{display:inline-block;padding:5px}.wy-breadcrumbs li a:first-child{padding-left:0}.wy-breadcrumbs li code,.wy-breadcrumbs li .rst-content tt,.rst-content .wy-breadcrumbs li tt{padding:5px;border:none;background:none}.wy-breadcrumbs li code.literal,.wy-breadcrumbs li .rst-content tt.literal,.rst-content .wy-breadcrumbs li tt.literal{color:#404040}.wy-breadcrumbs-extra{margin-bottom:0;color:#b3b3b3;font-size:80%;display:inline-block}@media screen and (max-width: 480px){.wy-breadcrumbs-extra{display:none}.wy-breadcrumbs li.wy-breadcrumbs-aside{display:none}}@media print{.wy-breadcrumbs li.wy-breadcrumbs-aside{display:none}}html{font-size:16px}.wy-affix{position:fixed;top:1.618em}.wy-menu a:hover{text-decoration:none}.wy-menu-horiz{*zoom:1}.wy-menu-horiz:before,.wy-menu-horiz:after{display:table;content:""}.wy-menu-horiz:after{clear:both}.wy-menu-horiz ul,.wy-menu-horiz li{display:inline-block}.wy-menu-horiz li:hover{background:rgba(255,255,255,0.1)}.wy-menu-horiz li.divide-left{border-left:solid 1px #404040}.wy-menu-horiz li.divide-right{border-right:solid 1px #404040}.wy-menu-horiz a{height:32px;display:inline-block;line-height:32px;padding:0 16px}.wy-menu-vertical{width:300px}.wy-menu-vertical header,.wy-menu-vertical p.caption{color:#3a7ca8;height:32px;display:inline-block;line-height:32px;padding:0 1.618em;margin:12px 0 0 0;display:block;font-weight:bold;text-transform:uppercase;font-size:85%;white-space:nowrap}.wy-menu-vertical ul{margin-bottom:0}.wy-menu-vertical li.divide-top{border-top:solid 1px #404040}.wy-menu-vertical li.divide-bottom{border-bottom:solid 1px #404040}.wy-menu-vertical li.current{background:#e3e3e3}.wy-menu-vertical li.current a{color:gray;border-right:solid 1px #c9c9c9;padding:.4045em 2.427em}.wy-menu-vertical li.current a:hover{background:#d6d6d6}.wy-menu-vertical li code,.wy-menu-vertical li .rst-content tt,.rst-content .wy-menu-vertical li tt{border:none;background:inherit;color:inherit;padding-left:0;padding-right:0}.wy-menu-vertical li span.toctree-expand{display:block;float:left;margin-left:-1.2em;font-size:.8em;line-height:1.6em;color:#4d4d4d}.wy-menu-vertical li.on a,.wy-menu-vertical li.current>a{color:#404040;padding:.4045em 1.618em;font-weight:bold;position:relative;background:#fcfcfc;border:none;padding-left:1.618em -4px}.wy-menu-vertical li.on a:hover,.wy-menu-vertical li.current>a:hover{background:#fcfcfc}.wy-menu-vertical li.on a:hover span.toctree-expand,.wy-menu-vertical li.current>a:hover span.toctree-expand{color:gray}.wy-menu-vertical li.on a span.toctree-expand,.wy-menu-vertical li.current>a span.toctree-expand{display:block;font-size:.8em;line-height:1.6em;color:#333}.wy-menu-vertical li.toctree-l1.current>a{border-bottom:solid 1px #c9c9c9;border-top:solid 1px #c9c9c9}.wy-menu-vertical li.toctree-l2 a,.wy-menu-vertical li.toctree-l3 a,.wy-menu-vertical li.toctree-l4 a{color:#404040}.wy-menu-vertical li.toctree-l1.current li.toctree-l2>ul,.wy-menu-vertical li.toctree-l2.current li.toctree-l3>ul{display:none}.wy-menu-vertical li.toctree-l1.current li.toctree-l2.current>ul,.wy-menu-vertical li.toctree-l2.current li.toctree-l3.current>ul{display:block}.wy-menu-vertical li.toctree-l2.current>a{background:#c9c9c9;padding:.4045em 2.427em}.wy-menu-vertical li.toctree-l2.current li.toctree-l3>a{display:block;background:#c9c9c9;padding:.4045em 4.045em}.wy-menu-vertical li.toctree-l2 a:hover span.toctree-expand{color:gray}.wy-menu-vertical li.toctree-l2 span.toctree-expand{color:#a3a3a3}.wy-menu-vertical li.toctree-l3{font-size:.9em}.wy-menu-vertical li.toctree-l3.current>a{background:#bdbdbd;padding:.4045em 4.045em}.wy-menu-vertical li.toctree-l3.current li.toctree-l4>a{display:block;background:#bdbdbd;padding:.4045em 5.663em}.wy-menu-vertical li.toctree-l3 a:hover span.toctree-expand{color:gray}.wy-menu-vertical li.toctree-l3 span.toctree-expand{color:#969696}.wy-menu-vertical li.toctree-l4{font-size:.9em}.wy-menu-vertical li.current ul{display:block}.wy-menu-vertical li ul{margin-bottom:0;display:none}.wy-menu-vertical li ul li a{margin-bottom:0;color:#d9d9d9;font-weight:normal}.wy-menu-vertical a{display:inline-block;line-height:18px;padding:.4045em 1.618em;display:block;position:relative;font-size:90%;color:#d9d9d9}.wy-menu-vertical a:hover{background-color:#4e4a4a;cursor:pointer}.wy-menu-vertical a:hover span.toctree-expand{color:#d9d9d9}.wy-menu-vertical a:active{background-color:#2980B9;cursor:pointer;color:#fff}.wy-menu-vertical a:active span.toctree-expand{color:#fff}.wy-side-nav-search{display:block;width:300px;padding:.809em;margin-bottom:.809em;z-index:200;background-color:#2980B9;text-align:center;padding:.809em;display:block;color:#fcfcfc;margin-bottom:.809em}.wy-side-nav-search input[type=text]{width:100%;border-radius:50px;padding:6px 12px;border-color:#2472a4}.wy-side-nav-search img{display:block;margin:auto auto .809em auto;height:45px;width:45px;background-color:#2980B9;padding:5px;border-radius:100%}.wy-side-nav-search>a,.wy-side-nav-search .wy-dropdown>a{color:#fcfcfc;font-size:100%;font-weight:bold;display:inline-block;padding:4px 6px;margin-bottom:.809em}.wy-side-nav-search>a:hover,.wy-side-nav-search .wy-dropdown>a:hover{background:rgba(255,255,255,0.1)}.wy-side-nav-search>a img.logo,.wy-side-nav-search .wy-dropdown>a img.logo{display:block;margin:0 auto;height:auto;width:auto;border-radius:0;max-width:100%;background:transparent}.wy-side-nav-search>a.icon img.logo,.wy-side-nav-search .wy-dropdown>a.icon img.logo{margin-top:.85em}.wy-side-nav-search>div.version{margin-top:-.4045em;margin-bottom:.809em;font-weight:normal;color:rgba(255,255,255,0.3)}.wy-nav .wy-menu-vertical header{color:#2980B9}.wy-nav .wy-menu-vertical a{color:#b3b3b3}.wy-nav .wy-menu-vertical a:hover{background-color:#2980B9;color:#fff}[data-menu-wrap]{-webkit-transition:all .2s ease-in;-moz-transition:all .2s ease-in;transition:all .2s ease-in;position:absolute;opacity:1;width:100%;opacity:0}[data-menu-wrap].move-center{left:0;right:auto;opacity:1}[data-menu-wrap].move-left{right:auto;left:-100%;opacity:0}[data-menu-wrap].move-right{right:-100%;left:auto;opacity:0}.wy-body-for-nav{background:#fcfcfc}.wy-grid-for-nav{position:absolute;width:100%;height:100%}.wy-nav-side{position:fixed;top:0;bottom:0;left:0;padding-bottom:2em;width:300px;overflow-x:hidden;overflow-y:hidden;min-height:100%;color:#9b9b9b;background:#343131;z-index:200}.wy-side-scroll{width:320px;position:relative;overflow-x:hidden;overflow-y:scroll;height:100%}.wy-nav-top{display:none;background:#2980B9;color:#fff;padding:.4045em .809em;position:relative;line-height:50px;text-align:center;font-size:100%;*zoom:1}.wy-nav-top:before,.wy-nav-top:after{display:table;content:""}.wy-nav-top:after{clear:both}.wy-nav-top a{color:#fff;font-weight:bold}.wy-nav-top img{margin-right:12px;height:45px;width:45px;background-color:#2980B9;padding:5px;border-radius:100%}.wy-nav-top i{font-size:30px;float:left;cursor:pointer;padding-top:inherit}.wy-nav-content-wrap{margin-left:300px;background:#fcfcfc;min-height:100%}.wy-nav-content{padding:1.618em 3.236em;height:100%;max-width:800px;margin:auto}.wy-body-mask{position:fixed;width:100%;height:100%;background:rgba(0,0,0,0.2);display:none;z-index:499}.wy-body-mask.on{display:block}footer{color:gray}footer p{margin-bottom:12px}footer span.commit code,footer span.commit .rst-content tt,.rst-content footer span.commit tt{padding:0px;font-family:SFMono-Regular,Menlo,Monaco,Consolas,"Liberation Mono","Courier New",Courier,monospace;font-size:1em;background:none;border:none;color:gray}.rst-footer-buttons{*zoom:1}.rst-footer-buttons:before,.rst-footer-buttons:after{width:100%}.rst-footer-buttons:before,.rst-footer-buttons:after{display:table;content:""}.rst-footer-buttons:after{clear:both}.rst-breadcrumbs-buttons{margin-top:12px;*zoom:1}.rst-breadcrumbs-buttons:before,.rst-breadcrumbs-buttons:after{display:table;content:""}.rst-breadcrumbs-buttons:after{clear:both}#search-results .search li{margin-bottom:24px;border-bottom:solid 1px #e1e4e5;padding-bottom:24px}#search-results .search li:first-child{border-top:solid 1px #e1e4e5;padding-top:24px}#search-results .search li a{font-size:120%;margin-bottom:12px;display:inline-block}#search-results .context{color:gray;font-size:90%}.genindextable li>ul{margin-left:24px}@media screen and (max-width: 768px){.wy-body-for-nav{background:#fcfcfc}.wy-nav-top{display:block}.wy-nav-side{left:-300px}.wy-nav-side.shift{width:85%;left:0}.wy-side-scroll{width:auto}.wy-side-nav-search{width:auto}.wy-menu.wy-menu-vertical{width:auto}.wy-nav-content-wrap{margin-left:0}.wy-nav-content-wrap .wy-nav-content{padding:1.618em}.wy-nav-content-wrap.shift{position:fixed;min-width:100%;left:85%;top:0;height:100%;overflow:hidden}}@media screen and (min-width: 1100px){.wy-nav-content-wrap{background:rgba(0,0,0,0.05)}.wy-nav-content{margin:0;background:#fcfcfc}}@media print{.rst-versions,footer,.wy-nav-side{display:none}.wy-nav-content-wrap{margin-left:0}}.rst-versions{position:fixed;bottom:0;left:0;width:300px;color:#fcfcfc;background:#1f1d1d;font-family:"Lato","proxima-nova","Helvetica Neue",Arial,sans-serif;z-index:400}.rst-versions a{color:#2980B9;text-decoration:none}.rst-versions .rst-badge-small{display:none}.rst-versions .rst-current-version{padding:12px;background-color:#272525;display:block;text-align:right;font-size:90%;cursor:pointer;color:#27AE60;*zoom:1}.rst-versions .rst-current-version:before,.rst-versions .rst-current-version:after{display:table;content:""}.rst-versions .rst-current-version:after{clear:both}.rst-versions .rst-current-version .fa,.rst-versions .rst-current-version .wy-menu-vertical li span.toctree-expand,.wy-menu-vertical li .rst-versions .rst-current-version span.toctree-expand,.rst-versions .rst-current-version .rst-content .admonition-title,.rst-content .rst-versions .rst-current-version .admonition-title,.rst-versions .rst-current-version .rst-content h1 .headerlink,.rst-content h1 .rst-versions .rst-current-version .headerlink,.rst-versions .rst-current-version .rst-content h2 .headerlink,.rst-content h2 .rst-versions .rst-current-version .headerlink,.rst-versions .rst-current-version .rst-content h3 .headerlink,.rst-content h3 .rst-versions .rst-current-version .headerlink,.rst-versions .rst-current-version .rst-content h4 .headerlink,.rst-content h4 .rst-versions .rst-current-version .headerlink,.rst-versions .rst-current-version .rst-content h5 .headerlink,.rst-content h5 .rst-versions .rst-current-version .headerlink,.rst-versions .rst-current-version .rst-content h6 .headerlink,.rst-content h6 .rst-versions .rst-current-version .headerlink,.rst-versions .rst-current-version .rst-content dl dt .headerlink,.rst-content dl dt .rst-versions .rst-current-version .headerlink,.rst-versions .rst-current-version .rst-content p.caption .headerlink,.rst-content p.caption .rst-versions .rst-current-version .headerlink,.rst-versions .rst-current-version .rst-content table>caption .headerlink,.rst-content table>caption .rst-versions .rst-current-version .headerlink,.rst-versions .rst-current-version .rst-content .code-block-caption .headerlink,.rst-content .code-block-caption .rst-versions .rst-current-version .headerlink,.rst-versions .rst-current-version .rst-content tt.download span:first-child,.rst-content tt.download .rst-versions .rst-current-version span:first-child,.rst-versions .rst-current-version .rst-content code.download span:first-child,.rst-content code.download .rst-versions .rst-current-version span:first-child,.rst-versions .rst-current-version .icon{color:#fcfcfc}.rst-versions .rst-current-version .fa-book,.rst-versions .rst-current-version .icon-book{float:left}.rst-versions .rst-current-version .icon-book{float:left}.rst-versions .rst-current-version.rst-out-of-date{background-color:#E74C3C;color:#fff}.rst-versions .rst-current-version.rst-active-old-version{background-color:#F1C40F;color:#000}.rst-versions.shift-up{height:auto;max-height:100%;overflow-y:scroll}.rst-versions.shift-up .rst-other-versions{display:block}.rst-versions .rst-other-versions{font-size:90%;padding:12px;color:gray;display:none}.rst-versions .rst-other-versions hr{display:block;height:1px;border:0;margin:20px 0;padding:0;border-top:solid 1px #413d3d}.rst-versions .rst-other-versions dd{display:inline-block;margin:0}.rst-versions .rst-other-versions dd a{display:inline-block;padding:6px;color:#fcfcfc}.rst-versions.rst-badge{width:auto;bottom:20px;right:20px;left:auto;border:none;max-width:300px;max-height:90%}.rst-versions.rst-badge .icon-book{float:none}.rst-versions.rst-badge .fa-book,.rst-versions.rst-badge .icon-book{float:none}.rst-versions.rst-badge.shift-up .rst-current-version{text-align:right}.rst-versions.rst-badge.shift-up .rst-current-version .fa-book,.rst-versions.rst-badge.shift-up .rst-current-version .icon-book{float:left}.rst-versions.rst-badge.shift-up .rst-current-version .icon-book{float:left}.rst-versions.rst-badge .rst-current-version{width:auto;height:30px;line-height:30px;padding:0 6px;display:block;text-align:center}@media screen and (max-width: 768px){.rst-versions{width:85%;display:none}.rst-versions.shift{display:block}}.rst-content img{max-width:100%;height:auto}.rst-content div.figure{margin-bottom:24px}.rst-content div.figure p.caption{font-style:italic}.rst-content div.figure p:last-child.caption{margin-bottom:0px}.rst-content div.figure.align-center{text-align:center}.rst-content .section>img,.rst-content .section>a>img{margin-bottom:24px}.rst-content abbr[title]{text-decoration:none}.rst-content.style-external-links a.reference.external:after{font-family:FontAwesome;content:"";color:#b3b3b3;vertical-align:super;font-size:60%;margin:0 .2em}.rst-content blockquote{margin-left:24px;line-height:24px;margin-bottom:24px}.rst-content pre.literal-block{white-space:pre;margin:0;padding:12px 12px;font-family:SFMono-Regular,Menlo,Monaco,Consolas,"Liberation Mono","Courier New",Courier,monospace;display:block;overflow:auto}.rst-content pre.literal-block,.rst-content div[class^='highlight']{border:1px solid #e1e4e5;overflow-x:auto;margin:1px 0 24px 0}.rst-content pre.literal-block div[class^='highlight'],.rst-content div[class^='highlight'] div[class^='highlight']{padding:0px;border:none;margin:0}.rst-content div[class^='highlight'] td.code{width:100%}.rst-content .linenodiv pre{border-right:solid 1px #e6e9ea;margin:0;padding:12px 12px;font-family:SFMono-Regular,Menlo,Monaco,Consolas,"Liberation Mono","Courier New",Courier,monospace;user-select:none;pointer-events:none}.rst-content div[class^='highlight'] pre{white-space:pre;margin:0;padding:12px 12px;display:block;overflow:auto}.rst-content div[class^='highlight'] pre .hll{display:block;margin:0 -12px;padding:0 12px}.rst-content pre.literal-block,.rst-content div[class^='highlight'] pre,.rst-content .linenodiv pre{font-family:SFMono-Regular,Menlo,Monaco,Consolas,"Liberation Mono","Courier New",Courier,monospace;font-size:12px;line-height:1.4}.rst-content .code-block-caption{font-style:italic;font-size:85%;line-height:1;padding:1em 0;text-align:center}@media print{.rst-content .codeblock,.rst-content div[class^='highlight'],.rst-content div[class^='highlight'] pre{white-space:pre-wrap}}.rst-content .note .last,.rst-content .attention .last,.rst-content .caution .last,.rst-content .danger .last,.rst-content .error .last,.rst-content .hint .last,.rst-content .important .last,.rst-content .tip .last,.rst-content .warning .last,.rst-content .seealso .last,.rst-content .admonition-todo .last,.rst-content .admonition .last{margin-bottom:0}.rst-content .admonition-title:before{margin-right:4px}.rst-content .admonition table{border-color:rgba(0,0,0,0.1)}.rst-content .admonition table td,.rst-content .admonition table th{background:transparent !important;border-color:rgba(0,0,0,0.1) !important}.rst-content .section ol.loweralpha,.rst-content .section ol.loweralpha li{list-style:lower-alpha}.rst-content .section ol.upperalpha,.rst-content .section ol.upperalpha li{list-style:upper-alpha}.rst-content .section ol p,.rst-content .section ul p{margin-bottom:12px}.rst-content .section ol p:last-child,.rst-content .section ul p:last-child{margin-bottom:24px}.rst-content .line-block{margin-left:0px;margin-bottom:24px;line-height:24px}.rst-content .line-block .line-block{margin-left:24px;margin-bottom:0px}.rst-content .topic-title{font-weight:bold;margin-bottom:12px}.rst-content .toc-backref{color:#404040}.rst-content .align-right{float:right;margin:0px 0px 24px 24px}.rst-content .align-left{float:left;margin:0px 24px 24px 0px}.rst-content .align-center{margin:auto}.rst-content .align-center:not(table){display:block}.rst-content h1 .headerlink,.rst-content h2 .headerlink,.rst-content .toctree-wrapper p.caption .headerlink,.rst-content h3 .headerlink,.rst-content h4 .headerlink,.rst-content h5 .headerlink,.rst-content h6 .headerlink,.rst-content dl dt .headerlink,.rst-content p.caption .headerlink,.rst-content table>caption .headerlink,.rst-content .code-block-caption .headerlink{visibility:hidden;font-size:14px}.rst-content h1 .headerlink:after,.rst-content h2 .headerlink:after,.rst-content .toctree-wrapper p.caption .headerlink:after,.rst-content h3 .headerlink:after,.rst-content h4 .headerlink:after,.rst-content h5 .headerlink:after,.rst-content h6 .headerlink:after,.rst-content dl dt .headerlink:after,.rst-content p.caption .headerlink:after,.rst-content table>caption .headerlink:after,.rst-content .code-block-caption .headerlink:after{content:"";font-family:FontAwesome}.rst-content h1:hover .headerlink:after,.rst-content h2:hover .headerlink:after,.rst-content .toctree-wrapper p.caption:hover .headerlink:after,.rst-content h3:hover .headerlink:after,.rst-content h4:hover .headerlink:after,.rst-content h5:hover .headerlink:after,.rst-content h6:hover .headerlink:after,.rst-content dl dt:hover .headerlink:after,.rst-content p.caption:hover .headerlink:after,.rst-content table>caption:hover .headerlink:after,.rst-content .code-block-caption:hover .headerlink:after{visibility:visible}.rst-content table>caption .headerlink:after{font-size:12px}.rst-content .centered{text-align:center}.rst-content .sidebar{float:right;width:40%;display:block;margin:0 0 24px 24px;padding:24px;background:#f3f6f6;border:solid 1px #e1e4e5}.rst-content .sidebar p,.rst-content .sidebar ul,.rst-content .sidebar dl{font-size:90%}.rst-content .sidebar .last{margin-bottom:0}.rst-content .sidebar .sidebar-title{display:block;font-family:"Roboto Slab","ff-tisa-web-pro","Georgia",Arial,sans-serif;font-weight:bold;background:#e1e4e5;padding:6px 12px;margin:-24px;margin-bottom:24px;font-size:100%}.rst-content .highlighted{background:#F1C40F;display:inline-block;font-weight:bold;padding:0 6px}.rst-content .footnote-reference,.rst-content .citation-reference{vertical-align:baseline;position:relative;top:-0.4em;line-height:0;font-size:90%}.rst-content table.docutils.citation,.rst-content table.docutils.footnote{background:none;border:none;color:gray}.rst-content table.docutils.citation td,.rst-content table.docutils.citation tr,.rst-content table.docutils.footnote td,.rst-content table.docutils.footnote tr{border:none;background-color:transparent !important;white-space:normal}.rst-content table.docutils.citation td.label,.rst-content table.docutils.footnote td.label{padding-left:0;padding-right:0;vertical-align:top}.rst-content table.docutils.citation tt,.rst-content table.docutils.citation code,.rst-content table.docutils.footnote tt,.rst-content table.docutils.footnote code{color:#555}.rst-content .wy-table-responsive.citation,.rst-content .wy-table-responsive.footnote{margin-bottom:0}.rst-content .wy-table-responsive.citation+:not(.citation),.rst-content .wy-table-responsive.footnote+:not(.footnote){margin-top:24px}.rst-content .wy-table-responsive.citation:last-child,.rst-content .wy-table-responsive.footnote:last-child{margin-bottom:24px}.rst-content table.docutils th{border-color:#e1e4e5}.rst-content table.docutils td .last,.rst-content table.docutils td .last :last-child{margin-bottom:0}.rst-content table.field-list{border:none}.rst-content table.field-list td{border:none}.rst-content table.field-list td p{font-size:inherit;line-height:inherit}.rst-content table.field-list td>strong{display:inline-block}.rst-content table.field-list .field-name{padding-right:10px;text-align:left;white-space:nowrap}.rst-content table.field-list .field-body{text-align:left}.rst-content tt,.rst-content tt,.rst-content code{color:#000;font-family:SFMono-Regular,Menlo,Monaco,Consolas,"Liberation Mono","Courier New",Courier,monospace;padding:2px 5px}.rst-content tt big,.rst-content tt em,.rst-content tt big,.rst-content code big,.rst-content tt em,.rst-content code em{font-size:100% !important;line-height:normal}.rst-content tt.literal,.rst-content tt.literal,.rst-content code.literal{color:#E74C3C}.rst-content tt.xref,a .rst-content tt,.rst-content tt.xref,.rst-content code.xref,a .rst-content tt,a .rst-content code{font-weight:bold;color:#404040}.rst-content pre,.rst-content kbd,.rst-content samp{font-family:SFMono-Regular,Menlo,Monaco,Consolas,"Liberation Mono","Courier New",Courier,monospace}.rst-content a tt,.rst-content a tt,.rst-content a code{color:#2980B9}.rst-content dl{margin-bottom:24px}.rst-content dl dt{font-weight:bold;margin-bottom:12px}.rst-content dl p,.rst-content dl table,.rst-content dl ul,.rst-content dl ol{margin-bottom:12px !important}.rst-content dl dd{margin:0 0 12px 24px;line-height:24px}.rst-content dl:not(.docutils){margin-bottom:24px}.rst-content dl:not(.docutils) dt{display:table;margin:6px 0;font-size:90%;line-height:normal;background:#e7f2fa;color:#2980B9;border-top:solid 3px #6ab0de;padding:6px;position:relative}.rst-content dl:not(.docutils) dt:before{color:#6ab0de}.rst-content dl:not(.docutils) dt .headerlink{color:#404040;font-size:100% !important}.rst-content dl:not(.docutils) dl dt{margin-bottom:6px;border:none;border-left:solid 3px #ccc;background:#f0f0f0;color:#555}.rst-content dl:not(.docutils) dl dt .headerlink{color:#404040;font-size:100% !important}.rst-content dl:not(.docutils) dt:first-child{margin-top:0}.rst-content dl:not(.docutils) tt,.rst-content dl:not(.docutils) tt,.rst-content dl:not(.docutils) code{font-weight:bold}.rst-content dl:not(.docutils) tt.descname,.rst-content dl:not(.docutils) tt.descclassname,.rst-content dl:not(.docutils) tt.descname,.rst-content dl:not(.docutils) code.descname,.rst-content dl:not(.docutils) tt.descclassname,.rst-content dl:not(.docutils) code.descclassname{background-color:transparent;border:none;padding:0;font-size:100% !important}.rst-content dl:not(.docutils) tt.descname,.rst-content dl:not(.docutils) tt.descname,.rst-content dl:not(.docutils) code.descname{font-weight:bold}.rst-content dl:not(.docutils) .optional{display:inline-block;padding:0 4px;color:#000;font-weight:bold}.rst-content dl:not(.docutils) .property{display:inline-block;padding-right:8px}.rst-content .viewcode-link,.rst-content .viewcode-back{display:inline-block;color:#27AE60;font-size:80%;padding-left:24px}.rst-content .viewcode-back{display:block;float:right}.rst-content p.rubric{margin-bottom:12px;font-weight:bold}.rst-content tt.download,.rst-content code.download{background:inherit;padding:inherit;font-weight:normal;font-family:inherit;font-size:inherit;color:inherit;border:inherit;white-space:inherit}.rst-content tt.download span:first-child,.rst-content code.download span:first-child{-webkit-font-smoothing:subpixel-antialiased}.rst-content tt.download span:first-child:before,.rst-content code.download span:first-child:before{margin-right:4px}.rst-content .guilabel{border:1px solid #7fbbe3;background:#e7f2fa;font-size:80%;font-weight:700;border-radius:4px;padding:2.4px 6px;margin:auto 2px}.rst-content .versionmodified{font-style:italic}@media screen and (max-width: 480px){.rst-content .sidebar{width:100%}}span[id*='MathJax-Span']{color:#404040}.math{text-align:center}@font-face{font-family:"Lato";src:url("../fonts/Lato/lato-regular.eot");src:url("../fonts/Lato/lato-regular.eot?#iefix") format("embedded-opentype"),url("../fonts/Lato/lato-regular.woff2") format("woff2"),url("../fonts/Lato/lato-regular.woff") format("woff"),url("../fonts/Lato/lato-regular.ttf") format("truetype");font-weight:400;font-style:normal}@font-face{font-family:"Lato";src:url("../fonts/Lato/lato-bold.eot");src:url("../fonts/Lato/lato-bold.eot?#iefix") format("embedded-opentype"),url("../fonts/Lato/lato-bold.woff2") format("woff2"),url("../fonts/Lato/lato-bold.woff") format("woff"),url("../fonts/Lato/lato-bold.ttf") format("truetype");font-weight:700;font-style:normal}@font-face{font-family:"Lato";src:url("../fonts/Lato/lato-bolditalic.eot");src:url("../fonts/Lato/lato-bolditalic.eot?#iefix") format("embedded-opentype"),url("../fonts/Lato/lato-bolditalic.woff2") format("woff2"),url("../fonts/Lato/lato-bolditalic.woff") format("woff"),url("../fonts/Lato/lato-bolditalic.ttf") format("truetype");font-weight:700;font-style:italic}@font-face{font-family:"Lato";src:url("../fonts/Lato/lato-italic.eot");src:url("../fonts/Lato/lato-italic.eot?#iefix") format("embedded-opentype"),url("../fonts/Lato/lato-italic.woff2") format("woff2"),url("../fonts/Lato/lato-italic.woff") format("woff"),url("../fonts/Lato/lato-italic.ttf") format("truetype");font-weight:400;font-style:italic}@font-face{font-family:"Roboto Slab";font-style:normal;font-weight:400;src:url("../fonts/RobotoSlab/roboto-slab.eot");src:url("../fonts/RobotoSlab/roboto-slab-v7-regular.eot?#iefix") format("embedded-opentype"),url("../fonts/RobotoSlab/roboto-slab-v7-regular.woff2") format("woff2"),url("../fonts/RobotoSlab/roboto-slab-v7-regular.woff") format("woff"),url("../fonts/RobotoSlab/roboto-slab-v7-regular.ttf") format("truetype")}@font-face{font-family:"Roboto Slab";font-style:normal;font-weight:700;src:url("../fonts/RobotoSlab/roboto-slab-v7-bold.eot");src:url("../fonts/RobotoSlab/roboto-slab-v7-bold.eot?#iefix") format("embedded-opentype"),url("../fonts/RobotoSlab/roboto-slab-v7-bold.woff2") format("woff2"),url("../fonts/RobotoSlab/roboto-slab-v7-bold.woff") format("woff"),url("../fonts/RobotoSlab/roboto-slab-v7-bold.ttf") format("truetype")} diff --git a/_static/custom.css b/_static/custom.css new file mode 100644 index 0000000000..e4ba5749a4 --- /dev/null +++ b/_static/custom.css @@ -0,0 +1,38 @@ +/* The search field on top of the toc tree */ +/* Mobile header */ +.wy-side-nav-search, .wy-nav-top { + background: #39B3C6; +} +/* toc tree text */ +.wy-menu-vertical header, +.wy-menu-vertical p.caption { + color: #39B3C6 +} +/* toc tree activated link */ +.wy-menu-vertical a:active { + background-color:#39B3C6; +} +/* Links */ +a { + color: #39B3C6 +} +/* Source spans */ +.rst-content .viewcode-link, .rst-content .viewcode-back{ + color: #39B3C6; +} +/* The literal code blocks */ +.rst-content tt.literal, .rst-content tt.literal, .rst-content code.literal { + color: #666; +} +.rst-content a code.literal { + color: #39B3C6; +} +/* Sidebar scroll space for version switcher */ +.wy-side-scroll { + padding-bottom: 1em; +} + +/* override table no-wrap */ +.wy-table-responsive table td, .wy-table-responsive table th { + white-space: normal; +} diff --git a/_static/doctools.js b/_static/doctools.js new file mode 100644 index 0000000000..c3db08d1c3 --- /dev/null +++ b/_static/doctools.js @@ -0,0 +1,264 @@ +/* + * doctools.js + * ~~~~~~~~~~~ + * + * Base JavaScript utilities for all Sphinx HTML documentation. + * + * :copyright: Copyright 2007-2022 by the Sphinx team, see AUTHORS. + * :license: BSD, see LICENSE for details. + * + */ +"use strict"; + +const _ready = (callback) => { + if (document.readyState !== "loading") { + callback(); + } else { + document.addEventListener("DOMContentLoaded", callback); + } +}; + +/** + * highlight a given string on a node by wrapping it in + * span elements with the given class name. + */ +const _highlight = (node, addItems, text, className) => { + if (node.nodeType === Node.TEXT_NODE) { + const val = node.nodeValue; + const parent = node.parentNode; + const pos = val.toLowerCase().indexOf(text); + if ( + pos >= 0 && + !parent.classList.contains(className) && + !parent.classList.contains("nohighlight") + ) { + let span; + + const closestNode = parent.closest("body, svg, foreignObject"); + const isInSVG = closestNode && closestNode.matches("svg"); + if (isInSVG) { + span = document.createElementNS("http://www.w3.org/2000/svg", "tspan"); + } else { + span = document.createElement("span"); + span.classList.add(className); + } + + span.appendChild(document.createTextNode(val.substr(pos, text.length))); + parent.insertBefore( + span, + parent.insertBefore( + document.createTextNode(val.substr(pos + text.length)), + node.nextSibling + ) + ); + node.nodeValue = val.substr(0, pos); + + if (isInSVG) { + const rect = document.createElementNS( + "http://www.w3.org/2000/svg", + "rect" + ); + const bbox = parent.getBBox(); + rect.x.baseVal.value = bbox.x; + rect.y.baseVal.value = bbox.y; + rect.width.baseVal.value = bbox.width; + rect.height.baseVal.value = bbox.height; + rect.setAttribute("class", className); + addItems.push({ parent: parent, target: rect }); + } + } + } else if (node.matches && !node.matches("button, select, textarea")) { + node.childNodes.forEach((el) => _highlight(el, addItems, text, className)); + } +}; +const _highlightText = (thisNode, text, className) => { + let addItems = []; + _highlight(thisNode, addItems, text, className); + addItems.forEach((obj) => + obj.parent.insertAdjacentElement("beforebegin", obj.target) + ); +}; + +/** + * Small JavaScript module for the documentation. + */ +const Documentation = { + init: () => { + Documentation.highlightSearchWords(); + Documentation.initDomainIndexTable(); + Documentation.initOnKeyListeners(); + }, + + /** + * i18n support + */ + TRANSLATIONS: {}, + PLURAL_EXPR: (n) => (n === 1 ? 0 : 1), + LOCALE: "unknown", + + // gettext and ngettext don't access this so that the functions + // can safely bound to a different name (_ = Documentation.gettext) + gettext: (string) => { + const translated = Documentation.TRANSLATIONS[string]; + switch (typeof translated) { + case "undefined": + return string; // no translation + case "string": + return translated; // translation exists + default: + return translated[0]; // (singular, plural) translation tuple exists + } + }, + + ngettext: (singular, plural, n) => { + const translated = Documentation.TRANSLATIONS[singular]; + if (typeof translated !== "undefined") + return translated[Documentation.PLURAL_EXPR(n)]; + return n === 1 ? singular : plural; + }, + + addTranslations: (catalog) => { + Object.assign(Documentation.TRANSLATIONS, catalog.messages); + Documentation.PLURAL_EXPR = new Function( + "n", + `return (${catalog.plural_expr})` + ); + Documentation.LOCALE = catalog.locale; + }, + + /** + * highlight the search words provided in the url in the text + */ + highlightSearchWords: () => { + const highlight = + new URLSearchParams(window.location.search).get("highlight") || ""; + const terms = highlight.toLowerCase().split(/\s+/).filter(x => x); + if (terms.length === 0) return; // nothing to do + + // There should never be more than one element matching "div.body" + const divBody = document.querySelectorAll("div.body"); + const body = divBody.length ? divBody[0] : document.querySelector("body"); + window.setTimeout(() => { + terms.forEach((term) => _highlightText(body, term, "highlighted")); + }, 10); + + const searchBox = document.getElementById("searchbox"); + if (searchBox === null) return; + searchBox.appendChild( + document + .createRange() + .createContextualFragment( + '" + ) + ); + }, + + /** + * helper function to hide the search marks again + */ + hideSearchWords: () => { + document + .querySelectorAll("#searchbox .highlight-link") + .forEach((el) => el.remove()); + document + .querySelectorAll("span.highlighted") + .forEach((el) => el.classList.remove("highlighted")); + const url = new URL(window.location); + url.searchParams.delete("highlight"); + window.history.replaceState({}, "", url); + }, + + /** + * helper function to focus on search bar + */ + focusSearchBar: () => { + document.querySelectorAll("input[name=q]")[0]?.focus(); + }, + + /** + * Initialise the domain index toggle buttons + */ + initDomainIndexTable: () => { + const toggler = (el) => { + const idNumber = el.id.substr(7); + const toggledRows = document.querySelectorAll(`tr.cg-${idNumber}`); + if (el.src.substr(-9) === "minus.png") { + el.src = `${el.src.substr(0, el.src.length - 9)}plus.png`; + toggledRows.forEach((el) => (el.style.display = "none")); + } else { + el.src = `${el.src.substr(0, el.src.length - 8)}minus.png`; + toggledRows.forEach((el) => (el.style.display = "")); + } + }; + + const togglerElements = document.querySelectorAll("img.toggler"); + togglerElements.forEach((el) => + el.addEventListener("click", (event) => toggler(event.currentTarget)) + ); + togglerElements.forEach((el) => (el.style.display = "")); + if (DOCUMENTATION_OPTIONS.COLLAPSE_INDEX) togglerElements.forEach(toggler); + }, + + initOnKeyListeners: () => { + // only install a listener if it is really needed + if ( + !DOCUMENTATION_OPTIONS.NAVIGATION_WITH_KEYS && + !DOCUMENTATION_OPTIONS.ENABLE_SEARCH_SHORTCUTS + ) + return; + + const blacklistedElements = new Set([ + "TEXTAREA", + "INPUT", + "SELECT", + "BUTTON", + ]); + document.addEventListener("keydown", (event) => { + if (blacklistedElements.has(document.activeElement.tagName)) return; // bail for input elements + if (event.altKey || event.ctrlKey || event.metaKey) return; // bail with special keys + + if (!event.shiftKey) { + switch (event.key) { + case "ArrowLeft": + if (!DOCUMENTATION_OPTIONS.NAVIGATION_WITH_KEYS) break; + + const prevLink = document.querySelector('link[rel="prev"]'); + if (prevLink && prevLink.href) { + window.location.href = prevLink.href; + event.preventDefault(); + } + break; + case "ArrowRight": + if (!DOCUMENTATION_OPTIONS.NAVIGATION_WITH_KEYS) break; + + const nextLink = document.querySelector('link[rel="next"]'); + if (nextLink && nextLink.href) { + window.location.href = nextLink.href; + event.preventDefault(); + } + break; + case "Escape": + if (!DOCUMENTATION_OPTIONS.ENABLE_SEARCH_SHORTCUTS) break; + Documentation.hideSearchWords(); + event.preventDefault(); + } + } + + // some keyboard layouts may need Shift to get / + switch (event.key) { + case "/": + if (!DOCUMENTATION_OPTIONS.ENABLE_SEARCH_SHORTCUTS) break; + Documentation.focusSearchBar(); + event.preventDefault(); + } + }); + }, +}; + +// quick alias for translations +const _ = Documentation.gettext; + +_ready(Documentation.init); diff --git a/_static/documentation_options.js b/_static/documentation_options.js new file mode 100644 index 0000000000..a750e4d5ee --- /dev/null +++ b/_static/documentation_options.js @@ -0,0 +1,14 @@ +var DOCUMENTATION_OPTIONS = { + URL_ROOT: document.getElementById("documentation_options").getAttribute('data-url_root'), + VERSION: '', + LANGUAGE: 'en', + COLLAPSE_INDEX: false, + BUILDER: 'html', + FILE_SUFFIX: '.html', + LINK_SUFFIX: '.html', + HAS_SOURCE: true, + SOURCELINK_SUFFIX: '.txt', + NAVIGATION_WITH_KEYS: false, + SHOW_SEARCH_SUMMARY: true, + ENABLE_SEARCH_SHORTCUTS: false, +}; \ No newline at end of file diff --git a/_static/favicon.png b/_static/favicon.png new file mode 100644 index 0000000000..505eff5074 Binary files /dev/null and b/_static/favicon.png differ diff --git a/_static/file.png b/_static/file.png new file mode 100644 index 0000000000..a858a410e4 Binary files /dev/null and b/_static/file.png differ diff --git a/_static/fonts/Inconsolata-Bold.ttf b/_static/fonts/Inconsolata-Bold.ttf new file mode 100644 index 0000000000..809c1f5828 Binary files /dev/null and b/_static/fonts/Inconsolata-Bold.ttf differ diff --git a/_static/fonts/Inconsolata-Regular.ttf b/_static/fonts/Inconsolata-Regular.ttf new file mode 100644 index 0000000000..fc981ce7ad Binary files /dev/null and b/_static/fonts/Inconsolata-Regular.ttf differ diff --git a/_static/fonts/Inconsolata.ttf b/_static/fonts/Inconsolata.ttf new file mode 100644 index 0000000000..4b8a36d249 Binary files /dev/null and b/_static/fonts/Inconsolata.ttf differ diff --git a/_static/fonts/Lato-Bold.ttf b/_static/fonts/Lato-Bold.ttf new file mode 100644 index 0000000000..1d23c7066e Binary files /dev/null and b/_static/fonts/Lato-Bold.ttf differ diff --git a/_static/fonts/Lato-Regular.ttf b/_static/fonts/Lato-Regular.ttf new file mode 100644 index 0000000000..0f3d0f837d Binary files /dev/null and b/_static/fonts/Lato-Regular.ttf differ diff --git a/_static/fonts/Lato/lato-bold.eot b/_static/fonts/Lato/lato-bold.eot new file mode 100644 index 0000000000..3361183a41 Binary files /dev/null and b/_static/fonts/Lato/lato-bold.eot differ diff --git a/_static/fonts/Lato/lato-bold.ttf b/_static/fonts/Lato/lato-bold.ttf new file mode 100644 index 0000000000..29f691d5ed Binary files /dev/null and b/_static/fonts/Lato/lato-bold.ttf differ diff --git a/_static/fonts/Lato/lato-bold.woff b/_static/fonts/Lato/lato-bold.woff new file mode 100644 index 0000000000..c6dff51f06 Binary files /dev/null and b/_static/fonts/Lato/lato-bold.woff differ diff --git a/_static/fonts/Lato/lato-bold.woff2 b/_static/fonts/Lato/lato-bold.woff2 new file mode 100644 index 0000000000..bb195043cf Binary files /dev/null and b/_static/fonts/Lato/lato-bold.woff2 differ diff --git a/_static/fonts/Lato/lato-bolditalic.eot b/_static/fonts/Lato/lato-bolditalic.eot new file mode 100644 index 0000000000..3d4154936b Binary files /dev/null and b/_static/fonts/Lato/lato-bolditalic.eot differ diff --git a/_static/fonts/Lato/lato-bolditalic.ttf b/_static/fonts/Lato/lato-bolditalic.ttf new file mode 100644 index 0000000000..f402040b3e Binary files /dev/null and b/_static/fonts/Lato/lato-bolditalic.ttf differ diff --git a/_static/fonts/Lato/lato-bolditalic.woff b/_static/fonts/Lato/lato-bolditalic.woff new file mode 100644 index 0000000000..88ad05b9ff Binary files /dev/null and b/_static/fonts/Lato/lato-bolditalic.woff differ diff --git a/_static/fonts/Lato/lato-bolditalic.woff2 b/_static/fonts/Lato/lato-bolditalic.woff2 new file mode 100644 index 0000000000..c4e3d804b5 Binary files /dev/null and b/_static/fonts/Lato/lato-bolditalic.woff2 differ diff --git a/_static/fonts/Lato/lato-italic.eot b/_static/fonts/Lato/lato-italic.eot new file mode 100644 index 0000000000..3f826421a1 Binary files /dev/null and b/_static/fonts/Lato/lato-italic.eot differ diff --git a/_static/fonts/Lato/lato-italic.ttf b/_static/fonts/Lato/lato-italic.ttf new file mode 100644 index 0000000000..b4bfc9b24a Binary files /dev/null and b/_static/fonts/Lato/lato-italic.ttf differ diff --git a/_static/fonts/Lato/lato-italic.woff b/_static/fonts/Lato/lato-italic.woff new file mode 100644 index 0000000000..76114bc033 Binary files /dev/null and b/_static/fonts/Lato/lato-italic.woff differ diff --git a/_static/fonts/Lato/lato-italic.woff2 b/_static/fonts/Lato/lato-italic.woff2 new file mode 100644 index 0000000000..3404f37e2e Binary files /dev/null and b/_static/fonts/Lato/lato-italic.woff2 differ diff --git a/_static/fonts/Lato/lato-regular.eot b/_static/fonts/Lato/lato-regular.eot new file mode 100644 index 0000000000..11e3f2a5f0 Binary files /dev/null and b/_static/fonts/Lato/lato-regular.eot differ diff --git a/_static/fonts/Lato/lato-regular.ttf b/_static/fonts/Lato/lato-regular.ttf new file mode 100644 index 0000000000..74decd9ebb Binary files /dev/null and b/_static/fonts/Lato/lato-regular.ttf differ diff --git a/_static/fonts/Lato/lato-regular.woff b/_static/fonts/Lato/lato-regular.woff new file mode 100644 index 0000000000..ae1307ff5f Binary files /dev/null and b/_static/fonts/Lato/lato-regular.woff differ diff --git a/_static/fonts/Lato/lato-regular.woff2 b/_static/fonts/Lato/lato-regular.woff2 new file mode 100644 index 0000000000..3bf9843328 Binary files /dev/null and b/_static/fonts/Lato/lato-regular.woff2 differ diff --git a/_static/fonts/RobotoSlab-Bold.ttf b/_static/fonts/RobotoSlab-Bold.ttf new file mode 100644 index 0000000000..df5d1df273 Binary files /dev/null and b/_static/fonts/RobotoSlab-Bold.ttf differ diff --git a/_static/fonts/RobotoSlab-Regular.ttf b/_static/fonts/RobotoSlab-Regular.ttf new file mode 100644 index 0000000000..eb52a79073 Binary files /dev/null and b/_static/fonts/RobotoSlab-Regular.ttf differ diff --git a/_static/fonts/RobotoSlab/roboto-slab-v7-bold.eot b/_static/fonts/RobotoSlab/roboto-slab-v7-bold.eot new file mode 100644 index 0000000000..79dc8efed3 Binary files /dev/null and b/_static/fonts/RobotoSlab/roboto-slab-v7-bold.eot differ diff --git a/_static/fonts/RobotoSlab/roboto-slab-v7-bold.ttf b/_static/fonts/RobotoSlab/roboto-slab-v7-bold.ttf new file mode 100644 index 0000000000..df5d1df273 Binary files /dev/null and b/_static/fonts/RobotoSlab/roboto-slab-v7-bold.ttf differ diff --git a/_static/fonts/RobotoSlab/roboto-slab-v7-bold.woff b/_static/fonts/RobotoSlab/roboto-slab-v7-bold.woff new file mode 100644 index 0000000000..6cb6000018 Binary files /dev/null and b/_static/fonts/RobotoSlab/roboto-slab-v7-bold.woff differ diff --git a/_static/fonts/RobotoSlab/roboto-slab-v7-bold.woff2 b/_static/fonts/RobotoSlab/roboto-slab-v7-bold.woff2 new file mode 100644 index 0000000000..7059e23142 Binary files /dev/null and b/_static/fonts/RobotoSlab/roboto-slab-v7-bold.woff2 differ diff --git a/_static/fonts/RobotoSlab/roboto-slab-v7-regular.eot b/_static/fonts/RobotoSlab/roboto-slab-v7-regular.eot new file mode 100644 index 0000000000..2f7ca78a1e Binary files /dev/null and b/_static/fonts/RobotoSlab/roboto-slab-v7-regular.eot differ diff --git a/_static/fonts/RobotoSlab/roboto-slab-v7-regular.ttf b/_static/fonts/RobotoSlab/roboto-slab-v7-regular.ttf new file mode 100644 index 0000000000..eb52a79073 Binary files /dev/null and b/_static/fonts/RobotoSlab/roboto-slab-v7-regular.ttf differ diff --git a/_static/fonts/RobotoSlab/roboto-slab-v7-regular.woff b/_static/fonts/RobotoSlab/roboto-slab-v7-regular.woff new file mode 100644 index 0000000000..f815f63f99 Binary files /dev/null and b/_static/fonts/RobotoSlab/roboto-slab-v7-regular.woff differ diff --git a/_static/fonts/RobotoSlab/roboto-slab-v7-regular.woff2 b/_static/fonts/RobotoSlab/roboto-slab-v7-regular.woff2 new file mode 100644 index 0000000000..f2c76e5bda Binary files /dev/null and b/_static/fonts/RobotoSlab/roboto-slab-v7-regular.woff2 differ diff --git a/_static/fonts/fontawesome-webfont.eot b/_static/fonts/fontawesome-webfont.eot new file mode 100644 index 0000000000..e9f60ca953 Binary files /dev/null and b/_static/fonts/fontawesome-webfont.eot differ diff --git a/_static/fonts/fontawesome-webfont.svg b/_static/fonts/fontawesome-webfont.svg new file mode 100644 index 0000000000..855c845e53 --- /dev/null +++ b/_static/fonts/fontawesome-webfont.svg @@ -0,0 +1,2671 @@ + + + + +Created by FontForge 20120731 at Mon Oct 24 17:37:40 2016 + By ,,, +Copyright Dave Gandy 2016. All rights reserved. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/_static/fonts/fontawesome-webfont.ttf b/_static/fonts/fontawesome-webfont.ttf new file mode 100644 index 0000000000..35acda2fa1 Binary files /dev/null and b/_static/fonts/fontawesome-webfont.ttf differ diff --git a/_static/fonts/fontawesome-webfont.woff b/_static/fonts/fontawesome-webfont.woff new file mode 100644 index 0000000000..400014a4b0 Binary files /dev/null and b/_static/fonts/fontawesome-webfont.woff differ diff --git a/_static/fonts/fontawesome-webfont.woff2 b/_static/fonts/fontawesome-webfont.woff2 new file mode 100644 index 0000000000..4d13fc6040 Binary files /dev/null and b/_static/fonts/fontawesome-webfont.woff2 differ diff --git a/_static/jquery-3.6.0.js b/_static/jquery-3.6.0.js new file mode 100644 index 0000000000..fc6c299b73 --- /dev/null +++ b/_static/jquery-3.6.0.js @@ -0,0 +1,10881 @@ +/*! + * jQuery JavaScript Library v3.6.0 + * https://jquery.com/ + * + * Includes Sizzle.js + * https://sizzlejs.com/ + * + * Copyright OpenJS Foundation and other contributors + * Released under the MIT license + * https://jquery.org/license + * + * Date: 2021-03-02T17:08Z + */ +( function( global, factory ) { + + "use strict"; + + if ( typeof module === "object" && typeof module.exports === "object" ) { + + // For CommonJS and CommonJS-like environments where a proper `window` + // is present, execute the factory and get jQuery. + // For environments that do not have a `window` with a `document` + // (such as Node.js), expose a factory as module.exports. + // This accentuates the need for the creation of a real `window`. + // e.g. var jQuery = require("jquery")(window); + // See ticket #14549 for more info. + module.exports = global.document ? + factory( global, true ) : + function( w ) { + if ( !w.document ) { + throw new Error( "jQuery requires a window with a document" ); + } + return factory( w ); + }; + } else { + factory( global ); + } + +// Pass this if window is not defined yet +} )( typeof window !== "undefined" ? window : this, function( window, noGlobal ) { + +// Edge <= 12 - 13+, Firefox <=18 - 45+, IE 10 - 11, Safari 5.1 - 9+, iOS 6 - 9.1 +// throw exceptions when non-strict code (e.g., ASP.NET 4.5) accesses strict mode +// arguments.callee.caller (trac-13335). But as of jQuery 3.0 (2016), strict mode should be common +// enough that all such attempts are guarded in a try block. +"use strict"; + +var arr = []; + +var getProto = Object.getPrototypeOf; + +var slice = arr.slice; + +var flat = arr.flat ? function( array ) { + return arr.flat.call( array ); +} : function( array ) { + return arr.concat.apply( [], array ); +}; + + +var push = arr.push; + +var indexOf = arr.indexOf; + +var class2type = {}; + +var toString = class2type.toString; + +var hasOwn = class2type.hasOwnProperty; + +var fnToString = hasOwn.toString; + +var ObjectFunctionString = fnToString.call( Object ); + +var support = {}; + +var isFunction = function isFunction( obj ) { + + // Support: Chrome <=57, Firefox <=52 + // In some browsers, typeof returns "function" for HTML elements + // (i.e., `typeof document.createElement( "object" ) === "function"`). + // We don't want to classify *any* DOM node as a function. + // Support: QtWeb <=3.8.5, WebKit <=534.34, wkhtmltopdf tool <=0.12.5 + // Plus for old WebKit, typeof returns "function" for HTML collections + // (e.g., `typeof document.getElementsByTagName("div") === "function"`). (gh-4756) + return typeof obj === "function" && typeof obj.nodeType !== "number" && + typeof obj.item !== "function"; + }; + + +var isWindow = function isWindow( obj ) { + return obj != null && obj === obj.window; + }; + + +var document = window.document; + + + + var preservedScriptAttributes = { + type: true, + src: true, + nonce: true, + noModule: true + }; + + function DOMEval( code, node, doc ) { + doc = doc || document; + + var i, val, + script = doc.createElement( "script" ); + + script.text = code; + if ( node ) { + for ( i in preservedScriptAttributes ) { + + // Support: Firefox 64+, Edge 18+ + // Some browsers don't support the "nonce" property on scripts. + // On the other hand, just using `getAttribute` is not enough as + // the `nonce` attribute is reset to an empty string whenever it + // becomes browsing-context connected. + // See https://github.com/whatwg/html/issues/2369 + // See https://html.spec.whatwg.org/#nonce-attributes + // The `node.getAttribute` check was added for the sake of + // `jQuery.globalEval` so that it can fake a nonce-containing node + // via an object. + val = node[ i ] || node.getAttribute && node.getAttribute( i ); + if ( val ) { + script.setAttribute( i, val ); + } + } + } + doc.head.appendChild( script ).parentNode.removeChild( script ); + } + + +function toType( obj ) { + if ( obj == null ) { + return obj + ""; + } + + // Support: Android <=2.3 only (functionish RegExp) + return typeof obj === "object" || typeof obj === "function" ? + class2type[ toString.call( obj ) ] || "object" : + typeof obj; +} +/* global Symbol */ +// Defining this global in .eslintrc.json would create a danger of using the global +// unguarded in another place, it seems safer to define global only for this module + + + +var + version = "3.6.0", + + // Define a local copy of jQuery + jQuery = function( selector, context ) { + + // The jQuery object is actually just the init constructor 'enhanced' + // Need init if jQuery is called (just allow error to be thrown if not included) + return new jQuery.fn.init( selector, context ); + }; + +jQuery.fn = jQuery.prototype = { + + // The current version of jQuery being used + jquery: version, + + constructor: jQuery, + + // The default length of a jQuery object is 0 + length: 0, + + toArray: function() { + return slice.call( this ); + }, + + // Get the Nth element in the matched element set OR + // Get the whole matched element set as a clean array + get: function( num ) { + + // Return all the elements in a clean array + if ( num == null ) { + return slice.call( this ); + } + + // Return just the one element from the set + return num < 0 ? this[ num + this.length ] : this[ num ]; + }, + + // Take an array of elements and push it onto the stack + // (returning the new matched element set) + pushStack: function( elems ) { + + // Build a new jQuery matched element set + var ret = jQuery.merge( this.constructor(), elems ); + + // Add the old object onto the stack (as a reference) + ret.prevObject = this; + + // Return the newly-formed element set + return ret; + }, + + // Execute a callback for every element in the matched set. + each: function( callback ) { + return jQuery.each( this, callback ); + }, + + map: function( callback ) { + return this.pushStack( jQuery.map( this, function( elem, i ) { + return callback.call( elem, i, elem ); + } ) ); + }, + + slice: function() { + return this.pushStack( slice.apply( this, arguments ) ); + }, + + first: function() { + return this.eq( 0 ); + }, + + last: function() { + return this.eq( -1 ); + }, + + even: function() { + return this.pushStack( jQuery.grep( this, function( _elem, i ) { + return ( i + 1 ) % 2; + } ) ); + }, + + odd: function() { + return this.pushStack( jQuery.grep( this, function( _elem, i ) { + return i % 2; + } ) ); + }, + + eq: function( i ) { + var len = this.length, + j = +i + ( i < 0 ? len : 0 ); + return this.pushStack( j >= 0 && j < len ? [ this[ j ] ] : [] ); + }, + + end: function() { + return this.prevObject || this.constructor(); + }, + + // For internal use only. + // Behaves like an Array's method, not like a jQuery method. + push: push, + sort: arr.sort, + splice: arr.splice +}; + +jQuery.extend = jQuery.fn.extend = function() { + var options, name, src, copy, copyIsArray, clone, + target = arguments[ 0 ] || {}, + i = 1, + length = arguments.length, + deep = false; + + // Handle a deep copy situation + if ( typeof target === "boolean" ) { + deep = target; + + // Skip the boolean and the target + target = arguments[ i ] || {}; + i++; + } + + // Handle case when target is a string or something (possible in deep copy) + if ( typeof target !== "object" && !isFunction( target ) ) { + target = {}; + } + + // Extend jQuery itself if only one argument is passed + if ( i === length ) { + target = this; + i--; + } + + for ( ; i < length; i++ ) { + + // Only deal with non-null/undefined values + if ( ( options = arguments[ i ] ) != null ) { + + // Extend the base object + for ( name in options ) { + copy = options[ name ]; + + // Prevent Object.prototype pollution + // Prevent never-ending loop + if ( name === "__proto__" || target === copy ) { + continue; + } + + // Recurse if we're merging plain objects or arrays + if ( deep && copy && ( jQuery.isPlainObject( copy ) || + ( copyIsArray = Array.isArray( copy ) ) ) ) { + src = target[ name ]; + + // Ensure proper type for the source value + if ( copyIsArray && !Array.isArray( src ) ) { + clone = []; + } else if ( !copyIsArray && !jQuery.isPlainObject( src ) ) { + clone = {}; + } else { + clone = src; + } + copyIsArray = false; + + // Never move original objects, clone them + target[ name ] = jQuery.extend( deep, clone, copy ); + + // Don't bring in undefined values + } else if ( copy !== undefined ) { + target[ name ] = copy; + } + } + } + } + + // Return the modified object + return target; +}; + +jQuery.extend( { + + // Unique for each copy of jQuery on the page + expando: "jQuery" + ( version + Math.random() ).replace( /\D/g, "" ), + + // Assume jQuery is ready without the ready module + isReady: true, + + error: function( msg ) { + throw new Error( msg ); + }, + + noop: function() {}, + + isPlainObject: function( obj ) { + var proto, Ctor; + + // Detect obvious negatives + // Use toString instead of jQuery.type to catch host objects + if ( !obj || toString.call( obj ) !== "[object Object]" ) { + return false; + } + + proto = getProto( obj ); + + // Objects with no prototype (e.g., `Object.create( null )`) are plain + if ( !proto ) { + return true; + } + + // Objects with prototype are plain iff they were constructed by a global Object function + Ctor = hasOwn.call( proto, "constructor" ) && proto.constructor; + return typeof Ctor === "function" && fnToString.call( Ctor ) === ObjectFunctionString; + }, + + isEmptyObject: function( obj ) { + var name; + + for ( name in obj ) { + return false; + } + return true; + }, + + // Evaluates a script in a provided context; falls back to the global one + // if not specified. + globalEval: function( code, options, doc ) { + DOMEval( code, { nonce: options && options.nonce }, doc ); + }, + + each: function( obj, callback ) { + var length, i = 0; + + if ( isArrayLike( obj ) ) { + length = obj.length; + for ( ; i < length; i++ ) { + if ( callback.call( obj[ i ], i, obj[ i ] ) === false ) { + break; + } + } + } else { + for ( i in obj ) { + if ( callback.call( obj[ i ], i, obj[ i ] ) === false ) { + break; + } + } + } + + return obj; + }, + + // results is for internal usage only + makeArray: function( arr, results ) { + var ret = results || []; + + if ( arr != null ) { + if ( isArrayLike( Object( arr ) ) ) { + jQuery.merge( ret, + typeof arr === "string" ? + [ arr ] : arr + ); + } else { + push.call( ret, arr ); + } + } + + return ret; + }, + + inArray: function( elem, arr, i ) { + return arr == null ? -1 : indexOf.call( arr, elem, i ); + }, + + // Support: Android <=4.0 only, PhantomJS 1 only + // push.apply(_, arraylike) throws on ancient WebKit + merge: function( first, second ) { + var len = +second.length, + j = 0, + i = first.length; + + for ( ; j < len; j++ ) { + first[ i++ ] = second[ j ]; + } + + first.length = i; + + return first; + }, + + grep: function( elems, callback, invert ) { + var callbackInverse, + matches = [], + i = 0, + length = elems.length, + callbackExpect = !invert; + + // Go through the array, only saving the items + // that pass the validator function + for ( ; i < length; i++ ) { + callbackInverse = !callback( elems[ i ], i ); + if ( callbackInverse !== callbackExpect ) { + matches.push( elems[ i ] ); + } + } + + return matches; + }, + + // arg is for internal usage only + map: function( elems, callback, arg ) { + var length, value, + i = 0, + ret = []; + + // Go through the array, translating each of the items to their new values + if ( isArrayLike( elems ) ) { + length = elems.length; + for ( ; i < length; i++ ) { + value = callback( elems[ i ], i, arg ); + + if ( value != null ) { + ret.push( value ); + } + } + + // Go through every key on the object, + } else { + for ( i in elems ) { + value = callback( elems[ i ], i, arg ); + + if ( value != null ) { + ret.push( value ); + } + } + } + + // Flatten any nested arrays + return flat( ret ); + }, + + // A global GUID counter for objects + guid: 1, + + // jQuery.support is not used in Core but other projects attach their + // properties to it so it needs to exist. + support: support +} ); + +if ( typeof Symbol === "function" ) { + jQuery.fn[ Symbol.iterator ] = arr[ Symbol.iterator ]; +} + +// Populate the class2type map +jQuery.each( "Boolean Number String Function Array Date RegExp Object Error Symbol".split( " " ), + function( _i, name ) { + class2type[ "[object " + name + "]" ] = name.toLowerCase(); + } ); + +function isArrayLike( obj ) { + + // Support: real iOS 8.2 only (not reproducible in simulator) + // `in` check used to prevent JIT error (gh-2145) + // hasOwn isn't used here due to false negatives + // regarding Nodelist length in IE + var length = !!obj && "length" in obj && obj.length, + type = toType( obj ); + + if ( isFunction( obj ) || isWindow( obj ) ) { + return false; + } + + return type === "array" || length === 0 || + typeof length === "number" && length > 0 && ( length - 1 ) in obj; +} +var Sizzle = +/*! + * Sizzle CSS Selector Engine v2.3.6 + * https://sizzlejs.com/ + * + * Copyright JS Foundation and other contributors + * Released under the MIT license + * https://js.foundation/ + * + * Date: 2021-02-16 + */ +( function( window ) { +var i, + support, + Expr, + getText, + isXML, + tokenize, + compile, + select, + outermostContext, + sortInput, + hasDuplicate, + + // Local document vars + setDocument, + document, + docElem, + documentIsHTML, + rbuggyQSA, + rbuggyMatches, + matches, + contains, + + // Instance-specific data + expando = "sizzle" + 1 * new Date(), + preferredDoc = window.document, + dirruns = 0, + done = 0, + classCache = createCache(), + tokenCache = createCache(), + compilerCache = createCache(), + nonnativeSelectorCache = createCache(), + sortOrder = function( a, b ) { + if ( a === b ) { + hasDuplicate = true; + } + return 0; + }, + + // Instance methods + hasOwn = ( {} ).hasOwnProperty, + arr = [], + pop = arr.pop, + pushNative = arr.push, + push = arr.push, + slice = arr.slice, + + // Use a stripped-down indexOf as it's faster than native + // https://jsperf.com/thor-indexof-vs-for/5 + indexOf = function( list, elem ) { + var i = 0, + len = list.length; + for ( ; i < len; i++ ) { + if ( list[ i ] === elem ) { + return i; + } + } + return -1; + }, + + booleans = "checked|selected|async|autofocus|autoplay|controls|defer|disabled|hidden|" + + "ismap|loop|multiple|open|readonly|required|scoped", + + // Regular expressions + + // http://www.w3.org/TR/css3-selectors/#whitespace + whitespace = "[\\x20\\t\\r\\n\\f]", + + // https://www.w3.org/TR/css-syntax-3/#ident-token-diagram + identifier = "(?:\\\\[\\da-fA-F]{1,6}" + whitespace + + "?|\\\\[^\\r\\n\\f]|[\\w-]|[^\0-\\x7f])+", + + // Attribute selectors: http://www.w3.org/TR/selectors/#attribute-selectors + attributes = "\\[" + whitespace + "*(" + identifier + ")(?:" + whitespace + + + // Operator (capture 2) + "*([*^$|!~]?=)" + whitespace + + + // "Attribute values must be CSS identifiers [capture 5] + // or strings [capture 3 or capture 4]" + "*(?:'((?:\\\\.|[^\\\\'])*)'|\"((?:\\\\.|[^\\\\\"])*)\"|(" + identifier + "))|)" + + whitespace + "*\\]", + + pseudos = ":(" + identifier + ")(?:\\((" + + + // To reduce the number of selectors needing tokenize in the preFilter, prefer arguments: + // 1. quoted (capture 3; capture 4 or capture 5) + "('((?:\\\\.|[^\\\\'])*)'|\"((?:\\\\.|[^\\\\\"])*)\")|" + + + // 2. simple (capture 6) + "((?:\\\\.|[^\\\\()[\\]]|" + attributes + ")*)|" + + + // 3. anything else (capture 2) + ".*" + + ")\\)|)", + + // Leading and non-escaped trailing whitespace, capturing some non-whitespace characters preceding the latter + rwhitespace = new RegExp( whitespace + "+", "g" ), + rtrim = new RegExp( "^" + whitespace + "+|((?:^|[^\\\\])(?:\\\\.)*)" + + whitespace + "+$", "g" ), + + rcomma = new RegExp( "^" + whitespace + "*," + whitespace + "*" ), + rcombinators = new RegExp( "^" + whitespace + "*([>+~]|" + whitespace + ")" + whitespace + + "*" ), + rdescend = new RegExp( whitespace + "|>" ), + + rpseudo = new RegExp( pseudos ), + ridentifier = new RegExp( "^" + identifier + "$" ), + + matchExpr = { + "ID": new RegExp( "^#(" + identifier + ")" ), + "CLASS": new RegExp( "^\\.(" + identifier + ")" ), + "TAG": new RegExp( "^(" + identifier + "|[*])" ), + "ATTR": new RegExp( "^" + attributes ), + "PSEUDO": new RegExp( "^" + pseudos ), + "CHILD": new RegExp( "^:(only|first|last|nth|nth-last)-(child|of-type)(?:\\(" + + whitespace + "*(even|odd|(([+-]|)(\\d*)n|)" + whitespace + "*(?:([+-]|)" + + whitespace + "*(\\d+)|))" + whitespace + "*\\)|)", "i" ), + "bool": new RegExp( "^(?:" + booleans + ")$", "i" ), + + // For use in libraries implementing .is() + // We use this for POS matching in `select` + "needsContext": new RegExp( "^" + whitespace + + "*[>+~]|:(even|odd|eq|gt|lt|nth|first|last)(?:\\(" + whitespace + + "*((?:-\\d)?\\d*)" + whitespace + "*\\)|)(?=[^-]|$)", "i" ) + }, + + rhtml = /HTML$/i, + rinputs = /^(?:input|select|textarea|button)$/i, + rheader = /^h\d$/i, + + rnative = /^[^{]+\{\s*\[native \w/, + + // Easily-parseable/retrievable ID or TAG or CLASS selectors + rquickExpr = /^(?:#([\w-]+)|(\w+)|\.([\w-]+))$/, + + rsibling = /[+~]/, + + // CSS escapes + // http://www.w3.org/TR/CSS21/syndata.html#escaped-characters + runescape = new RegExp( "\\\\[\\da-fA-F]{1,6}" + whitespace + "?|\\\\([^\\r\\n\\f])", "g" ), + funescape = function( escape, nonHex ) { + var high = "0x" + escape.slice( 1 ) - 0x10000; + + return nonHex ? + + // Strip the backslash prefix from a non-hex escape sequence + nonHex : + + // Replace a hexadecimal escape sequence with the encoded Unicode code point + // Support: IE <=11+ + // For values outside the Basic Multilingual Plane (BMP), manually construct a + // surrogate pair + high < 0 ? + String.fromCharCode( high + 0x10000 ) : + String.fromCharCode( high >> 10 | 0xD800, high & 0x3FF | 0xDC00 ); + }, + + // CSS string/identifier serialization + // https://drafts.csswg.org/cssom/#common-serializing-idioms + rcssescape = /([\0-\x1f\x7f]|^-?\d)|^-$|[^\0-\x1f\x7f-\uFFFF\w-]/g, + fcssescape = function( ch, asCodePoint ) { + if ( asCodePoint ) { + + // U+0000 NULL becomes U+FFFD REPLACEMENT CHARACTER + if ( ch === "\0" ) { + return "\uFFFD"; + } + + // Control characters and (dependent upon position) numbers get escaped as code points + return ch.slice( 0, -1 ) + "\\" + + ch.charCodeAt( ch.length - 1 ).toString( 16 ) + " "; + } + + // Other potentially-special ASCII characters get backslash-escaped + return "\\" + ch; + }, + + // Used for iframes + // See setDocument() + // Removing the function wrapper causes a "Permission Denied" + // error in IE + unloadHandler = function() { + setDocument(); + }, + + inDisabledFieldset = addCombinator( + function( elem ) { + return elem.disabled === true && elem.nodeName.toLowerCase() === "fieldset"; + }, + { dir: "parentNode", next: "legend" } + ); + +// Optimize for push.apply( _, NodeList ) +try { + push.apply( + ( arr = slice.call( preferredDoc.childNodes ) ), + preferredDoc.childNodes + ); + + // Support: Android<4.0 + // Detect silently failing push.apply + // eslint-disable-next-line no-unused-expressions + arr[ preferredDoc.childNodes.length ].nodeType; +} catch ( e ) { + push = { apply: arr.length ? + + // Leverage slice if possible + function( target, els ) { + pushNative.apply( target, slice.call( els ) ); + } : + + // Support: IE<9 + // Otherwise append directly + function( target, els ) { + var j = target.length, + i = 0; + + // Can't trust NodeList.length + while ( ( target[ j++ ] = els[ i++ ] ) ) {} + target.length = j - 1; + } + }; +} + +function Sizzle( selector, context, results, seed ) { + var m, i, elem, nid, match, groups, newSelector, + newContext = context && context.ownerDocument, + + // nodeType defaults to 9, since context defaults to document + nodeType = context ? context.nodeType : 9; + + results = results || []; + + // Return early from calls with invalid selector or context + if ( typeof selector !== "string" || !selector || + nodeType !== 1 && nodeType !== 9 && nodeType !== 11 ) { + + return results; + } + + // Try to shortcut find operations (as opposed to filters) in HTML documents + if ( !seed ) { + setDocument( context ); + context = context || document; + + if ( documentIsHTML ) { + + // If the selector is sufficiently simple, try using a "get*By*" DOM method + // (excepting DocumentFragment context, where the methods don't exist) + if ( nodeType !== 11 && ( match = rquickExpr.exec( selector ) ) ) { + + // ID selector + if ( ( m = match[ 1 ] ) ) { + + // Document context + if ( nodeType === 9 ) { + if ( ( elem = context.getElementById( m ) ) ) { + + // Support: IE, Opera, Webkit + // TODO: identify versions + // getElementById can match elements by name instead of ID + if ( elem.id === m ) { + results.push( elem ); + return results; + } + } else { + return results; + } + + // Element context + } else { + + // Support: IE, Opera, Webkit + // TODO: identify versions + // getElementById can match elements by name instead of ID + if ( newContext && ( elem = newContext.getElementById( m ) ) && + contains( context, elem ) && + elem.id === m ) { + + results.push( elem ); + return results; + } + } + + // Type selector + } else if ( match[ 2 ] ) { + push.apply( results, context.getElementsByTagName( selector ) ); + return results; + + // Class selector + } else if ( ( m = match[ 3 ] ) && support.getElementsByClassName && + context.getElementsByClassName ) { + + push.apply( results, context.getElementsByClassName( m ) ); + return results; + } + } + + // Take advantage of querySelectorAll + if ( support.qsa && + !nonnativeSelectorCache[ selector + " " ] && + ( !rbuggyQSA || !rbuggyQSA.test( selector ) ) && + + // Support: IE 8 only + // Exclude object elements + ( nodeType !== 1 || context.nodeName.toLowerCase() !== "object" ) ) { + + newSelector = selector; + newContext = context; + + // qSA considers elements outside a scoping root when evaluating child or + // descendant combinators, which is not what we want. + // In such cases, we work around the behavior by prefixing every selector in the + // list with an ID selector referencing the scope context. + // The technique has to be used as well when a leading combinator is used + // as such selectors are not recognized by querySelectorAll. + // Thanks to Andrew Dupont for this technique. + if ( nodeType === 1 && + ( rdescend.test( selector ) || rcombinators.test( selector ) ) ) { + + // Expand context for sibling selectors + newContext = rsibling.test( selector ) && testContext( context.parentNode ) || + context; + + // We can use :scope instead of the ID hack if the browser + // supports it & if we're not changing the context. + if ( newContext !== context || !support.scope ) { + + // Capture the context ID, setting it first if necessary + if ( ( nid = context.getAttribute( "id" ) ) ) { + nid = nid.replace( rcssescape, fcssescape ); + } else { + context.setAttribute( "id", ( nid = expando ) ); + } + } + + // Prefix every selector in the list + groups = tokenize( selector ); + i = groups.length; + while ( i-- ) { + groups[ i ] = ( nid ? "#" + nid : ":scope" ) + " " + + toSelector( groups[ i ] ); + } + newSelector = groups.join( "," ); + } + + try { + push.apply( results, + newContext.querySelectorAll( newSelector ) + ); + return results; + } catch ( qsaError ) { + nonnativeSelectorCache( selector, true ); + } finally { + if ( nid === expando ) { + context.removeAttribute( "id" ); + } + } + } + } + } + + // All others + return select( selector.replace( rtrim, "$1" ), context, results, seed ); +} + +/** + * Create key-value caches of limited size + * @returns {function(string, object)} Returns the Object data after storing it on itself with + * property name the (space-suffixed) string and (if the cache is larger than Expr.cacheLength) + * deleting the oldest entry + */ +function createCache() { + var keys = []; + + function cache( key, value ) { + + // Use (key + " ") to avoid collision with native prototype properties (see Issue #157) + if ( keys.push( key + " " ) > Expr.cacheLength ) { + + // Only keep the most recent entries + delete cache[ keys.shift() ]; + } + return ( cache[ key + " " ] = value ); + } + return cache; +} + +/** + * Mark a function for special use by Sizzle + * @param {Function} fn The function to mark + */ +function markFunction( fn ) { + fn[ expando ] = true; + return fn; +} + +/** + * Support testing using an element + * @param {Function} fn Passed the created element and returns a boolean result + */ +function assert( fn ) { + var el = document.createElement( "fieldset" ); + + try { + return !!fn( el ); + } catch ( e ) { + return false; + } finally { + + // Remove from its parent by default + if ( el.parentNode ) { + el.parentNode.removeChild( el ); + } + + // release memory in IE + el = null; + } +} + +/** + * Adds the same handler for all of the specified attrs + * @param {String} attrs Pipe-separated list of attributes + * @param {Function} handler The method that will be applied + */ +function addHandle( attrs, handler ) { + var arr = attrs.split( "|" ), + i = arr.length; + + while ( i-- ) { + Expr.attrHandle[ arr[ i ] ] = handler; + } +} + +/** + * Checks document order of two siblings + * @param {Element} a + * @param {Element} b + * @returns {Number} Returns less than 0 if a precedes b, greater than 0 if a follows b + */ +function siblingCheck( a, b ) { + var cur = b && a, + diff = cur && a.nodeType === 1 && b.nodeType === 1 && + a.sourceIndex - b.sourceIndex; + + // Use IE sourceIndex if available on both nodes + if ( diff ) { + return diff; + } + + // Check if b follows a + if ( cur ) { + while ( ( cur = cur.nextSibling ) ) { + if ( cur === b ) { + return -1; + } + } + } + + return a ? 1 : -1; +} + +/** + * Returns a function to use in pseudos for input types + * @param {String} type + */ +function createInputPseudo( type ) { + return function( elem ) { + var name = elem.nodeName.toLowerCase(); + return name === "input" && elem.type === type; + }; +} + +/** + * Returns a function to use in pseudos for buttons + * @param {String} type + */ +function createButtonPseudo( type ) { + return function( elem ) { + var name = elem.nodeName.toLowerCase(); + return ( name === "input" || name === "button" ) && elem.type === type; + }; +} + +/** + * Returns a function to use in pseudos for :enabled/:disabled + * @param {Boolean} disabled true for :disabled; false for :enabled + */ +function createDisabledPseudo( disabled ) { + + // Known :disabled false positives: fieldset[disabled] > legend:nth-of-type(n+2) :can-disable + return function( elem ) { + + // Only certain elements can match :enabled or :disabled + // https://html.spec.whatwg.org/multipage/scripting.html#selector-enabled + // https://html.spec.whatwg.org/multipage/scripting.html#selector-disabled + if ( "form" in elem ) { + + // Check for inherited disabledness on relevant non-disabled elements: + // * listed form-associated elements in a disabled fieldset + // https://html.spec.whatwg.org/multipage/forms.html#category-listed + // https://html.spec.whatwg.org/multipage/forms.html#concept-fe-disabled + // * option elements in a disabled optgroup + // https://html.spec.whatwg.org/multipage/forms.html#concept-option-disabled + // All such elements have a "form" property. + if ( elem.parentNode && elem.disabled === false ) { + + // Option elements defer to a parent optgroup if present + if ( "label" in elem ) { + if ( "label" in elem.parentNode ) { + return elem.parentNode.disabled === disabled; + } else { + return elem.disabled === disabled; + } + } + + // Support: IE 6 - 11 + // Use the isDisabled shortcut property to check for disabled fieldset ancestors + return elem.isDisabled === disabled || + + // Where there is no isDisabled, check manually + /* jshint -W018 */ + elem.isDisabled !== !disabled && + inDisabledFieldset( elem ) === disabled; + } + + return elem.disabled === disabled; + + // Try to winnow out elements that can't be disabled before trusting the disabled property. + // Some victims get caught in our net (label, legend, menu, track), but it shouldn't + // even exist on them, let alone have a boolean value. + } else if ( "label" in elem ) { + return elem.disabled === disabled; + } + + // Remaining elements are neither :enabled nor :disabled + return false; + }; +} + +/** + * Returns a function to use in pseudos for positionals + * @param {Function} fn + */ +function createPositionalPseudo( fn ) { + return markFunction( function( argument ) { + argument = +argument; + return markFunction( function( seed, matches ) { + var j, + matchIndexes = fn( [], seed.length, argument ), + i = matchIndexes.length; + + // Match elements found at the specified indexes + while ( i-- ) { + if ( seed[ ( j = matchIndexes[ i ] ) ] ) { + seed[ j ] = !( matches[ j ] = seed[ j ] ); + } + } + } ); + } ); +} + +/** + * Checks a node for validity as a Sizzle context + * @param {Element|Object=} context + * @returns {Element|Object|Boolean} The input node if acceptable, otherwise a falsy value + */ +function testContext( context ) { + return context && typeof context.getElementsByTagName !== "undefined" && context; +} + +// Expose support vars for convenience +support = Sizzle.support = {}; + +/** + * Detects XML nodes + * @param {Element|Object} elem An element or a document + * @returns {Boolean} True iff elem is a non-HTML XML node + */ +isXML = Sizzle.isXML = function( elem ) { + var namespace = elem && elem.namespaceURI, + docElem = elem && ( elem.ownerDocument || elem ).documentElement; + + // Support: IE <=8 + // Assume HTML when documentElement doesn't yet exist, such as inside loading iframes + // https://bugs.jquery.com/ticket/4833 + return !rhtml.test( namespace || docElem && docElem.nodeName || "HTML" ); +}; + +/** + * Sets document-related variables once based on the current document + * @param {Element|Object} [doc] An element or document object to use to set the document + * @returns {Object} Returns the current document + */ +setDocument = Sizzle.setDocument = function( node ) { + var hasCompare, subWindow, + doc = node ? node.ownerDocument || node : preferredDoc; + + // Return early if doc is invalid or already selected + // Support: IE 11+, Edge 17 - 18+ + // IE/Edge sometimes throw a "Permission denied" error when strict-comparing + // two documents; shallow comparisons work. + // eslint-disable-next-line eqeqeq + if ( doc == document || doc.nodeType !== 9 || !doc.documentElement ) { + return document; + } + + // Update global variables + document = doc; + docElem = document.documentElement; + documentIsHTML = !isXML( document ); + + // Support: IE 9 - 11+, Edge 12 - 18+ + // Accessing iframe documents after unload throws "permission denied" errors (jQuery #13936) + // Support: IE 11+, Edge 17 - 18+ + // IE/Edge sometimes throw a "Permission denied" error when strict-comparing + // two documents; shallow comparisons work. + // eslint-disable-next-line eqeqeq + if ( preferredDoc != document && + ( subWindow = document.defaultView ) && subWindow.top !== subWindow ) { + + // Support: IE 11, Edge + if ( subWindow.addEventListener ) { + subWindow.addEventListener( "unload", unloadHandler, false ); + + // Support: IE 9 - 10 only + } else if ( subWindow.attachEvent ) { + subWindow.attachEvent( "onunload", unloadHandler ); + } + } + + // Support: IE 8 - 11+, Edge 12 - 18+, Chrome <=16 - 25 only, Firefox <=3.6 - 31 only, + // Safari 4 - 5 only, Opera <=11.6 - 12.x only + // IE/Edge & older browsers don't support the :scope pseudo-class. + // Support: Safari 6.0 only + // Safari 6.0 supports :scope but it's an alias of :root there. + support.scope = assert( function( el ) { + docElem.appendChild( el ).appendChild( document.createElement( "div" ) ); + return typeof el.querySelectorAll !== "undefined" && + !el.querySelectorAll( ":scope fieldset div" ).length; + } ); + + /* Attributes + ---------------------------------------------------------------------- */ + + // Support: IE<8 + // Verify that getAttribute really returns attributes and not properties + // (excepting IE8 booleans) + support.attributes = assert( function( el ) { + el.className = "i"; + return !el.getAttribute( "className" ); + } ); + + /* getElement(s)By* + ---------------------------------------------------------------------- */ + + // Check if getElementsByTagName("*") returns only elements + support.getElementsByTagName = assert( function( el ) { + el.appendChild( document.createComment( "" ) ); + return !el.getElementsByTagName( "*" ).length; + } ); + + // Support: IE<9 + support.getElementsByClassName = rnative.test( document.getElementsByClassName ); + + // Support: IE<10 + // Check if getElementById returns elements by name + // The broken getElementById methods don't pick up programmatically-set names, + // so use a roundabout getElementsByName test + support.getById = assert( function( el ) { + docElem.appendChild( el ).id = expando; + return !document.getElementsByName || !document.getElementsByName( expando ).length; + } ); + + // ID filter and find + if ( support.getById ) { + Expr.filter[ "ID" ] = function( id ) { + var attrId = id.replace( runescape, funescape ); + return function( elem ) { + return elem.getAttribute( "id" ) === attrId; + }; + }; + Expr.find[ "ID" ] = function( id, context ) { + if ( typeof context.getElementById !== "undefined" && documentIsHTML ) { + var elem = context.getElementById( id ); + return elem ? [ elem ] : []; + } + }; + } else { + Expr.filter[ "ID" ] = function( id ) { + var attrId = id.replace( runescape, funescape ); + return function( elem ) { + var node = typeof elem.getAttributeNode !== "undefined" && + elem.getAttributeNode( "id" ); + return node && node.value === attrId; + }; + }; + + // Support: IE 6 - 7 only + // getElementById is not reliable as a find shortcut + Expr.find[ "ID" ] = function( id, context ) { + if ( typeof context.getElementById !== "undefined" && documentIsHTML ) { + var node, i, elems, + elem = context.getElementById( id ); + + if ( elem ) { + + // Verify the id attribute + node = elem.getAttributeNode( "id" ); + if ( node && node.value === id ) { + return [ elem ]; + } + + // Fall back on getElementsByName + elems = context.getElementsByName( id ); + i = 0; + while ( ( elem = elems[ i++ ] ) ) { + node = elem.getAttributeNode( "id" ); + if ( node && node.value === id ) { + return [ elem ]; + } + } + } + + return []; + } + }; + } + + // Tag + Expr.find[ "TAG" ] = support.getElementsByTagName ? + function( tag, context ) { + if ( typeof context.getElementsByTagName !== "undefined" ) { + return context.getElementsByTagName( tag ); + + // DocumentFragment nodes don't have gEBTN + } else if ( support.qsa ) { + return context.querySelectorAll( tag ); + } + } : + + function( tag, context ) { + var elem, + tmp = [], + i = 0, + + // By happy coincidence, a (broken) gEBTN appears on DocumentFragment nodes too + results = context.getElementsByTagName( tag ); + + // Filter out possible comments + if ( tag === "*" ) { + while ( ( elem = results[ i++ ] ) ) { + if ( elem.nodeType === 1 ) { + tmp.push( elem ); + } + } + + return tmp; + } + return results; + }; + + // Class + Expr.find[ "CLASS" ] = support.getElementsByClassName && function( className, context ) { + if ( typeof context.getElementsByClassName !== "undefined" && documentIsHTML ) { + return context.getElementsByClassName( className ); + } + }; + + /* QSA/matchesSelector + ---------------------------------------------------------------------- */ + + // QSA and matchesSelector support + + // matchesSelector(:active) reports false when true (IE9/Opera 11.5) + rbuggyMatches = []; + + // qSa(:focus) reports false when true (Chrome 21) + // We allow this because of a bug in IE8/9 that throws an error + // whenever `document.activeElement` is accessed on an iframe + // So, we allow :focus to pass through QSA all the time to avoid the IE error + // See https://bugs.jquery.com/ticket/13378 + rbuggyQSA = []; + + if ( ( support.qsa = rnative.test( document.querySelectorAll ) ) ) { + + // Build QSA regex + // Regex strategy adopted from Diego Perini + assert( function( el ) { + + var input; + + // Select is set to empty string on purpose + // This is to test IE's treatment of not explicitly + // setting a boolean content attribute, + // since its presence should be enough + // https://bugs.jquery.com/ticket/12359 + docElem.appendChild( el ).innerHTML = "" + + ""; + + // Support: IE8, Opera 11-12.16 + // Nothing should be selected when empty strings follow ^= or $= or *= + // The test attribute must be unknown in Opera but "safe" for WinRT + // https://msdn.microsoft.com/en-us/library/ie/hh465388.aspx#attribute_section + if ( el.querySelectorAll( "[msallowcapture^='']" ).length ) { + rbuggyQSA.push( "[*^$]=" + whitespace + "*(?:''|\"\")" ); + } + + // Support: IE8 + // Boolean attributes and "value" are not treated correctly + if ( !el.querySelectorAll( "[selected]" ).length ) { + rbuggyQSA.push( "\\[" + whitespace + "*(?:value|" + booleans + ")" ); + } + + // Support: Chrome<29, Android<4.4, Safari<7.0+, iOS<7.0+, PhantomJS<1.9.8+ + if ( !el.querySelectorAll( "[id~=" + expando + "-]" ).length ) { + rbuggyQSA.push( "~=" ); + } + + // Support: IE 11+, Edge 15 - 18+ + // IE 11/Edge don't find elements on a `[name='']` query in some cases. + // Adding a temporary attribute to the document before the selection works + // around the issue. + // Interestingly, IE 10 & older don't seem to have the issue. + input = document.createElement( "input" ); + input.setAttribute( "name", "" ); + el.appendChild( input ); + if ( !el.querySelectorAll( "[name='']" ).length ) { + rbuggyQSA.push( "\\[" + whitespace + "*name" + whitespace + "*=" + + whitespace + "*(?:''|\"\")" ); + } + + // Webkit/Opera - :checked should return selected option elements + // http://www.w3.org/TR/2011/REC-css3-selectors-20110929/#checked + // IE8 throws error here and will not see later tests + if ( !el.querySelectorAll( ":checked" ).length ) { + rbuggyQSA.push( ":checked" ); + } + + // Support: Safari 8+, iOS 8+ + // https://bugs.webkit.org/show_bug.cgi?id=136851 + // In-page `selector#id sibling-combinator selector` fails + if ( !el.querySelectorAll( "a#" + expando + "+*" ).length ) { + rbuggyQSA.push( ".#.+[+~]" ); + } + + // Support: Firefox <=3.6 - 5 only + // Old Firefox doesn't throw on a badly-escaped identifier. + el.querySelectorAll( "\\\f" ); + rbuggyQSA.push( "[\\r\\n\\f]" ); + } ); + + assert( function( el ) { + el.innerHTML = "" + + ""; + + // Support: Windows 8 Native Apps + // The type and name attributes are restricted during .innerHTML assignment + var input = document.createElement( "input" ); + input.setAttribute( "type", "hidden" ); + el.appendChild( input ).setAttribute( "name", "D" ); + + // Support: IE8 + // Enforce case-sensitivity of name attribute + if ( el.querySelectorAll( "[name=d]" ).length ) { + rbuggyQSA.push( "name" + whitespace + "*[*^$|!~]?=" ); + } + + // FF 3.5 - :enabled/:disabled and hidden elements (hidden elements are still enabled) + // IE8 throws error here and will not see later tests + if ( el.querySelectorAll( ":enabled" ).length !== 2 ) { + rbuggyQSA.push( ":enabled", ":disabled" ); + } + + // Support: IE9-11+ + // IE's :disabled selector does not pick up the children of disabled fieldsets + docElem.appendChild( el ).disabled = true; + if ( el.querySelectorAll( ":disabled" ).length !== 2 ) { + rbuggyQSA.push( ":enabled", ":disabled" ); + } + + // Support: Opera 10 - 11 only + // Opera 10-11 does not throw on post-comma invalid pseudos + el.querySelectorAll( "*,:x" ); + rbuggyQSA.push( ",.*:" ); + } ); + } + + if ( ( support.matchesSelector = rnative.test( ( matches = docElem.matches || + docElem.webkitMatchesSelector || + docElem.mozMatchesSelector || + docElem.oMatchesSelector || + docElem.msMatchesSelector ) ) ) ) { + + assert( function( el ) { + + // Check to see if it's possible to do matchesSelector + // on a disconnected node (IE 9) + support.disconnectedMatch = matches.call( el, "*" ); + + // This should fail with an exception + // Gecko does not error, returns false instead + matches.call( el, "[s!='']:x" ); + rbuggyMatches.push( "!=", pseudos ); + } ); + } + + rbuggyQSA = rbuggyQSA.length && new RegExp( rbuggyQSA.join( "|" ) ); + rbuggyMatches = rbuggyMatches.length && new RegExp( rbuggyMatches.join( "|" ) ); + + /* Contains + ---------------------------------------------------------------------- */ + hasCompare = rnative.test( docElem.compareDocumentPosition ); + + // Element contains another + // Purposefully self-exclusive + // As in, an element does not contain itself + contains = hasCompare || rnative.test( docElem.contains ) ? + function( a, b ) { + var adown = a.nodeType === 9 ? a.documentElement : a, + bup = b && b.parentNode; + return a === bup || !!( bup && bup.nodeType === 1 && ( + adown.contains ? + adown.contains( bup ) : + a.compareDocumentPosition && a.compareDocumentPosition( bup ) & 16 + ) ); + } : + function( a, b ) { + if ( b ) { + while ( ( b = b.parentNode ) ) { + if ( b === a ) { + return true; + } + } + } + return false; + }; + + /* Sorting + ---------------------------------------------------------------------- */ + + // Document order sorting + sortOrder = hasCompare ? + function( a, b ) { + + // Flag for duplicate removal + if ( a === b ) { + hasDuplicate = true; + return 0; + } + + // Sort on method existence if only one input has compareDocumentPosition + var compare = !a.compareDocumentPosition - !b.compareDocumentPosition; + if ( compare ) { + return compare; + } + + // Calculate position if both inputs belong to the same document + // Support: IE 11+, Edge 17 - 18+ + // IE/Edge sometimes throw a "Permission denied" error when strict-comparing + // two documents; shallow comparisons work. + // eslint-disable-next-line eqeqeq + compare = ( a.ownerDocument || a ) == ( b.ownerDocument || b ) ? + a.compareDocumentPosition( b ) : + + // Otherwise we know they are disconnected + 1; + + // Disconnected nodes + if ( compare & 1 || + ( !support.sortDetached && b.compareDocumentPosition( a ) === compare ) ) { + + // Choose the first element that is related to our preferred document + // Support: IE 11+, Edge 17 - 18+ + // IE/Edge sometimes throw a "Permission denied" error when strict-comparing + // two documents; shallow comparisons work. + // eslint-disable-next-line eqeqeq + if ( a == document || a.ownerDocument == preferredDoc && + contains( preferredDoc, a ) ) { + return -1; + } + + // Support: IE 11+, Edge 17 - 18+ + // IE/Edge sometimes throw a "Permission denied" error when strict-comparing + // two documents; shallow comparisons work. + // eslint-disable-next-line eqeqeq + if ( b == document || b.ownerDocument == preferredDoc && + contains( preferredDoc, b ) ) { + return 1; + } + + // Maintain original order + return sortInput ? + ( indexOf( sortInput, a ) - indexOf( sortInput, b ) ) : + 0; + } + + return compare & 4 ? -1 : 1; + } : + function( a, b ) { + + // Exit early if the nodes are identical + if ( a === b ) { + hasDuplicate = true; + return 0; + } + + var cur, + i = 0, + aup = a.parentNode, + bup = b.parentNode, + ap = [ a ], + bp = [ b ]; + + // Parentless nodes are either documents or disconnected + if ( !aup || !bup ) { + + // Support: IE 11+, Edge 17 - 18+ + // IE/Edge sometimes throw a "Permission denied" error when strict-comparing + // two documents; shallow comparisons work. + /* eslint-disable eqeqeq */ + return a == document ? -1 : + b == document ? 1 : + /* eslint-enable eqeqeq */ + aup ? -1 : + bup ? 1 : + sortInput ? + ( indexOf( sortInput, a ) - indexOf( sortInput, b ) ) : + 0; + + // If the nodes are siblings, we can do a quick check + } else if ( aup === bup ) { + return siblingCheck( a, b ); + } + + // Otherwise we need full lists of their ancestors for comparison + cur = a; + while ( ( cur = cur.parentNode ) ) { + ap.unshift( cur ); + } + cur = b; + while ( ( cur = cur.parentNode ) ) { + bp.unshift( cur ); + } + + // Walk down the tree looking for a discrepancy + while ( ap[ i ] === bp[ i ] ) { + i++; + } + + return i ? + + // Do a sibling check if the nodes have a common ancestor + siblingCheck( ap[ i ], bp[ i ] ) : + + // Otherwise nodes in our document sort first + // Support: IE 11+, Edge 17 - 18+ + // IE/Edge sometimes throw a "Permission denied" error when strict-comparing + // two documents; shallow comparisons work. + /* eslint-disable eqeqeq */ + ap[ i ] == preferredDoc ? -1 : + bp[ i ] == preferredDoc ? 1 : + /* eslint-enable eqeqeq */ + 0; + }; + + return document; +}; + +Sizzle.matches = function( expr, elements ) { + return Sizzle( expr, null, null, elements ); +}; + +Sizzle.matchesSelector = function( elem, expr ) { + setDocument( elem ); + + if ( support.matchesSelector && documentIsHTML && + !nonnativeSelectorCache[ expr + " " ] && + ( !rbuggyMatches || !rbuggyMatches.test( expr ) ) && + ( !rbuggyQSA || !rbuggyQSA.test( expr ) ) ) { + + try { + var ret = matches.call( elem, expr ); + + // IE 9's matchesSelector returns false on disconnected nodes + if ( ret || support.disconnectedMatch || + + // As well, disconnected nodes are said to be in a document + // fragment in IE 9 + elem.document && elem.document.nodeType !== 11 ) { + return ret; + } + } catch ( e ) { + nonnativeSelectorCache( expr, true ); + } + } + + return Sizzle( expr, document, null, [ elem ] ).length > 0; +}; + +Sizzle.contains = function( context, elem ) { + + // Set document vars if needed + // Support: IE 11+, Edge 17 - 18+ + // IE/Edge sometimes throw a "Permission denied" error when strict-comparing + // two documents; shallow comparisons work. + // eslint-disable-next-line eqeqeq + if ( ( context.ownerDocument || context ) != document ) { + setDocument( context ); + } + return contains( context, elem ); +}; + +Sizzle.attr = function( elem, name ) { + + // Set document vars if needed + // Support: IE 11+, Edge 17 - 18+ + // IE/Edge sometimes throw a "Permission denied" error when strict-comparing + // two documents; shallow comparisons work. + // eslint-disable-next-line eqeqeq + if ( ( elem.ownerDocument || elem ) != document ) { + setDocument( elem ); + } + + var fn = Expr.attrHandle[ name.toLowerCase() ], + + // Don't get fooled by Object.prototype properties (jQuery #13807) + val = fn && hasOwn.call( Expr.attrHandle, name.toLowerCase() ) ? + fn( elem, name, !documentIsHTML ) : + undefined; + + return val !== undefined ? + val : + support.attributes || !documentIsHTML ? + elem.getAttribute( name ) : + ( val = elem.getAttributeNode( name ) ) && val.specified ? + val.value : + null; +}; + +Sizzle.escape = function( sel ) { + return ( sel + "" ).replace( rcssescape, fcssescape ); +}; + +Sizzle.error = function( msg ) { + throw new Error( "Syntax error, unrecognized expression: " + msg ); +}; + +/** + * Document sorting and removing duplicates + * @param {ArrayLike} results + */ +Sizzle.uniqueSort = function( results ) { + var elem, + duplicates = [], + j = 0, + i = 0; + + // Unless we *know* we can detect duplicates, assume their presence + hasDuplicate = !support.detectDuplicates; + sortInput = !support.sortStable && results.slice( 0 ); + results.sort( sortOrder ); + + if ( hasDuplicate ) { + while ( ( elem = results[ i++ ] ) ) { + if ( elem === results[ i ] ) { + j = duplicates.push( i ); + } + } + while ( j-- ) { + results.splice( duplicates[ j ], 1 ); + } + } + + // Clear input after sorting to release objects + // See https://github.com/jquery/sizzle/pull/225 + sortInput = null; + + return results; +}; + +/** + * Utility function for retrieving the text value of an array of DOM nodes + * @param {Array|Element} elem + */ +getText = Sizzle.getText = function( elem ) { + var node, + ret = "", + i = 0, + nodeType = elem.nodeType; + + if ( !nodeType ) { + + // If no nodeType, this is expected to be an array + while ( ( node = elem[ i++ ] ) ) { + + // Do not traverse comment nodes + ret += getText( node ); + } + } else if ( nodeType === 1 || nodeType === 9 || nodeType === 11 ) { + + // Use textContent for elements + // innerText usage removed for consistency of new lines (jQuery #11153) + if ( typeof elem.textContent === "string" ) { + return elem.textContent; + } else { + + // Traverse its children + for ( elem = elem.firstChild; elem; elem = elem.nextSibling ) { + ret += getText( elem ); + } + } + } else if ( nodeType === 3 || nodeType === 4 ) { + return elem.nodeValue; + } + + // Do not include comment or processing instruction nodes + + return ret; +}; + +Expr = Sizzle.selectors = { + + // Can be adjusted by the user + cacheLength: 50, + + createPseudo: markFunction, + + match: matchExpr, + + attrHandle: {}, + + find: {}, + + relative: { + ">": { dir: "parentNode", first: true }, + " ": { dir: "parentNode" }, + "+": { dir: "previousSibling", first: true }, + "~": { dir: "previousSibling" } + }, + + preFilter: { + "ATTR": function( match ) { + match[ 1 ] = match[ 1 ].replace( runescape, funescape ); + + // Move the given value to match[3] whether quoted or unquoted + match[ 3 ] = ( match[ 3 ] || match[ 4 ] || + match[ 5 ] || "" ).replace( runescape, funescape ); + + if ( match[ 2 ] === "~=" ) { + match[ 3 ] = " " + match[ 3 ] + " "; + } + + return match.slice( 0, 4 ); + }, + + "CHILD": function( match ) { + + /* matches from matchExpr["CHILD"] + 1 type (only|nth|...) + 2 what (child|of-type) + 3 argument (even|odd|\d*|\d*n([+-]\d+)?|...) + 4 xn-component of xn+y argument ([+-]?\d*n|) + 5 sign of xn-component + 6 x of xn-component + 7 sign of y-component + 8 y of y-component + */ + match[ 1 ] = match[ 1 ].toLowerCase(); + + if ( match[ 1 ].slice( 0, 3 ) === "nth" ) { + + // nth-* requires argument + if ( !match[ 3 ] ) { + Sizzle.error( match[ 0 ] ); + } + + // numeric x and y parameters for Expr.filter.CHILD + // remember that false/true cast respectively to 0/1 + match[ 4 ] = +( match[ 4 ] ? + match[ 5 ] + ( match[ 6 ] || 1 ) : + 2 * ( match[ 3 ] === "even" || match[ 3 ] === "odd" ) ); + match[ 5 ] = +( ( match[ 7 ] + match[ 8 ] ) || match[ 3 ] === "odd" ); + + // other types prohibit arguments + } else if ( match[ 3 ] ) { + Sizzle.error( match[ 0 ] ); + } + + return match; + }, + + "PSEUDO": function( match ) { + var excess, + unquoted = !match[ 6 ] && match[ 2 ]; + + if ( matchExpr[ "CHILD" ].test( match[ 0 ] ) ) { + return null; + } + + // Accept quoted arguments as-is + if ( match[ 3 ] ) { + match[ 2 ] = match[ 4 ] || match[ 5 ] || ""; + + // Strip excess characters from unquoted arguments + } else if ( unquoted && rpseudo.test( unquoted ) && + + // Get excess from tokenize (recursively) + ( excess = tokenize( unquoted, true ) ) && + + // advance to the next closing parenthesis + ( excess = unquoted.indexOf( ")", unquoted.length - excess ) - unquoted.length ) ) { + + // excess is a negative index + match[ 0 ] = match[ 0 ].slice( 0, excess ); + match[ 2 ] = unquoted.slice( 0, excess ); + } + + // Return only captures needed by the pseudo filter method (type and argument) + return match.slice( 0, 3 ); + } + }, + + filter: { + + "TAG": function( nodeNameSelector ) { + var nodeName = nodeNameSelector.replace( runescape, funescape ).toLowerCase(); + return nodeNameSelector === "*" ? + function() { + return true; + } : + function( elem ) { + return elem.nodeName && elem.nodeName.toLowerCase() === nodeName; + }; + }, + + "CLASS": function( className ) { + var pattern = classCache[ className + " " ]; + + return pattern || + ( pattern = new RegExp( "(^|" + whitespace + + ")" + className + "(" + whitespace + "|$)" ) ) && classCache( + className, function( elem ) { + return pattern.test( + typeof elem.className === "string" && elem.className || + typeof elem.getAttribute !== "undefined" && + elem.getAttribute( "class" ) || + "" + ); + } ); + }, + + "ATTR": function( name, operator, check ) { + return function( elem ) { + var result = Sizzle.attr( elem, name ); + + if ( result == null ) { + return operator === "!="; + } + if ( !operator ) { + return true; + } + + result += ""; + + /* eslint-disable max-len */ + + return operator === "=" ? result === check : + operator === "!=" ? result !== check : + operator === "^=" ? check && result.indexOf( check ) === 0 : + operator === "*=" ? check && result.indexOf( check ) > -1 : + operator === "$=" ? check && result.slice( -check.length ) === check : + operator === "~=" ? ( " " + result.replace( rwhitespace, " " ) + " " ).indexOf( check ) > -1 : + operator === "|=" ? result === check || result.slice( 0, check.length + 1 ) === check + "-" : + false; + /* eslint-enable max-len */ + + }; + }, + + "CHILD": function( type, what, _argument, first, last ) { + var simple = type.slice( 0, 3 ) !== "nth", + forward = type.slice( -4 ) !== "last", + ofType = what === "of-type"; + + return first === 1 && last === 0 ? + + // Shortcut for :nth-*(n) + function( elem ) { + return !!elem.parentNode; + } : + + function( elem, _context, xml ) { + var cache, uniqueCache, outerCache, node, nodeIndex, start, + dir = simple !== forward ? "nextSibling" : "previousSibling", + parent = elem.parentNode, + name = ofType && elem.nodeName.toLowerCase(), + useCache = !xml && !ofType, + diff = false; + + if ( parent ) { + + // :(first|last|only)-(child|of-type) + if ( simple ) { + while ( dir ) { + node = elem; + while ( ( node = node[ dir ] ) ) { + if ( ofType ? + node.nodeName.toLowerCase() === name : + node.nodeType === 1 ) { + + return false; + } + } + + // Reverse direction for :only-* (if we haven't yet done so) + start = dir = type === "only" && !start && "nextSibling"; + } + return true; + } + + start = [ forward ? parent.firstChild : parent.lastChild ]; + + // non-xml :nth-child(...) stores cache data on `parent` + if ( forward && useCache ) { + + // Seek `elem` from a previously-cached index + + // ...in a gzip-friendly way + node = parent; + outerCache = node[ expando ] || ( node[ expando ] = {} ); + + // Support: IE <9 only + // Defend against cloned attroperties (jQuery gh-1709) + uniqueCache = outerCache[ node.uniqueID ] || + ( outerCache[ node.uniqueID ] = {} ); + + cache = uniqueCache[ type ] || []; + nodeIndex = cache[ 0 ] === dirruns && cache[ 1 ]; + diff = nodeIndex && cache[ 2 ]; + node = nodeIndex && parent.childNodes[ nodeIndex ]; + + while ( ( node = ++nodeIndex && node && node[ dir ] || + + // Fallback to seeking `elem` from the start + ( diff = nodeIndex = 0 ) || start.pop() ) ) { + + // When found, cache indexes on `parent` and break + if ( node.nodeType === 1 && ++diff && node === elem ) { + uniqueCache[ type ] = [ dirruns, nodeIndex, diff ]; + break; + } + } + + } else { + + // Use previously-cached element index if available + if ( useCache ) { + + // ...in a gzip-friendly way + node = elem; + outerCache = node[ expando ] || ( node[ expando ] = {} ); + + // Support: IE <9 only + // Defend against cloned attroperties (jQuery gh-1709) + uniqueCache = outerCache[ node.uniqueID ] || + ( outerCache[ node.uniqueID ] = {} ); + + cache = uniqueCache[ type ] || []; + nodeIndex = cache[ 0 ] === dirruns && cache[ 1 ]; + diff = nodeIndex; + } + + // xml :nth-child(...) + // or :nth-last-child(...) or :nth(-last)?-of-type(...) + if ( diff === false ) { + + // Use the same loop as above to seek `elem` from the start + while ( ( node = ++nodeIndex && node && node[ dir ] || + ( diff = nodeIndex = 0 ) || start.pop() ) ) { + + if ( ( ofType ? + node.nodeName.toLowerCase() === name : + node.nodeType === 1 ) && + ++diff ) { + + // Cache the index of each encountered element + if ( useCache ) { + outerCache = node[ expando ] || + ( node[ expando ] = {} ); + + // Support: IE <9 only + // Defend against cloned attroperties (jQuery gh-1709) + uniqueCache = outerCache[ node.uniqueID ] || + ( outerCache[ node.uniqueID ] = {} ); + + uniqueCache[ type ] = [ dirruns, diff ]; + } + + if ( node === elem ) { + break; + } + } + } + } + } + + // Incorporate the offset, then check against cycle size + diff -= last; + return diff === first || ( diff % first === 0 && diff / first >= 0 ); + } + }; + }, + + "PSEUDO": function( pseudo, argument ) { + + // pseudo-class names are case-insensitive + // http://www.w3.org/TR/selectors/#pseudo-classes + // Prioritize by case sensitivity in case custom pseudos are added with uppercase letters + // Remember that setFilters inherits from pseudos + var args, + fn = Expr.pseudos[ pseudo ] || Expr.setFilters[ pseudo.toLowerCase() ] || + Sizzle.error( "unsupported pseudo: " + pseudo ); + + // The user may use createPseudo to indicate that + // arguments are needed to create the filter function + // just as Sizzle does + if ( fn[ expando ] ) { + return fn( argument ); + } + + // But maintain support for old signatures + if ( fn.length > 1 ) { + args = [ pseudo, pseudo, "", argument ]; + return Expr.setFilters.hasOwnProperty( pseudo.toLowerCase() ) ? + markFunction( function( seed, matches ) { + var idx, + matched = fn( seed, argument ), + i = matched.length; + while ( i-- ) { + idx = indexOf( seed, matched[ i ] ); + seed[ idx ] = !( matches[ idx ] = matched[ i ] ); + } + } ) : + function( elem ) { + return fn( elem, 0, args ); + }; + } + + return fn; + } + }, + + pseudos: { + + // Potentially complex pseudos + "not": markFunction( function( selector ) { + + // Trim the selector passed to compile + // to avoid treating leading and trailing + // spaces as combinators + var input = [], + results = [], + matcher = compile( selector.replace( rtrim, "$1" ) ); + + return matcher[ expando ] ? + markFunction( function( seed, matches, _context, xml ) { + var elem, + unmatched = matcher( seed, null, xml, [] ), + i = seed.length; + + // Match elements unmatched by `matcher` + while ( i-- ) { + if ( ( elem = unmatched[ i ] ) ) { + seed[ i ] = !( matches[ i ] = elem ); + } + } + } ) : + function( elem, _context, xml ) { + input[ 0 ] = elem; + matcher( input, null, xml, results ); + + // Don't keep the element (issue #299) + input[ 0 ] = null; + return !results.pop(); + }; + } ), + + "has": markFunction( function( selector ) { + return function( elem ) { + return Sizzle( selector, elem ).length > 0; + }; + } ), + + "contains": markFunction( function( text ) { + text = text.replace( runescape, funescape ); + return function( elem ) { + return ( elem.textContent || getText( elem ) ).indexOf( text ) > -1; + }; + } ), + + // "Whether an element is represented by a :lang() selector + // is based solely on the element's language value + // being equal to the identifier C, + // or beginning with the identifier C immediately followed by "-". + // The matching of C against the element's language value is performed case-insensitively. + // The identifier C does not have to be a valid language name." + // http://www.w3.org/TR/selectors/#lang-pseudo + "lang": markFunction( function( lang ) { + + // lang value must be a valid identifier + if ( !ridentifier.test( lang || "" ) ) { + Sizzle.error( "unsupported lang: " + lang ); + } + lang = lang.replace( runescape, funescape ).toLowerCase(); + return function( elem ) { + var elemLang; + do { + if ( ( elemLang = documentIsHTML ? + elem.lang : + elem.getAttribute( "xml:lang" ) || elem.getAttribute( "lang" ) ) ) { + + elemLang = elemLang.toLowerCase(); + return elemLang === lang || elemLang.indexOf( lang + "-" ) === 0; + } + } while ( ( elem = elem.parentNode ) && elem.nodeType === 1 ); + return false; + }; + } ), + + // Miscellaneous + "target": function( elem ) { + var hash = window.location && window.location.hash; + return hash && hash.slice( 1 ) === elem.id; + }, + + "root": function( elem ) { + return elem === docElem; + }, + + "focus": function( elem ) { + return elem === document.activeElement && + ( !document.hasFocus || document.hasFocus() ) && + !!( elem.type || elem.href || ~elem.tabIndex ); + }, + + // Boolean properties + "enabled": createDisabledPseudo( false ), + "disabled": createDisabledPseudo( true ), + + "checked": function( elem ) { + + // In CSS3, :checked should return both checked and selected elements + // http://www.w3.org/TR/2011/REC-css3-selectors-20110929/#checked + var nodeName = elem.nodeName.toLowerCase(); + return ( nodeName === "input" && !!elem.checked ) || + ( nodeName === "option" && !!elem.selected ); + }, + + "selected": function( elem ) { + + // Accessing this property makes selected-by-default + // options in Safari work properly + if ( elem.parentNode ) { + // eslint-disable-next-line no-unused-expressions + elem.parentNode.selectedIndex; + } + + return elem.selected === true; + }, + + // Contents + "empty": function( elem ) { + + // http://www.w3.org/TR/selectors/#empty-pseudo + // :empty is negated by element (1) or content nodes (text: 3; cdata: 4; entity ref: 5), + // but not by others (comment: 8; processing instruction: 7; etc.) + // nodeType < 6 works because attributes (2) do not appear as children + for ( elem = elem.firstChild; elem; elem = elem.nextSibling ) { + if ( elem.nodeType < 6 ) { + return false; + } + } + return true; + }, + + "parent": function( elem ) { + return !Expr.pseudos[ "empty" ]( elem ); + }, + + // Element/input types + "header": function( elem ) { + return rheader.test( elem.nodeName ); + }, + + "input": function( elem ) { + return rinputs.test( elem.nodeName ); + }, + + "button": function( elem ) { + var name = elem.nodeName.toLowerCase(); + return name === "input" && elem.type === "button" || name === "button"; + }, + + "text": function( elem ) { + var attr; + return elem.nodeName.toLowerCase() === "input" && + elem.type === "text" && + + // Support: IE<8 + // New HTML5 attribute values (e.g., "search") appear with elem.type === "text" + ( ( attr = elem.getAttribute( "type" ) ) == null || + attr.toLowerCase() === "text" ); + }, + + // Position-in-collection + "first": createPositionalPseudo( function() { + return [ 0 ]; + } ), + + "last": createPositionalPseudo( function( _matchIndexes, length ) { + return [ length - 1 ]; + } ), + + "eq": createPositionalPseudo( function( _matchIndexes, length, argument ) { + return [ argument < 0 ? argument + length : argument ]; + } ), + + "even": createPositionalPseudo( function( matchIndexes, length ) { + var i = 0; + for ( ; i < length; i += 2 ) { + matchIndexes.push( i ); + } + return matchIndexes; + } ), + + "odd": createPositionalPseudo( function( matchIndexes, length ) { + var i = 1; + for ( ; i < length; i += 2 ) { + matchIndexes.push( i ); + } + return matchIndexes; + } ), + + "lt": createPositionalPseudo( function( matchIndexes, length, argument ) { + var i = argument < 0 ? + argument + length : + argument > length ? + length : + argument; + for ( ; --i >= 0; ) { + matchIndexes.push( i ); + } + return matchIndexes; + } ), + + "gt": createPositionalPseudo( function( matchIndexes, length, argument ) { + var i = argument < 0 ? argument + length : argument; + for ( ; ++i < length; ) { + matchIndexes.push( i ); + } + return matchIndexes; + } ) + } +}; + +Expr.pseudos[ "nth" ] = Expr.pseudos[ "eq" ]; + +// Add button/input type pseudos +for ( i in { radio: true, checkbox: true, file: true, password: true, image: true } ) { + Expr.pseudos[ i ] = createInputPseudo( i ); +} +for ( i in { submit: true, reset: true } ) { + Expr.pseudos[ i ] = createButtonPseudo( i ); +} + +// Easy API for creating new setFilters +function setFilters() {} +setFilters.prototype = Expr.filters = Expr.pseudos; +Expr.setFilters = new setFilters(); + +tokenize = Sizzle.tokenize = function( selector, parseOnly ) { + var matched, match, tokens, type, + soFar, groups, preFilters, + cached = tokenCache[ selector + " " ]; + + if ( cached ) { + return parseOnly ? 0 : cached.slice( 0 ); + } + + soFar = selector; + groups = []; + preFilters = Expr.preFilter; + + while ( soFar ) { + + // Comma and first run + if ( !matched || ( match = rcomma.exec( soFar ) ) ) { + if ( match ) { + + // Don't consume trailing commas as valid + soFar = soFar.slice( match[ 0 ].length ) || soFar; + } + groups.push( ( tokens = [] ) ); + } + + matched = false; + + // Combinators + if ( ( match = rcombinators.exec( soFar ) ) ) { + matched = match.shift(); + tokens.push( { + value: matched, + + // Cast descendant combinators to space + type: match[ 0 ].replace( rtrim, " " ) + } ); + soFar = soFar.slice( matched.length ); + } + + // Filters + for ( type in Expr.filter ) { + if ( ( match = matchExpr[ type ].exec( soFar ) ) && ( !preFilters[ type ] || + ( match = preFilters[ type ]( match ) ) ) ) { + matched = match.shift(); + tokens.push( { + value: matched, + type: type, + matches: match + } ); + soFar = soFar.slice( matched.length ); + } + } + + if ( !matched ) { + break; + } + } + + // Return the length of the invalid excess + // if we're just parsing + // Otherwise, throw an error or return tokens + return parseOnly ? + soFar.length : + soFar ? + Sizzle.error( selector ) : + + // Cache the tokens + tokenCache( selector, groups ).slice( 0 ); +}; + +function toSelector( tokens ) { + var i = 0, + len = tokens.length, + selector = ""; + for ( ; i < len; i++ ) { + selector += tokens[ i ].value; + } + return selector; +} + +function addCombinator( matcher, combinator, base ) { + var dir = combinator.dir, + skip = combinator.next, + key = skip || dir, + checkNonElements = base && key === "parentNode", + doneName = done++; + + return combinator.first ? + + // Check against closest ancestor/preceding element + function( elem, context, xml ) { + while ( ( elem = elem[ dir ] ) ) { + if ( elem.nodeType === 1 || checkNonElements ) { + return matcher( elem, context, xml ); + } + } + return false; + } : + + // Check against all ancestor/preceding elements + function( elem, context, xml ) { + var oldCache, uniqueCache, outerCache, + newCache = [ dirruns, doneName ]; + + // We can't set arbitrary data on XML nodes, so they don't benefit from combinator caching + if ( xml ) { + while ( ( elem = elem[ dir ] ) ) { + if ( elem.nodeType === 1 || checkNonElements ) { + if ( matcher( elem, context, xml ) ) { + return true; + } + } + } + } else { + while ( ( elem = elem[ dir ] ) ) { + if ( elem.nodeType === 1 || checkNonElements ) { + outerCache = elem[ expando ] || ( elem[ expando ] = {} ); + + // Support: IE <9 only + // Defend against cloned attroperties (jQuery gh-1709) + uniqueCache = outerCache[ elem.uniqueID ] || + ( outerCache[ elem.uniqueID ] = {} ); + + if ( skip && skip === elem.nodeName.toLowerCase() ) { + elem = elem[ dir ] || elem; + } else if ( ( oldCache = uniqueCache[ key ] ) && + oldCache[ 0 ] === dirruns && oldCache[ 1 ] === doneName ) { + + // Assign to newCache so results back-propagate to previous elements + return ( newCache[ 2 ] = oldCache[ 2 ] ); + } else { + + // Reuse newcache so results back-propagate to previous elements + uniqueCache[ key ] = newCache; + + // A match means we're done; a fail means we have to keep checking + if ( ( newCache[ 2 ] = matcher( elem, context, xml ) ) ) { + return true; + } + } + } + } + } + return false; + }; +} + +function elementMatcher( matchers ) { + return matchers.length > 1 ? + function( elem, context, xml ) { + var i = matchers.length; + while ( i-- ) { + if ( !matchers[ i ]( elem, context, xml ) ) { + return false; + } + } + return true; + } : + matchers[ 0 ]; +} + +function multipleContexts( selector, contexts, results ) { + var i = 0, + len = contexts.length; + for ( ; i < len; i++ ) { + Sizzle( selector, contexts[ i ], results ); + } + return results; +} + +function condense( unmatched, map, filter, context, xml ) { + var elem, + newUnmatched = [], + i = 0, + len = unmatched.length, + mapped = map != null; + + for ( ; i < len; i++ ) { + if ( ( elem = unmatched[ i ] ) ) { + if ( !filter || filter( elem, context, xml ) ) { + newUnmatched.push( elem ); + if ( mapped ) { + map.push( i ); + } + } + } + } + + return newUnmatched; +} + +function setMatcher( preFilter, selector, matcher, postFilter, postFinder, postSelector ) { + if ( postFilter && !postFilter[ expando ] ) { + postFilter = setMatcher( postFilter ); + } + if ( postFinder && !postFinder[ expando ] ) { + postFinder = setMatcher( postFinder, postSelector ); + } + return markFunction( function( seed, results, context, xml ) { + var temp, i, elem, + preMap = [], + postMap = [], + preexisting = results.length, + + // Get initial elements from seed or context + elems = seed || multipleContexts( + selector || "*", + context.nodeType ? [ context ] : context, + [] + ), + + // Prefilter to get matcher input, preserving a map for seed-results synchronization + matcherIn = preFilter && ( seed || !selector ) ? + condense( elems, preMap, preFilter, context, xml ) : + elems, + + matcherOut = matcher ? + + // If we have a postFinder, or filtered seed, or non-seed postFilter or preexisting results, + postFinder || ( seed ? preFilter : preexisting || postFilter ) ? + + // ...intermediate processing is necessary + [] : + + // ...otherwise use results directly + results : + matcherIn; + + // Find primary matches + if ( matcher ) { + matcher( matcherIn, matcherOut, context, xml ); + } + + // Apply postFilter + if ( postFilter ) { + temp = condense( matcherOut, postMap ); + postFilter( temp, [], context, xml ); + + // Un-match failing elements by moving them back to matcherIn + i = temp.length; + while ( i-- ) { + if ( ( elem = temp[ i ] ) ) { + matcherOut[ postMap[ i ] ] = !( matcherIn[ postMap[ i ] ] = elem ); + } + } + } + + if ( seed ) { + if ( postFinder || preFilter ) { + if ( postFinder ) { + + // Get the final matcherOut by condensing this intermediate into postFinder contexts + temp = []; + i = matcherOut.length; + while ( i-- ) { + if ( ( elem = matcherOut[ i ] ) ) { + + // Restore matcherIn since elem is not yet a final match + temp.push( ( matcherIn[ i ] = elem ) ); + } + } + postFinder( null, ( matcherOut = [] ), temp, xml ); + } + + // Move matched elements from seed to results to keep them synchronized + i = matcherOut.length; + while ( i-- ) { + if ( ( elem = matcherOut[ i ] ) && + ( temp = postFinder ? indexOf( seed, elem ) : preMap[ i ] ) > -1 ) { + + seed[ temp ] = !( results[ temp ] = elem ); + } + } + } + + // Add elements to results, through postFinder if defined + } else { + matcherOut = condense( + matcherOut === results ? + matcherOut.splice( preexisting, matcherOut.length ) : + matcherOut + ); + if ( postFinder ) { + postFinder( null, results, matcherOut, xml ); + } else { + push.apply( results, matcherOut ); + } + } + } ); +} + +function matcherFromTokens( tokens ) { + var checkContext, matcher, j, + len = tokens.length, + leadingRelative = Expr.relative[ tokens[ 0 ].type ], + implicitRelative = leadingRelative || Expr.relative[ " " ], + i = leadingRelative ? 1 : 0, + + // The foundational matcher ensures that elements are reachable from top-level context(s) + matchContext = addCombinator( function( elem ) { + return elem === checkContext; + }, implicitRelative, true ), + matchAnyContext = addCombinator( function( elem ) { + return indexOf( checkContext, elem ) > -1; + }, implicitRelative, true ), + matchers = [ function( elem, context, xml ) { + var ret = ( !leadingRelative && ( xml || context !== outermostContext ) ) || ( + ( checkContext = context ).nodeType ? + matchContext( elem, context, xml ) : + matchAnyContext( elem, context, xml ) ); + + // Avoid hanging onto element (issue #299) + checkContext = null; + return ret; + } ]; + + for ( ; i < len; i++ ) { + if ( ( matcher = Expr.relative[ tokens[ i ].type ] ) ) { + matchers = [ addCombinator( elementMatcher( matchers ), matcher ) ]; + } else { + matcher = Expr.filter[ tokens[ i ].type ].apply( null, tokens[ i ].matches ); + + // Return special upon seeing a positional matcher + if ( matcher[ expando ] ) { + + // Find the next relative operator (if any) for proper handling + j = ++i; + for ( ; j < len; j++ ) { + if ( Expr.relative[ tokens[ j ].type ] ) { + break; + } + } + return setMatcher( + i > 1 && elementMatcher( matchers ), + i > 1 && toSelector( + + // If the preceding token was a descendant combinator, insert an implicit any-element `*` + tokens + .slice( 0, i - 1 ) + .concat( { value: tokens[ i - 2 ].type === " " ? "*" : "" } ) + ).replace( rtrim, "$1" ), + matcher, + i < j && matcherFromTokens( tokens.slice( i, j ) ), + j < len && matcherFromTokens( ( tokens = tokens.slice( j ) ) ), + j < len && toSelector( tokens ) + ); + } + matchers.push( matcher ); + } + } + + return elementMatcher( matchers ); +} + +function matcherFromGroupMatchers( elementMatchers, setMatchers ) { + var bySet = setMatchers.length > 0, + byElement = elementMatchers.length > 0, + superMatcher = function( seed, context, xml, results, outermost ) { + var elem, j, matcher, + matchedCount = 0, + i = "0", + unmatched = seed && [], + setMatched = [], + contextBackup = outermostContext, + + // We must always have either seed elements or outermost context + elems = seed || byElement && Expr.find[ "TAG" ]( "*", outermost ), + + // Use integer dirruns iff this is the outermost matcher + dirrunsUnique = ( dirruns += contextBackup == null ? 1 : Math.random() || 0.1 ), + len = elems.length; + + if ( outermost ) { + + // Support: IE 11+, Edge 17 - 18+ + // IE/Edge sometimes throw a "Permission denied" error when strict-comparing + // two documents; shallow comparisons work. + // eslint-disable-next-line eqeqeq + outermostContext = context == document || context || outermost; + } + + // Add elements passing elementMatchers directly to results + // Support: IE<9, Safari + // Tolerate NodeList properties (IE: "length"; Safari: ) matching elements by id + for ( ; i !== len && ( elem = elems[ i ] ) != null; i++ ) { + if ( byElement && elem ) { + j = 0; + + // Support: IE 11+, Edge 17 - 18+ + // IE/Edge sometimes throw a "Permission denied" error when strict-comparing + // two documents; shallow comparisons work. + // eslint-disable-next-line eqeqeq + if ( !context && elem.ownerDocument != document ) { + setDocument( elem ); + xml = !documentIsHTML; + } + while ( ( matcher = elementMatchers[ j++ ] ) ) { + if ( matcher( elem, context || document, xml ) ) { + results.push( elem ); + break; + } + } + if ( outermost ) { + dirruns = dirrunsUnique; + } + } + + // Track unmatched elements for set filters + if ( bySet ) { + + // They will have gone through all possible matchers + if ( ( elem = !matcher && elem ) ) { + matchedCount--; + } + + // Lengthen the array for every element, matched or not + if ( seed ) { + unmatched.push( elem ); + } + } + } + + // `i` is now the count of elements visited above, and adding it to `matchedCount` + // makes the latter nonnegative. + matchedCount += i; + + // Apply set filters to unmatched elements + // NOTE: This can be skipped if there are no unmatched elements (i.e., `matchedCount` + // equals `i`), unless we didn't visit _any_ elements in the above loop because we have + // no element matchers and no seed. + // Incrementing an initially-string "0" `i` allows `i` to remain a string only in that + // case, which will result in a "00" `matchedCount` that differs from `i` but is also + // numerically zero. + if ( bySet && i !== matchedCount ) { + j = 0; + while ( ( matcher = setMatchers[ j++ ] ) ) { + matcher( unmatched, setMatched, context, xml ); + } + + if ( seed ) { + + // Reintegrate element matches to eliminate the need for sorting + if ( matchedCount > 0 ) { + while ( i-- ) { + if ( !( unmatched[ i ] || setMatched[ i ] ) ) { + setMatched[ i ] = pop.call( results ); + } + } + } + + // Discard index placeholder values to get only actual matches + setMatched = condense( setMatched ); + } + + // Add matches to results + push.apply( results, setMatched ); + + // Seedless set matches succeeding multiple successful matchers stipulate sorting + if ( outermost && !seed && setMatched.length > 0 && + ( matchedCount + setMatchers.length ) > 1 ) { + + Sizzle.uniqueSort( results ); + } + } + + // Override manipulation of globals by nested matchers + if ( outermost ) { + dirruns = dirrunsUnique; + outermostContext = contextBackup; + } + + return unmatched; + }; + + return bySet ? + markFunction( superMatcher ) : + superMatcher; +} + +compile = Sizzle.compile = function( selector, match /* Internal Use Only */ ) { + var i, + setMatchers = [], + elementMatchers = [], + cached = compilerCache[ selector + " " ]; + + if ( !cached ) { + + // Generate a function of recursive functions that can be used to check each element + if ( !match ) { + match = tokenize( selector ); + } + i = match.length; + while ( i-- ) { + cached = matcherFromTokens( match[ i ] ); + if ( cached[ expando ] ) { + setMatchers.push( cached ); + } else { + elementMatchers.push( cached ); + } + } + + // Cache the compiled function + cached = compilerCache( + selector, + matcherFromGroupMatchers( elementMatchers, setMatchers ) + ); + + // Save selector and tokenization + cached.selector = selector; + } + return cached; +}; + +/** + * A low-level selection function that works with Sizzle's compiled + * selector functions + * @param {String|Function} selector A selector or a pre-compiled + * selector function built with Sizzle.compile + * @param {Element} context + * @param {Array} [results] + * @param {Array} [seed] A set of elements to match against + */ +select = Sizzle.select = function( selector, context, results, seed ) { + var i, tokens, token, type, find, + compiled = typeof selector === "function" && selector, + match = !seed && tokenize( ( selector = compiled.selector || selector ) ); + + results = results || []; + + // Try to minimize operations if there is only one selector in the list and no seed + // (the latter of which guarantees us context) + if ( match.length === 1 ) { + + // Reduce context if the leading compound selector is an ID + tokens = match[ 0 ] = match[ 0 ].slice( 0 ); + if ( tokens.length > 2 && ( token = tokens[ 0 ] ).type === "ID" && + context.nodeType === 9 && documentIsHTML && Expr.relative[ tokens[ 1 ].type ] ) { + + context = ( Expr.find[ "ID" ]( token.matches[ 0 ] + .replace( runescape, funescape ), context ) || [] )[ 0 ]; + if ( !context ) { + return results; + + // Precompiled matchers will still verify ancestry, so step up a level + } else if ( compiled ) { + context = context.parentNode; + } + + selector = selector.slice( tokens.shift().value.length ); + } + + // Fetch a seed set for right-to-left matching + i = matchExpr[ "needsContext" ].test( selector ) ? 0 : tokens.length; + while ( i-- ) { + token = tokens[ i ]; + + // Abort if we hit a combinator + if ( Expr.relative[ ( type = token.type ) ] ) { + break; + } + if ( ( find = Expr.find[ type ] ) ) { + + // Search, expanding context for leading sibling combinators + if ( ( seed = find( + token.matches[ 0 ].replace( runescape, funescape ), + rsibling.test( tokens[ 0 ].type ) && testContext( context.parentNode ) || + context + ) ) ) { + + // If seed is empty or no tokens remain, we can return early + tokens.splice( i, 1 ); + selector = seed.length && toSelector( tokens ); + if ( !selector ) { + push.apply( results, seed ); + return results; + } + + break; + } + } + } + } + + // Compile and execute a filtering function if one is not provided + // Provide `match` to avoid retokenization if we modified the selector above + ( compiled || compile( selector, match ) )( + seed, + context, + !documentIsHTML, + results, + !context || rsibling.test( selector ) && testContext( context.parentNode ) || context + ); + return results; +}; + +// One-time assignments + +// Sort stability +support.sortStable = expando.split( "" ).sort( sortOrder ).join( "" ) === expando; + +// Support: Chrome 14-35+ +// Always assume duplicates if they aren't passed to the comparison function +support.detectDuplicates = !!hasDuplicate; + +// Initialize against the default document +setDocument(); + +// Support: Webkit<537.32 - Safari 6.0.3/Chrome 25 (fixed in Chrome 27) +// Detached nodes confoundingly follow *each other* +support.sortDetached = assert( function( el ) { + + // Should return 1, but returns 4 (following) + return el.compareDocumentPosition( document.createElement( "fieldset" ) ) & 1; +} ); + +// Support: IE<8 +// Prevent attribute/property "interpolation" +// https://msdn.microsoft.com/en-us/library/ms536429%28VS.85%29.aspx +if ( !assert( function( el ) { + el.innerHTML = ""; + return el.firstChild.getAttribute( "href" ) === "#"; +} ) ) { + addHandle( "type|href|height|width", function( elem, name, isXML ) { + if ( !isXML ) { + return elem.getAttribute( name, name.toLowerCase() === "type" ? 1 : 2 ); + } + } ); +} + +// Support: IE<9 +// Use defaultValue in place of getAttribute("value") +if ( !support.attributes || !assert( function( el ) { + el.innerHTML = ""; + el.firstChild.setAttribute( "value", "" ); + return el.firstChild.getAttribute( "value" ) === ""; +} ) ) { + addHandle( "value", function( elem, _name, isXML ) { + if ( !isXML && elem.nodeName.toLowerCase() === "input" ) { + return elem.defaultValue; + } + } ); +} + +// Support: IE<9 +// Use getAttributeNode to fetch booleans when getAttribute lies +if ( !assert( function( el ) { + return el.getAttribute( "disabled" ) == null; +} ) ) { + addHandle( booleans, function( elem, name, isXML ) { + var val; + if ( !isXML ) { + return elem[ name ] === true ? name.toLowerCase() : + ( val = elem.getAttributeNode( name ) ) && val.specified ? + val.value : + null; + } + } ); +} + +return Sizzle; + +} )( window ); + + + +jQuery.find = Sizzle; +jQuery.expr = Sizzle.selectors; + +// Deprecated +jQuery.expr[ ":" ] = jQuery.expr.pseudos; +jQuery.uniqueSort = jQuery.unique = Sizzle.uniqueSort; +jQuery.text = Sizzle.getText; +jQuery.isXMLDoc = Sizzle.isXML; +jQuery.contains = Sizzle.contains; +jQuery.escapeSelector = Sizzle.escape; + + + + +var dir = function( elem, dir, until ) { + var matched = [], + truncate = until !== undefined; + + while ( ( elem = elem[ dir ] ) && elem.nodeType !== 9 ) { + if ( elem.nodeType === 1 ) { + if ( truncate && jQuery( elem ).is( until ) ) { + break; + } + matched.push( elem ); + } + } + return matched; +}; + + +var siblings = function( n, elem ) { + var matched = []; + + for ( ; n; n = n.nextSibling ) { + if ( n.nodeType === 1 && n !== elem ) { + matched.push( n ); + } + } + + return matched; +}; + + +var rneedsContext = jQuery.expr.match.needsContext; + + + +function nodeName( elem, name ) { + + return elem.nodeName && elem.nodeName.toLowerCase() === name.toLowerCase(); + +} +var rsingleTag = ( /^<([a-z][^\/\0>:\x20\t\r\n\f]*)[\x20\t\r\n\f]*\/?>(?:<\/\1>|)$/i ); + + + +// Implement the identical functionality for filter and not +function winnow( elements, qualifier, not ) { + if ( isFunction( qualifier ) ) { + return jQuery.grep( elements, function( elem, i ) { + return !!qualifier.call( elem, i, elem ) !== not; + } ); + } + + // Single element + if ( qualifier.nodeType ) { + return jQuery.grep( elements, function( elem ) { + return ( elem === qualifier ) !== not; + } ); + } + + // Arraylike of elements (jQuery, arguments, Array) + if ( typeof qualifier !== "string" ) { + return jQuery.grep( elements, function( elem ) { + return ( indexOf.call( qualifier, elem ) > -1 ) !== not; + } ); + } + + // Filtered directly for both simple and complex selectors + return jQuery.filter( qualifier, elements, not ); +} + +jQuery.filter = function( expr, elems, not ) { + var elem = elems[ 0 ]; + + if ( not ) { + expr = ":not(" + expr + ")"; + } + + if ( elems.length === 1 && elem.nodeType === 1 ) { + return jQuery.find.matchesSelector( elem, expr ) ? [ elem ] : []; + } + + return jQuery.find.matches( expr, jQuery.grep( elems, function( elem ) { + return elem.nodeType === 1; + } ) ); +}; + +jQuery.fn.extend( { + find: function( selector ) { + var i, ret, + len = this.length, + self = this; + + if ( typeof selector !== "string" ) { + return this.pushStack( jQuery( selector ).filter( function() { + for ( i = 0; i < len; i++ ) { + if ( jQuery.contains( self[ i ], this ) ) { + return true; + } + } + } ) ); + } + + ret = this.pushStack( [] ); + + for ( i = 0; i < len; i++ ) { + jQuery.find( selector, self[ i ], ret ); + } + + return len > 1 ? jQuery.uniqueSort( ret ) : ret; + }, + filter: function( selector ) { + return this.pushStack( winnow( this, selector || [], false ) ); + }, + not: function( selector ) { + return this.pushStack( winnow( this, selector || [], true ) ); + }, + is: function( selector ) { + return !!winnow( + this, + + // If this is a positional/relative selector, check membership in the returned set + // so $("p:first").is("p:last") won't return true for a doc with two "p". + typeof selector === "string" && rneedsContext.test( selector ) ? + jQuery( selector ) : + selector || [], + false + ).length; + } +} ); + + +// Initialize a jQuery object + + +// A central reference to the root jQuery(document) +var rootjQuery, + + // A simple way to check for HTML strings + // Prioritize #id over to avoid XSS via location.hash (#9521) + // Strict HTML recognition (#11290: must start with <) + // Shortcut simple #id case for speed + rquickExpr = /^(?:\s*(<[\w\W]+>)[^>]*|#([\w-]+))$/, + + init = jQuery.fn.init = function( selector, context, root ) { + var match, elem; + + // HANDLE: $(""), $(null), $(undefined), $(false) + if ( !selector ) { + return this; + } + + // Method init() accepts an alternate rootjQuery + // so migrate can support jQuery.sub (gh-2101) + root = root || rootjQuery; + + // Handle HTML strings + if ( typeof selector === "string" ) { + if ( selector[ 0 ] === "<" && + selector[ selector.length - 1 ] === ">" && + selector.length >= 3 ) { + + // Assume that strings that start and end with <> are HTML and skip the regex check + match = [ null, selector, null ]; + + } else { + match = rquickExpr.exec( selector ); + } + + // Match html or make sure no context is specified for #id + if ( match && ( match[ 1 ] || !context ) ) { + + // HANDLE: $(html) -> $(array) + if ( match[ 1 ] ) { + context = context instanceof jQuery ? context[ 0 ] : context; + + // Option to run scripts is true for back-compat + // Intentionally let the error be thrown if parseHTML is not present + jQuery.merge( this, jQuery.parseHTML( + match[ 1 ], + context && context.nodeType ? context.ownerDocument || context : document, + true + ) ); + + // HANDLE: $(html, props) + if ( rsingleTag.test( match[ 1 ] ) && jQuery.isPlainObject( context ) ) { + for ( match in context ) { + + // Properties of context are called as methods if possible + if ( isFunction( this[ match ] ) ) { + this[ match ]( context[ match ] ); + + // ...and otherwise set as attributes + } else { + this.attr( match, context[ match ] ); + } + } + } + + return this; + + // HANDLE: $(#id) + } else { + elem = document.getElementById( match[ 2 ] ); + + if ( elem ) { + + // Inject the element directly into the jQuery object + this[ 0 ] = elem; + this.length = 1; + } + return this; + } + + // HANDLE: $(expr, $(...)) + } else if ( !context || context.jquery ) { + return ( context || root ).find( selector ); + + // HANDLE: $(expr, context) + // (which is just equivalent to: $(context).find(expr) + } else { + return this.constructor( context ).find( selector ); + } + + // HANDLE: $(DOMElement) + } else if ( selector.nodeType ) { + this[ 0 ] = selector; + this.length = 1; + return this; + + // HANDLE: $(function) + // Shortcut for document ready + } else if ( isFunction( selector ) ) { + return root.ready !== undefined ? + root.ready( selector ) : + + // Execute immediately if ready is not present + selector( jQuery ); + } + + return jQuery.makeArray( selector, this ); + }; + +// Give the init function the jQuery prototype for later instantiation +init.prototype = jQuery.fn; + +// Initialize central reference +rootjQuery = jQuery( document ); + + +var rparentsprev = /^(?:parents|prev(?:Until|All))/, + + // Methods guaranteed to produce a unique set when starting from a unique set + guaranteedUnique = { + children: true, + contents: true, + next: true, + prev: true + }; + +jQuery.fn.extend( { + has: function( target ) { + var targets = jQuery( target, this ), + l = targets.length; + + return this.filter( function() { + var i = 0; + for ( ; i < l; i++ ) { + if ( jQuery.contains( this, targets[ i ] ) ) { + return true; + } + } + } ); + }, + + closest: function( selectors, context ) { + var cur, + i = 0, + l = this.length, + matched = [], + targets = typeof selectors !== "string" && jQuery( selectors ); + + // Positional selectors never match, since there's no _selection_ context + if ( !rneedsContext.test( selectors ) ) { + for ( ; i < l; i++ ) { + for ( cur = this[ i ]; cur && cur !== context; cur = cur.parentNode ) { + + // Always skip document fragments + if ( cur.nodeType < 11 && ( targets ? + targets.index( cur ) > -1 : + + // Don't pass non-elements to Sizzle + cur.nodeType === 1 && + jQuery.find.matchesSelector( cur, selectors ) ) ) { + + matched.push( cur ); + break; + } + } + } + } + + return this.pushStack( matched.length > 1 ? jQuery.uniqueSort( matched ) : matched ); + }, + + // Determine the position of an element within the set + index: function( elem ) { + + // No argument, return index in parent + if ( !elem ) { + return ( this[ 0 ] && this[ 0 ].parentNode ) ? this.first().prevAll().length : -1; + } + + // Index in selector + if ( typeof elem === "string" ) { + return indexOf.call( jQuery( elem ), this[ 0 ] ); + } + + // Locate the position of the desired element + return indexOf.call( this, + + // If it receives a jQuery object, the first element is used + elem.jquery ? elem[ 0 ] : elem + ); + }, + + add: function( selector, context ) { + return this.pushStack( + jQuery.uniqueSort( + jQuery.merge( this.get(), jQuery( selector, context ) ) + ) + ); + }, + + addBack: function( selector ) { + return this.add( selector == null ? + this.prevObject : this.prevObject.filter( selector ) + ); + } +} ); + +function sibling( cur, dir ) { + while ( ( cur = cur[ dir ] ) && cur.nodeType !== 1 ) {} + return cur; +} + +jQuery.each( { + parent: function( elem ) { + var parent = elem.parentNode; + return parent && parent.nodeType !== 11 ? parent : null; + }, + parents: function( elem ) { + return dir( elem, "parentNode" ); + }, + parentsUntil: function( elem, _i, until ) { + return dir( elem, "parentNode", until ); + }, + next: function( elem ) { + return sibling( elem, "nextSibling" ); + }, + prev: function( elem ) { + return sibling( elem, "previousSibling" ); + }, + nextAll: function( elem ) { + return dir( elem, "nextSibling" ); + }, + prevAll: function( elem ) { + return dir( elem, "previousSibling" ); + }, + nextUntil: function( elem, _i, until ) { + return dir( elem, "nextSibling", until ); + }, + prevUntil: function( elem, _i, until ) { + return dir( elem, "previousSibling", until ); + }, + siblings: function( elem ) { + return siblings( ( elem.parentNode || {} ).firstChild, elem ); + }, + children: function( elem ) { + return siblings( elem.firstChild ); + }, + contents: function( elem ) { + if ( elem.contentDocument != null && + + // Support: IE 11+ + // elements with no `data` attribute has an object + // `contentDocument` with a `null` prototype. + getProto( elem.contentDocument ) ) { + + return elem.contentDocument; + } + + // Support: IE 9 - 11 only, iOS 7 only, Android Browser <=4.3 only + // Treat the template element as a regular one in browsers that + // don't support it. + if ( nodeName( elem, "template" ) ) { + elem = elem.content || elem; + } + + return jQuery.merge( [], elem.childNodes ); + } +}, function( name, fn ) { + jQuery.fn[ name ] = function( until, selector ) { + var matched = jQuery.map( this, fn, until ); + + if ( name.slice( -5 ) !== "Until" ) { + selector = until; + } + + if ( selector && typeof selector === "string" ) { + matched = jQuery.filter( selector, matched ); + } + + if ( this.length > 1 ) { + + // Remove duplicates + if ( !guaranteedUnique[ name ] ) { + jQuery.uniqueSort( matched ); + } + + // Reverse order for parents* and prev-derivatives + if ( rparentsprev.test( name ) ) { + matched.reverse(); + } + } + + return this.pushStack( matched ); + }; +} ); +var rnothtmlwhite = ( /[^\x20\t\r\n\f]+/g ); + + + +// Convert String-formatted options into Object-formatted ones +function createOptions( options ) { + var object = {}; + jQuery.each( options.match( rnothtmlwhite ) || [], function( _, flag ) { + object[ flag ] = true; + } ); + return object; +} + +/* + * Create a callback list using the following parameters: + * + * options: an optional list of space-separated options that will change how + * the callback list behaves or a more traditional option object + * + * By default a callback list will act like an event callback list and can be + * "fired" multiple times. + * + * Possible options: + * + * once: will ensure the callback list can only be fired once (like a Deferred) + * + * memory: will keep track of previous values and will call any callback added + * after the list has been fired right away with the latest "memorized" + * values (like a Deferred) + * + * unique: will ensure a callback can only be added once (no duplicate in the list) + * + * stopOnFalse: interrupt callings when a callback returns false + * + */ +jQuery.Callbacks = function( options ) { + + // Convert options from String-formatted to Object-formatted if needed + // (we check in cache first) + options = typeof options === "string" ? + createOptions( options ) : + jQuery.extend( {}, options ); + + var // Flag to know if list is currently firing + firing, + + // Last fire value for non-forgettable lists + memory, + + // Flag to know if list was already fired + fired, + + // Flag to prevent firing + locked, + + // Actual callback list + list = [], + + // Queue of execution data for repeatable lists + queue = [], + + // Index of currently firing callback (modified by add/remove as needed) + firingIndex = -1, + + // Fire callbacks + fire = function() { + + // Enforce single-firing + locked = locked || options.once; + + // Execute callbacks for all pending executions, + // respecting firingIndex overrides and runtime changes + fired = firing = true; + for ( ; queue.length; firingIndex = -1 ) { + memory = queue.shift(); + while ( ++firingIndex < list.length ) { + + // Run callback and check for early termination + if ( list[ firingIndex ].apply( memory[ 0 ], memory[ 1 ] ) === false && + options.stopOnFalse ) { + + // Jump to end and forget the data so .add doesn't re-fire + firingIndex = list.length; + memory = false; + } + } + } + + // Forget the data if we're done with it + if ( !options.memory ) { + memory = false; + } + + firing = false; + + // Clean up if we're done firing for good + if ( locked ) { + + // Keep an empty list if we have data for future add calls + if ( memory ) { + list = []; + + // Otherwise, this object is spent + } else { + list = ""; + } + } + }, + + // Actual Callbacks object + self = { + + // Add a callback or a collection of callbacks to the list + add: function() { + if ( list ) { + + // If we have memory from a past run, we should fire after adding + if ( memory && !firing ) { + firingIndex = list.length - 1; + queue.push( memory ); + } + + ( function add( args ) { + jQuery.each( args, function( _, arg ) { + if ( isFunction( arg ) ) { + if ( !options.unique || !self.has( arg ) ) { + list.push( arg ); + } + } else if ( arg && arg.length && toType( arg ) !== "string" ) { + + // Inspect recursively + add( arg ); + } + } ); + } )( arguments ); + + if ( memory && !firing ) { + fire(); + } + } + return this; + }, + + // Remove a callback from the list + remove: function() { + jQuery.each( arguments, function( _, arg ) { + var index; + while ( ( index = jQuery.inArray( arg, list, index ) ) > -1 ) { + list.splice( index, 1 ); + + // Handle firing indexes + if ( index <= firingIndex ) { + firingIndex--; + } + } + } ); + return this; + }, + + // Check if a given callback is in the list. + // If no argument is given, return whether or not list has callbacks attached. + has: function( fn ) { + return fn ? + jQuery.inArray( fn, list ) > -1 : + list.length > 0; + }, + + // Remove all callbacks from the list + empty: function() { + if ( list ) { + list = []; + } + return this; + }, + + // Disable .fire and .add + // Abort any current/pending executions + // Clear all callbacks and values + disable: function() { + locked = queue = []; + list = memory = ""; + return this; + }, + disabled: function() { + return !list; + }, + + // Disable .fire + // Also disable .add unless we have memory (since it would have no effect) + // Abort any pending executions + lock: function() { + locked = queue = []; + if ( !memory && !firing ) { + list = memory = ""; + } + return this; + }, + locked: function() { + return !!locked; + }, + + // Call all callbacks with the given context and arguments + fireWith: function( context, args ) { + if ( !locked ) { + args = args || []; + args = [ context, args.slice ? args.slice() : args ]; + queue.push( args ); + if ( !firing ) { + fire(); + } + } + return this; + }, + + // Call all the callbacks with the given arguments + fire: function() { + self.fireWith( this, arguments ); + return this; + }, + + // To know if the callbacks have already been called at least once + fired: function() { + return !!fired; + } + }; + + return self; +}; + + +function Identity( v ) { + return v; +} +function Thrower( ex ) { + throw ex; +} + +function adoptValue( value, resolve, reject, noValue ) { + var method; + + try { + + // Check for promise aspect first to privilege synchronous behavior + if ( value && isFunction( ( method = value.promise ) ) ) { + method.call( value ).done( resolve ).fail( reject ); + + // Other thenables + } else if ( value && isFunction( ( method = value.then ) ) ) { + method.call( value, resolve, reject ); + + // Other non-thenables + } else { + + // Control `resolve` arguments by letting Array#slice cast boolean `noValue` to integer: + // * false: [ value ].slice( 0 ) => resolve( value ) + // * true: [ value ].slice( 1 ) => resolve() + resolve.apply( undefined, [ value ].slice( noValue ) ); + } + + // For Promises/A+, convert exceptions into rejections + // Since jQuery.when doesn't unwrap thenables, we can skip the extra checks appearing in + // Deferred#then to conditionally suppress rejection. + } catch ( value ) { + + // Support: Android 4.0 only + // Strict mode functions invoked without .call/.apply get global-object context + reject.apply( undefined, [ value ] ); + } +} + +jQuery.extend( { + + Deferred: function( func ) { + var tuples = [ + + // action, add listener, callbacks, + // ... .then handlers, argument index, [final state] + [ "notify", "progress", jQuery.Callbacks( "memory" ), + jQuery.Callbacks( "memory" ), 2 ], + [ "resolve", "done", jQuery.Callbacks( "once memory" ), + jQuery.Callbacks( "once memory" ), 0, "resolved" ], + [ "reject", "fail", jQuery.Callbacks( "once memory" ), + jQuery.Callbacks( "once memory" ), 1, "rejected" ] + ], + state = "pending", + promise = { + state: function() { + return state; + }, + always: function() { + deferred.done( arguments ).fail( arguments ); + return this; + }, + "catch": function( fn ) { + return promise.then( null, fn ); + }, + + // Keep pipe for back-compat + pipe: function( /* fnDone, fnFail, fnProgress */ ) { + var fns = arguments; + + return jQuery.Deferred( function( newDefer ) { + jQuery.each( tuples, function( _i, tuple ) { + + // Map tuples (progress, done, fail) to arguments (done, fail, progress) + var fn = isFunction( fns[ tuple[ 4 ] ] ) && fns[ tuple[ 4 ] ]; + + // deferred.progress(function() { bind to newDefer or newDefer.notify }) + // deferred.done(function() { bind to newDefer or newDefer.resolve }) + // deferred.fail(function() { bind to newDefer or newDefer.reject }) + deferred[ tuple[ 1 ] ]( function() { + var returned = fn && fn.apply( this, arguments ); + if ( returned && isFunction( returned.promise ) ) { + returned.promise() + .progress( newDefer.notify ) + .done( newDefer.resolve ) + .fail( newDefer.reject ); + } else { + newDefer[ tuple[ 0 ] + "With" ]( + this, + fn ? [ returned ] : arguments + ); + } + } ); + } ); + fns = null; + } ).promise(); + }, + then: function( onFulfilled, onRejected, onProgress ) { + var maxDepth = 0; + function resolve( depth, deferred, handler, special ) { + return function() { + var that = this, + args = arguments, + mightThrow = function() { + var returned, then; + + // Support: Promises/A+ section 2.3.3.3.3 + // https://promisesaplus.com/#point-59 + // Ignore double-resolution attempts + if ( depth < maxDepth ) { + return; + } + + returned = handler.apply( that, args ); + + // Support: Promises/A+ section 2.3.1 + // https://promisesaplus.com/#point-48 + if ( returned === deferred.promise() ) { + throw new TypeError( "Thenable self-resolution" ); + } + + // Support: Promises/A+ sections 2.3.3.1, 3.5 + // https://promisesaplus.com/#point-54 + // https://promisesaplus.com/#point-75 + // Retrieve `then` only once + then = returned && + + // Support: Promises/A+ section 2.3.4 + // https://promisesaplus.com/#point-64 + // Only check objects and functions for thenability + ( typeof returned === "object" || + typeof returned === "function" ) && + returned.then; + + // Handle a returned thenable + if ( isFunction( then ) ) { + + // Special processors (notify) just wait for resolution + if ( special ) { + then.call( + returned, + resolve( maxDepth, deferred, Identity, special ), + resolve( maxDepth, deferred, Thrower, special ) + ); + + // Normal processors (resolve) also hook into progress + } else { + + // ...and disregard older resolution values + maxDepth++; + + then.call( + returned, + resolve( maxDepth, deferred, Identity, special ), + resolve( maxDepth, deferred, Thrower, special ), + resolve( maxDepth, deferred, Identity, + deferred.notifyWith ) + ); + } + + // Handle all other returned values + } else { + + // Only substitute handlers pass on context + // and multiple values (non-spec behavior) + if ( handler !== Identity ) { + that = undefined; + args = [ returned ]; + } + + // Process the value(s) + // Default process is resolve + ( special || deferred.resolveWith )( that, args ); + } + }, + + // Only normal processors (resolve) catch and reject exceptions + process = special ? + mightThrow : + function() { + try { + mightThrow(); + } catch ( e ) { + + if ( jQuery.Deferred.exceptionHook ) { + jQuery.Deferred.exceptionHook( e, + process.stackTrace ); + } + + // Support: Promises/A+ section 2.3.3.3.4.1 + // https://promisesaplus.com/#point-61 + // Ignore post-resolution exceptions + if ( depth + 1 >= maxDepth ) { + + // Only substitute handlers pass on context + // and multiple values (non-spec behavior) + if ( handler !== Thrower ) { + that = undefined; + args = [ e ]; + } + + deferred.rejectWith( that, args ); + } + } + }; + + // Support: Promises/A+ section 2.3.3.3.1 + // https://promisesaplus.com/#point-57 + // Re-resolve promises immediately to dodge false rejection from + // subsequent errors + if ( depth ) { + process(); + } else { + + // Call an optional hook to record the stack, in case of exception + // since it's otherwise lost when execution goes async + if ( jQuery.Deferred.getStackHook ) { + process.stackTrace = jQuery.Deferred.getStackHook(); + } + window.setTimeout( process ); + } + }; + } + + return jQuery.Deferred( function( newDefer ) { + + // progress_handlers.add( ... ) + tuples[ 0 ][ 3 ].add( + resolve( + 0, + newDefer, + isFunction( onProgress ) ? + onProgress : + Identity, + newDefer.notifyWith + ) + ); + + // fulfilled_handlers.add( ... ) + tuples[ 1 ][ 3 ].add( + resolve( + 0, + newDefer, + isFunction( onFulfilled ) ? + onFulfilled : + Identity + ) + ); + + // rejected_handlers.add( ... ) + tuples[ 2 ][ 3 ].add( + resolve( + 0, + newDefer, + isFunction( onRejected ) ? + onRejected : + Thrower + ) + ); + } ).promise(); + }, + + // Get a promise for this deferred + // If obj is provided, the promise aspect is added to the object + promise: function( obj ) { + return obj != null ? jQuery.extend( obj, promise ) : promise; + } + }, + deferred = {}; + + // Add list-specific methods + jQuery.each( tuples, function( i, tuple ) { + var list = tuple[ 2 ], + stateString = tuple[ 5 ]; + + // promise.progress = list.add + // promise.done = list.add + // promise.fail = list.add + promise[ tuple[ 1 ] ] = list.add; + + // Handle state + if ( stateString ) { + list.add( + function() { + + // state = "resolved" (i.e., fulfilled) + // state = "rejected" + state = stateString; + }, + + // rejected_callbacks.disable + // fulfilled_callbacks.disable + tuples[ 3 - i ][ 2 ].disable, + + // rejected_handlers.disable + // fulfilled_handlers.disable + tuples[ 3 - i ][ 3 ].disable, + + // progress_callbacks.lock + tuples[ 0 ][ 2 ].lock, + + // progress_handlers.lock + tuples[ 0 ][ 3 ].lock + ); + } + + // progress_handlers.fire + // fulfilled_handlers.fire + // rejected_handlers.fire + list.add( tuple[ 3 ].fire ); + + // deferred.notify = function() { deferred.notifyWith(...) } + // deferred.resolve = function() { deferred.resolveWith(...) } + // deferred.reject = function() { deferred.rejectWith(...) } + deferred[ tuple[ 0 ] ] = function() { + deferred[ tuple[ 0 ] + "With" ]( this === deferred ? undefined : this, arguments ); + return this; + }; + + // deferred.notifyWith = list.fireWith + // deferred.resolveWith = list.fireWith + // deferred.rejectWith = list.fireWith + deferred[ tuple[ 0 ] + "With" ] = list.fireWith; + } ); + + // Make the deferred a promise + promise.promise( deferred ); + + // Call given func if any + if ( func ) { + func.call( deferred, deferred ); + } + + // All done! + return deferred; + }, + + // Deferred helper + when: function( singleValue ) { + var + + // count of uncompleted subordinates + remaining = arguments.length, + + // count of unprocessed arguments + i = remaining, + + // subordinate fulfillment data + resolveContexts = Array( i ), + resolveValues = slice.call( arguments ), + + // the primary Deferred + primary = jQuery.Deferred(), + + // subordinate callback factory + updateFunc = function( i ) { + return function( value ) { + resolveContexts[ i ] = this; + resolveValues[ i ] = arguments.length > 1 ? slice.call( arguments ) : value; + if ( !( --remaining ) ) { + primary.resolveWith( resolveContexts, resolveValues ); + } + }; + }; + + // Single- and empty arguments are adopted like Promise.resolve + if ( remaining <= 1 ) { + adoptValue( singleValue, primary.done( updateFunc( i ) ).resolve, primary.reject, + !remaining ); + + // Use .then() to unwrap secondary thenables (cf. gh-3000) + if ( primary.state() === "pending" || + isFunction( resolveValues[ i ] && resolveValues[ i ].then ) ) { + + return primary.then(); + } + } + + // Multiple arguments are aggregated like Promise.all array elements + while ( i-- ) { + adoptValue( resolveValues[ i ], updateFunc( i ), primary.reject ); + } + + return primary.promise(); + } +} ); + + +// These usually indicate a programmer mistake during development, +// warn about them ASAP rather than swallowing them by default. +var rerrorNames = /^(Eval|Internal|Range|Reference|Syntax|Type|URI)Error$/; + +jQuery.Deferred.exceptionHook = function( error, stack ) { + + // Support: IE 8 - 9 only + // Console exists when dev tools are open, which can happen at any time + if ( window.console && window.console.warn && error && rerrorNames.test( error.name ) ) { + window.console.warn( "jQuery.Deferred exception: " + error.message, error.stack, stack ); + } +}; + + + + +jQuery.readyException = function( error ) { + window.setTimeout( function() { + throw error; + } ); +}; + + + + +// The deferred used on DOM ready +var readyList = jQuery.Deferred(); + +jQuery.fn.ready = function( fn ) { + + readyList + .then( fn ) + + // Wrap jQuery.readyException in a function so that the lookup + // happens at the time of error handling instead of callback + // registration. + .catch( function( error ) { + jQuery.readyException( error ); + } ); + + return this; +}; + +jQuery.extend( { + + // Is the DOM ready to be used? Set to true once it occurs. + isReady: false, + + // A counter to track how many items to wait for before + // the ready event fires. See #6781 + readyWait: 1, + + // Handle when the DOM is ready + ready: function( wait ) { + + // Abort if there are pending holds or we're already ready + if ( wait === true ? --jQuery.readyWait : jQuery.isReady ) { + return; + } + + // Remember that the DOM is ready + jQuery.isReady = true; + + // If a normal DOM Ready event fired, decrement, and wait if need be + if ( wait !== true && --jQuery.readyWait > 0 ) { + return; + } + + // If there are functions bound, to execute + readyList.resolveWith( document, [ jQuery ] ); + } +} ); + +jQuery.ready.then = readyList.then; + +// The ready event handler and self cleanup method +function completed() { + document.removeEventListener( "DOMContentLoaded", completed ); + window.removeEventListener( "load", completed ); + jQuery.ready(); +} + +// Catch cases where $(document).ready() is called +// after the browser event has already occurred. +// Support: IE <=9 - 10 only +// Older IE sometimes signals "interactive" too soon +if ( document.readyState === "complete" || + ( document.readyState !== "loading" && !document.documentElement.doScroll ) ) { + + // Handle it asynchronously to allow scripts the opportunity to delay ready + window.setTimeout( jQuery.ready ); + +} else { + + // Use the handy event callback + document.addEventListener( "DOMContentLoaded", completed ); + + // A fallback to window.onload, that will always work + window.addEventListener( "load", completed ); +} + + + + +// Multifunctional method to get and set values of a collection +// The value/s can optionally be executed if it's a function +var access = function( elems, fn, key, value, chainable, emptyGet, raw ) { + var i = 0, + len = elems.length, + bulk = key == null; + + // Sets many values + if ( toType( key ) === "object" ) { + chainable = true; + for ( i in key ) { + access( elems, fn, i, key[ i ], true, emptyGet, raw ); + } + + // Sets one value + } else if ( value !== undefined ) { + chainable = true; + + if ( !isFunction( value ) ) { + raw = true; + } + + if ( bulk ) { + + // Bulk operations run against the entire set + if ( raw ) { + fn.call( elems, value ); + fn = null; + + // ...except when executing function values + } else { + bulk = fn; + fn = function( elem, _key, value ) { + return bulk.call( jQuery( elem ), value ); + }; + } + } + + if ( fn ) { + for ( ; i < len; i++ ) { + fn( + elems[ i ], key, raw ? + value : + value.call( elems[ i ], i, fn( elems[ i ], key ) ) + ); + } + } + } + + if ( chainable ) { + return elems; + } + + // Gets + if ( bulk ) { + return fn.call( elems ); + } + + return len ? fn( elems[ 0 ], key ) : emptyGet; +}; + + +// Matches dashed string for camelizing +var rmsPrefix = /^-ms-/, + rdashAlpha = /-([a-z])/g; + +// Used by camelCase as callback to replace() +function fcamelCase( _all, letter ) { + return letter.toUpperCase(); +} + +// Convert dashed to camelCase; used by the css and data modules +// Support: IE <=9 - 11, Edge 12 - 15 +// Microsoft forgot to hump their vendor prefix (#9572) +function camelCase( string ) { + return string.replace( rmsPrefix, "ms-" ).replace( rdashAlpha, fcamelCase ); +} +var acceptData = function( owner ) { + + // Accepts only: + // - Node + // - Node.ELEMENT_NODE + // - Node.DOCUMENT_NODE + // - Object + // - Any + return owner.nodeType === 1 || owner.nodeType === 9 || !( +owner.nodeType ); +}; + + + + +function Data() { + this.expando = jQuery.expando + Data.uid++; +} + +Data.uid = 1; + +Data.prototype = { + + cache: function( owner ) { + + // Check if the owner object already has a cache + var value = owner[ this.expando ]; + + // If not, create one + if ( !value ) { + value = {}; + + // We can accept data for non-element nodes in modern browsers, + // but we should not, see #8335. + // Always return an empty object. + if ( acceptData( owner ) ) { + + // If it is a node unlikely to be stringify-ed or looped over + // use plain assignment + if ( owner.nodeType ) { + owner[ this.expando ] = value; + + // Otherwise secure it in a non-enumerable property + // configurable must be true to allow the property to be + // deleted when data is removed + } else { + Object.defineProperty( owner, this.expando, { + value: value, + configurable: true + } ); + } + } + } + + return value; + }, + set: function( owner, data, value ) { + var prop, + cache = this.cache( owner ); + + // Handle: [ owner, key, value ] args + // Always use camelCase key (gh-2257) + if ( typeof data === "string" ) { + cache[ camelCase( data ) ] = value; + + // Handle: [ owner, { properties } ] args + } else { + + // Copy the properties one-by-one to the cache object + for ( prop in data ) { + cache[ camelCase( prop ) ] = data[ prop ]; + } + } + return cache; + }, + get: function( owner, key ) { + return key === undefined ? + this.cache( owner ) : + + // Always use camelCase key (gh-2257) + owner[ this.expando ] && owner[ this.expando ][ camelCase( key ) ]; + }, + access: function( owner, key, value ) { + + // In cases where either: + // + // 1. No key was specified + // 2. A string key was specified, but no value provided + // + // Take the "read" path and allow the get method to determine + // which value to return, respectively either: + // + // 1. The entire cache object + // 2. The data stored at the key + // + if ( key === undefined || + ( ( key && typeof key === "string" ) && value === undefined ) ) { + + return this.get( owner, key ); + } + + // When the key is not a string, or both a key and value + // are specified, set or extend (existing objects) with either: + // + // 1. An object of properties + // 2. A key and value + // + this.set( owner, key, value ); + + // Since the "set" path can have two possible entry points + // return the expected data based on which path was taken[*] + return value !== undefined ? value : key; + }, + remove: function( owner, key ) { + var i, + cache = owner[ this.expando ]; + + if ( cache === undefined ) { + return; + } + + if ( key !== undefined ) { + + // Support array or space separated string of keys + if ( Array.isArray( key ) ) { + + // If key is an array of keys... + // We always set camelCase keys, so remove that. + key = key.map( camelCase ); + } else { + key = camelCase( key ); + + // If a key with the spaces exists, use it. + // Otherwise, create an array by matching non-whitespace + key = key in cache ? + [ key ] : + ( key.match( rnothtmlwhite ) || [] ); + } + + i = key.length; + + while ( i-- ) { + delete cache[ key[ i ] ]; + } + } + + // Remove the expando if there's no more data + if ( key === undefined || jQuery.isEmptyObject( cache ) ) { + + // Support: Chrome <=35 - 45 + // Webkit & Blink performance suffers when deleting properties + // from DOM nodes, so set to undefined instead + // https://bugs.chromium.org/p/chromium/issues/detail?id=378607 (bug restricted) + if ( owner.nodeType ) { + owner[ this.expando ] = undefined; + } else { + delete owner[ this.expando ]; + } + } + }, + hasData: function( owner ) { + var cache = owner[ this.expando ]; + return cache !== undefined && !jQuery.isEmptyObject( cache ); + } +}; +var dataPriv = new Data(); + +var dataUser = new Data(); + + + +// Implementation Summary +// +// 1. Enforce API surface and semantic compatibility with 1.9.x branch +// 2. Improve the module's maintainability by reducing the storage +// paths to a single mechanism. +// 3. Use the same single mechanism to support "private" and "user" data. +// 4. _Never_ expose "private" data to user code (TODO: Drop _data, _removeData) +// 5. Avoid exposing implementation details on user objects (eg. expando properties) +// 6. Provide a clear path for implementation upgrade to WeakMap in 2014 + +var rbrace = /^(?:\{[\w\W]*\}|\[[\w\W]*\])$/, + rmultiDash = /[A-Z]/g; + +function getData( data ) { + if ( data === "true" ) { + return true; + } + + if ( data === "false" ) { + return false; + } + + if ( data === "null" ) { + return null; + } + + // Only convert to a number if it doesn't change the string + if ( data === +data + "" ) { + return +data; + } + + if ( rbrace.test( data ) ) { + return JSON.parse( data ); + } + + return data; +} + +function dataAttr( elem, key, data ) { + var name; + + // If nothing was found internally, try to fetch any + // data from the HTML5 data-* attribute + if ( data === undefined && elem.nodeType === 1 ) { + name = "data-" + key.replace( rmultiDash, "-$&" ).toLowerCase(); + data = elem.getAttribute( name ); + + if ( typeof data === "string" ) { + try { + data = getData( data ); + } catch ( e ) {} + + // Make sure we set the data so it isn't changed later + dataUser.set( elem, key, data ); + } else { + data = undefined; + } + } + return data; +} + +jQuery.extend( { + hasData: function( elem ) { + return dataUser.hasData( elem ) || dataPriv.hasData( elem ); + }, + + data: function( elem, name, data ) { + return dataUser.access( elem, name, data ); + }, + + removeData: function( elem, name ) { + dataUser.remove( elem, name ); + }, + + // TODO: Now that all calls to _data and _removeData have been replaced + // with direct calls to dataPriv methods, these can be deprecated. + _data: function( elem, name, data ) { + return dataPriv.access( elem, name, data ); + }, + + _removeData: function( elem, name ) { + dataPriv.remove( elem, name ); + } +} ); + +jQuery.fn.extend( { + data: function( key, value ) { + var i, name, data, + elem = this[ 0 ], + attrs = elem && elem.attributes; + + // Gets all values + if ( key === undefined ) { + if ( this.length ) { + data = dataUser.get( elem ); + + if ( elem.nodeType === 1 && !dataPriv.get( elem, "hasDataAttrs" ) ) { + i = attrs.length; + while ( i-- ) { + + // Support: IE 11 only + // The attrs elements can be null (#14894) + if ( attrs[ i ] ) { + name = attrs[ i ].name; + if ( name.indexOf( "data-" ) === 0 ) { + name = camelCase( name.slice( 5 ) ); + dataAttr( elem, name, data[ name ] ); + } + } + } + dataPriv.set( elem, "hasDataAttrs", true ); + } + } + + return data; + } + + // Sets multiple values + if ( typeof key === "object" ) { + return this.each( function() { + dataUser.set( this, key ); + } ); + } + + return access( this, function( value ) { + var data; + + // The calling jQuery object (element matches) is not empty + // (and therefore has an element appears at this[ 0 ]) and the + // `value` parameter was not undefined. An empty jQuery object + // will result in `undefined` for elem = this[ 0 ] which will + // throw an exception if an attempt to read a data cache is made. + if ( elem && value === undefined ) { + + // Attempt to get data from the cache + // The key will always be camelCased in Data + data = dataUser.get( elem, key ); + if ( data !== undefined ) { + return data; + } + + // Attempt to "discover" the data in + // HTML5 custom data-* attrs + data = dataAttr( elem, key ); + if ( data !== undefined ) { + return data; + } + + // We tried really hard, but the data doesn't exist. + return; + } + + // Set the data... + this.each( function() { + + // We always store the camelCased key + dataUser.set( this, key, value ); + } ); + }, null, value, arguments.length > 1, null, true ); + }, + + removeData: function( key ) { + return this.each( function() { + dataUser.remove( this, key ); + } ); + } +} ); + + +jQuery.extend( { + queue: function( elem, type, data ) { + var queue; + + if ( elem ) { + type = ( type || "fx" ) + "queue"; + queue = dataPriv.get( elem, type ); + + // Speed up dequeue by getting out quickly if this is just a lookup + if ( data ) { + if ( !queue || Array.isArray( data ) ) { + queue = dataPriv.access( elem, type, jQuery.makeArray( data ) ); + } else { + queue.push( data ); + } + } + return queue || []; + } + }, + + dequeue: function( elem, type ) { + type = type || "fx"; + + var queue = jQuery.queue( elem, type ), + startLength = queue.length, + fn = queue.shift(), + hooks = jQuery._queueHooks( elem, type ), + next = function() { + jQuery.dequeue( elem, type ); + }; + + // If the fx queue is dequeued, always remove the progress sentinel + if ( fn === "inprogress" ) { + fn = queue.shift(); + startLength--; + } + + if ( fn ) { + + // Add a progress sentinel to prevent the fx queue from being + // automatically dequeued + if ( type === "fx" ) { + queue.unshift( "inprogress" ); + } + + // Clear up the last queue stop function + delete hooks.stop; + fn.call( elem, next, hooks ); + } + + if ( !startLength && hooks ) { + hooks.empty.fire(); + } + }, + + // Not public - generate a queueHooks object, or return the current one + _queueHooks: function( elem, type ) { + var key = type + "queueHooks"; + return dataPriv.get( elem, key ) || dataPriv.access( elem, key, { + empty: jQuery.Callbacks( "once memory" ).add( function() { + dataPriv.remove( elem, [ type + "queue", key ] ); + } ) + } ); + } +} ); + +jQuery.fn.extend( { + queue: function( type, data ) { + var setter = 2; + + if ( typeof type !== "string" ) { + data = type; + type = "fx"; + setter--; + } + + if ( arguments.length < setter ) { + return jQuery.queue( this[ 0 ], type ); + } + + return data === undefined ? + this : + this.each( function() { + var queue = jQuery.queue( this, type, data ); + + // Ensure a hooks for this queue + jQuery._queueHooks( this, type ); + + if ( type === "fx" && queue[ 0 ] !== "inprogress" ) { + jQuery.dequeue( this, type ); + } + } ); + }, + dequeue: function( type ) { + return this.each( function() { + jQuery.dequeue( this, type ); + } ); + }, + clearQueue: function( type ) { + return this.queue( type || "fx", [] ); + }, + + // Get a promise resolved when queues of a certain type + // are emptied (fx is the type by default) + promise: function( type, obj ) { + var tmp, + count = 1, + defer = jQuery.Deferred(), + elements = this, + i = this.length, + resolve = function() { + if ( !( --count ) ) { + defer.resolveWith( elements, [ elements ] ); + } + }; + + if ( typeof type !== "string" ) { + obj = type; + type = undefined; + } + type = type || "fx"; + + while ( i-- ) { + tmp = dataPriv.get( elements[ i ], type + "queueHooks" ); + if ( tmp && tmp.empty ) { + count++; + tmp.empty.add( resolve ); + } + } + resolve(); + return defer.promise( obj ); + } +} ); +var pnum = ( /[+-]?(?:\d*\.|)\d+(?:[eE][+-]?\d+|)/ ).source; + +var rcssNum = new RegExp( "^(?:([+-])=|)(" + pnum + ")([a-z%]*)$", "i" ); + + +var cssExpand = [ "Top", "Right", "Bottom", "Left" ]; + +var documentElement = document.documentElement; + + + + var isAttached = function( elem ) { + return jQuery.contains( elem.ownerDocument, elem ); + }, + composed = { composed: true }; + + // Support: IE 9 - 11+, Edge 12 - 18+, iOS 10.0 - 10.2 only + // Check attachment across shadow DOM boundaries when possible (gh-3504) + // Support: iOS 10.0-10.2 only + // Early iOS 10 versions support `attachShadow` but not `getRootNode`, + // leading to errors. We need to check for `getRootNode`. + if ( documentElement.getRootNode ) { + isAttached = function( elem ) { + return jQuery.contains( elem.ownerDocument, elem ) || + elem.getRootNode( composed ) === elem.ownerDocument; + }; + } +var isHiddenWithinTree = function( elem, el ) { + + // isHiddenWithinTree might be called from jQuery#filter function; + // in that case, element will be second argument + elem = el || elem; + + // Inline style trumps all + return elem.style.display === "none" || + elem.style.display === "" && + + // Otherwise, check computed style + // Support: Firefox <=43 - 45 + // Disconnected elements can have computed display: none, so first confirm that elem is + // in the document. + isAttached( elem ) && + + jQuery.css( elem, "display" ) === "none"; + }; + + + +function adjustCSS( elem, prop, valueParts, tween ) { + var adjusted, scale, + maxIterations = 20, + currentValue = tween ? + function() { + return tween.cur(); + } : + function() { + return jQuery.css( elem, prop, "" ); + }, + initial = currentValue(), + unit = valueParts && valueParts[ 3 ] || ( jQuery.cssNumber[ prop ] ? "" : "px" ), + + // Starting value computation is required for potential unit mismatches + initialInUnit = elem.nodeType && + ( jQuery.cssNumber[ prop ] || unit !== "px" && +initial ) && + rcssNum.exec( jQuery.css( elem, prop ) ); + + if ( initialInUnit && initialInUnit[ 3 ] !== unit ) { + + // Support: Firefox <=54 + // Halve the iteration target value to prevent interference from CSS upper bounds (gh-2144) + initial = initial / 2; + + // Trust units reported by jQuery.css + unit = unit || initialInUnit[ 3 ]; + + // Iteratively approximate from a nonzero starting point + initialInUnit = +initial || 1; + + while ( maxIterations-- ) { + + // Evaluate and update our best guess (doubling guesses that zero out). + // Finish if the scale equals or crosses 1 (making the old*new product non-positive). + jQuery.style( elem, prop, initialInUnit + unit ); + if ( ( 1 - scale ) * ( 1 - ( scale = currentValue() / initial || 0.5 ) ) <= 0 ) { + maxIterations = 0; + } + initialInUnit = initialInUnit / scale; + + } + + initialInUnit = initialInUnit * 2; + jQuery.style( elem, prop, initialInUnit + unit ); + + // Make sure we update the tween properties later on + valueParts = valueParts || []; + } + + if ( valueParts ) { + initialInUnit = +initialInUnit || +initial || 0; + + // Apply relative offset (+=/-=) if specified + adjusted = valueParts[ 1 ] ? + initialInUnit + ( valueParts[ 1 ] + 1 ) * valueParts[ 2 ] : + +valueParts[ 2 ]; + if ( tween ) { + tween.unit = unit; + tween.start = initialInUnit; + tween.end = adjusted; + } + } + return adjusted; +} + + +var defaultDisplayMap = {}; + +function getDefaultDisplay( elem ) { + var temp, + doc = elem.ownerDocument, + nodeName = elem.nodeName, + display = defaultDisplayMap[ nodeName ]; + + if ( display ) { + return display; + } + + temp = doc.body.appendChild( doc.createElement( nodeName ) ); + display = jQuery.css( temp, "display" ); + + temp.parentNode.removeChild( temp ); + + if ( display === "none" ) { + display = "block"; + } + defaultDisplayMap[ nodeName ] = display; + + return display; +} + +function showHide( elements, show ) { + var display, elem, + values = [], + index = 0, + length = elements.length; + + // Determine new display value for elements that need to change + for ( ; index < length; index++ ) { + elem = elements[ index ]; + if ( !elem.style ) { + continue; + } + + display = elem.style.display; + if ( show ) { + + // Since we force visibility upon cascade-hidden elements, an immediate (and slow) + // check is required in this first loop unless we have a nonempty display value (either + // inline or about-to-be-restored) + if ( display === "none" ) { + values[ index ] = dataPriv.get( elem, "display" ) || null; + if ( !values[ index ] ) { + elem.style.display = ""; + } + } + if ( elem.style.display === "" && isHiddenWithinTree( elem ) ) { + values[ index ] = getDefaultDisplay( elem ); + } + } else { + if ( display !== "none" ) { + values[ index ] = "none"; + + // Remember what we're overwriting + dataPriv.set( elem, "display", display ); + } + } + } + + // Set the display of the elements in a second loop to avoid constant reflow + for ( index = 0; index < length; index++ ) { + if ( values[ index ] != null ) { + elements[ index ].style.display = values[ index ]; + } + } + + return elements; +} + +jQuery.fn.extend( { + show: function() { + return showHide( this, true ); + }, + hide: function() { + return showHide( this ); + }, + toggle: function( state ) { + if ( typeof state === "boolean" ) { + return state ? this.show() : this.hide(); + } + + return this.each( function() { + if ( isHiddenWithinTree( this ) ) { + jQuery( this ).show(); + } else { + jQuery( this ).hide(); + } + } ); + } +} ); +var rcheckableType = ( /^(?:checkbox|radio)$/i ); + +var rtagName = ( /<([a-z][^\/\0>\x20\t\r\n\f]*)/i ); + +var rscriptType = ( /^$|^module$|\/(?:java|ecma)script/i ); + + + +( function() { + var fragment = document.createDocumentFragment(), + div = fragment.appendChild( document.createElement( "div" ) ), + input = document.createElement( "input" ); + + // Support: Android 4.0 - 4.3 only + // Check state lost if the name is set (#11217) + // Support: Windows Web Apps (WWA) + // `name` and `type` must use .setAttribute for WWA (#14901) + input.setAttribute( "type", "radio" ); + input.setAttribute( "checked", "checked" ); + input.setAttribute( "name", "t" ); + + div.appendChild( input ); + + // Support: Android <=4.1 only + // Older WebKit doesn't clone checked state correctly in fragments + support.checkClone = div.cloneNode( true ).cloneNode( true ).lastChild.checked; + + // Support: IE <=11 only + // Make sure textarea (and checkbox) defaultValue is properly cloned + div.innerHTML = ""; + support.noCloneChecked = !!div.cloneNode( true ).lastChild.defaultValue; + + // Support: IE <=9 only + // IE <=9 replaces "; + support.option = !!div.lastChild; +} )(); + + +// We have to close these tags to support XHTML (#13200) +var wrapMap = { + + // XHTML parsers do not magically insert elements in the + // same way that tag soup parsers do. So we cannot shorten + // this by omitting or other required elements. + thead: [ 1, "", "
" ], + col: [ 2, "", "
" ], + tr: [ 2, "", "
" ], + td: [ 3, "", "
" ], + + _default: [ 0, "", "" ] +}; + +wrapMap.tbody = wrapMap.tfoot = wrapMap.colgroup = wrapMap.caption = wrapMap.thead; +wrapMap.th = wrapMap.td; + +// Support: IE <=9 only +if ( !support.option ) { + wrapMap.optgroup = wrapMap.option = [ 1, "" ]; +} + + +function getAll( context, tag ) { + + // Support: IE <=9 - 11 only + // Use typeof to avoid zero-argument method invocation on host objects (#15151) + var ret; + + if ( typeof context.getElementsByTagName !== "undefined" ) { + ret = context.getElementsByTagName( tag || "*" ); + + } else if ( typeof context.querySelectorAll !== "undefined" ) { + ret = context.querySelectorAll( tag || "*" ); + + } else { + ret = []; + } + + if ( tag === undefined || tag && nodeName( context, tag ) ) { + return jQuery.merge( [ context ], ret ); + } + + return ret; +} + + +// Mark scripts as having already been evaluated +function setGlobalEval( elems, refElements ) { + var i = 0, + l = elems.length; + + for ( ; i < l; i++ ) { + dataPriv.set( + elems[ i ], + "globalEval", + !refElements || dataPriv.get( refElements[ i ], "globalEval" ) + ); + } +} + + +var rhtml = /<|&#?\w+;/; + +function buildFragment( elems, context, scripts, selection, ignored ) { + var elem, tmp, tag, wrap, attached, j, + fragment = context.createDocumentFragment(), + nodes = [], + i = 0, + l = elems.length; + + for ( ; i < l; i++ ) { + elem = elems[ i ]; + + if ( elem || elem === 0 ) { + + // Add nodes directly + if ( toType( elem ) === "object" ) { + + // Support: Android <=4.0 only, PhantomJS 1 only + // push.apply(_, arraylike) throws on ancient WebKit + jQuery.merge( nodes, elem.nodeType ? [ elem ] : elem ); + + // Convert non-html into a text node + } else if ( !rhtml.test( elem ) ) { + nodes.push( context.createTextNode( elem ) ); + + // Convert html into DOM nodes + } else { + tmp = tmp || fragment.appendChild( context.createElement( "div" ) ); + + // Deserialize a standard representation + tag = ( rtagName.exec( elem ) || [ "", "" ] )[ 1 ].toLowerCase(); + wrap = wrapMap[ tag ] || wrapMap._default; + tmp.innerHTML = wrap[ 1 ] + jQuery.htmlPrefilter( elem ) + wrap[ 2 ]; + + // Descend through wrappers to the right content + j = wrap[ 0 ]; + while ( j-- ) { + tmp = tmp.lastChild; + } + + // Support: Android <=4.0 only, PhantomJS 1 only + // push.apply(_, arraylike) throws on ancient WebKit + jQuery.merge( nodes, tmp.childNodes ); + + // Remember the top-level container + tmp = fragment.firstChild; + + // Ensure the created nodes are orphaned (#12392) + tmp.textContent = ""; + } + } + } + + // Remove wrapper from fragment + fragment.textContent = ""; + + i = 0; + while ( ( elem = nodes[ i++ ] ) ) { + + // Skip elements already in the context collection (trac-4087) + if ( selection && jQuery.inArray( elem, selection ) > -1 ) { + if ( ignored ) { + ignored.push( elem ); + } + continue; + } + + attached = isAttached( elem ); + + // Append to fragment + tmp = getAll( fragment.appendChild( elem ), "script" ); + + // Preserve script evaluation history + if ( attached ) { + setGlobalEval( tmp ); + } + + // Capture executables + if ( scripts ) { + j = 0; + while ( ( elem = tmp[ j++ ] ) ) { + if ( rscriptType.test( elem.type || "" ) ) { + scripts.push( elem ); + } + } + } + } + + return fragment; +} + + +var rtypenamespace = /^([^.]*)(?:\.(.+)|)/; + +function returnTrue() { + return true; +} + +function returnFalse() { + return false; +} + +// Support: IE <=9 - 11+ +// focus() and blur() are asynchronous, except when they are no-op. +// So expect focus to be synchronous when the element is already active, +// and blur to be synchronous when the element is not already active. +// (focus and blur are always synchronous in other supported browsers, +// this just defines when we can count on it). +function expectSync( elem, type ) { + return ( elem === safeActiveElement() ) === ( type === "focus" ); +} + +// Support: IE <=9 only +// Accessing document.activeElement can throw unexpectedly +// https://bugs.jquery.com/ticket/13393 +function safeActiveElement() { + try { + return document.activeElement; + } catch ( err ) { } +} + +function on( elem, types, selector, data, fn, one ) { + var origFn, type; + + // Types can be a map of types/handlers + if ( typeof types === "object" ) { + + // ( types-Object, selector, data ) + if ( typeof selector !== "string" ) { + + // ( types-Object, data ) + data = data || selector; + selector = undefined; + } + for ( type in types ) { + on( elem, type, selector, data, types[ type ], one ); + } + return elem; + } + + if ( data == null && fn == null ) { + + // ( types, fn ) + fn = selector; + data = selector = undefined; + } else if ( fn == null ) { + if ( typeof selector === "string" ) { + + // ( types, selector, fn ) + fn = data; + data = undefined; + } else { + + // ( types, data, fn ) + fn = data; + data = selector; + selector = undefined; + } + } + if ( fn === false ) { + fn = returnFalse; + } else if ( !fn ) { + return elem; + } + + if ( one === 1 ) { + origFn = fn; + fn = function( event ) { + + // Can use an empty set, since event contains the info + jQuery().off( event ); + return origFn.apply( this, arguments ); + }; + + // Use same guid so caller can remove using origFn + fn.guid = origFn.guid || ( origFn.guid = jQuery.guid++ ); + } + return elem.each( function() { + jQuery.event.add( this, types, fn, data, selector ); + } ); +} + +/* + * Helper functions for managing events -- not part of the public interface. + * Props to Dean Edwards' addEvent library for many of the ideas. + */ +jQuery.event = { + + global: {}, + + add: function( elem, types, handler, data, selector ) { + + var handleObjIn, eventHandle, tmp, + events, t, handleObj, + special, handlers, type, namespaces, origType, + elemData = dataPriv.get( elem ); + + // Only attach events to objects that accept data + if ( !acceptData( elem ) ) { + return; + } + + // Caller can pass in an object of custom data in lieu of the handler + if ( handler.handler ) { + handleObjIn = handler; + handler = handleObjIn.handler; + selector = handleObjIn.selector; + } + + // Ensure that invalid selectors throw exceptions at attach time + // Evaluate against documentElement in case elem is a non-element node (e.g., document) + if ( selector ) { + jQuery.find.matchesSelector( documentElement, selector ); + } + + // Make sure that the handler has a unique ID, used to find/remove it later + if ( !handler.guid ) { + handler.guid = jQuery.guid++; + } + + // Init the element's event structure and main handler, if this is the first + if ( !( events = elemData.events ) ) { + events = elemData.events = Object.create( null ); + } + if ( !( eventHandle = elemData.handle ) ) { + eventHandle = elemData.handle = function( e ) { + + // Discard the second event of a jQuery.event.trigger() and + // when an event is called after a page has unloaded + return typeof jQuery !== "undefined" && jQuery.event.triggered !== e.type ? + jQuery.event.dispatch.apply( elem, arguments ) : undefined; + }; + } + + // Handle multiple events separated by a space + types = ( types || "" ).match( rnothtmlwhite ) || [ "" ]; + t = types.length; + while ( t-- ) { + tmp = rtypenamespace.exec( types[ t ] ) || []; + type = origType = tmp[ 1 ]; + namespaces = ( tmp[ 2 ] || "" ).split( "." ).sort(); + + // There *must* be a type, no attaching namespace-only handlers + if ( !type ) { + continue; + } + + // If event changes its type, use the special event handlers for the changed type + special = jQuery.event.special[ type ] || {}; + + // If selector defined, determine special event api type, otherwise given type + type = ( selector ? special.delegateType : special.bindType ) || type; + + // Update special based on newly reset type + special = jQuery.event.special[ type ] || {}; + + // handleObj is passed to all event handlers + handleObj = jQuery.extend( { + type: type, + origType: origType, + data: data, + handler: handler, + guid: handler.guid, + selector: selector, + needsContext: selector && jQuery.expr.match.needsContext.test( selector ), + namespace: namespaces.join( "." ) + }, handleObjIn ); + + // Init the event handler queue if we're the first + if ( !( handlers = events[ type ] ) ) { + handlers = events[ type ] = []; + handlers.delegateCount = 0; + + // Only use addEventListener if the special events handler returns false + if ( !special.setup || + special.setup.call( elem, data, namespaces, eventHandle ) === false ) { + + if ( elem.addEventListener ) { + elem.addEventListener( type, eventHandle ); + } + } + } + + if ( special.add ) { + special.add.call( elem, handleObj ); + + if ( !handleObj.handler.guid ) { + handleObj.handler.guid = handler.guid; + } + } + + // Add to the element's handler list, delegates in front + if ( selector ) { + handlers.splice( handlers.delegateCount++, 0, handleObj ); + } else { + handlers.push( handleObj ); + } + + // Keep track of which events have ever been used, for event optimization + jQuery.event.global[ type ] = true; + } + + }, + + // Detach an event or set of events from an element + remove: function( elem, types, handler, selector, mappedTypes ) { + + var j, origCount, tmp, + events, t, handleObj, + special, handlers, type, namespaces, origType, + elemData = dataPriv.hasData( elem ) && dataPriv.get( elem ); + + if ( !elemData || !( events = elemData.events ) ) { + return; + } + + // Once for each type.namespace in types; type may be omitted + types = ( types || "" ).match( rnothtmlwhite ) || [ "" ]; + t = types.length; + while ( t-- ) { + tmp = rtypenamespace.exec( types[ t ] ) || []; + type = origType = tmp[ 1 ]; + namespaces = ( tmp[ 2 ] || "" ).split( "." ).sort(); + + // Unbind all events (on this namespace, if provided) for the element + if ( !type ) { + for ( type in events ) { + jQuery.event.remove( elem, type + types[ t ], handler, selector, true ); + } + continue; + } + + special = jQuery.event.special[ type ] || {}; + type = ( selector ? special.delegateType : special.bindType ) || type; + handlers = events[ type ] || []; + tmp = tmp[ 2 ] && + new RegExp( "(^|\\.)" + namespaces.join( "\\.(?:.*\\.|)" ) + "(\\.|$)" ); + + // Remove matching events + origCount = j = handlers.length; + while ( j-- ) { + handleObj = handlers[ j ]; + + if ( ( mappedTypes || origType === handleObj.origType ) && + ( !handler || handler.guid === handleObj.guid ) && + ( !tmp || tmp.test( handleObj.namespace ) ) && + ( !selector || selector === handleObj.selector || + selector === "**" && handleObj.selector ) ) { + handlers.splice( j, 1 ); + + if ( handleObj.selector ) { + handlers.delegateCount--; + } + if ( special.remove ) { + special.remove.call( elem, handleObj ); + } + } + } + + // Remove generic event handler if we removed something and no more handlers exist + // (avoids potential for endless recursion during removal of special event handlers) + if ( origCount && !handlers.length ) { + if ( !special.teardown || + special.teardown.call( elem, namespaces, elemData.handle ) === false ) { + + jQuery.removeEvent( elem, type, elemData.handle ); + } + + delete events[ type ]; + } + } + + // Remove data and the expando if it's no longer used + if ( jQuery.isEmptyObject( events ) ) { + dataPriv.remove( elem, "handle events" ); + } + }, + + dispatch: function( nativeEvent ) { + + var i, j, ret, matched, handleObj, handlerQueue, + args = new Array( arguments.length ), + + // Make a writable jQuery.Event from the native event object + event = jQuery.event.fix( nativeEvent ), + + handlers = ( + dataPriv.get( this, "events" ) || Object.create( null ) + )[ event.type ] || [], + special = jQuery.event.special[ event.type ] || {}; + + // Use the fix-ed jQuery.Event rather than the (read-only) native event + args[ 0 ] = event; + + for ( i = 1; i < arguments.length; i++ ) { + args[ i ] = arguments[ i ]; + } + + event.delegateTarget = this; + + // Call the preDispatch hook for the mapped type, and let it bail if desired + if ( special.preDispatch && special.preDispatch.call( this, event ) === false ) { + return; + } + + // Determine handlers + handlerQueue = jQuery.event.handlers.call( this, event, handlers ); + + // Run delegates first; they may want to stop propagation beneath us + i = 0; + while ( ( matched = handlerQueue[ i++ ] ) && !event.isPropagationStopped() ) { + event.currentTarget = matched.elem; + + j = 0; + while ( ( handleObj = matched.handlers[ j++ ] ) && + !event.isImmediatePropagationStopped() ) { + + // If the event is namespaced, then each handler is only invoked if it is + // specially universal or its namespaces are a superset of the event's. + if ( !event.rnamespace || handleObj.namespace === false || + event.rnamespace.test( handleObj.namespace ) ) { + + event.handleObj = handleObj; + event.data = handleObj.data; + + ret = ( ( jQuery.event.special[ handleObj.origType ] || {} ).handle || + handleObj.handler ).apply( matched.elem, args ); + + if ( ret !== undefined ) { + if ( ( event.result = ret ) === false ) { + event.preventDefault(); + event.stopPropagation(); + } + } + } + } + } + + // Call the postDispatch hook for the mapped type + if ( special.postDispatch ) { + special.postDispatch.call( this, event ); + } + + return event.result; + }, + + handlers: function( event, handlers ) { + var i, handleObj, sel, matchedHandlers, matchedSelectors, + handlerQueue = [], + delegateCount = handlers.delegateCount, + cur = event.target; + + // Find delegate handlers + if ( delegateCount && + + // Support: IE <=9 + // Black-hole SVG instance trees (trac-13180) + cur.nodeType && + + // Support: Firefox <=42 + // Suppress spec-violating clicks indicating a non-primary pointer button (trac-3861) + // https://www.w3.org/TR/DOM-Level-3-Events/#event-type-click + // Support: IE 11 only + // ...but not arrow key "clicks" of radio inputs, which can have `button` -1 (gh-2343) + !( event.type === "click" && event.button >= 1 ) ) { + + for ( ; cur !== this; cur = cur.parentNode || this ) { + + // Don't check non-elements (#13208) + // Don't process clicks on disabled elements (#6911, #8165, #11382, #11764) + if ( cur.nodeType === 1 && !( event.type === "click" && cur.disabled === true ) ) { + matchedHandlers = []; + matchedSelectors = {}; + for ( i = 0; i < delegateCount; i++ ) { + handleObj = handlers[ i ]; + + // Don't conflict with Object.prototype properties (#13203) + sel = handleObj.selector + " "; + + if ( matchedSelectors[ sel ] === undefined ) { + matchedSelectors[ sel ] = handleObj.needsContext ? + jQuery( sel, this ).index( cur ) > -1 : + jQuery.find( sel, this, null, [ cur ] ).length; + } + if ( matchedSelectors[ sel ] ) { + matchedHandlers.push( handleObj ); + } + } + if ( matchedHandlers.length ) { + handlerQueue.push( { elem: cur, handlers: matchedHandlers } ); + } + } + } + } + + // Add the remaining (directly-bound) handlers + cur = this; + if ( delegateCount < handlers.length ) { + handlerQueue.push( { elem: cur, handlers: handlers.slice( delegateCount ) } ); + } + + return handlerQueue; + }, + + addProp: function( name, hook ) { + Object.defineProperty( jQuery.Event.prototype, name, { + enumerable: true, + configurable: true, + + get: isFunction( hook ) ? + function() { + if ( this.originalEvent ) { + return hook( this.originalEvent ); + } + } : + function() { + if ( this.originalEvent ) { + return this.originalEvent[ name ]; + } + }, + + set: function( value ) { + Object.defineProperty( this, name, { + enumerable: true, + configurable: true, + writable: true, + value: value + } ); + } + } ); + }, + + fix: function( originalEvent ) { + return originalEvent[ jQuery.expando ] ? + originalEvent : + new jQuery.Event( originalEvent ); + }, + + special: { + load: { + + // Prevent triggered image.load events from bubbling to window.load + noBubble: true + }, + click: { + + // Utilize native event to ensure correct state for checkable inputs + setup: function( data ) { + + // For mutual compressibility with _default, replace `this` access with a local var. + // `|| data` is dead code meant only to preserve the variable through minification. + var el = this || data; + + // Claim the first handler + if ( rcheckableType.test( el.type ) && + el.click && nodeName( el, "input" ) ) { + + // dataPriv.set( el, "click", ... ) + leverageNative( el, "click", returnTrue ); + } + + // Return false to allow normal processing in the caller + return false; + }, + trigger: function( data ) { + + // For mutual compressibility with _default, replace `this` access with a local var. + // `|| data` is dead code meant only to preserve the variable through minification. + var el = this || data; + + // Force setup before triggering a click + if ( rcheckableType.test( el.type ) && + el.click && nodeName( el, "input" ) ) { + + leverageNative( el, "click" ); + } + + // Return non-false to allow normal event-path propagation + return true; + }, + + // For cross-browser consistency, suppress native .click() on links + // Also prevent it if we're currently inside a leveraged native-event stack + _default: function( event ) { + var target = event.target; + return rcheckableType.test( target.type ) && + target.click && nodeName( target, "input" ) && + dataPriv.get( target, "click" ) || + nodeName( target, "a" ); + } + }, + + beforeunload: { + postDispatch: function( event ) { + + // Support: Firefox 20+ + // Firefox doesn't alert if the returnValue field is not set. + if ( event.result !== undefined && event.originalEvent ) { + event.originalEvent.returnValue = event.result; + } + } + } + } +}; + +// Ensure the presence of an event listener that handles manually-triggered +// synthetic events by interrupting progress until reinvoked in response to +// *native* events that it fires directly, ensuring that state changes have +// already occurred before other listeners are invoked. +function leverageNative( el, type, expectSync ) { + + // Missing expectSync indicates a trigger call, which must force setup through jQuery.event.add + if ( !expectSync ) { + if ( dataPriv.get( el, type ) === undefined ) { + jQuery.event.add( el, type, returnTrue ); + } + return; + } + + // Register the controller as a special universal handler for all event namespaces + dataPriv.set( el, type, false ); + jQuery.event.add( el, type, { + namespace: false, + handler: function( event ) { + var notAsync, result, + saved = dataPriv.get( this, type ); + + if ( ( event.isTrigger & 1 ) && this[ type ] ) { + + // Interrupt processing of the outer synthetic .trigger()ed event + // Saved data should be false in such cases, but might be a leftover capture object + // from an async native handler (gh-4350) + if ( !saved.length ) { + + // Store arguments for use when handling the inner native event + // There will always be at least one argument (an event object), so this array + // will not be confused with a leftover capture object. + saved = slice.call( arguments ); + dataPriv.set( this, type, saved ); + + // Trigger the native event and capture its result + // Support: IE <=9 - 11+ + // focus() and blur() are asynchronous + notAsync = expectSync( this, type ); + this[ type ](); + result = dataPriv.get( this, type ); + if ( saved !== result || notAsync ) { + dataPriv.set( this, type, false ); + } else { + result = {}; + } + if ( saved !== result ) { + + // Cancel the outer synthetic event + event.stopImmediatePropagation(); + event.preventDefault(); + + // Support: Chrome 86+ + // In Chrome, if an element having a focusout handler is blurred by + // clicking outside of it, it invokes the handler synchronously. If + // that handler calls `.remove()` on the element, the data is cleared, + // leaving `result` undefined. We need to guard against this. + return result && result.value; + } + + // If this is an inner synthetic event for an event with a bubbling surrogate + // (focus or blur), assume that the surrogate already propagated from triggering the + // native event and prevent that from happening again here. + // This technically gets the ordering wrong w.r.t. to `.trigger()` (in which the + // bubbling surrogate propagates *after* the non-bubbling base), but that seems + // less bad than duplication. + } else if ( ( jQuery.event.special[ type ] || {} ).delegateType ) { + event.stopPropagation(); + } + + // If this is a native event triggered above, everything is now in order + // Fire an inner synthetic event with the original arguments + } else if ( saved.length ) { + + // ...and capture the result + dataPriv.set( this, type, { + value: jQuery.event.trigger( + + // Support: IE <=9 - 11+ + // Extend with the prototype to reset the above stopImmediatePropagation() + jQuery.extend( saved[ 0 ], jQuery.Event.prototype ), + saved.slice( 1 ), + this + ) + } ); + + // Abort handling of the native event + event.stopImmediatePropagation(); + } + } + } ); +} + +jQuery.removeEvent = function( elem, type, handle ) { + + // This "if" is needed for plain objects + if ( elem.removeEventListener ) { + elem.removeEventListener( type, handle ); + } +}; + +jQuery.Event = function( src, props ) { + + // Allow instantiation without the 'new' keyword + if ( !( this instanceof jQuery.Event ) ) { + return new jQuery.Event( src, props ); + } + + // Event object + if ( src && src.type ) { + this.originalEvent = src; + this.type = src.type; + + // Events bubbling up the document may have been marked as prevented + // by a handler lower down the tree; reflect the correct value. + this.isDefaultPrevented = src.defaultPrevented || + src.defaultPrevented === undefined && + + // Support: Android <=2.3 only + src.returnValue === false ? + returnTrue : + returnFalse; + + // Create target properties + // Support: Safari <=6 - 7 only + // Target should not be a text node (#504, #13143) + this.target = ( src.target && src.target.nodeType === 3 ) ? + src.target.parentNode : + src.target; + + this.currentTarget = src.currentTarget; + this.relatedTarget = src.relatedTarget; + + // Event type + } else { + this.type = src; + } + + // Put explicitly provided properties onto the event object + if ( props ) { + jQuery.extend( this, props ); + } + + // Create a timestamp if incoming event doesn't have one + this.timeStamp = src && src.timeStamp || Date.now(); + + // Mark it as fixed + this[ jQuery.expando ] = true; +}; + +// jQuery.Event is based on DOM3 Events as specified by the ECMAScript Language Binding +// https://www.w3.org/TR/2003/WD-DOM-Level-3-Events-20030331/ecma-script-binding.html +jQuery.Event.prototype = { + constructor: jQuery.Event, + isDefaultPrevented: returnFalse, + isPropagationStopped: returnFalse, + isImmediatePropagationStopped: returnFalse, + isSimulated: false, + + preventDefault: function() { + var e = this.originalEvent; + + this.isDefaultPrevented = returnTrue; + + if ( e && !this.isSimulated ) { + e.preventDefault(); + } + }, + stopPropagation: function() { + var e = this.originalEvent; + + this.isPropagationStopped = returnTrue; + + if ( e && !this.isSimulated ) { + e.stopPropagation(); + } + }, + stopImmediatePropagation: function() { + var e = this.originalEvent; + + this.isImmediatePropagationStopped = returnTrue; + + if ( e && !this.isSimulated ) { + e.stopImmediatePropagation(); + } + + this.stopPropagation(); + } +}; + +// Includes all common event props including KeyEvent and MouseEvent specific props +jQuery.each( { + altKey: true, + bubbles: true, + cancelable: true, + changedTouches: true, + ctrlKey: true, + detail: true, + eventPhase: true, + metaKey: true, + pageX: true, + pageY: true, + shiftKey: true, + view: true, + "char": true, + code: true, + charCode: true, + key: true, + keyCode: true, + button: true, + buttons: true, + clientX: true, + clientY: true, + offsetX: true, + offsetY: true, + pointerId: true, + pointerType: true, + screenX: true, + screenY: true, + targetTouches: true, + toElement: true, + touches: true, + which: true +}, jQuery.event.addProp ); + +jQuery.each( { focus: "focusin", blur: "focusout" }, function( type, delegateType ) { + jQuery.event.special[ type ] = { + + // Utilize native event if possible so blur/focus sequence is correct + setup: function() { + + // Claim the first handler + // dataPriv.set( this, "focus", ... ) + // dataPriv.set( this, "blur", ... ) + leverageNative( this, type, expectSync ); + + // Return false to allow normal processing in the caller + return false; + }, + trigger: function() { + + // Force setup before trigger + leverageNative( this, type ); + + // Return non-false to allow normal event-path propagation + return true; + }, + + // Suppress native focus or blur as it's already being fired + // in leverageNative. + _default: function() { + return true; + }, + + delegateType: delegateType + }; +} ); + +// Create mouseenter/leave events using mouseover/out and event-time checks +// so that event delegation works in jQuery. +// Do the same for pointerenter/pointerleave and pointerover/pointerout +// +// Support: Safari 7 only +// Safari sends mouseenter too often; see: +// https://bugs.chromium.org/p/chromium/issues/detail?id=470258 +// for the description of the bug (it existed in older Chrome versions as well). +jQuery.each( { + mouseenter: "mouseover", + mouseleave: "mouseout", + pointerenter: "pointerover", + pointerleave: "pointerout" +}, function( orig, fix ) { + jQuery.event.special[ orig ] = { + delegateType: fix, + bindType: fix, + + handle: function( event ) { + var ret, + target = this, + related = event.relatedTarget, + handleObj = event.handleObj; + + // For mouseenter/leave call the handler if related is outside the target. + // NB: No relatedTarget if the mouse left/entered the browser window + if ( !related || ( related !== target && !jQuery.contains( target, related ) ) ) { + event.type = handleObj.origType; + ret = handleObj.handler.apply( this, arguments ); + event.type = fix; + } + return ret; + } + }; +} ); + +jQuery.fn.extend( { + + on: function( types, selector, data, fn ) { + return on( this, types, selector, data, fn ); + }, + one: function( types, selector, data, fn ) { + return on( this, types, selector, data, fn, 1 ); + }, + off: function( types, selector, fn ) { + var handleObj, type; + if ( types && types.preventDefault && types.handleObj ) { + + // ( event ) dispatched jQuery.Event + handleObj = types.handleObj; + jQuery( types.delegateTarget ).off( + handleObj.namespace ? + handleObj.origType + "." + handleObj.namespace : + handleObj.origType, + handleObj.selector, + handleObj.handler + ); + return this; + } + if ( typeof types === "object" ) { + + // ( types-object [, selector] ) + for ( type in types ) { + this.off( type, selector, types[ type ] ); + } + return this; + } + if ( selector === false || typeof selector === "function" ) { + + // ( types [, fn] ) + fn = selector; + selector = undefined; + } + if ( fn === false ) { + fn = returnFalse; + } + return this.each( function() { + jQuery.event.remove( this, types, fn, selector ); + } ); + } +} ); + + +var + + // Support: IE <=10 - 11, Edge 12 - 13 only + // In IE/Edge using regex groups here causes severe slowdowns. + // See https://connect.microsoft.com/IE/feedback/details/1736512/ + rnoInnerhtml = /\s*$/g; + +// Prefer a tbody over its parent table for containing new rows +function manipulationTarget( elem, content ) { + if ( nodeName( elem, "table" ) && + nodeName( content.nodeType !== 11 ? content : content.firstChild, "tr" ) ) { + + return jQuery( elem ).children( "tbody" )[ 0 ] || elem; + } + + return elem; +} + +// Replace/restore the type attribute of script elements for safe DOM manipulation +function disableScript( elem ) { + elem.type = ( elem.getAttribute( "type" ) !== null ) + "/" + elem.type; + return elem; +} +function restoreScript( elem ) { + if ( ( elem.type || "" ).slice( 0, 5 ) === "true/" ) { + elem.type = elem.type.slice( 5 ); + } else { + elem.removeAttribute( "type" ); + } + + return elem; +} + +function cloneCopyEvent( src, dest ) { + var i, l, type, pdataOld, udataOld, udataCur, events; + + if ( dest.nodeType !== 1 ) { + return; + } + + // 1. Copy private data: events, handlers, etc. + if ( dataPriv.hasData( src ) ) { + pdataOld = dataPriv.get( src ); + events = pdataOld.events; + + if ( events ) { + dataPriv.remove( dest, "handle events" ); + + for ( type in events ) { + for ( i = 0, l = events[ type ].length; i < l; i++ ) { + jQuery.event.add( dest, type, events[ type ][ i ] ); + } + } + } + } + + // 2. Copy user data + if ( dataUser.hasData( src ) ) { + udataOld = dataUser.access( src ); + udataCur = jQuery.extend( {}, udataOld ); + + dataUser.set( dest, udataCur ); + } +} + +// Fix IE bugs, see support tests +function fixInput( src, dest ) { + var nodeName = dest.nodeName.toLowerCase(); + + // Fails to persist the checked state of a cloned checkbox or radio button. + if ( nodeName === "input" && rcheckableType.test( src.type ) ) { + dest.checked = src.checked; + + // Fails to return the selected option to the default selected state when cloning options + } else if ( nodeName === "input" || nodeName === "textarea" ) { + dest.defaultValue = src.defaultValue; + } +} + +function domManip( collection, args, callback, ignored ) { + + // Flatten any nested arrays + args = flat( args ); + + var fragment, first, scripts, hasScripts, node, doc, + i = 0, + l = collection.length, + iNoClone = l - 1, + value = args[ 0 ], + valueIsFunction = isFunction( value ); + + // We can't cloneNode fragments that contain checked, in WebKit + if ( valueIsFunction || + ( l > 1 && typeof value === "string" && + !support.checkClone && rchecked.test( value ) ) ) { + return collection.each( function( index ) { + var self = collection.eq( index ); + if ( valueIsFunction ) { + args[ 0 ] = value.call( this, index, self.html() ); + } + domManip( self, args, callback, ignored ); + } ); + } + + if ( l ) { + fragment = buildFragment( args, collection[ 0 ].ownerDocument, false, collection, ignored ); + first = fragment.firstChild; + + if ( fragment.childNodes.length === 1 ) { + fragment = first; + } + + // Require either new content or an interest in ignored elements to invoke the callback + if ( first || ignored ) { + scripts = jQuery.map( getAll( fragment, "script" ), disableScript ); + hasScripts = scripts.length; + + // Use the original fragment for the last item + // instead of the first because it can end up + // being emptied incorrectly in certain situations (#8070). + for ( ; i < l; i++ ) { + node = fragment; + + if ( i !== iNoClone ) { + node = jQuery.clone( node, true, true ); + + // Keep references to cloned scripts for later restoration + if ( hasScripts ) { + + // Support: Android <=4.0 only, PhantomJS 1 only + // push.apply(_, arraylike) throws on ancient WebKit + jQuery.merge( scripts, getAll( node, "script" ) ); + } + } + + callback.call( collection[ i ], node, i ); + } + + if ( hasScripts ) { + doc = scripts[ scripts.length - 1 ].ownerDocument; + + // Reenable scripts + jQuery.map( scripts, restoreScript ); + + // Evaluate executable scripts on first document insertion + for ( i = 0; i < hasScripts; i++ ) { + node = scripts[ i ]; + if ( rscriptType.test( node.type || "" ) && + !dataPriv.access( node, "globalEval" ) && + jQuery.contains( doc, node ) ) { + + if ( node.src && ( node.type || "" ).toLowerCase() !== "module" ) { + + // Optional AJAX dependency, but won't run scripts if not present + if ( jQuery._evalUrl && !node.noModule ) { + jQuery._evalUrl( node.src, { + nonce: node.nonce || node.getAttribute( "nonce" ) + }, doc ); + } + } else { + DOMEval( node.textContent.replace( rcleanScript, "" ), node, doc ); + } + } + } + } + } + } + + return collection; +} + +function remove( elem, selector, keepData ) { + var node, + nodes = selector ? jQuery.filter( selector, elem ) : elem, + i = 0; + + for ( ; ( node = nodes[ i ] ) != null; i++ ) { + if ( !keepData && node.nodeType === 1 ) { + jQuery.cleanData( getAll( node ) ); + } + + if ( node.parentNode ) { + if ( keepData && isAttached( node ) ) { + setGlobalEval( getAll( node, "script" ) ); + } + node.parentNode.removeChild( node ); + } + } + + return elem; +} + +jQuery.extend( { + htmlPrefilter: function( html ) { + return html; + }, + + clone: function( elem, dataAndEvents, deepDataAndEvents ) { + var i, l, srcElements, destElements, + clone = elem.cloneNode( true ), + inPage = isAttached( elem ); + + // Fix IE cloning issues + if ( !support.noCloneChecked && ( elem.nodeType === 1 || elem.nodeType === 11 ) && + !jQuery.isXMLDoc( elem ) ) { + + // We eschew Sizzle here for performance reasons: https://jsperf.com/getall-vs-sizzle/2 + destElements = getAll( clone ); + srcElements = getAll( elem ); + + for ( i = 0, l = srcElements.length; i < l; i++ ) { + fixInput( srcElements[ i ], destElements[ i ] ); + } + } + + // Copy the events from the original to the clone + if ( dataAndEvents ) { + if ( deepDataAndEvents ) { + srcElements = srcElements || getAll( elem ); + destElements = destElements || getAll( clone ); + + for ( i = 0, l = srcElements.length; i < l; i++ ) { + cloneCopyEvent( srcElements[ i ], destElements[ i ] ); + } + } else { + cloneCopyEvent( elem, clone ); + } + } + + // Preserve script evaluation history + destElements = getAll( clone, "script" ); + if ( destElements.length > 0 ) { + setGlobalEval( destElements, !inPage && getAll( elem, "script" ) ); + } + + // Return the cloned set + return clone; + }, + + cleanData: function( elems ) { + var data, elem, type, + special = jQuery.event.special, + i = 0; + + for ( ; ( elem = elems[ i ] ) !== undefined; i++ ) { + if ( acceptData( elem ) ) { + if ( ( data = elem[ dataPriv.expando ] ) ) { + if ( data.events ) { + for ( type in data.events ) { + if ( special[ type ] ) { + jQuery.event.remove( elem, type ); + + // This is a shortcut to avoid jQuery.event.remove's overhead + } else { + jQuery.removeEvent( elem, type, data.handle ); + } + } + } + + // Support: Chrome <=35 - 45+ + // Assign undefined instead of using delete, see Data#remove + elem[ dataPriv.expando ] = undefined; + } + if ( elem[ dataUser.expando ] ) { + + // Support: Chrome <=35 - 45+ + // Assign undefined instead of using delete, see Data#remove + elem[ dataUser.expando ] = undefined; + } + } + } + } +} ); + +jQuery.fn.extend( { + detach: function( selector ) { + return remove( this, selector, true ); + }, + + remove: function( selector ) { + return remove( this, selector ); + }, + + text: function( value ) { + return access( this, function( value ) { + return value === undefined ? + jQuery.text( this ) : + this.empty().each( function() { + if ( this.nodeType === 1 || this.nodeType === 11 || this.nodeType === 9 ) { + this.textContent = value; + } + } ); + }, null, value, arguments.length ); + }, + + append: function() { + return domManip( this, arguments, function( elem ) { + if ( this.nodeType === 1 || this.nodeType === 11 || this.nodeType === 9 ) { + var target = manipulationTarget( this, elem ); + target.appendChild( elem ); + } + } ); + }, + + prepend: function() { + return domManip( this, arguments, function( elem ) { + if ( this.nodeType === 1 || this.nodeType === 11 || this.nodeType === 9 ) { + var target = manipulationTarget( this, elem ); + target.insertBefore( elem, target.firstChild ); + } + } ); + }, + + before: function() { + return domManip( this, arguments, function( elem ) { + if ( this.parentNode ) { + this.parentNode.insertBefore( elem, this ); + } + } ); + }, + + after: function() { + return domManip( this, arguments, function( elem ) { + if ( this.parentNode ) { + this.parentNode.insertBefore( elem, this.nextSibling ); + } + } ); + }, + + empty: function() { + var elem, + i = 0; + + for ( ; ( elem = this[ i ] ) != null; i++ ) { + if ( elem.nodeType === 1 ) { + + // Prevent memory leaks + jQuery.cleanData( getAll( elem, false ) ); + + // Remove any remaining nodes + elem.textContent = ""; + } + } + + return this; + }, + + clone: function( dataAndEvents, deepDataAndEvents ) { + dataAndEvents = dataAndEvents == null ? false : dataAndEvents; + deepDataAndEvents = deepDataAndEvents == null ? dataAndEvents : deepDataAndEvents; + + return this.map( function() { + return jQuery.clone( this, dataAndEvents, deepDataAndEvents ); + } ); + }, + + html: function( value ) { + return access( this, function( value ) { + var elem = this[ 0 ] || {}, + i = 0, + l = this.length; + + if ( value === undefined && elem.nodeType === 1 ) { + return elem.innerHTML; + } + + // See if we can take a shortcut and just use innerHTML + if ( typeof value === "string" && !rnoInnerhtml.test( value ) && + !wrapMap[ ( rtagName.exec( value ) || [ "", "" ] )[ 1 ].toLowerCase() ] ) { + + value = jQuery.htmlPrefilter( value ); + + try { + for ( ; i < l; i++ ) { + elem = this[ i ] || {}; + + // Remove element nodes and prevent memory leaks + if ( elem.nodeType === 1 ) { + jQuery.cleanData( getAll( elem, false ) ); + elem.innerHTML = value; + } + } + + elem = 0; + + // If using innerHTML throws an exception, use the fallback method + } catch ( e ) {} + } + + if ( elem ) { + this.empty().append( value ); + } + }, null, value, arguments.length ); + }, + + replaceWith: function() { + var ignored = []; + + // Make the changes, replacing each non-ignored context element with the new content + return domManip( this, arguments, function( elem ) { + var parent = this.parentNode; + + if ( jQuery.inArray( this, ignored ) < 0 ) { + jQuery.cleanData( getAll( this ) ); + if ( parent ) { + parent.replaceChild( elem, this ); + } + } + + // Force callback invocation + }, ignored ); + } +} ); + +jQuery.each( { + appendTo: "append", + prependTo: "prepend", + insertBefore: "before", + insertAfter: "after", + replaceAll: "replaceWith" +}, function( name, original ) { + jQuery.fn[ name ] = function( selector ) { + var elems, + ret = [], + insert = jQuery( selector ), + last = insert.length - 1, + i = 0; + + for ( ; i <= last; i++ ) { + elems = i === last ? this : this.clone( true ); + jQuery( insert[ i ] )[ original ]( elems ); + + // Support: Android <=4.0 only, PhantomJS 1 only + // .get() because push.apply(_, arraylike) throws on ancient WebKit + push.apply( ret, elems.get() ); + } + + return this.pushStack( ret ); + }; +} ); +var rnumnonpx = new RegExp( "^(" + pnum + ")(?!px)[a-z%]+$", "i" ); + +var getStyles = function( elem ) { + + // Support: IE <=11 only, Firefox <=30 (#15098, #14150) + // IE throws on elements created in popups + // FF meanwhile throws on frame elements through "defaultView.getComputedStyle" + var view = elem.ownerDocument.defaultView; + + if ( !view || !view.opener ) { + view = window; + } + + return view.getComputedStyle( elem ); + }; + +var swap = function( elem, options, callback ) { + var ret, name, + old = {}; + + // Remember the old values, and insert the new ones + for ( name in options ) { + old[ name ] = elem.style[ name ]; + elem.style[ name ] = options[ name ]; + } + + ret = callback.call( elem ); + + // Revert the old values + for ( name in options ) { + elem.style[ name ] = old[ name ]; + } + + return ret; +}; + + +var rboxStyle = new RegExp( cssExpand.join( "|" ), "i" ); + + + +( function() { + + // Executing both pixelPosition & boxSizingReliable tests require only one layout + // so they're executed at the same time to save the second computation. + function computeStyleTests() { + + // This is a singleton, we need to execute it only once + if ( !div ) { + return; + } + + container.style.cssText = "position:absolute;left:-11111px;width:60px;" + + "margin-top:1px;padding:0;border:0"; + div.style.cssText = + "position:relative;display:block;box-sizing:border-box;overflow:scroll;" + + "margin:auto;border:1px;padding:1px;" + + "width:60%;top:1%"; + documentElement.appendChild( container ).appendChild( div ); + + var divStyle = window.getComputedStyle( div ); + pixelPositionVal = divStyle.top !== "1%"; + + // Support: Android 4.0 - 4.3 only, Firefox <=3 - 44 + reliableMarginLeftVal = roundPixelMeasures( divStyle.marginLeft ) === 12; + + // Support: Android 4.0 - 4.3 only, Safari <=9.1 - 10.1, iOS <=7.0 - 9.3 + // Some styles come back with percentage values, even though they shouldn't + div.style.right = "60%"; + pixelBoxStylesVal = roundPixelMeasures( divStyle.right ) === 36; + + // Support: IE 9 - 11 only + // Detect misreporting of content dimensions for box-sizing:border-box elements + boxSizingReliableVal = roundPixelMeasures( divStyle.width ) === 36; + + // Support: IE 9 only + // Detect overflow:scroll screwiness (gh-3699) + // Support: Chrome <=64 + // Don't get tricked when zoom affects offsetWidth (gh-4029) + div.style.position = "absolute"; + scrollboxSizeVal = roundPixelMeasures( div.offsetWidth / 3 ) === 12; + + documentElement.removeChild( container ); + + // Nullify the div so it wouldn't be stored in the memory and + // it will also be a sign that checks already performed + div = null; + } + + function roundPixelMeasures( measure ) { + return Math.round( parseFloat( measure ) ); + } + + var pixelPositionVal, boxSizingReliableVal, scrollboxSizeVal, pixelBoxStylesVal, + reliableTrDimensionsVal, reliableMarginLeftVal, + container = document.createElement( "div" ), + div = document.createElement( "div" ); + + // Finish early in limited (non-browser) environments + if ( !div.style ) { + return; + } + + // Support: IE <=9 - 11 only + // Style of cloned element affects source element cloned (#8908) + div.style.backgroundClip = "content-box"; + div.cloneNode( true ).style.backgroundClip = ""; + support.clearCloneStyle = div.style.backgroundClip === "content-box"; + + jQuery.extend( support, { + boxSizingReliable: function() { + computeStyleTests(); + return boxSizingReliableVal; + }, + pixelBoxStyles: function() { + computeStyleTests(); + return pixelBoxStylesVal; + }, + pixelPosition: function() { + computeStyleTests(); + return pixelPositionVal; + }, + reliableMarginLeft: function() { + computeStyleTests(); + return reliableMarginLeftVal; + }, + scrollboxSize: function() { + computeStyleTests(); + return scrollboxSizeVal; + }, + + // Support: IE 9 - 11+, Edge 15 - 18+ + // IE/Edge misreport `getComputedStyle` of table rows with width/height + // set in CSS while `offset*` properties report correct values. + // Behavior in IE 9 is more subtle than in newer versions & it passes + // some versions of this test; make sure not to make it pass there! + // + // Support: Firefox 70+ + // Only Firefox includes border widths + // in computed dimensions. (gh-4529) + reliableTrDimensions: function() { + var table, tr, trChild, trStyle; + if ( reliableTrDimensionsVal == null ) { + table = document.createElement( "table" ); + tr = document.createElement( "tr" ); + trChild = document.createElement( "div" ); + + table.style.cssText = "position:absolute;left:-11111px;border-collapse:separate"; + tr.style.cssText = "border:1px solid"; + + // Support: Chrome 86+ + // Height set through cssText does not get applied. + // Computed height then comes back as 0. + tr.style.height = "1px"; + trChild.style.height = "9px"; + + // Support: Android 8 Chrome 86+ + // In our bodyBackground.html iframe, + // display for all div elements is set to "inline", + // which causes a problem only in Android 8 Chrome 86. + // Ensuring the div is display: block + // gets around this issue. + trChild.style.display = "block"; + + documentElement + .appendChild( table ) + .appendChild( tr ) + .appendChild( trChild ); + + trStyle = window.getComputedStyle( tr ); + reliableTrDimensionsVal = ( parseInt( trStyle.height, 10 ) + + parseInt( trStyle.borderTopWidth, 10 ) + + parseInt( trStyle.borderBottomWidth, 10 ) ) === tr.offsetHeight; + + documentElement.removeChild( table ); + } + return reliableTrDimensionsVal; + } + } ); +} )(); + + +function curCSS( elem, name, computed ) { + var width, minWidth, maxWidth, ret, + + // Support: Firefox 51+ + // Retrieving style before computed somehow + // fixes an issue with getting wrong values + // on detached elements + style = elem.style; + + computed = computed || getStyles( elem ); + + // getPropertyValue is needed for: + // .css('filter') (IE 9 only, #12537) + // .css('--customProperty) (#3144) + if ( computed ) { + ret = computed.getPropertyValue( name ) || computed[ name ]; + + if ( ret === "" && !isAttached( elem ) ) { + ret = jQuery.style( elem, name ); + } + + // A tribute to the "awesome hack by Dean Edwards" + // Android Browser returns percentage for some values, + // but width seems to be reliably pixels. + // This is against the CSSOM draft spec: + // https://drafts.csswg.org/cssom/#resolved-values + if ( !support.pixelBoxStyles() && rnumnonpx.test( ret ) && rboxStyle.test( name ) ) { + + // Remember the original values + width = style.width; + minWidth = style.minWidth; + maxWidth = style.maxWidth; + + // Put in the new values to get a computed value out + style.minWidth = style.maxWidth = style.width = ret; + ret = computed.width; + + // Revert the changed values + style.width = width; + style.minWidth = minWidth; + style.maxWidth = maxWidth; + } + } + + return ret !== undefined ? + + // Support: IE <=9 - 11 only + // IE returns zIndex value as an integer. + ret + "" : + ret; +} + + +function addGetHookIf( conditionFn, hookFn ) { + + // Define the hook, we'll check on the first run if it's really needed. + return { + get: function() { + if ( conditionFn() ) { + + // Hook not needed (or it's not possible to use it due + // to missing dependency), remove it. + delete this.get; + return; + } + + // Hook needed; redefine it so that the support test is not executed again. + return ( this.get = hookFn ).apply( this, arguments ); + } + }; +} + + +var cssPrefixes = [ "Webkit", "Moz", "ms" ], + emptyStyle = document.createElement( "div" ).style, + vendorProps = {}; + +// Return a vendor-prefixed property or undefined +function vendorPropName( name ) { + + // Check for vendor prefixed names + var capName = name[ 0 ].toUpperCase() + name.slice( 1 ), + i = cssPrefixes.length; + + while ( i-- ) { + name = cssPrefixes[ i ] + capName; + if ( name in emptyStyle ) { + return name; + } + } +} + +// Return a potentially-mapped jQuery.cssProps or vendor prefixed property +function finalPropName( name ) { + var final = jQuery.cssProps[ name ] || vendorProps[ name ]; + + if ( final ) { + return final; + } + if ( name in emptyStyle ) { + return name; + } + return vendorProps[ name ] = vendorPropName( name ) || name; +} + + +var + + // Swappable if display is none or starts with table + // except "table", "table-cell", or "table-caption" + // See here for display values: https://developer.mozilla.org/en-US/docs/CSS/display + rdisplayswap = /^(none|table(?!-c[ea]).+)/, + rcustomProp = /^--/, + cssShow = { position: "absolute", visibility: "hidden", display: "block" }, + cssNormalTransform = { + letterSpacing: "0", + fontWeight: "400" + }; + +function setPositiveNumber( _elem, value, subtract ) { + + // Any relative (+/-) values have already been + // normalized at this point + var matches = rcssNum.exec( value ); + return matches ? + + // Guard against undefined "subtract", e.g., when used as in cssHooks + Math.max( 0, matches[ 2 ] - ( subtract || 0 ) ) + ( matches[ 3 ] || "px" ) : + value; +} + +function boxModelAdjustment( elem, dimension, box, isBorderBox, styles, computedVal ) { + var i = dimension === "width" ? 1 : 0, + extra = 0, + delta = 0; + + // Adjustment may not be necessary + if ( box === ( isBorderBox ? "border" : "content" ) ) { + return 0; + } + + for ( ; i < 4; i += 2 ) { + + // Both box models exclude margin + if ( box === "margin" ) { + delta += jQuery.css( elem, box + cssExpand[ i ], true, styles ); + } + + // If we get here with a content-box, we're seeking "padding" or "border" or "margin" + if ( !isBorderBox ) { + + // Add padding + delta += jQuery.css( elem, "padding" + cssExpand[ i ], true, styles ); + + // For "border" or "margin", add border + if ( box !== "padding" ) { + delta += jQuery.css( elem, "border" + cssExpand[ i ] + "Width", true, styles ); + + // But still keep track of it otherwise + } else { + extra += jQuery.css( elem, "border" + cssExpand[ i ] + "Width", true, styles ); + } + + // If we get here with a border-box (content + padding + border), we're seeking "content" or + // "padding" or "margin" + } else { + + // For "content", subtract padding + if ( box === "content" ) { + delta -= jQuery.css( elem, "padding" + cssExpand[ i ], true, styles ); + } + + // For "content" or "padding", subtract border + if ( box !== "margin" ) { + delta -= jQuery.css( elem, "border" + cssExpand[ i ] + "Width", true, styles ); + } + } + } + + // Account for positive content-box scroll gutter when requested by providing computedVal + if ( !isBorderBox && computedVal >= 0 ) { + + // offsetWidth/offsetHeight is a rounded sum of content, padding, scroll gutter, and border + // Assuming integer scroll gutter, subtract the rest and round down + delta += Math.max( 0, Math.ceil( + elem[ "offset" + dimension[ 0 ].toUpperCase() + dimension.slice( 1 ) ] - + computedVal - + delta - + extra - + 0.5 + + // If offsetWidth/offsetHeight is unknown, then we can't determine content-box scroll gutter + // Use an explicit zero to avoid NaN (gh-3964) + ) ) || 0; + } + + return delta; +} + +function getWidthOrHeight( elem, dimension, extra ) { + + // Start with computed style + var styles = getStyles( elem ), + + // To avoid forcing a reflow, only fetch boxSizing if we need it (gh-4322). + // Fake content-box until we know it's needed to know the true value. + boxSizingNeeded = !support.boxSizingReliable() || extra, + isBorderBox = boxSizingNeeded && + jQuery.css( elem, "boxSizing", false, styles ) === "border-box", + valueIsBorderBox = isBorderBox, + + val = curCSS( elem, dimension, styles ), + offsetProp = "offset" + dimension[ 0 ].toUpperCase() + dimension.slice( 1 ); + + // Support: Firefox <=54 + // Return a confounding non-pixel value or feign ignorance, as appropriate. + if ( rnumnonpx.test( val ) ) { + if ( !extra ) { + return val; + } + val = "auto"; + } + + + // Support: IE 9 - 11 only + // Use offsetWidth/offsetHeight for when box sizing is unreliable. + // In those cases, the computed value can be trusted to be border-box. + if ( ( !support.boxSizingReliable() && isBorderBox || + + // Support: IE 10 - 11+, Edge 15 - 18+ + // IE/Edge misreport `getComputedStyle` of table rows with width/height + // set in CSS while `offset*` properties report correct values. + // Interestingly, in some cases IE 9 doesn't suffer from this issue. + !support.reliableTrDimensions() && nodeName( elem, "tr" ) || + + // Fall back to offsetWidth/offsetHeight when value is "auto" + // This happens for inline elements with no explicit setting (gh-3571) + val === "auto" || + + // Support: Android <=4.1 - 4.3 only + // Also use offsetWidth/offsetHeight for misreported inline dimensions (gh-3602) + !parseFloat( val ) && jQuery.css( elem, "display", false, styles ) === "inline" ) && + + // Make sure the element is visible & connected + elem.getClientRects().length ) { + + isBorderBox = jQuery.css( elem, "boxSizing", false, styles ) === "border-box"; + + // Where available, offsetWidth/offsetHeight approximate border box dimensions. + // Where not available (e.g., SVG), assume unreliable box-sizing and interpret the + // retrieved value as a content box dimension. + valueIsBorderBox = offsetProp in elem; + if ( valueIsBorderBox ) { + val = elem[ offsetProp ]; + } + } + + // Normalize "" and auto + val = parseFloat( val ) || 0; + + // Adjust for the element's box model + return ( val + + boxModelAdjustment( + elem, + dimension, + extra || ( isBorderBox ? "border" : "content" ), + valueIsBorderBox, + styles, + + // Provide the current computed size to request scroll gutter calculation (gh-3589) + val + ) + ) + "px"; +} + +jQuery.extend( { + + // Add in style property hooks for overriding the default + // behavior of getting and setting a style property + cssHooks: { + opacity: { + get: function( elem, computed ) { + if ( computed ) { + + // We should always get a number back from opacity + var ret = curCSS( elem, "opacity" ); + return ret === "" ? "1" : ret; + } + } + } + }, + + // Don't automatically add "px" to these possibly-unitless properties + cssNumber: { + "animationIterationCount": true, + "columnCount": true, + "fillOpacity": true, + "flexGrow": true, + "flexShrink": true, + "fontWeight": true, + "gridArea": true, + "gridColumn": true, + "gridColumnEnd": true, + "gridColumnStart": true, + "gridRow": true, + "gridRowEnd": true, + "gridRowStart": true, + "lineHeight": true, + "opacity": true, + "order": true, + "orphans": true, + "widows": true, + "zIndex": true, + "zoom": true + }, + + // Add in properties whose names you wish to fix before + // setting or getting the value + cssProps: {}, + + // Get and set the style property on a DOM Node + style: function( elem, name, value, extra ) { + + // Don't set styles on text and comment nodes + if ( !elem || elem.nodeType === 3 || elem.nodeType === 8 || !elem.style ) { + return; + } + + // Make sure that we're working with the right name + var ret, type, hooks, + origName = camelCase( name ), + isCustomProp = rcustomProp.test( name ), + style = elem.style; + + // Make sure that we're working with the right name. We don't + // want to query the value if it is a CSS custom property + // since they are user-defined. + if ( !isCustomProp ) { + name = finalPropName( origName ); + } + + // Gets hook for the prefixed version, then unprefixed version + hooks = jQuery.cssHooks[ name ] || jQuery.cssHooks[ origName ]; + + // Check if we're setting a value + if ( value !== undefined ) { + type = typeof value; + + // Convert "+=" or "-=" to relative numbers (#7345) + if ( type === "string" && ( ret = rcssNum.exec( value ) ) && ret[ 1 ] ) { + value = adjustCSS( elem, name, ret ); + + // Fixes bug #9237 + type = "number"; + } + + // Make sure that null and NaN values aren't set (#7116) + if ( value == null || value !== value ) { + return; + } + + // If a number was passed in, add the unit (except for certain CSS properties) + // The isCustomProp check can be removed in jQuery 4.0 when we only auto-append + // "px" to a few hardcoded values. + if ( type === "number" && !isCustomProp ) { + value += ret && ret[ 3 ] || ( jQuery.cssNumber[ origName ] ? "" : "px" ); + } + + // background-* props affect original clone's values + if ( !support.clearCloneStyle && value === "" && name.indexOf( "background" ) === 0 ) { + style[ name ] = "inherit"; + } + + // If a hook was provided, use that value, otherwise just set the specified value + if ( !hooks || !( "set" in hooks ) || + ( value = hooks.set( elem, value, extra ) ) !== undefined ) { + + if ( isCustomProp ) { + style.setProperty( name, value ); + } else { + style[ name ] = value; + } + } + + } else { + + // If a hook was provided get the non-computed value from there + if ( hooks && "get" in hooks && + ( ret = hooks.get( elem, false, extra ) ) !== undefined ) { + + return ret; + } + + // Otherwise just get the value from the style object + return style[ name ]; + } + }, + + css: function( elem, name, extra, styles ) { + var val, num, hooks, + origName = camelCase( name ), + isCustomProp = rcustomProp.test( name ); + + // Make sure that we're working with the right name. We don't + // want to modify the value if it is a CSS custom property + // since they are user-defined. + if ( !isCustomProp ) { + name = finalPropName( origName ); + } + + // Try prefixed name followed by the unprefixed name + hooks = jQuery.cssHooks[ name ] || jQuery.cssHooks[ origName ]; + + // If a hook was provided get the computed value from there + if ( hooks && "get" in hooks ) { + val = hooks.get( elem, true, extra ); + } + + // Otherwise, if a way to get the computed value exists, use that + if ( val === undefined ) { + val = curCSS( elem, name, styles ); + } + + // Convert "normal" to computed value + if ( val === "normal" && name in cssNormalTransform ) { + val = cssNormalTransform[ name ]; + } + + // Make numeric if forced or a qualifier was provided and val looks numeric + if ( extra === "" || extra ) { + num = parseFloat( val ); + return extra === true || isFinite( num ) ? num || 0 : val; + } + + return val; + } +} ); + +jQuery.each( [ "height", "width" ], function( _i, dimension ) { + jQuery.cssHooks[ dimension ] = { + get: function( elem, computed, extra ) { + if ( computed ) { + + // Certain elements can have dimension info if we invisibly show them + // but it must have a current display style that would benefit + return rdisplayswap.test( jQuery.css( elem, "display" ) ) && + + // Support: Safari 8+ + // Table columns in Safari have non-zero offsetWidth & zero + // getBoundingClientRect().width unless display is changed. + // Support: IE <=11 only + // Running getBoundingClientRect on a disconnected node + // in IE throws an error. + ( !elem.getClientRects().length || !elem.getBoundingClientRect().width ) ? + swap( elem, cssShow, function() { + return getWidthOrHeight( elem, dimension, extra ); + } ) : + getWidthOrHeight( elem, dimension, extra ); + } + }, + + set: function( elem, value, extra ) { + var matches, + styles = getStyles( elem ), + + // Only read styles.position if the test has a chance to fail + // to avoid forcing a reflow. + scrollboxSizeBuggy = !support.scrollboxSize() && + styles.position === "absolute", + + // To avoid forcing a reflow, only fetch boxSizing if we need it (gh-3991) + boxSizingNeeded = scrollboxSizeBuggy || extra, + isBorderBox = boxSizingNeeded && + jQuery.css( elem, "boxSizing", false, styles ) === "border-box", + subtract = extra ? + boxModelAdjustment( + elem, + dimension, + extra, + isBorderBox, + styles + ) : + 0; + + // Account for unreliable border-box dimensions by comparing offset* to computed and + // faking a content-box to get border and padding (gh-3699) + if ( isBorderBox && scrollboxSizeBuggy ) { + subtract -= Math.ceil( + elem[ "offset" + dimension[ 0 ].toUpperCase() + dimension.slice( 1 ) ] - + parseFloat( styles[ dimension ] ) - + boxModelAdjustment( elem, dimension, "border", false, styles ) - + 0.5 + ); + } + + // Convert to pixels if value adjustment is needed + if ( subtract && ( matches = rcssNum.exec( value ) ) && + ( matches[ 3 ] || "px" ) !== "px" ) { + + elem.style[ dimension ] = value; + value = jQuery.css( elem, dimension ); + } + + return setPositiveNumber( elem, value, subtract ); + } + }; +} ); + +jQuery.cssHooks.marginLeft = addGetHookIf( support.reliableMarginLeft, + function( elem, computed ) { + if ( computed ) { + return ( parseFloat( curCSS( elem, "marginLeft" ) ) || + elem.getBoundingClientRect().left - + swap( elem, { marginLeft: 0 }, function() { + return elem.getBoundingClientRect().left; + } ) + ) + "px"; + } + } +); + +// These hooks are used by animate to expand properties +jQuery.each( { + margin: "", + padding: "", + border: "Width" +}, function( prefix, suffix ) { + jQuery.cssHooks[ prefix + suffix ] = { + expand: function( value ) { + var i = 0, + expanded = {}, + + // Assumes a single number if not a string + parts = typeof value === "string" ? value.split( " " ) : [ value ]; + + for ( ; i < 4; i++ ) { + expanded[ prefix + cssExpand[ i ] + suffix ] = + parts[ i ] || parts[ i - 2 ] || parts[ 0 ]; + } + + return expanded; + } + }; + + if ( prefix !== "margin" ) { + jQuery.cssHooks[ prefix + suffix ].set = setPositiveNumber; + } +} ); + +jQuery.fn.extend( { + css: function( name, value ) { + return access( this, function( elem, name, value ) { + var styles, len, + map = {}, + i = 0; + + if ( Array.isArray( name ) ) { + styles = getStyles( elem ); + len = name.length; + + for ( ; i < len; i++ ) { + map[ name[ i ] ] = jQuery.css( elem, name[ i ], false, styles ); + } + + return map; + } + + return value !== undefined ? + jQuery.style( elem, name, value ) : + jQuery.css( elem, name ); + }, name, value, arguments.length > 1 ); + } +} ); + + +function Tween( elem, options, prop, end, easing ) { + return new Tween.prototype.init( elem, options, prop, end, easing ); +} +jQuery.Tween = Tween; + +Tween.prototype = { + constructor: Tween, + init: function( elem, options, prop, end, easing, unit ) { + this.elem = elem; + this.prop = prop; + this.easing = easing || jQuery.easing._default; + this.options = options; + this.start = this.now = this.cur(); + this.end = end; + this.unit = unit || ( jQuery.cssNumber[ prop ] ? "" : "px" ); + }, + cur: function() { + var hooks = Tween.propHooks[ this.prop ]; + + return hooks && hooks.get ? + hooks.get( this ) : + Tween.propHooks._default.get( this ); + }, + run: function( percent ) { + var eased, + hooks = Tween.propHooks[ this.prop ]; + + if ( this.options.duration ) { + this.pos = eased = jQuery.easing[ this.easing ]( + percent, this.options.duration * percent, 0, 1, this.options.duration + ); + } else { + this.pos = eased = percent; + } + this.now = ( this.end - this.start ) * eased + this.start; + + if ( this.options.step ) { + this.options.step.call( this.elem, this.now, this ); + } + + if ( hooks && hooks.set ) { + hooks.set( this ); + } else { + Tween.propHooks._default.set( this ); + } + return this; + } +}; + +Tween.prototype.init.prototype = Tween.prototype; + +Tween.propHooks = { + _default: { + get: function( tween ) { + var result; + + // Use a property on the element directly when it is not a DOM element, + // or when there is no matching style property that exists. + if ( tween.elem.nodeType !== 1 || + tween.elem[ tween.prop ] != null && tween.elem.style[ tween.prop ] == null ) { + return tween.elem[ tween.prop ]; + } + + // Passing an empty string as a 3rd parameter to .css will automatically + // attempt a parseFloat and fallback to a string if the parse fails. + // Simple values such as "10px" are parsed to Float; + // complex values such as "rotate(1rad)" are returned as-is. + result = jQuery.css( tween.elem, tween.prop, "" ); + + // Empty strings, null, undefined and "auto" are converted to 0. + return !result || result === "auto" ? 0 : result; + }, + set: function( tween ) { + + // Use step hook for back compat. + // Use cssHook if its there. + // Use .style if available and use plain properties where available. + if ( jQuery.fx.step[ tween.prop ] ) { + jQuery.fx.step[ tween.prop ]( tween ); + } else if ( tween.elem.nodeType === 1 && ( + jQuery.cssHooks[ tween.prop ] || + tween.elem.style[ finalPropName( tween.prop ) ] != null ) ) { + jQuery.style( tween.elem, tween.prop, tween.now + tween.unit ); + } else { + tween.elem[ tween.prop ] = tween.now; + } + } + } +}; + +// Support: IE <=9 only +// Panic based approach to setting things on disconnected nodes +Tween.propHooks.scrollTop = Tween.propHooks.scrollLeft = { + set: function( tween ) { + if ( tween.elem.nodeType && tween.elem.parentNode ) { + tween.elem[ tween.prop ] = tween.now; + } + } +}; + +jQuery.easing = { + linear: function( p ) { + return p; + }, + swing: function( p ) { + return 0.5 - Math.cos( p * Math.PI ) / 2; + }, + _default: "swing" +}; + +jQuery.fx = Tween.prototype.init; + +// Back compat <1.8 extension point +jQuery.fx.step = {}; + + + + +var + fxNow, inProgress, + rfxtypes = /^(?:toggle|show|hide)$/, + rrun = /queueHooks$/; + +function schedule() { + if ( inProgress ) { + if ( document.hidden === false && window.requestAnimationFrame ) { + window.requestAnimationFrame( schedule ); + } else { + window.setTimeout( schedule, jQuery.fx.interval ); + } + + jQuery.fx.tick(); + } +} + +// Animations created synchronously will run synchronously +function createFxNow() { + window.setTimeout( function() { + fxNow = undefined; + } ); + return ( fxNow = Date.now() ); +} + +// Generate parameters to create a standard animation +function genFx( type, includeWidth ) { + var which, + i = 0, + attrs = { height: type }; + + // If we include width, step value is 1 to do all cssExpand values, + // otherwise step value is 2 to skip over Left and Right + includeWidth = includeWidth ? 1 : 0; + for ( ; i < 4; i += 2 - includeWidth ) { + which = cssExpand[ i ]; + attrs[ "margin" + which ] = attrs[ "padding" + which ] = type; + } + + if ( includeWidth ) { + attrs.opacity = attrs.width = type; + } + + return attrs; +} + +function createTween( value, prop, animation ) { + var tween, + collection = ( Animation.tweeners[ prop ] || [] ).concat( Animation.tweeners[ "*" ] ), + index = 0, + length = collection.length; + for ( ; index < length; index++ ) { + if ( ( tween = collection[ index ].call( animation, prop, value ) ) ) { + + // We're done with this property + return tween; + } + } +} + +function defaultPrefilter( elem, props, opts ) { + var prop, value, toggle, hooks, oldfire, propTween, restoreDisplay, display, + isBox = "width" in props || "height" in props, + anim = this, + orig = {}, + style = elem.style, + hidden = elem.nodeType && isHiddenWithinTree( elem ), + dataShow = dataPriv.get( elem, "fxshow" ); + + // Queue-skipping animations hijack the fx hooks + if ( !opts.queue ) { + hooks = jQuery._queueHooks( elem, "fx" ); + if ( hooks.unqueued == null ) { + hooks.unqueued = 0; + oldfire = hooks.empty.fire; + hooks.empty.fire = function() { + if ( !hooks.unqueued ) { + oldfire(); + } + }; + } + hooks.unqueued++; + + anim.always( function() { + + // Ensure the complete handler is called before this completes + anim.always( function() { + hooks.unqueued--; + if ( !jQuery.queue( elem, "fx" ).length ) { + hooks.empty.fire(); + } + } ); + } ); + } + + // Detect show/hide animations + for ( prop in props ) { + value = props[ prop ]; + if ( rfxtypes.test( value ) ) { + delete props[ prop ]; + toggle = toggle || value === "toggle"; + if ( value === ( hidden ? "hide" : "show" ) ) { + + // Pretend to be hidden if this is a "show" and + // there is still data from a stopped show/hide + if ( value === "show" && dataShow && dataShow[ prop ] !== undefined ) { + hidden = true; + + // Ignore all other no-op show/hide data + } else { + continue; + } + } + orig[ prop ] = dataShow && dataShow[ prop ] || jQuery.style( elem, prop ); + } + } + + // Bail out if this is a no-op like .hide().hide() + propTween = !jQuery.isEmptyObject( props ); + if ( !propTween && jQuery.isEmptyObject( orig ) ) { + return; + } + + // Restrict "overflow" and "display" styles during box animations + if ( isBox && elem.nodeType === 1 ) { + + // Support: IE <=9 - 11, Edge 12 - 15 + // Record all 3 overflow attributes because IE does not infer the shorthand + // from identically-valued overflowX and overflowY and Edge just mirrors + // the overflowX value there. + opts.overflow = [ style.overflow, style.overflowX, style.overflowY ]; + + // Identify a display type, preferring old show/hide data over the CSS cascade + restoreDisplay = dataShow && dataShow.display; + if ( restoreDisplay == null ) { + restoreDisplay = dataPriv.get( elem, "display" ); + } + display = jQuery.css( elem, "display" ); + if ( display === "none" ) { + if ( restoreDisplay ) { + display = restoreDisplay; + } else { + + // Get nonempty value(s) by temporarily forcing visibility + showHide( [ elem ], true ); + restoreDisplay = elem.style.display || restoreDisplay; + display = jQuery.css( elem, "display" ); + showHide( [ elem ] ); + } + } + + // Animate inline elements as inline-block + if ( display === "inline" || display === "inline-block" && restoreDisplay != null ) { + if ( jQuery.css( elem, "float" ) === "none" ) { + + // Restore the original display value at the end of pure show/hide animations + if ( !propTween ) { + anim.done( function() { + style.display = restoreDisplay; + } ); + if ( restoreDisplay == null ) { + display = style.display; + restoreDisplay = display === "none" ? "" : display; + } + } + style.display = "inline-block"; + } + } + } + + if ( opts.overflow ) { + style.overflow = "hidden"; + anim.always( function() { + style.overflow = opts.overflow[ 0 ]; + style.overflowX = opts.overflow[ 1 ]; + style.overflowY = opts.overflow[ 2 ]; + } ); + } + + // Implement show/hide animations + propTween = false; + for ( prop in orig ) { + + // General show/hide setup for this element animation + if ( !propTween ) { + if ( dataShow ) { + if ( "hidden" in dataShow ) { + hidden = dataShow.hidden; + } + } else { + dataShow = dataPriv.access( elem, "fxshow", { display: restoreDisplay } ); + } + + // Store hidden/visible for toggle so `.stop().toggle()` "reverses" + if ( toggle ) { + dataShow.hidden = !hidden; + } + + // Show elements before animating them + if ( hidden ) { + showHide( [ elem ], true ); + } + + /* eslint-disable no-loop-func */ + + anim.done( function() { + + /* eslint-enable no-loop-func */ + + // The final step of a "hide" animation is actually hiding the element + if ( !hidden ) { + showHide( [ elem ] ); + } + dataPriv.remove( elem, "fxshow" ); + for ( prop in orig ) { + jQuery.style( elem, prop, orig[ prop ] ); + } + } ); + } + + // Per-property setup + propTween = createTween( hidden ? dataShow[ prop ] : 0, prop, anim ); + if ( !( prop in dataShow ) ) { + dataShow[ prop ] = propTween.start; + if ( hidden ) { + propTween.end = propTween.start; + propTween.start = 0; + } + } + } +} + +function propFilter( props, specialEasing ) { + var index, name, easing, value, hooks; + + // camelCase, specialEasing and expand cssHook pass + for ( index in props ) { + name = camelCase( index ); + easing = specialEasing[ name ]; + value = props[ index ]; + if ( Array.isArray( value ) ) { + easing = value[ 1 ]; + value = props[ index ] = value[ 0 ]; + } + + if ( index !== name ) { + props[ name ] = value; + delete props[ index ]; + } + + hooks = jQuery.cssHooks[ name ]; + if ( hooks && "expand" in hooks ) { + value = hooks.expand( value ); + delete props[ name ]; + + // Not quite $.extend, this won't overwrite existing keys. + // Reusing 'index' because we have the correct "name" + for ( index in value ) { + if ( !( index in props ) ) { + props[ index ] = value[ index ]; + specialEasing[ index ] = easing; + } + } + } else { + specialEasing[ name ] = easing; + } + } +} + +function Animation( elem, properties, options ) { + var result, + stopped, + index = 0, + length = Animation.prefilters.length, + deferred = jQuery.Deferred().always( function() { + + // Don't match elem in the :animated selector + delete tick.elem; + } ), + tick = function() { + if ( stopped ) { + return false; + } + var currentTime = fxNow || createFxNow(), + remaining = Math.max( 0, animation.startTime + animation.duration - currentTime ), + + // Support: Android 2.3 only + // Archaic crash bug won't allow us to use `1 - ( 0.5 || 0 )` (#12497) + temp = remaining / animation.duration || 0, + percent = 1 - temp, + index = 0, + length = animation.tweens.length; + + for ( ; index < length; index++ ) { + animation.tweens[ index ].run( percent ); + } + + deferred.notifyWith( elem, [ animation, percent, remaining ] ); + + // If there's more to do, yield + if ( percent < 1 && length ) { + return remaining; + } + + // If this was an empty animation, synthesize a final progress notification + if ( !length ) { + deferred.notifyWith( elem, [ animation, 1, 0 ] ); + } + + // Resolve the animation and report its conclusion + deferred.resolveWith( elem, [ animation ] ); + return false; + }, + animation = deferred.promise( { + elem: elem, + props: jQuery.extend( {}, properties ), + opts: jQuery.extend( true, { + specialEasing: {}, + easing: jQuery.easing._default + }, options ), + originalProperties: properties, + originalOptions: options, + startTime: fxNow || createFxNow(), + duration: options.duration, + tweens: [], + createTween: function( prop, end ) { + var tween = jQuery.Tween( elem, animation.opts, prop, end, + animation.opts.specialEasing[ prop ] || animation.opts.easing ); + animation.tweens.push( tween ); + return tween; + }, + stop: function( gotoEnd ) { + var index = 0, + + // If we are going to the end, we want to run all the tweens + // otherwise we skip this part + length = gotoEnd ? animation.tweens.length : 0; + if ( stopped ) { + return this; + } + stopped = true; + for ( ; index < length; index++ ) { + animation.tweens[ index ].run( 1 ); + } + + // Resolve when we played the last frame; otherwise, reject + if ( gotoEnd ) { + deferred.notifyWith( elem, [ animation, 1, 0 ] ); + deferred.resolveWith( elem, [ animation, gotoEnd ] ); + } else { + deferred.rejectWith( elem, [ animation, gotoEnd ] ); + } + return this; + } + } ), + props = animation.props; + + propFilter( props, animation.opts.specialEasing ); + + for ( ; index < length; index++ ) { + result = Animation.prefilters[ index ].call( animation, elem, props, animation.opts ); + if ( result ) { + if ( isFunction( result.stop ) ) { + jQuery._queueHooks( animation.elem, animation.opts.queue ).stop = + result.stop.bind( result ); + } + return result; + } + } + + jQuery.map( props, createTween, animation ); + + if ( isFunction( animation.opts.start ) ) { + animation.opts.start.call( elem, animation ); + } + + // Attach callbacks from options + animation + .progress( animation.opts.progress ) + .done( animation.opts.done, animation.opts.complete ) + .fail( animation.opts.fail ) + .always( animation.opts.always ); + + jQuery.fx.timer( + jQuery.extend( tick, { + elem: elem, + anim: animation, + queue: animation.opts.queue + } ) + ); + + return animation; +} + +jQuery.Animation = jQuery.extend( Animation, { + + tweeners: { + "*": [ function( prop, value ) { + var tween = this.createTween( prop, value ); + adjustCSS( tween.elem, prop, rcssNum.exec( value ), tween ); + return tween; + } ] + }, + + tweener: function( props, callback ) { + if ( isFunction( props ) ) { + callback = props; + props = [ "*" ]; + } else { + props = props.match( rnothtmlwhite ); + } + + var prop, + index = 0, + length = props.length; + + for ( ; index < length; index++ ) { + prop = props[ index ]; + Animation.tweeners[ prop ] = Animation.tweeners[ prop ] || []; + Animation.tweeners[ prop ].unshift( callback ); + } + }, + + prefilters: [ defaultPrefilter ], + + prefilter: function( callback, prepend ) { + if ( prepend ) { + Animation.prefilters.unshift( callback ); + } else { + Animation.prefilters.push( callback ); + } + } +} ); + +jQuery.speed = function( speed, easing, fn ) { + var opt = speed && typeof speed === "object" ? jQuery.extend( {}, speed ) : { + complete: fn || !fn && easing || + isFunction( speed ) && speed, + duration: speed, + easing: fn && easing || easing && !isFunction( easing ) && easing + }; + + // Go to the end state if fx are off + if ( jQuery.fx.off ) { + opt.duration = 0; + + } else { + if ( typeof opt.duration !== "number" ) { + if ( opt.duration in jQuery.fx.speeds ) { + opt.duration = jQuery.fx.speeds[ opt.duration ]; + + } else { + opt.duration = jQuery.fx.speeds._default; + } + } + } + + // Normalize opt.queue - true/undefined/null -> "fx" + if ( opt.queue == null || opt.queue === true ) { + opt.queue = "fx"; + } + + // Queueing + opt.old = opt.complete; + + opt.complete = function() { + if ( isFunction( opt.old ) ) { + opt.old.call( this ); + } + + if ( opt.queue ) { + jQuery.dequeue( this, opt.queue ); + } + }; + + return opt; +}; + +jQuery.fn.extend( { + fadeTo: function( speed, to, easing, callback ) { + + // Show any hidden elements after setting opacity to 0 + return this.filter( isHiddenWithinTree ).css( "opacity", 0 ).show() + + // Animate to the value specified + .end().animate( { opacity: to }, speed, easing, callback ); + }, + animate: function( prop, speed, easing, callback ) { + var empty = jQuery.isEmptyObject( prop ), + optall = jQuery.speed( speed, easing, callback ), + doAnimation = function() { + + // Operate on a copy of prop so per-property easing won't be lost + var anim = Animation( this, jQuery.extend( {}, prop ), optall ); + + // Empty animations, or finishing resolves immediately + if ( empty || dataPriv.get( this, "finish" ) ) { + anim.stop( true ); + } + }; + + doAnimation.finish = doAnimation; + + return empty || optall.queue === false ? + this.each( doAnimation ) : + this.queue( optall.queue, doAnimation ); + }, + stop: function( type, clearQueue, gotoEnd ) { + var stopQueue = function( hooks ) { + var stop = hooks.stop; + delete hooks.stop; + stop( gotoEnd ); + }; + + if ( typeof type !== "string" ) { + gotoEnd = clearQueue; + clearQueue = type; + type = undefined; + } + if ( clearQueue ) { + this.queue( type || "fx", [] ); + } + + return this.each( function() { + var dequeue = true, + index = type != null && type + "queueHooks", + timers = jQuery.timers, + data = dataPriv.get( this ); + + if ( index ) { + if ( data[ index ] && data[ index ].stop ) { + stopQueue( data[ index ] ); + } + } else { + for ( index in data ) { + if ( data[ index ] && data[ index ].stop && rrun.test( index ) ) { + stopQueue( data[ index ] ); + } + } + } + + for ( index = timers.length; index--; ) { + if ( timers[ index ].elem === this && + ( type == null || timers[ index ].queue === type ) ) { + + timers[ index ].anim.stop( gotoEnd ); + dequeue = false; + timers.splice( index, 1 ); + } + } + + // Start the next in the queue if the last step wasn't forced. + // Timers currently will call their complete callbacks, which + // will dequeue but only if they were gotoEnd. + if ( dequeue || !gotoEnd ) { + jQuery.dequeue( this, type ); + } + } ); + }, + finish: function( type ) { + if ( type !== false ) { + type = type || "fx"; + } + return this.each( function() { + var index, + data = dataPriv.get( this ), + queue = data[ type + "queue" ], + hooks = data[ type + "queueHooks" ], + timers = jQuery.timers, + length = queue ? queue.length : 0; + + // Enable finishing flag on private data + data.finish = true; + + // Empty the queue first + jQuery.queue( this, type, [] ); + + if ( hooks && hooks.stop ) { + hooks.stop.call( this, true ); + } + + // Look for any active animations, and finish them + for ( index = timers.length; index--; ) { + if ( timers[ index ].elem === this && timers[ index ].queue === type ) { + timers[ index ].anim.stop( true ); + timers.splice( index, 1 ); + } + } + + // Look for any animations in the old queue and finish them + for ( index = 0; index < length; index++ ) { + if ( queue[ index ] && queue[ index ].finish ) { + queue[ index ].finish.call( this ); + } + } + + // Turn off finishing flag + delete data.finish; + } ); + } +} ); + +jQuery.each( [ "toggle", "show", "hide" ], function( _i, name ) { + var cssFn = jQuery.fn[ name ]; + jQuery.fn[ name ] = function( speed, easing, callback ) { + return speed == null || typeof speed === "boolean" ? + cssFn.apply( this, arguments ) : + this.animate( genFx( name, true ), speed, easing, callback ); + }; +} ); + +// Generate shortcuts for custom animations +jQuery.each( { + slideDown: genFx( "show" ), + slideUp: genFx( "hide" ), + slideToggle: genFx( "toggle" ), + fadeIn: { opacity: "show" }, + fadeOut: { opacity: "hide" }, + fadeToggle: { opacity: "toggle" } +}, function( name, props ) { + jQuery.fn[ name ] = function( speed, easing, callback ) { + return this.animate( props, speed, easing, callback ); + }; +} ); + +jQuery.timers = []; +jQuery.fx.tick = function() { + var timer, + i = 0, + timers = jQuery.timers; + + fxNow = Date.now(); + + for ( ; i < timers.length; i++ ) { + timer = timers[ i ]; + + // Run the timer and safely remove it when done (allowing for external removal) + if ( !timer() && timers[ i ] === timer ) { + timers.splice( i--, 1 ); + } + } + + if ( !timers.length ) { + jQuery.fx.stop(); + } + fxNow = undefined; +}; + +jQuery.fx.timer = function( timer ) { + jQuery.timers.push( timer ); + jQuery.fx.start(); +}; + +jQuery.fx.interval = 13; +jQuery.fx.start = function() { + if ( inProgress ) { + return; + } + + inProgress = true; + schedule(); +}; + +jQuery.fx.stop = function() { + inProgress = null; +}; + +jQuery.fx.speeds = { + slow: 600, + fast: 200, + + // Default speed + _default: 400 +}; + + +// Based off of the plugin by Clint Helfers, with permission. +// https://web.archive.org/web/20100324014747/http://blindsignals.com/index.php/2009/07/jquery-delay/ +jQuery.fn.delay = function( time, type ) { + time = jQuery.fx ? jQuery.fx.speeds[ time ] || time : time; + type = type || "fx"; + + return this.queue( type, function( next, hooks ) { + var timeout = window.setTimeout( next, time ); + hooks.stop = function() { + window.clearTimeout( timeout ); + }; + } ); +}; + + +( function() { + var input = document.createElement( "input" ), + select = document.createElement( "select" ), + opt = select.appendChild( document.createElement( "option" ) ); + + input.type = "checkbox"; + + // Support: Android <=4.3 only + // Default value for a checkbox should be "on" + support.checkOn = input.value !== ""; + + // Support: IE <=11 only + // Must access selectedIndex to make default options select + support.optSelected = opt.selected; + + // Support: IE <=11 only + // An input loses its value after becoming a radio + input = document.createElement( "input" ); + input.value = "t"; + input.type = "radio"; + support.radioValue = input.value === "t"; +} )(); + + +var boolHook, + attrHandle = jQuery.expr.attrHandle; + +jQuery.fn.extend( { + attr: function( name, value ) { + return access( this, jQuery.attr, name, value, arguments.length > 1 ); + }, + + removeAttr: function( name ) { + return this.each( function() { + jQuery.removeAttr( this, name ); + } ); + } +} ); + +jQuery.extend( { + attr: function( elem, name, value ) { + var ret, hooks, + nType = elem.nodeType; + + // Don't get/set attributes on text, comment and attribute nodes + if ( nType === 3 || nType === 8 || nType === 2 ) { + return; + } + + // Fallback to prop when attributes are not supported + if ( typeof elem.getAttribute === "undefined" ) { + return jQuery.prop( elem, name, value ); + } + + // Attribute hooks are determined by the lowercase version + // Grab necessary hook if one is defined + if ( nType !== 1 || !jQuery.isXMLDoc( elem ) ) { + hooks = jQuery.attrHooks[ name.toLowerCase() ] || + ( jQuery.expr.match.bool.test( name ) ? boolHook : undefined ); + } + + if ( value !== undefined ) { + if ( value === null ) { + jQuery.removeAttr( elem, name ); + return; + } + + if ( hooks && "set" in hooks && + ( ret = hooks.set( elem, value, name ) ) !== undefined ) { + return ret; + } + + elem.setAttribute( name, value + "" ); + return value; + } + + if ( hooks && "get" in hooks && ( ret = hooks.get( elem, name ) ) !== null ) { + return ret; + } + + ret = jQuery.find.attr( elem, name ); + + // Non-existent attributes return null, we normalize to undefined + return ret == null ? undefined : ret; + }, + + attrHooks: { + type: { + set: function( elem, value ) { + if ( !support.radioValue && value === "radio" && + nodeName( elem, "input" ) ) { + var val = elem.value; + elem.setAttribute( "type", value ); + if ( val ) { + elem.value = val; + } + return value; + } + } + } + }, + + removeAttr: function( elem, value ) { + var name, + i = 0, + + // Attribute names can contain non-HTML whitespace characters + // https://html.spec.whatwg.org/multipage/syntax.html#attributes-2 + attrNames = value && value.match( rnothtmlwhite ); + + if ( attrNames && elem.nodeType === 1 ) { + while ( ( name = attrNames[ i++ ] ) ) { + elem.removeAttribute( name ); + } + } + } +} ); + +// Hooks for boolean attributes +boolHook = { + set: function( elem, value, name ) { + if ( value === false ) { + + // Remove boolean attributes when set to false + jQuery.removeAttr( elem, name ); + } else { + elem.setAttribute( name, name ); + } + return name; + } +}; + +jQuery.each( jQuery.expr.match.bool.source.match( /\w+/g ), function( _i, name ) { + var getter = attrHandle[ name ] || jQuery.find.attr; + + attrHandle[ name ] = function( elem, name, isXML ) { + var ret, handle, + lowercaseName = name.toLowerCase(); + + if ( !isXML ) { + + // Avoid an infinite loop by temporarily removing this function from the getter + handle = attrHandle[ lowercaseName ]; + attrHandle[ lowercaseName ] = ret; + ret = getter( elem, name, isXML ) != null ? + lowercaseName : + null; + attrHandle[ lowercaseName ] = handle; + } + return ret; + }; +} ); + + + + +var rfocusable = /^(?:input|select|textarea|button)$/i, + rclickable = /^(?:a|area)$/i; + +jQuery.fn.extend( { + prop: function( name, value ) { + return access( this, jQuery.prop, name, value, arguments.length > 1 ); + }, + + removeProp: function( name ) { + return this.each( function() { + delete this[ jQuery.propFix[ name ] || name ]; + } ); + } +} ); + +jQuery.extend( { + prop: function( elem, name, value ) { + var ret, hooks, + nType = elem.nodeType; + + // Don't get/set properties on text, comment and attribute nodes + if ( nType === 3 || nType === 8 || nType === 2 ) { + return; + } + + if ( nType !== 1 || !jQuery.isXMLDoc( elem ) ) { + + // Fix name and attach hooks + name = jQuery.propFix[ name ] || name; + hooks = jQuery.propHooks[ name ]; + } + + if ( value !== undefined ) { + if ( hooks && "set" in hooks && + ( ret = hooks.set( elem, value, name ) ) !== undefined ) { + return ret; + } + + return ( elem[ name ] = value ); + } + + if ( hooks && "get" in hooks && ( ret = hooks.get( elem, name ) ) !== null ) { + return ret; + } + + return elem[ name ]; + }, + + propHooks: { + tabIndex: { + get: function( elem ) { + + // Support: IE <=9 - 11 only + // elem.tabIndex doesn't always return the + // correct value when it hasn't been explicitly set + // https://web.archive.org/web/20141116233347/http://fluidproject.org/blog/2008/01/09/getting-setting-and-removing-tabindex-values-with-javascript/ + // Use proper attribute retrieval(#12072) + var tabindex = jQuery.find.attr( elem, "tabindex" ); + + if ( tabindex ) { + return parseInt( tabindex, 10 ); + } + + if ( + rfocusable.test( elem.nodeName ) || + rclickable.test( elem.nodeName ) && + elem.href + ) { + return 0; + } + + return -1; + } + } + }, + + propFix: { + "for": "htmlFor", + "class": "className" + } +} ); + +// Support: IE <=11 only +// Accessing the selectedIndex property +// forces the browser to respect setting selected +// on the option +// The getter ensures a default option is selected +// when in an optgroup +// eslint rule "no-unused-expressions" is disabled for this code +// since it considers such accessions noop +if ( !support.optSelected ) { + jQuery.propHooks.selected = { + get: function( elem ) { + + /* eslint no-unused-expressions: "off" */ + + var parent = elem.parentNode; + if ( parent && parent.parentNode ) { + parent.parentNode.selectedIndex; + } + return null; + }, + set: function( elem ) { + + /* eslint no-unused-expressions: "off" */ + + var parent = elem.parentNode; + if ( parent ) { + parent.selectedIndex; + + if ( parent.parentNode ) { + parent.parentNode.selectedIndex; + } + } + } + }; +} + +jQuery.each( [ + "tabIndex", + "readOnly", + "maxLength", + "cellSpacing", + "cellPadding", + "rowSpan", + "colSpan", + "useMap", + "frameBorder", + "contentEditable" +], function() { + jQuery.propFix[ this.toLowerCase() ] = this; +} ); + + + + + // Strip and collapse whitespace according to HTML spec + // https://infra.spec.whatwg.org/#strip-and-collapse-ascii-whitespace + function stripAndCollapse( value ) { + var tokens = value.match( rnothtmlwhite ) || []; + return tokens.join( " " ); + } + + +function getClass( elem ) { + return elem.getAttribute && elem.getAttribute( "class" ) || ""; +} + +function classesToArray( value ) { + if ( Array.isArray( value ) ) { + return value; + } + if ( typeof value === "string" ) { + return value.match( rnothtmlwhite ) || []; + } + return []; +} + +jQuery.fn.extend( { + addClass: function( value ) { + var classes, elem, cur, curValue, clazz, j, finalValue, + i = 0; + + if ( isFunction( value ) ) { + return this.each( function( j ) { + jQuery( this ).addClass( value.call( this, j, getClass( this ) ) ); + } ); + } + + classes = classesToArray( value ); + + if ( classes.length ) { + while ( ( elem = this[ i++ ] ) ) { + curValue = getClass( elem ); + cur = elem.nodeType === 1 && ( " " + stripAndCollapse( curValue ) + " " ); + + if ( cur ) { + j = 0; + while ( ( clazz = classes[ j++ ] ) ) { + if ( cur.indexOf( " " + clazz + " " ) < 0 ) { + cur += clazz + " "; + } + } + + // Only assign if different to avoid unneeded rendering. + finalValue = stripAndCollapse( cur ); + if ( curValue !== finalValue ) { + elem.setAttribute( "class", finalValue ); + } + } + } + } + + return this; + }, + + removeClass: function( value ) { + var classes, elem, cur, curValue, clazz, j, finalValue, + i = 0; + + if ( isFunction( value ) ) { + return this.each( function( j ) { + jQuery( this ).removeClass( value.call( this, j, getClass( this ) ) ); + } ); + } + + if ( !arguments.length ) { + return this.attr( "class", "" ); + } + + classes = classesToArray( value ); + + if ( classes.length ) { + while ( ( elem = this[ i++ ] ) ) { + curValue = getClass( elem ); + + // This expression is here for better compressibility (see addClass) + cur = elem.nodeType === 1 && ( " " + stripAndCollapse( curValue ) + " " ); + + if ( cur ) { + j = 0; + while ( ( clazz = classes[ j++ ] ) ) { + + // Remove *all* instances + while ( cur.indexOf( " " + clazz + " " ) > -1 ) { + cur = cur.replace( " " + clazz + " ", " " ); + } + } + + // Only assign if different to avoid unneeded rendering. + finalValue = stripAndCollapse( cur ); + if ( curValue !== finalValue ) { + elem.setAttribute( "class", finalValue ); + } + } + } + } + + return this; + }, + + toggleClass: function( value, stateVal ) { + var type = typeof value, + isValidValue = type === "string" || Array.isArray( value ); + + if ( typeof stateVal === "boolean" && isValidValue ) { + return stateVal ? this.addClass( value ) : this.removeClass( value ); + } + + if ( isFunction( value ) ) { + return this.each( function( i ) { + jQuery( this ).toggleClass( + value.call( this, i, getClass( this ), stateVal ), + stateVal + ); + } ); + } + + return this.each( function() { + var className, i, self, classNames; + + if ( isValidValue ) { + + // Toggle individual class names + i = 0; + self = jQuery( this ); + classNames = classesToArray( value ); + + while ( ( className = classNames[ i++ ] ) ) { + + // Check each className given, space separated list + if ( self.hasClass( className ) ) { + self.removeClass( className ); + } else { + self.addClass( className ); + } + } + + // Toggle whole class name + } else if ( value === undefined || type === "boolean" ) { + className = getClass( this ); + if ( className ) { + + // Store className if set + dataPriv.set( this, "__className__", className ); + } + + // If the element has a class name or if we're passed `false`, + // then remove the whole classname (if there was one, the above saved it). + // Otherwise bring back whatever was previously saved (if anything), + // falling back to the empty string if nothing was stored. + if ( this.setAttribute ) { + this.setAttribute( "class", + className || value === false ? + "" : + dataPriv.get( this, "__className__" ) || "" + ); + } + } + } ); + }, + + hasClass: function( selector ) { + var className, elem, + i = 0; + + className = " " + selector + " "; + while ( ( elem = this[ i++ ] ) ) { + if ( elem.nodeType === 1 && + ( " " + stripAndCollapse( getClass( elem ) ) + " " ).indexOf( className ) > -1 ) { + return true; + } + } + + return false; + } +} ); + + + + +var rreturn = /\r/g; + +jQuery.fn.extend( { + val: function( value ) { + var hooks, ret, valueIsFunction, + elem = this[ 0 ]; + + if ( !arguments.length ) { + if ( elem ) { + hooks = jQuery.valHooks[ elem.type ] || + jQuery.valHooks[ elem.nodeName.toLowerCase() ]; + + if ( hooks && + "get" in hooks && + ( ret = hooks.get( elem, "value" ) ) !== undefined + ) { + return ret; + } + + ret = elem.value; + + // Handle most common string cases + if ( typeof ret === "string" ) { + return ret.replace( rreturn, "" ); + } + + // Handle cases where value is null/undef or number + return ret == null ? "" : ret; + } + + return; + } + + valueIsFunction = isFunction( value ); + + return this.each( function( i ) { + var val; + + if ( this.nodeType !== 1 ) { + return; + } + + if ( valueIsFunction ) { + val = value.call( this, i, jQuery( this ).val() ); + } else { + val = value; + } + + // Treat null/undefined as ""; convert numbers to string + if ( val == null ) { + val = ""; + + } else if ( typeof val === "number" ) { + val += ""; + + } else if ( Array.isArray( val ) ) { + val = jQuery.map( val, function( value ) { + return value == null ? "" : value + ""; + } ); + } + + hooks = jQuery.valHooks[ this.type ] || jQuery.valHooks[ this.nodeName.toLowerCase() ]; + + // If set returns undefined, fall back to normal setting + if ( !hooks || !( "set" in hooks ) || hooks.set( this, val, "value" ) === undefined ) { + this.value = val; + } + } ); + } +} ); + +jQuery.extend( { + valHooks: { + option: { + get: function( elem ) { + + var val = jQuery.find.attr( elem, "value" ); + return val != null ? + val : + + // Support: IE <=10 - 11 only + // option.text throws exceptions (#14686, #14858) + // Strip and collapse whitespace + // https://html.spec.whatwg.org/#strip-and-collapse-whitespace + stripAndCollapse( jQuery.text( elem ) ); + } + }, + select: { + get: function( elem ) { + var value, option, i, + options = elem.options, + index = elem.selectedIndex, + one = elem.type === "select-one", + values = one ? null : [], + max = one ? index + 1 : options.length; + + if ( index < 0 ) { + i = max; + + } else { + i = one ? index : 0; + } + + // Loop through all the selected options + for ( ; i < max; i++ ) { + option = options[ i ]; + + // Support: IE <=9 only + // IE8-9 doesn't update selected after form reset (#2551) + if ( ( option.selected || i === index ) && + + // Don't return options that are disabled or in a disabled optgroup + !option.disabled && + ( !option.parentNode.disabled || + !nodeName( option.parentNode, "optgroup" ) ) ) { + + // Get the specific value for the option + value = jQuery( option ).val(); + + // We don't need an array for one selects + if ( one ) { + return value; + } + + // Multi-Selects return an array + values.push( value ); + } + } + + return values; + }, + + set: function( elem, value ) { + var optionSet, option, + options = elem.options, + values = jQuery.makeArray( value ), + i = options.length; + + while ( i-- ) { + option = options[ i ]; + + /* eslint-disable no-cond-assign */ + + if ( option.selected = + jQuery.inArray( jQuery.valHooks.option.get( option ), values ) > -1 + ) { + optionSet = true; + } + + /* eslint-enable no-cond-assign */ + } + + // Force browsers to behave consistently when non-matching value is set + if ( !optionSet ) { + elem.selectedIndex = -1; + } + return values; + } + } + } +} ); + +// Radios and checkboxes getter/setter +jQuery.each( [ "radio", "checkbox" ], function() { + jQuery.valHooks[ this ] = { + set: function( elem, value ) { + if ( Array.isArray( value ) ) { + return ( elem.checked = jQuery.inArray( jQuery( elem ).val(), value ) > -1 ); + } + } + }; + if ( !support.checkOn ) { + jQuery.valHooks[ this ].get = function( elem ) { + return elem.getAttribute( "value" ) === null ? "on" : elem.value; + }; + } +} ); + + + + +// Return jQuery for attributes-only inclusion + + +support.focusin = "onfocusin" in window; + + +var rfocusMorph = /^(?:focusinfocus|focusoutblur)$/, + stopPropagationCallback = function( e ) { + e.stopPropagation(); + }; + +jQuery.extend( jQuery.event, { + + trigger: function( event, data, elem, onlyHandlers ) { + + var i, cur, tmp, bubbleType, ontype, handle, special, lastElement, + eventPath = [ elem || document ], + type = hasOwn.call( event, "type" ) ? event.type : event, + namespaces = hasOwn.call( event, "namespace" ) ? event.namespace.split( "." ) : []; + + cur = lastElement = tmp = elem = elem || document; + + // Don't do events on text and comment nodes + if ( elem.nodeType === 3 || elem.nodeType === 8 ) { + return; + } + + // focus/blur morphs to focusin/out; ensure we're not firing them right now + if ( rfocusMorph.test( type + jQuery.event.triggered ) ) { + return; + } + + if ( type.indexOf( "." ) > -1 ) { + + // Namespaced trigger; create a regexp to match event type in handle() + namespaces = type.split( "." ); + type = namespaces.shift(); + namespaces.sort(); + } + ontype = type.indexOf( ":" ) < 0 && "on" + type; + + // Caller can pass in a jQuery.Event object, Object, or just an event type string + event = event[ jQuery.expando ] ? + event : + new jQuery.Event( type, typeof event === "object" && event ); + + // Trigger bitmask: & 1 for native handlers; & 2 for jQuery (always true) + event.isTrigger = onlyHandlers ? 2 : 3; + event.namespace = namespaces.join( "." ); + event.rnamespace = event.namespace ? + new RegExp( "(^|\\.)" + namespaces.join( "\\.(?:.*\\.|)" ) + "(\\.|$)" ) : + null; + + // Clean up the event in case it is being reused + event.result = undefined; + if ( !event.target ) { + event.target = elem; + } + + // Clone any incoming data and prepend the event, creating the handler arg list + data = data == null ? + [ event ] : + jQuery.makeArray( data, [ event ] ); + + // Allow special events to draw outside the lines + special = jQuery.event.special[ type ] || {}; + if ( !onlyHandlers && special.trigger && special.trigger.apply( elem, data ) === false ) { + return; + } + + // Determine event propagation path in advance, per W3C events spec (#9951) + // Bubble up to document, then to window; watch for a global ownerDocument var (#9724) + if ( !onlyHandlers && !special.noBubble && !isWindow( elem ) ) { + + bubbleType = special.delegateType || type; + if ( !rfocusMorph.test( bubbleType + type ) ) { + cur = cur.parentNode; + } + for ( ; cur; cur = cur.parentNode ) { + eventPath.push( cur ); + tmp = cur; + } + + // Only add window if we got to document (e.g., not plain obj or detached DOM) + if ( tmp === ( elem.ownerDocument || document ) ) { + eventPath.push( tmp.defaultView || tmp.parentWindow || window ); + } + } + + // Fire handlers on the event path + i = 0; + while ( ( cur = eventPath[ i++ ] ) && !event.isPropagationStopped() ) { + lastElement = cur; + event.type = i > 1 ? + bubbleType : + special.bindType || type; + + // jQuery handler + handle = ( dataPriv.get( cur, "events" ) || Object.create( null ) )[ event.type ] && + dataPriv.get( cur, "handle" ); + if ( handle ) { + handle.apply( cur, data ); + } + + // Native handler + handle = ontype && cur[ ontype ]; + if ( handle && handle.apply && acceptData( cur ) ) { + event.result = handle.apply( cur, data ); + if ( event.result === false ) { + event.preventDefault(); + } + } + } + event.type = type; + + // If nobody prevented the default action, do it now + if ( !onlyHandlers && !event.isDefaultPrevented() ) { + + if ( ( !special._default || + special._default.apply( eventPath.pop(), data ) === false ) && + acceptData( elem ) ) { + + // Call a native DOM method on the target with the same name as the event. + // Don't do default actions on window, that's where global variables be (#6170) + if ( ontype && isFunction( elem[ type ] ) && !isWindow( elem ) ) { + + // Don't re-trigger an onFOO event when we call its FOO() method + tmp = elem[ ontype ]; + + if ( tmp ) { + elem[ ontype ] = null; + } + + // Prevent re-triggering of the same event, since we already bubbled it above + jQuery.event.triggered = type; + + if ( event.isPropagationStopped() ) { + lastElement.addEventListener( type, stopPropagationCallback ); + } + + elem[ type ](); + + if ( event.isPropagationStopped() ) { + lastElement.removeEventListener( type, stopPropagationCallback ); + } + + jQuery.event.triggered = undefined; + + if ( tmp ) { + elem[ ontype ] = tmp; + } + } + } + } + + return event.result; + }, + + // Piggyback on a donor event to simulate a different one + // Used only for `focus(in | out)` events + simulate: function( type, elem, event ) { + var e = jQuery.extend( + new jQuery.Event(), + event, + { + type: type, + isSimulated: true + } + ); + + jQuery.event.trigger( e, null, elem ); + } + +} ); + +jQuery.fn.extend( { + + trigger: function( type, data ) { + return this.each( function() { + jQuery.event.trigger( type, data, this ); + } ); + }, + triggerHandler: function( type, data ) { + var elem = this[ 0 ]; + if ( elem ) { + return jQuery.event.trigger( type, data, elem, true ); + } + } +} ); + + +// Support: Firefox <=44 +// Firefox doesn't have focus(in | out) events +// Related ticket - https://bugzilla.mozilla.org/show_bug.cgi?id=687787 +// +// Support: Chrome <=48 - 49, Safari <=9.0 - 9.1 +// focus(in | out) events fire after focus & blur events, +// which is spec violation - http://www.w3.org/TR/DOM-Level-3-Events/#events-focusevent-event-order +// Related ticket - https://bugs.chromium.org/p/chromium/issues/detail?id=449857 +if ( !support.focusin ) { + jQuery.each( { focus: "focusin", blur: "focusout" }, function( orig, fix ) { + + // Attach a single capturing handler on the document while someone wants focusin/focusout + var handler = function( event ) { + jQuery.event.simulate( fix, event.target, jQuery.event.fix( event ) ); + }; + + jQuery.event.special[ fix ] = { + setup: function() { + + // Handle: regular nodes (via `this.ownerDocument`), window + // (via `this.document`) & document (via `this`). + var doc = this.ownerDocument || this.document || this, + attaches = dataPriv.access( doc, fix ); + + if ( !attaches ) { + doc.addEventListener( orig, handler, true ); + } + dataPriv.access( doc, fix, ( attaches || 0 ) + 1 ); + }, + teardown: function() { + var doc = this.ownerDocument || this.document || this, + attaches = dataPriv.access( doc, fix ) - 1; + + if ( !attaches ) { + doc.removeEventListener( orig, handler, true ); + dataPriv.remove( doc, fix ); + + } else { + dataPriv.access( doc, fix, attaches ); + } + } + }; + } ); +} +var location = window.location; + +var nonce = { guid: Date.now() }; + +var rquery = ( /\?/ ); + + + +// Cross-browser xml parsing +jQuery.parseXML = function( data ) { + var xml, parserErrorElem; + if ( !data || typeof data !== "string" ) { + return null; + } + + // Support: IE 9 - 11 only + // IE throws on parseFromString with invalid input. + try { + xml = ( new window.DOMParser() ).parseFromString( data, "text/xml" ); + } catch ( e ) {} + + parserErrorElem = xml && xml.getElementsByTagName( "parsererror" )[ 0 ]; + if ( !xml || parserErrorElem ) { + jQuery.error( "Invalid XML: " + ( + parserErrorElem ? + jQuery.map( parserErrorElem.childNodes, function( el ) { + return el.textContent; + } ).join( "\n" ) : + data + ) ); + } + return xml; +}; + + +var + rbracket = /\[\]$/, + rCRLF = /\r?\n/g, + rsubmitterTypes = /^(?:submit|button|image|reset|file)$/i, + rsubmittable = /^(?:input|select|textarea|keygen)/i; + +function buildParams( prefix, obj, traditional, add ) { + var name; + + if ( Array.isArray( obj ) ) { + + // Serialize array item. + jQuery.each( obj, function( i, v ) { + if ( traditional || rbracket.test( prefix ) ) { + + // Treat each array item as a scalar. + add( prefix, v ); + + } else { + + // Item is non-scalar (array or object), encode its numeric index. + buildParams( + prefix + "[" + ( typeof v === "object" && v != null ? i : "" ) + "]", + v, + traditional, + add + ); + } + } ); + + } else if ( !traditional && toType( obj ) === "object" ) { + + // Serialize object item. + for ( name in obj ) { + buildParams( prefix + "[" + name + "]", obj[ name ], traditional, add ); + } + + } else { + + // Serialize scalar item. + add( prefix, obj ); + } +} + +// Serialize an array of form elements or a set of +// key/values into a query string +jQuery.param = function( a, traditional ) { + var prefix, + s = [], + add = function( key, valueOrFunction ) { + + // If value is a function, invoke it and use its return value + var value = isFunction( valueOrFunction ) ? + valueOrFunction() : + valueOrFunction; + + s[ s.length ] = encodeURIComponent( key ) + "=" + + encodeURIComponent( value == null ? "" : value ); + }; + + if ( a == null ) { + return ""; + } + + // If an array was passed in, assume that it is an array of form elements. + if ( Array.isArray( a ) || ( a.jquery && !jQuery.isPlainObject( a ) ) ) { + + // Serialize the form elements + jQuery.each( a, function() { + add( this.name, this.value ); + } ); + + } else { + + // If traditional, encode the "old" way (the way 1.3.2 or older + // did it), otherwise encode params recursively. + for ( prefix in a ) { + buildParams( prefix, a[ prefix ], traditional, add ); + } + } + + // Return the resulting serialization + return s.join( "&" ); +}; + +jQuery.fn.extend( { + serialize: function() { + return jQuery.param( this.serializeArray() ); + }, + serializeArray: function() { + return this.map( function() { + + // Can add propHook for "elements" to filter or add form elements + var elements = jQuery.prop( this, "elements" ); + return elements ? jQuery.makeArray( elements ) : this; + } ).filter( function() { + var type = this.type; + + // Use .is( ":disabled" ) so that fieldset[disabled] works + return this.name && !jQuery( this ).is( ":disabled" ) && + rsubmittable.test( this.nodeName ) && !rsubmitterTypes.test( type ) && + ( this.checked || !rcheckableType.test( type ) ); + } ).map( function( _i, elem ) { + var val = jQuery( this ).val(); + + if ( val == null ) { + return null; + } + + if ( Array.isArray( val ) ) { + return jQuery.map( val, function( val ) { + return { name: elem.name, value: val.replace( rCRLF, "\r\n" ) }; + } ); + } + + return { name: elem.name, value: val.replace( rCRLF, "\r\n" ) }; + } ).get(); + } +} ); + + +var + r20 = /%20/g, + rhash = /#.*$/, + rantiCache = /([?&])_=[^&]*/, + rheaders = /^(.*?):[ \t]*([^\r\n]*)$/mg, + + // #7653, #8125, #8152: local protocol detection + rlocalProtocol = /^(?:about|app|app-storage|.+-extension|file|res|widget):$/, + rnoContent = /^(?:GET|HEAD)$/, + rprotocol = /^\/\//, + + /* Prefilters + * 1) They are useful to introduce custom dataTypes (see ajax/jsonp.js for an example) + * 2) These are called: + * - BEFORE asking for a transport + * - AFTER param serialization (s.data is a string if s.processData is true) + * 3) key is the dataType + * 4) the catchall symbol "*" can be used + * 5) execution will start with transport dataType and THEN continue down to "*" if needed + */ + prefilters = {}, + + /* Transports bindings + * 1) key is the dataType + * 2) the catchall symbol "*" can be used + * 3) selection will start with transport dataType and THEN go to "*" if needed + */ + transports = {}, + + // Avoid comment-prolog char sequence (#10098); must appease lint and evade compression + allTypes = "*/".concat( "*" ), + + // Anchor tag for parsing the document origin + originAnchor = document.createElement( "a" ); + +originAnchor.href = location.href; + +// Base "constructor" for jQuery.ajaxPrefilter and jQuery.ajaxTransport +function addToPrefiltersOrTransports( structure ) { + + // dataTypeExpression is optional and defaults to "*" + return function( dataTypeExpression, func ) { + + if ( typeof dataTypeExpression !== "string" ) { + func = dataTypeExpression; + dataTypeExpression = "*"; + } + + var dataType, + i = 0, + dataTypes = dataTypeExpression.toLowerCase().match( rnothtmlwhite ) || []; + + if ( isFunction( func ) ) { + + // For each dataType in the dataTypeExpression + while ( ( dataType = dataTypes[ i++ ] ) ) { + + // Prepend if requested + if ( dataType[ 0 ] === "+" ) { + dataType = dataType.slice( 1 ) || "*"; + ( structure[ dataType ] = structure[ dataType ] || [] ).unshift( func ); + + // Otherwise append + } else { + ( structure[ dataType ] = structure[ dataType ] || [] ).push( func ); + } + } + } + }; +} + +// Base inspection function for prefilters and transports +function inspectPrefiltersOrTransports( structure, options, originalOptions, jqXHR ) { + + var inspected = {}, + seekingTransport = ( structure === transports ); + + function inspect( dataType ) { + var selected; + inspected[ dataType ] = true; + jQuery.each( structure[ dataType ] || [], function( _, prefilterOrFactory ) { + var dataTypeOrTransport = prefilterOrFactory( options, originalOptions, jqXHR ); + if ( typeof dataTypeOrTransport === "string" && + !seekingTransport && !inspected[ dataTypeOrTransport ] ) { + + options.dataTypes.unshift( dataTypeOrTransport ); + inspect( dataTypeOrTransport ); + return false; + } else if ( seekingTransport ) { + return !( selected = dataTypeOrTransport ); + } + } ); + return selected; + } + + return inspect( options.dataTypes[ 0 ] ) || !inspected[ "*" ] && inspect( "*" ); +} + +// A special extend for ajax options +// that takes "flat" options (not to be deep extended) +// Fixes #9887 +function ajaxExtend( target, src ) { + var key, deep, + flatOptions = jQuery.ajaxSettings.flatOptions || {}; + + for ( key in src ) { + if ( src[ key ] !== undefined ) { + ( flatOptions[ key ] ? target : ( deep || ( deep = {} ) ) )[ key ] = src[ key ]; + } + } + if ( deep ) { + jQuery.extend( true, target, deep ); + } + + return target; +} + +/* Handles responses to an ajax request: + * - finds the right dataType (mediates between content-type and expected dataType) + * - returns the corresponding response + */ +function ajaxHandleResponses( s, jqXHR, responses ) { + + var ct, type, finalDataType, firstDataType, + contents = s.contents, + dataTypes = s.dataTypes; + + // Remove auto dataType and get content-type in the process + while ( dataTypes[ 0 ] === "*" ) { + dataTypes.shift(); + if ( ct === undefined ) { + ct = s.mimeType || jqXHR.getResponseHeader( "Content-Type" ); + } + } + + // Check if we're dealing with a known content-type + if ( ct ) { + for ( type in contents ) { + if ( contents[ type ] && contents[ type ].test( ct ) ) { + dataTypes.unshift( type ); + break; + } + } + } + + // Check to see if we have a response for the expected dataType + if ( dataTypes[ 0 ] in responses ) { + finalDataType = dataTypes[ 0 ]; + } else { + + // Try convertible dataTypes + for ( type in responses ) { + if ( !dataTypes[ 0 ] || s.converters[ type + " " + dataTypes[ 0 ] ] ) { + finalDataType = type; + break; + } + if ( !firstDataType ) { + firstDataType = type; + } + } + + // Or just use first one + finalDataType = finalDataType || firstDataType; + } + + // If we found a dataType + // We add the dataType to the list if needed + // and return the corresponding response + if ( finalDataType ) { + if ( finalDataType !== dataTypes[ 0 ] ) { + dataTypes.unshift( finalDataType ); + } + return responses[ finalDataType ]; + } +} + +/* Chain conversions given the request and the original response + * Also sets the responseXXX fields on the jqXHR instance + */ +function ajaxConvert( s, response, jqXHR, isSuccess ) { + var conv2, current, conv, tmp, prev, + converters = {}, + + // Work with a copy of dataTypes in case we need to modify it for conversion + dataTypes = s.dataTypes.slice(); + + // Create converters map with lowercased keys + if ( dataTypes[ 1 ] ) { + for ( conv in s.converters ) { + converters[ conv.toLowerCase() ] = s.converters[ conv ]; + } + } + + current = dataTypes.shift(); + + // Convert to each sequential dataType + while ( current ) { + + if ( s.responseFields[ current ] ) { + jqXHR[ s.responseFields[ current ] ] = response; + } + + // Apply the dataFilter if provided + if ( !prev && isSuccess && s.dataFilter ) { + response = s.dataFilter( response, s.dataType ); + } + + prev = current; + current = dataTypes.shift(); + + if ( current ) { + + // There's only work to do if current dataType is non-auto + if ( current === "*" ) { + + current = prev; + + // Convert response if prev dataType is non-auto and differs from current + } else if ( prev !== "*" && prev !== current ) { + + // Seek a direct converter + conv = converters[ prev + " " + current ] || converters[ "* " + current ]; + + // If none found, seek a pair + if ( !conv ) { + for ( conv2 in converters ) { + + // If conv2 outputs current + tmp = conv2.split( " " ); + if ( tmp[ 1 ] === current ) { + + // If prev can be converted to accepted input + conv = converters[ prev + " " + tmp[ 0 ] ] || + converters[ "* " + tmp[ 0 ] ]; + if ( conv ) { + + // Condense equivalence converters + if ( conv === true ) { + conv = converters[ conv2 ]; + + // Otherwise, insert the intermediate dataType + } else if ( converters[ conv2 ] !== true ) { + current = tmp[ 0 ]; + dataTypes.unshift( tmp[ 1 ] ); + } + break; + } + } + } + } + + // Apply converter (if not an equivalence) + if ( conv !== true ) { + + // Unless errors are allowed to bubble, catch and return them + if ( conv && s.throws ) { + response = conv( response ); + } else { + try { + response = conv( response ); + } catch ( e ) { + return { + state: "parsererror", + error: conv ? e : "No conversion from " + prev + " to " + current + }; + } + } + } + } + } + } + + return { state: "success", data: response }; +} + +jQuery.extend( { + + // Counter for holding the number of active queries + active: 0, + + // Last-Modified header cache for next request + lastModified: {}, + etag: {}, + + ajaxSettings: { + url: location.href, + type: "GET", + isLocal: rlocalProtocol.test( location.protocol ), + global: true, + processData: true, + async: true, + contentType: "application/x-www-form-urlencoded; charset=UTF-8", + + /* + timeout: 0, + data: null, + dataType: null, + username: null, + password: null, + cache: null, + throws: false, + traditional: false, + headers: {}, + */ + + accepts: { + "*": allTypes, + text: "text/plain", + html: "text/html", + xml: "application/xml, text/xml", + json: "application/json, text/javascript" + }, + + contents: { + xml: /\bxml\b/, + html: /\bhtml/, + json: /\bjson\b/ + }, + + responseFields: { + xml: "responseXML", + text: "responseText", + json: "responseJSON" + }, + + // Data converters + // Keys separate source (or catchall "*") and destination types with a single space + converters: { + + // Convert anything to text + "* text": String, + + // Text to html (true = no transformation) + "text html": true, + + // Evaluate text as a json expression + "text json": JSON.parse, + + // Parse text as xml + "text xml": jQuery.parseXML + }, + + // For options that shouldn't be deep extended: + // you can add your own custom options here if + // and when you create one that shouldn't be + // deep extended (see ajaxExtend) + flatOptions: { + url: true, + context: true + } + }, + + // Creates a full fledged settings object into target + // with both ajaxSettings and settings fields. + // If target is omitted, writes into ajaxSettings. + ajaxSetup: function( target, settings ) { + return settings ? + + // Building a settings object + ajaxExtend( ajaxExtend( target, jQuery.ajaxSettings ), settings ) : + + // Extending ajaxSettings + ajaxExtend( jQuery.ajaxSettings, target ); + }, + + ajaxPrefilter: addToPrefiltersOrTransports( prefilters ), + ajaxTransport: addToPrefiltersOrTransports( transports ), + + // Main method + ajax: function( url, options ) { + + // If url is an object, simulate pre-1.5 signature + if ( typeof url === "object" ) { + options = url; + url = undefined; + } + + // Force options to be an object + options = options || {}; + + var transport, + + // URL without anti-cache param + cacheURL, + + // Response headers + responseHeadersString, + responseHeaders, + + // timeout handle + timeoutTimer, + + // Url cleanup var + urlAnchor, + + // Request state (becomes false upon send and true upon completion) + completed, + + // To know if global events are to be dispatched + fireGlobals, + + // Loop variable + i, + + // uncached part of the url + uncached, + + // Create the final options object + s = jQuery.ajaxSetup( {}, options ), + + // Callbacks context + callbackContext = s.context || s, + + // Context for global events is callbackContext if it is a DOM node or jQuery collection + globalEventContext = s.context && + ( callbackContext.nodeType || callbackContext.jquery ) ? + jQuery( callbackContext ) : + jQuery.event, + + // Deferreds + deferred = jQuery.Deferred(), + completeDeferred = jQuery.Callbacks( "once memory" ), + + // Status-dependent callbacks + statusCode = s.statusCode || {}, + + // Headers (they are sent all at once) + requestHeaders = {}, + requestHeadersNames = {}, + + // Default abort message + strAbort = "canceled", + + // Fake xhr + jqXHR = { + readyState: 0, + + // Builds headers hashtable if needed + getResponseHeader: function( key ) { + var match; + if ( completed ) { + if ( !responseHeaders ) { + responseHeaders = {}; + while ( ( match = rheaders.exec( responseHeadersString ) ) ) { + responseHeaders[ match[ 1 ].toLowerCase() + " " ] = + ( responseHeaders[ match[ 1 ].toLowerCase() + " " ] || [] ) + .concat( match[ 2 ] ); + } + } + match = responseHeaders[ key.toLowerCase() + " " ]; + } + return match == null ? null : match.join( ", " ); + }, + + // Raw string + getAllResponseHeaders: function() { + return completed ? responseHeadersString : null; + }, + + // Caches the header + setRequestHeader: function( name, value ) { + if ( completed == null ) { + name = requestHeadersNames[ name.toLowerCase() ] = + requestHeadersNames[ name.toLowerCase() ] || name; + requestHeaders[ name ] = value; + } + return this; + }, + + // Overrides response content-type header + overrideMimeType: function( type ) { + if ( completed == null ) { + s.mimeType = type; + } + return this; + }, + + // Status-dependent callbacks + statusCode: function( map ) { + var code; + if ( map ) { + if ( completed ) { + + // Execute the appropriate callbacks + jqXHR.always( map[ jqXHR.status ] ); + } else { + + // Lazy-add the new callbacks in a way that preserves old ones + for ( code in map ) { + statusCode[ code ] = [ statusCode[ code ], map[ code ] ]; + } + } + } + return this; + }, + + // Cancel the request + abort: function( statusText ) { + var finalText = statusText || strAbort; + if ( transport ) { + transport.abort( finalText ); + } + done( 0, finalText ); + return this; + } + }; + + // Attach deferreds + deferred.promise( jqXHR ); + + // Add protocol if not provided (prefilters might expect it) + // Handle falsy url in the settings object (#10093: consistency with old signature) + // We also use the url parameter if available + s.url = ( ( url || s.url || location.href ) + "" ) + .replace( rprotocol, location.protocol + "//" ); + + // Alias method option to type as per ticket #12004 + s.type = options.method || options.type || s.method || s.type; + + // Extract dataTypes list + s.dataTypes = ( s.dataType || "*" ).toLowerCase().match( rnothtmlwhite ) || [ "" ]; + + // A cross-domain request is in order when the origin doesn't match the current origin. + if ( s.crossDomain == null ) { + urlAnchor = document.createElement( "a" ); + + // Support: IE <=8 - 11, Edge 12 - 15 + // IE throws exception on accessing the href property if url is malformed, + // e.g. http://example.com:80x/ + try { + urlAnchor.href = s.url; + + // Support: IE <=8 - 11 only + // Anchor's host property isn't correctly set when s.url is relative + urlAnchor.href = urlAnchor.href; + s.crossDomain = originAnchor.protocol + "//" + originAnchor.host !== + urlAnchor.protocol + "//" + urlAnchor.host; + } catch ( e ) { + + // If there is an error parsing the URL, assume it is crossDomain, + // it can be rejected by the transport if it is invalid + s.crossDomain = true; + } + } + + // Convert data if not already a string + if ( s.data && s.processData && typeof s.data !== "string" ) { + s.data = jQuery.param( s.data, s.traditional ); + } + + // Apply prefilters + inspectPrefiltersOrTransports( prefilters, s, options, jqXHR ); + + // If request was aborted inside a prefilter, stop there + if ( completed ) { + return jqXHR; + } + + // We can fire global events as of now if asked to + // Don't fire events if jQuery.event is undefined in an AMD-usage scenario (#15118) + fireGlobals = jQuery.event && s.global; + + // Watch for a new set of requests + if ( fireGlobals && jQuery.active++ === 0 ) { + jQuery.event.trigger( "ajaxStart" ); + } + + // Uppercase the type + s.type = s.type.toUpperCase(); + + // Determine if request has content + s.hasContent = !rnoContent.test( s.type ); + + // Save the URL in case we're toying with the If-Modified-Since + // and/or If-None-Match header later on + // Remove hash to simplify url manipulation + cacheURL = s.url.replace( rhash, "" ); + + // More options handling for requests with no content + if ( !s.hasContent ) { + + // Remember the hash so we can put it back + uncached = s.url.slice( cacheURL.length ); + + // If data is available and should be processed, append data to url + if ( s.data && ( s.processData || typeof s.data === "string" ) ) { + cacheURL += ( rquery.test( cacheURL ) ? "&" : "?" ) + s.data; + + // #9682: remove data so that it's not used in an eventual retry + delete s.data; + } + + // Add or update anti-cache param if needed + if ( s.cache === false ) { + cacheURL = cacheURL.replace( rantiCache, "$1" ); + uncached = ( rquery.test( cacheURL ) ? "&" : "?" ) + "_=" + ( nonce.guid++ ) + + uncached; + } + + // Put hash and anti-cache on the URL that will be requested (gh-1732) + s.url = cacheURL + uncached; + + // Change '%20' to '+' if this is encoded form body content (gh-2658) + } else if ( s.data && s.processData && + ( s.contentType || "" ).indexOf( "application/x-www-form-urlencoded" ) === 0 ) { + s.data = s.data.replace( r20, "+" ); + } + + // Set the If-Modified-Since and/or If-None-Match header, if in ifModified mode. + if ( s.ifModified ) { + if ( jQuery.lastModified[ cacheURL ] ) { + jqXHR.setRequestHeader( "If-Modified-Since", jQuery.lastModified[ cacheURL ] ); + } + if ( jQuery.etag[ cacheURL ] ) { + jqXHR.setRequestHeader( "If-None-Match", jQuery.etag[ cacheURL ] ); + } + } + + // Set the correct header, if data is being sent + if ( s.data && s.hasContent && s.contentType !== false || options.contentType ) { + jqXHR.setRequestHeader( "Content-Type", s.contentType ); + } + + // Set the Accepts header for the server, depending on the dataType + jqXHR.setRequestHeader( + "Accept", + s.dataTypes[ 0 ] && s.accepts[ s.dataTypes[ 0 ] ] ? + s.accepts[ s.dataTypes[ 0 ] ] + + ( s.dataTypes[ 0 ] !== "*" ? ", " + allTypes + "; q=0.01" : "" ) : + s.accepts[ "*" ] + ); + + // Check for headers option + for ( i in s.headers ) { + jqXHR.setRequestHeader( i, s.headers[ i ] ); + } + + // Allow custom headers/mimetypes and early abort + if ( s.beforeSend && + ( s.beforeSend.call( callbackContext, jqXHR, s ) === false || completed ) ) { + + // Abort if not done already and return + return jqXHR.abort(); + } + + // Aborting is no longer a cancellation + strAbort = "abort"; + + // Install callbacks on deferreds + completeDeferred.add( s.complete ); + jqXHR.done( s.success ); + jqXHR.fail( s.error ); + + // Get transport + transport = inspectPrefiltersOrTransports( transports, s, options, jqXHR ); + + // If no transport, we auto-abort + if ( !transport ) { + done( -1, "No Transport" ); + } else { + jqXHR.readyState = 1; + + // Send global event + if ( fireGlobals ) { + globalEventContext.trigger( "ajaxSend", [ jqXHR, s ] ); + } + + // If request was aborted inside ajaxSend, stop there + if ( completed ) { + return jqXHR; + } + + // Timeout + if ( s.async && s.timeout > 0 ) { + timeoutTimer = window.setTimeout( function() { + jqXHR.abort( "timeout" ); + }, s.timeout ); + } + + try { + completed = false; + transport.send( requestHeaders, done ); + } catch ( e ) { + + // Rethrow post-completion exceptions + if ( completed ) { + throw e; + } + + // Propagate others as results + done( -1, e ); + } + } + + // Callback for when everything is done + function done( status, nativeStatusText, responses, headers ) { + var isSuccess, success, error, response, modified, + statusText = nativeStatusText; + + // Ignore repeat invocations + if ( completed ) { + return; + } + + completed = true; + + // Clear timeout if it exists + if ( timeoutTimer ) { + window.clearTimeout( timeoutTimer ); + } + + // Dereference transport for early garbage collection + // (no matter how long the jqXHR object will be used) + transport = undefined; + + // Cache response headers + responseHeadersString = headers || ""; + + // Set readyState + jqXHR.readyState = status > 0 ? 4 : 0; + + // Determine if successful + isSuccess = status >= 200 && status < 300 || status === 304; + + // Get response data + if ( responses ) { + response = ajaxHandleResponses( s, jqXHR, responses ); + } + + // Use a noop converter for missing script but not if jsonp + if ( !isSuccess && + jQuery.inArray( "script", s.dataTypes ) > -1 && + jQuery.inArray( "json", s.dataTypes ) < 0 ) { + s.converters[ "text script" ] = function() {}; + } + + // Convert no matter what (that way responseXXX fields are always set) + response = ajaxConvert( s, response, jqXHR, isSuccess ); + + // If successful, handle type chaining + if ( isSuccess ) { + + // Set the If-Modified-Since and/or If-None-Match header, if in ifModified mode. + if ( s.ifModified ) { + modified = jqXHR.getResponseHeader( "Last-Modified" ); + if ( modified ) { + jQuery.lastModified[ cacheURL ] = modified; + } + modified = jqXHR.getResponseHeader( "etag" ); + if ( modified ) { + jQuery.etag[ cacheURL ] = modified; + } + } + + // if no content + if ( status === 204 || s.type === "HEAD" ) { + statusText = "nocontent"; + + // if not modified + } else if ( status === 304 ) { + statusText = "notmodified"; + + // If we have data, let's convert it + } else { + statusText = response.state; + success = response.data; + error = response.error; + isSuccess = !error; + } + } else { + + // Extract error from statusText and normalize for non-aborts + error = statusText; + if ( status || !statusText ) { + statusText = "error"; + if ( status < 0 ) { + status = 0; + } + } + } + + // Set data for the fake xhr object + jqXHR.status = status; + jqXHR.statusText = ( nativeStatusText || statusText ) + ""; + + // Success/Error + if ( isSuccess ) { + deferred.resolveWith( callbackContext, [ success, statusText, jqXHR ] ); + } else { + deferred.rejectWith( callbackContext, [ jqXHR, statusText, error ] ); + } + + // Status-dependent callbacks + jqXHR.statusCode( statusCode ); + statusCode = undefined; + + if ( fireGlobals ) { + globalEventContext.trigger( isSuccess ? "ajaxSuccess" : "ajaxError", + [ jqXHR, s, isSuccess ? success : error ] ); + } + + // Complete + completeDeferred.fireWith( callbackContext, [ jqXHR, statusText ] ); + + if ( fireGlobals ) { + globalEventContext.trigger( "ajaxComplete", [ jqXHR, s ] ); + + // Handle the global AJAX counter + if ( !( --jQuery.active ) ) { + jQuery.event.trigger( "ajaxStop" ); + } + } + } + + return jqXHR; + }, + + getJSON: function( url, data, callback ) { + return jQuery.get( url, data, callback, "json" ); + }, + + getScript: function( url, callback ) { + return jQuery.get( url, undefined, callback, "script" ); + } +} ); + +jQuery.each( [ "get", "post" ], function( _i, method ) { + jQuery[ method ] = function( url, data, callback, type ) { + + // Shift arguments if data argument was omitted + if ( isFunction( data ) ) { + type = type || callback; + callback = data; + data = undefined; + } + + // The url can be an options object (which then must have .url) + return jQuery.ajax( jQuery.extend( { + url: url, + type: method, + dataType: type, + data: data, + success: callback + }, jQuery.isPlainObject( url ) && url ) ); + }; +} ); + +jQuery.ajaxPrefilter( function( s ) { + var i; + for ( i in s.headers ) { + if ( i.toLowerCase() === "content-type" ) { + s.contentType = s.headers[ i ] || ""; + } + } +} ); + + +jQuery._evalUrl = function( url, options, doc ) { + return jQuery.ajax( { + url: url, + + // Make this explicit, since user can override this through ajaxSetup (#11264) + type: "GET", + dataType: "script", + cache: true, + async: false, + global: false, + + // Only evaluate the response if it is successful (gh-4126) + // dataFilter is not invoked for failure responses, so using it instead + // of the default converter is kludgy but it works. + converters: { + "text script": function() {} + }, + dataFilter: function( response ) { + jQuery.globalEval( response, options, doc ); + } + } ); +}; + + +jQuery.fn.extend( { + wrapAll: function( html ) { + var wrap; + + if ( this[ 0 ] ) { + if ( isFunction( html ) ) { + html = html.call( this[ 0 ] ); + } + + // The elements to wrap the target around + wrap = jQuery( html, this[ 0 ].ownerDocument ).eq( 0 ).clone( true ); + + if ( this[ 0 ].parentNode ) { + wrap.insertBefore( this[ 0 ] ); + } + + wrap.map( function() { + var elem = this; + + while ( elem.firstElementChild ) { + elem = elem.firstElementChild; + } + + return elem; + } ).append( this ); + } + + return this; + }, + + wrapInner: function( html ) { + if ( isFunction( html ) ) { + return this.each( function( i ) { + jQuery( this ).wrapInner( html.call( this, i ) ); + } ); + } + + return this.each( function() { + var self = jQuery( this ), + contents = self.contents(); + + if ( contents.length ) { + contents.wrapAll( html ); + + } else { + self.append( html ); + } + } ); + }, + + wrap: function( html ) { + var htmlIsFunction = isFunction( html ); + + return this.each( function( i ) { + jQuery( this ).wrapAll( htmlIsFunction ? html.call( this, i ) : html ); + } ); + }, + + unwrap: function( selector ) { + this.parent( selector ).not( "body" ).each( function() { + jQuery( this ).replaceWith( this.childNodes ); + } ); + return this; + } +} ); + + +jQuery.expr.pseudos.hidden = function( elem ) { + return !jQuery.expr.pseudos.visible( elem ); +}; +jQuery.expr.pseudos.visible = function( elem ) { + return !!( elem.offsetWidth || elem.offsetHeight || elem.getClientRects().length ); +}; + + + + +jQuery.ajaxSettings.xhr = function() { + try { + return new window.XMLHttpRequest(); + } catch ( e ) {} +}; + +var xhrSuccessStatus = { + + // File protocol always yields status code 0, assume 200 + 0: 200, + + // Support: IE <=9 only + // #1450: sometimes IE returns 1223 when it should be 204 + 1223: 204 + }, + xhrSupported = jQuery.ajaxSettings.xhr(); + +support.cors = !!xhrSupported && ( "withCredentials" in xhrSupported ); +support.ajax = xhrSupported = !!xhrSupported; + +jQuery.ajaxTransport( function( options ) { + var callback, errorCallback; + + // Cross domain only allowed if supported through XMLHttpRequest + if ( support.cors || xhrSupported && !options.crossDomain ) { + return { + send: function( headers, complete ) { + var i, + xhr = options.xhr(); + + xhr.open( + options.type, + options.url, + options.async, + options.username, + options.password + ); + + // Apply custom fields if provided + if ( options.xhrFields ) { + for ( i in options.xhrFields ) { + xhr[ i ] = options.xhrFields[ i ]; + } + } + + // Override mime type if needed + if ( options.mimeType && xhr.overrideMimeType ) { + xhr.overrideMimeType( options.mimeType ); + } + + // X-Requested-With header + // For cross-domain requests, seeing as conditions for a preflight are + // akin to a jigsaw puzzle, we simply never set it to be sure. + // (it can always be set on a per-request basis or even using ajaxSetup) + // For same-domain requests, won't change header if already provided. + if ( !options.crossDomain && !headers[ "X-Requested-With" ] ) { + headers[ "X-Requested-With" ] = "XMLHttpRequest"; + } + + // Set headers + for ( i in headers ) { + xhr.setRequestHeader( i, headers[ i ] ); + } + + // Callback + callback = function( type ) { + return function() { + if ( callback ) { + callback = errorCallback = xhr.onload = + xhr.onerror = xhr.onabort = xhr.ontimeout = + xhr.onreadystatechange = null; + + if ( type === "abort" ) { + xhr.abort(); + } else if ( type === "error" ) { + + // Support: IE <=9 only + // On a manual native abort, IE9 throws + // errors on any property access that is not readyState + if ( typeof xhr.status !== "number" ) { + complete( 0, "error" ); + } else { + complete( + + // File: protocol always yields status 0; see #8605, #14207 + xhr.status, + xhr.statusText + ); + } + } else { + complete( + xhrSuccessStatus[ xhr.status ] || xhr.status, + xhr.statusText, + + // Support: IE <=9 only + // IE9 has no XHR2 but throws on binary (trac-11426) + // For XHR2 non-text, let the caller handle it (gh-2498) + ( xhr.responseType || "text" ) !== "text" || + typeof xhr.responseText !== "string" ? + { binary: xhr.response } : + { text: xhr.responseText }, + xhr.getAllResponseHeaders() + ); + } + } + }; + }; + + // Listen to events + xhr.onload = callback(); + errorCallback = xhr.onerror = xhr.ontimeout = callback( "error" ); + + // Support: IE 9 only + // Use onreadystatechange to replace onabort + // to handle uncaught aborts + if ( xhr.onabort !== undefined ) { + xhr.onabort = errorCallback; + } else { + xhr.onreadystatechange = function() { + + // Check readyState before timeout as it changes + if ( xhr.readyState === 4 ) { + + // Allow onerror to be called first, + // but that will not handle a native abort + // Also, save errorCallback to a variable + // as xhr.onerror cannot be accessed + window.setTimeout( function() { + if ( callback ) { + errorCallback(); + } + } ); + } + }; + } + + // Create the abort callback + callback = callback( "abort" ); + + try { + + // Do send the request (this may raise an exception) + xhr.send( options.hasContent && options.data || null ); + } catch ( e ) { + + // #14683: Only rethrow if this hasn't been notified as an error yet + if ( callback ) { + throw e; + } + } + }, + + abort: function() { + if ( callback ) { + callback(); + } + } + }; + } +} ); + + + + +// Prevent auto-execution of scripts when no explicit dataType was provided (See gh-2432) +jQuery.ajaxPrefilter( function( s ) { + if ( s.crossDomain ) { + s.contents.script = false; + } +} ); + +// Install script dataType +jQuery.ajaxSetup( { + accepts: { + script: "text/javascript, application/javascript, " + + "application/ecmascript, application/x-ecmascript" + }, + contents: { + script: /\b(?:java|ecma)script\b/ + }, + converters: { + "text script": function( text ) { + jQuery.globalEval( text ); + return text; + } + } +} ); + +// Handle cache's special case and crossDomain +jQuery.ajaxPrefilter( "script", function( s ) { + if ( s.cache === undefined ) { + s.cache = false; + } + if ( s.crossDomain ) { + s.type = "GET"; + } +} ); + +// Bind script tag hack transport +jQuery.ajaxTransport( "script", function( s ) { + + // This transport only deals with cross domain or forced-by-attrs requests + if ( s.crossDomain || s.scriptAttrs ) { + var script, callback; + return { + send: function( _, complete ) { + script = jQuery( " + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + +
+ + + + + +
+ +
+ + + + + + + + + + + + + + + + + +
+ + + + +
+
+
+
+ +
+

Adapter Activation and Composition

+

With adapters, it becomes possible to combine multiple adapters trained on different tasks in so-called adapter compositions. +To enable such compositions, adapters comes with a modular and flexible concept to define how the input to the model should flow through the available adapters. +This allows, e.g., stacking (MAD-X) and fusing (AdapterFusion) adapters and even more complex adapter setups.

+
+

Adapter Activation

+

The single location where all the adapter composition magic happens is the active_adapters property of the model class. +In the simplest case, you can set the name of a single adapter here to activate it:

+
model.active_adapters = "adapter_name"
+
+
+
+

Important

+

active_adapters defines which available adapters are used in each forward and backward pass through the model. This means:

+
    +
  • You cannot activate an adapter before previously adding it to the model using either add_adapter() or load_adapter().

  • +
  • All adapters not mentioned in the active_adapters setup are ignored, although they might have been loaded into the model. Thus, after adding an adapter, make sure to activate it.

  • +
+
+

Note that we also could have used the set_active_adapters method with model.set_active_adapters("adapter_name") which does the same.

+

Alternatively, the AdapterSetup context manager allows dynamic configuration of activated setups without changing the model state:

+
from adapters import AdapterSetup
+
+model = ...
+model.add_adapter("adapter_name")
+
+with AdapterSetup("adapter_name"):
+    # will use the adapter named "adapter_name" in the forward pass
+    outputs = model(**inputs)
+
+
+
+
+

Composition Blocks - Overview

+

The basic building blocks of the more advanced setups are objects derived from AdapterCompositionBlock, +each representing a different possibility to combine single adapters. +The following table gives an overview on the supported composition blocks and their support by different adapter methods.

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
BlockBottleneck
Adapters
Prefix
Tuning
CompacterLoRA(IA)³Prompt Tuning
Stack✅(*)✅(*)
Fuse
Split
BatchSplit✅(*)✅(*)
Parallel✅(*)✅(*)
Output averaging✅(*)✅(*)
Parameter averaging
+

(*) except for Deberta-v1, GPT-2.

+

Next, we present all composition blocks in more detail.

+
+
+

Stack

+
+Illustration of stacking adapters. +

Stacking adapters using the ‘Stack’ block.

+
+

The Stack block can be used to stack multiple adapters on top of each other. +This kind of adapter composition is used e.g. in the MAD-X framework for cross-lingual transfer (Pfeiffer et al., 2020), where language and task adapters are stacked on top of each other. +For more, check out this Colab notebook on cross-lingual transfer.

+

In the following example, we stack the adapters a, b and c so that in each layer, the input is first passed through a, the output of a is then inputted to b and the output of b is finally inputted to c.

+
import adapters.composition as ac
+
+// ...
+
+model.add_adapter("a")
+model.add_adapter("b")
+model.add_adapter("c")
+
+model.active_adapters = ac.Stack("a", "b", "c")
+
+
+
+

Note

+

When using stacking for prefix tuning the stacked prefixed are prepended to the input states from right to left, i.e. Stack(“a”, “b”, “c”) will first prepend prefix states for “a” to the input vectors, then prepend “b” to the resulting vectors etc.

+
+
+
+

Fuse

+
+Illustration of AdapterFusion. +

Fusing adapters with AdapterFusion.

+
+

The Fuse block can be used to activate a fusion layer of adapters. +AdapterFusion is a non-destructive way to combine the knowledge of multiple pre-trained adapters on a new downstream task, proposed by Pfeiffer et al., 2021. +In the following example, we activate the adapters d, e and f as well as the fusion layer that combines the outputs of all three. +The fusion layer is added beforehand using model.add_adapter_fusion(), where we specify the names of the adapters which should be fused.

+
import adapters.composition as ac
+
+// ...
+
+model.add_adapter("d")
+model.add_adapter("e")
+model.add_adapter("f")
+model.add_adapter_fusion(["d", "e", "f"])
+
+model.active_adapters = ac.Fuse("d", "e", "f")
+
+
+
+

Important

+

Fusing adapters with the Fuse block only works successfully if an adapter fusion layer combining all of the adapters listed in the Fuse has been added to the model. +This can be done either using add_adapter_fusion() or load_adapter_fusion().

+
+

To learn how training an AdapterFusion layer works, check out this Colab notebook from the adapters repo.

+
+

Retrieving AdapterFusion attentions

+

Finally, it is possible to retrieve the attention scores computed by each fusion layer in a forward pass of the model. +These scores can be used for analyzing the fused adapter blocks and can serve as the basis for visualizations similar to those in the AdapterFusion paper. +You can collect the fusion attention scores by passing output_adapter_fusion_attentions=True to the model forward call. +The scores for each layer will then be saved in the adapter_fusion_attentions attribute of the output:

+
outputs = model(**inputs, output_adapter_fusion_attentions=True)
+attention_scores = outputs.adapter_fusion_attentions
+
+
+

Note that this parameter is only available to base model classes and AdapterModel classes. +In the example, attention_scores holds a dictionary of the following form:

+
{
+    '<fusion_name>': {
+        <layer_id>: {
+            '<module_location>': np.array([...]),
+            ...
+        },
+        ...
+    },
+    ...
+}
+
+
+
+
+
+

Split

+
+Illustration of splitting adapters. +

Splitting the input between two adapters using the ‘Split’ block.

+
+

The Split block can be used to split an input sequence between multiple adapters. +This is done by specifying split indices at which the sequences should be divided. +In the following example, we split each input sequence between adapters g and h. +For each sequence, all tokens from 0 up to 63 are forwarded through g while the next 64 tokens are forwarded through h:

+
import adapters.composition as ac
+
+// ...
+
+model.add_adapter("g")
+model.add_adapter("h")
+
+model.active_adapters = ac.Split("g", "h", splits=[64, 64])
+
+
+
+
+

BatchSplit

+

The BatchSplit block is an alternative to split the input between several adapters. It does not split the input sequences but the +batch into smaller batches. As a result, the input sequences remain untouched.

+

In the following example, we split the batch between adapters i, k and l. The batch_sizesparameter specifies +the batch size for each of the adapters. The adapter i gets two sequences, kgets 1 sequence and l gets two sequences. +If all adapters should get the same batch size this can be specified by passing one batch size e.g. batch_sizes = 2. The sum +specified batch has to match the batch size of the input.

+
import adapters.composition as ac
+
+// ...
+
+model.add_adapter("i")
+model.add_adapter("k")
+model.add_adapter("l")
+
+model.active_adapters = ac.BatchSplit("i", "k", "l", batch_sizes=[2, 1, 2])
+
+
+
+
+
+

Parallel

+
+Illustration of parallel adapter forward pass. +

Parallel adapter forward pass as implemented by the ‘Parallel’ block. The input is replicated at the first layer with parallel adapters.

+
+

The Parallel block can be used to enable parallel multi-task training and inference on different adapters, each with their own prediction head. +Parallel adapter inference was first used in AdapterDrop: On the Efficiency of Adapters in Transformers (Rücklé et al., 2020).

+

In the following example, we load two adapters for semantic textual similarity (STS) from the Hub, one trained on the STS benchmark, the other trained on the MRPC dataset. +We activate a parallel setup where the input is passed through both adapters and their respective prediction heads.

+
import adapters.composition as ac
+
+model = AutoAdapterModel.from_pretrained("distilbert-base-uncased")
+tokenizer = AutoTokenizer.from_pretrained("distilbert-base-uncased")
+
+adapter1 = model.load_adapter("sts/sts-b@ukp")
+adapter2 = model.load_adapter("sts/mrpc@ukp")
+
+model.active_adapters = ac.Parallel(adapter1, adapter2)
+
+input_ids = tokenizer("Adapters are great!", "Adapters are awesome!", return_tensors="pt")
+
+output1, output2 = model(**input_ids)
+
+print("STS-B adapter output:", output1[0].item())
+print("MRPC adapter output:", bool(torch.argmax(output2[0]).item()))
+
+
+
+
+

Averaging Outputs or Parameters

+

Following approaches of ensembling full models at inference time for better generalization, recent work on adapters has explored methods of averaging pre-trained adapters. +This includes averaging output representations of adapters (Wang et al., 2021) as well as averaging adapter parameters (Wang et al., 2022, Chronopoulou et al., 2023). +adapters provides built-in support for both types of inference time averaging methods.

+
+

Output averaging

+

Output averaging allows to dynamically aggregate the output representations of multiple adapters in a model forward pass via weighted averaging. +This is realized via the Average composition block that works similar to other composition blocks. +In the example below, the three adapters are averaged with the weights 0.1 for m, 0.6 for n and 0.3 for o.

+
import adapters.composition as ac
+
+// ...
+
+model.add_adapter("m")
+model.add_adapter("n")
+model.add_adapter("o")
+
+model.active_adapters = ac.Average("m", "n", "o", weights=[0.1, 0.6, 0.3])
+
+
+
+
+

Parameter averaging

+

Parameter averaging enables creating a new adapter via weighted averaging of the parameters of multiple pre-trained adapters. +As this process is typically not done dynamically at runtime, adapters provides average_adapter() as a dedicated method for parameter averaging. +In the example below, the parameters of the adapters m, n and o are averaged (with weights 0.1 0.6 and 0.3, respectively) to create a new adapter avg. +Note that for this to succeed, all averaged adapters must use the same adapter configuration.

+
model.add_adapter("m")
+model.add_adapter("n")
+model.add_adapter("o")
+
+model.average_adapter("avg", ["m", "n", "o"], weights=[0.1, 0.6, 0.3])
+
+
+

Compared to output averaging, parameter averaging of adapters has the advantage of not inducing any additional inference time relative to using a single adapter.

+

For both output and parameter averaging, passed weights are normalized by default. +To disable normalization, pass normalize_weights=False.

+
+
+
+

Nesting composition blocks

+

Of course, it is also possible to combine different composition blocks in one adapter setup. +E.g., we can nest a Split block within a Stack of adapters:

+
import adapters.composition as ac
+
+model.active_adapters = ac.Stack("a", ac.Split("b", "c", splits=60))
+
+
+

However, combinations of adapter composition blocks cannot be arbitrarily deep. All currently supported possibilities are visualized in the table below.

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
BlockSupported Nesting
Stack[str, Fuse, Split, Parallel, BatchSplit, Average]
Fuse[str, Stack]
Split[str, Split, Stack, BatchSplit, Average]
Parallel[str, Stack, BatchSplit, Average]
BatchSplit[str, Stack, Split, BatchSplit, Average]
Average[str, Stack, Split, BatchSplit]
+

In the table, str represents an adapter, e.g. adapter “a” in the nesting example above. Depending on the individual model, some nested compositions might not be possible.

+
+
+ + +
+ +
+ + +
+
+ +
+ +
+ +
+ + Versions + v: main + + +
+
+
Branches
+
main
+
+
+
+ + + + + + + + + + \ No newline at end of file diff --git a/classes/adapter_config.html b/classes/adapter_config.html new file mode 100644 index 0000000000..087a98b953 --- /dev/null +++ b/classes/adapter_config.html @@ -0,0 +1,942 @@ + + + + + + + + + + + Adapter Configuration — AdapterHub documentation + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + +
+ + + + + +
+ +
+ + + + + + + + + + + + + + + + + +
+ + + + +
+
+
+
+ +
+

Adapter Configuration

+

Classes representing the architectures of adapter modules and fusion layers.

+
+

Single (bottleneck) adapters

+
+
+class adapters.AdapterConfig
+

Base class for all adaptation methods. This class does not define specific configuration keys, but only provides +some common helper methods.

+
+
Parameters
+

architecture (str, optional) – The type of adaptation method defined by the configuration.

+
+
+
+
+classmethod from_dict(config)
+

Creates a config class from a Python dict.

+
+ +
+
+classmethod load(config: Union[dict, str], download_kwargs=None, **kwargs)
+

Loads a given adapter configuration specifier into a full AdapterConfig instance.

+
+
Parameters
+

config (Union[dict, str]) –

The configuration to load. Can be either:

+
    +
  • a dictionary representing the full config

  • +
  • an identifier string available in ADAPTER_CONFIG_MAP

  • +
  • the path to a file containing a full adapter configuration

  • +
  • an identifier string available in Adapter-Hub

  • +
+

+
+
Returns
+

The resolved adapter configuration dictionary.

+
+
Return type
+

dict

+
+
+
+ +
+
+replace(**changes)
+

Returns a new instance of the config class with the specified changes applied.

+
+ +
+
+to_dict()
+

Converts the config class to a Python dict.

+
+ +
+ +
+
+class adapters.BnConfig(mh_adapter: bool, output_adapter: bool, reduction_factor: ~typing.Union[float, ~collections.abc.Mapping], non_linearity: str, original_ln_before: bool = False, original_ln_after: bool = True, ln_before: bool = False, ln_after: bool = False, init_weights: str = 'bert', is_parallel: bool = False, scaling: ~typing.Union[float, str] = 1.0, use_gating: bool = False, residual_before_ln: ~typing.Union[bool, str] = True, adapter_residual_before_ln: bool = False, inv_adapter: ~typing.Optional[str] = None, inv_adapter_reduction_factor: ~typing.Optional[float] = None, cross_adapter: bool = False, leave_out: ~typing.List[int] = <factory>, dropout: float = 0.0, phm_layer: bool = False, phm_dim: int = 4, factorized_phm_W: ~typing.Optional[bool] = True, shared_W_phm: ~typing.Optional[bool] = False, shared_phm_rule: ~typing.Optional[bool] = True, factorized_phm_rule: ~typing.Optional[bool] = False, phm_c_init: ~typing.Optional[str] = 'normal', phm_init_range: ~typing.Optional[float] = 0.0001, learn_phm: ~typing.Optional[bool] = True, hypercomplex_nonlinearity: ~typing.Optional[str] = 'glorot-uniform', phm_rank: ~typing.Optional[int] = 1, phm_bias: ~typing.Optional[bool] = True)
+

Base class that models the architecture of a bottleneck adapter.

+
+
Parameters
+
    +
  • mh_adapter (bool) – If True, add adapter modules after the multi-head attention block of each layer.

  • +
  • output_adapter (bool) – If True, add adapter modules after the output FFN of each layer.

  • +
  • reduction_factor (float or Mapping) – Either a scalar float (> 0) specifying the reduction factor for all layers or a mapping from layer ID +(starting at 0) to values specifying the reduction_factor for individual layers. If not all layers are +represented in the mapping a default value should be given e.g. {‘1’: 8, ‘6’: 32, ‘default’: 16}. +Specifying a reduction factor < 1 will result in an up-projection layer.

  • +
  • non_linearity (str) – The activation function to use in the adapter bottleneck.

  • +
  • original_ln_before (bool, optional) – If True, apply layer pre-trained normalization and residual connection before the adapter modules. Defaults +to False. Only applicable if is_parallel is False.

  • +
  • original_ln_after (bool, optional) – If True, apply pre-trained layer normalization and residual connection after the adapter modules. Defaults +to True.

  • +
  • ln_before (bool, optional) – If True, add a new layer normalization before the adapter bottleneck. +Defaults to False.

  • +
  • ln_after (bool, optional) – If True, add a new layer normalization after the adapter bottleneck. +Defaults to False.

  • +
  • init_weights (str, optional) – Initialization method for the weights of the adapter modules. +Currently, this can be either “bert” (default) or “mam_adapter”.

  • +
  • is_parallel (bool, optional) – If True, apply adapter transformations in parallel. +By default (False), sequential application is used.

  • +
  • scaling (float or str, optional) – Scaling factor to use for scaled addition of adapter outputs as done by He et al. (2021). Can be either a +constant factor (float) or the string “learned”, in which case the scaling factor is learned. Defaults to +1.0.

  • +
  • use_gating (bool, optional) – Place a trainable gating module besides the added parameter module to control module activation. This is +e.g. used for UniPELT. Defaults to False.

  • +
  • residual_before_ln (bool or str, optional) – If True, take the residual connection around the adapter bottleneck before the layer normalization. If set +to “post_add”, take the residual connection around the adapter bottleneck after the previous residual +connection. Only applicable if original_ln_before is True.

  • +
  • adapter_residual_before_ln (bool, optional) – If True, apply the residual connection around the adapter modules before the new layer normalization within +the adapter. Only applicable if ln_after is True and is_parallel is False.

  • +
  • inv_adapter (str, optional) – If not None (default), add invertible adapter modules after the model embedding layer. Currently, this can +be either “nice” or “glow”.

  • +
  • inv_adapter_reduction_factor (float, optional) – The reduction to use within the invertible adapter modules. Only applicable if inv_adapter is not +None.

  • +
  • cross_adapter (bool, optional) – If True, add adapter modules after the cross attention block of each decoder layer in an encoder-decoder +model. Defaults to False.

  • +
  • leave_out (List[int], optional) – The IDs of the layers (starting at 0) where NO adapter modules should be added.

  • +
  • dropout (float, optional) – The dropout rate used in the adapter layer. Defaults to 0.0.

  • +
  • phm_layer (bool, optional) – If True the down and up projection layers are a PHMLayer. +Defaults to False

  • +
  • phm_dim (int, optional) – The dimension of the phm matrix. +Only applicable if phm_layer is set to True. Defaults to 4.

  • +
  • shared_phm_rule (bool, optional) – Whether the phm matrix is shared across all layers. +Defaults to True

  • +
  • factorized_phm_rule (bool, optional) – Whether the phm matrix is factorized into a left and right matrix. Defaults to False.

  • +
  • learn_phm (bool, optional) – Whether the phm matrix should be learned during training. +Defaults to True

  • +
  • ( (factorized_phm_W) – obj:bool, optional): Whether the weights matrix is factorized into a left and right matrix. Defaults to +True

  • +
  • shared_W_phm (bool, optional) – Whether the weights matrix is shared across all layers. +Defaults to False.

  • +
  • phm_c_init (str, optional) – The initialization function for the weights of the phm matrix. +The possible values are [“normal”, “uniform”]. Defaults to normal.

  • +
  • phm_init_range (float, optional) – std for initializing phm weights if phm_c_init=”normal”. +Defaults to 0.0001.

  • +
  • hypercomplex_nonlinearity (str, optional) – This specifies the distribution to draw the weights in the phm layer from. Defaults to glorot-uniform.

  • +
  • phm_rank (int, optional) – If the weight matrix is factorized this specifies the rank of the matrix. E.g. the left matrix of the down +projection has the shape (phm_dim, _in_feats_per_axis, phm_rank) and the right matrix (phm_dim, phm_rank, +_out_feats_per_axis). Defaults to 1

  • +
  • phm_bias (bool, optional) – If True the down and up projection PHMLayer has a bias term. If phm_layer is False this is ignored. +Defaults to True

  • +
+
+
+
+
+classmethod from_dict(config)
+

Creates a config class from a Python dict.

+
+ +
+
+classmethod load(config: Union[dict, str], download_kwargs=None, **kwargs)
+

Loads a given adapter configuration specifier into a full AdapterConfig instance.

+
+
Parameters
+

config (Union[dict, str]) –

The configuration to load. Can be either:

+
    +
  • a dictionary representing the full config

  • +
  • an identifier string available in ADAPTER_CONFIG_MAP

  • +
  • the path to a file containing a full adapter configuration

  • +
  • an identifier string available in Adapter-Hub

  • +
+

+
+
Returns
+

The resolved adapter configuration dictionary.

+
+
Return type
+

dict

+
+
+
+ +
+
+replace(**changes)
+

Returns a new instance of the config class with the specified changes applied.

+
+ +
+
+to_dict()
+

Converts the config class to a Python dict.

+
+ +
+ +
+
+class adapters.SeqBnConfig(mh_adapter: bool = False, output_adapter: bool = True, reduction_factor: ~typing.Union[float, ~collections.abc.Mapping] = 16, non_linearity: str = 'relu', original_ln_before: bool = True, original_ln_after: bool = True, ln_before: bool = False, ln_after: bool = False, init_weights: str = 'bert', is_parallel: bool = False, scaling: ~typing.Union[float, str] = 1.0, use_gating: bool = False, residual_before_ln: ~typing.Union[bool, str] = True, adapter_residual_before_ln: bool = False, inv_adapter: ~typing.Optional[str] = None, inv_adapter_reduction_factor: ~typing.Optional[float] = None, cross_adapter: bool = False, leave_out: ~typing.List[int] = <factory>, dropout: float = 0.0, phm_layer: bool = False, phm_dim: int = 4, factorized_phm_W: ~typing.Optional[bool] = True, shared_W_phm: ~typing.Optional[bool] = False, shared_phm_rule: ~typing.Optional[bool] = True, factorized_phm_rule: ~typing.Optional[bool] = False, phm_c_init: ~typing.Optional[str] = 'normal', phm_init_range: ~typing.Optional[float] = 0.0001, learn_phm: ~typing.Optional[bool] = True, hypercomplex_nonlinearity: ~typing.Optional[str] = 'glorot-uniform', phm_rank: ~typing.Optional[int] = 1, phm_bias: ~typing.Optional[bool] = True)
+

The adapter architecture proposed by Pfeiffer et al. (2020). See https://arxiv.org/pdf/2005.00247.pdf.

+
+ +
+
+class adapters.SeqBnInvConfig(mh_adapter: bool = False, output_adapter: bool = True, reduction_factor: ~typing.Union[float, ~collections.abc.Mapping] = 16, non_linearity: str = 'relu', original_ln_before: bool = True, original_ln_after: bool = True, ln_before: bool = False, ln_after: bool = False, init_weights: str = 'bert', is_parallel: bool = False, scaling: ~typing.Union[float, str] = 1.0, use_gating: bool = False, residual_before_ln: ~typing.Union[bool, str] = True, adapter_residual_before_ln: bool = False, inv_adapter: ~typing.Optional[str] = 'nice', inv_adapter_reduction_factor: ~typing.Optional[float] = 2, cross_adapter: bool = False, leave_out: ~typing.List[int] = <factory>, dropout: float = 0.0, phm_layer: bool = False, phm_dim: int = 4, factorized_phm_W: ~typing.Optional[bool] = True, shared_W_phm: ~typing.Optional[bool] = False, shared_phm_rule: ~typing.Optional[bool] = True, factorized_phm_rule: ~typing.Optional[bool] = False, phm_c_init: ~typing.Optional[str] = 'normal', phm_init_range: ~typing.Optional[float] = 0.0001, learn_phm: ~typing.Optional[bool] = True, hypercomplex_nonlinearity: ~typing.Optional[str] = 'glorot-uniform', phm_rank: ~typing.Optional[int] = 1, phm_bias: ~typing.Optional[bool] = True)
+

The adapter architecture proposed by Pfeiffer et al. (2020). See https://arxiv.org/pdf/2005.00247.pdf.

+
+ +
+
+class adapters.DoubleSeqBnConfig(mh_adapter: bool = True, output_adapter: bool = True, reduction_factor: ~typing.Union[float, ~collections.abc.Mapping] = 16, non_linearity: str = 'swish', original_ln_before: bool = False, original_ln_after: bool = True, ln_before: bool = False, ln_after: bool = False, init_weights: str = 'bert', is_parallel: bool = False, scaling: ~typing.Union[float, str] = 1.0, use_gating: bool = False, residual_before_ln: ~typing.Union[bool, str] = True, adapter_residual_before_ln: bool = False, inv_adapter: ~typing.Optional[str] = None, inv_adapter_reduction_factor: ~typing.Optional[float] = None, cross_adapter: bool = False, leave_out: ~typing.List[int] = <factory>, dropout: float = 0.0, phm_layer: bool = False, phm_dim: int = 4, factorized_phm_W: ~typing.Optional[bool] = True, shared_W_phm: ~typing.Optional[bool] = False, shared_phm_rule: ~typing.Optional[bool] = True, factorized_phm_rule: ~typing.Optional[bool] = False, phm_c_init: ~typing.Optional[str] = 'normal', phm_init_range: ~typing.Optional[float] = 0.0001, learn_phm: ~typing.Optional[bool] = True, hypercomplex_nonlinearity: ~typing.Optional[str] = 'glorot-uniform', phm_rank: ~typing.Optional[int] = 1, phm_bias: ~typing.Optional[bool] = True)
+

The adapter architecture proposed by Houlsby et al. (2019). See https://arxiv.org/pdf/1902.00751.pdf.

+
+ +
+
+class adapters.DoubleSeqBnInvConfig(mh_adapter: bool = True, output_adapter: bool = True, reduction_factor: ~typing.Union[float, ~collections.abc.Mapping] = 16, non_linearity: str = 'swish', original_ln_before: bool = False, original_ln_after: bool = True, ln_before: bool = False, ln_after: bool = False, init_weights: str = 'bert', is_parallel: bool = False, scaling: ~typing.Union[float, str] = 1.0, use_gating: bool = False, residual_before_ln: ~typing.Union[bool, str] = True, adapter_residual_before_ln: bool = False, inv_adapter: ~typing.Optional[str] = 'nice', inv_adapter_reduction_factor: ~typing.Optional[float] = 2, cross_adapter: bool = False, leave_out: ~typing.List[int] = <factory>, dropout: float = 0.0, phm_layer: bool = False, phm_dim: int = 4, factorized_phm_W: ~typing.Optional[bool] = True, shared_W_phm: ~typing.Optional[bool] = False, shared_phm_rule: ~typing.Optional[bool] = True, factorized_phm_rule: ~typing.Optional[bool] = False, phm_c_init: ~typing.Optional[str] = 'normal', phm_init_range: ~typing.Optional[float] = 0.0001, learn_phm: ~typing.Optional[bool] = True, hypercomplex_nonlinearity: ~typing.Optional[str] = 'glorot-uniform', phm_rank: ~typing.Optional[int] = 1, phm_bias: ~typing.Optional[bool] = True)
+

The adapter architecture proposed by Houlsby et. al. (2019). See https://arxiv.org/pdf/1902.00751.pdf.

+
+ +
+
+class adapters.ParBnConfig(mh_adapter: bool = False, output_adapter: bool = True, reduction_factor: ~typing.Union[float, ~collections.abc.Mapping] = 2, non_linearity: str = 'relu', original_ln_before: bool = False, original_ln_after: bool = True, ln_before: bool = False, ln_after: bool = False, init_weights: str = 'mam_adapter', is_parallel: bool = True, scaling: ~typing.Union[float, str] = 4.0, use_gating: bool = False, residual_before_ln: ~typing.Union[bool, str] = True, adapter_residual_before_ln: bool = False, inv_adapter: ~typing.Optional[str] = None, inv_adapter_reduction_factor: ~typing.Optional[float] = None, cross_adapter: bool = False, leave_out: ~typing.List[int] = <factory>, dropout: float = 0.0, phm_layer: bool = False, phm_dim: int = 4, factorized_phm_W: ~typing.Optional[bool] = True, shared_W_phm: ~typing.Optional[bool] = False, shared_phm_rule: ~typing.Optional[bool] = True, factorized_phm_rule: ~typing.Optional[bool] = False, phm_c_init: ~typing.Optional[str] = 'normal', phm_init_range: ~typing.Optional[float] = 0.0001, learn_phm: ~typing.Optional[bool] = True, hypercomplex_nonlinearity: ~typing.Optional[str] = 'glorot-uniform', phm_rank: ~typing.Optional[int] = 1, phm_bias: ~typing.Optional[bool] = True)
+

The parallel adapter architecture proposed by He et al. (2021). See https://arxiv.org/pdf/2110.04366.pdf.

+
+ +
+
+class adapters.CompacterConfig(mh_adapter: bool = True, output_adapter: bool = True, reduction_factor: ~typing.Union[float, ~collections.abc.Mapping] = 32, non_linearity: str = 'gelu', original_ln_before: bool = False, original_ln_after: bool = True, ln_before: bool = False, ln_after: bool = False, init_weights: str = 'bert', is_parallel: bool = False, scaling: ~typing.Union[float, str] = 1.0, use_gating: bool = False, residual_before_ln: ~typing.Union[bool, str] = True, adapter_residual_before_ln: bool = False, inv_adapter: ~typing.Optional[str] = None, inv_adapter_reduction_factor: ~typing.Optional[float] = None, cross_adapter: bool = False, leave_out: ~typing.List[int] = <factory>, dropout: float = 0.0, phm_layer: bool = True, phm_dim: int = 4, factorized_phm_W: ~typing.Optional[bool] = True, shared_W_phm: ~typing.Optional[bool] = False, shared_phm_rule: ~typing.Optional[bool] = True, factorized_phm_rule: ~typing.Optional[bool] = False, phm_c_init: ~typing.Optional[str] = 'normal', phm_init_range: ~typing.Optional[float] = 0.0001, learn_phm: ~typing.Optional[bool] = True, hypercomplex_nonlinearity: ~typing.Optional[str] = 'glorot-uniform', phm_rank: ~typing.Optional[int] = 1, phm_bias: ~typing.Optional[bool] = True)
+

The Compacter architecture proposed by Mahabadi et al. (2021). See https://arxiv.org/pdf/2106.04647.pdf.

+
+ +
+
+class adapters.CompacterPlusPlusConfig(mh_adapter: bool = False, output_adapter: bool = True, reduction_factor: ~typing.Union[float, ~collections.abc.Mapping] = 32, non_linearity: str = 'gelu', original_ln_before: bool = True, original_ln_after: bool = True, ln_before: bool = False, ln_after: bool = False, init_weights: str = 'bert', is_parallel: bool = False, scaling: ~typing.Union[float, str] = 1.0, use_gating: bool = False, residual_before_ln: ~typing.Union[bool, str] = True, adapter_residual_before_ln: bool = False, inv_adapter: ~typing.Optional[str] = None, inv_adapter_reduction_factor: ~typing.Optional[float] = None, cross_adapter: bool = False, leave_out: ~typing.List[int] = <factory>, dropout: float = 0.0, phm_layer: bool = True, phm_dim: int = 4, factorized_phm_W: ~typing.Optional[bool] = True, shared_W_phm: ~typing.Optional[bool] = False, shared_phm_rule: ~typing.Optional[bool] = True, factorized_phm_rule: ~typing.Optional[bool] = False, phm_c_init: ~typing.Optional[str] = 'normal', phm_init_range: ~typing.Optional[float] = 0.0001, learn_phm: ~typing.Optional[bool] = True, hypercomplex_nonlinearity: ~typing.Optional[str] = 'glorot-uniform', phm_rank: ~typing.Optional[int] = 1, phm_bias: ~typing.Optional[bool] = True)
+

The Compacter++ architecture proposed by Mahabadi et al. (2021). See https://arxiv.org/pdf/2106.04647.pdf.

+
+ +
+
+

Prefix Tuning

+
+
+class adapters.PrefixTuningConfig(architecture: ~typing.Optional[str] = 'prefix_tuning', encoder_prefix: bool = True, cross_prefix: bool = True, leave_out: ~typing.List[int] = <factory>, flat: bool = False, prefix_length: int = 30, bottleneck_size: int = 512, non_linearity: str = 'tanh', dropout: float = 0.0, use_gating: bool = False, shared_gating: bool = True)
+

The Prefix Tuning architecture proposed by Li & Liang (2021). See https://arxiv.org/pdf/2101.00190.pdf.

+
+
Parameters
+
    +
  • encoder_prefix (bool) – If True, add prefixes to the encoder of an encoder-decoder model.

  • +
  • cross_prefix (bool) – If True, add prefixes to the cross attention of an encoder-decoder model.

  • +
  • flat (bool) – If True, train the prefix parameters directly. Otherwise, reparametrize using a bottleneck MLP.

  • +
  • prefix_length (int) – The length of the prefix.

  • +
  • bottleneck_size (int) – If flat=False, the size of the bottleneck MLP.

  • +
  • non_linearity (str) – If flat=False, the non-linearity used in the bottleneck MLP.

  • +
  • dropout (float) – The dropout rate used in the prefix tuning layer.

  • +
  • leave_out (List[int]) – The IDs of the layers (starting at 0) where NO prefix should be added.

  • +
  • use_gating (bool, optional) – Place a trainable gating module besides the added parameter module to control module activation. This is +e.g. used for UniPELT. Defaults to False.

  • +
  • ( (shared_gating) – obj:bool, optional): Whether to use a shared gate for the prefixes of all attention matrices. Only +applicable if use_gating=True. Defaults to True.

  • +
+
+
+
+
+classmethod from_dict(config)
+

Creates a config class from a Python dict.

+
+ +
+
+classmethod load(config: Union[dict, str], download_kwargs=None, **kwargs)
+

Loads a given adapter configuration specifier into a full AdapterConfig instance.

+
+
Parameters
+

config (Union[dict, str]) –

The configuration to load. Can be either:

+
    +
  • a dictionary representing the full config

  • +
  • an identifier string available in ADAPTER_CONFIG_MAP

  • +
  • the path to a file containing a full adapter configuration

  • +
  • an identifier string available in Adapter-Hub

  • +
+

+
+
Returns
+

The resolved adapter configuration dictionary.

+
+
Return type
+

dict

+
+
+
+ +
+
+replace(**changes)
+

Returns a new instance of the config class with the specified changes applied.

+
+ +
+
+to_dict()
+

Converts the config class to a Python dict.

+
+ +
+ +
+
+

LoRAConfig

+
+
+class adapters.LoRAConfig(architecture: ~typing.Optional[str] = 'lora', selfattn_lora: bool = True, intermediate_lora: bool = False, output_lora: bool = False, leave_out: ~typing.List[int] = <factory>, r: int = 8, alpha: int = 8, dropout: float = 0.0, attn_matrices: ~typing.List[str] = <factory>, composition_mode: str = 'add', init_weights: str = 'lora', use_gating: bool = False)
+

The Low-Rank Adaptation (LoRA) architecture proposed by Hu et al. (2021). See https://arxiv.org/pdf/2106.09685.pdf. +LoRA adapts a model by reparametrizing the weights of a layer matrix. You can merge the additional weights with the +original layer weights using model.merge_adapter("lora_name").

+
+
Parameters
+
    +
  • selfattn_lora (bool, optional) – If True, add LoRA to the self-attention weights of a model. +Defaults to True.

  • +
  • intermediate_lora (bool, optional) – If True, add LoRA to the intermediate MLP weights of a model. +Defaults to False.

  • +
  • output_lora (bool, optional) – If True, add LoRA to the output MLP weights of a model. +Defaults to False.

  • +
  • leave_out (List[int], optional) – The IDs of the layers (starting at 0) where NO adapter modules should be added.

  • +
  • r (int, optional) – The rank of the LoRA layer. Defaults to 8.

  • +
  • alpha (int, optional) – The hyperparameter used for scaling the LoRA reparametrization. Defaults to 8.

  • +
  • dropout (float, optional) – The dropout rate used in the LoRA layer. Defaults to 0.0.

  • +
  • attn_matrices (List[str], optional) – Determines which matrices of the self-attention module to adapt. +A list that may contain the strings “q” (query), “k” (key), “v” (value). Defaults to [“q”, “v”].

  • +
  • composition_mode (str, optional) – Defines how the injected weights are composed with the original model weights. Can be either “add” +(addition of decomposed matrix, as in LoRA) or “scale” (element-wise multiplication of vector, as in +(IA)^3). “scale” can only be used together with r=1. Defaults to “add”.

  • +
  • init_weights (str, optional) – Initialization method for the weights of the LoRA modules. +Currently, this can be either “lora” (default) or “bert”.

  • +
  • use_gating (bool, optional) – Place a trainable gating module besides the added parameter module to control module activation. This is +e.g. used for UniPELT. Defaults to False. Note that modules with use_gating=True cannot be merged using +merge_adapter().

  • +
+
+
+
+
+classmethod from_dict(config)
+

Creates a config class from a Python dict.

+
+ +
+
+classmethod load(config: Union[dict, str], download_kwargs=None, **kwargs)
+

Loads a given adapter configuration specifier into a full AdapterConfig instance.

+
+
Parameters
+

config (Union[dict, str]) –

The configuration to load. Can be either:

+
    +
  • a dictionary representing the full config

  • +
  • an identifier string available in ADAPTER_CONFIG_MAP

  • +
  • the path to a file containing a full adapter configuration

  • +
  • an identifier string available in Adapter-Hub

  • +
+

+
+
Returns
+

The resolved adapter configuration dictionary.

+
+
Return type
+

dict

+
+
+
+ +
+
+replace(**changes)
+

Returns a new instance of the config class with the specified changes applied.

+
+ +
+
+to_dict()
+

Converts the config class to a Python dict.

+
+ +
+ +
+
+

IA3Config

+
+
+class adapters.IA3Config(architecture: ~typing.Optional[str] = 'lora', selfattn_lora: bool = True, intermediate_lora: bool = True, output_lora: bool = False, leave_out: ~typing.List[int] = <factory>, r: int = 1, alpha: int = 1, dropout: float = 0.0, attn_matrices: ~typing.List[str] = <factory>, composition_mode: str = 'scale', init_weights: str = 'ia3', use_gating: bool = False)
+

The ‘Infused Adapter by Inhibiting and Amplifying Inner Activations’ ((IA)^3) architecture proposed by Liu et al. +(2022). See https://arxiv.org/pdf/2205.05638.pdf. (IA)^3 builds on top of LoRA, however, unlike the additive +composition of LoRA, it scales weights of a layer using an injected vector.

+
+
+classmethod from_dict(config)
+

Creates a config class from a Python dict.

+
+ +
+
+classmethod load(config: Union[dict, str], download_kwargs=None, **kwargs)
+

Loads a given adapter configuration specifier into a full AdapterConfig instance.

+
+
Parameters
+

config (Union[dict, str]) –

The configuration to load. Can be either:

+
    +
  • a dictionary representing the full config

  • +
  • an identifier string available in ADAPTER_CONFIG_MAP

  • +
  • the path to a file containing a full adapter configuration

  • +
  • an identifier string available in Adapter-Hub

  • +
+

+
+
Returns
+

The resolved adapter configuration dictionary.

+
+
Return type
+

dict

+
+
+
+ +
+
+replace(**changes)
+

Returns a new instance of the config class with the specified changes applied.

+
+ +
+
+to_dict()
+

Converts the config class to a Python dict.

+
+ +
+ +
+
+

PromptTuningConfig

+
+
+class adapters.PromptTuningConfig(architecture: str = 'prompt_tuning', prompt_length: int = 10, prompt_init: str = 'random_uniform', prompt_init_text: Optional[str] = None, combine: str = 'prefix')
+

The Prompt Tuning architecture proposed by Lester et al. (2021). See https://arxiv.org/pdf/2104.08691.pdf

+
+
Parameters
+
    +
  • prompt_length (int) – The number of tokens in the prompt. +Defaults to 10.

  • +
  • prompt_init (str) – The initialization method for the prompt. Can be either “random_uniform” or “from_string”. +Defaults to “random_uniform”.

  • +
  • prompt_init_text (str) – The text to use for prompt initialization if prompt_init=”from_string”.

  • +
  • random_uniform_scale (float) – The scale of the random uniform initialization if prompt_init=”random_uniform”. +Defaults to 0.5 as in the paper.

  • +
  • combine (str) – The method used to combine the prompt with the input. Can be either “prefix” or “prefix_after_bos”. +Defaults to “prefix”.

  • +
+
+
+
+
+classmethod from_dict(config)
+

Creates a config class from a Python dict.

+
+ +
+
+classmethod load(config: Union[dict, str], download_kwargs=None, **kwargs)
+

Loads a given adapter configuration specifier into a full AdapterConfig instance.

+
+
Parameters
+

config (Union[dict, str]) –

The configuration to load. Can be either:

+
    +
  • a dictionary representing the full config

  • +
  • an identifier string available in ADAPTER_CONFIG_MAP

  • +
  • the path to a file containing a full adapter configuration

  • +
  • an identifier string available in Adapter-Hub

  • +
+

+
+
Returns
+

The resolved adapter configuration dictionary.

+
+
Return type
+

dict

+
+
+
+ +
+
+replace(**changes)
+

Returns a new instance of the config class with the specified changes applied.

+
+ +
+
+to_dict()
+

Converts the config class to a Python dict.

+
+ +
+ +
+
+

Combined configurations

+
+
+class adapters.ConfigUnion(*configs: List[AdapterConfig])
+

Composes multiple adaptation method configurations into one. This class can be used to define complex adaptation +method setups.

+
+
+classmethod from_dict(config)
+

Creates a config class from a Python dict.

+
+ +
+
+classmethod load(config: Union[dict, str], download_kwargs=None, **kwargs)
+

Loads a given adapter configuration specifier into a full AdapterConfig instance.

+
+
Parameters
+

config (Union[dict, str]) –

The configuration to load. Can be either:

+
    +
  • a dictionary representing the full config

  • +
  • an identifier string available in ADAPTER_CONFIG_MAP

  • +
  • the path to a file containing a full adapter configuration

  • +
  • an identifier string available in Adapter-Hub

  • +
+

+
+
Returns
+

The resolved adapter configuration dictionary.

+
+
Return type
+

dict

+
+
+
+ +
+
+replace(**changes)
+

Returns a new instance of the config class with the specified changes applied.

+
+ +
+
+to_dict()
+

Converts the config class to a Python dict.

+
+ +
+
+static validate(configs)
+

Performs simple validations of a list of configurations to check whether they can be combined to a common +setup.

+
+
Parameters
+

configs (List[AdapterConfig]) – list of configs to check.

+
+
Raises
+
    +
  • TypeError – One of the configurations has a wrong type. ValueError: At least two given configurations

  • +
  • conflict.

  • +
+
+
+
+ +
+ +
+
+class adapters.MAMConfig(prefix_tuning: Optional[PrefixTuningConfig] = None, adapter: Optional[BnConfig] = None)
+

The Mix-And-Match adapter architecture proposed by He et al. (2021). See https://arxiv.org/pdf/2110.04366.pdf.

+
+ +
+
+class adapters.UniPELTConfig(prefix_tuning: Optional[PrefixTuningConfig] = None, adapter: Optional[BnConfig] = None, lora: Optional[LoRAConfig] = None)
+

The UniPELT adapter architecture proposed by Mao et al. (2022). See https://arxiv.org/pdf/2110.07577.pdf.

+
+ +
+
+

Adapter Fusion

+
+
+class adapters.AdapterFusionConfig(key: bool, query: bool, value: bool, query_before_ln: bool, regularization: bool, residual_before: bool, temperature: bool, value_before_softmax: bool, value_initialized: str, dropout_prob: float)
+

Base class that models the architecture of an adapter fusion layer.

+
+
+classmethod from_dict(config)
+

Creates a config class from a Python dict.

+
+ +
+
+classmethod load(config: Union[dict, str], **kwargs)
+

Loads a given adapter fusion configuration specifier into a full AdapterFusionConfig instance.

+
+
Parameters
+

config (Union[dict, str]) –

The configuration to load. Can be either:

+
    +
  • a dictionary representing the full config

  • +
  • an identifier string available in ADAPTERFUSION_CONFIG_MAP

  • +
  • the path to a file containing a full adapter fusion configuration

  • +
+

+
+
Returns
+

The resolved adapter fusion configuration dictionary.

+
+
Return type
+

dict

+
+
+
+ +
+
+replace(**changes)
+

Returns a new instance of the config class with the specified changes applied.

+
+ +
+
+to_dict()
+

Converts the config class to a Python dict.

+
+ +
+ +
+
+class adapters.StaticAdapterFusionConfig(key: bool = True, query: bool = True, value: bool = False, query_before_ln: bool = False, regularization: bool = False, residual_before: bool = False, temperature: bool = False, value_before_softmax: bool = True, value_initialized: str = False, dropout_prob: Optional[float] = None)
+

Static version of adapter fusion without a value matrix. See https://arxiv.org/pdf/2005.00247.pdf.

+
+ +
+
+class adapters.DynamicAdapterFusionConfig(key: bool = True, query: bool = True, value: bool = True, query_before_ln: bool = False, regularization: bool = True, residual_before: bool = False, temperature: bool = False, value_before_softmax: bool = True, value_initialized: str = True, dropout_prob: Optional[float] = None)
+

Dynamic version of adapter fusion with a value matrix and regularization. See https://arxiv.org/pdf/2005.00247.pdf.

+
+ +
+
+

Adapter Setup

+
+
+class adapters.AdapterSetup(adapter_setup, head_setup=None, ignore_empty: bool = False)
+

Represents an adapter setup of a model including active adapters and active heads. This class is intended to be +used as a context manager using the with statement. The setup defined by the AdapterSetup context will +override static adapter setups defined in a model (i.e. setups specified via active_adapters).

+

Example:

+
with AdapterSetup(Stack("a", "b")):
+    # will use the adapter stack "a" and "b" outputs = model(**inputs)
+
+
+

Note that the context manager is thread-local, i.e. it can be used with different setups in a multi-threaded +environment.

+
+ +
+
+ + +
+ +
+ + +
+
+ +
+ +
+ +
+ + Versions + v: main + + +
+
+
Branches
+
main
+
+
+
+ + + + + + + + + + \ No newline at end of file diff --git a/classes/adapter_layer.html b/classes/adapter_layer.html new file mode 100644 index 0000000000..6579dccb43 --- /dev/null +++ b/classes/adapter_layer.html @@ -0,0 +1,591 @@ + + + + + + + + + + + Adapter Implementation — AdapterHub documentation + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + +
+ + + + + +
+ +
+ + + + + + + + + + + + + + + + + +
+ + + + +
+
+
+
+ +
+

Adapter Implementation

+

The following classes define the common interfaces for all adapter methods. +They further hold logic shared by all adapter implementations. +All newly added adapter methods should inherit from either one of these classes.

+
+
+class adapters.AdapterLayerBase
+

Base class for all adaptation methods that require per-layer modules.

+

Make sure the ‘adapter_modules_name’ attribute is overriden in derived classes.

+
+
+abstract add_adapter(adapter_name: str, layer_idx: int) bool
+

Adds a new adapter module to the layer.

+
+
Parameters
+
    +
  • adapter_name (str) – The name of the new adapter to add.

  • +
  • layer_idx (int) – The index of the adapters layer (this should be set once by the first added adapter and the kept fix).

  • +
+
+
Returns
+

True if the adapter was added, False otherwise.

+
+
Return type
+

bool

+
+
+
+ +
+
+abstract average_adapter(adapter_name: str, input_adapters: Dict[str, float]) bool
+

Averages a set of adapter modules into a new adapter module.

+
+
Parameters
+
    +
  • adapter_name (str) – The name of the new (averaged) adapter module to add.

  • +
  • input_adapters (Dict[str, float]) – Either: +- a list of adapter names (with equal weighting). +- a dictionary of adapter names and their corresponding weights.

  • +
+
+
Returns
+

True if the adapter was added, False otherwise.

+
+
Return type
+

bool

+
+
+
+ +
+
+abstract delete_adapter(adapter_name: str)
+

Deletes an adapter module from the layer.

+
+
Parameters
+

adapter_name (str) – The name of the adapter to delete.

+
+
+
+ +
+
+abstract enable_adapters(adapter_setup: AdapterCompositionBlock, unfreeze_adapters: bool, unfreeze_fusion: bool)
+

Enables/ disables a set of adapter modules within the layer.

+
+
Parameters
+
    +
  • adapter_setup (AdapterCompositionBlock) – The adapter setup to enable/ disable.

  • +
  • unfreeze_adapters (bool) – Whether to unfreeze the adapters.

  • +
  • unfreeze_fusion (bool) – Whether to unfreeze the fusion layers.

  • +
+
+
+
+ +
+
+abstract get_adapter(adapter_name: str) Module
+

Returns the adapter module with the given name.

+
+
Parameters
+

adapter_name (str) – The name of the adapter module.

+
+
+
+ +
+ +
+
+class adapters.ComposableAdapterLayerBase(*args, **kwargs)
+

Base class for all adapter methods that support composition.

+

Make sure the ‘adapter_modules_name’ and ‘supported_compositions’ attributes as well as all abstract methods are +overriden in derived classes. ‘allow_multi_parallelize’ can be set to True to allow inputs to be parallelized +independently multiple times. This is useful when there are multiple parallel input flows through an adapter layer +(e.g. in LoRA).

+
+
+check_composition_valid(parent: AdapterCompositionBlock, child: AdapterCompositionBlock, lvl: int)
+

Checks whether the given composition is valid.

+
+
Parameters
+
    +
  • parent (AdapterCompositionBlock) – The parent composition block.

  • +
  • child (AdapterCompositionBlock) – The child composition block.

  • +
  • lvl (int) – The composition depth.

  • +
+
+
Raises
+

ValueError – If the composition is invalid.

+
+
+
+ +
+
+compose(adapter_setup: Union[AdapterCompositionBlock, str], state: NamedTuple) NamedTuple
+

The main composition forward method which recursively calls the composition blocks forward methods. +This method should be called by the forward method of the derived class.

+
+
Parameters
+
    +
  • adapter_setup (Union[AdapterCompositionBlock, str]) – The adapter setup to be used.

  • +
  • state (NamedTuple) – The current state.

  • +
+
+
Returns
+

The state after forwarding through the adapter setup.

+
+
Return type
+

NamedTuple

+
+
+
+ +
+
+compose_average(adapter_setup: Average, state: NamedTuple, lvl: int = 0)
+

For averaging the output representations of multiple adapters.

+
+ +
+
+compose_batch_split(adapter_setup: BatchSplit, state: NamedTuple, lvl: int = 0)
+

For splitting to multiple adapters along the batch size dimension.

+
+ +
+
+compose_fuse(adapter_setup: Fuse, state: NamedTuple, lvl: int = 0)
+

For fusing multiple adapters using adapter fusion. NOTE: This method has no default implementation.

+
+ +
+
+compose_parallel(adapter_setup: Parallel, state: NamedTuple, lvl: int = 0)
+

For parallel execution of the adapters on the same input. This means that the input is repeated N times before +feeding it to the adapters (where N is the number of adapters).

+
+ +
+
+abstract compose_single(adapter_setup: str, state: NamedTuple, lvl: int = 0) NamedTuple
+

Forwards the given state through the given single adapter.

+
+
Parameters
+
    +
  • adapter_setup (str) – The name of the adapter.

  • +
  • state (NamedTuple) – The state to be forwarded.

  • +
  • lvl (int, optional) – The composition depth. Defaults to 0.

  • +
+
+
Returns
+

The state after forwarding through the adapter.

+
+
Return type
+

NamedTuple

+
+
+
+ +
+
+compose_split(adapter_setup: Split, state: NamedTuple, lvl: int = 0)
+

For splitting to multiple adapters along the sequence length dimension. NOTE: This method has no default +implementation.

+
+ +
+
+compose_stack(adapter_setup: Stack, state: NamedTuple, lvl: int = 0) NamedTuple
+

For sequentially stacking multiple adapters.

+
+ +
+
+abstract mean(states: List[NamedTuple], weights: Tensor) NamedTuple
+

Averages the given states along the batch size dimension by the given weights. +This is e.g. used by the Average composition block. IMPORTANT: Has to be implemented by all derived classes.

+
+
Parameters
+
    +
  • states (List[NamedTuple]) – The states to be averaged.

  • +
  • weights (torch.Tensor) – The averaging weights.

  • +
+
+
Returns
+

The averaged state.

+
+
Return type
+

NamedTuple

+
+
+
+ +
+
+abstract pad_and_concat(states: List[NamedTuple]) NamedTuple
+

Concatenates the given states along the batch size dimension. +Pads the states before concatenation if necessary. This is e.g. used by the BatchSplit and Parallel composition +blocks. IMPORTANT: Has to be implemented by all derived classes.

+
+
Parameters
+

states (List[NamedTuple]) – The states to be concatenated.

+
+
Returns
+

The concatenated state.

+
+
Return type
+

NamedTuple

+
+
+
+ +
+
+pre_block(adapter_setup: Union[AdapterCompositionBlock, str], state: NamedTuple) NamedTuple
+

Optional state pre-processing method which is invoked before passing the state to the first child block of a +composition. By default, this method does not contain any logic. E.g. used for bottleneck adapters to implement +residuals and LNs.

+
+
Parameters
+
    +
  • adapter_setup (Union[AdapterCompositionBlock, str]) – The current composition or single adapter.

  • +
  • state (NamedTuple) – The current state.

  • +
+
+
Returns
+

The pre-processed state.

+
+
Return type
+

NamedTuple

+
+
+
+ +
+
+abstract repeat(state: NamedTuple, channels: int) NamedTuple
+

Repeats the given state along the batch size dimension for the given number of times. +This is e.g. used by the Parallel composition block. IMPORTANT: Has to be implemented by all derived classes.

+
+
Parameters
+
    +
  • state (NamedTuple) – The state to be repeated.

  • +
  • channels (int) – The number of times the state should be repeated.

  • +
+
+
Returns
+

The repeated state.

+
+
Return type
+

NamedTuple

+
+
+
+ +
+
+abstract vslice(state: NamedTuple, slice_obj: slice) NamedTuple
+

Slices the given state along the batch size (vertical) dimension. +This is e.g. used by the BatchSplit and Parallel composition blocks. IMPORTANT: Has to be implemented by all +derived classes.

+
+
Parameters
+
    +
  • state (NamedTuple) – The state to be sliced.

  • +
  • slice_obj (slice) – The slice object.

  • +
+
+
Returns
+

The sliced state.

+
+
Return type
+

NamedTuple

+
+
+
+ +
+ +
+ + +
+ +
+ + +
+
+ +
+ +
+ +
+ + Versions + v: main + + +
+
+
Branches
+
main
+
+
+
+ + + + + + + + + + \ No newline at end of file diff --git a/classes/adapter_training.html b/classes/adapter_training.html new file mode 100644 index 0000000000..4795c55143 --- /dev/null +++ b/classes/adapter_training.html @@ -0,0 +1,364 @@ + + + + + + + + + + + Adapter Training — AdapterHub documentation + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + +
+ + + + + +
+ +
+ + + + + + + + + + + + + + + + + +
+ + + + +
+
+
+
+ +
+

Adapter Training

+

Classes and methods related to training adapters.

+
+
+class adapters.training.AdapterArguments(train_adapter: bool = False, load_adapter: Optional[str] = '', adapter_config: Optional[str] = 'seq_bn', load_lang_adapter: Optional[str] = None, lang_adapter_config: Optional[str] = None)
+

The subset of arguments related to adapter training.

+
+
Parameters
+
    +
  • train_adapter (bool) – Whether to train an adapter instead of the full model.

  • +
  • load_adapter (str) – Pre-trained adapter module to be loaded from Hub.

  • +
  • adapter_config (str) – Adapter configuration. Either a config string or a path to a file.

  • +
  • load_lang_adapter (str) – Pre-trained language adapter module to be loaded from Hub.

  • +
  • lang_adapter_config (str) – Language adapter configuration. Either an identifier or a path to a file.

  • +
+
+
+
+ +
+
+adapters.training.setup_adapter_training(model, adapter_args: AdapterArguments, adapter_name: str, adapter_config_kwargs: Optional[dict] = None, adapter_load_kwargs: Optional[dict] = None)
+

Setup model for adapter training based on given adapter arguments.

+
+
Parameters
+
    +
  • model (_type_) – The model instance to be trained.

  • +
  • adapter_args (AdapterArguments) – The adapter arguments used for configuration.

  • +
  • adapter_name (str) – The name of the adapter to be added.

  • +
+
+
Returns
+

A tuple containing the names of the loaded adapters.

+
+
Return type
+

Tuple[str, str]

+
+
+
+ +
+
+class adapters.trainer.AdapterTrainer(model: Optional[Union[PreTrainedModel, Module]] = None, args: Optional[TrainingArguments] = None, data_collator: Optional[DataCollator] = None, train_dataset: Optional[Dataset] = None, eval_dataset: Optional[Dataset] = None, tokenizer: Optional[PreTrainedTokenizerBase] = None, model_init: Optional[Callable[[], PreTrainedModel]] = None, compute_metrics: Optional[Callable[[EvalPrediction], Dict]] = None, callbacks: Optional[List[TrainerCallback]] = None, adapter_names: Optional[List[List[str]]] = None, optimizers: Tuple[Optimizer, LambdaLR] = (None, None), preprocess_logits_for_metrics: Optional[Callable[[Tensor, Tensor], Tensor]] = None)
+
+
+create_optimizer()
+

Setup the optimizer.

+

We provide a reasonable default that works well. If you want to use something else, you can pass a tuple in the +Trainer’s init through optimizers, or subclass and override this method in a subclass.

+
+ +
+ +
+
+class adapters.trainer.AdapterTrainerCallback(trainer)
+
+
+on_step_end(args: TrainingArguments, state: TrainerState, control: TrainerControl, **kwargs)
+

Event called at the end of a training step. If using gradient accumulation, one training step might take +several inputs.

+
+ +
+
+on_train_begin(args: TrainingArguments, state: TrainerState, control: TrainerControl, **kwargs)
+

Event called at the beginning of training.

+
+ +
+ +
+
+class adapters.trainer.Seq2SeqAdapterTrainer(model: Optional[Union[PreTrainedModel, Module]] = None, args: Optional[TrainingArguments] = None, data_collator: Optional[DataCollator] = None, train_dataset: Optional[Dataset] = None, eval_dataset: Optional[Dataset] = None, tokenizer: Optional[PreTrainedTokenizerBase] = None, model_init: Optional[Callable[[], PreTrainedModel]] = None, compute_metrics: Optional[Callable[[EvalPrediction], Dict]] = None, callbacks: Optional[List[TrainerCallback]] = None, adapter_names: Optional[List[List[str]]] = None, optimizers: Tuple[Optimizer, LambdaLR] = (None, None), preprocess_logits_for_metrics: Optional[Callable[[Tensor, Tensor], Tensor]] = None)
+
+ +
+ + +
+ +
+ + +
+
+ +
+ +
+ +
+ + Versions + v: main + + +
+
+
Branches
+
main
+
+
+
+ + + + + + + + + + \ No newline at end of file diff --git a/classes/adapter_utils.html b/classes/adapter_utils.html new file mode 100644 index 0000000000..9aea244f27 --- /dev/null +++ b/classes/adapter_utils.html @@ -0,0 +1,505 @@ + + + + + + + + + + + Adapter Utilities — AdapterHub documentation + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + +
+ + + + + +
+ +
+ + + + + + + + + + + + + + + + + +
+ + + + +
+
+
+
+ +
+

Adapter Utilities

+

A collection of utility methods mainly related to searching and loading adapter modules from +Adapter-Hub.

+
+
+class adapters.utils.AdapterInfo(source: str, adapter_id: str, model_name: Optional[str] = None, task: Optional[str] = None, subtask: Optional[str] = None, username: Optional[str] = None, adapter_config: Optional[dict] = None, sha1_checksum: Optional[str] = None)
+

Holds information about an adapter publicly available on AdapterHub or huggingface.co. Returned by +list_adapters().

+
+
Parameters
+
    +
  • source (str) – The source repository of this adapter. Can be either “ah” (AdapterHub) or “hf” (huggingface.co).

  • +
  • adapter_id (str) – The unique identifier of this adapter.

  • +
  • model_name (str, optional) – The identifier of the model this adapter was trained for.

  • +
  • task (str, optional) – The task this adapter was trained for.

  • +
  • subtask (str, optional) – The subtask or dataset this adapter was trained on.

  • +
  • username (str, optional) – The username of author(s) of this adapter.

  • +
  • adapter_config (dict, optional) – The configuration dictionary of this adapter.

  • +
+
+
+
+ +
+
+class adapters.utils.AdapterType(value)
+

Models all currently available model adapter types.

+
+ +
+
+adapters.utils.get_adapter_config_hash(config, length=16)
+

Calculates the hash of a given adapter configuration which is used to identify this configuration.

+
+
Returns
+

The resulting hash of the given config dict.

+
+
Return type
+

str

+
+
+
+ +
+
+adapters.utils.get_adapter_info(adapter_id: str, source: str = 'ah') Optional[AdapterInfo]
+

Retrieves information about a specific adapter.

+
+
Parameters
+
    +
  • adapter_id (str) – The identifier of the adapter to retrieve.

  • +
  • source (str, optional) –

    Identifier of the source(s) from where to get adapters. Can be either:

    +
      +
    • ”ah”: search on AdapterHub.ml.

    • +
    • ”hf”: search on HuggingFace model hub (huggingface.co).

    • +
    +

  • +
+
+
Returns
+

The adapter information or None if the adapter was not found.

+
+
Return type
+

AdapterInfo

+
+
+
+ +
+
+adapters.utils.get_from_cache(url: str, cache_dir=None, force_download=False, proxies=None, etag_timeout=10, resume_download=False, user_agent: Optional[Union[Dict, str]] = None, use_auth_token: Optional[Union[bool, str]] = None, local_files_only=False) Optional[str]
+

Given a URL, look for the corresponding file in the local cache. If it’s not there, download it. Then return the +path to the cached file.

+
+
Returns
+

Local path (string) of file or if networking is off, last version of file cached on disk.

+
+
Raises
+

In case of non-recoverable file (non-existent or inaccessible url + no cache on disk).

+
+
+
+ +
+
+adapters.utils.list_adapters(source: Optional[str] = None, model_name: Optional[str] = None) List[AdapterInfo]
+

Retrieves a list of all publicly available adapters on AdapterHub.ml or on huggingface.co.

+
+
Parameters
+
    +
  • source (str, optional) –

    Identifier of the source(s) from where to get adapters. Can be either:

    +
      +
    • ”ah”: search on AdapterHub.ml.

    • +
    • ”hf”: search on HuggingFace model hub (huggingface.co).

    • +
    • None (default): search on all sources

    • +
    +

  • +
  • model_name (str, optional) – If specified, only returns adapters trained for the model with this identifier.

  • +
+
+
+
+ +
+
+adapters.utils.parse_adapter_config_string(config_string: str) List[Tuple[str, dict]]
+

Parses an adapter configuration string into a list of tuples. Each tuple constists of an adapter config identifier +and dictionary.

+
+ +
+
+adapters.utils.prefix_attention_mask(attention_mask, dim: int = 3, prefix_value: int = 0)
+

Adds a prefix to an attention mask. The length of the prefix is determined by the prefix_attention_mask_length +attribute in the ForwardContext.

+
+
Parameters
+
    +
  • attention_mask – The attention mask to add the prefix to.

  • +
  • dim (int) – The dimension along which to concatenate the prefix_attention_mask. Defaults to 3.

  • +
  • prefix_value (int) – The value to use for the prefix_attention_mask. Defaults to 0, however some models, e.g. DistilBert, use +different values. BERT like models invert their extended_attention_mask, hence they use 0 as value for not +masked tokens. This inversion is usually done in the forward method of the model in 2 different ways: +1) by calling self.invert_attention_mask, as BERT does 2) by doing the inversion manually, e.g. ALBERT +does: extended_attention_mask = (1.0 - extended_attention_mask) * torch.finfo(self.dtype).min

  • +
+
+
+
+ +
+
+adapters.utils.pull_from_hub(specifier: str, model_name: str, adapter_config: Optional[Union[dict, str]] = None, version: Optional[str] = None, strict: bool = False, redirect_to_hf_hub: bool = False, **kwargs) str
+

Downloads a pre-trained adapter module from Adapter-Hub

+
+
Parameters
+
    +
  • specifier (str) – A string specifying the adapter to be loaded.

  • +
  • model_name (str) – The identifier of the pre-trained model for which to load an adapter.

  • +
  • adapter_config (Union[dict, str], optional) – The configuration of the adapter to be loaded.

  • +
  • version (str, optional) – The version of the adapter to be loaded. Defaults to None.

  • +
  • strict (bool, optional) – If set to True, only allow adapters exactly matching the given config to be loaded. Defaults to False.

  • +
  • redirect_to_hf_hub (bool, optional) – If set to True, the function will redirect to the HuggingFace Model Hub instead of AdapterHub. +Defaults to False.

  • +
+
+
Returns
+

The local path to which the adapter has been downloaded.

+
+
Return type
+

str

+
+
+
+ +
+
+adapters.utils.resolve_adapter_config(config: Union[dict, str], local_map=None, try_loading_from_hub=True, **kwargs) dict
+

Resolves a given adapter configuration specifier to a full configuration dictionary.

+
+
Parameters
+

config (Union[dict, str]) –

The configuration to resolve. Can be either:

+
    +
  • a dictionary: returned without further action

  • +
  • an identifier string available in local_map

  • +
  • the path to a file containing a full adapter configuration

  • +
  • an identifier string available in Adapter-Hub

  • +
+

+
+
Returns
+

The resolved adapter configuration dictionary.

+
+
Return type
+

dict

+
+
+
+ +
+
+adapters.utils.resolve_adapter_path(adapter_name_or_path, model_name: Optional[str] = None, adapter_config: Optional[Union[dict, str]] = None, version: Optional[str] = None, source: Optional[str] = None, redirect_to_hf_hub: bool = False, **kwargs) str
+

Resolves the path to a pre-trained adapter module. Note: If attempting to resolve an adapter from the Hub, +adapter_config and model_name must be present.

+
+
Parameters
+
    +
  • adapter_name_or_path (str) –

    Can be either:

    +
      +
    • the path to a folder in the file system containing the adapter configuration and weights

    • +
    • an url pointing to a zip folder containing the adapter configuration and weights

    • +
    • a specifier matching a pre-trained adapter uploaded to Adapter-Hub

    • +
    +

  • +
  • model_name (str, optional) – The identifier of the pre-trained model for which to load an adapter.

  • +
  • adapter_config (Union[dict, str], optional) – The configuration of the adapter to be loaded.

  • +
  • version (str, optional) – The version of the adapter to be loaded. Defaults to None.

  • +
  • source (str, optional) –

    Identifier of the source(s) from where to get adapters. Can be either:

    +
      +
    • ”ah”: search on AdapterHub.ml. Note: this source is deprecated in favor of “hf”.

    • +
    • ”hf”: search on HuggingFace model hub (huggingface.co).

    • +
    • None (default): search on all sources

    • +
    +

  • +
  • redirect_to_hf_hub (bool, optional) – If set to True, the function will redirect to the HuggingFace Model Hub instead of AdapterHub. +Defaults to False.

  • +
+
+
Returns
+

The local path from where the adapter module can be loaded.

+
+
Return type
+

str

+
+
+
+ +
+ + +
+ +
+ + +
+
+ +
+ +
+ +
+ + Versions + v: main + + +
+
+
Branches
+
main
+
+
+
+ + + + + + + + + + \ No newline at end of file diff --git a/classes/model_adapters_config.html b/classes/model_adapters_config.html new file mode 100644 index 0000000000..3eca432032 --- /dev/null +++ b/classes/model_adapters_config.html @@ -0,0 +1,380 @@ + + + + + + + + + + + Model Adapters Config — AdapterHub documentation + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + +
+ + + + + +
+ +
+ + + + + + + + + + + + + + + + + +
+ + + + +
+
+
+
+ +
+

Model Adapters Config

+

This class manages the setup and configuration of adapter modules in a pre-trained model.

+
+
+class adapters.ModelAdaptersConfig(**kwargs)
+

This class manages the setup and configuration of adapter modules in a pre-trained model.

+
+
+add(adapter_name: str, config: Optional[Union[dict, str]] = None)
+

Adds a new adapter of the name to the model config.

+
+
Parameters
+
    +
  • adapter_name (str) – The name of the adapter.

  • +
  • config (Optional[Union[str, dict]], optional) – The adapter config. Defaults to None.

  • +
+
+
+
+ +
+
+add_fusion(fusion_name: Union[str, List[str]], config: Optional[Union[dict, str]] = None)
+

Adds a new AdapterFusion.

+
+
Parameters
+
    +
  • fusion_name (Union[str, List[str]]) – The name of the AdapterFusion or the adapters to fuse.

  • +
  • config (Optional[Union[str, dict]], optional) – AdapterFusion config. Defaults to None.

  • +
+
+
+
+ +
+
+common_config_value(adapter_names: list, attribute: str)
+

Checks whether all adapters in a list share the same config setting for a given attribute and returns the +shared value.

+
+
Parameters
+
    +
  • adapter_names (list) – The adapters to check.

  • +
  • attribute (str) – The config attribute to check.

  • +
+
+
+
+ +
+
+get(adapter_name: str) Optional[dict]
+

Gets the config dictionary for a given adapter.

+
+
Parameters
+

adapter_name (str) – The name of the adapter.

+
+
Returns
+

The adapter configuration.

+
+
Return type
+

Mapping

+
+
+
+ +
+
+get_fusion(fusion_name: Union[str, List[str]]) Optional[dict]
+

Gets the config dictionary for a given AdapterFusion.

+
+
Parameters
+

fusion_name (Union[str, List[str]]) – The name of the AdapterFusion or the adapters to fuse.

+
+
Returns
+

The AdapterFusion configuration.

+
+
Return type
+

Optional[dict]

+
+
+
+ +
+
+match(adapter_name: str, config_type: type, layer_idx: Optional[int] = None, location_key: Optional[str] = None) Optional[dict]
+

Tries to match the given criteria to an existing adapter. Return the adapter config if a match is found, +otherwise None.

+
+ +
+ +
+ + +
+ +
+ + +
+
+ +
+ +
+ +
+ + Versions + v: main + + +
+
+
Branches
+
main
+
+
+
+ + + + + + + + + + \ No newline at end of file diff --git a/classes/model_mixins.html b/classes/model_mixins.html new file mode 100644 index 0000000000..ad13d37f56 --- /dev/null +++ b/classes/model_mixins.html @@ -0,0 +1,1393 @@ + + + + + + + + + + + Model Mixins — AdapterHub documentation + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + +
+ + + + + +
+ +
+ + + + + + + + + + + + + + + + + +
+ + + + +
+
+
+
+ +
+

Model Mixins

+

These classes provide the basis of adapter module integration into model classes such as adapter saving and loading. +Depending on the model, one of these mixins should be implemented by every adapter-supporting model class.

+
+

InvertibleAdaptersMixin

+
+
+class adapters.InvertibleAdaptersMixin
+

Mixin for Transformer models adding invertible adapters.

+
+
+add_invertible_adapter(adapter_name: str) bool
+

Adds an invertible adapter module for the adapter with the given name. If the given adapter does not specify an +invertible adapter config, this method does nothing.

+
+
Parameters
+

adapter_name (str) – The name of the adapter for which to add an invertible adapter module.

+
+
+
+ +
+ +
+
+

EmbeddingAdaptersMixin

+
+
+class adapters.EmbeddingAdaptersMixin
+

Mixin for Transformer models adding support for dynamically switching embeddings.

+
+
+add_embeddings(name, tokenizer, reference_embedding=None, reference_tokenizer=None, embedding_dim=None)
+

Add a new embedding to the model. If a reference embedding and reference tokenizer are provided tokens in the +present in both tokenizers are initialized to the embedding in the reference_embedding.

+
+
Parameters
+
    +
  • name – the name of the embedding

  • +
  • tokenizer – the tokenizer determining the vocab of the embedding

  • +
  • reference_embedding – the reference embedding to use for initializing the embeddings of tokens present in the newly created +embedding

  • +
  • reference_tokenizer – the tokenizer providing the vocab for the reference embedding

  • +
  • embedding_dim – the dimension of the embeddings (if None the embedding_size, or if this doesn’t exist the hidden_size, +from the config is used)

  • +
+
+
+
+ +
+
+delete_embeddings(name)
+

Deletes the embedding with the given name

+
+
Parameters
+

name – The name of the embedding that should be deleted

+
+
+
+ +
+
+load_embeddings(path: str, name: str)
+

Load a saved embedding from the given path. If the embedding was saved with a tokenizer it is returned

+
+
Parameters
+
    +
  • path – the path to the saved embedding

  • +
  • name – the name the embedding should be loaded as

  • +
+
+
+

Returns: a tokenizer if it ws saved with the embedding otherwise None

+
+ +
+
+save_embeddings(path, name, tokenizer=None)
+

Saves the embedding with the given name. If a tokenizer is passed as well the tokenizer is saved together with +the embedding.

+
+
Parameters
+
    +
  • path – The path where the embedding should be saved

  • +
  • name – The name of the embedding that should be saved

  • +
  • tokenizer – optionally a tokenizer to save with the embedding (default is None)

  • +
+
+
+
+ +
+
+set_active_embeddings(name)
+

Sets the active embedding for the forward pass of the model

+
+
Parameters
+

name – The name of the embedding that should be used

+
+
+
+ +
+ +
+
+

ModelAdaptersMixin

+
+
+class adapters.ModelAdaptersMixin(config, *args, **kwargs)
+

Mixin for transformer models adding support for loading/ saving adapters.

+
+
+adapter_fusion_to(adapter_names: Union[Fuse, list, str], device: Optional[Union[device, str]] = None, dtype: Optional[dtype] = None)
+

Moves the adapter fusion layer with the given name to the specified device and data type.

+
+
Parameters
+
    +
  • adapter_names (Union[Fuse, list, str]) – The name of the adapter fusion layer to be moved.

  • +
  • device (torch.device or str, optional) – The device on which the adapter fusion layer should be moved.

  • +
  • dtype (torch.dtype, optional) – The data type to which the adapter fusion layer should be cast.

  • +
+
+
+
+ +
+
+adapter_summary(as_dict=False) Union[str, dict]
+

Returns a string summary of all adapters currently added to the model. Each entry in the summary table has the +following attributes:

+
+
    +
  • name: the name of the adapter

  • +
  • architecture: the architectural base of the adapter

  • +
  • #param: the number of parameters of the adapter

  • +
  • %param: the number of parameters of the adapter relative to the full model

  • +
  • active: whether the adapter is active

  • +
  • train: whether the adapter weights are enabled for training

  • +
+
+
+ +
+
+adapter_to(name: str, device: Optional[Union[device, str]] = None, dtype: Optional[dtype] = None)
+

Moves the adapter with the given name to the specified device and data type.

+
+
Parameters
+
    +
  • name (str) – The name of the adapter to be moved.

  • +
  • device (torch.device or str, optional) – The device on which the adapter should be moved.

  • +
  • dtype (torch.dtype, optional) – The data type to which the adapter should be cast.

  • +
+
+
+
+ +
+
+add_adapter(adapter_name: str, config=None, overwrite_ok: bool = False, set_active: bool = False)
+

Adds a new adapter module of the specified type to the model.

+
+
Parameters
+
    +
  • adapter_name (str) – The name of the adapter module to be added.

  • +
  • config (str or dict or AdapterConfig, optional) –

    The adapter configuration, can be either:

    +
      +
    • the string identifier of a pre-defined configuration dictionary

    • +
    • a configuration dictionary specifying the full config

    • +
    • if not given, the default configuration for this adapter type will be used

    • +
    +

  • +
  • overwrite_ok (bool, optional) – Overwrite an adapter with the same name if it exists. By default (False), an

  • +
  • set_active (exception is thrown.) – Set the adapter to be the active one. By default (False),

  • +
  • activated. (the adapter is added but not) –

  • +
+
+
+
+ +
+
+add_adapter_fusion(adapter_names: Union[Fuse, list, str], config=None, overwrite_ok: bool = False, set_active: bool = False)
+

Adds AdapterFusion to the model with alll the necessary configurations and weight initializations

+
+
Parameters
+
    +
  • adapter_names (Fuse or list or str) –

    AdapterFusion layer to add. Can be either:

    +
      +
    • a Fuse composition block

    • +
    • a list of adapter names to fuse

    • +
    • a comma-separated string of adapter names to fuse

    • +
    +

  • +
  • config (str or dict) –

    adapter fusion configuration, can be either:

    +
      +
    • a string identifying a pre-defined adapter fusion configuration

    • +
    • a dictionary representing the adapter fusion configuration

    • +
    • the path to a file containing the adapter fusion configuration

    • +
    +

  • +
  • overwrite_ok (bool, optional) – Overwrite an AdapterFusion layer with the same name if it exists. By default (False), an exception is +thrown.

  • +
  • set_active (bool, optional) – Activate the added AdapterFusion. By default (False), the AdapterFusion is added but not activated.

  • +
+
+
+
+ +
+
+apply_to_adapter_layers(fn)
+

Applies a function to all adapter layers of the model.

+
+ +
+
+apply_to_basemodel_childs(fn)
+

Applies a function to all direct childs of the model if they are a instance of AdapterLayerBase.

+
+ +
+
+average_adapter(adapter_name: str, adapter_list: List[str], weights: Optional[List[float]] = None, normalize_weights: bool = True, overwrite_ok: bool = False, set_active: bool = False)
+

Adds a new adapter module as weighted average of a set of existing adapter modules.

+
+
Parameters
+
    +
  • adapter_name (str) – The name of the adapter module to be added.

  • +
  • input_adapters (List[str] or Dict[str, float]) – Specifies the existing adapters whose weights should be averaged. Can either be a list of adapter names +or a dictionary mapping adapter names to weights.

  • +
  • overwrite_ok (bool, optional) – Overwrite an adapter with the same name if it exists. By default (False), an exception is thrown.

  • +
  • set_active (bool, optional) – Set the adapter to be the active one. By default (False), the adapter is added but not activated.

  • +
+
+
+
+ +
+
+delete_adapter(adapter_name: str)
+

Deletes the adapter with the specified name from the model.

+
+
Parameters
+

adapter_name (str) – The name of the adapter.

+
+
+
+ +
+
+delete_adapter_fusion(adapter_names: Union[Fuse, list, str])
+

Deletes the AdapterFusion layer of the specified adapters.

+
+
Parameters
+

adapter_names (Union[Fuse, list, str]) – AdapterFusion layer to delete.

+
+
+
+ +
+
+eject_prefix_tuning(name: str)
+

Converts the prefix tuning with the given name from the reparameterized form into the flat form.

+
+
Parameters
+

name (str) – The name of the prefix tuning.

+
+
+
+ +
+
+forward_context(context: ForwardContext, *args, **kwargs)
+

This method is called by the ForwardContext at the beginning of the forward pass.

+
+ +
+
+freeze_model(freeze=True)
+

Freezes all weights of the model.

+
+ +
+
+get_adapter(name) dict
+

Returns a dictionary with all weights of the adapter with the specified name.

+
+
Parameters
+

name (str) – The adapter name.

+
+
Returns
+

A nested dictionary containing the weights of the adapter. The dictionary is structured as follow: +{<layer id>: {<module location>: <nn.Module>}}. <layer id> = -1 indicates global/ shared weights.

+
+
Return type
+

dict

+
+
+
+ +
+
+init_adapters(model_config, adapters_config, add_prefix_tuning_pool=True)
+

This method initializes adapter modules and fusion modules from the model config.

+
+ +
+
+abstract iter_layers() Iterable[Tuple[int, Module]]
+

Iterates over all layers of the model.

+

This abstract method has to ne implemented by every implementing model.

+
+ +
+
+load_adapter(adapter_name_or_path: str, config: Optional[Union[dict, str]] = None, version: Optional[str] = None, model_name: Optional[str] = None, load_as: Optional[str] = None, source: Optional[str] = None, custom_weights_loaders: Optional[List[WeightsLoader]] = None, leave_out: Optional[List[int]] = None, id2label=None, set_active: bool = False, use_safetensors: bool = False, **kwargs) str
+

Loads a pre-trained pytorch adapter module from the local file system or a remote location.

+
+
Parameters
+
    +
  • adapter_name_or_path (str) –

    can be either:

    +
      +
    • the identifier of a pre-trained task adapter to be loaded from Adapter Hub

    • +
    • a path to a directory containing adapter weights saved using model.saved_adapter()

    • +
    • a URL pointing to a zip folder containing a saved adapter module

    • +
    +

  • +
  • config (dict or str, optional) – The requested configuration of the adapter. +If not specified, will be either: - the default adapter config for the requested adapter if specified - +the global default adapter config

  • +
  • version (str, optional) – The version of the adapter to be loaded.

  • +
  • model_name (str, optional) – The string identifier of the pre-trained model.

  • +
  • load_as (str, optional) – Load the adapter using this name. By default, the name with which the adapter was +saved will be used.

  • +
  • source (str, optional) –

    Identifier of the source(s) from where to load the adapter. Can be:

    +
      +
    • +
      ”ah”: search on AdapterHub Hub repo.

      Note: the Hub repo has been archived and all adapters have been moved to HuggingFace Model Hub. +Loading from this source is deprecated.

      +
      +
      +
    • +
    • ”hf”: search on HuggingFace Model Hub.

    • +
    • None (default): search on all sources

    • +
    +

  • +
  • leave_out – Dynamically drop adapter modules in the specified Transformer layers when loading the adapter.

  • +
  • set_active (bool, optional) – Set the loaded adapter to be the active one. By default (False), the adapter is loaded but not +activated.

  • +
  • use_safetensors (bool, optional) – If True, weights are loaded via safetensors if safetensors checkpoint is available. Otherwise, the regular torch save method is used.

  • +
+
+
Returns
+

The name with which the adapter was added to the model.

+
+
Return type
+

str

+
+
+
+ +
+
+load_adapter_fusion(adapter_fusion_name_or_path: str, load_as: Optional[str] = None, custom_weights_loaders: Optional[List[WeightsLoader]] = None, set_active: bool = False, use_safetensors: bool = False, **kwargs) str
+

Loads a pre-trained AdapterFusion layer from the local file system.

+
+
Parameters
+
    +
  • adapter_fusion_name_or_path (str) – a path to a directory containing AdapterFusion weights saved using model.save_adapter_fusion().

  • +
  • load_as (str, optional) – Load the AdapterFusion using this name. +By default, the name with which the AdapterFusion layer was saved will be used.

  • +
  • set_active (bool, optional) – Activate the loaded AdapterFusion. By default (False), the AdapterFusion is loaded but not activated.

  • +
  • use_safetensors (bool, optional) – If True, weights are loaded via safetensors if safetensors checkpoint is available. Otherwise, the regular torch save method is used.

  • +
+
+
Returns
+

The name with which the AdapterFusion was added to the model.

+
+
Return type
+

str

+
+
+
+ +
+
+merge_adapter(name: str)
+

Merges the weights of the given LoRA module with the Transformer weights as described in the paper.

+
+
Parameters
+

name (str) – LoRA module to merge.

+
+
+
+ +
+
+reset_adapter()
+

Resets weights of a LoRA module merged using model.merge_adapter(name).

+
+ +
+
+save_adapter(save_directory: str, adapter_name: str, meta_dict: Optional[dict] = None, custom_weights_loaders: Optional[List[WeightsLoader]] = None, use_safetensors: bool = False)
+

Saves an adapter and its configuration file to a directory so that it can be shared or reloaded using +load_adapter().

+
+
Parameters
+
    +
  • save_directory (str) – Path to a directory where the adapter should be saved.

  • +
  • adapter_name (str) – Name of the adapter to be saved.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
Raises
+

ValueError – If the given adapter name is invalid.

+
+
+
+ +
+
+save_adapter_fusion(save_directory: str, adapter_names: Union[Fuse, list, str], meta_dict: Optional[dict] = None, custom_weights_loaders: Optional[List[WeightsLoader]] = None, use_safetensors: bool = False)
+

Saves an AdapterFusion layer and its configuration file to a directory so that it can be shared or reloaded +using load_adapter_fusion().

+
+
Parameters
+
    +
  • save_directory (str) – Path to a directory where the AdapterFusion should be saved.

  • +
  • adapter_names (Union[Fuse, list, str]) – AdapterFusion to be saved.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
Raises
+

ValueError – If the given AdapterFusion name is invalid.

+
+
+
+ +
+
+save_all_adapter_fusions(save_directory: str, meta_dict: Optional[dict] = None, custom_weights_loaders: Optional[List[WeightsLoader]] = None, use_safetensors: bool = False)
+

Saves all AdapterFusion layers of this model together with their configuration to subfolders of the given +location.

+
+
Parameters
+
    +
  • save_directory (str) – Path to a directory where the AdapterFusion layers should be saved.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
+
+ +
+
+save_all_adapters(save_directory: str, meta_dict: Optional[dict] = None, custom_weights_loaders: Optional[List[WeightsLoader]] = None, use_safetensors: bool = False)
+

Saves all adapters of this model together with their configuration to subfolders of the given location.

+
+
Parameters
+
    +
  • save_directory (str) – Path to a directory where the adapters should be saved.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
+
+ +
+
+set_active_adapters(adapter_setup: Union[list, AdapterCompositionBlock], skip_layers: Optional[List[int]] = None)
+

Sets the adapter modules to be used by default in every forward pass. If no adapter with the given name is +found, no module of the respective type will be activated.

+
+
Parameters
+

adapter_setup (list) – The list of adapters to be activated by default. Can be a fusion or stacking configuration.

+
+
+
+ +
+
+train_adapter(adapter_setup: Union[list, AdapterCompositionBlock], train_embeddings=False)
+

Sets the model into mode for training the given adapters.

+
+ +
+
+train_adapter_fusion(adapter_setup: Union[list, AdapterCompositionBlock], unfreeze_adapters=False)
+

Sets the model into mode for training of adapter fusion determined by a list of adapter names.

+
+ +
+
+train_fusion(adapter_setup: Union[list, AdapterCompositionBlock], unfreeze_adapters=False)
+

Sets the model into mode for training of adapter fusion determined by a list of adapter names.

+
+ +
+ +
+
+

ModelWithHeadsAdaptersMixin

+
+
+class adapters.ModelWithHeadsAdaptersMixin(config, *args, **kwargs)
+

Mixin adding support for loading/ saving adapters to transformer models with head(s).

+
+
+add_adapter(adapter_name: str, config=None, overwrite_ok: bool = False, set_active: bool = False)
+

Adds a new adapter module of the specified type to the model.

+
+
Parameters
+
    +
  • adapter_name (str) – The name of the adapter module to be added.

  • +
  • config (str or dict, optional) –

    The adapter configuration, can be either:

    +
      +
    • the string identifier of a pre-defined configuration dictionary

    • +
    • a configuration dictionary specifying the full config

    • +
    • if not given, the default configuration for this adapter type will be used

    • +
    +

  • +
  • overwrite_ok (bool, optional) – Overwrite an adapter with the same name if it exists. By default (False), an exception is thrown.

  • +
  • set_active (bool, optional) – Set the adapter to be the active one. By default (False), the adapter is added but not activated.

  • +
+
+
+

If self.base_model is self, must inherit from a class that implements this method, to preclude infinite +recursion

+
+ +
+
+delete_adapter(adapter_name: str)
+

Deletes the adapter with the specified name from the model.

+
+
Parameters
+

adapter_name (str) – The name of the adapter.

+
+
+
+ +
+
+get_adapter(name)
+

If self.base_model is self, must inherit from a class that implements this method, to preclude infinite +recursion

+
+ +
+
+init_adapters(model_config, adapters_config, add_prefix_tuning_pool=True)
+

This method initializes adapter modules and fusion modules from the model config.

+
+ +
+
+iter_layers() Iterable[Tuple[int, Module]]
+

Iterates over all layers of the model.

+
+ +
+
+load_adapter(adapter_name_or_path: str, config: Optional[Union[dict, str]] = None, version: Optional[str] = None, model_name: Optional[str] = None, load_as: Optional[str] = None, source: Optional[str] = None, with_head: bool = True, custom_weights_loaders: Optional[List[WeightsLoader]] = None, leave_out: Optional[List[int]] = None, id2label=None, set_active: bool = False, use_safetensors: bool = False, **kwargs) str
+

Loads a pre-trained pytorch adapter module from the local file system or a remote location.

+
+
Parameters
+
    +
  • adapter_name_or_path (str) –

    can be either:

    +
      +
    • the identifier of a pre-trained task adapter to be loaded from Adapter Hub

    • +
    • a path to a directory containing adapter weights saved using model.saved_adapter()

    • +
    • a URL pointing to a zip folder containing a saved adapter module

    • +
    +

  • +
  • config (dict or str, optional) – The requested configuration of the adapter. +If not specified, will be either: - the default adapter config for the requested adapter if specified - +the global default adapter config

  • +
  • version (str, optional) – The version of the adapter to be loaded.

  • +
  • model_name (str, optional) – The string identifier of the pre-trained model.

  • +
  • load_as (str, optional) – Load the adapter using this name. By default, the name with which the adapter was +saved will be used.

  • +
  • source (str, optional) –

    Identifier of the source(s) from where to load the adapter. Can be:

    +
      +
    • +
      ”ah”: search on AdapterHub Hub repo.

      Note: the Hub repo has been archived and all adapters have been moved to HuggingFace Model Hub. +Loading from this source is deprecated.

      +
      +
      +
    • +
    • ”hf”: search on HuggingFace Model Hub.

    • +
    • None (default): search on all sources

    • +
    +

  • +
  • leave_out – Dynamically drop adapter modules in the specified Transformer layers when loading the adapter.

  • +
  • set_active (bool, optional) – Set the loaded adapter to be the active one. By default (False), the adapter is loaded but not +activated.

  • +
  • use_safetensors (bool, optional) – If True, weights are loaded via safetensors if safetensors checkpoint is available. Otherwise, the regular torch save method is used.

  • +
+
+
Returns
+

The name with which the adapter was added to the model.

+
+
Return type
+

str

+
+
+
+ +
+
+load_adapter_fusion(adapter_fusion_name_or_path: str, load_as: Optional[str] = None, custom_weights_loaders: Optional[List[WeightsLoader]] = None, set_active: bool = False, with_head: bool = True, use_safetensors: bool = False, **kwargs) str
+

Loads a pre-trained AdapterFusion layer from the local file system.

+
+
Parameters
+
    +
  • adapter_fusion_name_or_path (str) – a path to a directory containing AdapterFusion weights saved using model.save_adapter_fusion().

  • +
  • load_as (str, optional) – Load the AdapterFusion using this name. +By default, the name with which the AdapterFusion layer was saved will be used.

  • +
  • set_active (bool, optional) – Activate the loaded AdapterFusion. By default (False), the AdapterFusion is loaded but not activated.

  • +
  • use_safetensors (bool, optional) – If True, weights are loaded via safetensors if safetensors checkpoint is available. Otherwise, the regular torch save method is used.

  • +
+
+
Returns
+

The name with which the AdapterFusion was added to the model.

+
+
Return type
+

str

+
+
+
+ +
+
+load_head(save_directory: str, load_as: Optional[str] = None, id2label: Optional[Dict[int, str]] = None, use_safetensors: bool = False, **kwargs) str
+

Loads a model prediction head from a directory where it was saved using save_head().

+
+
Parameters
+
    +
  • save_directory (str) – Path to the directory where the prediction head is saved.

  • +
  • load_as (str, optional) – Load the AdapterFusion using this name. +By default, the name with which the AdapterFusion layer was saved will be used.

  • +
  • id2label (Dict[int, str], optional) – Provide a custom mapping from class ids to class labels. Defaults to None.

  • +
  • use_safetensors (bool, optional) – If True, weights are loaded via safetensors if safetensors checkpoint is available. Otherwise, the regular torch save method is used.

  • +
+
+
Returns
+

The name with which the prediction head was added to the model.

+
+
Return type
+

str

+
+
+
+ +
+
+save_adapter(save_directory: str, adapter_name: str, with_head: bool = True, meta_dict: Optional[dict] = None, custom_weights_loaders: Optional[List[WeightsLoader]] = None, use_safetensors: bool = False)
+

Saves an adapter and its configuration file to a directory so that it can be shared or reloaded using +load_adapter().

+
+
Parameters
+
    +
  • save_directory (str) – Path to a directory where the adapter should be saved.

  • +
  • adapter_name (str) – Name of the adapter to be saved.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
Raises
+

ValueError – If the given adapter name is invalid.

+
+
+
+ +
+
+save_adapter_fusion(save_directory: str, adapter_names: Union[Fuse, list, str], meta_dict: Optional[dict] = None, custom_weights_loaders: Optional[List[WeightsLoader]] = None, with_head: Union[bool, str] = False, use_safetensors: bool = False)
+

Saves an AdapterFusion layer and its configuration file to a directory so that it can be shared or reloaded +using load_adapter_fusion().

+
+
Parameters
+
    +
  • save_directory (str) – Path to a directory where the AdapterFusion should be saved.

  • +
  • adapter_names (Union[Fuse, list, str]) – AdapterFusion to be saved.

  • +
  • with_head (Union[bool, str]) – If True, will save a head with the same name as the AdapterFusionLayer. If a string, this will be used +as the name of the head to be saved.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
Raises
+

ValueError – If the given AdapterFusion name is invalid.

+
+
+
+ +
+
+save_all_adapters(save_directory: str, with_head: bool = True, meta_dict: Optional[dict] = None, custom_weights_loaders: Optional[List[WeightsLoader]] = None, use_safetensors: bool = False)
+

Saves all adapters of this model together with their configuration to subfolders of the given location.

+
+
Parameters
+
    +
  • save_directory (str) – Path to a directory where the adapters should be saved.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
+
+ +
+
+save_all_heads(save_directory: str, use_safetensors: bool = False)
+

Saves all prediction heads of this model to subfolders of the given location.

+
+
Parameters
+
    +
  • save_directory (str) – Path to the base directory where prediction heads should be saved.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
+
+ +
+
+save_head(save_directory: str, head_name: Optional[str] = None, use_safetensors: bool = False) None
+

Saves a model prediction head to a directory such that it can be reloaded using load_head().

+
+
Parameters
+
    +
  • save_directory (str) – Path to the directory where the prediction head should be saved.

  • +
  • head_name (str, optional) – Name of the head to save. Set to None if model only has one head. Defaults to None.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
+
+ +
+
+train_adapter(adapter_setup: Union[list, AdapterCompositionBlock], train_embeddings=False)
+

Sets the model into mode for training the given adapters. If self.base_model is self, must inherit from a class +that implements this method, to preclude infinite recursion

+
+ +
+
+train_adapter_fusion(adapter_setup: Union[list, AdapterCompositionBlock], unfreeze_adapters=False)
+

Sets the model into mode for training of adapter fusion determined by a list of adapter names. If +self.base_model is self, must inherit from a class that implements this method, to preclude infinite recursion

+
+ +
+ +
+
+

ModelWithFlexibleHeadsAdaptersMixin

+
+
+class adapters.ModelWithFlexibleHeadsAdaptersMixin(*args, **kwargs)
+

Adds flexible prediction heads to a model class. Implemented by the XModelWithHeads classes.

+
+
+property active_head: Union[str, List[str]]
+

The active prediction head configuration of this model. Can be either the name of a single available head +(string) or a list of multiple available heads. In case of a list of heads, the same base model is forwarded +through all specified heads.

+
+
Returns
+

A string or a list of strings describing the active head configuration.

+
+
Return type
+

Union[str, List[str]]

+
+
+
+ +
+
+adapter_to(name: str, device: Optional[Union[device, str]] = None, dtype: Optional[dtype] = None)
+

Moves the adapter with the given name to the specified device and data type.

+
+
Parameters
+
    +
  • name (str) – The name of the adapter to be moved.

  • +
  • device (torch.device or str, optional) – The device on which the adapter should be moved.

  • +
  • dtype (torch.dtype, optional) – The data type to which the adapter should be cast.

  • +
+
+
+
+ +
+
+add_causal_lm_head(head_name, activation_function='gelu', overwrite_ok=False)
+

Adds a causal language modeling head on top of the model.

+
+
Parameters
+
    +
  • head_name (str) – The name of the head.

  • +
  • activation_function (str, optional) – Activation function. Defaults to ‘gelu’.

  • +
  • overwrite_ok (bool, optional) – Force overwrite if a head with the same name exists. Defaults to False.

  • +
+
+
+
+ +
+
+add_classification_head(head_name, num_labels=2, layers=2, activation_function='tanh', overwrite_ok=False, multilabel=False, id2label=None, use_pooler=False)
+

Adds a sequence classification head on top of the model.

+
+
Parameters
+
    +
  • head_name (str) – The name of the head.

  • +
  • num_labels (int, optional) – Number of classification labels. Defaults to 2.

  • +
  • layers (int, optional) – Number of layers. Defaults to 2.

  • +
  • activation_function (str, optional) – Activation function. Defaults to ‘tanh’.

  • +
  • overwrite_ok (bool, optional) – Force overwrite if a head with the same name exists. Defaults to False.

  • +
  • multilabel (bool, optional) – Enable multilabel classification setup. Defaults to False.

  • +
+
+
+
+ +
+
+add_dependency_parsing_head(head_name, num_labels=2, overwrite_ok=False, id2label=None)
+

Adds a biaffine dependency parsing head on top of the model. The parsing head uses the architecture described +in “Is Supervised Syntactic Parsing Beneficial for Language Understanding? An Empirical Investigation” (Glavaš +& Vulić, 2021) (https://arxiv.org/pdf/2008.06788.pdf).

+
+
Parameters
+
    +
  • head_name (str) – The name of the head.

  • +
  • num_labels (int, optional) – Number of labels. Defaults to 2.

  • +
  • overwrite_ok (bool, optional) – Force overwrite if a head with the same name exists. Defaults to False.

  • +
  • id2label (dict, optional) – Mapping from label ids to labels. Defaults to None.

  • +
+
+
+
+ +
+
+add_image_classification_head(head_name, num_labels=2, layers=1, activation_function='tanh', overwrite_ok=False, multilabel=False, id2label=None, use_pooler=False)
+

Adds an image classification head on top of the model.

+
+
Parameters
+
    +
  • head_name (str) – The name of the head.

  • +
  • num_labels (int, optional) – Number of classification labels. Defaults to 2.

  • +
  • layers (int, optional) – Number of layers. Defaults to 1.

  • +
  • activation_function (str, optional) – Activation function. Defaults to ‘tanh’.

  • +
  • overwrite_ok (bool, optional) – Force overwrite if a head with the same name exists. Defaults to False.

  • +
  • multilabel (bool, optional) – Enable multilabel classification setup. Defaults to False.

  • +
+
+
+
+ +
+
+add_masked_lm_head(head_name, activation_function='gelu', overwrite_ok=False)
+

Adds a masked language modeling head on top of the model.

+
+
Parameters
+
    +
  • head_name (str) – The name of the head.

  • +
  • activation_function (str, optional) – Activation function. Defaults to ‘gelu’.

  • +
  • overwrite_ok (bool, optional) – Force overwrite if a head with the same name exists. Defaults to False.

  • +
+
+
+
+ +
+
+add_multiple_choice_head(head_name, num_choices=2, layers=2, activation_function='tanh', overwrite_ok=False, id2label=None, use_pooler=False)
+

Adds a multiple choice head on top of the model.

+
+
Parameters
+
    +
  • head_name (str) – The name of the head.

  • +
  • num_choices (int, optional) – Number of choices. Defaults to 2.

  • +
  • layers (int, optional) – Number of layers. Defaults to 2.

  • +
  • activation_function (str, optional) – Activation function. Defaults to ‘tanh’.

  • +
  • overwrite_ok (bool, optional) – Force overwrite if a head with the same name exists. Defaults to False.

  • +
+
+
+
+ +
+
+add_qa_head(head_name, num_labels=2, layers=1, activation_function='tanh', overwrite_ok=False, id2label=None)
+

Adds a question answering head on top of the model.

+
+
Parameters
+
    +
  • head_name (str) – The name of the head.

  • +
  • num_labels (int, optional) – Number of classification labels. Defaults to 2.

  • +
  • layers (int, optional) – Number of layers. Defaults to 1.

  • +
  • activation_function (str, optional) – Activation function. Defaults to ‘tanh’.

  • +
  • overwrite_ok (bool, optional) – Force overwrite if a head with the same name exists. Defaults to False.

  • +
+
+
+
+ +
+
+add_seq2seq_lm_head(head_name, overwrite_ok=False)
+

Adds a sequence-to-sequence language modeling head on top of the model.

+
+
Parameters
+
    +
  • head_name (str) – The name of the head.

  • +
  • overwrite_ok (bool, optional) – Force overwrite if a head with the same name exists. Defaults to False.

  • +
+
+
+
+ +
+
+add_tagging_head(head_name, num_labels=2, layers=1, activation_function='tanh', overwrite_ok=False, id2label=None)
+

Adds a token classification head on top of the model.

+
+
Parameters
+
    +
  • head_name (str) – The name of the head.

  • +
  • num_labels (int, optional) – Number of classification labels. Defaults to 2.

  • +
  • layers (int, optional) – Number of layers. Defaults to 1.

  • +
  • activation_function (str, optional) – Activation function. Defaults to ‘tanh’.

  • +
  • overwrite_ok (bool, optional) – Force overwrite if a head with the same name exists. Defaults to False.

  • +
+
+
+
+ +
+
+delete_head(head_name: str)
+

Deletes the prediction head with the specified name from the model.

+
+
Parameters
+

head_name (str) – The name of the prediction to delete.

+
+
+
+ +
+
+forward_head(all_outputs, head_name=None, cls_output=None, attention_mask=None, return_dict=False, context=None, **kwargs)
+

The forward pass through a prediction head configuration. There are three ways to specify the used prediction +head configuration (in order of priority):

+
+
    +
  1. If a head_name is passed, the head with the given name is used.

  2. +
  3. If the forward call is executed within an AdapterSetup context, the head configuration is read from +the context.

  4. +
  5. If the active_head property is set, the head configuration is read from there.

  6. +
+
+
+
Parameters
+
    +
  • all_outputs (dict) – The outputs of the base model.

  • +
  • head_name (str, optional) – The name of the prediction head to use. If None, the active head is used.

  • +
  • cls_output (torch.Tensor, optional) – The classification output of the model.

  • +
  • attention_mask (torch.Tensor, optional) – The attention mask of the model.

  • +
  • return_dict (bool) – Whether or not to return a ModelOutput instead of a plain tuple.

  • +
  • get_cls_from_eos_tokens (bool) – If set to True, retrieve classifier token representations from the last <eos> token in the sequence. +Setting to True requires eos_mask to be passed as well.

  • +
  • **kwargs – Additional keyword arguments passed to the forward pass of the head.

  • +
+
+
+
+ +
+
+get_labels(head_name=None)
+

Returns the labels the given head is assigning/predictin

+
+
Parameters
+
    +
  • head_name – (str, optional) the name of the head which labels should be returned. Default is None.

  • +
  • returned (If the name is None the labels of the active head are) –

  • +
+
+
+

Returns: labels

+
+ +
+
+get_labels_dict(head_name=None)
+

Returns the id2label dict for the given hea

+
+
Parameters
+
    +
  • head_name – (str, optional) the name of the head which labels should be returned. Default is None.

  • +
  • returned (If the name is None the labels of the active head are) –

  • +
+
+
+

Returns: id2label

+
+ +
+
+head_type()
+

Checks which head type the decorated function belongs to and raises an error if the model does not support the +head type.

+
+ +
+
+set_active_adapters(adapter_setup: Union[list, AdapterCompositionBlock], skip_layers: Optional[List[int]] = None)
+

Sets the adapter modules to be used by default in every forward pass. This setting can be overriden by passing +the adapter_names parameter in the foward() pass. If no adapter with the given name is found, no module of +the respective type will be activated. In case the calling model class supports named prediction heads, this +method will attempt to activate a prediction head with the name of the last adapter in the list of passed +adapter names.

+
+
Parameters
+

adapter_setup (list) – The list of adapters to be activated by default. Can be a fusion or stacking configuration.

+
+
+
+ +
+
+tie_weights()
+

Tie the weights between the input embeddings and the output embeddings.

+

If the torchscript flag is set in the configuration, can’t handle parameter sharing so we are cloning +the weights instead.

+
+ +
+ +
+
+

PushAdapterToHubMixin

+
+
+class adapters.hub_mixin.PushAdapterToHubMixin
+

Mixin providing support for uploading adapters to HuggingFace’s Model Hub.

+
+
+push_adapter_to_hub(repo_name: str, adapter_name: str, organization: Optional[str] = None, adapterhub_tag: Optional[str] = None, datasets_tag: Optional[str] = None, local_path: Optional[str] = None, commit_message: Optional[str] = None, private: Optional[bool] = None, token: Optional[Union[bool, str]] = None, overwrite_adapter_card: bool = False, create_pr: bool = False, revision: Optional[str] = None, commit_description: Optional[str] = None, adapter_card_kwargs: Optional[dict] = None, **deprecated_kwargs)
+

Upload an adapter to HuggingFace’s Model Hub.

+
+
Parameters
+
    +
  • repo_name (str) – The name of the repository on the model hub to upload to.

  • +
  • adapter_name (str) – The name of the adapter to be uploaded.

  • +
  • organization (str, optional) – Organization in which to push the adapter +(you must be a member of this organization). Defaults to None.

  • +
  • adapterhub_tag (str, optional) – Tag of the format <task>/<subtask> for categorization on https://adapterhub.ml/explore/. See +https://docs.adapterhub.ml/contributing.html#add-a-new-task-or-subtask for more. If not specified, +datasets_tag must be given in case a new adapter card is generated. Defaults to None.

  • +
  • datasets_tag (str, optional) – Dataset identifier from https://huggingface.co/datasets. +If not specified, adapterhub_tag must be given in case a new adapter card is generated. Defaults to +None.

  • +
  • local_path (str, optional) – Local path used as clone directory of the adapter repository. +If not specified, will create a temporary directory. Defaults to None.

  • +
  • commit_message (str, optional) – Message to commit while pushing. Will default to "add config", "add tokenizer" or +"add model" depending on the type of the class.

  • +
  • private (bool, optional) – Whether or not the repository created should be private (requires a paying subscription).

  • +
  • token (bool or str, optional) – The token to use as HTTP bearer authorization for remote files. If True, will use the token generated +when running huggingface-cli login (stored in ~/.huggingface). Will default to True if repo_url +is not specified.

  • +
  • overwrite_adapter_card (bool, optional) – Overwrite an existing adapter card with a newly generated one. +If set to False, will only generate an adapter card, if none exists. Defaults to False.

  • +
  • create_pr (bool, optional) – Whether or not to create a PR with the uploaded files or directly commit.

  • +
  • revision (str, optional) – Branch to push the uploaded files to.

  • +
  • commit_description (str, optional) – The description of the commit that will be created

  • +
+
+
Returns
+

The url of the adapter repository on the model hub.

+
+
Return type
+

str

+
+
+
+ +
+ +
+
+ + +
+ +
+ + +
+
+ +
+ +
+ +
+ + Versions + v: main + + +
+
+
Branches
+
main
+
+
+
+ + + + + + + + + + \ No newline at end of file diff --git a/classes/models/albert.html b/classes/models/albert.html new file mode 100644 index 0000000000..05da166500 --- /dev/null +++ b/classes/models/albert.html @@ -0,0 +1,1116 @@ + + + + + + + + + + + ALBERT — AdapterHub documentation + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + +
+ + + + + +
+ +
+ + + + + + + + + + + + + + + + + +
+ + + + +
+
+
+
+ +
+

ALBERT

+
+

Note

+
+
Adapter implementation notes for ALBERT:
    +
  • As layers are shared between groups, adapters added to a layer are also shared between groups. Therefore, changing the adapter configuration for a layer affects the behavior of all groups that use this layer.

  • +
  • As usual, the leave_out parameter can be used to specify the layers in which adapters should be added. The layer IDs are counted by putting all layers of the groups into a sequence depending on the group number and their position in the group. I.e., for a ALBERT model with inner_group_num=2 the first layer of the first group has ID 0, the second layer of the first group has ID 1, the first layer of the second group has ID 2, etc.

  • +
+
+
+
+

The ALBERT model was proposed in ALBERT: A Lite BERT for Self-supervised Learning of Language Representations +by Zhenzhong Lan, Mingda Chen, Sebastian Goodman, Kevin Gimpel, Piyush Sharma, Radu Soricut. +It presents two parameter-reduction techniques to lower memory consumption and increase the training speed of BERT:

+
    +
  • Splitting the embedding matrix into two smaller matrices.

  • +
  • Using repeating layers split among groups.

  • +
+
+

AlbertAdapterModel

+
+
+class adapters.AlbertAdapterModel(config)
+

Albert Model transformer with the option to add multiple flexible heads on top.

+

This model inherits from [PreTrainedModel]. Check the superclass documentation for the generic methods the +library implements for all its model (such as downloading or saving, resizing the input embeddings, pruning heads +etc.)

+

This model is also a PyTorch [torch.nn.Module](https://pytorch.org/docs/stable/nn.html#torch.nn.Module) subclass. +Use it as a regular PyTorch Module and refer to the PyTorch documentation for all matter related to general usage +and behavior.

+
+
Parameters
+

config ([AlbertConfig]) – Model configuration class with all the parameters of the model. +Initializing with a config file does not load the weights associated with the model, only the +configuration. Check out the [~PreTrainedModel.from_pretrained] method to load the model weights.

+
+
+
+
+property active_adapters: AdapterCompositionBlock
+

If you are not familiar with adapters and PEFT methods, we invite you to read more about them on the PEFT +official documentation: https://huggingface.co/docs/peft

+

Gets the current active adapters of the model. In case of multi-adapter inference (combining multiple adapters +for inference) returns the list of all active adapters so that users can deal with them accordingly.

+

For previous PEFT versions (that does not support multi-adapter inference), module.active_adapter will return +a single string.

+
+ +
+
+property active_head: Union[str, List[str]]
+

The active prediction head configuration of this model. Can be either the name of a single available head +(string) or a list of multiple available heads. In case of a list of heads, the same base model is forwarded +through all specified heads.

+
+
Returns
+

A string or a list of strings describing the active head configuration.

+
+
Return type
+

Union[str, List[str]]

+
+
+
+ +
+
+adapter_fusion_to(adapter_names: Union[Fuse, list, str], device: Optional[Union[device, str]] = None, dtype: Optional[dtype] = None)
+

Moves the adapter fusion layer with the given name to the specified device and data type.

+
+
Parameters
+
    +
  • adapter_names (Union[Fuse, list, str]) – The name of the adapter fusion layer to be moved.

  • +
  • device (torch.device or str, optional) – The device on which the adapter fusion layer should be moved.

  • +
  • dtype (torch.dtype, optional) – The data type to which the adapter fusion layer should be cast.

  • +
+
+
+
+ +
+
+adapter_summary(as_dict=False) Union[str, dict]
+

Returns a string summary of all adapters currently added to the model. Each entry in the summary table has the +following attributes:

+
+
    +
  • name: the name of the adapter

  • +
  • architecture: the architectural base of the adapter

  • +
  • #param: the number of parameters of the adapter

  • +
  • %param: the number of parameters of the adapter relative to the full model

  • +
  • active: whether the adapter is active

  • +
  • train: whether the adapter weights are enabled for training

  • +
+
+
+ +
+
+adapter_to(name: str, device: Optional[Union[device, str]] = None, dtype: Optional[dtype] = None)
+

Moves the adapter with the given name to the specified device and data type.

+
+
Parameters
+
    +
  • name (str) – The name of the adapter to be moved.

  • +
  • device (torch.device or str, optional) – The device on which the adapter should be moved.

  • +
  • dtype (torch.dtype, optional) – The data type to which the adapter should be cast.

  • +
+
+
+
+ +
+
+add_adapter(adapter_name: str, config=None, overwrite_ok: bool = False, set_active: bool = False)
+

Adds a new adapter module of the specified type to the model.

+
+
Parameters
+
    +
  • adapter_name (str) – The name of the adapter module to be added.

  • +
  • config (str or dict, optional) –

    The adapter configuration, can be either:

    +
      +
    • the string identifier of a pre-defined configuration dictionary

    • +
    • a configuration dictionary specifying the full config

    • +
    • if not given, the default configuration for this adapter type will be used

    • +
    +

  • +
  • overwrite_ok (bool, optional) – Overwrite an adapter with the same name if it exists. By default (False), an exception is thrown.

  • +
  • set_active (bool, optional) – Set the adapter to be the active one. By default (False), the adapter is added but not activated.

  • +
+
+
+

If self.base_model is self, must inherit from a class that implements this method, to preclude infinite +recursion

+
+ +
+
+add_adapter_fusion(adapter_names: Union[Fuse, list, str], config=None, overwrite_ok: bool = False, set_active: bool = False)
+

Adds AdapterFusion to the model with alll the necessary configurations and weight initializations

+
+
Parameters
+
    +
  • adapter_names (Fuse or list or str) –

    AdapterFusion layer to add. Can be either:

    +
      +
    • a Fuse composition block

    • +
    • a list of adapter names to fuse

    • +
    • a comma-separated string of adapter names to fuse

    • +
    +

  • +
  • config (str or dict) –

    adapter fusion configuration, can be either:

    +
      +
    • a string identifying a pre-defined adapter fusion configuration

    • +
    • a dictionary representing the adapter fusion configuration

    • +
    • the path to a file containing the adapter fusion configuration

    • +
    +

  • +
  • overwrite_ok (bool, optional) – Overwrite an AdapterFusion layer with the same name if it exists. By default (False), an exception is +thrown.

  • +
  • set_active (bool, optional) – Activate the added AdapterFusion. By default (False), the AdapterFusion is added but not activated.

  • +
+
+
+
+ +
+
+add_classification_head(head_name, num_labels=2, layers=2, activation_function='tanh', overwrite_ok=False, multilabel=False, id2label=None, use_pooler=False)
+

Adds a sequence classification head on top of the model.

+
+
Parameters
+
    +
  • head_name (str) – The name of the head.

  • +
  • num_labels (int, optional) – Number of classification labels. Defaults to 2.

  • +
  • layers (int, optional) – Number of layers. Defaults to 2.

  • +
  • activation_function (str, optional) – Activation function. Defaults to ‘tanh’.

  • +
  • overwrite_ok (bool, optional) – Force overwrite if a head with the same name exists. Defaults to False.

  • +
  • multilabel (bool, optional) – Enable multilabel classification setup. Defaults to False.

  • +
+
+
+
+ +
+
+add_masked_lm_head(head_name, activation_function='gelu', overwrite_ok=False)
+

Adds a masked language modeling head on top of the model.

+
+
Parameters
+
    +
  • head_name (str) – The name of the head.

  • +
  • activation_function (str, optional) – Activation function. Defaults to ‘gelu’.

  • +
  • overwrite_ok (bool, optional) – Force overwrite if a head with the same name exists. Defaults to False.

  • +
+
+
+
+ +
+
+add_multiple_choice_head(head_name, num_choices=2, layers=2, activation_function='tanh', overwrite_ok=False, id2label=None, use_pooler=False)
+

Adds a multiple choice head on top of the model.

+
+
Parameters
+
    +
  • head_name (str) – The name of the head.

  • +
  • num_choices (int, optional) – Number of choices. Defaults to 2.

  • +
  • layers (int, optional) – Number of layers. Defaults to 2.

  • +
  • activation_function (str, optional) – Activation function. Defaults to ‘tanh’.

  • +
  • overwrite_ok (bool, optional) – Force overwrite if a head with the same name exists. Defaults to False.

  • +
+
+
+
+ +
+
+add_qa_head(head_name, num_labels=2, layers=1, activation_function='tanh', overwrite_ok=False, id2label=None)
+

Adds a question answering head on top of the model.

+
+
Parameters
+
    +
  • head_name (str) – The name of the head.

  • +
  • num_labels (int, optional) – Number of classification labels. Defaults to 2.

  • +
  • layers (int, optional) – Number of layers. Defaults to 1.

  • +
  • activation_function (str, optional) – Activation function. Defaults to ‘tanh’.

  • +
  • overwrite_ok (bool, optional) – Force overwrite if a head with the same name exists. Defaults to False.

  • +
+
+
+
+ +
+
+add_tagging_head(head_name, num_labels=2, layers=1, activation_function='tanh', overwrite_ok=False, id2label=None)
+

Adds a token classification head on top of the model.

+
+
Parameters
+
    +
  • head_name (str) – The name of the head.

  • +
  • num_labels (int, optional) – Number of classification labels. Defaults to 2.

  • +
  • layers (int, optional) – Number of layers. Defaults to 1.

  • +
  • activation_function (str, optional) – Activation function. Defaults to ‘tanh’.

  • +
  • overwrite_ok (bool, optional) – Force overwrite if a head with the same name exists. Defaults to False.

  • +
+
+
+
+ +
+
+apply_to_adapter_layers(fn)
+

Applies a function to all adapter layers of the model.

+
+ +
+
+apply_to_basemodel_childs(fn)
+

Applies a function to all direct childs of the model if they are a instance of AdapterLayerBase.

+
+ +
+
+average_adapter(adapter_name: str, adapter_list: List[str], weights: Optional[List[float]] = None, normalize_weights: bool = True, overwrite_ok: bool = False, set_active: bool = False)
+

Adds a new adapter module as weighted average of a set of existing adapter modules.

+
+
Parameters
+
    +
  • adapter_name (str) – The name of the adapter module to be added.

  • +
  • input_adapters (List[str] or Dict[str, float]) – Specifies the existing adapters whose weights should be averaged. Can either be a list of adapter names +or a dictionary mapping adapter names to weights.

  • +
  • overwrite_ok (bool, optional) – Overwrite an adapter with the same name if it exists. By default (False), an exception is thrown.

  • +
  • set_active (bool, optional) – Set the adapter to be the active one. By default (False), the adapter is added but not activated.

  • +
+
+
+
+ +
+
+delete_adapter(adapter_name: str)
+

Deletes the adapter with the specified name from the model.

+
+
Parameters
+

adapter_name (str) – The name of the adapter.

+
+
+
+ +
+
+delete_adapter_fusion(adapter_names: Union[Fuse, list, str])
+

Deletes the AdapterFusion layer of the specified adapters.

+
+
Parameters
+

adapter_names (Union[Fuse, list, str]) – AdapterFusion layer to delete.

+
+
+
+ +
+
+delete_head(head_name: str)
+

Deletes the prediction head with the specified name from the model.

+
+
Parameters
+

head_name (str) – The name of the prediction to delete.

+
+
+
+ +
+
+eject_prefix_tuning(name: str)
+

Converts the prefix tuning with the given name from the reparameterized form into the flat form.

+
+
Parameters
+

name (str) – The name of the prefix tuning.

+
+
+
+ +
+
+forward(input_ids=None, attention_mask=None, token_type_ids=None, position_ids=None, head_mask=None, inputs_embeds=None, output_attentions=None, output_hidden_states=None, return_dict=None, head=None, output_adapter_gating_scores=False, output_adapter_fusion_attentions=False, **kwargs)
+

The [AlbertAdapterModel] forward method, overrides the __call__ special method.

+

<Tip>

+

Although the recipe for forward pass needs to be defined within this function, one should call the [Module] +instance afterwards instead of this since the former takes care of running the pre and post processing steps while +the latter silently ignores them.

+

</Tip>

+
+
Parameters
+
    +
  • input_ids (torch.LongTensor of shape (batch_size, sequence_length)) –

    Indices of input sequence tokens in the vocabulary.

    +

    Indices can be obtained using [AutoTokenizer]. See [PreTrainedTokenizer.__call__] and +[PreTrainedTokenizer.encode] for details.

    +

    [What are input IDs?](../glossary#input-ids)

    +

  • +
  • attention_mask (torch.FloatTensor of shape (batch_size, sequence_length), optional) –

    Mask to avoid performing attention on padding token indices. Mask values selected in [0, 1]:

    +
      +
    • 1 for tokens that are not masked,

    • +
    • 0 for tokens that are masked.

    • +
    +

    [What are attention masks?](../glossary#attention-mask)

    +

  • +
  • token_type_ids (torch.LongTensor of shape (batch_size, sequence_length), optional) –

    Segment token indices to indicate first and second portions of the inputs. Indices are selected in [0, +1]:

    +
      +
    • 0 corresponds to a sentence A token,

    • +
    • 1 corresponds to a sentence B token.

    • +
    +

    [What are token type IDs?](../glossary#token-type-ids)

    +

  • +
  • position_ids (torch.LongTensor of shape (batch_size, sequence_length), optional) –

    Indices of positions of each input sequence tokens in the position embeddings. Selected in the range [0, +config.max_position_embeddings - 1].

    +

    [What are position IDs?](../glossary#position-ids)

    +

  • +
  • head_mask (torch.FloatTensor of shape (num_heads,) or (num_layers, num_heads), optional) –

    Mask to nullify selected heads of the self-attention modules. Mask values selected in [0, 1]:

    +
      +
    • 1 indicates the head is not masked,

    • +
    • 0 indicates the head is masked.

    • +
    +

  • +
  • inputs_embeds (torch.FloatTensor of shape (batch_size, sequence_length, hidden_size), optional) – Optionally, instead of passing input_ids you can choose to directly pass an embedded representation. This +is useful if you want more control over how to convert input_ids indices into associated vectors than the +model’s internal embedding lookup matrix.

  • +
  • output_attentions (bool, optional) – Whether or not to return the attentions tensors of all attention layers. See attentions under returned +tensors for more detail.

  • +
  • output_hidden_states (bool, optional) – Whether or not to return the hidden states of all layers. See hidden_states under returned tensors for +more detail.

  • +
  • return_dict (bool, optional) – Whether or not to return a [~utils.ModelOutput] instead of a plain tuple.

  • +
+
+
+
+ +
+
+forward_context(context: ForwardContext, *args, **kwargs)
+

This method is called by the ForwardContext at the beginning of the forward pass.

+
+ +
+
+forward_head(all_outputs, head_name=None, cls_output=None, attention_mask=None, return_dict=False, context=None, **kwargs)
+

The forward pass through a prediction head configuration. There are three ways to specify the used prediction +head configuration (in order of priority):

+
+
    +
  1. If a head_name is passed, the head with the given name is used.

  2. +
  3. If the forward call is executed within an AdapterSetup context, the head configuration is read from +the context.

  4. +
  5. If the active_head property is set, the head configuration is read from there.

  6. +
+
+
+
Parameters
+
    +
  • all_outputs (dict) – The outputs of the base model.

  • +
  • head_name (str, optional) – The name of the prediction head to use. If None, the active head is used.

  • +
  • cls_output (torch.Tensor, optional) – The classification output of the model.

  • +
  • attention_mask (torch.Tensor, optional) – The attention mask of the model.

  • +
  • return_dict (bool) – Whether or not to return a ModelOutput instead of a plain tuple.

  • +
  • get_cls_from_eos_tokens (bool) – If set to True, retrieve classifier token representations from the last <eos> token in the sequence. +Setting to True requires eos_mask to be passed as well.

  • +
  • **kwargs – Additional keyword arguments passed to the forward pass of the head.

  • +
+
+
+
+ +
+
+freeze_model(freeze=True)
+

Freezes all weights of the model.

+
+ +
+
+get_adapter(name)
+

If self.base_model is self, must inherit from a class that implements this method, to preclude infinite +recursion

+
+ +
+
+get_labels(head_name=None)
+

Returns the labels the given head is assigning/predictin

+
+
Parameters
+
    +
  • head_name – (str, optional) the name of the head which labels should be returned. Default is None.

  • +
  • returned (If the name is None the labels of the active head are) –

  • +
+
+
+

Returns: labels

+
+ +
+
+get_labels_dict(head_name=None)
+

Returns the id2label dict for the given hea

+
+
Parameters
+
    +
  • head_name – (str, optional) the name of the head which labels should be returned. Default is None.

  • +
  • returned (If the name is None the labels of the active head are) –

  • +
+
+
+

Returns: id2label

+
+ +
+
+get_output_embeddings() Union[Module, List[Module]]
+

Returns the model’s output embeddings.

+
+
Returns
+

A torch module mapping hidden states to vocabulary.

+
+
Return type
+

nn.Module

+
+
+
+ +
+
+head_type()
+

Checks which head type the decorated function belongs to and raises an error if the model does not support the +head type.

+
+ +
+
+init_adapters(model_config, adapters_config, add_prefix_tuning_pool=True)
+

This method initializes adapter modules and fusion modules from the model config.

+
+ +
+
+iter_layers() Iterable[Tuple[int, Module]]
+

Iterates over all layers of the model.

+
+ +
+
+load_adapter(adapter_name_or_path: str, config: Optional[Union[dict, str]] = None, version: Optional[str] = None, model_name: Optional[str] = None, load_as: Optional[str] = None, source: Optional[str] = None, with_head: bool = True, custom_weights_loaders: Optional[List[WeightsLoader]] = None, leave_out: Optional[List[int]] = None, id2label=None, set_active: bool = False, use_safetensors: bool = False, **kwargs) str
+

Loads a pre-trained pytorch adapter module from the local file system or a remote location.

+
+
Parameters
+
    +
  • adapter_name_or_path (str) –

    can be either:

    +
      +
    • the identifier of a pre-trained task adapter to be loaded from Adapter Hub

    • +
    • a path to a directory containing adapter weights saved using model.saved_adapter()

    • +
    • a URL pointing to a zip folder containing a saved adapter module

    • +
    +

  • +
  • config (dict or str, optional) – The requested configuration of the adapter. +If not specified, will be either: - the default adapter config for the requested adapter if specified - +the global default adapter config

  • +
  • version (str, optional) – The version of the adapter to be loaded.

  • +
  • model_name (str, optional) – The string identifier of the pre-trained model.

  • +
  • load_as (str, optional) – Load the adapter using this name. By default, the name with which the adapter was +saved will be used.

  • +
  • source (str, optional) –

    Identifier of the source(s) from where to load the adapter. Can be:

    +
      +
    • +
      ”ah”: search on AdapterHub Hub repo.

      Note: the Hub repo has been archived and all adapters have been moved to HuggingFace Model Hub. +Loading from this source is deprecated.

      +
      +
      +
    • +
    • ”hf”: search on HuggingFace Model Hub.

    • +
    • None (default): search on all sources

    • +
    +

  • +
  • leave_out – Dynamically drop adapter modules in the specified Transformer layers when loading the adapter.

  • +
  • set_active (bool, optional) – Set the loaded adapter to be the active one. By default (False), the adapter is loaded but not +activated.

  • +
  • use_safetensors (bool, optional) – If True, weights are loaded via safetensors if safetensors checkpoint is available. Otherwise, the regular torch save method is used.

  • +
+
+
Returns
+

The name with which the adapter was added to the model.

+
+
Return type
+

str

+
+
+
+ +
+
+load_adapter_fusion(adapter_fusion_name_or_path: str, load_as: Optional[str] = None, custom_weights_loaders: Optional[List[WeightsLoader]] = None, set_active: bool = False, with_head: bool = True, use_safetensors: bool = False, **kwargs) str
+

Loads a pre-trained AdapterFusion layer from the local file system.

+
+
Parameters
+
    +
  • adapter_fusion_name_or_path (str) – a path to a directory containing AdapterFusion weights saved using model.save_adapter_fusion().

  • +
  • load_as (str, optional) – Load the AdapterFusion using this name. +By default, the name with which the AdapterFusion layer was saved will be used.

  • +
  • set_active (bool, optional) – Activate the loaded AdapterFusion. By default (False), the AdapterFusion is loaded but not activated.

  • +
  • use_safetensors (bool, optional) – If True, weights are loaded via safetensors if safetensors checkpoint is available. Otherwise, the regular torch save method is used.

  • +
+
+
Returns
+

The name with which the AdapterFusion was added to the model.

+
+
Return type
+

str

+
+
+
+ +
+
+load_head(save_directory: str, load_as: Optional[str] = None, id2label: Optional[Dict[int, str]] = None, use_safetensors: bool = False, **kwargs) str
+

Loads a model prediction head from a directory where it was saved using save_head().

+
+
Parameters
+
    +
  • save_directory (str) – Path to the directory where the prediction head is saved.

  • +
  • load_as (str, optional) – Load the AdapterFusion using this name. +By default, the name with which the AdapterFusion layer was saved will be used.

  • +
  • id2label (Dict[int, str], optional) – Provide a custom mapping from class ids to class labels. Defaults to None.

  • +
  • use_safetensors (bool, optional) – If True, weights are loaded via safetensors if safetensors checkpoint is available. Otherwise, the regular torch save method is used.

  • +
+
+
Returns
+

The name with which the prediction head was added to the model.

+
+
Return type
+

str

+
+
+
+ +
+
+merge_adapter(name: str)
+

Merges the weights of the given LoRA module with the Transformer weights as described in the paper.

+
+
Parameters
+

name (str) – LoRA module to merge.

+
+
+
+ +
+
+push_adapter_to_hub(repo_name: str, adapter_name: str, organization: Optional[str] = None, adapterhub_tag: Optional[str] = None, datasets_tag: Optional[str] = None, local_path: Optional[str] = None, commit_message: Optional[str] = None, private: Optional[bool] = None, token: Optional[Union[bool, str]] = None, overwrite_adapter_card: bool = False, create_pr: bool = False, revision: Optional[str] = None, commit_description: Optional[str] = None, adapter_card_kwargs: Optional[dict] = None, **deprecated_kwargs)
+

Upload an adapter to HuggingFace’s Model Hub.

+
+
Parameters
+
    +
  • repo_name (str) – The name of the repository on the model hub to upload to.

  • +
  • adapter_name (str) – The name of the adapter to be uploaded.

  • +
  • organization (str, optional) – Organization in which to push the adapter +(you must be a member of this organization). Defaults to None.

  • +
  • adapterhub_tag (str, optional) – Tag of the format <task>/<subtask> for categorization on https://adapterhub.ml/explore/. See +https://docs.adapterhub.ml/contributing.html#add-a-new-task-or-subtask for more. If not specified, +datasets_tag must be given in case a new adapter card is generated. Defaults to None.

  • +
  • datasets_tag (str, optional) – Dataset identifier from https://huggingface.co/datasets. +If not specified, adapterhub_tag must be given in case a new adapter card is generated. Defaults to +None.

  • +
  • local_path (str, optional) – Local path used as clone directory of the adapter repository. +If not specified, will create a temporary directory. Defaults to None.

  • +
  • commit_message (str, optional) – Message to commit while pushing. Will default to "add config", "add tokenizer" or +"add model" depending on the type of the class.

  • +
  • private (bool, optional) – Whether or not the repository created should be private (requires a paying subscription).

  • +
  • token (bool or str, optional) – The token to use as HTTP bearer authorization for remote files. If True, will use the token generated +when running huggingface-cli login (stored in ~/.huggingface). Will default to True if repo_url +is not specified.

  • +
  • overwrite_adapter_card (bool, optional) – Overwrite an existing adapter card with a newly generated one. +If set to False, will only generate an adapter card, if none exists. Defaults to False.

  • +
  • create_pr (bool, optional) – Whether or not to create a PR with the uploaded files or directly commit.

  • +
  • revision (str, optional) – Branch to push the uploaded files to.

  • +
  • commit_description (str, optional) – The description of the commit that will be created

  • +
+
+
Returns
+

The url of the adapter repository on the model hub.

+
+
Return type
+

str

+
+
+
+ +
+
+reset_adapter()
+

Resets weights of a LoRA module merged using model.merge_adapter(name).

+
+ +
+
+save_adapter(save_directory: str, adapter_name: str, with_head: bool = True, meta_dict: Optional[dict] = None, custom_weights_loaders: Optional[List[WeightsLoader]] = None, use_safetensors: bool = False)
+

Saves an adapter and its configuration file to a directory so that it can be shared or reloaded using +load_adapter().

+
+
Parameters
+
    +
  • save_directory (str) – Path to a directory where the adapter should be saved.

  • +
  • adapter_name (str) – Name of the adapter to be saved.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
Raises
+

ValueError – If the given adapter name is invalid.

+
+
+
+ +
+
+save_adapter_fusion(save_directory: str, adapter_names: Union[Fuse, list, str], meta_dict: Optional[dict] = None, custom_weights_loaders: Optional[List[WeightsLoader]] = None, with_head: Union[bool, str] = False, use_safetensors: bool = False)
+

Saves an AdapterFusion layer and its configuration file to a directory so that it can be shared or reloaded +using load_adapter_fusion().

+
+
Parameters
+
    +
  • save_directory (str) – Path to a directory where the AdapterFusion should be saved.

  • +
  • adapter_names (Union[Fuse, list, str]) – AdapterFusion to be saved.

  • +
  • with_head (Union[bool, str]) – If True, will save a head with the same name as the AdapterFusionLayer. If a string, this will be used +as the name of the head to be saved.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
Raises
+

ValueError – If the given AdapterFusion name is invalid.

+
+
+
+ +
+
+save_all_adapter_fusions(save_directory: str, meta_dict: Optional[dict] = None, custom_weights_loaders: Optional[List[WeightsLoader]] = None, use_safetensors: bool = False)
+

Saves all AdapterFusion layers of this model together with their configuration to subfolders of the given +location.

+
+
Parameters
+
    +
  • save_directory (str) – Path to a directory where the AdapterFusion layers should be saved.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
+
+ +
+
+save_all_adapters(save_directory: str, with_head: bool = True, meta_dict: Optional[dict] = None, custom_weights_loaders: Optional[List[WeightsLoader]] = None, use_safetensors: bool = False)
+

Saves all adapters of this model together with their configuration to subfolders of the given location.

+
+
Parameters
+
    +
  • save_directory (str) – Path to a directory where the adapters should be saved.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
+
+ +
+
+save_all_heads(save_directory: str, use_safetensors: bool = False)
+

Saves all prediction heads of this model to subfolders of the given location.

+
+
Parameters
+
    +
  • save_directory (str) – Path to the base directory where prediction heads should be saved.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
+
+ +
+
+save_head(save_directory: str, head_name: Optional[str] = None, use_safetensors: bool = False) None
+

Saves a model prediction head to a directory such that it can be reloaded using load_head().

+
+
Parameters
+
    +
  • save_directory (str) – Path to the directory where the prediction head should be saved.

  • +
  • head_name (str, optional) – Name of the head to save. Set to None if model only has one head. Defaults to None.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
+
+ +
+
+save_pretrained(save_directory: Union[str, PathLike], **kwargs)
+

Save a model and its configuration file to a directory, so that it can be re-loaded using the +[~PreTrainedModel.from_pretrained] class method.

+
+
Parameters
+
    +
  • save_directory (str or os.PathLike) – Directory to which to save. Will be created if it doesn’t exist.

  • +
  • is_main_process (bool, optional, defaults to True) – Whether the process calling this is the main process or not. Useful when in distributed training like +TPUs and need to call this function on all processes. In this case, set is_main_process=True only on +the main process to avoid race conditions.

  • +
  • state_dict (nested dictionary of torch.Tensor) – The state dictionary of the model to save. Will default to self.state_dict(), but can be used to only +save parts of the model or if special precautions need to be taken when recovering the state dictionary +of a model (like when using model parallelism).

  • +
  • save_function (Callable) – The function to use to save the state dictionary. Useful on distributed training like TPUs when one +need to replace torch.save by another method.

  • +
  • push_to_hub (bool, optional, defaults to False) – Whether or not to push your model to the Hugging Face model hub after saving it. You can specify the +repository you want to push to with repo_id (will default to the name of save_directory in your +namespace).

  • +
  • max_shard_size (int or str, optional, defaults to “5GB”) –

    The maximum size for a checkpoint before being sharded. Checkpoints shard will then be each of size +lower than this size. If expressed as a string, needs to be digits followed by a unit (like “5MB”). +We default it to 5GB in order for models to be able to run easily on free-tier google colab instances +without CPU OOM issues.

    +

    <Tip warning={true}>

    +

    If a single weight of the model is bigger than max_shard_size, it will be in its own checkpoint shard +which will be bigger than max_shard_size.

    +

    </Tip>

    +

  • +
  • safe_serialization (bool, optional, defaults to True) – Whether to save the model using safetensors or the traditional PyTorch way (that uses pickle).

  • +
  • variant (str, optional) – If specified, weights are saved in the format pytorch_model.<variant>.bin.

  • +
  • token (str or bool, optional) – The token to use as HTTP bearer authorization for remote files. If True, or not specified, will use +the token generated when running huggingface-cli login (stored in ~/.huggingface).

  • +
  • save_peft_format (bool, optional, defaults to True) – For backward compatibility with PEFT library, in case adapter weights are attached to the model, all +keys of the state dict of adapters needs to be pre-pended with base_model.model. Advanced users can +disable this behaviours by setting save_peft_format to False.

  • +
  • kwargs (Dict[str, Any], optional) – Additional key word arguments passed along to the [~utils.PushToHubMixin.push_to_hub] method.

  • +
+
+
+
+ +
+
+set_active_adapters(adapter_setup: Union[list, AdapterCompositionBlock], skip_layers: Optional[List[int]] = None)
+

Sets the adapter modules to be used by default in every forward pass. This setting can be overriden by passing +the adapter_names parameter in the foward() pass. If no adapter with the given name is found, no module of +the respective type will be activated. In case the calling model class supports named prediction heads, this +method will attempt to activate a prediction head with the name of the last adapter in the list of passed +adapter names.

+
+
Parameters
+

adapter_setup (list) – The list of adapters to be activated by default. Can be a fusion or stacking configuration.

+
+
+
+ +
+
+tie_weights()
+

Tie the weights between the input embeddings and the output embeddings.

+

If the torchscript flag is set in the configuration, can’t handle parameter sharing so we are cloning +the weights instead.

+
+ +
+
+train_adapter(adapter_setup: Union[list, AdapterCompositionBlock], train_embeddings=False)
+

Sets the model into mode for training the given adapters. If self.base_model is self, must inherit from a class +that implements this method, to preclude infinite recursion

+
+ +
+
+train_adapter_fusion(adapter_setup: Union[list, AdapterCompositionBlock], unfreeze_adapters=False)
+

Sets the model into mode for training of adapter fusion determined by a list of adapter names. If +self.base_model is self, must inherit from a class that implements this method, to preclude infinite recursion

+
+ +
+
+train_fusion(adapter_setup: Union[list, AdapterCompositionBlock], unfreeze_adapters=False)
+

Sets the model into mode for training of adapter fusion determined by a list of adapter names.

+
+ +
+ +
+
+ + +
+ +
+ + +
+
+ +
+ +
+ +
+ + Versions + v: main + + +
+
+
Branches
+
main
+
+
+
+ + + + + + + + + + \ No newline at end of file diff --git a/classes/models/auto.html b/classes/models/auto.html new file mode 100644 index 0000000000..b4147bb469 --- /dev/null +++ b/classes/models/auto.html @@ -0,0 +1,493 @@ + + + + + + + + + + + Auto Classes — AdapterHub documentation + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + +
+ + + + + +
+ +
+ + + + + + + + + + + + + + + + + +
+ + + + +
+
+
+
+ +
+

Auto Classes

+

Similar to the AutoModel classes built-in into HuggingFace Transformers, adapters provides an AutoAdapterModel class. +As with other auto classes, the correct adapter model class is automatically instantiated based on the pre-trained model passed to the from_pretrained() method.

+
+

Note

+

If the model loaded with the from_pretrained(...) function has a head, this head gets loaded as well. However, this only works for non-sharded models. If you want to load a sharded model with a head, you first need to load the model and then the head separately.

+
+
+

AutoAdapterModel

+
+
+class adapters.AutoAdapterModel(*args, **kwargs)
+

This is a generic model class that will be instantiated as one of the model classes of the library (with a adapters and flexible heads head) when created +with the [~AutoAdapterModel.from_pretrained] class method or the [~AutoAdapterModel.from_config] class +method.

+

This class cannot be instantiated directly using __init__() (throws an error).

+
+
+classmethod from_config(**kwargs)
+

Instantiates one of the model classes of the library (with a adapters and flexible heads head) from a configuration.

+
+

Note

+

Loading a model from its configuration file does not load the model weights. It only affects the +model’s configuration. Use [~AutoAdapterModel.from_pretrained] to load the model weights.

+
+
+
Parameters
+
    +
  • config ([PretrainedConfig]) –

    The model class to instantiate is selected based on the configuration class:

    +
      +
    • [AlbertConfig] configuration class: [AlbertAdapterModel] (ALBERT model)

    • +
    • [BartConfig] configuration class: [BartAdapterModel] (BART model)

    • +
    • [BeitConfig] configuration class: [BeitAdapterModel] (BEiT model)

    • +
    • [BertConfig] configuration class: [BertAdapterModel] (BERT model)

    • +
    • [BertGenerationConfig] configuration class: [BertGenerationAdapterModel] (Bert Generation model)

    • +
    • [CLIPConfig] configuration class: [CLIPAdapterModel] (CLIP model)

    • +
    • [DebertaConfig] configuration class: [DebertaAdapterModel] (DeBERTa model)

    • +
    • [DebertaV2Config] configuration class: [DebertaV2AdapterModel] (DeBERTa-v2 model)

    • +
    • [DistilBertConfig] configuration class: [DistilBertAdapterModel] (DistilBERT model)

    • +
    • [ElectraConfig] configuration class: [ElectraAdapterModel] (ELECTRA model)

    • +
    • [GPT2Config] configuration class: [GPT2AdapterModel] (OpenAI GPT-2 model)

    • +
    • [GPTJConfig] configuration class: [GPTJAdapterModel] (GPT-J model)

    • +
    • [LlamaConfig] configuration class: [LlamaAdapterModel] (LLaMA model)

    • +
    • [MBartConfig] configuration class: [MBartAdapterModel] (mBART model)

    • +
    • [MT5Config] configuration class: [MT5AdapterModel] (MT5 model)

    • +
    • [RobertaConfig] configuration class: [RobertaAdapterModel] (RoBERTa model)

    • +
    • [T5Config] configuration class: [T5AdapterModel] (T5 model)

    • +
    • [ViTConfig] configuration class: [ViTAdapterModel] (ViT model)

    • +
    • [XLMRobertaConfig] configuration class: [XLMRobertaAdapterModel] (XLM-RoBERTa model)

    • +
    • [XmodConfig] configuration class: [XmodAdapterModel] (X-MOD model)

    • +
    +

  • +
  • attn_implementation (str, optional) – The attention implementation to use in the model (if relevant). Can be any of “eager” (manual implementation of the attention), “sdpa” (using [F.scaled_dot_product_attention](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), or “flash_attention_2” (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual “eager” implementation.

  • +
+
+
+

Examples:

+

```python +>>> from transformers import AutoConfig, AutoAdapterModel

+
>>> # Download configuration from huggingface.co and cache.
+>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
+>>> model = AutoAdapterModel.from_config(config)
+```
+
+
+
+ +
+
+classmethod from_pretrained(*model_args, **kwargs)
+

Instantiate one of the model classes of the library (with a adapters and flexible heads head) from a pretrained model.

+

The model class to instantiate is selected based on the model_type property of the config object (either +passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by +falling back to using pattern matching on pretrained_model_name_or_path:

+
+
    +
  • albert – [AlbertAdapterModel] (ALBERT model)

  • +
  • bart – [BartAdapterModel] (BART model)

  • +
  • beit – [BeitAdapterModel] (BEiT model)

  • +
  • bert – [BertAdapterModel] (BERT model)

  • +
  • bert-generation – [BertGenerationAdapterModel] (Bert Generation model)

  • +
  • clip – [CLIPAdapterModel] (CLIP model)

  • +
  • deberta – [DebertaAdapterModel] (DeBERTa model)

  • +
  • deberta-v2 – [DebertaV2AdapterModel] (DeBERTa-v2 model)

  • +
  • distilbert – [DistilBertAdapterModel] (DistilBERT model)

  • +
  • electra – [ElectraAdapterModel] (ELECTRA model)

  • +
  • gpt2 – [GPT2AdapterModel] (OpenAI GPT-2 model)

  • +
  • gptj – [GPTJAdapterModel] (GPT-J model)

  • +
  • llama – [LlamaAdapterModel] (LLaMA model)

  • +
  • mbart – [MBartAdapterModel] (mBART model)

  • +
  • mt5 – [MT5AdapterModel] (MT5 model)

  • +
  • roberta – [RobertaAdapterModel] (RoBERTa model)

  • +
  • t5 – [T5AdapterModel] (T5 model)

  • +
  • vit – [ViTAdapterModel] (ViT model)

  • +
  • xlm-roberta – [XLMRobertaAdapterModel] (XLM-RoBERTa model)

  • +
  • xmod – [XmodAdapterModel] (X-MOD model)

  • +
+
+

The model is set in evaluation mode by default using model.eval() (so for instance, dropout modules are +deactivated). To train the model, you should first set it back in training mode with model.train()

+
+
Parameters
+
    +
  • pretrained_model_name_or_path (str or os.PathLike) –

    Can be either:

    +
    +
      +
    • A string, the model id of a pretrained model hosted inside a model repo on huggingface.co.

    • +
    • A path to a directory containing model weights saved using +[~PreTrainedModel.save_pretrained], e.g., ./my_model_directory/.

    • +
    • A path or url to a tensorflow index checkpoint file (e.g, ./tf_model/model.ckpt.index). In +this case, from_tf should be set to True and a configuration object should be provided as +config argument. This loading path is slower than converting the TensorFlow checkpoint in a +PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.

    • +
    +
    +

  • +
  • model_args (additional positional arguments, optional) – Will be passed along to the underlying model __init__() method.

  • +
  • config ([PretrainedConfig], optional) –

    Configuration for the model to use instead of an automatically loaded configuration. Configuration can +be automatically loaded when:

    +
    +
      +
    • The model is a model provided by the library (loaded with the model id string of a pretrained +model).

    • +
    • The model was saved using [~PreTrainedModel.save_pretrained] and is reloaded by supplying the +save directory.

    • +
    • The model is loaded by supplying a local directory as pretrained_model_name_or_path and a +configuration JSON file named config.json is found in the directory.

    • +
    +
    +

  • +
  • state_dict (Dict[str, torch.Tensor], optional) –

    A state dictionary to use instead of a state dictionary loaded from saved weights file.

    +

    This option can be used if you want to create a model from a pretrained configuration but load your own +weights. In this case though, you should check if using [~PreTrainedModel.save_pretrained] and +[~PreTrainedModel.from_pretrained] is not a simpler option.

    +

  • +
  • cache_dir (str or os.PathLike, optional) – Path to a directory in which a downloaded pretrained model configuration should be cached if the +standard cache should not be used.

  • +
  • from_tf (bool, optional, defaults to False) – Load the model weights from a TensorFlow checkpoint save file (see docstring of +pretrained_model_name_or_path argument).

  • +
  • force_download (bool, optional, defaults to False) – Whether or not to force the (re-)download of the model weights and configuration files, overriding the +cached versions if they exist.

  • +
  • resume_download (bool, optional, defaults to False) – Whether or not to delete incompletely received files. Will attempt to resume the download if such a +file exists.

  • +
  • proxies (Dict[str, str], optional) – A dictionary of proxy servers to use by protocol or endpoint, e.g., {‘http’: ‘foo.bar:3128’, +‘http://hostname’: ‘foo.bar:4012’}. The proxies are used on each request.

  • +
  • output_loading_info (bool, optional, defaults to False) – Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.

  • +
  • local_files_only (bool, optional, defaults to False) – Whether or not to only look at local files (e.g., not try downloading the model).

  • +
  • revision (str, optional, defaults to “main”) – The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a +git-based system for storing models and other artifacts on huggingface.co, so revision can be any +identifier allowed by git.

  • +
  • trust_remote_code (bool, optional, defaults to False) – Whether or not to allow for custom models defined on the Hub in their own modeling files. This option +should only be set to True for repositories you trust and in which you have read the code, as it will +execute code present on the Hub on your local machine.

  • +
  • code_revision (str, optional, defaults to “main”) – The specific revision to use for the code on the Hub, if the code leaves in a different repository than +the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based +system for storing models and other artifacts on huggingface.co, so revision can be any identifier +allowed by git.

  • +
  • kwargs (additional keyword arguments, optional) –

    Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., +output_attentions=True). Behaves differently depending on whether a config is provided or +automatically loaded:

    +
    +
      +
    • If a configuration is provided with config, **kwargs will be directly passed to the +underlying model’s __init__ method (we assume all relevant updates to the configuration have +already been done)

    • +
    • If a configuration is not provided, kwargs will be first passed to the configuration class +initialization function ([~PretrainedConfig.from_pretrained]). Each key of kwargs that +corresponds to a configuration attribute will be used to override said attribute with the +supplied kwargs value. Remaining keys that do not correspond to any configuration attribute +will be passed to the underlying model’s __init__ function.

    • +
    +
    +

  • +
+
+
+

Examples:

+

```python +>>> from transformers import AutoConfig, AutoAdapterModel

+
>>> # Download model and configuration from huggingface.co and cache.
+>>> model = AutoAdapterModel.from_pretrained("google-bert/bert-base-cased")
+
+
+
>>> # Update configuration during loading
+>>> model = AutoAdapterModel.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
+>>> model.config.output_attentions
+True
+
+
+
>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
+>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
+>>> model = AutoAdapterModel.from_pretrained(
+...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
+... )
+```
+
+
+
+ +
+ +
+
+ + +
+ +
+ + +
+
+ +
+ +
+ +
+ + Versions + v: main + + +
+
+
Branches
+
main
+
+
+
+ + + + + + + + + + \ No newline at end of file diff --git a/classes/models/bart.html b/classes/models/bart.html new file mode 100644 index 0000000000..bc603c5119 --- /dev/null +++ b/classes/models/bart.html @@ -0,0 +1,1119 @@ + + + + + + + + + + + BART — AdapterHub documentation + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + +
+ + + + + +
+ +
+ + + + + + + + + + + + + + + + + +
+ + + + +
+
+
+
+ +
+

BART

+

The Bart model was proposed in BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, +Translation, and Comprehension by Mike Lewis, Yinhan Liu, Naman Goyal, Marjan +Ghazvininejad, Abdelrahman Mohamed, Omer Levy, Ves Stoyanov and Luke Zettlemoyer on 29 Oct, 2019.

+

According to the abstract,

+
    +
  • Bart uses a standard seq2seq/machine translation architecture with a bidirectional encoder (like BERT) and a +left-to-right decoder (like GPT).

  • +
  • The pretraining task involves randomly shuffling the order of the original sentences and a novel in-filling scheme, +where spans of text are replaced with a single mask token.

  • +
  • BART is particularly effective when fine tuned for text generation but also works well for comprehension tasks. It +matches the performance of RoBERTa with comparable training resources on GLUE and SQuAD, achieves new +state-of-the-art results on a range of abstractive dialogue, question answering, and summarization tasks, with gains +of up to 6 ROUGE.

  • +
+
+

BartAdapterModel

+
+
+class adapters.BartAdapterModel(config: BartConfig, **kwargs)
+

BART Model with the option to add multiple flexible prediction heads on top. +This model inherits from [PreTrainedModel]. Check the superclass documentation for the generic methods the +library implements for all its model (such as downloading or saving, resizing the input embeddings, pruning heads +etc.)

+

This model is also a PyTorch [torch.nn.Module](https://pytorch.org/docs/stable/nn.html#torch.nn.Module) subclass. +Use it as a regular PyTorch Module and refer to the PyTorch documentation for all matter related to general usage +and behavior.

+
+
Parameters
+

config ([BartConfig]) – Model configuration class with all the parameters of the model. Initializing with a config file does not +load the weights associated with the model, only the configuration. Check out the +[~PreTrainedModel.from_pretrained] method to load the model weights.

+
+
+
+
+property active_adapters: AdapterCompositionBlock
+

If you are not familiar with adapters and PEFT methods, we invite you to read more about them on the PEFT +official documentation: https://huggingface.co/docs/peft

+

Gets the current active adapters of the model. In case of multi-adapter inference (combining multiple adapters +for inference) returns the list of all active adapters so that users can deal with them accordingly.

+

For previous PEFT versions (that does not support multi-adapter inference), module.active_adapter will return +a single string.

+
+ +
+
+property active_head: Union[str, List[str]]
+

The active prediction head configuration of this model. Can be either the name of a single available head +(string) or a list of multiple available heads. In case of a list of heads, the same base model is forwarded +through all specified heads.

+
+
Returns
+

A string or a list of strings describing the active head configuration.

+
+
Return type
+

Union[str, List[str]]

+
+
+
+ +
+
+adapter_fusion_to(adapter_names: Union[Fuse, list, str], device: Optional[Union[device, str]] = None, dtype: Optional[dtype] = None)
+

Moves the adapter fusion layer with the given name to the specified device and data type.

+
+
Parameters
+
    +
  • adapter_names (Union[Fuse, list, str]) – The name of the adapter fusion layer to be moved.

  • +
  • device (torch.device or str, optional) – The device on which the adapter fusion layer should be moved.

  • +
  • dtype (torch.dtype, optional) – The data type to which the adapter fusion layer should be cast.

  • +
+
+
+
+ +
+
+adapter_summary(as_dict=False) Union[str, dict]
+

Returns a string summary of all adapters currently added to the model. Each entry in the summary table has the +following attributes:

+
+
    +
  • name: the name of the adapter

  • +
  • architecture: the architectural base of the adapter

  • +
  • #param: the number of parameters of the adapter

  • +
  • %param: the number of parameters of the adapter relative to the full model

  • +
  • active: whether the adapter is active

  • +
  • train: whether the adapter weights are enabled for training

  • +
+
+
+ +
+
+adapter_to(name: str, device: Optional[Union[device, str]] = None, dtype: Optional[dtype] = None)
+

Moves the adapter with the given name to the specified device and data type.

+
+
Parameters
+
    +
  • name (str) – The name of the adapter to be moved.

  • +
  • device (torch.device or str, optional) – The device on which the adapter should be moved.

  • +
  • dtype (torch.dtype, optional) – The data type to which the adapter should be cast.

  • +
+
+
+
+ +
+
+add_adapter(adapter_name: str, config=None, overwrite_ok: bool = False, set_active: bool = False)
+

Adds a new adapter module of the specified type to the model.

+
+
Parameters
+
    +
  • adapter_name (str) – The name of the adapter module to be added.

  • +
  • config (str or dict, optional) –

    The adapter configuration, can be either:

    +
      +
    • the string identifier of a pre-defined configuration dictionary

    • +
    • a configuration dictionary specifying the full config

    • +
    • if not given, the default configuration for this adapter type will be used

    • +
    +

  • +
  • overwrite_ok (bool, optional) – Overwrite an adapter with the same name if it exists. By default (False), an exception is thrown.

  • +
  • set_active (bool, optional) – Set the adapter to be the active one. By default (False), the adapter is added but not activated.

  • +
+
+
+

If self.base_model is self, must inherit from a class that implements this method, to preclude infinite +recursion

+
+ +
+
+add_adapter_fusion(adapter_names: Union[Fuse, list, str], config=None, overwrite_ok: bool = False, set_active: bool = False)
+

Adds AdapterFusion to the model with alll the necessary configurations and weight initializations

+
+
Parameters
+
    +
  • adapter_names (Fuse or list or str) –

    AdapterFusion layer to add. Can be either:

    +
      +
    • a Fuse composition block

    • +
    • a list of adapter names to fuse

    • +
    • a comma-separated string of adapter names to fuse

    • +
    +

  • +
  • config (str or dict) –

    adapter fusion configuration, can be either:

    +
      +
    • a string identifying a pre-defined adapter fusion configuration

    • +
    • a dictionary representing the adapter fusion configuration

    • +
    • the path to a file containing the adapter fusion configuration

    • +
    +

  • +
  • overwrite_ok (bool, optional) – Overwrite an AdapterFusion layer with the same name if it exists. By default (False), an exception is +thrown.

  • +
  • set_active (bool, optional) – Activate the added AdapterFusion. By default (False), the AdapterFusion is added but not activated.

  • +
+
+
+
+ +
+
+add_classification_head(head_name, num_labels=2, layers=2, activation_function='tanh', overwrite_ok=False, multilabel=False, id2label=None, use_pooler=False)
+

Adds a sequence classification head on top of the model.

+
+
Parameters
+
    +
  • head_name (str) – The name of the head.

  • +
  • num_labels (int, optional) – Number of classification labels. Defaults to 2.

  • +
  • layers (int, optional) – Number of layers. Defaults to 2.

  • +
  • activation_function (str, optional) – Activation function. Defaults to ‘tanh’.

  • +
  • overwrite_ok (bool, optional) – Force overwrite if a head with the same name exists. Defaults to False.

  • +
  • multilabel (bool, optional) – Enable multilabel classification setup. Defaults to False.

  • +
+
+
+
+ +
+
+add_qa_head(head_name, num_labels=2, layers=1, activation_function='tanh', overwrite_ok=False, id2label=None)
+

Adds a question answering head on top of the model.

+
+
Parameters
+
    +
  • head_name (str) – The name of the head.

  • +
  • num_labels (int, optional) – Number of classification labels. Defaults to 2.

  • +
  • layers (int, optional) – Number of layers. Defaults to 1.

  • +
  • activation_function (str, optional) – Activation function. Defaults to ‘tanh’.

  • +
  • overwrite_ok (bool, optional) – Force overwrite if a head with the same name exists. Defaults to False.

  • +
+
+
+
+ +
+
+add_seq2seq_lm_head(head_name, overwrite_ok=False)
+

Adds a sequence-to-sequence language modeling head on top of the model.

+
+
Parameters
+
    +
  • head_name (str) – The name of the head.

  • +
  • overwrite_ok (bool, optional) – Force overwrite if a head with the same name exists. Defaults to False.

  • +
+
+
+
+ +
+
+apply_to_adapter_layers(fn)
+

Applies a function to all adapter layers of the model.

+
+ +
+
+apply_to_basemodel_childs(fn)
+

Applies a function to all direct childs of the model if they are a instance of AdapterLayerBase.

+
+ +
+
+average_adapter(adapter_name: str, adapter_list: List[str], weights: Optional[List[float]] = None, normalize_weights: bool = True, overwrite_ok: bool = False, set_active: bool = False)
+

Adds a new adapter module as weighted average of a set of existing adapter modules.

+
+
Parameters
+
    +
  • adapter_name (str) – The name of the adapter module to be added.

  • +
  • input_adapters (List[str] or Dict[str, float]) – Specifies the existing adapters whose weights should be averaged. Can either be a list of adapter names +or a dictionary mapping adapter names to weights.

  • +
  • overwrite_ok (bool, optional) – Overwrite an adapter with the same name if it exists. By default (False), an exception is thrown.

  • +
  • set_active (bool, optional) – Set the adapter to be the active one. By default (False), the adapter is added but not activated.

  • +
+
+
+
+ +
+
+delete_adapter(adapter_name: str)
+

Deletes the adapter with the specified name from the model.

+
+
Parameters
+

adapter_name (str) – The name of the adapter.

+
+
+
+ +
+
+delete_adapter_fusion(adapter_names: Union[Fuse, list, str])
+

Deletes the AdapterFusion layer of the specified adapters.

+
+
Parameters
+

adapter_names (Union[Fuse, list, str]) – AdapterFusion layer to delete.

+
+
+
+ +
+
+delete_head(head_name: str)
+

Deletes the prediction head with the specified name from the model.

+
+
Parameters
+

head_name (str) – The name of the prediction to delete.

+
+
+
+ +
+
+eject_prefix_tuning(name: str)
+

Converts the prefix tuning with the given name from the reparameterized form into the flat form.

+
+
Parameters
+

name (str) – The name of the prefix tuning.

+
+
+
+ +
+
+forward(input_ids=None, attention_mask=None, decoder_input_ids=None, decoder_attention_mask=None, head_mask=None, decoder_head_mask=None, cross_attn_head_mask=None, encoder_outputs=None, inputs_embeds=None, decoder_inputs_embeds=None, use_cache=None, output_attentions=None, output_hidden_states=None, return_dict=None, past_key_values=None, head=None, output_adapter_gating_scores=False, output_adapter_fusion_attentions=False, **kwargs)
+

The [BartAdapterModel] forward method, overrides the __call__ special method.

+

<Tip>

+

Although the recipe for forward pass needs to be defined within this function, one should call the [Module] +instance afterwards instead of this since the former takes care of running the pre and post processing steps while +the latter silently ignores them.

+

</Tip>

+
+
Parameters
+
    +
  • input_ids (torch.LongTensor of shape (batch_size, sequence_length)) –

    Indices of input sequence tokens in the vocabulary. Padding will be ignored by default should you provide +it.

    +

    Indices can be obtained using [AutoTokenizer]. See [PreTrainedTokenizer.encode] and +[PreTrainedTokenizer.__call__] for details.

    +

    [What are input IDs?](../glossary#input-ids)

    +

  • +
  • attention_mask (torch.Tensor of shape (batch_size, sequence_length), optional) –

    Mask to avoid performing attention on padding token indices. Mask values selected in [0, 1]:

    +
      +
    • 1 for tokens that are not masked,

    • +
    • 0 for tokens that are masked.

    • +
    +

    [What are attention masks?](../glossary#attention-mask)

    +

  • +
  • decoder_input_ids (torch.LongTensor of shape (batch_size, target_sequence_length), optional) –

    Indices of decoder input sequence tokens in the vocabulary.

    +

    Indices can be obtained using [AutoTokenizer]. See [PreTrainedTokenizer.encode] and +[PreTrainedTokenizer.__call__] for details.

    +

    [What are decoder input IDs?](../glossary#decoder-input-ids)

    +

    Bart uses the eos_token_id as the starting token for decoder_input_ids generation. If past_key_values +is used, optionally only the last decoder_input_ids have to be input (see past_key_values).

    +

    For translation and summarization training, decoder_input_ids should be provided. If no +decoder_input_ids is provided, the model will create this tensor by shifting the input_ids to the right +for denoising pre-training following the paper.

    +

  • +
  • decoder_attention_mask (torch.LongTensor of shape (batch_size, target_sequence_length), optional) –

    Default behavior: generate a tensor that ignores pad tokens in decoder_input_ids. Causal mask will also +be used by default.

    +

    If you want to change padding behavior, you should read [modeling_bart._prepare_decoder_attention_mask] +and modify to your needs. See diagram 1 in [the paper](https://arxiv.org/abs/1910.13461) for more +information on the default strategy.

    +

  • +
  • head_mask (torch.Tensor of shape (encoder_layers, encoder_attention_heads), optional) –

    Mask to nullify selected heads of the attention modules in the encoder. Mask values selected in [0, 1]:

    +
      +
    • 1 indicates the head is not masked,

    • +
    • 0 indicates the head is masked.

    • +
    +

  • +
  • decoder_head_mask (torch.Tensor of shape (decoder_layers, decoder_attention_heads), optional) –

    Mask to nullify selected heads of the attention modules in the decoder. Mask values selected in [0, 1]:

    +
      +
    • 1 indicates the head is not masked,

    • +
    • 0 indicates the head is masked.

    • +
    +

  • +
  • cross_attn_head_mask (torch.Tensor of shape (decoder_layers, decoder_attention_heads), optional) –

    Mask to nullify selected heads of the cross-attention modules in the decoder. Mask values selected in [0, +1]:

    +
      +
    • 1 indicates the head is not masked,

    • +
    • 0 indicates the head is masked.

    • +
    +

  • +
  • encoder_outputs (tuple(tuple(torch.FloatTensor), optional) – Tuple consists of (last_hidden_state, optional: hidden_states, optional: attentions) +last_hidden_state of shape (batch_size, sequence_length, hidden_size), optional) is a sequence of +hidden-states at the output of the last layer of the encoder. Used in the cross-attention of the decoder.

  • +
  • past_key_values (tuple(tuple(torch.FloatTensor)), optional, returned when use_cache=True is passed or when config.use_cache=True) –

    Tuple of tuple(torch.FloatTensor) of length config.n_layers, with each tuple having 2 tensors of shape +(batch_size, num_heads, sequence_length, embed_size_per_head)) and 2 additional tensors of shape +(batch_size, num_heads, encoder_sequence_length, embed_size_per_head).

    +

    Contains pre-computed hidden-states (key and values in the self-attention blocks and in the cross-attention +blocks) that can be used (see past_key_values input) to speed up sequential decoding.

    +

    If past_key_values are used, the user can optionally input only the last decoder_input_ids (those that +don’t have their past key value states given to this model) of shape (batch_size, 1) instead of all +decoder_input_ids of shape (batch_size, sequence_length).

    +

  • +
  • inputs_embeds (torch.FloatTensor of shape (batch_size, sequence_length, hidden_size), optional) – Optionally, instead of passing input_ids you can choose to directly pass an embedded representation. +This is useful if you want more control over how to convert input_ids indices into associated vectors +than the model’s internal embedding lookup matrix.

  • +
  • decoder_inputs_embeds (torch.FloatTensor of shape (batch_size, target_sequence_length, hidden_size), optional) –

    Optionally, instead of passing decoder_input_ids you can choose to directly pass an embedded +representation. If past_key_values is used, optionally only the last decoder_inputs_embeds have to be +input (see past_key_values). This is useful if you want more control over how to convert +decoder_input_ids indices into associated vectors than the model’s internal embedding lookup matrix.

    +

    If decoder_input_ids and decoder_inputs_embeds are both unset, decoder_inputs_embeds takes the value +of inputs_embeds.

    +

  • +
  • use_cache (bool, optional) – If set to True, past_key_values key value states are returned and can be used to speed up decoding (see +past_key_values).

  • +
  • output_attentions (bool, optional) – Whether or not to return the attentions tensors of all attention layers. See attentions under returned +tensors for more detail.

  • +
  • output_hidden_states (bool, optional) – Whether or not to return the hidden states of all layers. See hidden_states under returned tensors for +more detail.

  • +
  • return_dict (bool, optional) – Whether or not to return a [~utils.ModelOutput] instead of a plain tuple.

  • +
  • labels (torch.LongTensor of shape (batch_size,), optional) – Labels for computing the sequence classification/regression loss. Indices should be in [0, ..., +config.num_labels - 1]. If config.num_labels > 1 a classification loss is computed (Cross-Entropy).

  • +
+
+
+
+ +
+
+forward_context(context: ForwardContext, *args, **kwargs)
+

This method is called by the ForwardContext at the beginning of the forward pass.

+
+ +
+
+forward_head(all_outputs, head_name=None, cls_output=None, attention_mask=None, return_dict=False, context=None, **kwargs)
+

The forward pass through a prediction head configuration. There are three ways to specify the used prediction +head configuration (in order of priority):

+
+
    +
  1. If a head_name is passed, the head with the given name is used.

  2. +
  3. If the forward call is executed within an AdapterSetup context, the head configuration is read from +the context.

  4. +
  5. If the active_head property is set, the head configuration is read from there.

  6. +
+
+
+
Parameters
+
    +
  • all_outputs (dict) – The outputs of the base model.

  • +
  • head_name (str, optional) – The name of the prediction head to use. If None, the active head is used.

  • +
  • cls_output (torch.Tensor, optional) – The classification output of the model.

  • +
  • attention_mask (torch.Tensor, optional) – The attention mask of the model.

  • +
  • return_dict (bool) – Whether or not to return a ModelOutput instead of a plain tuple.

  • +
  • get_cls_from_eos_tokens (bool) – If set to True, retrieve classifier token representations from the last <eos> token in the sequence. +Setting to True requires eos_mask to be passed as well.

  • +
  • **kwargs – Additional keyword arguments passed to the forward pass of the head.

  • +
+
+
+
+ +
+
+freeze_model(freeze=True)
+

Freezes all weights of the model.

+
+ +
+
+get_adapter(name)
+

If self.base_model is self, must inherit from a class that implements this method, to preclude infinite +recursion

+
+ +
+
+get_labels(head_name=None)
+

Returns the labels the given head is assigning/predictin

+
+
Parameters
+
    +
  • head_name – (str, optional) the name of the head which labels should be returned. Default is None.

  • +
  • returned (If the name is None the labels of the active head are) –

  • +
+
+
+

Returns: labels

+
+ +
+
+get_labels_dict(head_name=None)
+

Returns the id2label dict for the given hea

+
+
Parameters
+
    +
  • head_name – (str, optional) the name of the head which labels should be returned. Default is None.

  • +
  • returned (If the name is None the labels of the active head are) –

  • +
+
+
+

Returns: id2label

+
+ +
+
+get_output_embeddings() Union[Module, List[Module]]
+

Returns the model’s output embeddings.

+
+
Returns
+

A torch module mapping hidden states to vocabulary.

+
+
Return type
+

nn.Module

+
+
+
+ +
+
+head_type()
+

Checks which head type the decorated function belongs to and raises an error if the model does not support the +head type.

+
+ +
+
+init_adapters(model_config, adapters_config, add_prefix_tuning_pool=True)
+

This method initializes adapter modules and fusion modules from the model config.

+
+ +
+
+iter_layers() Iterable[Tuple[int, Module]]
+

Iterates over all layers of the model.

+
+ +
+
+load_adapter(adapter_name_or_path: str, config: Optional[Union[dict, str]] = None, version: Optional[str] = None, model_name: Optional[str] = None, load_as: Optional[str] = None, source: Optional[str] = None, with_head: bool = True, custom_weights_loaders: Optional[List[WeightsLoader]] = None, leave_out: Optional[List[int]] = None, id2label=None, set_active: bool = False, use_safetensors: bool = False, **kwargs) str
+

Loads a pre-trained pytorch adapter module from the local file system or a remote location.

+
+
Parameters
+
    +
  • adapter_name_or_path (str) –

    can be either:

    +
      +
    • the identifier of a pre-trained task adapter to be loaded from Adapter Hub

    • +
    • a path to a directory containing adapter weights saved using model.saved_adapter()

    • +
    • a URL pointing to a zip folder containing a saved adapter module

    • +
    +

  • +
  • config (dict or str, optional) – The requested configuration of the adapter. +If not specified, will be either: - the default adapter config for the requested adapter if specified - +the global default adapter config

  • +
  • version (str, optional) – The version of the adapter to be loaded.

  • +
  • model_name (str, optional) – The string identifier of the pre-trained model.

  • +
  • load_as (str, optional) – Load the adapter using this name. By default, the name with which the adapter was +saved will be used.

  • +
  • source (str, optional) –

    Identifier of the source(s) from where to load the adapter. Can be:

    +
      +
    • +
      ”ah”: search on AdapterHub Hub repo.

      Note: the Hub repo has been archived and all adapters have been moved to HuggingFace Model Hub. +Loading from this source is deprecated.

      +
      +
      +
    • +
    • ”hf”: search on HuggingFace Model Hub.

    • +
    • None (default): search on all sources

    • +
    +

  • +
  • leave_out – Dynamically drop adapter modules in the specified Transformer layers when loading the adapter.

  • +
  • set_active (bool, optional) – Set the loaded adapter to be the active one. By default (False), the adapter is loaded but not +activated.

  • +
  • use_safetensors (bool, optional) – If True, weights are loaded via safetensors if safetensors checkpoint is available. Otherwise, the regular torch save method is used.

  • +
+
+
Returns
+

The name with which the adapter was added to the model.

+
+
Return type
+

str

+
+
+
+ +
+
+load_adapter_fusion(adapter_fusion_name_or_path: str, load_as: Optional[str] = None, custom_weights_loaders: Optional[List[WeightsLoader]] = None, set_active: bool = False, with_head: bool = True, use_safetensors: bool = False, **kwargs) str
+

Loads a pre-trained AdapterFusion layer from the local file system.

+
+
Parameters
+
    +
  • adapter_fusion_name_or_path (str) – a path to a directory containing AdapterFusion weights saved using model.save_adapter_fusion().

  • +
  • load_as (str, optional) – Load the AdapterFusion using this name. +By default, the name with which the AdapterFusion layer was saved will be used.

  • +
  • set_active (bool, optional) – Activate the loaded AdapterFusion. By default (False), the AdapterFusion is loaded but not activated.

  • +
  • use_safetensors (bool, optional) – If True, weights are loaded via safetensors if safetensors checkpoint is available. Otherwise, the regular torch save method is used.

  • +
+
+
Returns
+

The name with which the AdapterFusion was added to the model.

+
+
Return type
+

str

+
+
+
+ +
+
+load_head(save_directory: str, load_as: Optional[str] = None, id2label: Optional[Dict[int, str]] = None, use_safetensors: bool = False, **kwargs) str
+

Loads a model prediction head from a directory where it was saved using save_head().

+
+
Parameters
+
    +
  • save_directory (str) – Path to the directory where the prediction head is saved.

  • +
  • load_as (str, optional) – Load the AdapterFusion using this name. +By default, the name with which the AdapterFusion layer was saved will be used.

  • +
  • id2label (Dict[int, str], optional) – Provide a custom mapping from class ids to class labels. Defaults to None.

  • +
  • use_safetensors (bool, optional) – If True, weights are loaded via safetensors if safetensors checkpoint is available. Otherwise, the regular torch save method is used.

  • +
+
+
Returns
+

The name with which the prediction head was added to the model.

+
+
Return type
+

str

+
+
+
+ +
+
+merge_adapter(name: str)
+

Merges the weights of the given LoRA module with the Transformer weights as described in the paper.

+
+
Parameters
+

name (str) – LoRA module to merge.

+
+
+
+ +
+
+push_adapter_to_hub(repo_name: str, adapter_name: str, organization: Optional[str] = None, adapterhub_tag: Optional[str] = None, datasets_tag: Optional[str] = None, local_path: Optional[str] = None, commit_message: Optional[str] = None, private: Optional[bool] = None, token: Optional[Union[bool, str]] = None, overwrite_adapter_card: bool = False, create_pr: bool = False, revision: Optional[str] = None, commit_description: Optional[str] = None, adapter_card_kwargs: Optional[dict] = None, **deprecated_kwargs)
+

Upload an adapter to HuggingFace’s Model Hub.

+
+
Parameters
+
    +
  • repo_name (str) – The name of the repository on the model hub to upload to.

  • +
  • adapter_name (str) – The name of the adapter to be uploaded.

  • +
  • organization (str, optional) – Organization in which to push the adapter +(you must be a member of this organization). Defaults to None.

  • +
  • adapterhub_tag (str, optional) – Tag of the format <task>/<subtask> for categorization on https://adapterhub.ml/explore/. See +https://docs.adapterhub.ml/contributing.html#add-a-new-task-or-subtask for more. If not specified, +datasets_tag must be given in case a new adapter card is generated. Defaults to None.

  • +
  • datasets_tag (str, optional) – Dataset identifier from https://huggingface.co/datasets. +If not specified, adapterhub_tag must be given in case a new adapter card is generated. Defaults to +None.

  • +
  • local_path (str, optional) – Local path used as clone directory of the adapter repository. +If not specified, will create a temporary directory. Defaults to None.

  • +
  • commit_message (str, optional) – Message to commit while pushing. Will default to "add config", "add tokenizer" or +"add model" depending on the type of the class.

  • +
  • private (bool, optional) – Whether or not the repository created should be private (requires a paying subscription).

  • +
  • token (bool or str, optional) – The token to use as HTTP bearer authorization for remote files. If True, will use the token generated +when running huggingface-cli login (stored in ~/.huggingface). Will default to True if repo_url +is not specified.

  • +
  • overwrite_adapter_card (bool, optional) – Overwrite an existing adapter card with a newly generated one. +If set to False, will only generate an adapter card, if none exists. Defaults to False.

  • +
  • create_pr (bool, optional) – Whether or not to create a PR with the uploaded files or directly commit.

  • +
  • revision (str, optional) – Branch to push the uploaded files to.

  • +
  • commit_description (str, optional) – The description of the commit that will be created

  • +
+
+
Returns
+

The url of the adapter repository on the model hub.

+
+
Return type
+

str

+
+
+
+ +
+
+reset_adapter()
+

Resets weights of a LoRA module merged using model.merge_adapter(name).

+
+ +
+
+save_adapter(save_directory: str, adapter_name: str, with_head: bool = True, meta_dict: Optional[dict] = None, custom_weights_loaders: Optional[List[WeightsLoader]] = None, use_safetensors: bool = False)
+

Saves an adapter and its configuration file to a directory so that it can be shared or reloaded using +load_adapter().

+
+
Parameters
+
    +
  • save_directory (str) – Path to a directory where the adapter should be saved.

  • +
  • adapter_name (str) – Name of the adapter to be saved.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
Raises
+

ValueError – If the given adapter name is invalid.

+
+
+
+ +
+
+save_adapter_fusion(save_directory: str, adapter_names: Union[Fuse, list, str], meta_dict: Optional[dict] = None, custom_weights_loaders: Optional[List[WeightsLoader]] = None, with_head: Union[bool, str] = False, use_safetensors: bool = False)
+

Saves an AdapterFusion layer and its configuration file to a directory so that it can be shared or reloaded +using load_adapter_fusion().

+
+
Parameters
+
    +
  • save_directory (str) – Path to a directory where the AdapterFusion should be saved.

  • +
  • adapter_names (Union[Fuse, list, str]) – AdapterFusion to be saved.

  • +
  • with_head (Union[bool, str]) – If True, will save a head with the same name as the AdapterFusionLayer. If a string, this will be used +as the name of the head to be saved.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
Raises
+

ValueError – If the given AdapterFusion name is invalid.

+
+
+
+ +
+
+save_all_adapter_fusions(save_directory: str, meta_dict: Optional[dict] = None, custom_weights_loaders: Optional[List[WeightsLoader]] = None, use_safetensors: bool = False)
+

Saves all AdapterFusion layers of this model together with their configuration to subfolders of the given +location.

+
+
Parameters
+
    +
  • save_directory (str) – Path to a directory where the AdapterFusion layers should be saved.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
+
+ +
+
+save_all_adapters(save_directory: str, with_head: bool = True, meta_dict: Optional[dict] = None, custom_weights_loaders: Optional[List[WeightsLoader]] = None, use_safetensors: bool = False)
+

Saves all adapters of this model together with their configuration to subfolders of the given location.

+
+
Parameters
+
    +
  • save_directory (str) – Path to a directory where the adapters should be saved.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
+
+ +
+
+save_all_heads(save_directory: str, use_safetensors: bool = False)
+

Saves all prediction heads of this model to subfolders of the given location.

+
+
Parameters
+
    +
  • save_directory (str) – Path to the base directory where prediction heads should be saved.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
+
+ +
+
+save_head(save_directory: str, head_name: Optional[str] = None, use_safetensors: bool = False) None
+

Saves a model prediction head to a directory such that it can be reloaded using load_head().

+
+
Parameters
+
    +
  • save_directory (str) – Path to the directory where the prediction head should be saved.

  • +
  • head_name (str, optional) – Name of the head to save. Set to None if model only has one head. Defaults to None.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
+
+ +
+
+save_pretrained(save_directory: Union[str, PathLike], **kwargs)
+

Save a model and its configuration file to a directory, so that it can be re-loaded using the +[~PreTrainedModel.from_pretrained] class method.

+
+
Parameters
+
    +
  • save_directory (str or os.PathLike) – Directory to which to save. Will be created if it doesn’t exist.

  • +
  • is_main_process (bool, optional, defaults to True) – Whether the process calling this is the main process or not. Useful when in distributed training like +TPUs and need to call this function on all processes. In this case, set is_main_process=True only on +the main process to avoid race conditions.

  • +
  • state_dict (nested dictionary of torch.Tensor) – The state dictionary of the model to save. Will default to self.state_dict(), but can be used to only +save parts of the model or if special precautions need to be taken when recovering the state dictionary +of a model (like when using model parallelism).

  • +
  • save_function (Callable) – The function to use to save the state dictionary. Useful on distributed training like TPUs when one +need to replace torch.save by another method.

  • +
  • push_to_hub (bool, optional, defaults to False) – Whether or not to push your model to the Hugging Face model hub after saving it. You can specify the +repository you want to push to with repo_id (will default to the name of save_directory in your +namespace).

  • +
  • max_shard_size (int or str, optional, defaults to “5GB”) –

    The maximum size for a checkpoint before being sharded. Checkpoints shard will then be each of size +lower than this size. If expressed as a string, needs to be digits followed by a unit (like “5MB”). +We default it to 5GB in order for models to be able to run easily on free-tier google colab instances +without CPU OOM issues.

    +

    <Tip warning={true}>

    +

    If a single weight of the model is bigger than max_shard_size, it will be in its own checkpoint shard +which will be bigger than max_shard_size.

    +

    </Tip>

    +

  • +
  • safe_serialization (bool, optional, defaults to True) – Whether to save the model using safetensors or the traditional PyTorch way (that uses pickle).

  • +
  • variant (str, optional) – If specified, weights are saved in the format pytorch_model.<variant>.bin.

  • +
  • token (str or bool, optional) – The token to use as HTTP bearer authorization for remote files. If True, or not specified, will use +the token generated when running huggingface-cli login (stored in ~/.huggingface).

  • +
  • save_peft_format (bool, optional, defaults to True) – For backward compatibility with PEFT library, in case adapter weights are attached to the model, all +keys of the state dict of adapters needs to be pre-pended with base_model.model. Advanced users can +disable this behaviours by setting save_peft_format to False.

  • +
  • kwargs (Dict[str, Any], optional) – Additional key word arguments passed along to the [~utils.PushToHubMixin.push_to_hub] method.

  • +
+
+
+
+ +
+
+set_active_adapters(adapter_setup: Union[list, AdapterCompositionBlock], skip_layers: Optional[List[int]] = None)
+

Sets the adapter modules to be used by default in every forward pass. This setting can be overriden by passing +the adapter_names parameter in the foward() pass. If no adapter with the given name is found, no module of +the respective type will be activated. In case the calling model class supports named prediction heads, this +method will attempt to activate a prediction head with the name of the last adapter in the list of passed +adapter names.

+
+
Parameters
+

adapter_setup (list) – The list of adapters to be activated by default. Can be a fusion or stacking configuration.

+
+
+
+ +
+
+tie_weights()
+

Tie the weights between the input embeddings and the output embeddings.

+

If the torchscript flag is set in the configuration, can’t handle parameter sharing so we are cloning +the weights instead.

+
+ +
+
+train_adapter(adapter_setup: Union[list, AdapterCompositionBlock], train_embeddings=False)
+

Sets the model into mode for training the given adapters. If self.base_model is self, must inherit from a class +that implements this method, to preclude infinite recursion

+
+ +
+
+train_adapter_fusion(adapter_setup: Union[list, AdapterCompositionBlock], unfreeze_adapters=False)
+

Sets the model into mode for training of adapter fusion determined by a list of adapter names. If +self.base_model is self, must inherit from a class that implements this method, to preclude infinite recursion

+
+ +
+
+train_fusion(adapter_setup: Union[list, AdapterCompositionBlock], unfreeze_adapters=False)
+

Sets the model into mode for training of adapter fusion determined by a list of adapter names.

+
+ +
+ +
+
+ + +
+ +
+ + +
+
+ +
+ +
+ +
+ + Versions + v: main + + +
+
+
Branches
+
main
+
+
+
+ + + + + + + + + + \ No newline at end of file diff --git a/classes/models/beit.html b/classes/models/beit.html new file mode 100644 index 0000000000..b663d7d825 --- /dev/null +++ b/classes/models/beit.html @@ -0,0 +1,1019 @@ + + + + + + + + + + + BEiT — AdapterHub documentation + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + +
+ + + + + +
+ +
+ + + + + + + + + + + + + + + + + +
+ + + + +
+
+
+
+ +
+

BEiT

+

The Bidirectional Encoder representation from Image Transformers (BEiT) model was proposed in BERT Pre-Training of Image +Transformers by Hangbo Bao, Li Dong, Songhao Piao, Furu Wei.

+

The abstract from the paper is the following:

+

We introduce a self-supervised vision representation model BEiT, which stands for Bidirectional Encoder representation +from Image Transformers. Following BERT developed in the natural language processing area, we propose a masked image +modeling task to pretrain vision Transformers. Specifically, each image has two views in our pre-training, i.e, image +patches (such as 16x16 pixels), and visual tokens (i.e., discrete tokens). We first “tokenize” the original image into +visual tokens. Then we randomly mask some image patches and fed them into the backbone Transformer. The pre-training +objective is to recover the original visual tokens based on the corrupted image patches. After pre-training BEiT, we +directly fine-tune the model parameters on downstream tasks by appending task layers upon the pretrained encoder. +Experimental results on image classification and semantic segmentation show that our model achieves competitive results +with previous pre-training methods. For example, base-size BEiT achieves 83.2% top-1 accuracy on ImageNet-1K, +significantly outperforming from-scratch DeiT training (81.8%) with the same setup. Moreover, large-size BEiT obtains +86.3% only using ImageNet-1K, even outperforming ViT-L with supervised pre-training on ImageNet-22K (85.2%).

+
+

BeitAdapterModel

+
+
+class adapters.BeitAdapterModel(config)
+

Beit Model transformer with the option to add multiple flexible heads on top. +This model is a PyTorch [torch.nn.Module](https://pytorch.org/docs/stable/nn.html#torch.nn.Module) subclass. Use it +as a regular PyTorch Module and refer to the PyTorch documentation for all matter related to general usage and +behavior.

+
+
Parameters
+

config ([BeitConfig]) – Model configuration class with all the parameters of the model. +Initializing with a config file does not load the weights associated with the model, only the +configuration. Check out the [~PreTrainedModel.from_pretrained] method to load the model weights.

+
+
+
+
+property active_adapters: AdapterCompositionBlock
+

If you are not familiar with adapters and PEFT methods, we invite you to read more about them on the PEFT +official documentation: https://huggingface.co/docs/peft

+

Gets the current active adapters of the model. In case of multi-adapter inference (combining multiple adapters +for inference) returns the list of all active adapters so that users can deal with them accordingly.

+

For previous PEFT versions (that does not support multi-adapter inference), module.active_adapter will return +a single string.

+
+ +
+
+property active_head: Union[str, List[str]]
+

The active prediction head configuration of this model. Can be either the name of a single available head +(string) or a list of multiple available heads. In case of a list of heads, the same base model is forwarded +through all specified heads.

+
+
Returns
+

A string or a list of strings describing the active head configuration.

+
+
Return type
+

Union[str, List[str]]

+
+
+
+ +
+
+adapter_fusion_to(adapter_names: Union[Fuse, list, str], device: Optional[Union[device, str]] = None, dtype: Optional[dtype] = None)
+

Moves the adapter fusion layer with the given name to the specified device and data type.

+
+
Parameters
+
    +
  • adapter_names (Union[Fuse, list, str]) – The name of the adapter fusion layer to be moved.

  • +
  • device (torch.device or str, optional) – The device on which the adapter fusion layer should be moved.

  • +
  • dtype (torch.dtype, optional) – The data type to which the adapter fusion layer should be cast.

  • +
+
+
+
+ +
+
+adapter_summary(as_dict=False) Union[str, dict]
+

Returns a string summary of all adapters currently added to the model. Each entry in the summary table has the +following attributes:

+
+
    +
  • name: the name of the adapter

  • +
  • architecture: the architectural base of the adapter

  • +
  • #param: the number of parameters of the adapter

  • +
  • %param: the number of parameters of the adapter relative to the full model

  • +
  • active: whether the adapter is active

  • +
  • train: whether the adapter weights are enabled for training

  • +
+
+
+ +
+
+adapter_to(name: str, device: Optional[Union[device, str]] = None, dtype: Optional[dtype] = None)
+

Moves the adapter with the given name to the specified device and data type.

+
+
Parameters
+
    +
  • name (str) – The name of the adapter to be moved.

  • +
  • device (torch.device or str, optional) – The device on which the adapter should be moved.

  • +
  • dtype (torch.dtype, optional) – The data type to which the adapter should be cast.

  • +
+
+
+
+ +
+
+add_adapter(adapter_name: str, config=None, overwrite_ok: bool = False, set_active: bool = False)
+

Adds a new adapter module of the specified type to the model.

+
+
Parameters
+
    +
  • adapter_name (str) – The name of the adapter module to be added.

  • +
  • config (str or dict, optional) –

    The adapter configuration, can be either:

    +
      +
    • the string identifier of a pre-defined configuration dictionary

    • +
    • a configuration dictionary specifying the full config

    • +
    • if not given, the default configuration for this adapter type will be used

    • +
    +

  • +
  • overwrite_ok (bool, optional) – Overwrite an adapter with the same name if it exists. By default (False), an exception is thrown.

  • +
  • set_active (bool, optional) – Set the adapter to be the active one. By default (False), the adapter is added but not activated.

  • +
+
+
+

If self.base_model is self, must inherit from a class that implements this method, to preclude infinite +recursion

+
+ +
+
+add_adapter_fusion(adapter_names: Union[Fuse, list, str], config=None, overwrite_ok: bool = False, set_active: bool = False)
+

Adds AdapterFusion to the model with alll the necessary configurations and weight initializations

+
+
Parameters
+
    +
  • adapter_names (Fuse or list or str) –

    AdapterFusion layer to add. Can be either:

    +
      +
    • a Fuse composition block

    • +
    • a list of adapter names to fuse

    • +
    • a comma-separated string of adapter names to fuse

    • +
    +

  • +
  • config (str or dict) –

    adapter fusion configuration, can be either:

    +
      +
    • a string identifying a pre-defined adapter fusion configuration

    • +
    • a dictionary representing the adapter fusion configuration

    • +
    • the path to a file containing the adapter fusion configuration

    • +
    +

  • +
  • overwrite_ok (bool, optional) – Overwrite an AdapterFusion layer with the same name if it exists. By default (False), an exception is +thrown.

  • +
  • set_active (bool, optional) – Activate the added AdapterFusion. By default (False), the AdapterFusion is added but not activated.

  • +
+
+
+
+ +
+
+add_image_classification_head(head_name, num_labels=2, layers=1, activation_function='tanh', overwrite_ok=False, multilabel=False, id2label=None, use_pooler=False)
+

Adds an image classification head on top of the model.

+
+
Parameters
+
    +
  • head_name (str) – The name of the head.

  • +
  • num_labels (int, optional) – Number of classification labels. Defaults to 2.

  • +
  • layers (int, optional) – Number of layers. Defaults to 1.

  • +
  • activation_function (str, optional) – Activation function. Defaults to ‘tanh’.

  • +
  • overwrite_ok (bool, optional) – Force overwrite if a head with the same name exists. Defaults to False.

  • +
  • multilabel (bool, optional) – Enable multilabel classification setup. Defaults to False.

  • +
+
+
+
+ +
+
+apply_to_adapter_layers(fn)
+

Applies a function to all adapter layers of the model.

+
+ +
+
+apply_to_basemodel_childs(fn)
+

Applies a function to all direct childs of the model if they are a instance of AdapterLayerBase.

+
+ +
+
+average_adapter(adapter_name: str, adapter_list: List[str], weights: Optional[List[float]] = None, normalize_weights: bool = True, overwrite_ok: bool = False, set_active: bool = False)
+

Adds a new adapter module as weighted average of a set of existing adapter modules.

+
+
Parameters
+
    +
  • adapter_name (str) – The name of the adapter module to be added.

  • +
  • input_adapters (List[str] or Dict[str, float]) – Specifies the existing adapters whose weights should be averaged. Can either be a list of adapter names +or a dictionary mapping adapter names to weights.

  • +
  • overwrite_ok (bool, optional) – Overwrite an adapter with the same name if it exists. By default (False), an exception is thrown.

  • +
  • set_active (bool, optional) – Set the adapter to be the active one. By default (False), the adapter is added but not activated.

  • +
+
+
+
+ +
+
+delete_adapter(adapter_name: str)
+

Deletes the adapter with the specified name from the model.

+
+
Parameters
+

adapter_name (str) – The name of the adapter.

+
+
+
+ +
+
+delete_adapter_fusion(adapter_names: Union[Fuse, list, str])
+

Deletes the AdapterFusion layer of the specified adapters.

+
+
Parameters
+

adapter_names (Union[Fuse, list, str]) – AdapterFusion layer to delete.

+
+
+
+ +
+
+delete_head(head_name: str)
+

Deletes the prediction head with the specified name from the model.

+
+
Parameters
+

head_name (str) – The name of the prediction to delete.

+
+
+
+ +
+
+eject_prefix_tuning(name: str)
+

Converts the prefix tuning with the given name from the reparameterized form into the flat form.

+
+
Parameters
+

name (str) – The name of the prefix tuning.

+
+
+
+ +
+
+forward(pixel_values: Optional[Tensor] = None, bool_masked_pos: Optional[BoolTensor] = None, head_mask: Optional[Tensor] = None, output_attentions: Optional[bool] = None, output_hidden_states: Optional[bool] = None, return_dict: Optional[bool] = None, head=None, output_adapter_gating_scores=False, output_adapter_fusion_attentions=False, **kwargs)
+

The [BeitAdapterModel] forward method, overrides the __call__ special method.

+

<Tip>

+

Although the recipe for forward pass needs to be defined within this function, one should call the [Module] +instance afterwards instead of this since the former takes care of running the pre and post processing steps while +the latter silently ignores them.

+

</Tip>

+
+
Parameters
+
    +
  • pixel_values (torch.FloatTensor of shape (batch_size, num_channels, height, width)) – Pixel values. Pixel values can be obtained using [AutoImageProcessor]. See +[BeitImageProcessor.__call__] for details.

  • +
  • head_mask (torch.FloatTensor of shape (num_heads,) or (num_layers, num_heads), optional) –

    Mask to nullify selected heads of the self-attention modules. Mask values selected in [0, 1]:

    +
      +
    • 1 indicates the head is not masked,

    • +
    • 0 indicates the head is masked.

    • +
    +

  • +
  • output_attentions (bool, optional) – Whether or not to return the attentions tensors of all attention layers. See attentions under returned +tensors for more detail.

  • +
  • output_hidden_states (bool, optional) – Whether or not to return the hidden states of all layers. See hidden_states under returned tensors for +more detail.

  • +
  • return_dict (bool, optional) – Whether or not to return a [~utils.ModelOutput] instead of a plain tuple.

  • +
+
+
+
+ +
+
+forward_context(context: ForwardContext, *args, **kwargs)
+

This method is called by the ForwardContext at the beginning of the forward pass.

+
+ +
+
+forward_head(all_outputs, head_name=None, cls_output=None, attention_mask=None, return_dict=False, context=None, **kwargs)
+

The forward pass through a prediction head configuration. There are three ways to specify the used prediction +head configuration (in order of priority):

+
+
    +
  1. If a head_name is passed, the head with the given name is used.

  2. +
  3. If the forward call is executed within an AdapterSetup context, the head configuration is read from +the context.

  4. +
  5. If the active_head property is set, the head configuration is read from there.

  6. +
+
+
+
Parameters
+
    +
  • all_outputs (dict) – The outputs of the base model.

  • +
  • head_name (str, optional) – The name of the prediction head to use. If None, the active head is used.

  • +
  • cls_output (torch.Tensor, optional) – The classification output of the model.

  • +
  • attention_mask (torch.Tensor, optional) – The attention mask of the model.

  • +
  • return_dict (bool) – Whether or not to return a ModelOutput instead of a plain tuple.

  • +
  • get_cls_from_eos_tokens (bool) – If set to True, retrieve classifier token representations from the last <eos> token in the sequence. +Setting to True requires eos_mask to be passed as well.

  • +
  • **kwargs – Additional keyword arguments passed to the forward pass of the head.

  • +
+
+
+
+ +
+
+freeze_model(freeze=True)
+

Freezes all weights of the model.

+
+ +
+
+get_adapter(name)
+

If self.base_model is self, must inherit from a class that implements this method, to preclude infinite +recursion

+
+ +
+
+get_labels(head_name=None)
+

Returns the labels the given head is assigning/predictin

+
+
Parameters
+
    +
  • head_name – (str, optional) the name of the head which labels should be returned. Default is None.

  • +
  • returned (If the name is None the labels of the active head are) –

  • +
+
+
+

Returns: labels

+
+ +
+
+get_labels_dict(head_name=None)
+

Returns the id2label dict for the given hea

+
+
Parameters
+
    +
  • head_name – (str, optional) the name of the head which labels should be returned. Default is None.

  • +
  • returned (If the name is None the labels of the active head are) –

  • +
+
+
+

Returns: id2label

+
+ +
+
+get_output_embeddings() Union[Module, List[Module]]
+

Returns the model’s output embeddings.

+
+
Returns
+

A torch module mapping hidden states to vocabulary.

+
+
Return type
+

nn.Module

+
+
+
+ +
+
+head_type()
+

Checks which head type the decorated function belongs to and raises an error if the model does not support the +head type.

+
+ +
+
+init_adapters(model_config, adapters_config, add_prefix_tuning_pool=True)
+

This method initializes adapter modules and fusion modules from the model config.

+
+ +
+
+iter_layers() Iterable[Tuple[int, Module]]
+

Iterates over all layers of the model.

+
+ +
+
+load_adapter(adapter_name_or_path: str, config: Optional[Union[dict, str]] = None, version: Optional[str] = None, model_name: Optional[str] = None, load_as: Optional[str] = None, source: Optional[str] = None, with_head: bool = True, custom_weights_loaders: Optional[List[WeightsLoader]] = None, leave_out: Optional[List[int]] = None, id2label=None, set_active: bool = False, use_safetensors: bool = False, **kwargs) str
+

Loads a pre-trained pytorch adapter module from the local file system or a remote location.

+
+
Parameters
+
    +
  • adapter_name_or_path (str) –

    can be either:

    +
      +
    • the identifier of a pre-trained task adapter to be loaded from Adapter Hub

    • +
    • a path to a directory containing adapter weights saved using model.saved_adapter()

    • +
    • a URL pointing to a zip folder containing a saved adapter module

    • +
    +

  • +
  • config (dict or str, optional) – The requested configuration of the adapter. +If not specified, will be either: - the default adapter config for the requested adapter if specified - +the global default adapter config

  • +
  • version (str, optional) – The version of the adapter to be loaded.

  • +
  • model_name (str, optional) – The string identifier of the pre-trained model.

  • +
  • load_as (str, optional) – Load the adapter using this name. By default, the name with which the adapter was +saved will be used.

  • +
  • source (str, optional) –

    Identifier of the source(s) from where to load the adapter. Can be:

    +
      +
    • +
      ”ah”: search on AdapterHub Hub repo.

      Note: the Hub repo has been archived and all adapters have been moved to HuggingFace Model Hub. +Loading from this source is deprecated.

      +
      +
      +
    • +
    • ”hf”: search on HuggingFace Model Hub.

    • +
    • None (default): search on all sources

    • +
    +

  • +
  • leave_out – Dynamically drop adapter modules in the specified Transformer layers when loading the adapter.

  • +
  • set_active (bool, optional) – Set the loaded adapter to be the active one. By default (False), the adapter is loaded but not +activated.

  • +
  • use_safetensors (bool, optional) – If True, weights are loaded via safetensors if safetensors checkpoint is available. Otherwise, the regular torch save method is used.

  • +
+
+
Returns
+

The name with which the adapter was added to the model.

+
+
Return type
+

str

+
+
+
+ +
+
+load_adapter_fusion(adapter_fusion_name_or_path: str, load_as: Optional[str] = None, custom_weights_loaders: Optional[List[WeightsLoader]] = None, set_active: bool = False, with_head: bool = True, use_safetensors: bool = False, **kwargs) str
+

Loads a pre-trained AdapterFusion layer from the local file system.

+
+
Parameters
+
    +
  • adapter_fusion_name_or_path (str) – a path to a directory containing AdapterFusion weights saved using model.save_adapter_fusion().

  • +
  • load_as (str, optional) – Load the AdapterFusion using this name. +By default, the name with which the AdapterFusion layer was saved will be used.

  • +
  • set_active (bool, optional) – Activate the loaded AdapterFusion. By default (False), the AdapterFusion is loaded but not activated.

  • +
  • use_safetensors (bool, optional) – If True, weights are loaded via safetensors if safetensors checkpoint is available. Otherwise, the regular torch save method is used.

  • +
+
+
Returns
+

The name with which the AdapterFusion was added to the model.

+
+
Return type
+

str

+
+
+
+ +
+
+load_head(save_directory: str, load_as: Optional[str] = None, id2label: Optional[Dict[int, str]] = None, use_safetensors: bool = False, **kwargs) str
+

Loads a model prediction head from a directory where it was saved using save_head().

+
+
Parameters
+
    +
  • save_directory (str) – Path to the directory where the prediction head is saved.

  • +
  • load_as (str, optional) – Load the AdapterFusion using this name. +By default, the name with which the AdapterFusion layer was saved will be used.

  • +
  • id2label (Dict[int, str], optional) – Provide a custom mapping from class ids to class labels. Defaults to None.

  • +
  • use_safetensors (bool, optional) – If True, weights are loaded via safetensors if safetensors checkpoint is available. Otherwise, the regular torch save method is used.

  • +
+
+
Returns
+

The name with which the prediction head was added to the model.

+
+
Return type
+

str

+
+
+
+ +
+
+merge_adapter(name: str)
+

Merges the weights of the given LoRA module with the Transformer weights as described in the paper.

+
+
Parameters
+

name (str) – LoRA module to merge.

+
+
+
+ +
+
+push_adapter_to_hub(repo_name: str, adapter_name: str, organization: Optional[str] = None, adapterhub_tag: Optional[str] = None, datasets_tag: Optional[str] = None, local_path: Optional[str] = None, commit_message: Optional[str] = None, private: Optional[bool] = None, token: Optional[Union[bool, str]] = None, overwrite_adapter_card: bool = False, create_pr: bool = False, revision: Optional[str] = None, commit_description: Optional[str] = None, adapter_card_kwargs: Optional[dict] = None, **deprecated_kwargs)
+

Upload an adapter to HuggingFace’s Model Hub.

+
+
Parameters
+
    +
  • repo_name (str) – The name of the repository on the model hub to upload to.

  • +
  • adapter_name (str) – The name of the adapter to be uploaded.

  • +
  • organization (str, optional) – Organization in which to push the adapter +(you must be a member of this organization). Defaults to None.

  • +
  • adapterhub_tag (str, optional) – Tag of the format <task>/<subtask> for categorization on https://adapterhub.ml/explore/. See +https://docs.adapterhub.ml/contributing.html#add-a-new-task-or-subtask for more. If not specified, +datasets_tag must be given in case a new adapter card is generated. Defaults to None.

  • +
  • datasets_tag (str, optional) – Dataset identifier from https://huggingface.co/datasets. +If not specified, adapterhub_tag must be given in case a new adapter card is generated. Defaults to +None.

  • +
  • local_path (str, optional) – Local path used as clone directory of the adapter repository. +If not specified, will create a temporary directory. Defaults to None.

  • +
  • commit_message (str, optional) – Message to commit while pushing. Will default to "add config", "add tokenizer" or +"add model" depending on the type of the class.

  • +
  • private (bool, optional) – Whether or not the repository created should be private (requires a paying subscription).

  • +
  • token (bool or str, optional) – The token to use as HTTP bearer authorization for remote files. If True, will use the token generated +when running huggingface-cli login (stored in ~/.huggingface). Will default to True if repo_url +is not specified.

  • +
  • overwrite_adapter_card (bool, optional) – Overwrite an existing adapter card with a newly generated one. +If set to False, will only generate an adapter card, if none exists. Defaults to False.

  • +
  • create_pr (bool, optional) – Whether or not to create a PR with the uploaded files or directly commit.

  • +
  • revision (str, optional) – Branch to push the uploaded files to.

  • +
  • commit_description (str, optional) – The description of the commit that will be created

  • +
+
+
Returns
+

The url of the adapter repository on the model hub.

+
+
Return type
+

str

+
+
+
+ +
+
+reset_adapter()
+

Resets weights of a LoRA module merged using model.merge_adapter(name).

+
+ +
+
+save_adapter(save_directory: str, adapter_name: str, with_head: bool = True, meta_dict: Optional[dict] = None, custom_weights_loaders: Optional[List[WeightsLoader]] = None, use_safetensors: bool = False)
+

Saves an adapter and its configuration file to a directory so that it can be shared or reloaded using +load_adapter().

+
+
Parameters
+
    +
  • save_directory (str) – Path to a directory where the adapter should be saved.

  • +
  • adapter_name (str) – Name of the adapter to be saved.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
Raises
+

ValueError – If the given adapter name is invalid.

+
+
+
+ +
+
+save_adapter_fusion(save_directory: str, adapter_names: Union[Fuse, list, str], meta_dict: Optional[dict] = None, custom_weights_loaders: Optional[List[WeightsLoader]] = None, with_head: Union[bool, str] = False, use_safetensors: bool = False)
+

Saves an AdapterFusion layer and its configuration file to a directory so that it can be shared or reloaded +using load_adapter_fusion().

+
+
Parameters
+
    +
  • save_directory (str) – Path to a directory where the AdapterFusion should be saved.

  • +
  • adapter_names (Union[Fuse, list, str]) – AdapterFusion to be saved.

  • +
  • with_head (Union[bool, str]) – If True, will save a head with the same name as the AdapterFusionLayer. If a string, this will be used +as the name of the head to be saved.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
Raises
+

ValueError – If the given AdapterFusion name is invalid.

+
+
+
+ +
+
+save_all_adapter_fusions(save_directory: str, meta_dict: Optional[dict] = None, custom_weights_loaders: Optional[List[WeightsLoader]] = None, use_safetensors: bool = False)
+

Saves all AdapterFusion layers of this model together with their configuration to subfolders of the given +location.

+
+
Parameters
+
    +
  • save_directory (str) – Path to a directory where the AdapterFusion layers should be saved.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
+
+ +
+
+save_all_adapters(save_directory: str, with_head: bool = True, meta_dict: Optional[dict] = None, custom_weights_loaders: Optional[List[WeightsLoader]] = None, use_safetensors: bool = False)
+

Saves all adapters of this model together with their configuration to subfolders of the given location.

+
+
Parameters
+
    +
  • save_directory (str) – Path to a directory where the adapters should be saved.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
+
+ +
+
+save_all_heads(save_directory: str, use_safetensors: bool = False)
+

Saves all prediction heads of this model to subfolders of the given location.

+
+
Parameters
+
    +
  • save_directory (str) – Path to the base directory where prediction heads should be saved.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
+
+ +
+
+save_head(save_directory: str, head_name: Optional[str] = None, use_safetensors: bool = False) None
+

Saves a model prediction head to a directory such that it can be reloaded using load_head().

+
+
Parameters
+
    +
  • save_directory (str) – Path to the directory where the prediction head should be saved.

  • +
  • head_name (str, optional) – Name of the head to save. Set to None if model only has one head. Defaults to None.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
+
+ +
+
+save_pretrained(save_directory: Union[str, PathLike], **kwargs)
+

Save a model and its configuration file to a directory, so that it can be re-loaded using the +[~PreTrainedModel.from_pretrained] class method.

+
+
Parameters
+
    +
  • save_directory (str or os.PathLike) – Directory to which to save. Will be created if it doesn’t exist.

  • +
  • is_main_process (bool, optional, defaults to True) – Whether the process calling this is the main process or not. Useful when in distributed training like +TPUs and need to call this function on all processes. In this case, set is_main_process=True only on +the main process to avoid race conditions.

  • +
  • state_dict (nested dictionary of torch.Tensor) – The state dictionary of the model to save. Will default to self.state_dict(), but can be used to only +save parts of the model or if special precautions need to be taken when recovering the state dictionary +of a model (like when using model parallelism).

  • +
  • save_function (Callable) – The function to use to save the state dictionary. Useful on distributed training like TPUs when one +need to replace torch.save by another method.

  • +
  • push_to_hub (bool, optional, defaults to False) – Whether or not to push your model to the Hugging Face model hub after saving it. You can specify the +repository you want to push to with repo_id (will default to the name of save_directory in your +namespace).

  • +
  • max_shard_size (int or str, optional, defaults to “5GB”) –

    The maximum size for a checkpoint before being sharded. Checkpoints shard will then be each of size +lower than this size. If expressed as a string, needs to be digits followed by a unit (like “5MB”). +We default it to 5GB in order for models to be able to run easily on free-tier google colab instances +without CPU OOM issues.

    +

    <Tip warning={true}>

    +

    If a single weight of the model is bigger than max_shard_size, it will be in its own checkpoint shard +which will be bigger than max_shard_size.

    +

    </Tip>

    +

  • +
  • safe_serialization (bool, optional, defaults to True) – Whether to save the model using safetensors or the traditional PyTorch way (that uses pickle).

  • +
  • variant (str, optional) – If specified, weights are saved in the format pytorch_model.<variant>.bin.

  • +
  • token (str or bool, optional) – The token to use as HTTP bearer authorization for remote files. If True, or not specified, will use +the token generated when running huggingface-cli login (stored in ~/.huggingface).

  • +
  • save_peft_format (bool, optional, defaults to True) – For backward compatibility with PEFT library, in case adapter weights are attached to the model, all +keys of the state dict of adapters needs to be pre-pended with base_model.model. Advanced users can +disable this behaviours by setting save_peft_format to False.

  • +
  • kwargs (Dict[str, Any], optional) – Additional key word arguments passed along to the [~utils.PushToHubMixin.push_to_hub] method.

  • +
+
+
+
+ +
+
+set_active_adapters(adapter_setup: Union[list, AdapterCompositionBlock], skip_layers: Optional[List[int]] = None)
+

Sets the adapter modules to be used by default in every forward pass. This setting can be overriden by passing +the adapter_names parameter in the foward() pass. If no adapter with the given name is found, no module of +the respective type will be activated. In case the calling model class supports named prediction heads, this +method will attempt to activate a prediction head with the name of the last adapter in the list of passed +adapter names.

+
+
Parameters
+

adapter_setup (list) – The list of adapters to be activated by default. Can be a fusion or stacking configuration.

+
+
+
+ +
+
+tie_weights()
+

Tie the weights between the input embeddings and the output embeddings.

+

If the torchscript flag is set in the configuration, can’t handle parameter sharing so we are cloning +the weights instead.

+
+ +
+
+train_adapter(adapter_setup: Union[list, AdapterCompositionBlock], train_embeddings=False)
+

Sets the model into mode for training the given adapters. If self.base_model is self, must inherit from a class +that implements this method, to preclude infinite recursion

+
+ +
+
+train_adapter_fusion(adapter_setup: Union[list, AdapterCompositionBlock], unfreeze_adapters=False)
+

Sets the model into mode for training of adapter fusion determined by a list of adapter names. If +self.base_model is self, must inherit from a class that implements this method, to preclude infinite recursion

+
+ +
+
+train_fusion(adapter_setup: Union[list, AdapterCompositionBlock], unfreeze_adapters=False)
+

Sets the model into mode for training of adapter fusion determined by a list of adapter names.

+
+ +
+ +
+
+ + +
+ +
+ + +
+
+ +
+ +
+ +
+ + Versions + v: main + + +
+
+
Branches
+
main
+
+
+
+ + + + + + + + + + \ No newline at end of file diff --git a/classes/models/bert-generation.html b/classes/models/bert-generation.html new file mode 100644 index 0000000000..50c58a837f --- /dev/null +++ b/classes/models/bert-generation.html @@ -0,0 +1,1053 @@ + + + + + + + + + + + BertGeneration — AdapterHub documentation + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + +
+ + + + + +
+ +
+ + + + + + + + + + + + + + + + + +
+ + + + +
+
+
+
+ +
+

BertGeneration

+
+

Overview

+

The BertGeneration model is a BERT model that can be leveraged for sequence-to-sequence tasks using +EncoderDecoderModel as proposed in Leveraging Pre-trained Checkpoints for Sequence Generation +Tasks by Sascha Rothe, Shashi Narayan, Aliaksei Severyn.

+

The abstract from the paper is the following:

+

Unsupervised pretraining of large neural models has recently revolutionized Natural Language Processing. By +warm-starting from the publicly released checkpoints, NLP practitioners have pushed the state-of-the-art on multiple +benchmarks while saving significant amounts of compute time. So far the focus has been mainly on the Natural Language +Understanding tasks. In this paper, we demonstrate the efficacy of pre-trained checkpoints for Sequence Generation. We +developed a Transformer-based sequence-to-sequence model that is compatible with publicly available pre-trained BERT, +GPT-2 and RoBERTa checkpoints and conducted an extensive empirical study on the utility of initializing our model, both +encoder and decoder, with these checkpoints. Our models result in new state-of-the-art results on Machine Translation, +Text Summarization, Sentence Splitting, and Sentence Fusion.

+
+
+

BertGenerationAdapterModel

+
+
+class adapters.BertGenerationAdapterModel(config)
+

Bert Model transformer with the option to add multiple flexible heads on top.

+

This model inherits from [PreTrainedModel]. Check the superclass documentation for the generic methods the +library implements for all its model (such as downloading or saving, resizing the input embeddings, pruning heads +etc.)

+

This model is also a PyTorch [torch.nn.Module](https://pytorch.org/docs/stable/nn.html#torch.nn.Module) subclass. +Use it as a regular PyTorch Module and refer to the PyTorch documentation for all matter related to general usage +and behavior.

+
+
Parameters
+

config ([BertGenerationConfig]) – Model configuration class with all the parameters of the model. +Initializing with a config file does not load the weights associated with the model, only the +configuration. Check out the [~PreTrainedModel.from_pretrained] method to load the model weights.

+
+
+
+
+property active_adapters: AdapterCompositionBlock
+

If you are not familiar with adapters and PEFT methods, we invite you to read more about them on the PEFT +official documentation: https://huggingface.co/docs/peft

+

Gets the current active adapters of the model. In case of multi-adapter inference (combining multiple adapters +for inference) returns the list of all active adapters so that users can deal with them accordingly.

+

For previous PEFT versions (that does not support multi-adapter inference), module.active_adapter will return +a single string.

+
+ +
+
+property active_head: Union[str, List[str]]
+

The active prediction head configuration of this model. Can be either the name of a single available head +(string) or a list of multiple available heads. In case of a list of heads, the same base model is forwarded +through all specified heads.

+
+
Returns
+

A string or a list of strings describing the active head configuration.

+
+
Return type
+

Union[str, List[str]]

+
+
+
+ +
+
+adapter_fusion_to(adapter_names: Union[Fuse, list, str], device: Optional[Union[device, str]] = None, dtype: Optional[dtype] = None)
+

Moves the adapter fusion layer with the given name to the specified device and data type.

+
+
Parameters
+
    +
  • adapter_names (Union[Fuse, list, str]) – The name of the adapter fusion layer to be moved.

  • +
  • device (torch.device or str, optional) – The device on which the adapter fusion layer should be moved.

  • +
  • dtype (torch.dtype, optional) – The data type to which the adapter fusion layer should be cast.

  • +
+
+
+
+ +
+
+adapter_summary(as_dict=False) Union[str, dict]
+

Returns a string summary of all adapters currently added to the model. Each entry in the summary table has the +following attributes:

+
+
    +
  • name: the name of the adapter

  • +
  • architecture: the architectural base of the adapter

  • +
  • #param: the number of parameters of the adapter

  • +
  • %param: the number of parameters of the adapter relative to the full model

  • +
  • active: whether the adapter is active

  • +
  • train: whether the adapter weights are enabled for training

  • +
+
+
+ +
+
+adapter_to(name: str, device: Optional[Union[device, str]] = None, dtype: Optional[dtype] = None)
+

Moves the adapter with the given name to the specified device and data type.

+
+
Parameters
+
    +
  • name (str) – The name of the adapter to be moved.

  • +
  • device (torch.device or str, optional) – The device on which the adapter should be moved.

  • +
  • dtype (torch.dtype, optional) – The data type to which the adapter should be cast.

  • +
+
+
+
+ +
+
+add_adapter(adapter_name: str, config=None, overwrite_ok: bool = False, set_active: bool = False)
+

Adds a new adapter module of the specified type to the model.

+
+
Parameters
+
    +
  • adapter_name (str) – The name of the adapter module to be added.

  • +
  • config (str or dict, optional) –

    The adapter configuration, can be either:

    +
      +
    • the string identifier of a pre-defined configuration dictionary

    • +
    • a configuration dictionary specifying the full config

    • +
    • if not given, the default configuration for this adapter type will be used

    • +
    +

  • +
  • overwrite_ok (bool, optional) – Overwrite an adapter with the same name if it exists. By default (False), an exception is thrown.

  • +
  • set_active (bool, optional) – Set the adapter to be the active one. By default (False), the adapter is added but not activated.

  • +
+
+
+

If self.base_model is self, must inherit from a class that implements this method, to preclude infinite +recursion

+
+ +
+
+add_adapter_fusion(adapter_names: Union[Fuse, list, str], config=None, overwrite_ok: bool = False, set_active: bool = False)
+

Adds AdapterFusion to the model with alll the necessary configurations and weight initializations

+
+
Parameters
+
    +
  • adapter_names (Fuse or list or str) –

    AdapterFusion layer to add. Can be either:

    +
      +
    • a Fuse composition block

    • +
    • a list of adapter names to fuse

    • +
    • a comma-separated string of adapter names to fuse

    • +
    +

  • +
  • config (str or dict) –

    adapter fusion configuration, can be either:

    +
      +
    • a string identifying a pre-defined adapter fusion configuration

    • +
    • a dictionary representing the adapter fusion configuration

    • +
    • the path to a file containing the adapter fusion configuration

    • +
    +

  • +
  • overwrite_ok (bool, optional) – Overwrite an AdapterFusion layer with the same name if it exists. By default (False), an exception is +thrown.

  • +
  • set_active (bool, optional) – Activate the added AdapterFusion. By default (False), the AdapterFusion is added but not activated.

  • +
+
+
+
+ +
+
+add_causal_lm_head(head_name, activation_function='gelu', overwrite_ok=False)
+

Adds a causal language modeling head on top of the model.

+
+
Parameters
+
    +
  • head_name (str) – The name of the head.

  • +
  • activation_function (str, optional) – Activation function. Defaults to ‘gelu’.

  • +
  • overwrite_ok (bool, optional) – Force overwrite if a head with the same name exists. Defaults to False.

  • +
+
+
+
+ +
+
+add_masked_lm_head(head_name, activation_function='gelu', overwrite_ok=False)
+

Adds a masked language modeling head on top of the model.

+
+
Parameters
+
    +
  • head_name (str) – The name of the head.

  • +
  • activation_function (str, optional) – Activation function. Defaults to ‘gelu’.

  • +
  • overwrite_ok (bool, optional) – Force overwrite if a head with the same name exists. Defaults to False.

  • +
+
+
+
+ +
+
+apply_to_adapter_layers(fn)
+

Applies a function to all adapter layers of the model.

+
+ +
+
+apply_to_basemodel_childs(fn)
+

Applies a function to all direct childs of the model if they are a instance of AdapterLayerBase.

+
+ +
+
+average_adapter(adapter_name: str, adapter_list: List[str], weights: Optional[List[float]] = None, normalize_weights: bool = True, overwrite_ok: bool = False, set_active: bool = False)
+

Adds a new adapter module as weighted average of a set of existing adapter modules.

+
+
Parameters
+
    +
  • adapter_name (str) – The name of the adapter module to be added.

  • +
  • input_adapters (List[str] or Dict[str, float]) – Specifies the existing adapters whose weights should be averaged. Can either be a list of adapter names +or a dictionary mapping adapter names to weights.

  • +
  • overwrite_ok (bool, optional) – Overwrite an adapter with the same name if it exists. By default (False), an exception is thrown.

  • +
  • set_active (bool, optional) – Set the adapter to be the active one. By default (False), the adapter is added but not activated.

  • +
+
+
+
+ +
+
+delete_adapter(adapter_name: str)
+

Deletes the adapter with the specified name from the model.

+
+
Parameters
+

adapter_name (str) – The name of the adapter.

+
+
+
+ +
+
+delete_adapter_fusion(adapter_names: Union[Fuse, list, str])
+

Deletes the AdapterFusion layer of the specified adapters.

+
+
Parameters
+

adapter_names (Union[Fuse, list, str]) – AdapterFusion layer to delete.

+
+
+
+ +
+
+delete_head(head_name: str)
+

Deletes the prediction head with the specified name from the model.

+
+
Parameters
+

head_name (str) – The name of the prediction to delete.

+
+
+
+ +
+
+eject_prefix_tuning(name: str)
+

Converts the prefix tuning with the given name from the reparameterized form into the flat form.

+
+
Parameters
+

name (str) – The name of the prefix tuning.

+
+
+
+ +
+
+forward(input_ids=None, attention_mask=None, position_ids=None, head_mask=None, inputs_embeds=None, encoder_hidden_states=None, encoder_attention_mask=None, past_key_values=None, use_cache=None, output_attentions=None, output_hidden_states=None, return_dict=None, head=None, output_adapter_gating_scores=False, output_adapter_fusion_attentions=False, **kwargs)
+

The [BertGenerationAdapterModel] forward method, overrides the __call__ special method.

+

<Tip>

+

Although the recipe for forward pass needs to be defined within this function, one should call the [Module] +instance afterwards instead of this since the former takes care of running the pre and post processing steps while +the latter silently ignores them.

+

</Tip>

+
+
Parameters
+
    +
  • input_ids (torch.LongTensor of shape (batch_size, sequence_length)) –

    Indices of input sequence tokens in the vocabulary.

    +

    Indices can be obtained using [AutoTokenizer]. See [PreTrainedTokenizer.__call__] and +[PreTrainedTokenizer.encode] for details.

    +

    [What are input IDs?](../glossary#input-ids)

    +

  • +
  • attention_mask (torch.FloatTensor of shape (batch_size, sequence_length), optional) –

    Mask to avoid performing attention on padding token indices. Mask values selected in [0, 1]:

    +
      +
    • 1 for tokens that are not masked,

    • +
    • 0 for tokens that are masked.

    • +
    +

    [What are attention masks?](../glossary#attention-mask)

    +

  • +
  • position_ids (torch.LongTensor of shape (batch_size, sequence_length), optional) –

    Indices of positions of each input sequence tokens in the position embeddings. Selected in the range [0, +config.max_position_embeddings - 1].

    +

    [What are position IDs?](../glossary#position-ids)

    +

  • +
  • head_mask (torch.FloatTensor of shape (num_heads,) or (num_layers, num_heads), optional) –

    Mask to nullify selected heads of the self-attention modules. Mask values selected in [0, 1]:

    +
      +
    • 1 indicates the head is not masked,

    • +
    • 0 indicates the head is masked.

    • +
    +

  • +
  • inputs_embeds (torch.FloatTensor of shape (batch_size, sequence_length, hidden_size), optional) – Optionally, instead of passing input_ids you can choose to directly pass an embedded representation. This +is useful if you want more control over how to convert input_ids indices into associated vectors than the +model’s internal embedding lookup matrix.

  • +
  • output_attentions (bool, optional) – Whether or not to return the attentions tensors of all attention layers. See attentions under returned +tensors for more detail.

  • +
  • output_hidden_states (bool, optional) – Whether or not to return the hidden states of all layers. See hidden_states under returned tensors for +more detail.

  • +
  • return_dict (bool, optional) – Whether or not to return a [~utils.ModelOutput] instead of a plain tuple.

  • +
+
+
+
+ +
+
+forward_context(context: ForwardContext, *args, **kwargs)
+

This method is called by the ForwardContext at the beginning of the forward pass.

+
+ +
+
+forward_head(all_outputs, head_name=None, cls_output=None, attention_mask=None, return_dict=False, context=None, **kwargs)
+

The forward pass through a prediction head configuration. There are three ways to specify the used prediction +head configuration (in order of priority):

+
+
    +
  1. If a head_name is passed, the head with the given name is used.

  2. +
  3. If the forward call is executed within an AdapterSetup context, the head configuration is read from +the context.

  4. +
  5. If the active_head property is set, the head configuration is read from there.

  6. +
+
+
+
Parameters
+
    +
  • all_outputs (dict) – The outputs of the base model.

  • +
  • head_name (str, optional) – The name of the prediction head to use. If None, the active head is used.

  • +
  • cls_output (torch.Tensor, optional) – The classification output of the model.

  • +
  • attention_mask (torch.Tensor, optional) – The attention mask of the model.

  • +
  • return_dict (bool) – Whether or not to return a ModelOutput instead of a plain tuple.

  • +
  • get_cls_from_eos_tokens (bool) – If set to True, retrieve classifier token representations from the last <eos> token in the sequence. +Setting to True requires eos_mask to be passed as well.

  • +
  • **kwargs – Additional keyword arguments passed to the forward pass of the head.

  • +
+
+
+
+ +
+
+freeze_model(freeze=True)
+

Freezes all weights of the model.

+
+ +
+
+get_adapter(name)
+

If self.base_model is self, must inherit from a class that implements this method, to preclude infinite +recursion

+
+ +
+
+get_labels(head_name=None)
+

Returns the labels the given head is assigning/predictin

+
+
Parameters
+
    +
  • head_name – (str, optional) the name of the head which labels should be returned. Default is None.

  • +
  • returned (If the name is None the labels of the active head are) –

  • +
+
+
+

Returns: labels

+
+ +
+
+get_labels_dict(head_name=None)
+

Returns the id2label dict for the given hea

+
+
Parameters
+
    +
  • head_name – (str, optional) the name of the head which labels should be returned. Default is None.

  • +
  • returned (If the name is None the labels of the active head are) –

  • +
+
+
+

Returns: id2label

+
+ +
+
+get_output_embeddings() Union[Module, List[Module]]
+

Returns the model’s output embeddings.

+
+
Returns
+

A torch module mapping hidden states to vocabulary.

+
+
Return type
+

nn.Module

+
+
+
+ +
+
+head_type()
+

Checks which head type the decorated function belongs to and raises an error if the model does not support the +head type.

+
+ +
+
+init_adapters(model_config, adapters_config, add_prefix_tuning_pool=True)
+

This method initializes adapter modules and fusion modules from the model config.

+
+ +
+
+iter_layers() Iterable[Tuple[int, Module]]
+

Iterates over all layers of the model.

+
+ +
+
+load_adapter(adapter_name_or_path: str, config: Optional[Union[dict, str]] = None, version: Optional[str] = None, model_name: Optional[str] = None, load_as: Optional[str] = None, source: Optional[str] = None, with_head: bool = True, custom_weights_loaders: Optional[List[WeightsLoader]] = None, leave_out: Optional[List[int]] = None, id2label=None, set_active: bool = False, use_safetensors: bool = False, **kwargs) str
+

Loads a pre-trained pytorch adapter module from the local file system or a remote location.

+
+
Parameters
+
    +
  • adapter_name_or_path (str) –

    can be either:

    +
      +
    • the identifier of a pre-trained task adapter to be loaded from Adapter Hub

    • +
    • a path to a directory containing adapter weights saved using model.saved_adapter()

    • +
    • a URL pointing to a zip folder containing a saved adapter module

    • +
    +

  • +
  • config (dict or str, optional) – The requested configuration of the adapter. +If not specified, will be either: - the default adapter config for the requested adapter if specified - +the global default adapter config

  • +
  • version (str, optional) – The version of the adapter to be loaded.

  • +
  • model_name (str, optional) – The string identifier of the pre-trained model.

  • +
  • load_as (str, optional) – Load the adapter using this name. By default, the name with which the adapter was +saved will be used.

  • +
  • source (str, optional) –

    Identifier of the source(s) from where to load the adapter. Can be:

    +
      +
    • +
      ”ah”: search on AdapterHub Hub repo.

      Note: the Hub repo has been archived and all adapters have been moved to HuggingFace Model Hub. +Loading from this source is deprecated.

      +
      +
      +
    • +
    • ”hf”: search on HuggingFace Model Hub.

    • +
    • None (default): search on all sources

    • +
    +

  • +
  • leave_out – Dynamically drop adapter modules in the specified Transformer layers when loading the adapter.

  • +
  • set_active (bool, optional) – Set the loaded adapter to be the active one. By default (False), the adapter is loaded but not +activated.

  • +
  • use_safetensors (bool, optional) – If True, weights are loaded via safetensors if safetensors checkpoint is available. Otherwise, the regular torch save method is used.

  • +
+
+
Returns
+

The name with which the adapter was added to the model.

+
+
Return type
+

str

+
+
+
+ +
+
+load_adapter_fusion(adapter_fusion_name_or_path: str, load_as: Optional[str] = None, custom_weights_loaders: Optional[List[WeightsLoader]] = None, set_active: bool = False, with_head: bool = True, use_safetensors: bool = False, **kwargs) str
+

Loads a pre-trained AdapterFusion layer from the local file system.

+
+
Parameters
+
    +
  • adapter_fusion_name_or_path (str) – a path to a directory containing AdapterFusion weights saved using model.save_adapter_fusion().

  • +
  • load_as (str, optional) – Load the AdapterFusion using this name. +By default, the name with which the AdapterFusion layer was saved will be used.

  • +
  • set_active (bool, optional) – Activate the loaded AdapterFusion. By default (False), the AdapterFusion is loaded but not activated.

  • +
  • use_safetensors (bool, optional) – If True, weights are loaded via safetensors if safetensors checkpoint is available. Otherwise, the regular torch save method is used.

  • +
+
+
Returns
+

The name with which the AdapterFusion was added to the model.

+
+
Return type
+

str

+
+
+
+ +
+
+load_head(save_directory: str, load_as: Optional[str] = None, id2label: Optional[Dict[int, str]] = None, use_safetensors: bool = False, **kwargs) str
+

Loads a model prediction head from a directory where it was saved using save_head().

+
+
Parameters
+
    +
  • save_directory (str) – Path to the directory where the prediction head is saved.

  • +
  • load_as (str, optional) – Load the AdapterFusion using this name. +By default, the name with which the AdapterFusion layer was saved will be used.

  • +
  • id2label (Dict[int, str], optional) – Provide a custom mapping from class ids to class labels. Defaults to None.

  • +
  • use_safetensors (bool, optional) – If True, weights are loaded via safetensors if safetensors checkpoint is available. Otherwise, the regular torch save method is used.

  • +
+
+
Returns
+

The name with which the prediction head was added to the model.

+
+
Return type
+

str

+
+
+
+ +
+
+merge_adapter(name: str)
+

Merges the weights of the given LoRA module with the Transformer weights as described in the paper.

+
+
Parameters
+

name (str) – LoRA module to merge.

+
+
+
+ +
+
+push_adapter_to_hub(repo_name: str, adapter_name: str, organization: Optional[str] = None, adapterhub_tag: Optional[str] = None, datasets_tag: Optional[str] = None, local_path: Optional[str] = None, commit_message: Optional[str] = None, private: Optional[bool] = None, token: Optional[Union[bool, str]] = None, overwrite_adapter_card: bool = False, create_pr: bool = False, revision: Optional[str] = None, commit_description: Optional[str] = None, adapter_card_kwargs: Optional[dict] = None, **deprecated_kwargs)
+

Upload an adapter to HuggingFace’s Model Hub.

+
+
Parameters
+
    +
  • repo_name (str) – The name of the repository on the model hub to upload to.

  • +
  • adapter_name (str) – The name of the adapter to be uploaded.

  • +
  • organization (str, optional) – Organization in which to push the adapter +(you must be a member of this organization). Defaults to None.

  • +
  • adapterhub_tag (str, optional) – Tag of the format <task>/<subtask> for categorization on https://adapterhub.ml/explore/. See +https://docs.adapterhub.ml/contributing.html#add-a-new-task-or-subtask for more. If not specified, +datasets_tag must be given in case a new adapter card is generated. Defaults to None.

  • +
  • datasets_tag (str, optional) – Dataset identifier from https://huggingface.co/datasets. +If not specified, adapterhub_tag must be given in case a new adapter card is generated. Defaults to +None.

  • +
  • local_path (str, optional) – Local path used as clone directory of the adapter repository. +If not specified, will create a temporary directory. Defaults to None.

  • +
  • commit_message (str, optional) – Message to commit while pushing. Will default to "add config", "add tokenizer" or +"add model" depending on the type of the class.

  • +
  • private (bool, optional) – Whether or not the repository created should be private (requires a paying subscription).

  • +
  • token (bool or str, optional) – The token to use as HTTP bearer authorization for remote files. If True, will use the token generated +when running huggingface-cli login (stored in ~/.huggingface). Will default to True if repo_url +is not specified.

  • +
  • overwrite_adapter_card (bool, optional) – Overwrite an existing adapter card with a newly generated one. +If set to False, will only generate an adapter card, if none exists. Defaults to False.

  • +
  • create_pr (bool, optional) – Whether or not to create a PR with the uploaded files or directly commit.

  • +
  • revision (str, optional) – Branch to push the uploaded files to.

  • +
  • commit_description (str, optional) – The description of the commit that will be created

  • +
+
+
Returns
+

The url of the adapter repository on the model hub.

+
+
Return type
+

str

+
+
+
+ +
+
+reset_adapter()
+

Resets weights of a LoRA module merged using model.merge_adapter(name).

+
+ +
+
+save_adapter(save_directory: str, adapter_name: str, with_head: bool = True, meta_dict: Optional[dict] = None, custom_weights_loaders: Optional[List[WeightsLoader]] = None, use_safetensors: bool = False)
+

Saves an adapter and its configuration file to a directory so that it can be shared or reloaded using +load_adapter().

+
+
Parameters
+
    +
  • save_directory (str) – Path to a directory where the adapter should be saved.

  • +
  • adapter_name (str) – Name of the adapter to be saved.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
Raises
+

ValueError – If the given adapter name is invalid.

+
+
+
+ +
+
+save_adapter_fusion(save_directory: str, adapter_names: Union[Fuse, list, str], meta_dict: Optional[dict] = None, custom_weights_loaders: Optional[List[WeightsLoader]] = None, with_head: Union[bool, str] = False, use_safetensors: bool = False)
+

Saves an AdapterFusion layer and its configuration file to a directory so that it can be shared or reloaded +using load_adapter_fusion().

+
+
Parameters
+
    +
  • save_directory (str) – Path to a directory where the AdapterFusion should be saved.

  • +
  • adapter_names (Union[Fuse, list, str]) – AdapterFusion to be saved.

  • +
  • with_head (Union[bool, str]) – If True, will save a head with the same name as the AdapterFusionLayer. If a string, this will be used +as the name of the head to be saved.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
Raises
+

ValueError – If the given AdapterFusion name is invalid.

+
+
+
+ +
+
+save_all_adapter_fusions(save_directory: str, meta_dict: Optional[dict] = None, custom_weights_loaders: Optional[List[WeightsLoader]] = None, use_safetensors: bool = False)
+

Saves all AdapterFusion layers of this model together with their configuration to subfolders of the given +location.

+
+
Parameters
+
    +
  • save_directory (str) – Path to a directory where the AdapterFusion layers should be saved.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
+
+ +
+
+save_all_adapters(save_directory: str, with_head: bool = True, meta_dict: Optional[dict] = None, custom_weights_loaders: Optional[List[WeightsLoader]] = None, use_safetensors: bool = False)
+

Saves all adapters of this model together with their configuration to subfolders of the given location.

+
+
Parameters
+
    +
  • save_directory (str) – Path to a directory where the adapters should be saved.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
+
+ +
+
+save_all_heads(save_directory: str, use_safetensors: bool = False)
+

Saves all prediction heads of this model to subfolders of the given location.

+
+
Parameters
+
    +
  • save_directory (str) – Path to the base directory where prediction heads should be saved.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
+
+ +
+
+save_head(save_directory: str, head_name: Optional[str] = None, use_safetensors: bool = False) None
+

Saves a model prediction head to a directory such that it can be reloaded using load_head().

+
+
Parameters
+
    +
  • save_directory (str) – Path to the directory where the prediction head should be saved.

  • +
  • head_name (str, optional) – Name of the head to save. Set to None if model only has one head. Defaults to None.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
+
+ +
+
+save_pretrained(save_directory: Union[str, PathLike], **kwargs)
+

Save a model and its configuration file to a directory, so that it can be re-loaded using the +[~PreTrainedModel.from_pretrained] class method.

+
+
Parameters
+
    +
  • save_directory (str or os.PathLike) – Directory to which to save. Will be created if it doesn’t exist.

  • +
  • is_main_process (bool, optional, defaults to True) – Whether the process calling this is the main process or not. Useful when in distributed training like +TPUs and need to call this function on all processes. In this case, set is_main_process=True only on +the main process to avoid race conditions.

  • +
  • state_dict (nested dictionary of torch.Tensor) – The state dictionary of the model to save. Will default to self.state_dict(), but can be used to only +save parts of the model or if special precautions need to be taken when recovering the state dictionary +of a model (like when using model parallelism).

  • +
  • save_function (Callable) – The function to use to save the state dictionary. Useful on distributed training like TPUs when one +need to replace torch.save by another method.

  • +
  • push_to_hub (bool, optional, defaults to False) – Whether or not to push your model to the Hugging Face model hub after saving it. You can specify the +repository you want to push to with repo_id (will default to the name of save_directory in your +namespace).

  • +
  • max_shard_size (int or str, optional, defaults to “5GB”) –

    The maximum size for a checkpoint before being sharded. Checkpoints shard will then be each of size +lower than this size. If expressed as a string, needs to be digits followed by a unit (like “5MB”). +We default it to 5GB in order for models to be able to run easily on free-tier google colab instances +without CPU OOM issues.

    +

    <Tip warning={true}>

    +

    If a single weight of the model is bigger than max_shard_size, it will be in its own checkpoint shard +which will be bigger than max_shard_size.

    +

    </Tip>

    +

  • +
  • safe_serialization (bool, optional, defaults to True) – Whether to save the model using safetensors or the traditional PyTorch way (that uses pickle).

  • +
  • variant (str, optional) – If specified, weights are saved in the format pytorch_model.<variant>.bin.

  • +
  • token (str or bool, optional) – The token to use as HTTP bearer authorization for remote files. If True, or not specified, will use +the token generated when running huggingface-cli login (stored in ~/.huggingface).

  • +
  • save_peft_format (bool, optional, defaults to True) – For backward compatibility with PEFT library, in case adapter weights are attached to the model, all +keys of the state dict of adapters needs to be pre-pended with base_model.model. Advanced users can +disable this behaviours by setting save_peft_format to False.

  • +
  • kwargs (Dict[str, Any], optional) – Additional key word arguments passed along to the [~utils.PushToHubMixin.push_to_hub] method.

  • +
+
+
+
+ +
+
+set_active_adapters(adapter_setup: Union[list, AdapterCompositionBlock], skip_layers: Optional[List[int]] = None)
+

Sets the adapter modules to be used by default in every forward pass. This setting can be overriden by passing +the adapter_names parameter in the foward() pass. If no adapter with the given name is found, no module of +the respective type will be activated. In case the calling model class supports named prediction heads, this +method will attempt to activate a prediction head with the name of the last adapter in the list of passed +adapter names.

+
+
Parameters
+

adapter_setup (list) – The list of adapters to be activated by default. Can be a fusion or stacking configuration.

+
+
+
+ +
+
+tie_weights()
+

Tie the weights between the input embeddings and the output embeddings.

+

If the torchscript flag is set in the configuration, can’t handle parameter sharing so we are cloning +the weights instead.

+
+ +
+
+train_adapter(adapter_setup: Union[list, AdapterCompositionBlock], train_embeddings=False)
+

Sets the model into mode for training the given adapters. If self.base_model is self, must inherit from a class +that implements this method, to preclude infinite recursion

+
+ +
+
+train_adapter_fusion(adapter_setup: Union[list, AdapterCompositionBlock], unfreeze_adapters=False)
+

Sets the model into mode for training of adapter fusion determined by a list of adapter names. If +self.base_model is self, must inherit from a class that implements this method, to preclude infinite recursion

+
+ +
+
+train_fusion(adapter_setup: Union[list, AdapterCompositionBlock], unfreeze_adapters=False)
+

Sets the model into mode for training of adapter fusion determined by a list of adapter names.

+
+ +
+ +
+
+ + +
+ +
+ + +
+
+ +
+ +
+ +
+ + Versions + v: main + + +
+
+
Branches
+
main
+
+
+
+ + + + + + + + + + \ No newline at end of file diff --git a/classes/models/bert.html b/classes/models/bert.html new file mode 100644 index 0000000000..56221d4ef7 --- /dev/null +++ b/classes/models/bert.html @@ -0,0 +1,1135 @@ + + + + + + + + + + + BERT — AdapterHub documentation + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + +
+ + + + + +
+ +
+ + + + + + + + + + + + + + + + + +
+ + + + +
+
+
+
+ +
+

BERT

+

The BERT model was proposed in BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding +by Jacob Devlin, Ming-Wei Chang, Kenton Lee and Kristina Toutanova. It is a bidirectional transformer +pre-trained using a combination of masked language modeling objective and next sentence prediction.

+
+

BertAdapterModel

+
+
+class adapters.BertAdapterModel(config)
+

Bert Model transformer with the option to add multiple flexible heads on top.

+

This model inherits from [PreTrainedModel]. Check the superclass documentation for the generic methods the +library implements for all its model (such as downloading or saving, resizing the input embeddings, pruning heads +etc.)

+

This model is also a PyTorch [torch.nn.Module](https://pytorch.org/docs/stable/nn.html#torch.nn.Module) subclass. +Use it as a regular PyTorch Module and refer to the PyTorch documentation for all matter related to general usage +and behavior.

+
+
Parameters
+

config ([BertConfig]) – Model configuration class with all the parameters of the model. +Initializing with a config file does not load the weights associated with the model, only the +configuration. Check out the [~PreTrainedModel.from_pretrained] method to load the model weights.

+
+
+
+
+property active_adapters: AdapterCompositionBlock
+

If you are not familiar with adapters and PEFT methods, we invite you to read more about them on the PEFT +official documentation: https://huggingface.co/docs/peft

+

Gets the current active adapters of the model. In case of multi-adapter inference (combining multiple adapters +for inference) returns the list of all active adapters so that users can deal with them accordingly.

+

For previous PEFT versions (that does not support multi-adapter inference), module.active_adapter will return +a single string.

+
+ +
+
+property active_head: Union[str, List[str]]
+

The active prediction head configuration of this model. Can be either the name of a single available head +(string) or a list of multiple available heads. In case of a list of heads, the same base model is forwarded +through all specified heads.

+
+
Returns
+

A string or a list of strings describing the active head configuration.

+
+
Return type
+

Union[str, List[str]]

+
+
+
+ +
+
+adapter_fusion_to(adapter_names: Union[Fuse, list, str], device: Optional[Union[device, str]] = None, dtype: Optional[dtype] = None)
+

Moves the adapter fusion layer with the given name to the specified device and data type.

+
+
Parameters
+
    +
  • adapter_names (Union[Fuse, list, str]) – The name of the adapter fusion layer to be moved.

  • +
  • device (torch.device or str, optional) – The device on which the adapter fusion layer should be moved.

  • +
  • dtype (torch.dtype, optional) – The data type to which the adapter fusion layer should be cast.

  • +
+
+
+
+ +
+
+adapter_summary(as_dict=False) Union[str, dict]
+

Returns a string summary of all adapters currently added to the model. Each entry in the summary table has the +following attributes:

+
+
    +
  • name: the name of the adapter

  • +
  • architecture: the architectural base of the adapter

  • +
  • #param: the number of parameters of the adapter

  • +
  • %param: the number of parameters of the adapter relative to the full model

  • +
  • active: whether the adapter is active

  • +
  • train: whether the adapter weights are enabled for training

  • +
+
+
+ +
+
+adapter_to(name: str, device: Optional[Union[device, str]] = None, dtype: Optional[dtype] = None)
+

Moves the adapter with the given name to the specified device and data type.

+
+
Parameters
+
    +
  • name (str) – The name of the adapter to be moved.

  • +
  • device (torch.device or str, optional) – The device on which the adapter should be moved.

  • +
  • dtype (torch.dtype, optional) – The data type to which the adapter should be cast.

  • +
+
+
+
+ +
+
+add_adapter(adapter_name: str, config=None, overwrite_ok: bool = False, set_active: bool = False)
+

Adds a new adapter module of the specified type to the model.

+
+
Parameters
+
    +
  • adapter_name (str) – The name of the adapter module to be added.

  • +
  • config (str or dict, optional) –

    The adapter configuration, can be either:

    +
      +
    • the string identifier of a pre-defined configuration dictionary

    • +
    • a configuration dictionary specifying the full config

    • +
    • if not given, the default configuration for this adapter type will be used

    • +
    +

  • +
  • overwrite_ok (bool, optional) – Overwrite an adapter with the same name if it exists. By default (False), an exception is thrown.

  • +
  • set_active (bool, optional) – Set the adapter to be the active one. By default (False), the adapter is added but not activated.

  • +
+
+
+

If self.base_model is self, must inherit from a class that implements this method, to preclude infinite +recursion

+
+ +
+
+add_adapter_fusion(adapter_names: Union[Fuse, list, str], config=None, overwrite_ok: bool = False, set_active: bool = False)
+

Adds AdapterFusion to the model with alll the necessary configurations and weight initializations

+
+
Parameters
+
    +
  • adapter_names (Fuse or list or str) –

    AdapterFusion layer to add. Can be either:

    +
      +
    • a Fuse composition block

    • +
    • a list of adapter names to fuse

    • +
    • a comma-separated string of adapter names to fuse

    • +
    +

  • +
  • config (str or dict) –

    adapter fusion configuration, can be either:

    +
      +
    • a string identifying a pre-defined adapter fusion configuration

    • +
    • a dictionary representing the adapter fusion configuration

    • +
    • the path to a file containing the adapter fusion configuration

    • +
    +

  • +
  • overwrite_ok (bool, optional) – Overwrite an AdapterFusion layer with the same name if it exists. By default (False), an exception is +thrown.

  • +
  • set_active (bool, optional) – Activate the added AdapterFusion. By default (False), the AdapterFusion is added but not activated.

  • +
+
+
+
+ +
+
+add_causal_lm_head(head_name, activation_function='gelu', overwrite_ok=False)
+

Adds a causal language modeling head on top of the model.

+
+
Parameters
+
    +
  • head_name (str) – The name of the head.

  • +
  • activation_function (str, optional) – Activation function. Defaults to ‘gelu’.

  • +
  • overwrite_ok (bool, optional) – Force overwrite if a head with the same name exists. Defaults to False.

  • +
+
+
+
+ +
+
+add_classification_head(head_name, num_labels=2, layers=2, activation_function='tanh', overwrite_ok=False, multilabel=False, id2label=None, use_pooler=False)
+

Adds a sequence classification head on top of the model.

+
+
Parameters
+
    +
  • head_name (str) – The name of the head.

  • +
  • num_labels (int, optional) – Number of classification labels. Defaults to 2.

  • +
  • layers (int, optional) – Number of layers. Defaults to 2.

  • +
  • activation_function (str, optional) – Activation function. Defaults to ‘tanh’.

  • +
  • overwrite_ok (bool, optional) – Force overwrite if a head with the same name exists. Defaults to False.

  • +
  • multilabel (bool, optional) – Enable multilabel classification setup. Defaults to False.

  • +
+
+
+
+ +
+
+add_dependency_parsing_head(head_name, num_labels=2, overwrite_ok=False, id2label=None)
+

Adds a biaffine dependency parsing head on top of the model. The parsing head uses the architecture described +in “Is Supervised Syntactic Parsing Beneficial for Language Understanding? An Empirical Investigation” (Glavaš +& Vulić, 2021) (https://arxiv.org/pdf/2008.06788.pdf).

+
+
Parameters
+
    +
  • head_name (str) – The name of the head.

  • +
  • num_labels (int, optional) – Number of labels. Defaults to 2.

  • +
  • overwrite_ok (bool, optional) – Force overwrite if a head with the same name exists. Defaults to False.

  • +
  • id2label (dict, optional) – Mapping from label ids to labels. Defaults to None.

  • +
+
+
+
+ +
+
+add_masked_lm_head(head_name, activation_function='gelu', overwrite_ok=False)
+

Adds a masked language modeling head on top of the model.

+
+
Parameters
+
    +
  • head_name (str) – The name of the head.

  • +
  • activation_function (str, optional) – Activation function. Defaults to ‘gelu’.

  • +
  • overwrite_ok (bool, optional) – Force overwrite if a head with the same name exists. Defaults to False.

  • +
+
+
+
+ +
+
+add_multiple_choice_head(head_name, num_choices=2, layers=2, activation_function='tanh', overwrite_ok=False, id2label=None, use_pooler=False)
+

Adds a multiple choice head on top of the model.

+
+
Parameters
+
    +
  • head_name (str) – The name of the head.

  • +
  • num_choices (int, optional) – Number of choices. Defaults to 2.

  • +
  • layers (int, optional) – Number of layers. Defaults to 2.

  • +
  • activation_function (str, optional) – Activation function. Defaults to ‘tanh’.

  • +
  • overwrite_ok (bool, optional) – Force overwrite if a head with the same name exists. Defaults to False.

  • +
+
+
+
+ +
+
+add_qa_head(head_name, num_labels=2, layers=1, activation_function='tanh', overwrite_ok=False, id2label=None)
+

Adds a question answering head on top of the model.

+
+
Parameters
+
    +
  • head_name (str) – The name of the head.

  • +
  • num_labels (int, optional) – Number of classification labels. Defaults to 2.

  • +
  • layers (int, optional) – Number of layers. Defaults to 1.

  • +
  • activation_function (str, optional) – Activation function. Defaults to ‘tanh’.

  • +
  • overwrite_ok (bool, optional) – Force overwrite if a head with the same name exists. Defaults to False.

  • +
+
+
+
+ +
+
+add_tagging_head(head_name, num_labels=2, layers=1, activation_function='tanh', overwrite_ok=False, id2label=None)
+

Adds a token classification head on top of the model.

+
+
Parameters
+
    +
  • head_name (str) – The name of the head.

  • +
  • num_labels (int, optional) – Number of classification labels. Defaults to 2.

  • +
  • layers (int, optional) – Number of layers. Defaults to 1.

  • +
  • activation_function (str, optional) – Activation function. Defaults to ‘tanh’.

  • +
  • overwrite_ok (bool, optional) – Force overwrite if a head with the same name exists. Defaults to False.

  • +
+
+
+
+ +
+
+apply_to_adapter_layers(fn)
+

Applies a function to all adapter layers of the model.

+
+ +
+
+apply_to_basemodel_childs(fn)
+

Applies a function to all direct childs of the model if they are a instance of AdapterLayerBase.

+
+ +
+
+average_adapter(adapter_name: str, adapter_list: List[str], weights: Optional[List[float]] = None, normalize_weights: bool = True, overwrite_ok: bool = False, set_active: bool = False)
+

Adds a new adapter module as weighted average of a set of existing adapter modules.

+
+
Parameters
+
    +
  • adapter_name (str) – The name of the adapter module to be added.

  • +
  • input_adapters (List[str] or Dict[str, float]) – Specifies the existing adapters whose weights should be averaged. Can either be a list of adapter names +or a dictionary mapping adapter names to weights.

  • +
  • overwrite_ok (bool, optional) – Overwrite an adapter with the same name if it exists. By default (False), an exception is thrown.

  • +
  • set_active (bool, optional) – Set the adapter to be the active one. By default (False), the adapter is added but not activated.

  • +
+
+
+
+ +
+
+delete_adapter(adapter_name: str)
+

Deletes the adapter with the specified name from the model.

+
+
Parameters
+

adapter_name (str) – The name of the adapter.

+
+
+
+ +
+
+delete_adapter_fusion(adapter_names: Union[Fuse, list, str])
+

Deletes the AdapterFusion layer of the specified adapters.

+
+
Parameters
+

adapter_names (Union[Fuse, list, str]) – AdapterFusion layer to delete.

+
+
+
+ +
+
+delete_head(head_name: str)
+

Deletes the prediction head with the specified name from the model.

+
+
Parameters
+

head_name (str) – The name of the prediction to delete.

+
+
+
+ +
+
+eject_prefix_tuning(name: str)
+

Converts the prefix tuning with the given name from the reparameterized form into the flat form.

+
+
Parameters
+

name (str) – The name of the prefix tuning.

+
+
+
+ +
+
+forward(input_ids=None, attention_mask=None, token_type_ids=None, position_ids=None, head_mask=None, inputs_embeds=None, output_attentions=None, output_hidden_states=None, return_dict=None, head=None, output_adapter_gating_scores=False, output_adapter_fusion_attentions=False, **kwargs)
+

The [BertAdapterModel] forward method, overrides the __call__ special method.

+

<Tip>

+

Although the recipe for forward pass needs to be defined within this function, one should call the [Module] +instance afterwards instead of this since the former takes care of running the pre and post processing steps while +the latter silently ignores them.

+

</Tip>

+
+
Parameters
+
    +
  • input_ids (torch.LongTensor of shape (batch_size, sequence_length)) –

    Indices of input sequence tokens in the vocabulary.

    +

    Indices can be obtained using [AutoTokenizer]. See [PreTrainedTokenizer.encode] and +[PreTrainedTokenizer.__call__] for details.

    +

    [What are input IDs?](../glossary#input-ids)

    +

  • +
  • attention_mask (torch.FloatTensor of shape (batch_size, sequence_length), optional) –

    Mask to avoid performing attention on padding token indices. Mask values selected in [0, 1]:

    +
      +
    • 1 for tokens that are not masked,

    • +
    • 0 for tokens that are masked.

    • +
    +

    [What are attention masks?](../glossary#attention-mask)

    +

  • +
  • token_type_ids (torch.LongTensor of shape (batch_size, sequence_length), optional) –

    Segment token indices to indicate first and second portions of the inputs. Indices are selected in [0, +1]:

    +
      +
    • 0 corresponds to a sentence A token,

    • +
    • 1 corresponds to a sentence B token.

    • +
    +

    [What are token type IDs?](../glossary#token-type-ids)

    +

  • +
  • position_ids (torch.LongTensor of shape (batch_size, sequence_length), optional) –

    Indices of positions of each input sequence tokens in the position embeddings. Selected in the range [0, +config.max_position_embeddings - 1].

    +

    [What are position IDs?](../glossary#position-ids)

    +

  • +
  • head_mask (torch.FloatTensor of shape (num_heads,) or (num_layers, num_heads), optional) –

    Mask to nullify selected heads of the self-attention modules. Mask values selected in [0, 1]:

    +
      +
    • 1 indicates the head is not masked,

    • +
    • 0 indicates the head is masked.

    • +
    +

  • +
  • inputs_embeds (torch.FloatTensor of shape (batch_size, sequence_length, hidden_size), optional) – Optionally, instead of passing input_ids you can choose to directly pass an embedded representation. This +is useful if you want more control over how to convert input_ids indices into associated vectors than the +model’s internal embedding lookup matrix.

  • +
  • output_attentions (bool, optional) – Whether or not to return the attentions tensors of all attention layers. See attentions under returned +tensors for more detail.

  • +
  • output_hidden_states (bool, optional) – Whether or not to return the hidden states of all layers. See hidden_states under returned tensors for +more detail.

  • +
  • return_dict (bool, optional) – Whether or not to return a [~utils.ModelOutput] instead of a plain tuple.

  • +
+
+
+
+ +
+
+forward_context(context: ForwardContext, *args, **kwargs)
+

This method is called by the ForwardContext at the beginning of the forward pass.

+
+ +
+
+forward_head(all_outputs, head_name=None, cls_output=None, attention_mask=None, return_dict=False, context=None, **kwargs)
+

The forward pass through a prediction head configuration. There are three ways to specify the used prediction +head configuration (in order of priority):

+
+
    +
  1. If a head_name is passed, the head with the given name is used.

  2. +
  3. If the forward call is executed within an AdapterSetup context, the head configuration is read from +the context.

  4. +
  5. If the active_head property is set, the head configuration is read from there.

  6. +
+
+
+
Parameters
+
    +
  • all_outputs (dict) – The outputs of the base model.

  • +
  • head_name (str, optional) – The name of the prediction head to use. If None, the active head is used.

  • +
  • cls_output (torch.Tensor, optional) – The classification output of the model.

  • +
  • attention_mask (torch.Tensor, optional) – The attention mask of the model.

  • +
  • return_dict (bool) – Whether or not to return a ModelOutput instead of a plain tuple.

  • +
  • get_cls_from_eos_tokens (bool) – If set to True, retrieve classifier token representations from the last <eos> token in the sequence. +Setting to True requires eos_mask to be passed as well.

  • +
  • **kwargs – Additional keyword arguments passed to the forward pass of the head.

  • +
+
+
+
+ +
+
+freeze_model(freeze=True)
+

Freezes all weights of the model.

+
+ +
+
+get_adapter(name)
+

If self.base_model is self, must inherit from a class that implements this method, to preclude infinite +recursion

+
+ +
+
+get_labels(head_name=None)
+

Returns the labels the given head is assigning/predictin

+
+
Parameters
+
    +
  • head_name – (str, optional) the name of the head which labels should be returned. Default is None.

  • +
  • returned (If the name is None the labels of the active head are) –

  • +
+
+
+

Returns: labels

+
+ +
+
+get_labels_dict(head_name=None)
+

Returns the id2label dict for the given hea

+
+
Parameters
+
    +
  • head_name – (str, optional) the name of the head which labels should be returned. Default is None.

  • +
  • returned (If the name is None the labels of the active head are) –

  • +
+
+
+

Returns: id2label

+
+ +
+
+get_output_embeddings() Union[Module, List[Module]]
+

Returns the model’s output embeddings.

+
+
Returns
+

A torch module mapping hidden states to vocabulary.

+
+
Return type
+

nn.Module

+
+
+
+ +
+
+head_type()
+

Checks which head type the decorated function belongs to and raises an error if the model does not support the +head type.

+
+ +
+
+init_adapters(model_config, adapters_config, add_prefix_tuning_pool=True)
+

This method initializes adapter modules and fusion modules from the model config.

+
+ +
+
+iter_layers() Iterable[Tuple[int, Module]]
+

Iterates over all layers of the model.

+
+ +
+
+load_adapter(adapter_name_or_path: str, config: Optional[Union[dict, str]] = None, version: Optional[str] = None, model_name: Optional[str] = None, load_as: Optional[str] = None, source: Optional[str] = None, with_head: bool = True, custom_weights_loaders: Optional[List[WeightsLoader]] = None, leave_out: Optional[List[int]] = None, id2label=None, set_active: bool = False, use_safetensors: bool = False, **kwargs) str
+

Loads a pre-trained pytorch adapter module from the local file system or a remote location.

+
+
Parameters
+
    +
  • adapter_name_or_path (str) –

    can be either:

    +
      +
    • the identifier of a pre-trained task adapter to be loaded from Adapter Hub

    • +
    • a path to a directory containing adapter weights saved using model.saved_adapter()

    • +
    • a URL pointing to a zip folder containing a saved adapter module

    • +
    +

  • +
  • config (dict or str, optional) – The requested configuration of the adapter. +If not specified, will be either: - the default adapter config for the requested adapter if specified - +the global default adapter config

  • +
  • version (str, optional) – The version of the adapter to be loaded.

  • +
  • model_name (str, optional) – The string identifier of the pre-trained model.

  • +
  • load_as (str, optional) – Load the adapter using this name. By default, the name with which the adapter was +saved will be used.

  • +
  • source (str, optional) –

    Identifier of the source(s) from where to load the adapter. Can be:

    +
      +
    • +
      ”ah”: search on AdapterHub Hub repo.

      Note: the Hub repo has been archived and all adapters have been moved to HuggingFace Model Hub. +Loading from this source is deprecated.

      +
      +
      +
    • +
    • ”hf”: search on HuggingFace Model Hub.

    • +
    • None (default): search on all sources

    • +
    +

  • +
  • leave_out – Dynamically drop adapter modules in the specified Transformer layers when loading the adapter.

  • +
  • set_active (bool, optional) – Set the loaded adapter to be the active one. By default (False), the adapter is loaded but not +activated.

  • +
  • use_safetensors (bool, optional) – If True, weights are loaded via safetensors if safetensors checkpoint is available. Otherwise, the regular torch save method is used.

  • +
+
+
Returns
+

The name with which the adapter was added to the model.

+
+
Return type
+

str

+
+
+
+ +
+
+load_adapter_fusion(adapter_fusion_name_or_path: str, load_as: Optional[str] = None, custom_weights_loaders: Optional[List[WeightsLoader]] = None, set_active: bool = False, with_head: bool = True, use_safetensors: bool = False, **kwargs) str
+

Loads a pre-trained AdapterFusion layer from the local file system.

+
+
Parameters
+
    +
  • adapter_fusion_name_or_path (str) – a path to a directory containing AdapterFusion weights saved using model.save_adapter_fusion().

  • +
  • load_as (str, optional) – Load the AdapterFusion using this name. +By default, the name with which the AdapterFusion layer was saved will be used.

  • +
  • set_active (bool, optional) – Activate the loaded AdapterFusion. By default (False), the AdapterFusion is loaded but not activated.

  • +
  • use_safetensors (bool, optional) – If True, weights are loaded via safetensors if safetensors checkpoint is available. Otherwise, the regular torch save method is used.

  • +
+
+
Returns
+

The name with which the AdapterFusion was added to the model.

+
+
Return type
+

str

+
+
+
+ +
+
+load_head(save_directory: str, load_as: Optional[str] = None, id2label: Optional[Dict[int, str]] = None, use_safetensors: bool = False, **kwargs) str
+

Loads a model prediction head from a directory where it was saved using save_head().

+
+
Parameters
+
    +
  • save_directory (str) – Path to the directory where the prediction head is saved.

  • +
  • load_as (str, optional) – Load the AdapterFusion using this name. +By default, the name with which the AdapterFusion layer was saved will be used.

  • +
  • id2label (Dict[int, str], optional) – Provide a custom mapping from class ids to class labels. Defaults to None.

  • +
  • use_safetensors (bool, optional) – If True, weights are loaded via safetensors if safetensors checkpoint is available. Otherwise, the regular torch save method is used.

  • +
+
+
Returns
+

The name with which the prediction head was added to the model.

+
+
Return type
+

str

+
+
+
+ +
+
+merge_adapter(name: str)
+

Merges the weights of the given LoRA module with the Transformer weights as described in the paper.

+
+
Parameters
+

name (str) – LoRA module to merge.

+
+
+
+ +
+
+push_adapter_to_hub(repo_name: str, adapter_name: str, organization: Optional[str] = None, adapterhub_tag: Optional[str] = None, datasets_tag: Optional[str] = None, local_path: Optional[str] = None, commit_message: Optional[str] = None, private: Optional[bool] = None, token: Optional[Union[bool, str]] = None, overwrite_adapter_card: bool = False, create_pr: bool = False, revision: Optional[str] = None, commit_description: Optional[str] = None, adapter_card_kwargs: Optional[dict] = None, **deprecated_kwargs)
+

Upload an adapter to HuggingFace’s Model Hub.

+
+
Parameters
+
    +
  • repo_name (str) – The name of the repository on the model hub to upload to.

  • +
  • adapter_name (str) – The name of the adapter to be uploaded.

  • +
  • organization (str, optional) – Organization in which to push the adapter +(you must be a member of this organization). Defaults to None.

  • +
  • adapterhub_tag (str, optional) – Tag of the format <task>/<subtask> for categorization on https://adapterhub.ml/explore/. See +https://docs.adapterhub.ml/contributing.html#add-a-new-task-or-subtask for more. If not specified, +datasets_tag must be given in case a new adapter card is generated. Defaults to None.

  • +
  • datasets_tag (str, optional) – Dataset identifier from https://huggingface.co/datasets. +If not specified, adapterhub_tag must be given in case a new adapter card is generated. Defaults to +None.

  • +
  • local_path (str, optional) – Local path used as clone directory of the adapter repository. +If not specified, will create a temporary directory. Defaults to None.

  • +
  • commit_message (str, optional) – Message to commit while pushing. Will default to "add config", "add tokenizer" or +"add model" depending on the type of the class.

  • +
  • private (bool, optional) – Whether or not the repository created should be private (requires a paying subscription).

  • +
  • token (bool or str, optional) – The token to use as HTTP bearer authorization for remote files. If True, will use the token generated +when running huggingface-cli login (stored in ~/.huggingface). Will default to True if repo_url +is not specified.

  • +
  • overwrite_adapter_card (bool, optional) – Overwrite an existing adapter card with a newly generated one. +If set to False, will only generate an adapter card, if none exists. Defaults to False.

  • +
  • create_pr (bool, optional) – Whether or not to create a PR with the uploaded files or directly commit.

  • +
  • revision (str, optional) – Branch to push the uploaded files to.

  • +
  • commit_description (str, optional) – The description of the commit that will be created

  • +
+
+
Returns
+

The url of the adapter repository on the model hub.

+
+
Return type
+

str

+
+
+
+ +
+
+reset_adapter()
+

Resets weights of a LoRA module merged using model.merge_adapter(name).

+
+ +
+
+save_adapter(save_directory: str, adapter_name: str, with_head: bool = True, meta_dict: Optional[dict] = None, custom_weights_loaders: Optional[List[WeightsLoader]] = None, use_safetensors: bool = False)
+

Saves an adapter and its configuration file to a directory so that it can be shared or reloaded using +load_adapter().

+
+
Parameters
+
    +
  • save_directory (str) – Path to a directory where the adapter should be saved.

  • +
  • adapter_name (str) – Name of the adapter to be saved.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
Raises
+

ValueError – If the given adapter name is invalid.

+
+
+
+ +
+
+save_adapter_fusion(save_directory: str, adapter_names: Union[Fuse, list, str], meta_dict: Optional[dict] = None, custom_weights_loaders: Optional[List[WeightsLoader]] = None, with_head: Union[bool, str] = False, use_safetensors: bool = False)
+

Saves an AdapterFusion layer and its configuration file to a directory so that it can be shared or reloaded +using load_adapter_fusion().

+
+
Parameters
+
    +
  • save_directory (str) – Path to a directory where the AdapterFusion should be saved.

  • +
  • adapter_names (Union[Fuse, list, str]) – AdapterFusion to be saved.

  • +
  • with_head (Union[bool, str]) – If True, will save a head with the same name as the AdapterFusionLayer. If a string, this will be used +as the name of the head to be saved.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
Raises
+

ValueError – If the given AdapterFusion name is invalid.

+
+
+
+ +
+
+save_all_adapter_fusions(save_directory: str, meta_dict: Optional[dict] = None, custom_weights_loaders: Optional[List[WeightsLoader]] = None, use_safetensors: bool = False)
+

Saves all AdapterFusion layers of this model together with their configuration to subfolders of the given +location.

+
+
Parameters
+
    +
  • save_directory (str) – Path to a directory where the AdapterFusion layers should be saved.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
+
+ +
+
+save_all_adapters(save_directory: str, with_head: bool = True, meta_dict: Optional[dict] = None, custom_weights_loaders: Optional[List[WeightsLoader]] = None, use_safetensors: bool = False)
+

Saves all adapters of this model together with their configuration to subfolders of the given location.

+
+
Parameters
+
    +
  • save_directory (str) – Path to a directory where the adapters should be saved.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
+
+ +
+
+save_all_heads(save_directory: str, use_safetensors: bool = False)
+

Saves all prediction heads of this model to subfolders of the given location.

+
+
Parameters
+
    +
  • save_directory (str) – Path to the base directory where prediction heads should be saved.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
+
+ +
+
+save_head(save_directory: str, head_name: Optional[str] = None, use_safetensors: bool = False) None
+

Saves a model prediction head to a directory such that it can be reloaded using load_head().

+
+
Parameters
+
    +
  • save_directory (str) – Path to the directory where the prediction head should be saved.

  • +
  • head_name (str, optional) – Name of the head to save. Set to None if model only has one head. Defaults to None.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
+
+ +
+
+save_pretrained(save_directory: Union[str, PathLike], **kwargs)
+

Save a model and its configuration file to a directory, so that it can be re-loaded using the +[~PreTrainedModel.from_pretrained] class method.

+
+
Parameters
+
    +
  • save_directory (str or os.PathLike) – Directory to which to save. Will be created if it doesn’t exist.

  • +
  • is_main_process (bool, optional, defaults to True) – Whether the process calling this is the main process or not. Useful when in distributed training like +TPUs and need to call this function on all processes. In this case, set is_main_process=True only on +the main process to avoid race conditions.

  • +
  • state_dict (nested dictionary of torch.Tensor) – The state dictionary of the model to save. Will default to self.state_dict(), but can be used to only +save parts of the model or if special precautions need to be taken when recovering the state dictionary +of a model (like when using model parallelism).

  • +
  • save_function (Callable) – The function to use to save the state dictionary. Useful on distributed training like TPUs when one +need to replace torch.save by another method.

  • +
  • push_to_hub (bool, optional, defaults to False) – Whether or not to push your model to the Hugging Face model hub after saving it. You can specify the +repository you want to push to with repo_id (will default to the name of save_directory in your +namespace).

  • +
  • max_shard_size (int or str, optional, defaults to “5GB”) –

    The maximum size for a checkpoint before being sharded. Checkpoints shard will then be each of size +lower than this size. If expressed as a string, needs to be digits followed by a unit (like “5MB”). +We default it to 5GB in order for models to be able to run easily on free-tier google colab instances +without CPU OOM issues.

    +

    <Tip warning={true}>

    +

    If a single weight of the model is bigger than max_shard_size, it will be in its own checkpoint shard +which will be bigger than max_shard_size.

    +

    </Tip>

    +

  • +
  • safe_serialization (bool, optional, defaults to True) – Whether to save the model using safetensors or the traditional PyTorch way (that uses pickle).

  • +
  • variant (str, optional) – If specified, weights are saved in the format pytorch_model.<variant>.bin.

  • +
  • token (str or bool, optional) – The token to use as HTTP bearer authorization for remote files. If True, or not specified, will use +the token generated when running huggingface-cli login (stored in ~/.huggingface).

  • +
  • save_peft_format (bool, optional, defaults to True) – For backward compatibility with PEFT library, in case adapter weights are attached to the model, all +keys of the state dict of adapters needs to be pre-pended with base_model.model. Advanced users can +disable this behaviours by setting save_peft_format to False.

  • +
  • kwargs (Dict[str, Any], optional) – Additional key word arguments passed along to the [~utils.PushToHubMixin.push_to_hub] method.

  • +
+
+
+
+ +
+
+set_active_adapters(adapter_setup: Union[list, AdapterCompositionBlock], skip_layers: Optional[List[int]] = None)
+

Sets the adapter modules to be used by default in every forward pass. This setting can be overriden by passing +the adapter_names parameter in the foward() pass. If no adapter with the given name is found, no module of +the respective type will be activated. In case the calling model class supports named prediction heads, this +method will attempt to activate a prediction head with the name of the last adapter in the list of passed +adapter names.

+
+
Parameters
+

adapter_setup (list) – The list of adapters to be activated by default. Can be a fusion or stacking configuration.

+
+
+
+ +
+
+tie_weights()
+

Tie the weights between the input embeddings and the output embeddings.

+

If the torchscript flag is set in the configuration, can’t handle parameter sharing so we are cloning +the weights instead.

+
+ +
+
+train_adapter(adapter_setup: Union[list, AdapterCompositionBlock], train_embeddings=False)
+

Sets the model into mode for training the given adapters. If self.base_model is self, must inherit from a class +that implements this method, to preclude infinite recursion

+
+ +
+
+train_adapter_fusion(adapter_setup: Union[list, AdapterCompositionBlock], unfreeze_adapters=False)
+

Sets the model into mode for training of adapter fusion determined by a list of adapter names. If +self.base_model is self, must inherit from a class that implements this method, to preclude infinite recursion

+
+ +
+
+train_fusion(adapter_setup: Union[list, AdapterCompositionBlock], unfreeze_adapters=False)
+

Sets the model into mode for training of adapter fusion determined by a list of adapter names.

+
+ +
+ +
+
+ + +
+ +
+ + +
+
+ +
+ +
+ +
+ + Versions + v: main + + +
+
+
Branches
+
main
+
+
+
+ + + + + + + + + + \ No newline at end of file diff --git a/classes/models/clip.html b/classes/models/clip.html new file mode 100644 index 0000000000..b58945e3d9 --- /dev/null +++ b/classes/models/clip.html @@ -0,0 +1,743 @@ + + + + + + + + + + + CLIP — AdapterHub documentation + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + +
+ + + + + +
+ +
+ + + + + + + + + + + + + + + + + +
+ + + + +
+
+
+
+ +
+

CLIP

+
+

Note

+
+
Adapter implementation notes:
    +
  • CLIP consists of two separate Transformer encoder models, a ViT-style Transformer for visual features and a language model for textual features. Both encoders can be fitted with adapters. As usual, the leave_out parameter can be used to specify the layers in which adapters should be added. For CLIP, layer IDs are counted globally across both encoders, starting from the text encoder. I.e., for a CLIP model with 12 layers in each Transformer encoder, the text encoder will have IDs 0-11 and the vision encoder will have IDs 12-23.

  • +
  • As CLIP does not come with pre-supported task-specific prediction heads, there is currently no CLIPAdapterModel class. Use CLIPModel instead.

  • +
+
+
+
+

The CLIP model was proposed in Learning Transferable Visual Models From Natural Language Supervision by Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, +Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, Gretchen Krueger, Ilya Sutskever. CLIP +(Contrastive Language-Image Pre-Training) is a neural network trained on a variety of (image, text) pairs. It can be +instructed in natural language to predict the most relevant text snippet, given an image, without directly optimizing +for the task, similarly to the zero-shot capabilities of GPT-2 and 3.

+

The abstract from the paper is the following:

+

State-of-the-art computer vision systems are trained to predict a fixed set of predetermined object categories. This +restricted form of supervision limits their generality and usability since additional labeled data is needed to specify +any other visual concept. Learning directly from raw text about images is a promising alternative which leverages a +much broader source of supervision. We demonstrate that the simple pre-training task of predicting which caption goes +with which image is an efficient and scalable way to learn SOTA image representations from scratch on a dataset of 400 +million (image, text) pairs collected from the internet. After pre-training, natural language is used to reference +learned visual concepts (or describe new ones) enabling zero-shot transfer of the model to downstream tasks. We study +the performance of this approach by benchmarking on over 30 different existing computer vision datasets, spanning tasks +such as OCR, action recognition in videos, geo-localization, and many types of fine-grained object classification. The +model transfers non-trivially to most tasks and is often competitive with a fully supervised baseline without the need +for any dataset specific training. For instance, we match the accuracy of the original ResNet-50 on ImageNet zero-shot +without needing to use any of the 1.28 million training examples it was trained on. We release our code and pre-trained +model weights at this https URL.

+
+

CLIPTextModel

+
+
+class transformers.CLIPTextModel(config: CLIPTextConfig)
+

The text model from CLIP without any head or projection on top. +This model inherits from [PreTrainedModel]. Check the superclass documentation for the generic methods the +library implements for all its model (such as downloading or saving, resizing the input embeddings, pruning heads +etc.)

+

This model is also a PyTorch [torch.nn.Module](https://pytorch.org/docs/stable/nn.html#torch.nn.Module) subclass. +Use it as a regular PyTorch Module and refer to the PyTorch documentation for all matter related to general usage +and behavior.

+
+
Parameters
+

config ([CLIPConfig]) – Model configuration class with all the parameters of the model. +Initializing with a config file does not load the weights associated with the model, only the +configuration. Check out the [~PreTrainedModel.from_pretrained] method to load the model weights.

+
+
+
+
+config_class
+

alias of CLIPTextConfig

+
+ +
+
+forward(input_ids: Optional[Tensor] = None, attention_mask: Optional[Tensor] = None, position_ids: Optional[Tensor] = None, output_attentions: Optional[bool] = None, output_hidden_states: Optional[bool] = None, return_dict: Optional[bool] = None) Union[Tuple, BaseModelOutputWithPooling]
+

The [CLIPTextModel] forward method, overrides the __call__ special method.

+

<Tip>

+

Although the recipe for forward pass needs to be defined within this function, one should call the [Module] +instance afterwards instead of this since the former takes care of running the pre and post processing steps while +the latter silently ignores them.

+

</Tip>

+
+
Parameters
+
    +
  • input_ids (torch.LongTensor of shape (batch_size, sequence_length)) –

    Indices of input sequence tokens in the vocabulary. Padding will be ignored by default should you provide +it.

    +

    Indices can be obtained using [AutoTokenizer]. See [PreTrainedTokenizer.encode] and +[PreTrainedTokenizer.__call__] for details.

    +

    [What are input IDs?](../glossary#input-ids)

    +

  • +
  • attention_mask (torch.Tensor of shape (batch_size, sequence_length), optional) –

    Mask to avoid performing attention on padding token indices. Mask values selected in [0, 1]:

    +
      +
    • 1 for tokens that are not masked,

    • +
    • 0 for tokens that are masked.

    • +
    +

    [What are attention masks?](../glossary#attention-mask)

    +

  • +
  • position_ids (torch.LongTensor of shape (batch_size, sequence_length), optional) –

    Indices of positions of each input sequence tokens in the position embeddings. Selected in the range [0, +config.max_position_embeddings - 1].

    +

    [What are position IDs?](../glossary#position-ids)

    +

  • +
  • output_attentions (bool, optional) – Whether or not to return the attentions tensors of all attention layers. See attentions under returned +tensors for more detail.

  • +
  • output_hidden_states (bool, optional) – Whether or not to return the hidden states of all layers. See hidden_states under returned tensors for +more detail.

  • +
  • return_dict (bool, optional) – Whether or not to return a [~utils.ModelOutput] instead of a plain tuple.

  • +
  • Returns

    [transformers.modeling_outputs.BaseModelOutputWithPooling] or tuple(torch.FloatTensor): A [transformers.modeling_outputs.BaseModelOutputWithPooling] or a tuple of +torch.FloatTensor (if return_dict=False is passed or when config.return_dict=False) comprising various +elements depending on the configuration ([<class ‘transformers.models.clip.configuration_clip.CLIPTextConfig’>]) and inputs.

    +
      +
    • last_hidden_state (torch.FloatTensor of shape (batch_size, sequence_length, hidden_size)) – Sequence of hidden-states at the output of the last layer of the model.

    • +
    • pooler_output (torch.FloatTensor of shape (batch_size, hidden_size)) – Last layer hidden-state of the first token of the sequence (classification token) after further processing +through the layers used for the auxiliary pretraining task. E.g. for BERT-family of models, this returns +the classification token after processing through a linear layer and a tanh activation function. The linear +layer weights are trained from the next sentence prediction (classification) objective during pretraining.

    • +
    • hidden_states (tuple(torch.FloatTensor), optional, returned when output_hidden_states=True is passed or when config.output_hidden_states=True) – Tuple of torch.FloatTensor (one for the output of the embeddings, if the model has an embedding layer, + +one for the output of each layer) of shape (batch_size, sequence_length, hidden_size).

      +

      Hidden-states of the model at the output of each layer plus the optional initial embedding outputs.

      +
    • +
    • attentions (tuple(torch.FloatTensor), optional, returned when output_attentions=True is passed or when config.output_attentions=True) – Tuple of torch.FloatTensor (one for each layer) of shape (batch_size, num_heads, sequence_length, +sequence_length).

      +

      Attentions weights after the attention softmax, used to compute the weighted average in the self-attention +heads.

      +
    • +
    +

  • +
  • Examples

  • +
  • ```python

  • +
  • AutoTokenizer (>>> from transformers import) –

  • +
  • CLIPTextModel

  • +
  • CLIPTextModel.from_pretrained (>>> model =) –

  • +
  • AutoTokenizer.from_pretrained (>>> tokenizer =) –

  • +
  • tokenizer (>>> inputs =) –

  • +
  • model (>>> outputs =) –

  • +
  • outputs.last_hidden_state (>>> last_hidden_state =) –

  • +
  • pooled (>>> pooled_output = outputs.pooler_output #) –

  • +
  • ```

  • +
+
+
+
+ +
+
+get_input_embeddings() Module
+

Returns the model’s input embeddings.

+
+
Returns
+

A torch module mapping vocabulary to hidden states.

+
+
Return type
+

nn.Module

+
+
+
+ +
+
+set_input_embeddings(value)
+

Set model’s input embeddings.

+
+
Parameters
+

value (nn.Module) – A module mapping vocabulary to hidden states.

+
+
+
+ +
+ +
+
+

CLIPVisionModel

+
+
+class transformers.CLIPVisionModel(config: CLIPVisionConfig)
+

The vision model from CLIP without any head or projection on top. +This model inherits from [PreTrainedModel]. Check the superclass documentation for the generic methods the +library implements for all its model (such as downloading or saving, resizing the input embeddings, pruning heads +etc.)

+

This model is also a PyTorch [torch.nn.Module](https://pytorch.org/docs/stable/nn.html#torch.nn.Module) subclass. +Use it as a regular PyTorch Module and refer to the PyTorch documentation for all matter related to general usage +and behavior.

+
+
Parameters
+

config ([CLIPConfig]) – Model configuration class with all the parameters of the model. +Initializing with a config file does not load the weights associated with the model, only the +configuration. Check out the [~PreTrainedModel.from_pretrained] method to load the model weights.

+
+
+
+
+config_class
+

alias of CLIPVisionConfig

+
+ +
+
+forward(pixel_values: Optional[FloatTensor] = None, output_attentions: Optional[bool] = None, output_hidden_states: Optional[bool] = None, return_dict: Optional[bool] = None) Union[Tuple, BaseModelOutputWithPooling]
+

The [CLIPVisionModel] forward method, overrides the __call__ special method.

+

<Tip>

+

Although the recipe for forward pass needs to be defined within this function, one should call the [Module] +instance afterwards instead of this since the former takes care of running the pre and post processing steps while +the latter silently ignores them.

+

</Tip>

+
+
Parameters
+
    +
  • pixel_values (torch.FloatTensor of shape (batch_size, num_channels, height, width)) – Pixel values. Padding will be ignored by default should you provide it. Pixel values can be obtained using +[AutoImageProcessor]. See [CLIPImageProcessor.__call__] for details.

  • +
  • output_attentions (bool, optional) – Whether or not to return the attentions tensors of all attention layers. See attentions under returned +tensors for more detail.

  • +
  • output_hidden_states (bool, optional) – Whether or not to return the hidden states of all layers. See hidden_states under returned tensors for +more detail.

  • +
  • return_dict (bool, optional) – Whether or not to return a [~utils.ModelOutput] instead of a plain tuple.

  • +
  • Returns

    [transformers.modeling_outputs.BaseModelOutputWithPooling] or tuple(torch.FloatTensor): A [transformers.modeling_outputs.BaseModelOutputWithPooling] or a tuple of +torch.FloatTensor (if return_dict=False is passed or when config.return_dict=False) comprising various +elements depending on the configuration ([<class ‘transformers.models.clip.configuration_clip.CLIPVisionConfig’>]) and inputs.

    +
      +
    • last_hidden_state (torch.FloatTensor of shape (batch_size, sequence_length, hidden_size)) – Sequence of hidden-states at the output of the last layer of the model.

    • +
    • pooler_output (torch.FloatTensor of shape (batch_size, hidden_size)) – Last layer hidden-state of the first token of the sequence (classification token) after further processing +through the layers used for the auxiliary pretraining task. E.g. for BERT-family of models, this returns +the classification token after processing through a linear layer and a tanh activation function. The linear +layer weights are trained from the next sentence prediction (classification) objective during pretraining.

    • +
    • hidden_states (tuple(torch.FloatTensor), optional, returned when output_hidden_states=True is passed or when config.output_hidden_states=True) – Tuple of torch.FloatTensor (one for the output of the embeddings, if the model has an embedding layer, + +one for the output of each layer) of shape (batch_size, sequence_length, hidden_size).

      +

      Hidden-states of the model at the output of each layer plus the optional initial embedding outputs.

      +
    • +
    • attentions (tuple(torch.FloatTensor), optional, returned when output_attentions=True is passed or when config.output_attentions=True) – Tuple of torch.FloatTensor (one for each layer) of shape (batch_size, num_heads, sequence_length, +sequence_length).

      +

      Attentions weights after the attention softmax, used to compute the weighted average in the self-attention +heads.

      +
    • +
    +

  • +
  • Examples

  • +
  • ```python

  • +
  • Image (>>> from PIL import) –

  • +
  • requests (>>> import) –

  • +
  • AutoProcessor (>>> from transformers import) –

  • +
  • CLIPVisionModel

  • +
  • CLIPVisionModel.from_pretrained (>>> model =) –

  • +
  • AutoProcessor.from_pretrained (>>> processor =) –

  • +
  • "http (>>> url =) – //images.cocodataset.org/val2017/000000039769.jpg”

  • +
  • Image.open (>>> image =) –

  • +
  • processor (>>> inputs =) –

  • +
  • model (>>> outputs =) –

  • +
  • outputs.last_hidden_state (>>> last_hidden_state =) –

  • +
  • states (>>> pooled_output = outputs.pooler_output # pooled CLS) –

  • +
  • ```

  • +
+
+
+
+ +
+
+get_input_embeddings() Module
+

Returns the model’s input embeddings.

+
+
Returns
+

A torch module mapping vocabulary to hidden states.

+
+
Return type
+

nn.Module

+
+
+
+ +
+ +
+
+

CLIPModel

+
+
+class transformers.CLIPModel(config: CLIPConfig)
+

This model inherits from [PreTrainedModel]. Check the superclass documentation for the generic methods the +library implements for all its model (such as downloading or saving, resizing the input embeddings, pruning heads +etc.)

+

This model is also a PyTorch [torch.nn.Module](https://pytorch.org/docs/stable/nn.html#torch.nn.Module) subclass. +Use it as a regular PyTorch Module and refer to the PyTorch documentation for all matter related to general usage +and behavior.

+
+
Parameters
+

config ([CLIPConfig]) – Model configuration class with all the parameters of the model. +Initializing with a config file does not load the weights associated with the model, only the +configuration. Check out the [~PreTrainedModel.from_pretrained] method to load the model weights.

+
+
+
+
+config_class
+

alias of CLIPConfig

+
+ +
+
+forward(input_ids: Optional[LongTensor] = None, pixel_values: Optional[FloatTensor] = None, attention_mask: Optional[Tensor] = None, position_ids: Optional[LongTensor] = None, return_loss: Optional[bool] = None, output_attentions: Optional[bool] = None, output_hidden_states: Optional[bool] = None, return_dict: Optional[bool] = None) Union[Tuple, CLIPOutput]
+

The [CLIPModel] forward method, overrides the __call__ special method.

+

<Tip>

+

Although the recipe for forward pass needs to be defined within this function, one should call the [Module] +instance afterwards instead of this since the former takes care of running the pre and post processing steps while +the latter silently ignores them.

+

</Tip>

+
+
Parameters
+
    +
  • input_ids (torch.LongTensor of shape (batch_size, sequence_length)) –

    Indices of input sequence tokens in the vocabulary. Padding will be ignored by default should you provide +it.

    +

    Indices can be obtained using [AutoTokenizer]. See [PreTrainedTokenizer.encode] and +[PreTrainedTokenizer.__call__] for details.

    +

    [What are input IDs?](../glossary#input-ids)

    +

  • +
  • attention_mask (torch.Tensor of shape (batch_size, sequence_length), optional) –

    Mask to avoid performing attention on padding token indices. Mask values selected in [0, 1]:

    +
      +
    • 1 for tokens that are not masked,

    • +
    • 0 for tokens that are masked.

    • +
    +

    [What are attention masks?](../glossary#attention-mask)

    +

  • +
  • position_ids (torch.LongTensor of shape (batch_size, sequence_length), optional) –

    Indices of positions of each input sequence tokens in the position embeddings. Selected in the range [0, +config.max_position_embeddings - 1].

    +

    [What are position IDs?](../glossary#position-ids)

    +

  • +
  • pixel_values (torch.FloatTensor of shape (batch_size, num_channels, height, width)) – Pixel values. Padding will be ignored by default should you provide it. Pixel values can be obtained using +[AutoImageProcessor]. See [CLIPImageProcessor.__call__] for details.

  • +
  • return_loss (bool, optional) – Whether or not to return the contrastive loss.

  • +
  • output_attentions (bool, optional) – Whether or not to return the attentions tensors of all attention layers. See attentions under returned +tensors for more detail.

  • +
  • output_hidden_states (bool, optional) – Whether or not to return the hidden states of all layers. See hidden_states under returned tensors for +more detail.

  • +
  • return_dict (bool, optional) – Whether or not to return a [~utils.ModelOutput] instead of a plain tuple.

  • +
  • Returns

    [transformers.models.clip.modeling_clip.CLIPOutput] or tuple(torch.FloatTensor): A [transformers.models.clip.modeling_clip.CLIPOutput] or a tuple of +torch.FloatTensor (if return_dict=False is passed or when config.return_dict=False) comprising various +elements depending on the configuration ([<class ‘transformers.models.clip.configuration_clip.CLIPConfig’>]) and inputs.

    +
      +
    • loss (torch.FloatTensor of shape (1,), optional, returned when return_loss is True) – Contrastive loss for image-text similarity.

    • +
    • logits_per_image:(`torch.FloatTensor` of shape (image_batch_size, text_batch_size)) – The scaled dot product scores between image_embeds and text_embeds. This represents the image-text +similarity scores.

    • +
    • logits_per_text:(`torch.FloatTensor` of shape (text_batch_size, image_batch_size)) – The scaled dot product scores between text_embeds and image_embeds. This represents the text-image +similarity scores.

    • +
    • text_embeds(`torch.FloatTensor` of shape (batch_size, output_dim) – The text embeddings obtained by applying the projection layer to the pooled output of [CLIPTextModel].

    • +
    • image_embeds(`torch.FloatTensor` of shape (batch_size, output_dim) – The image embeddings obtained by applying the projection layer to the pooled output of [CLIPVisionModel].

    • +
    • text_model_output(`BaseModelOutputWithPooling`): +The output of the [CLIPTextModel].

    • +
    • vision_model_output(`BaseModelOutputWithPooling`): +The output of the [CLIPVisionModel].

    • +
    +

  • +
  • Examples

  • +
  • ```python

  • +
  • Image (>>> from PIL import) –

  • +
  • requests (>>> import) –

  • +
  • AutoProcessor (>>> from transformers import) –

  • +
  • CLIPModel

  • +
  • CLIPModel.from_pretrained (>>> model =) –

  • +
  • AutoProcessor.from_pretrained (>>> processor =) –

  • +
  • "http (>>> url =) – //images.cocodataset.org/val2017/000000039769.jpg”

  • +
  • Image.open (>>> image =) –

  • +
  • processor( (>>> inputs =) –

  • +
  • cat" (... text=["a photo of a) –

  • +
  • dog"] ("a photo of a) –

  • +
  • images=image

  • +
  • return_tensors="pt"

  • +
  • padding=True

  • +
  • ) (...) –

  • +
  • model (>>> outputs =) –

  • +
  • score (>>> logits_per_image = outputs.logits_per_image # this is the image-text similarity) –

  • +
  • logits_per_image.softmax (>>> probs =) –

  • +
  • ```

  • +
+
+
+
+ +
+
+get_image_features(pixel_values: Optional[FloatTensor] = None, output_attentions: Optional[bool] = None, output_hidden_states: Optional[bool] = None, return_dict: Optional[bool] = None) FloatTensor
+

The [CLIPModel] forward method, overrides the __call__ special method.

+

<Tip>

+

Although the recipe for forward pass needs to be defined within this function, one should call the [Module] +instance afterwards instead of this since the former takes care of running the pre and post processing steps while +the latter silently ignores them.

+

</Tip>

+
+
Parameters
+
    +
  • pixel_values (torch.FloatTensor of shape (batch_size, num_channels, height, width)) – Pixel values. Padding will be ignored by default should you provide it. Pixel values can be obtained using +[AutoImageProcessor]. See [CLIPImageProcessor.__call__] for details.

  • +
  • output_attentions (bool, optional) – Whether or not to return the attentions tensors of all attention layers. See attentions under returned +tensors for more detail.

  • +
  • output_hidden_states (bool, optional) – Whether or not to return the hidden states of all layers. See hidden_states under returned tensors for +more detail.

  • +
  • return_dict (bool, optional) – Whether or not to return a [~utils.ModelOutput] instead of a plain tuple.

  • +
  • Returns – image_features (torch.FloatTensor of shape (batch_size, output_dim): The image embeddings obtained by +applying the projection layer to the pooled output of [CLIPVisionModel].

  • +
  • Examples

  • +
  • ```python

  • +
  • Image (>>> from PIL import) –

  • +
  • requests (>>> import) –

  • +
  • AutoProcessor (>>> from transformers import) –

  • +
  • CLIPModel

  • +
  • CLIPModel.from_pretrained (>>> model =) –

  • +
  • AutoProcessor.from_pretrained (>>> processor =) –

  • +
  • "http (>>> url =) – //images.cocodataset.org/val2017/000000039769.jpg”

  • +
  • Image.open (>>> image =) –

  • +
  • processor (>>> inputs =) –

  • +
  • model.get_image_features (>>> image_features =) –

  • +
  • ```

  • +
+
+
+
+ +
+
+get_text_features(input_ids: Optional[Tensor] = None, attention_mask: Optional[Tensor] = None, position_ids: Optional[Tensor] = None, output_attentions: Optional[bool] = None, output_hidden_states: Optional[bool] = None, return_dict: Optional[bool] = None) FloatTensor
+

The [CLIPModel] forward method, overrides the __call__ special method.

+

<Tip>

+

Although the recipe for forward pass needs to be defined within this function, one should call the [Module] +instance afterwards instead of this since the former takes care of running the pre and post processing steps while +the latter silently ignores them.

+

</Tip>

+
+
Parameters
+
    +
  • input_ids (torch.LongTensor of shape (batch_size, sequence_length)) –

    Indices of input sequence tokens in the vocabulary. Padding will be ignored by default should you provide +it.

    +

    Indices can be obtained using [AutoTokenizer]. See [PreTrainedTokenizer.encode] and +[PreTrainedTokenizer.__call__] for details.

    +

    [What are input IDs?](../glossary#input-ids)

    +

  • +
  • attention_mask (torch.Tensor of shape (batch_size, sequence_length), optional) –

    Mask to avoid performing attention on padding token indices. Mask values selected in [0, 1]:

    +
      +
    • 1 for tokens that are not masked,

    • +
    • 0 for tokens that are masked.

    • +
    +

    [What are attention masks?](../glossary#attention-mask)

    +

  • +
  • position_ids (torch.LongTensor of shape (batch_size, sequence_length), optional) –

    Indices of positions of each input sequence tokens in the position embeddings. Selected in the range [0, +config.max_position_embeddings - 1].

    +

    [What are position IDs?](../glossary#position-ids)

    +

  • +
  • output_attentions (bool, optional) – Whether or not to return the attentions tensors of all attention layers. See attentions under returned +tensors for more detail.

  • +
  • output_hidden_states (bool, optional) – Whether or not to return the hidden states of all layers. See hidden_states under returned tensors for +more detail.

  • +
  • return_dict (bool, optional) – Whether or not to return a [~utils.ModelOutput] instead of a plain tuple.

  • +
  • Returns – text_features (torch.FloatTensor of shape (batch_size, output_dim): The text embeddings obtained by +applying the projection layer to the pooled output of [CLIPTextModel].

  • +
  • Examples

  • +
  • ```python

  • +
  • AutoTokenizer (>>> from transformers import) –

  • +
  • CLIPModel

  • +
  • CLIPModel.from_pretrained (>>> model =) –

  • +
  • AutoTokenizer.from_pretrained (>>> tokenizer =) –

  • +
  • tokenizer (>>> inputs =) –

  • +
  • model.get_text_features (>>> text_features =) –

  • +
  • ```

  • +
+
+
+
+ +
+ +
+
+ + +
+ +
+ + +
+
+ +
+ +
+ +
+ + Versions + v: main + + +
+
+
Branches
+
main
+
+
+
+ + + + + + + + + + \ No newline at end of file diff --git a/classes/models/deberta.html b/classes/models/deberta.html new file mode 100644 index 0000000000..da0337bd4e --- /dev/null +++ b/classes/models/deberta.html @@ -0,0 +1,1067 @@ + + + + + + + + + + + DeBERTa — AdapterHub documentation + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + +
+ + + + + +
+ +
+ + + + + + + + + + + + + + + + + +
+ + + + +
+
+
+
+ +
+

DeBERTa

+
+

Overview

+

The DeBERTa model was proposed in DeBERTa: Decoding-enhanced BERT with Disentangled Attention by Pengcheng He, Xiaodong Liu, Jianfeng Gao, Weizhu Chen It is based on Google’s +BERT model released in 2018 and Facebook’s RoBERTa model released in 2019.

+

It builds on RoBERTa with disentangled attention and enhanced mask decoder training with half of the data used in +RoBERTa.

+

The abstract from the paper is the following:

+

Recent progress in pre-trained neural language models has significantly improved the performance of many natural +language processing (NLP) tasks. In this paper we propose a new model architecture DeBERTa (Decoding-enhanced BERT with +disentangled attention) that improves the BERT and RoBERTa models using two novel techniques. The first is the +disentangled attention mechanism, where each word is represented using two vectors that encode its content and +position, respectively, and the attention weights among words are computed using disentangled matrices on their +contents and relative positions. Second, an enhanced mask decoder is used to replace the output softmax layer to +predict the masked tokens for model pretraining. We show that these two techniques significantly improve the efficiency +of model pretraining and performance of downstream tasks. Compared to RoBERTa-Large, a DeBERTa model trained on half of +the training data performs consistently better on a wide range of NLP tasks, achieving improvements on MNLI by +0.9% +(90.2% vs. 91.1%), on SQuAD v2.0 by +2.3% (88.4% vs. 90.7%) and RACE by +3.6% (83.2% vs. 86.8%). The DeBERTa code and +pre-trained models will be made publicly available at https://github.com/microsoft/DeBERTa.

+

This model was contributed by DeBERTa. This model TF 2.0 implementation was +contributed by kamalkraj . The original code can be found here.

+
+
+

DebertaAdapterModel

+
+
+class adapters.DebertaAdapterModel(config)
+

Deberta Model transformer with the option to add multiple flexible heads on top.

+
+
+property active_adapters: AdapterCompositionBlock
+

If you are not familiar with adapters and PEFT methods, we invite you to read more about them on the PEFT +official documentation: https://huggingface.co/docs/peft

+

Gets the current active adapters of the model. In case of multi-adapter inference (combining multiple adapters +for inference) returns the list of all active adapters so that users can deal with them accordingly.

+

For previous PEFT versions (that does not support multi-adapter inference), module.active_adapter will return +a single string.

+
+ +
+
+property active_head: Union[str, List[str]]
+

The active prediction head configuration of this model. Can be either the name of a single available head +(string) or a list of multiple available heads. In case of a list of heads, the same base model is forwarded +through all specified heads.

+
+
Returns
+

A string or a list of strings describing the active head configuration.

+
+
Return type
+

Union[str, List[str]]

+
+
+
+ +
+
+adapter_fusion_to(adapter_names: Union[Fuse, list, str], device: Optional[Union[device, str]] = None, dtype: Optional[dtype] = None)
+

Moves the adapter fusion layer with the given name to the specified device and data type.

+
+
Parameters
+
    +
  • adapter_names (Union[Fuse, list, str]) – The name of the adapter fusion layer to be moved.

  • +
  • device (torch.device or str, optional) – The device on which the adapter fusion layer should be moved.

  • +
  • dtype (torch.dtype, optional) – The data type to which the adapter fusion layer should be cast.

  • +
+
+
+
+ +
+
+adapter_summary(as_dict=False) Union[str, dict]
+

Returns a string summary of all adapters currently added to the model. Each entry in the summary table has the +following attributes:

+
+
    +
  • name: the name of the adapter

  • +
  • architecture: the architectural base of the adapter

  • +
  • #param: the number of parameters of the adapter

  • +
  • %param: the number of parameters of the adapter relative to the full model

  • +
  • active: whether the adapter is active

  • +
  • train: whether the adapter weights are enabled for training

  • +
+
+
+ +
+
+adapter_to(name: str, device: Optional[Union[device, str]] = None, dtype: Optional[dtype] = None)
+

Moves the adapter with the given name to the specified device and data type.

+
+
Parameters
+
    +
  • name (str) – The name of the adapter to be moved.

  • +
  • device (torch.device or str, optional) – The device on which the adapter should be moved.

  • +
  • dtype (torch.dtype, optional) – The data type to which the adapter should be cast.

  • +
+
+
+
+ +
+
+add_adapter(adapter_name: str, config=None, overwrite_ok: bool = False, set_active: bool = False)
+

Adds a new adapter module of the specified type to the model.

+
+
Parameters
+
    +
  • adapter_name (str) – The name of the adapter module to be added.

  • +
  • config (str or dict, optional) –

    The adapter configuration, can be either:

    +
      +
    • the string identifier of a pre-defined configuration dictionary

    • +
    • a configuration dictionary specifying the full config

    • +
    • if not given, the default configuration for this adapter type will be used

    • +
    +

  • +
  • overwrite_ok (bool, optional) – Overwrite an adapter with the same name if it exists. By default (False), an exception is thrown.

  • +
  • set_active (bool, optional) – Set the adapter to be the active one. By default (False), the adapter is added but not activated.

  • +
+
+
+

If self.base_model is self, must inherit from a class that implements this method, to preclude infinite +recursion

+
+ +
+
+add_adapter_fusion(adapter_names: Union[Fuse, list, str], config=None, overwrite_ok: bool = False, set_active: bool = False)
+

Adds AdapterFusion to the model with alll the necessary configurations and weight initializations

+
+
Parameters
+
    +
  • adapter_names (Fuse or list or str) –

    AdapterFusion layer to add. Can be either:

    +
      +
    • a Fuse composition block

    • +
    • a list of adapter names to fuse

    • +
    • a comma-separated string of adapter names to fuse

    • +
    +

  • +
  • config (str or dict) –

    adapter fusion configuration, can be either:

    +
      +
    • a string identifying a pre-defined adapter fusion configuration

    • +
    • a dictionary representing the adapter fusion configuration

    • +
    • the path to a file containing the adapter fusion configuration

    • +
    +

  • +
  • overwrite_ok (bool, optional) – Overwrite an AdapterFusion layer with the same name if it exists. By default (False), an exception is +thrown.

  • +
  • set_active (bool, optional) – Activate the added AdapterFusion. By default (False), the AdapterFusion is added but not activated.

  • +
+
+
+
+ +
+
+add_classification_head(head_name, num_labels=2, layers=2, activation_function='tanh', overwrite_ok=False, multilabel=False, id2label=None, use_pooler=False)
+

Adds a sequence classification head on top of the model.

+
+
Parameters
+
    +
  • head_name (str) – The name of the head.

  • +
  • num_labels (int, optional) – Number of classification labels. Defaults to 2.

  • +
  • layers (int, optional) – Number of layers. Defaults to 2.

  • +
  • activation_function (str, optional) – Activation function. Defaults to ‘tanh’.

  • +
  • overwrite_ok (bool, optional) – Force overwrite if a head with the same name exists. Defaults to False.

  • +
  • multilabel (bool, optional) – Enable multilabel classification setup. Defaults to False.

  • +
+
+
+
+ +
+
+add_masked_lm_head(head_name, activation_function='gelu', overwrite_ok=False)
+

Adds a masked language modeling head on top of the model.

+
+
Parameters
+
    +
  • head_name (str) – The name of the head.

  • +
  • activation_function (str, optional) – Activation function. Defaults to ‘gelu’.

  • +
  • overwrite_ok (bool, optional) – Force overwrite if a head with the same name exists. Defaults to False.

  • +
+
+
+
+ +
+
+add_multiple_choice_head(head_name, num_choices=2, layers=2, activation_function='tanh', overwrite_ok=False, id2label=None, use_pooler=False)
+

Adds a multiple choice head on top of the model.

+
+
Parameters
+
    +
  • head_name (str) – The name of the head.

  • +
  • num_choices (int, optional) – Number of choices. Defaults to 2.

  • +
  • layers (int, optional) – Number of layers. Defaults to 2.

  • +
  • activation_function (str, optional) – Activation function. Defaults to ‘tanh’.

  • +
  • overwrite_ok (bool, optional) – Force overwrite if a head with the same name exists. Defaults to False.

  • +
+
+
+
+ +
+
+add_qa_head(head_name, num_labels=2, layers=1, activation_function='tanh', overwrite_ok=False, id2label=None)
+

Adds a question answering head on top of the model.

+
+
Parameters
+
    +
  • head_name (str) – The name of the head.

  • +
  • num_labels (int, optional) – Number of classification labels. Defaults to 2.

  • +
  • layers (int, optional) – Number of layers. Defaults to 1.

  • +
  • activation_function (str, optional) – Activation function. Defaults to ‘tanh’.

  • +
  • overwrite_ok (bool, optional) – Force overwrite if a head with the same name exists. Defaults to False.

  • +
+
+
+
+ +
+
+add_tagging_head(head_name, num_labels=2, layers=1, activation_function='tanh', overwrite_ok=False, id2label=None)
+

Adds a token classification head on top of the model.

+
+
Parameters
+
    +
  • head_name (str) – The name of the head.

  • +
  • num_labels (int, optional) – Number of classification labels. Defaults to 2.

  • +
  • layers (int, optional) – Number of layers. Defaults to 1.

  • +
  • activation_function (str, optional) – Activation function. Defaults to ‘tanh’.

  • +
  • overwrite_ok (bool, optional) – Force overwrite if a head with the same name exists. Defaults to False.

  • +
+
+
+
+ +
+
+apply_to_adapter_layers(fn)
+

Applies a function to all adapter layers of the model.

+
+ +
+
+apply_to_basemodel_childs(fn)
+

Applies a function to all direct childs of the model if they are a instance of AdapterLayerBase.

+
+ +
+
+average_adapter(adapter_name: str, adapter_list: List[str], weights: Optional[List[float]] = None, normalize_weights: bool = True, overwrite_ok: bool = False, set_active: bool = False)
+

Adds a new adapter module as weighted average of a set of existing adapter modules.

+
+
Parameters
+
    +
  • adapter_name (str) – The name of the adapter module to be added.

  • +
  • input_adapters (List[str] or Dict[str, float]) – Specifies the existing adapters whose weights should be averaged. Can either be a list of adapter names +or a dictionary mapping adapter names to weights.

  • +
  • overwrite_ok (bool, optional) – Overwrite an adapter with the same name if it exists. By default (False), an exception is thrown.

  • +
  • set_active (bool, optional) – Set the adapter to be the active one. By default (False), the adapter is added but not activated.

  • +
+
+
+
+ +
+
+delete_adapter(adapter_name: str)
+

Deletes the adapter with the specified name from the model.

+
+
Parameters
+

adapter_name (str) – The name of the adapter.

+
+
+
+ +
+
+delete_adapter_fusion(adapter_names: Union[Fuse, list, str])
+

Deletes the AdapterFusion layer of the specified adapters.

+
+
Parameters
+

adapter_names (Union[Fuse, list, str]) – AdapterFusion layer to delete.

+
+
+
+ +
+
+delete_head(head_name: str)
+

Deletes the prediction head with the specified name from the model.

+
+
Parameters
+

head_name (str) – The name of the prediction to delete.

+
+
+
+ +
+
+eject_prefix_tuning(name: str)
+

Converts the prefix tuning with the given name from the reparameterized form into the flat form.

+
+
Parameters
+

name (str) – The name of the prefix tuning.

+
+
+
+ +
+
+forward(input_ids=None, attention_mask=None, token_type_ids=None, position_ids=None, inputs_embeds=None, output_attentions=None, output_hidden_states=None, return_dict=None, head=None, output_adapter_gating_scores=False, output_adapter_fusion_attentions=False, **kwargs)
+

Define the computation performed at every call.

+

Should be overridden by all subclasses.

+
+

Note

+

Although the recipe for forward pass needs to be defined within +this function, one should call the Module instance afterwards +instead of this since the former takes care of running the +registered hooks while the latter silently ignores them.

+
+
+ +
+
+forward_context(context: ForwardContext, *args, **kwargs)
+

This method is called by the ForwardContext at the beginning of the forward pass.

+
+ +
+
+forward_head(all_outputs, head_name=None, cls_output=None, attention_mask=None, return_dict=False, context=None, **kwargs)
+

The forward pass through a prediction head configuration. There are three ways to specify the used prediction +head configuration (in order of priority):

+
+
    +
  1. If a head_name is passed, the head with the given name is used.

  2. +
  3. If the forward call is executed within an AdapterSetup context, the head configuration is read from +the context.

  4. +
  5. If the active_head property is set, the head configuration is read from there.

  6. +
+
+
+
Parameters
+
    +
  • all_outputs (dict) – The outputs of the base model.

  • +
  • head_name (str, optional) – The name of the prediction head to use. If None, the active head is used.

  • +
  • cls_output (torch.Tensor, optional) – The classification output of the model.

  • +
  • attention_mask (torch.Tensor, optional) – The attention mask of the model.

  • +
  • return_dict (bool) – Whether or not to return a ModelOutput instead of a plain tuple.

  • +
  • get_cls_from_eos_tokens (bool) – If set to True, retrieve classifier token representations from the last <eos> token in the sequence. +Setting to True requires eos_mask to be passed as well.

  • +
  • **kwargs – Additional keyword arguments passed to the forward pass of the head.

  • +
+
+
+
+ +
+
+freeze_model(freeze=True)
+

Freezes all weights of the model.

+
+ +
+
+get_adapter(name)
+

If self.base_model is self, must inherit from a class that implements this method, to preclude infinite +recursion

+
+ +
+
+get_labels(head_name=None)
+

Returns the labels the given head is assigning/predictin

+
+
Parameters
+
    +
  • head_name – (str, optional) the name of the head which labels should be returned. Default is None.

  • +
  • returned (If the name is None the labels of the active head are) –

  • +
+
+
+

Returns: labels

+
+ +
+
+get_labels_dict(head_name=None)
+

Returns the id2label dict for the given hea

+
+
Parameters
+
    +
  • head_name – (str, optional) the name of the head which labels should be returned. Default is None.

  • +
  • returned (If the name is None the labels of the active head are) –

  • +
+
+
+

Returns: id2label

+
+ +
+
+get_output_embeddings() Union[Module, List[Module]]
+

Returns the model’s output embeddings.

+
+
Returns
+

A torch module mapping hidden states to vocabulary.

+
+
Return type
+

nn.Module

+
+
+
+ +
+
+head_type()
+

Checks which head type the decorated function belongs to and raises an error if the model does not support the +head type.

+
+ +
+
+init_adapters(model_config, adapters_config, add_prefix_tuning_pool=True)
+

This method initializes adapter modules and fusion modules from the model config.

+
+ +
+
+iter_layers() Iterable[Tuple[int, Module]]
+

Iterates over all layers of the model.

+
+ +
+
+load_adapter(adapter_name_or_path: str, config: Optional[Union[dict, str]] = None, version: Optional[str] = None, model_name: Optional[str] = None, load_as: Optional[str] = None, source: Optional[str] = None, with_head: bool = True, custom_weights_loaders: Optional[List[WeightsLoader]] = None, leave_out: Optional[List[int]] = None, id2label=None, set_active: bool = False, use_safetensors: bool = False, **kwargs) str
+

Loads a pre-trained pytorch adapter module from the local file system or a remote location.

+
+
Parameters
+
    +
  • adapter_name_or_path (str) –

    can be either:

    +
      +
    • the identifier of a pre-trained task adapter to be loaded from Adapter Hub

    • +
    • a path to a directory containing adapter weights saved using model.saved_adapter()

    • +
    • a URL pointing to a zip folder containing a saved adapter module

    • +
    +

  • +
  • config (dict or str, optional) – The requested configuration of the adapter. +If not specified, will be either: - the default adapter config for the requested adapter if specified - +the global default adapter config

  • +
  • version (str, optional) – The version of the adapter to be loaded.

  • +
  • model_name (str, optional) – The string identifier of the pre-trained model.

  • +
  • load_as (str, optional) – Load the adapter using this name. By default, the name with which the adapter was +saved will be used.

  • +
  • source (str, optional) –

    Identifier of the source(s) from where to load the adapter. Can be:

    +
      +
    • +
      ”ah”: search on AdapterHub Hub repo.

      Note: the Hub repo has been archived and all adapters have been moved to HuggingFace Model Hub. +Loading from this source is deprecated.

      +
      +
      +
    • +
    • ”hf”: search on HuggingFace Model Hub.

    • +
    • None (default): search on all sources

    • +
    +

  • +
  • leave_out – Dynamically drop adapter modules in the specified Transformer layers when loading the adapter.

  • +
  • set_active (bool, optional) – Set the loaded adapter to be the active one. By default (False), the adapter is loaded but not +activated.

  • +
  • use_safetensors (bool, optional) – If True, weights are loaded via safetensors if safetensors checkpoint is available. Otherwise, the regular torch save method is used.

  • +
+
+
Returns
+

The name with which the adapter was added to the model.

+
+
Return type
+

str

+
+
+
+ +
+
+load_adapter_fusion(adapter_fusion_name_or_path: str, load_as: Optional[str] = None, custom_weights_loaders: Optional[List[WeightsLoader]] = None, set_active: bool = False, with_head: bool = True, use_safetensors: bool = False, **kwargs) str
+

Loads a pre-trained AdapterFusion layer from the local file system.

+
+
Parameters
+
    +
  • adapter_fusion_name_or_path (str) – a path to a directory containing AdapterFusion weights saved using model.save_adapter_fusion().

  • +
  • load_as (str, optional) – Load the AdapterFusion using this name. +By default, the name with which the AdapterFusion layer was saved will be used.

  • +
  • set_active (bool, optional) – Activate the loaded AdapterFusion. By default (False), the AdapterFusion is loaded but not activated.

  • +
  • use_safetensors (bool, optional) – If True, weights are loaded via safetensors if safetensors checkpoint is available. Otherwise, the regular torch save method is used.

  • +
+
+
Returns
+

The name with which the AdapterFusion was added to the model.

+
+
Return type
+

str

+
+
+
+ +
+
+load_head(save_directory: str, load_as: Optional[str] = None, id2label: Optional[Dict[int, str]] = None, use_safetensors: bool = False, **kwargs) str
+

Loads a model prediction head from a directory where it was saved using save_head().

+
+
Parameters
+
    +
  • save_directory (str) – Path to the directory where the prediction head is saved.

  • +
  • load_as (str, optional) – Load the AdapterFusion using this name. +By default, the name with which the AdapterFusion layer was saved will be used.

  • +
  • id2label (Dict[int, str], optional) – Provide a custom mapping from class ids to class labels. Defaults to None.

  • +
  • use_safetensors (bool, optional) – If True, weights are loaded via safetensors if safetensors checkpoint is available. Otherwise, the regular torch save method is used.

  • +
+
+
Returns
+

The name with which the prediction head was added to the model.

+
+
Return type
+

str

+
+
+
+ +
+
+merge_adapter(name: str)
+

Merges the weights of the given LoRA module with the Transformer weights as described in the paper.

+
+
Parameters
+

name (str) – LoRA module to merge.

+
+
+
+ +
+
+push_adapter_to_hub(repo_name: str, adapter_name: str, organization: Optional[str] = None, adapterhub_tag: Optional[str] = None, datasets_tag: Optional[str] = None, local_path: Optional[str] = None, commit_message: Optional[str] = None, private: Optional[bool] = None, token: Optional[Union[bool, str]] = None, overwrite_adapter_card: bool = False, create_pr: bool = False, revision: Optional[str] = None, commit_description: Optional[str] = None, adapter_card_kwargs: Optional[dict] = None, **deprecated_kwargs)
+

Upload an adapter to HuggingFace’s Model Hub.

+
+
Parameters
+
    +
  • repo_name (str) – The name of the repository on the model hub to upload to.

  • +
  • adapter_name (str) – The name of the adapter to be uploaded.

  • +
  • organization (str, optional) – Organization in which to push the adapter +(you must be a member of this organization). Defaults to None.

  • +
  • adapterhub_tag (str, optional) – Tag of the format <task>/<subtask> for categorization on https://adapterhub.ml/explore/. See +https://docs.adapterhub.ml/contributing.html#add-a-new-task-or-subtask for more. If not specified, +datasets_tag must be given in case a new adapter card is generated. Defaults to None.

  • +
  • datasets_tag (str, optional) – Dataset identifier from https://huggingface.co/datasets. +If not specified, adapterhub_tag must be given in case a new adapter card is generated. Defaults to +None.

  • +
  • local_path (str, optional) – Local path used as clone directory of the adapter repository. +If not specified, will create a temporary directory. Defaults to None.

  • +
  • commit_message (str, optional) – Message to commit while pushing. Will default to "add config", "add tokenizer" or +"add model" depending on the type of the class.

  • +
  • private (bool, optional) – Whether or not the repository created should be private (requires a paying subscription).

  • +
  • token (bool or str, optional) – The token to use as HTTP bearer authorization for remote files. If True, will use the token generated +when running huggingface-cli login (stored in ~/.huggingface). Will default to True if repo_url +is not specified.

  • +
  • overwrite_adapter_card (bool, optional) – Overwrite an existing adapter card with a newly generated one. +If set to False, will only generate an adapter card, if none exists. Defaults to False.

  • +
  • create_pr (bool, optional) – Whether or not to create a PR with the uploaded files or directly commit.

  • +
  • revision (str, optional) – Branch to push the uploaded files to.

  • +
  • commit_description (str, optional) – The description of the commit that will be created

  • +
+
+
Returns
+

The url of the adapter repository on the model hub.

+
+
Return type
+

str

+
+
+
+ +
+
+reset_adapter()
+

Resets weights of a LoRA module merged using model.merge_adapter(name).

+
+ +
+
+save_adapter(save_directory: str, adapter_name: str, with_head: bool = True, meta_dict: Optional[dict] = None, custom_weights_loaders: Optional[List[WeightsLoader]] = None, use_safetensors: bool = False)
+

Saves an adapter and its configuration file to a directory so that it can be shared or reloaded using +load_adapter().

+
+
Parameters
+
    +
  • save_directory (str) – Path to a directory where the adapter should be saved.

  • +
  • adapter_name (str) – Name of the adapter to be saved.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
Raises
+

ValueError – If the given adapter name is invalid.

+
+
+
+ +
+
+save_adapter_fusion(save_directory: str, adapter_names: Union[Fuse, list, str], meta_dict: Optional[dict] = None, custom_weights_loaders: Optional[List[WeightsLoader]] = None, with_head: Union[bool, str] = False, use_safetensors: bool = False)
+

Saves an AdapterFusion layer and its configuration file to a directory so that it can be shared or reloaded +using load_adapter_fusion().

+
+
Parameters
+
    +
  • save_directory (str) – Path to a directory where the AdapterFusion should be saved.

  • +
  • adapter_names (Union[Fuse, list, str]) – AdapterFusion to be saved.

  • +
  • with_head (Union[bool, str]) – If True, will save a head with the same name as the AdapterFusionLayer. If a string, this will be used +as the name of the head to be saved.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
Raises
+

ValueError – If the given AdapterFusion name is invalid.

+
+
+
+ +
+
+save_all_adapter_fusions(save_directory: str, meta_dict: Optional[dict] = None, custom_weights_loaders: Optional[List[WeightsLoader]] = None, use_safetensors: bool = False)
+

Saves all AdapterFusion layers of this model together with their configuration to subfolders of the given +location.

+
+
Parameters
+
    +
  • save_directory (str) – Path to a directory where the AdapterFusion layers should be saved.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
+
+ +
+
+save_all_adapters(save_directory: str, with_head: bool = True, meta_dict: Optional[dict] = None, custom_weights_loaders: Optional[List[WeightsLoader]] = None, use_safetensors: bool = False)
+

Saves all adapters of this model together with their configuration to subfolders of the given location.

+
+
Parameters
+
    +
  • save_directory (str) – Path to a directory where the adapters should be saved.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
+
+ +
+
+save_all_heads(save_directory: str, use_safetensors: bool = False)
+

Saves all prediction heads of this model to subfolders of the given location.

+
+
Parameters
+
    +
  • save_directory (str) – Path to the base directory where prediction heads should be saved.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
+
+ +
+
+save_head(save_directory: str, head_name: Optional[str] = None, use_safetensors: bool = False) None
+

Saves a model prediction head to a directory such that it can be reloaded using load_head().

+
+
Parameters
+
    +
  • save_directory (str) – Path to the directory where the prediction head should be saved.

  • +
  • head_name (str, optional) – Name of the head to save. Set to None if model only has one head. Defaults to None.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
+
+ +
+
+save_pretrained(save_directory: Union[str, PathLike], **kwargs)
+

Save a model and its configuration file to a directory, so that it can be re-loaded using the +[~PreTrainedModel.from_pretrained] class method.

+
+
Parameters
+
    +
  • save_directory (str or os.PathLike) – Directory to which to save. Will be created if it doesn’t exist.

  • +
  • is_main_process (bool, optional, defaults to True) – Whether the process calling this is the main process or not. Useful when in distributed training like +TPUs and need to call this function on all processes. In this case, set is_main_process=True only on +the main process to avoid race conditions.

  • +
  • state_dict (nested dictionary of torch.Tensor) – The state dictionary of the model to save. Will default to self.state_dict(), but can be used to only +save parts of the model or if special precautions need to be taken when recovering the state dictionary +of a model (like when using model parallelism).

  • +
  • save_function (Callable) – The function to use to save the state dictionary. Useful on distributed training like TPUs when one +need to replace torch.save by another method.

  • +
  • push_to_hub (bool, optional, defaults to False) – Whether or not to push your model to the Hugging Face model hub after saving it. You can specify the +repository you want to push to with repo_id (will default to the name of save_directory in your +namespace).

  • +
  • max_shard_size (int or str, optional, defaults to “5GB”) –

    The maximum size for a checkpoint before being sharded. Checkpoints shard will then be each of size +lower than this size. If expressed as a string, needs to be digits followed by a unit (like “5MB”). +We default it to 5GB in order for models to be able to run easily on free-tier google colab instances +without CPU OOM issues.

    +

    <Tip warning={true}>

    +

    If a single weight of the model is bigger than max_shard_size, it will be in its own checkpoint shard +which will be bigger than max_shard_size.

    +

    </Tip>

    +

  • +
  • safe_serialization (bool, optional, defaults to True) – Whether to save the model using safetensors or the traditional PyTorch way (that uses pickle).

  • +
  • variant (str, optional) – If specified, weights are saved in the format pytorch_model.<variant>.bin.

  • +
  • token (str or bool, optional) – The token to use as HTTP bearer authorization for remote files. If True, or not specified, will use +the token generated when running huggingface-cli login (stored in ~/.huggingface).

  • +
  • save_peft_format (bool, optional, defaults to True) – For backward compatibility with PEFT library, in case adapter weights are attached to the model, all +keys of the state dict of adapters needs to be pre-pended with base_model.model. Advanced users can +disable this behaviours by setting save_peft_format to False.

  • +
  • kwargs (Dict[str, Any], optional) – Additional key word arguments passed along to the [~utils.PushToHubMixin.push_to_hub] method.

  • +
+
+
+
+ +
+
+set_active_adapters(adapter_setup: Union[list, AdapterCompositionBlock], skip_layers: Optional[List[int]] = None)
+

Sets the adapter modules to be used by default in every forward pass. This setting can be overriden by passing +the adapter_names parameter in the foward() pass. If no adapter with the given name is found, no module of +the respective type will be activated. In case the calling model class supports named prediction heads, this +method will attempt to activate a prediction head with the name of the last adapter in the list of passed +adapter names.

+
+
Parameters
+

adapter_setup (list) – The list of adapters to be activated by default. Can be a fusion or stacking configuration.

+
+
+
+ +
+
+tie_weights()
+

Tie the weights between the input embeddings and the output embeddings.

+

If the torchscript flag is set in the configuration, can’t handle parameter sharing so we are cloning +the weights instead.

+
+ +
+
+train_adapter(adapter_setup: Union[list, AdapterCompositionBlock], train_embeddings=False)
+

Sets the model into mode for training the given adapters. If self.base_model is self, must inherit from a class +that implements this method, to preclude infinite recursion

+
+ +
+
+train_adapter_fusion(adapter_setup: Union[list, AdapterCompositionBlock], unfreeze_adapters=False)
+

Sets the model into mode for training of adapter fusion determined by a list of adapter names. If +self.base_model is self, must inherit from a class that implements this method, to preclude infinite recursion

+
+ +
+
+train_fusion(adapter_setup: Union[list, AdapterCompositionBlock], unfreeze_adapters=False)
+

Sets the model into mode for training of adapter fusion determined by a list of adapter names.

+
+ +
+ +
+
+ + +
+ +
+ + +
+
+ +
+ +
+ +
+ + Versions + v: main + + +
+
+
Branches
+
main
+
+
+
+ + + + + + + + + + \ No newline at end of file diff --git a/classes/models/deberta_v2.html b/classes/models/deberta_v2.html new file mode 100644 index 0000000000..262753d4c6 --- /dev/null +++ b/classes/models/deberta_v2.html @@ -0,0 +1,1086 @@ + + + + + + + + + + + DeBERTa-v2 — AdapterHub documentation + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + +
+ + + + + +
+ +
+ + + + + + + + + + + + + + + + + +
+ + + + +
+
+
+
+ +
+

DeBERTa-v2

+
+

Overview

+

The DeBERTa model was proposed in DeBERTa: Decoding-enhanced BERT with Disentangled Attention by Pengcheng He, Xiaodong Liu, Jianfeng Gao, Weizhu Chen It is based on Google’s +BERT model released in 2018 and Facebook’s RoBERTa model released in 2019.

+

It builds on RoBERTa with disentangled attention and enhanced mask decoder training with half of the data used in +RoBERTa.

+

The abstract from the paper is the following:

+

Recent progress in pre-trained neural language models has significantly improved the performance of many natural +language processing (NLP) tasks. In this paper we propose a new model architecture DeBERTa (Decoding-enhanced BERT with +disentangled attention) that improves the BERT and RoBERTa models using two novel techniques. The first is the +disentangled attention mechanism, where each word is represented using two vectors that encode its content and +position, respectively, and the attention weights among words are computed using disentangled matrices on their +contents and relative positions. Second, an enhanced mask decoder is used to replace the output softmax layer to +predict the masked tokens for model pretraining. We show that these two techniques significantly improve the efficiency +of model pretraining and performance of downstream tasks. Compared to RoBERTa-Large, a DeBERTa model trained on half of +the training data performs consistently better on a wide range of NLP tasks, achieving improvements on MNLI by +0.9% +(90.2% vs. 91.1%), on SQuAD v2.0 by +2.3% (88.4% vs. 90.7%) and RACE by +3.6% (83.2% vs. 86.8%). The DeBERTa code and +pre-trained models will be made publicly available at https://github.com/microsoft/DeBERTa.

+

The following information is visible directly on the [original implementation +repository](https://github.com/microsoft/DeBERTa). DeBERTa v2 is the second version of the DeBERTa model. It includes +the 1.5B model used for the SuperGLUE single-model submission and achieving 89.9, versus human baseline 89.8. You can +find more details about this submission in the authors’ +[blog](https://www.microsoft.com/en-us/research/blog/microsoft-deberta-surpasses-human-performance-on-the-superglue-benchmark/)

+

New in v2:

+
    +
  • Vocabulary In v2 the tokenizer is changed to use a new vocabulary of size 128K built from the training data. +Instead of a GPT2-based tokenizer, the tokenizer is now +[sentencepiece-based](https://github.com/google/sentencepiece) tokenizer.

  • +
  • nGiE(nGram Induced Input Encoding) The DeBERTa-v2 model uses an additional convolution layer aside with the first +transformer layer to better learn the local dependency of input tokens.

  • +
  • Sharing position projection matrix with content projection matrix in attention layer Based on previous +experiments, this can save parameters without affecting the performance.

  • +
  • Apply bucket to encode relative positions The DeBERTa-v2 model uses log bucket to encode relative positions +similar to T5.

  • +
  • 900M model & 1.5B model Two additional model sizes are available: 900M and 1.5B, which significantly improves the +performance of downstream tasks.

  • +
+

This model was contributed by DeBERTa. This model TF 2.0 implementation was +contributed by kamalkraj. The original code can be found here.

+
+
+

DebertaV2AdapterModel

+
+
+class adapters.DebertaV2AdapterModel(config)
+

Deberta v2 Model transformer with the option to add multiple flexible heads on top.

+
+
+property active_adapters: AdapterCompositionBlock
+

If you are not familiar with adapters and PEFT methods, we invite you to read more about them on the PEFT +official documentation: https://huggingface.co/docs/peft

+

Gets the current active adapters of the model. In case of multi-adapter inference (combining multiple adapters +for inference) returns the list of all active adapters so that users can deal with them accordingly.

+

For previous PEFT versions (that does not support multi-adapter inference), module.active_adapter will return +a single string.

+
+ +
+
+property active_head: Union[str, List[str]]
+

The active prediction head configuration of this model. Can be either the name of a single available head +(string) or a list of multiple available heads. In case of a list of heads, the same base model is forwarded +through all specified heads.

+
+
Returns
+

A string or a list of strings describing the active head configuration.

+
+
Return type
+

Union[str, List[str]]

+
+
+
+ +
+
+adapter_fusion_to(adapter_names: Union[Fuse, list, str], device: Optional[Union[device, str]] = None, dtype: Optional[dtype] = None)
+

Moves the adapter fusion layer with the given name to the specified device and data type.

+
+
Parameters
+
    +
  • adapter_names (Union[Fuse, list, str]) – The name of the adapter fusion layer to be moved.

  • +
  • device (torch.device or str, optional) – The device on which the adapter fusion layer should be moved.

  • +
  • dtype (torch.dtype, optional) – The data type to which the adapter fusion layer should be cast.

  • +
+
+
+
+ +
+
+adapter_summary(as_dict=False) Union[str, dict]
+

Returns a string summary of all adapters currently added to the model. Each entry in the summary table has the +following attributes:

+
+
    +
  • name: the name of the adapter

  • +
  • architecture: the architectural base of the adapter

  • +
  • #param: the number of parameters of the adapter

  • +
  • %param: the number of parameters of the adapter relative to the full model

  • +
  • active: whether the adapter is active

  • +
  • train: whether the adapter weights are enabled for training

  • +
+
+
+ +
+
+adapter_to(name: str, device: Optional[Union[device, str]] = None, dtype: Optional[dtype] = None)
+

Moves the adapter with the given name to the specified device and data type.

+
+
Parameters
+
    +
  • name (str) – The name of the adapter to be moved.

  • +
  • device (torch.device or str, optional) – The device on which the adapter should be moved.

  • +
  • dtype (torch.dtype, optional) – The data type to which the adapter should be cast.

  • +
+
+
+
+ +
+
+add_adapter(adapter_name: str, config=None, overwrite_ok: bool = False, set_active: bool = False)
+

Adds a new adapter module of the specified type to the model.

+
+
Parameters
+
    +
  • adapter_name (str) – The name of the adapter module to be added.

  • +
  • config (str or dict, optional) –

    The adapter configuration, can be either:

    +
      +
    • the string identifier of a pre-defined configuration dictionary

    • +
    • a configuration dictionary specifying the full config

    • +
    • if not given, the default configuration for this adapter type will be used

    • +
    +

  • +
  • overwrite_ok (bool, optional) – Overwrite an adapter with the same name if it exists. By default (False), an exception is thrown.

  • +
  • set_active (bool, optional) – Set the adapter to be the active one. By default (False), the adapter is added but not activated.

  • +
+
+
+

If self.base_model is self, must inherit from a class that implements this method, to preclude infinite +recursion

+
+ +
+
+add_adapter_fusion(adapter_names: Union[Fuse, list, str], config=None, overwrite_ok: bool = False, set_active: bool = False)
+

Adds AdapterFusion to the model with alll the necessary configurations and weight initializations

+
+
Parameters
+
    +
  • adapter_names (Fuse or list or str) –

    AdapterFusion layer to add. Can be either:

    +
      +
    • a Fuse composition block

    • +
    • a list of adapter names to fuse

    • +
    • a comma-separated string of adapter names to fuse

    • +
    +

  • +
  • config (str or dict) –

    adapter fusion configuration, can be either:

    +
      +
    • a string identifying a pre-defined adapter fusion configuration

    • +
    • a dictionary representing the adapter fusion configuration

    • +
    • the path to a file containing the adapter fusion configuration

    • +
    +

  • +
  • overwrite_ok (bool, optional) – Overwrite an AdapterFusion layer with the same name if it exists. By default (False), an exception is +thrown.

  • +
  • set_active (bool, optional) – Activate the added AdapterFusion. By default (False), the AdapterFusion is added but not activated.

  • +
+
+
+
+ +
+
+add_classification_head(head_name, num_labels=2, layers=2, activation_function='tanh', overwrite_ok=False, multilabel=False, id2label=None, use_pooler=False)
+

Adds a sequence classification head on top of the model.

+
+
Parameters
+
    +
  • head_name (str) – The name of the head.

  • +
  • num_labels (int, optional) – Number of classification labels. Defaults to 2.

  • +
  • layers (int, optional) – Number of layers. Defaults to 2.

  • +
  • activation_function (str, optional) – Activation function. Defaults to ‘tanh’.

  • +
  • overwrite_ok (bool, optional) – Force overwrite if a head with the same name exists. Defaults to False.

  • +
  • multilabel (bool, optional) – Enable multilabel classification setup. Defaults to False.

  • +
+
+
+
+ +
+
+add_masked_lm_head(head_name, activation_function='gelu', overwrite_ok=False)
+

Adds a masked language modeling head on top of the model.

+
+
Parameters
+
    +
  • head_name (str) – The name of the head.

  • +
  • activation_function (str, optional) – Activation function. Defaults to ‘gelu’.

  • +
  • overwrite_ok (bool, optional) – Force overwrite if a head with the same name exists. Defaults to False.

  • +
+
+
+
+ +
+
+add_multiple_choice_head(head_name, num_choices=2, layers=2, activation_function='tanh', overwrite_ok=False, id2label=None, use_pooler=False)
+

Adds a multiple choice head on top of the model.

+
+
Parameters
+
    +
  • head_name (str) – The name of the head.

  • +
  • num_choices (int, optional) – Number of choices. Defaults to 2.

  • +
  • layers (int, optional) – Number of layers. Defaults to 2.

  • +
  • activation_function (str, optional) – Activation function. Defaults to ‘tanh’.

  • +
  • overwrite_ok (bool, optional) – Force overwrite if a head with the same name exists. Defaults to False.

  • +
+
+
+
+ +
+
+add_qa_head(head_name, num_labels=2, layers=1, activation_function='tanh', overwrite_ok=False, id2label=None)
+

Adds a question answering head on top of the model.

+
+
Parameters
+
    +
  • head_name (str) – The name of the head.

  • +
  • num_labels (int, optional) – Number of classification labels. Defaults to 2.

  • +
  • layers (int, optional) – Number of layers. Defaults to 1.

  • +
  • activation_function (str, optional) – Activation function. Defaults to ‘tanh’.

  • +
  • overwrite_ok (bool, optional) – Force overwrite if a head with the same name exists. Defaults to False.

  • +
+
+
+
+ +
+
+add_tagging_head(head_name, num_labels=2, layers=1, activation_function='tanh', overwrite_ok=False, id2label=None)
+

Adds a token classification head on top of the model.

+
+
Parameters
+
    +
  • head_name (str) – The name of the head.

  • +
  • num_labels (int, optional) – Number of classification labels. Defaults to 2.

  • +
  • layers (int, optional) – Number of layers. Defaults to 1.

  • +
  • activation_function (str, optional) – Activation function. Defaults to ‘tanh’.

  • +
  • overwrite_ok (bool, optional) – Force overwrite if a head with the same name exists. Defaults to False.

  • +
+
+
+
+ +
+
+apply_to_adapter_layers(fn)
+

Applies a function to all adapter layers of the model.

+
+ +
+
+apply_to_basemodel_childs(fn)
+

Applies a function to all direct childs of the model if they are a instance of AdapterLayerBase.

+
+ +
+
+average_adapter(adapter_name: str, adapter_list: List[str], weights: Optional[List[float]] = None, normalize_weights: bool = True, overwrite_ok: bool = False, set_active: bool = False)
+

Adds a new adapter module as weighted average of a set of existing adapter modules.

+
+
Parameters
+
    +
  • adapter_name (str) – The name of the adapter module to be added.

  • +
  • input_adapters (List[str] or Dict[str, float]) – Specifies the existing adapters whose weights should be averaged. Can either be a list of adapter names +or a dictionary mapping adapter names to weights.

  • +
  • overwrite_ok (bool, optional) – Overwrite an adapter with the same name if it exists. By default (False), an exception is thrown.

  • +
  • set_active (bool, optional) – Set the adapter to be the active one. By default (False), the adapter is added but not activated.

  • +
+
+
+
+ +
+
+delete_adapter(adapter_name: str)
+

Deletes the adapter with the specified name from the model.

+
+
Parameters
+

adapter_name (str) – The name of the adapter.

+
+
+
+ +
+
+delete_adapter_fusion(adapter_names: Union[Fuse, list, str])
+

Deletes the AdapterFusion layer of the specified adapters.

+
+
Parameters
+

adapter_names (Union[Fuse, list, str]) – AdapterFusion layer to delete.

+
+
+
+ +
+
+delete_head(head_name: str)
+

Deletes the prediction head with the specified name from the model.

+
+
Parameters
+

head_name (str) – The name of the prediction to delete.

+
+
+
+ +
+
+eject_prefix_tuning(name: str)
+

Converts the prefix tuning with the given name from the reparameterized form into the flat form.

+
+
Parameters
+

name (str) – The name of the prefix tuning.

+
+
+
+ +
+
+forward(input_ids=None, attention_mask=None, token_type_ids=None, position_ids=None, inputs_embeds=None, output_attentions=None, output_hidden_states=None, return_dict=None, head=None, output_adapter_gating_scores=False, output_adapter_fusion_attentions=False, **kwargs)
+

Define the computation performed at every call.

+

Should be overridden by all subclasses.

+
+

Note

+

Although the recipe for forward pass needs to be defined within +this function, one should call the Module instance afterwards +instead of this since the former takes care of running the +registered hooks while the latter silently ignores them.

+
+
+ +
+
+forward_context(context: ForwardContext, *args, **kwargs)
+

This method is called by the ForwardContext at the beginning of the forward pass.

+
+ +
+
+forward_head(all_outputs, head_name=None, cls_output=None, attention_mask=None, return_dict=False, context=None, **kwargs)
+

The forward pass through a prediction head configuration. There are three ways to specify the used prediction +head configuration (in order of priority):

+
+
    +
  1. If a head_name is passed, the head with the given name is used.

  2. +
  3. If the forward call is executed within an AdapterSetup context, the head configuration is read from +the context.

  4. +
  5. If the active_head property is set, the head configuration is read from there.

  6. +
+
+
+
Parameters
+
    +
  • all_outputs (dict) – The outputs of the base model.

  • +
  • head_name (str, optional) – The name of the prediction head to use. If None, the active head is used.

  • +
  • cls_output (torch.Tensor, optional) – The classification output of the model.

  • +
  • attention_mask (torch.Tensor, optional) – The attention mask of the model.

  • +
  • return_dict (bool) – Whether or not to return a ModelOutput instead of a plain tuple.

  • +
  • get_cls_from_eos_tokens (bool) – If set to True, retrieve classifier token representations from the last <eos> token in the sequence. +Setting to True requires eos_mask to be passed as well.

  • +
  • **kwargs – Additional keyword arguments passed to the forward pass of the head.

  • +
+
+
+
+ +
+
+freeze_model(freeze=True)
+

Freezes all weights of the model.

+
+ +
+
+get_adapter(name)
+

If self.base_model is self, must inherit from a class that implements this method, to preclude infinite +recursion

+
+ +
+
+get_labels(head_name=None)
+

Returns the labels the given head is assigning/predictin

+
+
Parameters
+
    +
  • head_name – (str, optional) the name of the head which labels should be returned. Default is None.

  • +
  • returned (If the name is None the labels of the active head are) –

  • +
+
+
+

Returns: labels

+
+ +
+
+get_labels_dict(head_name=None)
+

Returns the id2label dict for the given hea

+
+
Parameters
+
    +
  • head_name – (str, optional) the name of the head which labels should be returned. Default is None.

  • +
  • returned (If the name is None the labels of the active head are) –

  • +
+
+
+

Returns: id2label

+
+ +
+
+get_output_embeddings() Union[Module, List[Module]]
+

Returns the model’s output embeddings.

+
+
Returns
+

A torch module mapping hidden states to vocabulary.

+
+
Return type
+

nn.Module

+
+
+
+ +
+
+head_type()
+

Checks which head type the decorated function belongs to and raises an error if the model does not support the +head type.

+
+ +
+
+init_adapters(model_config, adapters_config, add_prefix_tuning_pool=True)
+

This method initializes adapter modules and fusion modules from the model config.

+
+ +
+
+iter_layers() Iterable[Tuple[int, Module]]
+

Iterates over all layers of the model.

+
+ +
+
+load_adapter(adapter_name_or_path: str, config: Optional[Union[dict, str]] = None, version: Optional[str] = None, model_name: Optional[str] = None, load_as: Optional[str] = None, source: Optional[str] = None, with_head: bool = True, custom_weights_loaders: Optional[List[WeightsLoader]] = None, leave_out: Optional[List[int]] = None, id2label=None, set_active: bool = False, use_safetensors: bool = False, **kwargs) str
+

Loads a pre-trained pytorch adapter module from the local file system or a remote location.

+
+
Parameters
+
    +
  • adapter_name_or_path (str) –

    can be either:

    +
      +
    • the identifier of a pre-trained task adapter to be loaded from Adapter Hub

    • +
    • a path to a directory containing adapter weights saved using model.saved_adapter()

    • +
    • a URL pointing to a zip folder containing a saved adapter module

    • +
    +

  • +
  • config (dict or str, optional) – The requested configuration of the adapter. +If not specified, will be either: - the default adapter config for the requested adapter if specified - +the global default adapter config

  • +
  • version (str, optional) – The version of the adapter to be loaded.

  • +
  • model_name (str, optional) – The string identifier of the pre-trained model.

  • +
  • load_as (str, optional) – Load the adapter using this name. By default, the name with which the adapter was +saved will be used.

  • +
  • source (str, optional) –

    Identifier of the source(s) from where to load the adapter. Can be:

    +
      +
    • +
      ”ah”: search on AdapterHub Hub repo.

      Note: the Hub repo has been archived and all adapters have been moved to HuggingFace Model Hub. +Loading from this source is deprecated.

      +
      +
      +
    • +
    • ”hf”: search on HuggingFace Model Hub.

    • +
    • None (default): search on all sources

    • +
    +

  • +
  • leave_out – Dynamically drop adapter modules in the specified Transformer layers when loading the adapter.

  • +
  • set_active (bool, optional) – Set the loaded adapter to be the active one. By default (False), the adapter is loaded but not +activated.

  • +
  • use_safetensors (bool, optional) – If True, weights are loaded via safetensors if safetensors checkpoint is available. Otherwise, the regular torch save method is used.

  • +
+
+
Returns
+

The name with which the adapter was added to the model.

+
+
Return type
+

str

+
+
+
+ +
+
+load_adapter_fusion(adapter_fusion_name_or_path: str, load_as: Optional[str] = None, custom_weights_loaders: Optional[List[WeightsLoader]] = None, set_active: bool = False, with_head: bool = True, use_safetensors: bool = False, **kwargs) str
+

Loads a pre-trained AdapterFusion layer from the local file system.

+
+
Parameters
+
    +
  • adapter_fusion_name_or_path (str) – a path to a directory containing AdapterFusion weights saved using model.save_adapter_fusion().

  • +
  • load_as (str, optional) – Load the AdapterFusion using this name. +By default, the name with which the AdapterFusion layer was saved will be used.

  • +
  • set_active (bool, optional) – Activate the loaded AdapterFusion. By default (False), the AdapterFusion is loaded but not activated.

  • +
  • use_safetensors (bool, optional) – If True, weights are loaded via safetensors if safetensors checkpoint is available. Otherwise, the regular torch save method is used.

  • +
+
+
Returns
+

The name with which the AdapterFusion was added to the model.

+
+
Return type
+

str

+
+
+
+ +
+
+load_head(save_directory: str, load_as: Optional[str] = None, id2label: Optional[Dict[int, str]] = None, use_safetensors: bool = False, **kwargs) str
+

Loads a model prediction head from a directory where it was saved using save_head().

+
+
Parameters
+
    +
  • save_directory (str) – Path to the directory where the prediction head is saved.

  • +
  • load_as (str, optional) – Load the AdapterFusion using this name. +By default, the name with which the AdapterFusion layer was saved will be used.

  • +
  • id2label (Dict[int, str], optional) – Provide a custom mapping from class ids to class labels. Defaults to None.

  • +
  • use_safetensors (bool, optional) – If True, weights are loaded via safetensors if safetensors checkpoint is available. Otherwise, the regular torch save method is used.

  • +
+
+
Returns
+

The name with which the prediction head was added to the model.

+
+
Return type
+

str

+
+
+
+ +
+
+merge_adapter(name: str)
+

Merges the weights of the given LoRA module with the Transformer weights as described in the paper.

+
+
Parameters
+

name (str) – LoRA module to merge.

+
+
+
+ +
+
+push_adapter_to_hub(repo_name: str, adapter_name: str, organization: Optional[str] = None, adapterhub_tag: Optional[str] = None, datasets_tag: Optional[str] = None, local_path: Optional[str] = None, commit_message: Optional[str] = None, private: Optional[bool] = None, token: Optional[Union[bool, str]] = None, overwrite_adapter_card: bool = False, create_pr: bool = False, revision: Optional[str] = None, commit_description: Optional[str] = None, adapter_card_kwargs: Optional[dict] = None, **deprecated_kwargs)
+

Upload an adapter to HuggingFace’s Model Hub.

+
+
Parameters
+
    +
  • repo_name (str) – The name of the repository on the model hub to upload to.

  • +
  • adapter_name (str) – The name of the adapter to be uploaded.

  • +
  • organization (str, optional) – Organization in which to push the adapter +(you must be a member of this organization). Defaults to None.

  • +
  • adapterhub_tag (str, optional) – Tag of the format <task>/<subtask> for categorization on https://adapterhub.ml/explore/. See +https://docs.adapterhub.ml/contributing.html#add-a-new-task-or-subtask for more. If not specified, +datasets_tag must be given in case a new adapter card is generated. Defaults to None.

  • +
  • datasets_tag (str, optional) – Dataset identifier from https://huggingface.co/datasets. +If not specified, adapterhub_tag must be given in case a new adapter card is generated. Defaults to +None.

  • +
  • local_path (str, optional) – Local path used as clone directory of the adapter repository. +If not specified, will create a temporary directory. Defaults to None.

  • +
  • commit_message (str, optional) – Message to commit while pushing. Will default to "add config", "add tokenizer" or +"add model" depending on the type of the class.

  • +
  • private (bool, optional) – Whether or not the repository created should be private (requires a paying subscription).

  • +
  • token (bool or str, optional) – The token to use as HTTP bearer authorization for remote files. If True, will use the token generated +when running huggingface-cli login (stored in ~/.huggingface). Will default to True if repo_url +is not specified.

  • +
  • overwrite_adapter_card (bool, optional) – Overwrite an existing adapter card with a newly generated one. +If set to False, will only generate an adapter card, if none exists. Defaults to False.

  • +
  • create_pr (bool, optional) – Whether or not to create a PR with the uploaded files or directly commit.

  • +
  • revision (str, optional) – Branch to push the uploaded files to.

  • +
  • commit_description (str, optional) – The description of the commit that will be created

  • +
+
+
Returns
+

The url of the adapter repository on the model hub.

+
+
Return type
+

str

+
+
+
+ +
+
+reset_adapter()
+

Resets weights of a LoRA module merged using model.merge_adapter(name).

+
+ +
+
+save_adapter(save_directory: str, adapter_name: str, with_head: bool = True, meta_dict: Optional[dict] = None, custom_weights_loaders: Optional[List[WeightsLoader]] = None, use_safetensors: bool = False)
+

Saves an adapter and its configuration file to a directory so that it can be shared or reloaded using +load_adapter().

+
+
Parameters
+
    +
  • save_directory (str) – Path to a directory where the adapter should be saved.

  • +
  • adapter_name (str) – Name of the adapter to be saved.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
Raises
+

ValueError – If the given adapter name is invalid.

+
+
+
+ +
+
+save_adapter_fusion(save_directory: str, adapter_names: Union[Fuse, list, str], meta_dict: Optional[dict] = None, custom_weights_loaders: Optional[List[WeightsLoader]] = None, with_head: Union[bool, str] = False, use_safetensors: bool = False)
+

Saves an AdapterFusion layer and its configuration file to a directory so that it can be shared or reloaded +using load_adapter_fusion().

+
+
Parameters
+
    +
  • save_directory (str) – Path to a directory where the AdapterFusion should be saved.

  • +
  • adapter_names (Union[Fuse, list, str]) – AdapterFusion to be saved.

  • +
  • with_head (Union[bool, str]) – If True, will save a head with the same name as the AdapterFusionLayer. If a string, this will be used +as the name of the head to be saved.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
Raises
+

ValueError – If the given AdapterFusion name is invalid.

+
+
+
+ +
+
+save_all_adapter_fusions(save_directory: str, meta_dict: Optional[dict] = None, custom_weights_loaders: Optional[List[WeightsLoader]] = None, use_safetensors: bool = False)
+

Saves all AdapterFusion layers of this model together with their configuration to subfolders of the given +location.

+
+
Parameters
+
    +
  • save_directory (str) – Path to a directory where the AdapterFusion layers should be saved.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
+
+ +
+
+save_all_adapters(save_directory: str, with_head: bool = True, meta_dict: Optional[dict] = None, custom_weights_loaders: Optional[List[WeightsLoader]] = None, use_safetensors: bool = False)
+

Saves all adapters of this model together with their configuration to subfolders of the given location.

+
+
Parameters
+
    +
  • save_directory (str) – Path to a directory where the adapters should be saved.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
+
+ +
+
+save_all_heads(save_directory: str, use_safetensors: bool = False)
+

Saves all prediction heads of this model to subfolders of the given location.

+
+
Parameters
+
    +
  • save_directory (str) – Path to the base directory where prediction heads should be saved.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
+
+ +
+
+save_head(save_directory: str, head_name: Optional[str] = None, use_safetensors: bool = False) None
+

Saves a model prediction head to a directory such that it can be reloaded using load_head().

+
+
Parameters
+
    +
  • save_directory (str) – Path to the directory where the prediction head should be saved.

  • +
  • head_name (str, optional) – Name of the head to save. Set to None if model only has one head. Defaults to None.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
+
+ +
+
+save_pretrained(save_directory: Union[str, PathLike], **kwargs)
+

Save a model and its configuration file to a directory, so that it can be re-loaded using the +[~PreTrainedModel.from_pretrained] class method.

+
+
Parameters
+
    +
  • save_directory (str or os.PathLike) – Directory to which to save. Will be created if it doesn’t exist.

  • +
  • is_main_process (bool, optional, defaults to True) – Whether the process calling this is the main process or not. Useful when in distributed training like +TPUs and need to call this function on all processes. In this case, set is_main_process=True only on +the main process to avoid race conditions.

  • +
  • state_dict (nested dictionary of torch.Tensor) – The state dictionary of the model to save. Will default to self.state_dict(), but can be used to only +save parts of the model or if special precautions need to be taken when recovering the state dictionary +of a model (like when using model parallelism).

  • +
  • save_function (Callable) – The function to use to save the state dictionary. Useful on distributed training like TPUs when one +need to replace torch.save by another method.

  • +
  • push_to_hub (bool, optional, defaults to False) – Whether or not to push your model to the Hugging Face model hub after saving it. You can specify the +repository you want to push to with repo_id (will default to the name of save_directory in your +namespace).

  • +
  • max_shard_size (int or str, optional, defaults to “5GB”) –

    The maximum size for a checkpoint before being sharded. Checkpoints shard will then be each of size +lower than this size. If expressed as a string, needs to be digits followed by a unit (like “5MB”). +We default it to 5GB in order for models to be able to run easily on free-tier google colab instances +without CPU OOM issues.

    +

    <Tip warning={true}>

    +

    If a single weight of the model is bigger than max_shard_size, it will be in its own checkpoint shard +which will be bigger than max_shard_size.

    +

    </Tip>

    +

  • +
  • safe_serialization (bool, optional, defaults to True) – Whether to save the model using safetensors or the traditional PyTorch way (that uses pickle).

  • +
  • variant (str, optional) – If specified, weights are saved in the format pytorch_model.<variant>.bin.

  • +
  • token (str or bool, optional) – The token to use as HTTP bearer authorization for remote files. If True, or not specified, will use +the token generated when running huggingface-cli login (stored in ~/.huggingface).

  • +
  • save_peft_format (bool, optional, defaults to True) – For backward compatibility with PEFT library, in case adapter weights are attached to the model, all +keys of the state dict of adapters needs to be pre-pended with base_model.model. Advanced users can +disable this behaviours by setting save_peft_format to False.

  • +
  • kwargs (Dict[str, Any], optional) – Additional key word arguments passed along to the [~utils.PushToHubMixin.push_to_hub] method.

  • +
+
+
+
+ +
+
+set_active_adapters(adapter_setup: Union[list, AdapterCompositionBlock], skip_layers: Optional[List[int]] = None)
+

Sets the adapter modules to be used by default in every forward pass. This setting can be overriden by passing +the adapter_names parameter in the foward() pass. If no adapter with the given name is found, no module of +the respective type will be activated. In case the calling model class supports named prediction heads, this +method will attempt to activate a prediction head with the name of the last adapter in the list of passed +adapter names.

+
+
Parameters
+

adapter_setup (list) – The list of adapters to be activated by default. Can be a fusion or stacking configuration.

+
+
+
+ +
+
+tie_weights()
+

Tie the weights between the input embeddings and the output embeddings.

+

If the torchscript flag is set in the configuration, can’t handle parameter sharing so we are cloning +the weights instead.

+
+ +
+
+train_adapter(adapter_setup: Union[list, AdapterCompositionBlock], train_embeddings=False)
+

Sets the model into mode for training the given adapters. If self.base_model is self, must inherit from a class +that implements this method, to preclude infinite recursion

+
+ +
+
+train_adapter_fusion(adapter_setup: Union[list, AdapterCompositionBlock], unfreeze_adapters=False)
+

Sets the model into mode for training of adapter fusion determined by a list of adapter names. If +self.base_model is self, must inherit from a class that implements this method, to preclude infinite recursion

+
+ +
+
+train_fusion(adapter_setup: Union[list, AdapterCompositionBlock], unfreeze_adapters=False)
+

Sets the model into mode for training of adapter fusion determined by a list of adapter names.

+
+ +
+ +
+
+ + +
+ +
+ + +
+
+ +
+ +
+ +
+ + Versions + v: main + + +
+
+
Branches
+
main
+
+
+
+ + + + + + + + + + \ No newline at end of file diff --git a/classes/models/distilbert.html b/classes/models/distilbert.html new file mode 100644 index 0000000000..dfc5d12ff4 --- /dev/null +++ b/classes/models/distilbert.html @@ -0,0 +1,1148 @@ + + + + + + + + + + + DistilBERT — AdapterHub documentation + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + +
+ + + + + +
+ +
+ + + + + + + + + + + + + + + + + +
+ + + + +
+
+
+
+ +
+

DistilBERT

+

The DistilBERT model was proposed in the blog post +Smaller, faster, cheaper, lighter: Introducing DistilBERT, a distilled version of BERT, +and the paper DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter. +DistilBERT is a small, fast, cheap and light Transformer model trained by distilling Bert base. It has 40% less +parameters than bert-base-uncased, runs 60% faster while preserving over 95% of Bert’s performances as measured on +the GLUE language understanding benchmark.

+
+

DistilBertAdapterModel

+
+
+class adapters.DistilBertAdapterModel(config)
+

DistilBert Model transformer with the option to add multiple flexible heads on top.

+

This model inherits from [PreTrainedModel]. Check the superclass documentation for the generic methods the +library implements for all its model (such as downloading or saving, resizing the input embeddings, pruning heads +etc.)

+

This model is also a PyTorch [torch.nn.Module](https://pytorch.org/docs/stable/nn.html#torch.nn.Module) subclass. +Use it as a regular PyTorch Module and refer to the PyTorch documentation for all matter related to general usage +and behavior.

+
+
Parameters
+

config ([DistilBertConfig]) – Model configuration class with all the parameters of the model. +Initializing with a config file does not load the weights associated with the model, only the +configuration. Check out the [~PreTrainedModel.from_pretrained] method to load the model weights.

+
+
+
+
+property active_adapters: AdapterCompositionBlock
+

If you are not familiar with adapters and PEFT methods, we invite you to read more about them on the PEFT +official documentation: https://huggingface.co/docs/peft

+

Gets the current active adapters of the model. In case of multi-adapter inference (combining multiple adapters +for inference) returns the list of all active adapters so that users can deal with them accordingly.

+

For previous PEFT versions (that does not support multi-adapter inference), module.active_adapter will return +a single string.

+
+ +
+
+property active_head: Union[str, List[str]]
+

The active prediction head configuration of this model. Can be either the name of a single available head +(string) or a list of multiple available heads. In case of a list of heads, the same base model is forwarded +through all specified heads.

+
+
Returns
+

A string or a list of strings describing the active head configuration.

+
+
Return type
+

Union[str, List[str]]

+
+
+
+ +
+
+adapter_fusion_to(adapter_names: Union[Fuse, list, str], device: Optional[Union[device, str]] = None, dtype: Optional[dtype] = None)
+

Moves the adapter fusion layer with the given name to the specified device and data type.

+
+
Parameters
+
    +
  • adapter_names (Union[Fuse, list, str]) – The name of the adapter fusion layer to be moved.

  • +
  • device (torch.device or str, optional) – The device on which the adapter fusion layer should be moved.

  • +
  • dtype (torch.dtype, optional) – The data type to which the adapter fusion layer should be cast.

  • +
+
+
+
+ +
+
+adapter_summary(as_dict=False) Union[str, dict]
+

Returns a string summary of all adapters currently added to the model. Each entry in the summary table has the +following attributes:

+
+
    +
  • name: the name of the adapter

  • +
  • architecture: the architectural base of the adapter

  • +
  • #param: the number of parameters of the adapter

  • +
  • %param: the number of parameters of the adapter relative to the full model

  • +
  • active: whether the adapter is active

  • +
  • train: whether the adapter weights are enabled for training

  • +
+
+
+ +
+
+adapter_to(name: str, device: Optional[Union[device, str]] = None, dtype: Optional[dtype] = None)
+

Moves the adapter with the given name to the specified device and data type.

+
+
Parameters
+
    +
  • name (str) – The name of the adapter to be moved.

  • +
  • device (torch.device or str, optional) – The device on which the adapter should be moved.

  • +
  • dtype (torch.dtype, optional) – The data type to which the adapter should be cast.

  • +
+
+
+
+ +
+
+add_adapter(adapter_name: str, config=None, overwrite_ok: bool = False, set_active: bool = False)
+

Adds a new adapter module of the specified type to the model.

+
+
Parameters
+
    +
  • adapter_name (str) – The name of the adapter module to be added.

  • +
  • config (str or dict, optional) –

    The adapter configuration, can be either:

    +
      +
    • the string identifier of a pre-defined configuration dictionary

    • +
    • a configuration dictionary specifying the full config

    • +
    • if not given, the default configuration for this adapter type will be used

    • +
    +

  • +
  • overwrite_ok (bool, optional) – Overwrite an adapter with the same name if it exists. By default (False), an exception is thrown.

  • +
  • set_active (bool, optional) – Set the adapter to be the active one. By default (False), the adapter is added but not activated.

  • +
+
+
+

If self.base_model is self, must inherit from a class that implements this method, to preclude infinite +recursion

+
+ +
+
+add_adapter_fusion(adapter_names: Union[Fuse, list, str], config=None, overwrite_ok: bool = False, set_active: bool = False)
+

Adds AdapterFusion to the model with alll the necessary configurations and weight initializations

+
+
Parameters
+
    +
  • adapter_names (Fuse or list or str) –

    AdapterFusion layer to add. Can be either:

    +
      +
    • a Fuse composition block

    • +
    • a list of adapter names to fuse

    • +
    • a comma-separated string of adapter names to fuse

    • +
    +

  • +
  • config (str or dict) –

    adapter fusion configuration, can be either:

    +
      +
    • a string identifying a pre-defined adapter fusion configuration

    • +
    • a dictionary representing the adapter fusion configuration

    • +
    • the path to a file containing the adapter fusion configuration

    • +
    +

  • +
  • overwrite_ok (bool, optional) – Overwrite an AdapterFusion layer with the same name if it exists. By default (False), an exception is +thrown.

  • +
  • set_active (bool, optional) – Activate the added AdapterFusion. By default (False), the AdapterFusion is added but not activated.

  • +
+
+
+
+ +
+
+add_causal_lm_head(head_name, activation_function='gelu', overwrite_ok=False)
+

Adds a causal language modeling head on top of the model.

+
+
Parameters
+
    +
  • head_name (str) – The name of the head.

  • +
  • activation_function (str, optional) – Activation function. Defaults to ‘gelu’.

  • +
  • overwrite_ok (bool, optional) – Force overwrite if a head with the same name exists. Defaults to False.

  • +
+
+
+
+ +
+
+add_classification_head(head_name, num_labels=2, layers=2, activation_function='tanh', overwrite_ok=False, multilabel=False, id2label=None, use_pooler=False)
+

Adds a sequence classification head on top of the model.

+
+
Parameters
+
    +
  • head_name (str) – The name of the head.

  • +
  • num_labels (int, optional) – Number of classification labels. Defaults to 2.

  • +
  • layers (int, optional) – Number of layers. Defaults to 2.

  • +
  • activation_function (str, optional) – Activation function. Defaults to ‘tanh’.

  • +
  • overwrite_ok (bool, optional) – Force overwrite if a head with the same name exists. Defaults to False.

  • +
  • multilabel (bool, optional) – Enable multilabel classification setup. Defaults to False.

  • +
+
+
+
+ +
+
+add_dependency_parsing_head(head_name, num_labels=2, overwrite_ok=False, id2label=None)
+

Adds a biaffine dependency parsing head on top of the model. The parsing head uses the architecture described +in “Is Supervised Syntactic Parsing Beneficial for Language Understanding? An Empirical Investigation” (Glavaš +& Vulić, 2021) (https://arxiv.org/pdf/2008.06788.pdf).

+
+
Parameters
+
    +
  • head_name (str) – The name of the head.

  • +
  • num_labels (int, optional) – Number of labels. Defaults to 2.

  • +
  • overwrite_ok (bool, optional) – Force overwrite if a head with the same name exists. Defaults to False.

  • +
  • id2label (dict, optional) – Mapping from label ids to labels. Defaults to None.

  • +
+
+
+
+ +
+
+add_masked_lm_head(head_name, activation_function='gelu', overwrite_ok=False)
+

Adds a masked language modeling head on top of the model.

+
+
Parameters
+
    +
  • head_name (str) – The name of the head.

  • +
  • activation_function (str, optional) – Activation function. Defaults to ‘gelu’.

  • +
  • overwrite_ok (bool, optional) – Force overwrite if a head with the same name exists. Defaults to False.

  • +
+
+
+
+ +
+
+add_multiple_choice_head(head_name, num_choices=2, layers=2, activation_function='tanh', overwrite_ok=False, id2label=None, use_pooler=False)
+

Adds a multiple choice head on top of the model.

+
+
Parameters
+
    +
  • head_name (str) – The name of the head.

  • +
  • num_choices (int, optional) – Number of choices. Defaults to 2.

  • +
  • layers (int, optional) – Number of layers. Defaults to 2.

  • +
  • activation_function (str, optional) – Activation function. Defaults to ‘tanh’.

  • +
  • overwrite_ok (bool, optional) – Force overwrite if a head with the same name exists. Defaults to False.

  • +
+
+
+
+ +
+
+add_qa_head(head_name, num_labels=2, layers=1, activation_function='tanh', overwrite_ok=False, id2label=None)
+

Adds a question answering head on top of the model.

+
+
Parameters
+
    +
  • head_name (str) – The name of the head.

  • +
  • num_labels (int, optional) – Number of classification labels. Defaults to 2.

  • +
  • layers (int, optional) – Number of layers. Defaults to 1.

  • +
  • activation_function (str, optional) – Activation function. Defaults to ‘tanh’.

  • +
  • overwrite_ok (bool, optional) – Force overwrite if a head with the same name exists. Defaults to False.

  • +
+
+
+
+ +
+
+add_tagging_head(head_name, num_labels=2, layers=1, activation_function='tanh', overwrite_ok=False, id2label=None)
+

Adds a token classification head on top of the model.

+
+
Parameters
+
    +
  • head_name (str) – The name of the head.

  • +
  • num_labels (int, optional) – Number of classification labels. Defaults to 2.

  • +
  • layers (int, optional) – Number of layers. Defaults to 1.

  • +
  • activation_function (str, optional) – Activation function. Defaults to ‘tanh’.

  • +
  • overwrite_ok (bool, optional) – Force overwrite if a head with the same name exists. Defaults to False.

  • +
+
+
+
+ +
+
+apply_to_adapter_layers(fn)
+

Applies a function to all adapter layers of the model.

+
+ +
+
+apply_to_basemodel_childs(fn)
+

Applies a function to all direct childs of the model if they are a instance of AdapterLayerBase.

+
+ +
+
+average_adapter(adapter_name: str, adapter_list: List[str], weights: Optional[List[float]] = None, normalize_weights: bool = True, overwrite_ok: bool = False, set_active: bool = False)
+

Adds a new adapter module as weighted average of a set of existing adapter modules.

+
+
Parameters
+
    +
  • adapter_name (str) – The name of the adapter module to be added.

  • +
  • input_adapters (List[str] or Dict[str, float]) – Specifies the existing adapters whose weights should be averaged. Can either be a list of adapter names +or a dictionary mapping adapter names to weights.

  • +
  • overwrite_ok (bool, optional) – Overwrite an adapter with the same name if it exists. By default (False), an exception is thrown.

  • +
  • set_active (bool, optional) – Set the adapter to be the active one. By default (False), the adapter is added but not activated.

  • +
+
+
+
+ +
+
+delete_adapter(adapter_name: str)
+

Deletes the adapter with the specified name from the model.

+
+
Parameters
+

adapter_name (str) – The name of the adapter.

+
+
+
+ +
+
+delete_adapter_fusion(adapter_names: Union[Fuse, list, str])
+

Deletes the AdapterFusion layer of the specified adapters.

+
+
Parameters
+

adapter_names (Union[Fuse, list, str]) – AdapterFusion layer to delete.

+
+
+
+ +
+
+delete_head(head_name: str)
+

Deletes the prediction head with the specified name from the model.

+
+
Parameters
+

head_name (str) – The name of the prediction to delete.

+
+
+
+ +
+
+eject_prefix_tuning(name: str)
+

Converts the prefix tuning with the given name from the reparameterized form into the flat form.

+
+
Parameters
+

name (str) – The name of the prefix tuning.

+
+
+
+ +
+
+forward(input_ids=None, attention_mask=None, head_mask=None, inputs_embeds=None, output_attentions=None, output_hidden_states=None, return_dict=None, head=None, output_adapter_gating_scores=False, output_adapter_fusion_attentions=False, **kwargs)
+

The [DistilBertAdapterModel] forward method, overrides the __call__ special method.

+

<Tip>

+

Although the recipe for forward pass needs to be defined within this function, one should call the [Module] +instance afterwards instead of this since the former takes care of running the pre and post processing steps while +the latter silently ignores them.

+

</Tip>

+
+
Parameters
+
    +
  • input_ids (torch.LongTensor of shape (batch_size, num_choices)) –

    Indices of input sequence tokens in the vocabulary.

    +

    Indices can be obtained using [AutoTokenizer]. See [PreTrainedTokenizer.encode] and +[PreTrainedTokenizer.__call__] for details.

    +

    [What are input IDs?](../glossary#input-ids)

    +

  • +
  • attention_mask (torch.FloatTensor of shape (batch_size, num_choices), optional) –

    Mask to avoid performing attention on padding token indices. Mask values selected in [0, 1]:

    +
      +
    • 1 for tokens that are not masked,

    • +
    • 0 for tokens that are masked.

    • +
    +

    [What are attention masks?](../glossary#attention-mask)

    +

  • +
  • head_mask (torch.FloatTensor of shape (num_heads,) or (num_layers, num_heads), optional) –

    Mask to nullify selected heads of the self-attention modules. Mask values selected in [0, 1]:

    +
      +
    • 1 indicates the head is not masked,

    • +
    • 0 indicates the head is masked.

    • +
    +

  • +
  • inputs_embeds (torch.FloatTensor of shape (batch_size, num_choices, hidden_size), optional) – Optionally, instead of passing input_ids you can choose to directly pass an embedded representation. This +is useful if you want more control over how to convert input_ids indices into associated vectors than the +model’s internal embedding lookup matrix.

  • +
  • output_attentions (bool, optional) – Whether or not to return the attentions tensors of all attention layers. See attentions under returned +tensors for more detail.

  • +
  • output_hidden_states (bool, optional) – Whether or not to return the hidden states of all layers. See hidden_states under returned tensors for +more detail.

  • +
  • return_dict (bool, optional) – Whether or not to return a [~utils.ModelOutput] instead of a plain tuple.

  • +
+
+
+
+ +
+
+forward_context(context: ForwardContext, *args, **kwargs)
+

This method is called by the ForwardContext at the beginning of the forward pass.

+
+ +
+
+forward_head(all_outputs, head_name=None, cls_output=None, attention_mask=None, return_dict=False, context=None, **kwargs)
+

The forward pass through a prediction head configuration. There are three ways to specify the used prediction +head configuration (in order of priority):

+
+
    +
  1. If a head_name is passed, the head with the given name is used.

  2. +
  3. If the forward call is executed within an AdapterSetup context, the head configuration is read from +the context.

  4. +
  5. If the active_head property is set, the head configuration is read from there.

  6. +
+
+
+
Parameters
+
    +
  • all_outputs (dict) – The outputs of the base model.

  • +
  • head_name (str, optional) – The name of the prediction head to use. If None, the active head is used.

  • +
  • cls_output (torch.Tensor, optional) – The classification output of the model.

  • +
  • attention_mask (torch.Tensor, optional) – The attention mask of the model.

  • +
  • return_dict (bool) – Whether or not to return a ModelOutput instead of a plain tuple.

  • +
  • get_cls_from_eos_tokens (bool) – If set to True, retrieve classifier token representations from the last <eos> token in the sequence. +Setting to True requires eos_mask to be passed as well.

  • +
  • **kwargs – Additional keyword arguments passed to the forward pass of the head.

  • +
+
+
+
+ +
+
+freeze_model(freeze=True)
+

Freezes all weights of the model.

+
+ +
+
+get_adapter(name)
+

If self.base_model is self, must inherit from a class that implements this method, to preclude infinite +recursion

+
+ +
+
+get_labels(head_name=None)
+

Returns the labels the given head is assigning/predictin

+
+
Parameters
+
    +
  • head_name – (str, optional) the name of the head which labels should be returned. Default is None.

  • +
  • returned (If the name is None the labels of the active head are) –

  • +
+
+
+

Returns: labels

+
+ +
+
+get_labels_dict(head_name=None)
+

Returns the id2label dict for the given hea

+
+
Parameters
+
    +
  • head_name – (str, optional) the name of the head which labels should be returned. Default is None.

  • +
  • returned (If the name is None the labels of the active head are) –

  • +
+
+
+

Returns: id2label

+
+ +
+
+get_output_embeddings() Union[Module, List[Module]]
+

Returns the model’s output embeddings.

+
+
Returns
+

A torch module mapping hidden states to vocabulary.

+
+
Return type
+

nn.Module

+
+
+
+ +
+
+get_position_embeddings() Embedding
+

Returns the position embeddings

+
+ +
+
+head_type()
+

Checks which head type the decorated function belongs to and raises an error if the model does not support the +head type.

+
+ +
+
+init_adapters(model_config, adapters_config, add_prefix_tuning_pool=True)
+

This method initializes adapter modules and fusion modules from the model config.

+
+ +
+
+iter_layers() Iterable[Tuple[int, Module]]
+

Iterates over all layers of the model.

+
+ +
+
+load_adapter(adapter_name_or_path: str, config: Optional[Union[dict, str]] = None, version: Optional[str] = None, model_name: Optional[str] = None, load_as: Optional[str] = None, source: Optional[str] = None, with_head: bool = True, custom_weights_loaders: Optional[List[WeightsLoader]] = None, leave_out: Optional[List[int]] = None, id2label=None, set_active: bool = False, use_safetensors: bool = False, **kwargs) str
+

Loads a pre-trained pytorch adapter module from the local file system or a remote location.

+
+
Parameters
+
    +
  • adapter_name_or_path (str) –

    can be either:

    +
      +
    • the identifier of a pre-trained task adapter to be loaded from Adapter Hub

    • +
    • a path to a directory containing adapter weights saved using model.saved_adapter()

    • +
    • a URL pointing to a zip folder containing a saved adapter module

    • +
    +

  • +
  • config (dict or str, optional) – The requested configuration of the adapter. +If not specified, will be either: - the default adapter config for the requested adapter if specified - +the global default adapter config

  • +
  • version (str, optional) – The version of the adapter to be loaded.

  • +
  • model_name (str, optional) – The string identifier of the pre-trained model.

  • +
  • load_as (str, optional) – Load the adapter using this name. By default, the name with which the adapter was +saved will be used.

  • +
  • source (str, optional) –

    Identifier of the source(s) from where to load the adapter. Can be:

    +
      +
    • +
      ”ah”: search on AdapterHub Hub repo.

      Note: the Hub repo has been archived and all adapters have been moved to HuggingFace Model Hub. +Loading from this source is deprecated.

      +
      +
      +
    • +
    • ”hf”: search on HuggingFace Model Hub.

    • +
    • None (default): search on all sources

    • +
    +

  • +
  • leave_out – Dynamically drop adapter modules in the specified Transformer layers when loading the adapter.

  • +
  • set_active (bool, optional) – Set the loaded adapter to be the active one. By default (False), the adapter is loaded but not +activated.

  • +
  • use_safetensors (bool, optional) – If True, weights are loaded via safetensors if safetensors checkpoint is available. Otherwise, the regular torch save method is used.

  • +
+
+
Returns
+

The name with which the adapter was added to the model.

+
+
Return type
+

str

+
+
+
+ +
+
+load_adapter_fusion(adapter_fusion_name_or_path: str, load_as: Optional[str] = None, custom_weights_loaders: Optional[List[WeightsLoader]] = None, set_active: bool = False, with_head: bool = True, use_safetensors: bool = False, **kwargs) str
+

Loads a pre-trained AdapterFusion layer from the local file system.

+
+
Parameters
+
    +
  • adapter_fusion_name_or_path (str) – a path to a directory containing AdapterFusion weights saved using model.save_adapter_fusion().

  • +
  • load_as (str, optional) – Load the AdapterFusion using this name. +By default, the name with which the AdapterFusion layer was saved will be used.

  • +
  • set_active (bool, optional) – Activate the loaded AdapterFusion. By default (False), the AdapterFusion is loaded but not activated.

  • +
  • use_safetensors (bool, optional) – If True, weights are loaded via safetensors if safetensors checkpoint is available. Otherwise, the regular torch save method is used.

  • +
+
+
Returns
+

The name with which the AdapterFusion was added to the model.

+
+
Return type
+

str

+
+
+
+ +
+
+load_head(save_directory: str, load_as: Optional[str] = None, id2label: Optional[Dict[int, str]] = None, use_safetensors: bool = False, **kwargs) str
+

Loads a model prediction head from a directory where it was saved using save_head().

+
+
Parameters
+
    +
  • save_directory (str) – Path to the directory where the prediction head is saved.

  • +
  • load_as (str, optional) – Load the AdapterFusion using this name. +By default, the name with which the AdapterFusion layer was saved will be used.

  • +
  • id2label (Dict[int, str], optional) – Provide a custom mapping from class ids to class labels. Defaults to None.

  • +
  • use_safetensors (bool, optional) – If True, weights are loaded via safetensors if safetensors checkpoint is available. Otherwise, the regular torch save method is used.

  • +
+
+
Returns
+

The name with which the prediction head was added to the model.

+
+
Return type
+

str

+
+
+
+ +
+
+merge_adapter(name: str)
+

Merges the weights of the given LoRA module with the Transformer weights as described in the paper.

+
+
Parameters
+

name (str) – LoRA module to merge.

+
+
+
+ +
+
+push_adapter_to_hub(repo_name: str, adapter_name: str, organization: Optional[str] = None, adapterhub_tag: Optional[str] = None, datasets_tag: Optional[str] = None, local_path: Optional[str] = None, commit_message: Optional[str] = None, private: Optional[bool] = None, token: Optional[Union[bool, str]] = None, overwrite_adapter_card: bool = False, create_pr: bool = False, revision: Optional[str] = None, commit_description: Optional[str] = None, adapter_card_kwargs: Optional[dict] = None, **deprecated_kwargs)
+

Upload an adapter to HuggingFace’s Model Hub.

+
+
Parameters
+
    +
  • repo_name (str) – The name of the repository on the model hub to upload to.

  • +
  • adapter_name (str) – The name of the adapter to be uploaded.

  • +
  • organization (str, optional) – Organization in which to push the adapter +(you must be a member of this organization). Defaults to None.

  • +
  • adapterhub_tag (str, optional) – Tag of the format <task>/<subtask> for categorization on https://adapterhub.ml/explore/. See +https://docs.adapterhub.ml/contributing.html#add-a-new-task-or-subtask for more. If not specified, +datasets_tag must be given in case a new adapter card is generated. Defaults to None.

  • +
  • datasets_tag (str, optional) – Dataset identifier from https://huggingface.co/datasets. +If not specified, adapterhub_tag must be given in case a new adapter card is generated. Defaults to +None.

  • +
  • local_path (str, optional) – Local path used as clone directory of the adapter repository. +If not specified, will create a temporary directory. Defaults to None.

  • +
  • commit_message (str, optional) – Message to commit while pushing. Will default to "add config", "add tokenizer" or +"add model" depending on the type of the class.

  • +
  • private (bool, optional) – Whether or not the repository created should be private (requires a paying subscription).

  • +
  • token (bool or str, optional) – The token to use as HTTP bearer authorization for remote files. If True, will use the token generated +when running huggingface-cli login (stored in ~/.huggingface). Will default to True if repo_url +is not specified.

  • +
  • overwrite_adapter_card (bool, optional) – Overwrite an existing adapter card with a newly generated one. +If set to False, will only generate an adapter card, if none exists. Defaults to False.

  • +
  • create_pr (bool, optional) – Whether or not to create a PR with the uploaded files or directly commit.

  • +
  • revision (str, optional) – Branch to push the uploaded files to.

  • +
  • commit_description (str, optional) – The description of the commit that will be created

  • +
+
+
Returns
+

The url of the adapter repository on the model hub.

+
+
Return type
+

str

+
+
+
+ +
+
+reset_adapter()
+

Resets weights of a LoRA module merged using model.merge_adapter(name).

+
+ +
+
+resize_position_embeddings(new_num_position_embeddings: int)
+

Resizes position embeddings of the model if new_num_position_embeddings != +config.max_position_embeddings.

+
+
Parameters
+

new_num_position_embeddings (int) – The number of new position embedding matrix. If position embeddings are learned, increasing the size +will add newly initialized vectors at the end, whereas reducing the size will remove vectors from the +end. If position embeddings are not learned (e.g. sinusoidal position embeddings), increasing the +size will add correct vectors at the end following the position encoding algorithm, whereas reducing +the size will remove vectors from the end.

+
+
+
+ +
+
+save_adapter(save_directory: str, adapter_name: str, with_head: bool = True, meta_dict: Optional[dict] = None, custom_weights_loaders: Optional[List[WeightsLoader]] = None, use_safetensors: bool = False)
+

Saves an adapter and its configuration file to a directory so that it can be shared or reloaded using +load_adapter().

+
+
Parameters
+
    +
  • save_directory (str) – Path to a directory where the adapter should be saved.

  • +
  • adapter_name (str) – Name of the adapter to be saved.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
Raises
+

ValueError – If the given adapter name is invalid.

+
+
+
+ +
+
+save_adapter_fusion(save_directory: str, adapter_names: Union[Fuse, list, str], meta_dict: Optional[dict] = None, custom_weights_loaders: Optional[List[WeightsLoader]] = None, with_head: Union[bool, str] = False, use_safetensors: bool = False)
+

Saves an AdapterFusion layer and its configuration file to a directory so that it can be shared or reloaded +using load_adapter_fusion().

+
+
Parameters
+
    +
  • save_directory (str) – Path to a directory where the AdapterFusion should be saved.

  • +
  • adapter_names (Union[Fuse, list, str]) – AdapterFusion to be saved.

  • +
  • with_head (Union[bool, str]) – If True, will save a head with the same name as the AdapterFusionLayer. If a string, this will be used +as the name of the head to be saved.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
Raises
+

ValueError – If the given AdapterFusion name is invalid.

+
+
+
+ +
+
+save_all_adapter_fusions(save_directory: str, meta_dict: Optional[dict] = None, custom_weights_loaders: Optional[List[WeightsLoader]] = None, use_safetensors: bool = False)
+

Saves all AdapterFusion layers of this model together with their configuration to subfolders of the given +location.

+
+
Parameters
+
    +
  • save_directory (str) – Path to a directory where the AdapterFusion layers should be saved.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
+
+ +
+
+save_all_adapters(save_directory: str, with_head: bool = True, meta_dict: Optional[dict] = None, custom_weights_loaders: Optional[List[WeightsLoader]] = None, use_safetensors: bool = False)
+

Saves all adapters of this model together with their configuration to subfolders of the given location.

+
+
Parameters
+
    +
  • save_directory (str) – Path to a directory where the adapters should be saved.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
+
+ +
+
+save_all_heads(save_directory: str, use_safetensors: bool = False)
+

Saves all prediction heads of this model to subfolders of the given location.

+
+
Parameters
+
    +
  • save_directory (str) – Path to the base directory where prediction heads should be saved.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
+
+ +
+
+save_head(save_directory: str, head_name: Optional[str] = None, use_safetensors: bool = False) None
+

Saves a model prediction head to a directory such that it can be reloaded using load_head().

+
+
Parameters
+
    +
  • save_directory (str) – Path to the directory where the prediction head should be saved.

  • +
  • head_name (str, optional) – Name of the head to save. Set to None if model only has one head. Defaults to None.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
+
+ +
+
+save_pretrained(save_directory: Union[str, PathLike], **kwargs)
+

Save a model and its configuration file to a directory, so that it can be re-loaded using the +[~PreTrainedModel.from_pretrained] class method.

+
+
Parameters
+
    +
  • save_directory (str or os.PathLike) – Directory to which to save. Will be created if it doesn’t exist.

  • +
  • is_main_process (bool, optional, defaults to True) – Whether the process calling this is the main process or not. Useful when in distributed training like +TPUs and need to call this function on all processes. In this case, set is_main_process=True only on +the main process to avoid race conditions.

  • +
  • state_dict (nested dictionary of torch.Tensor) – The state dictionary of the model to save. Will default to self.state_dict(), but can be used to only +save parts of the model or if special precautions need to be taken when recovering the state dictionary +of a model (like when using model parallelism).

  • +
  • save_function (Callable) – The function to use to save the state dictionary. Useful on distributed training like TPUs when one +need to replace torch.save by another method.

  • +
  • push_to_hub (bool, optional, defaults to False) – Whether or not to push your model to the Hugging Face model hub after saving it. You can specify the +repository you want to push to with repo_id (will default to the name of save_directory in your +namespace).

  • +
  • max_shard_size (int or str, optional, defaults to “5GB”) –

    The maximum size for a checkpoint before being sharded. Checkpoints shard will then be each of size +lower than this size. If expressed as a string, needs to be digits followed by a unit (like “5MB”). +We default it to 5GB in order for models to be able to run easily on free-tier google colab instances +without CPU OOM issues.

    +

    <Tip warning={true}>

    +

    If a single weight of the model is bigger than max_shard_size, it will be in its own checkpoint shard +which will be bigger than max_shard_size.

    +

    </Tip>

    +

  • +
  • safe_serialization (bool, optional, defaults to True) – Whether to save the model using safetensors or the traditional PyTorch way (that uses pickle).

  • +
  • variant (str, optional) – If specified, weights are saved in the format pytorch_model.<variant>.bin.

  • +
  • token (str or bool, optional) – The token to use as HTTP bearer authorization for remote files. If True, or not specified, will use +the token generated when running huggingface-cli login (stored in ~/.huggingface).

  • +
  • save_peft_format (bool, optional, defaults to True) – For backward compatibility with PEFT library, in case adapter weights are attached to the model, all +keys of the state dict of adapters needs to be pre-pended with base_model.model. Advanced users can +disable this behaviours by setting save_peft_format to False.

  • +
  • kwargs (Dict[str, Any], optional) – Additional key word arguments passed along to the [~utils.PushToHubMixin.push_to_hub] method.

  • +
+
+
+
+ +
+
+set_active_adapters(adapter_setup: Union[list, AdapterCompositionBlock], skip_layers: Optional[List[int]] = None)
+

Sets the adapter modules to be used by default in every forward pass. This setting can be overriden by passing +the adapter_names parameter in the foward() pass. If no adapter with the given name is found, no module of +the respective type will be activated. In case the calling model class supports named prediction heads, this +method will attempt to activate a prediction head with the name of the last adapter in the list of passed +adapter names.

+
+
Parameters
+

adapter_setup (list) – The list of adapters to be activated by default. Can be a fusion or stacking configuration.

+
+
+
+ +
+
+tie_weights()
+

Tie the weights between the input embeddings and the output embeddings.

+

If the torchscript flag is set in the configuration, can’t handle parameter sharing so we are cloning +the weights instead.

+
+ +
+
+train_adapter(adapter_setup: Union[list, AdapterCompositionBlock], train_embeddings=False)
+

Sets the model into mode for training the given adapters. If self.base_model is self, must inherit from a class +that implements this method, to preclude infinite recursion

+
+ +
+
+train_adapter_fusion(adapter_setup: Union[list, AdapterCompositionBlock], unfreeze_adapters=False)
+

Sets the model into mode for training of adapter fusion determined by a list of adapter names. If +self.base_model is self, must inherit from a class that implements this method, to preclude infinite recursion

+
+ +
+
+train_fusion(adapter_setup: Union[list, AdapterCompositionBlock], unfreeze_adapters=False)
+

Sets the model into mode for training of adapter fusion determined by a list of adapter names.

+
+ +
+ +
+
+ + +
+ +
+ + +
+
+ +
+ +
+ +
+ + Versions + v: main + + +
+
+
Branches
+
main
+
+
+
+ + + + + + + + + + \ No newline at end of file diff --git a/classes/models/electra.html b/classes/models/electra.html new file mode 100644 index 0000000000..9f01c0497e --- /dev/null +++ b/classes/models/electra.html @@ -0,0 +1,1161 @@ + + + + + + + + + + + ELECTRA — AdapterHub documentation + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + +
+ + + + + +
+ +
+ + + + + + + + + + + + + + + + + +
+ + + + +
+
+
+
+ +
+

ELECTRA

+

The ELECTRA model was proposed in the paper ELECTRA: Pre-training Text Encoders as Discriminators Rather Than +Generators. ELECTRA is a new pretraining approach which trains two +transformer models: the generator and the discriminator. The generator’s role is to replace tokens in a sequence, and +is therefore trained as a masked language model. The discriminator, which is the model we’re interested in, tries to +identify which tokens were replaced by the generator in the sequence.

+

The abstract from the paper is the following:

+

Masked language modeling (MLM) pretraining methods such as BERT corrupt the input by replacing some tokens with [MASK] +and then train a model to reconstruct the original tokens. While they produce good results when transferred to +downstream NLP tasks, they generally require large amounts of compute to be effective. As an alternative, we propose a +more sample-efficient pretraining task called replaced token detection. Instead of masking the input, our approach +corrupts it by replacing some tokens with plausible alternatives sampled from a small generator network. Then, instead +of training a model that predicts the original identities of the corrupted tokens, we train a discriminative model that +predicts whether each token in the corrupted input was replaced by a generator sample or not. Thorough experiments +demonstrate this new pretraining task is more efficient than MLM because the task is defined over all input tokens +rather than just the small subset that was masked out. As a result, the contextual representations learned by our +approach substantially outperform the ones learned by BERT given the same model size, data, and compute. The gains are +particularly strong for small models; for example, we train a model on one GPU for 4 days that outperforms GPT (trained +using 30x more compute) on the GLUE natural language understanding benchmark. Our approach also works well at scale, +where it performs comparably to RoBERTa and XLNet while using less than 1/4 of their compute and outperforms them when +using the same amount of compute.

+
+

ElectraAdapterModel

+
+
+class adapters.ElectraAdapterModel(config)
+

Electra Model transformer with the option to add multiple flexible heads on top.

+

This model inherits from [PreTrainedModel]. Check the superclass documentation for the generic methods the +library implements for all its model (such as downloading or saving, resizing the input embeddings, pruning heads +etc.)

+

This model is also a PyTorch [torch.nn.Module](https://pytorch.org/docs/stable/nn.html#torch.nn.Module) subclass. +Use it as a regular PyTorch Module and refer to the PyTorch documentation for all matter related to general usage +and behavior.

+
+
Parameters
+

config ([ElectraConfig]) – Model configuration class with all the parameters of the model. +Initializing with a config file does not load the weights associated with the model, only the +configuration. Check out the [~PreTrainedModel.from_pretrained] method to load the model weights.

+
+
+
+
+property active_adapters: AdapterCompositionBlock
+

If you are not familiar with adapters and PEFT methods, we invite you to read more about them on the PEFT +official documentation: https://huggingface.co/docs/peft

+

Gets the current active adapters of the model. In case of multi-adapter inference (combining multiple adapters +for inference) returns the list of all active adapters so that users can deal with them accordingly.

+

For previous PEFT versions (that does not support multi-adapter inference), module.active_adapter will return +a single string.

+
+ +
+
+property active_head: Union[str, List[str]]
+

The active prediction head configuration of this model. Can be either the name of a single available head +(string) or a list of multiple available heads. In case of a list of heads, the same base model is forwarded +through all specified heads.

+
+
Returns
+

A string or a list of strings describing the active head configuration.

+
+
Return type
+

Union[str, List[str]]

+
+
+
+ +
+
+adapter_fusion_to(adapter_names: Union[Fuse, list, str], device: Optional[Union[device, str]] = None, dtype: Optional[dtype] = None)
+

Moves the adapter fusion layer with the given name to the specified device and data type.

+
+
Parameters
+
    +
  • adapter_names (Union[Fuse, list, str]) – The name of the adapter fusion layer to be moved.

  • +
  • device (torch.device or str, optional) – The device on which the adapter fusion layer should be moved.

  • +
  • dtype (torch.dtype, optional) – The data type to which the adapter fusion layer should be cast.

  • +
+
+
+
+ +
+
+adapter_summary(as_dict=False) Union[str, dict]
+

Returns a string summary of all adapters currently added to the model. Each entry in the summary table has the +following attributes:

+
+
    +
  • name: the name of the adapter

  • +
  • architecture: the architectural base of the adapter

  • +
  • #param: the number of parameters of the adapter

  • +
  • %param: the number of parameters of the adapter relative to the full model

  • +
  • active: whether the adapter is active

  • +
  • train: whether the adapter weights are enabled for training

  • +
+
+
+ +
+
+adapter_to(name: str, device: Optional[Union[device, str]] = None, dtype: Optional[dtype] = None)
+

Moves the adapter with the given name to the specified device and data type.

+
+
Parameters
+
    +
  • name (str) – The name of the adapter to be moved.

  • +
  • device (torch.device or str, optional) – The device on which the adapter should be moved.

  • +
  • dtype (torch.dtype, optional) – The data type to which the adapter should be cast.

  • +
+
+
+
+ +
+
+add_adapter(adapter_name: str, config=None, overwrite_ok: bool = False, set_active: bool = False)
+

Adds a new adapter module of the specified type to the model.

+
+
Parameters
+
    +
  • adapter_name (str) – The name of the adapter module to be added.

  • +
  • config (str or dict, optional) –

    The adapter configuration, can be either:

    +
      +
    • the string identifier of a pre-defined configuration dictionary

    • +
    • a configuration dictionary specifying the full config

    • +
    • if not given, the default configuration for this adapter type will be used

    • +
    +

  • +
  • overwrite_ok (bool, optional) – Overwrite an adapter with the same name if it exists. By default (False), an exception is thrown.

  • +
  • set_active (bool, optional) – Set the adapter to be the active one. By default (False), the adapter is added but not activated.

  • +
+
+
+

If self.base_model is self, must inherit from a class that implements this method, to preclude infinite +recursion

+
+ +
+
+add_adapter_fusion(adapter_names: Union[Fuse, list, str], config=None, overwrite_ok: bool = False, set_active: bool = False)
+

Adds AdapterFusion to the model with alll the necessary configurations and weight initializations

+
+
Parameters
+
    +
  • adapter_names (Fuse or list or str) –

    AdapterFusion layer to add. Can be either:

    +
      +
    • a Fuse composition block

    • +
    • a list of adapter names to fuse

    • +
    • a comma-separated string of adapter names to fuse

    • +
    +

  • +
  • config (str or dict) –

    adapter fusion configuration, can be either:

    +
      +
    • a string identifying a pre-defined adapter fusion configuration

    • +
    • a dictionary representing the adapter fusion configuration

    • +
    • the path to a file containing the adapter fusion configuration

    • +
    +

  • +
  • overwrite_ok (bool, optional) – Overwrite an AdapterFusion layer with the same name if it exists. By default (False), an exception is +thrown.

  • +
  • set_active (bool, optional) – Activate the added AdapterFusion. By default (False), the AdapterFusion is added but not activated.

  • +
+
+
+
+ +
+
+add_causal_lm_head(head_name, activation_function='gelu', overwrite_ok=False)
+

Adds a causal language modeling head on top of the model.

+
+
Parameters
+
    +
  • head_name (str) – The name of the head.

  • +
  • activation_function (str, optional) – Activation function. Defaults to ‘gelu’.

  • +
  • overwrite_ok (bool, optional) – Force overwrite if a head with the same name exists. Defaults to False.

  • +
+
+
+
+ +
+
+add_classification_head(head_name, num_labels=2, layers=2, activation_function='tanh', overwrite_ok=False, multilabel=False, id2label=None, use_pooler=False)
+

Adds a sequence classification head on top of the model.

+
+
Parameters
+
    +
  • head_name (str) – The name of the head.

  • +
  • num_labels (int, optional) – Number of classification labels. Defaults to 2.

  • +
  • layers (int, optional) – Number of layers. Defaults to 2.

  • +
  • activation_function (str, optional) – Activation function. Defaults to ‘tanh’.

  • +
  • overwrite_ok (bool, optional) – Force overwrite if a head with the same name exists. Defaults to False.

  • +
  • multilabel (bool, optional) – Enable multilabel classification setup. Defaults to False.

  • +
+
+
+
+ +
+
+add_dependency_parsing_head(head_name, num_labels=2, overwrite_ok=False, id2label=None)
+

Adds a biaffine dependency parsing head on top of the model. The parsing head uses the architecture described +in “Is Supervised Syntactic Parsing Beneficial for Language Understanding? An Empirical Investigation” (Glavaš +& Vulić, 2021) (https://arxiv.org/pdf/2008.06788.pdf).

+
+
Parameters
+
    +
  • head_name (str) – The name of the head.

  • +
  • num_labels (int, optional) – Number of labels. Defaults to 2.

  • +
  • overwrite_ok (bool, optional) – Force overwrite if a head with the same name exists. Defaults to False.

  • +
  • id2label (dict, optional) – Mapping from label ids to labels. Defaults to None.

  • +
+
+
+
+ +
+
+add_masked_lm_head(head_name, activation_function='gelu', overwrite_ok=False)
+

Adds a masked language modeling head on top of the model.

+
+
Parameters
+
    +
  • head_name (str) – The name of the head.

  • +
  • activation_function (str, optional) – Activation function. Defaults to ‘gelu’.

  • +
  • overwrite_ok (bool, optional) – Force overwrite if a head with the same name exists. Defaults to False.

  • +
+
+
+
+ +
+
+add_multiple_choice_head(head_name, num_choices=2, layers=2, activation_function='tanh', overwrite_ok=False, id2label=None, use_pooler=False)
+

Adds a multiple choice head on top of the model.

+
+
Parameters
+
    +
  • head_name (str) – The name of the head.

  • +
  • num_choices (int, optional) – Number of choices. Defaults to 2.

  • +
  • layers (int, optional) – Number of layers. Defaults to 2.

  • +
  • activation_function (str, optional) – Activation function. Defaults to ‘tanh’.

  • +
  • overwrite_ok (bool, optional) – Force overwrite if a head with the same name exists. Defaults to False.

  • +
+
+
+
+ +
+
+add_qa_head(head_name, num_labels=2, layers=1, activation_function='tanh', overwrite_ok=False, id2label=None)
+

Adds a question answering head on top of the model.

+
+
Parameters
+
    +
  • head_name (str) – The name of the head.

  • +
  • num_labels (int, optional) – Number of classification labels. Defaults to 2.

  • +
  • layers (int, optional) – Number of layers. Defaults to 1.

  • +
  • activation_function (str, optional) – Activation function. Defaults to ‘tanh’.

  • +
  • overwrite_ok (bool, optional) – Force overwrite if a head with the same name exists. Defaults to False.

  • +
+
+
+
+ +
+
+add_tagging_head(head_name, num_labels=2, layers=1, activation_function='tanh', overwrite_ok=False, id2label=None)
+

Adds a token classification head on top of the model.

+
+
Parameters
+
    +
  • head_name (str) – The name of the head.

  • +
  • num_labels (int, optional) – Number of classification labels. Defaults to 2.

  • +
  • layers (int, optional) – Number of layers. Defaults to 1.

  • +
  • activation_function (str, optional) – Activation function. Defaults to ‘tanh’.

  • +
  • overwrite_ok (bool, optional) – Force overwrite if a head with the same name exists. Defaults to False.

  • +
+
+
+
+ +
+
+apply_to_adapter_layers(fn)
+

Applies a function to all adapter layers of the model.

+
+ +
+
+apply_to_basemodel_childs(fn)
+

Applies a function to all direct childs of the model if they are a instance of AdapterLayerBase.

+
+ +
+
+average_adapter(adapter_name: str, adapter_list: List[str], weights: Optional[List[float]] = None, normalize_weights: bool = True, overwrite_ok: bool = False, set_active: bool = False)
+

Adds a new adapter module as weighted average of a set of existing adapter modules.

+
+
Parameters
+
    +
  • adapter_name (str) – The name of the adapter module to be added.

  • +
  • input_adapters (List[str] or Dict[str, float]) – Specifies the existing adapters whose weights should be averaged. Can either be a list of adapter names +or a dictionary mapping adapter names to weights.

  • +
  • overwrite_ok (bool, optional) – Overwrite an adapter with the same name if it exists. By default (False), an exception is thrown.

  • +
  • set_active (bool, optional) – Set the adapter to be the active one. By default (False), the adapter is added but not activated.

  • +
+
+
+
+ +
+
+delete_adapter(adapter_name: str)
+

Deletes the adapter with the specified name from the model.

+
+
Parameters
+

adapter_name (str) – The name of the adapter.

+
+
+
+ +
+
+delete_adapter_fusion(adapter_names: Union[Fuse, list, str])
+

Deletes the AdapterFusion layer of the specified adapters.

+
+
Parameters
+

adapter_names (Union[Fuse, list, str]) – AdapterFusion layer to delete.

+
+
+
+ +
+
+delete_head(head_name: str)
+

Deletes the prediction head with the specified name from the model.

+
+
Parameters
+

head_name (str) – The name of the prediction to delete.

+
+
+
+ +
+
+eject_prefix_tuning(name: str)
+

Converts the prefix tuning with the given name from the reparameterized form into the flat form.

+
+
Parameters
+

name (str) – The name of the prefix tuning.

+
+
+
+ +
+
+forward(input_ids=None, attention_mask=None, token_type_ids=None, position_ids=None, head_mask=None, inputs_embeds=None, output_attentions=None, output_hidden_states=None, return_dict=None, head=None, output_adapter_gating_scores=False, output_adapter_fusion_attentions=False, **kwargs)
+

The [ElectraAdapterModel] forward method, overrides the __call__ special method.

+

<Tip>

+

Although the recipe for forward pass needs to be defined within this function, one should call the [Module] +instance afterwards instead of this since the former takes care of running the pre and post processing steps while +the latter silently ignores them.

+

</Tip>

+
+
Parameters
+
    +
  • input_ids (torch.LongTensor of shape (batch_size, sequence_length)) –

    Indices of input sequence tokens in the vocabulary.

    +

    Indices can be obtained using [AutoTokenizer]. See [PreTrainedTokenizer.encode] and +[PreTrainedTokenizer.__call__] for details.

    +

    [What are input IDs?](../glossary#input-ids)

    +

  • +
  • attention_mask (torch.FloatTensor of shape (batch_size, sequence_length), optional) –

    Mask to avoid performing attention on padding token indices. Mask values selected in [0, 1]:

    +
      +
    • 1 for tokens that are not masked,

    • +
    • 0 for tokens that are masked.

    • +
    +

    [What are attention masks?](../glossary#attention-mask)

    +

  • +
  • token_type_ids (torch.LongTensor of shape (batch_size, sequence_length), optional) –

    Segment token indices to indicate first and second portions of the inputs. Indices are selected in [0, +1]:

    +
      +
    • 0 corresponds to a sentence A token,

    • +
    • 1 corresponds to a sentence B token.

    • +
    +

    [What are token type IDs?](../glossary#token-type-ids)

    +

  • +
  • position_ids (torch.LongTensor of shape (batch_size, sequence_length), optional) –

    Indices of positions of each input sequence tokens in the position embeddings. Selected in the range [0, +config.max_position_embeddings - 1].

    +

    [What are position IDs?](../glossary#position-ids)

    +

  • +
  • head_mask (torch.FloatTensor of shape (num_heads,) or (num_layers, num_heads), optional) –

    Mask to nullify selected heads of the self-attention modules. Mask values selected in [0, 1]:

    +
      +
    • 1 indicates the head is not masked,

    • +
    • 0 indicates the head is masked.

    • +
    +

  • +
  • inputs_embeds (torch.FloatTensor of shape (batch_size, sequence_length, hidden_size), optional) – Optionally, instead of passing input_ids you can choose to directly pass an embedded representation. This +is useful if you want more control over how to convert input_ids indices into associated vectors than the +model’s internal embedding lookup matrix.

  • +
  • encoder_hidden_states (torch.FloatTensor of shape (batch_size, sequence_length, hidden_size), optional) – Sequence of hidden-states at the output of the last layer of the encoder. Used in the cross-attention if +the model is configured as a decoder.

  • +
  • encoder_attention_mask (torch.FloatTensor of shape (batch_size, sequence_length), optional) –

    Mask to avoid performing attention on the padding token indices of the encoder input. This mask is used in +the cross-attention if the model is configured as a decoder. Mask values selected in [0, 1]:

    +
      +
    • 1 indicates the head is not masked,

    • +
    • 0 indicates the head is masked.

    • +
    +

  • +
  • output_attentions (bool, optional) – Whether or not to return the attentions tensors of all attention layers. See attentions under returned +tensors for more detail.

  • +
  • output_hidden_states (bool, optional) – Whether or not to return the hidden states of all layers. See hidden_states under returned tensors for +more detail.

  • +
  • return_dict (bool, optional) – Whether or not to return a [~utils.ModelOutput] instead of a plain tuple.

  • +
+
+
+
+ +
+
+forward_context(context: ForwardContext, *args, **kwargs)
+

This method is called by the ForwardContext at the beginning of the forward pass.

+
+ +
+
+forward_head(all_outputs, head_name=None, cls_output=None, attention_mask=None, return_dict=False, context=None, **kwargs)
+

The forward pass through a prediction head configuration. There are three ways to specify the used prediction +head configuration (in order of priority):

+
+
    +
  1. If a head_name is passed, the head with the given name is used.

  2. +
  3. If the forward call is executed within an AdapterSetup context, the head configuration is read from +the context.

  4. +
  5. If the active_head property is set, the head configuration is read from there.

  6. +
+
+
+
Parameters
+
    +
  • all_outputs (dict) – The outputs of the base model.

  • +
  • head_name (str, optional) – The name of the prediction head to use. If None, the active head is used.

  • +
  • cls_output (torch.Tensor, optional) – The classification output of the model.

  • +
  • attention_mask (torch.Tensor, optional) – The attention mask of the model.

  • +
  • return_dict (bool) – Whether or not to return a ModelOutput instead of a plain tuple.

  • +
  • get_cls_from_eos_tokens (bool) – If set to True, retrieve classifier token representations from the last <eos> token in the sequence. +Setting to True requires eos_mask to be passed as well.

  • +
  • **kwargs – Additional keyword arguments passed to the forward pass of the head.

  • +
+
+
+
+ +
+
+freeze_model(freeze=True)
+

Freezes all weights of the model.

+
+ +
+
+get_adapter(name)
+

If self.base_model is self, must inherit from a class that implements this method, to preclude infinite +recursion

+
+ +
+
+get_labels(head_name=None)
+

Returns the labels the given head is assigning/predictin

+
+
Parameters
+
    +
  • head_name – (str, optional) the name of the head which labels should be returned. Default is None.

  • +
  • returned (If the name is None the labels of the active head are) –

  • +
+
+
+

Returns: labels

+
+ +
+
+get_labels_dict(head_name=None)
+

Returns the id2label dict for the given hea

+
+
Parameters
+
    +
  • head_name – (str, optional) the name of the head which labels should be returned. Default is None.

  • +
  • returned (If the name is None the labels of the active head are) –

  • +
+
+
+

Returns: id2label

+
+ +
+
+get_output_embeddings() Union[Module, List[Module]]
+

Returns the model’s output embeddings.

+
+
Returns
+

A torch module mapping hidden states to vocabulary.

+
+
Return type
+

nn.Module

+
+
+
+ +
+
+head_type()
+

Checks which head type the decorated function belongs to and raises an error if the model does not support the +head type.

+
+ +
+
+init_adapters(model_config, adapters_config, add_prefix_tuning_pool=True)
+

This method initializes adapter modules and fusion modules from the model config.

+
+ +
+
+iter_layers() Iterable[Tuple[int, Module]]
+

Iterates over all layers of the model.

+
+ +
+
+load_adapter(adapter_name_or_path: str, config: Optional[Union[dict, str]] = None, version: Optional[str] = None, model_name: Optional[str] = None, load_as: Optional[str] = None, source: Optional[str] = None, with_head: bool = True, custom_weights_loaders: Optional[List[WeightsLoader]] = None, leave_out: Optional[List[int]] = None, id2label=None, set_active: bool = False, use_safetensors: bool = False, **kwargs) str
+

Loads a pre-trained pytorch adapter module from the local file system or a remote location.

+
+
Parameters
+
    +
  • adapter_name_or_path (str) –

    can be either:

    +
      +
    • the identifier of a pre-trained task adapter to be loaded from Adapter Hub

    • +
    • a path to a directory containing adapter weights saved using model.saved_adapter()

    • +
    • a URL pointing to a zip folder containing a saved adapter module

    • +
    +

  • +
  • config (dict or str, optional) – The requested configuration of the adapter. +If not specified, will be either: - the default adapter config for the requested adapter if specified - +the global default adapter config

  • +
  • version (str, optional) – The version of the adapter to be loaded.

  • +
  • model_name (str, optional) – The string identifier of the pre-trained model.

  • +
  • load_as (str, optional) – Load the adapter using this name. By default, the name with which the adapter was +saved will be used.

  • +
  • source (str, optional) –

    Identifier of the source(s) from where to load the adapter. Can be:

    +
      +
    • +
      ”ah”: search on AdapterHub Hub repo.

      Note: the Hub repo has been archived and all adapters have been moved to HuggingFace Model Hub. +Loading from this source is deprecated.

      +
      +
      +
    • +
    • ”hf”: search on HuggingFace Model Hub.

    • +
    • None (default): search on all sources

    • +
    +

  • +
  • leave_out – Dynamically drop adapter modules in the specified Transformer layers when loading the adapter.

  • +
  • set_active (bool, optional) – Set the loaded adapter to be the active one. By default (False), the adapter is loaded but not +activated.

  • +
  • use_safetensors (bool, optional) – If True, weights are loaded via safetensors if safetensors checkpoint is available. Otherwise, the regular torch save method is used.

  • +
+
+
Returns
+

The name with which the adapter was added to the model.

+
+
Return type
+

str

+
+
+
+ +
+
+load_adapter_fusion(adapter_fusion_name_or_path: str, load_as: Optional[str] = None, custom_weights_loaders: Optional[List[WeightsLoader]] = None, set_active: bool = False, with_head: bool = True, use_safetensors: bool = False, **kwargs) str
+

Loads a pre-trained AdapterFusion layer from the local file system.

+
+
Parameters
+
    +
  • adapter_fusion_name_or_path (str) – a path to a directory containing AdapterFusion weights saved using model.save_adapter_fusion().

  • +
  • load_as (str, optional) – Load the AdapterFusion using this name. +By default, the name with which the AdapterFusion layer was saved will be used.

  • +
  • set_active (bool, optional) – Activate the loaded AdapterFusion. By default (False), the AdapterFusion is loaded but not activated.

  • +
  • use_safetensors (bool, optional) – If True, weights are loaded via safetensors if safetensors checkpoint is available. Otherwise, the regular torch save method is used.

  • +
+
+
Returns
+

The name with which the AdapterFusion was added to the model.

+
+
Return type
+

str

+
+
+
+ +
+
+load_head(save_directory: str, load_as: Optional[str] = None, id2label: Optional[Dict[int, str]] = None, use_safetensors: bool = False, **kwargs) str
+

Loads a model prediction head from a directory where it was saved using save_head().

+
+
Parameters
+
    +
  • save_directory (str) – Path to the directory where the prediction head is saved.

  • +
  • load_as (str, optional) – Load the AdapterFusion using this name. +By default, the name with which the AdapterFusion layer was saved will be used.

  • +
  • id2label (Dict[int, str], optional) – Provide a custom mapping from class ids to class labels. Defaults to None.

  • +
  • use_safetensors (bool, optional) – If True, weights are loaded via safetensors if safetensors checkpoint is available. Otherwise, the regular torch save method is used.

  • +
+
+
Returns
+

The name with which the prediction head was added to the model.

+
+
Return type
+

str

+
+
+
+ +
+
+merge_adapter(name: str)
+

Merges the weights of the given LoRA module with the Transformer weights as described in the paper.

+
+
Parameters
+

name (str) – LoRA module to merge.

+
+
+
+ +
+
+push_adapter_to_hub(repo_name: str, adapter_name: str, organization: Optional[str] = None, adapterhub_tag: Optional[str] = None, datasets_tag: Optional[str] = None, local_path: Optional[str] = None, commit_message: Optional[str] = None, private: Optional[bool] = None, token: Optional[Union[bool, str]] = None, overwrite_adapter_card: bool = False, create_pr: bool = False, revision: Optional[str] = None, commit_description: Optional[str] = None, adapter_card_kwargs: Optional[dict] = None, **deprecated_kwargs)
+

Upload an adapter to HuggingFace’s Model Hub.

+
+
Parameters
+
    +
  • repo_name (str) – The name of the repository on the model hub to upload to.

  • +
  • adapter_name (str) – The name of the adapter to be uploaded.

  • +
  • organization (str, optional) – Organization in which to push the adapter +(you must be a member of this organization). Defaults to None.

  • +
  • adapterhub_tag (str, optional) – Tag of the format <task>/<subtask> for categorization on https://adapterhub.ml/explore/. See +https://docs.adapterhub.ml/contributing.html#add-a-new-task-or-subtask for more. If not specified, +datasets_tag must be given in case a new adapter card is generated. Defaults to None.

  • +
  • datasets_tag (str, optional) – Dataset identifier from https://huggingface.co/datasets. +If not specified, adapterhub_tag must be given in case a new adapter card is generated. Defaults to +None.

  • +
  • local_path (str, optional) – Local path used as clone directory of the adapter repository. +If not specified, will create a temporary directory. Defaults to None.

  • +
  • commit_message (str, optional) – Message to commit while pushing. Will default to "add config", "add tokenizer" or +"add model" depending on the type of the class.

  • +
  • private (bool, optional) – Whether or not the repository created should be private (requires a paying subscription).

  • +
  • token (bool or str, optional) – The token to use as HTTP bearer authorization for remote files. If True, will use the token generated +when running huggingface-cli login (stored in ~/.huggingface). Will default to True if repo_url +is not specified.

  • +
  • overwrite_adapter_card (bool, optional) – Overwrite an existing adapter card with a newly generated one. +If set to False, will only generate an adapter card, if none exists. Defaults to False.

  • +
  • create_pr (bool, optional) – Whether or not to create a PR with the uploaded files or directly commit.

  • +
  • revision (str, optional) – Branch to push the uploaded files to.

  • +
  • commit_description (str, optional) – The description of the commit that will be created

  • +
+
+
Returns
+

The url of the adapter repository on the model hub.

+
+
Return type
+

str

+
+
+
+ +
+
+reset_adapter()
+

Resets weights of a LoRA module merged using model.merge_adapter(name).

+
+ +
+
+save_adapter(save_directory: str, adapter_name: str, with_head: bool = True, meta_dict: Optional[dict] = None, custom_weights_loaders: Optional[List[WeightsLoader]] = None, use_safetensors: bool = False)
+

Saves an adapter and its configuration file to a directory so that it can be shared or reloaded using +load_adapter().

+
+
Parameters
+
    +
  • save_directory (str) – Path to a directory where the adapter should be saved.

  • +
  • adapter_name (str) – Name of the adapter to be saved.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
Raises
+

ValueError – If the given adapter name is invalid.

+
+
+
+ +
+
+save_adapter_fusion(save_directory: str, adapter_names: Union[Fuse, list, str], meta_dict: Optional[dict] = None, custom_weights_loaders: Optional[List[WeightsLoader]] = None, with_head: Union[bool, str] = False, use_safetensors: bool = False)
+

Saves an AdapterFusion layer and its configuration file to a directory so that it can be shared or reloaded +using load_adapter_fusion().

+
+
Parameters
+
    +
  • save_directory (str) – Path to a directory where the AdapterFusion should be saved.

  • +
  • adapter_names (Union[Fuse, list, str]) – AdapterFusion to be saved.

  • +
  • with_head (Union[bool, str]) – If True, will save a head with the same name as the AdapterFusionLayer. If a string, this will be used +as the name of the head to be saved.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
Raises
+

ValueError – If the given AdapterFusion name is invalid.

+
+
+
+ +
+
+save_all_adapter_fusions(save_directory: str, meta_dict: Optional[dict] = None, custom_weights_loaders: Optional[List[WeightsLoader]] = None, use_safetensors: bool = False)
+

Saves all AdapterFusion layers of this model together with their configuration to subfolders of the given +location.

+
+
Parameters
+
    +
  • save_directory (str) – Path to a directory where the AdapterFusion layers should be saved.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
+
+ +
+
+save_all_adapters(save_directory: str, with_head: bool = True, meta_dict: Optional[dict] = None, custom_weights_loaders: Optional[List[WeightsLoader]] = None, use_safetensors: bool = False)
+

Saves all adapters of this model together with their configuration to subfolders of the given location.

+
+
Parameters
+
    +
  • save_directory (str) – Path to a directory where the adapters should be saved.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
+
+ +
+
+save_all_heads(save_directory: str, use_safetensors: bool = False)
+

Saves all prediction heads of this model to subfolders of the given location.

+
+
Parameters
+
    +
  • save_directory (str) – Path to the base directory where prediction heads should be saved.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
+
+ +
+
+save_head(save_directory: str, head_name: Optional[str] = None, use_safetensors: bool = False) None
+

Saves a model prediction head to a directory such that it can be reloaded using load_head().

+
+
Parameters
+
    +
  • save_directory (str) – Path to the directory where the prediction head should be saved.

  • +
  • head_name (str, optional) – Name of the head to save. Set to None if model only has one head. Defaults to None.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
+
+ +
+
+save_pretrained(save_directory: Union[str, PathLike], **kwargs)
+

Save a model and its configuration file to a directory, so that it can be re-loaded using the +[~PreTrainedModel.from_pretrained] class method.

+
+
Parameters
+
    +
  • save_directory (str or os.PathLike) – Directory to which to save. Will be created if it doesn’t exist.

  • +
  • is_main_process (bool, optional, defaults to True) – Whether the process calling this is the main process or not. Useful when in distributed training like +TPUs and need to call this function on all processes. In this case, set is_main_process=True only on +the main process to avoid race conditions.

  • +
  • state_dict (nested dictionary of torch.Tensor) – The state dictionary of the model to save. Will default to self.state_dict(), but can be used to only +save parts of the model or if special precautions need to be taken when recovering the state dictionary +of a model (like when using model parallelism).

  • +
  • save_function (Callable) – The function to use to save the state dictionary. Useful on distributed training like TPUs when one +need to replace torch.save by another method.

  • +
  • push_to_hub (bool, optional, defaults to False) – Whether or not to push your model to the Hugging Face model hub after saving it. You can specify the +repository you want to push to with repo_id (will default to the name of save_directory in your +namespace).

  • +
  • max_shard_size (int or str, optional, defaults to “5GB”) –

    The maximum size for a checkpoint before being sharded. Checkpoints shard will then be each of size +lower than this size. If expressed as a string, needs to be digits followed by a unit (like “5MB”). +We default it to 5GB in order for models to be able to run easily on free-tier google colab instances +without CPU OOM issues.

    +

    <Tip warning={true}>

    +

    If a single weight of the model is bigger than max_shard_size, it will be in its own checkpoint shard +which will be bigger than max_shard_size.

    +

    </Tip>

    +

  • +
  • safe_serialization (bool, optional, defaults to True) – Whether to save the model using safetensors or the traditional PyTorch way (that uses pickle).

  • +
  • variant (str, optional) – If specified, weights are saved in the format pytorch_model.<variant>.bin.

  • +
  • token (str or bool, optional) – The token to use as HTTP bearer authorization for remote files. If True, or not specified, will use +the token generated when running huggingface-cli login (stored in ~/.huggingface).

  • +
  • save_peft_format (bool, optional, defaults to True) – For backward compatibility with PEFT library, in case adapter weights are attached to the model, all +keys of the state dict of adapters needs to be pre-pended with base_model.model. Advanced users can +disable this behaviours by setting save_peft_format to False.

  • +
  • kwargs (Dict[str, Any], optional) – Additional key word arguments passed along to the [~utils.PushToHubMixin.push_to_hub] method.

  • +
+
+
+
+ +
+
+set_active_adapters(adapter_setup: Union[list, AdapterCompositionBlock], skip_layers: Optional[List[int]] = None)
+

Sets the adapter modules to be used by default in every forward pass. This setting can be overriden by passing +the adapter_names parameter in the foward() pass. If no adapter with the given name is found, no module of +the respective type will be activated. In case the calling model class supports named prediction heads, this +method will attempt to activate a prediction head with the name of the last adapter in the list of passed +adapter names.

+
+
Parameters
+

adapter_setup (list) – The list of adapters to be activated by default. Can be a fusion or stacking configuration.

+
+
+
+ +
+
+tie_weights()
+

Tie the weights between the input embeddings and the output embeddings.

+

If the torchscript flag is set in the configuration, can’t handle parameter sharing so we are cloning +the weights instead.

+
+ +
+
+train_adapter(adapter_setup: Union[list, AdapterCompositionBlock], train_embeddings=False)
+

Sets the model into mode for training the given adapters. If self.base_model is self, must inherit from a class +that implements this method, to preclude infinite recursion

+
+ +
+
+train_adapter_fusion(adapter_setup: Union[list, AdapterCompositionBlock], unfreeze_adapters=False)
+

Sets the model into mode for training of adapter fusion determined by a list of adapter names. If +self.base_model is self, must inherit from a class that implements this method, to preclude infinite recursion

+
+ +
+
+train_fusion(adapter_setup: Union[list, AdapterCompositionBlock], unfreeze_adapters=False)
+

Sets the model into mode for training of adapter fusion determined by a list of adapter names.

+
+ +
+ +
+
+ + +
+ +
+ + +
+
+ +
+ +
+ +
+ + Versions + v: main + + +
+
+
Branches
+
main
+
+
+
+ + + + + + + + + + \ No newline at end of file diff --git a/classes/models/encoderdecoder.html b/classes/models/encoderdecoder.html new file mode 100644 index 0000000000..bdcba49522 --- /dev/null +++ b/classes/models/encoderdecoder.html @@ -0,0 +1,552 @@ + + + + + + + + + + + Encoder Decoder Models — AdapterHub documentation + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + +
+ + + + + +
+ +
+ + + + + + + + + + + + + + + + + +
+ + + + +
+
+
+
+ +
+

Encoder Decoder Models

+
+

Note

+
+
Adapter implementation notes:
    +
  • Unlike other models, an explicit EncoderDecoderAdapterModel for the EncoderDecoderModel has not been implemented. This decision was made due to the lack of support for the EncoderDecoderModel in Hugging Face Transformers’ AutoModel class. As a result, our AutoAdapterModel class would not support the EncoderDecoderAdapterModel either. Thus, to use an EncoderDecoderModel with Adapters, follow these steps:

    +
    +
      +
    1. First, create an EncoderDecoderModel instance, for example, using model = EncoderDecoderModel.from_encoder_decoder_pretrained("bert-base-uncased", "bert-base-uncased").

    2. +
    3. Next, convert this model to an adapter model using the adapters.init(model) function.

    4. +
    +
    +
  • +
  • Adapters can be added to both the encoder and the decoder. As usual, the leave_out parameter can be used to specify the layers where adapters are to be added. For the EncoderDecoderModel the layer IDs are counted seperately over the encoder and decoder starting from 0. Thus, specifying leave_out=[0,1] will leave out the first and second layer of the encoder and the first and second layer of the decoder.

  • +
+
+
+
+

The EncoderDecoderModel can be used to initialize a sequence-to-sequence model with any +pretrained autoencoding model as the encoder and any pretrained autoregressive model as the decoder.

+

The effectiveness of initializing sequence-to-sequence models with pretrained checkpoints for sequence generation tasks +was shown in Leveraging Pre-trained Checkpoints for Sequence Generation Tasks by +Sascha Rothe, Shashi Narayan, Aliaksei Severyn.

+

After such an EncoderDecoderModel has been trained/fine-tuned, it can be saved/loaded just like +any other models (see the examples for more information).

+

An application of this architecture could be to leverage two pretrained BertModel as the encoder +and decoder for a summarization model as was shown in: Text Summarization with Pretrained Encoders by Yang Liu and Mirella Lapata.

+
+

EncoderDecoderModel

+
+
+class transformers.EncoderDecoderModel(config: Optional[PretrainedConfig] = None, encoder: Optional[PreTrainedModel] = None, decoder: Optional[PreTrainedModel] = None)
+

This class can be used to initialize a sequence-to-sequence model with any pretrained autoencoding model as the +encoder and any pretrained autoregressive model as the decoder. The encoder is loaded via +[~AutoModel.from_pretrained] function and the decoder is loaded via [~AutoModelForCausalLM.from_pretrained] +function. Cross-attention layers are automatically added to the decoder and should be fine-tuned on a downstream +generative task, like summarization.

+

The effectiveness of initializing sequence-to-sequence models with pretrained checkpoints for sequence generation +tasks was shown in [Leveraging Pre-trained Checkpoints for Sequence Generation +Tasks](https://arxiv.org/abs/1907.12461) by Sascha Rothe, Shashi Narayan, Aliaksei Severyn. Michael Matena, Yanqi +Zhou, Wei Li, Peter J. Liu.

+

After such an Encoder Decoder model has been trained/fine-tuned, it can be saved/loaded just like any other models +(see the examples for more information).

+

This model inherits from [PreTrainedModel]. Check the superclass documentation for the generic methods the +library implements for all its model (such as downloading or saving, resizing the input embeddings, pruning heads +etc.)

+

This model is also a PyTorch [torch.nn.Module](https://pytorch.org/docs/stable/nn.html#torch.nn.Module) subclass. +Use it as a regular PyTorch Module and refer to the PyTorch documentation for all matter related to general usage +and behavior.

+
+
Parameters
+

config ([EncoderDecoderConfig]) – Model configuration class with all the parameters of the model. +Initializing with a config file does not load the weights associated with the model, only the +configuration. Check out the [~PreTrainedModel.from_pretrained] method to load the model weights.

+
+
+

[EncoderDecoderModel] is a generic model class that will be instantiated as a transformer architecture with one +of the base model classes of the library as encoder and another one as decoder when created with the +:meth*~transformers.AutoModel.from_pretrained* class method for the encoder and +:meth*~transformers.AutoModelForCausalLM.from_pretrained* class method for the decoder.

+
+
+forward(input_ids: Optional[LongTensor] = None, attention_mask: Optional[FloatTensor] = None, decoder_input_ids: Optional[LongTensor] = None, decoder_attention_mask: Optional[BoolTensor] = None, encoder_outputs: Optional[Tuple[FloatTensor]] = None, past_key_values: Optional[Tuple[Tuple[FloatTensor]]] = None, inputs_embeds: Optional[FloatTensor] = None, decoder_inputs_embeds: Optional[FloatTensor] = None, labels: Optional[LongTensor] = None, use_cache: Optional[bool] = None, output_attentions: Optional[bool] = None, output_hidden_states: Optional[bool] = None, return_dict: Optional[bool] = None, **kwargs) Union[Tuple, Seq2SeqLMOutput]
+

The [EncoderDecoderModel] forward method, overrides the __call__ special method.

+

<Tip>

+

Although the recipe for forward pass needs to be defined within this function, one should call the [Module] +instance afterwards instead of this since the former takes care of running the pre and post processing steps while +the latter silently ignores them.

+

</Tip>

+
+
Parameters
+
    +
  • input_ids (torch.LongTensor of shape (batch_size, sequence_length)) –

    Indices of input sequence tokens in the vocabulary.

    +

    Indices can be obtained using [PreTrainedTokenizer]. See [PreTrainedTokenizer.encode] and +[PreTrainedTokenizer.__call__] for details.

    +

    [What are input IDs?](../glossary#input-ids)

    +

  • +
  • attention_mask (torch.FloatTensor of shape (batch_size, sequence_length), optional) –

    Mask to avoid performing attention on padding token indices. Mask values selected in [0, 1]:

    +
      +
    • 1 for tokens that are not masked,

    • +
    • 0 for tokens that are masked.

    • +
    +

    [What are attention masks?](../glossary#attention-mask)

    +

  • +
  • decoder_input_ids (torch.LongTensor of shape (batch_size, target_sequence_length), optional) –

    Indices of decoder input sequence tokens in the vocabulary.

    +

    Indices can be obtained using [PreTrainedTokenizer]. See [PreTrainedTokenizer.encode] and +[PreTrainedTokenizer.__call__] for details.

    +

    [What are input IDs?](../glossary#input-ids)

    +

    If past_key_values is used, optionally only the last decoder_input_ids have to be input (see +past_key_values).

    +

    For training, decoder_input_ids are automatically created by the model by shifting the labels to the +right, replacing -100 by the pad_token_id and prepending them with the decoder_start_token_id.

    +

  • +
  • decoder_attention_mask (torch.BoolTensor of shape (batch_size, target_sequence_length), optional) – Default behavior: generate a tensor that ignores pad tokens in decoder_input_ids. Causal mask will also +be used by default.

  • +
  • encoder_outputs (tuple(torch.FloatTensor), optional) – This tuple must consist of (last_hidden_state, optional: hidden_states, optional: attentions) +last_hidden_state (torch.FloatTensor of shape (batch_size, sequence_length, hidden_size)) is a tensor +of hidden-states at the output of the last layer of the encoder. Used in the cross-attention of the +decoder.

  • +
  • past_key_values (tuple(tuple(torch.FloatTensor)) of length config.n_layers with each tuple having 4 tensors of shape (batch_size, num_heads, sequence_length - 1, embed_size_per_head)) –

    Contains precomputed key and value hidden states of the attention blocks. Can be used to speed up decoding.

    +

    If past_key_values are used, the user can optionally input only the last decoder_input_ids (those that +don’t have their past key value states given to this model) of shape (batch_size, 1) instead of all +decoder_input_ids of shape (batch_size, sequence_length).

    +

  • +
  • inputs_embeds (torch.FloatTensor of shape (batch_size, sequence_length, hidden_size), optional) – Optionally, instead of passing input_ids you can choose to directly pass an embedded representation. This +is useful if you want more control over how to convert input_ids indices into associated vectors than the +model’s internal embedding lookup matrix.

  • +
  • decoder_inputs_embeds (torch.FloatTensor of shape (batch_size, target_sequence_length, hidden_size), optional) – Optionally, instead of passing decoder_input_ids you can choose to directly pass an embedded +representation. This is useful if you want more control over how to convert decoder_input_ids indices +into associated vectors than the model’s internal embedding lookup matrix.

  • +
  • labels (torch.LongTensor of shape (batch_size, sequence_length), optional) – Labels for computing the masked language modeling loss for the decoder. Indices should be in [-100, 0, +…, config.vocab_size] (see input_ids docstring) Tokens with indices set to -100 are ignored +(masked), the loss is only computed for the tokens with labels in [0, …, config.vocab_size]

  • +
  • use_cache (bool, optional) – If set to True, past_key_values key value states are returned and can be used to speed up decoding (see +past_key_values).

  • +
  • output_attentions (bool, optional) – Whether or not to return the attentions tensors of all attention layers. See attentions under returned +tensors for more detail.

  • +
  • output_hidden_states (bool, optional) – Whether or not to return the hidden states of all layers. See hidden_states under returned tensors for +more detail.

  • +
  • return_dict (bool, optional) – If set to True, the model will return a [~utils.Seq2SeqLMOutput] instead of a plain tuple.

  • +
  • kwargs (optional) –

    Remaining dictionary of keyword arguments. Keyword arguments come in two flavors:

    +
      +
    • Without a prefix which will be input as **encoder_kwargs for the encoder forward function.

    • +
    • With a decoder_ prefix which will be input as **decoder_kwargs for the decoder forward function.

    • +
    +

  • +
  • Returns

    [transformers.modeling_outputs.Seq2SeqLMOutput] or tuple(torch.FloatTensor): A [transformers.modeling_outputs.Seq2SeqLMOutput] or a tuple of +torch.FloatTensor (if return_dict=False is passed or when config.return_dict=False) comprising various +elements depending on the configuration ([EncoderDecoderConfig]) and inputs.

    +
      +
    • loss (torch.FloatTensor of shape (1,), optional, returned when labels is provided) – Language modeling loss.

    • +
    • logits (torch.FloatTensor of shape (batch_size, sequence_length, config.vocab_size)) – Prediction scores of the language modeling head (scores for each vocabulary token before SoftMax).

    • +
    • past_key_values (tuple(tuple(torch.FloatTensor)), optional, returned when use_cache=True is passed or when config.use_cache=True) – Tuple of tuple(torch.FloatTensor) of length config.n_layers, with each tuple having 2 tensors of shape +(batch_size, num_heads, sequence_length, embed_size_per_head)) and 2 additional tensors of shape +(batch_size, num_heads, encoder_sequence_length, embed_size_per_head).

      +

      Contains pre-computed hidden-states (key and values in the self-attention blocks and in the cross-attention +blocks) that can be used (see past_key_values input) to speed up sequential decoding.

      +
    • +
    • decoder_hidden_states (tuple(torch.FloatTensor), optional, returned when output_hidden_states=True is passed or when config.output_hidden_states=True) – Tuple of torch.FloatTensor (one for the output of the embeddings, if the model has an embedding layer, + +one for the output of each layer) of shape (batch_size, sequence_length, hidden_size).

      +

      Hidden-states of the decoder at the output of each layer plus the initial embedding outputs.

      +
    • +
    • decoder_attentions (tuple(torch.FloatTensor), optional, returned when output_attentions=True is passed or when config.output_attentions=True) – Tuple of torch.FloatTensor (one for each layer) of shape (batch_size, num_heads, sequence_length, +sequence_length).

      +

      Attentions weights of the decoder, after the attention softmax, used to compute the weighted average in the +self-attention heads.

      +
    • +
    • cross_attentions (tuple(torch.FloatTensor), optional, returned when output_attentions=True is passed or when config.output_attentions=True) – Tuple of torch.FloatTensor (one for each layer) of shape (batch_size, num_heads, sequence_length, +sequence_length).

      +

      Attentions weights of the decoder’s cross-attention layer, after the attention softmax, used to compute the +weighted average in the cross-attention heads.

      +
    • +
    • encoder_last_hidden_state (torch.FloatTensor of shape (batch_size, sequence_length, hidden_size), optional) – Sequence of hidden-states at the output of the last layer of the encoder of the model.

    • +
    • encoder_hidden_states (tuple(torch.FloatTensor), optional, returned when output_hidden_states=True is passed or when config.output_hidden_states=True) – Tuple of torch.FloatTensor (one for the output of the embeddings, if the model has an embedding layer, + +one for the output of each layer) of shape (batch_size, sequence_length, hidden_size).

      +

      Hidden-states of the encoder at the output of each layer plus the initial embedding outputs.

      +
    • +
    • encoder_attentions (tuple(torch.FloatTensor), optional, returned when output_attentions=True is passed or when config.output_attentions=True) – Tuple of torch.FloatTensor (one for each layer) of shape (batch_size, num_heads, sequence_length, +sequence_length).

      +

      Attentions weights of the encoder, after the attention softmax, used to compute the weighted average in the +self-attention heads.

      +
    • +
    +

  • +
  • Examples

  • +
  • ```python

  • +
  • EncoderDecoderModel (>>> from transformers import) –

  • +
  • BertTokenizer

  • +
  • torch (>>> import) –

  • +
  • BertTokenizer.from_pretrained (>>> tokenizer =) –

  • +
  • EncoderDecoderModel.from_encoder_decoder_pretrained( (>>> model =) –

  • +
  • "google-bert/bert-base-uncased" (...) –

  • +
  • "google-bert/bert-base-uncased"

  • +
  • checkpoints (... ) # initialize Bert2Bert from pre-trained) –

  • +
  • training (>>> #) –

  • +
  • tokenizer.cls_token_id (>>> model.config.decoder_start_token_id =) –

  • +
  • tokenizer.pad_token_id (>>> model.config.pad_token_id =) –

  • +
  • model.config.decoder.vocab_size (>>> model.config.vocab_size =) –

  • +
  • tokenizer (>>> labels =) –

  • +
  • tokenizer

  • +
  • model (>>> outputs =) –

  • +
  • loss (>>>) –

  • +
  • outputs.loss (logits =) –

  • +
  • outputs.logits

  • +
  • pretrained (>>> # save and load from) –

  • +
  • model.save_pretrained (>>>) –

  • +
  • EncoderDecoderModel.from_pretrained (>>> model =) –

  • +
  • generation (>>> #) –

  • +
  • model.generate (>>> generated =) –

  • +
  • ```

  • +
+
+
+
+ +
+
+classmethod from_encoder_decoder_pretrained(encoder_pretrained_model_name_or_path: Optional[str] = None, decoder_pretrained_model_name_or_path: Optional[str] = None, *model_args, **kwargs) PreTrainedModel
+

Instantiate an encoder and a decoder from one or two base classes of the library from pretrained model +checkpoints.

+

The model is set in evaluation mode by default using model.eval() (Dropout modules are deactivated). To train +the model, you need to first set it back in training mode with model.train().

+
+
Params:
+
encoder_pretrained_model_name_or_path (str, optional):

Information necessary to initiate the encoder. Can be either:

+
+
    +
  • A string, the model id of a pretrained model hosted inside a model repo on huggingface.co.

  • +
  • A path to a directory containing model weights saved using +[~PreTrainedModel.save_pretrained], e.g., ./my_model_directory/.

  • +
  • A path or url to a tensorflow index checkpoint file (e.g, ./tf_model/model.ckpt.index). In +this case, from_tf should be set to True and a configuration object should be provided as +config argument. This loading path is slower than converting the TensorFlow checkpoint in a +PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.

  • +
+
+
+
decoder_pretrained_model_name_or_path (str, optional, defaults to None):

Information necessary to initiate the decoder. Can be either:

+
+
    +
  • A string, the model id of a pretrained model hosted inside a model repo on huggingface.co.

  • +
  • A path to a directory containing model weights saved using +[~PreTrainedModel.save_pretrained], e.g., ./my_model_directory/.

  • +
  • A path or url to a tensorflow index checkpoint file (e.g, ./tf_model/model.ckpt.index). In +this case, from_tf should be set to True and a configuration object should be provided as +config argument. This loading path is slower than converting the TensorFlow checkpoint in a +PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.

  • +
+
+
+
model_args (remaining positional arguments, optional):

All remaining positional arguments will be passed to the underlying model’s __init__ method.

+
+
kwargs (remaining dictionary of keyword arguments, optional):

Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., +output_attentions=True).

+
    +
  • To update the encoder configuration, use the prefix encoder_ for each configuration parameter.

  • +
  • To update the decoder configuration, use the prefix decoder_ for each configuration parameter.

  • +
  • To update the parent model configuration, do not use a prefix for each configuration parameter.

  • +
+

Behaves differently depending on whether a config is provided or automatically loaded.

+
+
+
+
+

Example:

+

```python +>>> from transformers import EncoderDecoderModel

+
>>> # initialize a bert2bert from two pretrained BERT models. Note that the cross-attention layers will be randomly initialized
+>>> model = EncoderDecoderModel.from_encoder_decoder_pretrained("google-bert/bert-base-uncased", "google-bert/bert-base-uncased")
+>>> # saving model after fine-tuning
+>>> model.save_pretrained("./bert2bert")
+>>> # load fine-tuned model
+>>> model = EncoderDecoderModel.from_pretrained("./bert2bert")
+```
+
+
+
+ +
+ +
+
+ + +
+ +
+ + +
+
+ +
+ +
+ +
+ + Versions + v: main + + +
+
+
Branches
+
main
+
+
+
+ + + + + + + + + + \ No newline at end of file diff --git a/classes/models/gpt2.html b/classes/models/gpt2.html new file mode 100644 index 0000000000..f68a7dce69 --- /dev/null +++ b/classes/models/gpt2.html @@ -0,0 +1,1058 @@ + + + + + + + + + + + OpenAI GPT2 — AdapterHub documentation + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + +
+ + + + + +
+ +
+ + + + + + + + + + + + + + + + + +
+ + + + +
+
+
+
+ +
+

OpenAI GPT2

+

OpenAI GPT-2 model was proposed in Language Models are Unsupervised Multitask Learners by Alec +Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei and Ilya Sutskever. It’s a causal (unidirectional) +transformer pretrained using language modeling on a very large corpus of ~40 GB of text data.

+

The abstract from the paper is the following:

+

GPT-2 is a large transformer-based language model with 1.5 billion parameters, trained on a dataset[1] of 8 million +web pages. GPT-2 is trained with a simple objective: predict the next word, given all of the previous words within some +text. The diversity of the dataset causes this simple goal to contain naturally occurring demonstrations of many tasks +across diverse domains. GPT-2 is a direct scale-up of GPT, with more than 10X the parameters and trained on more than +10X the amount of data.

+
+

GPT2AdapterModel

+
+
+class adapters.GPT2AdapterModel(config)
+

The GPT2 Model that allows the loading of different heads dor different tasks. This enables a flexible use of the +models and adpters. Since this class does classification on the last token, it requires to know the position of the +last token. If a pad_token_id is defined in the configuration, it finds the last token that is not a padding +token in each row. If no pad_token_id is defined, it simply takes the last value in each row of the batch. Since +it cannot guess the padding tokens when inputs_embeds are passed instead of input_ids, it does the same +(take the last value in each row of the batch).

+
+

This model inherits from [PreTrainedModel]. Check the superclass documentation for the generic methods the +library implements for all its model (such as downloading or saving, resizing the input embeddings, pruning heads +etc.)

+

This model is also a PyTorch [torch.nn.Module](https://pytorch.org/docs/stable/nn.html#torch.nn.Module) subclass. +Use it as a regular PyTorch Module and refer to the PyTorch documentation for all matter related to general usage +and behavior.

+
+
Parameters:
+
config ([GPT2Config]): Model configuration class with all the parameters of the model.

Initializing with a config file does not load the weights associated with the model, only the +configuration. Check out the [~PreTrainedModel.from_pretrained] method to load the model weights.

+
+
+
+
+
+
+
+property active_adapters: AdapterCompositionBlock
+

If you are not familiar with adapters and PEFT methods, we invite you to read more about them on the PEFT +official documentation: https://huggingface.co/docs/peft

+

Gets the current active adapters of the model. In case of multi-adapter inference (combining multiple adapters +for inference) returns the list of all active adapters so that users can deal with them accordingly.

+

For previous PEFT versions (that does not support multi-adapter inference), module.active_adapter will return +a single string.

+
+ +
+
+property active_head: Union[str, List[str]]
+

The active prediction head configuration of this model. Can be either the name of a single available head +(string) or a list of multiple available heads. In case of a list of heads, the same base model is forwarded +through all specified heads.

+
+
Returns
+

A string or a list of strings describing the active head configuration.

+
+
Return type
+

Union[str, List[str]]

+
+
+
+ +
+
+adapter_fusion_to(adapter_names: Union[Fuse, list, str], device: Optional[Union[device, str]] = None, dtype: Optional[dtype] = None)
+

Moves the adapter fusion layer with the given name to the specified device and data type.

+
+
Parameters
+
    +
  • adapter_names (Union[Fuse, list, str]) – The name of the adapter fusion layer to be moved.

  • +
  • device (torch.device or str, optional) – The device on which the adapter fusion layer should be moved.

  • +
  • dtype (torch.dtype, optional) – The data type to which the adapter fusion layer should be cast.

  • +
+
+
+
+ +
+
+adapter_summary(as_dict=False) Union[str, dict]
+

Returns a string summary of all adapters currently added to the model. Each entry in the summary table has the +following attributes:

+
+
    +
  • name: the name of the adapter

  • +
  • architecture: the architectural base of the adapter

  • +
  • #param: the number of parameters of the adapter

  • +
  • %param: the number of parameters of the adapter relative to the full model

  • +
  • active: whether the adapter is active

  • +
  • train: whether the adapter weights are enabled for training

  • +
+
+
+ +
+
+adapter_to(name: str, device: Optional[Union[device, str]] = None, dtype: Optional[dtype] = None)
+

Moves the adapter with the given name to the specified device and data type.

+
+
Parameters
+
    +
  • name (str) – The name of the adapter to be moved.

  • +
  • device (torch.device or str, optional) – The device on which the adapter should be moved.

  • +
  • dtype (torch.dtype, optional) – The data type to which the adapter should be cast.

  • +
+
+
+
+ +
+
+add_adapter(adapter_name: str, config=None, overwrite_ok: bool = False, set_active: bool = False)
+

Adds a new adapter module of the specified type to the model.

+
+
Parameters
+
    +
  • adapter_name (str) – The name of the adapter module to be added.

  • +
  • config (str or dict, optional) –

    The adapter configuration, can be either:

    +
      +
    • the string identifier of a pre-defined configuration dictionary

    • +
    • a configuration dictionary specifying the full config

    • +
    • if not given, the default configuration for this adapter type will be used

    • +
    +

  • +
  • overwrite_ok (bool, optional) – Overwrite an adapter with the same name if it exists. By default (False), an exception is thrown.

  • +
  • set_active (bool, optional) – Set the adapter to be the active one. By default (False), the adapter is added but not activated.

  • +
+
+
+

If self.base_model is self, must inherit from a class that implements this method, to preclude infinite +recursion

+
+ +
+
+add_adapter_fusion(adapter_names: Union[Fuse, list, str], config=None, overwrite_ok: bool = False, set_active: bool = False)
+

Adds AdapterFusion to the model with alll the necessary configurations and weight initializations

+
+
Parameters
+
    +
  • adapter_names (Fuse or list or str) –

    AdapterFusion layer to add. Can be either:

    +
      +
    • a Fuse composition block

    • +
    • a list of adapter names to fuse

    • +
    • a comma-separated string of adapter names to fuse

    • +
    +

  • +
  • config (str or dict) –

    adapter fusion configuration, can be either:

    +
      +
    • a string identifying a pre-defined adapter fusion configuration

    • +
    • a dictionary representing the adapter fusion configuration

    • +
    • the path to a file containing the adapter fusion configuration

    • +
    +

  • +
  • overwrite_ok (bool, optional) – Overwrite an AdapterFusion layer with the same name if it exists. By default (False), an exception is +thrown.

  • +
  • set_active (bool, optional) – Activate the added AdapterFusion. By default (False), the AdapterFusion is added but not activated.

  • +
+
+
+
+ +
+
+add_causal_lm_head(head_name, activation_function='gelu', overwrite_ok=False)
+

Adds a causal language modeling head on top of the model.

+
+
Parameters
+
    +
  • head_name (str) – The name of the head.

  • +
  • activation_function (str, optional) – Activation function. Defaults to ‘gelu’.

  • +
  • overwrite_ok (bool, optional) – Force overwrite if a head with the same name exists. Defaults to False.

  • +
+
+
+
+ +
+
+add_classification_head(head_name, num_labels=2, layers=2, activation_function='tanh', overwrite_ok=False, multilabel=False, id2label=None, use_pooler=False)
+

Adds a sequence classification head on top of the model.

+
+
Parameters
+
    +
  • head_name (str) – The name of the head.

  • +
  • num_labels (int, optional) – Number of classification labels. Defaults to 2.

  • +
  • layers (int, optional) – Number of layers. Defaults to 2.

  • +
  • activation_function (str, optional) – Activation function. Defaults to ‘tanh’.

  • +
  • overwrite_ok (bool, optional) – Force overwrite if a head with the same name exists. Defaults to False.

  • +
  • multilabel (bool, optional) – Enable multilabel classification setup. Defaults to False.

  • +
+
+
+
+ +
+
+add_qa_head(head_name, num_labels=2, layers=1, activation_function='tanh', overwrite_ok=False, id2label=None)
+

Adds a question answering head on top of the model.

+
+
Parameters
+
    +
  • head_name (str) – The name of the head.

  • +
  • num_labels (int, optional) – Number of classification labels. Defaults to 2.

  • +
  • layers (int, optional) – Number of layers. Defaults to 1.

  • +
  • activation_function (str, optional) – Activation function. Defaults to ‘tanh’.

  • +
  • overwrite_ok (bool, optional) – Force overwrite if a head with the same name exists. Defaults to False.

  • +
+
+
+
+ +
+
+add_tagging_head(head_name, num_labels=2, layers=1, activation_function='tanh', overwrite_ok=False, id2label=None)
+

Adds a token classification head on top of the model.

+
+
Parameters
+
    +
  • head_name (str) – The name of the head.

  • +
  • num_labels (int, optional) – Number of classification labels. Defaults to 2.

  • +
  • layers (int, optional) – Number of layers. Defaults to 1.

  • +
  • activation_function (str, optional) – Activation function. Defaults to ‘tanh’.

  • +
  • overwrite_ok (bool, optional) – Force overwrite if a head with the same name exists. Defaults to False.

  • +
+
+
+
+ +
+
+apply_to_adapter_layers(fn)
+

Applies a function to all adapter layers of the model.

+
+ +
+
+apply_to_basemodel_childs(fn)
+

Applies a function to all direct childs of the model if they are a instance of AdapterLayerBase.

+
+ +
+
+average_adapter(adapter_name: str, adapter_list: List[str], weights: Optional[List[float]] = None, normalize_weights: bool = True, overwrite_ok: bool = False, set_active: bool = False)
+

Adds a new adapter module as weighted average of a set of existing adapter modules.

+
+
Parameters
+
    +
  • adapter_name (str) – The name of the adapter module to be added.

  • +
  • input_adapters (List[str] or Dict[str, float]) – Specifies the existing adapters whose weights should be averaged. Can either be a list of adapter names +or a dictionary mapping adapter names to weights.

  • +
  • overwrite_ok (bool, optional) – Overwrite an adapter with the same name if it exists. By default (False), an exception is thrown.

  • +
  • set_active (bool, optional) – Set the adapter to be the active one. By default (False), the adapter is added but not activated.

  • +
+
+
+
+ +
+
+delete_adapter(adapter_name: str)
+

Deletes the adapter with the specified name from the model.

+
+
Parameters
+

adapter_name (str) – The name of the adapter.

+
+
+
+ +
+
+delete_adapter_fusion(adapter_names: Union[Fuse, list, str])
+

Deletes the AdapterFusion layer of the specified adapters.

+
+
Parameters
+

adapter_names (Union[Fuse, list, str]) – AdapterFusion layer to delete.

+
+
+
+ +
+
+delete_head(head_name: str)
+

Deletes the prediction head with the specified name from the model.

+
+
Parameters
+

head_name (str) – The name of the prediction to delete.

+
+
+
+ +
+
+eject_prefix_tuning(name: str)
+

Converts the prefix tuning with the given name from the reparameterized form into the flat form.

+
+
Parameters
+

name (str) – The name of the prefix tuning.

+
+
+
+ +
+
+forward(input_ids=None, past_key_values=None, attention_mask=None, token_type_ids=None, position_ids=None, head_mask=None, inputs_embeds=None, encoder_hidden_states=None, encoder_attention_mask=None, use_cache=None, output_attentions=None, output_hidden_states=None, return_dict=None, head=None, output_adapter_gating_scores=False, output_adapter_fusion_attentions=False, **kwargs)
+

Define the computation performed at every call.

+

Should be overridden by all subclasses.

+
+

Note

+

Although the recipe for forward pass needs to be defined within +this function, one should call the Module instance afterwards +instead of this since the former takes care of running the +registered hooks while the latter silently ignores them.

+
+
+ +
+
+forward_context(context: ForwardContext, *args, **kwargs)
+

This method is called by the ForwardContext at the beginning of the forward pass.

+
+ +
+
+forward_head(all_outputs, head_name=None, cls_output=None, attention_mask=None, return_dict=False, context=None, **kwargs)
+

The forward pass through a prediction head configuration. There are three ways to specify the used prediction +head configuration (in order of priority):

+
+
    +
  1. If a head_name is passed, the head with the given name is used.

  2. +
  3. If the forward call is executed within an AdapterSetup context, the head configuration is read from +the context.

  4. +
  5. If the active_head property is set, the head configuration is read from there.

  6. +
+
+
+
Parameters
+
    +
  • all_outputs (dict) – The outputs of the base model.

  • +
  • head_name (str, optional) – The name of the prediction head to use. If None, the active head is used.

  • +
  • cls_output (torch.Tensor, optional) – The classification output of the model.

  • +
  • attention_mask (torch.Tensor, optional) – The attention mask of the model.

  • +
  • return_dict (bool) – Whether or not to return a ModelOutput instead of a plain tuple.

  • +
  • get_cls_from_eos_tokens (bool) – If set to True, retrieve classifier token representations from the last <eos> token in the sequence. +Setting to True requires eos_mask to be passed as well.

  • +
  • **kwargs – Additional keyword arguments passed to the forward pass of the head.

  • +
+
+
+
+ +
+
+freeze_model(freeze=True)
+

Freezes all weights of the model.

+
+ +
+
+get_adapter(name)
+

If self.base_model is self, must inherit from a class that implements this method, to preclude infinite +recursion

+
+ +
+
+get_labels(head_name=None)
+

Returns the labels the given head is assigning/predictin

+
+
Parameters
+
    +
  • head_name – (str, optional) the name of the head which labels should be returned. Default is None.

  • +
  • returned (If the name is None the labels of the active head are) –

  • +
+
+
+

Returns: labels

+
+ +
+
+get_labels_dict(head_name=None)
+

Returns the id2label dict for the given hea

+
+
Parameters
+
    +
  • head_name – (str, optional) the name of the head which labels should be returned. Default is None.

  • +
  • returned (If the name is None the labels of the active head are) –

  • +
+
+
+

Returns: id2label

+
+ +
+
+get_output_embeddings() Union[Module, List[Module]]
+

Returns the model’s output embeddings.

+
+
Returns
+

A torch module mapping hidden states to vocabulary.

+
+
Return type
+

nn.Module

+
+
+
+ +
+
+head_type()
+

Checks which head type the decorated function belongs to and raises an error if the model does not support the +head type.

+
+ +
+
+init_adapters(model_config, adapters_config, add_prefix_tuning_pool=True)
+

This method initializes adapter modules and fusion modules from the model config.

+
+ +
+
+iter_layers() Iterable[Tuple[int, Module]]
+

Iterates over all layers of the model.

+
+ +
+
+load_adapter(adapter_name_or_path: str, config: Optional[Union[dict, str]] = None, version: Optional[str] = None, model_name: Optional[str] = None, load_as: Optional[str] = None, source: Optional[str] = None, with_head: bool = True, custom_weights_loaders: Optional[List[WeightsLoader]] = None, leave_out: Optional[List[int]] = None, id2label=None, set_active: bool = False, use_safetensors: bool = False, **kwargs) str
+

Loads a pre-trained pytorch adapter module from the local file system or a remote location.

+
+
Parameters
+
    +
  • adapter_name_or_path (str) –

    can be either:

    +
      +
    • the identifier of a pre-trained task adapter to be loaded from Adapter Hub

    • +
    • a path to a directory containing adapter weights saved using model.saved_adapter()

    • +
    • a URL pointing to a zip folder containing a saved adapter module

    • +
    +

  • +
  • config (dict or str, optional) – The requested configuration of the adapter. +If not specified, will be either: - the default adapter config for the requested adapter if specified - +the global default adapter config

  • +
  • version (str, optional) – The version of the adapter to be loaded.

  • +
  • model_name (str, optional) – The string identifier of the pre-trained model.

  • +
  • load_as (str, optional) – Load the adapter using this name. By default, the name with which the adapter was +saved will be used.

  • +
  • source (str, optional) –

    Identifier of the source(s) from where to load the adapter. Can be:

    +
      +
    • +
      ”ah”: search on AdapterHub Hub repo.

      Note: the Hub repo has been archived and all adapters have been moved to HuggingFace Model Hub. +Loading from this source is deprecated.

      +
      +
      +
    • +
    • ”hf”: search on HuggingFace Model Hub.

    • +
    • None (default): search on all sources

    • +
    +

  • +
  • leave_out – Dynamically drop adapter modules in the specified Transformer layers when loading the adapter.

  • +
  • set_active (bool, optional) – Set the loaded adapter to be the active one. By default (False), the adapter is loaded but not +activated.

  • +
  • use_safetensors (bool, optional) – If True, weights are loaded via safetensors if safetensors checkpoint is available. Otherwise, the regular torch save method is used.

  • +
+
+
Returns
+

The name with which the adapter was added to the model.

+
+
Return type
+

str

+
+
+
+ +
+
+load_adapter_fusion(adapter_fusion_name_or_path: str, load_as: Optional[str] = None, custom_weights_loaders: Optional[List[WeightsLoader]] = None, set_active: bool = False, with_head: bool = True, use_safetensors: bool = False, **kwargs) str
+

Loads a pre-trained AdapterFusion layer from the local file system.

+
+
Parameters
+
    +
  • adapter_fusion_name_or_path (str) – a path to a directory containing AdapterFusion weights saved using model.save_adapter_fusion().

  • +
  • load_as (str, optional) – Load the AdapterFusion using this name. +By default, the name with which the AdapterFusion layer was saved will be used.

  • +
  • set_active (bool, optional) – Activate the loaded AdapterFusion. By default (False), the AdapterFusion is loaded but not activated.

  • +
  • use_safetensors (bool, optional) – If True, weights are loaded via safetensors if safetensors checkpoint is available. Otherwise, the regular torch save method is used.

  • +
+
+
Returns
+

The name with which the AdapterFusion was added to the model.

+
+
Return type
+

str

+
+
+
+ +
+
+load_head(save_directory: str, load_as: Optional[str] = None, id2label: Optional[Dict[int, str]] = None, use_safetensors: bool = False, **kwargs) str
+

Loads a model prediction head from a directory where it was saved using save_head().

+
+
Parameters
+
    +
  • save_directory (str) – Path to the directory where the prediction head is saved.

  • +
  • load_as (str, optional) – Load the AdapterFusion using this name. +By default, the name with which the AdapterFusion layer was saved will be used.

  • +
  • id2label (Dict[int, str], optional) – Provide a custom mapping from class ids to class labels. Defaults to None.

  • +
  • use_safetensors (bool, optional) – If True, weights are loaded via safetensors if safetensors checkpoint is available. Otherwise, the regular torch save method is used.

  • +
+
+
Returns
+

The name with which the prediction head was added to the model.

+
+
Return type
+

str

+
+
+
+ +
+
+merge_adapter(name: str)
+

Merges the weights of the given LoRA module with the Transformer weights as described in the paper.

+
+
Parameters
+

name (str) – LoRA module to merge.

+
+
+
+ +
+
+push_adapter_to_hub(repo_name: str, adapter_name: str, organization: Optional[str] = None, adapterhub_tag: Optional[str] = None, datasets_tag: Optional[str] = None, local_path: Optional[str] = None, commit_message: Optional[str] = None, private: Optional[bool] = None, token: Optional[Union[bool, str]] = None, overwrite_adapter_card: bool = False, create_pr: bool = False, revision: Optional[str] = None, commit_description: Optional[str] = None, adapter_card_kwargs: Optional[dict] = None, **deprecated_kwargs)
+

Upload an adapter to HuggingFace’s Model Hub.

+
+
Parameters
+
    +
  • repo_name (str) – The name of the repository on the model hub to upload to.

  • +
  • adapter_name (str) – The name of the adapter to be uploaded.

  • +
  • organization (str, optional) – Organization in which to push the adapter +(you must be a member of this organization). Defaults to None.

  • +
  • adapterhub_tag (str, optional) – Tag of the format <task>/<subtask> for categorization on https://adapterhub.ml/explore/. See +https://docs.adapterhub.ml/contributing.html#add-a-new-task-or-subtask for more. If not specified, +datasets_tag must be given in case a new adapter card is generated. Defaults to None.

  • +
  • datasets_tag (str, optional) – Dataset identifier from https://huggingface.co/datasets. +If not specified, adapterhub_tag must be given in case a new adapter card is generated. Defaults to +None.

  • +
  • local_path (str, optional) – Local path used as clone directory of the adapter repository. +If not specified, will create a temporary directory. Defaults to None.

  • +
  • commit_message (str, optional) – Message to commit while pushing. Will default to "add config", "add tokenizer" or +"add model" depending on the type of the class.

  • +
  • private (bool, optional) – Whether or not the repository created should be private (requires a paying subscription).

  • +
  • token (bool or str, optional) – The token to use as HTTP bearer authorization for remote files. If True, will use the token generated +when running huggingface-cli login (stored in ~/.huggingface). Will default to True if repo_url +is not specified.

  • +
  • overwrite_adapter_card (bool, optional) – Overwrite an existing adapter card with a newly generated one. +If set to False, will only generate an adapter card, if none exists. Defaults to False.

  • +
  • create_pr (bool, optional) – Whether or not to create a PR with the uploaded files or directly commit.

  • +
  • revision (str, optional) – Branch to push the uploaded files to.

  • +
  • commit_description (str, optional) – The description of the commit that will be created

  • +
+
+
Returns
+

The url of the adapter repository on the model hub.

+
+
Return type
+

str

+
+
+
+ +
+
+reset_adapter()
+

Resets weights of a LoRA module merged using model.merge_adapter(name).

+
+ +
+
+save_adapter(save_directory: str, adapter_name: str, with_head: bool = True, meta_dict: Optional[dict] = None, custom_weights_loaders: Optional[List[WeightsLoader]] = None, use_safetensors: bool = False)
+

Saves an adapter and its configuration file to a directory so that it can be shared or reloaded using +load_adapter().

+
+
Parameters
+
    +
  • save_directory (str) – Path to a directory where the adapter should be saved.

  • +
  • adapter_name (str) – Name of the adapter to be saved.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
Raises
+

ValueError – If the given adapter name is invalid.

+
+
+
+ +
+
+save_adapter_fusion(save_directory: str, adapter_names: Union[Fuse, list, str], meta_dict: Optional[dict] = None, custom_weights_loaders: Optional[List[WeightsLoader]] = None, with_head: Union[bool, str] = False, use_safetensors: bool = False)
+

Saves an AdapterFusion layer and its configuration file to a directory so that it can be shared or reloaded +using load_adapter_fusion().

+
+
Parameters
+
    +
  • save_directory (str) – Path to a directory where the AdapterFusion should be saved.

  • +
  • adapter_names (Union[Fuse, list, str]) – AdapterFusion to be saved.

  • +
  • with_head (Union[bool, str]) – If True, will save a head with the same name as the AdapterFusionLayer. If a string, this will be used +as the name of the head to be saved.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
Raises
+

ValueError – If the given AdapterFusion name is invalid.

+
+
+
+ +
+
+save_all_adapter_fusions(save_directory: str, meta_dict: Optional[dict] = None, custom_weights_loaders: Optional[List[WeightsLoader]] = None, use_safetensors: bool = False)
+

Saves all AdapterFusion layers of this model together with their configuration to subfolders of the given +location.

+
+
Parameters
+
    +
  • save_directory (str) – Path to a directory where the AdapterFusion layers should be saved.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
+
+ +
+
+save_all_adapters(save_directory: str, with_head: bool = True, meta_dict: Optional[dict] = None, custom_weights_loaders: Optional[List[WeightsLoader]] = None, use_safetensors: bool = False)
+

Saves all adapters of this model together with their configuration to subfolders of the given location.

+
+
Parameters
+
    +
  • save_directory (str) – Path to a directory where the adapters should be saved.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
+
+ +
+
+save_all_heads(save_directory: str, use_safetensors: bool = False)
+

Saves all prediction heads of this model to subfolders of the given location.

+
+
Parameters
+
    +
  • save_directory (str) – Path to the base directory where prediction heads should be saved.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
+
+ +
+
+save_head(save_directory: str, head_name: Optional[str] = None, use_safetensors: bool = False) None
+

Saves a model prediction head to a directory such that it can be reloaded using load_head().

+
+
Parameters
+
    +
  • save_directory (str) – Path to the directory where the prediction head should be saved.

  • +
  • head_name (str, optional) – Name of the head to save. Set to None if model only has one head. Defaults to None.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
+
+ +
+
+save_pretrained(save_directory: Union[str, PathLike], **kwargs)
+

Save a model and its configuration file to a directory, so that it can be re-loaded using the +[~PreTrainedModel.from_pretrained] class method.

+
+
Parameters
+
    +
  • save_directory (str or os.PathLike) – Directory to which to save. Will be created if it doesn’t exist.

  • +
  • is_main_process (bool, optional, defaults to True) – Whether the process calling this is the main process or not. Useful when in distributed training like +TPUs and need to call this function on all processes. In this case, set is_main_process=True only on +the main process to avoid race conditions.

  • +
  • state_dict (nested dictionary of torch.Tensor) – The state dictionary of the model to save. Will default to self.state_dict(), but can be used to only +save parts of the model or if special precautions need to be taken when recovering the state dictionary +of a model (like when using model parallelism).

  • +
  • save_function (Callable) – The function to use to save the state dictionary. Useful on distributed training like TPUs when one +need to replace torch.save by another method.

  • +
  • push_to_hub (bool, optional, defaults to False) – Whether or not to push your model to the Hugging Face model hub after saving it. You can specify the +repository you want to push to with repo_id (will default to the name of save_directory in your +namespace).

  • +
  • max_shard_size (int or str, optional, defaults to “5GB”) –

    The maximum size for a checkpoint before being sharded. Checkpoints shard will then be each of size +lower than this size. If expressed as a string, needs to be digits followed by a unit (like “5MB”). +We default it to 5GB in order for models to be able to run easily on free-tier google colab instances +without CPU OOM issues.

    +

    <Tip warning={true}>

    +

    If a single weight of the model is bigger than max_shard_size, it will be in its own checkpoint shard +which will be bigger than max_shard_size.

    +

    </Tip>

    +

  • +
  • safe_serialization (bool, optional, defaults to True) – Whether to save the model using safetensors or the traditional PyTorch way (that uses pickle).

  • +
  • variant (str, optional) – If specified, weights are saved in the format pytorch_model.<variant>.bin.

  • +
  • token (str or bool, optional) – The token to use as HTTP bearer authorization for remote files. If True, or not specified, will use +the token generated when running huggingface-cli login (stored in ~/.huggingface).

  • +
  • save_peft_format (bool, optional, defaults to True) – For backward compatibility with PEFT library, in case adapter weights are attached to the model, all +keys of the state dict of adapters needs to be pre-pended with base_model.model. Advanced users can +disable this behaviours by setting save_peft_format to False.

  • +
  • kwargs (Dict[str, Any], optional) – Additional key word arguments passed along to the [~utils.PushToHubMixin.push_to_hub] method.

  • +
+
+
+
+ +
+
+set_active_adapters(adapter_setup: Union[list, AdapterCompositionBlock], skip_layers: Optional[List[int]] = None)
+

Sets the adapter modules to be used by default in every forward pass. This setting can be overriden by passing +the adapter_names parameter in the foward() pass. If no adapter with the given name is found, no module of +the respective type will be activated. In case the calling model class supports named prediction heads, this +method will attempt to activate a prediction head with the name of the last adapter in the list of passed +adapter names.

+
+
Parameters
+

adapter_setup (list) – The list of adapters to be activated by default. Can be a fusion or stacking configuration.

+
+
+
+ +
+
+tie_weights()
+

Tie the weights between the input embeddings and the output embeddings.

+

If the torchscript flag is set in the configuration, can’t handle parameter sharing so we are cloning +the weights instead.

+
+ +
+
+train_adapter(adapter_setup: Union[list, AdapterCompositionBlock], train_embeddings=False)
+

Sets the model into mode for training the given adapters. If self.base_model is self, must inherit from a class +that implements this method, to preclude infinite recursion

+
+ +
+
+train_adapter_fusion(adapter_setup: Union[list, AdapterCompositionBlock], unfreeze_adapters=False)
+

Sets the model into mode for training of adapter fusion determined by a list of adapter names. If +self.base_model is self, must inherit from a class that implements this method, to preclude infinite recursion

+
+ +
+
+train_fusion(adapter_setup: Union[list, AdapterCompositionBlock], unfreeze_adapters=False)
+

Sets the model into mode for training of adapter fusion determined by a list of adapter names.

+
+ +
+ +
+
+ + +
+ +
+ + +
+
+ +
+ +
+ +
+ + Versions + v: main + + +
+
+
Branches
+
main
+
+
+
+ + + + + + + + + + \ No newline at end of file diff --git a/classes/models/gptj.html b/classes/models/gptj.html new file mode 100644 index 0000000000..8a2108588e --- /dev/null +++ b/classes/models/gptj.html @@ -0,0 +1,1056 @@ + + + + + + + + + + + EleutherAI GPT-J-6B — AdapterHub documentation + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + +
+ + + + + +
+ +
+ + + + + + + + + + + + + + + + + +
+ + + + +
+
+
+
+ +
+

EleutherAI GPT-J-6B

+

EleutherAI GPT-J-6B is an open source, autoregressive language model created by a group of researchers called +EleutherAI. It’s one of the most advanced alternatives to OpenAI’s GPT-3 and performs well on a wide array of +natural language tasks such as chat, summarization, and question answering, to name a few.

+

For a deeper dive, GPT-J is a transformer model trained using Ben Wang’s Mesh Transformer JAX Mesh Transformer JAX. “GPT” is short for +generative pre-trained transformer, “J” distinguishes this model from other GPT models, and “6B” represents the 6 +billion trainable parameters.

+

The model consists of 28 layers with a model dimension of 4096, and a feedforward dimension of 16384. The model +dimension is split into 16 heads, each with a dimension of 256. Rotary Position Embedding (RoPE) is applied to +64 dimensions of each head. The model is trained with a tokenization vocabulary of 50257, using the same set of +BPEs as GPT-2/GPT-3.

+
+

GPTJAdapterModel

+
+
+class adapters.GPTJAdapterModel(config)
+

The GPTJ Model that allows the loading of different heads for different tasks. This enables a flexible use of the +models and adapters. Since this class does classification on the last token, it requires to know the position of the +last token. If a pad_token_id is defined in the configuration, it finds the last token that is not a padding +token in each row. If no pad_token_id is defined, it simply takes the last value in each row of the batch. Since +it cannot guess the padding tokens when inputs_embeds are passed instead of input_ids, it does the same +(take the last value in each row of the batch).

+
+

This model is a PyTorch [torch.nn.Module](https://pytorch.org/docs/stable/nn.html#torch.nn.Module) sub-class. Use +it as a regular PyTorch Module and refer to the PyTorch documentation for all matter related to general usage and +behavior.

+
+
Parameters:
+
config ([GPTJConfig]): Model configuration class with all the parameters of the model.

Initializing with a config file does not load the weights associated with the model, only the +configuration. Check out the [~PreTrainedModel.from_pretrained] method to load the model weights.

+
+
+
+
+
+
+
+property active_adapters: AdapterCompositionBlock
+

If you are not familiar with adapters and PEFT methods, we invite you to read more about them on the PEFT +official documentation: https://huggingface.co/docs/peft

+

Gets the current active adapters of the model. In case of multi-adapter inference (combining multiple adapters +for inference) returns the list of all active adapters so that users can deal with them accordingly.

+

For previous PEFT versions (that does not support multi-adapter inference), module.active_adapter will return +a single string.

+
+ +
+
+property active_head: Union[str, List[str]]
+

The active prediction head configuration of this model. Can be either the name of a single available head +(string) or a list of multiple available heads. In case of a list of heads, the same base model is forwarded +through all specified heads.

+
+
Returns
+

A string or a list of strings describing the active head configuration.

+
+
Return type
+

Union[str, List[str]]

+
+
+
+ +
+
+adapter_fusion_to(adapter_names: Union[Fuse, list, str], device: Optional[Union[device, str]] = None, dtype: Optional[dtype] = None)
+

Moves the adapter fusion layer with the given name to the specified device and data type.

+
+
Parameters
+
    +
  • adapter_names (Union[Fuse, list, str]) – The name of the adapter fusion layer to be moved.

  • +
  • device (torch.device or str, optional) – The device on which the adapter fusion layer should be moved.

  • +
  • dtype (torch.dtype, optional) – The data type to which the adapter fusion layer should be cast.

  • +
+
+
+
+ +
+
+adapter_summary(as_dict=False) Union[str, dict]
+

Returns a string summary of all adapters currently added to the model. Each entry in the summary table has the +following attributes:

+
+
    +
  • name: the name of the adapter

  • +
  • architecture: the architectural base of the adapter

  • +
  • #param: the number of parameters of the adapter

  • +
  • %param: the number of parameters of the adapter relative to the full model

  • +
  • active: whether the adapter is active

  • +
  • train: whether the adapter weights are enabled for training

  • +
+
+
+ +
+
+adapter_to(name: str, device: Optional[Union[device, str]] = None, dtype: Optional[dtype] = None)
+

Moves the adapter with the given name to the specified device and data type.

+
+
Parameters
+
    +
  • name (str) – The name of the adapter to be moved.

  • +
  • device (torch.device or str, optional) – The device on which the adapter should be moved.

  • +
  • dtype (torch.dtype, optional) – The data type to which the adapter should be cast.

  • +
+
+
+
+ +
+
+add_adapter(adapter_name: str, config=None, overwrite_ok: bool = False, set_active: bool = False)
+

Adds a new adapter module of the specified type to the model.

+
+
Parameters
+
    +
  • adapter_name (str) – The name of the adapter module to be added.

  • +
  • config (str or dict, optional) –

    The adapter configuration, can be either:

    +
      +
    • the string identifier of a pre-defined configuration dictionary

    • +
    • a configuration dictionary specifying the full config

    • +
    • if not given, the default configuration for this adapter type will be used

    • +
    +

  • +
  • overwrite_ok (bool, optional) – Overwrite an adapter with the same name if it exists. By default (False), an exception is thrown.

  • +
  • set_active (bool, optional) – Set the adapter to be the active one. By default (False), the adapter is added but not activated.

  • +
+
+
+

If self.base_model is self, must inherit from a class that implements this method, to preclude infinite +recursion

+
+ +
+
+add_adapter_fusion(adapter_names: Union[Fuse, list, str], config=None, overwrite_ok: bool = False, set_active: bool = False)
+

Adds AdapterFusion to the model with alll the necessary configurations and weight initializations

+
+
Parameters
+
    +
  • adapter_names (Fuse or list or str) –

    AdapterFusion layer to add. Can be either:

    +
      +
    • a Fuse composition block

    • +
    • a list of adapter names to fuse

    • +
    • a comma-separated string of adapter names to fuse

    • +
    +

  • +
  • config (str or dict) –

    adapter fusion configuration, can be either:

    +
      +
    • a string identifying a pre-defined adapter fusion configuration

    • +
    • a dictionary representing the adapter fusion configuration

    • +
    • the path to a file containing the adapter fusion configuration

    • +
    +

  • +
  • overwrite_ok (bool, optional) – Overwrite an AdapterFusion layer with the same name if it exists. By default (False), an exception is +thrown.

  • +
  • set_active (bool, optional) – Activate the added AdapterFusion. By default (False), the AdapterFusion is added but not activated.

  • +
+
+
+
+ +
+
+add_causal_lm_head(head_name, activation_function='gelu', overwrite_ok=False)
+

Adds a causal language modeling head on top of the model.

+
+
Parameters
+
    +
  • head_name (str) – The name of the head.

  • +
  • activation_function (str, optional) – Activation function. Defaults to ‘gelu’.

  • +
  • overwrite_ok (bool, optional) – Force overwrite if a head with the same name exists. Defaults to False.

  • +
+
+
+
+ +
+
+add_classification_head(head_name, num_labels=2, layers=2, activation_function='tanh', overwrite_ok=False, multilabel=False, id2label=None, use_pooler=False)
+

Adds a sequence classification head on top of the model.

+
+
Parameters
+
    +
  • head_name (str) – The name of the head.

  • +
  • num_labels (int, optional) – Number of classification labels. Defaults to 2.

  • +
  • layers (int, optional) – Number of layers. Defaults to 2.

  • +
  • activation_function (str, optional) – Activation function. Defaults to ‘tanh’.

  • +
  • overwrite_ok (bool, optional) – Force overwrite if a head with the same name exists. Defaults to False.

  • +
  • multilabel (bool, optional) – Enable multilabel classification setup. Defaults to False.

  • +
+
+
+
+ +
+
+add_qa_head(head_name, num_labels=2, layers=1, activation_function='tanh', overwrite_ok=False, id2label=None)
+

Adds a question answering head on top of the model.

+
+
Parameters
+
    +
  • head_name (str) – The name of the head.

  • +
  • num_labels (int, optional) – Number of classification labels. Defaults to 2.

  • +
  • layers (int, optional) – Number of layers. Defaults to 1.

  • +
  • activation_function (str, optional) – Activation function. Defaults to ‘tanh’.

  • +
  • overwrite_ok (bool, optional) – Force overwrite if a head with the same name exists. Defaults to False.

  • +
+
+
+
+ +
+
+add_tagging_head(head_name, num_labels=2, layers=1, activation_function='tanh', overwrite_ok=False, id2label=None)
+

Adds a token classification head on top of the model.

+
+
Parameters
+
    +
  • head_name (str) – The name of the head.

  • +
  • num_labels (int, optional) – Number of classification labels. Defaults to 2.

  • +
  • layers (int, optional) – Number of layers. Defaults to 1.

  • +
  • activation_function (str, optional) – Activation function. Defaults to ‘tanh’.

  • +
  • overwrite_ok (bool, optional) – Force overwrite if a head with the same name exists. Defaults to False.

  • +
+
+
+
+ +
+
+apply_to_adapter_layers(fn)
+

Applies a function to all adapter layers of the model.

+
+ +
+
+apply_to_basemodel_childs(fn)
+

Applies a function to all direct childs of the model if they are a instance of AdapterLayerBase.

+
+ +
+
+average_adapter(adapter_name: str, adapter_list: List[str], weights: Optional[List[float]] = None, normalize_weights: bool = True, overwrite_ok: bool = False, set_active: bool = False)
+

Adds a new adapter module as weighted average of a set of existing adapter modules.

+
+
Parameters
+
    +
  • adapter_name (str) – The name of the adapter module to be added.

  • +
  • input_adapters (List[str] or Dict[str, float]) – Specifies the existing adapters whose weights should be averaged. Can either be a list of adapter names +or a dictionary mapping adapter names to weights.

  • +
  • overwrite_ok (bool, optional) – Overwrite an adapter with the same name if it exists. By default (False), an exception is thrown.

  • +
  • set_active (bool, optional) – Set the adapter to be the active one. By default (False), the adapter is added but not activated.

  • +
+
+
+
+ +
+
+delete_adapter(adapter_name: str)
+

Deletes the adapter with the specified name from the model.

+
+
Parameters
+

adapter_name (str) – The name of the adapter.

+
+
+
+ +
+
+delete_adapter_fusion(adapter_names: Union[Fuse, list, str])
+

Deletes the AdapterFusion layer of the specified adapters.

+
+
Parameters
+

adapter_names (Union[Fuse, list, str]) – AdapterFusion layer to delete.

+
+
+
+ +
+
+delete_head(head_name: str)
+

Deletes the prediction head with the specified name from the model.

+
+
Parameters
+

head_name (str) – The name of the prediction to delete.

+
+
+
+ +
+
+eject_prefix_tuning(name: str)
+

Converts the prefix tuning with the given name from the reparameterized form into the flat form.

+
+
Parameters
+

name (str) – The name of the prefix tuning.

+
+
+
+ +
+
+forward(input_ids=None, past_key_values=None, attention_mask=None, token_type_ids=None, position_ids=None, head_mask=None, inputs_embeds=None, use_cache=None, output_attentions=None, output_hidden_states=None, return_dict=None, head=None, output_adapter_gating_scores=False, output_adapter_fusion_attentions=False, **kwargs)
+

Define the computation performed at every call.

+

Should be overridden by all subclasses.

+
+

Note

+

Although the recipe for forward pass needs to be defined within +this function, one should call the Module instance afterwards +instead of this since the former takes care of running the +registered hooks while the latter silently ignores them.

+
+
+ +
+
+forward_context(context: ForwardContext, *args, **kwargs)
+

This method is called by the ForwardContext at the beginning of the forward pass.

+
+ +
+
+forward_head(all_outputs, head_name=None, cls_output=None, attention_mask=None, return_dict=False, context=None, **kwargs)
+

The forward pass through a prediction head configuration. There are three ways to specify the used prediction +head configuration (in order of priority):

+
+
    +
  1. If a head_name is passed, the head with the given name is used.

  2. +
  3. If the forward call is executed within an AdapterSetup context, the head configuration is read from +the context.

  4. +
  5. If the active_head property is set, the head configuration is read from there.

  6. +
+
+
+
Parameters
+
    +
  • all_outputs (dict) – The outputs of the base model.

  • +
  • head_name (str, optional) – The name of the prediction head to use. If None, the active head is used.

  • +
  • cls_output (torch.Tensor, optional) – The classification output of the model.

  • +
  • attention_mask (torch.Tensor, optional) – The attention mask of the model.

  • +
  • return_dict (bool) – Whether or not to return a ModelOutput instead of a plain tuple.

  • +
  • get_cls_from_eos_tokens (bool) – If set to True, retrieve classifier token representations from the last <eos> token in the sequence. +Setting to True requires eos_mask to be passed as well.

  • +
  • **kwargs – Additional keyword arguments passed to the forward pass of the head.

  • +
+
+
+
+ +
+
+freeze_model(freeze=True)
+

Freezes all weights of the model.

+
+ +
+
+get_adapter(name)
+

If self.base_model is self, must inherit from a class that implements this method, to preclude infinite +recursion

+
+ +
+
+get_labels(head_name=None)
+

Returns the labels the given head is assigning/predictin

+
+
Parameters
+
    +
  • head_name – (str, optional) the name of the head which labels should be returned. Default is None.

  • +
  • returned (If the name is None the labels of the active head are) –

  • +
+
+
+

Returns: labels

+
+ +
+
+get_labels_dict(head_name=None)
+

Returns the id2label dict for the given hea

+
+
Parameters
+
    +
  • head_name – (str, optional) the name of the head which labels should be returned. Default is None.

  • +
  • returned (If the name is None the labels of the active head are) –

  • +
+
+
+

Returns: id2label

+
+ +
+
+get_output_embeddings() Union[Module, List[Module]]
+

Returns the model’s output embeddings.

+
+
Returns
+

A torch module mapping hidden states to vocabulary.

+
+
Return type
+

nn.Module

+
+
+
+ +
+
+head_type()
+

Checks which head type the decorated function belongs to and raises an error if the model does not support the +head type.

+
+ +
+
+init_adapters(model_config, adapters_config, add_prefix_tuning_pool=True)
+

This method initializes adapter modules and fusion modules from the model config.

+
+ +
+
+iter_layers() Iterable[Tuple[int, Module]]
+

Iterates over all layers of the model.

+
+ +
+
+load_adapter(adapter_name_or_path: str, config: Optional[Union[dict, str]] = None, version: Optional[str] = None, model_name: Optional[str] = None, load_as: Optional[str] = None, source: Optional[str] = None, with_head: bool = True, custom_weights_loaders: Optional[List[WeightsLoader]] = None, leave_out: Optional[List[int]] = None, id2label=None, set_active: bool = False, use_safetensors: bool = False, **kwargs) str
+

Loads a pre-trained pytorch adapter module from the local file system or a remote location.

+
+
Parameters
+
    +
  • adapter_name_or_path (str) –

    can be either:

    +
      +
    • the identifier of a pre-trained task adapter to be loaded from Adapter Hub

    • +
    • a path to a directory containing adapter weights saved using model.saved_adapter()

    • +
    • a URL pointing to a zip folder containing a saved adapter module

    • +
    +

  • +
  • config (dict or str, optional) – The requested configuration of the adapter. +If not specified, will be either: - the default adapter config for the requested adapter if specified - +the global default adapter config

  • +
  • version (str, optional) – The version of the adapter to be loaded.

  • +
  • model_name (str, optional) – The string identifier of the pre-trained model.

  • +
  • load_as (str, optional) – Load the adapter using this name. By default, the name with which the adapter was +saved will be used.

  • +
  • source (str, optional) –

    Identifier of the source(s) from where to load the adapter. Can be:

    +
      +
    • +
      ”ah”: search on AdapterHub Hub repo.

      Note: the Hub repo has been archived and all adapters have been moved to HuggingFace Model Hub. +Loading from this source is deprecated.

      +
      +
      +
    • +
    • ”hf”: search on HuggingFace Model Hub.

    • +
    • None (default): search on all sources

    • +
    +

  • +
  • leave_out – Dynamically drop adapter modules in the specified Transformer layers when loading the adapter.

  • +
  • set_active (bool, optional) – Set the loaded adapter to be the active one. By default (False), the adapter is loaded but not +activated.

  • +
  • use_safetensors (bool, optional) – If True, weights are loaded via safetensors if safetensors checkpoint is available. Otherwise, the regular torch save method is used.

  • +
+
+
Returns
+

The name with which the adapter was added to the model.

+
+
Return type
+

str

+
+
+
+ +
+
+load_adapter_fusion(adapter_fusion_name_or_path: str, load_as: Optional[str] = None, custom_weights_loaders: Optional[List[WeightsLoader]] = None, set_active: bool = False, with_head: bool = True, use_safetensors: bool = False, **kwargs) str
+

Loads a pre-trained AdapterFusion layer from the local file system.

+
+
Parameters
+
    +
  • adapter_fusion_name_or_path (str) – a path to a directory containing AdapterFusion weights saved using model.save_adapter_fusion().

  • +
  • load_as (str, optional) – Load the AdapterFusion using this name. +By default, the name with which the AdapterFusion layer was saved will be used.

  • +
  • set_active (bool, optional) – Activate the loaded AdapterFusion. By default (False), the AdapterFusion is loaded but not activated.

  • +
  • use_safetensors (bool, optional) – If True, weights are loaded via safetensors if safetensors checkpoint is available. Otherwise, the regular torch save method is used.

  • +
+
+
Returns
+

The name with which the AdapterFusion was added to the model.

+
+
Return type
+

str

+
+
+
+ +
+
+load_head(save_directory: str, load_as: Optional[str] = None, id2label: Optional[Dict[int, str]] = None, use_safetensors: bool = False, **kwargs) str
+

Loads a model prediction head from a directory where it was saved using save_head().

+
+
Parameters
+
    +
  • save_directory (str) – Path to the directory where the prediction head is saved.

  • +
  • load_as (str, optional) – Load the AdapterFusion using this name. +By default, the name with which the AdapterFusion layer was saved will be used.

  • +
  • id2label (Dict[int, str], optional) – Provide a custom mapping from class ids to class labels. Defaults to None.

  • +
  • use_safetensors (bool, optional) – If True, weights are loaded via safetensors if safetensors checkpoint is available. Otherwise, the regular torch save method is used.

  • +
+
+
Returns
+

The name with which the prediction head was added to the model.

+
+
Return type
+

str

+
+
+
+ +
+
+merge_adapter(name: str)
+

Merges the weights of the given LoRA module with the Transformer weights as described in the paper.

+
+
Parameters
+

name (str) – LoRA module to merge.

+
+
+
+ +
+
+push_adapter_to_hub(repo_name: str, adapter_name: str, organization: Optional[str] = None, adapterhub_tag: Optional[str] = None, datasets_tag: Optional[str] = None, local_path: Optional[str] = None, commit_message: Optional[str] = None, private: Optional[bool] = None, token: Optional[Union[bool, str]] = None, overwrite_adapter_card: bool = False, create_pr: bool = False, revision: Optional[str] = None, commit_description: Optional[str] = None, adapter_card_kwargs: Optional[dict] = None, **deprecated_kwargs)
+

Upload an adapter to HuggingFace’s Model Hub.

+
+
Parameters
+
    +
  • repo_name (str) – The name of the repository on the model hub to upload to.

  • +
  • adapter_name (str) – The name of the adapter to be uploaded.

  • +
  • organization (str, optional) – Organization in which to push the adapter +(you must be a member of this organization). Defaults to None.

  • +
  • adapterhub_tag (str, optional) – Tag of the format <task>/<subtask> for categorization on https://adapterhub.ml/explore/. See +https://docs.adapterhub.ml/contributing.html#add-a-new-task-or-subtask for more. If not specified, +datasets_tag must be given in case a new adapter card is generated. Defaults to None.

  • +
  • datasets_tag (str, optional) – Dataset identifier from https://huggingface.co/datasets. +If not specified, adapterhub_tag must be given in case a new adapter card is generated. Defaults to +None.

  • +
  • local_path (str, optional) – Local path used as clone directory of the adapter repository. +If not specified, will create a temporary directory. Defaults to None.

  • +
  • commit_message (str, optional) – Message to commit while pushing. Will default to "add config", "add tokenizer" or +"add model" depending on the type of the class.

  • +
  • private (bool, optional) – Whether or not the repository created should be private (requires a paying subscription).

  • +
  • token (bool or str, optional) – The token to use as HTTP bearer authorization for remote files. If True, will use the token generated +when running huggingface-cli login (stored in ~/.huggingface). Will default to True if repo_url +is not specified.

  • +
  • overwrite_adapter_card (bool, optional) – Overwrite an existing adapter card with a newly generated one. +If set to False, will only generate an adapter card, if none exists. Defaults to False.

  • +
  • create_pr (bool, optional) – Whether or not to create a PR with the uploaded files or directly commit.

  • +
  • revision (str, optional) – Branch to push the uploaded files to.

  • +
  • commit_description (str, optional) – The description of the commit that will be created

  • +
+
+
Returns
+

The url of the adapter repository on the model hub.

+
+
Return type
+

str

+
+
+
+ +
+
+reset_adapter()
+

Resets weights of a LoRA module merged using model.merge_adapter(name).

+
+ +
+
+save_adapter(save_directory: str, adapter_name: str, with_head: bool = True, meta_dict: Optional[dict] = None, custom_weights_loaders: Optional[List[WeightsLoader]] = None, use_safetensors: bool = False)
+

Saves an adapter and its configuration file to a directory so that it can be shared or reloaded using +load_adapter().

+
+
Parameters
+
    +
  • save_directory (str) – Path to a directory where the adapter should be saved.

  • +
  • adapter_name (str) – Name of the adapter to be saved.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
Raises
+

ValueError – If the given adapter name is invalid.

+
+
+
+ +
+
+save_adapter_fusion(save_directory: str, adapter_names: Union[Fuse, list, str], meta_dict: Optional[dict] = None, custom_weights_loaders: Optional[List[WeightsLoader]] = None, with_head: Union[bool, str] = False, use_safetensors: bool = False)
+

Saves an AdapterFusion layer and its configuration file to a directory so that it can be shared or reloaded +using load_adapter_fusion().

+
+
Parameters
+
    +
  • save_directory (str) – Path to a directory where the AdapterFusion should be saved.

  • +
  • adapter_names (Union[Fuse, list, str]) – AdapterFusion to be saved.

  • +
  • with_head (Union[bool, str]) – If True, will save a head with the same name as the AdapterFusionLayer. If a string, this will be used +as the name of the head to be saved.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
Raises
+

ValueError – If the given AdapterFusion name is invalid.

+
+
+
+ +
+
+save_all_adapter_fusions(save_directory: str, meta_dict: Optional[dict] = None, custom_weights_loaders: Optional[List[WeightsLoader]] = None, use_safetensors: bool = False)
+

Saves all AdapterFusion layers of this model together with their configuration to subfolders of the given +location.

+
+
Parameters
+
    +
  • save_directory (str) – Path to a directory where the AdapterFusion layers should be saved.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
+
+ +
+
+save_all_adapters(save_directory: str, with_head: bool = True, meta_dict: Optional[dict] = None, custom_weights_loaders: Optional[List[WeightsLoader]] = None, use_safetensors: bool = False)
+

Saves all adapters of this model together with their configuration to subfolders of the given location.

+
+
Parameters
+
    +
  • save_directory (str) – Path to a directory where the adapters should be saved.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
+
+ +
+
+save_all_heads(save_directory: str, use_safetensors: bool = False)
+

Saves all prediction heads of this model to subfolders of the given location.

+
+
Parameters
+
    +
  • save_directory (str) – Path to the base directory where prediction heads should be saved.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
+
+ +
+
+save_head(save_directory: str, head_name: Optional[str] = None, use_safetensors: bool = False) None
+

Saves a model prediction head to a directory such that it can be reloaded using load_head().

+
+
Parameters
+
    +
  • save_directory (str) – Path to the directory where the prediction head should be saved.

  • +
  • head_name (str, optional) – Name of the head to save. Set to None if model only has one head. Defaults to None.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
+
+ +
+
+save_pretrained(save_directory: Union[str, PathLike], **kwargs)
+

Save a model and its configuration file to a directory, so that it can be re-loaded using the +[~PreTrainedModel.from_pretrained] class method.

+
+
Parameters
+
    +
  • save_directory (str or os.PathLike) – Directory to which to save. Will be created if it doesn’t exist.

  • +
  • is_main_process (bool, optional, defaults to True) – Whether the process calling this is the main process or not. Useful when in distributed training like +TPUs and need to call this function on all processes. In this case, set is_main_process=True only on +the main process to avoid race conditions.

  • +
  • state_dict (nested dictionary of torch.Tensor) – The state dictionary of the model to save. Will default to self.state_dict(), but can be used to only +save parts of the model or if special precautions need to be taken when recovering the state dictionary +of a model (like when using model parallelism).

  • +
  • save_function (Callable) – The function to use to save the state dictionary. Useful on distributed training like TPUs when one +need to replace torch.save by another method.

  • +
  • push_to_hub (bool, optional, defaults to False) – Whether or not to push your model to the Hugging Face model hub after saving it. You can specify the +repository you want to push to with repo_id (will default to the name of save_directory in your +namespace).

  • +
  • max_shard_size (int or str, optional, defaults to “5GB”) –

    The maximum size for a checkpoint before being sharded. Checkpoints shard will then be each of size +lower than this size. If expressed as a string, needs to be digits followed by a unit (like “5MB”). +We default it to 5GB in order for models to be able to run easily on free-tier google colab instances +without CPU OOM issues.

    +

    <Tip warning={true}>

    +

    If a single weight of the model is bigger than max_shard_size, it will be in its own checkpoint shard +which will be bigger than max_shard_size.

    +

    </Tip>

    +

  • +
  • safe_serialization (bool, optional, defaults to True) – Whether to save the model using safetensors or the traditional PyTorch way (that uses pickle).

  • +
  • variant (str, optional) – If specified, weights are saved in the format pytorch_model.<variant>.bin.

  • +
  • token (str or bool, optional) – The token to use as HTTP bearer authorization for remote files. If True, or not specified, will use +the token generated when running huggingface-cli login (stored in ~/.huggingface).

  • +
  • save_peft_format (bool, optional, defaults to True) – For backward compatibility with PEFT library, in case adapter weights are attached to the model, all +keys of the state dict of adapters needs to be pre-pended with base_model.model. Advanced users can +disable this behaviours by setting save_peft_format to False.

  • +
  • kwargs (Dict[str, Any], optional) – Additional key word arguments passed along to the [~utils.PushToHubMixin.push_to_hub] method.

  • +
+
+
+
+ +
+
+set_active_adapters(adapter_setup: Union[list, AdapterCompositionBlock], skip_layers: Optional[List[int]] = None)
+

Sets the adapter modules to be used by default in every forward pass. This setting can be overriden by passing +the adapter_names parameter in the foward() pass. If no adapter with the given name is found, no module of +the respective type will be activated. In case the calling model class supports named prediction heads, this +method will attempt to activate a prediction head with the name of the last adapter in the list of passed +adapter names.

+
+
Parameters
+

adapter_setup (list) – The list of adapters to be activated by default. Can be a fusion or stacking configuration.

+
+
+
+ +
+
+tie_weights()
+

Tie the weights between the input embeddings and the output embeddings.

+

If the torchscript flag is set in the configuration, can’t handle parameter sharing so we are cloning +the weights instead.

+
+ +
+
+train_adapter(adapter_setup: Union[list, AdapterCompositionBlock], train_embeddings=False)
+

Sets the model into mode for training the given adapters. If self.base_model is self, must inherit from a class +that implements this method, to preclude infinite recursion

+
+ +
+
+train_adapter_fusion(adapter_setup: Union[list, AdapterCompositionBlock], unfreeze_adapters=False)
+

Sets the model into mode for training of adapter fusion determined by a list of adapter names. If +self.base_model is self, must inherit from a class that implements this method, to preclude infinite recursion

+
+ +
+
+train_fusion(adapter_setup: Union[list, AdapterCompositionBlock], unfreeze_adapters=False)
+

Sets the model into mode for training of adapter fusion determined by a list of adapter names.

+
+ +
+ +
+
+ + +
+ +
+ + +
+
+ +
+ +
+ +
+ + Versions + v: main + + +
+
+
Branches
+
main
+
+
+
+ + + + + + + + + + \ No newline at end of file diff --git a/classes/models/llama.html b/classes/models/llama.html new file mode 100644 index 0000000000..0d96873968 --- /dev/null +++ b/classes/models/llama.html @@ -0,0 +1,1064 @@ + + + + + + + + + + + LLaMA — AdapterHub documentation + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + +
+ + + + + +
+ +
+ + + + + + + + + + + + + + + + + +
+ + + + +
+
+
+
+ +
+

LLaMA

+
+

Note

+

Loading a LlamaForQuestionAnswering via [AutoAdapterModel](adapters.AutoAdapterModel) or via [LlamaAdapterModel](adapters.LlamaAdapterModel) does not load the head, even if the model is not sharded. Please load the base model first and then subsequently the head. +Note that for sharded models the head is never automatically loaded as described here: [Auto Classes](auto.rst)

+
+

The LLaMA model was proposed in LLaMA: Open and Efficient Foundation Language Models by +Hugo Touvron, Thibaut Lavril, Gautier Izacard, Xavier Martinet, Marie-Anne Lachaux, Timothée Lacroix, Baptiste Rozière, Naman Goyal, +Eric Hambro, Faisal Azhar, Aurelien Rodriguez, Armand Joulin, Edouard Grave, Guillaume Lample. It is a collection of foundation language +models ranging from 7B to 65B parameters.

+

The abstract from the paper is the following:

+

We introduce LLaMA, a collection of foundation language models ranging from 7B to 65B parameters. We train our models on trillions of tokens, +and show that it is possible to train state-of-the-art models using publicly available datasets exclusively, without resorting to proprietary +and inaccessible datasets. In particular, LLaMA-13B outperforms GPT-3 (175B) on most benchmarks, and LLaMA-65B is competitive with the best models, +Chinchilla-70B and PaLM-540B. We release all our models to the research community.

+
+

LlamaAdapterModel

+
+
+class adapters.LlamaAdapterModel(config)
+

The Llama Model that allows the loading of different heads dor different tasks. This enables a flexible use of the +models and adpters. Since this class does classification on the last token, it requires to know the position of the +last token. If a pad_token_id is defined in the configuration, it finds the last token that is not a padding +token in each row. If no pad_token_id is defined, it simply takes the last value in each row of the batch. Since +it cannot guess the padding tokens when inputs_embeds are passed instead of input_ids, it does the same +(take the last value in each row of the batch).

+
+

This model inherits from [PreTrainedModel]. Check the superclass documentation for the generic methods the +library implements for all its model (such as downloading or saving, resizing the input embeddings, pruning heads +etc.)

+

This model is also a PyTorch [torch.nn.Module](https://pytorch.org/docs/stable/nn.html#torch.nn.Module) subclass. +Use it as a regular PyTorch Module and refer to the PyTorch documentation for all matter related to general usage +and behavior.

+
+
Parameters:
+
config ([LlamaConfig]):

Model configuration class with all the parameters of the model. Initializing with a config file does not +load the weights associated with the model, only the configuration. Check out the +[~PreTrainedModel.from_pretrained] method to load the model weights.

+
+
+
+
+
+
+
+property active_adapters: AdapterCompositionBlock
+

If you are not familiar with adapters and PEFT methods, we invite you to read more about them on the PEFT +official documentation: https://huggingface.co/docs/peft

+

Gets the current active adapters of the model. In case of multi-adapter inference (combining multiple adapters +for inference) returns the list of all active adapters so that users can deal with them accordingly.

+

For previous PEFT versions (that does not support multi-adapter inference), module.active_adapter will return +a single string.

+
+ +
+
+property active_head: Union[str, List[str]]
+

The active prediction head configuration of this model. Can be either the name of a single available head +(string) or a list of multiple available heads. In case of a list of heads, the same base model is forwarded +through all specified heads.

+
+
Returns
+

A string or a list of strings describing the active head configuration.

+
+
Return type
+

Union[str, List[str]]

+
+
+
+ +
+
+adapter_fusion_to(adapter_names: Union[Fuse, list, str], device: Optional[Union[device, str]] = None, dtype: Optional[dtype] = None)
+

Moves the adapter fusion layer with the given name to the specified device and data type.

+
+
Parameters
+
    +
  • adapter_names (Union[Fuse, list, str]) – The name of the adapter fusion layer to be moved.

  • +
  • device (torch.device or str, optional) – The device on which the adapter fusion layer should be moved.

  • +
  • dtype (torch.dtype, optional) – The data type to which the adapter fusion layer should be cast.

  • +
+
+
+
+ +
+
+adapter_summary(as_dict=False) Union[str, dict]
+

Returns a string summary of all adapters currently added to the model. Each entry in the summary table has the +following attributes:

+
+
    +
  • name: the name of the adapter

  • +
  • architecture: the architectural base of the adapter

  • +
  • #param: the number of parameters of the adapter

  • +
  • %param: the number of parameters of the adapter relative to the full model

  • +
  • active: whether the adapter is active

  • +
  • train: whether the adapter weights are enabled for training

  • +
+
+
+ +
+
+adapter_to(name: str, device: Optional[Union[device, str]] = None, dtype: Optional[dtype] = None)
+

Moves the adapter with the given name to the specified device and data type.

+
+
Parameters
+
    +
  • name (str) – The name of the adapter to be moved.

  • +
  • device (torch.device or str, optional) – The device on which the adapter should be moved.

  • +
  • dtype (torch.dtype, optional) – The data type to which the adapter should be cast.

  • +
+
+
+
+ +
+
+add_adapter(adapter_name: str, config=None, overwrite_ok: bool = False, set_active: bool = False)
+

Adds a new adapter module of the specified type to the model.

+
+
Parameters
+
    +
  • adapter_name (str) – The name of the adapter module to be added.

  • +
  • config (str or dict, optional) –

    The adapter configuration, can be either:

    +
      +
    • the string identifier of a pre-defined configuration dictionary

    • +
    • a configuration dictionary specifying the full config

    • +
    • if not given, the default configuration for this adapter type will be used

    • +
    +

  • +
  • overwrite_ok (bool, optional) – Overwrite an adapter with the same name if it exists. By default (False), an exception is thrown.

  • +
  • set_active (bool, optional) – Set the adapter to be the active one. By default (False), the adapter is added but not activated.

  • +
+
+
+

If self.base_model is self, must inherit from a class that implements this method, to preclude infinite +recursion

+
+ +
+
+add_adapter_fusion(adapter_names: Union[Fuse, list, str], config=None, overwrite_ok: bool = False, set_active: bool = False)
+

Adds AdapterFusion to the model with alll the necessary configurations and weight initializations

+
+
Parameters
+
    +
  • adapter_names (Fuse or list or str) –

    AdapterFusion layer to add. Can be either:

    +
      +
    • a Fuse composition block

    • +
    • a list of adapter names to fuse

    • +
    • a comma-separated string of adapter names to fuse

    • +
    +

  • +
  • config (str or dict) –

    adapter fusion configuration, can be either:

    +
      +
    • a string identifying a pre-defined adapter fusion configuration

    • +
    • a dictionary representing the adapter fusion configuration

    • +
    • the path to a file containing the adapter fusion configuration

    • +
    +

  • +
  • overwrite_ok (bool, optional) – Overwrite an AdapterFusion layer with the same name if it exists. By default (False), an exception is +thrown.

  • +
  • set_active (bool, optional) – Activate the added AdapterFusion. By default (False), the AdapterFusion is added but not activated.

  • +
+
+
+
+ +
+
+add_causal_lm_head(head_name, activation_function='gelu', overwrite_ok=False)
+

Adds a causal language modeling head on top of the model.

+
+
Parameters
+
    +
  • head_name (str) – The name of the head.

  • +
  • activation_function (str, optional) – Activation function. Defaults to ‘gelu’.

  • +
  • overwrite_ok (bool, optional) – Force overwrite if a head with the same name exists. Defaults to False.

  • +
+
+
+
+ +
+
+add_classification_head(head_name, num_labels=2, layers=2, activation_function='tanh', overwrite_ok=False, multilabel=False, id2label=None, use_pooler=False)
+

Adds a sequence classification head on top of the model.

+
+
Parameters
+
    +
  • head_name (str) – The name of the head.

  • +
  • num_labels (int, optional) – Number of classification labels. Defaults to 2.

  • +
  • layers (int, optional) – Number of layers. Defaults to 2.

  • +
  • activation_function (str, optional) – Activation function. Defaults to ‘tanh’.

  • +
  • overwrite_ok (bool, optional) – Force overwrite if a head with the same name exists. Defaults to False.

  • +
  • multilabel (bool, optional) – Enable multilabel classification setup. Defaults to False.

  • +
+
+
+
+ +
+
+add_qa_head(head_name, num_labels=2, layers=1, activation_function='tanh', overwrite_ok=False, id2label=None)
+

Adds a question answering head on top of the model.

+
+
Parameters
+
    +
  • head_name (str) – The name of the head.

  • +
  • num_labels (int, optional) – Number of classification labels. Defaults to 2.

  • +
  • layers (int, optional) – Number of layers. Defaults to 1.

  • +
  • activation_function (str, optional) – Activation function. Defaults to ‘tanh’.

  • +
  • overwrite_ok (bool, optional) – Force overwrite if a head with the same name exists. Defaults to False.

  • +
+
+
+
+ +
+
+add_tagging_head(head_name, num_labels=2, layers=1, activation_function='tanh', overwrite_ok=False, id2label=None)
+

Adds a token classification head on top of the model.

+
+
Parameters
+
    +
  • head_name (str) – The name of the head.

  • +
  • num_labels (int, optional) – Number of classification labels. Defaults to 2.

  • +
  • layers (int, optional) – Number of layers. Defaults to 1.

  • +
  • activation_function (str, optional) – Activation function. Defaults to ‘tanh’.

  • +
  • overwrite_ok (bool, optional) – Force overwrite if a head with the same name exists. Defaults to False.

  • +
+
+
+
+ +
+
+apply_to_adapter_layers(fn)
+

Applies a function to all adapter layers of the model.

+
+ +
+
+apply_to_basemodel_childs(fn)
+

Applies a function to all direct childs of the model if they are a instance of AdapterLayerBase.

+
+ +
+
+average_adapter(adapter_name: str, adapter_list: List[str], weights: Optional[List[float]] = None, normalize_weights: bool = True, overwrite_ok: bool = False, set_active: bool = False)
+

Adds a new adapter module as weighted average of a set of existing adapter modules.

+
+
Parameters
+
    +
  • adapter_name (str) – The name of the adapter module to be added.

  • +
  • input_adapters (List[str] or Dict[str, float]) – Specifies the existing adapters whose weights should be averaged. Can either be a list of adapter names +or a dictionary mapping adapter names to weights.

  • +
  • overwrite_ok (bool, optional) – Overwrite an adapter with the same name if it exists. By default (False), an exception is thrown.

  • +
  • set_active (bool, optional) – Set the adapter to be the active one. By default (False), the adapter is added but not activated.

  • +
+
+
+
+ +
+
+delete_adapter(adapter_name: str)
+

Deletes the adapter with the specified name from the model.

+
+
Parameters
+

adapter_name (str) – The name of the adapter.

+
+
+
+ +
+
+delete_adapter_fusion(adapter_names: Union[Fuse, list, str])
+

Deletes the AdapterFusion layer of the specified adapters.

+
+
Parameters
+

adapter_names (Union[Fuse, list, str]) – AdapterFusion layer to delete.

+
+
+
+ +
+
+delete_head(head_name: str)
+

Deletes the prediction head with the specified name from the model.

+
+
Parameters
+

head_name (str) – The name of the prediction to delete.

+
+
+
+ +
+
+eject_prefix_tuning(name: str)
+

Converts the prefix tuning with the given name from the reparameterized form into the flat form.

+
+
Parameters
+

name (str) – The name of the prefix tuning.

+
+
+
+ +
+
+forward(input_ids=None, attention_mask=None, position_ids=None, past_key_values=None, inputs_embeds=None, use_cache=None, cache_position: Optional[LongTensor] = None, output_attentions=None, output_hidden_states=None, return_dict=None, head=None, output_adapter_gating_scores=False, output_adapter_fusion_attentions=False, **kwargs)
+

Define the computation performed at every call.

+

Should be overridden by all subclasses.

+
+

Note

+

Although the recipe for forward pass needs to be defined within +this function, one should call the Module instance afterwards +instead of this since the former takes care of running the +registered hooks while the latter silently ignores them.

+
+
+ +
+
+forward_context(context: ForwardContext, *args, **kwargs)
+

This method is called by the ForwardContext at the beginning of the forward pass.

+
+ +
+
+forward_head(all_outputs, head_name=None, cls_output=None, attention_mask=None, return_dict=False, context=None, **kwargs)
+

The forward pass through a prediction head configuration. There are three ways to specify the used prediction +head configuration (in order of priority):

+
+
    +
  1. If a head_name is passed, the head with the given name is used.

  2. +
  3. If the forward call is executed within an AdapterSetup context, the head configuration is read from +the context.

  4. +
  5. If the active_head property is set, the head configuration is read from there.

  6. +
+
+
+
Parameters
+
    +
  • all_outputs (dict) – The outputs of the base model.

  • +
  • head_name (str, optional) – The name of the prediction head to use. If None, the active head is used.

  • +
  • cls_output (torch.Tensor, optional) – The classification output of the model.

  • +
  • attention_mask (torch.Tensor, optional) – The attention mask of the model.

  • +
  • return_dict (bool) – Whether or not to return a ModelOutput instead of a plain tuple.

  • +
  • get_cls_from_eos_tokens (bool) – If set to True, retrieve classifier token representations from the last <eos> token in the sequence. +Setting to True requires eos_mask to be passed as well.

  • +
  • **kwargs – Additional keyword arguments passed to the forward pass of the head.

  • +
+
+
+
+ +
+
+freeze_model(freeze=True)
+

Freezes all weights of the model.

+
+ +
+
+get_adapter(name)
+

If self.base_model is self, must inherit from a class that implements this method, to preclude infinite +recursion

+
+ +
+
+get_labels(head_name=None)
+

Returns the labels the given head is assigning/predictin

+
+
Parameters
+
    +
  • head_name – (str, optional) the name of the head which labels should be returned. Default is None.

  • +
  • returned (If the name is None the labels of the active head are) –

  • +
+
+
+

Returns: labels

+
+ +
+
+get_labels_dict(head_name=None)
+

Returns the id2label dict for the given hea

+
+
Parameters
+
    +
  • head_name – (str, optional) the name of the head which labels should be returned. Default is None.

  • +
  • returned (If the name is None the labels of the active head are) –

  • +
+
+
+

Returns: id2label

+
+ +
+
+get_output_embeddings() Union[Module, List[Module]]
+

Returns the model’s output embeddings.

+
+
Returns
+

A torch module mapping hidden states to vocabulary.

+
+
Return type
+

nn.Module

+
+
+
+ +
+
+head_type()
+

Checks which head type the decorated function belongs to and raises an error if the model does not support the +head type.

+
+ +
+
+init_adapters(model_config, adapters_config, add_prefix_tuning_pool=True)
+

This method initializes adapter modules and fusion modules from the model config.

+
+ +
+
+iter_layers() Iterable[Tuple[int, Module]]
+

Iterates over all layers of the model.

+
+ +
+
+load_adapter(adapter_name_or_path: str, config: Optional[Union[dict, str]] = None, version: Optional[str] = None, model_name: Optional[str] = None, load_as: Optional[str] = None, source: Optional[str] = None, with_head: bool = True, custom_weights_loaders: Optional[List[WeightsLoader]] = None, leave_out: Optional[List[int]] = None, id2label=None, set_active: bool = False, use_safetensors: bool = False, **kwargs) str
+

Loads a pre-trained pytorch adapter module from the local file system or a remote location.

+
+
Parameters
+
    +
  • adapter_name_or_path (str) –

    can be either:

    +
      +
    • the identifier of a pre-trained task adapter to be loaded from Adapter Hub

    • +
    • a path to a directory containing adapter weights saved using model.saved_adapter()

    • +
    • a URL pointing to a zip folder containing a saved adapter module

    • +
    +

  • +
  • config (dict or str, optional) – The requested configuration of the adapter. +If not specified, will be either: - the default adapter config for the requested adapter if specified - +the global default adapter config

  • +
  • version (str, optional) – The version of the adapter to be loaded.

  • +
  • model_name (str, optional) – The string identifier of the pre-trained model.

  • +
  • load_as (str, optional) – Load the adapter using this name. By default, the name with which the adapter was +saved will be used.

  • +
  • source (str, optional) –

    Identifier of the source(s) from where to load the adapter. Can be:

    +
      +
    • +
      ”ah”: search on AdapterHub Hub repo.

      Note: the Hub repo has been archived and all adapters have been moved to HuggingFace Model Hub. +Loading from this source is deprecated.

      +
      +
      +
    • +
    • ”hf”: search on HuggingFace Model Hub.

    • +
    • None (default): search on all sources

    • +
    +

  • +
  • leave_out – Dynamically drop adapter modules in the specified Transformer layers when loading the adapter.

  • +
  • set_active (bool, optional) – Set the loaded adapter to be the active one. By default (False), the adapter is loaded but not +activated.

  • +
  • use_safetensors (bool, optional) – If True, weights are loaded via safetensors if safetensors checkpoint is available. Otherwise, the regular torch save method is used.

  • +
+
+
Returns
+

The name with which the adapter was added to the model.

+
+
Return type
+

str

+
+
+
+ +
+
+load_adapter_fusion(adapter_fusion_name_or_path: str, load_as: Optional[str] = None, custom_weights_loaders: Optional[List[WeightsLoader]] = None, set_active: bool = False, with_head: bool = True, use_safetensors: bool = False, **kwargs) str
+

Loads a pre-trained AdapterFusion layer from the local file system.

+
+
Parameters
+
    +
  • adapter_fusion_name_or_path (str) – a path to a directory containing AdapterFusion weights saved using model.save_adapter_fusion().

  • +
  • load_as (str, optional) – Load the AdapterFusion using this name. +By default, the name with which the AdapterFusion layer was saved will be used.

  • +
  • set_active (bool, optional) – Activate the loaded AdapterFusion. By default (False), the AdapterFusion is loaded but not activated.

  • +
  • use_safetensors (bool, optional) – If True, weights are loaded via safetensors if safetensors checkpoint is available. Otherwise, the regular torch save method is used.

  • +
+
+
Returns
+

The name with which the AdapterFusion was added to the model.

+
+
Return type
+

str

+
+
+
+ +
+
+load_head(save_directory: str, load_as: Optional[str] = None, id2label: Optional[Dict[int, str]] = None, use_safetensors: bool = False, **kwargs) str
+

Loads a model prediction head from a directory where it was saved using save_head().

+
+
Parameters
+
    +
  • save_directory (str) – Path to the directory where the prediction head is saved.

  • +
  • load_as (str, optional) – Load the AdapterFusion using this name. +By default, the name with which the AdapterFusion layer was saved will be used.

  • +
  • id2label (Dict[int, str], optional) – Provide a custom mapping from class ids to class labels. Defaults to None.

  • +
  • use_safetensors (bool, optional) – If True, weights are loaded via safetensors if safetensors checkpoint is available. Otherwise, the regular torch save method is used.

  • +
+
+
Returns
+

The name with which the prediction head was added to the model.

+
+
Return type
+

str

+
+
+
+ +
+
+merge_adapter(name: str)
+

Merges the weights of the given LoRA module with the Transformer weights as described in the paper.

+
+
Parameters
+

name (str) – LoRA module to merge.

+
+
+
+ +
+
+push_adapter_to_hub(repo_name: str, adapter_name: str, organization: Optional[str] = None, adapterhub_tag: Optional[str] = None, datasets_tag: Optional[str] = None, local_path: Optional[str] = None, commit_message: Optional[str] = None, private: Optional[bool] = None, token: Optional[Union[bool, str]] = None, overwrite_adapter_card: bool = False, create_pr: bool = False, revision: Optional[str] = None, commit_description: Optional[str] = None, adapter_card_kwargs: Optional[dict] = None, **deprecated_kwargs)
+

Upload an adapter to HuggingFace’s Model Hub.

+
+
Parameters
+
    +
  • repo_name (str) – The name of the repository on the model hub to upload to.

  • +
  • adapter_name (str) – The name of the adapter to be uploaded.

  • +
  • organization (str, optional) – Organization in which to push the adapter +(you must be a member of this organization). Defaults to None.

  • +
  • adapterhub_tag (str, optional) – Tag of the format <task>/<subtask> for categorization on https://adapterhub.ml/explore/. See +https://docs.adapterhub.ml/contributing.html#add-a-new-task-or-subtask for more. If not specified, +datasets_tag must be given in case a new adapter card is generated. Defaults to None.

  • +
  • datasets_tag (str, optional) – Dataset identifier from https://huggingface.co/datasets. +If not specified, adapterhub_tag must be given in case a new adapter card is generated. Defaults to +None.

  • +
  • local_path (str, optional) – Local path used as clone directory of the adapter repository. +If not specified, will create a temporary directory. Defaults to None.

  • +
  • commit_message (str, optional) – Message to commit while pushing. Will default to "add config", "add tokenizer" or +"add model" depending on the type of the class.

  • +
  • private (bool, optional) – Whether or not the repository created should be private (requires a paying subscription).

  • +
  • token (bool or str, optional) – The token to use as HTTP bearer authorization for remote files. If True, will use the token generated +when running huggingface-cli login (stored in ~/.huggingface). Will default to True if repo_url +is not specified.

  • +
  • overwrite_adapter_card (bool, optional) – Overwrite an existing adapter card with a newly generated one. +If set to False, will only generate an adapter card, if none exists. Defaults to False.

  • +
  • create_pr (bool, optional) – Whether or not to create a PR with the uploaded files or directly commit.

  • +
  • revision (str, optional) – Branch to push the uploaded files to.

  • +
  • commit_description (str, optional) – The description of the commit that will be created

  • +
+
+
Returns
+

The url of the adapter repository on the model hub.

+
+
Return type
+

str

+
+
+
+ +
+
+reset_adapter()
+

Resets weights of a LoRA module merged using model.merge_adapter(name).

+
+ +
+
+save_adapter(save_directory: str, adapter_name: str, with_head: bool = True, meta_dict: Optional[dict] = None, custom_weights_loaders: Optional[List[WeightsLoader]] = None, use_safetensors: bool = False)
+

Saves an adapter and its configuration file to a directory so that it can be shared or reloaded using +load_adapter().

+
+
Parameters
+
    +
  • save_directory (str) – Path to a directory where the adapter should be saved.

  • +
  • adapter_name (str) – Name of the adapter to be saved.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
Raises
+

ValueError – If the given adapter name is invalid.

+
+
+
+ +
+
+save_adapter_fusion(save_directory: str, adapter_names: Union[Fuse, list, str], meta_dict: Optional[dict] = None, custom_weights_loaders: Optional[List[WeightsLoader]] = None, with_head: Union[bool, str] = False, use_safetensors: bool = False)
+

Saves an AdapterFusion layer and its configuration file to a directory so that it can be shared or reloaded +using load_adapter_fusion().

+
+
Parameters
+
    +
  • save_directory (str) – Path to a directory where the AdapterFusion should be saved.

  • +
  • adapter_names (Union[Fuse, list, str]) – AdapterFusion to be saved.

  • +
  • with_head (Union[bool, str]) – If True, will save a head with the same name as the AdapterFusionLayer. If a string, this will be used +as the name of the head to be saved.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
Raises
+

ValueError – If the given AdapterFusion name is invalid.

+
+
+
+ +
+
+save_all_adapter_fusions(save_directory: str, meta_dict: Optional[dict] = None, custom_weights_loaders: Optional[List[WeightsLoader]] = None, use_safetensors: bool = False)
+

Saves all AdapterFusion layers of this model together with their configuration to subfolders of the given +location.

+
+
Parameters
+
    +
  • save_directory (str) – Path to a directory where the AdapterFusion layers should be saved.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
+
+ +
+
+save_all_adapters(save_directory: str, with_head: bool = True, meta_dict: Optional[dict] = None, custom_weights_loaders: Optional[List[WeightsLoader]] = None, use_safetensors: bool = False)
+

Saves all adapters of this model together with their configuration to subfolders of the given location.

+
+
Parameters
+
    +
  • save_directory (str) – Path to a directory where the adapters should be saved.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
+
+ +
+
+save_all_heads(save_directory: str, use_safetensors: bool = False)
+

Saves all prediction heads of this model to subfolders of the given location.

+
+
Parameters
+
    +
  • save_directory (str) – Path to the base directory where prediction heads should be saved.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
+
+ +
+
+save_head(save_directory: str, head_name: Optional[str] = None, use_safetensors: bool = False) None
+

Saves a model prediction head to a directory such that it can be reloaded using load_head().

+
+
Parameters
+
    +
  • save_directory (str) – Path to the directory where the prediction head should be saved.

  • +
  • head_name (str, optional) – Name of the head to save. Set to None if model only has one head. Defaults to None.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
+
+ +
+
+save_pretrained(save_directory: Union[str, PathLike], **kwargs)
+

Save a model and its configuration file to a directory, so that it can be re-loaded using the +[~PreTrainedModel.from_pretrained] class method.

+
+
Parameters
+
    +
  • save_directory (str or os.PathLike) – Directory to which to save. Will be created if it doesn’t exist.

  • +
  • is_main_process (bool, optional, defaults to True) – Whether the process calling this is the main process or not. Useful when in distributed training like +TPUs and need to call this function on all processes. In this case, set is_main_process=True only on +the main process to avoid race conditions.

  • +
  • state_dict (nested dictionary of torch.Tensor) – The state dictionary of the model to save. Will default to self.state_dict(), but can be used to only +save parts of the model or if special precautions need to be taken when recovering the state dictionary +of a model (like when using model parallelism).

  • +
  • save_function (Callable) – The function to use to save the state dictionary. Useful on distributed training like TPUs when one +need to replace torch.save by another method.

  • +
  • push_to_hub (bool, optional, defaults to False) – Whether or not to push your model to the Hugging Face model hub after saving it. You can specify the +repository you want to push to with repo_id (will default to the name of save_directory in your +namespace).

  • +
  • max_shard_size (int or str, optional, defaults to “5GB”) –

    The maximum size for a checkpoint before being sharded. Checkpoints shard will then be each of size +lower than this size. If expressed as a string, needs to be digits followed by a unit (like “5MB”). +We default it to 5GB in order for models to be able to run easily on free-tier google colab instances +without CPU OOM issues.

    +

    <Tip warning={true}>

    +

    If a single weight of the model is bigger than max_shard_size, it will be in its own checkpoint shard +which will be bigger than max_shard_size.

    +

    </Tip>

    +

  • +
  • safe_serialization (bool, optional, defaults to True) – Whether to save the model using safetensors or the traditional PyTorch way (that uses pickle).

  • +
  • variant (str, optional) – If specified, weights are saved in the format pytorch_model.<variant>.bin.

  • +
  • token (str or bool, optional) – The token to use as HTTP bearer authorization for remote files. If True, or not specified, will use +the token generated when running huggingface-cli login (stored in ~/.huggingface).

  • +
  • save_peft_format (bool, optional, defaults to True) – For backward compatibility with PEFT library, in case adapter weights are attached to the model, all +keys of the state dict of adapters needs to be pre-pended with base_model.model. Advanced users can +disable this behaviours by setting save_peft_format to False.

  • +
  • kwargs (Dict[str, Any], optional) – Additional key word arguments passed along to the [~utils.PushToHubMixin.push_to_hub] method.

  • +
+
+
+
+ +
+
+set_active_adapters(adapter_setup: Union[list, AdapterCompositionBlock], skip_layers: Optional[List[int]] = None)
+

Sets the adapter modules to be used by default in every forward pass. This setting can be overriden by passing +the adapter_names parameter in the foward() pass. If no adapter with the given name is found, no module of +the respective type will be activated. In case the calling model class supports named prediction heads, this +method will attempt to activate a prediction head with the name of the last adapter in the list of passed +adapter names.

+
+
Parameters
+

adapter_setup (list) – The list of adapters to be activated by default. Can be a fusion or stacking configuration.

+
+
+
+ +
+
+tie_weights()
+

Tie the weights between the input embeddings and the output embeddings.

+

If the torchscript flag is set in the configuration, can’t handle parameter sharing so we are cloning +the weights instead.

+
+ +
+
+train_adapter(adapter_setup: Union[list, AdapterCompositionBlock], train_embeddings=False)
+

Sets the model into mode for training the given adapters. If self.base_model is self, must inherit from a class +that implements this method, to preclude infinite recursion

+
+ +
+
+train_adapter_fusion(adapter_setup: Union[list, AdapterCompositionBlock], unfreeze_adapters=False)
+

Sets the model into mode for training of adapter fusion determined by a list of adapter names. If +self.base_model is self, must inherit from a class that implements this method, to preclude infinite recursion

+
+ +
+
+train_fusion(adapter_setup: Union[list, AdapterCompositionBlock], unfreeze_adapters=False)
+

Sets the model into mode for training of adapter fusion determined by a list of adapter names.

+
+ +
+ +
+
+ + +
+ +
+ + +
+
+ +
+ +
+ +
+ + Versions + v: main + + +
+
+
Branches
+
main
+
+
+
+ + + + + + + + + + \ No newline at end of file diff --git a/classes/models/mbart.html b/classes/models/mbart.html new file mode 100644 index 0000000000..359e55f5c7 --- /dev/null +++ b/classes/models/mbart.html @@ -0,0 +1,1109 @@ + + + + + + + + + + + MBart — AdapterHub documentation + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + +
+ + + + + +
+ +
+ + + + + + + + + + + + + + + + + +
+ + + + +
+
+
+
+ +
+

MBart

+

The MBart model was presented in Multilingual Denoising Pre-training for Neural Machine Translation by Yinhan Liu, Jiatao Gu, Naman Goyal, Xian Li, Sergey Edunov Marjan +Ghazvininejad, Mike Lewis, Luke Zettlemoyer.

+

According to the abstract, MBART is a sequence-to-sequence denoising auto-encoder pretrained on large-scale monolingual +corpora in many languages using the BART objective. mBART is one of the first methods for pretraining a complete +sequence-to-sequence model by denoising full texts in multiple languages, while previous approaches have focused only +on the encoder, decoder, or reconstructing parts of the text.

+
+

MBartAdapterModel

+
+
+class adapters.MBartAdapterModel(config: MBartConfig, **kwargs)
+

MBART Model with the option to add multiple flexible prediction heads on top. +This model inherits from [PreTrainedModel]. Check the superclass documentation for the generic methods the +library implements for all its model (such as downloading or saving, resizing the input embeddings, pruning heads +etc.)

+

This model is also a PyTorch [torch.nn.Module](https://pytorch.org/docs/stable/nn.html#torch.nn.Module) subclass. +Use it as a regular PyTorch Module and refer to the PyTorch documentation for all matter related to general usage +and behavior.

+
+
Parameters
+

config ([MBartConfig]) – Model configuration class with all the parameters of the model. Initializing with a config file does not +load the weights associated with the model, only the configuration. Check out the +[~PreTrainedModel.from_pretrained] method to load the model weights.

+
+
+
+
+property active_adapters: AdapterCompositionBlock
+

If you are not familiar with adapters and PEFT methods, we invite you to read more about them on the PEFT +official documentation: https://huggingface.co/docs/peft

+

Gets the current active adapters of the model. In case of multi-adapter inference (combining multiple adapters +for inference) returns the list of all active adapters so that users can deal with them accordingly.

+

For previous PEFT versions (that does not support multi-adapter inference), module.active_adapter will return +a single string.

+
+ +
+
+property active_head: Union[str, List[str]]
+

The active prediction head configuration of this model. Can be either the name of a single available head +(string) or a list of multiple available heads. In case of a list of heads, the same base model is forwarded +through all specified heads.

+
+
Returns
+

A string or a list of strings describing the active head configuration.

+
+
Return type
+

Union[str, List[str]]

+
+
+
+ +
+
+adapter_fusion_to(adapter_names: Union[Fuse, list, str], device: Optional[Union[device, str]] = None, dtype: Optional[dtype] = None)
+

Moves the adapter fusion layer with the given name to the specified device and data type.

+
+
Parameters
+
    +
  • adapter_names (Union[Fuse, list, str]) – The name of the adapter fusion layer to be moved.

  • +
  • device (torch.device or str, optional) – The device on which the adapter fusion layer should be moved.

  • +
  • dtype (torch.dtype, optional) – The data type to which the adapter fusion layer should be cast.

  • +
+
+
+
+ +
+
+adapter_summary(as_dict=False) Union[str, dict]
+

Returns a string summary of all adapters currently added to the model. Each entry in the summary table has the +following attributes:

+
+
    +
  • name: the name of the adapter

  • +
  • architecture: the architectural base of the adapter

  • +
  • #param: the number of parameters of the adapter

  • +
  • %param: the number of parameters of the adapter relative to the full model

  • +
  • active: whether the adapter is active

  • +
  • train: whether the adapter weights are enabled for training

  • +
+
+
+ +
+
+adapter_to(name: str, device: Optional[Union[device, str]] = None, dtype: Optional[dtype] = None)
+

Moves the adapter with the given name to the specified device and data type.

+
+
Parameters
+
    +
  • name (str) – The name of the adapter to be moved.

  • +
  • device (torch.device or str, optional) – The device on which the adapter should be moved.

  • +
  • dtype (torch.dtype, optional) – The data type to which the adapter should be cast.

  • +
+
+
+
+ +
+
+add_adapter(adapter_name: str, config=None, overwrite_ok: bool = False, set_active: bool = False)
+

Adds a new adapter module of the specified type to the model.

+
+
Parameters
+
    +
  • adapter_name (str) – The name of the adapter module to be added.

  • +
  • config (str or dict, optional) –

    The adapter configuration, can be either:

    +
      +
    • the string identifier of a pre-defined configuration dictionary

    • +
    • a configuration dictionary specifying the full config

    • +
    • if not given, the default configuration for this adapter type will be used

    • +
    +

  • +
  • overwrite_ok (bool, optional) – Overwrite an adapter with the same name if it exists. By default (False), an exception is thrown.

  • +
  • set_active (bool, optional) – Set the adapter to be the active one. By default (False), the adapter is added but not activated.

  • +
+
+
+

If self.base_model is self, must inherit from a class that implements this method, to preclude infinite +recursion

+
+ +
+
+add_adapter_fusion(adapter_names: Union[Fuse, list, str], config=None, overwrite_ok: bool = False, set_active: bool = False)
+

Adds AdapterFusion to the model with alll the necessary configurations and weight initializations

+
+
Parameters
+
    +
  • adapter_names (Fuse or list or str) –

    AdapterFusion layer to add. Can be either:

    +
      +
    • a Fuse composition block

    • +
    • a list of adapter names to fuse

    • +
    • a comma-separated string of adapter names to fuse

    • +
    +

  • +
  • config (str or dict) –

    adapter fusion configuration, can be either:

    +
      +
    • a string identifying a pre-defined adapter fusion configuration

    • +
    • a dictionary representing the adapter fusion configuration

    • +
    • the path to a file containing the adapter fusion configuration

    • +
    +

  • +
  • overwrite_ok (bool, optional) – Overwrite an AdapterFusion layer with the same name if it exists. By default (False), an exception is +thrown.

  • +
  • set_active (bool, optional) – Activate the added AdapterFusion. By default (False), the AdapterFusion is added but not activated.

  • +
+
+
+
+ +
+
+add_classification_head(head_name, num_labels=2, layers=2, activation_function='tanh', overwrite_ok=False, multilabel=False, id2label=None, use_pooler=False)
+

Adds a sequence classification head on top of the model.

+
+
Parameters
+
    +
  • head_name (str) – The name of the head.

  • +
  • num_labels (int, optional) – Number of classification labels. Defaults to 2.

  • +
  • layers (int, optional) – Number of layers. Defaults to 2.

  • +
  • activation_function (str, optional) – Activation function. Defaults to ‘tanh’.

  • +
  • overwrite_ok (bool, optional) – Force overwrite if a head with the same name exists. Defaults to False.

  • +
  • multilabel (bool, optional) – Enable multilabel classification setup. Defaults to False.

  • +
+
+
+
+ +
+
+add_qa_head(head_name, num_labels=2, layers=1, activation_function='tanh', overwrite_ok=False, id2label=None)
+

Adds a question answering head on top of the model.

+
+
Parameters
+
    +
  • head_name (str) – The name of the head.

  • +
  • num_labels (int, optional) – Number of classification labels. Defaults to 2.

  • +
  • layers (int, optional) – Number of layers. Defaults to 1.

  • +
  • activation_function (str, optional) – Activation function. Defaults to ‘tanh’.

  • +
  • overwrite_ok (bool, optional) – Force overwrite if a head with the same name exists. Defaults to False.

  • +
+
+
+
+ +
+
+add_seq2seq_lm_head(head_name, overwrite_ok=False)
+

Adds a sequence-to-sequence language modeling head on top of the model.

+
+
Parameters
+
    +
  • head_name (str) – The name of the head.

  • +
  • overwrite_ok (bool, optional) – Force overwrite if a head with the same name exists. Defaults to False.

  • +
+
+
+
+ +
+
+apply_to_adapter_layers(fn)
+

Applies a function to all adapter layers of the model.

+
+ +
+
+apply_to_basemodel_childs(fn)
+

Applies a function to all direct childs of the model if they are a instance of AdapterLayerBase.

+
+ +
+
+average_adapter(adapter_name: str, adapter_list: List[str], weights: Optional[List[float]] = None, normalize_weights: bool = True, overwrite_ok: bool = False, set_active: bool = False)
+

Adds a new adapter module as weighted average of a set of existing adapter modules.

+
+
Parameters
+
    +
  • adapter_name (str) – The name of the adapter module to be added.

  • +
  • input_adapters (List[str] or Dict[str, float]) – Specifies the existing adapters whose weights should be averaged. Can either be a list of adapter names +or a dictionary mapping adapter names to weights.

  • +
  • overwrite_ok (bool, optional) – Overwrite an adapter with the same name if it exists. By default (False), an exception is thrown.

  • +
  • set_active (bool, optional) – Set the adapter to be the active one. By default (False), the adapter is added but not activated.

  • +
+
+
+
+ +
+
+delete_adapter(adapter_name: str)
+

Deletes the adapter with the specified name from the model.

+
+
Parameters
+

adapter_name (str) – The name of the adapter.

+
+
+
+ +
+
+delete_adapter_fusion(adapter_names: Union[Fuse, list, str])
+

Deletes the AdapterFusion layer of the specified adapters.

+
+
Parameters
+

adapter_names (Union[Fuse, list, str]) – AdapterFusion layer to delete.

+
+
+
+ +
+
+delete_head(head_name: str)
+

Deletes the prediction head with the specified name from the model.

+
+
Parameters
+

head_name (str) – The name of the prediction to delete.

+
+
+
+ +
+
+eject_prefix_tuning(name: str)
+

Converts the prefix tuning with the given name from the reparameterized form into the flat form.

+
+
Parameters
+

name (str) – The name of the prefix tuning.

+
+
+
+ +
+
+forward(input_ids=None, attention_mask=None, decoder_input_ids=None, decoder_attention_mask=None, head_mask=None, decoder_head_mask=None, cross_attn_head_mask=None, encoder_outputs=None, inputs_embeds=None, decoder_inputs_embeds=None, use_cache=None, output_attentions=None, output_hidden_states=None, return_dict=None, past_key_values=None, head=None, output_adapter_gating_scores=False, output_adapter_fusion_attentions=False, **kwargs)
+

The [MBartAdapterModel] forward method, overrides the __call__ special method.

+

<Tip>

+

Although the recipe for forward pass needs to be defined within this function, one should call the [Module] +instance afterwards instead of this since the former takes care of running the pre and post processing steps while +the latter silently ignores them.

+

</Tip>

+
+
Parameters
+
    +
  • input_ids (torch.LongTensor of shape (batch_size, sequence_length)) –

    Indices of input sequence tokens in the vocabulary. Padding will be ignored by default should you provide +it.

    +

    Indices can be obtained using [AutoTokenizer]. See [PreTrainedTokenizer.encode] and +[PreTrainedTokenizer.__call__] for details.

    +

    [What are input IDs?](../glossary#input-ids)

    +

  • +
  • attention_mask (torch.Tensor of shape (batch_size, sequence_length), optional) –

    Mask to avoid performing attention on padding token indices. Mask values selected in [0, 1]:

    +
      +
    • 1 for tokens that are not masked,

    • +
    • 0 for tokens that are masked.

    • +
    +

    [What are attention masks?](../glossary#attention-mask)

    +

  • +
  • decoder_input_ids (torch.LongTensor of shape (batch_size, target_sequence_length), optional) –

    Indices of decoder input sequence tokens in the vocabulary.

    +

    Indices can be obtained using [AutoTokenizer]. See [PreTrainedTokenizer.encode] and +[PreTrainedTokenizer.__call__] for details.

    +

    [What are decoder input IDs?](../glossary#decoder-input-ids)

    +

    MBart uses a specific language id token as the starting token for decoder_input_ids generation that +varies according to source and target language, e.g. 25004 for en_XX, and 25003 for de_DE. If +past_key_values is used, optionally only the last decoder_input_ids have to be input (see +past_key_values).

    +

    For translation and summarization training, decoder_input_ids should be provided. If no +decoder_input_ids is provided, the model will create this tensor by shifting the input_ids to the right +for denoising pre-training following the paper.

    +

  • +
  • decoder_attention_mask (torch.LongTensor of shape (batch_size, target_sequence_length), optional) – Default behavior: generate a tensor that ignores pad tokens in decoder_input_ids. Causal mask will also +be used by default.

  • +
  • head_mask (torch.Tensor of shape (encoder_layers, encoder_attention_heads), optional) –

    Mask to nullify selected heads of the attention modules in the encoder. Mask values selected in [0, 1]:

    +
      +
    • 1 indicates the head is not masked,

    • +
    • 0 indicates the head is masked.

    • +
    +

  • +
  • decoder_head_mask (torch.Tensor of shape (decoder_layers, decoder_attention_heads), optional) –

    Mask to nullify selected heads of the attention modules in the decoder. Mask values selected in [0, 1]:

    +
      +
    • 1 indicates the head is not masked,

    • +
    • 0 indicates the head is masked.

    • +
    +

  • +
  • cross_attn_head_mask (torch.Tensor of shape (decoder_layers, decoder_attention_heads), optional) –

    Mask to nullify selected heads of the cross-attention modules in the decoder. Mask values selected in [0, +1]:

    +
      +
    • 1 indicates the head is not masked,

    • +
    • 0 indicates the head is masked.

    • +
    +

  • +
  • encoder_outputs (tuple(tuple(torch.FloatTensor), optional) – Tuple consists of (last_hidden_state, optional: hidden_states, optional: attentions) +last_hidden_state of shape (batch_size, sequence_length, hidden_size), optional) is a sequence of +hidden-states at the output of the last layer of the encoder. Used in the cross-attention of the decoder.

  • +
  • past_key_values (tuple(tuple(torch.FloatTensor)), optional, returned when use_cache=True is passed or when config.use_cache=True) –

    Tuple of tuple(torch.FloatTensor) of length config.n_layers, with each tuple having 2 tensors of shape +(batch_size, num_heads, sequence_length, embed_size_per_head)) and 2 additional tensors of shape +(batch_size, num_heads, encoder_sequence_length, embed_size_per_head).

    +

    Contains pre-computed hidden-states (key and values in the self-attention blocks and in the cross-attention +blocks) that can be used (see past_key_values input) to speed up sequential decoding.

    +

    If past_key_values are used, the user can optionally input only the last decoder_input_ids (those that +don’t have their past key value states given to this model) of shape (batch_size, 1) instead of all +decoder_input_ids of shape (batch_size, sequence_length).

    +

  • +
  • inputs_embeds (torch.FloatTensor of shape (batch_size, sequence_length, hidden_size), optional) – Optionally, instead of passing input_ids you can choose to directly pass an embedded representation. +This is useful if you want more control over how to convert input_ids indices into associated vectors +than the model’s internal embedding lookup matrix.

  • +
  • decoder_inputs_embeds (torch.FloatTensor of shape (batch_size, target_sequence_length, hidden_size), optional) –

    Optionally, instead of passing decoder_input_ids you can choose to directly pass an embedded +representation. If past_key_values is used, optionally only the last decoder_inputs_embeds have to be +input (see past_key_values). This is useful if you want more control over how to convert +decoder_input_ids indices into associated vectors than the model’s internal embedding lookup matrix.

    +

    If decoder_input_ids and decoder_inputs_embeds are both unset, decoder_inputs_embeds takes the value +of inputs_embeds.

    +

  • +
  • use_cache (bool, optional) – If set to True, past_key_values key value states are returned and can be used to speed up decoding (see +past_key_values).

  • +
  • output_attentions (bool, optional) – Whether or not to return the attentions tensors of all attention layers. See attentions under returned +tensors for more detail.

  • +
  • output_hidden_states (bool, optional) – Whether or not to return the hidden states of all layers. See hidden_states under returned tensors for +more detail.

  • +
  • return_dict (bool, optional) – Whether or not to return a [~utils.ModelOutput] instead of a plain tuple.

  • +
  • labels (torch.LongTensor of shape (batch_size,), optional) – Labels for computing the sequence classification/regression loss. Indices should be in [0, ..., +config.num_labels - 1]. If config.num_labels > 1 a classification loss is computed (Cross-Entropy).

  • +
+
+
+
+ +
+
+forward_context(context: ForwardContext, *args, **kwargs)
+

This method is called by the ForwardContext at the beginning of the forward pass.

+
+ +
+
+forward_head(all_outputs, head_name=None, cls_output=None, attention_mask=None, return_dict=False, context=None, **kwargs)
+

The forward pass through a prediction head configuration. There are three ways to specify the used prediction +head configuration (in order of priority):

+
+
    +
  1. If a head_name is passed, the head with the given name is used.

  2. +
  3. If the forward call is executed within an AdapterSetup context, the head configuration is read from +the context.

  4. +
  5. If the active_head property is set, the head configuration is read from there.

  6. +
+
+
+
Parameters
+
    +
  • all_outputs (dict) – The outputs of the base model.

  • +
  • head_name (str, optional) – The name of the prediction head to use. If None, the active head is used.

  • +
  • cls_output (torch.Tensor, optional) – The classification output of the model.

  • +
  • attention_mask (torch.Tensor, optional) – The attention mask of the model.

  • +
  • return_dict (bool) – Whether or not to return a ModelOutput instead of a plain tuple.

  • +
  • get_cls_from_eos_tokens (bool) – If set to True, retrieve classifier token representations from the last <eos> token in the sequence. +Setting to True requires eos_mask to be passed as well.

  • +
  • **kwargs – Additional keyword arguments passed to the forward pass of the head.

  • +
+
+
+
+ +
+
+freeze_model(freeze=True)
+

Freezes all weights of the model.

+
+ +
+
+get_adapter(name)
+

If self.base_model is self, must inherit from a class that implements this method, to preclude infinite +recursion

+
+ +
+
+get_labels(head_name=None)
+

Returns the labels the given head is assigning/predictin

+
+
Parameters
+
    +
  • head_name – (str, optional) the name of the head which labels should be returned. Default is None.

  • +
  • returned (If the name is None the labels of the active head are) –

  • +
+
+
+

Returns: labels

+
+ +
+
+get_labels_dict(head_name=None)
+

Returns the id2label dict for the given hea

+
+
Parameters
+
    +
  • head_name – (str, optional) the name of the head which labels should be returned. Default is None.

  • +
  • returned (If the name is None the labels of the active head are) –

  • +
+
+
+

Returns: id2label

+
+ +
+
+get_output_embeddings() Union[Module, List[Module]]
+

Returns the model’s output embeddings.

+
+
Returns
+

A torch module mapping hidden states to vocabulary.

+
+
Return type
+

nn.Module

+
+
+
+ +
+
+head_type()
+

Checks which head type the decorated function belongs to and raises an error if the model does not support the +head type.

+
+ +
+
+init_adapters(model_config, adapters_config, add_prefix_tuning_pool=True)
+

This method initializes adapter modules and fusion modules from the model config.

+
+ +
+
+iter_layers() Iterable[Tuple[int, Module]]
+

Iterates over all layers of the model.

+
+ +
+
+load_adapter(adapter_name_or_path: str, config: Optional[Union[dict, str]] = None, version: Optional[str] = None, model_name: Optional[str] = None, load_as: Optional[str] = None, source: Optional[str] = None, with_head: bool = True, custom_weights_loaders: Optional[List[WeightsLoader]] = None, leave_out: Optional[List[int]] = None, id2label=None, set_active: bool = False, use_safetensors: bool = False, **kwargs) str
+

Loads a pre-trained pytorch adapter module from the local file system or a remote location.

+
+
Parameters
+
    +
  • adapter_name_or_path (str) –

    can be either:

    +
      +
    • the identifier of a pre-trained task adapter to be loaded from Adapter Hub

    • +
    • a path to a directory containing adapter weights saved using model.saved_adapter()

    • +
    • a URL pointing to a zip folder containing a saved adapter module

    • +
    +

  • +
  • config (dict or str, optional) – The requested configuration of the adapter. +If not specified, will be either: - the default adapter config for the requested adapter if specified - +the global default adapter config

  • +
  • version (str, optional) – The version of the adapter to be loaded.

  • +
  • model_name (str, optional) – The string identifier of the pre-trained model.

  • +
  • load_as (str, optional) – Load the adapter using this name. By default, the name with which the adapter was +saved will be used.

  • +
  • source (str, optional) –

    Identifier of the source(s) from where to load the adapter. Can be:

    +
      +
    • +
      ”ah”: search on AdapterHub Hub repo.

      Note: the Hub repo has been archived and all adapters have been moved to HuggingFace Model Hub. +Loading from this source is deprecated.

      +
      +
      +
    • +
    • ”hf”: search on HuggingFace Model Hub.

    • +
    • None (default): search on all sources

    • +
    +

  • +
  • leave_out – Dynamically drop adapter modules in the specified Transformer layers when loading the adapter.

  • +
  • set_active (bool, optional) – Set the loaded adapter to be the active one. By default (False), the adapter is loaded but not +activated.

  • +
  • use_safetensors (bool, optional) – If True, weights are loaded via safetensors if safetensors checkpoint is available. Otherwise, the regular torch save method is used.

  • +
+
+
Returns
+

The name with which the adapter was added to the model.

+
+
Return type
+

str

+
+
+
+ +
+
+load_adapter_fusion(adapter_fusion_name_or_path: str, load_as: Optional[str] = None, custom_weights_loaders: Optional[List[WeightsLoader]] = None, set_active: bool = False, with_head: bool = True, use_safetensors: bool = False, **kwargs) str
+

Loads a pre-trained AdapterFusion layer from the local file system.

+
+
Parameters
+
    +
  • adapter_fusion_name_or_path (str) – a path to a directory containing AdapterFusion weights saved using model.save_adapter_fusion().

  • +
  • load_as (str, optional) – Load the AdapterFusion using this name. +By default, the name with which the AdapterFusion layer was saved will be used.

  • +
  • set_active (bool, optional) – Activate the loaded AdapterFusion. By default (False), the AdapterFusion is loaded but not activated.

  • +
  • use_safetensors (bool, optional) – If True, weights are loaded via safetensors if safetensors checkpoint is available. Otherwise, the regular torch save method is used.

  • +
+
+
Returns
+

The name with which the AdapterFusion was added to the model.

+
+
Return type
+

str

+
+
+
+ +
+
+load_head(save_directory: str, load_as: Optional[str] = None, id2label: Optional[Dict[int, str]] = None, use_safetensors: bool = False, **kwargs) str
+

Loads a model prediction head from a directory where it was saved using save_head().

+
+
Parameters
+
    +
  • save_directory (str) – Path to the directory where the prediction head is saved.

  • +
  • load_as (str, optional) – Load the AdapterFusion using this name. +By default, the name with which the AdapterFusion layer was saved will be used.

  • +
  • id2label (Dict[int, str], optional) – Provide a custom mapping from class ids to class labels. Defaults to None.

  • +
  • use_safetensors (bool, optional) – If True, weights are loaded via safetensors if safetensors checkpoint is available. Otherwise, the regular torch save method is used.

  • +
+
+
Returns
+

The name with which the prediction head was added to the model.

+
+
Return type
+

str

+
+
+
+ +
+
+merge_adapter(name: str)
+

Merges the weights of the given LoRA module with the Transformer weights as described in the paper.

+
+
Parameters
+

name (str) – LoRA module to merge.

+
+
+
+ +
+
+push_adapter_to_hub(repo_name: str, adapter_name: str, organization: Optional[str] = None, adapterhub_tag: Optional[str] = None, datasets_tag: Optional[str] = None, local_path: Optional[str] = None, commit_message: Optional[str] = None, private: Optional[bool] = None, token: Optional[Union[bool, str]] = None, overwrite_adapter_card: bool = False, create_pr: bool = False, revision: Optional[str] = None, commit_description: Optional[str] = None, adapter_card_kwargs: Optional[dict] = None, **deprecated_kwargs)
+

Upload an adapter to HuggingFace’s Model Hub.

+
+
Parameters
+
    +
  • repo_name (str) – The name of the repository on the model hub to upload to.

  • +
  • adapter_name (str) – The name of the adapter to be uploaded.

  • +
  • organization (str, optional) – Organization in which to push the adapter +(you must be a member of this organization). Defaults to None.

  • +
  • adapterhub_tag (str, optional) – Tag of the format <task>/<subtask> for categorization on https://adapterhub.ml/explore/. See +https://docs.adapterhub.ml/contributing.html#add-a-new-task-or-subtask for more. If not specified, +datasets_tag must be given in case a new adapter card is generated. Defaults to None.

  • +
  • datasets_tag (str, optional) – Dataset identifier from https://huggingface.co/datasets. +If not specified, adapterhub_tag must be given in case a new adapter card is generated. Defaults to +None.

  • +
  • local_path (str, optional) – Local path used as clone directory of the adapter repository. +If not specified, will create a temporary directory. Defaults to None.

  • +
  • commit_message (str, optional) – Message to commit while pushing. Will default to "add config", "add tokenizer" or +"add model" depending on the type of the class.

  • +
  • private (bool, optional) – Whether or not the repository created should be private (requires a paying subscription).

  • +
  • token (bool or str, optional) – The token to use as HTTP bearer authorization for remote files. If True, will use the token generated +when running huggingface-cli login (stored in ~/.huggingface). Will default to True if repo_url +is not specified.

  • +
  • overwrite_adapter_card (bool, optional) – Overwrite an existing adapter card with a newly generated one. +If set to False, will only generate an adapter card, if none exists. Defaults to False.

  • +
  • create_pr (bool, optional) – Whether or not to create a PR with the uploaded files or directly commit.

  • +
  • revision (str, optional) – Branch to push the uploaded files to.

  • +
  • commit_description (str, optional) – The description of the commit that will be created

  • +
+
+
Returns
+

The url of the adapter repository on the model hub.

+
+
Return type
+

str

+
+
+
+ +
+
+reset_adapter()
+

Resets weights of a LoRA module merged using model.merge_adapter(name).

+
+ +
+
+save_adapter(save_directory: str, adapter_name: str, with_head: bool = True, meta_dict: Optional[dict] = None, custom_weights_loaders: Optional[List[WeightsLoader]] = None, use_safetensors: bool = False)
+

Saves an adapter and its configuration file to a directory so that it can be shared or reloaded using +load_adapter().

+
+
Parameters
+
    +
  • save_directory (str) – Path to a directory where the adapter should be saved.

  • +
  • adapter_name (str) – Name of the adapter to be saved.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
Raises
+

ValueError – If the given adapter name is invalid.

+
+
+
+ +
+
+save_adapter_fusion(save_directory: str, adapter_names: Union[Fuse, list, str], meta_dict: Optional[dict] = None, custom_weights_loaders: Optional[List[WeightsLoader]] = None, with_head: Union[bool, str] = False, use_safetensors: bool = False)
+

Saves an AdapterFusion layer and its configuration file to a directory so that it can be shared or reloaded +using load_adapter_fusion().

+
+
Parameters
+
    +
  • save_directory (str) – Path to a directory where the AdapterFusion should be saved.

  • +
  • adapter_names (Union[Fuse, list, str]) – AdapterFusion to be saved.

  • +
  • with_head (Union[bool, str]) – If True, will save a head with the same name as the AdapterFusionLayer. If a string, this will be used +as the name of the head to be saved.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
Raises
+

ValueError – If the given AdapterFusion name is invalid.

+
+
+
+ +
+
+save_all_adapter_fusions(save_directory: str, meta_dict: Optional[dict] = None, custom_weights_loaders: Optional[List[WeightsLoader]] = None, use_safetensors: bool = False)
+

Saves all AdapterFusion layers of this model together with their configuration to subfolders of the given +location.

+
+
Parameters
+
    +
  • save_directory (str) – Path to a directory where the AdapterFusion layers should be saved.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
+
+ +
+
+save_all_adapters(save_directory: str, with_head: bool = True, meta_dict: Optional[dict] = None, custom_weights_loaders: Optional[List[WeightsLoader]] = None, use_safetensors: bool = False)
+

Saves all adapters of this model together with their configuration to subfolders of the given location.

+
+
Parameters
+
    +
  • save_directory (str) – Path to a directory where the adapters should be saved.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
+
+ +
+
+save_all_heads(save_directory: str, use_safetensors: bool = False)
+

Saves all prediction heads of this model to subfolders of the given location.

+
+
Parameters
+
    +
  • save_directory (str) – Path to the base directory where prediction heads should be saved.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
+
+ +
+
+save_head(save_directory: str, head_name: Optional[str] = None, use_safetensors: bool = False) None
+

Saves a model prediction head to a directory such that it can be reloaded using load_head().

+
+
Parameters
+
    +
  • save_directory (str) – Path to the directory where the prediction head should be saved.

  • +
  • head_name (str, optional) – Name of the head to save. Set to None if model only has one head. Defaults to None.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
+
+ +
+
+save_pretrained(save_directory: Union[str, PathLike], **kwargs)
+

Save a model and its configuration file to a directory, so that it can be re-loaded using the +[~PreTrainedModel.from_pretrained] class method.

+
+
Parameters
+
    +
  • save_directory (str or os.PathLike) – Directory to which to save. Will be created if it doesn’t exist.

  • +
  • is_main_process (bool, optional, defaults to True) – Whether the process calling this is the main process or not. Useful when in distributed training like +TPUs and need to call this function on all processes. In this case, set is_main_process=True only on +the main process to avoid race conditions.

  • +
  • state_dict (nested dictionary of torch.Tensor) – The state dictionary of the model to save. Will default to self.state_dict(), but can be used to only +save parts of the model or if special precautions need to be taken when recovering the state dictionary +of a model (like when using model parallelism).

  • +
  • save_function (Callable) – The function to use to save the state dictionary. Useful on distributed training like TPUs when one +need to replace torch.save by another method.

  • +
  • push_to_hub (bool, optional, defaults to False) – Whether or not to push your model to the Hugging Face model hub after saving it. You can specify the +repository you want to push to with repo_id (will default to the name of save_directory in your +namespace).

  • +
  • max_shard_size (int or str, optional, defaults to “5GB”) –

    The maximum size for a checkpoint before being sharded. Checkpoints shard will then be each of size +lower than this size. If expressed as a string, needs to be digits followed by a unit (like “5MB”). +We default it to 5GB in order for models to be able to run easily on free-tier google colab instances +without CPU OOM issues.

    +

    <Tip warning={true}>

    +

    If a single weight of the model is bigger than max_shard_size, it will be in its own checkpoint shard +which will be bigger than max_shard_size.

    +

    </Tip>

    +

  • +
  • safe_serialization (bool, optional, defaults to True) – Whether to save the model using safetensors or the traditional PyTorch way (that uses pickle).

  • +
  • variant (str, optional) – If specified, weights are saved in the format pytorch_model.<variant>.bin.

  • +
  • token (str or bool, optional) – The token to use as HTTP bearer authorization for remote files. If True, or not specified, will use +the token generated when running huggingface-cli login (stored in ~/.huggingface).

  • +
  • save_peft_format (bool, optional, defaults to True) – For backward compatibility with PEFT library, in case adapter weights are attached to the model, all +keys of the state dict of adapters needs to be pre-pended with base_model.model. Advanced users can +disable this behaviours by setting save_peft_format to False.

  • +
  • kwargs (Dict[str, Any], optional) – Additional key word arguments passed along to the [~utils.PushToHubMixin.push_to_hub] method.

  • +
+
+
+
+ +
+
+set_active_adapters(adapter_setup: Union[list, AdapterCompositionBlock], skip_layers: Optional[List[int]] = None)
+

Sets the adapter modules to be used by default in every forward pass. This setting can be overriden by passing +the adapter_names parameter in the foward() pass. If no adapter with the given name is found, no module of +the respective type will be activated. In case the calling model class supports named prediction heads, this +method will attempt to activate a prediction head with the name of the last adapter in the list of passed +adapter names.

+
+
Parameters
+

adapter_setup (list) – The list of adapters to be activated by default. Can be a fusion or stacking configuration.

+
+
+
+ +
+
+tie_weights()
+

Tie the weights between the input embeddings and the output embeddings.

+

If the torchscript flag is set in the configuration, can’t handle parameter sharing so we are cloning +the weights instead.

+
+ +
+
+train_adapter(adapter_setup: Union[list, AdapterCompositionBlock], train_embeddings=False)
+

Sets the model into mode for training the given adapters. If self.base_model is self, must inherit from a class +that implements this method, to preclude infinite recursion

+
+ +
+
+train_adapter_fusion(adapter_setup: Union[list, AdapterCompositionBlock], unfreeze_adapters=False)
+

Sets the model into mode for training of adapter fusion determined by a list of adapter names. If +self.base_model is self, must inherit from a class that implements this method, to preclude infinite recursion

+
+ +
+
+train_fusion(adapter_setup: Union[list, AdapterCompositionBlock], unfreeze_adapters=False)
+

Sets the model into mode for training of adapter fusion determined by a list of adapter names.

+
+ +
+ +
+
+ + +
+ +
+ + +
+
+ +
+ +
+ +
+ + Versions + v: main + + +
+
+
Branches
+
main
+
+
+
+ + + + + + + + + + \ No newline at end of file diff --git a/classes/models/mt5.html b/classes/models/mt5.html new file mode 100644 index 0000000000..0f6d0c7d92 --- /dev/null +++ b/classes/models/mt5.html @@ -0,0 +1,1113 @@ + + + + + + + + + + + MT5 — AdapterHub documentation + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + +
+ + + + + +
+ +
+ + + + + + + + + + + + + + + + + +
+ + + + +
+
+
+
+ +
+

MT5

+

The mT5 model was presented in mT5: A massively multilingual pre-trained text-to-text transformer by Linting Xue, Noah Constant, Adam Roberts, Mihir Kale, Rami Al-Rfou, +Aditya Siddhant, Aditya Barua, Colin Raffel.

+

The abstract from the paper is the following,

+
    +
  • The recent “Text-to-Text Transfer Transformer” (T5) leveraged a unified text-to-text format and scale to attain +state-of-the-art results on a wide variety of English-language NLP tasks. In this paper, we introduce mT5, a +multilingual variant of T5 that was pre-trained on a new Common Crawl-based dataset covering 101 languages. We detail +the design and modified training of mT5 and demonstrate its state-of-the-art performance on many multilingual +benchmarks. We also describe a simple technique to prevent “accidental translation” in the zero-shot setting, where a +generative model chooses to (partially) translate its prediction into the wrong language. All of the code and model +checkpoints used in this work are publicly available.

  • +
+
+

MT5AdapterModel

+
+
+class adapters.MT5AdapterModel(config)
+

MT5 Model with the option to add multiple flexible prediction heads on top.

+

The MT5 model was proposed in [Exploring the Limits of Transfer Learning with a Unified Text-to-Text +Transformer](https://arxiv.org/abs/1910.10683) by Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan +Narang, Michael Matena, Yanqi Zhou, Wei Li, Peter J. Liu. It’s an encoder decoder transformer pre-trained in a +text-to-text denoising generative setting.

+

This model inherits from [PreTrainedModel]. Check the superclass documentation for the generic methods the +library implements for all its model (such as downloading or saving, resizing the input embeddings, pruning heads +etc.)

+

This model is also a PyTorch [torch.nn.Module](https://pytorch.org/docs/stable/nn.html#torch.nn.Module) subclass. +Use it as a regular PyTorch Module and refer to the PyTorch documentation for all matter related to general usage +and behavior.

+
+
Parameters
+

config ([MT5Config]) – Model configuration class with all the parameters of the model. +Initializing with a config file does not load the weights associated with the model, only the +configuration. Check out the [~PreTrainedModel.from_pretrained] method to load the model weights.

+
+
+
+
+property active_adapters: AdapterCompositionBlock
+

If you are not familiar with adapters and PEFT methods, we invite you to read more about them on the PEFT +official documentation: https://huggingface.co/docs/peft

+

Gets the current active adapters of the model. In case of multi-adapter inference (combining multiple adapters +for inference) returns the list of all active adapters so that users can deal with them accordingly.

+

For previous PEFT versions (that does not support multi-adapter inference), module.active_adapter will return +a single string.

+
+ +
+
+property active_head: Union[str, List[str]]
+

The active prediction head configuration of this model. Can be either the name of a single available head +(string) or a list of multiple available heads. In case of a list of heads, the same base model is forwarded +through all specified heads.

+
+
Returns
+

A string or a list of strings describing the active head configuration.

+
+
Return type
+

Union[str, List[str]]

+
+
+
+ +
+
+adapter_fusion_to(adapter_names: Union[Fuse, list, str], device: Optional[Union[device, str]] = None, dtype: Optional[dtype] = None)
+

Moves the adapter fusion layer with the given name to the specified device and data type.

+
+
Parameters
+
    +
  • adapter_names (Union[Fuse, list, str]) – The name of the adapter fusion layer to be moved.

  • +
  • device (torch.device or str, optional) – The device on which the adapter fusion layer should be moved.

  • +
  • dtype (torch.dtype, optional) – The data type to which the adapter fusion layer should be cast.

  • +
+
+
+
+ +
+
+adapter_summary(as_dict=False) Union[str, dict]
+

Returns a string summary of all adapters currently added to the model. Each entry in the summary table has the +following attributes:

+
+
    +
  • name: the name of the adapter

  • +
  • architecture: the architectural base of the adapter

  • +
  • #param: the number of parameters of the adapter

  • +
  • %param: the number of parameters of the adapter relative to the full model

  • +
  • active: whether the adapter is active

  • +
  • train: whether the adapter weights are enabled for training

  • +
+
+
+ +
+
+adapter_to(name: str, device: Optional[Union[device, str]] = None, dtype: Optional[dtype] = None)
+

Moves the adapter with the given name to the specified device and data type.

+
+
Parameters
+
    +
  • name (str) – The name of the adapter to be moved.

  • +
  • device (torch.device or str, optional) – The device on which the adapter should be moved.

  • +
  • dtype (torch.dtype, optional) – The data type to which the adapter should be cast.

  • +
+
+
+
+ +
+
+add_adapter(adapter_name: str, config=None, overwrite_ok: bool = False, set_active: bool = False)
+

Adds a new adapter module of the specified type to the model.

+
+
Parameters
+
    +
  • adapter_name (str) – The name of the adapter module to be added.

  • +
  • config (str or dict, optional) –

    The adapter configuration, can be either:

    +
      +
    • the string identifier of a pre-defined configuration dictionary

    • +
    • a configuration dictionary specifying the full config

    • +
    • if not given, the default configuration for this adapter type will be used

    • +
    +

  • +
  • overwrite_ok (bool, optional) – Overwrite an adapter with the same name if it exists. By default (False), an exception is thrown.

  • +
  • set_active (bool, optional) – Set the adapter to be the active one. By default (False), the adapter is added but not activated.

  • +
+
+
+

If self.base_model is self, must inherit from a class that implements this method, to preclude infinite +recursion

+
+ +
+
+add_adapter_fusion(adapter_names: Union[Fuse, list, str], config=None, overwrite_ok: bool = False, set_active: bool = False)
+

Adds AdapterFusion to the model with alll the necessary configurations and weight initializations

+
+
Parameters
+
    +
  • adapter_names (Fuse or list or str) –

    AdapterFusion layer to add. Can be either:

    +
      +
    • a Fuse composition block

    • +
    • a list of adapter names to fuse

    • +
    • a comma-separated string of adapter names to fuse

    • +
    +

  • +
  • config (str or dict) –

    adapter fusion configuration, can be either:

    +
      +
    • a string identifying a pre-defined adapter fusion configuration

    • +
    • a dictionary representing the adapter fusion configuration

    • +
    • the path to a file containing the adapter fusion configuration

    • +
    +

  • +
  • overwrite_ok (bool, optional) – Overwrite an AdapterFusion layer with the same name if it exists. By default (False), an exception is +thrown.

  • +
  • set_active (bool, optional) – Activate the added AdapterFusion. By default (False), the AdapterFusion is added but not activated.

  • +
+
+
+
+ +
+
+add_classification_head(head_name, num_labels=2, layers=2, activation_function='tanh', overwrite_ok=False, multilabel=False, id2label=None, use_pooler=False)
+

Adds a sequence classification head on top of the model.

+
+
Parameters
+
    +
  • head_name (str) – The name of the head.

  • +
  • num_labels (int, optional) – Number of classification labels. Defaults to 2.

  • +
  • layers (int, optional) – Number of layers. Defaults to 2.

  • +
  • activation_function (str, optional) – Activation function. Defaults to ‘tanh’.

  • +
  • overwrite_ok (bool, optional) – Force overwrite if a head with the same name exists. Defaults to False.

  • +
  • multilabel (bool, optional) – Enable multilabel classification setup. Defaults to False.

  • +
+
+
+
+ +
+
+add_qa_head(head_name, num_labels=2, layers=1, activation_function='tanh', overwrite_ok=False, id2label=None)
+

Adds a question answering head on top of the model.

+
+
Parameters
+
    +
  • head_name (str) – The name of the head.

  • +
  • num_labels (int, optional) – Number of classification labels. Defaults to 2.

  • +
  • layers (int, optional) – Number of layers. Defaults to 1.

  • +
  • activation_function (str, optional) – Activation function. Defaults to ‘tanh’.

  • +
  • overwrite_ok (bool, optional) – Force overwrite if a head with the same name exists. Defaults to False.

  • +
+
+
+
+ +
+
+add_seq2seq_lm_head(head_name, overwrite_ok=False)
+

Adds a sequence-to-sequence language modeling head on top of the model.

+
+
Parameters
+
    +
  • head_name (str) – The name of the head.

  • +
  • overwrite_ok (bool, optional) – Force overwrite if a head with the same name exists. Defaults to False.

  • +
+
+
+
+ +
+
+apply_to_adapter_layers(fn)
+

Applies a function to all adapter layers of the model.

+
+ +
+
+apply_to_basemodel_childs(fn)
+

Applies a function to all direct childs of the model if they are a instance of AdapterLayerBase.

+
+ +
+
+average_adapter(adapter_name: str, adapter_list: List[str], weights: Optional[List[float]] = None, normalize_weights: bool = True, overwrite_ok: bool = False, set_active: bool = False)
+

Adds a new adapter module as weighted average of a set of existing adapter modules.

+
+
Parameters
+
    +
  • adapter_name (str) – The name of the adapter module to be added.

  • +
  • input_adapters (List[str] or Dict[str, float]) – Specifies the existing adapters whose weights should be averaged. Can either be a list of adapter names +or a dictionary mapping adapter names to weights.

  • +
  • overwrite_ok (bool, optional) – Overwrite an adapter with the same name if it exists. By default (False), an exception is thrown.

  • +
  • set_active (bool, optional) – Set the adapter to be the active one. By default (False), the adapter is added but not activated.

  • +
+
+
+
+ +
+
+delete_adapter(adapter_name: str)
+

Deletes the adapter with the specified name from the model.

+
+
Parameters
+

adapter_name (str) – The name of the adapter.

+
+
+
+ +
+
+delete_adapter_fusion(adapter_names: Union[Fuse, list, str])
+

Deletes the AdapterFusion layer of the specified adapters.

+
+
Parameters
+

adapter_names (Union[Fuse, list, str]) – AdapterFusion layer to delete.

+
+
+
+ +
+
+delete_head(head_name: str)
+

Deletes the prediction head with the specified name from the model.

+
+
Parameters
+

head_name (str) – The name of the prediction to delete.

+
+
+
+ +
+
+eject_prefix_tuning(name: str)
+

Converts the prefix tuning with the given name from the reparameterized form into the flat form.

+
+
Parameters
+

name (str) – The name of the prefix tuning.

+
+
+
+ +
+
+forward(input_ids=None, attention_mask=None, decoder_input_ids=None, decoder_attention_mask=None, head_mask=None, decoder_head_mask=None, cross_attn_head_mask=None, encoder_outputs=None, past_key_values=None, inputs_embeds=None, decoder_inputs_embeds=None, labels=None, use_cache=None, output_attentions=None, output_hidden_states=None, return_dict=None, head=None, output_adapter_gating_scores=False, output_adapter_fusion_attentions=False, **kwargs)
+

The [MT5AdapterModel] forward method, overrides the __call__ special method.

+

<Tip>

+

Although the recipe for forward pass needs to be defined within this function, one should call the [Module] +instance afterwards instead of this since the former takes care of running the pre and post processing steps while +the latter silently ignores them.

+

</Tip>

+
+
Parameters
+
    +
  • input_ids (torch.LongTensor of shape (batch_size, sequence_length)) –

    Indices of input sequence tokens in the vocabulary. MT5 is a model with relative position embeddings so you +should be able to pad the inputs on both the right and the left.

    +

    Indices can be obtained using [AutoTokenizer]. See [PreTrainedTokenizer.encode] and +[PreTrainedTokenizer.__call__] for detail.

    +

    [What are input IDs?](../glossary#input-ids)

    +

    To know more on how to prepare input_ids for pretraining take a look a [MT5 Training](./mt5#training).

    +

  • +
  • attention_mask (torch.FloatTensor of shape (batch_size, sequence_length), optional) –

    Mask to avoid performing attention on padding token indices. Mask values selected in [0, 1]:

    +
      +
    • 1 for tokens that are not masked,

    • +
    • 0 for tokens that are masked.

    • +
    +

    [What are attention masks?](../glossary#attention-mask)

    +

  • +
  • decoder_input_ids (torch.LongTensor of shape (batch_size, target_sequence_length), optional) –

    Indices of decoder input sequence tokens in the vocabulary.

    +

    Indices can be obtained using [AutoTokenizer]. See [PreTrainedTokenizer.encode] and +[PreTrainedTokenizer.__call__] for details.

    +

    [What are decoder input IDs?](../glossary#decoder-input-ids)

    +

    MT5 uses the pad_token_id as the starting token for decoder_input_ids generation. If past_key_values +is used, optionally only the last decoder_input_ids have to be input (see past_key_values).

    +

    To know more on how to prepare decoder_input_ids for pretraining take a look at [MT5 +Training](./mt5#training).

    +

  • +
  • decoder_attention_mask (torch.BoolTensor of shape (batch_size, target_sequence_length), optional) – Default behavior: generate a tensor that ignores pad tokens in decoder_input_ids. Causal mask will also +be used by default.

  • +
  • head_mask (torch.FloatTensor of shape (num_heads,) or (num_layers, num_heads), optional) –

    Mask to nullify selected heads of the self-attention modules in the encoder. Mask values selected in [0, +1]:

    +
      +
    • 1 indicates the head is not masked,

    • +
    • 0 indicates the head is masked.

    • +
    +

  • +
  • decoder_head_mask (torch.FloatTensor of shape (num_heads,) or (num_layers, num_heads), optional) –

    Mask to nullify selected heads of the self-attention modules in the decoder. Mask values selected in [0, +1]:

    +
      +
    • 1 indicates the head is not masked,

    • +
    • 0 indicates the head is masked.

    • +
    +

  • +
  • cross_attn_head_mask (torch.Tensor of shape (num_heads,) or (num_layers, num_heads), optional) –

    Mask to nullify selected heads of the cross-attention modules in the decoder. Mask values selected in +[0, 1]:

    +
      +
    • 1 indicates the head is not masked,

    • +
    • 0 indicates the head is masked.

    • +
    +

  • +
  • encoder_outputs (tuple(tuple(torch.FloatTensor), optional) – Tuple consists of (last_hidden_state, optional: hidden_states, optional: attentions) +last_hidden_state of shape (batch_size, sequence_length, hidden_size) is a sequence of hidden states at +the output of the last layer of the encoder. Used in the cross-attention of the decoder.

  • +
  • past_key_values (tuple(tuple(torch.FloatTensor)) of length config.n_layers with each tuple having 4 tensors of shape (batch_size, num_heads, sequence_length - 1, embed_size_per_head)) –

    Contains precomputed key and value hidden states of the attention blocks. Can be used to speed up decoding.

    +

    If past_key_values are used, the user can optionally input only the last decoder_input_ids (those that +don’t have their past key value states given to this model) of shape (batch_size, 1) instead of all +decoder_input_ids of shape (batch_size, sequence_length).

    +

  • +
  • inputs_embeds (torch.FloatTensor of shape (batch_size, sequence_length, hidden_size), optional) – Optionally, instead of passing input_ids you can choose to directly pass an embedded representation. This +is useful if you want more control over how to convert input_ids indices into associated vectors than the +model’s internal embedding lookup matrix.

  • +
  • decoder_inputs_embeds (torch.FloatTensor of shape (batch_size, target_sequence_length, hidden_size), optional) –

    Optionally, instead of passing decoder_input_ids you can choose to directly pass an embedded +representation. If past_key_values is used, optionally only the last decoder_inputs_embeds have to be +input (see past_key_values). This is useful if you want more control over how to convert +decoder_input_ids indices into associated vectors than the model’s internal embedding lookup matrix.

    +

    If decoder_input_ids and decoder_inputs_embeds are both unset, decoder_inputs_embeds takes the value +of inputs_embeds.

    +

  • +
  • use_cache (bool, optional) – If set to True, past_key_values key value states are returned and can be used to speed up decoding (see +past_key_values).

  • +
  • output_attentions (bool, optional) – Whether or not to return the attentions tensors of all attention layers. See attentions under returned +tensors for more detail.

  • +
  • output_hidden_states (bool, optional) – Whether or not to return the hidden states of all layers. See hidden_states under returned tensors for +more detail.

  • +
  • return_dict (bool, optional) – Whether or not to return a [~utils.ModelOutput] instead of a plain tuple.

  • +
+
+
+
+ +
+
+forward_context(context: ForwardContext, *args, **kwargs)
+

This method is called by the ForwardContext at the beginning of the forward pass.

+
+ +
+
+forward_head(all_outputs, head_name=None, cls_output=None, attention_mask=None, return_dict=False, context=None, **kwargs)
+

The forward pass through a prediction head configuration. There are three ways to specify the used prediction +head configuration (in order of priority):

+
+
    +
  1. If a head_name is passed, the head with the given name is used.

  2. +
  3. If the forward call is executed within an AdapterSetup context, the head configuration is read from +the context.

  4. +
  5. If the active_head property is set, the head configuration is read from there.

  6. +
+
+
+
Parameters
+
    +
  • all_outputs (dict) – The outputs of the base model.

  • +
  • head_name (str, optional) – The name of the prediction head to use. If None, the active head is used.

  • +
  • cls_output (torch.Tensor, optional) – The classification output of the model.

  • +
  • attention_mask (torch.Tensor, optional) – The attention mask of the model.

  • +
  • return_dict (bool) – Whether or not to return a ModelOutput instead of a plain tuple.

  • +
  • get_cls_from_eos_tokens (bool) – If set to True, retrieve classifier token representations from the last <eos> token in the sequence. +Setting to True requires eos_mask to be passed as well.

  • +
  • **kwargs – Additional keyword arguments passed to the forward pass of the head.

  • +
+
+
+
+ +
+
+freeze_model(freeze=True)
+

Freezes all weights of the model.

+
+ +
+
+get_adapter(name)
+

If self.base_model is self, must inherit from a class that implements this method, to preclude infinite +recursion

+
+ +
+
+get_labels(head_name=None)
+

Returns the labels the given head is assigning/predictin

+
+
Parameters
+
    +
  • head_name – (str, optional) the name of the head which labels should be returned. Default is None.

  • +
  • returned (If the name is None the labels of the active head are) –

  • +
+
+
+

Returns: labels

+
+ +
+
+get_labels_dict(head_name=None)
+

Returns the id2label dict for the given hea

+
+
Parameters
+
    +
  • head_name – (str, optional) the name of the head which labels should be returned. Default is None.

  • +
  • returned (If the name is None the labels of the active head are) –

  • +
+
+
+

Returns: id2label

+
+ +
+
+get_output_embeddings() Union[Module, List[Module]]
+

Returns the model’s output embeddings.

+
+
Returns
+

A torch module mapping hidden states to vocabulary.

+
+
Return type
+

nn.Module

+
+
+
+ +
+
+head_type()
+

Checks which head type the decorated function belongs to and raises an error if the model does not support the +head type.

+
+ +
+
+init_adapters(model_config, adapters_config, add_prefix_tuning_pool=True)
+

This method initializes adapter modules and fusion modules from the model config.

+
+ +
+
+iter_layers() Iterable[Tuple[int, Module]]
+

Iterates over all layers of the model.

+
+ +
+
+load_adapter(adapter_name_or_path: str, config: Optional[Union[dict, str]] = None, version: Optional[str] = None, model_name: Optional[str] = None, load_as: Optional[str] = None, source: Optional[str] = None, with_head: bool = True, custom_weights_loaders: Optional[List[WeightsLoader]] = None, leave_out: Optional[List[int]] = None, id2label=None, set_active: bool = False, use_safetensors: bool = False, **kwargs) str
+

Loads a pre-trained pytorch adapter module from the local file system or a remote location.

+
+
Parameters
+
    +
  • adapter_name_or_path (str) –

    can be either:

    +
      +
    • the identifier of a pre-trained task adapter to be loaded from Adapter Hub

    • +
    • a path to a directory containing adapter weights saved using model.saved_adapter()

    • +
    • a URL pointing to a zip folder containing a saved adapter module

    • +
    +

  • +
  • config (dict or str, optional) – The requested configuration of the adapter. +If not specified, will be either: - the default adapter config for the requested adapter if specified - +the global default adapter config

  • +
  • version (str, optional) – The version of the adapter to be loaded.

  • +
  • model_name (str, optional) – The string identifier of the pre-trained model.

  • +
  • load_as (str, optional) – Load the adapter using this name. By default, the name with which the adapter was +saved will be used.

  • +
  • source (str, optional) –

    Identifier of the source(s) from where to load the adapter. Can be:

    +
      +
    • +
      ”ah”: search on AdapterHub Hub repo.

      Note: the Hub repo has been archived and all adapters have been moved to HuggingFace Model Hub. +Loading from this source is deprecated.

      +
      +
      +
    • +
    • ”hf”: search on HuggingFace Model Hub.

    • +
    • None (default): search on all sources

    • +
    +

  • +
  • leave_out – Dynamically drop adapter modules in the specified Transformer layers when loading the adapter.

  • +
  • set_active (bool, optional) – Set the loaded adapter to be the active one. By default (False), the adapter is loaded but not +activated.

  • +
  • use_safetensors (bool, optional) – If True, weights are loaded via safetensors if safetensors checkpoint is available. Otherwise, the regular torch save method is used.

  • +
+
+
Returns
+

The name with which the adapter was added to the model.

+
+
Return type
+

str

+
+
+
+ +
+
+load_adapter_fusion(adapter_fusion_name_or_path: str, load_as: Optional[str] = None, custom_weights_loaders: Optional[List[WeightsLoader]] = None, set_active: bool = False, with_head: bool = True, use_safetensors: bool = False, **kwargs) str
+

Loads a pre-trained AdapterFusion layer from the local file system.

+
+
Parameters
+
    +
  • adapter_fusion_name_or_path (str) – a path to a directory containing AdapterFusion weights saved using model.save_adapter_fusion().

  • +
  • load_as (str, optional) – Load the AdapterFusion using this name. +By default, the name with which the AdapterFusion layer was saved will be used.

  • +
  • set_active (bool, optional) – Activate the loaded AdapterFusion. By default (False), the AdapterFusion is loaded but not activated.

  • +
  • use_safetensors (bool, optional) – If True, weights are loaded via safetensors if safetensors checkpoint is available. Otherwise, the regular torch save method is used.

  • +
+
+
Returns
+

The name with which the AdapterFusion was added to the model.

+
+
Return type
+

str

+
+
+
+ +
+
+load_head(save_directory: str, load_as: Optional[str] = None, id2label: Optional[Dict[int, str]] = None, use_safetensors: bool = False, **kwargs) str
+

Loads a model prediction head from a directory where it was saved using save_head().

+
+
Parameters
+
    +
  • save_directory (str) – Path to the directory where the prediction head is saved.

  • +
  • load_as (str, optional) – Load the AdapterFusion using this name. +By default, the name with which the AdapterFusion layer was saved will be used.

  • +
  • id2label (Dict[int, str], optional) – Provide a custom mapping from class ids to class labels. Defaults to None.

  • +
  • use_safetensors (bool, optional) – If True, weights are loaded via safetensors if safetensors checkpoint is available. Otherwise, the regular torch save method is used.

  • +
+
+
Returns
+

The name with which the prediction head was added to the model.

+
+
Return type
+

str

+
+
+
+ +
+
+merge_adapter(name: str)
+

Merges the weights of the given LoRA module with the Transformer weights as described in the paper.

+
+
Parameters
+

name (str) – LoRA module to merge.

+
+
+
+ +
+
+push_adapter_to_hub(repo_name: str, adapter_name: str, organization: Optional[str] = None, adapterhub_tag: Optional[str] = None, datasets_tag: Optional[str] = None, local_path: Optional[str] = None, commit_message: Optional[str] = None, private: Optional[bool] = None, token: Optional[Union[bool, str]] = None, overwrite_adapter_card: bool = False, create_pr: bool = False, revision: Optional[str] = None, commit_description: Optional[str] = None, adapter_card_kwargs: Optional[dict] = None, **deprecated_kwargs)
+

Upload an adapter to HuggingFace’s Model Hub.

+
+
Parameters
+
    +
  • repo_name (str) – The name of the repository on the model hub to upload to.

  • +
  • adapter_name (str) – The name of the adapter to be uploaded.

  • +
  • organization (str, optional) – Organization in which to push the adapter +(you must be a member of this organization). Defaults to None.

  • +
  • adapterhub_tag (str, optional) – Tag of the format <task>/<subtask> for categorization on https://adapterhub.ml/explore/. See +https://docs.adapterhub.ml/contributing.html#add-a-new-task-or-subtask for more. If not specified, +datasets_tag must be given in case a new adapter card is generated. Defaults to None.

  • +
  • datasets_tag (str, optional) – Dataset identifier from https://huggingface.co/datasets. +If not specified, adapterhub_tag must be given in case a new adapter card is generated. Defaults to +None.

  • +
  • local_path (str, optional) – Local path used as clone directory of the adapter repository. +If not specified, will create a temporary directory. Defaults to None.

  • +
  • commit_message (str, optional) – Message to commit while pushing. Will default to "add config", "add tokenizer" or +"add model" depending on the type of the class.

  • +
  • private (bool, optional) – Whether or not the repository created should be private (requires a paying subscription).

  • +
  • token (bool or str, optional) – The token to use as HTTP bearer authorization for remote files. If True, will use the token generated +when running huggingface-cli login (stored in ~/.huggingface). Will default to True if repo_url +is not specified.

  • +
  • overwrite_adapter_card (bool, optional) – Overwrite an existing adapter card with a newly generated one. +If set to False, will only generate an adapter card, if none exists. Defaults to False.

  • +
  • create_pr (bool, optional) – Whether or not to create a PR with the uploaded files or directly commit.

  • +
  • revision (str, optional) – Branch to push the uploaded files to.

  • +
  • commit_description (str, optional) – The description of the commit that will be created

  • +
+
+
Returns
+

The url of the adapter repository on the model hub.

+
+
Return type
+

str

+
+
+
+ +
+
+reset_adapter()
+

Resets weights of a LoRA module merged using model.merge_adapter(name).

+
+ +
+
+save_adapter(save_directory: str, adapter_name: str, with_head: bool = True, meta_dict: Optional[dict] = None, custom_weights_loaders: Optional[List[WeightsLoader]] = None, use_safetensors: bool = False)
+

Saves an adapter and its configuration file to a directory so that it can be shared or reloaded using +load_adapter().

+
+
Parameters
+
    +
  • save_directory (str) – Path to a directory where the adapter should be saved.

  • +
  • adapter_name (str) – Name of the adapter to be saved.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
Raises
+

ValueError – If the given adapter name is invalid.

+
+
+
+ +
+
+save_adapter_fusion(save_directory: str, adapter_names: Union[Fuse, list, str], meta_dict: Optional[dict] = None, custom_weights_loaders: Optional[List[WeightsLoader]] = None, with_head: Union[bool, str] = False, use_safetensors: bool = False)
+

Saves an AdapterFusion layer and its configuration file to a directory so that it can be shared or reloaded +using load_adapter_fusion().

+
+
Parameters
+
    +
  • save_directory (str) – Path to a directory where the AdapterFusion should be saved.

  • +
  • adapter_names (Union[Fuse, list, str]) – AdapterFusion to be saved.

  • +
  • with_head (Union[bool, str]) – If True, will save a head with the same name as the AdapterFusionLayer. If a string, this will be used +as the name of the head to be saved.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
Raises
+

ValueError – If the given AdapterFusion name is invalid.

+
+
+
+ +
+
+save_all_adapter_fusions(save_directory: str, meta_dict: Optional[dict] = None, custom_weights_loaders: Optional[List[WeightsLoader]] = None, use_safetensors: bool = False)
+

Saves all AdapterFusion layers of this model together with their configuration to subfolders of the given +location.

+
+
Parameters
+
    +
  • save_directory (str) – Path to a directory where the AdapterFusion layers should be saved.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
+
+ +
+
+save_all_adapters(save_directory: str, with_head: bool = True, meta_dict: Optional[dict] = None, custom_weights_loaders: Optional[List[WeightsLoader]] = None, use_safetensors: bool = False)
+

Saves all adapters of this model together with their configuration to subfolders of the given location.

+
+
Parameters
+
    +
  • save_directory (str) – Path to a directory where the adapters should be saved.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
+
+ +
+
+save_all_heads(save_directory: str, use_safetensors: bool = False)
+

Saves all prediction heads of this model to subfolders of the given location.

+
+
Parameters
+
    +
  • save_directory (str) – Path to the base directory where prediction heads should be saved.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
+
+ +
+
+save_head(save_directory: str, head_name: Optional[str] = None, use_safetensors: bool = False) None
+

Saves a model prediction head to a directory such that it can be reloaded using load_head().

+
+
Parameters
+
    +
  • save_directory (str) – Path to the directory where the prediction head should be saved.

  • +
  • head_name (str, optional) – Name of the head to save. Set to None if model only has one head. Defaults to None.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
+
+ +
+
+save_pretrained(save_directory: Union[str, PathLike], **kwargs)
+

Save a model and its configuration file to a directory, so that it can be re-loaded using the +[~PreTrainedModel.from_pretrained] class method.

+
+
Parameters
+
    +
  • save_directory (str or os.PathLike) – Directory to which to save. Will be created if it doesn’t exist.

  • +
  • is_main_process (bool, optional, defaults to True) – Whether the process calling this is the main process or not. Useful when in distributed training like +TPUs and need to call this function on all processes. In this case, set is_main_process=True only on +the main process to avoid race conditions.

  • +
  • state_dict (nested dictionary of torch.Tensor) – The state dictionary of the model to save. Will default to self.state_dict(), but can be used to only +save parts of the model or if special precautions need to be taken when recovering the state dictionary +of a model (like when using model parallelism).

  • +
  • save_function (Callable) – The function to use to save the state dictionary. Useful on distributed training like TPUs when one +need to replace torch.save by another method.

  • +
  • push_to_hub (bool, optional, defaults to False) – Whether or not to push your model to the Hugging Face model hub after saving it. You can specify the +repository you want to push to with repo_id (will default to the name of save_directory in your +namespace).

  • +
  • max_shard_size (int or str, optional, defaults to “5GB”) –

    The maximum size for a checkpoint before being sharded. Checkpoints shard will then be each of size +lower than this size. If expressed as a string, needs to be digits followed by a unit (like “5MB”). +We default it to 5GB in order for models to be able to run easily on free-tier google colab instances +without CPU OOM issues.

    +

    <Tip warning={true}>

    +

    If a single weight of the model is bigger than max_shard_size, it will be in its own checkpoint shard +which will be bigger than max_shard_size.

    +

    </Tip>

    +

  • +
  • safe_serialization (bool, optional, defaults to True) – Whether to save the model using safetensors or the traditional PyTorch way (that uses pickle).

  • +
  • variant (str, optional) – If specified, weights are saved in the format pytorch_model.<variant>.bin.

  • +
  • token (str or bool, optional) – The token to use as HTTP bearer authorization for remote files. If True, or not specified, will use +the token generated when running huggingface-cli login (stored in ~/.huggingface).

  • +
  • save_peft_format (bool, optional, defaults to True) – For backward compatibility with PEFT library, in case adapter weights are attached to the model, all +keys of the state dict of adapters needs to be pre-pended with base_model.model. Advanced users can +disable this behaviours by setting save_peft_format to False.

  • +
  • kwargs (Dict[str, Any], optional) – Additional key word arguments passed along to the [~utils.PushToHubMixin.push_to_hub] method.

  • +
+
+
+
+ +
+
+set_active_adapters(adapter_setup: Union[list, AdapterCompositionBlock], skip_layers: Optional[List[int]] = None)
+

Sets the adapter modules to be used by default in every forward pass. This setting can be overriden by passing +the adapter_names parameter in the foward() pass. If no adapter with the given name is found, no module of +the respective type will be activated. In case the calling model class supports named prediction heads, this +method will attempt to activate a prediction head with the name of the last adapter in the list of passed +adapter names.

+
+
Parameters
+

adapter_setup (list) – The list of adapters to be activated by default. Can be a fusion or stacking configuration.

+
+
+
+ +
+
+tie_weights()
+

Tie the weights between the input embeddings and the output embeddings.

+

If the torchscript flag is set in the configuration, can’t handle parameter sharing so we are cloning +the weights instead.

+
+ +
+
+train_adapter(adapter_setup: Union[list, AdapterCompositionBlock], train_embeddings=False)
+

Sets the model into mode for training the given adapters. If self.base_model is self, must inherit from a class +that implements this method, to preclude infinite recursion

+
+ +
+
+train_adapter_fusion(adapter_setup: Union[list, AdapterCompositionBlock], unfreeze_adapters=False)
+

Sets the model into mode for training of adapter fusion determined by a list of adapter names. If +self.base_model is self, must inherit from a class that implements this method, to preclude infinite recursion

+
+ +
+
+train_fusion(adapter_setup: Union[list, AdapterCompositionBlock], unfreeze_adapters=False)
+

Sets the model into mode for training of adapter fusion determined by a list of adapter names.

+
+ +
+ +
+
+ + +
+ +
+ + +
+
+ +
+ +
+ +
+ + Versions + v: main + + +
+
+
Branches
+
main
+
+
+
+ + + + + + + + + + \ No newline at end of file diff --git a/classes/models/roberta.html b/classes/models/roberta.html new file mode 100644 index 0000000000..8c2af61520 --- /dev/null +++ b/classes/models/roberta.html @@ -0,0 +1,1136 @@ + + + + + + + + + + + RoBERTa — AdapterHub documentation + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + +
+ + + + + +
+ +
+ + + + + + + + + + + + + + + + + +
+ + + + +
+
+
+
+ +
+

RoBERTa

+

The RoBERTa model was proposed in RoBERTa: A Robustly Optimized BERT Pretraining Approach +by Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, +Veselin Stoyanov. It is based on Google’s BERT model released in 2018.

+
+

RobertaAdapterModel

+
+
+class adapters.RobertaAdapterModel(config)
+

Roberta Model transformer with the option to add multiple flexible heads on top.

+

This model inherits from [PreTrainedModel]. Check the superclass documentation for the generic methods the +library implements for all its model (such as downloading or saving, resizing the input embeddings, pruning heads +etc.)

+

This model is also a PyTorch [torch.nn.Module](https://pytorch.org/docs/stable/nn.html#torch.nn.Module) subclass. +Use it as a regular PyTorch Module and refer to the PyTorch documentation for all matter related to general usage +and behavior.

+
+
Parameters
+

config ([RobertaConfig]) – Model configuration class with all the parameters of the +model. Initializing with a config file does not load the weights associated with the model, only the +configuration. Check out the [~PreTrainedModel.from_pretrained] method to load the model weights.

+
+
+
+
+property active_adapters: AdapterCompositionBlock
+

If you are not familiar with adapters and PEFT methods, we invite you to read more about them on the PEFT +official documentation: https://huggingface.co/docs/peft

+

Gets the current active adapters of the model. In case of multi-adapter inference (combining multiple adapters +for inference) returns the list of all active adapters so that users can deal with them accordingly.

+

For previous PEFT versions (that does not support multi-adapter inference), module.active_adapter will return +a single string.

+
+ +
+
+property active_head: Union[str, List[str]]
+

The active prediction head configuration of this model. Can be either the name of a single available head +(string) or a list of multiple available heads. In case of a list of heads, the same base model is forwarded +through all specified heads.

+
+
Returns
+

A string or a list of strings describing the active head configuration.

+
+
Return type
+

Union[str, List[str]]

+
+
+
+ +
+
+adapter_fusion_to(adapter_names: Union[Fuse, list, str], device: Optional[Union[device, str]] = None, dtype: Optional[dtype] = None)
+

Moves the adapter fusion layer with the given name to the specified device and data type.

+
+
Parameters
+
    +
  • adapter_names (Union[Fuse, list, str]) – The name of the adapter fusion layer to be moved.

  • +
  • device (torch.device or str, optional) – The device on which the adapter fusion layer should be moved.

  • +
  • dtype (torch.dtype, optional) – The data type to which the adapter fusion layer should be cast.

  • +
+
+
+
+ +
+
+adapter_summary(as_dict=False) Union[str, dict]
+

Returns a string summary of all adapters currently added to the model. Each entry in the summary table has the +following attributes:

+
+
    +
  • name: the name of the adapter

  • +
  • architecture: the architectural base of the adapter

  • +
  • #param: the number of parameters of the adapter

  • +
  • %param: the number of parameters of the adapter relative to the full model

  • +
  • active: whether the adapter is active

  • +
  • train: whether the adapter weights are enabled for training

  • +
+
+
+ +
+
+adapter_to(name: str, device: Optional[Union[device, str]] = None, dtype: Optional[dtype] = None)
+

Moves the adapter with the given name to the specified device and data type.

+
+
Parameters
+
    +
  • name (str) – The name of the adapter to be moved.

  • +
  • device (torch.device or str, optional) – The device on which the adapter should be moved.

  • +
  • dtype (torch.dtype, optional) – The data type to which the adapter should be cast.

  • +
+
+
+
+ +
+
+add_adapter(adapter_name: str, config=None, overwrite_ok: bool = False, set_active: bool = False)
+

Adds a new adapter module of the specified type to the model.

+
+
Parameters
+
    +
  • adapter_name (str) – The name of the adapter module to be added.

  • +
  • config (str or dict, optional) –

    The adapter configuration, can be either:

    +
      +
    • the string identifier of a pre-defined configuration dictionary

    • +
    • a configuration dictionary specifying the full config

    • +
    • if not given, the default configuration for this adapter type will be used

    • +
    +

  • +
  • overwrite_ok (bool, optional) – Overwrite an adapter with the same name if it exists. By default (False), an exception is thrown.

  • +
  • set_active (bool, optional) – Set the adapter to be the active one. By default (False), the adapter is added but not activated.

  • +
+
+
+

If self.base_model is self, must inherit from a class that implements this method, to preclude infinite +recursion

+
+ +
+
+add_adapter_fusion(adapter_names: Union[Fuse, list, str], config=None, overwrite_ok: bool = False, set_active: bool = False)
+

Adds AdapterFusion to the model with alll the necessary configurations and weight initializations

+
+
Parameters
+
    +
  • adapter_names (Fuse or list or str) –

    AdapterFusion layer to add. Can be either:

    +
      +
    • a Fuse composition block

    • +
    • a list of adapter names to fuse

    • +
    • a comma-separated string of adapter names to fuse

    • +
    +

  • +
  • config (str or dict) –

    adapter fusion configuration, can be either:

    +
      +
    • a string identifying a pre-defined adapter fusion configuration

    • +
    • a dictionary representing the adapter fusion configuration

    • +
    • the path to a file containing the adapter fusion configuration

    • +
    +

  • +
  • overwrite_ok (bool, optional) – Overwrite an AdapterFusion layer with the same name if it exists. By default (False), an exception is +thrown.

  • +
  • set_active (bool, optional) – Activate the added AdapterFusion. By default (False), the AdapterFusion is added but not activated.

  • +
+
+
+
+ +
+
+add_causal_lm_head(head_name, activation_function='gelu', overwrite_ok=False)
+

Adds a causal language modeling head on top of the model.

+
+
Parameters
+
    +
  • head_name (str) – The name of the head.

  • +
  • activation_function (str, optional) – Activation function. Defaults to ‘gelu’.

  • +
  • overwrite_ok (bool, optional) – Force overwrite if a head with the same name exists. Defaults to False.

  • +
+
+
+
+ +
+
+add_classification_head(head_name, num_labels=2, layers=2, activation_function='tanh', overwrite_ok=False, multilabel=False, id2label=None, use_pooler=False)
+

Adds a sequence classification head on top of the model.

+
+
Parameters
+
    +
  • head_name (str) – The name of the head.

  • +
  • num_labels (int, optional) – Number of classification labels. Defaults to 2.

  • +
  • layers (int, optional) – Number of layers. Defaults to 2.

  • +
  • activation_function (str, optional) – Activation function. Defaults to ‘tanh’.

  • +
  • overwrite_ok (bool, optional) – Force overwrite if a head with the same name exists. Defaults to False.

  • +
  • multilabel (bool, optional) – Enable multilabel classification setup. Defaults to False.

  • +
+
+
+
+ +
+
+add_dependency_parsing_head(head_name, num_labels=2, overwrite_ok=False, id2label=None)
+

Adds a biaffine dependency parsing head on top of the model. The parsing head uses the architecture described +in “Is Supervised Syntactic Parsing Beneficial for Language Understanding? An Empirical Investigation” (Glavaš +& Vulić, 2021) (https://arxiv.org/pdf/2008.06788.pdf).

+
+
Parameters
+
    +
  • head_name (str) – The name of the head.

  • +
  • num_labels (int, optional) – Number of labels. Defaults to 2.

  • +
  • overwrite_ok (bool, optional) – Force overwrite if a head with the same name exists. Defaults to False.

  • +
  • id2label (dict, optional) – Mapping from label ids to labels. Defaults to None.

  • +
+
+
+
+ +
+
+add_masked_lm_head(head_name, activation_function='gelu', overwrite_ok=False)
+

Adds a masked language modeling head on top of the model.

+
+
Parameters
+
    +
  • head_name (str) – The name of the head.

  • +
  • activation_function (str, optional) – Activation function. Defaults to ‘gelu’.

  • +
  • overwrite_ok (bool, optional) – Force overwrite if a head with the same name exists. Defaults to False.

  • +
+
+
+
+ +
+
+add_multiple_choice_head(head_name, num_choices=2, layers=2, activation_function='tanh', overwrite_ok=False, id2label=None, use_pooler=False)
+

Adds a multiple choice head on top of the model.

+
+
Parameters
+
    +
  • head_name (str) – The name of the head.

  • +
  • num_choices (int, optional) – Number of choices. Defaults to 2.

  • +
  • layers (int, optional) – Number of layers. Defaults to 2.

  • +
  • activation_function (str, optional) – Activation function. Defaults to ‘tanh’.

  • +
  • overwrite_ok (bool, optional) – Force overwrite if a head with the same name exists. Defaults to False.

  • +
+
+
+
+ +
+
+add_qa_head(head_name, num_labels=2, layers=1, activation_function='tanh', overwrite_ok=False, id2label=None)
+

Adds a question answering head on top of the model.

+
+
Parameters
+
    +
  • head_name (str) – The name of the head.

  • +
  • num_labels (int, optional) – Number of classification labels. Defaults to 2.

  • +
  • layers (int, optional) – Number of layers. Defaults to 1.

  • +
  • activation_function (str, optional) – Activation function. Defaults to ‘tanh’.

  • +
  • overwrite_ok (bool, optional) – Force overwrite if a head with the same name exists. Defaults to False.

  • +
+
+
+
+ +
+
+add_tagging_head(head_name, num_labels=2, layers=1, activation_function='tanh', overwrite_ok=False, id2label=None)
+

Adds a token classification head on top of the model.

+
+
Parameters
+
    +
  • head_name (str) – The name of the head.

  • +
  • num_labels (int, optional) – Number of classification labels. Defaults to 2.

  • +
  • layers (int, optional) – Number of layers. Defaults to 1.

  • +
  • activation_function (str, optional) – Activation function. Defaults to ‘tanh’.

  • +
  • overwrite_ok (bool, optional) – Force overwrite if a head with the same name exists. Defaults to False.

  • +
+
+
+
+ +
+
+apply_to_adapter_layers(fn)
+

Applies a function to all adapter layers of the model.

+
+ +
+
+apply_to_basemodel_childs(fn)
+

Applies a function to all direct childs of the model if they are a instance of AdapterLayerBase.

+
+ +
+
+average_adapter(adapter_name: str, adapter_list: List[str], weights: Optional[List[float]] = None, normalize_weights: bool = True, overwrite_ok: bool = False, set_active: bool = False)
+

Adds a new adapter module as weighted average of a set of existing adapter modules.

+
+
Parameters
+
    +
  • adapter_name (str) – The name of the adapter module to be added.

  • +
  • input_adapters (List[str] or Dict[str, float]) – Specifies the existing adapters whose weights should be averaged. Can either be a list of adapter names +or a dictionary mapping adapter names to weights.

  • +
  • overwrite_ok (bool, optional) – Overwrite an adapter with the same name if it exists. By default (False), an exception is thrown.

  • +
  • set_active (bool, optional) – Set the adapter to be the active one. By default (False), the adapter is added but not activated.

  • +
+
+
+
+ +
+
+delete_adapter(adapter_name: str)
+

Deletes the adapter with the specified name from the model.

+
+
Parameters
+

adapter_name (str) – The name of the adapter.

+
+
+
+ +
+
+delete_adapter_fusion(adapter_names: Union[Fuse, list, str])
+

Deletes the AdapterFusion layer of the specified adapters.

+
+
Parameters
+

adapter_names (Union[Fuse, list, str]) – AdapterFusion layer to delete.

+
+
+
+ +
+
+delete_head(head_name: str)
+

Deletes the prediction head with the specified name from the model.

+
+
Parameters
+

head_name (str) – The name of the prediction to delete.

+
+
+
+ +
+
+eject_prefix_tuning(name: str)
+

Converts the prefix tuning with the given name from the reparameterized form into the flat form.

+
+
Parameters
+

name (str) – The name of the prefix tuning.

+
+
+
+ +
+
+forward(input_ids=None, attention_mask=None, token_type_ids=None, position_ids=None, head_mask=None, inputs_embeds=None, output_attentions=None, output_hidden_states=None, return_dict=None, head=None, output_adapter_gating_scores=False, output_adapter_fusion_attentions=False, **kwargs)
+

The [RobertaAdapterModel] forward method, overrides the __call__ special method.

+

<Tip>

+

Although the recipe for forward pass needs to be defined within this function, one should call the [Module] +instance afterwards instead of this since the former takes care of running the pre and post processing steps while +the latter silently ignores them.

+

</Tip>

+
+
Parameters
+
    +
  • input_ids (torch.LongTensor of shape (batch_size, sequence_length)) –

    Indices of input sequence tokens in the vocabulary.

    +

    Indices can be obtained using [AutoTokenizer]. See [PreTrainedTokenizer.encode] and +[PreTrainedTokenizer.__call__] for details.

    +

    [What are input IDs?](../glossary#input-ids)

    +

  • +
  • attention_mask (torch.FloatTensor of shape (batch_size, sequence_length), optional) –

    Mask to avoid performing attention on padding token indices. Mask values selected in [0, 1]:

    +
      +
    • 1 for tokens that are not masked,

    • +
    • 0 for tokens that are masked.

    • +
    +

    [What are attention masks?](../glossary#attention-mask)

    +

  • +
  • token_type_ids (torch.LongTensor of shape (batch_size, sequence_length), optional) –

    Segment token indices to indicate first and second portions of the inputs. Indices are selected in [0,1]:

    +
      +
    • 0 corresponds to a sentence A token,

    • +
    • 1 corresponds to a sentence B token.

    • +
    +

    This parameter can only be used when the model is initialized with type_vocab_size parameter with value +>= 2. All the value in this tensor should be always < type_vocab_size.

    +

    [What are token type IDs?](../glossary#token-type-ids)

    +

  • +
  • position_ids (torch.LongTensor of shape (batch_size, sequence_length), optional) –

    Indices of positions of each input sequence tokens in the position embeddings. Selected in the range [0, +config.max_position_embeddings - 1].

    +

    [What are position IDs?](../glossary#position-ids)

    +

  • +
  • head_mask (torch.FloatTensor of shape (num_heads,) or (num_layers, num_heads), optional) –

    Mask to nullify selected heads of the self-attention modules. Mask values selected in [0, 1]:

    +
      +
    • 1 indicates the head is not masked,

    • +
    • 0 indicates the head is masked.

    • +
    +

  • +
  • inputs_embeds (torch.FloatTensor of shape (batch_size, sequence_length, hidden_size), optional) – Optionally, instead of passing input_ids you can choose to directly pass an embedded representation. This +is useful if you want more control over how to convert input_ids indices into associated vectors than the +model’s internal embedding lookup matrix.

  • +
  • output_attentions (bool, optional) – Whether or not to return the attentions tensors of all attention layers. See attentions under returned +tensors for more detail.

  • +
  • output_hidden_states (bool, optional) – Whether or not to return the hidden states of all layers. See hidden_states under returned tensors for +more detail.

  • +
  • return_dict (bool, optional) – Whether or not to return a [~utils.ModelOutput] instead of a plain tuple.

  • +
+
+
+
+ +
+
+forward_context(context: ForwardContext, *args, **kwargs)
+

This method is called by the ForwardContext at the beginning of the forward pass.

+
+ +
+
+forward_head(all_outputs, head_name=None, cls_output=None, attention_mask=None, return_dict=False, context=None, **kwargs)
+

The forward pass through a prediction head configuration. There are three ways to specify the used prediction +head configuration (in order of priority):

+
+
    +
  1. If a head_name is passed, the head with the given name is used.

  2. +
  3. If the forward call is executed within an AdapterSetup context, the head configuration is read from +the context.

  4. +
  5. If the active_head property is set, the head configuration is read from there.

  6. +
+
+
+
Parameters
+
    +
  • all_outputs (dict) – The outputs of the base model.

  • +
  • head_name (str, optional) – The name of the prediction head to use. If None, the active head is used.

  • +
  • cls_output (torch.Tensor, optional) – The classification output of the model.

  • +
  • attention_mask (torch.Tensor, optional) – The attention mask of the model.

  • +
  • return_dict (bool) – Whether or not to return a ModelOutput instead of a plain tuple.

  • +
  • get_cls_from_eos_tokens (bool) – If set to True, retrieve classifier token representations from the last <eos> token in the sequence. +Setting to True requires eos_mask to be passed as well.

  • +
  • **kwargs – Additional keyword arguments passed to the forward pass of the head.

  • +
+
+
+
+ +
+
+freeze_model(freeze=True)
+

Freezes all weights of the model.

+
+ +
+
+get_adapter(name)
+

If self.base_model is self, must inherit from a class that implements this method, to preclude infinite +recursion

+
+ +
+
+get_labels(head_name=None)
+

Returns the labels the given head is assigning/predictin

+
+
Parameters
+
    +
  • head_name – (str, optional) the name of the head which labels should be returned. Default is None.

  • +
  • returned (If the name is None the labels of the active head are) –

  • +
+
+
+

Returns: labels

+
+ +
+
+get_labels_dict(head_name=None)
+

Returns the id2label dict for the given hea

+
+
Parameters
+
    +
  • head_name – (str, optional) the name of the head which labels should be returned. Default is None.

  • +
  • returned (If the name is None the labels of the active head are) –

  • +
+
+
+

Returns: id2label

+
+ +
+
+get_output_embeddings() Union[Module, List[Module]]
+

Returns the model’s output embeddings.

+
+
Returns
+

A torch module mapping hidden states to vocabulary.

+
+
Return type
+

nn.Module

+
+
+
+ +
+
+head_type()
+

Checks which head type the decorated function belongs to and raises an error if the model does not support the +head type.

+
+ +
+
+init_adapters(model_config, adapters_config, add_prefix_tuning_pool=True)
+

This method initializes adapter modules and fusion modules from the model config.

+
+ +
+
+iter_layers() Iterable[Tuple[int, Module]]
+

Iterates over all layers of the model.

+
+ +
+
+load_adapter(adapter_name_or_path: str, config: Optional[Union[dict, str]] = None, version: Optional[str] = None, model_name: Optional[str] = None, load_as: Optional[str] = None, source: Optional[str] = None, with_head: bool = True, custom_weights_loaders: Optional[List[WeightsLoader]] = None, leave_out: Optional[List[int]] = None, id2label=None, set_active: bool = False, use_safetensors: bool = False, **kwargs) str
+

Loads a pre-trained pytorch adapter module from the local file system or a remote location.

+
+
Parameters
+
    +
  • adapter_name_or_path (str) –

    can be either:

    +
      +
    • the identifier of a pre-trained task adapter to be loaded from Adapter Hub

    • +
    • a path to a directory containing adapter weights saved using model.saved_adapter()

    • +
    • a URL pointing to a zip folder containing a saved adapter module

    • +
    +

  • +
  • config (dict or str, optional) – The requested configuration of the adapter. +If not specified, will be either: - the default adapter config for the requested adapter if specified - +the global default adapter config

  • +
  • version (str, optional) – The version of the adapter to be loaded.

  • +
  • model_name (str, optional) – The string identifier of the pre-trained model.

  • +
  • load_as (str, optional) – Load the adapter using this name. By default, the name with which the adapter was +saved will be used.

  • +
  • source (str, optional) –

    Identifier of the source(s) from where to load the adapter. Can be:

    +
      +
    • +
      ”ah”: search on AdapterHub Hub repo.

      Note: the Hub repo has been archived and all adapters have been moved to HuggingFace Model Hub. +Loading from this source is deprecated.

      +
      +
      +
    • +
    • ”hf”: search on HuggingFace Model Hub.

    • +
    • None (default): search on all sources

    • +
    +

  • +
  • leave_out – Dynamically drop adapter modules in the specified Transformer layers when loading the adapter.

  • +
  • set_active (bool, optional) – Set the loaded adapter to be the active one. By default (False), the adapter is loaded but not +activated.

  • +
  • use_safetensors (bool, optional) – If True, weights are loaded via safetensors if safetensors checkpoint is available. Otherwise, the regular torch save method is used.

  • +
+
+
Returns
+

The name with which the adapter was added to the model.

+
+
Return type
+

str

+
+
+
+ +
+
+load_adapter_fusion(adapter_fusion_name_or_path: str, load_as: Optional[str] = None, custom_weights_loaders: Optional[List[WeightsLoader]] = None, set_active: bool = False, with_head: bool = True, use_safetensors: bool = False, **kwargs) str
+

Loads a pre-trained AdapterFusion layer from the local file system.

+
+
Parameters
+
    +
  • adapter_fusion_name_or_path (str) – a path to a directory containing AdapterFusion weights saved using model.save_adapter_fusion().

  • +
  • load_as (str, optional) – Load the AdapterFusion using this name. +By default, the name with which the AdapterFusion layer was saved will be used.

  • +
  • set_active (bool, optional) – Activate the loaded AdapterFusion. By default (False), the AdapterFusion is loaded but not activated.

  • +
  • use_safetensors (bool, optional) – If True, weights are loaded via safetensors if safetensors checkpoint is available. Otherwise, the regular torch save method is used.

  • +
+
+
Returns
+

The name with which the AdapterFusion was added to the model.

+
+
Return type
+

str

+
+
+
+ +
+
+load_head(save_directory: str, load_as: Optional[str] = None, id2label: Optional[Dict[int, str]] = None, use_safetensors: bool = False, **kwargs) str
+

Loads a model prediction head from a directory where it was saved using save_head().

+
+
Parameters
+
    +
  • save_directory (str) – Path to the directory where the prediction head is saved.

  • +
  • load_as (str, optional) – Load the AdapterFusion using this name. +By default, the name with which the AdapterFusion layer was saved will be used.

  • +
  • id2label (Dict[int, str], optional) – Provide a custom mapping from class ids to class labels. Defaults to None.

  • +
  • use_safetensors (bool, optional) – If True, weights are loaded via safetensors if safetensors checkpoint is available. Otherwise, the regular torch save method is used.

  • +
+
+
Returns
+

The name with which the prediction head was added to the model.

+
+
Return type
+

str

+
+
+
+ +
+
+merge_adapter(name: str)
+

Merges the weights of the given LoRA module with the Transformer weights as described in the paper.

+
+
Parameters
+

name (str) – LoRA module to merge.

+
+
+
+ +
+
+push_adapter_to_hub(repo_name: str, adapter_name: str, organization: Optional[str] = None, adapterhub_tag: Optional[str] = None, datasets_tag: Optional[str] = None, local_path: Optional[str] = None, commit_message: Optional[str] = None, private: Optional[bool] = None, token: Optional[Union[bool, str]] = None, overwrite_adapter_card: bool = False, create_pr: bool = False, revision: Optional[str] = None, commit_description: Optional[str] = None, adapter_card_kwargs: Optional[dict] = None, **deprecated_kwargs)
+

Upload an adapter to HuggingFace’s Model Hub.

+
+
Parameters
+
    +
  • repo_name (str) – The name of the repository on the model hub to upload to.

  • +
  • adapter_name (str) – The name of the adapter to be uploaded.

  • +
  • organization (str, optional) – Organization in which to push the adapter +(you must be a member of this organization). Defaults to None.

  • +
  • adapterhub_tag (str, optional) – Tag of the format <task>/<subtask> for categorization on https://adapterhub.ml/explore/. See +https://docs.adapterhub.ml/contributing.html#add-a-new-task-or-subtask for more. If not specified, +datasets_tag must be given in case a new adapter card is generated. Defaults to None.

  • +
  • datasets_tag (str, optional) – Dataset identifier from https://huggingface.co/datasets. +If not specified, adapterhub_tag must be given in case a new adapter card is generated. Defaults to +None.

  • +
  • local_path (str, optional) – Local path used as clone directory of the adapter repository. +If not specified, will create a temporary directory. Defaults to None.

  • +
  • commit_message (str, optional) – Message to commit while pushing. Will default to "add config", "add tokenizer" or +"add model" depending on the type of the class.

  • +
  • private (bool, optional) – Whether or not the repository created should be private (requires a paying subscription).

  • +
  • token (bool or str, optional) – The token to use as HTTP bearer authorization for remote files. If True, will use the token generated +when running huggingface-cli login (stored in ~/.huggingface). Will default to True if repo_url +is not specified.

  • +
  • overwrite_adapter_card (bool, optional) – Overwrite an existing adapter card with a newly generated one. +If set to False, will only generate an adapter card, if none exists. Defaults to False.

  • +
  • create_pr (bool, optional) – Whether or not to create a PR with the uploaded files or directly commit.

  • +
  • revision (str, optional) – Branch to push the uploaded files to.

  • +
  • commit_description (str, optional) – The description of the commit that will be created

  • +
+
+
Returns
+

The url of the adapter repository on the model hub.

+
+
Return type
+

str

+
+
+
+ +
+
+reset_adapter()
+

Resets weights of a LoRA module merged using model.merge_adapter(name).

+
+ +
+
+save_adapter(save_directory: str, adapter_name: str, with_head: bool = True, meta_dict: Optional[dict] = None, custom_weights_loaders: Optional[List[WeightsLoader]] = None, use_safetensors: bool = False)
+

Saves an adapter and its configuration file to a directory so that it can be shared or reloaded using +load_adapter().

+
+
Parameters
+
    +
  • save_directory (str) – Path to a directory where the adapter should be saved.

  • +
  • adapter_name (str) – Name of the adapter to be saved.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
Raises
+

ValueError – If the given adapter name is invalid.

+
+
+
+ +
+
+save_adapter_fusion(save_directory: str, adapter_names: Union[Fuse, list, str], meta_dict: Optional[dict] = None, custom_weights_loaders: Optional[List[WeightsLoader]] = None, with_head: Union[bool, str] = False, use_safetensors: bool = False)
+

Saves an AdapterFusion layer and its configuration file to a directory so that it can be shared or reloaded +using load_adapter_fusion().

+
+
Parameters
+
    +
  • save_directory (str) – Path to a directory where the AdapterFusion should be saved.

  • +
  • adapter_names (Union[Fuse, list, str]) – AdapterFusion to be saved.

  • +
  • with_head (Union[bool, str]) – If True, will save a head with the same name as the AdapterFusionLayer. If a string, this will be used +as the name of the head to be saved.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
Raises
+

ValueError – If the given AdapterFusion name is invalid.

+
+
+
+ +
+
+save_all_adapter_fusions(save_directory: str, meta_dict: Optional[dict] = None, custom_weights_loaders: Optional[List[WeightsLoader]] = None, use_safetensors: bool = False)
+

Saves all AdapterFusion layers of this model together with their configuration to subfolders of the given +location.

+
+
Parameters
+
    +
  • save_directory (str) – Path to a directory where the AdapterFusion layers should be saved.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
+
+ +
+
+save_all_adapters(save_directory: str, with_head: bool = True, meta_dict: Optional[dict] = None, custom_weights_loaders: Optional[List[WeightsLoader]] = None, use_safetensors: bool = False)
+

Saves all adapters of this model together with their configuration to subfolders of the given location.

+
+
Parameters
+
    +
  • save_directory (str) – Path to a directory where the adapters should be saved.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
+
+ +
+
+save_all_heads(save_directory: str, use_safetensors: bool = False)
+

Saves all prediction heads of this model to subfolders of the given location.

+
+
Parameters
+
    +
  • save_directory (str) – Path to the base directory where prediction heads should be saved.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
+
+ +
+
+save_head(save_directory: str, head_name: Optional[str] = None, use_safetensors: bool = False) None
+

Saves a model prediction head to a directory such that it can be reloaded using load_head().

+
+
Parameters
+
    +
  • save_directory (str) – Path to the directory where the prediction head should be saved.

  • +
  • head_name (str, optional) – Name of the head to save. Set to None if model only has one head. Defaults to None.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
+
+ +
+
+save_pretrained(save_directory: Union[str, PathLike], **kwargs)
+

Save a model and its configuration file to a directory, so that it can be re-loaded using the +[~PreTrainedModel.from_pretrained] class method.

+
+
Parameters
+
    +
  • save_directory (str or os.PathLike) – Directory to which to save. Will be created if it doesn’t exist.

  • +
  • is_main_process (bool, optional, defaults to True) – Whether the process calling this is the main process or not. Useful when in distributed training like +TPUs and need to call this function on all processes. In this case, set is_main_process=True only on +the main process to avoid race conditions.

  • +
  • state_dict (nested dictionary of torch.Tensor) – The state dictionary of the model to save. Will default to self.state_dict(), but can be used to only +save parts of the model or if special precautions need to be taken when recovering the state dictionary +of a model (like when using model parallelism).

  • +
  • save_function (Callable) – The function to use to save the state dictionary. Useful on distributed training like TPUs when one +need to replace torch.save by another method.

  • +
  • push_to_hub (bool, optional, defaults to False) – Whether or not to push your model to the Hugging Face model hub after saving it. You can specify the +repository you want to push to with repo_id (will default to the name of save_directory in your +namespace).

  • +
  • max_shard_size (int or str, optional, defaults to “5GB”) –

    The maximum size for a checkpoint before being sharded. Checkpoints shard will then be each of size +lower than this size. If expressed as a string, needs to be digits followed by a unit (like “5MB”). +We default it to 5GB in order for models to be able to run easily on free-tier google colab instances +without CPU OOM issues.

    +

    <Tip warning={true}>

    +

    If a single weight of the model is bigger than max_shard_size, it will be in its own checkpoint shard +which will be bigger than max_shard_size.

    +

    </Tip>

    +

  • +
  • safe_serialization (bool, optional, defaults to True) – Whether to save the model using safetensors or the traditional PyTorch way (that uses pickle).

  • +
  • variant (str, optional) – If specified, weights are saved in the format pytorch_model.<variant>.bin.

  • +
  • token (str or bool, optional) – The token to use as HTTP bearer authorization for remote files. If True, or not specified, will use +the token generated when running huggingface-cli login (stored in ~/.huggingface).

  • +
  • save_peft_format (bool, optional, defaults to True) – For backward compatibility with PEFT library, in case adapter weights are attached to the model, all +keys of the state dict of adapters needs to be pre-pended with base_model.model. Advanced users can +disable this behaviours by setting save_peft_format to False.

  • +
  • kwargs (Dict[str, Any], optional) – Additional key word arguments passed along to the [~utils.PushToHubMixin.push_to_hub] method.

  • +
+
+
+
+ +
+
+set_active_adapters(adapter_setup: Union[list, AdapterCompositionBlock], skip_layers: Optional[List[int]] = None)
+

Sets the adapter modules to be used by default in every forward pass. This setting can be overriden by passing +the adapter_names parameter in the foward() pass. If no adapter with the given name is found, no module of +the respective type will be activated. In case the calling model class supports named prediction heads, this +method will attempt to activate a prediction head with the name of the last adapter in the list of passed +adapter names.

+
+
Parameters
+

adapter_setup (list) – The list of adapters to be activated by default. Can be a fusion or stacking configuration.

+
+
+
+ +
+
+tie_weights()
+

Tie the weights between the input embeddings and the output embeddings.

+

If the torchscript flag is set in the configuration, can’t handle parameter sharing so we are cloning +the weights instead.

+
+ +
+
+train_adapter(adapter_setup: Union[list, AdapterCompositionBlock], train_embeddings=False)
+

Sets the model into mode for training the given adapters. If self.base_model is self, must inherit from a class +that implements this method, to preclude infinite recursion

+
+ +
+
+train_adapter_fusion(adapter_setup: Union[list, AdapterCompositionBlock], unfreeze_adapters=False)
+

Sets the model into mode for training of adapter fusion determined by a list of adapter names. If +self.base_model is self, must inherit from a class that implements this method, to preclude infinite recursion

+
+ +
+
+train_fusion(adapter_setup: Union[list, AdapterCompositionBlock], unfreeze_adapters=False)
+

Sets the model into mode for training of adapter fusion determined by a list of adapter names.

+
+ +
+ +
+
+ + +
+ +
+ + +
+
+ +
+ +
+ +
+ + Versions + v: main + + +
+
+
Branches
+
main
+
+
+
+ + + + + + + + + + \ No newline at end of file diff --git a/classes/models/t5.html b/classes/models/t5.html new file mode 100644 index 0000000000..9ae96c4342 --- /dev/null +++ b/classes/models/t5.html @@ -0,0 +1,1112 @@ + + + + + + + + + + + T5 — AdapterHub documentation + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + +
+ + + + + +
+ +
+ + + + + + + + + + + + + + + + + +
+ + + + +
+
+
+
+ +
+

T5

+

The T5 model was presented in Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer by Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, +Michael Matena, Yanqi Zhou, Wei Li, Peter J. Liu.

+

The abstract from the paper is the following,

+
    +
  • T5 is an encoder-decoder model pre-trained on a multi-task mixture of unsupervised and supervised tasks and for which +each task is converted into a text-to-text format. T5 works well on a variety of tasks out-of-the-box by prepending a +different prefix to the input corresponding to each task, e.g., for translation: translate English to German: …, +for summarization: summarize: ….

    +

    For more information about which prefix to use, it is easiest to look into Appendix D of the paper.

    +
  • +
+
+

T5AdapterModel

+
+
+class adapters.T5AdapterModel(config)
+

T5 Model with the option to add multiple flexible prediction heads on top.

+

The T5 model was proposed in [Exploring the Limits of Transfer Learning with a Unified Text-to-Text +Transformer](https://arxiv.org/abs/1910.10683) by Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan +Narang, Michael Matena, Yanqi Zhou, Wei Li, Peter J. Liu. It’s an encoder decoder transformer pre-trained in a +text-to-text denoising generative setting.

+

This model inherits from [PreTrainedModel]. Check the superclass documentation for the generic methods the +library implements for all its model (such as downloading or saving, resizing the input embeddings, pruning heads +etc.)

+

This model is also a PyTorch [torch.nn.Module](https://pytorch.org/docs/stable/nn.html#torch.nn.Module) subclass. +Use it as a regular PyTorch Module and refer to the PyTorch documentation for all matter related to general usage +and behavior.

+
+
Parameters
+

config ([T5Config]) – Model configuration class with all the parameters of the model. +Initializing with a config file does not load the weights associated with the model, only the +configuration. Check out the [~PreTrainedModel.from_pretrained] method to load the model weights.

+
+
+
+
+property active_adapters: AdapterCompositionBlock
+

If you are not familiar with adapters and PEFT methods, we invite you to read more about them on the PEFT +official documentation: https://huggingface.co/docs/peft

+

Gets the current active adapters of the model. In case of multi-adapter inference (combining multiple adapters +for inference) returns the list of all active adapters so that users can deal with them accordingly.

+

For previous PEFT versions (that does not support multi-adapter inference), module.active_adapter will return +a single string.

+
+ +
+
+property active_head: Union[str, List[str]]
+

The active prediction head configuration of this model. Can be either the name of a single available head +(string) or a list of multiple available heads. In case of a list of heads, the same base model is forwarded +through all specified heads.

+
+
Returns
+

A string or a list of strings describing the active head configuration.

+
+
Return type
+

Union[str, List[str]]

+
+
+
+ +
+
+adapter_fusion_to(adapter_names: Union[Fuse, list, str], device: Optional[Union[device, str]] = None, dtype: Optional[dtype] = None)
+

Moves the adapter fusion layer with the given name to the specified device and data type.

+
+
Parameters
+
    +
  • adapter_names (Union[Fuse, list, str]) – The name of the adapter fusion layer to be moved.

  • +
  • device (torch.device or str, optional) – The device on which the adapter fusion layer should be moved.

  • +
  • dtype (torch.dtype, optional) – The data type to which the adapter fusion layer should be cast.

  • +
+
+
+
+ +
+
+adapter_summary(as_dict=False) Union[str, dict]
+

Returns a string summary of all adapters currently added to the model. Each entry in the summary table has the +following attributes:

+
+
    +
  • name: the name of the adapter

  • +
  • architecture: the architectural base of the adapter

  • +
  • #param: the number of parameters of the adapter

  • +
  • %param: the number of parameters of the adapter relative to the full model

  • +
  • active: whether the adapter is active

  • +
  • train: whether the adapter weights are enabled for training

  • +
+
+
+ +
+
+adapter_to(name: str, device: Optional[Union[device, str]] = None, dtype: Optional[dtype] = None)
+

Moves the adapter with the given name to the specified device and data type.

+
+
Parameters
+
    +
  • name (str) – The name of the adapter to be moved.

  • +
  • device (torch.device or str, optional) – The device on which the adapter should be moved.

  • +
  • dtype (torch.dtype, optional) – The data type to which the adapter should be cast.

  • +
+
+
+
+ +
+
+add_adapter(adapter_name: str, config=None, overwrite_ok: bool = False, set_active: bool = False)
+

Adds a new adapter module of the specified type to the model.

+
+
Parameters
+
    +
  • adapter_name (str) – The name of the adapter module to be added.

  • +
  • config (str or dict, optional) –

    The adapter configuration, can be either:

    +
      +
    • the string identifier of a pre-defined configuration dictionary

    • +
    • a configuration dictionary specifying the full config

    • +
    • if not given, the default configuration for this adapter type will be used

    • +
    +

  • +
  • overwrite_ok (bool, optional) – Overwrite an adapter with the same name if it exists. By default (False), an exception is thrown.

  • +
  • set_active (bool, optional) – Set the adapter to be the active one. By default (False), the adapter is added but not activated.

  • +
+
+
+

If self.base_model is self, must inherit from a class that implements this method, to preclude infinite +recursion

+
+ +
+
+add_adapter_fusion(adapter_names: Union[Fuse, list, str], config=None, overwrite_ok: bool = False, set_active: bool = False)
+

Adds AdapterFusion to the model with alll the necessary configurations and weight initializations

+
+
Parameters
+
    +
  • adapter_names (Fuse or list or str) –

    AdapterFusion layer to add. Can be either:

    +
      +
    • a Fuse composition block

    • +
    • a list of adapter names to fuse

    • +
    • a comma-separated string of adapter names to fuse

    • +
    +

  • +
  • config (str or dict) –

    adapter fusion configuration, can be either:

    +
      +
    • a string identifying a pre-defined adapter fusion configuration

    • +
    • a dictionary representing the adapter fusion configuration

    • +
    • the path to a file containing the adapter fusion configuration

    • +
    +

  • +
  • overwrite_ok (bool, optional) – Overwrite an AdapterFusion layer with the same name if it exists. By default (False), an exception is +thrown.

  • +
  • set_active (bool, optional) – Activate the added AdapterFusion. By default (False), the AdapterFusion is added but not activated.

  • +
+
+
+
+ +
+
+add_classification_head(head_name, num_labels=2, layers=2, activation_function='tanh', overwrite_ok=False, multilabel=False, id2label=None, use_pooler=False)
+

Adds a sequence classification head on top of the model.

+
+
Parameters
+
    +
  • head_name (str) – The name of the head.

  • +
  • num_labels (int, optional) – Number of classification labels. Defaults to 2.

  • +
  • layers (int, optional) – Number of layers. Defaults to 2.

  • +
  • activation_function (str, optional) – Activation function. Defaults to ‘tanh’.

  • +
  • overwrite_ok (bool, optional) – Force overwrite if a head with the same name exists. Defaults to False.

  • +
  • multilabel (bool, optional) – Enable multilabel classification setup. Defaults to False.

  • +
+
+
+
+ +
+
+add_qa_head(head_name, num_labels=2, layers=1, activation_function='tanh', overwrite_ok=False, id2label=None)
+

Adds a question answering head on top of the model.

+
+
Parameters
+
    +
  • head_name (str) – The name of the head.

  • +
  • num_labels (int, optional) – Number of classification labels. Defaults to 2.

  • +
  • layers (int, optional) – Number of layers. Defaults to 1.

  • +
  • activation_function (str, optional) – Activation function. Defaults to ‘tanh’.

  • +
  • overwrite_ok (bool, optional) – Force overwrite if a head with the same name exists. Defaults to False.

  • +
+
+
+
+ +
+
+add_seq2seq_lm_head(head_name, overwrite_ok=False)
+

Adds a sequence-to-sequence language modeling head on top of the model.

+
+
Parameters
+
    +
  • head_name (str) – The name of the head.

  • +
  • overwrite_ok (bool, optional) – Force overwrite if a head with the same name exists. Defaults to False.

  • +
+
+
+
+ +
+
+apply_to_adapter_layers(fn)
+

Applies a function to all adapter layers of the model.

+
+ +
+
+apply_to_basemodel_childs(fn)
+

Applies a function to all direct childs of the model if they are a instance of AdapterLayerBase.

+
+ +
+
+average_adapter(adapter_name: str, adapter_list: List[str], weights: Optional[List[float]] = None, normalize_weights: bool = True, overwrite_ok: bool = False, set_active: bool = False)
+

Adds a new adapter module as weighted average of a set of existing adapter modules.

+
+
Parameters
+
    +
  • adapter_name (str) – The name of the adapter module to be added.

  • +
  • input_adapters (List[str] or Dict[str, float]) – Specifies the existing adapters whose weights should be averaged. Can either be a list of adapter names +or a dictionary mapping adapter names to weights.

  • +
  • overwrite_ok (bool, optional) – Overwrite an adapter with the same name if it exists. By default (False), an exception is thrown.

  • +
  • set_active (bool, optional) – Set the adapter to be the active one. By default (False), the adapter is added but not activated.

  • +
+
+
+
+ +
+
+delete_adapter(adapter_name: str)
+

Deletes the adapter with the specified name from the model.

+
+
Parameters
+

adapter_name (str) – The name of the adapter.

+
+
+
+ +
+
+delete_adapter_fusion(adapter_names: Union[Fuse, list, str])
+

Deletes the AdapterFusion layer of the specified adapters.

+
+
Parameters
+

adapter_names (Union[Fuse, list, str]) – AdapterFusion layer to delete.

+
+
+
+ +
+
+delete_head(head_name: str)
+

Deletes the prediction head with the specified name from the model.

+
+
Parameters
+

head_name (str) – The name of the prediction to delete.

+
+
+
+ +
+
+eject_prefix_tuning(name: str)
+

Converts the prefix tuning with the given name from the reparameterized form into the flat form.

+
+
Parameters
+

name (str) – The name of the prefix tuning.

+
+
+
+ +
+
+forward(input_ids=None, attention_mask=None, decoder_input_ids=None, decoder_attention_mask=None, head_mask=None, decoder_head_mask=None, cross_attn_head_mask=None, encoder_outputs=None, past_key_values=None, inputs_embeds=None, decoder_inputs_embeds=None, labels=None, use_cache=None, output_attentions=None, output_hidden_states=None, return_dict=None, head=None, output_adapter_gating_scores=False, output_adapter_fusion_attentions=False, **kwargs)
+

The [T5AdapterModel] forward method, overrides the __call__ special method.

+

<Tip>

+

Although the recipe for forward pass needs to be defined within this function, one should call the [Module] +instance afterwards instead of this since the former takes care of running the pre and post processing steps while +the latter silently ignores them.

+

</Tip>

+
+
Parameters
+
    +
  • input_ids (torch.LongTensor of shape (batch_size, sequence_length)) –

    Indices of input sequence tokens in the vocabulary. T5 is a model with relative position embeddings so you +should be able to pad the inputs on both the right and the left.

    +

    Indices can be obtained using [AutoTokenizer]. See [PreTrainedTokenizer.encode] and +[PreTrainedTokenizer.__call__] for detail.

    +

    [What are input IDs?](../glossary#input-ids)

    +

    To know more on how to prepare input_ids for pretraining take a look a [T5 Training](./t5#training).

    +

  • +
  • attention_mask (torch.FloatTensor of shape (batch_size, sequence_length), optional) –

    Mask to avoid performing attention on padding token indices. Mask values selected in [0, 1]:

    +
      +
    • 1 for tokens that are not masked,

    • +
    • 0 for tokens that are masked.

    • +
    +

    [What are attention masks?](../glossary#attention-mask)

    +

  • +
  • decoder_input_ids (torch.LongTensor of shape (batch_size, target_sequence_length), optional) –

    Indices of decoder input sequence tokens in the vocabulary.

    +

    Indices can be obtained using [AutoTokenizer]. See [PreTrainedTokenizer.encode] and +[PreTrainedTokenizer.__call__] for details.

    +

    [What are decoder input IDs?](../glossary#decoder-input-ids)

    +

    T5 uses the pad_token_id as the starting token for decoder_input_ids generation. If past_key_values +is used, optionally only the last decoder_input_ids have to be input (see past_key_values).

    +

    To know more on how to prepare decoder_input_ids for pretraining take a look at [T5 +Training](./t5#training).

    +

  • +
  • decoder_attention_mask (torch.BoolTensor of shape (batch_size, target_sequence_length), optional) – Default behavior: generate a tensor that ignores pad tokens in decoder_input_ids. Causal mask will also +be used by default.

  • +
  • head_mask (torch.FloatTensor of shape (num_heads,) or (num_layers, num_heads), optional) –

    Mask to nullify selected heads of the self-attention modules in the encoder. Mask values selected in [0, +1]:

    +
      +
    • 1 indicates the head is not masked,

    • +
    • 0 indicates the head is masked.

    • +
    +

  • +
  • decoder_head_mask (torch.FloatTensor of shape (num_heads,) or (num_layers, num_heads), optional) –

    Mask to nullify selected heads of the self-attention modules in the decoder. Mask values selected in [0, +1]:

    +
      +
    • 1 indicates the head is not masked,

    • +
    • 0 indicates the head is masked.

    • +
    +

  • +
  • cross_attn_head_mask (torch.Tensor of shape (num_heads,) or (num_layers, num_heads), optional) –

    Mask to nullify selected heads of the cross-attention modules in the decoder. Mask values selected in +[0, 1]:

    +
      +
    • 1 indicates the head is not masked,

    • +
    • 0 indicates the head is masked.

    • +
    +

  • +
  • encoder_outputs (tuple(tuple(torch.FloatTensor), optional) – Tuple consists of (last_hidden_state, optional: hidden_states, optional: attentions) +last_hidden_state of shape (batch_size, sequence_length, hidden_size) is a sequence of hidden states at +the output of the last layer of the encoder. Used in the cross-attention of the decoder.

  • +
  • past_key_values (tuple(tuple(torch.FloatTensor)) of length config.n_layers with each tuple having 4 tensors of shape (batch_size, num_heads, sequence_length - 1, embed_size_per_head)) –

    Contains precomputed key and value hidden states of the attention blocks. Can be used to speed up decoding.

    +

    If past_key_values are used, the user can optionally input only the last decoder_input_ids (those that +don’t have their past key value states given to this model) of shape (batch_size, 1) instead of all +decoder_input_ids of shape (batch_size, sequence_length).

    +

  • +
  • inputs_embeds (torch.FloatTensor of shape (batch_size, sequence_length, hidden_size), optional) – Optionally, instead of passing input_ids you can choose to directly pass an embedded representation. This +is useful if you want more control over how to convert input_ids indices into associated vectors than the +model’s internal embedding lookup matrix.

  • +
  • decoder_inputs_embeds (torch.FloatTensor of shape (batch_size, target_sequence_length, hidden_size), optional) –

    Optionally, instead of passing decoder_input_ids you can choose to directly pass an embedded +representation. If past_key_values is used, optionally only the last decoder_inputs_embeds have to be +input (see past_key_values). This is useful if you want more control over how to convert +decoder_input_ids indices into associated vectors than the model’s internal embedding lookup matrix.

    +

    If decoder_input_ids and decoder_inputs_embeds are both unset, decoder_inputs_embeds takes the value +of inputs_embeds.

    +

  • +
  • use_cache (bool, optional) – If set to True, past_key_values key value states are returned and can be used to speed up decoding (see +past_key_values).

  • +
  • output_attentions (bool, optional) – Whether or not to return the attentions tensors of all attention layers. See attentions under returned +tensors for more detail.

  • +
  • output_hidden_states (bool, optional) – Whether or not to return the hidden states of all layers. See hidden_states under returned tensors for +more detail.

  • +
  • return_dict (bool, optional) – Whether or not to return a [~utils.ModelOutput] instead of a plain tuple.

  • +
+
+
+
+ +
+
+forward_context(context: ForwardContext, *args, **kwargs)
+

This method is called by the ForwardContext at the beginning of the forward pass.

+
+ +
+
+forward_head(all_outputs, head_name=None, cls_output=None, attention_mask=None, return_dict=False, context=None, **kwargs)
+

The forward pass through a prediction head configuration. There are three ways to specify the used prediction +head configuration (in order of priority):

+
+
    +
  1. If a head_name is passed, the head with the given name is used.

  2. +
  3. If the forward call is executed within an AdapterSetup context, the head configuration is read from +the context.

  4. +
  5. If the active_head property is set, the head configuration is read from there.

  6. +
+
+
+
Parameters
+
    +
  • all_outputs (dict) – The outputs of the base model.

  • +
  • head_name (str, optional) – The name of the prediction head to use. If None, the active head is used.

  • +
  • cls_output (torch.Tensor, optional) – The classification output of the model.

  • +
  • attention_mask (torch.Tensor, optional) – The attention mask of the model.

  • +
  • return_dict (bool) – Whether or not to return a ModelOutput instead of a plain tuple.

  • +
  • get_cls_from_eos_tokens (bool) – If set to True, retrieve classifier token representations from the last <eos> token in the sequence. +Setting to True requires eos_mask to be passed as well.

  • +
  • **kwargs – Additional keyword arguments passed to the forward pass of the head.

  • +
+
+
+
+ +
+
+freeze_model(freeze=True)
+

Freezes all weights of the model.

+
+ +
+
+get_adapter(name)
+

If self.base_model is self, must inherit from a class that implements this method, to preclude infinite +recursion

+
+ +
+
+get_labels(head_name=None)
+

Returns the labels the given head is assigning/predictin

+
+
Parameters
+
    +
  • head_name – (str, optional) the name of the head which labels should be returned. Default is None.

  • +
  • returned (If the name is None the labels of the active head are) –

  • +
+
+
+

Returns: labels

+
+ +
+
+get_labels_dict(head_name=None)
+

Returns the id2label dict for the given hea

+
+
Parameters
+
    +
  • head_name – (str, optional) the name of the head which labels should be returned. Default is None.

  • +
  • returned (If the name is None the labels of the active head are) –

  • +
+
+
+

Returns: id2label

+
+ +
+
+get_output_embeddings() Union[Module, List[Module]]
+

Returns the model’s output embeddings.

+
+
Returns
+

A torch module mapping hidden states to vocabulary.

+
+
Return type
+

nn.Module

+
+
+
+ +
+
+head_type()
+

Checks which head type the decorated function belongs to and raises an error if the model does not support the +head type.

+
+ +
+
+init_adapters(model_config, adapters_config, add_prefix_tuning_pool=True)
+

This method initializes adapter modules and fusion modules from the model config.

+
+ +
+
+iter_layers() Iterable[Tuple[int, Module]]
+

Iterates over all layers of the model.

+
+ +
+
+load_adapter(adapter_name_or_path: str, config: Optional[Union[dict, str]] = None, version: Optional[str] = None, model_name: Optional[str] = None, load_as: Optional[str] = None, source: Optional[str] = None, with_head: bool = True, custom_weights_loaders: Optional[List[WeightsLoader]] = None, leave_out: Optional[List[int]] = None, id2label=None, set_active: bool = False, use_safetensors: bool = False, **kwargs) str
+

Loads a pre-trained pytorch adapter module from the local file system or a remote location.

+
+
Parameters
+
    +
  • adapter_name_or_path (str) –

    can be either:

    +
      +
    • the identifier of a pre-trained task adapter to be loaded from Adapter Hub

    • +
    • a path to a directory containing adapter weights saved using model.saved_adapter()

    • +
    • a URL pointing to a zip folder containing a saved adapter module

    • +
    +

  • +
  • config (dict or str, optional) – The requested configuration of the adapter. +If not specified, will be either: - the default adapter config for the requested adapter if specified - +the global default adapter config

  • +
  • version (str, optional) – The version of the adapter to be loaded.

  • +
  • model_name (str, optional) – The string identifier of the pre-trained model.

  • +
  • load_as (str, optional) – Load the adapter using this name. By default, the name with which the adapter was +saved will be used.

  • +
  • source (str, optional) –

    Identifier of the source(s) from where to load the adapter. Can be:

    +
      +
    • +
      ”ah”: search on AdapterHub Hub repo.

      Note: the Hub repo has been archived and all adapters have been moved to HuggingFace Model Hub. +Loading from this source is deprecated.

      +
      +
      +
    • +
    • ”hf”: search on HuggingFace Model Hub.

    • +
    • None (default): search on all sources

    • +
    +

  • +
  • leave_out – Dynamically drop adapter modules in the specified Transformer layers when loading the adapter.

  • +
  • set_active (bool, optional) – Set the loaded adapter to be the active one. By default (False), the adapter is loaded but not +activated.

  • +
  • use_safetensors (bool, optional) – If True, weights are loaded via safetensors if safetensors checkpoint is available. Otherwise, the regular torch save method is used.

  • +
+
+
Returns
+

The name with which the adapter was added to the model.

+
+
Return type
+

str

+
+
+
+ +
+
+load_adapter_fusion(adapter_fusion_name_or_path: str, load_as: Optional[str] = None, custom_weights_loaders: Optional[List[WeightsLoader]] = None, set_active: bool = False, with_head: bool = True, use_safetensors: bool = False, **kwargs) str
+

Loads a pre-trained AdapterFusion layer from the local file system.

+
+
Parameters
+
    +
  • adapter_fusion_name_or_path (str) – a path to a directory containing AdapterFusion weights saved using model.save_adapter_fusion().

  • +
  • load_as (str, optional) – Load the AdapterFusion using this name. +By default, the name with which the AdapterFusion layer was saved will be used.

  • +
  • set_active (bool, optional) – Activate the loaded AdapterFusion. By default (False), the AdapterFusion is loaded but not activated.

  • +
  • use_safetensors (bool, optional) – If True, weights are loaded via safetensors if safetensors checkpoint is available. Otherwise, the regular torch save method is used.

  • +
+
+
Returns
+

The name with which the AdapterFusion was added to the model.

+
+
Return type
+

str

+
+
+
+ +
+
+load_head(save_directory: str, load_as: Optional[str] = None, id2label: Optional[Dict[int, str]] = None, use_safetensors: bool = False, **kwargs) str
+

Loads a model prediction head from a directory where it was saved using save_head().

+
+
Parameters
+
    +
  • save_directory (str) – Path to the directory where the prediction head is saved.

  • +
  • load_as (str, optional) – Load the AdapterFusion using this name. +By default, the name with which the AdapterFusion layer was saved will be used.

  • +
  • id2label (Dict[int, str], optional) – Provide a custom mapping from class ids to class labels. Defaults to None.

  • +
  • use_safetensors (bool, optional) – If True, weights are loaded via safetensors if safetensors checkpoint is available. Otherwise, the regular torch save method is used.

  • +
+
+
Returns
+

The name with which the prediction head was added to the model.

+
+
Return type
+

str

+
+
+
+ +
+
+merge_adapter(name: str)
+

Merges the weights of the given LoRA module with the Transformer weights as described in the paper.

+
+
Parameters
+

name (str) – LoRA module to merge.

+
+
+
+ +
+
+push_adapter_to_hub(repo_name: str, adapter_name: str, organization: Optional[str] = None, adapterhub_tag: Optional[str] = None, datasets_tag: Optional[str] = None, local_path: Optional[str] = None, commit_message: Optional[str] = None, private: Optional[bool] = None, token: Optional[Union[bool, str]] = None, overwrite_adapter_card: bool = False, create_pr: bool = False, revision: Optional[str] = None, commit_description: Optional[str] = None, adapter_card_kwargs: Optional[dict] = None, **deprecated_kwargs)
+

Upload an adapter to HuggingFace’s Model Hub.

+
+
Parameters
+
    +
  • repo_name (str) – The name of the repository on the model hub to upload to.

  • +
  • adapter_name (str) – The name of the adapter to be uploaded.

  • +
  • organization (str, optional) – Organization in which to push the adapter +(you must be a member of this organization). Defaults to None.

  • +
  • adapterhub_tag (str, optional) – Tag of the format <task>/<subtask> for categorization on https://adapterhub.ml/explore/. See +https://docs.adapterhub.ml/contributing.html#add-a-new-task-or-subtask for more. If not specified, +datasets_tag must be given in case a new adapter card is generated. Defaults to None.

  • +
  • datasets_tag (str, optional) – Dataset identifier from https://huggingface.co/datasets. +If not specified, adapterhub_tag must be given in case a new adapter card is generated. Defaults to +None.

  • +
  • local_path (str, optional) – Local path used as clone directory of the adapter repository. +If not specified, will create a temporary directory. Defaults to None.

  • +
  • commit_message (str, optional) – Message to commit while pushing. Will default to "add config", "add tokenizer" or +"add model" depending on the type of the class.

  • +
  • private (bool, optional) – Whether or not the repository created should be private (requires a paying subscription).

  • +
  • token (bool or str, optional) – The token to use as HTTP bearer authorization for remote files. If True, will use the token generated +when running huggingface-cli login (stored in ~/.huggingface). Will default to True if repo_url +is not specified.

  • +
  • overwrite_adapter_card (bool, optional) – Overwrite an existing adapter card with a newly generated one. +If set to False, will only generate an adapter card, if none exists. Defaults to False.

  • +
  • create_pr (bool, optional) – Whether or not to create a PR with the uploaded files or directly commit.

  • +
  • revision (str, optional) – Branch to push the uploaded files to.

  • +
  • commit_description (str, optional) – The description of the commit that will be created

  • +
+
+
Returns
+

The url of the adapter repository on the model hub.

+
+
Return type
+

str

+
+
+
+ +
+
+reset_adapter()
+

Resets weights of a LoRA module merged using model.merge_adapter(name).

+
+ +
+
+save_adapter(save_directory: str, adapter_name: str, with_head: bool = True, meta_dict: Optional[dict] = None, custom_weights_loaders: Optional[List[WeightsLoader]] = None, use_safetensors: bool = False)
+

Saves an adapter and its configuration file to a directory so that it can be shared or reloaded using +load_adapter().

+
+
Parameters
+
    +
  • save_directory (str) – Path to a directory where the adapter should be saved.

  • +
  • adapter_name (str) – Name of the adapter to be saved.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
Raises
+

ValueError – If the given adapter name is invalid.

+
+
+
+ +
+
+save_adapter_fusion(save_directory: str, adapter_names: Union[Fuse, list, str], meta_dict: Optional[dict] = None, custom_weights_loaders: Optional[List[WeightsLoader]] = None, with_head: Union[bool, str] = False, use_safetensors: bool = False)
+

Saves an AdapterFusion layer and its configuration file to a directory so that it can be shared or reloaded +using load_adapter_fusion().

+
+
Parameters
+
    +
  • save_directory (str) – Path to a directory where the AdapterFusion should be saved.

  • +
  • adapter_names (Union[Fuse, list, str]) – AdapterFusion to be saved.

  • +
  • with_head (Union[bool, str]) – If True, will save a head with the same name as the AdapterFusionLayer. If a string, this will be used +as the name of the head to be saved.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
Raises
+

ValueError – If the given AdapterFusion name is invalid.

+
+
+
+ +
+
+save_all_adapter_fusions(save_directory: str, meta_dict: Optional[dict] = None, custom_weights_loaders: Optional[List[WeightsLoader]] = None, use_safetensors: bool = False)
+

Saves all AdapterFusion layers of this model together with their configuration to subfolders of the given +location.

+
+
Parameters
+
    +
  • save_directory (str) – Path to a directory where the AdapterFusion layers should be saved.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
+
+ +
+
+save_all_adapters(save_directory: str, with_head: bool = True, meta_dict: Optional[dict] = None, custom_weights_loaders: Optional[List[WeightsLoader]] = None, use_safetensors: bool = False)
+

Saves all adapters of this model together with their configuration to subfolders of the given location.

+
+
Parameters
+
    +
  • save_directory (str) – Path to a directory where the adapters should be saved.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
+
+ +
+
+save_all_heads(save_directory: str, use_safetensors: bool = False)
+

Saves all prediction heads of this model to subfolders of the given location.

+
+
Parameters
+
    +
  • save_directory (str) – Path to the base directory where prediction heads should be saved.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
+
+ +
+
+save_head(save_directory: str, head_name: Optional[str] = None, use_safetensors: bool = False) None
+

Saves a model prediction head to a directory such that it can be reloaded using load_head().

+
+
Parameters
+
    +
  • save_directory (str) – Path to the directory where the prediction head should be saved.

  • +
  • head_name (str, optional) – Name of the head to save. Set to None if model only has one head. Defaults to None.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
+
+ +
+
+save_pretrained(save_directory: Union[str, PathLike], **kwargs)
+

Save a model and its configuration file to a directory, so that it can be re-loaded using the +[~PreTrainedModel.from_pretrained] class method.

+
+
Parameters
+
    +
  • save_directory (str or os.PathLike) – Directory to which to save. Will be created if it doesn’t exist.

  • +
  • is_main_process (bool, optional, defaults to True) – Whether the process calling this is the main process or not. Useful when in distributed training like +TPUs and need to call this function on all processes. In this case, set is_main_process=True only on +the main process to avoid race conditions.

  • +
  • state_dict (nested dictionary of torch.Tensor) – The state dictionary of the model to save. Will default to self.state_dict(), but can be used to only +save parts of the model or if special precautions need to be taken when recovering the state dictionary +of a model (like when using model parallelism).

  • +
  • save_function (Callable) – The function to use to save the state dictionary. Useful on distributed training like TPUs when one +need to replace torch.save by another method.

  • +
  • push_to_hub (bool, optional, defaults to False) – Whether or not to push your model to the Hugging Face model hub after saving it. You can specify the +repository you want to push to with repo_id (will default to the name of save_directory in your +namespace).

  • +
  • max_shard_size (int or str, optional, defaults to “5GB”) –

    The maximum size for a checkpoint before being sharded. Checkpoints shard will then be each of size +lower than this size. If expressed as a string, needs to be digits followed by a unit (like “5MB”). +We default it to 5GB in order for models to be able to run easily on free-tier google colab instances +without CPU OOM issues.

    +

    <Tip warning={true}>

    +

    If a single weight of the model is bigger than max_shard_size, it will be in its own checkpoint shard +which will be bigger than max_shard_size.

    +

    </Tip>

    +

  • +
  • safe_serialization (bool, optional, defaults to True) – Whether to save the model using safetensors or the traditional PyTorch way (that uses pickle).

  • +
  • variant (str, optional) – If specified, weights are saved in the format pytorch_model.<variant>.bin.

  • +
  • token (str or bool, optional) – The token to use as HTTP bearer authorization for remote files. If True, or not specified, will use +the token generated when running huggingface-cli login (stored in ~/.huggingface).

  • +
  • save_peft_format (bool, optional, defaults to True) – For backward compatibility with PEFT library, in case adapter weights are attached to the model, all +keys of the state dict of adapters needs to be pre-pended with base_model.model. Advanced users can +disable this behaviours by setting save_peft_format to False.

  • +
  • kwargs (Dict[str, Any], optional) – Additional key word arguments passed along to the [~utils.PushToHubMixin.push_to_hub] method.

  • +
+
+
+
+ +
+
+set_active_adapters(adapter_setup: Union[list, AdapterCompositionBlock], skip_layers: Optional[List[int]] = None)
+

Sets the adapter modules to be used by default in every forward pass. This setting can be overriden by passing +the adapter_names parameter in the foward() pass. If no adapter with the given name is found, no module of +the respective type will be activated. In case the calling model class supports named prediction heads, this +method will attempt to activate a prediction head with the name of the last adapter in the list of passed +adapter names.

+
+
Parameters
+

adapter_setup (list) – The list of adapters to be activated by default. Can be a fusion or stacking configuration.

+
+
+
+ +
+
+tie_weights()
+

Tie the weights between the input embeddings and the output embeddings.

+

If the torchscript flag is set in the configuration, can’t handle parameter sharing so we are cloning +the weights instead.

+
+ +
+
+train_adapter(adapter_setup: Union[list, AdapterCompositionBlock], train_embeddings=False)
+

Sets the model into mode for training the given adapters. If self.base_model is self, must inherit from a class +that implements this method, to preclude infinite recursion

+
+ +
+
+train_adapter_fusion(adapter_setup: Union[list, AdapterCompositionBlock], unfreeze_adapters=False)
+

Sets the model into mode for training of adapter fusion determined by a list of adapter names. If +self.base_model is self, must inherit from a class that implements this method, to preclude infinite recursion

+
+ +
+
+train_fusion(adapter_setup: Union[list, AdapterCompositionBlock], unfreeze_adapters=False)
+

Sets the model into mode for training of adapter fusion determined by a list of adapter names.

+
+ +
+ +
+
+ + +
+ +
+ + +
+
+ +
+ +
+ +
+ + Versions + v: main + + +
+
+
Branches
+
main
+
+
+
+ + + + + + + + + + \ No newline at end of file diff --git a/classes/models/vit.html b/classes/models/vit.html new file mode 100644 index 0000000000..b151425265 --- /dev/null +++ b/classes/models/vit.html @@ -0,0 +1,1020 @@ + + + + + + + + + + + Vision Transformer (ViT) — AdapterHub documentation + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + +
+ + + + + +
+ +
+ + + + + + + + + + + + + + + + + +
+ + + + +
+
+
+
+ +
+

Vision Transformer (ViT)

+

The Vision Transformer (ViT) model was proposed in An Image is Worth 16x16 Words: Transformers for Image Recognition +at Scale by Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk +Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob +Uszkoreit, Neil Houlsby. It’s the first paper that successfully trains a Transformer encoder on ImageNet, attaining +very good results compared to familiar convolutional architectures.

+

The abstract from the paper is the following:

+

While the Transformer architecture has become the de-facto standard for natural language processing tasks, its +applications to computer vision remain limited. In vision, attention is either applied in conjunction with +convolutional networks, or used to replace certain components of convolutional networks while keeping their overall +structure in place. We show that this reliance on CNNs is not necessary and a pure transformer applied directly to +sequences of image patches can perform very well on image classification tasks. When pre-trained on large amounts of +data and transferred to multiple mid-sized or small image recognition benchmarks (ImageNet, CIFAR-100, VTAB, etc.), +Vision Transformer (ViT) attains excellent results compared to state-of-the-art convolutional networks while requiring +substantially fewer computational resources to train.

+
+

ViTAdapterModel

+
+
+class adapters.ViTAdapterModel(config)
+

ViT Model transformer with the option to add multiple flexible heads on top. +This model is a PyTorch [torch.nn.Module](https://pytorch.org/docs/stable/nn.html#torch.nn.Module) subclass. Use it +as a regular PyTorch Module and refer to the PyTorch documentation for all matter related to general usage and +behavior.

+
+
Parameters
+

config ([ViTConfig]) – Model configuration class with all the parameters of the model. +Initializing with a config file does not load the weights associated with the model, only the +configuration. Check out the [~PreTrainedModel.from_pretrained] method to load the model weights.

+
+
+
+
+property active_adapters: AdapterCompositionBlock
+

If you are not familiar with adapters and PEFT methods, we invite you to read more about them on the PEFT +official documentation: https://huggingface.co/docs/peft

+

Gets the current active adapters of the model. In case of multi-adapter inference (combining multiple adapters +for inference) returns the list of all active adapters so that users can deal with them accordingly.

+

For previous PEFT versions (that does not support multi-adapter inference), module.active_adapter will return +a single string.

+
+ +
+
+property active_head: Union[str, List[str]]
+

The active prediction head configuration of this model. Can be either the name of a single available head +(string) or a list of multiple available heads. In case of a list of heads, the same base model is forwarded +through all specified heads.

+
+
Returns
+

A string or a list of strings describing the active head configuration.

+
+
Return type
+

Union[str, List[str]]

+
+
+
+ +
+
+adapter_fusion_to(adapter_names: Union[Fuse, list, str], device: Optional[Union[device, str]] = None, dtype: Optional[dtype] = None)
+

Moves the adapter fusion layer with the given name to the specified device and data type.

+
+
Parameters
+
    +
  • adapter_names (Union[Fuse, list, str]) – The name of the adapter fusion layer to be moved.

  • +
  • device (torch.device or str, optional) – The device on which the adapter fusion layer should be moved.

  • +
  • dtype (torch.dtype, optional) – The data type to which the adapter fusion layer should be cast.

  • +
+
+
+
+ +
+
+adapter_summary(as_dict=False) Union[str, dict]
+

Returns a string summary of all adapters currently added to the model. Each entry in the summary table has the +following attributes:

+
+
    +
  • name: the name of the adapter

  • +
  • architecture: the architectural base of the adapter

  • +
  • #param: the number of parameters of the adapter

  • +
  • %param: the number of parameters of the adapter relative to the full model

  • +
  • active: whether the adapter is active

  • +
  • train: whether the adapter weights are enabled for training

  • +
+
+
+ +
+
+adapter_to(name: str, device: Optional[Union[device, str]] = None, dtype: Optional[dtype] = None)
+

Moves the adapter with the given name to the specified device and data type.

+
+
Parameters
+
    +
  • name (str) – The name of the adapter to be moved.

  • +
  • device (torch.device or str, optional) – The device on which the adapter should be moved.

  • +
  • dtype (torch.dtype, optional) – The data type to which the adapter should be cast.

  • +
+
+
+
+ +
+
+add_adapter(adapter_name: str, config=None, overwrite_ok: bool = False, set_active: bool = False)
+

Adds a new adapter module of the specified type to the model.

+
+
Parameters
+
    +
  • adapter_name (str) – The name of the adapter module to be added.

  • +
  • config (str or dict, optional) –

    The adapter configuration, can be either:

    +
      +
    • the string identifier of a pre-defined configuration dictionary

    • +
    • a configuration dictionary specifying the full config

    • +
    • if not given, the default configuration for this adapter type will be used

    • +
    +

  • +
  • overwrite_ok (bool, optional) – Overwrite an adapter with the same name if it exists. By default (False), an exception is thrown.

  • +
  • set_active (bool, optional) – Set the adapter to be the active one. By default (False), the adapter is added but not activated.

  • +
+
+
+

If self.base_model is self, must inherit from a class that implements this method, to preclude infinite +recursion

+
+ +
+
+add_adapter_fusion(adapter_names: Union[Fuse, list, str], config=None, overwrite_ok: bool = False, set_active: bool = False)
+

Adds AdapterFusion to the model with alll the necessary configurations and weight initializations

+
+
Parameters
+
    +
  • adapter_names (Fuse or list or str) –

    AdapterFusion layer to add. Can be either:

    +
      +
    • a Fuse composition block

    • +
    • a list of adapter names to fuse

    • +
    • a comma-separated string of adapter names to fuse

    • +
    +

  • +
  • config (str or dict) –

    adapter fusion configuration, can be either:

    +
      +
    • a string identifying a pre-defined adapter fusion configuration

    • +
    • a dictionary representing the adapter fusion configuration

    • +
    • the path to a file containing the adapter fusion configuration

    • +
    +

  • +
  • overwrite_ok (bool, optional) – Overwrite an AdapterFusion layer with the same name if it exists. By default (False), an exception is +thrown.

  • +
  • set_active (bool, optional) – Activate the added AdapterFusion. By default (False), the AdapterFusion is added but not activated.

  • +
+
+
+
+ +
+
+add_image_classification_head(head_name, num_labels=2, layers=1, activation_function='tanh', overwrite_ok=False, multilabel=False, id2label=None, use_pooler=False)
+

Adds an image classification head on top of the model.

+
+
Parameters
+
    +
  • head_name (str) – The name of the head.

  • +
  • num_labels (int, optional) – Number of classification labels. Defaults to 2.

  • +
  • layers (int, optional) – Number of layers. Defaults to 1.

  • +
  • activation_function (str, optional) – Activation function. Defaults to ‘tanh’.

  • +
  • overwrite_ok (bool, optional) – Force overwrite if a head with the same name exists. Defaults to False.

  • +
  • multilabel (bool, optional) – Enable multilabel classification setup. Defaults to False.

  • +
+
+
+
+ +
+
+apply_to_adapter_layers(fn)
+

Applies a function to all adapter layers of the model.

+
+ +
+
+apply_to_basemodel_childs(fn)
+

Applies a function to all direct childs of the model if they are a instance of AdapterLayerBase.

+
+ +
+
+average_adapter(adapter_name: str, adapter_list: List[str], weights: Optional[List[float]] = None, normalize_weights: bool = True, overwrite_ok: bool = False, set_active: bool = False)
+

Adds a new adapter module as weighted average of a set of existing adapter modules.

+
+
Parameters
+
    +
  • adapter_name (str) – The name of the adapter module to be added.

  • +
  • input_adapters (List[str] or Dict[str, float]) – Specifies the existing adapters whose weights should be averaged. Can either be a list of adapter names +or a dictionary mapping adapter names to weights.

  • +
  • overwrite_ok (bool, optional) – Overwrite an adapter with the same name if it exists. By default (False), an exception is thrown.

  • +
  • set_active (bool, optional) – Set the adapter to be the active one. By default (False), the adapter is added but not activated.

  • +
+
+
+
+ +
+
+delete_adapter(adapter_name: str)
+

Deletes the adapter with the specified name from the model.

+
+
Parameters
+

adapter_name (str) – The name of the adapter.

+
+
+
+ +
+
+delete_adapter_fusion(adapter_names: Union[Fuse, list, str])
+

Deletes the AdapterFusion layer of the specified adapters.

+
+
Parameters
+

adapter_names (Union[Fuse, list, str]) – AdapterFusion layer to delete.

+
+
+
+ +
+
+delete_head(head_name: str)
+

Deletes the prediction head with the specified name from the model.

+
+
Parameters
+

head_name (str) – The name of the prediction to delete.

+
+
+
+ +
+
+eject_prefix_tuning(name: str)
+

Converts the prefix tuning with the given name from the reparameterized form into the flat form.

+
+
Parameters
+

name (str) – The name of the prefix tuning.

+
+
+
+ +
+
+forward(pixel_values: Optional[Tensor] = None, head_mask: Optional[Tensor] = None, output_attentions: Optional[bool] = None, output_hidden_states: Optional[bool] = None, interpolate_pos_encoding: Optional[bool] = None, return_dict: Optional[bool] = None, head=None, output_adapter_gating_scores=False, output_adapter_fusion_attentions=False, **kwargs)
+

The [ViTAdapterModel] forward method, overrides the __call__ special method.

+

<Tip>

+

Although the recipe for forward pass needs to be defined within this function, one should call the [Module] +instance afterwards instead of this since the former takes care of running the pre and post processing steps while +the latter silently ignores them.

+

</Tip>

+
+
Parameters
+
    +
  • pixel_values (torch.FloatTensor of shape (batch_size, num_channels, height, width)) – Pixel values. Pixel values can be obtained using [AutoImageProcessor]. See [ViTImageProcessor.__call__] +for details.

  • +
  • head_mask (torch.FloatTensor of shape (num_heads,) or (num_layers, num_heads), optional) –

    Mask to nullify selected heads of the self-attention modules. Mask values selected in [0, 1]:

    +
      +
    • 1 indicates the head is not masked,

    • +
    • 0 indicates the head is masked.

    • +
    +

  • +
  • output_attentions (bool, optional) – Whether or not to return the attentions tensors of all attention layers. See attentions under returned +tensors for more detail.

  • +
  • output_hidden_states (bool, optional) – Whether or not to return the hidden states of all layers. See hidden_states under returned tensors for +more detail.

  • +
  • interpolate_pos_encoding (bool, optional) – Whether to interpolate the pre-trained position encodings.

  • +
  • return_dict (bool, optional) – Whether or not to return a [~utils.ModelOutput] instead of a plain tuple.

  • +
+
+
+
+ +
+
+forward_context(context: ForwardContext, *args, **kwargs)
+

This method is called by the ForwardContext at the beginning of the forward pass.

+
+ +
+
+forward_head(all_outputs, head_name=None, cls_output=None, attention_mask=None, return_dict=False, context=None, **kwargs)
+

The forward pass through a prediction head configuration. There are three ways to specify the used prediction +head configuration (in order of priority):

+
+
    +
  1. If a head_name is passed, the head with the given name is used.

  2. +
  3. If the forward call is executed within an AdapterSetup context, the head configuration is read from +the context.

  4. +
  5. If the active_head property is set, the head configuration is read from there.

  6. +
+
+
+
Parameters
+
    +
  • all_outputs (dict) – The outputs of the base model.

  • +
  • head_name (str, optional) – The name of the prediction head to use. If None, the active head is used.

  • +
  • cls_output (torch.Tensor, optional) – The classification output of the model.

  • +
  • attention_mask (torch.Tensor, optional) – The attention mask of the model.

  • +
  • return_dict (bool) – Whether or not to return a ModelOutput instead of a plain tuple.

  • +
  • get_cls_from_eos_tokens (bool) – If set to True, retrieve classifier token representations from the last <eos> token in the sequence. +Setting to True requires eos_mask to be passed as well.

  • +
  • **kwargs – Additional keyword arguments passed to the forward pass of the head.

  • +
+
+
+
+ +
+
+freeze_model(freeze=True)
+

Freezes all weights of the model.

+
+ +
+
+get_adapter(name)
+

If self.base_model is self, must inherit from a class that implements this method, to preclude infinite +recursion

+
+ +
+
+get_labels(head_name=None)
+

Returns the labels the given head is assigning/predictin

+
+
Parameters
+
    +
  • head_name – (str, optional) the name of the head which labels should be returned. Default is None.

  • +
  • returned (If the name is None the labels of the active head are) –

  • +
+
+
+

Returns: labels

+
+ +
+
+get_labels_dict(head_name=None)
+

Returns the id2label dict for the given hea

+
+
Parameters
+
    +
  • head_name – (str, optional) the name of the head which labels should be returned. Default is None.

  • +
  • returned (If the name is None the labels of the active head are) –

  • +
+
+
+

Returns: id2label

+
+ +
+
+get_output_embeddings() Union[Module, List[Module]]
+

Returns the model’s output embeddings.

+
+
Returns
+

A torch module mapping hidden states to vocabulary.

+
+
Return type
+

nn.Module

+
+
+
+ +
+
+head_type()
+

Checks which head type the decorated function belongs to and raises an error if the model does not support the +head type.

+
+ +
+
+init_adapters(model_config, adapters_config, add_prefix_tuning_pool=True)
+

This method initializes adapter modules and fusion modules from the model config.

+
+ +
+
+iter_layers() Iterable[Tuple[int, Module]]
+

Iterates over all layers of the model.

+
+ +
+
+load_adapter(adapter_name_or_path: str, config: Optional[Union[dict, str]] = None, version: Optional[str] = None, model_name: Optional[str] = None, load_as: Optional[str] = None, source: Optional[str] = None, with_head: bool = True, custom_weights_loaders: Optional[List[WeightsLoader]] = None, leave_out: Optional[List[int]] = None, id2label=None, set_active: bool = False, use_safetensors: bool = False, **kwargs) str
+

Loads a pre-trained pytorch adapter module from the local file system or a remote location.

+
+
Parameters
+
    +
  • adapter_name_or_path (str) –

    can be either:

    +
      +
    • the identifier of a pre-trained task adapter to be loaded from Adapter Hub

    • +
    • a path to a directory containing adapter weights saved using model.saved_adapter()

    • +
    • a URL pointing to a zip folder containing a saved adapter module

    • +
    +

  • +
  • config (dict or str, optional) – The requested configuration of the adapter. +If not specified, will be either: - the default adapter config for the requested adapter if specified - +the global default adapter config

  • +
  • version (str, optional) – The version of the adapter to be loaded.

  • +
  • model_name (str, optional) – The string identifier of the pre-trained model.

  • +
  • load_as (str, optional) – Load the adapter using this name. By default, the name with which the adapter was +saved will be used.

  • +
  • source (str, optional) –

    Identifier of the source(s) from where to load the adapter. Can be:

    +
      +
    • +
      ”ah”: search on AdapterHub Hub repo.

      Note: the Hub repo has been archived and all adapters have been moved to HuggingFace Model Hub. +Loading from this source is deprecated.

      +
      +
      +
    • +
    • ”hf”: search on HuggingFace Model Hub.

    • +
    • None (default): search on all sources

    • +
    +

  • +
  • leave_out – Dynamically drop adapter modules in the specified Transformer layers when loading the adapter.

  • +
  • set_active (bool, optional) – Set the loaded adapter to be the active one. By default (False), the adapter is loaded but not +activated.

  • +
  • use_safetensors (bool, optional) – If True, weights are loaded via safetensors if safetensors checkpoint is available. Otherwise, the regular torch save method is used.

  • +
+
+
Returns
+

The name with which the adapter was added to the model.

+
+
Return type
+

str

+
+
+
+ +
+
+load_adapter_fusion(adapter_fusion_name_or_path: str, load_as: Optional[str] = None, custom_weights_loaders: Optional[List[WeightsLoader]] = None, set_active: bool = False, with_head: bool = True, use_safetensors: bool = False, **kwargs) str
+

Loads a pre-trained AdapterFusion layer from the local file system.

+
+
Parameters
+
    +
  • adapter_fusion_name_or_path (str) – a path to a directory containing AdapterFusion weights saved using model.save_adapter_fusion().

  • +
  • load_as (str, optional) – Load the AdapterFusion using this name. +By default, the name with which the AdapterFusion layer was saved will be used.

  • +
  • set_active (bool, optional) – Activate the loaded AdapterFusion. By default (False), the AdapterFusion is loaded but not activated.

  • +
  • use_safetensors (bool, optional) – If True, weights are loaded via safetensors if safetensors checkpoint is available. Otherwise, the regular torch save method is used.

  • +
+
+
Returns
+

The name with which the AdapterFusion was added to the model.

+
+
Return type
+

str

+
+
+
+ +
+
+load_head(save_directory: str, load_as: Optional[str] = None, id2label: Optional[Dict[int, str]] = None, use_safetensors: bool = False, **kwargs) str
+

Loads a model prediction head from a directory where it was saved using save_head().

+
+
Parameters
+
    +
  • save_directory (str) – Path to the directory where the prediction head is saved.

  • +
  • load_as (str, optional) – Load the AdapterFusion using this name. +By default, the name with which the AdapterFusion layer was saved will be used.

  • +
  • id2label (Dict[int, str], optional) – Provide a custom mapping from class ids to class labels. Defaults to None.

  • +
  • use_safetensors (bool, optional) – If True, weights are loaded via safetensors if safetensors checkpoint is available. Otherwise, the regular torch save method is used.

  • +
+
+
Returns
+

The name with which the prediction head was added to the model.

+
+
Return type
+

str

+
+
+
+ +
+
+merge_adapter(name: str)
+

Merges the weights of the given LoRA module with the Transformer weights as described in the paper.

+
+
Parameters
+

name (str) – LoRA module to merge.

+
+
+
+ +
+
+push_adapter_to_hub(repo_name: str, adapter_name: str, organization: Optional[str] = None, adapterhub_tag: Optional[str] = None, datasets_tag: Optional[str] = None, local_path: Optional[str] = None, commit_message: Optional[str] = None, private: Optional[bool] = None, token: Optional[Union[bool, str]] = None, overwrite_adapter_card: bool = False, create_pr: bool = False, revision: Optional[str] = None, commit_description: Optional[str] = None, adapter_card_kwargs: Optional[dict] = None, **deprecated_kwargs)
+

Upload an adapter to HuggingFace’s Model Hub.

+
+
Parameters
+
    +
  • repo_name (str) – The name of the repository on the model hub to upload to.

  • +
  • adapter_name (str) – The name of the adapter to be uploaded.

  • +
  • organization (str, optional) – Organization in which to push the adapter +(you must be a member of this organization). Defaults to None.

  • +
  • adapterhub_tag (str, optional) – Tag of the format <task>/<subtask> for categorization on https://adapterhub.ml/explore/. See +https://docs.adapterhub.ml/contributing.html#add-a-new-task-or-subtask for more. If not specified, +datasets_tag must be given in case a new adapter card is generated. Defaults to None.

  • +
  • datasets_tag (str, optional) – Dataset identifier from https://huggingface.co/datasets. +If not specified, adapterhub_tag must be given in case a new adapter card is generated. Defaults to +None.

  • +
  • local_path (str, optional) – Local path used as clone directory of the adapter repository. +If not specified, will create a temporary directory. Defaults to None.

  • +
  • commit_message (str, optional) – Message to commit while pushing. Will default to "add config", "add tokenizer" or +"add model" depending on the type of the class.

  • +
  • private (bool, optional) – Whether or not the repository created should be private (requires a paying subscription).

  • +
  • token (bool or str, optional) – The token to use as HTTP bearer authorization for remote files. If True, will use the token generated +when running huggingface-cli login (stored in ~/.huggingface). Will default to True if repo_url +is not specified.

  • +
  • overwrite_adapter_card (bool, optional) – Overwrite an existing adapter card with a newly generated one. +If set to False, will only generate an adapter card, if none exists. Defaults to False.

  • +
  • create_pr (bool, optional) – Whether or not to create a PR with the uploaded files or directly commit.

  • +
  • revision (str, optional) – Branch to push the uploaded files to.

  • +
  • commit_description (str, optional) – The description of the commit that will be created

  • +
+
+
Returns
+

The url of the adapter repository on the model hub.

+
+
Return type
+

str

+
+
+
+ +
+
+reset_adapter()
+

Resets weights of a LoRA module merged using model.merge_adapter(name).

+
+ +
+
+save_adapter(save_directory: str, adapter_name: str, with_head: bool = True, meta_dict: Optional[dict] = None, custom_weights_loaders: Optional[List[WeightsLoader]] = None, use_safetensors: bool = False)
+

Saves an adapter and its configuration file to a directory so that it can be shared or reloaded using +load_adapter().

+
+
Parameters
+
    +
  • save_directory (str) – Path to a directory where the adapter should be saved.

  • +
  • adapter_name (str) – Name of the adapter to be saved.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
Raises
+

ValueError – If the given adapter name is invalid.

+
+
+
+ +
+
+save_adapter_fusion(save_directory: str, adapter_names: Union[Fuse, list, str], meta_dict: Optional[dict] = None, custom_weights_loaders: Optional[List[WeightsLoader]] = None, with_head: Union[bool, str] = False, use_safetensors: bool = False)
+

Saves an AdapterFusion layer and its configuration file to a directory so that it can be shared or reloaded +using load_adapter_fusion().

+
+
Parameters
+
    +
  • save_directory (str) – Path to a directory where the AdapterFusion should be saved.

  • +
  • adapter_names (Union[Fuse, list, str]) – AdapterFusion to be saved.

  • +
  • with_head (Union[bool, str]) – If True, will save a head with the same name as the AdapterFusionLayer. If a string, this will be used +as the name of the head to be saved.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
Raises
+

ValueError – If the given AdapterFusion name is invalid.

+
+
+
+ +
+
+save_all_adapter_fusions(save_directory: str, meta_dict: Optional[dict] = None, custom_weights_loaders: Optional[List[WeightsLoader]] = None, use_safetensors: bool = False)
+

Saves all AdapterFusion layers of this model together with their configuration to subfolders of the given +location.

+
+
Parameters
+
    +
  • save_directory (str) – Path to a directory where the AdapterFusion layers should be saved.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
+
+ +
+
+save_all_adapters(save_directory: str, with_head: bool = True, meta_dict: Optional[dict] = None, custom_weights_loaders: Optional[List[WeightsLoader]] = None, use_safetensors: bool = False)
+

Saves all adapters of this model together with their configuration to subfolders of the given location.

+
+
Parameters
+
    +
  • save_directory (str) – Path to a directory where the adapters should be saved.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
+
+ +
+
+save_all_heads(save_directory: str, use_safetensors: bool = False)
+

Saves all prediction heads of this model to subfolders of the given location.

+
+
Parameters
+
    +
  • save_directory (str) – Path to the base directory where prediction heads should be saved.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
+
+ +
+
+save_head(save_directory: str, head_name: Optional[str] = None, use_safetensors: bool = False) None
+

Saves a model prediction head to a directory such that it can be reloaded using load_head().

+
+
Parameters
+
    +
  • save_directory (str) – Path to the directory where the prediction head should be saved.

  • +
  • head_name (str, optional) – Name of the head to save. Set to None if model only has one head. Defaults to None.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
+
+ +
+
+save_pretrained(save_directory: Union[str, PathLike], **kwargs)
+

Save a model and its configuration file to a directory, so that it can be re-loaded using the +[~PreTrainedModel.from_pretrained] class method.

+
+
Parameters
+
    +
  • save_directory (str or os.PathLike) – Directory to which to save. Will be created if it doesn’t exist.

  • +
  • is_main_process (bool, optional, defaults to True) – Whether the process calling this is the main process or not. Useful when in distributed training like +TPUs and need to call this function on all processes. In this case, set is_main_process=True only on +the main process to avoid race conditions.

  • +
  • state_dict (nested dictionary of torch.Tensor) – The state dictionary of the model to save. Will default to self.state_dict(), but can be used to only +save parts of the model or if special precautions need to be taken when recovering the state dictionary +of a model (like when using model parallelism).

  • +
  • save_function (Callable) – The function to use to save the state dictionary. Useful on distributed training like TPUs when one +need to replace torch.save by another method.

  • +
  • push_to_hub (bool, optional, defaults to False) – Whether or not to push your model to the Hugging Face model hub after saving it. You can specify the +repository you want to push to with repo_id (will default to the name of save_directory in your +namespace).

  • +
  • max_shard_size (int or str, optional, defaults to “5GB”) –

    The maximum size for a checkpoint before being sharded. Checkpoints shard will then be each of size +lower than this size. If expressed as a string, needs to be digits followed by a unit (like “5MB”). +We default it to 5GB in order for models to be able to run easily on free-tier google colab instances +without CPU OOM issues.

    +

    <Tip warning={true}>

    +

    If a single weight of the model is bigger than max_shard_size, it will be in its own checkpoint shard +which will be bigger than max_shard_size.

    +

    </Tip>

    +

  • +
  • safe_serialization (bool, optional, defaults to True) – Whether to save the model using safetensors or the traditional PyTorch way (that uses pickle).

  • +
  • variant (str, optional) – If specified, weights are saved in the format pytorch_model.<variant>.bin.

  • +
  • token (str or bool, optional) – The token to use as HTTP bearer authorization for remote files. If True, or not specified, will use +the token generated when running huggingface-cli login (stored in ~/.huggingface).

  • +
  • save_peft_format (bool, optional, defaults to True) – For backward compatibility with PEFT library, in case adapter weights are attached to the model, all +keys of the state dict of adapters needs to be pre-pended with base_model.model. Advanced users can +disable this behaviours by setting save_peft_format to False.

  • +
  • kwargs (Dict[str, Any], optional) – Additional key word arguments passed along to the [~utils.PushToHubMixin.push_to_hub] method.

  • +
+
+
+
+ +
+
+set_active_adapters(adapter_setup: Union[list, AdapterCompositionBlock], skip_layers: Optional[List[int]] = None)
+

Sets the adapter modules to be used by default in every forward pass. This setting can be overriden by passing +the adapter_names parameter in the foward() pass. If no adapter with the given name is found, no module of +the respective type will be activated. In case the calling model class supports named prediction heads, this +method will attempt to activate a prediction head with the name of the last adapter in the list of passed +adapter names.

+
+
Parameters
+

adapter_setup (list) – The list of adapters to be activated by default. Can be a fusion or stacking configuration.

+
+
+
+ +
+
+tie_weights()
+

Tie the weights between the input embeddings and the output embeddings.

+

If the torchscript flag is set in the configuration, can’t handle parameter sharing so we are cloning +the weights instead.

+
+ +
+
+train_adapter(adapter_setup: Union[list, AdapterCompositionBlock], train_embeddings=False)
+

Sets the model into mode for training the given adapters. If self.base_model is self, must inherit from a class +that implements this method, to preclude infinite recursion

+
+ +
+
+train_adapter_fusion(adapter_setup: Union[list, AdapterCompositionBlock], unfreeze_adapters=False)
+

Sets the model into mode for training of adapter fusion determined by a list of adapter names. If +self.base_model is self, must inherit from a class that implements this method, to preclude infinite recursion

+
+ +
+
+train_fusion(adapter_setup: Union[list, AdapterCompositionBlock], unfreeze_adapters=False)
+

Sets the model into mode for training of adapter fusion determined by a list of adapter names.

+
+ +
+ +
+
+ + +
+ +
+ + +
+
+ +
+ +
+ +
+ + Versions + v: main + + +
+
+
Branches
+
main
+
+
+
+ + + + + + + + + + \ No newline at end of file diff --git a/classes/models/xlmroberta.html b/classes/models/xlmroberta.html new file mode 100644 index 0000000000..5da7e59f66 --- /dev/null +++ b/classes/models/xlmroberta.html @@ -0,0 +1,373 @@ + + + + + + + + + + + XLM-RoBERTa — AdapterHub documentation + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + +
+ + + + + +
+ +
+ + + + + + + + + + + + + + + + + +
+ + + + +
+
+
+
+ +
+

XLM-RoBERTa

+

The XLM-RoBERTa model was proposed in Unsupervised Cross-lingual Representation Learning at Scale +by Alexis Conneau, Kartikay Khandelwal, Naman Goyal, Vishrav Chaudhary, Guillaume Wenzek, Francisco Guzmán, +Edouard Grave, Myle Ott, Luke Zettlemoyer and Veselin Stoyanov. It is based on Facebook’s RoBERTa model released in 2019. +It is a large multi-lingual language model, trained on 2.5TB of filtered CommonCrawl data.

+
+

XLMRobertaAdapterModel

+
+
+class adapters.XLMRobertaAdapterModel(config)
+

XLM-Roberta Model transformer with the option to add multiple flexible heads on top.

+

This model inherits from [PreTrainedModel]. Check the superclass documentation for the generic methods the +library implements for all its model (such as downloading or saving, resizing the input embeddings, pruning heads +etc.)

+

This model is also a PyTorch [torch.nn.Module](https://pytorch.org/docs/stable/nn.html#torch.nn.Module) subclass. +Use it as a regular PyTorch Module and refer to the PyTorch documentation for all matter related to general usage +and behavior.

+
+
Parameters
+

config ([XLMRobertaConfig]) – Model configuration class with all the parameters of the +model. Initializing with a config file does not load the weights associated with the model, only the +configuration. Check out the [~PreTrainedModel.from_pretrained] method to load the model weights.

+
+
+
+
+forward(input_ids=None, attention_mask=None, token_type_ids=None, position_ids=None, head_mask=None, inputs_embeds=None, output_attentions=None, output_hidden_states=None, return_dict=None, head=None, output_adapter_gating_scores=False, output_adapter_fusion_attentions=False, **kwargs)
+

The [XLMRobertaAdapterModel] forward method, overrides the __call__ special method.

+

<Tip>

+

Although the recipe for forward pass needs to be defined within this function, one should call the [Module] +instance afterwards instead of this since the former takes care of running the pre and post processing steps while +the latter silently ignores them.

+

</Tip>

+
+
Parameters
+
    +
  • input_ids (torch.LongTensor of shape (batch_size, sequence_length)) –

    Indices of input sequence tokens in the vocabulary.

    +

    Indices can be obtained using [AutoTokenizer]. See [PreTrainedTokenizer.encode] and +[PreTrainedTokenizer.__call__] for details.

    +

    [What are input IDs?](../glossary#input-ids)

    +

  • +
  • attention_mask (torch.FloatTensor of shape (batch_size, sequence_length), optional) –

    Mask to avoid performing attention on padding token indices. Mask values selected in [0, 1]:

    +
      +
    • 1 for tokens that are not masked,

    • +
    • 0 for tokens that are masked.

    • +
    +

    [What are attention masks?](../glossary#attention-mask)

    +

  • +
  • token_type_ids (torch.LongTensor of shape (batch_size, sequence_length), optional) –

    Segment token indices to indicate first and second portions of the inputs. Indices are selected in [0, +1]:

    +
      +
    • 0 corresponds to a sentence A token,

    • +
    • 1 corresponds to a sentence B token.

    • +
    +

    [What are token type IDs?](../glossary#token-type-ids)

    +

  • +
  • position_ids (torch.LongTensor of shape (batch_size, sequence_length), optional) –

    Indices of positions of each input sequence tokens in the position embeddings. Selected in the range [0, +config.max_position_embeddings - 1].

    +

    [What are position IDs?](../glossary#position-ids)

    +

  • +
  • head_mask (torch.FloatTensor of shape (num_heads,) or (num_layers, num_heads), optional) –

    Mask to nullify selected heads of the self-attention modules. Mask values selected in [0, 1]:

    +
      +
    • 1 indicates the head is not masked,

    • +
    • 0 indicates the head is masked.

    • +
    +

  • +
  • inputs_embeds (torch.FloatTensor of shape (batch_size, sequence_length, hidden_size), optional) – Optionally, instead of passing input_ids you can choose to directly pass an embedded representation. This +is useful if you want more control over how to convert input_ids indices into associated vectors than the +model’s internal embedding lookup matrix.

  • +
  • output_attentions (bool, optional) – Whether or not to return the attentions tensors of all attention layers. See attentions under returned +tensors for more detail.

  • +
  • output_hidden_states (bool, optional) – Whether or not to return the hidden states of all layers. See hidden_states under returned tensors for +more detail.

  • +
  • return_dict (bool, optional) – Whether or not to return a [~utils.ModelOutput] instead of a plain tuple.

  • +
+
+
+
+ +
+ +
+
+ + +
+ +
+ + +
+
+ +
+ +
+ +
+ + Versions + v: main + + +
+
+
Branches
+
main
+
+
+
+ + + + + + + + + + \ No newline at end of file diff --git a/classes/models/xmod.html b/classes/models/xmod.html new file mode 100644 index 0000000000..4b16dab216 --- /dev/null +++ b/classes/models/xmod.html @@ -0,0 +1,1158 @@ + + + + + + + + + + + X-MOD — AdapterHub documentation + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + +
+ + + + + +
+ +
+ + + + + + + + + + + + + + + + + +
+ + + + +
+
+
+
+ +
+

X-MOD

+
+

Important

+

The X-MOD implementation integrated into Transformers already supports adapters. +To make this implementation compatible with Adapters, a few changes were necessary:

+
    +
  • +
    Pre-trained X-MOD checkpoints require conversion before they can be used with Adapters. We provide pre-converted checkpoints for the following models:
      +
    • facebook/xmod-base -> AdapterHub/xmod-base with languages adapters split into separate repos (e.g. AdapterHub/xmod-base-af_ZA)

    • +
    +
    +
    +
  • +
  • +
    In Adapters, the X-MOD classes rely on the usual adapter methods instead of the custom methods introduced in Transformers, i.e.:
      +
    • set_active_adapters() instead of set_default_language().

    • +
    • AdapterSetup context instead of lang_ids parameter.

    • +
    +
    +
    +
  • +
+
+

The abstract from the paper is the following:

+

Multilingual pre-trained models are known to suffer from the curse of multilinguality, which causes per-language performance to drop as they cover more languages. We address this issue by introducing language-specific modules, which allows us to grow the total capacity of the model, while keeping the total number of trainable parameters per language constant. In contrast with prior work that learns language-specific components post-hoc, we pre-train the modules of our Cross-lingual Modular (X-MOD) models from the start. Our experiments on natural language inference, named entity recognition and question answering show that our approach not only mitigates the negative interference between languages, but also enables positive transfer, resulting in improved monolingual and cross-lingual performance. Furthermore, our approach enables adding languages post-hoc with no measurable drop in performance, no longer limiting the model usage to the set of pre-trained languages.

+
+

XmodAdapterModel

+
+
+class adapters.XmodAdapterModel(config)
+

X-MOD Model transformer with the option to add multiple flexible heads on top.

+

This model inherits from [PreTrainedModel]. Check the superclass documentation for the generic methods the +library implements for all its model (such as downloading or saving, resizing the input embeddings, pruning heads +etc.)

+

This model is also a PyTorch [torch.nn.Module](https://pytorch.org/docs/stable/nn.html#torch.nn.Module) subclass. +Use it as a regular PyTorch Module and refer to the PyTorch documentation for all matter related to general usage +and behavior.

+
+
Parameters
+

config ([XmodConfig]) – Model configuration class with all the parameters of the +model. Initializing with a config file does not load the weights associated with the model, only the +configuration. Check out the [~PreTrainedModel.from_pretrained] method to load the model weights.

+
+
+
+
+property active_adapters: AdapterCompositionBlock
+

If you are not familiar with adapters and PEFT methods, we invite you to read more about them on the PEFT +official documentation: https://huggingface.co/docs/peft

+

Gets the current active adapters of the model. In case of multi-adapter inference (combining multiple adapters +for inference) returns the list of all active adapters so that users can deal with them accordingly.

+

For previous PEFT versions (that does not support multi-adapter inference), module.active_adapter will return +a single string.

+
+ +
+
+property active_head: Union[str, List[str]]
+

The active prediction head configuration of this model. Can be either the name of a single available head +(string) or a list of multiple available heads. In case of a list of heads, the same base model is forwarded +through all specified heads.

+
+
Returns
+

A string or a list of strings describing the active head configuration.

+
+
Return type
+

Union[str, List[str]]

+
+
+
+ +
+
+adapter_fusion_to(adapter_names: Union[Fuse, list, str], device: Optional[Union[device, str]] = None, dtype: Optional[dtype] = None)
+

Moves the adapter fusion layer with the given name to the specified device and data type.

+
+
Parameters
+
    +
  • adapter_names (Union[Fuse, list, str]) – The name of the adapter fusion layer to be moved.

  • +
  • device (torch.device or str, optional) – The device on which the adapter fusion layer should be moved.

  • +
  • dtype (torch.dtype, optional) – The data type to which the adapter fusion layer should be cast.

  • +
+
+
+
+ +
+
+adapter_summary(as_dict=False) Union[str, dict]
+

Returns a string summary of all adapters currently added to the model. Each entry in the summary table has the +following attributes:

+
+
    +
  • name: the name of the adapter

  • +
  • architecture: the architectural base of the adapter

  • +
  • #param: the number of parameters of the adapter

  • +
  • %param: the number of parameters of the adapter relative to the full model

  • +
  • active: whether the adapter is active

  • +
  • train: whether the adapter weights are enabled for training

  • +
+
+
+ +
+
+adapter_to(name: str, device: Optional[Union[device, str]] = None, dtype: Optional[dtype] = None)
+

Moves the adapter with the given name to the specified device and data type.

+
+
Parameters
+
    +
  • name (str) – The name of the adapter to be moved.

  • +
  • device (torch.device or str, optional) – The device on which the adapter should be moved.

  • +
  • dtype (torch.dtype, optional) – The data type to which the adapter should be cast.

  • +
+
+
+
+ +
+
+add_adapter(adapter_name: str, config=None, overwrite_ok: bool = False, set_active: bool = False)
+

Adds a new adapter module of the specified type to the model.

+
+
Parameters
+
    +
  • adapter_name (str) – The name of the adapter module to be added.

  • +
  • config (str or dict, optional) –

    The adapter configuration, can be either:

    +
      +
    • the string identifier of a pre-defined configuration dictionary

    • +
    • a configuration dictionary specifying the full config

    • +
    • if not given, the default configuration for this adapter type will be used

    • +
    +

  • +
  • overwrite_ok (bool, optional) – Overwrite an adapter with the same name if it exists. By default (False), an exception is thrown.

  • +
  • set_active (bool, optional) – Set the adapter to be the active one. By default (False), the adapter is added but not activated.

  • +
+
+
+

If self.base_model is self, must inherit from a class that implements this method, to preclude infinite +recursion

+
+ +
+
+add_adapter_fusion(adapter_names: Union[Fuse, list, str], config=None, overwrite_ok: bool = False, set_active: bool = False)
+

Adds AdapterFusion to the model with alll the necessary configurations and weight initializations

+
+
Parameters
+
    +
  • adapter_names (Fuse or list or str) –

    AdapterFusion layer to add. Can be either:

    +
      +
    • a Fuse composition block

    • +
    • a list of adapter names to fuse

    • +
    • a comma-separated string of adapter names to fuse

    • +
    +

  • +
  • config (str or dict) –

    adapter fusion configuration, can be either:

    +
      +
    • a string identifying a pre-defined adapter fusion configuration

    • +
    • a dictionary representing the adapter fusion configuration

    • +
    • the path to a file containing the adapter fusion configuration

    • +
    +

  • +
  • overwrite_ok (bool, optional) – Overwrite an AdapterFusion layer with the same name if it exists. By default (False), an exception is +thrown.

  • +
  • set_active (bool, optional) – Activate the added AdapterFusion. By default (False), the AdapterFusion is added but not activated.

  • +
+
+
+
+ +
+
+add_causal_lm_head(head_name, activation_function='gelu', overwrite_ok=False)
+

Adds a causal language modeling head on top of the model.

+
+
Parameters
+
    +
  • head_name (str) – The name of the head.

  • +
  • activation_function (str, optional) – Activation function. Defaults to ‘gelu’.

  • +
  • overwrite_ok (bool, optional) – Force overwrite if a head with the same name exists. Defaults to False.

  • +
+
+
+
+ +
+
+add_classification_head(head_name, num_labels=2, layers=2, activation_function='tanh', overwrite_ok=False, multilabel=False, id2label=None, use_pooler=False)
+

Adds a sequence classification head on top of the model.

+
+
Parameters
+
    +
  • head_name (str) – The name of the head.

  • +
  • num_labels (int, optional) – Number of classification labels. Defaults to 2.

  • +
  • layers (int, optional) – Number of layers. Defaults to 2.

  • +
  • activation_function (str, optional) – Activation function. Defaults to ‘tanh’.

  • +
  • overwrite_ok (bool, optional) – Force overwrite if a head with the same name exists. Defaults to False.

  • +
  • multilabel (bool, optional) – Enable multilabel classification setup. Defaults to False.

  • +
+
+
+
+ +
+
+add_dependency_parsing_head(head_name, num_labels=2, overwrite_ok=False, id2label=None)
+

Adds a biaffine dependency parsing head on top of the model. The parsing head uses the architecture described +in “Is Supervised Syntactic Parsing Beneficial for Language Understanding? An Empirical Investigation” (Glavaš +& Vulić, 2021) (https://arxiv.org/pdf/2008.06788.pdf).

+
+
Parameters
+
    +
  • head_name (str) – The name of the head.

  • +
  • num_labels (int, optional) – Number of labels. Defaults to 2.

  • +
  • overwrite_ok (bool, optional) – Force overwrite if a head with the same name exists. Defaults to False.

  • +
  • id2label (dict, optional) – Mapping from label ids to labels. Defaults to None.

  • +
+
+
+
+ +
+
+add_masked_lm_head(head_name, activation_function='gelu', overwrite_ok=False)
+

Adds a masked language modeling head on top of the model.

+
+
Parameters
+
    +
  • head_name (str) – The name of the head.

  • +
  • activation_function (str, optional) – Activation function. Defaults to ‘gelu’.

  • +
  • overwrite_ok (bool, optional) – Force overwrite if a head with the same name exists. Defaults to False.

  • +
+
+
+
+ +
+
+add_multiple_choice_head(head_name, num_choices=2, layers=2, activation_function='tanh', overwrite_ok=False, id2label=None, use_pooler=False)
+

Adds a multiple choice head on top of the model.

+
+
Parameters
+
    +
  • head_name (str) – The name of the head.

  • +
  • num_choices (int, optional) – Number of choices. Defaults to 2.

  • +
  • layers (int, optional) – Number of layers. Defaults to 2.

  • +
  • activation_function (str, optional) – Activation function. Defaults to ‘tanh’.

  • +
  • overwrite_ok (bool, optional) – Force overwrite if a head with the same name exists. Defaults to False.

  • +
+
+
+
+ +
+
+add_qa_head(head_name, num_labels=2, layers=1, activation_function='tanh', overwrite_ok=False, id2label=None)
+

Adds a question answering head on top of the model.

+
+
Parameters
+
    +
  • head_name (str) – The name of the head.

  • +
  • num_labels (int, optional) – Number of classification labels. Defaults to 2.

  • +
  • layers (int, optional) – Number of layers. Defaults to 1.

  • +
  • activation_function (str, optional) – Activation function. Defaults to ‘tanh’.

  • +
  • overwrite_ok (bool, optional) – Force overwrite if a head with the same name exists. Defaults to False.

  • +
+
+
+
+ +
+
+add_tagging_head(head_name, num_labels=2, layers=1, activation_function='tanh', overwrite_ok=False, id2label=None)
+

Adds a token classification head on top of the model.

+
+
Parameters
+
    +
  • head_name (str) – The name of the head.

  • +
  • num_labels (int, optional) – Number of classification labels. Defaults to 2.

  • +
  • layers (int, optional) – Number of layers. Defaults to 1.

  • +
  • activation_function (str, optional) – Activation function. Defaults to ‘tanh’.

  • +
  • overwrite_ok (bool, optional) – Force overwrite if a head with the same name exists. Defaults to False.

  • +
+
+
+
+ +
+
+apply_to_adapter_layers(fn)
+

Applies a function to all adapter layers of the model.

+
+ +
+
+apply_to_basemodel_childs(fn)
+

Applies a function to all direct childs of the model if they are a instance of AdapterLayerBase.

+
+ +
+
+average_adapter(adapter_name: str, adapter_list: List[str], weights: Optional[List[float]] = None, normalize_weights: bool = True, overwrite_ok: bool = False, set_active: bool = False)
+

Adds a new adapter module as weighted average of a set of existing adapter modules.

+
+
Parameters
+
    +
  • adapter_name (str) – The name of the adapter module to be added.

  • +
  • input_adapters (List[str] or Dict[str, float]) – Specifies the existing adapters whose weights should be averaged. Can either be a list of adapter names +or a dictionary mapping adapter names to weights.

  • +
  • overwrite_ok (bool, optional) – Overwrite an adapter with the same name if it exists. By default (False), an exception is thrown.

  • +
  • set_active (bool, optional) – Set the adapter to be the active one. By default (False), the adapter is added but not activated.

  • +
+
+
+
+ +
+
+delete_adapter(adapter_name: str)
+

Deletes the adapter with the specified name from the model.

+
+
Parameters
+

adapter_name (str) – The name of the adapter.

+
+
+
+ +
+
+delete_adapter_fusion(adapter_names: Union[Fuse, list, str])
+

Deletes the AdapterFusion layer of the specified adapters.

+
+
Parameters
+

adapter_names (Union[Fuse, list, str]) – AdapterFusion layer to delete.

+
+
+
+ +
+
+delete_head(head_name: str)
+

Deletes the prediction head with the specified name from the model.

+
+
Parameters
+

head_name (str) – The name of the prediction to delete.

+
+
+
+ +
+
+eject_prefix_tuning(name: str)
+

Converts the prefix tuning with the given name from the reparameterized form into the flat form.

+
+
Parameters
+

name (str) – The name of the prefix tuning.

+
+
+
+ +
+
+forward(input_ids: Optional[Tensor] = None, lang_ids: Optional[LongTensor] = None, attention_mask: Optional[Tensor] = None, token_type_ids: Optional[Tensor] = None, position_ids: Optional[Tensor] = None, head_mask: Optional[Tensor] = None, inputs_embeds: Optional[Tensor] = None, output_attentions: Optional[bool] = None, output_hidden_states: Optional[bool] = None, return_dict: Optional[bool] = None, head: Optional[str] = None, output_adapter_gating_scores: Optional[bool] = False, output_adapter_fusion_attentions: Optional[bool] = False, **kwargs)
+

The [XmodAdapterModel] forward method, overrides the __call__ special method.

+

<Tip>

+

Although the recipe for forward pass needs to be defined within this function, one should call the [Module] +instance afterwards instead of this since the former takes care of running the pre and post processing steps while +the latter silently ignores them.

+

</Tip>

+
+
Parameters
+
    +
  • input_ids (torch.LongTensor of shape (batch_size, sequence_length)) –

    Indices of input sequence tokens in the vocabulary.

    +

    Indices can be obtained using [AutoTokenizer]. See [PreTrainedTokenizer.encode] and +[PreTrainedTokenizer.__call__] for details.

    +

    [What are input IDs?](../glossary#input-ids)

    +

  • +
  • lang_ids (torch.LongTensor of shape (batch_size, sequence_length), optional) – Indices of the language adapters that should be activated for each sample, respectively. Default: the index +that corresponds to self.config.default_language.

  • +
  • attention_mask (torch.FloatTensor of shape (batch_size, sequence_length), optional) –

    Mask to avoid performing attention on padding token indices. Mask values selected in [0, 1]:

    +
      +
    • 1 for tokens that are not masked,

    • +
    • 0 for tokens that are masked.

    • +
    +

    [What are attention masks?](../glossary#attention-mask)

    +

  • +
  • token_type_ids (torch.LongTensor of shape (batch_size, sequence_length), optional) –

    Segment token indices to indicate first and second portions of the inputs. Indices are selected in [0, +1]:

    +
      +
    • 0 corresponds to a sentence A token,

    • +
    • 1 corresponds to a sentence B token.

    • +
    +

    [What are token type IDs?](../glossary#token-type-ids)

    +

  • +
  • position_ids (torch.LongTensor of shape (batch_size, sequence_length), optional) –

    Indices of positions of each input sequence tokens in the position embeddings. Selected in the range [0, +config.max_position_embeddings - 1].

    +

    [What are position IDs?](../glossary#position-ids)

    +

  • +
  • head_mask (torch.FloatTensor of shape (num_heads,) or (num_layers, num_heads), optional) –

    Mask to nullify selected heads of the self-attention modules. Mask values selected in [0, 1]:

    +
      +
    • 1 indicates the head is not masked,

    • +
    • 0 indicates the head is masked.

    • +
    +

  • +
  • inputs_embeds (torch.FloatTensor of shape (batch_size, sequence_length, hidden_size), optional) – Optionally, instead of passing input_ids you can choose to directly pass an embedded representation. This +is useful if you want more control over how to convert input_ids indices into associated vectors than the +model’s internal embedding lookup matrix.

  • +
  • output_attentions (bool, optional) – Whether or not to return the attentions tensors of all attention layers. See attentions under returned +tensors for more detail.

  • +
  • output_hidden_states (bool, optional) – Whether or not to return the hidden states of all layers. See hidden_states under returned tensors for +more detail.

  • +
  • return_dict (bool, optional) – Whether or not to return a [~utils.ModelOutput] instead of a plain tuple.

  • +
+
+
+
+ +
+
+forward_context(context: ForwardContext, *args, **kwargs)
+

This method is called by the ForwardContext at the beginning of the forward pass.

+
+ +
+
+forward_head(all_outputs, head_name=None, cls_output=None, attention_mask=None, return_dict=False, context=None, **kwargs)
+

The forward pass through a prediction head configuration. There are three ways to specify the used prediction +head configuration (in order of priority):

+
+
    +
  1. If a head_name is passed, the head with the given name is used.

  2. +
  3. If the forward call is executed within an AdapterSetup context, the head configuration is read from +the context.

  4. +
  5. If the active_head property is set, the head configuration is read from there.

  6. +
+
+
+
Parameters
+
    +
  • all_outputs (dict) – The outputs of the base model.

  • +
  • head_name (str, optional) – The name of the prediction head to use. If None, the active head is used.

  • +
  • cls_output (torch.Tensor, optional) – The classification output of the model.

  • +
  • attention_mask (torch.Tensor, optional) – The attention mask of the model.

  • +
  • return_dict (bool) – Whether or not to return a ModelOutput instead of a plain tuple.

  • +
  • get_cls_from_eos_tokens (bool) – If set to True, retrieve classifier token representations from the last <eos> token in the sequence. +Setting to True requires eos_mask to be passed as well.

  • +
  • **kwargs – Additional keyword arguments passed to the forward pass of the head.

  • +
+
+
+
+ +
+
+freeze_model(freeze=True)
+

Freezes all weights of the model.

+
+ +
+
+get_adapter(name)
+

If self.base_model is self, must inherit from a class that implements this method, to preclude infinite +recursion

+
+ +
+
+get_labels(head_name=None)
+

Returns the labels the given head is assigning/predictin

+
+
Parameters
+
    +
  • head_name – (str, optional) the name of the head which labels should be returned. Default is None.

  • +
  • returned (If the name is None the labels of the active head are) –

  • +
+
+
+

Returns: labels

+
+ +
+
+get_labels_dict(head_name=None)
+

Returns the id2label dict for the given hea

+
+
Parameters
+
    +
  • head_name – (str, optional) the name of the head which labels should be returned. Default is None.

  • +
  • returned (If the name is None the labels of the active head are) –

  • +
+
+
+

Returns: id2label

+
+ +
+
+get_output_embeddings() Union[Module, List[Module]]
+

Returns the model’s output embeddings.

+
+
Returns
+

A torch module mapping hidden states to vocabulary.

+
+
Return type
+

nn.Module

+
+
+
+ +
+
+head_type()
+

Checks which head type the decorated function belongs to and raises an error if the model does not support the +head type.

+
+ +
+
+init_adapters(model_config, adapters_config, add_prefix_tuning_pool=True)
+

This method initializes adapter modules and fusion modules from the model config.

+
+ +
+
+iter_layers() Iterable[Tuple[int, Module]]
+

Iterates over all layers of the model.

+
+ +
+
+load_adapter(adapter_name_or_path: str, config: Optional[Union[dict, str]] = None, version: Optional[str] = None, model_name: Optional[str] = None, load_as: Optional[str] = None, source: Optional[str] = None, with_head: bool = True, custom_weights_loaders: Optional[List[WeightsLoader]] = None, leave_out: Optional[List[int]] = None, id2label=None, set_active: bool = False, use_safetensors: bool = False, **kwargs) str
+

Loads a pre-trained pytorch adapter module from the local file system or a remote location.

+
+
Parameters
+
    +
  • adapter_name_or_path (str) –

    can be either:

    +
      +
    • the identifier of a pre-trained task adapter to be loaded from Adapter Hub

    • +
    • a path to a directory containing adapter weights saved using model.saved_adapter()

    • +
    • a URL pointing to a zip folder containing a saved adapter module

    • +
    +

  • +
  • config (dict or str, optional) – The requested configuration of the adapter. +If not specified, will be either: - the default adapter config for the requested adapter if specified - +the global default adapter config

  • +
  • version (str, optional) – The version of the adapter to be loaded.

  • +
  • model_name (str, optional) – The string identifier of the pre-trained model.

  • +
  • load_as (str, optional) – Load the adapter using this name. By default, the name with which the adapter was +saved will be used.

  • +
  • source (str, optional) –

    Identifier of the source(s) from where to load the adapter. Can be:

    +
      +
    • +
      ”ah”: search on AdapterHub Hub repo.

      Note: the Hub repo has been archived and all adapters have been moved to HuggingFace Model Hub. +Loading from this source is deprecated.

      +
      +
      +
    • +
    • ”hf”: search on HuggingFace Model Hub.

    • +
    • None (default): search on all sources

    • +
    +

  • +
  • leave_out – Dynamically drop adapter modules in the specified Transformer layers when loading the adapter.

  • +
  • set_active (bool, optional) – Set the loaded adapter to be the active one. By default (False), the adapter is loaded but not +activated.

  • +
  • use_safetensors (bool, optional) – If True, weights are loaded via safetensors if safetensors checkpoint is available. Otherwise, the regular torch save method is used.

  • +
+
+
Returns
+

The name with which the adapter was added to the model.

+
+
Return type
+

str

+
+
+
+ +
+
+load_adapter_fusion(adapter_fusion_name_or_path: str, load_as: Optional[str] = None, custom_weights_loaders: Optional[List[WeightsLoader]] = None, set_active: bool = False, with_head: bool = True, use_safetensors: bool = False, **kwargs) str
+

Loads a pre-trained AdapterFusion layer from the local file system.

+
+
Parameters
+
    +
  • adapter_fusion_name_or_path (str) – a path to a directory containing AdapterFusion weights saved using model.save_adapter_fusion().

  • +
  • load_as (str, optional) – Load the AdapterFusion using this name. +By default, the name with which the AdapterFusion layer was saved will be used.

  • +
  • set_active (bool, optional) – Activate the loaded AdapterFusion. By default (False), the AdapterFusion is loaded but not activated.

  • +
  • use_safetensors (bool, optional) – If True, weights are loaded via safetensors if safetensors checkpoint is available. Otherwise, the regular torch save method is used.

  • +
+
+
Returns
+

The name with which the AdapterFusion was added to the model.

+
+
Return type
+

str

+
+
+
+ +
+
+load_head(save_directory: str, load_as: Optional[str] = None, id2label: Optional[Dict[int, str]] = None, use_safetensors: bool = False, **kwargs) str
+

Loads a model prediction head from a directory where it was saved using save_head().

+
+
Parameters
+
    +
  • save_directory (str) – Path to the directory where the prediction head is saved.

  • +
  • load_as (str, optional) – Load the AdapterFusion using this name. +By default, the name with which the AdapterFusion layer was saved will be used.

  • +
  • id2label (Dict[int, str], optional) – Provide a custom mapping from class ids to class labels. Defaults to None.

  • +
  • use_safetensors (bool, optional) – If True, weights are loaded via safetensors if safetensors checkpoint is available. Otherwise, the regular torch save method is used.

  • +
+
+
Returns
+

The name with which the prediction head was added to the model.

+
+
Return type
+

str

+
+
+
+ +
+
+merge_adapter(name: str)
+

Merges the weights of the given LoRA module with the Transformer weights as described in the paper.

+
+
Parameters
+

name (str) – LoRA module to merge.

+
+
+
+ +
+
+push_adapter_to_hub(repo_name: str, adapter_name: str, organization: Optional[str] = None, adapterhub_tag: Optional[str] = None, datasets_tag: Optional[str] = None, local_path: Optional[str] = None, commit_message: Optional[str] = None, private: Optional[bool] = None, token: Optional[Union[bool, str]] = None, overwrite_adapter_card: bool = False, create_pr: bool = False, revision: Optional[str] = None, commit_description: Optional[str] = None, adapter_card_kwargs: Optional[dict] = None, **deprecated_kwargs)
+

Upload an adapter to HuggingFace’s Model Hub.

+
+
Parameters
+
    +
  • repo_name (str) – The name of the repository on the model hub to upload to.

  • +
  • adapter_name (str) – The name of the adapter to be uploaded.

  • +
  • organization (str, optional) – Organization in which to push the adapter +(you must be a member of this organization). Defaults to None.

  • +
  • adapterhub_tag (str, optional) – Tag of the format <task>/<subtask> for categorization on https://adapterhub.ml/explore/. See +https://docs.adapterhub.ml/contributing.html#add-a-new-task-or-subtask for more. If not specified, +datasets_tag must be given in case a new adapter card is generated. Defaults to None.

  • +
  • datasets_tag (str, optional) – Dataset identifier from https://huggingface.co/datasets. +If not specified, adapterhub_tag must be given in case a new adapter card is generated. Defaults to +None.

  • +
  • local_path (str, optional) – Local path used as clone directory of the adapter repository. +If not specified, will create a temporary directory. Defaults to None.

  • +
  • commit_message (str, optional) – Message to commit while pushing. Will default to "add config", "add tokenizer" or +"add model" depending on the type of the class.

  • +
  • private (bool, optional) – Whether or not the repository created should be private (requires a paying subscription).

  • +
  • token (bool or str, optional) – The token to use as HTTP bearer authorization for remote files. If True, will use the token generated +when running huggingface-cli login (stored in ~/.huggingface). Will default to True if repo_url +is not specified.

  • +
  • overwrite_adapter_card (bool, optional) – Overwrite an existing adapter card with a newly generated one. +If set to False, will only generate an adapter card, if none exists. Defaults to False.

  • +
  • create_pr (bool, optional) – Whether or not to create a PR with the uploaded files or directly commit.

  • +
  • revision (str, optional) – Branch to push the uploaded files to.

  • +
  • commit_description (str, optional) – The description of the commit that will be created

  • +
+
+
Returns
+

The url of the adapter repository on the model hub.

+
+
Return type
+

str

+
+
+
+ +
+
+reset_adapter()
+

Resets weights of a LoRA module merged using model.merge_adapter(name).

+
+ +
+
+save_adapter(save_directory: str, adapter_name: str, with_head: bool = True, meta_dict: Optional[dict] = None, custom_weights_loaders: Optional[List[WeightsLoader]] = None, use_safetensors: bool = False)
+

Saves an adapter and its configuration file to a directory so that it can be shared or reloaded using +load_adapter().

+
+
Parameters
+
    +
  • save_directory (str) – Path to a directory where the adapter should be saved.

  • +
  • adapter_name (str) – Name of the adapter to be saved.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
Raises
+

ValueError – If the given adapter name is invalid.

+
+
+
+ +
+
+save_adapter_fusion(save_directory: str, adapter_names: Union[Fuse, list, str], meta_dict: Optional[dict] = None, custom_weights_loaders: Optional[List[WeightsLoader]] = None, with_head: Union[bool, str] = False, use_safetensors: bool = False)
+

Saves an AdapterFusion layer and its configuration file to a directory so that it can be shared or reloaded +using load_adapter_fusion().

+
+
Parameters
+
    +
  • save_directory (str) – Path to a directory where the AdapterFusion should be saved.

  • +
  • adapter_names (Union[Fuse, list, str]) – AdapterFusion to be saved.

  • +
  • with_head (Union[bool, str]) – If True, will save a head with the same name as the AdapterFusionLayer. If a string, this will be used +as the name of the head to be saved.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
Raises
+

ValueError – If the given AdapterFusion name is invalid.

+
+
+
+ +
+
+save_all_adapter_fusions(save_directory: str, meta_dict: Optional[dict] = None, custom_weights_loaders: Optional[List[WeightsLoader]] = None, use_safetensors: bool = False)
+

Saves all AdapterFusion layers of this model together with their configuration to subfolders of the given +location.

+
+
Parameters
+
    +
  • save_directory (str) – Path to a directory where the AdapterFusion layers should be saved.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
+
+ +
+
+save_all_adapters(save_directory: str, with_head: bool = True, meta_dict: Optional[dict] = None, custom_weights_loaders: Optional[List[WeightsLoader]] = None, use_safetensors: bool = False)
+

Saves all adapters of this model together with their configuration to subfolders of the given location.

+
+
Parameters
+
    +
  • save_directory (str) – Path to a directory where the adapters should be saved.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
+
+ +
+
+save_all_heads(save_directory: str, use_safetensors: bool = False)
+

Saves all prediction heads of this model to subfolders of the given location.

+
+
Parameters
+
    +
  • save_directory (str) – Path to the base directory where prediction heads should be saved.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
+
+ +
+
+save_head(save_directory: str, head_name: Optional[str] = None, use_safetensors: bool = False) None
+

Saves a model prediction head to a directory such that it can be reloaded using load_head().

+
+
Parameters
+
    +
  • save_directory (str) – Path to the directory where the prediction head should be saved.

  • +
  • head_name (str, optional) – Name of the head to save. Set to None if model only has one head. Defaults to None.

  • +
  • use_safetensors (bool, optional) – If True, weights are saved via safetensors. Otherwise, the regular torch save method is used.

  • +
+
+
+
+ +
+
+save_pretrained(save_directory: Union[str, PathLike], **kwargs)
+

Save a model and its configuration file to a directory, so that it can be re-loaded using the +[~PreTrainedModel.from_pretrained] class method.

+
+
Parameters
+
    +
  • save_directory (str or os.PathLike) – Directory to which to save. Will be created if it doesn’t exist.

  • +
  • is_main_process (bool, optional, defaults to True) – Whether the process calling this is the main process or not. Useful when in distributed training like +TPUs and need to call this function on all processes. In this case, set is_main_process=True only on +the main process to avoid race conditions.

  • +
  • state_dict (nested dictionary of torch.Tensor) – The state dictionary of the model to save. Will default to self.state_dict(), but can be used to only +save parts of the model or if special precautions need to be taken when recovering the state dictionary +of a model (like when using model parallelism).

  • +
  • save_function (Callable) – The function to use to save the state dictionary. Useful on distributed training like TPUs when one +need to replace torch.save by another method.

  • +
  • push_to_hub (bool, optional, defaults to False) – Whether or not to push your model to the Hugging Face model hub after saving it. You can specify the +repository you want to push to with repo_id (will default to the name of save_directory in your +namespace).

  • +
  • max_shard_size (int or str, optional, defaults to “5GB”) –

    The maximum size for a checkpoint before being sharded. Checkpoints shard will then be each of size +lower than this size. If expressed as a string, needs to be digits followed by a unit (like “5MB”). +We default it to 5GB in order for models to be able to run easily on free-tier google colab instances +without CPU OOM issues.

    +

    <Tip warning={true}>

    +

    If a single weight of the model is bigger than max_shard_size, it will be in its own checkpoint shard +which will be bigger than max_shard_size.

    +

    </Tip>

    +

  • +
  • safe_serialization (bool, optional, defaults to True) – Whether to save the model using safetensors or the traditional PyTorch way (that uses pickle).

  • +
  • variant (str, optional) – If specified, weights are saved in the format pytorch_model.<variant>.bin.

  • +
  • token (str or bool, optional) – The token to use as HTTP bearer authorization for remote files. If True, or not specified, will use +the token generated when running huggingface-cli login (stored in ~/.huggingface).

  • +
  • save_peft_format (bool, optional, defaults to True) – For backward compatibility with PEFT library, in case adapter weights are attached to the model, all +keys of the state dict of adapters needs to be pre-pended with base_model.model. Advanced users can +disable this behaviours by setting save_peft_format to False.

  • +
  • kwargs (Dict[str, Any], optional) – Additional key word arguments passed along to the [~utils.PushToHubMixin.push_to_hub] method.

  • +
+
+
+
+ +
+
+set_active_adapters(adapter_setup: Union[list, AdapterCompositionBlock], skip_layers: Optional[List[int]] = None)
+

Sets the adapter modules to be used by default in every forward pass. This setting can be overriden by passing +the adapter_names parameter in the foward() pass. If no adapter with the given name is found, no module of +the respective type will be activated. In case the calling model class supports named prediction heads, this +method will attempt to activate a prediction head with the name of the last adapter in the list of passed +adapter names.

+
+
Parameters
+

adapter_setup (list) – The list of adapters to be activated by default. Can be a fusion or stacking configuration.

+
+
+
+ +
+
+tie_weights()
+

Tie the weights between the input embeddings and the output embeddings.

+

If the torchscript flag is set in the configuration, can’t handle parameter sharing so we are cloning +the weights instead.

+
+ +
+
+train_adapter(adapter_setup: Union[list, AdapterCompositionBlock], train_embeddings=False)
+

Sets the model into mode for training the given adapters. If self.base_model is self, must inherit from a class +that implements this method, to preclude infinite recursion

+
+ +
+
+train_adapter_fusion(adapter_setup: Union[list, AdapterCompositionBlock], unfreeze_adapters=False)
+

Sets the model into mode for training of adapter fusion determined by a list of adapter names. If +self.base_model is self, must inherit from a class that implements this method, to preclude infinite recursion

+
+ +
+
+train_fusion(adapter_setup: Union[list, AdapterCompositionBlock], unfreeze_adapters=False)
+

Sets the model into mode for training of adapter fusion determined by a list of adapter names.

+
+ +
+ +
+
+ + +
+ +
+ + +
+
+ +
+ +
+ +
+ + Versions + v: main + + +
+
+
Branches
+
main
+
+
+
+ + + + + + + + + + \ No newline at end of file diff --git a/contributing.html b/contributing.html new file mode 100644 index 0000000000..55e3a77b67 --- /dev/null +++ b/contributing.html @@ -0,0 +1,377 @@ + + + + + + + + + + + Contributing to AdapterHub — AdapterHub documentation + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + +
+ + + + + +
+ +
+ + + + + + + + + + + + + + + + + +
+ + + + +
+
+
+
+ +
+

Contributing to AdapterHub

+

There are many ways in which you can contribute to AdapterHub and the adapters library. +This includes code contributions such as:

+
    +
  • implementing new adapter methods

  • +
  • adding support for new Transformer

  • +
  • fixing open issues

  • +
+

as well as non-code contributions such as:

+
    +
  • training and uploading adapters to the Hub

  • +
  • writing documentation and blog posts

  • +
  • helping others with their issues and questions

  • +
+

Whichever way you’d like to contribute, you’re very welcome to do so!

+
+

Contributing to the adapters codebase

+
+

Setting up your dev environment

+

To get started with writing code for adapters, you’d want to set up the project on a local development environment.

+

adapters closely follows the original Hugging Face Transformers repository in many aspects. +This guide assumes that you want to set up your dev environment on a local machine and that you have basic knowledge of git. +Additionally, you require Python 3.8 or above pre-installed to get started.

+

In the following, we go through the setup procedure step by step:

+
    +
  1. Fork the adapters repository to get a local copy of the code under your user account.

  2. +
  3. Clone your fork to your local machine:

    +
    git clone --recursive git@github.com:<YOUR_USERNAME>/adapters.git
    +cd adapters
    +
    +
    +

    Note: The --recursive flag is important to initialize git submodules.

    +
  4. +
  5. Create a virtual environment, e.g. via virtualenv or conda.

  6. +
  7. Install PyTorch, following the installation command for your environment on their website.

  8. +
  9. Install Hugging Face Transformers from the local git submodule:

    +
    pip install ./hf_transformers
    +
    +
    +
  10. +
  11. Install adapters and required dev dependencies:

    +
    pip install -e ".[dev]"
    +
    +
    +
  12. +
+
+
+

Adding Adapter Methods

+

How to integrate new efficient fine-tuning/ adapter methods to adapters is described at https://docs.adapterhub.ml/contributing/adding_adapter_methods.html.

+
+
+

Adding Adapters to a Model

+

How to add adapter support to a model type already supported by Hugging Face Transformers is described at https://docs.adapterhub.ml/contributing/adding_adapters_to_a_model.html.

+
+
+

Testing your changes to the codebase

+

adapters provides multiple Makefile targets for easily running tests and repo checks. +Make sure these checks run without errors to pass the CI pipeline tasks when you open a pull request.

+

To run all tests in the repository:

+
make test
+
+
+

To auto format code and imports in the whole codebase:

+
make style
+
+
+

This will run black and isort.

+

To run all quality checks ensuring code style and repo consistency:

+
make quality
+
+
+

This will run checks with black, isort and flake8 as well as additional custom checks.

+
+
+
+

Publishing Pre-Trained Adapters

+

How to make your own trained adapters accessible for the adapters library HuggingFace Model Hub is described at https://docs.adapterhub.ml/huggingface_hub.html.

+
+
+ + +
+ +
+ + +
+
+ +
+ +
+ +
+ + Versions + v: main + + +
+
+
Branches
+
main
+
+
+
+ + + + + + + + + + \ No newline at end of file diff --git a/contributing/adding_adapter_methods.html b/contributing/adding_adapter_methods.html new file mode 100644 index 0000000000..eabea7e3d6 --- /dev/null +++ b/contributing/adding_adapter_methods.html @@ -0,0 +1,417 @@ + + + + + + + + + + + Adding Adapter Methods — AdapterHub documentation + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + +
+ + + + + +
+ +
+ + + + + + + + + + + + + + + + + +
+ + + + +
+
+
+
+ +
+

Adding Adapter Methods

+

This document describes how different efficient fine-tuning methods can be integrated into the codebase of adapters. +It can be used as a guide to add new efficient fine-tuning/ adapter methods.

+

Before we start to go into implementation details, first some important design philosophies of adapters:

+
    +
  • Adapters should integrate seamlessly with existing model classes: This means (a) if a model architecture supports adapters, it should be possible to use them with all model classes of this architecture and (b) adapters should be entirely opt-in, i.e. the model classes still must work without adapters.

  • +
  • Copying original should be minimal: adapters tries to avoid copying of the original HF code as far as possible. We extensively use Python mixins to achieve this.

  • +
+

Now we highlight the most important components of integrating adapter methods into Transformer models. +Each integration is highly dependent on the specific details of the adapter methods. +Therefore, the described steps might not be applicable to each implementation.

+
+

Implementation

+

❓ As adapter methods typically inject blocks of new parameters into an existing Transformer model, they mostly can be implemented using multiple blocks of classes deriving from torch.nn.Module. +These module classes then have to be inserted into the correct locations within the Transformer model implementation. +Thus, each adapter method implementation at least should provide two classes:

+
    +
  • a configuration class deriving from AdapterConfig that provides attributes for all configuration options of the method

  • +
  • a module class deriving from the abstract AdapterLayerBase that provides the method parameters and a set of standard adapter management functions

    +
      +
    • modules supporting adapter composition should instead derive from ComposableAdapterLayerBase

    • +
    +
  • +
+
+

Configuration

+

All configuration classes reside in src/adapters/configuration/adapter_config.py.

+
    +
  • To add a new configuration class for a new method, create a new subclass of AdapterConfig. +Make sure to set the architecture attribute in your class.

  • +
  • Finally, also make sure the config class is added to the __init__.py files in src/adapters.

  • +
+
+
+

Modeling

+

All adapter method implementations reside in src/adapters/methods.

+
+

For methods without composition support

+

The AdapterLayerBase class from which any new adapter modules should derive resides in src/adapters/methods/adapter_layer_base.py.

+
    +
  • This abstract base class defines a set of methods that should be implemented by each deriving class, +including methods for adding, enabling and deleting adapter weights. These methods are marked as abstract in the base class. See AdapterLayerBase for details.

  • +
  • Most importantly however, the module classes deriving from this base class should implement the forward pass through an adaptation component.

  • +
  • The concrete implementation of these classes heavily depends on the specifics of the adapter method.

  • +
+
+
+

For methods with composition support

+

The ComposableAdapterLayerBase class (as subclass of AdapterLayerBase), which resides in src/adapters/methods/adapter_layer_base.py provides the basic skeleton for implementing adapter composition.

+
    +
  • Your deriving module class firstly should implement all methods required by AdapterLayerBase. See section above for details.

  • +
  • For adapter composition, the pre-implemented compose() method constitutes the main entry-point. This method should be called during the forward pass of your adapter module.

  • +
  • compose() expects a state object, which is a generic named tuple object defined by your adapter method. This state object should hold all tensors (such as hidden states, attention masks etc.) and state attributes required for your adapter implementation. See BottleneckState for an example.

  • +
  • Implementations for specific composition blocks are given in methods starting with compose_. Some composition blocks provide generic default implementations, some must be implemented by the deriving class if they should be supported. Make sure to list all supported composition blocks in the supported_compositions class attribute of your deriving module.

  • +
  • In any case, a small set of helper methods should be implemented by any deriving module to support basic composition logic. These are marked as abstract methods in ComposableAdapterLayerBase and currently consist of the following: vslice(), pad_and_concat(), repeat(), mean(), compose_single(). See ComposableAdapterLayerBase for details.

  • +
+

For a reference implementation, have a look at BottleneckLayer for bottleneck adapters.

+
+
+

For all methods

+

To actually make use of the newly implemented classes, it’s finally necessary to integrate the forward calls to the modules in the actual model implementations.

+
    +
  • This, again, is highly dependent on how the adapter method interacts with the base model classes. Typically, module classes can be integrated either via mixins (see modules starting with “mixin” in src/adapters/models) or directly as submodules of the respective model components.

  • +
  • The model class integration has to be repeated for each supported Transformer model, as they typically don’t share a codebase. At this point it is often important to consider where the adapters need to be added to the transformer model and whether there is an implementation that does not require more copying of classes than the current implementation. +Please try to integrate any new adapter method into every model class when it’s reasonable. +You can find all currently supported model classes at https://docs.adapterhub.ml/model_overview.html.

  • +
+

Additional things to consider

+
    +
  • New adapter methods typically also require some changes in the AdapterLoader class in src/adapters/loading.py (also see here).

  • +
  • Depending on the method to be integrated, further changes in other classes might be necessary.

  • +
+
+
+
+
+

Testing

+

adapters provides a framework for testing adapter methods on implementing models in tests. +Tests for each adapter method are provided via a mixin class. +All test mixins derive from the common AdapterMethodBaseTestMixin class and reside in tests/methods.

+

📝 Steps

+
    +
  • Add a new test_<method>.py module in tests/methods.

    +
      +
    • This module should contain a <method>TestMixin class deriving from AdapterMethodBaseTestMixin that implements typical methods of adding, loading and training modules of the new adapter method.

    • +
    • Have a look at existing test mixins for reference.

    • +
    +
  • +
  • Next, add the newly implemented test mixin to the tests of all model types that support the new adapter method.

    +
      +
    • Each model type has its own test class tests/test_<model_type>.py that contains a <model_type>AdapterTest class. +Add the new test mixin to the mixins of this class. +E.g., if the new method is supported by BERT, add the its test mixin to BertAdapterTest.

    • +
    +
  • +
+
+
+

Documentation

+

❓ The documentation for adapters lives in the docs folder.

+

📝 Steps

+
    +
  • Add the class documentation for the configuration class of the new method in docs/classes/adapter_config.rst.

  • +
  • In docs/overview.md, add a new section for the new adapter method that describes the most important concepts. Please try to follow the general format of the existing methods.

  • +
  • Add a new column in the table in docs/model_overview.md and check the models that support the new adapter method.

  • +
+

Finally, please add a row for the new method in the table of supported methods under Implemented Methods in the main README.md of this repository.

+
+
+

Training Example Adapters

+

❓ To make sure the new adapter implementation works properly, it is useful to train some example adapters and compare the training results to full model fine-tuning and/or reference implementations. +Ideally, this would include training adapters on one (or more) tasks that are good for demonstrating the new method and uploading them to AdapterHub.

+

Hugging Face already provides example training scripts for many tasks, some of them have already been modified to support adapter training (see https://github.com/Adapter-Hub/adapters/tree/main/examples).

+
+
+ + +
+ +
+ + +
+
+ +
+ +
+ +
+ + Versions + v: main + + +
+
+
Branches
+
main
+
+
+
+ + + + + + + + + + \ No newline at end of file diff --git a/contributing/adding_adapters_to_a_model.html b/contributing/adding_adapters_to_a_model.html new file mode 100644 index 0000000000..426627c37f --- /dev/null +++ b/contributing/adding_adapters_to_a_model.html @@ -0,0 +1,432 @@ + + + + + + + + + + + Adding Adapters to a Model — AdapterHub documentation + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + +
+ + + + + +
+ +
+ + + + + + + + + + + + + + + + + +
+ + + + +
+
+
+
+ +
+

Adding Adapters to a Model

+

This document gives an overview of how new model architectures of Hugging Face Transformers can be supported by adapters. +Before delving into implementation details, you should familiarize yourself with the main design philosophies of adapters:

+
    +
  • Adapters should integrate seamlessly with existing model classes: If a model architecture supports adapters, it should be possible to use them with all model classes of this architecture.

  • +
  • Copied code should be minimal: adapters extensively uses Python mixins to add adapter support to HF models. Functions that cannot be sufficiently modified by mixins are copied and then modified. Try to avoid copying functions as much as possible.

  • +
+
+

Relevant Classes

+

Adding adapter support to an existing model architecture requires modifying some parts of the model forward pass logic. These modifications are realized by the four files in the src/adapters/models/<model_type>/ directory. Let’s examine the purpose of these files in the example of BERT. It’s important to note that we are adapting the original Hugging Face model, implemented in transformers/models/bert/modeling_bert.py. The files in src/adapters/models/bert/ are:

+
    +
  1. src/adapters/models/bert/mixin_bert.py: +This file contains mixins for each class we want to change. For example, in the BertSelfAttention class, we need to make changes for LoRA and Prefix Tuning. For this, we create a BertSelfAttentionAdaptersMixin to implement these changes. We will discuss how this works in detail below.

  2. +
  3. src/adapters/models/bert/modeling_bert.py: +For some classes of the BERT implementation (e.g. BertModel or BertLayer) the code can be sufficiently customized via mixins. For other classes (like BertSelfAttention), we need to edit the original code directly. These classes are copied into src/adapters/models/bert/modeling_bert.py and modified.

  4. +
  5. src/adapters/models/bert/adapter_model.py: +In this file, the adapter model class is defined. This class allows flexible adding of and switching between multiple prediction heads of different types. This looks about the same for each model, except that each model has different heads and thus different add_..._head() functions.

  6. +
  7. src/adapters/models/bert/__init__.py: Defines Python’s import structure.

  8. +
+
+
+

Implementation Steps 📝

+

Now that we have discussed the purpose of every file in src/adapters/models/<model_type>/, we go through the integration of adapters into an existing model architecture step by step. The following steps might not be applicable to every model architecture.

+
    +
  1. Files:

    +
      +
    • Create the src/adapters/models/<model_type>/ directory and in it the 4 files: mixin_<model_type>.py, modeling_<model_type>.py adapter_model.py and __init__.py

    • +
    +
  2. +
  3. Mixins:

    +
      +
    • In src/adapters/models/<model_type>/mixin_<model_type>.py, create mixins for any class you want to change and where you can’t reuse an existing mixin from another class.

      +
        +
      • To figure out which classes to change, think about where to insert LoRA, Prefix Tuning, and bottleneck adapters.

      • +
      • You can use similar model implementations for guidance.

      • +
      • Often, existing mixins of another class can be reused. E.g. BertLayer, RobertaLayer, XLMRobertaLayer, DebertaLayer, DebertaV2Layer and BertGenerationLayer (all models derived from BERT) use the BertLayerAdaptersMixin.

      • +
      +
    • +
    • To additionally support Prefix Tuning, it’s necessary to apply the forward call to the PrefixTuningLayer module in the respective attention layer (see step 3 for how to modify the code of an Hugging Face class).

    • +
    • Make sure the calls to bottleneck_layer_forward() are added in the right places.

    • +
    • The mixin for the whole base model class (e.g., BertModel) should derive from ModelBaseAdaptersMixin and (if possible) EmbeddingAdaptersMixin and/or InvertibleAdaptersMixin. This mixin should at least implement the iter_layers() method but might require additional modifications depending on the architecture.

      +
        +
      • If the model is a combination of different models, such as the EncoderDecoderModel, use ModelUsingSubmodelsAdaptersMixin instead of ModelBaseAdaptersMixin.

      • +
      +
    • +
    +
  4. +
  5. Copied functions:

    +
      +
    • For those classes where the mixin is not enough to realize the wanted behavior, you must:

    • +
    • Create a new class in src/adapters/models/<model_type>/modeling_<model_type>.py with the name <class>WithAdapters. This class should derive from the corresponding mixin and HF class.

    • +
    • Copy the function you want to change into this class and modify it.

      +
        +
      • e.g., the forward method of the BertSelfAttention class must be adapted to support prefix tuning. We therefore create a class BertSelfAttentionWithAdapters(BertSelfAttentionAdaptersMixin, BertSelfAttention), copy the forward method into it and modify it.

      • +
      • if the forward method of a module is copied and modified, make sure to call adapters.utils.patch_forward() in the module’s init_adapters() method. This ensures adapters work correctly with the accelerate package.

      • +
      +
    • +
    +
  6. +
  7. Modify MODEL_MIXIN_MAPPING

    +
      +
    • For each mixin whose class was not copied into modeling_<model_type>.py, add the mixin/class combination into MODEL_MIXIN_MAPPING in the file src/adapters/models/__init__.py.

    • +
    +
  8. +
  9. Create the adapter model:

    +
      +
    • Adapter-supporting architectures should provide a new model class <model_type>AdapterModel. This class allows flexible adding of and switching between multiple prediction heads of different types.

    • +
    • This is done in the adapter_model.py file:

      +
        +
      • This module should implement the <model_type>AdapterModel class, deriving from ModelWithFlexibleHeadsAdaptersMixin and <model_type>PreTrainedModel.

      • +
      • In the model class, add methods for those prediction heads that make sense for the new model architecture.

      • +
      • Again, have a look at existing implementations.

      • +
      +
    • +
    • Add <model_type>AdapterModel to the ADAPTER_MODEL_MAPPING_NAMES mapping in src/adapters/models/auto/adapter_model.py and to src/adapters/__init__.py.

    • +
    • Define the classes to be added to Python’s import structure in src/adapters/models/<model_type>/__init__.py. This will likely only be the <model_type>AdapterModel.

    • +
    +
  10. +
  11. Adapt the config classes:

    +
      +
    • Adapt the config class to the requirements of adapters in src/transformers/adapters/wrappers/configuration.py.

    • +
    • There are some naming differences in the config attributes of different model architectures. The adapter implementation requires some additional attributes with a specific name to be available. These currently are num_attention_heads, hidden_size, hidden_dropout_prob and attention_probs_dropout_prob as in the BertConfig class. +If your model config does not provide these, add corresponding mappings to CONFIG_CLASS_KEYS_MAPPING.

    • +
    +
  12. +
+
+

Additional (optional) implementation steps 📝

+
    +
  • Parallel adapter inference via Parallel composition block (cf. documentation, PR#150).

  • +
  • Provide mappings for an architecture’s existing (static) prediction heads into adapters flex heads (cf. implementation).

  • +
+
+
+
+

Testing

+

❓ In addition to the general Hugging Face model tests, there are adapter-specific test cases. All tests are executed from the tests folder. You need to add two different test classes.

+

📝 Steps

+
    +
  1. Add a new test_<model_type>.py module in tests/

    +
      +
    • This file is used to test that everything related to the usage of adapters (adding, removing, activating, …) works.

    • +
    • This module typically holds 2 test classes and a test base class:

      +
        +
      • <model_type>AdapterTestBase: This class contains the tokenizer_name, config_class and config.

      • +
      • <model_type>AdapterTest derives from a collection of test mixins that hold various adapter tests (depending on the implementation).

      • +
      • (optionally) <model_type>ClassConversionTest runs tests for correct class conversion if conversion of prediction heads is implemented.

      • +
      +
    • +
    +
  2. +
  3. Add a new test_<model_type>.py module in tests/models/

    +
      +
    • This file is used to test the AdapterModel class.

    • +
    • This module typically holds 1 test class with the name <model_type>AdapterModelTest

      +
        +
      • <model_type>AdapterModelTest derives directly from Hugging Face’s existing model test class <model_type>ModelTest and adds <model_type>AdapterModel as a class to test.

      • +
      +
    • +
    +
  4. +
+
+
+

Documentation

+

❓ The documentation for adapters lives in the docs folder.

+

📝 Steps

+
    +
  • Add docs/classes/models/<model_type>.rst (oriented at the doc file in the HF docs). Make sure to include <model_type>AdapterModel autodoc. Finally, list the file in index.rst.

  • +
  • Add a new row for the model in the model table of the overview page at docs/model_overview.md, listing all the methods implemented by the new model.

  • +
+
+
+

Training Example Adapters

+

❓ To make sure the new adapter implementation works properly, it is useful to train some example adapters and compare the training results to full model fine-tuning. Ideally, this would include training adapters on one (or more) tasks that are good for demonstrating the new model architecture (e.g. GLUE benchmark for BERT, summarization for BART) and uploading them to AdapterHub.

+

We provide training scripts for many tasks here: https://github.com/Adapter-Hub/adapters/tree/main/examples/pytorch/

+
+
+ + +
+ +
+ + +
+
+ +
+ +
+ +
+ + Versions + v: main + + +
+
+
Branches
+
main
+
+
+
+ + + + + + + + + + \ No newline at end of file diff --git a/embeddings.html b/embeddings.html new file mode 100644 index 0000000000..e905afd169 --- /dev/null +++ b/embeddings.html @@ -0,0 +1,342 @@ + + + + + + + + + + + Embeddings — AdapterHub documentation + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + +
+ + + + + +
+ +
+ + + + + + + + + + + + + + + + + +
+ + + + +
+
+
+
+ +
+

Embeddings

+

With adapters, we support dynamically adding, loading, and deleting of Embeddings. This section +will give you an overview of these features. A toy example is illustrated in this notebook.

+
+

Adding and Deleting Embeddings

+

The methods for handling embeddings are similar to the ones handling adapters. To add new embeddings we call +add_embeddings. This adds new embeddings for the vocabulary of the tokenizer. +In some cases, it might be useful to initialize embeddings of tokens to the ones of another embeddings module. If a +reference_embedding and reference_tokenizer are provided all embeddings for tokens that are present in both embeddings are initialized to the embedding provided by the reference_embedding. The new embedding will be created and set as the active embedding. If you are unsure which embedding +is currently active, the active_embeddings property contains the currently active embedding.

+
model.add_embeddings('name', tokenizer, reference_embedding='default', reference_tokenizer=reference_tokenizer)
+
+
+

The original embedding of the transformers model is always available under the name "default". To set it as the active +embedding simply call the set_active_embedding('name') method.

+
model.set_active_embeddings('name')
+
+
+

Similarly, all other embeddings can be set as active by passing their name to the set_active_embedding method.

+

To delete an embedding that is no longer needed, we can call the delete_embeddings method with the name of the adapter +we want to delete. However, you cannot delete the default embedding.

+
model.delete_embeddings('name')
+
+
+

Please note, that if the active embedding is deleted the default embedding is set as the active embedding.

+
+
+

Training Embeddings

+

Embeddings can only be trained with an adapter. To freeze all weights except for the embedding and the adapter:

+
model.train_adapter('adapter_name', train_embeddings=True)
+
+
+

Except for the train_embeddings flag, the training is the same as for just training an adapter (see Adapter Training).

+
+
+

Saving and Loading Embeddings

+

You can save the embeddings by calling save_embeddings('path/to/dir', 'name') and load them with load_embeddings('path/to/dir', 'name').

+
model.save_embeddings(path, 'name')
+model.load_embeddings(path, 'reloaded_name')
+
+
+

The path needs to be to a directory in which the weights of the embedding will be saved.

+

You can also save and load the tokenizer +with the embedding by passing the tokenizer to save_embeddings.

+
model.save_embeddings(path, 'name', tokenizer)
+loaded_tokenizer = model.load_embeddings(path, 'name')
+
+
+
+
+ + +
+ +
+ + +
+
+ +
+ +
+ +
+ + Versions + v: main + + +
+
+
Branches
+
main
+
+
+
+ + + + + + + + + + \ No newline at end of file diff --git a/extending.html b/extending.html new file mode 100644 index 0000000000..2ac7fe51a7 --- /dev/null +++ b/extending.html @@ -0,0 +1,323 @@ + + + + + + + + + + + Extending the Library — AdapterHub documentation + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + +
+ + + + + +
+ +
+ + + + + + + + + + + + + + + + + +
+ + + + +
+
+
+
+ +
+

Extending the Library

+
+

Integrating new Transformer models

+

Currently, not all model types included in Hugging Face’s transformers support adapters yet. +However, it is possible to add the existing adapter implementation to new models. +For detailed instructions, see Adding Adapters to a Model.

+
+
+

Loading custom module weights

+

adapters provides support for saving and loading adapter and prediction head modules from the local file system or the Hub out of the box. +However, countless additional module integrations into language models are thinkable. +To provide a basis for such new custom model plugins, adapters integrates a basic mechanism to save and load custom weights.

+

All adapter and head module weights are extracted, saved and loaded by implementations of the WeightsLoader class, the two preincluded being AdapterLoader and PredictionHeadLoader. To add basic saving and loading functionalities to your custom module weights, you can implement a new subclass of WeightsLoader. The two required abstract methods to be implemented are:

+
    +
  • filter_func(self, name: str) -> Callable[[str], bool]: The callable returned by this method is used to extract the module weights to be saved or loaded based on their names.

  • +
  • rename_func(self, old_name: str, new_name: str) -> Callable[[str], str]: The callable returned by this method is used to optionally rename the module weights after loading.

  • +
+

For more advanced functionalities, you may also want to override the save() and load() method.

+

Using the custom loader class, weights can now be saved with:

+
loader = MyCustomWeightsLoader(model)
+loader.save("path/to/save/dir", "custom_weights_name")
+
+
+

You can also upload these weights to the Hub and then load them from there together with an adapter:

+
model.load_adapter(
+    "adapter_name",
+    custom_weights_loaders=[MyCustomWeightsLoader]
+)
+
+
+
+
+ + +
+ +
+ + +
+
+ +
+ +
+ +
+ + Versions + v: main + + +
+
+
Branches
+
main
+
+
+
+ + + + + + + + + + \ No newline at end of file diff --git a/genindex.html b/genindex.html new file mode 100644 index 0000000000..d8332c7a14 --- /dev/null +++ b/genindex.html @@ -0,0 +1,2730 @@ + + + + + + + + + + + Index — AdapterHub documentation + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + +
+ + + + + +
+ +
+ + + + + + + + + + + + + + + + + +
+ +
    + +
  • Docs »
  • + +
  • Index
  • + + +
  • + + + +
  • + +
+ + +
+
+
+
+ + +

Index

+ +
+ A + | B + | C + | D + | E + | F + | G + | H + | I + | L + | M + | O + | P + | R + | S + | T + | U + | V + | X + +
+

A

+ + + +
+ +

B

+ + + +
+ +

C

+ + + +
+ +

D

+ + + +
+ +

E

+ + + +
+ +

F

+ + + +
+ +

G

+ + + +
+ +

H

+ + +
+ +

I

+ + + +
+ +

L

+ + + +
+ +

M

+ + + +
+ +

O

+ + + +
+ +

P

+ + + +
+ +

R

+ + + +
+ +

S

+ + + +
+ +

T

+ + + +
+ +

U

+ + +
+ +

V

+ + + +
+ +

X

+ + + +
+ + + +
+ +
+
+ + +
+ +
+

+ © Copyright 2020-2024, AdapterHub Team + +

+
+ Built with Sphinx using a theme provided by Read the Docs. + +
+ +
+
+ +
+ +
+ +
+ + Versions + v: main + + +
+
+
Branches
+
main
+
+
+
+ + + + + + + + + + \ No newline at end of file diff --git a/hub_contributing.html b/hub_contributing.html new file mode 100644 index 0000000000..a0daa8c0dd --- /dev/null +++ b/hub_contributing.html @@ -0,0 +1,283 @@ + + + + + + + + + + + Contributing Adapters to the Hub — AdapterHub documentation + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + +
+ + + + + +
+ +
+ + + + + + + + + + + + + + + + + +
+ + + + +
+
+
+
+ +
+

Contributing Adapters to the Hub

+
+

Warning

+

The original approach of contributing adapters via the Hub repository is deprecated. Please upload all new adapters to HuggingFace’s Model Hub as described in Integration with Hugging Face’s Model Hub. +For the legacy documentation, refer to here.

+
+
+ + +
+ +
+
+ + +
+ +
+

+ © Copyright 2020-2024, AdapterHub Team + +

+
+ Built with Sphinx using a theme provided by Read the Docs. + +
+ +
+
+ +
+ +
+ +
+ + Versions + v: main + + +
+
+
Branches
+
main
+
+
+
+ + + + + + + + + + \ No newline at end of file diff --git a/huggingface_hub.html b/huggingface_hub.html new file mode 100644 index 0000000000..e9302fb44f --- /dev/null +++ b/huggingface_hub.html @@ -0,0 +1,356 @@ + + + + + + + + + + + Integration with Hugging Face’s Model Hub — AdapterHub documentation + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + +
+ + + + + +
+ +
+ + + + + + + + + + + + + + + + + +
+ + + + +
+
+
+
+ +
+

Integration with Hugging Face’s Model Hub

+
+Hugging Face Hub logo.
+

You can download adapters from and upload them to Hugging Face’s Model Hub. +This document describes how to interact with the Model Hub when working with adapters.

+
+

Downloading from the Hub

+

The Hugging Face Model Hub already provides hundreds of pre-trained adapters available for download. +To search for available adapters, use the Adapters library filter on the Model Hub website or use this link: https://huggingface.co/models?library=adapter-transformers. +Alternatively, all adapters on the Hugging Face Model Hub are also listed on https://adapterhub.ml/explore together with all adapters directly uploaded to AdapterHub.

+

After you have found an adapter you would like to use, loading it into a Transformer model is easy. +For example, for loading and activating the adapter AdapterHub/roberta-base-pf-sick, write:

+
from adapters import AutoAdapterModel
+
+model = AutoAdapterModel.from_pretrained("roberta-base")
+adapter_name = model.load_adapter("AdapterHub/roberta-base-pf-sick")
+model.active_adapters = adapter_name
+
+
+
+
+

Uploading to the Hub

+

Hugging Face’s Model Hub provides a convenient way for everyone to upload their pre-trained models and share them with the world. +Of course, this is also possible with adapters now! +In the following, we’ll go through the fastest way of uploading an adapter directly via Python in the adapters library. +For more options and information, e.g. for managing models via the CLI and Git, refer to HugginFace’s documentation.

+
    +
  1. Prepare access credentials: Before being able to push to the Hugging Face Model Hub for the first time, we have to store our access token in the cache. +This can be done via the huggingface-cli by running:

    +
    huggingface-cli login
    +
    +
    +
  2. +
  3. Push an adapter: Next, we can proceed to upload our first adapter. +Let’s say we have a standard pre-trained Transformers model with an existing adapter named awesome_adapter (e.g. added via model.add_adapter("awesome_adapter") and trained afterwards). +We can now push this adapter to the Model Hub using model.push_adapter_to_hub() like this:

    +
    model.push_adapter_to_hub(
    +    "my-awesome-adapter",
    +    "awesome_adapter",
    +    adapterhub_tag="sentiment/imdb",
    +    datasets_tag="imdb"
    +)
    +
    +
    +

    This will create a repository my-awesome-adapter under your username, generate a default adapter card as README.md and upload the adapter named awesome_adapter together with the adapter card to the new repository. +adapterhub_tag and datasets_tag provide additional information for categorization.

    +
    +

    Important

    +

    All adapters uploaded to Hugging Face’s Model Hub are automatically also listed on AdapterHub.ml. Thus, for better categorization, either adapterhub_tag or datasets_tag is required when uploading a new adapter to the Model Hub.

    + +
    +
  4. +
+

Voilà! Your first adapter is on the Hugging Face Model Hub. +Anyone can now run:

+
model.load_adapter("<your_username>/my-awesome-adapter", source="hf")
+
+
+

To update your adapter, simply run push_adapter_to_hub() with the same repository name again. This will push a new commit to the existing repository.

+

You can find the full documentation of push_adapter_to_hub() here.

+
+
+ + +
+ +
+ + +
+
+ +
+ +
+ +
+ + Versions + v: main + + +
+
+
Branches
+
main
+
+
+
+ + + + + + + + + + \ No newline at end of file diff --git a/index.html b/index.html new file mode 100644 index 0000000000..dd45c8ddc2 --- /dev/null +++ b/index.html @@ -0,0 +1,520 @@ + + + + + + + + + + + AdapterHub Documentation — AdapterHub documentation + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + +
+ + + + + +
+ +
+ + + + + + + + + + + + + + + + + +
+ + + + +
+
+
+
+ +
+

AdapterHub Documentation

+
+

Note

+

This documentation is based on the new Adapters library.

+

The documentation based on the legacy adapter-transformers library can be found at: https://docs-legacy.adapterhub.ml.

+
+

AdapterHub is a framework simplifying the integration, training and usage of adapters and other efficient fine-tuning methods for Transformer-based language models. +For a full list of currently implemented methods, see the table in our repository.

+

The framework consists of two main components:

+ ++++ + + + + + + + + + + +

Adapters

AdapterHub.ml

an add-on to Hugging Face’s Transformers library that adds adapters into transformer models

a central collection of pre-trained adapter modules

+

Currently, we support the PyTorch versions of all models as listed on the Model Overview page.

+ + + + + + + +
+
+

Citation

+

If you use _Adapters_ in your work, please consider citing our library paper Adapters: A Unified Library for Parameter-Efficient and Modular Transfer Learning <https://arxiv.org/abs/2311.11077)>

+
@inproceedings{poth-etal-2023-adapters,
+   title = "Adapters: A Unified Library for Parameter-Efficient and Modular Transfer Learning",
+   author = {Poth, Clifton  and
+      Sterz, Hannah  and
+      Paul, Indraneil  and
+      Purkayastha, Sukannya  and
+      Engl{\"a}nder, Leon  and
+      Imhof, Timo  and
+      Vuli{\'c}, Ivan  and
+      Ruder, Sebastian  and
+      Gurevych, Iryna  and
+      Pfeiffer, Jonas},
+   booktitle = "Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing: System Demonstrations",
+   month = dec,
+   year = "2023",
+   address = "Singapore",
+   publisher = "Association for Computational Linguistics",
+   url = "https://aclanthology.org/2023.emnlp-demo.13",
+   pages = "149--160",
+}
+
+
+

Alternatively, for the predecessor adapter-transformers, the Hub infrastructure and adapters uploaded by the AdapterHub team, please consider citing our initial paper: AdapterHub: A Framework for Adapting Transformers

+
@inproceedings{pfeiffer2020AdapterHub,
+   title={AdapterHub: A Framework for Adapting Transformers},
+   author={Jonas Pfeiffer and
+            Andreas R\"uckl\'{e} and
+            Clifton Poth and
+            Aishwarya Kamath and
+            Ivan Vuli\'{c} and
+            Sebastian Ruder and
+            Kyunghyun Cho and
+            Iryna Gurevych},
+   booktitle={Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP 2020): Systems Demonstrations},
+   year={2020},
+   address = "Online",
+   publisher = "Association for Computational Linguistics",
+   url = "https://www.aclweb.org/anthology/2020.emnlp-demos.7",
+   pages = "46--54",
+}
+
+
+
+
+

Indices and tables

+ +
+ + +
+ +
+ + +
+
+ +
+ +
+ +
+ + Versions + v: main + + +
+
+
Branches
+
main
+
+
+
+ + + + + + + + + + \ No newline at end of file diff --git a/installation.html b/installation.html new file mode 100644 index 0000000000..e42a2a4e43 --- /dev/null +++ b/installation.html @@ -0,0 +1,332 @@ + + + + + + + + + + + Installation — AdapterHub documentation + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + +
+ + + + + +
+ +
+ + + + + + + + + + + + + + + + + +
+ + + + +
+
+
+
+ +
+

Installation

+

The adapters package is designed as an add-on for Hugging Face’s Transformers library. +It currently supports Python 3.8+ and PyTorch 1.10+. You will have to install PyTorch first.

+
+

Important

+

Each adapters version is built for one specific version of Transformers. +While using a different version of Transformers with an adapters might work, it is highly recommended to use the intended version. +adapters will automatically install the correct Transformers version if not installed.

+
+
+

Using pip

+
+

From PyPI

+

The simplest way of installation is by using pip to install the package from the Python Package Index:

+
pip install adapters
+
+
+
+
+

From GitHub

+

You can also install the latest development version directly from our GitHub repository:

+
pip install git+https://github.com/adapter-hub/adapters.git
+
+
+
+
+
+

From repository

+

Alternatively, you can clone the repository first and install the package from source. +This allows you to run the included example scripts directly:

+
git clone https://github.com/adapter-hub/adapters.git
+cd adapters
+pip install .
+
+
+
+
+ + +
+ +
+ + +
+
+ +
+ +
+ +
+ + Versions + v: main + + +
+
+
Branches
+
main
+
+
+
+ + + + + + + + + + \ No newline at end of file diff --git a/loading.html b/loading.html new file mode 100644 index 0000000000..bf14117454 --- /dev/null +++ b/loading.html @@ -0,0 +1,387 @@ + + + + + + + + + + + Loading Pre-Trained Adapters — AdapterHub documentation + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + +
+ + + + + +
+ +
+ + + + + + + + + + + + + + + + + +
+ + + + +
+
+
+
+ +
+

Loading Pre-Trained Adapters

+
+

Finding pre-trained adapters

+

AdapterHub.ml provides a central collection of all pre-trained adapters uploaded via Hugging Face’s Model Hub. +You can easily find pre-trained adapters for your task of interest along with all relevant information and code snippets to get started.

+
+

Note

+

The original Hub repository (via source="ah") has been archived and migrated to the HuggingFace Model Hub. The Adapters library supports automatic redirecting to the HF Model Hub when attempting to load adapters from the original Hub repository.

+
+

Alternatively, list_adapters() provides a programmatical way of accessing all available pre-trained adapters. +This will return an AdapterInfo object for each retrieved adapter. +E.g., we can use it to retrieve information for all adapters trained for a specific model:

+
from adapters import list_adapters
+
+# source can be "ah" (archived Hub repo), "hf" (huggingface.co) or None (for both, default)
+adapter_infos = list_adapters(source="hf", model_name="bert-base-uncased")
+
+for adapter_info in adapter_infos:
+    print("Id:", adapter_info.adapter_id)
+    print("Model name:", adapter_info.model_name)
+    print("Uploaded by:", adapter_info.username)
+
+
+

In case the adapter ID is known, information for a single adapter can also be retrieved via get_adapter_info():

+
adapter_info = get_adapter_info("@ukp/bert-base-uncased_sentiment_sst-2_pfeiffer", source="ah")
+
+print("Id:", adapter_info.adapter_id)
+print("Model name:", adapter_info.model_name)
+print("Uploaded by:", adapter_info.username)
+
+
+
+
+

Using pre-trained adapters in your code

+

Suppose we have loaded a pre-trained transformer model from Hugging Face, e.g. BERT, and initialized it for adding adapters:

+
from transformers import BertModel
+import adapters
+
+model = BertModel.from_pretrained('bert-base-uncased')
+adaptrers.init(model)
+
+
+

We can now easily load a pre-trained adapter module from Adapter Hub by its identifier using the load_adapter() method:

+
adapter_name = model.load_adapter('sst-2')
+
+
+

In the minimal case, that’s everything we need to specify to load a pre-trained task adapter for sentiment analysis, trained on the sst-2 dataset using BERT base and a suitable adapter configuration. +The name of the adapter is returned by load_adapter(), so we can activate it in the next step:

+
model.set_active_adapters(adapter_name)
+
+
+

As the second example, let’s have a look at how to load an adapter based on the AdapterInfo returned by the list_adapters() method from above:

+
from adapters import AutoAdapterModel, list_available_adapters
+
+adapter_infos = list_available_adapters(source="ah")
+# Take the first adapter info as an example
+adapter_info = adapter_infos[0]
+
+model = AutoAdapterModel.from_pretrained(adapter_info.model_name)
+model.load_adapter(adapter_info.adapter_id, source=adapter_info.source)
+
+
+
+

Advanced usage of load_adapter()

+

To examine what’s happening underneath in a bit more detail, let’s first write out the full method call with all relevant arguments explicitly stated:

+
model.load_adapter(
+    'sst-2',
+    config='pfeiffer',
+    model_name='bert-base-uncased',
+    version=1,
+    load_as='sst',
+    source='ah'
+)
+
+
+

We will go through the different arguments and their meaning one by one:

+
    +
  • The first argument passed to the method specifies the name of the adapter we want to load from Adapter-Hub. The library will search for an available adapter module with this name that matches the model architecture as well as the adapter type and configuration we requested. As the identifier sst-2 resolves to a unique entry in the Hub, the corresponding adapter can be successfully loaded based on this information. To get an overview of all available adapter identifiers, please refer to the Adapter-Hub website.

  • +
  • The config argument defines the adapter architecture the loaded adapter should have. +The value of this parameter can be either a string identifier for one of the predefined architectures, the identifier of an architecture available in the Hub or a dictionary representing a full adapter configuration. +Based on this information, the library will only search for pre-trained adapter modules having the same configuration.

  • +
  • Adapter modules trained on different pre-trained language models in general can not be used interchangeably. +Therefore, we need to make sure to load an adapter matching the language model we are using. +If possible, the library will infer the name of the pre-trained model automatically (e.g. when we use from_pretrained('identifier') to load a model from Hugging Face). However, if this is not the case, we must specify the name of the host model in the model_name parameter.

  • +
  • There could be multiple versions of the same adapter available. To load a specific version, use the version parameter.

  • +
  • By default, the load_adapter() method will add the loaded adapter using the identifier string given as the first argument. +To load the adapter using a custom name, we can use the load_as parameter.

  • +
  • Finally the source parameter provides the possibility to load adapters from alternative adapter repositories. +Besides the default value ah, referring to AdapterHub, it’s also possible to pass hf to load adapters from Hugging Face’s Model Hub.

  • +
+
+
+
+ + +
+ +
+ + +
+
+ +
+ +
+ +
+ + Versions + v: main + + +
+
+
Branches
+
main
+
+
+
+ + + + + + + + + + \ No newline at end of file diff --git a/method_combinations.html b/method_combinations.html new file mode 100644 index 0000000000..e99eccfa45 --- /dev/null +++ b/method_combinations.html @@ -0,0 +1,402 @@ + + + + + + + + + + + Method Combinations — AdapterHub documentation + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + +
+ + + + + +
+ +
+ + + + + + + + + + + + + + + + + +
+ + + + +
+
+
+
+ +
+

Method Combinations

+

Configuration class: ConfigUnion

+

While different efficient fine-tuning methods and configurations have often been proposed as standalone, combining them for joint training might be beneficial. +To make this process easier, adapters provides the possibility to group multiple configuration instances using the ConfigUnion class.

+

For example, this could be used to define different reduction factors for the adapter modules placed after the multi-head attention and the feed-forward blocks:

+
from adapters import BnConfig, ConfigUnion
+
+config = ConfigUnion(
+    BnConfig(mh_adapter=True, output_adapter=False, reduction_factor=16, non_linearity="relu"),
+    BnConfig(mh_adapter=False, output_adapter=True, reduction_factor=2, non_linearity="relu"),
+)
+model.add_adapter("union_adapter", config=config)
+
+
+
+

Mix-and-Match Adapters

+

Configuration class: MAMConfig

+

He et al. (2021) study various variants and combinations of efficient fine-tuning methods. +They propose Mix-and-Match Adapters as a combination of Prefix Tuning and parallel bottleneck adapters. +This configuration is supported by adapters out-of-the-box:

+
from adapters import MAMConfig
+
+config = MAMConfig()
+model.add_adapter("mam_adapter", config=config)
+
+
+

and is identical to using the following ConfigUnion:

+
from adapters import ConfigUnion, ParBnConfig, PrefixTuningConfig
+
+config = ConfigUnion(
+    PrefixTuningConfig(bottleneck_size=800),
+    ParBnConfig(),
+)
+model.add_adapter("mam_adapter", config=config)
+
+
+

Papers:

+ +
+
+

UniPELT

+

Configuration class: UniPELTConfig

+
+Illustration of UniPELT. +

Illustration of the UniPELT method within one Transformer layer. Trained components are colored in shades of magenta.

+
+

An approach similar to the work of He et al. (2021) is taken by Mao et al. (2022) in their UniPELT framework. +They, too, combine multiple efficient fine-tuning methods, namely LoRA, Prefix Tuning and bottleneck adapters, in a single unified setup. +UniPELT additionally introduces a gating mechanism that controls the activation of the different submodules.

+

Concretely, for each adapted module \(m\), UniPELT adds a trainable gating value \(\mathcal{G}_m \in (0, 1)\) that is computed via a feed-forward network (\(W_{\mathcal{G}_m}\)) and sigmoid activation (\(\sigma\)) from the Transformer layer input states (\(x\)):

+
+\[\mathcal{G}_m \leftarrow \sigma(W_{\mathcal{G}_m} \cdot x)\]
+

These gating values are then used to scale the output activations of the injected adapter modules, e.g., for a LoRA layer:

+
+\[ +h \leftarrow W_0 x + \mathcal{G}_{LoRA} B A x +\]
+

In the configuration classes of adapters, these gating mechanisms can be activated via use_gating=True. +The full UniPELT setup can be instantiated using UniPELTConfig1:

+
from adapters import UniPELTConfig
+
+config = UniPELTConfig()
+model.add_adapter("unipelt", config=config)
+
+
+

which is identical to the following ConfigUnion:

+
from adapters import ConfigUnion, LoRAConfig, PrefixTuningConfig, SeqBnConfig
+
+config = ConfigUnion(
+    LoRAConfig(r=8, alpha=2, use_gating=True),
+    PrefixTuningConfig(prefix_length=10, use_gating=True),
+    SeqBnConfig(reduction_factor=16, use_gating=True),
+)
+model.add_adapter("unipelt", config=config)
+
+
+

Finally, as the gating values for each adapter module might provide interesting insights for analysis, adapters comes with an integrated mechanism of returning all gating values computed during a model forward pass via the output_adapter_gating_scores parameter:

+
outputs = model(**inputs, output_adapter_gating_scores=True)
+gating_scores = outputs.adapter_gating_scores
+
+
+

Note that this parameter is only available to base model classes and AdapterModel classes. +In the example, gating_scores holds a dictionary of the following form:

+
{
+    '<adapter_name>': {
+        <layer_id>: {
+            '<module_location>': np.array([...]),
+            ...
+        },
+        ...
+    },
+    ...
+}
+
+
+

Papers:

+ +
+
+
1
+

Note that the implementation of UniPELT in adapters follows the implementation in the original code, which is slightlty different from the description in the paper. See here for more.

+
+
+
+
+ + +
+ +
+ + +
+
+ +
+ +
+ +
+ + Versions + v: main + + +
+
+
Branches
+
main
+
+
+
+ + + + + + + + + + \ No newline at end of file diff --git a/methods.html b/methods.html new file mode 100644 index 0000000000..330def42c5 --- /dev/null +++ b/methods.html @@ -0,0 +1,550 @@ + + + + + + + + + + + Adapter Methods — AdapterHub documentation + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + +
+ + + + + +
+ +
+ + + + + + + + + + + + + + + + + +
+ + + + +
+
+
+
+ +
+

Adapter Methods

+

On this page, we present all adapter methods currently integrated into the adapters library. +A tabular overview of adapter methods is provided here. +Additionally, options to combine multiple adapter methods in a single setup are presented on the next page.

+
+

Bottleneck Adapters

+

Configuration class: BnConfig

+

Bottleneck adapters introduce bottleneck feed-forward layers in each layer of a Transformer model. +Generally, these adapter layers consist of a down-projection matrix \(W_{down}\) that projects the layer hidden states into a lower dimension \(d_{bottleneck}\), a non-linearity \(f\), an up-projection \(W_{up}\) that projects back into the original hidden layer dimension and a residual connection \(r\):

+
+\[ +h \leftarrow W_{up} \cdot f(W_{down} \cdot h) + r +\]
+

Depending on the concrete adapter configuration, these layers can be introduced at different locations within a Transformer block. Further, residual connections, layer norms, activation functions and bottleneck sizes ,etc., can be configured.

+

The most important configuration hyperparameter to be highlighted here is the bottleneck dimension \(d_{bottleneck}\). +In adapters, this bottleneck dimension is specified indirectly via the reduction_factor attribute of a configuration. +This reduction_factor defines the ratio between a model’s layer hidden dimension and the bottleneck dimension, i.e.:

+
+\[ +\text{reduction_factor} = \frac{d_{hidden}}{d_{bottleneck}} +\]
+

A visualization of further configuration options related to the adapter structure is given in the figure below. For more details, we refer to the documentation of BnConfig](adapters.BnConfig).

+
+Adapter architectures +

Visualization of possible adapter configurations with corresponding dictionary keys.

+
+

adapters comes with pre-defined configurations for some bottleneck adapter architectures proposed in literature:

+ +

Example:

+
from adapters import BnConfig
+
+config = BnConfig(mh_adapter=True, output_adapter=True, reduction_factor=16, non_linearity="relu")
+model.add_adapter("bottleneck_adapter", config=config)
+
+
+

Papers:

+ +
+
+

Language Adapters - Invertible Adapters

+

Configuration class: SeqBnInvConfig, DoubleSeqBnInvConfig

+

The MAD-X setup (Pfeiffer et al., 2020) proposes language adapters to learn language-specific transformations. +After being trained on a language modeling task, a language adapter can be stacked before a task adapter for training on a downstream task. +To perform zero-shot cross-lingual transfer, one language adapter can simply be replaced by another.

+

In terms of architecture, language adapters are largely similar to regular bottleneck adapters, except for an additional invertible adapter layer after the LM embedding layer. +Embedding outputs are passed through this invertible adapter in the forward direction before entering the first Transformer layer and in the inverse direction after leaving the last Transformer layer. +Invertible adapter architectures are further detailed in Pfeiffer et al. (2020) and can be configured via the inv_adapter attribute of the BnConfig class.

+

Example:

+
from adapters import SeqBnInvConfig
+
+config = SeqBnInvConfig()
+model.add_adapter("lang_adapter", config=config)
+
+
+

Papers:

+ +
+

Note

+

V1.x of adapters made a distinction between task adapters (without invertible adapters) and language adapters (with invertible adapters) with the help of the AdapterType enumeration. +This distinction was dropped with v2.x.

+
+
+
+

Prefix Tuning

+

Configuration class: PrefixTuningConfig

+
+Illustration of Prefix Tuning. +

Illustration of the Prefix Tuning method within one Transformer layer. Trained components are colored in shades of magenta.

+
+

Prefix Tuning (Li and Liang, 2021) introduces new parameters in the multi-head attention blocks in each Transformer layer. +More specifically, it prepends trainable prefix vectors \(P^K\) and \(P^V\) to the keys and values of the attention head input, each of a configurable prefix length \(l\) (prefix_length attribute):

+
+\[ +head_i = \text{Attention}(Q W_i^Q, [P_i^K, K W_i^K], [P_i^V, V W_i^V]) +\]
+

Following the original authors, the prefix vectors in \(P^K\) and \(P^V\) are not optimized directly but reparameterized via a bottleneck MLP. +This behavior is controlled via the flat attribute of the configuration. +Using PrefixTuningConfig(flat=True) will create prefix tuning vectors that are optimized without reparameterization.

+

Example:

+
from adapters import PrefixTuningConfig
+
+config = PrefixTuningConfig(flat=False, prefix_length=30)
+model.add_adapter("prefix_tuning", config=config)
+
+
+

As reparameterization using the bottleneck MLP is not necessary for performing inference on an already trained Prefix Tuning module, adapters includes a function to “eject” a reparameterized Prefix Tuning into a flat one:

+
model.eject_prefix_tuning("prefix_tuning")
+
+
+

This will only retain the necessary parameters and reduces the size of the trained Prefix Tuning.

+

Papers:

+ +
+
+

Compacter

+

Configuration class: CompacterConfig, CompacterPlusPlusConfig

+
+Illustration of Compacter. +

Illustration of the Compacter method within one Transformer layer. Trained components are colored in shades of magenta.

+
+

The Compacter architecture proposed by Mahabadi et al., 2021 +is similar to the bottleneck adapter architecture. It only exchanges the linear down- and +up-projection with a PHM layer. Unlike the linear layer, the PHM layer constructs its weight matrix from two smaller matrices, which reduces the number of parameters. +These matrices can be factorized and shared between all adapter layers. You can exchange the down- and up-projection layers from any of the bottleneck adapters described in the previous section +for a PHM layer by specifying use_phm=True in the config.

+

The PHM layer has the following additional properties: phm_dim, shared_phm_rule, factorized_phm_rule, learn_phm, +factorized_phm_W, shared_W_phm, phm_c_init, phm_init_range, hypercomplex_nonlinearity

+

For more information, check out the BnConfig class.

+

To add a Compacter to your model, you can use the predefined configs:

+
from adapters import CompacterConfig
+
+config = CompacterConfig()
+model.add_adapter("dummy", config=config)
+
+
+

Papers:

+ +
+
+

LoRA

+

Configuration class: LoRAConfig

+
+Illustration of LoRA. +

Illustration of the LoRA method within one Transformer layer. Trained components are colored in shades of magenta.

+
+

Low-Rank Adaptation (LoRA) is an efficient fine-tuning technique proposed by Hu et al. (2021). +LoRA injects trainable low-rank decomposition matrices into the layers of a pre-trained model. +For any model layer expressed as a matrix multiplication of the form \(h = W_0 x\), it performs a reparameterization, such that:

+
+\[ +h = W_0 x + \frac{\alpha}{r} B A x +\]
+

Here, \(A \in \mathbb{R}^{r\times k}\) and \(B \in \mathbb{R}^{d\times r}\) are the decomposition matrices and \(r\), the low-dimensional rank of the decomposition, is the most important hyperparameter.

+

While, in principle, this reparameterization can be applied to any weight matrix in a model, the original paper only adapts the attention weights of the Transformer self-attention sub-layer with LoRA. +adapters additionally allows injecting LoRA into the dense feed-forward layers in the intermediate and output components of a Transformer block. +You can configure the locations where LoRA weights should be injected using the attributes in the LoRAConfig class.

+

Example:

+
from adapters import LoRAConfig
+
+config = LoRAConfig(r=8, alpha=16)
+model.add_adapter("lora_adapter", config=config)
+
+
+

In the design of LoRA, Hu et al. (2021) also pay special attention to keeping the inference latency overhead compared to full fine-tuning at a minimum. +To accomplish this, the LoRA reparameterization can be merged with the original pre-trained weights of a model for inference. +Thus, the adapted weights are directly used in every forward pass without passing activations through an additional module. +In adapters, this can be realized using the built-in merge_adapter() method:

+
model.merge_adapter("lora_adapter")
+
+
+

To continue training on this LoRA adapter or to deactivate it entirely, the merged weights first have to be reset again:

+
model.reset_adapter()
+
+
+

Papers:

+ +
+
+

(IA)^3

+

Configuration class: IA3Config

+
+Illustration of (IA)^3. +

Illustration of the (IA)^3 method within one Transformer layer. Trained components are colored in shades of magenta.

+
+

Infused Adapter by Inhibiting and Amplifying Inner Activations ((IA)^3) is an efficient fine-tuning method proposed within the T-Few fine-tuning approach by Liu et al. (2022). +(IA)^3 introduces trainable vectors \(l_W\) into different components of a Transformer model, which perform element-wise rescaling of inner model activations. +For any model layer expressed as a matrix multiplication of the form \(h = W x\), it therefore performs an element-wise multiplication with \(l_W\), such that:

+
+\[ +h = l_W \odot W x +\]
+

Here, \(\odot\) denotes element-wise multiplication where the entries of \(l_W\) are broadcasted to the shape of \(W\).

+

Example:

+
from adapters import IA3Config
+
+config = IA3Config()
+model.add_adapter("ia3_adapter", config=config)
+
+
+

The implementation of (IA)^3, as well as the IA3Config class, are derived from the implementation of LoRA, with a few main modifications. +First, (IA)^3 uses multiplicative composition of weights instead of additive composition, as in LoRA. +Second, the added weights are not further decomposed into low-rank matrices. +These modifications are controlled via the composition_mode configuration attribute by setting composition_mode="scale". +Additionally, as the added weights are already of rank 1, r=1 is set.

+

Beyond that, both methods share the same configuration attributes that allow you to specify in which Transformer components rescaling vectors will be injected. +Following the original implementation, IA3Config adds rescaling vectors to the self-attention weights (selfattn_lora=True) and the final feed-forward layer (output_lora=True). +Further, you can modify which matrices of the attention mechanism to rescale by leveraging the attn_matrices attribute. +By default, (IA)^3 injects weights into the key (‘k’) and value (‘v’) matrices but not in the query (‘q’) matrix.

+

Finally, similar to LoRA, (IA)^3 also allows merging the injected parameters with the original weight matrices of the Transformer model. +E.g.:

+
# Merge (IA)^3 adapter
+model.merge_adapter("ia3_adapter")
+
+# Reset merged weights
+model.reset_adapter()
+
+
+

Papers:

+ +
+
+

Prompt Tuning

+

Prompt Tuning is an efficient fine-tuning technique proposed by Lester et al. (2021). Prompt tuning adds tunable tokens, called soft-prompts, that are prepended to the input text. +First, the input sequence \({x_1, x_2, \dots, x_n }\) gets embedded, resulting in the matrix \(X_e \in \mathbb{R}^{n \times e}\) where \(e\) is the dimension of +the embedding space. The soft-prompts with length \(p\) are represented as \(P_e \in \mathbb{R}^{p \times e}\). +\(P_e\) and \(X_e\) get concatenated, forming the input of the following encoder or decoder:

+
+\[ +\left[P_e; X_e\right] \in \mathbb{R}^{\left(p + n\right) \times e} +\]
+

The PromptTuningConfig has the properties:

+
    +
  • prompt_length: to set the soft-prompts length \(p\)

  • +
  • prompt_init: to set the weight initialisation method, which is either “random_uniform” or “from_string” to initialize each prompt token with an embedding drawn from the model’s vocabulary.

    +
      +
    • prompt_init_text as the text use for initialisation if prompt_init="from_string"

    • +
    +
  • +
  • combine: To define if the prefix should be added before the embedded input sequence or after the BOS token

  • +
+

To add Prompt Tuning to your model, you can use the predefined configs:

+
from adapters import PromptTuningConfig
+
+config = PromptTuningConfig(prompt_length=10)
+model.add_adapter("dummy", config=config)
+
+
+

Papers:

+ +
+
+ + +
+ +
+ + +
+
+ +
+ +
+ +
+ + Versions + v: main + + +
+
+
Branches
+
main
+
+
+
+ + + + + + + + + + \ No newline at end of file diff --git a/model_overview.html b/model_overview.html new file mode 100644 index 0000000000..e595277681 --- /dev/null +++ b/model_overview.html @@ -0,0 +1,550 @@ + + + + + + + + + + + Model Overview — AdapterHub documentation + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + +
+ + + + + +
+ +
+ + + + + + + + + + + + + + + + + +
+ + + + +
+
+
+
+ +
+

Model Overview

+

This page gives an overview of the Transformer models currently supported by adapters. +The table below further shows which model architectures support which adaptation methods and which features of adapters.

+
+

Note

+

Each supported model architecture X typically provides a class XAdapterModel for usage with AutoAdapterModel. +Additionally, it is possible to use adapters with the model classes already shipped with Hugging Face Transformers. For these classes, initialize the model for adapters with adapters.init(model). +E.g., for BERT, this means adapters provides a BertAdapterModel class, but you can also use BertModel, BertForSequenceClassification etc. together with adapters.

+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Model(Bottleneck)
Adapters
Prefix
Tuning
LoRACompacterAdapter
Fusion
Invertible
Adapters
Parallel
block
Prompt
Tuning
ALBERT
BART
BEIT
BERT-Generation
BERT
CLIP
DeBERTa
DeBERTa-v2
DistilBERT
Electra
Encoder Decoder(*)(*)(*)(*)(*)(*)
GPT-2
GPT-J
Llama
MBart
MT5
RoBERTa
T5
ViT
XLM-RoBERTa
X-MOD
+

(*) If the used encoder and decoder model class are supported.

+

Missing a model architecture you’d like to use? +adapters can be easily extended to new model architectures as described in Adding Adapters to a Model. +Feel free to open an issue requesting support for a new architecture. +We very much welcome pull requests adding new model implementations!

+
+ + +
+ +
+ + +
+
+ +
+ +
+ +
+ + Versions + v: main + + +
+
+
Branches
+
main
+
+
+
+ + + + + + + + + + \ No newline at end of file diff --git a/objects.inv b/objects.inv new file mode 100644 index 0000000000..cbc2271219 Binary files /dev/null and b/objects.inv differ diff --git a/overview.html b/overview.html new file mode 100644 index 0000000000..961f22e8a2 --- /dev/null +++ b/overview.html @@ -0,0 +1,457 @@ + + + + + + + + + + + Overview and Configuration — AdapterHub documentation + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + +
+ + + + + +
+ +
+ + + + + + + + + + + + + + + + + +
+ + + + +
+
+
+
+ +
+

Overview and Configuration

+

Large pre-trained Transformer-based language models (LMs) have become the foundation of NLP in recent years. +While the most prevalent method of using these LMs for transfer learning involves costly full fine-tuning of all model parameters, a series of efficient and lightweight alternatives have recently been established. +Instead of updating all parameters of the pre-trained LM towards a downstream target task, these methods commonly introduce a small number of new parameters and only update these while keeping the pre-trained model weights fixed.

+
+

Why use Efficient Fine-Tuning?

+

Efficient fine-tuning methods offer multiple benefits over the full fine-tuning of LMs:

+
    +
  • They are parameter-efficient, i.e., they only update a tiny subset (often under 1%) of a model’s parameters.

  • +
  • They often are modular, i.e., the updated parameters can be extracted and shared independently of the base model parameters.

  • +
  • They are easy to share and deploy due to their small file sizes, e.g., having only ~3MB per task instead of ~440MB for sharing a full model.

  • +
  • They speed up training, i.e., efficient fine-tuning often requires less training time than fully fine-tuning LMs.

  • +
  • They are composable, e.g., multiple adapters trained on different tasks can be stacked, fused, or mixed to leverage their combined knowledge.

  • +
  • They often provide on-par performance with full fine-tuning.

  • +
+
+

More specifically, let the parameters of a LM be composed of a set of pre-trained parameters \(\Theta\) (frozen) and a set of (newly introduced) parameters \(\Phi\). +Then, efficient fine-tuning methods optimize only \(\Phi\) according to a loss function \(L\) on a dataset \(D\):

+
+\[ +\Phi^* \leftarrow \arg \min_{\Phi} L(D; \{\Theta, \Phi\}) +\]
+

Efficient fine-tuning might insert parameters \(\Phi\) at different locations of a Transformer-based LM. +One early and successful method, (bottleneck) adapters, introduces bottleneck feed-forward layers in each layer of a Transformer model. +While these adapters have laid the foundation of the adapters library, multiple alternative methods have been introduced and integrated since.

+
+

Important

+

In literature, different terms are used to refer to efficient fine-tuning methods. +The term “adapter” is usually only applied to bottleneck adapter modules. +However, most efficient fine-tuning methods follow the same general idea of inserting a small set of new parameters and, by this, “adapting” the pre-trained LM to a new task. +In adapters, the term “adapter” thus may refer to any efficient fine-tuning method if not specified otherwise.

+
+

In the remaining sections, we will present how adapter methods can be configured in adapters. +The next two pages will then present the methodological details of all currently supported adapter methods.

+
+

Table of Adapter Methods

+

The following table gives an overview of all adapter methods supported by adapters. +Identifiers and configuration classes are explained in more detail in the next section.

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
IdentifierConfiguration classMore information
seq_bnSeqBnConfig()Bottleneck Adapters
double_seq_bnDoubleSeqBnConfig()Bottleneck Adapters
par_bnParBnConfig()Bottleneck Adapters
scaled_par_bnParBnConfig(scaling="learned")Bottleneck Adapters
seq_bn_invSeqBnInvConfig()Invertible Adapters
double_seq_bn_invDoubleSeqBnInvConfig()Invertible Adapters
compacterCompacterConfig()Compacter
compacter++CompacterPlusPlusConfig()Compacter
prefix_tuningPrefixTuningConfig()Prefix Tuning
prefix_tuning_flatPrefixTuningConfig(flat=True)Prefix Tuning
loraLoRAConfig()LoRA
ia3IA3Config()IA³
mamMAMConfig()Mix-and-Match Adapters
unipeltUniPELTConfig()UniPELT
prompt_tuningPromptTuningConfig()Prompt Tuning
+
+
+

Configuration

+

All supported adapter methods can be added, trained, saved and shared using the same set of model class functions (see class documentation). +Each method is specified and configured using a specific configuration class, all of which derive from the common AdapterConfig class. +E.g., adding one of the supported adapter methods to an existing model instance follows this scheme:

+
model.add_adapter("name", config=<ADAPTER_CONFIG>)
+
+
+

Here, <ADAPTER_CONFIG> can either be:

+
    +
  • a configuration string, as described below

  • +
  • an instance of a configuration class, as listed in the table above

  • +
  • a path to a JSON file containing a configuration dictionary

  • +
+
+

Configuration strings

+

Configuration strings are a concise way of defining a specific adapter method configuration. +They are especially useful when adapter configurations are passed from external sources such as the command-line, when using configuration classes is not an option.

+

In general, a configuration string for a single method takes the form <identifier>[<key>=<value>, ...]. +Here, <identifier> refers to one of the identifiers listed in the table above, e.g. par_bn. +In square brackets after the identifier, you can set specific configuration attributes from the respective configuration class, e.g. par_bn[reduction_factor=2]. +If all attributes remain at their default values, this can be omitted.

+

Finally, it is also possible to specify a method combination as a configuration string by joining multiple configuration strings with |, e.g.:

+
config = "prefix_tuning[bottleneck_size=800]|parallel"
+
+
+

is identical to the following ConfigUnion:

+
config = ConfigUnion(
+    PrefixTuningConfig(bottleneck_size=800),
+    ParBnConfig(),
+)
+
+
+
+
+
+ + +
+ +
+ + +
+
+ +
+ +
+ +
+ + Versions + v: main + + +
+
+
Branches
+
main
+
+
+
+ + + + + + + + + + \ No newline at end of file diff --git a/prediction_heads.html b/prediction_heads.html new file mode 100644 index 0000000000..9d9bac04e3 --- /dev/null +++ b/prediction_heads.html @@ -0,0 +1,426 @@ + + + + + + + + + + + Prediction Heads — AdapterHub documentation + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + +
+ + + + + +
+ +
+ + + + + + + + + + + + + + + + + +
+ + + + +
+
+
+
+ +
+

Prediction Heads

+

This section gives an overview of how different prediction heads can be used together with adapter modules and how pre-trained adapters can be distributed side-by-side with matching prediction heads in AdapterHub. +We will take a look at the AdapterModel classes (e.g. BertAdapterModel) introduced by adapters, which provide flexible support for prediction heads, as well as models with static heads provided out-of-the-box by Hugging Face Transformers (e.g. BertForSequenceClassification).

+
+

Tip

+

We recommend to use the AdapterModel classes whenever possible. +These flexible models have been created specifically for working with adapters.

+
+
+

AdapterModel classes

+

The AdapterModel classes provided by adapters allow a flexible configuration of prediction heads on top of a pre-trained language model.

+

First, we load pre-trained model from the Hugging Face Hub via the AutoAdapterModel class:

+
model = AutoAdapterModel.from_pretrained("bert-base-uncased")
+
+
+

By default, this model doesn’t have any heads yet, so let’s add a new binary sequence classification head on top of our model:

+
model.add_classification_head("mrpc", num_labels=2)
+
+
+

All heads have a name, we called this new head "mrpc". Since all heads are named, we can add multiple other heads with different names to the same model. +To see the head types of a model and how they can get configured, please refer to the class references of the respective model classes, e.g. BertAdapterModel.

+

A head alone is just one layer with very few parameters. Hence, we want to train our classification head together with an adapter, so let’s add one:

+
model.add_adapter("mrpc", config="seq_bn")
+model.set_active_adapters("mrpc")
+
+
+

Since we gave the task adapter the same name as our head, we can easily identify them as belonging together. +The call to set_active_adapters() in the second line tells our model to use the adapter - head configuration we specified by default in a forward pass. +At this point, we can start to train our setup.

+
+

Note

+

The set_active_adapters() will search for an adapter and a prediction head with the given name to be activated. +Alternatively, prediction heads can also be activated explicitly (i.e. without adapter modules). +These three options are possible (in order of priority when multiple are specified):

+
    +
  1. If head is passed to the forward call, the head with the given name is used.

  2. +
  3. If the forward call is executed within an AdapterSetup context, the head configuration is read from the context.

  4. +
  5. If the active_head property is set, the head configuration is read from there.

  6. +
+
+

After training has completed, we can save our whole setup (adapter module and prediction head), with a single call:

+
model.save_adapter("/path/to/dir", "mrpc", with_head=True)
+
+
+

Now, you just have to share your work with the world. +After you published the adapter together with its head in the Hub, anyone else can load both adapter and head by using the same model class.

+

Alternatively, we can also save and load the prediction head separately from an adapter module:

+
# save
+model.save_head("/path/to/dir", "mrpc")
+# load
+model.load_head("/path/to/dir")
+
+
+

Lastly, it’s also possible to delete an added head again:

+
model.delete_head("mrpc")
+
+
+
+
+

Model classes with static heads (Hugging Face Transformers)

+

The transformers library provides strongly typed model classes with heads for various different tasks (e.g. RobertaForSequenceClassification, AutoModelForMultipleChoice …). +If an adapter module is trained with one of these out-of-the-box classes, it is encouraged to also distribute the prediction head weights together with the adapter weights. +Therefore, we can also easily save the prediction head weights for these models together with an adapter:

+
model.save_adapter("/path/to/dir", "mrpc", with_head=True)
+
+
+

In the next step, we can provide both the adapter weights and the head weights to the Hub. +If someone else then downloads the pre-trained adapter, the resolving method will check if the prediction head matches the class of his model. +In case the classes match, the prediction head weights will be automatically loaded too.

+
+
+

Automatic conversion

+

adapters supports loading static heads, e.g., created with AutoModelForSequenceClassification, into model classes with flexible heads, e.g. AutoAdapterModel.

+

For this, for a model created with AutoModelForSequenceClassification we first need to enable adapter support by calling the init() method.

+
from adapters import init, AutoAdapterModel
+from transformers import AutoModelForSequenceClassification
+import os
+
+static_head_model = AutoModelForSequenceClassification.from_pretrained("bert-base-uncased")
+# Enable adapter support
+init(static_head_model) 
+
+
+

Now we can add an adapter and save it together with the head as usual:

+
static_head_model.add_adapter("test")
+
+temp_dir = os.path.join(os.getcwd(), "temp_dir")
+static_head_model.save_adapter(temp_dir, "test", with_head=True)
+
+
+

When now loading the adapter and head into a new AdapterModel, the conversion of weights happens automatically during the call of load_adapter(), so no additional steps are needed:

+
flex_head_model = AutoAdapterModel.from_pretrained("bert-base-uncased")
+flex_head_model.load_adapter(temp_dir)
+
+assert "test" in flex_head_model.adapters_config
+assert "test" in flex_head_model.heads
+
+
+
+

Note

+

The conversion in the opposite direction is not supported, i.e. you cannot load a head created with AutoAdapterModel into a model of type AutoModelForSequenceClassification.

+
+
+
+

Custom Heads

+

If none of the available prediction heads fit your requirements, you can define and add a custom head.

+

First, we need to define the new head class. For that, the initialization and the forward pass need to be implemented. +The initialization of the head gets a reference to the model, the name of the head, and additionally defined kwargs. +You can use the following template as a guideline.

+
class CustomHead(PredictionHead):
+    def __init__(
+        self,
+        model,
+        head_name,
+        **kwargs,
+    ):
+        # innitialization of the custom head
+
+    def forward(self, outputs, cls_output=None, attention_mask=None, return_dict=False, **kwargs):
+        # implementation of the forward pass
+
+
+

Next, we can register the new custom head and give the new head type a name. This only notifies +the model that there is a new head type. Then, we can add an instance of the new head to the model by +calling add_custom_head with the name of the new head type, the name of the head instance we are creating, and +additional arguments required by the head.

+
model.register_custom_head("my_custom_head", CustomHead)
+model.add_custom_head(head_type="my_custom_head", head_name="custom_head", **kwargs)
+
+
+

After adding the custom head you can treat it like any other build-in head type.

+
+
+ + +
+ +
+ + +
+
+ +
+ +
+ +
+ + Versions + v: main + + +
+
+
Branches
+
main
+
+
+
+ + + + + + + + + + \ No newline at end of file diff --git a/py-modindex.html b/py-modindex.html new file mode 100644 index 0000000000..fbad4d06ca --- /dev/null +++ b/py-modindex.html @@ -0,0 +1,307 @@ + + + + + + + + + + + Python Module Index — AdapterHub documentation + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + +
+ + + + + +
+ +
+ + + + + + + + + + + + + + + + + +
+ +
    + +
  • Docs »
  • + +
  • Python Module Index
  • + + +
  • + +
  • + +
+ + +
+
+
+
+ + +

Python Module Index

+ +
+ a +
+ + + + + + + + + + + + + + + + +
 
+ a
+ adapters +
    + adapters.trainer +
    + adapters.training +
    + adapters.utils +
+ + +
+ +
+
+ + +
+ +
+

+ © Copyright 2020-2024, AdapterHub Team + +

+
+ Built with Sphinx using a theme provided by Read the Docs. + +
+ +
+
+ +
+ +
+ +
+ + Versions + v: main + + +
+
+
Branches
+
main
+
+
+
+ + + + + + + + + + \ No newline at end of file diff --git a/quickstart.html b/quickstart.html new file mode 100644 index 0000000000..2c832552a5 --- /dev/null +++ b/quickstart.html @@ -0,0 +1,400 @@ + + + + + + + + + + + Quick Start — AdapterHub documentation + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + +
+ + + + + +
+ +
+ + + + + + + + + + + + + + + + + +
+ + + + +
+
+
+
+ +
+

Quick Start

+
+

Introduction

+

adapters adds adapter functionality to the PyTorch implementations of all Transformer models listed in the Model Overview. +For working with adapters, a couple of methods, e.g. for creation (add_adapter()), loading (load_adapter()), +storing (save_adapter()) and deletion (delete_adapter()) are added to the model classes. +In the following, we will briefly go through some examples to showcase these methods.

+
+

Note

+

This document focuses on the adapter-related functionalities added by adapters. +For a more general overview of the transformers library, visit +the ‘Usage’ section in Hugging Face’s documentation.

+
+
+
+

Initialize a Model with Adapters

+

The XAdapterModel is the recommended model for training and inference of adapters:

+
from adapters import AutoAdapterModel
+
+model = AutoAdapterModel.from_pretrained(model_name)
+
+
+

This handles the initialization of the adapter-related functionality internally and provides you with the initialized model. The XAdapterModel also supports the dynamic adding, loading, and storing of heads for different tasks.

+

If you want to use adapters in Hugging Face models, the models need to be initialized with the adapters library. This initializes the functionality of adding, loading and storing of adapters within the transformers models.

+
import adapters
+
+adapters.init(model)
+
+
+
+
+

Using a Pre-Trained Adapter for Inference

+

We also have a Quickstart Colab notebook for adapter inference: Open In Colab

+

The following example shows the usage of a basic pre-trained Transformer model with adapters. +Our goal here is to predict the sentiment of a given sentence.

+

We use BERT in this example, so we first load a pre-trained BertTokenizer to encode the input sentence and a pre-trained +bert-base-uncased checkpoint from Hugging Face’s Model Hub using the BertAdapterModel class:

+
import os
+
+import torch
+from transformers import BertTokenizer
+from adapters import BertAdapterModel
+
+# Load pre-trained BERT tokenizer from Hugging Face
+tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
+
+# An input sentence
+sentence = "It's also, clearly, great fun."
+
+# Tokenize the input sentence and create a PyTorch input tensor
+input_data = tokenizer(sentence, return_tensors="pt")
+
+# Load pre-trained BERT model from Hugging Face Hub
+# The `BertAdapterModel` class is specifically designed for working with adapters
+# It can be used with different prediction heads
+model = BertAdapterModel.from_pretrained('bert-base-uncased')
+
+
+

Having loaded the model, we now add a pre-trained task adapter that is useful to our task from AdapterHub. +In this case, for sentiment classification, we thus use an adapter trained on the SST-2 dataset. +The task prediction head loaded together with the adapter gives us a class label for our sentence:

+
# Load pre-trained task adapter from Adapter Hub
+# This method call will also load a pre-trained classification head for the adapter task
+adapter_name = model.load_adapter("sentiment/sst-2@ukp", config='pfeiffer')
+
+# Activate the adapter we just loaded, so that it is used in every forward pass
+model.set_active_adapters(adapter_name)
+
+# Predict output tensor
+outputs = model(**input_data)
+
+# Retrieve the predicted class label
+predicted = torch.argmax(outputs[0]).item()
+assert predicted == 1
+
+
+

To save our pre-trained model and adapters, we can easily store and reload them as follows:

+
# For the sake of this demonstration an example path for loading and storing is given below
+example_path = os.path.join(os.getcwd(), "adapter-quickstart")
+
+# Save model
+model.save_pretrained(example_path)
+# Save adapter
+model.save_adapter(example_path, adapter_name)
+
+# Load model, similar to Hugging Face's AutoModel class, 
+# you can also use AutoAdapterModel instead of BertAdapterModel
+model = AutoAdapterModel.from_pretrained(example_path)
+model.load_adapter(example_path)
+
+
+

Similar to how the weights of the full model are saved, the save_adapter() will create a file for saving the adapter weights and a file for saving the adapter configuration in the specified directory.

+

Finally, if we have finished working with adapters, we can restore the base Transformer to its original form by deactivating and deleting the adapter:

+
# Deactivate all adapters
+model.set_active_adapters(None)
+# Delete the added adapter
+model.delete_adapter(adapter_name)
+
+
+
+
+

Adapter training

+

We also have a Quickstart Colab notebook for adapter training: Open In Colab

+

For more examples of training different adapter setups, refer to the section on Adapter Training. +Further information on using adapters with prediction heads can be found in the Prediction Heads section.

+
+
+ + +
+ +
+ + +
+
+ +
+ +
+ +
+ + Versions + v: main + + +
+
+
Branches
+
main
+
+
+
+ + + + + + + + + + \ No newline at end of file diff --git a/search.html b/search.html new file mode 100644 index 0000000000..af7ffb68c4 --- /dev/null +++ b/search.html @@ -0,0 +1,293 @@ + + + + + + + + + + + Search — AdapterHub documentation + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + +
+ + + + + +
+ +
+ + + + + + + + + + + + + + + + + +
+ +
    + +
  • Docs »
  • + +
  • Search
  • + + +
  • + + + +
  • + +
+ + +
+
+
+
+ + + + +
+ +
+ +
+ +
+
+ + +
+ +
+

+ © Copyright 2020-2024, AdapterHub Team + +

+
+ Built with Sphinx using a theme provided by Read the Docs. + +
+ +
+
+ +
+ +
+ +
+ + Versions + v: main + + +
+
+
Branches
+
main
+
+
+
+ + + + + + + + + + + + + + + \ No newline at end of file diff --git a/searchindex.js b/searchindex.js new file mode 100644 index 0000000000..bc2cb60a79 --- /dev/null +++ b/searchindex.js @@ -0,0 +1 @@ +Search.setIndex({"docnames": ["adapter_composition", "classes/adapter_config", "classes/adapter_layer", "classes/adapter_training", "classes/adapter_utils", "classes/model_adapters_config", "classes/model_mixins", "classes/models/albert", "classes/models/auto", "classes/models/bart", "classes/models/beit", "classes/models/bert", "classes/models/bert-generation", "classes/models/clip", "classes/models/deberta", "classes/models/deberta_v2", "classes/models/distilbert", "classes/models/electra", "classes/models/encoderdecoder", "classes/models/gpt2", "classes/models/gptj", "classes/models/llama", "classes/models/mbart", "classes/models/mt5", "classes/models/roberta", "classes/models/t5", "classes/models/vit", "classes/models/xlmroberta", "classes/models/xmod", "contributing", "contributing/adding_adapter_methods", "contributing/adding_adapters_to_a_model", "embeddings", "extending", "hub_contributing", "huggingface_hub", "index", "installation", "loading", "method_combinations", "methods", "model_overview", "overview", "prediction_heads", "quickstart", "training", "transitioning"], "filenames": ["adapter_composition.md", "classes/adapter_config.rst", "classes/adapter_layer.rst", "classes/adapter_training.rst", "classes/adapter_utils.rst", "classes/model_adapters_config.rst", "classes/model_mixins.rst", "classes/models/albert.rst", "classes/models/auto.rst", "classes/models/bart.rst", "classes/models/beit.rst", "classes/models/bert.rst", "classes/models/bert-generation.rst", "classes/models/clip.rst", "classes/models/deberta.rst", "classes/models/deberta_v2.rst", "classes/models/distilbert.rst", "classes/models/electra.rst", "classes/models/encoderdecoder.rst", "classes/models/gpt2.rst", "classes/models/gptj.rst", "classes/models/llama.rst", "classes/models/mbart.rst", "classes/models/mt5.rst", "classes/models/roberta.rst", "classes/models/t5.rst", "classes/models/vit.rst", "classes/models/xlmroberta.rst", "classes/models/xmod.rst", "contributing.md", "contributing/adding_adapter_methods.md", "contributing/adding_adapters_to_a_model.md", "embeddings.md", "extending.md", "hub_contributing.md", "huggingface_hub.md", "index.rst", "installation.md", "loading.md", "method_combinations.md", "methods.md", "model_overview.md", "overview.md", "prediction_heads.md", "quickstart.md", "training.md", "transitioning.md"], "titles": ["Adapter Activation and Composition", "Adapter Configuration", "Adapter Implementation", "Adapter Training", "Adapter Utilities", "Model Adapters Config", "Model Mixins", "ALBERT", "Auto Classes", "BART", "BEiT", "BERT", "BertGeneration", "CLIP", "DeBERTa", "DeBERTa-v2", "DistilBERT", "ELECTRA", "Encoder Decoder Models", "OpenAI GPT2", "EleutherAI GPT-J-6B", "LLaMA", "MBart", "MT5", "RoBERTa", "T5", "Vision Transformer (ViT)", "XLM-RoBERTa", "X-MOD", "Contributing to AdapterHub", "Adding Adapter Methods", "Adding Adapters to a Model", "Embeddings", "Extending the Library", "Contributing Adapters to the Hub", "Integration with Hugging Face\u2019s Model Hub", "AdapterHub Documentation", "Installation", "Loading Pre-Trained Adapters", "Method Combinations", "Adapter Methods", "Model Overview", "Overview and Configuration", "Prediction Heads", "Quick Start", "Adapter Training", "Transitioning from adapter-transformers"], "terms": {"With": [0, 18, 32], "becom": [0, 26, 42], "possibl": [0, 1, 8, 21, 30, 31, 33, 35, 38, 39, 40, 41, 42, 43], "combin": [0, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 31, 36, 40, 42], "multipl": [0, 1, 2, 6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 38, 39, 40, 42, 43], "train": [0, 1, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 35, 36, 39, 40, 42, 43, 46], "differ": [0, 1, 4, 8, 13, 18, 19, 20, 21, 25, 30, 31, 37, 38, 39, 40, 42, 43, 44, 45, 46], "task": [0, 4, 6, 7, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, 30, 31, 35, 36, 38, 40, 42, 43, 44], "so": [0, 6, 7, 8, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, 38, 43, 44, 45], "call": [0, 2, 3, 4, 6, 7, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 30, 31, 32, 38, 40, 43, 44, 45, 46], "To": [0, 8, 18, 23, 25, 28, 29, 30, 31, 32, 33, 35, 38, 39, 40, 43, 44, 45, 46], "enabl": [0, 2, 6, 7, 9, 10, 11, 12, 13, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 30, 43, 45], "come": [0, 13, 18, 39, 40], "modular": [0, 28, 36, 42], "flexibl": [0, 6, 7, 8, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 31, 43], "concept": [0, 13, 30], "defin": [0, 1, 2, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 30, 31, 38, 39, 40, 42, 43], "how": [0, 1, 7, 9, 11, 12, 16, 17, 18, 22, 23, 24, 25, 27, 28, 29, 30, 31, 35, 38, 42, 43, 44, 45], "input": [0, 1, 2, 3, 6, 7, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 39, 40, 44], "model": [0, 1, 3, 4, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 32, 34, 38, 39, 40, 42], "should": [0, 1, 2, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 30, 31, 38, 40, 45], "flow": [0, 2], "through": [0, 2, 3, 6, 7, 9, 10, 11, 12, 13, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, 30, 31, 35, 38, 40, 44, 45], "avail": [0, 1, 4, 6, 7, 8, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 31, 32, 35, 38, 39, 43], "thi": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46], "allow": [0, 2, 4, 8, 19, 20, 21, 28, 31, 37, 40, 43], "e": [0, 1, 2, 4, 7, 8, 10, 13, 16, 18, 22, 25, 28, 29, 30, 31, 35, 36, 38, 39, 40, 41, 42, 43, 44, 46], "g": [0, 1, 2, 4, 8, 13, 16, 18, 22, 25, 28, 29, 30, 31, 35, 38, 39, 40, 41, 42, 43, 44, 45, 46], "mad": [0, 40], "x": [0, 8, 36, 39, 40, 41], "even": [0, 10, 21], "more": [0, 6, 7, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 30, 31, 33, 35, 38, 39, 40, 42, 44, 45, 46], "complex": [0, 1, 45], "setup": [0, 2, 3, 5, 6, 7, 9, 10, 11, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, 39, 40, 43, 44], "The": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46], "singl": [0, 2, 6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 38, 39, 40, 42, 43, 45], "locat": [0, 6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 30, 40, 42], "where": [0, 1, 2, 4, 6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 30, 31, 36, 40, 45], "all": [0, 1, 2, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 31, 32, 33, 34, 35, 36, 38, 39, 40, 42, 43, 44, 45, 46], "magic": 0, "happen": [0, 38, 43], "active_adapt": [0, 1, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 35], "properti": [0, 6, 7, 8, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 32, 40, 43], "class": [0, 1, 2, 3, 4, 5, 6, 7, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 30, 33, 39, 40, 41, 42, 44, 46], "In": [0, 4, 6, 7, 8, 9, 10, 11, 12, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, 30, 31, 32, 35, 38, 39, 40, 42, 43, 44, 45], "simplest": [0, 37], "case": [0, 1, 4, 6, 7, 8, 9, 10, 11, 12, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 30, 31, 32, 38, 43, 44, 45], "you": [0, 1, 3, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 35, 36, 37, 38, 40, 41, 42, 43, 44, 45, 46], "can": [0, 1, 2, 3, 4, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45], "set": [0, 1, 2, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 30, 32, 40, 42, 43, 45], "name": [0, 2, 3, 5, 6, 7, 8, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 30, 31, 32, 33, 35, 36, 38, 39, 42, 43, 45], "here": [0, 14, 15, 21, 30, 31, 34, 35, 39, 40, 42, 44, 45, 46], "adapter_nam": [0, 2, 3, 5, 6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 32, 33, 35, 38, 39, 44, 46], "which": [0, 1, 2, 4, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, 30, 31, 32, 39, 40, 41, 42, 43, 45, 46], "ar": [0, 1, 2, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 35, 36, 38, 39, 40, 41, 42, 43, 44, 45], "us": [0, 1, 2, 3, 4, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 30, 31, 32, 33, 35, 36, 39, 40, 41, 43, 45, 46], "each": [0, 1, 4, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 30, 31, 37, 38, 39, 40, 41, 42, 45], "forward": [0, 2, 4, 6, 7, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 30, 31, 39, 40, 42, 43, 44, 45], "backward": [0, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 46], "pass": [0, 2, 3, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 38, 39, 40, 42, 43, 44, 45], "mean": [0, 2, 30, 38, 41, 46], "cannot": [0, 1, 8, 19, 20, 21, 31, 32, 43, 46], "an": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 30, 31, 32, 33, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45], "befor": [0, 1, 2, 7, 9, 10, 11, 12, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 30, 31, 35, 40, 45], "previous": [0, 46], "ad": [0, 1, 2, 3, 6, 7, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 33, 35, 36, 38, 40, 41, 42, 43, 44, 45, 46], "either": [0, 1, 2, 3, 4, 6, 7, 8, 9, 10, 11, 12, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 30, 35, 38, 40, 42], "add_adapt": [0, 2, 6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 35, 39, 40, 42, 43, 44, 45, 46], "load_adapt": [0, 3, 6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 33, 35, 43, 44, 45], "mention": 0, "ignor": [0, 1, 7, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28], "although": [0, 7, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28], "thei": [0, 1, 2, 4, 6, 7, 8, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 30, 39, 42, 43, 45, 46], "might": [0, 3, 30, 31, 32, 37, 39, 42, 45], "have": [0, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, 30, 31, 35, 37, 38, 39, 40, 42, 43, 44, 45, 46], "been": [0, 4, 6, 7, 8, 9, 10, 11, 12, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 30, 38, 39, 42, 43], "load": [0, 1, 3, 4, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 30, 35, 43, 44, 45, 46], "thu": [0, 18, 30, 31, 35, 40, 42, 44, 45], "after": [0, 1, 2, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 33, 35, 39, 40, 42, 43, 45], "make": [0, 2, 28, 29, 30, 31, 38, 39, 45, 46], "sure": [0, 2, 29, 30, 31, 38, 45], "note": [0, 1, 2, 4, 6, 7, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, 31, 32, 39, 45, 46], "we": [0, 3, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, 30, 31, 32, 35, 36, 38, 40, 41, 42, 43, 44, 45, 46], "also": [0, 7, 8, 9, 11, 12, 13, 16, 17, 18, 19, 21, 22, 23, 24, 25, 27, 28, 30, 32, 33, 35, 37, 38, 40, 41, 42, 43, 44, 45, 46], "could": [0, 18, 38, 39, 46], "set_active_adapt": [0, 6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 38, 43, 44, 45, 46], "method": [0, 1, 2, 3, 4, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 31, 32, 33, 38, 41, 43, 44, 46], "doe": [0, 1, 2, 4, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 30, 31, 45], "same": [0, 2, 5, 6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 31, 32, 35, 36, 38, 40, 42, 43, 45], "altern": [0, 13, 17, 20, 35, 36, 37, 38, 42, 43, 45], "adaptersetup": [0, 1, 6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 43], "context": [0, 1, 6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 40, 43], "manag": [0, 1, 5, 30, 35], "dynam": [0, 1, 6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 32, 44], "configur": [0, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 31, 36, 38, 39, 40, 43, 44], "without": [0, 1, 4, 7, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, 40, 43, 46], "chang": [0, 1, 7, 9, 11, 15, 28, 30, 31, 45, 46], "state": [0, 2, 3, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 30, 38, 39, 40], "from": [0, 1, 2, 3, 4, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 33, 36, 38, 39, 40, 42, 43, 44, 45], "import": [0, 2, 8, 13, 18, 29, 30, 31, 35, 38, 39, 40, 43, 44, 45, 46], "basic": [0, 29, 30, 33, 44], "build": [0, 1, 14, 15, 43], "advanc": [0, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 33], "object": [0, 2, 8, 10, 11, 13, 18, 19, 22, 30, 38], "deriv": [0, 2, 30, 31, 40, 42], "adaptercompositionblock": [0, 2, 6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "repres": [0, 1, 6, 7, 9, 10, 11, 12, 13, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 38, 40], "follow": [0, 2, 6, 7, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, 30, 31, 35, 39, 40, 42, 43, 44, 45, 46], "tabl": [0, 6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 30, 31, 41], "give": [0, 31, 32, 41, 42, 43, 44], "support": [0, 2, 6, 7, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, 31, 32, 33, 37, 38, 39, 41, 42, 43, 44, 45], "bottleneck": [0, 2, 30, 31, 36, 39, 41, 42, 45], "prefix": [0, 4, 6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 31, 36, 39, 41, 42, 45], "tune": [0, 6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, 30, 31, 36, 39, 41, 45], "compact": [0, 1, 36, 41, 42], "lora": [0, 1, 2, 6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 31, 36, 39, 41, 42, 45, 46], "ia": [0, 1, 36], "\u00b3": 0, "prompt": [0, 1, 36, 41, 42], "except": [0, 6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 31, 32, 40, 45, 46], "deberta": [0, 8, 36, 41], "v1": [0, 40], "gpt": [0, 8, 9, 12, 13, 17, 19, 21, 36, 41], "2": [0, 1, 4, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 31, 38, 39, 41, 42, 43, 44, 45], "next": [0, 11, 13, 18, 19, 30, 35, 38, 40, 42, 43], "present": [0, 4, 6, 7, 8, 22, 23, 25, 32, 40, 42, 45], "detail": [0, 7, 9, 10, 11, 12, 13, 15, 16, 17, 18, 22, 23, 24, 25, 26, 27, 28, 30, 31, 33, 38, 40, 42], "top": [0, 1, 6, 7, 9, 10, 11, 12, 13, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 43], "other": [0, 8, 13, 18, 20, 29, 30, 31, 32, 36, 43, 45], "kind": 0, "framework": [0, 30, 36, 39, 40], "cross": [0, 1, 9, 17, 18, 22, 23, 25, 27, 28, 40], "lingual": [0, 27, 28, 40], "transfer": [0, 13, 17, 23, 25, 26, 28, 36, 39, 40, 42], "pfeiffer": [0, 1, 36, 38, 40, 44, 45, 46], "et": [0, 1, 39, 40, 45], "al": [0, 1, 23, 39, 40, 45], "2020": [0, 1, 36, 40, 45], "languag": [0, 3, 6, 7, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 33, 36, 38, 39, 42, 43], "For": [0, 2, 7, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 31, 33, 34, 35, 36, 39, 40, 41, 43, 44, 45, 46], "check": [0, 1, 2, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 40, 43], "out": [0, 7, 9, 10, 11, 12, 13, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 31, 33, 38, 39, 40, 43], "colab": [0, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 44], "notebook": [0, 32, 44, 45], "exampl": [0, 1, 8, 10, 13, 17, 18, 32, 35, 37, 38, 39, 40, 44, 45], "b": [0, 1, 7, 11, 17, 24, 27, 28, 30, 39, 40], "c": [0, 36], "layer": [0, 1, 2, 6, 7, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 31, 39, 40, 42, 43, 45], "first": [0, 2, 7, 8, 10, 11, 13, 14, 15, 17, 18, 21, 22, 24, 26, 27, 28, 30, 35, 37, 38, 40, 43, 44, 45, 46], "final": [0, 30, 31, 38, 39, 40, 42, 44, 45], "ac": 0, "when": [0, 2, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, 30, 35, 38, 42, 43, 45], "prepend": [0, 18, 25, 40], "right": [0, 1, 9, 18, 22, 23, 25, 31, 40], "left": [0, 1, 9, 23, 25, 40], "i": [0, 1, 7, 10, 13, 28, 30, 36, 40, 42, 43], "vector": [0, 1, 7, 9, 11, 12, 14, 15, 16, 17, 18, 22, 23, 24, 25, 27, 28, 40], "result": [0, 1, 4, 9, 10, 12, 17, 18, 23, 26, 28, 30, 31, 40, 46], "etc": [0, 7, 9, 11, 12, 13, 16, 17, 18, 19, 21, 22, 23, 24, 25, 26, 27, 28, 30, 40, 41], "fusion": [0, 2, 6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 41, 45], "non": [0, 1, 4, 8, 13, 29, 40], "destruct": [0, 40], "wai": [0, 4, 6, 7, 9, 10, 11, 12, 13, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, 35, 37, 38, 42, 45], "knowledg": [0, 29, 42], "pre": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 30, 35, 36, 40, 42, 43, 45], "new": [0, 1, 2, 5, 6, 7, 9, 10, 11, 12, 13, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, 30, 31, 32, 34, 35, 36, 40, 41, 42, 43, 45, 46], "downstream": [0, 10, 13, 14, 15, 17, 18, 40, 42], "propos": [0, 1, 7, 9, 10, 11, 12, 13, 14, 15, 16, 17, 19, 21, 23, 24, 25, 26, 27, 39, 40], "2021": [0, 1, 6, 11, 16, 17, 24, 28, 39, 40], "d": [0, 25, 29, 40, 41, 42], "f": [0, 8, 40], "well": [0, 2, 3, 6, 7, 8, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, 38, 40, 43, 45], "three": [0, 6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 43], "beforehand": 0, "add_adapter_fus": [0, 6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "specifi": [0, 1, 4, 6, 7, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 35, 38, 40, 42, 43, 44, 45], "onli": [0, 1, 4, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 31, 32, 38, 39, 40, 42, 43, 45, 46], "work": [0, 3, 8, 9, 17, 23, 25, 28, 30, 31, 35, 36, 37, 39, 43, 44, 45], "successfulli": [0, 26, 38], "list": [0, 1, 2, 3, 4, 5, 6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 30, 31, 35, 36, 42, 44, 46], "ha": [0, 1, 2, 4, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 30, 31, 36, 38, 40, 43], "done": [0, 1, 4, 8, 31, 35], "load_adapter_fus": [0, 6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "learn": [0, 1, 7, 13, 15, 16, 17, 23, 25, 27, 28, 36, 39, 40, 42, 45], "repo": [0, 6, 7, 8, 9, 10, 11, 12, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, 38], "score": [0, 13, 18], "comput": [0, 9, 12, 13, 14, 15, 17, 18, 19, 20, 21, 22, 26, 36, 39], "These": [0, 6, 30, 31, 39, 40, 43, 45, 46], "analyz": 0, "serv": [0, 46], "basi": [0, 6, 33], "visual": [0, 10, 13, 40], "similar": [0, 8, 13, 15, 31, 32, 39, 40, 44, 45], "those": [0, 9, 18, 22, 23, 25, 31, 45], "paper": [0, 1, 6, 7, 9, 10, 11, 12, 13, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 36, 39, 40], "collect": [0, 1, 4, 13, 21, 31, 36, 38], "output_adapter_fusion_attent": [0, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28], "true": [0, 1, 2, 4, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 32, 39, 40, 42, 43], "save": [0, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 33, 36, 42, 43, 44, 45, 46], "adapter_fusion_attent": 0, "attribut": [0, 2, 4, 5, 6, 7, 8, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 30, 31, 40, 42], "attention_scor": 0, "base": [0, 1, 2, 3, 6, 7, 8, 9, 10, 11, 12, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 30, 31, 33, 35, 36, 38, 39, 40, 42, 43, 44, 45, 46], "adaptermodel": [0, 31, 36, 39, 45, 46], "hold": [0, 2, 4, 30, 31, 39], "dictionari": [0, 1, 2, 4, 5, 6, 7, 8, 9, 10, 11, 12, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 38, 39, 40, 42], "form": [0, 6, 7, 9, 10, 11, 12, 13, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 39, 40, 42, 44], "fusion_nam": [0, 5], "layer_id": [0, 39], "module_loc": [0, 39], "np": [0, 39], "arrai": [0, 20, 39], "between": [0, 6, 7, 9, 10, 11, 12, 13, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 31, 40, 45], "two": [0, 1, 7, 10, 13, 14, 15, 17, 18, 30, 31, 33, 36, 40, 42], "sequenc": [0, 2, 6, 7, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 40, 43], "indic": [0, 6, 7, 9, 10, 11, 12, 13, 16, 17, 18, 22, 23, 24, 25, 26, 27, 28], "divid": 0, "h": [0, 39, 40], "token": [0, 1, 3, 4, 6, 7, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 32, 35, 40, 44, 45], "0": [0, 1, 2, 4, 7, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 22, 23, 24, 25, 26, 27, 28, 38, 39, 44, 45], "up": [0, 1, 9, 18, 19, 22, 23, 25, 40, 42, 45], "63": 0, "while": [0, 6, 7, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 37, 39, 40, 42], "64": [0, 20], "sever": [0, 3], "It": [0, 7, 8, 9, 11, 13, 14, 15, 16, 19, 20, 21, 23, 24, 25, 26, 27, 30, 31, 37, 40, 44, 46], "batch": [0, 2, 19, 20, 21], "smaller": [0, 7, 16, 40], "As": [0, 7, 8, 13, 17, 18, 30, 38, 40], "remain": [0, 8, 18, 26, 36, 42], "untouch": 0, "k": [0, 1, 40], "l": [0, 10, 40, 42], "batch_siz": [0, 7, 9, 10, 11, 12, 13, 16, 17, 18, 22, 23, 24, 25, 26, 27, 28], "size": [0, 1, 2, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 40, 42, 45], "get": [0, 4, 5, 7, 8, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, 38, 40, 43], "1": [0, 1, 4, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 31, 37, 38, 39, 40, 42, 44], "If": [0, 1, 2, 3, 4, 6, 7, 8, 9, 10, 11, 12, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 31, 32, 36, 38, 41, 42, 43, 44, 45], "one": [0, 1, 2, 3, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 30, 31, 37, 38, 39, 40, 42, 43, 45], "sum": 0, "match": [0, 1, 4, 5, 8, 9, 13, 36, 38, 42, 43], "implement": [0, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 33, 36, 39, 40, 41, 43, 44], "replic": 0, "multi": [0, 1, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 39, 40], "infer": [0, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 31, 36, 38, 40, 46], "own": [0, 7, 8, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, 30, 45], "predict": [0, 6, 7, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 31, 33, 36, 44, 45], "head": [0, 1, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 31, 33, 36, 39, 40, 44, 45], "wa": [0, 2, 4, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 31, 35, 40], "adapterdrop": 0, "On": [0, 40], "effici": [0, 13, 14, 15, 17, 21, 29, 30, 36, 39, 40], "transform": [0, 1, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 27, 28, 29, 30, 31, 32, 35, 36, 37, 38, 39, 40, 41, 42, 44, 45], "r\u00fcckl\u00e9": 0, "semant": [0, 10], "textual": [0, 13], "st": 0, "hub": [0, 1, 3, 4, 6, 7, 8, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, 30, 31, 33, 36, 37, 38, 43, 44, 45, 46], "benchmark": [0, 12, 13, 15, 16, 17, 21, 23, 26, 31, 45], "mrpc": [0, 43, 45], "dataset": [0, 3, 4, 6, 7, 9, 10, 11, 12, 13, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 35, 38, 42, 44, 45], "both": [0, 6, 9, 12, 13, 18, 22, 23, 25, 32, 38, 40, 43, 46], "respect": [0, 6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 30, 31, 42, 43], "autoadaptermodel": [0, 18, 21, 35, 38, 41, 43, 44, 45, 46], "from_pretrain": [0, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 35, 38, 43, 44, 45, 46], "distilbert": [0, 4, 8, 36, 41], "uncas": [0, 16, 18, 38, 43, 44, 45, 46], "autotoken": [0, 7, 9, 11, 12, 13, 16, 17, 22, 23, 24, 25, 27, 28], "adapter1": 0, "ukp": [0, 38, 44], "adapter2": 0, "input_id": [0, 7, 9, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 27, 28], "great": [0, 44], "awesom": [0, 35], "return_tensor": [0, 13, 44], "pt": [0, 13, 44], "output1": 0, "output2": 0, "print": [0, 38], "item": [0, 44], "bool": [0, 1, 2, 3, 4, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 33], "torch": [0, 2, 4, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 30, 44], "argmax": [0, 44], "approach": [0, 13, 17, 22, 24, 28, 34, 39, 40], "ensembl": 0, "full": [0, 1, 3, 4, 6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 30, 31, 35, 36, 38, 39, 40, 42, 44, 45, 46], "time": [0, 2, 12, 35, 40, 42, 46], "better": [0, 14, 15, 35, 40, 45], "gener": [0, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 30, 31, 35, 38, 40, 41, 42, 44], "recent": [0, 12, 14, 15, 23, 42], "explor": [0, 6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 35], "includ": [0, 1, 15, 29, 30, 31, 33, 37, 40, 45], "represent": [0, 2, 6, 7, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28], "wang": [0, 20], "2022": [0, 1, 39, 40], "chronopoul": 0, "2023": [0, 36, 45], "provid": [0, 1, 3, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, 30, 31, 32, 33, 35, 38, 39, 40, 41, 42, 43, 44, 45, 46], "built": [0, 8, 15, 37, 40, 45], "type": [0, 1, 2, 3, 4, 5, 6, 7, 9, 10, 11, 12, 13, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 33, 38, 43], "aggreg": 0, "via": [0, 1, 6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, 30, 31, 34, 35, 38, 39, 40, 43, 45], "weight": [0, 1, 2, 4, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 30, 32, 36, 40, 42, 43, 44, 45, 46], "realiz": [0, 31, 40], "below": [0, 31, 40, 41, 42, 44, 45], "m": [0, 39], "6": [0, 1, 9, 14, 15, 20, 45], "n": [0, 2, 40], "3": [0, 1, 4, 10, 13, 14, 15, 20, 21, 29, 31, 36, 37], "o": 0, "creat": [0, 1, 6, 7, 8, 9, 10, 11, 12, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, 30, 31, 32, 35, 40, 43, 44], "process": [0, 2, 7, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 36, 39], "typic": [0, 30, 31, 41], "runtim": 0, "average_adapt": [0, 2, 6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "dedic": 0, "avg": 0, "succe": 0, "must": [0, 4, 6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 30, 31, 38, 46], "compar": [0, 9, 14, 15, 17, 26, 30, 31, 40, 45, 46], "advantag": 0, "induc": [0, 15], "ani": [0, 2, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 30, 31, 40, 42, 43, 45, 46], "addit": [0, 1, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, 30, 33, 35, 40, 43, 45, 46], "rel": [0, 6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "normal": [0, 1], "default": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 30, 32, 35, 38, 40, 42, 43, 45], "disabl": [0, 2, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 45], "normalize_weight": [0, 6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "fals": [0, 1, 2, 3, 4, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 39, 40, 43, 45], "Of": [0, 35], "cours": [0, 35], "within": [0, 1, 2, 6, 7, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 30, 39, 40, 43, 44], "60": [0, 16], "howev": [0, 1, 4, 8, 30, 32, 33, 38, 42, 45, 46], "arbitrarili": 0, "deep": [0, 11], "current": [0, 1, 2, 4, 6, 7, 9, 10, 11, 12, 13, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 30, 31, 32, 33, 36, 37, 40, 41, 42], "str": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 33], "abov": [0, 29, 30, 38, 42, 45], "depend": [0, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, 30, 31, 40, 45], "individu": [0, 1, 45], "some": [0, 1, 4, 10, 17, 19, 30, 31, 32, 40, 44, 45, 46], "architectur": [1, 6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 30, 31, 38, 40, 41, 45], "modul": [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 30, 31, 32, 36, 38, 39, 40, 42, 43, 45, 46], "adapterconfig": [1, 6, 30, 42, 45, 46], "specif": [1, 4, 8, 10, 13, 22, 28, 30, 31, 37, 38, 40, 42, 43, 44], "kei": [1, 7, 8, 9, 10, 11, 12, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 40, 42], "common": [1, 2, 23, 30, 42], "helper": [1, 30], "paramet": [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 30, 36, 38, 39, 40, 42, 43, 45], "option": [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 30, 33, 35, 40, 42, 43], "classmethod": [1, 8, 18], "from_dict": 1, "config": [1, 3, 4, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 30, 31, 36, 38, 39, 40, 42, 43, 44, 45, 46], "python": [1, 8, 13, 18, 29, 30, 31, 35, 37, 45], "dict": [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "union": [1, 2, 3, 4, 5, 6, 7, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28], "download_kwarg": 1, "none": [1, 3, 4, 5, 6, 7, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 38, 43, 44], "kwarg": [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 43], "given": [1, 2, 3, 4, 5, 6, 7, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 30, 38, 40, 43, 44, 45], "instanc": [1, 3, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 39, 42, 43], "identifi": [1, 3, 4, 6, 7, 8, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 35, 38, 42, 43], "string": [1, 3, 4, 6, 7, 8, 9, 10, 11, 12, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 38, 46], "adapter_config_map": 1, "path": [1, 3, 4, 6, 7, 8, 9, 10, 11, 12, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 32, 33, 42, 43, 44, 45], "file": [1, 3, 4, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 30, 31, 33, 42, 44], "contain": [1, 2, 3, 4, 6, 7, 8, 9, 10, 11, 12, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 30, 31, 32, 42, 46], "return": [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 33, 38, 39], "resolv": [1, 4, 38, 43, 45], "replac": [1, 7, 9, 10, 11, 12, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 40, 45, 46], "appli": [1, 6, 7, 9, 10, 11, 12, 13, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 31, 40, 42], "to_dict": 1, "convert": [1, 6, 7, 8, 9, 10, 11, 12, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28], "bnconfig": [1, 39, 40], "mh_adapt": [1, 39, 40], "output_adapt": [1, 39, 40], "reduction_factor": [1, 39, 40, 42], "float": [1, 2, 6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "abc": 1, "map": [1, 5, 6, 7, 9, 10, 11, 12, 13, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 31], "non_linear": [1, 39, 40], "original_ln_befor": 1, "original_ln_aft": 1, "ln_befor": 1, "ln_after": 1, "init_weight": 1, "bert": [1, 4, 7, 8, 9, 10, 12, 13, 14, 15, 16, 17, 18, 24, 30, 31, 36, 38, 41, 43, 44, 45, 46], "is_parallel": 1, "scale": [1, 13, 17, 19, 22, 23, 26, 27, 39, 40, 42], "use_g": [1, 39], "residual_before_ln": 1, "adapter_residual_before_ln": 1, "inv_adapt": [1, 40], "inv_adapter_reduction_factor": 1, "cross_adapt": 1, "leave_out": [1, 6, 7, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28], "int": [1, 2, 4, 5, 6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "factori": 1, "dropout": [1, 8, 18], "phm_layer": 1, "phm_dim": [1, 40], "4": [1, 14, 15, 17, 18, 23, 25, 31, 45], "factorized_phm_w": [1, 40], "shared_w_phm": [1, 40], "shared_phm_rul": [1, 40], "factorized_phm_rul": [1, 40], "phm_c_init": [1, 40], "phm_init_rang": [1, 40], "0001": 1, "learn_phm": [1, 40], "hypercomplex_nonlinear": [1, 40], "glorot": 1, "uniform": 1, "phm_rank": 1, "phm_bia": 1, "add": [1, 2, 4, 5, 6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 35, 36, 37, 38, 39, 40, 43, 44, 45, 46], "attent": [1, 4, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 30, 31, 39, 40], "block": [1, 2, 6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 30, 31, 36, 39, 40, 41, 45, 46], "output": [1, 2, 6, 7, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 36, 39, 40, 43, 44], "ffn": 1, "scalar": 1, "reduct": [1, 7, 39], "factor": [1, 39, 40], "id": [1, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 38], "start": [1, 9, 12, 13, 18, 22, 23, 25, 28, 29, 30, 38, 43], "valu": [1, 4, 5, 7, 8, 9, 10, 11, 12, 13, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 38, 39, 40, 42], "8": [1, 10, 14, 15, 19, 29, 37, 39, 40], "32": [1, 45], "16": [1, 4, 20, 39, 40], "project": [1, 13, 15, 29, 40], "activ": [1, 6, 7, 9, 10, 11, 12, 13, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 31, 32, 35, 36, 38, 39, 40, 43, 44, 45, 46], "function": [1, 4, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 30, 31, 33, 40, 42, 44, 46], "residu": [1, 2, 40], "connect": [1, 40], "applic": [1, 18, 26, 30, 31], "initi": [1, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 32, 36, 38, 40, 41, 43, 45], "mam_adapt": [1, 39], "parallel": [1, 2, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 31, 36, 39, 40, 41, 42, 46], "By": [1, 2, 6, 7, 8, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 38, 40, 43], "sequenti": [1, 2, 9, 18, 22], "he": [1, 14, 15, 39, 40], "constant": [1, 23, 28], "place": [1, 26, 31, 39, 40], "trainabl": [1, 20, 28, 39, 40], "gate": [1, 39], "besid": [1, 38, 45], "control": [1, 3, 7, 9, 11, 12, 16, 17, 18, 22, 23, 24, 25, 27, 28, 39, 40], "unipelt": [1, 36, 42], "take": [1, 3, 7, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 38, 42, 43], "around": 1, "post_add": 1, "previou": [1, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 40, 45], "invert": [1, 4, 6, 36, 41, 42, 46], "embed": [1, 6, 7, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 36, 40], "nice": 1, "glow": 1, "decod": [1, 9, 12, 14, 15, 17, 22, 23, 25, 36, 40, 41], "encod": [1, 7, 9, 10, 11, 12, 13, 14, 15, 16, 17, 22, 23, 24, 25, 26, 27, 28, 36, 40, 41, 44], "NO": 1, "rate": [1, 45], "down": [1, 40], "phmlayer": 1, "dimens": [1, 2, 4, 6, 20, 40], "phm": [1, 40], "matrix": [1, 7, 9, 11, 12, 15, 16, 17, 18, 22, 23, 24, 25, 27, 28, 40], "whether": [1, 2, 3, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 30], "share": [1, 2, 5, 6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 30, 35, 40, 42, 43], "across": [1, 13, 19], "dure": [1, 8, 13, 30, 39, 43], "obj": 1, "std": 1, "distribut": [1, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 43], "draw": 1, "rank": [1, 40], "shape": [1, 7, 9, 10, 11, 12, 13, 16, 17, 18, 22, 23, 24, 25, 26, 27, 28, 40], "_in_feats_per_axi": 1, "_out_feats_per_axi": 1, "bia": 1, "term": [1, 40, 42, 46], "seqbnconfig": [1, 39, 40, 42, 46], "relu": [1, 39, 40], "see": [1, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 30, 31, 32, 33, 35, 36, 39, 42, 43, 45, 46], "http": [1, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 35, 36, 37, 45, 46], "arxiv": [1, 6, 9, 11, 16, 17, 18, 23, 24, 25, 28, 36], "org": [1, 6, 7, 8, 9, 10, 11, 12, 13, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 36], "pdf": [1, 6, 11, 16, 17, 24, 28], "2005": 1, "00247": 1, "seqbninvconfig": [1, 40, 42], "doubleseqbnconfig": [1, 40, 42], "swish": 1, "houlsbi": [1, 26, 40, 46], "2019": [1, 9, 14, 15, 27, 40], "1902": 1, "00751": 1, "doubleseqbninvconfig": [1, 40, 42], "parbnconfig": [1, 39, 40, 42], "2110": 1, "04366": 1, "compacterconfig": [1, 40, 42], "gelu": [1, 6, 7, 11, 12, 14, 15, 16, 17, 19, 20, 21, 24, 28], "mahabadi": [1, 40], "2106": 1, "04647": 1, "compacterplusplusconfig": [1, 40, 42], "prefixtuningconfig": [1, 39, 40, 42, 46], "prefix_tun": [1, 40, 42], "encoder_prefix": 1, "cross_prefix": 1, "flat": [1, 6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 40, 42], "prefix_length": [1, 39, 40], "30": [1, 13, 40], "bottleneck_s": [1, 39, 42], "512": 1, "tanh": [1, 6, 7, 9, 10, 11, 13, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "shared_g": 1, "li": [1, 10, 18, 22, 23, 25, 40], "liang": [1, 40], "2101": 1, "00190": 1, "directli": [1, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 30, 31, 35, 37, 40, 46], "otherwis": [1, 2, 5, 6, 7, 8, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 42], "reparametr": 1, "mlp": [1, 40], "length": [1, 2, 4, 9, 18, 22, 23, 25, 40], "linear": [1, 13, 40], "matric": [1, 7, 14, 15, 40], "selfattn_lora": [1, 40], "intermediate_lora": 1, "output_lora": [1, 40], "r": [1, 36, 39, 40, 45], "alpha": [1, 39, 40], "attn_matric": [1, 40], "composition_mod": [1, 40], "low": [1, 40], "hu": [1, 40], "09685": 1, "merg": [1, 6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 40], "origin": [1, 9, 10, 13, 14, 15, 17, 29, 30, 31, 32, 34, 38, 39, 40, 44, 45], "merge_adapt": [1, 6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 40], "lora_nam": 1, "self": [1, 4, 6, 7, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 33, 40, 43], "intermedi": [1, 40], "hyperparamet": [1, 40], "determin": [1, 4, 6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "A": [1, 3, 4, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 32, 36, 39, 40, 43], "mai": [1, 33, 42, 45], "q": [1, 40], "queri": [1, 40], "v": [1, 40], "inject": [1, 30, 39, 40], "compos": [1, 2, 30, 42], "decompos": [1, 40], "element": [1, 13, 18, 40], "wise": [1, 40], "togeth": [1, 6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 33, 35, 41, 43, 44], "ia3": [1, 42], "infus": [1, 40], "inhibit": [1, 40], "amplifi": [1, 40], "inner": [1, 40], "liu": [1, 9, 14, 15, 18, 22, 23, 24, 25, 40], "2205": 1, "05638": 1, "unlik": [1, 18, 40, 46], "composit": [1, 2, 6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 31, 36, 40, 45, 46], "prompt_tun": [1, 42], "prompt_length": [1, 40], "10": [1, 4, 37, 39, 40, 45], "prompt_init": [1, 40], "random_uniform": [1, 40], "prompt_init_text": [1, 40], "lester": [1, 40], "2104": 1, "08691": 1, "number": [1, 2, 6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 40, 42], "from_str": [1, 40], "text": [1, 9, 12, 13, 17, 18, 19, 22, 23, 25, 40], "random_uniform_scal": 1, "random": 1, "5": [1, 19, 45], "prefix_after_bo": 1, "configunion": [1, 39, 42], "static": [1, 31, 36], "valid": [1, 2, 45], "perform": [1, 7, 9, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 40, 42, 45], "simpl": [1, 13, 19, 23, 40], "rais": [1, 2, 4, 6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "typeerror": 1, "One": [1, 42], "wrong": [1, 23], "valueerror": [1, 2, 6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "At": [1, 30, 43], "least": [1, 30, 31], "conflict": 1, "mamconfig": [1, 39, 42], "mix": [1, 36, 42], "And": 1, "unipeltconfig": [1, 39, 42], "mao": [1, 39], "07577": 1, "adapterfusionconfig": 1, "query_before_ln": 1, "regular": [1, 6, 7, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 40], "residual_befor": 1, "temperatur": 1, "value_before_softmax": 1, "value_initi": 1, "dropout_prob": 1, "adapterfusion_config_map": 1, "staticadapterfusionconfig": 1, "version": [1, 4, 6, 7, 8, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 36, 37, 38, 45, 46], "dynamicadapterfusionconfig": 1, "adapter_setup": [1, 2, 6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "head_setup": 1, "ignore_empti": 1, "intend": [1, 37, 45], "statement": [1, 45], "overrid": [1, 3, 7, 8, 9, 10, 11, 12, 13, 16, 17, 18, 22, 23, 24, 25, 26, 27, 28, 33], "stack": [1, 2, 6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 36, 40, 42, 46], "thread": 1, "local": [1, 4, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, 33], "environ": [1, 46], "interfac": 2, "further": [2, 4, 13, 30, 40, 41, 44, 45], "logic": [2, 30, 31], "newli": [2, 6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 30, 42], "inherit": [2, 6, 7, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28], "adapterlayerbas": [2, 6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 30], "requir": [2, 6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, 30, 31, 33, 35, 42, 43, 45, 46], "per": [2, 28, 42], "adapter_modules_nam": 2, "overriden": [2, 6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "abstract": [2, 6, 9, 10, 12, 13, 14, 15, 17, 19, 21, 22, 23, 25, 26, 28, 30, 33], "layer_idx": [2, 5], "index": [2, 8, 18, 28, 31, 36, 37], "onc": 2, "kept": 2, "fix": [2, 13, 29, 42], "input_adapt": [2, 6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "averag": [2, 6, 7, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 36], "equal": [2, 45], "correspond": [2, 4, 7, 8, 11, 17, 24, 25, 27, 28, 31, 38, 40, 46], "delete_adapt": [2, 6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 44], "delet": [2, 6, 7, 8, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 30, 36, 43, 44], "enable_adapt": 2, "unfreeze_adapt": [2, 6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "unfreeze_fus": 2, "unfreez": [2, 45], "get_adapt": [2, 6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "composableadapterlayerbas": [2, 30], "arg": [2, 3, 6, 7, 8, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 42, 45], "supported_composit": [2, 30], "allow_multi_parallel": 2, "independ": [2, 42, 45], "check_composition_valid": 2, "parent": [2, 18], "child": [2, 6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "lvl": 2, "depth": 2, "invalid": [2, 6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "namedtupl": 2, "main": [2, 7, 8, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 30, 31, 36, 40], "recurs": [2, 6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29], "compose_averag": 2, "compose_batch_split": 2, "batchsplit": [2, 36], "split": [2, 7, 12, 20, 28, 36], "along": [2, 4, 7, 8, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 38], "compose_fus": 2, "fuse": [2, 5, 6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 36, 42, 45], "compose_parallel": 2, "execut": [2, 6, 7, 8, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 31, 43], "repeat": [2, 7, 30], "feed": [2, 39, 40, 42], "compose_singl": [2, 30], "compose_split": 2, "compose_stack": 2, "tensor": [2, 3, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 30, 44], "pad_and_concat": [2, 30], "concaten": [2, 4, 40], "pad": [2, 7, 9, 11, 12, 13, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 27, 28], "necessari": [2, 6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 30, 31, 40, 46], "pre_block": 2, "invok": 2, "ln": 2, "channel": 2, "vslice": [2, 30], "slice_obj": 2, "slice": 2, "vertic": 2, "relat": [3, 4, 7, 9, 10, 11, 12, 13, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 31, 40, 44], "adapterargu": 3, "train_adapt": [3, 6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 32, 45, 46], "adapter_config": [3, 4, 30, 42, 45, 46], "seq_bn": [3, 42, 43, 45, 46], "load_lang_adapt": 3, "lang_adapter_config": 3, "subset": [3, 17, 42], "argument": [3, 6, 7, 8, 9, 10, 11, 12, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 38, 43, 45], "instead": [3, 4, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 30, 31, 40, 42, 44], "setup_adapter_train": [3, 45], "adapter_arg": [3, 45], "adapter_config_kwarg": 3, "adapter_load_kwarg": 3, "_type_": 3, "tupl": [3, 4, 6, 7, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 30], "trainer": [3, 45], "adaptertrain": [3, 36, 46], "pretrainedmodel": [3, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 31], "trainingargu": [3, 45], "data_col": [3, 45], "datacol": 3, "train_dataset": [3, 45], "eval_dataset": [3, 45], "pretrainedtokenizerbas": 3, "model_init": 3, "callabl": [3, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 33], "compute_metr": 3, "evalpredict": 3, "callback": 3, "trainercallback": 3, "optim": [3, 13, 24, 40, 42, 45], "lambdalr": 3, "preprocess_logits_for_metr": 3, "create_optim": 3, "reason": [3, 30], "want": [3, 7, 8, 9, 10, 11, 12, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 31, 32, 33, 38, 43, 44, 45], "someth": [3, 45], "els": [3, 43], "s": [3, 4, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 30, 31, 33, 34, 36, 37, 38, 40, 42, 43, 44, 45], "init": [3, 18, 38, 41, 43, 44, 45, 46], "subclass": [3, 7, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 30, 33], "adaptertrainercallback": 3, "on_step_end": 3, "trainerst": 3, "trainercontrol": 3, "event": 3, "end": [3, 16], "step": [3, 7, 9, 10, 11, 12, 13, 16, 17, 18, 22, 23, 24, 25, 26, 27, 28, 29, 30, 38, 43], "gradient": 3, "accumul": 3, "on_train_begin": 3, "begin": [3, 6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "seq2seqadaptertrain": 3, "mainli": [4, 12, 46], "search": [4, 6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 35, 38, 43], "adapterinfo": [4, 38], "sourc": [4, 6, 7, 9, 10, 11, 12, 13, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 35, 37, 38, 42], "adapter_id": [4, 38], "model_nam": [4, 6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 38, 44], "subtask": [4, 6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 35], "usernam": [4, 35, 38], "sha1_checksum": 4, "inform": [4, 9, 15, 18, 25, 35, 38, 40, 42, 44, 45], "about": [4, 7, 9, 10, 11, 12, 13, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 31, 45], "publicli": [4, 12, 14, 15, 21, 23], "adapterhub": [4, 6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 30, 31, 35, 38, 40, 43, 44, 46], "huggingfac": [4, 6, 7, 8, 9, 10, 11, 12, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, 34, 35, 38], "co": [4, 6, 7, 8, 9, 10, 11, 12, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 35, 38], "list_adapt": [4, 38], "repositori": [4, 6, 7, 8, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, 30, 34, 35, 36, 38, 45], "ah": [4, 6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 38], "hf": [4, 6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 30, 31, 35, 38], "uniqu": [4, 38], "author": [4, 6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 36, 40], "adaptertyp": [4, 40], "get_adapter_config_hash": 4, "calcul": 4, "hash": 4, "get_adapter_info": [4, 38], "retriev": [4, 6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 38, 44], "ml": [4, 6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, 30, 35, 36, 38, 46], "found": [4, 5, 6, 7, 8, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 35, 36, 44, 45], "get_from_cach": 4, "url": [4, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 36], "cache_dir": [4, 8], "force_download": [4, 8], "proxi": [4, 8], "etag_timeout": 4, "resume_download": [4, 8], "user_ag": 4, "use_auth_token": 4, "local_files_onli": [4, 8], "look": [4, 8, 23, 25, 30, 31, 38, 43, 45], "cach": [4, 8, 35], "download": [4, 7, 8, 9, 11, 12, 13, 16, 17, 18, 19, 21, 22, 23, 24, 25, 27, 28, 36, 43], "Then": [4, 10, 17, 42, 43], "network": [4, 13, 17, 26, 39], "off": 4, "last": [4, 6, 7, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 40], "disk": 4, "recover": 4, "exist": [4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 30, 31, 33, 35, 42, 45], "inaccess": [4, 21], "parse_adapter_config_str": 4, "config_str": 4, "pars": [4, 6, 11, 16, 17, 24, 28], "constist": 4, "prefix_attention_mask": 4, "attention_mask": [4, 6, 7, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 43], "dim": 4, "prefix_valu": 4, "mask": [4, 6, 7, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 30, 45], "prefix_attention_mask_length": 4, "forwardcontext": [4, 6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "like": [4, 7, 9, 10, 11, 12, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, 31, 35, 41, 43, 45], "extended_attention_mask": 4, "henc": [4, 43, 46], "invers": [4, 40], "usual": [4, 7, 13, 18, 28, 42, 43, 45], "invert_attention_mask": 4, "do": [4, 8, 18, 29], "manual": [4, 8, 45], "albert": [4, 8, 36, 41], "finfo": 4, "dtype": [4, 6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "min": 4, "pull_from_hub": 4, "strict": 4, "redirect_to_hf_hub": 4, "exactli": 4, "redirect": [4, 38], "resolve_adapter_config": 4, "local_map": 4, "try_loading_from_hub": 4, "action": [4, 13], "resolve_adapter_path": 4, "adapter_name_or_path": [4, 6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "attempt": [4, 6, 7, 8, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 38], "folder": [4, 6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 30, 31, 45], "system": [4, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 33, 36], "point": [4, 6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 30, 43], "zip": [4, 6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "upload": [4, 6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, 30, 31, 33, 34, 36, 38], "deprec": [4, 6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 34], "favor": 4, "modeladaptersconfig": 5, "add_fus": 5, "adapterfus": [5, 6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 36, 40], "common_config_valu": 5, "get_fus": 5, "config_typ": 5, "location_kei": 5, "tri": [5, 17, 30], "criteria": 5, "adapt": [6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 32, 33, 35, 37, 41, 43], "integr": [6, 28, 29, 30, 31, 34, 36, 39, 40, 42, 45], "everi": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 30, 31, 40, 44, 45], "add_invertible_adapt": 6, "noth": 6, "switch": [6, 31], "add_embed": [6, 32], "reference_embed": [6, 32], "reference_token": [6, 32], "embedding_dim": 6, "refer": [6, 7, 9, 10, 11, 12, 13, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 30, 34, 35, 38, 40, 42, 43, 44, 45], "vocab": 6, "embedding_s": 6, "doesn": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 43], "t": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 30, 31, 40, 43], "hidden_s": [6, 7, 9, 11, 12, 13, 16, 17, 18, 22, 23, 24, 25, 27, 28, 31], "delete_embed": [6, 32], "load_embed": [6, 32], "ws": 6, "save_embed": [6, 32], "set_active_embed": [6, 32], "adapter_fusion_to": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "devic": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "move": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 46], "data": [6, 7, 9, 10, 11, 12, 13, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 45], "cast": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "adapter_summari": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "as_dict": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "summari": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "entri": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 30, 38, 40], "param": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28], "adapter_to": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "overwrite_ok": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "set_act": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "overwrit": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "thrown": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "alll": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "comma": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "separ": [6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 43], "apply_to_adapter_lay": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "fn": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "apply_to_basemodel_child": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "direct": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 40, 43], "adapter_list": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "whose": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 31], "delete_adapter_fus": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "eject_prefix_tun": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 40], "reparameter": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 40], "forward_context": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "freeze_model": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 45], "freez": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 32, 45, 46], "nest": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 36], "structur": [6, 26, 31, 40], "nn": [6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 30], "global": [6, 7, 9, 10, 11, 12, 13, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "init_adapt": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 31], "model_config": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "adapters_config": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 43, 45], "add_prefix_tuning_pool": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "iter_lay": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 31], "iter": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "over": [6, 7, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 42], "ne": 6, "load_a": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 38], "custom_weights_load": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 33], "weightsload": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 33], "id2label": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "use_safetensor": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "pytorch": [6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 31, 36, 37, 44, 45], "remot": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "directori": [6, 7, 8, 9, 10, 11, 12, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 31, 32, 44], "saved_adapt": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "request": [6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, 38, 41], "archiv": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 38, 46], "drop": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 40, 46], "safetensor": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "checkpoint": [6, 7, 8, 9, 10, 11, 12, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 44, 45, 46], "adapter_fusion_name_or_path": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "save_adapter_fus": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "describ": [6, 7, 9, 10, 11, 12, 13, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, 30, 34, 35, 40, 41, 42, 45], "reset_adapt": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 40], "reset": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 40], "save_adapt": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 43, 44], "save_directori": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "meta_dict": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "its": [6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 30, 38, 40, 43, 44], "reload": [6, 7, 8, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 44, 45], "save_all_adapter_fus": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "subfold": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "save_all_adapt": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "skip_lay": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "train_embed": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 32], "mode": [6, 7, 8, 9, 10, 11, 12, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28], "train_adapter_fus": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "train_fus": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "base_model": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "preclud": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "infinit": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "with_head": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 43], "load_head": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 43], "save_head": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 43], "custom": [6, 7, 8, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, 31, 36, 38], "label": [6, 7, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 44], "adapterfusionlay": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "save_all_head": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "head_nam": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 43], "xmodelwithhead": 6, "active_head": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 43], "add_causal_lm_head": [6, 11, 12, 16, 17, 19, 20, 21, 24, 28], "activation_funct": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "causal": [6, 9, 11, 12, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 28], "forc": [6, 7, 8, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "add_classification_head": [6, 7, 9, 11, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 28, 43, 45], "num_label": [6, 7, 9, 10, 11, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 43, 45], "multilabel": [6, 7, 9, 10, 11, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "use_pool": [6, 7, 9, 10, 11, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "classif": [6, 7, 9, 10, 11, 12, 13, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 43, 44], "add_dependency_parsing_head": [6, 11, 16, 17, 24, 28], "biaffin": [6, 11, 16, 17, 24, 28], "Is": [6, 11, 16, 17, 24, 28], "supervis": [6, 7, 10, 11, 13, 16, 17, 24, 25, 28], "syntact": [6, 11, 16, 17, 24, 28], "benefici": [6, 11, 16, 17, 24, 28, 39], "understand": [6, 11, 12, 16, 17, 24, 28], "empir": [6, 11, 12, 16, 17, 24, 28, 36], "investig": [6, 11, 16, 17, 24, 28], "glava\u0161": [6, 11, 16, 17, 24, 28], "vuli\u0107": [6, 11, 16, 17, 24, 28], "2008": [6, 11, 16, 17, 24, 28], "06788": [6, 11, 16, 17, 24, 28], "add_image_classification_head": [6, 10, 26], "imag": [6, 10, 13, 26], "add_masked_lm_head": [6, 7, 11, 12, 14, 15, 16, 17, 24, 28], "add_multiple_choice_head": [6, 7, 11, 14, 15, 16, 17, 24, 28], "num_choic": [6, 7, 11, 14, 15, 16, 17, 24, 28], "choic": [6, 7, 11, 14, 15, 16, 17, 24, 28], "add_qa_head": [6, 7, 9, 11, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 28], "question": [6, 7, 9, 11, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 28, 29], "answer": [6, 7, 9, 11, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 28], "add_seq2seq_lm_head": [6, 9, 22, 23, 25], "add_tagging_head": [6, 7, 11, 14, 15, 16, 17, 19, 20, 21, 24, 28], "delete_head": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 43], "forward_head": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "all_output": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "cls_output": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 43], "return_dict": [6, 7, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 43], "There": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, 31, 38], "order": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 43], "prioriti": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 43], "read": [6, 7, 8, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 43], "modeloutput": [6, 7, 9, 10, 11, 12, 13, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28], "plain": [6, 7, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28], "get_cls_from_eos_token": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "classifi": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "eo": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "eos_mask": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "keyword": [6, 7, 8, 9, 10, 11, 12, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28], "get_label": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "assign": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "predictin": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "get_labels_dict": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "hea": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "head_typ": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 43], "decor": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "belong": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 43], "error": [6, 7, 8, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29], "foward": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "tie_weight": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "tie": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "torchscript": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "flag": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, 32, 45], "handl": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 32, 44], "clone": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, 37, 45], "hub_mixin": 6, "push_adapter_to_hub": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 35], "repo_nam": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "organ": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "adapterhub_tag": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 35], "datasets_tag": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 35], "local_path": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "commit_messag": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "privat": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "overwrite_adapter_card": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "create_pr": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "revis": [6, 7, 8, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "commit_descript": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "adapter_card_kwarg": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "deprecated_kwarg": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "push": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 35], "member": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "tag": [6, 7, 8, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "format": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, 30, 35], "categor": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 35], "doc": [6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 36, 46], "contribut": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "html": [6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30], "card": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 35], "temporari": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "messag": [6, 7, 8, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "commit": [6, 7, 8, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 35], "Will": [6, 7, 8, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "pai": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 40], "subscript": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "bearer": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "run": [6, 7, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 31, 35, 37, 45, 46], "cli": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 35], "login": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 35], "store": [6, 7, 8, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 35, 44], "repo_url": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "pr": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 31], "branch": [6, 7, 8, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "descript": [6, 7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 39], "group": [7, 20, 39], "therefor": [7, 17, 30, 31, 38, 40, 43, 45], "affect": [7, 8, 15, 46], "behavior": [7, 9, 10, 11, 12, 13, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 31, 40], "count": [7, 13, 18], "put": 7, "posit": [7, 8, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 23, 24, 25, 26, 27, 28], "inner_group_num": 7, "second": [7, 11, 14, 15, 17, 18, 24, 27, 28, 38, 40, 43, 45], "lite": 7, "zhenzhong": 7, "lan": 7, "mingda": 7, "chen": [7, 14, 15, 24], "sebastian": [7, 36], "goodman": 7, "kevin": 7, "gimpel": 7, "piyush": 7, "sharma": 7, "radu": 7, "soricut": 7, "techniqu": [7, 14, 15, 23, 40], "lower": [7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 40], "memori": 7, "consumpt": 7, "increas": [7, 16], "speed": [7, 9, 18, 22, 23, 25, 42], "among": [7, 14, 15], "superclass": [7, 9, 11, 12, 13, 16, 17, 18, 19, 21, 22, 23, 24, 25, 27, 28], "document": [7, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 34, 35, 40, 42, 44, 46], "librari": [7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 35, 36, 37, 38, 40, 42, 43, 44, 45, 46], "resiz": [7, 9, 11, 12, 13, 16, 17, 18, 19, 21, 22, 23, 24, 25, 27, 28], "prune": [7, 9, 11, 12, 13, 16, 17, 18, 19, 21, 22, 23, 24, 25, 27, 28], "stabl": [7, 9, 10, 11, 12, 13, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28], "matter": [7, 9, 10, 11, 12, 13, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28], "usag": [7, 9, 10, 11, 12, 13, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 31, 36, 41, 44, 45], "albertconfig": [7, 8], "associ": [7, 9, 10, 11, 12, 13, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 36], "familiar": [7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 31], "peft": [7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "invit": [7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "them": [7, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 30, 31, 32, 33, 35, 39, 43, 44, 46], "offici": [7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "user": [7, 9, 10, 11, 12, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29], "deal": [7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "accordingli": [7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "token_type_id": [7, 11, 14, 15, 17, 19, 20, 24, 27, 28], "position_id": [7, 11, 12, 13, 14, 15, 17, 19, 20, 21, 24, 27, 28], "head_mask": [7, 9, 10, 11, 12, 16, 17, 19, 20, 22, 23, 24, 25, 26, 27, 28], "inputs_emb": [7, 9, 11, 12, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 27, 28], "output_attent": [7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28], "output_hidden_st": [7, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28], "output_adapter_gating_scor": [7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 39], "__call__": [7, 9, 10, 11, 12, 13, 16, 17, 18, 22, 23, 24, 25, 26, 27, 28], "special": [7, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 40, 45], "tip": [7, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28], "recip": [7, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28], "need": [7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 30, 31, 32, 38, 43, 44, 45, 46], "afterward": [7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 35], "sinc": [7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 42, 43], "former": [7, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28], "care": [7, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28], "post": [7, 9, 10, 11, 12, 13, 16, 17, 18, 22, 23, 24, 25, 26, 27, 28, 29], "latter": [7, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28], "silent": [7, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28], "longtensor": [7, 9, 11, 12, 13, 16, 17, 18, 21, 22, 23, 24, 25, 27, 28], "sequence_length": [7, 9, 11, 12, 13, 17, 18, 22, 23, 24, 25, 27, 28], "vocabulari": [7, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 32, 40], "obtain": [7, 9, 10, 11, 12, 13, 16, 17, 18, 22, 23, 24, 25, 26, 27, 28], "pretrainedtoken": [7, 9, 11, 12, 13, 16, 17, 18, 22, 23, 24, 25, 27, 28], "what": [7, 9, 11, 12, 13, 16, 17, 18, 22, 23, 24, 25, 27, 28, 36, 38], "glossari": [7, 9, 11, 12, 13, 16, 17, 18, 22, 23, 24, 25, 27, 28], "floattensor": [7, 9, 10, 11, 12, 13, 16, 17, 18, 22, 23, 24, 25, 26, 27, 28], "avoid": [7, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 30, 31, 45], "select": [7, 8, 9, 10, 11, 12, 13, 16, 17, 18, 22, 23, 24, 25, 26, 27, 28], "segment": [7, 10, 11, 17, 24, 27, 28], "portion": [7, 11, 17, 24, 27, 28], "sentenc": [7, 9, 11, 12, 13, 17, 24, 27, 28, 44], "rang": [7, 9, 11, 12, 13, 14, 15, 17, 21, 24, 27, 28], "max_position_embed": [7, 11, 12, 13, 16, 17, 24, 27, 28], "num_head": [7, 9, 10, 11, 12, 13, 16, 17, 18, 22, 23, 24, 25, 26, 27, 28], "num_lay": [7, 10, 11, 12, 16, 17, 23, 24, 25, 26, 27, 28], "nullifi": [7, 9, 10, 11, 12, 16, 17, 22, 23, 24, 25, 26, 27, 28], "choos": [7, 9, 11, 12, 16, 17, 18, 22, 23, 24, 25, 27, 28], "than": [7, 8, 9, 10, 11, 12, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 30, 40, 42, 45], "intern": [7, 9, 11, 12, 16, 17, 18, 22, 23, 24, 25, 27, 28, 44], "lookup": [7, 9, 11, 12, 16, 17, 18, 22, 23, 24, 25, 27, 28], "under": [7, 9, 10, 11, 12, 13, 16, 17, 18, 22, 23, 24, 25, 26, 27, 28, 29, 30, 32, 35, 42], "hidden": [7, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 30, 40], "hidden_st": [7, 9, 10, 11, 12, 13, 16, 17, 18, 22, 23, 24, 25, 26, 27, 28], "util": [7, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 31, 36], "get_output_embed": [7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "save_pretrain": [7, 8, 9, 10, 11, 12, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 44, 46], "pathlik": [7, 8, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "re": [7, 8, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29], "os": [7, 8, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 43, 44], "is_main_process": [7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "tpu": [7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "race": [7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "condit": [7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "state_dict": [7, 8, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "part": [7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 31], "precaut": [7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "taken": [7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 39], "recov": [7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "save_funct": [7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "anoth": [7, 9, 10, 11, 12, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 31, 32, 40, 46], "push_to_hub": [7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "your": [7, 8, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 30, 31, 33, 35, 36, 40, 43, 45, 46], "hug": [7, 9, 10, 11, 12, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, 30, 31, 33, 34, 36, 37, 38, 41, 44, 45, 46], "face": [7, 9, 10, 11, 12, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, 30, 31, 33, 34, 36, 37, 38, 41, 44, 45, 46], "repo_id": [7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "namespac": [7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 36], "max_shard_s": [7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "5gb": [7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "maximum": [7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "being": [7, 8, 9, 10, 11, 12, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 33, 35, 40, 46], "shard": [7, 8, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "express": [7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 40], "digit": [7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "unit": [7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "5mb": [7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "abl": [7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 35], "easili": [7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, 38, 41, 43, 44], "free": [7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 41], "tier": [7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "googl": [7, 8, 9, 10, 11, 12, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28], "cpu": [7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "oom": [7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "issu": [7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, 41], "warn": [7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "bigger": [7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "safe_seri": [7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "tradit": [7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "pickl": [7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "variant": [7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 39], "pytorch_model": [7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "bin": [7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "save_peft_format": [7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "compat": [7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 46], "attach": [7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "pend": [7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "behaviour": [7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "word": [7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "pushtohubmixin": [7, 9, 10, 11, 12, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28], "automodel": [8, 18, 44, 46], "correct": [8, 16, 30, 31, 37], "automat": [8, 18, 21, 35, 36, 37, 38, 45, 46], "instanti": [8, 18, 39, 45], "from_config": 8, "__init__": [8, 18, 30, 31, 43], "throw": 8, "pretrainedconfig": [8, 18], "albertadaptermodel": 8, "bartconfig": [8, 9], "bartadaptermodel": 8, "bart": [8, 22, 31, 36, 41], "beitconfig": [8, 10], "beitadaptermodel": 8, "beit": [8, 36, 41], "bertconfig": [8, 11, 31], "bertadaptermodel": [8, 41, 43, 44, 46], "bertgenerationconfig": [8, 12], "bertgenerationadaptermodel": 8, "clipconfig": [8, 13], "clipadaptermodel": [8, 13], "clip": [8, 36, 41], "debertaconfig": 8, "debertaadaptermodel": 8, "debertav2config": 8, "debertav2adaptermodel": 8, "v2": [8, 14, 36, 40, 41], "distilbertconfig": [8, 16], "distilbertadaptermodel": 8, "electraconfig": [8, 17], "electraadaptermodel": 8, "electra": [8, 36, 41], "gpt2config": [8, 19], "gpt2adaptermodel": 8, "openai": [8, 20, 36], "gptjconfig": [8, 20], "gptjadaptermodel": 8, "j": [8, 18, 23, 25, 36, 41], "llamaconfig": [8, 21], "llamaadaptermodel": 8, "llama": [8, 36, 41], "mbartconfig": [8, 22], "mbartadaptermodel": 8, "mbart": [8, 36, 41], "mt5config": [8, 23], "mt5adaptermodel": 8, "mt5": [8, 36, 41], "robertaconfig": [8, 24], "robertaadaptermodel": 8, "roberta": [8, 9, 12, 14, 15, 17, 35, 36, 41], "t5config": [8, 25], "t5adaptermodel": 8, "t5": [8, 15, 23, 36, 41], "vitconfig": [8, 26], "vitadaptermodel": 8, "vit": [8, 10, 13, 36, 41], "xlmrobertaconfig": [8, 27], "xlmrobertaadaptermodel": 8, "xlm": [8, 36, 41], "xmodconfig": [8, 28], "xmodadaptermodel": 8, "mod": [8, 36, 41], "attn_implement": 8, "relev": [8, 13, 38], "eager": 8, "sdpa": 8, "scaled_dot_product_attent": 8, "master": 8, "flash_attention_2": 8, "dao": 8, "ailab": 8, "flash": 8, "github": [8, 14, 15, 29, 30, 31, 45, 46], "com": [8, 14, 15, 29, 30, 31, 37, 45, 46], "autoconfig": 8, "model_arg": [8, 18, 45], "pretrain": [8, 9, 10, 12, 13, 14, 15, 17, 18, 19, 22, 23, 24, 25], "model_typ": [8, 30, 31], "pretrained_model_name_or_path": 8, "miss": [8, 41], "fall": 8, "back": [8, 18, 40], "pattern": 8, "gpt2": [8, 15, 36], "gptj": [8, 20], "xmod": [8, 28], "evalu": [8, 18, 45], "eval": [8, 18], "deactiv": [8, 18, 40, 44], "host": [8, 18, 38, 46], "insid": [8, 18], "my_model_directori": [8, 18], "tensorflow": [8, 18], "tf_model": [8, 18], "ckpt": [8, 18], "from_tf": [8, 18], "slower": [8, 18], "convers": [8, 18, 28, 31, 36], "script": [8, 18, 30, 31, 37, 45], "underli": [8, 18], "suppli": 8, "json": [8, 42], "though": 8, "simpler": 8, "standard": [8, 9, 26, 30, 35], "docstr": [8, 18], "incomplet": 8, "receiv": 8, "resum": 8, "server": 8, "protocol": 8, "endpoint": 8, "foo": 8, "bar": 8, "3128": 8, "hostnam": 8, "4012": 8, "output_loading_info": 8, "ot": 8, "unexpect": 8, "try": [8, 30, 31], "git": [8, 29, 35, 37, 45], "artifact": 8, "trust_remote_cod": 8, "trust": 8, "code": [8, 13, 14, 15, 23, 29, 30, 31, 36, 39, 45, 46], "machin": [8, 9, 12, 22, 29, 40], "code_revis": 8, "leav": [8, 18, 40], "rest": [8, 45], "updat": [8, 18, 35, 42], "behav": [8, 18], "assum": [8, 29], "alreadi": [8, 28, 29, 30, 35, 40, 41, 46], "said": 8, "tf": [8, 14, 15], "bert_tf_model_config": 8, "bert_tf_checkpoint": 8, "denois": [9, 22, 23, 25], "natur": [9, 10, 12, 13, 14, 15, 17, 19, 20, 26, 28, 36], "translat": [9, 12, 22, 23, 25, 40], "comprehens": 9, "mike": [9, 22, 24], "lewi": [9, 22, 24], "yinhan": [9, 22, 24], "naman": [9, 21, 22, 24, 27], "goyal": [9, 21, 22, 24, 27], "marjan": [9, 22], "ghazvininejad": [9, 22], "abdelrahman": 9, "moham": 9, "omer": [9, 24], "levi": [9, 24], "ve": 9, "stoyanov": [9, 24, 27], "luke": [9, 22, 24, 27], "zettlemoy": [9, 22, 24, 27], "29": 9, "oct": 9, "accord": [9, 22, 35, 42], "seq2seq": 9, "bidirect": [9, 10, 11], "involv": [9, 42], "randomli": [9, 10, 18, 45], "shuffl": 9, "novel": [9, 14, 15], "fill": 9, "scheme": [9, 42], "span": [9, 13], "particularli": [9, 17], "effect": [9, 17, 18], "fine": [9, 10, 13, 18, 29, 30, 31, 36, 39, 40, 45], "resourc": [9, 26], "glue": [9, 16, 17, 31, 45], "squad": [9, 14, 15], "achiev": [9, 10, 14, 15, 30, 45], "art": [9, 12, 13, 21, 23, 26], "dialogu": 9, "summar": [9, 12, 18, 20, 22, 25, 31], "gain": [9, 17], "roug": 9, "decoder_input_id": [9, 18, 22, 23, 25], "decoder_attention_mask": [9, 18, 22, 23, 25], "decoder_head_mask": [9, 22, 23, 25], "cross_attn_head_mask": [9, 22, 23, 25], "encoder_output": [9, 18, 22, 23, 25], "decoder_inputs_emb": [9, 18, 22, 23, 25], "use_cach": [9, 12, 18, 19, 20, 21, 22, 23, 25], "past_key_valu": [9, 12, 18, 19, 20, 21, 22, 23, 25], "target_sequence_length": [9, 18, 22, 23, 25], "eos_token_id": 9, "shift": [9, 18, 22], "modeling_bart": 9, "_prepare_decoder_attention_mask": 9, "modifi": [9, 23, 30, 31, 40, 45], "diagram": 9, "ab": [9, 18, 23, 25, 36], "1910": [9, 23, 25], "13461": 9, "strategi": 9, "encoder_lay": [9, 22], "encoder_attention_head": [9, 22], "decoder_lay": [9, 22], "decoder_attention_head": [9, 22], "consist": [9, 13, 14, 15, 18, 20, 22, 23, 25, 29, 30, 36, 40, 46], "last_hidden_st": [9, 13, 18, 22, 23, 25], "n_layer": [9, 18, 22, 23, 25], "embed_size_per_head": [9, 18, 22, 23, 25], "encoder_sequence_length": [9, 18, 22], "don": [9, 18, 22, 23, 25, 30], "past": [9, 18, 22, 23, 25], "unset": [9, 22, 23, 25], "regress": [9, 22], "loss": [9, 13, 18, 22, 42], "entropi": [9, 22], "hangbo": 10, "bao": 10, "dong": 10, "songhao": 10, "piao": 10, "furu": 10, "wei": [10, 11, 18, 23, 25], "introduc": [10, 16, 21, 23, 28, 39, 40, 42, 43, 45], "vision": [10, 13, 36], "stand": [10, 46], "develop": [10, 12, 29, 37, 45], "area": 10, "view": [10, 39], "our": [10, 12, 13, 17, 18, 21, 28, 35, 36, 37, 43, 44, 45], "patch": [10, 26], "16x16": [10, 26], "pixel": [10, 13, 26], "discret": 10, "fed": 10, "backbon": 10, "corrupt": [10, 17], "append": 10, "upon": 10, "experiment": 10, "show": [10, 14, 15, 21, 26, 28, 41, 44, 45], "competit": [10, 13, 21], "83": [10, 14, 15], "accuraci": [10, 13], "imagenet": [10, 13, 26], "1k": 10, "significantli": [10, 14, 15], "outperform": [10, 17, 21], "scratch": [10, 13], "deit": 10, "81": 10, "moreov": 10, "larg": [10, 12, 14, 15, 17, 19, 22, 26, 27, 40, 42], "86": [10, 14, 15], "22k": 10, "85": 10, "pixel_valu": [10, 13, 26], "bool_masked_po": 10, "booltensor": [10, 18, 23, 25], "num_channel": [10, 13, 26], "height": [10, 13, 26], "width": [10, 13, 26], "autoimageprocessor": [10, 13, 26], "beitimageprocessor": 10, "jacob": 11, "devlin": 11, "ming": 11, "kenton": 11, "lee": [11, 23, 25], "kristina": 11, "toutanova": 11, "leverag": [12, 13, 18, 23, 40, 42], "encoderdecodermodel": [12, 31], "sascha": [12, 18], "roth": [12, 18], "shashi": [12, 18], "narayan": [12, 18], "aliaksei": [12, 18], "severyn": [12, 18], "unsupervis": [12, 19, 25, 27], "neural": [12, 13, 14, 15, 22, 40], "revolution": 12, "warm": 12, "releas": [12, 13, 14, 15, 21, 24, 27], "nlp": [12, 14, 15, 17, 23, 40, 42], "practition": 12, "signific": [12, 45], "amount": [12, 17, 19, 26], "far": [12, 30], "focu": [12, 45], "demonstr": [12, 13, 17, 19, 23, 30, 31, 36, 44], "efficaci": 12, "conduct": 12, "extens": [12, 30, 31], "studi": [12, 13, 39], "encoder_hidden_st": [12, 17, 18, 19], "encoder_attention_mask": [12, 17, 19], "style": [13, 29], "featur": [13, 32, 36, 41], "fit": [13, 43], "12": 13, "11": 13, "23": 13, "alec": [13, 19], "radford": [13, 19], "jong": 13, "wook": 13, "kim": 13, "chri": 13, "hallaci": 13, "aditya": [13, 23], "ramesh": 13, "gabriel": 13, "goh": 13, "sandhini": 13, "agarw": 13, "girish": 13, "sastri": 13, "amanda": 13, "askel": 13, "pamela": 13, "mishkin": 13, "jack": 13, "clark": 13, "gretchen": 13, "krueger": 13, "ilya": [13, 19], "sutskev": [13, 19], "contrast": [13, 28], "varieti": [13, 23, 25], "pair": 13, "instruct": [13, 33], "most": [13, 20, 21, 30, 40, 42, 45], "snippet": [13, 38, 45], "similarli": [13, 32, 45], "zero": [13, 23, 40], "shot": [13, 23, 40], "capabl": 13, "predetermin": 13, "categori": 13, "restrict": 13, "limit": [13, 23, 25, 26, 28], "usabl": 13, "raw": 13, "promis": 13, "much": [13, 31, 41], "broader": 13, "caption": 13, "goe": 13, "scalabl": [13, 40], "sota": 13, "400": 13, "million": [13, 19], "internet": 13, "ones": [13, 17, 32], "ocr": 13, "recognit": [13, 26, 28], "video": 13, "geo": 13, "mani": [13, 14, 15, 19, 22, 23, 29, 30, 31], "grain": 13, "trivial": 13, "often": [13, 30, 31, 39, 42], "fulli": [13, 42, 45, 46], "baselin": [13, 15], "resnet": 13, "50": 13, "28": [13, 20], "cliptextconfig": 13, "config_class": [13, 31], "alia": 13, "basemodeloutputwithpool": 13, "modeling_output": [13, 18], "compris": [13, 18], "variou": [13, 18, 31, 39, 43, 45], "configuration_clip": 13, "pooler_output": 13, "auxiliari": 13, "famili": 13, "plu": [13, 18], "softmax": [13, 14, 15, 18], "pool": 13, "pooled_output": 13, "get_input_embed": 13, "set_input_embed": 13, "clipvisionconfig": 13, "clipimageprocessor": 13, "pil": 13, "autoprocessor": 13, "processor": 13, "cocodataset": 13, "val2017": 13, "000000039769": 13, "jpg": 13, "open": [13, 20, 21, 29, 41], "cl": 13, "return_loss": 13, "clipoutput": 13, "modeling_clip": 13, "logits_per_imag": 13, "image_batch_s": 13, "text_batch_s": 13, "dot": [13, 40], "product": 13, "image_emb": 13, "text_emb": 13, "logits_per_text": 13, "output_dim": 13, "text_model_output": 13, "vision_model_output": 13, "cat": 13, "photo": 13, "dog": 13, "prob": 13, "get_image_featur": 13, "image_featur": 13, "get_text_featur": 13, "text_featur": 13, "enhanc": [14, 15], "disentangl": [14, 15, 46], "pengcheng": [14, 15], "xiaodong": [14, 15], "jianfeng": [14, 15], "gao": [14, 15], "weizhu": [14, 15], "2018": [14, 15, 24], "facebook": [14, 15, 27, 28], "half": [14, 15], "progress": [14, 15], "improv": [14, 15, 28], "mechan": [14, 15, 33, 39, 40], "content": [14, 15], "wide": [14, 15, 20, 23], "mnli": [14, 15], "9": [14, 15], "90": [14, 15], "vs": [14, 15], "91": [14, 15], "88": [14, 15], "7": [14, 15, 36], "made": [14, 15, 18, 40], "microsoft": [14, 15], "kamalkraj": [14, 15], "overridden": [14, 15, 19, 20, 21], "regist": [14, 15, 19, 20, 21, 43], "hook": [14, 15, 19, 20, 21], "visibl": 15, "5b": 15, "superglu": 15, "submiss": 15, "89": 15, "versu": 15, "human": 15, "find": [15, 19, 20, 21, 30, 35, 36, 45], "blog": [15, 16, 29], "www": [15, 36], "en": 15, "research": [15, 20, 21], "surpass": 15, "128k": 15, "now": [15, 30, 31, 33, 35, 38, 43, 44, 46], "sentencepiec": 15, "ngie": 15, "ngram": 15, "convolut": [15, 26], "asid": 15, "experi": [15, 17, 28], "bucket": 15, "log": 15, "900m": 15, "faster": 16, "cheaper": [16, 40], "lighter": 16, "distil": 16, "small": [16, 17, 26, 30, 42], "fast": 16, "cheap": 16, "light": 16, "40": [16, 19], "less": [16, 17, 42], "preserv": 16, "95": 16, "measur": [16, 28], "get_position_embed": 16, "resize_position_embed": 16, "new_num_position_embed": 16, "wherea": 16, "reduc": [16, 40], "remov": [16, 31], "sinusoid": 16, "algorithm": 16, "discrimin": 17, "rather": 17, "role": 17, "interest": [17, 38, 39], "were": [17, 28], "mlm": [17, 45], "reconstruct": [17, 22], "produc": 17, "good": [17, 26, 30, 31], "sampl": [17, 28], "detect": 17, "plausibl": 17, "ident": [17, 39, 42], "thorough": 17, "becaus": 17, "just": [17, 18, 32, 43, 44, 45], "contextu": 17, "substanti": [17, 26], "strong": 17, "gpu": 17, "dai": 17, "30x": 17, "xlnet": 17, "explicit": 18, "encoderdecoderadaptermodel": 18, "decis": 18, "due": [18, 42], "lack": 18, "would": [18, 30, 31, 35, 45], "from_encoder_decoder_pretrain": 18, "seper": 18, "autoencod": 18, "autoregress": [18, 20], "shown": [18, 35], "bertmodel": [18, 31, 38, 41, 46], "yang": 18, "mirella": 18, "lapata": 18, "automodelforcausallm": 18, "1907": 18, "12461": 18, "michael": [18, 23, 25], "matena": [18, 23, 25], "yanqi": [18, 23, 25], "zhou": [18, 23, 25], "peter": [18, 23, 25], "encoderdecoderconfig": 18, "meth": 18, "seq2seqlmoutput": 18, "100": [18, 26], "pad_token_id": [18, 19, 20, 21, 23, 25], "decoder_start_token_id": 18, "precomput": [18, 23, 25], "vocab_s": 18, "flavor": 18, "encoder_kwarg": 18, "decoder_": 18, "decoder_kwarg": 18, "logit": 18, "decoder_hidden_st": 18, "decoder_attent": 18, "cross_attent": 18, "encoder_last_hidden_st": 18, "encoder_attent": 18, "berttoken": [18, 44], "bert2bert": 18, "cls_token_id": 18, "encoder_pretrained_model_name_or_path": 18, "decoder_pretrained_model_name_or_path": 18, "encoder_": 18, "multitask": 19, "learner": 19, "jeffrei": 19, "wu": 19, "rewon": 19, "david": 19, "luan": 19, "dario": 19, "amodei": 19, "unidirect": 19, "veri": [19, 26, 29, 41, 43], "corpu": 19, "gb": 19, "billion": [19, 20], "web": 19, "page": [19, 31, 36, 40, 41, 42], "divers": 19, "caus": [19, 28], "goal": [19, 44], "occur": 19, "domain": 19, "10x": 19, "dor": [19, 21], "adpter": [19, 21], "know": [19, 20, 21, 23, 25], "row": [19, 20, 21, 30, 31], "simpli": [19, 20, 21, 32, 35, 40, 45], "guess": [19, 20, 21], "chat": 20, "few": [20, 28, 40, 43, 46], "deeper": 20, "dive": 20, "ben": 20, "mesh": 20, "jax": 20, "short": 20, "distinguish": 20, "4096": 20, "feedforward": 20, "16384": 20, "256": 20, "rotari": 20, "rope": 20, "50257": 20, "bpe": 20, "sub": [20, 40], "llamaforquestionansw": 21, "pleas": [21, 30, 32, 34, 36, 38, 43, 45], "subsequ": 21, "never": 21, "auto": [21, 22, 29, 31, 36], "rst": [21, 30, 31], "foundat": [21, 42], "hugo": 21, "touvron": 21, "thibaut": 21, "lavril": 21, "gautier": 21, "izacard": 21, "xavier": 21, "martinet": 21, "mari": 21, "ann": 21, "lachaux": 21, "timoth\u00e9": 21, "lacroix": 21, "baptist": 21, "rozi\u00e8r": 21, "eric": 21, "hambro": 21, "faisal": 21, "azhar": 21, "aurelien": 21, "rodriguez": 21, "armand": 21, "joulin": 21, "edouard": [21, 27], "grave": [21, 27], "guillaum": [21, 27], "lampl": 21, "7b": 21, "65b": 21, "trillion": 21, "exclus": 21, "resort": 21, "proprietari": 21, "particular": 21, "13b": 21, "175b": 21, "best": [21, 45], "chinchilla": 21, "70b": 21, "palm": 21, "540b": 21, "commun": 21, "cache_posit": 21, "multilingu": [22, 23, 28], "jiatao": 22, "gu": 22, "xian": 22, "sergei": 22, "edunov": 22, "monolingu": [22, 28], "corpora": 22, "complet": [22, 43, 46], "focus": [22, 44], "vari": 22, "target": [22, 29, 42, 45], "25004": 22, "en_xx": 22, "25003": 22, "de_d": 22, "massiv": 23, "lint": 23, "xue": 23, "noah": 23, "adam": [23, 25], "robert": [23, 25], "mihir": 23, "kale": 23, "rami": 23, "rfou": 23, "siddhant": 23, "barua": 23, "colin": [23, 25], "raffel": [23, 25], "unifi": [23, 25, 36, 39], "attain": [23, 26], "english": [23, 25], "crawl": 23, "cover": [23, 28], "101": 23, "design": [23, 30, 31, 37, 40, 44], "prevent": 23, "accident": 23, "partial": 23, "10683": [23, 25], "noam": [23, 25], "shazeer": [23, 25], "katherin": [23, 25], "sharan": [23, 25], "narang": [23, 25], "prepar": [23, 25, 35, 46], "robustli": 24, "myle": [24, 27], "ott": [24, 27], "jingfei": 24, "du": 24, "mandar": 24, "joshi": 24, "danqi": 24, "veselin": [24, 27], "type_vocab_s": 24, "alwai": [24, 32], "mixtur": 25, "box": [25, 33, 39, 43], "german": 25, "easiest": 25, "appendix": 25, "worth": 26, "alexei": 26, "dosovitskii": 26, "luca": 26, "beyer": 26, "alexand": 26, "kolesnikov": 26, "dirk": 26, "weissenborn": 26, "xiaohua": 26, "zhai": 26, "thoma": 26, "unterthin": 26, "mostafa": 26, "dehghani": 26, "matthia": 26, "minder": 26, "georg": 26, "heigold": 26, "sylvain": 26, "gelli": 26, "jakob": 26, "uszkoreit": 26, "neil": 26, "de": 26, "facto": 26, "conjunct": 26, "certain": 26, "compon": [26, 28, 30, 36, 39, 40], "keep": [26, 28, 40, 42], "overal": 26, "relianc": 26, "cnn": 26, "pure": 26, "mid": 26, "cifar": 26, "vtab": 26, "excel": 26, "fewer": 26, "interpolate_pos_encod": 26, "vitimageprocessor": 26, "interpol": 26, "alexi": 27, "conneau": 27, "kartikai": 27, "khandelw": 27, "vishrav": 27, "chaudhari": 27, "wenzek": 27, "francisco": 27, "guzm\u00e1n": 27, "5tb": 27, "filter": [27, 35], "commoncrawl": 27, "af_za": 28, "reli": 28, "set_default_languag": 28, "lang_id": 28, "known": [28, 38], "suffer": 28, "curs": 28, "address": [28, 36], "grow": 28, "total": 28, "capac": 28, "prior": 28, "hoc": 28, "entiti": 28, "mitig": 28, "neg": 28, "interfer": 28, "furthermor": 28, "longer": [28, 32, 45, 46], "default_languag": 28, "write": [29, 35, 38], "help": [29, 40], "whichev": 29, "welcom": [29, 41], "close": 29, "aspect": 29, "guid": [29, 30, 45], "addition": [29, 31, 39, 40, 41, 43], "instal": [29, 36, 45, 46], "go": [29, 30, 31, 35, 38, 44, 45], "procedur": [29, 45], "fork": 29, "copi": [29, 30, 31], "account": 29, "your_usernam": [29, 35], "cd": [29, 37, 45], "submodul": [29, 30, 39], "virtual": 29, "virtualenv": 29, "conda": 29, "command": [29, 42, 45], "websit": [29, 35, 38], "pip": [29, 36, 45, 46], "hf_transform": 29, "adding_adapter_method": 29, "adding_adapters_to_a_model": 29, "makefil": 29, "ci": 29, "pipelin": [29, 46], "pull": [29, 41], "whole": [29, 31, 43], "black": 29, "isort": 29, "qualiti": 29, "ensur": [29, 31], "flake8": 29, "access": [29, 35, 38], "huggingface_hub": 29, "codebas": [30, 46], "philosophi": [30, 31], "seamlessli": [30, 31], "entir": [30, 40, 45], "opt": 30, "still": [30, 36, 45], "minim": [30, 31, 38], "mixin": [30, 31, 36], "highlight": [30, 40], "highli": [30, 37], "mostli": 30, "insert": [30, 31, 42], "resid": 30, "src": [30, 31], "py": [30, 31, 45], "adapter_layer_bas": 30, "mark": 30, "importantli": 30, "concret": [30, 39, 40], "heavili": 30, "skeleton": 30, "firstli": 30, "section": [30, 32, 40, 42, 43, 44, 45], "constitut": 30, "expect": 30, "bottleneckst": 30, "compose_": 30, "bottlenecklay": 30, "actual": 30, "again": [30, 31, 35, 40, 43], "interact": [30, 35], "consid": [30, 36, 46], "model_overview": [30, 31], "thing": 30, "adapterload": [30, 33], "adaptermethodbasetestmixin": 30, "test_": [30, 31], "testmixin": 30, "adaptertest": [30, 31], "bertadaptertest": 30, "live": [30, 31], "overview": [30, 31, 32, 36, 38, 40, 43, 44], "md": [30, 31, 35], "column": 30, "readm": [30, 35], "properli": [30, 31], "ideal": [30, 31], "tree": [30, 31], "delv": 31, "yourself": 31, "suffici": 31, "modif": [31, 40, 45], "four": 31, "let": [31, 35, 38, 42, 43], "examin": [31, 38], "purpos": [31, 46], "modeling_bert": 31, "mixin_bert": 31, "bertselfattent": 31, "bertselfattentionadaptersmixin": 31, "discuss": 31, "bertlay": 31, "edit": 31, "adapter_model": 31, "add_": 31, "_head": 31, "mixin_": 31, "modeling_": 31, "reus": 31, "figur": [31, 40], "think": 31, "guidanc": 31, "robertalay": 31, "xlmrobertalay": 31, "debertalay": 31, "debertav2lay": 31, "bertgenerationlay": 31, "bertlayeradaptersmixin": 31, "prefixtuninglay": 31, "bottleneck_layer_forward": 31, "modelbaseadaptersmixin": 31, "embeddingadaptersmixin": 31, "invertibleadaptersmixin": 31, "modelusingsubmodelsadaptersmixin": 31, "enough": 31, "withadapt": 31, "bertselfattentionwithadapt": 31, "patch_forward": 31, "correctli": 31, "acceler": 31, "packag": [31, 36, 37], "model_mixin_map": 31, "modelwithflexibleheadsadaptersmixin": 31, "sens": 31, "adapter_model_mapping_nam": 31, "wrapper": 31, "num_attention_head": 31, "hidden_dropout_prob": 31, "attention_probs_dropout_prob": 31, "config_class_keys_map": 31, "cf": 31, "150": 31, "flex": 31, "everyth": [31, 38], "adaptertestbas": 31, "tokenizer_nam": 31, "classconversiontest": 31, "adaptermodeltest": 31, "modeltest": 31, "orient": 31, "autodoc": 31, "toi": 32, "illustr": [32, 39, 40], "unsur": 32, "active_embed": 32, "dir": [32, 33, 43], "reloaded_nam": 32, "loaded_token": 32, "yet": [33, 43], "countless": 33, "thinkabl": 33, "plugin": 33, "extract": [33, 42], "preinclud": 33, "predictionheadload": 33, "filter_func": 33, "rename_func": 33, "old_nam": 33, "new_nam": 33, "renam": [33, 46], "loader": 33, "mycustomweightsload": 33, "custom_weights_nam": 33, "legaci": [34, 36, 46], "hundr": 35, "link": 35, "easi": [35, 42], "pf": 35, "sick": 35, "conveni": 35, "everyon": 35, "world": [35, 43], "ll": 35, "fastest": 35, "hugginfac": 35, "credenti": 35, "proce": 35, "sai": 35, "awesome_adapt": 35, "my": 35, "sentiment": [35, 38, 44], "imdb": 35, "voil\u00e0": 35, "anyon": [35, 43], "simplifi": 36, "central": [36, 38], "quick": 36, "introduct": 36, "quantiz": 36, "transit": 36, "initialis": [36, 40], "extend": [36, 41], "bertgener": 36, "eleutherai": 36, "6b": 36, "_adapters_": 36, "cite": 36, "2311": 36, "11077": 36, "inproceed": 36, "poth": 36, "etal": 36, "titl": 36, "clifton": 36, "sterz": 36, "hannah": 36, "paul": 36, "indraneil": 36, "purkayastha": 36, "sukannya": 36, "engl": 36, "nder": 36, "leon": 36, "imhof": 36, "timo": 36, "vuli": 36, "ivan": 36, "ruder": [36, 40], "gurevych": 36, "iryna": 36, "jona": 36, "booktitl": 36, "proceed": 36, "confer": 36, "month": 36, "dec": 36, "year": [36, 42], "singapor": 36, "publish": [36, 43], "linguist": 36, "aclantholog": 36, "emnlp": 36, "demo": 36, "13": 36, "149": 36, "160": 36, "predecessor": 36, "infrastructur": 36, "team": 36, "pfeiffer2020adapterhub": 36, "andrea": 36, "uckl": 36, "aishwarya": 36, "kamath": 36, "kyunghyun": 36, "cho": 36, "onlin": 36, "aclweb": 36, "antholog": 36, "46": 36, "54": 36, "recommend": [37, 43, 44, 46], "latest": [37, 45], "migrat": [38, 45], "programmat": 38, "adapter_info": 38, "uncased_sentiment_sst": 38, "2_pfeiffer": 38, "suppos": [38, 45], "adaptr": 38, "sst": [38, 44, 45], "analysi": [38, 39], "suitabl": 38, "list_available_adapt": 38, "info": 38, "underneath": 38, "bit": 38, "explicitli": [38, 43], "predefin": [38, 40], "interchang": 38, "standalon": 39, "joint": 39, "easier": [39, 46], "union_adapt": 39, "800": [39, 42], "toward": [39, 42], "color": [39, 40], "shade": [39, 40], "magenta": [39, 40], "too": [39, 43], "mathcal": 39, "_m": 39, "w_": [39, 40], "sigmoid": 39, "sigma": 39, "leftarrow": [39, 40, 42], "cdot": [39, 40], "w_0": [39, 40], "_": 39, "loraconfig": [39, 40, 42], "insight": 39, "gating_scor": 39, "adapter_gating_scor": 39, "slightlti": 39, "tabular": 40, "d_": 40, "norm": 40, "indirectli": 40, "ratio": 40, "frac": 40, "literatur": [40, 42], "bottleneck_adapt": 40, "bapna": 40, "firat": 40, "lm": [40, 42], "enter": 40, "lang_adapt": 40, "distinct": 40, "enumer": 40, "p": 40, "head_i": 40, "w_i": 40, "p_i": 40, "eject": 40, "retain": 40, "continu": [40, 46], "exchang": [40, 45], "construct": 40, "use_phm": 40, "dummi": 40, "hypercomplex": 40, "henderson": 40, "decomposit": 40, "mathbb": 40, "dimension": 40, "principl": 40, "dens": 40, "lora_adapt": 40, "latenc": 40, "overhead": 40, "minimum": 40, "accomplish": 40, "ia3config": [40, 42], "l_w": 40, "rescal": 40, "w": 40, "odot": 40, "denot": 40, "broadcast": 40, "ia3_adapt": 40, "beyond": 40, "tunabl": 40, "soft": 40, "x_1": 40, "x_2": 40, "x_n": 40, "x_e": 40, "space": 40, "p_e": 40, "prompttuningconfig": [40, 42], "drawn": 40, "bo": 40, "power": 40, "xadaptermodel": [41, 44], "ship": 41, "bertforsequenceclassif": [41, 43], "feel": 41, "preval": 42, "costli": 42, "seri": 42, "lightweight": 42, "establish": 42, "commonli": 42, "offer": 42, "benefit": [42, 45], "tini": 42, "deploi": 42, "3mb": 42, "440mb": 42, "par": 42, "theta": 42, "frozen": 42, "phi": 42, "min_": 42, "earli": 42, "success": 42, "laid": 42, "idea": 42, "methodolog": 42, "explain": 42, "double_seq_bn": [42, 46], "par_bn": 42, "scaled_par_bn": 42, "seq_bn_inv": [42, 45, 46], "double_seq_bn_inv": [42, 46], "prefix_tuning_flat": 42, "ia\u00b3": 42, "mam": 42, "concis": 42, "especi": 42, "extern": 42, "line": [42, 43, 45], "squar": 42, "bracket": 42, "omit": 42, "join": [42, 43, 44], "side": 43, "whenev": 43, "binari": 43, "alon": [43, 46], "gave": 43, "tell": 43, "lastli": 43, "strongli": [43, 46], "robertaforsequenceclassif": 43, "automodelformultiplechoic": 43, "encourag": 43, "someon": 43, "hi": 43, "automodelforsequenceclassif": [43, 45], "static_head_model": 43, "test": [43, 45], "temp_dir": 43, "getcwd": [43, 44], "flex_head_model": 43, "assert": [43, 44], "opposit": 43, "templat": 43, "guidelin": 43, "customhead": 43, "predictionhead": 43, "def": 43, "inniti": 43, "notifi": 43, "add_custom_head": 43, "register_custom_head": 43, "my_custom_head": 43, "custom_head": 43, "treat": 43, "coupl": 44, "creation": 44, "briefli": 44, "showcas": 44, "visit": 44, "quickstart": 44, "clearli": 44, "fun": 44, "input_data": 44, "sake": 44, "example_path": 44, "finish": 44, "restor": 44, "scenario": 45, "slightli": 45, "your_examples_fold": 45, "txt": 45, "minor": 45, "run_glu": 45, "hfargumentpars": 45, "parser": 45, "modelargu": 45, "datatrainingargu": 45, "data_arg": 45, "training_arg": 45, "parse_args_into_dataclass": 45, "model_name_or_path": 45, "task_nam": 45, "job": 45, "anywher": 45, "crucial": 45, "outsid": 45, "later": 45, "checkout": 45, "technic": 45, "loop": 45, "run_multiple_choic": 45, "run_squad": 45, "export": 45, "do_train": 45, "do_ev": 45, "max_seq_length": 45, "128": 45, "per_device_train_batch_s": 45, "learning_r": 45, "1e": 45, "num_train_epoch": 45, "output_dir": 45, "tmp": 45, "overwrite_output_dir": 45, "why": 45, "higher": 45, "overfit": 45, "epoch": 45, "straightforward": 45, "run_mlm": 45, "train_fil": 45, "validation_fil": 45, "run_fusion_glu": 45, "stage": 45, "glue_dir": 45, "data_dir": 45, "5e": 45, "trainings_arg": 45, "trainingsargu": 45, "do_save_full_model": 45, "do_save_adapt": 45, "do_save_adapter_fus": 45, "qlora": 45, "dettmer": 45, "bitsandbyt": 45, "hand": 45, "degrad": 46, "successor": 46, "essenti": 46, "break": 46, "trigger": 46, "intuit": 46, "par_seq_bn": 46, "inv": 46, "old": 46, "pfeifferconfig": 46, "consequ": 46, "anymor": 46, "adapterconfigbas": 46}, "objects": {"adapters": [[1, 0, 1, "", "AdapterConfig"], [1, 0, 1, "", "AdapterFusionConfig"], [2, 0, 1, "", "AdapterLayerBase"], [1, 0, 1, "", "AdapterSetup"], [7, 0, 1, "", "AlbertAdapterModel"], [8, 0, 1, "", "AutoAdapterModel"], [9, 0, 1, "", "BartAdapterModel"], [10, 0, 1, "", "BeitAdapterModel"], [11, 0, 1, "", "BertAdapterModel"], [12, 0, 1, "", "BertGenerationAdapterModel"], [1, 0, 1, "", "BnConfig"], [1, 0, 1, "", "CompacterConfig"], [1, 0, 1, "", "CompacterPlusPlusConfig"], [2, 0, 1, "", "ComposableAdapterLayerBase"], [1, 0, 1, "", "ConfigUnion"], [14, 0, 1, "", "DebertaAdapterModel"], [15, 0, 1, "", "DebertaV2AdapterModel"], [16, 0, 1, "", "DistilBertAdapterModel"], [1, 0, 1, "", "DoubleSeqBnConfig"], [1, 0, 1, "", "DoubleSeqBnInvConfig"], [1, 0, 1, "", "DynamicAdapterFusionConfig"], [17, 0, 1, "", "ElectraAdapterModel"], [6, 0, 1, "", "EmbeddingAdaptersMixin"], [19, 0, 1, "", "GPT2AdapterModel"], [20, 0, 1, "", "GPTJAdapterModel"], [1, 0, 1, "", "IA3Config"], [6, 0, 1, "", "InvertibleAdaptersMixin"], [21, 0, 1, "", "LlamaAdapterModel"], [1, 0, 1, "", "LoRAConfig"], [1, 0, 1, "", "MAMConfig"], [22, 0, 1, "", "MBartAdapterModel"], [23, 0, 1, "", "MT5AdapterModel"], [5, 0, 1, "", "ModelAdaptersConfig"], [6, 0, 1, "", "ModelAdaptersMixin"], [6, 0, 1, "", "ModelWithFlexibleHeadsAdaptersMixin"], [6, 0, 1, "", "ModelWithHeadsAdaptersMixin"], [1, 0, 1, "", "ParBnConfig"], [1, 0, 1, "", "PrefixTuningConfig"], [1, 0, 1, "", "PromptTuningConfig"], [24, 0, 1, "", "RobertaAdapterModel"], [1, 0, 1, "", "SeqBnConfig"], [1, 0, 1, "", "SeqBnInvConfig"], [1, 0, 1, "", "StaticAdapterFusionConfig"], [25, 0, 1, "", "T5AdapterModel"], [1, 0, 1, "", "UniPELTConfig"], [26, 0, 1, "", "ViTAdapterModel"], [27, 0, 1, "", "XLMRobertaAdapterModel"], [28, 0, 1, "", "XmodAdapterModel"], [3, 3, 0, "-", "trainer"], [3, 3, 0, "-", "training"], [4, 3, 0, "-", "utils"]], "adapters.AdapterConfig": [[1, 1, 1, "", "from_dict"], [1, 1, 1, "", "load"], [1, 1, 1, "", "replace"], [1, 1, 1, "", "to_dict"]], "adapters.AdapterFusionConfig": [[1, 1, 1, "", "from_dict"], [1, 1, 1, "", "load"], [1, 1, 1, "", "replace"], [1, 1, 1, "", "to_dict"]], "adapters.AdapterLayerBase": [[2, 1, 1, "", "add_adapter"], [2, 1, 1, "", "average_adapter"], [2, 1, 1, "", "delete_adapter"], [2, 1, 1, "", "enable_adapters"], [2, 1, 1, "", "get_adapter"]], "adapters.AlbertAdapterModel": [[7, 2, 1, "", "active_adapters"], [7, 2, 1, "", "active_head"], [7, 1, 1, "", "adapter_fusion_to"], [7, 1, 1, "", "adapter_summary"], [7, 1, 1, "", "adapter_to"], [7, 1, 1, "", "add_adapter"], [7, 1, 1, "", "add_adapter_fusion"], [7, 1, 1, "", "add_classification_head"], [7, 1, 1, "", "add_masked_lm_head"], [7, 1, 1, "", "add_multiple_choice_head"], [7, 1, 1, "", "add_qa_head"], [7, 1, 1, "", "add_tagging_head"], [7, 1, 1, "", "apply_to_adapter_layers"], [7, 1, 1, "", "apply_to_basemodel_childs"], [7, 1, 1, "", "average_adapter"], [7, 1, 1, "", "delete_adapter"], [7, 1, 1, "", "delete_adapter_fusion"], [7, 1, 1, "", "delete_head"], [7, 1, 1, "", "eject_prefix_tuning"], [7, 1, 1, "", "forward"], [7, 1, 1, "", "forward_context"], [7, 1, 1, "", "forward_head"], [7, 1, 1, "", "freeze_model"], [7, 1, 1, "", "get_adapter"], [7, 1, 1, "", "get_labels"], [7, 1, 1, "", "get_labels_dict"], [7, 1, 1, "", "get_output_embeddings"], [7, 1, 1, "", "head_type"], [7, 1, 1, "", "init_adapters"], [7, 1, 1, "", "iter_layers"], [7, 1, 1, "", "load_adapter"], [7, 1, 1, "", "load_adapter_fusion"], [7, 1, 1, "", "load_head"], [7, 1, 1, "", "merge_adapter"], [7, 1, 1, "", "push_adapter_to_hub"], [7, 1, 1, "", "reset_adapter"], [7, 1, 1, "", "save_adapter"], [7, 1, 1, "", "save_adapter_fusion"], [7, 1, 1, "", "save_all_adapter_fusions"], [7, 1, 1, "", "save_all_adapters"], [7, 1, 1, "", "save_all_heads"], [7, 1, 1, "", "save_head"], [7, 1, 1, "", "save_pretrained"], [7, 1, 1, "", "set_active_adapters"], [7, 1, 1, "", "tie_weights"], [7, 1, 1, "", "train_adapter"], [7, 1, 1, "", "train_adapter_fusion"], [7, 1, 1, "", "train_fusion"]], "adapters.AutoAdapterModel": [[8, 1, 1, "", "from_config"], [8, 1, 1, "", "from_pretrained"]], "adapters.BartAdapterModel": [[9, 2, 1, "", "active_adapters"], [9, 2, 1, "", "active_head"], [9, 1, 1, "", "adapter_fusion_to"], [9, 1, 1, "", "adapter_summary"], [9, 1, 1, "", "adapter_to"], [9, 1, 1, "", "add_adapter"], [9, 1, 1, "", "add_adapter_fusion"], [9, 1, 1, "", "add_classification_head"], [9, 1, 1, "", "add_qa_head"], [9, 1, 1, "", "add_seq2seq_lm_head"], [9, 1, 1, "", "apply_to_adapter_layers"], [9, 1, 1, "", "apply_to_basemodel_childs"], [9, 1, 1, "", "average_adapter"], [9, 1, 1, "", "delete_adapter"], [9, 1, 1, "", "delete_adapter_fusion"], [9, 1, 1, "", "delete_head"], [9, 1, 1, "", "eject_prefix_tuning"], [9, 1, 1, "", "forward"], [9, 1, 1, "", "forward_context"], [9, 1, 1, "", "forward_head"], [9, 1, 1, "", "freeze_model"], [9, 1, 1, "", "get_adapter"], [9, 1, 1, "", "get_labels"], [9, 1, 1, "", "get_labels_dict"], [9, 1, 1, "", "get_output_embeddings"], [9, 1, 1, "", "head_type"], [9, 1, 1, "", "init_adapters"], [9, 1, 1, "", "iter_layers"], [9, 1, 1, "", "load_adapter"], [9, 1, 1, "", "load_adapter_fusion"], [9, 1, 1, "", "load_head"], [9, 1, 1, "", "merge_adapter"], [9, 1, 1, "", "push_adapter_to_hub"], [9, 1, 1, "", "reset_adapter"], [9, 1, 1, "", "save_adapter"], [9, 1, 1, "", "save_adapter_fusion"], [9, 1, 1, "", "save_all_adapter_fusions"], [9, 1, 1, "", "save_all_adapters"], [9, 1, 1, "", "save_all_heads"], [9, 1, 1, "", "save_head"], [9, 1, 1, "", "save_pretrained"], [9, 1, 1, "", "set_active_adapters"], [9, 1, 1, "", "tie_weights"], [9, 1, 1, "", "train_adapter"], [9, 1, 1, "", "train_adapter_fusion"], [9, 1, 1, "", "train_fusion"]], "adapters.BeitAdapterModel": [[10, 2, 1, "", "active_adapters"], [10, 2, 1, "", "active_head"], [10, 1, 1, "", "adapter_fusion_to"], [10, 1, 1, "", "adapter_summary"], [10, 1, 1, "", "adapter_to"], [10, 1, 1, "", "add_adapter"], [10, 1, 1, "", "add_adapter_fusion"], [10, 1, 1, "", "add_image_classification_head"], [10, 1, 1, "", "apply_to_adapter_layers"], [10, 1, 1, "", "apply_to_basemodel_childs"], [10, 1, 1, "", "average_adapter"], [10, 1, 1, "", "delete_adapter"], [10, 1, 1, "", "delete_adapter_fusion"], [10, 1, 1, "", "delete_head"], [10, 1, 1, "", "eject_prefix_tuning"], [10, 1, 1, "", "forward"], [10, 1, 1, "", "forward_context"], [10, 1, 1, "", "forward_head"], [10, 1, 1, "", "freeze_model"], [10, 1, 1, "", "get_adapter"], [10, 1, 1, "", "get_labels"], [10, 1, 1, "", "get_labels_dict"], [10, 1, 1, "", "get_output_embeddings"], [10, 1, 1, "", "head_type"], [10, 1, 1, "", "init_adapters"], [10, 1, 1, "", "iter_layers"], [10, 1, 1, "", "load_adapter"], [10, 1, 1, "", "load_adapter_fusion"], [10, 1, 1, "", "load_head"], [10, 1, 1, "", "merge_adapter"], [10, 1, 1, "", "push_adapter_to_hub"], [10, 1, 1, "", "reset_adapter"], [10, 1, 1, "", "save_adapter"], [10, 1, 1, "", "save_adapter_fusion"], [10, 1, 1, "", "save_all_adapter_fusions"], [10, 1, 1, "", "save_all_adapters"], [10, 1, 1, "", "save_all_heads"], [10, 1, 1, "", "save_head"], [10, 1, 1, "", "save_pretrained"], [10, 1, 1, "", "set_active_adapters"], [10, 1, 1, "", "tie_weights"], [10, 1, 1, "", "train_adapter"], [10, 1, 1, "", "train_adapter_fusion"], [10, 1, 1, "", "train_fusion"]], "adapters.BertAdapterModel": [[11, 2, 1, "", "active_adapters"], [11, 2, 1, "", "active_head"], [11, 1, 1, "", "adapter_fusion_to"], [11, 1, 1, "", "adapter_summary"], [11, 1, 1, "", "adapter_to"], [11, 1, 1, "", "add_adapter"], [11, 1, 1, "", "add_adapter_fusion"], [11, 1, 1, "", "add_causal_lm_head"], [11, 1, 1, "", "add_classification_head"], [11, 1, 1, "", "add_dependency_parsing_head"], [11, 1, 1, "", "add_masked_lm_head"], [11, 1, 1, "", "add_multiple_choice_head"], [11, 1, 1, "", "add_qa_head"], [11, 1, 1, "", "add_tagging_head"], [11, 1, 1, "", "apply_to_adapter_layers"], [11, 1, 1, "", "apply_to_basemodel_childs"], [11, 1, 1, "", "average_adapter"], [11, 1, 1, "", "delete_adapter"], [11, 1, 1, "", "delete_adapter_fusion"], [11, 1, 1, "", "delete_head"], [11, 1, 1, "", "eject_prefix_tuning"], [11, 1, 1, "", "forward"], [11, 1, 1, "", "forward_context"], [11, 1, 1, "", "forward_head"], [11, 1, 1, "", "freeze_model"], [11, 1, 1, "", "get_adapter"], [11, 1, 1, "", "get_labels"], [11, 1, 1, "", "get_labels_dict"], [11, 1, 1, "", "get_output_embeddings"], [11, 1, 1, "", "head_type"], [11, 1, 1, "", "init_adapters"], [11, 1, 1, "", "iter_layers"], [11, 1, 1, "", "load_adapter"], [11, 1, 1, "", "load_adapter_fusion"], [11, 1, 1, "", "load_head"], [11, 1, 1, "", "merge_adapter"], [11, 1, 1, "", "push_adapter_to_hub"], [11, 1, 1, "", "reset_adapter"], [11, 1, 1, "", "save_adapter"], [11, 1, 1, "", "save_adapter_fusion"], [11, 1, 1, "", "save_all_adapter_fusions"], [11, 1, 1, "", "save_all_adapters"], [11, 1, 1, "", "save_all_heads"], [11, 1, 1, "", "save_head"], [11, 1, 1, "", "save_pretrained"], [11, 1, 1, "", "set_active_adapters"], [11, 1, 1, "", "tie_weights"], [11, 1, 1, "", "train_adapter"], [11, 1, 1, "", "train_adapter_fusion"], [11, 1, 1, "", "train_fusion"]], "adapters.BertGenerationAdapterModel": [[12, 2, 1, "", "active_adapters"], [12, 2, 1, "", "active_head"], [12, 1, 1, "", "adapter_fusion_to"], [12, 1, 1, "", "adapter_summary"], [12, 1, 1, "", "adapter_to"], [12, 1, 1, "", "add_adapter"], [12, 1, 1, "", "add_adapter_fusion"], [12, 1, 1, "", "add_causal_lm_head"], [12, 1, 1, "", "add_masked_lm_head"], [12, 1, 1, "", "apply_to_adapter_layers"], [12, 1, 1, "", "apply_to_basemodel_childs"], [12, 1, 1, "", "average_adapter"], [12, 1, 1, "", "delete_adapter"], [12, 1, 1, "", "delete_adapter_fusion"], [12, 1, 1, "", "delete_head"], [12, 1, 1, "", "eject_prefix_tuning"], [12, 1, 1, "", "forward"], [12, 1, 1, "", "forward_context"], [12, 1, 1, "", "forward_head"], [12, 1, 1, "", "freeze_model"], [12, 1, 1, "", "get_adapter"], [12, 1, 1, "", "get_labels"], [12, 1, 1, "", "get_labels_dict"], [12, 1, 1, "", "get_output_embeddings"], [12, 1, 1, "", "head_type"], [12, 1, 1, "", "init_adapters"], [12, 1, 1, "", "iter_layers"], [12, 1, 1, "", "load_adapter"], [12, 1, 1, "", "load_adapter_fusion"], [12, 1, 1, "", "load_head"], [12, 1, 1, "", "merge_adapter"], [12, 1, 1, "", "push_adapter_to_hub"], [12, 1, 1, "", "reset_adapter"], [12, 1, 1, "", "save_adapter"], [12, 1, 1, "", "save_adapter_fusion"], [12, 1, 1, "", "save_all_adapter_fusions"], [12, 1, 1, "", "save_all_adapters"], [12, 1, 1, "", "save_all_heads"], [12, 1, 1, "", "save_head"], [12, 1, 1, "", "save_pretrained"], [12, 1, 1, "", "set_active_adapters"], [12, 1, 1, "", "tie_weights"], [12, 1, 1, "", "train_adapter"], [12, 1, 1, "", "train_adapter_fusion"], [12, 1, 1, "", "train_fusion"]], "adapters.BnConfig": [[1, 1, 1, "", "from_dict"], [1, 1, 1, "", "load"], [1, 1, 1, "", "replace"], [1, 1, 1, "", "to_dict"]], "adapters.ComposableAdapterLayerBase": [[2, 1, 1, "", "check_composition_valid"], [2, 1, 1, "", "compose"], [2, 1, 1, "", "compose_average"], [2, 1, 1, "", "compose_batch_split"], [2, 1, 1, "", "compose_fuse"], [2, 1, 1, "", "compose_parallel"], [2, 1, 1, "", "compose_single"], [2, 1, 1, "", "compose_split"], [2, 1, 1, "", "compose_stack"], [2, 1, 1, "", "mean"], [2, 1, 1, "", "pad_and_concat"], [2, 1, 1, "", "pre_block"], [2, 1, 1, "", "repeat"], [2, 1, 1, "", "vslice"]], "adapters.ConfigUnion": [[1, 1, 1, "", "from_dict"], [1, 1, 1, "", "load"], [1, 1, 1, "", "replace"], [1, 1, 1, "", "to_dict"], [1, 1, 1, "", "validate"]], "adapters.DebertaAdapterModel": [[14, 2, 1, "", "active_adapters"], [14, 2, 1, "", "active_head"], [14, 1, 1, "", "adapter_fusion_to"], [14, 1, 1, "", "adapter_summary"], [14, 1, 1, "", "adapter_to"], [14, 1, 1, "", "add_adapter"], [14, 1, 1, "", "add_adapter_fusion"], [14, 1, 1, "", "add_classification_head"], [14, 1, 1, "", "add_masked_lm_head"], [14, 1, 1, "", "add_multiple_choice_head"], [14, 1, 1, "", "add_qa_head"], [14, 1, 1, "", "add_tagging_head"], [14, 1, 1, "", "apply_to_adapter_layers"], [14, 1, 1, "", "apply_to_basemodel_childs"], [14, 1, 1, "", "average_adapter"], [14, 1, 1, "", "delete_adapter"], [14, 1, 1, "", "delete_adapter_fusion"], [14, 1, 1, "", "delete_head"], [14, 1, 1, "", "eject_prefix_tuning"], [14, 1, 1, "", "forward"], [14, 1, 1, "", "forward_context"], [14, 1, 1, "", "forward_head"], [14, 1, 1, "", "freeze_model"], [14, 1, 1, "", "get_adapter"], [14, 1, 1, "", "get_labels"], [14, 1, 1, "", "get_labels_dict"], [14, 1, 1, "", "get_output_embeddings"], [14, 1, 1, "", "head_type"], [14, 1, 1, "", "init_adapters"], [14, 1, 1, "", "iter_layers"], [14, 1, 1, "", "load_adapter"], [14, 1, 1, "", "load_adapter_fusion"], [14, 1, 1, "", "load_head"], [14, 1, 1, "", "merge_adapter"], [14, 1, 1, "", "push_adapter_to_hub"], [14, 1, 1, "", "reset_adapter"], [14, 1, 1, "", "save_adapter"], [14, 1, 1, "", "save_adapter_fusion"], [14, 1, 1, "", "save_all_adapter_fusions"], [14, 1, 1, "", "save_all_adapters"], [14, 1, 1, "", "save_all_heads"], [14, 1, 1, "", "save_head"], [14, 1, 1, "", "save_pretrained"], [14, 1, 1, "", "set_active_adapters"], [14, 1, 1, "", "tie_weights"], [14, 1, 1, "", "train_adapter"], [14, 1, 1, "", "train_adapter_fusion"], [14, 1, 1, "", "train_fusion"]], "adapters.DebertaV2AdapterModel": [[15, 2, 1, "", "active_adapters"], [15, 2, 1, "", "active_head"], [15, 1, 1, "", "adapter_fusion_to"], [15, 1, 1, "", "adapter_summary"], [15, 1, 1, "", "adapter_to"], [15, 1, 1, "", "add_adapter"], [15, 1, 1, "", "add_adapter_fusion"], [15, 1, 1, "", "add_classification_head"], [15, 1, 1, "", "add_masked_lm_head"], [15, 1, 1, "", "add_multiple_choice_head"], [15, 1, 1, "", "add_qa_head"], [15, 1, 1, "", "add_tagging_head"], [15, 1, 1, "", "apply_to_adapter_layers"], [15, 1, 1, "", "apply_to_basemodel_childs"], [15, 1, 1, "", "average_adapter"], [15, 1, 1, "", "delete_adapter"], [15, 1, 1, "", "delete_adapter_fusion"], [15, 1, 1, "", "delete_head"], [15, 1, 1, "", "eject_prefix_tuning"], [15, 1, 1, "", "forward"], [15, 1, 1, "", "forward_context"], [15, 1, 1, "", "forward_head"], [15, 1, 1, "", "freeze_model"], [15, 1, 1, "", "get_adapter"], [15, 1, 1, "", "get_labels"], [15, 1, 1, "", "get_labels_dict"], [15, 1, 1, "", "get_output_embeddings"], [15, 1, 1, "", "head_type"], [15, 1, 1, "", "init_adapters"], [15, 1, 1, "", "iter_layers"], [15, 1, 1, "", "load_adapter"], [15, 1, 1, "", "load_adapter_fusion"], [15, 1, 1, "", "load_head"], [15, 1, 1, "", "merge_adapter"], [15, 1, 1, "", "push_adapter_to_hub"], [15, 1, 1, "", "reset_adapter"], [15, 1, 1, "", "save_adapter"], [15, 1, 1, "", "save_adapter_fusion"], [15, 1, 1, "", "save_all_adapter_fusions"], [15, 1, 1, "", "save_all_adapters"], [15, 1, 1, "", "save_all_heads"], [15, 1, 1, "", "save_head"], [15, 1, 1, "", "save_pretrained"], [15, 1, 1, "", "set_active_adapters"], [15, 1, 1, "", "tie_weights"], [15, 1, 1, "", "train_adapter"], [15, 1, 1, "", "train_adapter_fusion"], [15, 1, 1, "", "train_fusion"]], "adapters.DistilBertAdapterModel": [[16, 2, 1, "", "active_adapters"], [16, 2, 1, "", "active_head"], [16, 1, 1, "", "adapter_fusion_to"], [16, 1, 1, "", "adapter_summary"], [16, 1, 1, "", "adapter_to"], [16, 1, 1, "", "add_adapter"], [16, 1, 1, "", "add_adapter_fusion"], [16, 1, 1, "", "add_causal_lm_head"], [16, 1, 1, "", "add_classification_head"], [16, 1, 1, "", "add_dependency_parsing_head"], [16, 1, 1, "", "add_masked_lm_head"], [16, 1, 1, "", "add_multiple_choice_head"], [16, 1, 1, "", "add_qa_head"], [16, 1, 1, "", "add_tagging_head"], [16, 1, 1, "", "apply_to_adapter_layers"], [16, 1, 1, "", "apply_to_basemodel_childs"], [16, 1, 1, "", "average_adapter"], [16, 1, 1, "", "delete_adapter"], [16, 1, 1, "", "delete_adapter_fusion"], [16, 1, 1, "", "delete_head"], [16, 1, 1, "", "eject_prefix_tuning"], [16, 1, 1, "", "forward"], [16, 1, 1, "", "forward_context"], [16, 1, 1, "", "forward_head"], [16, 1, 1, "", "freeze_model"], [16, 1, 1, "", "get_adapter"], [16, 1, 1, "", "get_labels"], [16, 1, 1, "", "get_labels_dict"], [16, 1, 1, "", "get_output_embeddings"], [16, 1, 1, "", "get_position_embeddings"], [16, 1, 1, "", "head_type"], [16, 1, 1, "", "init_adapters"], [16, 1, 1, "", "iter_layers"], [16, 1, 1, "", "load_adapter"], [16, 1, 1, "", "load_adapter_fusion"], [16, 1, 1, "", "load_head"], [16, 1, 1, "", "merge_adapter"], [16, 1, 1, "", "push_adapter_to_hub"], [16, 1, 1, "", "reset_adapter"], [16, 1, 1, "", "resize_position_embeddings"], [16, 1, 1, "", "save_adapter"], [16, 1, 1, "", "save_adapter_fusion"], [16, 1, 1, "", "save_all_adapter_fusions"], [16, 1, 1, "", "save_all_adapters"], [16, 1, 1, "", "save_all_heads"], [16, 1, 1, "", "save_head"], [16, 1, 1, "", "save_pretrained"], [16, 1, 1, "", "set_active_adapters"], [16, 1, 1, "", "tie_weights"], [16, 1, 1, "", "train_adapter"], [16, 1, 1, "", "train_adapter_fusion"], [16, 1, 1, "", "train_fusion"]], "adapters.ElectraAdapterModel": [[17, 2, 1, "", "active_adapters"], [17, 2, 1, "", "active_head"], [17, 1, 1, "", "adapter_fusion_to"], [17, 1, 1, "", "adapter_summary"], [17, 1, 1, "", "adapter_to"], [17, 1, 1, "", "add_adapter"], [17, 1, 1, "", "add_adapter_fusion"], [17, 1, 1, "", "add_causal_lm_head"], [17, 1, 1, "", "add_classification_head"], [17, 1, 1, "", "add_dependency_parsing_head"], [17, 1, 1, "", "add_masked_lm_head"], [17, 1, 1, "", "add_multiple_choice_head"], [17, 1, 1, "", "add_qa_head"], [17, 1, 1, "", "add_tagging_head"], [17, 1, 1, "", "apply_to_adapter_layers"], [17, 1, 1, "", "apply_to_basemodel_childs"], [17, 1, 1, "", "average_adapter"], [17, 1, 1, "", "delete_adapter"], [17, 1, 1, "", "delete_adapter_fusion"], [17, 1, 1, "", "delete_head"], [17, 1, 1, "", "eject_prefix_tuning"], [17, 1, 1, "", "forward"], [17, 1, 1, "", "forward_context"], [17, 1, 1, "", "forward_head"], [17, 1, 1, "", "freeze_model"], [17, 1, 1, "", "get_adapter"], [17, 1, 1, "", "get_labels"], [17, 1, 1, "", "get_labels_dict"], [17, 1, 1, "", "get_output_embeddings"], [17, 1, 1, "", "head_type"], [17, 1, 1, "", "init_adapters"], [17, 1, 1, "", "iter_layers"], [17, 1, 1, "", "load_adapter"], [17, 1, 1, "", "load_adapter_fusion"], [17, 1, 1, "", "load_head"], [17, 1, 1, "", "merge_adapter"], [17, 1, 1, "", "push_adapter_to_hub"], [17, 1, 1, "", "reset_adapter"], [17, 1, 1, "", "save_adapter"], [17, 1, 1, "", "save_adapter_fusion"], [17, 1, 1, "", "save_all_adapter_fusions"], [17, 1, 1, "", "save_all_adapters"], [17, 1, 1, "", "save_all_heads"], [17, 1, 1, "", "save_head"], [17, 1, 1, "", "save_pretrained"], [17, 1, 1, "", "set_active_adapters"], [17, 1, 1, "", "tie_weights"], [17, 1, 1, "", "train_adapter"], [17, 1, 1, "", "train_adapter_fusion"], [17, 1, 1, "", "train_fusion"]], "adapters.EmbeddingAdaptersMixin": [[6, 1, 1, "", "add_embeddings"], [6, 1, 1, "", "delete_embeddings"], [6, 1, 1, "", "load_embeddings"], [6, 1, 1, "", "save_embeddings"], [6, 1, 1, "", "set_active_embeddings"]], "adapters.GPT2AdapterModel": [[19, 2, 1, "", "active_adapters"], [19, 2, 1, "", "active_head"], [19, 1, 1, "", "adapter_fusion_to"], [19, 1, 1, "", "adapter_summary"], [19, 1, 1, "", "adapter_to"], [19, 1, 1, "", "add_adapter"], [19, 1, 1, "", "add_adapter_fusion"], [19, 1, 1, "", "add_causal_lm_head"], [19, 1, 1, "", "add_classification_head"], [19, 1, 1, "", "add_qa_head"], [19, 1, 1, "", "add_tagging_head"], [19, 1, 1, "", "apply_to_adapter_layers"], [19, 1, 1, "", "apply_to_basemodel_childs"], [19, 1, 1, "", "average_adapter"], [19, 1, 1, "", "delete_adapter"], [19, 1, 1, "", "delete_adapter_fusion"], [19, 1, 1, "", "delete_head"], [19, 1, 1, "", "eject_prefix_tuning"], [19, 1, 1, "", "forward"], [19, 1, 1, "", "forward_context"], [19, 1, 1, "", "forward_head"], [19, 1, 1, "", "freeze_model"], [19, 1, 1, "", "get_adapter"], [19, 1, 1, "", "get_labels"], [19, 1, 1, "", "get_labels_dict"], [19, 1, 1, "", "get_output_embeddings"], [19, 1, 1, "", "head_type"], [19, 1, 1, "", "init_adapters"], [19, 1, 1, "", "iter_layers"], [19, 1, 1, "", "load_adapter"], [19, 1, 1, "", "load_adapter_fusion"], [19, 1, 1, "", "load_head"], [19, 1, 1, "", "merge_adapter"], [19, 1, 1, "", "push_adapter_to_hub"], [19, 1, 1, "", "reset_adapter"], [19, 1, 1, "", "save_adapter"], [19, 1, 1, "", "save_adapter_fusion"], [19, 1, 1, "", "save_all_adapter_fusions"], [19, 1, 1, "", "save_all_adapters"], [19, 1, 1, "", "save_all_heads"], [19, 1, 1, "", "save_head"], [19, 1, 1, "", "save_pretrained"], [19, 1, 1, "", "set_active_adapters"], [19, 1, 1, "", "tie_weights"], [19, 1, 1, "", "train_adapter"], [19, 1, 1, "", "train_adapter_fusion"], [19, 1, 1, "", "train_fusion"]], "adapters.GPTJAdapterModel": [[20, 2, 1, "", "active_adapters"], [20, 2, 1, "", "active_head"], [20, 1, 1, "", "adapter_fusion_to"], [20, 1, 1, "", "adapter_summary"], [20, 1, 1, "", "adapter_to"], [20, 1, 1, "", "add_adapter"], [20, 1, 1, "", "add_adapter_fusion"], [20, 1, 1, "", "add_causal_lm_head"], [20, 1, 1, "", "add_classification_head"], [20, 1, 1, "", "add_qa_head"], [20, 1, 1, "", "add_tagging_head"], [20, 1, 1, "", "apply_to_adapter_layers"], [20, 1, 1, "", "apply_to_basemodel_childs"], [20, 1, 1, "", "average_adapter"], [20, 1, 1, "", "delete_adapter"], [20, 1, 1, "", "delete_adapter_fusion"], [20, 1, 1, "", "delete_head"], [20, 1, 1, "", "eject_prefix_tuning"], [20, 1, 1, "", "forward"], [20, 1, 1, "", "forward_context"], [20, 1, 1, "", "forward_head"], [20, 1, 1, "", "freeze_model"], [20, 1, 1, "", "get_adapter"], [20, 1, 1, "", "get_labels"], [20, 1, 1, "", "get_labels_dict"], [20, 1, 1, "", "get_output_embeddings"], [20, 1, 1, "", "head_type"], [20, 1, 1, "", "init_adapters"], [20, 1, 1, "", "iter_layers"], [20, 1, 1, "", "load_adapter"], [20, 1, 1, "", "load_adapter_fusion"], [20, 1, 1, "", "load_head"], [20, 1, 1, "", "merge_adapter"], [20, 1, 1, "", "push_adapter_to_hub"], [20, 1, 1, "", "reset_adapter"], [20, 1, 1, "", "save_adapter"], [20, 1, 1, "", "save_adapter_fusion"], [20, 1, 1, "", "save_all_adapter_fusions"], [20, 1, 1, "", "save_all_adapters"], [20, 1, 1, "", "save_all_heads"], [20, 1, 1, "", "save_head"], [20, 1, 1, "", "save_pretrained"], [20, 1, 1, "", "set_active_adapters"], [20, 1, 1, "", "tie_weights"], [20, 1, 1, "", "train_adapter"], [20, 1, 1, "", "train_adapter_fusion"], [20, 1, 1, "", "train_fusion"]], "adapters.IA3Config": [[1, 1, 1, "", "from_dict"], [1, 1, 1, "", "load"], [1, 1, 1, "", "replace"], [1, 1, 1, "", "to_dict"]], "adapters.InvertibleAdaptersMixin": [[6, 1, 1, "", "add_invertible_adapter"]], "adapters.LlamaAdapterModel": [[21, 2, 1, "", "active_adapters"], [21, 2, 1, "", "active_head"], [21, 1, 1, "", "adapter_fusion_to"], [21, 1, 1, "", "adapter_summary"], [21, 1, 1, "", "adapter_to"], [21, 1, 1, "", "add_adapter"], [21, 1, 1, "", "add_adapter_fusion"], [21, 1, 1, "", "add_causal_lm_head"], [21, 1, 1, "", "add_classification_head"], [21, 1, 1, "", "add_qa_head"], [21, 1, 1, "", "add_tagging_head"], [21, 1, 1, "", "apply_to_adapter_layers"], [21, 1, 1, "", "apply_to_basemodel_childs"], [21, 1, 1, "", "average_adapter"], [21, 1, 1, "", "delete_adapter"], [21, 1, 1, "", "delete_adapter_fusion"], [21, 1, 1, "", "delete_head"], [21, 1, 1, "", "eject_prefix_tuning"], [21, 1, 1, "", "forward"], [21, 1, 1, "", "forward_context"], [21, 1, 1, "", "forward_head"], [21, 1, 1, "", "freeze_model"], [21, 1, 1, "", "get_adapter"], [21, 1, 1, "", "get_labels"], [21, 1, 1, "", "get_labels_dict"], [21, 1, 1, "", "get_output_embeddings"], [21, 1, 1, "", "head_type"], [21, 1, 1, "", "init_adapters"], [21, 1, 1, "", "iter_layers"], [21, 1, 1, "", "load_adapter"], [21, 1, 1, "", "load_adapter_fusion"], [21, 1, 1, "", "load_head"], [21, 1, 1, "", "merge_adapter"], [21, 1, 1, "", "push_adapter_to_hub"], [21, 1, 1, "", "reset_adapter"], [21, 1, 1, "", "save_adapter"], [21, 1, 1, "", "save_adapter_fusion"], [21, 1, 1, "", "save_all_adapter_fusions"], [21, 1, 1, "", "save_all_adapters"], [21, 1, 1, "", "save_all_heads"], [21, 1, 1, "", "save_head"], [21, 1, 1, "", "save_pretrained"], [21, 1, 1, "", "set_active_adapters"], [21, 1, 1, "", "tie_weights"], [21, 1, 1, "", "train_adapter"], [21, 1, 1, "", "train_adapter_fusion"], [21, 1, 1, "", "train_fusion"]], "adapters.LoRAConfig": [[1, 1, 1, "", "from_dict"], [1, 1, 1, "", "load"], [1, 1, 1, "", "replace"], [1, 1, 1, "", "to_dict"]], "adapters.MBartAdapterModel": [[22, 2, 1, "", "active_adapters"], [22, 2, 1, "", "active_head"], [22, 1, 1, "", "adapter_fusion_to"], [22, 1, 1, "", "adapter_summary"], [22, 1, 1, "", "adapter_to"], [22, 1, 1, "", "add_adapter"], [22, 1, 1, "", "add_adapter_fusion"], [22, 1, 1, "", "add_classification_head"], [22, 1, 1, "", "add_qa_head"], [22, 1, 1, "", "add_seq2seq_lm_head"], [22, 1, 1, "", "apply_to_adapter_layers"], [22, 1, 1, "", "apply_to_basemodel_childs"], [22, 1, 1, "", "average_adapter"], [22, 1, 1, "", "delete_adapter"], [22, 1, 1, "", "delete_adapter_fusion"], [22, 1, 1, "", "delete_head"], [22, 1, 1, "", "eject_prefix_tuning"], [22, 1, 1, "", "forward"], [22, 1, 1, "", "forward_context"], [22, 1, 1, "", "forward_head"], [22, 1, 1, "", "freeze_model"], [22, 1, 1, "", "get_adapter"], [22, 1, 1, "", "get_labels"], [22, 1, 1, "", "get_labels_dict"], [22, 1, 1, "", "get_output_embeddings"], [22, 1, 1, "", "head_type"], [22, 1, 1, "", "init_adapters"], [22, 1, 1, "", "iter_layers"], [22, 1, 1, "", "load_adapter"], [22, 1, 1, "", "load_adapter_fusion"], [22, 1, 1, "", "load_head"], [22, 1, 1, "", "merge_adapter"], [22, 1, 1, "", "push_adapter_to_hub"], [22, 1, 1, "", "reset_adapter"], [22, 1, 1, "", "save_adapter"], [22, 1, 1, "", "save_adapter_fusion"], [22, 1, 1, "", "save_all_adapter_fusions"], [22, 1, 1, "", "save_all_adapters"], [22, 1, 1, "", "save_all_heads"], [22, 1, 1, "", "save_head"], [22, 1, 1, "", "save_pretrained"], [22, 1, 1, "", "set_active_adapters"], [22, 1, 1, "", "tie_weights"], [22, 1, 1, "", "train_adapter"], [22, 1, 1, "", "train_adapter_fusion"], [22, 1, 1, "", "train_fusion"]], "adapters.MT5AdapterModel": [[23, 2, 1, "", "active_adapters"], [23, 2, 1, "", "active_head"], [23, 1, 1, "", "adapter_fusion_to"], [23, 1, 1, "", "adapter_summary"], [23, 1, 1, "", "adapter_to"], [23, 1, 1, "", "add_adapter"], [23, 1, 1, "", "add_adapter_fusion"], [23, 1, 1, "", "add_classification_head"], [23, 1, 1, "", "add_qa_head"], [23, 1, 1, "", "add_seq2seq_lm_head"], [23, 1, 1, "", "apply_to_adapter_layers"], [23, 1, 1, "", "apply_to_basemodel_childs"], [23, 1, 1, "", "average_adapter"], [23, 1, 1, "", "delete_adapter"], [23, 1, 1, "", "delete_adapter_fusion"], [23, 1, 1, "", "delete_head"], [23, 1, 1, "", "eject_prefix_tuning"], [23, 1, 1, "", "forward"], [23, 1, 1, "", "forward_context"], [23, 1, 1, "", "forward_head"], [23, 1, 1, "", "freeze_model"], [23, 1, 1, "", "get_adapter"], [23, 1, 1, "", "get_labels"], [23, 1, 1, "", "get_labels_dict"], [23, 1, 1, "", "get_output_embeddings"], [23, 1, 1, "", "head_type"], [23, 1, 1, "", "init_adapters"], [23, 1, 1, "", "iter_layers"], [23, 1, 1, "", "load_adapter"], [23, 1, 1, "", "load_adapter_fusion"], [23, 1, 1, "", "load_head"], [23, 1, 1, "", "merge_adapter"], [23, 1, 1, "", "push_adapter_to_hub"], [23, 1, 1, "", "reset_adapter"], [23, 1, 1, "", "save_adapter"], [23, 1, 1, "", "save_adapter_fusion"], [23, 1, 1, "", "save_all_adapter_fusions"], [23, 1, 1, "", "save_all_adapters"], [23, 1, 1, "", "save_all_heads"], [23, 1, 1, "", "save_head"], [23, 1, 1, "", "save_pretrained"], [23, 1, 1, "", "set_active_adapters"], [23, 1, 1, "", "tie_weights"], [23, 1, 1, "", "train_adapter"], [23, 1, 1, "", "train_adapter_fusion"], [23, 1, 1, "", "train_fusion"]], "adapters.ModelAdaptersConfig": [[5, 1, 1, "", "add"], [5, 1, 1, "", "add_fusion"], [5, 1, 1, "", "common_config_value"], [5, 1, 1, "", "get"], [5, 1, 1, "", "get_fusion"], [5, 1, 1, "", "match"]], "adapters.ModelAdaptersMixin": [[6, 1, 1, "", "adapter_fusion_to"], [6, 1, 1, "", "adapter_summary"], [6, 1, 1, "", "adapter_to"], [6, 1, 1, "", "add_adapter"], [6, 1, 1, "", "add_adapter_fusion"], [6, 1, 1, "", "apply_to_adapter_layers"], [6, 1, 1, "", "apply_to_basemodel_childs"], [6, 1, 1, "", "average_adapter"], [6, 1, 1, "", "delete_adapter"], [6, 1, 1, "", "delete_adapter_fusion"], [6, 1, 1, "", "eject_prefix_tuning"], [6, 1, 1, "", "forward_context"], [6, 1, 1, "", "freeze_model"], [6, 1, 1, "", "get_adapter"], [6, 1, 1, "", "init_adapters"], [6, 1, 1, "", "iter_layers"], [6, 1, 1, "", "load_adapter"], [6, 1, 1, "", "load_adapter_fusion"], [6, 1, 1, "", "merge_adapter"], [6, 1, 1, "", "reset_adapter"], [6, 1, 1, "", "save_adapter"], [6, 1, 1, "", "save_adapter_fusion"], [6, 1, 1, "", "save_all_adapter_fusions"], [6, 1, 1, "", "save_all_adapters"], [6, 1, 1, "", "set_active_adapters"], [6, 1, 1, "", "train_adapter"], [6, 1, 1, "", "train_adapter_fusion"], [6, 1, 1, "", "train_fusion"]], "adapters.ModelWithFlexibleHeadsAdaptersMixin": [[6, 2, 1, "", "active_head"], [6, 1, 1, "", "adapter_to"], [6, 1, 1, "", "add_causal_lm_head"], [6, 1, 1, "", "add_classification_head"], [6, 1, 1, "", "add_dependency_parsing_head"], [6, 1, 1, "", "add_image_classification_head"], [6, 1, 1, "", "add_masked_lm_head"], [6, 1, 1, "", "add_multiple_choice_head"], [6, 1, 1, "", "add_qa_head"], [6, 1, 1, "", "add_seq2seq_lm_head"], [6, 1, 1, "", "add_tagging_head"], [6, 1, 1, "", "delete_head"], [6, 1, 1, "", "forward_head"], [6, 1, 1, "", "get_labels"], [6, 1, 1, "", "get_labels_dict"], [6, 1, 1, "", "head_type"], [6, 1, 1, "", "set_active_adapters"], [6, 1, 1, "", "tie_weights"]], "adapters.ModelWithHeadsAdaptersMixin": [[6, 1, 1, "", "add_adapter"], [6, 1, 1, "", "delete_adapter"], [6, 1, 1, "", "get_adapter"], [6, 1, 1, "", "init_adapters"], [6, 1, 1, "", "iter_layers"], [6, 1, 1, "", "load_adapter"], [6, 1, 1, "", "load_adapter_fusion"], [6, 1, 1, "", "load_head"], [6, 1, 1, "", "save_adapter"], [6, 1, 1, "", "save_adapter_fusion"], [6, 1, 1, "", "save_all_adapters"], [6, 1, 1, "", "save_all_heads"], [6, 1, 1, "", "save_head"], [6, 1, 1, "", "train_adapter"], [6, 1, 1, "", "train_adapter_fusion"]], "adapters.PrefixTuningConfig": [[1, 1, 1, "", "from_dict"], [1, 1, 1, "", "load"], [1, 1, 1, "", "replace"], [1, 1, 1, "", "to_dict"]], "adapters.PromptTuningConfig": [[1, 1, 1, "", "from_dict"], [1, 1, 1, "", "load"], [1, 1, 1, "", "replace"], [1, 1, 1, "", "to_dict"]], "adapters.RobertaAdapterModel": [[24, 2, 1, "", "active_adapters"], [24, 2, 1, "", "active_head"], [24, 1, 1, "", "adapter_fusion_to"], [24, 1, 1, "", "adapter_summary"], [24, 1, 1, "", "adapter_to"], [24, 1, 1, "", "add_adapter"], [24, 1, 1, "", "add_adapter_fusion"], [24, 1, 1, "", "add_causal_lm_head"], [24, 1, 1, "", "add_classification_head"], [24, 1, 1, "", "add_dependency_parsing_head"], [24, 1, 1, "", "add_masked_lm_head"], [24, 1, 1, "", "add_multiple_choice_head"], [24, 1, 1, "", "add_qa_head"], [24, 1, 1, "", "add_tagging_head"], [24, 1, 1, "", "apply_to_adapter_layers"], [24, 1, 1, "", "apply_to_basemodel_childs"], [24, 1, 1, "", "average_adapter"], [24, 1, 1, "", "delete_adapter"], [24, 1, 1, "", "delete_adapter_fusion"], [24, 1, 1, "", "delete_head"], [24, 1, 1, "", "eject_prefix_tuning"], [24, 1, 1, "", "forward"], [24, 1, 1, "", "forward_context"], [24, 1, 1, "", "forward_head"], [24, 1, 1, "", "freeze_model"], [24, 1, 1, "", "get_adapter"], [24, 1, 1, "", "get_labels"], [24, 1, 1, "", "get_labels_dict"], [24, 1, 1, "", "get_output_embeddings"], [24, 1, 1, "", "head_type"], [24, 1, 1, "", "init_adapters"], [24, 1, 1, "", "iter_layers"], [24, 1, 1, "", "load_adapter"], [24, 1, 1, "", "load_adapter_fusion"], [24, 1, 1, "", "load_head"], [24, 1, 1, "", "merge_adapter"], [24, 1, 1, "", "push_adapter_to_hub"], [24, 1, 1, "", "reset_adapter"], [24, 1, 1, "", "save_adapter"], [24, 1, 1, "", "save_adapter_fusion"], [24, 1, 1, "", "save_all_adapter_fusions"], [24, 1, 1, "", "save_all_adapters"], [24, 1, 1, "", "save_all_heads"], [24, 1, 1, "", "save_head"], [24, 1, 1, "", "save_pretrained"], [24, 1, 1, "", "set_active_adapters"], [24, 1, 1, "", "tie_weights"], [24, 1, 1, "", "train_adapter"], [24, 1, 1, "", "train_adapter_fusion"], [24, 1, 1, "", "train_fusion"]], "adapters.T5AdapterModel": [[25, 2, 1, "", "active_adapters"], [25, 2, 1, "", "active_head"], [25, 1, 1, "", "adapter_fusion_to"], [25, 1, 1, "", "adapter_summary"], [25, 1, 1, "", "adapter_to"], [25, 1, 1, "", "add_adapter"], [25, 1, 1, "", "add_adapter_fusion"], [25, 1, 1, "", "add_classification_head"], [25, 1, 1, "", "add_qa_head"], [25, 1, 1, "", "add_seq2seq_lm_head"], [25, 1, 1, "", "apply_to_adapter_layers"], [25, 1, 1, "", "apply_to_basemodel_childs"], [25, 1, 1, "", "average_adapter"], [25, 1, 1, "", "delete_adapter"], [25, 1, 1, "", "delete_adapter_fusion"], [25, 1, 1, "", "delete_head"], [25, 1, 1, "", "eject_prefix_tuning"], [25, 1, 1, "", "forward"], [25, 1, 1, "", "forward_context"], [25, 1, 1, "", "forward_head"], [25, 1, 1, "", "freeze_model"], [25, 1, 1, "", "get_adapter"], [25, 1, 1, "", "get_labels"], [25, 1, 1, "", "get_labels_dict"], [25, 1, 1, "", "get_output_embeddings"], [25, 1, 1, "", "head_type"], [25, 1, 1, "", "init_adapters"], [25, 1, 1, "", "iter_layers"], [25, 1, 1, "", "load_adapter"], [25, 1, 1, "", "load_adapter_fusion"], [25, 1, 1, "", "load_head"], [25, 1, 1, "", "merge_adapter"], [25, 1, 1, "", "push_adapter_to_hub"], [25, 1, 1, "", "reset_adapter"], [25, 1, 1, "", "save_adapter"], [25, 1, 1, "", "save_adapter_fusion"], [25, 1, 1, "", "save_all_adapter_fusions"], [25, 1, 1, "", "save_all_adapters"], [25, 1, 1, "", "save_all_heads"], [25, 1, 1, "", "save_head"], [25, 1, 1, "", "save_pretrained"], [25, 1, 1, "", "set_active_adapters"], [25, 1, 1, "", "tie_weights"], [25, 1, 1, "", "train_adapter"], [25, 1, 1, "", "train_adapter_fusion"], [25, 1, 1, "", "train_fusion"]], "adapters.ViTAdapterModel": [[26, 2, 1, "", "active_adapters"], [26, 2, 1, "", "active_head"], [26, 1, 1, "", "adapter_fusion_to"], [26, 1, 1, "", "adapter_summary"], [26, 1, 1, "", "adapter_to"], [26, 1, 1, "", "add_adapter"], [26, 1, 1, "", "add_adapter_fusion"], [26, 1, 1, "", "add_image_classification_head"], [26, 1, 1, "", "apply_to_adapter_layers"], [26, 1, 1, "", "apply_to_basemodel_childs"], [26, 1, 1, "", "average_adapter"], [26, 1, 1, "", "delete_adapter"], [26, 1, 1, "", "delete_adapter_fusion"], [26, 1, 1, "", "delete_head"], [26, 1, 1, "", "eject_prefix_tuning"], [26, 1, 1, "", "forward"], [26, 1, 1, "", "forward_context"], [26, 1, 1, "", "forward_head"], [26, 1, 1, "", "freeze_model"], [26, 1, 1, "", "get_adapter"], [26, 1, 1, "", "get_labels"], [26, 1, 1, "", "get_labels_dict"], [26, 1, 1, "", "get_output_embeddings"], [26, 1, 1, "", "head_type"], [26, 1, 1, "", "init_adapters"], [26, 1, 1, "", "iter_layers"], [26, 1, 1, "", "load_adapter"], [26, 1, 1, "", "load_adapter_fusion"], [26, 1, 1, "", "load_head"], [26, 1, 1, "", "merge_adapter"], [26, 1, 1, "", "push_adapter_to_hub"], [26, 1, 1, "", "reset_adapter"], [26, 1, 1, "", "save_adapter"], [26, 1, 1, "", "save_adapter_fusion"], [26, 1, 1, "", "save_all_adapter_fusions"], [26, 1, 1, "", "save_all_adapters"], [26, 1, 1, "", "save_all_heads"], [26, 1, 1, "", "save_head"], [26, 1, 1, "", "save_pretrained"], [26, 1, 1, "", "set_active_adapters"], [26, 1, 1, "", "tie_weights"], [26, 1, 1, "", "train_adapter"], [26, 1, 1, "", "train_adapter_fusion"], [26, 1, 1, "", "train_fusion"]], "adapters.XLMRobertaAdapterModel": [[27, 1, 1, "", "forward"]], "adapters.XmodAdapterModel": [[28, 2, 1, "", "active_adapters"], [28, 2, 1, "", "active_head"], [28, 1, 1, "", "adapter_fusion_to"], [28, 1, 1, "", "adapter_summary"], [28, 1, 1, "", "adapter_to"], [28, 1, 1, "", "add_adapter"], [28, 1, 1, "", "add_adapter_fusion"], [28, 1, 1, "", "add_causal_lm_head"], [28, 1, 1, "", "add_classification_head"], [28, 1, 1, "", "add_dependency_parsing_head"], [28, 1, 1, "", "add_masked_lm_head"], [28, 1, 1, "", "add_multiple_choice_head"], [28, 1, 1, "", "add_qa_head"], [28, 1, 1, "", "add_tagging_head"], [28, 1, 1, "", "apply_to_adapter_layers"], [28, 1, 1, "", "apply_to_basemodel_childs"], [28, 1, 1, "", "average_adapter"], [28, 1, 1, "", "delete_adapter"], [28, 1, 1, "", "delete_adapter_fusion"], [28, 1, 1, "", "delete_head"], [28, 1, 1, "", "eject_prefix_tuning"], [28, 1, 1, "", "forward"], [28, 1, 1, "", "forward_context"], [28, 1, 1, "", "forward_head"], [28, 1, 1, "", "freeze_model"], [28, 1, 1, "", "get_adapter"], [28, 1, 1, "", "get_labels"], [28, 1, 1, "", "get_labels_dict"], [28, 1, 1, "", "get_output_embeddings"], [28, 1, 1, "", "head_type"], [28, 1, 1, "", "init_adapters"], [28, 1, 1, "", "iter_layers"], [28, 1, 1, "", "load_adapter"], [28, 1, 1, "", "load_adapter_fusion"], [28, 1, 1, "", "load_head"], [28, 1, 1, "", "merge_adapter"], [28, 1, 1, "", "push_adapter_to_hub"], [28, 1, 1, "", "reset_adapter"], [28, 1, 1, "", "save_adapter"], [28, 1, 1, "", "save_adapter_fusion"], [28, 1, 1, "", "save_all_adapter_fusions"], [28, 1, 1, "", "save_all_adapters"], [28, 1, 1, "", "save_all_heads"], [28, 1, 1, "", "save_head"], [28, 1, 1, "", "save_pretrained"], [28, 1, 1, "", "set_active_adapters"], [28, 1, 1, "", "tie_weights"], [28, 1, 1, "", "train_adapter"], [28, 1, 1, "", "train_adapter_fusion"], [28, 1, 1, "", "train_fusion"]], "adapters.hub_mixin": [[6, 0, 1, "", "PushAdapterToHubMixin"]], "adapters.hub_mixin.PushAdapterToHubMixin": [[6, 1, 1, "", "push_adapter_to_hub"]], "adapters.trainer": [[3, 0, 1, "", "AdapterTrainer"], [3, 0, 1, "", "AdapterTrainerCallback"], [3, 0, 1, "", "Seq2SeqAdapterTrainer"]], "adapters.trainer.AdapterTrainer": [[3, 1, 1, "", "create_optimizer"]], "adapters.trainer.AdapterTrainerCallback": [[3, 1, 1, "", "on_step_end"], [3, 1, 1, "", "on_train_begin"]], "adapters.training": [[3, 0, 1, "", "AdapterArguments"], [3, 4, 1, "", "setup_adapter_training"]], "adapters.utils": [[4, 0, 1, "", "AdapterInfo"], [4, 0, 1, "", "AdapterType"], [4, 4, 1, "", "get_adapter_config_hash"], [4, 4, 1, "", "get_adapter_info"], [4, 4, 1, "", "get_from_cache"], [4, 4, 1, "", "list_adapters"], [4, 4, 1, "", "parse_adapter_config_string"], [4, 4, 1, "", "prefix_attention_mask"], [4, 4, 1, "", "pull_from_hub"], [4, 4, 1, "", "resolve_adapter_config"], [4, 4, 1, "", "resolve_adapter_path"]], "transformers": [[13, 0, 1, "", "CLIPModel"], [13, 0, 1, "", "CLIPTextModel"], [13, 0, 1, "", "CLIPVisionModel"], [18, 0, 1, "", "EncoderDecoderModel"]], "transformers.CLIPModel": [[13, 5, 1, "", "config_class"], [13, 1, 1, "", "forward"], [13, 1, 1, "", "get_image_features"], [13, 1, 1, "", "get_text_features"]], "transformers.CLIPTextModel": [[13, 5, 1, "", "config_class"], [13, 1, 1, "", "forward"], [13, 1, 1, "", "get_input_embeddings"], [13, 1, 1, "", "set_input_embeddings"]], "transformers.CLIPVisionModel": [[13, 5, 1, "", "config_class"], [13, 1, 1, "", "forward"], [13, 1, 1, "", "get_input_embeddings"]], "transformers.EncoderDecoderModel": [[18, 1, 1, "", "forward"], [18, 1, 1, "", "from_encoder_decoder_pretrained"]]}, "objtypes": {"0": "py:class", "1": "py:method", "2": "py:property", "3": "py:module", "4": "py:function", "5": "py:attribute"}, "objnames": {"0": ["py", "class", "Python class"], "1": ["py", "method", "Python method"], "2": ["py", "property", "Python property"], "3": ["py", "module", "Python module"], "4": ["py", "function", "Python function"], "5": ["py", "attribute", "Python attribute"]}, "titleterms": {"adapt": [0, 1, 2, 3, 4, 5, 29, 30, 31, 34, 36, 38, 39, 40, 42, 44, 45, 46], "activ": 0, "composit": [0, 30], "block": 0, "overview": [0, 12, 14, 15, 41, 42], "stack": 0, "fuse": 0, "retriev": 0, "adapterfus": [0, 45], "attent": 0, "split": 0, "batchsplit": 0, "parallel": 0, "averag": 0, "output": 0, "paramet": 0, "nest": 0, "configur": [1, 30, 42, 46], "singl": 1, "bottleneck": [1, 40, 46], "prefix": [1, 40], "tune": [1, 40, 42], "loraconfig": 1, "ia3config": 1, "prompttuningconfig": 1, "combin": [1, 39], "fusion": 1, "setup": [1, 45], "implement": [2, 30, 31], "train": [3, 29, 30, 31, 32, 38, 44, 45], "util": 4, "model": [5, 6, 18, 29, 30, 31, 33, 35, 36, 41, 43, 44, 45, 46], "config": 5, "mixin": 6, "invertibleadaptersmixin": 6, "embeddingadaptersmixin": 6, "modeladaptersmixin": 6, "modelwithheadsadaptersmixin": 6, "modelwithflexibleheadsadaptersmixin": 6, "pushadaptertohubmixin": 6, "albert": 7, "albertadaptermodel": 7, "auto": 8, "class": [8, 31, 36, 43, 45], "autoadaptermodel": 8, "bart": 9, "bartadaptermodel": 9, "beit": 10, "beitadaptermodel": 10, "bert": 11, "bertadaptermodel": 11, "bertgener": 12, "bertgenerationadaptermodel": 12, "clip": 13, "cliptextmodel": 13, "clipvisionmodel": 13, "clipmodel": 13, "deberta": [14, 15], "debertaadaptermodel": 14, "v2": 15, "debertav2adaptermodel": 15, "distilbert": 16, "distilbertadaptermodel": 16, "electra": 17, "electraadaptermodel": 17, "encod": 18, "decod": 18, "encoderdecodermodel": 18, "openai": 19, "gpt2": 19, "gpt2adaptermodel": 19, "eleutherai": 20, "gpt": 20, "j": 20, "6b": 20, "gptjadaptermodel": 20, "llama": 21, "llamaadaptermodel": 21, "mbart": 22, "mbartadaptermodel": 22, "mt5": 23, "mt5adaptermodel": 23, "roberta": [24, 27], "robertaadaptermodel": 24, "t5": 25, "t5adaptermodel": 25, "vision": 26, "transform": [26, 33, 43, 46], "vit": 26, "vitadaptermodel": 26, "xlm": 27, "xlmrobertaadaptermodel": 27, "x": 28, "mod": 28, "xmodadaptermodel": 28, "contribut": [29, 34, 36], "adapterhub": [29, 36], "codebas": 29, "set": 29, "up": 29, "your": [29, 38], "dev": 29, "environ": 29, "ad": [29, 30, 31, 32], "method": [29, 30, 36, 39, 40, 42, 45], "test": [29, 30, 31], "chang": 29, "publish": 29, "pre": [29, 38, 44], "For": 30, "without": 30, "support": [30, 36, 46], "all": 30, "document": [30, 31, 36], "exampl": [30, 31], "relev": 31, "step": [31, 45], "addit": 31, "option": [31, 45], "embed": 32, "delet": 32, "save": 32, "load": [32, 33, 36, 38], "extend": 33, "librari": 33, "integr": [33, 35], "new": 33, "custom": [33, 43], "modul": 33, "weight": 33, "hub": [34, 35], "hug": [35, 43], "face": [35, 43], "s": 35, "download": 35, "from": [35, 37, 46], "upload": 35, "get": 36, "start": [36, 44, 45], "advanc": [36, 38], "share": 36, "relat": 36, "citat": 36, "indic": 36, "tabl": [36, 42], "instal": 37, "us": [37, 38, 42, 44], "pip": 37, "pypi": 37, "github": 37, "repositori": 37, "find": [38, 46], "code": 38, "usag": 38, "load_adapt": 38, "mix": 39, "match": 39, "unipelt": 39, "languag": [40, 45], "invert": 40, "compact": 40, "lora": 40, "ia": 40, "3": 40, "prompt": 40, "why": 42, "effici": 42, "fine": 42, "string": 42, "predict": 43, "head": 43, "adaptermodel": 43, "static": 43, "automat": 43, "convers": 43, "quick": 44, "introduct": 44, "initi": 44, "infer": 44, "task": 45, "A": 45, "pars": 45, "adapterargu": 45, "b": 45, "switch": 45, "c": 45, "d": 45, "adaptertrain": 45, "e": 45, "quantiz": 45, "transit": 46, "packag": 46, "namespac": 46, "initialis": 46, "name": 46, "featur": 46, "ar": 46, "what": 46, "ha": 46, "remain": 46, "same": 46, "where": 46, "can": 46, "i": 46, "still": 46}, "envversion": {"sphinx.domains.c": 2, "sphinx.domains.changeset": 1, "sphinx.domains.citation": 1, "sphinx.domains.cpp": 6, "sphinx.domains.index": 1, "sphinx.domains.javascript": 2, "sphinx.domains.math": 2, "sphinx.domains.python": 3, "sphinx.domains.rst": 2, "sphinx.domains.std": 2, "sphinx": 56}}) \ No newline at end of file diff --git a/training.html b/training.html new file mode 100644 index 0000000000..7f53fd0d75 --- /dev/null +++ b/training.html @@ -0,0 +1,501 @@ + + + + + + + + + + + Adapter Training — AdapterHub documentation + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + +
+ + + + + +
+ +
+ + + + + + + + + + + + + + + + + +
+ + + + +
+
+
+
+ +
+

Adapter Training

+

This section describes some examples of training adapter methods for different scenarios. We focus on integrating adapter methods into existing training scripts for Transformer models. +All presented scripts are only slightly modified from the original examples from Hugging Face Transformers. +To run the scripts, make sure you have the latest version of the repository and have installed some additional requirements:

+
git clone https://github.com/adapter-hub/adapters
+cd adapters
+pip install .
+pip install -r ./examples/pytorch/<your_examples_folder>/requirements.txt
+
+
+
+

Train a Task Adapter

+

Training a task adapter module on a dataset only requires minor modifications compared to training the entire model. +Suppose we have an existing script for training a Transformer model. +In the following, we will use Hugging Face’s run_glue.py example script for training on the GLUE benchmark. +We go through all required changes step by step:

+
+

Step A - Parse AdapterArguments

+

The AdapterArguments class integrated into adapters provides a set of command-line options useful for training adapters. +These include options such as --train_adapter for activating adapter training and --load_adapter for loading adapters from checkpoints. +Thus, the first step of integrating adapters is to add these arguments to the line where HfArgumentParser is instantiated:

+
parser = HfArgumentParser((ModelArguments, DataTrainingArguments, TrainingArguments, AdapterArguments))
+# ...
+model_args, data_args, training_args, adapter_args = parser.parse_args_into_dataclasses()
+
+
+
+
+

Step B - Switch model class (optional)

+

In our example, we replace the built-in AutoModelForSequenceClassification class with the AutoAdapterModel class introduced by adapters. +Therefore, the model instantiation changed to:

+
model = AutoAdapterModel.from_pretrained(
+        model_args.model_name_or_path,
+        config=config,
+)
+model.add_classification_head(data_args.task_name, num_labels=num_labels)
+
+
+

Alternatively, you can also use the original transformers class and initialize the model for the usage of adapters by calling adapters.init(model). +Learn more about the benefits of AdapterModel classes here

+
+
+

Step C - Setup adapter methods

+
+

Tip

+

In the following, we show how to set up adapters manually. In most cases, you can use the built-in setup_adapter_training() method to perform this job automatically. Just add a statement similar to this anywhere between model instantiation and training start in your script: setup_adapter_training(model, adapter_args, task_name)

+
+

Compared to fine-tuning the entire model, we have to make only one significant adaptation: adding an adapter setup and activating it.

+
# task adapter - only add if not existing
+if task_name not in model.adapters_config:
+    # resolve the adapter config
+    adapter_config = AdapterConfig.load(adapter_args.adapter_config)
+    # add a new adapter
+    model.add_adapter(task_name, config=adapter_config)
+# Enable adapter training
+model.train_adapter(task_name)
+
+
+
+

Important

+

The most crucial step when training an adapter module is to freeze all weights in the model except for those of the +adapter. In the previous snippet, this is achieved by calling the train_adapter() method, which disables training +of all weights outside the task adapter. In case you want to unfreeze all model weights later on, you can use +freeze_model(False).

+
+

Besides this, we only have to make sure that the task adapter and prediction head are activated so that they are used in every forward pass. To specify the adapter modules to use, we can use the model.set_active_adapters() +method and pass the adapter setup. If you only use a single adapter, you can simply pass the name of the adapter. For more information +on complex setups, checkout the Composition Blocks.

+
model.set_active_adapters(task_name)
+
+
+
+
+

Step D - Switch to AdapterTrainer class

+

Finally, we exchange the Trainer class built into Transformers for the AdapterTrainer class that is optimized for training adapter methods. +See below for more information.

+

Technically, this change is not required as no changes to the training loop are required for training adapters. +However, AdapterTrainer e.g., provides better support for checkpointing and reloading adapter weights.

+
+
+

Step E - Start training

+

The rest of the training procedure does not require any further changes in code.

+

You can find the full version of the modified training script for GLUE at run_glue.py in the examples folder of our repository. +We also adapted various other example scripts (e.g., run_glue.py, run_multiple_choice.py, run_squad.py, …) to support adapter training.

+

To start adapter training on a GLUE task, you can run something similar to:

+
export TASK_NAME=mrpc
+
+python run_glue.py \
+  --model_name_or_path bert-base-uncased \
+  --task_name $TASK_NAME \
+  --do_train \
+  --do_eval \
+  --max_seq_length 128 \
+  --per_device_train_batch_size 32 \
+  --learning_rate 1e-4 \
+  --num_train_epochs 10.0 \
+  --output_dir /tmp/$TASK_NAME \
+  --overwrite_output_dir \
+  --train_adapter \
+  --adapter_config seq_bn
+
+
+

The important flag here is --train_adapter, which switches from fine-tuning the entire model to training an adapter module for the given GLUE task.

+
+

Tip

+

Adapter weights are usually initialized randomly, which is why we require a higher learning rate. We have found that a default adapter learning rate of 1e-4 works well for most settings.

+
+
+

Tip

+

Depending on your data set size, you might also need to train longer than usual. To avoid overfitting, you can evaluate the adapters after each epoch on the development set and only save the best model.

+
+
+
+
+

Train a Language Adapter

+

Training a language adapter is equally straightforward as training a task adapter. Similarly to the steps for task adapters +described above, we add a language adapter module to an existing model training script. Here, we modified Hugging Face’s run_mlm.py script for masked language modeling with BERT-based models.

+

Training a language adapter on BERT using this script may look like the following:

+
export TRAIN_FILE=/path/to/dataset/train
+export VALIDATION_FILE=/path/to/dataset/validation
+
+python run_mlm.py \
+    --model_name_or_path bert-base-uncased \
+    --train_file $TRAIN_FILE \
+    --validation_file $VALIDATION_FILE \
+    --do_train \
+    --do_eval \
+    --learning_rate 1e-4 \
+    --num_train_epochs 10.0 \
+    --output_dir /tmp/test-mlm \
+    --train_adapter \
+    --adapter_config "seq_bn_inv"
+
+
+
+
+

Train AdapterFusion

+

We provide an example for training AdapterFusion (Pfeiffer et al., 2020) on the GLUE dataset: run_fusion_glue.py. +You can adapt this script to train AdapterFusion with different pre-trained adapters on your own dataset.

+
+

Important

+

AdapterFusion on a target task is trained in a second training stage after independently training adapters on individual tasks. +When setting up a fusion architecture on your model, make sure to load the pre-trained adapter modules to be fused using model.load_adapter() before adding a fusion layer. +For more on AdapterFusion, also refer to Pfeiffer et al., 2020.

+
+

To start fusion training on SST-2 as the target task, you can run something like the following:

+
export GLUE_DIR=/path/to/glue
+export TASK_NAME=SST-2
+
+python run_fusion_glue.py \
+  --model_name_or_path bert-base-uncased \
+  --task_name $TASK_NAME \
+  --do_train \
+  --do_eval \
+  --data_dir $GLUE_DIR/$TASK_NAME \
+  --max_seq_length 128 \
+  --per_device_train_batch_size 32 \
+  --learning_rate 5e-5 \
+  --num_train_epochs 10.0 \
+  --output_dir /tmp/$TASK_NAME \
+  --overwrite_output_dir
+
+
+
+
+

AdapterTrainer

+

Similar to the Trainer class provided by Hugging Face, adapters provides an AdapterTrainer class. This class is only +intended for training adapters. The Trainer class should still be used to fully fine-tune models. To train adapters with the AdapterTrainer +class, simply initialize it the same way you would initialize the Trainer class, e.g.:

+
model.add_adapter(task_name)
+model.train_adapter(task_name)
+
+trainings_args =  TrainingsArguments(
+    learning_rate=1e-4,
+    num_train_epochs=6,
+)
+
+trainer = AdapterTrainer(
+        model=model,
+        args=training_args,
+        train_dataset=train_dataset,
+        eval_dataset=eval_dataset,
+        tokenizer=tokenizer,
+        data_collator=data_collator,
+    )
+
+
+
+

Tip

+

When you migrate from the previous versions, which use the Trainer class for adapter training and fully fine-tuning, note that the +specialized AdapterTrainer class does not have the parameters do_save_full_model, do_save_adapters and do_save_adapter_fusion.

+
+
+
+

Quantized Model Training

+

Adapters supports fine-tuning of quantized language models similar to QLoRA (Dettmers et al., 2023) via the bitsandbytes library integrated into Transformers. +Quantized training is supported for LoRA-based adapters as well as bottleneck adapters and prefix tuning. +Please refer to this notebook for a hands-on guide.

+
+
+ + +
+ +
+ + +
+
+ +
+ +
+ +
+ + Versions + v: main + + +
+
+
Branches
+
main
+
+
+
+ + + + + + + + + + \ No newline at end of file diff --git a/transitioning.html b/transitioning.html new file mode 100644 index 0000000000..e5f8dffcb1 --- /dev/null +++ b/transitioning.html @@ -0,0 +1,379 @@ + + + + + + + + + + + Transitioning from adapter-transformers — AdapterHub documentation + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + +
+ + + + + +
+ +
+ + + + + + + + + + + + + + + + + +
+ + + + +
+
+
+
+ +
+

Transitioning from adapter-transformers

+
+

Important

+

adapters is fully compatible to adapter-transformers in terms of model weights, meaning you can load any adapter trained with any version of adapter-transformers to the new library without degradation.

+
+

The new adapters library is the successor to the adapter-transformers library. It differs essentially in that adapters is now a stand-alone package, i.e., the package is disentangled from the transformers package from Hugging Face and is no longer a drop-in replacement.

+

This results in some breaking changes. To transition your code from adapter-transformers to adapters you need to consider the following changes:

+
+

Package and Namespace

+

To use the library you need to install +transformers and adapters in the same environment (unlike adapter-transformers which contained transformers and could not be installed in the same environment).

+

Run the following to install both (installing adapters will automatically trigger the installation of a compatible transformers version):

+
pip install adapters
+
+
+

This also changes the namespace to adapters. For all imports of adapter classes change the import from transformers to adapters. +This mainly affects the following classes:

+
    +
  • AdapterModel classes, e.g. AutoAdapterModel (see AdapterModels )

  • +
  • Adapter configurations e.g. PrefixTuningConfig (see Configurations )

  • +
  • Adapter composition blocks, e.g. Stack (see Composition Blocks )

  • +
  • The AdapterTrainer class

  • +
+
+
+

Model Initialisation

+

The Hugging Face model classes, such as BertModel, cannot be used directly with adapters. They must first be initialised for adding adapters:

+
from transformers import AutoModel
+import adapters
+
+model = AutoModel.from_pretrained("bert-base-uncased")
+adapters.init(model) # prepare model for use with adapters
+
+
+

The necessary change is the call of the adapters.init() method. +Note that no additional initialisation is required to use the AdapterModel classes such as the BertAdapterModel’. These classes are provided by the adapters library and are already prepared for using adapters in training and inference.

+
+
+

Bottleneck Configuration Names

+

The adapters library supports the configuration of adapters using config strings. Compared to the adapter-transformers library, we have changed some of the strings to make them more consistent and intuitive:

+
    +
  • houlsby -> double_seq_bn

  • +
  • pfeiffer -> seq_bn

  • +
  • parallel-> par_seq_bn

  • +
  • houlsby+inv -> double_seq_bn_inv

  • +
  • pfeiffer+inv-> seq_bn_inv

  • +
+

For a complete list of config strings and classes see here. We strongly recommend using the new config strings, but we will continue to support the old config strings for the time being to make the transition easier. +Note that with the config strings the corresponding adapter config classes have changed, e.g. PfeifferConfig -> SeqBnConfig.

+

Another consequence of this that the AdapterConfig class is now not only for the bottleneck adapters anymore, but the base class of all the configurations (previously AdapterConfigBase). Hence, the function this class serves has changed. However, you can still load adapter configs with:

+
adapter_config = AdapterConfig.load("lora")
+
+
+
+
+

Features that are not supported by adapters

+

Compared to adapter-transformers, there are a few features that are no longer supported by the adapters library:

+
    +
  • Using transformers pipelines with adapters.

  • +
  • Using invertible adapters in the Hugging Face model classes. To use invertible adapters you must use the AdapterModel class.

  • +
  • Loading model and adapter checkpoints saved with save_pretrained using Hugging Face classes. This is only supported by the AdapterModel classes.

  • +
+
+
+

What has remained the same

+
    +
  • The new library is fully backwards compatible in terms of adapter weights, i.e. you can load all adapter modules trained with adapter-transformers.

  • +
  • The functionality for adding, activating, and training adapters has not changed, except for the renaming of some adapter configs. You still add and activate adapters as follows:

  • +
+
# add adapter to the model
+model.add_adapter("adapter_name", config="lora")
+# activate adapter
+model.set_active_adapters("adapter_name")
+# freeze model weights and activate adapter
+model.train_adapter("adapter_name")
+
+
+
+
+

Where can I still find adapter-transformers?

+

The codebase of adapter-transformers has moved to https://github.com/adapter-hub/adapter-transformers-legacy for archival purposes.

+

The full documentation of the old library is now hosted at https://docs-legacy.adapterhub.ml.

+
+
+ + +
+ +
+ + +
+
+ +
+ +
+ +
+ + Versions + v: main + + +
+
+
Branches
+
main
+
+
+
+ + + + + + + + + + \ No newline at end of file