feat: add IA3 prompt tuning #2

maw501 · 2023-03-21T14:39:31Z

This PR adds $(IA)^3$ tuning per Few-Shot Parameter-Efficient Fine-Tuning is Better and Cheaper than In-Context Learning.

The main change is the addition of two new classes, ParallelMLPIA3 and ParallelSelfAttentionIA3. Both of these simply create a new scaling vector then modify the forward pass to perform the (re)scaling. Note that the forward pass is otherwise unchanged except for where we apply $(IA)^3$.

This has been tested using model parallelism (i.e. tensor parallelism in DeepSpeed terminology) on the Stability cluster.

Notes

There are no hyperparameters for the method.
We rescale each attention head (the paper is unclear here) independently.

TODO

Pipeline parallelism is currently failing with $(IA)^3$. To replicate the issue:

pipe-parallel-size : 2 , model-parallel-size : 1, ia3_tuning: True. Note: num_gpus not set. 👎
pipe-parallel-size : 2 , model-parallel-size : 1, ia3_tuning: False. Note: num_gpus not set. 👍

configs/local_setup.yml

megatron/model/transformer.py

megatron/mpu/layers.py

feat: update submodule

20d5570

jackapbutler reviewed Mar 21, 2023

View reviewed changes

configs/local_setup.yml Outdated Show resolved Hide resolved

Mark Worrall added 2 commits March 21, 2023 15:31

feat: add no weight decay parameters

3c403e0

feat: add flag for checkpointing

661c780

MicPie reviewed Mar 22, 2023

View reviewed changes

megatron/model/transformer.py Outdated Show resolved Hide resolved

MicPie reviewed Mar 22, 2023

View reviewed changes

megatron/model/transformer.py Show resolved Hide resolved

MicPie reviewed Mar 22, 2023

View reviewed changes

megatron/mpu/layers.py Outdated Show resolved Hide resolved

Mark Worrall added 6 commits March 23, 2023 11:53

feat: rename ia3_prompt_tuning -> ia3_tuning

e825de0

feat: remove stride parameter

7d4b726

feat: add comments to ia3 rescaling function

65d64ad

feat: move IA3 scaling after MLP non-linearity

221b6dd

feat: update defaults

467231b

feat: update comment

ceeda69

maw501 marked this pull request as draft March 23, 2023 14:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add IA3 prompt tuning #2

feat: add IA3 prompt tuning #2

maw501 commented Mar 21, 2023 •

edited

Loading

feat: add IA3 prompt tuning #2

Are you sure you want to change the base?

feat: add IA3 prompt tuning #2

Conversation

maw501 commented Mar 21, 2023 • edited Loading

maw501 commented Mar 21, 2023 •

edited

Loading