[Llama] Make torchao's Llama trainable #728

gau-nernst · 2024-08-22T13:19:03Z

Fixes #674

To make Llama trainable, I changed the following:

Do not initialize KV cache if training=True is passed to setup_caches()
When input_pos is not passed to the model, handle it accordingly and use F.sdpa(is_causal=True)

Other minor changes:

Ignore .safetensors weights in scripts/download.py to save bandwidth/speed up download time.
Use torchao's Llama as an example for benchmarks/quantized_training

I manually ran

python torchao/_models/llama/generate.py --checkpoint_path checkpoints/meta-llama/Llama-2-7b-chat-hf/model.pth --prompt "Hello, my name is" --compile --precision float16

to confirm that the generated outputs are identical.

pytorch-bot · 2024-08-22T13:19:06Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/728

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 806721c with merge base 99644e9 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

gau-nernst added 4 commits August 22, 2024 17:17

initial change

848aecc

skip safetensors weights

e47f272

update quantized training script

13eee14

add activation checkpointing

806721c

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Aug 22, 2024

gau-nernst requested review from HDCharles and msaroufim August 22, 2024 13:22

msaroufim approved these changes Aug 22, 2024

View reviewed changes

msaroufim merged commit 8002099 into pytorch:main Aug 22, 2024
16 checks passed

gau-nernst deleted the llama_train branch August 22, 2024 18:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Llama] Make torchao's Llama trainable #728

[Llama] Make torchao's Llama trainable #728

gau-nernst commented Aug 22, 2024

pytorch-bot bot commented Aug 22, 2024 •

edited

Loading

[Llama] Make torchao's Llama trainable #728

[Llama] Make torchao's Llama trainable #728

Conversation

gau-nernst commented Aug 22, 2024

pytorch-bot bot commented Aug 22, 2024 • edited Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/728

✅ No Failures

pytorch-bot bot commented Aug 22, 2024 •

edited

Loading