Add `accelerate` support for LongT5 models #20341

pszemraj · 2022-11-21T01:11:03Z

Signed-off-by: peter szemraj [email protected]

What does this PR do?

This PR adds accelerate support for the longT5 models (i.e., make it possible to use device_map="auto"), so these models can be loaded in 8bit using load_in_8bit=True.

This helps enable inference with trained/fine-tuned SoTA long summarization models using limited memory ☺️

Took inspiration from reviewing similar PRs for other models: #19912 and #19927

cc @sgugger

test results

I made a Colab notebook that clones the branch from my fork to demo the load_in_8bit=True working. Everything else is the same for comparison purposes (except the function that says the model size) as the fp32/standard notebook listed on my fine-tuned model card.

I also ran the tests for longT5 locally:

$ python -m pytest -n auto --dist=loadfile -s -v tests/models/longt5/test_modeling_longt5.py 

( ... many things here ...)

=================================================== 196 passed, 58 skipped, 118 warnings in 30.49s ===================================================

pszemraj · 2022-11-21T01:11:55Z

cc @KMFODA for inputs on tests & more 🤞

HuggingFaceDocBuilderDev · 2022-11-21T01:24:04Z

The documentation is not available anymore as the PR was closed or merged.

Signed-off-by: peter szemraj <[email protected]>

younesbelkada

Very cool PR! Glad to see that 8-bit integration is gaining interest and attention on more models 🔥
Just a small typo on the Google Colab: the .cuda() is not needed after instantiating the model with load_in_8bit and device_map=auto, so I would advice to remove it ;)

Can you make sure the slow tests pass with the command RUN_SLOW=1 pytest tests/models/longt5/test_modeling_longt5.py ? (You will need to have access to a GPU instance) - When I ran your fix, accelerate tests were failing. You can fix them by adding the lines here as it was done for BART / NLLB in #19912

self.embed_tokens = nn.Embedding(config.vocab_size, config.d_model)
if embed_tokens is not None:
    self.embed_tokens.weight = embed_tokens.weight

pszemraj · 2022-11-21T15:33:36Z

Thanks for the feedback & good catch on the Colab! I've updated the notebook - will run and resolve the slow tests/accelerate items later today/tomorrow and revert back 👌

younesbelkada · 2022-11-23T09:48:52Z

Hey @pszemraj !
How is the integration going 💪 ? Let me know if I can help at some point to debug / make the tests pass ;) !

younesbelkada · 2022-12-06T07:54:27Z

Hi @pszemraj !
Is it ok if I try to take over the PR? this addition could be very nice to the lib! Let me know what do you think :)

pszemraj · 2022-12-06T13:19:07Z

Hey! let me give it a stab today (I was sick for a week) if you don't see anything by tomorrow, feel free to take it home! Email | ***@***.*** On 12/6/2022 8:54:39 AM, Younes Belkada ***@***.***> wrote: Hi @pszemraj [https://github.com/pszemraj] ! Is it ok if I try to take over the PR? this addition could be very nice to the lib! Let me know what do you think :) — Reply to this email directly, view it on GitHub [#20341 (comment)], or unsubscribe [https://github.com/notifications/unsubscribe-auth/AR3GSMFN4MP444ZC72B4EN3WL3WL7ANCNFSM6AAAAAASGEAOLE]. You are receiving this because you were mentioned.Message ID: ***@***.***> [31e14b4b-28c3-4714-8081-803278962750]

pszemraj · 2022-12-07T13:59:14Z

@younesbelkada hey - was trying to get the tests to pass and evaluate further but unfortunately the machine I do have access to a GPU on and can work this was running into some install issues with the dev dependencies for pytest etc

If you're willing to finish this, that would probably be easiest 😅 I'll add the line for accelerate as you suggested and rebase as per the contrib guidelines, feel free to take whatever you find useful :)

younesbelkada · 2022-12-07T14:00:15Z

Thanks a lot @pszemraj for your great efforts, will have a look ASAP ;) this is definitely in my TODO list

pszemraj · 2022-12-08T19:55:58Z

thanks so much! I see you pushed so I will leave you to it (but feel free to let me know if questions or you need me to change anything on my end)

then we can get this bad boi usable on free Colab runtimes :)

younesbelkada

I can confirm all slow tests pass (single & multi-gpu)!
Thanks so much @pszemraj for your great contribution and patience! Thanks a lot for making Long-T5 models more accessible to anyone
Leaving it now to @sgugger for a final review

sgugger

Thanks a lot!

pszemraj · 2022-12-13T19:59:52Z

Thanks for taking it home @younesbelkada! and thanks for the review @sgugger. Happy to help :)

* ✨ add accelerate support for LongT5 models Signed-off-by: peter szemraj <[email protected]> * fix `accelerate` tests * Trigger CI test Signed-off-by: peter szemraj <[email protected]> Co-authored-by: younesbelkada <[email protected]>

✨ add accelerate support for LongT5 models

1269fb7

Signed-off-by: peter szemraj <[email protected]>

pszemraj marked this pull request as ready for review November 21, 2022 01:59

younesbelkada reviewed Nov 21, 2022

View reviewed changes

fix accelerate tests

a3e8eb4

younesbelkada added 2 commits December 11, 2022 23:02

Trigger CI test

b83bb25

Merge remote-tracking branch 'upstream/main' into HEAD

ec1f0dc

younesbelkada approved these changes Dec 11, 2022

View reviewed changes

younesbelkada requested a review from sgugger December 11, 2022 23:30

sgugger approved these changes Dec 12, 2022

View reviewed changes

sgugger merged commit a3345c1 into huggingface:main Dec 12, 2022

Add accelerate support for LongT5 models #20341

Add accelerate support for LongT5 models #20341

Uh oh!

Conversation

pszemraj commented Nov 21, 2022

What does this PR do?

test results

Uh oh!

pszemraj commented Nov 21, 2022

Uh oh!

HuggingFaceDocBuilderDev commented Nov 21, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

younesbelkada left a comment

Choose a reason for hiding this comment

Uh oh!

pszemraj commented Nov 21, 2022

Uh oh!

younesbelkada commented Nov 23, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

younesbelkada commented Dec 6, 2022

Uh oh!

pszemraj commented Dec 6, 2022 via email • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pszemraj commented Dec 7, 2022

Uh oh!

younesbelkada commented Dec 7, 2022

Uh oh!

pszemraj commented Dec 8, 2022

Uh oh!

younesbelkada left a comment

Choose a reason for hiding this comment

Uh oh!

sgugger left a comment

Choose a reason for hiding this comment

Uh oh!

pszemraj commented Dec 13, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Add `accelerate` support for LongT5 models #20341

Add `accelerate` support for LongT5 models #20341

HuggingFaceDocBuilderDev commented Nov 21, 2022 •

edited

Loading

younesbelkada commented Nov 23, 2022 •

edited

Loading

pszemraj commented Dec 6, 2022 via email •

edited

Loading