Skip to content

Enable Llama2 70B to run with hqt on single card (#50)#780

Merged
regisss merged 3 commits into
huggingface:synapse_1.15from
HabanaAI:dev/dsemiat/llama_single_card_pre_1.10.0_no_rebase
Mar 22, 2024
Merged

Enable Llama2 70B to run with hqt on single card (#50)#780
regisss merged 3 commits into
huggingface:synapse_1.15from
HabanaAI:dev/dsemiat/llama_single_card_pre_1.10.0_no_rebase

Conversation

@HolyFalafel
Copy link
Copy Markdown
Contributor

Add disk_offload flag that controls device_map=auto. Setting this flag enbales weights offload to disk when cpu memory runs OOM.
Add const serialization path flag that gets a path for where to serialize const sections, so if there is no space on device to save all const sections they will be offloaded to disk.

This branch replaces the branch from PR #762

Yantom1 and others added 2 commits March 11, 2024 11:16
Add disk_offload flag that controls device_map=auto. Setting this flag enbales weights
offload to disk when cpu memory runs OOM.
Add const serialization path flag that gets a path for where to serialize const sections,
so if there is no space on device to save all const sections they will be offloaded to disk.
@HolyFalafel HolyFalafel requested a review from regisss as a code owner March 11, 2024 09:27
@HolyFalafel
Copy link
Copy Markdown
Contributor Author

Agree with Libin, please provide an example.

Can you also rebase your branch on main and run the following please?

pip install -U ruff
make style

We'll update this

@libinta libinta added run-test Run CI for PRs from external contributors synapse 1.15 labels Mar 13, 2024
@regisss
Copy link
Copy Markdown
Collaborator

regisss commented Mar 13, 2024

@HolyFalafel I guess this PR wouldn't work with Synapse 1.14 right?

@HolyFalafel
Copy link
Copy Markdown
Contributor Author

@HolyFalafel I guess this PR wouldn't work with Synapse 1.14 right?

You are right, v1.14 doesn't support some of the changes done here, and llama 70B won't be able to run on a single card

@HolyFalafel
Copy link
Copy Markdown
Contributor Author

@HolyFalafel I guess this PR wouldn't work with Synapse 1.14 right?

You are right, v1.14 doesn't support some of the changes done here, and llama 70B won't be able to run on a single card

@regisss if you can try to run it on 1.14, it would be good, it might still work, we're not sure

Comment thread examples/text-generation/utils.py Outdated
@HuggingFaceDocBuilderDev
Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@regisss regisss changed the base branch from main to synapse_1.15 March 22, 2024 22:31
@regisss regisss merged commit fe1c8e0 into huggingface:synapse_1.15 Mar 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

run-test Run CI for PRs from external contributors synapse 1.15

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants