Skip to content

Enable Llama2 70B to run with hqt on single card#50

Merged
MrGeva merged 1 commit into
habana-mainfrom
dev/yant
Feb 20, 2024
Merged

Enable Llama2 70B to run with hqt on single card#50
MrGeva merged 1 commit into
habana-mainfrom
dev/yant

Conversation

@Yantom1
Copy link
Copy Markdown

@Yantom1 Yantom1 commented Feb 18, 2024

Add disk_offload flag that controls device_map=auto. Setting this flag enbales weights offload to disk when cpu memory runs OOM.
Add const serialization path flag that gets a path for where to serialize const sections, so if there is no space on device to save all const sections they will be offloaded to disk.

Add disk_offload flag that controls device_map=auto. Setting this flag enbales weights
offload to disk when cpu memory runs OOM.
Add const serialization path flag that gets a path for where to serialize const sections,
so if there is no space on device to save all const sections they will be offloaded to disk.
@Yantom1 Yantom1 changed the title [SW-172523] Enable Llama2 70B to run with hqt on single card Enable Llama2 70B to run with hqt on single card Feb 18, 2024
@Yantom1 Yantom1 requested review from MrGeva and ulivne February 18, 2024 18:16
Comment thread examples/text-generation/utils.py
@MrGeva MrGeva merged commit 8d30377 into habana-main Feb 20, 2024
bhargaveede pushed a commit that referenced this pull request Feb 22, 2024
Add disk_offload flag that controls device_map=auto. Setting this flag enbales weights
offload to disk when cpu memory runs OOM.
Add const serialization path flag that gets a path for where to serialize const sections,
so if there is no space on device to save all const sections they will be offloaded to disk.
HolyFalafel pushed a commit that referenced this pull request Mar 5, 2024
Add disk_offload flag that controls device_map=auto. Setting this flag enbales weights
offload to disk when cpu memory runs OOM.
Add const serialization path flag that gets a path for where to serialize const sections,
so if there is no space on device to save all const sections they will be offloaded to disk.
HolyFalafel pushed a commit that referenced this pull request Mar 5, 2024
Add disk_offload flag that controls device_map=auto. Setting this flag enbales weights
offload to disk when cpu memory runs OOM.
Add const serialization path flag that gets a path for where to serialize const sections,
so if there is no space on device to save all const sections they will be offloaded to disk.
HolyFalafel pushed a commit that referenced this pull request Mar 5, 2024
Add disk_offload flag that controls device_map=auto. Setting this flag enbales weights
offload to disk when cpu memory runs OOM.
Add const serialization path flag that gets a path for where to serialize const sections,
so if there is no space on device to save all const sections they will be offloaded to disk.
HolyFalafel pushed a commit that referenced this pull request Mar 5, 2024
Add disk_offload flag that controls device_map=auto. Setting this flag enbales weights
offload to disk when cpu memory runs OOM.
Add const serialization path flag that gets a path for where to serialize const sections,
so if there is no space on device to save all const sections they will be offloaded to disk.
HolyFalafel pushed a commit that referenced this pull request Mar 5, 2024
Add disk_offload flag that controls device_map=auto. Setting this flag enbales weights
offload to disk when cpu memory runs OOM.
Add const serialization path flag that gets a path for where to serialize const sections,
so if there is no space on device to save all const sections they will be offloaded to disk.
HolyFalafel pushed a commit that referenced this pull request Mar 11, 2024
Add disk_offload flag that controls device_map=auto. Setting this flag enbales weights
offload to disk when cpu memory runs OOM.
Add const serialization path flag that gets a path for where to serialize const sections,
so if there is no space on device to save all const sections they will be offloaded to disk.
HolyFalafel added a commit that referenced this pull request Mar 24, 2024
Co-authored-by: Yan Tomsinsky <73292515+Yantom1@users.noreply.github.com>
@astachowiczhabana
Copy link
Copy Markdown

huggingface#831

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants