Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

想咨询下,你们在显存方面有哪些建议和优化 #25

Open
datalee opened this issue Jan 18, 2024 · 2 comments
Open

想咨询下,你们在显存方面有哪些建议和优化 #25

datalee opened this issue Jan 18, 2024 · 2 comments

Comments

@datalee
Copy link

datalee commented Jan 18, 2024

稍微长点的文档在32G的v100上就爆点了
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 34.13 GiB (GPU 0; 31.75 GiB total capacity; 18.88 GiB already allocated; 7.01 GiB free; 20.61 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

@miange91
Copy link
Contributor

稍微长点的文档在32G的v100上就爆点了 torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 34.13 GiB (GPU 0; 31.75 GiB total capacity; 18.88 GiB already allocated; 7.01 GiB free; 20.61 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

可以用tensorRT-LLM来跑推理

@datalee
Copy link
Author

datalee commented Jan 22, 2024

稍微长点的文档在32G的v100上就爆点了 torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 34.13 GiB (GPU 0; 31.75 GiB total capacity; 18.88 GiB already allocated; 7.01 GiB free; 20.61 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

可以用tensorRT-LLM来跑推理

有一文档可以参考吗?是选用LLAMA2结构?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants