-
Notifications
You must be signed in to change notification settings - Fork 506
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CUDA out of memory #6
Comments
I haven't tested this code with less than 16GB of GPU memory, but this is a bit surprising since each model is roughly 400M parameters and therefore around 800MB of memory. One suggestion: try loading the checkpoint on CPU, and then moving to GPU, like so:
|
I still get the following error even when trying the above code |
It looks like you had a 12-GB-VRAM GPU, while OP managed to work with a 4-GB-VRAM GPU. Try clearing the memory, e.g. by restarting the Colab session if you use Google Colab. |
Hi @woctezuma . Thank you for your suggestion. I am running the code locally. I am not sure how to clear the memory in the code.
Setting the batch size to greater than 16 will result in an OOM error. +-----------------------------------------------------------------------------+ +-----------------------------------------------------------------------------+ |
I ran into the same issue on a 4GB GPU. Oddly enough, loading the upsample model before the base model worked for me with no other changes. |
@kgullion Thanks, I will try your suggestion! |
@kgullion thanks for your suggestion. But I ave tried to run the upscale model before the base model and I still get the following error:
Note this error only occurs, if I want to run the code with higher batch size e.g. >50 |
I get a OOM when loading the
upsample
model:the allocation error was
RuntimeError: CUDA out of memory. Tried to allocate 100.00 MiB (GPU 0; 3.94 GiB total capacity; 3.00 GiB already allocated; 30.94 MiB free; 3.06 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
my
nvidia-smi
isThe text was updated successfully, but these errors were encountered: