Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Running out of GPU memory / VRAM #73

Open
Courage-1984 opened this issue Nov 5, 2024 · 0 comments
Open

Running out of GPU memory / VRAM #73

Courage-1984 opened this issue Nov 5, 2024 · 0 comments

Comments

@Courage-1984
Copy link

Courage-1984 commented Nov 5, 2024

I am using the following arguments:

python test_seesr.py --pretrained_model_path preset/models/stable-diffusion-2-base --prompt '' --seesr_model_path preset/models/seesr --ram_ft_path preset/models/DAPE.pth --image_path preset/datasets/test_datasets --output_dir preset/datasets/output --start_point lr --num_inference_steps 30 --guidance_scale 5.0 --process_size 384 --vae_decoder_tiled_size 384 --vae_encoder_tiled_size 384 --latent_tiled_size 48 --latent_tiled_overlap 4

but I get the following output after hours of processing:

input size: 2488x2864
[Tiled VAE]: input_size: torch.Size([2, 3, 2488, 2864]), tile_size: 384, padding: 32
[Tiled VAE]: split to 7x8 = 56 tiles. Optimal tile size 352x352, original tile size 384x384
[Tiled VAE]: Executing Encoder Task Queue: 100%|█████████████████████████████████| 5096/5096 [08:44<00:00,  9.72it/s]
[Tiled VAE]: Done in 525.361s, max VRAM alloc 7304.777 MB
  0%|                                                                                         | 0/30 [00:00<?, ?it/s][Tiled Latent]: the input size is 2488x2864, need to tiled
100%|█████████████████████████████████████████████████████████████████████████████| 30/30 [7:16:56<00:00, 873.88s/it]
[Tiled VAE]: the input size is tiny and unnecessary to tile.
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 1.70 GiB (GPU 0; 4.00 GiB total capacity; 9.37 GiB 
already allocated; 0 bytes free; 9.90 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try 
setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

I am running on GTX 1050 ti, and I know it's not ideal, but I am willing to wait... what can/should I change to avoid this?

I tried to run the following in my terminal before I run SeeSR but it didn't help:

set PYTORCH_CUDA_ALLOC_CONF=max_split_size_mb:256

Don't know if this helps or mean anything but SeeSR did work with the default arguments with the test dataset image provided by SeeSR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant