-
Hi, Thanks for the great work! I am fine-tuning FLUX on a dataset of ~100K images. I have good GPUs (A100) but a small disk. I'm afraid that I will get out of my disk space if I do the caching. Can I skip the caching of VAE & text embeds? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 5 replies
-
you cannot completely disable caching, and to enable --vae_cache_ondemand will greatly slow down training and increase vram requirements perhaps in an uncomfortable manner even on A100 devices at 1024px image area but there's also --compress_disk_cache option which will apply Gzip to the image embeds and reduce their size. for 21,000 16-channel VAE embeds at 1024x1024 even without compression they are just 11GB. so for 100k you are looking at just 50G or so disk space to consume. the real problem is the text embeds, which require disk cache compression to be reasonable. those you can store in a Cloudflare R2 bucket. see DATALOADER.md for how to offload disk embeds. |
Beta Was this translation helpful? Give feedback.
you cannot completely disable caching, and to enable --vae_cache_ondemand will greatly slow down training and increase vram requirements perhaps in an uncomfortable manner even on A100 devices at 1024px image area
but there's also --compress_disk_cache option which will apply Gzip to the image embeds and reduce their size.
for 21,000 16-channel VAE embeds at 1024x1024 even without compression they are just 11GB. so for 100k you are looking at just 50G or so disk space to consume.
the real problem is the text embeds, which require disk cache compression to be reasonable. those you can store in a Cloudflare R2 bucket. see DATALOADER.md for how to offload disk embeds.