Why are 4 different dataset options generated by default? #1175

slimy-pufflefish · 2024-11-22T09:56:30Z

slimy-pufflefish
Nov 22, 2024

I used configure.py to generate a multidatabackend.json. However, it generated 4 datasets, a 512 variant, a 1024 variant, and 2 versions that have random cropping enabled.

I don't understand the differences between the variants with "crop": false and "crop": true. I also don't understand what happens if an image is smaller than, e.g, 512, is it upscaled to be 512x512? Will this cause the model to output blurry results (since upscaling will make the image blurry).

Answered by bghira

Nov 23, 2024

yes they are all treated as additional datasets

View full answer

bghira · 2024-11-22T14:46:00Z

bghira
Nov 22, 2024
Maintainer

this is because models often train better to generalise across aspects / base resolutions when you include them in the training.

DiT models like Flux or SD3 actually enter representation collapse rather easily when you go too far from their training sequence lengths (resolutions)

crop=false just buckets things by aspect (image shape) and crop=true makes them all squares.

training the same images in bucketed and square format helps the model avoid biasing any particular style or quality to a given resolution bucket, which is something that all Unet and DiT models do.

4 replies

slimy-pufflefish Nov 22, 2024
Author

Thanks for the response! Here's what I specifically don't understand. Given a config like this:

    {
        "id": "test_dataset-512",
        "type": "local",
        "instance_data_dir": "some/path/to/data",
        "crop": false,
        "crop_style": "random",
        "minimum_image_size": 128,
        "resolution": 512,
        "resolution_type": "pixel_area",
        "repeats": 200,
        "metadata_backend": "discovery",
        "caption_strategy": "textfile",
        "cache_dir_vae": "cache//vae-512"
    },

What happens if an image of resolution 128x128 is inside the dataset folder?
Will it be be upscaled to 512x512? (And if so, won't that make the images blurry)
Given that that crop is false, does crop_style do anything?

Now, let's say I have 2 datasets in my config file:

    {
        "id": "test_dataset-512",
        "type": "local",
        "instance_data_dir": "some/path/to/data",
        "crop": false,
        "crop_style": "random",
        "minimum_image_size": 128,
        "resolution": 512,
        "resolution_type": "pixel_area",
        "repeats": 200,
        "metadata_backend": "discovery",
        "caption_strategy": "textfile",
        "cache_dir_vae": "cache//vae-512"
    },
    {
        "id": "test_dataset-1024",
        "type": "local",
        "instance_data_dir": "some/path/to/data",
        "crop": false,
        "crop_style": "random",
        "minimum_image_size": 128,
        "resolution": 1024,
        "resolution_type": "pixel_area",
        "repeats": 200,
        "metadata_backend": "discovery",
        "caption_strategy": "textfile",
        "cache_dir_vae": "cache//vae-1024"
    },

If there's an image of size 128x128, which dataset will it belong to? (Will it get upscaled to 512 or to 1024)?

bghira Nov 22, 2024
Maintainer

yes the minimum image value is set to a non ideal value; but the default is none, which is arguably worse. a value of 128 allows nearly every image to be used for better or worse, while signalling to the user that it is an option that can be set.

slimy-pufflefish Nov 23, 2024
Author

Got it. And are datasets additive? In other words, if I have 2 datasets pointing to the same folder of images (but one dataset has "crop": true and the other one does not), will the trainer do 200 repeats on cropped images and 200 repeats on un-cropped images?

bghira Nov 23, 2024
Maintainer

yes they are all treated as additional datasets

Answer selected by slimy-pufflefish

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why are 4 different dataset options generated by default? #1175

{{title}}

Replies: 1 comment 4 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Why are 4 different dataset options generated by default? #1175

slimy-pufflefish Nov 22, 2024

Replies: 1 comment · 4 replies

bghira Nov 22, 2024 Maintainer

slimy-pufflefish Nov 22, 2024 Author

bghira Nov 22, 2024 Maintainer

slimy-pufflefish Nov 23, 2024 Author

bghira Nov 23, 2024 Maintainer

slimy-pufflefish
Nov 22, 2024

Replies: 1 comment 4 replies

bghira
Nov 22, 2024
Maintainer

slimy-pufflefish Nov 22, 2024
Author

bghira Nov 22, 2024
Maintainer

slimy-pufflefish Nov 23, 2024
Author

bghira Nov 23, 2024
Maintainer