Exception: No images were discovered by the bucket manager in the dataset #1035

a-l-e-x-d-s-9 · 2024-10-08T01:20:18Z

My dataset is a single image:
dataset.zip
Settings:
s01_multidatabackend.json
s01_config_01.json
Log:

No dependencies to install or update
INFO:root:lm_eval is not installed, GPTQ may not be usable
/home/alexds9/Documents/stable_diffusion/SimpleTuner/.venv/lib/python3.11/site-packages/xformers/ops/fmha/flash.py:211: FutureWarning: `torch.library.impl_abstract` was renamed to `torch.library.register_fake`. Please use that instead; we will remove `torch.library.impl_abstract` in a future version of PyTorch.
  @torch.library.impl_abstract("xformers_flash::flash_fwd")
/home/alexds9/Documents/stable_diffusion/SimpleTuner/.venv/lib/python3.11/site-packages/xformers/ops/fmha/flash.py:344: FutureWarning: `torch.library.impl_abstract` was renamed to `torch.library.register_fake`. Please use that instead; we will remove `torch.library.impl_abstract` in a future version of PyTorch.
  @torch.library.impl_abstract("xformers_flash::flash_bwd")
2024-10-08 04:09:41,385 [INFO] Using json configuration backend.
2024-10-08 04:09:41,385 [INFO] [CONFIG.JSON] Loaded configuration from config/config.json
2024-10-08 04:09:41,385 [WARNING] Skipping false argument: --push_to_hub
2024-10-08 04:09:41,385 [WARNING] Skipping false argument: --push_checkpoints_to_hub
2024-10-08 04:09:41,385 [WARNING] Skipping false argument: --validation_torch_compile
2024-10-08 04:09:41,385 [WARNING] Skipping false argument: --disable_benchmark
--model_type=lora
--lora_type=lycoris
--lycoris_config=/home/alexds9/Documents/stable_diffusion/Models_2024_10/simple_image_test/lycoris_config_03.json
--pretrained_model_name_or_path=black-forest-labs/FLUX.1-dev
--model_family=flux
--data_backend_config=/home/alexds9/Documents/stable_diffusion/Models_2024_10/simple_image_test/s01_multidatabackend.json
--output_dir=/home/alexds9/stable-diffusion-webui/models/Lora/My/Flux/Training/Models_2024_10/simple_image_test/tr_01/
--user_prompt_library=/home/alexds9/Documents/stable_diffusion/Models_2024_10/simple_image_test/s01_prompt_library.json
--hub_model_id=simpletuner-lora-01_simple_image_test_tr_01
--tracker_project_name=simpletuner-lora-01_simple_image_test_tr_01
--tracker_run_name=tr_01
--seed=5612103
--lora_rank=8
--lora_alpha=8
--mixed_precision=bf16
--optimizer=adamw_bf16
--learning_rate=7.5e-3
--train_batch_size=2
--gradient_accumulation_steps=2
--lr_scheduler=cosine
--lr_warmup_steps=20
--max_train_steps=1000
--num_train_epochs=0
--checkpointing_steps=100
--base_model_precision=int8-quanto
--base_model_default_dtype=bf16
--keep_vae_loaded
--flux_lora_target=all+ffs
--gradient_precision=fp32
--noise_offset=0.15
--noise_offset_probability=0.5
--checkpoints_total_limit=20
--aspect_bucket_rounding=2
--minimum_image_size=0
--resume_from_checkpoint=latest
--report_to=wandb
--metadata_update_interval=60
--gradient_checkpointing
--caption_dropout_probability=0.20
--resolution_type=pixel_area
--resolution=256
--validation_seed=10
--validation_steps=100
--validation_resolution=512x768
--validation_guidance=3.5
--validation_guidance_rescale=0.0
--validation_num_inference_steps=20
--validation_prompt=woman, brown hair, blue eyes, white shirt, upper body, indoors,
--num_validation_images=1
--snr_gamma=5
--inference_scheduler_timestep_spacing=trailing
--training_scheduler_timestep_spacing=trailing
--max_workers=32
--read_batch_size=25
--write_batch_size=64
--torch_num_threads=8
--image_processing_batch_size=32
--vae_batch_size=4
--compress_disk_cache
--max_grad_norm=0.02
--disable_bucket_pruning
--override_dataset_config
--quantize_via=cpu
2024-10-08 04:09:41,390 [WARNING] The VAE model madebyollin/sdxl-vae-fp16-fix is not compatible. Please use a compatible VAE to eliminate this warning. The baked-in VAE will be used, instead.
2024-10-08 04:09:41,391 [INFO] VAE Model: black-forest-labs/FLUX.1-dev
2024-10-08 04:09:41,391 [INFO] Default VAE Cache location: 
2024-10-08 04:09:41,391 [INFO] Text Cache location: cache
2024-10-08 04:09:41,391 [WARNING] Updating T5 XXL tokeniser max length to 512 for Flux.
2024-10-08 04:09:41,391 [WARNING] Gradient accumulation steps are enabled, but gradient precision is set to 'unmodified'. This may lead to numeric instability. Consider disabling gradient accumulation steps. Continuing in 10 seconds..
2024-10-08 04:09:51,391 [INFO] Enabled NVIDIA TF32 for faster training on Ampere GPUs. Use --disable_tf32 if this causes any problems.
2024-10-08 04:09:51,912 [INFO] Load VAE: black-forest-labs/FLUX.1-dev
2024-10-08 04:09:52,464 [INFO] Loading VAE onto accelerator, converting from torch.float32 to torch.bfloat16
2024-10-08 04:09:52,603 [INFO] Load tokenizers
You set `add_prefix_space`. The tokenizer needs to be converted from the slow tokenizers
2024-10-08 04:09:53,495 [INFO] Loading OpenAI CLIP-L text encoder from black-forest-labs/FLUX.1-dev/text_encoder..
2024-10-08 04:09:53,895 [INFO] Loading T5 XXL v1.1 text encoder from black-forest-labs/FLUX.1-dev/text_encoder_2..
Downloading shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 7876.63it/s]
Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00,  7.17it/s]
2024-10-08 04:09:57,514 [INFO] Moving text encoder to GPU.
2024-10-08 04:09:57,707 [INFO] Moving text encoder 2 to GPU.
2024-10-08 04:10:06,404 [INFO] Loading data backend config from /home/alexds9/Documents/stable_diffusion/Models_2024_10/simple_image_test/s01_multidatabackend.json
2024-10-08 04:10:06,405 [INFO] Configuring text embed backend: alt-embed-cache
Loading pipeline components...: 100%|███████████████████████████████████████████████████████████████████████████████████| 5/5 [00:00<00:00, 593.91it/s]
2024-10-08 04:10:06,757 [INFO] (Rank: 0) (id=alt-embed-cache) Listing all text embed cache entries
2024-10-08 04:10:06,758 [INFO] Pre-computing null embedding
2024-10-08 04:10:13,232 [INFO] Completed loading text embed services.                                                        
2024-10-08 04:10:13,232 [INFO] Configuring data backend: all_dataset_768
2024-10-08 04:10:13,232 [INFO] (id=all_dataset_768) Loading bucket manager.                                                  
2024-10-08 04:10:13,243 [WARNING] No cache file found, creating new one.
2024-10-08 04:10:13,243 [INFO] (id=all_dataset_768) Refreshing aspect buckets on main process.
2024-10-08 04:10:13,243 [INFO] Discovering new files...
2024-10-08 04:10:13,245 [INFO] Compressed 0 existing files from 0.
Generating aspect bucket cache:   0%|                                         | 0/1 [00:00<?, ?it/s]2024-10-08 04:10:13,267 [ERROR] Error processing image: Aspect buckets must be a list of floats or dictionaries.
2024-10-08 04:10:13,268 [ERROR] Error traceback: Traceback (most recent call last):
  File "/home/alexds9/Documents/stable_diffusion/SimpleTuner/helpers/metadata/backends/discovery.py", line 237, in _process_for_bucket
    prepared_sample = training_sample.prepare()
                      ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/alexds9/Documents/stable_diffusion/SimpleTuner/helpers/image_manipulation/training_sample.py", line 314, in prepare
    self.crop()
  File "/home/alexds9/Documents/stable_diffusion/SimpleTuner/helpers/image_manipulation/training_sample.py", line 529, in crop
    self.calculate_target_size()
  File "/home/alexds9/Documents/stable_diffusion/SimpleTuner/helpers/image_manipulation/training_sample.py", line 484, in calculate_target_size
    self.aspect_ratio = self._select_random_aspect()
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/alexds9/Documents/stable_diffusion/SimpleTuner/helpers/image_manipulation/training_sample.py", line 280, in _select_random_aspect
    available_aspects = self._trim_aspect_bucket_list()
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/alexds9/Documents/stable_diffusion/SimpleTuner/helpers/image_manipulation/training_sample.py", line 198, in _trim_aspect_bucket_list
    raise ValueError(
ValueError: Aspect buckets must be a list of floats or dictionaries.

2024-10-08 04:10:13,270 [INFO] Image processing statistics: {'total_processed': 0, 'skipped': {'already_exists': 0, 'metadata_missing': 0, 'not_found': 0, 'too_small': 0, 'other': 0}}
2024-10-08 04:10:13,270 [INFO] Enforcing minimum image size of 0.013225. This could take a while for very-large datasets.
2024-10-08 04:10:13,270 [INFO] Completed aspect bucket update.
2024-10-08 04:10:13,271 [INFO] Configured backend: {'id': 'all_dataset_768', 'config': {'vae_cache_clear_each_epoch': False, 'probability': 1.0, 'repeats': 5, 'crop': True, 'crop_aspect': 'random', 'crop_aspect_buckets': [0.125, 0.25, 0.375, 0.5, 0.625, 0.75, 0.875, 1.0, 1.125, 1.25, 1.375, 1.5, 1.625, 1.75, 1.875, 2], 'crop_style': 'random', 'disable_validation': False, 'resolution': 0.589824, 'resolution_type': 'area', 'caption_strategy': 'textfile', 'instance_data_dir': '/home/alexds9/Documents/stable_diffusion/Models_2024_10/simple_image_test/dataset', 'maximum_image_size': 1.048576, 'target_downsample_size': 0.589824, 'config_version': 2}, 'dataset_type': 'image', 'data_backend': <helpers.data_backend.local.LocalDataBackend object at 0x732528accc90>, 'instance_data_dir': '/home/alexds9/Documents/stable_diffusion/Models_2024_10/simple_image_test/dataset', 'metadata_backend': <helpers.metadata.backends.discovery.DiscoveryMetadataBackend object at 0x732528a8a690>}
(Rank: 0)  | Bucket     | Image Count (per-GPU)
------------------------------
2024-10-08 04:10:13,272 [ERROR] No images were discovered by the bucket manager in the dataset: all_dataset_768., traceback: Traceback (most recent call last):
  File "/home/alexds9/Documents/stable_diffusion/SimpleTuner/helpers/training/trainer.py", line 605, in init_data_backend
    configure_multi_databackend(
  File "/home/alexds9/Documents/stable_diffusion/SimpleTuner/helpers/data_backend/factory.py", line 823, in configure_multi_databackend
    raise Exception(
Exception: No images were discovered by the bucket manager in the dataset: all_dataset_768.

No images were discovered by the bucket manager in the dataset: all_dataset_768.
Traceback (most recent call last):
  File "/home/alexds9/Documents/stable_diffusion/SimpleTuner/train.py", line 30, in <module>
    trainer.init_data_backend()
  File "/home/alexds9/Documents/stable_diffusion/SimpleTuner/helpers/training/trainer.py", line 631, in init_data_backend
    raise e
  File "/home/alexds9/Documents/stable_diffusion/SimpleTuner/helpers/training/trainer.py", line 605, in init_data_backend
    configure_multi_databackend(
  File "/home/alexds9/Documents/stable_diffusion/SimpleTuner/helpers/data_backend/factory.py", line 823, in configure_multi_databackend
    raise Exception(
Exception: No images were discovered by the bucket manager in the dataset: all_dataset_768.

The text was updated successfully, but these errors were encountered:

a-l-e-x-d-s-9 · 2024-10-08T11:01:07Z

The dataset include a single image, and it uses two settings for resolution: 512px and 768px. Fot 768px, it seems to remove the file and assumes there is nothing to train.

a-l-e-x-d-s-9 · 2024-10-08T11:19:54Z

I don't use "--delete_problematic_images" or "--delete_unwanted_images". I cleared cache and all json files that script generated from dataset and output folder, and tried again - and it crashed again.
When I used: "crop": false - the image was discovered - and training was working, so it seems that crop option is deleting the image and causing the issue.

bghira · 2024-10-08T11:27:44Z

clearly the system is haunted by poltergeist

a-l-e-x-d-s-9 · 2024-10-08T11:30:53Z

The image size: 852 × 480
Crop settings that cased the image to be deleted:

        "crop": true,
        "crop_style": "random",
        "crop_aspect": "random",
        "crop_aspect_buckets": [0.125, 0.250, 0.375, 0.500, 0.625, 0.750, 0.875, 1.0, 1.125, 1.250, 1.375, 1.500, 1.625, 1.750, 1.875, 2],
        "resolution": 768,
        "resolution_type": "pixel_area",
        "minimum_image_size": 115,
        "maximum_image_size": 1024,
        "target_downsample_size": 768,

a-l-e-x-d-s-9 · 2024-10-08T12:01:37Z

I removed 768px resolution from dataset settings and tried only with 512px, it crashed with similar error for 512px:

[ERROR] No images were discovered by the bucket manager in the dataset: all_dataset_512., traceback: Traceback

So crop option removing the image even for smaller resolution.

a-l-e-x-d-s-9 · 2024-10-08T12:18:14Z

@bghira
The problem was caused by having an integer in crop_aspect_buckets without a decimal point.
For example, you can reproduce the problem with: "crop_aspect_buckets": [1],
But if you change it to 1.0 - it will work: "crop_aspect_buckets": [1.0],

bghira · 2024-10-08T12:39:08Z

thanks for figuring that part out. i looked into the file deletions and really every call to data_backend.delete(...) is wrapped by a check for delete_problematic_images etc so those might be lurking somewhere?

a-l-e-x-d-s-9 · 2024-10-08T12:48:58Z

Thank you. I meant to say that the image was deleted from the list of recognized/used images, not from the file system itself. So there is no problem in this regard.

bghira · 2024-10-08T13:00:38Z

oh, that is a relief

bghira pushed a commit that referenced this issue Oct 8, 2024

(#1035) add a better check for contents of aspect bucket crop list

fe370c9

bghira closed this as completed Oct 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Exception: No images were discovered by the bucket manager in the dataset #1035

Exception: No images were discovered by the bucket manager in the dataset #1035

a-l-e-x-d-s-9 commented Oct 8, 2024

a-l-e-x-d-s-9 commented Oct 8, 2024

a-l-e-x-d-s-9 commented Oct 8, 2024

bghira commented Oct 8, 2024

a-l-e-x-d-s-9 commented Oct 8, 2024

a-l-e-x-d-s-9 commented Oct 8, 2024

a-l-e-x-d-s-9 commented Oct 8, 2024 •

edited

Loading

bghira commented Oct 8, 2024

a-l-e-x-d-s-9 commented Oct 8, 2024

bghira commented Oct 8, 2024

Exception: No images were discovered by the bucket manager in the dataset #1035

Exception: No images were discovered by the bucket manager in the dataset #1035

Comments

a-l-e-x-d-s-9 commented Oct 8, 2024

a-l-e-x-d-s-9 commented Oct 8, 2024

a-l-e-x-d-s-9 commented Oct 8, 2024

bghira commented Oct 8, 2024

a-l-e-x-d-s-9 commented Oct 8, 2024

a-l-e-x-d-s-9 commented Oct 8, 2024

a-l-e-x-d-s-9 commented Oct 8, 2024 • edited Loading

bghira commented Oct 8, 2024

a-l-e-x-d-s-9 commented Oct 8, 2024

bghira commented Oct 8, 2024

a-l-e-x-d-s-9 commented Oct 8, 2024 •

edited

Loading