feat add & bugfix #642

OleehyO · 2025-01-06T10:54:12Z

Added prompt embedding pre-caching in the dataset to reduce memory usage during training, and adapted it for the T2V and I2V models.
Modified training details for the T2V and I2V models.
Deleted the computation graph before evaluation and enabled offloading to further reduce memory usage.
Fixed some minor issues in the SAT-related code.
Adjusted default parameters in certain scripts and updated the README.

- Add validation check to ensure number of frames is multiple of 8 - Add format validation for train_resolution string (frames x height x width)

- Add caching for prompt embeddings - Store cached files using safetensors format - Add cache directory structure under data_root/cache - Optimize memory usage by moving tensors to CPU after caching - Add debug logging for cache hits - Add info logging for cache writes The caching system helps reduce redundant computation and memory usage during training by: 1. Caching prompt embeddings based on prompt text hash 2. Caching encoded video latents based on video filename 3. Moving tensors to CPU after caching to free GPU memory

…lution - Add validation to ensure (frames - 1) is multiple of 8 - Add specific resolution check (480x720) for cogvideox-5b models - Add error handling for invalid resolution format

- Add support for cached prompt embeddings in dataset - Fix bug where first frame wasn't properly padded in latent space

This change enables caching of prompt embeddings in the CogVideoX text-to-video LoRA trainer, which can improve training efficiency by avoiding redundant text encoding operations.

Add docstring to train_frames field in State schema to explicitly indicate that it includes one image padding frame

- Add pipe.remove_all_hooks() after validation to prevent memory leaks - Clean up validation pipeline properly to avoid potential issues in subsequent training steps

- Add text embedding support in dataset collation - Pad 2 random noise frames at the beginning of latent space during training

zRzRzRzRzRzRzR

Today I will proceed with the operation, thank you.

When training i2v models without specifying image_column, automatically extract and use first frames from training videos as conditioning images. This includes: - Add load_images_from_videos() utility function to extract and cache first frames - Update BaseI2VDataset to support auto-extraction when image_column is None - Add validation and warning message in Args schema for i2v without image_column The first frames are extracted once and cached to avoid repeated video loading.

…ache Before precomputing the latent cache and text embeddings, cast the VAE and text encoder to the target training dtype (fp16/bf16) instead of keeping them in fp32. This reduces memory usage during the precomputation phase. The change occurs in prepare_dataset() where the models are moved to device and cast to weight_dtype before being used to generate the cache.

Add a table in README files showing hardware requirements for training different CogVideoX models, including: - Memory requirements for each model variant - Supported training types (LoRA) - Training resolutions - Mixed precision settings Updated in all language versions (EN/ZH/JA).

OleehyO and others added 15 commits January 2, 2025 03:07

docs: update README in multiple languages

362b7bf

feat(args): add validation for training resolution

a88c1ed

- Add validation check to ensure number of frames is multiple of 8 - Add format validation for train_resolution string (frames x height x width)

put lora back(sat), unavailable running

b080c6a

Update diffusion_video.py

ce2c299

Add unload_model function

f731c35

chore: update default training parameters for t2v and i2v scripts

c817e7f

docs: update finetune documentation in all languages

ffb6ee3

feat(args): add train_resolution validation for video frames and reso…

de5bef6

…lution - Add validation to ensure (frames - 1) is multiple of 8 - Add specific resolution check (480x720) for cogvideox-5b models - Add error handling for invalid resolution format

fix(cogvideox): add prompt embedding caching and fix frame padding

66e4ba2

- Add support for cached prompt embeddings in dataset - Fix bug where first frame wasn't properly padded in latent space

feat(cogvideox): add prompt embedding caching support

7e1ac76

This change enables caching of prompt embeddings in the CogVideoX text-to-video LoRA trainer, which can improve training efficiency by avoiding redundant text encoding operations.

docs: clarify train_frames includes padding frame

93b906b

Add docstring to train_frames field in State schema to explicitly indicate that it includes one image padding frame

fix: remove pipeline hooks after validation

49dc370

- Add pipe.remove_all_hooks() after validation to prevent memory leaks - Clean up validation pipeline properly to avoid potential issues in subsequent training steps

Adapt dataset for text embeddings and add noise padding

9157e0c

- Add text embedding support in dataset collation - Pad 2 random noise frames at the beginning of latent space during training

Merge remote-tracking branch 'upstream/CogVideoX_dev' into dev

1b88632

OleehyO requested a review from zRzRzRzRzRzRzR January 6, 2025 10:54

zRzRzRzRzRzRzR reviewed Jan 7, 2025

View reviewed changes

zRzRzRzRzRzRzR and others added 11 commits January 7, 2025 13:16

format and check fp16 for cogvideox2b

1789f07

style: format import statements across finetune module

3642727

feat: add warning for fp16 mixed precision training

96e511b

docs: update READMEs with auto first-frame extraction feature

ee1f666

remove --image_column

1193589

Add video path to error message for better debugging

392e370

Merge remote-tracking branch 'upstream/CogVideoX_dev' into dev

0e21d41

readme

045e1b3

zRzRzRzRzRzRzR merged commit 8f1829f into main Jan 8, 2025

zRzRzRzRzRzRzR mentioned this pull request Jan 12, 2025

Work plan and enhancement / 工作计划和用户诉求 #194

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat add & bugfix #642

feat add & bugfix #642

OleehyO commented Jan 6, 2025 •

edited

Loading

zRzRzRzRzRzRzR left a comment

feat add & bugfix #642

feat add & bugfix #642

Conversation

OleehyO commented Jan 6, 2025 • edited Loading

zRzRzRzRzRzRzR left a comment

Choose a reason for hiding this comment

OleehyO commented Jan 6, 2025 •

edited

Loading