Skip to content

Commit b946be3

Browse files
authored
Merge pull request #6 from bmaltais/dev
v17.1
2 parents 90dad54 + 8856172 commit b946be3

File tree

5 files changed

+494
-665
lines changed

5 files changed

+494
-665
lines changed

README.md

+18-293
Original file line numberDiff line numberDiff line change
@@ -77,6 +77,20 @@ pip install --upgrade -r requirements.txt
7777

7878
Once the commands have completed successfully you should be ready to use the new version.
7979

80+
## GUI
81+
82+
There is now support for GUI based training using gradio. You can start the GUI interface by running:
83+
84+
```powershell
85+
python .\dreambooth_gui.py
86+
```
87+
88+
## Quickstart screencast
89+
90+
You can find a screen cast on how to use the GUI at the following location:
91+
92+
https://youtu.be/RlvqEKj03WI
93+
8094
## Folders configuration
8195

8296
Refer to the note to understand how to create the folde structure. In short it should look like:
@@ -106,305 +120,16 @@ my_asd_dog_dreambooth
106120
`- dog8.png
107121
```
108122

109-
## GUI
110-
111-
There is now support for GUI based training using gradio. You can start the GUI interface by running:
112-
113-
```powershell
114-
python .\dreambooth_gui.py
115-
```
116-
117123
## Support
118124

119125
Drop by the discord server for support: https://discord.com/channels/1041518562487058594/1041518563242020906
120126

121-
## Manual Script Execution
122-
123-
### SD1.5 example
124-
125-
Edit and paste the following in a Powershell terminal:
126-
127-
```powershell
128-
accelerate launch --num_cpu_threads_per_process 6 train_db_fixed.py `
129-
--pretrained_model_name_or_path="D:\models\last.ckpt" `
130-
--train_data_dir="D:\dreambooth\train_bernard\train_man" `
131-
--reg_data_dir="D:\dreambooth\train_bernard\reg_man" `
132-
--output_dir="D:\dreambooth\train_bernard" `
133-
--prior_loss_weight=1.0 `
134-
--resolution=512 `
135-
--train_batch_size=1 `
136-
--learning_rate=1e-6 `
137-
--max_train_steps=2100 `
138-
--use_8bit_adam `
139-
--xformers `
140-
--mixed_precision="fp16" `
141-
--cache_latents `
142-
--gradient_checkpointing `
143-
--save_every_n_epochs=1
144-
```
145-
146-
### SD2.0 512 Base example
147-
148-
```powershell
149-
# variable values
150-
$pretrained_model_name_or_path = "D:\models\512-base-ema.ckpt"
151-
$data_dir = "D:\models\dariusz_zawadzki\kohya_reg\data"
152-
$reg_data_dir = "D:\models\dariusz_zawadzki\kohya_reg\reg"
153-
$logging_dir = "D:\models\dariusz_zawadzki\logs"
154-
$output_dir = "D:\models\dariusz_zawadzki\train_db_fixed_model_reg_v2"
155-
$resolution = "512,512"
156-
$lr_scheduler="polynomial"
157-
$cache_latents = 1 # 1 = true, 0 = false
158-
159-
$image_num = Get-ChildItem $data_dir -Recurse -File -Include *.png, *.jpg, *.webp | Measure-Object | %{$_.Count}
160-
161-
Write-Output "image_num: $image_num"
162-
163-
$dataset_repeats = 200
164-
$learning_rate = 2e-6
165-
$train_batch_size = 4
166-
$epoch = 1
167-
$save_every_n_epochs=1
168-
$mixed_precision="bf16"
169-
$num_cpu_threads_per_process=6
170-
171-
# You should not have to change values past this point
172-
if ($cache_latents -eq 1) {
173-
$cache_latents_value="--cache_latents"
174-
}
175-
else {
176-
$cache_latents_value=""
177-
}
178-
179-
$repeats = $image_num * $dataset_repeats
180-
$mts = [Math]::Ceiling($repeats / $train_batch_size * $epoch)
181-
182-
Write-Output "Repeats: $repeats"
183-
184-
cd D:\kohya_ss
185-
.\venv\Scripts\activate
186-
187-
accelerate launch --num_cpu_threads_per_process $num_cpu_threads_per_process train_db_fixed.py `
188-
--v2 `
189-
--pretrained_model_name_or_path=$pretrained_model_name_or_path `
190-
--train_data_dir=$data_dir `
191-
--output_dir=$output_dir `
192-
--resolution=$resolution `
193-
--train_batch_size=$train_batch_size `
194-
--learning_rate=$learning_rate `
195-
--max_train_steps=$mts `
196-
--use_8bit_adam `
197-
--xformers `
198-
--mixed_precision=$mixed_precision `
199-
$cache_latents_value `
200-
--save_every_n_epochs=$save_every_n_epochs `
201-
--logging_dir=$logging_dir `
202-
--save_precision="fp16" `
203-
--reg_data_dir=$reg_data_dir `
204-
--seed=494481440 `
205-
--lr_scheduler=$lr_scheduler
206-
207-
# Add the inference yaml file along with the model for proper loading. Need to have the same name as model... Most likelly "last.yaml" in our case.
208-
cp v2_inference\v2-inference.yaml $output_dir"\last.yaml"
209-
```
210-
211-
### SD2.0 768v Base example
212-
213-
```powershell
214-
# variable values
215-
$pretrained_model_name_or_path = "C:\Users\berna\Downloads\768-v-ema.ckpt"
216-
$data_dir = "D:\dreambooth\train_paper_artwork\kohya\data"
217-
$logging_dir = "D:\dreambooth\train_paper_artwork"
218-
$output_dir = "D:\models\paper_artwork\train_db_fixed_model_v2_768v"
219-
$resolution = "768,768"
220-
$lr_scheduler="polynomial"
221-
$cache_latents = 1 # 1 = true, 0 = false
222-
223-
$image_num = Get-ChildItem $data_dir -Recurse -File -Include *.png, *.jpg, *.webp | Measure-Object | %{$_.Count}
224-
225-
Write-Output "image_num: $image_num"
226-
227-
$dataset_repeats = 200
228-
$learning_rate = 2e-6
229-
$train_batch_size = 4
230-
$epoch = 1
231-
$save_every_n_epochs=1
232-
$mixed_precision="bf16"
233-
$num_cpu_threads_per_process=6
234-
235-
# You should not have to change values past this point
236-
if ($cache_latents -eq 1) {
237-
$cache_latents_value="--cache_latents"
238-
}
239-
else {
240-
$cache_latents_value=""
241-
}
242-
243-
$repeats = $image_num * $dataset_repeats
244-
$mts = [Math]::Ceiling($repeats / $train_batch_size * $epoch)
245-
246-
Write-Output "Repeats: $repeats"
247-
248-
cd D:\kohya_ss
249-
.\venv\Scripts\activate
250-
251-
accelerate launch --num_cpu_threads_per_process $num_cpu_threads_per_process train_db_fixed.py `
252-
--v2 `
253-
--v_parameterization `
254-
--pretrained_model_name_or_path=$pretrained_model_name_or_path `
255-
--train_data_dir=$data_dir `
256-
--output_dir=$output_dir `
257-
--resolution=$resolution `
258-
--train_batch_size=$train_batch_size `
259-
--learning_rate=$learning_rate `
260-
--max_train_steps=$mts `
261-
--use_8bit_adam `
262-
--xformers `
263-
--mixed_precision=$mixed_precision `
264-
$cache_latents_value `
265-
--save_every_n_epochs=$save_every_n_epochs `
266-
--logging_dir=$logging_dir `
267-
--save_precision="fp16" `
268-
--seed=494481440 `
269-
--lr_scheduler=$lr_scheduler
270-
271-
# Add the inference 768v yaml file along with the model for proper loading. Need to have the same name as model... Most likelly "last.yaml" in our case.
272-
cp v2_inference\v2-inference-v.yaml $output_dir"\last.yaml"
273-
```
274-
275-
## Finetuning
276-
277-
If you would rather use model finetuning rather than the dreambooth method you can use a command similat to the following. The advantage of fine tuning is that you do not need to worry about regularization images... but you need to provide captions for every images. The caption will be used to train the model. You can use auto1111 to preprocess your training images and add either BLIP or danbooru captions to them. You then need to edit those to add the name of the model and correct any wrong description.
278-
279-
```
280-
accelerate launch --num_cpu_threads_per_process 6 train_db_fixed-ber.py `
281-
--pretrained_model_name_or_path="D:\models\alexandrine_teissier_and_bernard_maltais-400-kohya-sd15-v1.ckpt" `
282-
--train_data_dir="D:\dreambooth\source\alet_et_bernard\landscape-pp" `
283-
--output_dir="D:\dreambooth\train_alex_and_bernard" `
284-
--resolution="640,448" `
285-
--train_batch_size=1 `
286-
--learning_rate=1e-6 `
287-
--max_train_steps=550 `
288-
--use_8bit_adam `
289-
--xformers `
290-
--mixed_precision="fp16" `
291-
--cache_latents `
292-
--save_every_n_epochs=1 `
293-
--fine_tuning `
294-
--enable_bucket `
295-
--dataset_repeats=200 `
296-
--seed=23 `
297-
---save_precision="fp16"
298-
```
299-
300-
Refer to this url for more details about finetuning: https://note.com/kohya_ss/n/n1269f1e1a54e
301-
302-
## Options list
303-
304-
```txt
305-
usage: train_db_fixed.py [-h] [--v2] [--v_parameterization] [--pretrained_model_name_or_path PRETRAINED_MODEL_NAME_OR_PATH]
306-
[--fine_tuning] [--shuffle_caption] [--caption_extention CAPTION_EXTENTION]
307-
[--caption_extension CAPTION_EXTENSION] [--train_data_dir TRAIN_DATA_DIR]
308-
[--reg_data_dir REG_DATA_DIR] [--dataset_repeats DATASET_REPEATS] [--output_dir OUTPUT_DIR]
309-
[--use_safetensors] [--save_every_n_epochs SAVE_EVERY_N_EPOCHS] [--save_state] [--resume RESUME]
310-
[--prior_loss_weight PRIOR_LOSS_WEIGHT] [--no_token_padding]
311-
[--stop_text_encoder_training STOP_TEXT_ENCODER_TRAINING] [--color_aug] [--flip_aug]
312-
[--face_crop_aug_range FACE_CROP_AUG_RANGE] [--random_crop] [--debug_dataset]
313-
[--resolution RESOLUTION] [--train_batch_size TRAIN_BATCH_SIZE] [--use_8bit_adam] [--mem_eff_attn]
314-
[--xformers] [--vae VAE] [--cache_latents] [--enable_bucket] [--min_bucket_reso MIN_BUCKET_RESO]
315-
[--max_bucket_reso MAX_BUCKET_RESO] [--learning_rate LEARNING_RATE]
316-
[--max_train_steps MAX_TRAIN_STEPS] [--seed SEED] [--gradient_checkpointing]
317-
[--mixed_precision {no,fp16,bf16}] [--full_fp16] [--save_precision {None,float,fp16,bf16}]
318-
[--clip_skip CLIP_SKIP] [--logging_dir LOGGING_DIR] [--log_prefix LOG_PREFIX]
319-
[--lr_scheduler LR_SCHEDULER] [--lr_warmup_steps LR_WARMUP_STEPS]
320-
321-
options:
322-
-h, --help show this help message and exit
323-
--v2 load Stable Diffusion v2.0 model / Stable Diffusion 2.0のモデルを読み込む
324-
--v_parameterization enable v-parameterization training / v-parameterization学習を有効にする
325-
--pretrained_model_name_or_path PRETRAINED_MODEL_NAME_OR_PATH
326-
pretrained model to train, directory to Diffusers model or StableDiffusion checkpoint /
327-
学習元モデル、Diffusers形式モデルのディレクトリまたはStableDiffusionのckptファイル
328-
--fine_tuning fine tune the model instead of DreamBooth / DreamBoothではなくfine tuningする
329-
--shuffle_caption shuffle comma-separated caption / コンマで区切られたcaptionの各要素をshuffleする
330-
--caption_extention CAPTION_EXTENTION
331-
extension of caption files (backward compatiblity) / 読み込むcaptionファイルの拡張子(スペルミスを残してあります)
332-
--caption_extension CAPTION_EXTENSION
333-
extension of caption files / 読み込むcaptionファイルの拡張子
334-
--train_data_dir TRAIN_DATA_DIR
335-
directory for train images / 学習画像データのディレクトリ
336-
--reg_data_dir REG_DATA_DIR
337-
directory for regularization images / 正則化画像データのディレクトリ
338-
--dataset_repeats DATASET_REPEATS
339-
repeat dataset in fine tuning / fine tuning時にデータセットを繰り返す回数
340-
--output_dir OUTPUT_DIR
341-
directory to output trained model / 学習後のモデル出力先ディレクトリ
342-
--use_safetensors use safetensors format to save / checkpoint、モデルをsafetensors形式で保存する
343-
--save_every_n_epochs SAVE_EVERY_N_EPOCHS
344-
save checkpoint every N epochs / 学習中のモデルを指定エポックごとに保存する
345-
--save_state save training state additionally (including optimizer states etc.) /
346-
optimizerなど学習状態も含めたstateを追加で保存する
347-
--resume RESUME saved state to resume training / 学習再開するモデルのstate
348-
--prior_loss_weight PRIOR_LOSS_WEIGHT
349-
loss weight for regularization images / 正則化画像のlossの重み
350-
--no_token_padding disable token padding (same as Diffuser's DreamBooth) /
351-
トークンのpaddingを無効にする(Diffusers版DreamBoothと同じ動作)
352-
--stop_text_encoder_training STOP_TEXT_ENCODER_TRAINING
353-
steps to stop text encoder training / Text Encoderの学習を止めるステップ数
354-
--color_aug enable weak color augmentation / 学習時に色合いのaugmentationを有効にする
355-
--flip_aug enable horizontal flip augmentation / 学習時に左右反転のaugmentationを有効にする
356-
--face_crop_aug_range FACE_CROP_AUG_RANGE
357-
enable face-centered crop augmentation and its range (e.g. 2.0,4.0) /
358-
学習時に顔を中心とした切り出しaugmentationを有効にするときは倍率を指定する(例:2.0,4.0)
359-
--random_crop enable random crop (for style training in face-centered crop augmentation) /
360-
ランダムな切り出しを有効にする(顔を中心としたaugmentationを行うときに画風の学習用に指定する)
361-
--debug_dataset show images for debugging (do not train) / デバッグ用に学習データを画面表示する(学習は行わない)
362-
--resolution RESOLUTION
363-
resolution in training ('size' or 'width,height') / 学習時の画像解像度('サイズ'指定、または'幅,高さ' 指定)
364-
--train_batch_size TRAIN_BATCH_SIZE
365-
batch size for training (1 means one train or reg data, not train/reg pair) /
366-
学習時のバッチサイズ(1でtrain/regをそれぞれ1件ずつ学習)
367-
--use_8bit_adam use 8bit Adam optimizer (requires bitsandbytes) / 8bit Adamオプティマイザを使う(bitsandbytesのインス トールが必要)
368-
--mem_eff_attn use memory efficient attention for CrossAttention / CrossAttentionに省メモリ版attentionを使う
369-
--xformers use xformers for CrossAttention / CrossAttentionにxformersを使う
370-
--vae VAE path to checkpoint of vae to replace / VAEを入れ替える場合、VAEのcheckpointファイルまたはディレクトリ
371-
--cache_latents cache latents to reduce memory (augmentations must be disabled) /
372-
メモリ削減のためにlatentをcacheする(augmentationは使用不可)
373-
--enable_bucket enable buckets for multi aspect ratio training / 複数解像度学習のためのbucketを有効にする
374-
--min_bucket_reso MIN_BUCKET_RESO
375-
minimum resolution for buckets / bucketの最小解像度
376-
--max_bucket_reso MAX_BUCKET_RESO
377-
maximum resolution for buckets / bucketの最小解像度
378-
--learning_rate LEARNING_RATE
379-
learning rate / 学習率
380-
--max_train_steps MAX_TRAIN_STEPS
381-
training steps / 学習ステップ数
382-
--seed SEED random seed for training / 学習時の乱数のseed
383-
--gradient_checkpointing
384-
enable gradient checkpointing / grandient checkpointingを有効にする
385-
--mixed_precision {no,fp16,bf16}
386-
use mixed precision / 混合精度を使う場合、その精度
387-
--full_fp16 fp16 training including gradients / 勾配も含めてfp16で学習する
388-
--save_precision {None,float,fp16,bf16}
389-
precision in saving (available in StableDiffusion checkpoint) /
390-
保存時に精度を変更して保存する(StableDiffusion形式での保存時のみ有効)
391-
--clip_skip CLIP_SKIP
392-
use output of nth layer from back of text encoder (n>=1) / text encoderの後ろからn番目の層の出力を用いる(nは1以上)
393-
--logging_dir LOGGING_DIR
394-
enable logging and output TensorBoard log to this directory /
395-
ログ出力を有効にしてこのディレクトリにTensorBoard用のログを出力する
396-
--log_prefix LOG_PREFIX
397-
add prefix for each log directory / ログディレクトリ名の先頭に追加する文字列
398-
--lr_scheduler LR_SCHEDULER
399-
scheduler to use for learning rate / 学習率のスケジューラ: linear, cosine, cosine_with_restarts, polynomial,
400-
constant (default), constant_with_warmup
401-
--lr_warmup_steps LR_WARMUP_STEPS
402-
Number of steps for the warmup in the lr scheduler (default is 0) /
403-
学習率のスケジューラをウォームアップするステップ数(デフォルト0)
404-
```
405-
406127
## Change history
407128

129+
* 12/17 (v17.1) update:
130+
- Adding GUI for kohya_ss called dreambooth_gui.py
131+
- removing support for `--finetuning` as there is now a dedicated python repo for that. `--fine-tuning` is still there behind the scene until kohya_ss remove it in a future code release.
132+
- removing cli examples as I will now focus on the GUI for training. People who prefer cli based training can still do that.
408133
* 12/13 (v17) update:
409134
- Added support for learning to fp16 gradient (experimental function). SD1.x can be trained with 8GB of VRAM. Specify full_fp16 options.
410135
* 12/06 (v16) update:

0 commit comments

Comments
 (0)