训练时类中心越来越没差异了 #15

s5248 · 2024-12-15T10:37:26Z

train(
iterations=1500000,
save_model_every=100000,
batch_size=256,
vae_input_dim=3584,
vae_hidden_dims=[512, 256, 128],
vae_embed_dim=1024,
vae_codebook_size=64
)

按上面参数训练tokenizer，当步数在10万以内时，得到sem_ids在0-64之间分布，但当训练steps越来越大，达到40万后，loss在上下小幅波动，而得到的sem_ids越来越相同，都分到一个类中了，可能是损失函数不够好导致训练坍缩了么，损失函数中没有看到强制64个codebook不相同的约束。

EdoardoBotta · 2024-12-17T15:59:15Z

Yes. I have seen codebook collapse a lot in my experiments. This is a very common issue with VQ-VAE. I would recommend keeping track of the entropy of the codebook throughout training to ensure healthy codebook usage.

Also recommend to pull the new changes from master. I have experimented a bit with various techniques to reduce codebook collapse and implemented the larger, more up to date MovieLens 32M.

With the training paramaters I have set with this commit: 7b7f08b, I am not seeing codebook collapse.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

训练时类中心越来越没差异了 #15

训练时类中心越来越没差异了 #15

s5248 commented Dec 15, 2024 •

edited

Loading

EdoardoBotta commented Dec 17, 2024 •

edited

Loading

训练时类中心越来越没差异了 #15

训练时类中心越来越没差异了 #15

Comments

s5248 commented Dec 15, 2024 • edited Loading

EdoardoBotta commented Dec 17, 2024 • edited Loading

s5248 commented Dec 15, 2024 •

edited

Loading

EdoardoBotta commented Dec 17, 2024 •

edited

Loading