You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Yes. I have seen codebook collapse a lot in my experiments. This is a very common issue with VQ-VAE. I would recommend keeping track of the entropy of the codebook throughout training to ensure healthy codebook usage.
Also recommend to pull the new changes from master. I have experimented a bit with various techniques to reduce codebook collapse and implemented the larger, more up to date MovieLens 32M.
With the training paramaters I have set with this commit: 7b7f08b, I am not seeing codebook collapse.
按上面参数训练tokenizer,当步数在10万以内时,得到sem_ids在0-64之间分布,但当训练steps越来越大,达到40万后,loss在上下小幅波动,而得到的sem_ids越来越相同,都分到一个类中了,可能是损失函数不够好导致训练坍缩了么,损失函数中没有看到强制64个codebook不相同的约束。
The text was updated successfully, but these errors were encountered: