Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About the exact iteration number of LJSpeech pretrained model #153

Closed
Interfish opened this issue May 8, 2019 · 7 comments
Closed

About the exact iteration number of LJSpeech pretrained model #153

Interfish opened this issue May 8, 2019 · 7 comments

Comments

@Interfish
Copy link

Hi all:

I download pretrained model 20180510_mixture_lj_checkpoint_step000320000_ema.pth. As it's name suggested, it is trained 320k steps. I tried synthesising audios from it, and the quality is very good.

Then I train my own model on LJSpeech using the same preset as the author's. It is now running over 620k but there are little noise in the background. Here is my log:
image
So I am wondering what's wrong with my trainning. After some reading, I found some contradictory explaination in README.md.
For example,
image
It says this model is trained over 1000k, but its filename suggests that it's trained only 320k.
And in #1 (comment) , the author also mentioned he trained the model over 1000k.

SO, what is the exact iteration number of this 20180510_mixture_lj_checkpoint_step000320000_ema.pth ?.
This is important cause I am trying to figure out if there is something wrong with my own training. If the pretrained model is 320k and my trainning is definately wrong so I can start debugging.

@r9y9
Copy link
Owner

r9y9 commented May 8, 2019

Sorry for the confusion, but the name doesn't suggest the model was trained for 320k steps in total. I had noted for this before:

Note: As for the pretrained model for LJSpeech, the model was fine-tuned multiple times and trained for more than 1000k steps in total. Please refer to the issues (#1, #75, #45) to know how the model was trained.

Ref: #129

I initially thought the notice was enough for the explanation but it seems not true as I got the same issue twice. I will update the filename of the pretrained model to avoid the confusion.

@Interfish
Copy link
Author

Thanks for the quick response and hard work. Now I know i have to keep on trainning. Hope anyone who share the same confusion with me can see this post!

@puppylpg
Copy link

puppylpg commented Nov 2, 2020

Sorry for the confusion, but the name doesn't suggest the model was trained for 320k steps in total. I had noted for this before:

Note: As for the pretrained model for LJSpeech, the model was fine-tuned multiple times and trained for more than 1000k steps in total. Please refer to the issues (#1, #75, #45) to know how the model was trained.

Ref: #129

I initially thought the notice was enough for the explanation but it seems not true as I got the same issue twice. I will update the filename of the pretrained model to avoid the confusion.

Thanks for this explanation. However I'm still puzzled after load 20180510_mixture_lj_checkpoint_step000320000_ema.pth as checkpoint. The code output suggests that:

  • checkpoint["global_step"] = 320000
  • checkpoint["global_epoch"] = 51
  • checkpoint["global_test_step"] = 14739

and If I train a new modle base on 20180510_mixture_lj_checkpoint_step000320000_ema.pth, a model with 320k iteration will be saved as a checkpoint. Then 330k and so on.

Don't these mean that 20180510_mixture_lj_checkpoint_step000320000_ema.pth is actuall a model with only 320k iterations rather than more than 1000k?
Again, thanks for this great work.

@r9y9
Copy link
Owner

r9y9 commented Nov 2, 2020

--reset-optimizer Reset optimizer.

--reset-optimizer will clear the number of steps in the checkpoint, and start training from 0-th step.

@puppylpg
Copy link

puppylpg commented Nov 2, 2020

--reset-optimizer Reset optimizer.

--reset-optimizer will clear the number of steps in the checkpoint, and start training from 0-th step.

Emm...Maybe I didn't express myself clearly. I just wanna confirm why the step continues at 320k rather than 1000k, since you said it was trained over 1000k steps...

@r9y9
Copy link
Owner

r9y9 commented Nov 2, 2020

For example,

  • model 1: a model trained for 700k steps
  • model 2: a fine-tuned model from model 1 with --reset-optimizer for 300k steps.

Then, model 2 was trained in total for 1000k steps, but the checkpoint only keep the last number of iterations (i.e. 300 ksteps)

@puppylpg
Copy link

puppylpg commented Nov 2, 2020

For example,

  • model 1: a model trained for 700k steps
  • model 2: a fine-tuned model from model 1 with --reset-optimizer for 300k steps.

Then, model 2 was trained in total for 1000k steps, but the checkpoint only keep the last number of iterations (i.e. 300 ksteps)

Gotcha! Thanks for your patient reply.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants