Fine-tune PLATO-2 #127

sdai654416 · 2022-04-12T21:21:32Z

I download the 24L model, and run the finetune script bash ./scripts/local/job.sh ./projects/PLATO-2/finetune/24L_train.conf. I got nan for my loss at the very beginning of the fine-tuning. Am I missing any stages?
If I do the pre-train script, the pretrain stage 1 does not store anything in the output/. I assume stage 2.1, and stage 2.2, requires stage 1's output right? How do I store stage 1? Thanks

Thanks!

The text was updated successfully, but these errors were encountered:

sserdoubleh · 2022-04-13T03:31:55Z

You can change: AMP setting in knover/core/model.py
https://github.com/PaddlePaddle/Knover/blame/develop/knover/core/model.py#L165

"custom_white_list": ["gelu"],

It seems that old models need to disable fp16 softmax / layer_norm.
Thanks for feedback!

py703703 · 2022-05-03T10:16:09Z

As for your second question, I think the pre-training data is too small and the number of steps saved is too large. As a result, the training ends before the set number of steps is reached. Save_steps can be modified in/"projects/ PLATe-2 /pretrain/ 24l_train_stage-1.conf"

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fine-tune PLATO-2 #127

Fine-tune PLATO-2 #127

sdai654416 commented Apr 12, 2022

sserdoubleh commented Apr 13, 2022 •

edited

Loading

py703703 commented May 3, 2022

Fine-tune PLATO-2 #127

Fine-tune PLATO-2 #127

Comments

sdai654416 commented Apr 12, 2022

sserdoubleh commented Apr 13, 2022 • edited Loading

py703703 commented May 3, 2022

sserdoubleh commented Apr 13, 2022 •

edited

Loading