-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
学习代码的时候写了一个教程,希望可以帮到其他同学 #73
Comments
LoRA SFT, 这个怎么用前面自己训练好的模型,不要从hf上下载 |
|
谢谢你的教程! |
所以有 checkpoint 了之后,如何跑通 5-dpo_train.py 呀? 求大佬补充细节 @jingyaogong 不知道按照下面这种方式直接 patch init_model 函数可以吗?
|
This was referenced Feb 20, 2025
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
本教程参考官方 readme 和 不是Issue,一点个人训练minimind的记录 #26
由于本人手头只有一台Macbook (M1 Pro),因此这个项目只是用来debug和学习代码,完全没有训练出一个可用的模型。在大佬的代码基础上减少了epoch,同时在一个epoch内只用很少的数据进行训练,代码可正常运行从而可以学习代码运行的逻辑。以下是我学习代码的流程:
train tokenizer
data_process.py 处理数据,为pretrain 数据集做准备
预训练model,1-pretrain.py
有监督微调(Supervised Fine-Tuning,SFT)3-full_sft.py
现在可以运行2-eval.py 来进行评估
LoRA SFT,4-lora_sft.py
5-dpo_train.py
The text was updated successfully, but these errors were encountered: