Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

请问一下训练时间大概是多少? #5

Open
xYxSeries opened this issue May 24, 2024 · 9 comments
Open

请问一下训练时间大概是多少? #5

xYxSeries opened this issue May 24, 2024 · 9 comments

Comments

@xYxSeries
Copy link

No description provided.

@wdndev
Copy link
Owner

wdndev commented May 24, 2024

看你的硬件和训练的token数量。
我这面的参考:8*A100, 42B token,1 epoch,92M,15小时左右

@xYxSeries
Copy link
Author

好的,了解了,感谢。

@dage0127
Copy link

请问,下面这个训练时间是针对多大规模的模型,96M还是440M参数的模型。
“8*A100, 42B token,1 epoch,92M,15小时左右”

@wdndev
Copy link
Owner

wdndev commented Jul 24, 2024

92M的模型,不是440M和96M的,看你的训练资源,

@dage0127
Copy link

96M的参数规模,是不是2B左右的Tokens就可以了?

@wdndev
Copy link
Owner

wdndev commented Jul 27, 2024

@dage0127 尽可能多吧,我训练了40多B,才有这效果,还是有点差

@dage0127
Copy link

另外,请教一下,这些数据是一次性加载进去训练的吗?
一次加载Load 40B的数据,服务器的内存和CPU需要什么配置呢,我这边连2B的数据都加载失败。

@wdndev
Copy link
Owner

wdndev commented Jul 29, 2024

@dage0127 使用MAP方式加载,不用一次把所有数据加载到内存。
image

@dage0127
Copy link

非常感谢,我试试!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants