You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have read the README and searched the existing issues.
System Info
Currently, each time I fine-tune the program, it takes 6 hours to pre-tokenize, which is too time-consuming.
(p.s. I've tried launching in a streaming manner, but the GPU utilization is almost zero. The training time is even longer compared to offline preprocessing, which is not acceptable.)
Reproduction
[None]
Expected behavior
If the pre-tokenized data is saved locally, it can be loaded directly the next time for the same data, allowing to speed this process.
Others
No response
The text was updated successfully, but these errors were encountered:
Reminder
System Info
Currently, each time I fine-tune the program, it takes 6 hours to pre-tokenize, which is too time-consuming.
(p.s. I've tried launching in a streaming manner, but the GPU utilization is almost zero. The training time is even longer compared to offline preprocessing, which is not acceptable.)
Reproduction
[None]
Expected behavior
If the pre-tokenized data is saved locally, it can be loaded directly the next time for the same data, allowing to speed this process.
Others
No response
The text was updated successfully, but these errors were encountered: