GPU Middle Class? #2161
Labels
discussion
Start a discussion
distributed
Anything related to distributed env (multi-GPU, multi-node)
triaged
This issue has been assigned an owner and appropriate label
Does torchtune have any plans to support "GPU middle class" users?
We're trying to evaluate using torchtune for post-training, especially since there are many useful features implemented (RLHF, LORA, etc). However, one big sticking point is that the system seems heavily geared towards single-node training. Are there plans to support multi-node training (e.g. 16-64 nodes) and things like model parallelism, 128k context training, etc?
If not, is torchtitan the recommended system to use?
Thanks!
The text was updated successfully, but these errors were encountered: