Conversation
TRL SFT trainer has used IterableDataset and DataLoaderDispatcher will be called. The data shape/tensor will be broadcasted from rank 0 to other rank in DDP. Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
|
Traceback (most recent call last): |
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
TRL SFT trainer has used IterableDataset and DataLoaderDispatcher will be called. The data shape/tensor will be broadcasted from rank 0 to other rank in DDP.
What does this PR do?
Fixes # (issue)
Before submitting