[4/n] DP Enhancement: Optimize communication when dp < tp by using all_gather_into_tensor and reduce_scatter_tensor#8279
Closed
ch-wan wants to merge 4 commits intogh/ch-wam/4/basefrom
Closed
[4/n] DP Enhancement: Optimize communication when dp < tp by using all_gather_into_tensor and reduce_scatter_tensor#8279ch-wan wants to merge 4 commits intogh/ch-wam/4/basefrom
dp < tp by using all_gather_into_tensor and reduce_scatter_tensor#8279ch-wan wants to merge 4 commits intogh/ch-wam/4/basefrom