New WR 156s (1.25% better than PR #122): Optimize distributed training, improve skip connection gating, and enhance bfloat16 usage#125
Merged
Commits
Commits on Aug 23, 2025
- committed
- authored
Commits on Aug 27, 2025
- committed
Commits on Sep 2, 2025
Commits on Sep 11, 2025
- authored andcommitted
- authored andcommitted

- authored andcommitted
- committed
- committed
- committed
- committed
Bernardino