Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
estimate_zero2_model_states_mem_needs: fixing memory estiamtion (#5099)
was considering 4 bytes per model param, and 4 bytes per gradient. fixed it to 2 bytes - under the assumption of FP16/BF16 --------- Co-authored-by: Olatunji Ruwase <[email protected]>
- Loading branch information