Skip to content

Conversation

@samyam
Copy link
Contributor

@samyam samyam commented Mar 10, 2021

When a parameter is not divisible by world size, the partitioned gradients are mis-aligned due to incorrect padding handling. This PR should fix for that.

When a parameter is not divisible by world size, the partitioned gradients are mis-aligned due to incorrect padding handling. This PR should fix for that.
@ShadenSmith ShadenSmith linked an issue Mar 10, 2021 that may be closed by this pull request
Copy link
Contributor

@ShadenSmith ShadenSmith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @samyam! Can we add a unit test to cover this case in our CI?

@samyam
Copy link
Contributor Author

samyam commented Mar 10, 2021

Thanks @samyam! Can we add a unit test to cover this case in our CI?
--> @ShadenSmith Done

@jeffra
Copy link
Collaborator

jeffra commented Mar 12, 2021

these changes are included in #851

@jeffra jeffra closed this Mar 12, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Trouble with the backward pass in ZeRO 3

4 participants