Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix buckeing seeding #6254

Merged
merged 2 commits into from
Mar 19, 2023
Merged

Fix buckeing seeding #6254

merged 2 commits into from
Mar 19, 2023

Conversation

VahidooX
Copy link
Collaborator

What does this PR do ?

When the global seeding is set and controlled by the user, the fully_randomized bucketing uses the same order of buckets on all workers. This PR fixes this seeding bug of the bucketing dataset by adding the rank of data loader to the seed.

Changelog

  • Add the rank of the data worker to the bucketing dataset to keep the randomness for fully_randomzied mode of buceking.

PR Type:

  • New Feature
  • Bugfix
  • Documentation

Copy link
Collaborator

@titu1994 titu1994 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good !

@titu1994 titu1994 merged commit 850b1f2 into NVIDIA:r1.17.0 Mar 19, 2023
github-actions bot pushed a commit that referenced this pull request Mar 19, 2023
* fixed the seeding bug of bucketing.

Signed-off-by: Vahid <[email protected]>

* fixed the seeding bug of bucketing.

Signed-off-by: Vahid <[email protected]>

---------

Signed-off-by: Vahid <[email protected]>
titu1994 pushed a commit that referenced this pull request Mar 19, 2023
* fixed the seeding bug of bucketing.



* fixed the seeding bug of bucketing.



---------

Signed-off-by: Vahid <[email protected]>
Co-authored-by: Vahid Noroozi <[email protected]>
hsiehjackson pushed a commit to hsiehjackson/NeMo that referenced this pull request Jun 2, 2023
* fixed the seeding bug of bucketing.



* fixed the seeding bug of bucketing.



---------

Signed-off-by: Vahid <[email protected]>
Co-authored-by: Vahid Noroozi <[email protected]>
Signed-off-by: hsiehjackson <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants