Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Distributed optimizer support for experimental FP8 tensors (r1.20.0 branch) #7565

Closed

Conversation

timmoon10
Copy link
Collaborator

timmoon10 and others added 2 commits September 28, 2023 11:02
* Update distopt wrapper with multiple dtype support

Remove manual handling of separate FP32 optimizer.

Signed-off-by: Tim Moon <[email protected]>

* Use distopt support for contiguous buffers with multiple dtypes

Signed-off-by: Tim Moon <[email protected]>

* Fix typo

Signed-off-by: Tim Moon <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Separate distopt buckets for first GPT layer and non-overlapped params

Signed-off-by: Tim Moon <[email protected]>

* Add distopt logic for int dtypes

Signed-off-by: Tim Moon <[email protected]>

* Update Apex commit

Signed-off-by: Tim Moon <[email protected]>

* Remove unused variables

Signed-off-by: Tim Moon <[email protected]>

* Update Apex commit in README and Jenkensfile

Signed-off-by: Tim Moon <[email protected]>

* Debug Dockerfile and Jenkinsfile

Signed-off-by: Tim Moon <[email protected]>

---------

Signed-off-by: Tim Moon <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Eric Harper <[email protected]>
@github-actions github-actions bot added core Changes to NeMo Core NLP CI labels Sep 28, 2023
@timmoon10 timmoon10 changed the title Distributed optimizer support for experimental FP8 tensors for r1.20.0 branch Distributed optimizer support for experimental FP8 tensors (r1.20.0 branch) Sep 28, 2023
@timmoon10 timmoon10 marked this pull request as draft October 9, 2023 20:53
@github-actions
Copy link
Contributor

This PR is stale because it has been open for 14 days with no activity. Remove stale label or comment or update or this will be closed in 7 days.

@github-actions github-actions bot added the stale label Oct 24, 2023
Copy link
Contributor

This PR is stale because it has been open for 14 days with no activity. Remove stale label or comment or update or this will be closed in 7 days.

@github-actions github-actions bot added the stale label Nov 18, 2023
Copy link
Contributor

This PR was closed because it has been inactive for 7 days since being marked as stale.

@github-actions github-actions bot closed this Nov 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI core Changes to NeMo Core NLP stale
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants