Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set num_nodes and sync_batchnorm From Trainer for Manually Passed Training Type Plugin #7026

Conversation

shuyingsunshine21
Copy link
Contributor

@shuyingsunshine21 shuyingsunshine21 commented Apr 15, 2021

What does this PR do?

Fixes #7007
Fixes #7429

Before submitting

  • Was this discussed/approved via a GitHub issue? (not for typos and docs)
  • Did you read the contributor guideline, Pull Request section?
  • Did you make sure your PR does only one thing, instead of bundling different changes together?
  • Did you make sure to update the documentation with your changes? (if necessary)
  • Did you write any new necessary tests? (not for typos and docs)
  • Did you verify new and existing tests pass locally with your changes?
  • Did you update the CHANGELOG? (not for typos, docs, test updates, or internal minor changes/refactorings)

PR review

Anyone in the community is free to review the PR once the tests have passed.
Before you start reviewing make sure you have read Review guidelines. In short, see the following bullet-list:

  • Is this pull request ready for review? (if not, please submit in draft mode)
  • Check that all items from Before submitting are resolved
  • Make sure the title is self-explanatory and the description concisely explains the PR
  • Add labels and milestones (and optionally projects) to the PR so it can be classified

Did you have fun?

Make sure you had fun coding 🙃

Shuying Sun and others added 30 commits March 23, 2021 12:06
Summary:

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:
…oint_consolidate

Update test_all_gather_grad.py
…1-checkpoint_consolidate"

This reverts commit c5053da, reversing
changes made to 0d23d75.
This reverts commit 70fe5da.
This reverts commit a9aae99.
Shuying Sun added 2 commits May 4, 2021 03:27
@mergify mergify bot removed the has conflicts label May 4, 2021
@shuyingsunshine21
Copy link
Contributor Author

added deprecation tests.

@shuyingsunshine21 shuyingsunshine21 requested a review from carmocca May 4, 2021 18:50
@mergify mergify bot added the has conflicts label May 7, 2021
Copy link
Contributor

@awaelchli awaelchli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great!

pytorch_lightning/plugins/training_type/ddp.py Outdated Show resolved Hide resolved
@mergify mergify bot removed the has conflicts label May 8, 2021
@awaelchli awaelchli changed the title [RFC] Set num_nodes and sync_batchnorm From Trainer for Manually Passed Training Type Plugin Set num_nodes and sync_batchnorm From Trainer for Manually Passed Training Type Plugin May 8, 2021
@awaelchli awaelchli enabled auto-merge (squash) May 8, 2021 11:02
@awaelchli awaelchli merged commit 987530c into Lightning-AI:master May 8, 2021
@shuyingsunshine21 shuyingsunshine21 deleted the training_type_plugin_consolidate branch May 11, 2021 18:22
@awaelchli awaelchli mentioned this pull request May 14, 2021
@carmocca carmocca modified the milestones: v1.4, v1.3.x May 17, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
distributed Generic distributed-related topic ready PRs ready to be merged refactor
Projects
None yet
8 participants