Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable timeout for DDPStrategy #13244

Merged
merged 20 commits into from
Jun 21, 2022
Merged

Conversation

lsy643
Copy link
Contributor

@lsy643 lsy643 commented Jun 7, 2022

What does this PR do?

Enable timeout for DDPStrategy

Fixes #13211

Does your PR introduce any breaking changes? If yes, please list them.

None

Before submitting

  • Was this discussed/approved via a GitHub issue? (not for typos and docs)
  • Did you read the contributor guideline, Pull Request section?
  • Did you make sure your PR does only one thing, instead of bundling different changes together?
  • Did you make sure to update the documentation with your changes? (if necessary)
  • Did you write any new necessary tests? (not for typos and docs)
  • Did you verify new and existing tests pass locally with your changes?
  • Did you list all the breaking changes introduced by this pull request?
  • Did you update the CHANGELOG? (not for typos, docs, test updates, or minor internal changes/refactors)

PR review

Anyone in the community is welcome to review the PR.
Before you start reviewing, make sure you have read the review guidelines. In short, see the following bullet-list:

  • Is this pull request ready for review? (if not, please submit in draft mode)
  • Check that all items from Before submitting are resolved
  • Make sure the title is self-explanatory and the description concisely explains the PR
  • Add labels and milestones (and optionally projects) to the PR so it can be classified

Did you have fun?

Make sure you had fun coding 🙃

enable timeout for DDPStrategy
pytorch_lightning/strategies/ddp.py Outdated Show resolved Hide resolved
pytorch_lightning/strategies/ddp.py Outdated Show resolved Hide resolved
@carmocca carmocca added feature Is an improvement or enhancement strategy: ddp DistributedDataParallel community This PR is from the community labels Jun 7, 2022
@justusschock
Copy link
Member

@carmocca should we also set the env variables for the non-gloo backends? Or at least check them and error out early?

Add a timeout unit test
Copy link
Contributor

@carmocca carmocca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM as it is!

Let's continue the discussion about the env vars in the issue and if necessary, open a separate PR for checking them.

CHANGELOG.md Outdated Show resolved Hide resolved
@carmocca carmocca added this to the 1.7 milestone Jun 14, 2022
@carmocca carmocca self-assigned this Jun 14, 2022
@mergify mergify bot removed the has conflicts label Jun 15, 2022
@mergify mergify bot added ready PRs ready to be merged has conflicts labels Jun 15, 2022
@mergify mergify bot removed the ready PRs ready to be merged label Jun 15, 2022
@mergify mergify bot added ready PRs ready to be merged and removed has conflicts ready PRs ready to be merged labels Jun 16, 2022
@awaelchli awaelchli enabled auto-merge (squash) June 21, 2022 09:34
auto-merge was automatically disabled June 21, 2022 11:40

Head branch was pushed to by a user without write access

@carmocca carmocca merged commit c600f98 into Lightning-AI:master Jun 21, 2022
@lsy643 lsy643 deleted the ddp_with_timeout branch July 11, 2022 06:23
@RuABraun
Copy link

How is one supposed to use this in a yaml config? the timeout arg is class with a named arg.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
community This PR is from the community feature Is an improvement or enhancement ready PRs ready to be merged strategy: ddp DistributedDataParallel
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Be able to Set Timeout When DDP Strategy is Used
5 participants