Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify TPUSpawn rank management #11163

Merged
merged 52 commits into from
Jul 14, 2022
Merged

Simplify TPUSpawn rank management #11163

merged 52 commits into from
Jul 14, 2022

Conversation

awaelchli
Copy link
Contributor

@awaelchli awaelchli commented Dec 19, 2021

What does this PR do?

Fixes #10986

The attributes tpu_local_core_rank and tpu_global_core_rank today behave identically to the local_rank and global_rank attributes. They are no longer needed.

It is a breaking change for everyone who was accessing them through the Strategy.

Before submitting

  • Was this discussed/approved via a GitHub issue? (not for typos and docs)
  • Did you read the contributor guideline, Pull Request section?
  • Did you make sure your PR does only one thing, instead of bundling different changes together?
  • Did you make sure to update the documentation with your changes? (if necessary)
  • Did you write any new necessary tests? (not for typos and docs)
  • Did you verify new and existing tests pass locally with your changes?
  • Did you update the CHANGELOG? (not for typos, docs, test updates, or internal minor changes/refactorings)

PR review

Anyone in the community is free to review the PR once the tests have passed.
Before you start reviewing make sure you have read Review guidelines. In short, see the following bullet-list:

  • Is this pull request ready for review? (if not, please submit in draft mode)
  • Check that all items from Before submitting are resolved
  • Make sure the title is self-explanatory and the description concisely explains the PR
  • Add labels and milestones (and optionally projects) to the PR so it can be classified

Did you have fun?

I made sure I had fun coding 🙃

Part of #1 (it's a lie, this is just here to avoid noisy GitHub bot)

cc @Borda @carmocca @JackCaoG @Liyang90 @gkroiz @justusschock @awaelchli @kaushikb11 @rohitgr7 @akihironitta

@awaelchli awaelchli added the breaking change Includes a breaking change label Jan 4, 2022
@awaelchli awaelchli marked this pull request as ready for review January 4, 2022 09:36
CHANGELOG.md Outdated Show resolved Hide resolved
@mergify mergify bot added the ready PRs ready to be merged label Jan 4, 2022
pytorch_lightning/strategies/tpu_spawn.py Outdated Show resolved Hide resolved
pytorch_lightning/strategies/tpu_spawn.py Outdated Show resolved Hide resolved
Copy link
Contributor

@akihironitta akihironitta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a breaking change for everyone who was accessing them through the Strategy.

@awaelchli Are we removing these attributes without any deprecation? To me, it looks quite inconsistent that these are removed instantly in v1.6 while SingleTPUPlugin will be removed in v1.8.

@awaelchli
Copy link
Contributor Author

awaelchli commented Jan 4, 2022

@awaelchli Are we removing these attributes without any deprecation?

Yes. I will add the deprecation if reviewers specifically request it.

To me, it looks quite inconsistent that these are removed instantly in v1.6 while SingleTPUPlugin will be removed in v1.8.

Yes it is inconsistent because not all changes we want to do for 1.6 can be made backward compatible or deprecation-friendly.

@awaelchli awaelchli marked this pull request as draft January 5, 2022 11:11
@awaelchli awaelchli changed the base branch from master to feature/xla_environment May 15, 2022 00:54
Copy link
Contributor

@akihironitta akihironitta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Base automatically changed from feature/xla_environment to master June 22, 2022 08:57
@awaelchli awaelchli marked this pull request as ready for review June 22, 2022 13:13
@mergify mergify bot added the ready PRs ready to be merged label Jun 22, 2022
@awaelchli
Copy link
Contributor Author

Waiting for TPU CI to be reactivated

@awaelchli awaelchli self-assigned this Jun 22, 2022
@awaelchli awaelchli marked this pull request as draft June 22, 2022 15:04
@awaelchli awaelchli marked this pull request as ready for review July 11, 2022 18:26
@awaelchli awaelchli requested a review from kaushikb11 July 12, 2022 11:41
@mergify mergify bot added has conflicts and removed ready PRs ready to be merged labels Jul 13, 2022
@mergify mergify bot added ready PRs ready to be merged and removed has conflicts ready PRs ready to be merged labels Jul 14, 2022
@awaelchli awaelchli enabled auto-merge (squash) July 14, 2022 13:40
@awaelchli awaelchli merged commit bb5e8be into master Jul 14, 2022
@awaelchli awaelchli deleted the refactor/tpu-rank branch July 14, 2022 15:43
@awaelchli awaelchli added strategy: ddp DistributedDataParallel and removed strategy: ddp spawn labels Nov 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
accelerator: tpu Tensor Processing Unit breaking change Includes a breaking change ready PRs ready to be merged refactor strategy: ddp DistributedDataParallel
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Property TPUSpawnPlugin.tpu_global_core_rank may be replaced by global rank property
7 participants