Add `strategy` argument to Trainer #8597

kaushikb11 · 2021-07-28T12:16:28Z

What does this PR do?

Supports #6090
Related Issue #9053

strategy argument supports passing training type aliases (ddp, ddp_spawn), TrainingTypeRegistry plugins ("ddp_spawn_find_unused_parameters_false") and custom plugin objects (DDPPlugin())

At the moment, there’s a single flag accelerator tied for Accelerators as well as Training Type plugins. We wish to have them decoupled!

trainer = Trainer(accelerator=GPUaccelerator(..))
trainer = Trainer(accelerator='ddp_spawn')

Alternate flags to set Training Types

accelerator
- type: Optional[Union[str, Accelerator]] = None
- Supports training types and Accelerator Objects
distributed_backend
- type: Optional[str] = None
- Deprecated, should use accelerator instead
plugins
- type: Optional[Union[List[Union[Plugin, ClusterEnvironment, str]], Plugin, ClusterEnvironment, str]] = None
- Supports custom lightning plugins & environment

What's the difference between passing training type to accelerator, distributed_backend, or plugins?

accelerator and distributed_backend only support DistributedType, whereas plugins support Custom Training Types.

Exceptions:

Trainer(distributed_backend="ddp_cpu", strategy="ddp_spawn")
Trainer(accelerator="ddp", strategy="ddp_spawn")
Trainer(plugins="ddp_find_unused_parameters_false", strategy="ddp_spawn")

Deprecations: (Deprecated in v1.5 & will be removed in v1.6)

Passing training type to accelerator flag
Passing training type to plugins flag

Does your PR introduce any breaking changes? If yes, please list them.

Before submitting

Was this discussed/approved via a GitHub issue? (not for typos and docs)
Did you read the contributor guideline, Pull Request section?
Did you make sure your PR does only one thing, instead of bundling different changes together?
Did you make sure to update the documentation with your changes? (if necessary)
Did you write any new necessary tests? (not for typos and docs)
Did you verify new and existing tests pass locally with your changes?
Did you list all the breaking changes introduced by this pull request?
Did you update the CHANGELOG? (not for typos, docs, test updates, or internal minor changes/refactorings)

PR review

Anyone in the community is welcome to review the PR.
Before you start reviewing make sure you have read Review guidelines. In short, see the following bullet-list:

Is this pull request ready for review? (if not, please submit in draft mode)
Check that all items from Before submitting are resolved
Make sure the title is self-explanatory and the description concisely explains the PR
Add labels and milestones (and optionally projects) to the PR so it can be classified

Did you have fun?

Make sure you had fun coding 🙃

codecov · 2021-07-28T12:17:39Z

Codecov Report

Merging #8597 (dfecb4f) into master (28fc8d2) will decrease coverage by 4%.
The diff coverage is 100%.

@@           Coverage Diff           @@
##           master   #8597    +/-   ##
=======================================
- Coverage      93%     89%    -4%     
=======================================
  Files         178     178            
  Lines       15668   15695    +27     
=======================================
- Hits        14526   13943   -583     
- Misses       1142    1752   +610

ananthsub · 2021-07-30T07:42:40Z

since training type plugins are themselves in beta, i have a naming question: training type isn't only for training, but also other stages like evaluation and prediction. people could be confused why the plugin name references training if it also applies during these other situations. with that in mind, is there another name we should formalize his under? i fully acknowledge renaming existing training type plugins would be super annoying, but it'll be much harder to change once this is on the trainer constructor

justusschock · 2021-07-30T09:05:00Z

@ananthsub I fully agree. This comes back from when we introduced it. Back then there was mainly training and validation (which was considered to be only a part of training). How would you call it though? Some kind of strategy_plugin? But strategy for what? Precision is also some kind of strategy. TBH, I initially came up with the name, because I couldn't find anything better and needed something to prototype this... And somehow we kept it :D

for more information, see https://pre-commit.ci

tchaton

LGMT !