Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Renamed the DDPSpawnPlugin to DDPSpawnStrategy #11145

Merged
merged 10 commits into from
Dec 21, 2021
4 changes: 4 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -145,8 +145,12 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/).
- DeepSpeed does not require lightning module zero 3 partitioning ([#10655](https://github.com/PyTorchLightning/pytorch-lightning/pull/10655))


- Renamed the `DDPSpawnPlugin` to `DDPSpawnStrategy` ([#11145](https://github.com/PyTorchLightning/pytorch-lightning/pull/11145))


- Renamed the `DDPFullyShardedPlugin` to `DDPFullyShardedStrategy` ([#11143](https://github.com/PyTorchLightning/pytorch-lightning/pull/11143))


### Deprecated

- Deprecated `ClusterEnvironment.master_{address,port}` in favor of `ClusterEnvironment.main_{address,port}` ([#10103](https://github.com/PyTorchLightning/pytorch-lightning/issues/10103))
Expand Down
2 changes: 1 addition & 1 deletion docs/source/advanced/training_tricks.rst
Original file line number Diff line number Diff line change
Expand Up @@ -178,7 +178,7 @@ For example, when training Graph Neural Networks, a common strategy is to load t

A simple way to prevent redundant dataset replicas is to rely on :obj:`torch.multiprocessing` to share the `data automatically between spawned processes via shared memory <https://pytorch.org/docs/stable/notes/multiprocessing.html>`_.
For this, all data pre-loading should be done on the main process inside :meth:`DataModule.__init__`.
As a result, all tensor-data will get automatically shared when using the :class:`~pytorch_lightning.plugins.DDPSpawnPlugin` training type plugin:
As a result, all tensor-data will get automatically shared when using the :class:`~pytorch_lightning.plugins.DDPSpawnStrategy` training type strategy:

.. warning::

Expand Down
2 changes: 1 addition & 1 deletion docs/source/api_references.rst
Original file line number Diff line number Diff line change
Expand Up @@ -154,7 +154,7 @@ Training Type Plugins
DDP2Plugin
DDPShardedPlugin
DDPSpawnShardedPlugin
DDPSpawnPlugin
DDPSpawnStrategy
DeepSpeedPlugin
HorovodPlugin
SingleTPUPlugin
Expand Down
2 changes: 1 addition & 1 deletion docs/source/extensions/plugins.rst
Original file line number Diff line number Diff line change
Expand Up @@ -113,7 +113,7 @@ Training Type Plugins
DDP2Plugin
DDPShardedPlugin
DDPSpawnShardedPlugin
DDPSpawnPlugin
DDPSpawnStrategy
DeepSpeedPlugin
HorovodPlugin
SingleTPUPlugin
Expand Down
4 changes: 2 additions & 2 deletions docs/source/guides/speed.rst
Original file line number Diff line number Diff line change
Expand Up @@ -95,11 +95,11 @@ This by default comes with a performance hit, and can be disabled in most cases.

.. code-block:: python

from pytorch_lightning.plugins import DDPSpawnPlugin
from pytorch_lightning.plugins import DDPSpawnStrategy

trainer = pl.Trainer(
gpus=2,
strategy=DDPSpawnPlugin(find_unused_parameters=False),
strategy=DDPSpawnStrategy(find_unused_parameters=False),
)

When using DDP on a multi-node cluster, set NCCL parameters
Expand Down
4 changes: 2 additions & 2 deletions pytorch_lightning/core/lightning.py
Original file line number Diff line number Diff line change
Expand Up @@ -1926,7 +1926,7 @@ def add_to_queue(self, queue: pl.plugins.training_type.ddp_spawn._FakeQueue) ->
queue: the instance of the queue to append the data.

.. deprecated:: v1.5
This method was deprecated in v1.5 in favor of `DDPSpawnPlugin.add_to_queue`
This method was deprecated in v1.5 in favor of `DDPSpawnStrategy.add_to_queue`
and will be removed in v1.7.
"""

Expand All @@ -1938,7 +1938,7 @@ def get_from_queue(self, queue: pl.plugins.training_type.ddp_spawn._FakeQueue) -
queue: the instance of the queue from where to get the data.

.. deprecated:: v1.5
This method was deprecated in v1.5 in favor of `DDPSpawnPlugin.get_from_queue`
This method was deprecated in v1.5 in favor of `DDPSpawnStrategy.get_from_queue`
and will be removed in v1.7.
"""

Expand Down
2 changes: 1 addition & 1 deletion pytorch_lightning/distributed/dist.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ class LightningDistributed:
"""
.. deprecated:: v1.5
This class is deprecated in v1.5 and will be removed in v1.7.
The broadcast logic will be moved to the :class:`DDPPlugin` and :class`DDPSpawnPlugin` classes.
The broadcast logic will be moved to the :class:`DDPPlugin` and :class`DDPSpawnStrategy` classes.
"""

def __init__(self, rank=None, device=None):
Expand Down
6 changes: 3 additions & 3 deletions pytorch_lightning/lite/lite.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@

from pytorch_lightning.accelerators.accelerator import Accelerator
from pytorch_lightning.lite.wrappers import _LiteDataLoader, _LiteModule, _LiteOptimizer
from pytorch_lightning.plugins import DDPSpawnPlugin, DeepSpeedPlugin, PLUGIN_INPUT, Strategy, TPUSpawnPlugin
from pytorch_lightning.plugins import DDPSpawnStrategy, DeepSpeedPlugin, PLUGIN_INPUT, Strategy, TPUSpawnPlugin
from pytorch_lightning.plugins.training_type.training_type_plugin import TBroadcast
from pytorch_lightning.trainer.connectors.accelerator_connector import AcceleratorConnector
from pytorch_lightning.utilities import _AcceleratorType, _StrategyType, move_data_to_device
Expand Down Expand Up @@ -310,7 +310,7 @@ def to_device(self, obj: Union[nn.Module, Tensor, Any]) -> Union[nn.Module, Tens
"""
if isinstance(obj, nn.Module):
if self.device.type == "cuda":
# need to call this manually here again in case we spawned with DDPSpawnPlugin
# need to call this manually here again in case we spawned with DDPSpawnStrategy
# TODO: refactor to let plugin handle this cleanly
torch.cuda.set_device(self.device)
return obj.to(self.device)
Expand Down Expand Up @@ -403,7 +403,7 @@ def _run_impl(self, run_method: Callable, *args: Any, **kwargs: Any) -> Any:
# apply sharded context to prevent OOM
run_method = partial(self._run_with_sharded_context, run_method)

if isinstance(self._strategy, DDPSpawnPlugin):
if isinstance(self._strategy, DDPSpawnStrategy):
return self._strategy.spawn(run_method, *args, **kwargs)
else:
return run_method(*args, **kwargs)
Expand Down
10 changes: 5 additions & 5 deletions pytorch_lightning/loops/dataloader/prediction_loop.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@

from pytorch_lightning.loops.dataloader.dataloader_loop import DataLoaderLoop
from pytorch_lightning.loops.epoch.prediction_epoch_loop import PredictionEpochLoop
from pytorch_lightning.plugins import DDPSpawnPlugin
from pytorch_lightning.plugins import DDPSpawnStrategy
from pytorch_lightning.utilities.exceptions import MisconfigurationException
from pytorch_lightning.utilities.types import _PREDICT_OUTPUT

Expand All @@ -29,14 +29,14 @@ def return_predictions(self) -> bool:

@return_predictions.setter
def return_predictions(self, return_predictions: Optional[bool] = None) -> None:
# `DDPSpawnPlugin` plugins and derivatives don't support return predictions.
is_ddp_spawn = isinstance(self.trainer.training_type_plugin, DDPSpawnPlugin)
# `DDPSpawnStrategy` plugins and derivatives don't support return predictions.
is_ddp_spawn = isinstance(self.trainer.training_type_plugin, DDPSpawnStrategy)
if return_predictions and is_ddp_spawn:
raise MisconfigurationException(
"`return_predictions` should be set to `False` when using the `DDPSpawnPlugin` or children class. "
"`return_predictions` should be set to `False` when using the `DDPSpawnStrategy` or children class. "
f"Found {return_predictions} with training_type_plugin {type(self.trainer.training_type_plugin)}."
)
# For non `DDPSpawnPlugin` plugin, the `return_predictions` is True by default unless user decide otherwise.
# For non `DDPSpawnStrategy` plugin, the `return_predictions` is True by default unless user decide otherwise.
self._return_predictions = not is_ddp_spawn if return_predictions is None else return_predictions

@property
Expand Down
4 changes: 2 additions & 2 deletions pytorch_lightning/plugins/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@
from pytorch_lightning.plugins.precision.tpu_bf16 import TPUBf16PrecisionPlugin
from pytorch_lightning.plugins.training_type.ddp import DDPPlugin
from pytorch_lightning.plugins.training_type.ddp2 import DDP2Plugin
from pytorch_lightning.plugins.training_type.ddp_spawn import DDPSpawnPlugin
from pytorch_lightning.plugins.training_type.ddp_spawn import DDPSpawnStrategy
from pytorch_lightning.plugins.training_type.deepspeed import DeepSpeedPlugin
from pytorch_lightning.plugins.training_type.dp import DataParallelPlugin
from pytorch_lightning.plugins.training_type.fully_sharded import DDPFullyShardedStrategy
Expand All @@ -46,7 +46,7 @@
"DataParallelPlugin",
"DDP2Plugin",
"DDPPlugin",
"DDPSpawnPlugin",
"DDPSpawnStrategy",
"DDPFullyShardedStrategy",
"DeepSpeedPlugin",
"DeepSpeedPrecisionPlugin",
Expand Down
2 changes: 1 addition & 1 deletion pytorch_lightning/plugins/training_type/__init__.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
from pytorch_lightning.plugins.training_type.ddp import DDPPlugin # noqa: F401
from pytorch_lightning.plugins.training_type.ddp2 import DDP2Plugin # noqa: F401
from pytorch_lightning.plugins.training_type.ddp_spawn import DDPSpawnPlugin # noqa: F401
from pytorch_lightning.plugins.training_type.ddp_spawn import DDPSpawnStrategy # noqa: F401
from pytorch_lightning.plugins.training_type.deepspeed import DeepSpeedPlugin # noqa: F401
from pytorch_lightning.plugins.training_type.dp import DataParallelPlugin # noqa: F401
from pytorch_lightning.plugins.training_type.fully_sharded import DDPFullyShardedStrategy # noqa: F401
Expand Down
2 changes: 1 addition & 1 deletion pytorch_lightning/plugins/training_type/ddp_spawn.py
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,7 @@
log = logging.getLogger(__name__)


class DDPSpawnPlugin(ParallelPlugin):
class DDPSpawnStrategy(ParallelPlugin):
"""Spawns processes using the :func:`torch.multiprocessing.spawn` method and joins processes after training
finishes."""

Expand Down
4 changes: 2 additions & 2 deletions pytorch_lightning/plugins/training_type/sharded_spawn.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@
from torch.optim import Optimizer

import pytorch_lightning as pl
from pytorch_lightning.plugins.training_type.ddp_spawn import DDPSpawnPlugin
from pytorch_lightning.plugins.training_type.ddp_spawn import DDPSpawnStrategy
from pytorch_lightning.trainer.states import TrainerFn
from pytorch_lightning.utilities import _FAIRSCALE_AVAILABLE, rank_zero_only
from pytorch_lightning.utilities.enums import _StrategyType
Expand All @@ -32,7 +32,7 @@
from pytorch_lightning.overrides.fairscale import LightningShardedDataParallel, unwrap_lightning_module_sharded


class DDPSpawnShardedPlugin(DDPSpawnPlugin):
class DDPSpawnShardedPlugin(DDPSpawnStrategy):
"""Optimizer sharded training provided by FairScale."""

distributed_backend = _StrategyType.DDP_SHARDED_SPAWN
Expand Down
6 changes: 3 additions & 3 deletions pytorch_lightning/plugins/training_type/tpu_spawn.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@
from pytorch_lightning.overrides import LightningDistributedModule
from pytorch_lightning.plugins.io.xla_plugin import XLACheckpointIO
from pytorch_lightning.plugins.precision import PrecisionPlugin
from pytorch_lightning.plugins.training_type.ddp_spawn import _FakeQueue, _SpawnOutput, DDPSpawnPlugin
from pytorch_lightning.plugins.training_type.ddp_spawn import _FakeQueue, _SpawnOutput, DDPSpawnStrategy
from pytorch_lightning.trainer.connectors.data_connector import DataConnector
from pytorch_lightning.trainer.states import TrainerFn
from pytorch_lightning.utilities import _TPU_AVAILABLE, find_shared_parameters, set_shared_parameters
Expand All @@ -48,7 +48,7 @@
xm, xmp, MpDeviceLoader, rendezvous = [None] * 4


class TPUSpawnPlugin(DDPSpawnPlugin):
class TPUSpawnPlugin(DDPSpawnStrategy):
"""Plugin for training multiple TPU devices using the :func:`torch.multiprocessing.spawn` method."""

def __init__(
Expand Down Expand Up @@ -340,7 +340,7 @@ def should_rank_save_checkpoint(self) -> bool:
def register_plugins(cls, plugin_registry: Dict) -> None:
plugin_registry.register("tpu_spawn_debug", cls, description="TPUSpawn Plugin with `debug` as True", debug=True)

@DDPSpawnPlugin.checkpoint_io.setter
@DDPSpawnStrategy.checkpoint_io.setter
def checkpoint_io(self, io: Optional[XLACheckpointIO]) -> None:
if io is not None and not isinstance(io, XLACheckpointIO):
raise MisconfigurationException(f"{self.__class__.__name__}.checkpoint_io` must be a `XLACheckpointIO`.")
Expand Down
4 changes: 2 additions & 2 deletions pytorch_lightning/trainer/configuration_validator.py
Original file line number Diff line number Diff line change
Expand Up @@ -263,12 +263,12 @@ def _check_add_get_queue(model: "pl.LightningModule") -> None:
if is_overridden("add_to_queue", model):
rank_zero_deprecation(
"The `LightningModule.add_to_queue` method was deprecated in v1.5 and will be removed in v1.7 in "
"favor of `DDPSpawnPlugin.add_to_queue`"
"favor of `DDPSpawnStrategy.add_to_queue`"
)
if is_overridden("get_from_queue", model):
rank_zero_deprecation(
"The `LightningModule.get_from_queue` method was deprecated in v1.5 and will be removed in v1.7 in "
"favor of `DDPSpawnPlugin.get_from_queue`"
"favor of `DDPSpawnStrategy.get_from_queue`"
)


Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -32,8 +32,8 @@
DDPFullyShardedStrategy,
DDPPlugin,
DDPShardedPlugin,
DDPSpawnPlugin,
DDPSpawnShardedPlugin,
DDPSpawnStrategy,
DeepSpeedPlugin,
DeepSpeedPrecisionPlugin,
DoublePrecisionPlugin,
Expand Down Expand Up @@ -735,7 +735,7 @@ def select_training_type_plugin(self) -> Strategy:
):
ddp_plugin_cls = DDPPlugin
elif use_ddp_spawn or use_ddp_cpu_spawn:
ddp_plugin_cls = DDPSpawnPlugin
ddp_plugin_cls = DDPSpawnStrategy
elif use_ddp_fully_sharded:
ddp_plugin_cls = DDPFullyShardedStrategy
else:
Expand Down
8 changes: 4 additions & 4 deletions pytorch_lightning/trainer/trainer.py
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@
from pytorch_lightning.loops.utilities import _parse_loop_limits
from pytorch_lightning.plugins import (
ApexMixedPrecisionPlugin,
DDPSpawnPlugin,
DDPSpawnStrategy,
NativeMixedPrecisionPlugin,
ParallelPlugin,
PLUGIN_INPUT,
Expand Down Expand Up @@ -669,7 +669,7 @@ def _call_and_handle_interrupt(self, trainer_fn: Callable, *args: Any, **kwargs:
**kwargs: keyword arguments to be passed to `trainer_fn`
"""
try:
if isinstance(self.training_type_plugin, DDPSpawnPlugin):
if isinstance(self.training_type_plugin, DDPSpawnStrategy):
spawn_output: _SpawnOutput = self.training_type_plugin.spawn(trainer_fn, *args, **kwargs)
self.training_type_plugin._recover_results_in_main_process(spawn_output, self)
return spawn_output.trainer_results
Expand Down Expand Up @@ -1179,7 +1179,7 @@ def _run(
self.state.status = TrainerStatus.FINISHED
self.state.stage = None

if isinstance(self.training_type_plugin, DDPSpawnPlugin):
if isinstance(self.training_type_plugin, DDPSpawnStrategy):
results = self.training_type_plugin._collect_rank_zero_results(self, results)

return results
Expand Down Expand Up @@ -1416,7 +1416,7 @@ def _handle_meta_model(self) -> None:
if not is_on_meta_device(self.lightning_module):
return

if isinstance(self.training_type_plugin, DDPSpawnPlugin):
if isinstance(self.training_type_plugin, DDPSpawnStrategy):
raise MisconfigurationException("LightningModule on meta device isn't supported with spawn.")

materialize_module(self.lightning_module)
Expand Down
Loading