From 5f34f2b049f70de805b472dad13afc0fdea9bc45 Mon Sep 17 00:00:00 2001 From: edenlightning <66261195+edenlightning@users.noreply.github.com> Date: Fri, 11 Dec 2020 20:42:04 -0500 Subject: [PATCH] Update installation instructions for FairScale (#5099) Co-authored-by: Jirka Borovec --- docs/source/multi_gpu.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/source/multi_gpu.rst b/docs/source/multi_gpu.rst index def47810504d6..b3e0b905f27f4 100644 --- a/docs/source/multi_gpu.rst +++ b/docs/source/multi_gpu.rst @@ -663,7 +663,7 @@ It is highly recommended to use Sharded Training in multi-GPU environments where A technical note: as batch size scales, storing activations for the backwards pass becomes the bottleneck in training. As a result, sharding optimizer state and gradients becomes less impactful. Work within the future will bring optional sharding to activations and model parameters to reduce memory further, but come with a speed cost. -To use Sharded Training, you need to first install FairScale using the command below or install all extras using ``pip install pytorch-lightning["extra"]``. +To use Sharded Training, you need to first install FairScale using the command below. .. code-block:: bash