Skip to content
Merged
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions deepspeed/runtime/zero/partition_parameters.py
Original file line number Diff line number Diff line change
Expand Up @@ -279,6 +279,9 @@ def __init__(self,
For example, if a node has 1TB of memory and 8 GPUs, we could fit a trillion
parameter model with 4 nodes and 32 GPUs.

Important: If the fp16 weights of the model can't fit onto a single GPU memory
this feature must be used.

.. note::
Initializes ``torch.distributed`` if it has not already been done so.
See :meth:`deepseed.init_distributed` for more information.
Expand Down