Multiple GPU Support #26

GraV1337y · 2021-12-08T16:51:28Z

Description
We follow this idea and spawn one learner on each available GPU. Then we use Torch DDP to average the gradients like we already do when using multiple nodes.

For starters we are not using policy workers on multiple GPUs, as suggested here:

To take full advantage of this, we also need to support policy workers on multiple GPUs. This requires exchanging the parameter vectors between learner and policy worker through CPU memory, rather than shared GPU memory. This can be a step 1 of the implementation.

The only advantage of this would be to save some memory (and maybe some time transferring the gradients through DDP), by not storing the model multiple times per node.

Tasks

Add host list as parameter for multi-sample-factory
Initialize num_policies learner worker on each GPU
Initialize policy_workers_per_policy * num_policies on each GPU
Initialize one SharedBuffer per GPU
Add each GPU separately to the DDP host list
Assign actor workers a fixed SharedBuffer to write to

The text was updated successfully, but these errors were encountered:

KonstantinRamthun · 2021-12-14T10:20:32Z

With commit 9012399 we get a deadlock, because the learners are initialized synchronously, resulting in the learner 0 waiting for the DDP initialization of the following learners.

Log output

^[[36m[2021-12-14 10:52:34,672][17487] Default env families supported: ['doom_*', 'atari_*', 'dmlab_*', 'mujoco_*', 'MiniGrid*', 'unity_*']^[[0m
^[[36m[2021-12-14 10:52:34,671][32665] Default env families supported: ['doom_*', 'atari_*', 'dmlab_*', 'mujoco_*', 'MiniGrid*', 'unity_*']^[[0m
^[[36m[2021-12-14 10:52:35,366][17487] Env registry entry created: unity_^[[0m
^[[36m[2021-12-14 10:52:35,366][32665] Env registry entry created: unity_^[[0m
^[[33m[2021-12-14 10:52:35,470][17487] Saved parameter configuration for experiment saving_training_iss26 not found!^[[0m
^[[33m[2021-12-14 10:52:35,470][17487] Starting experiment from scratch!^[[0m
^[[33m[2021-12-14 10:52:35,470][32665] Saved parameter configuration for experiment saving_training_iss26 not found!^[[0m
^[[33m[2021-12-14 10:52:35,470][32665] Starting experiment from scratch!^[[0m
^[[36m[2021-12-14 10:52:37,505][32665] Queried available GPUs: 0,1
^[[0m
^[[36m[2021-12-14 10:52:37,505][17487] Queried available GPUs: 0,1
^[[0m
[INFO] Connected to Unity environment with package version 2.0.0-pre.3 and communication version 1.5.0
[INFO] Connected to Unity environment with package version 2.0.0-pre.3 and communication version 1.5.0
[INFO] Connected new brain: GoalKeeping?team=0
[INFO] Connected new brain: GoalKeeping?team=0
[WARNING] The environment contains multiple observations. You must define allow_multiple_obs=True to receive them all. Otherwise, only the first visual observation (or vector observation ifthere are no visual observations) will be provi$
[WARNING] The environment contains multiple observations. You must define allow_multiple_obs=True to receive them all. Otherwise, only the first visual observation (or vector observation ifthere are no visual observations) will be provi$
/work/grudelpg/envs/multi-sample-factory-env/lib/python3.9/site-packages/gym/logger.py:34: UserWarning: ^[[33mWARN: Box bound precision lowered by casting to float32^[[0m
  warnings.warn(colorize("%s: %s" % ("WARN", msg % args), "yellow"))
/work/grudelpg/envs/multi-sample-factory-env/lib/python3.9/site-packages/gym/logger.py:34: UserWarning: ^[[33mWARN: Box bound precision lowered by casting to float32^[[0m
  warnings.warn(colorize("%s: %s" % ("WARN", msg % args), "yellow"))
^[[37m^[[01m[2021-12-14 10:52:43,084][17487] Using a total of 240 trajectory buffers^[[0m
^[[36m[2021-12-14 10:52:43,085][17487] Allocating shared memory for trajectories^[[0m
^[[37m^[[01m[2021-12-14 10:52:43,085][32665] Using a total of 240 trajectory buffers^[[0m
^[[36m[2021-12-14 10:52:43,085][32665] Allocating shared memory for trajectories^[[0m
^[[37m^[[01m[2021-12-14 10:52:44,738][32665] Initializing learners...^[[0m
^[[37m^[[01m[2021-12-14 10:52:44,739][32665] Initializing the learner 0 for policy 0^[[0m
^[[37m^[[01m[2021-12-14 10:52:44,752][00322] Set environment var CUDA_VISIBLE_DEVICES to '0' for learner process 0^[[0m
^[[37m^[[01m[2021-12-14 10:52:44,770][17487] Initializing learners...^[[0m
^[[37m^[[01m[2021-12-14 10:52:44,771][17487] Initializing the learner 0 for policy 0^[[0m
^[[37m^[[01m[2021-12-14 10:52:44,783][17610] Set environment var CUDA_VISIBLE_DEVICES to '0' for learner process 0^[[0m
^[[36m[2021-12-14 10:52:44,790][00322] Visible devices: 1^[[0m
^[[37m^[[01m[2021-12-14 10:52:44,793][00322] Starting seed is not provided^[[0m
^[[36m[2021-12-14 10:52:44,820][17610] Visible devices: 1^[[0m
^[[37m^[[01m[2021-12-14 10:52:44,822][17610] Starting seed is not provided^[[0m
^[[37m^[[01m[2021-12-14 10:52:45,871][00322] Waiting for the learner to initialize...^[[0m

KonstantinRamthun · 2021-12-14T10:36:09Z

Currently our models are all initialized randomly. As stated here, torch DDP

performs an all-reduce step on gradients and assumes that they will be modified by the optimizer in all processes in the same way.

We need a method, which initializes the models in a deterministic way (because we don't want full models to be send from one note to another). Methods, which initialize models deterministically are e.g. torch.nn.init.constant_, torch.nn.init.ones_, torch.nn.init.zeros_, torch.nn.init.eye_. To fix this, we could do one of the following things:

Add an additional option for the parameter --policy_initialization , which initializes models in a deterministic way. This method has the disadvantage, that users of MSF must set this option by hand, when using multiple nodes. This is prone to be forgotten.
Check that --num_policies is greater than 1 and --with_pbt is false in this method. If this is the case, then apply deterministc intitialization.

GraV1337y added the enhancement New feature or request label Dec 8, 2021

GraV1337y assigned GraV1337y and KonstantinRamthun Dec 8, 2021

KonstantinRamthun closed this as completed Jan 22, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multiple GPU Support #26

Multiple GPU Support #26

GraV1337y commented Dec 8, 2021 •

edited by Horrible22232

Loading

KonstantinRamthun commented Dec 14, 2021

KonstantinRamthun commented Dec 14, 2021

Multiple GPU Support #26

Multiple GPU Support #26

Comments

GraV1337y commented Dec 8, 2021 • edited by Horrible22232 Loading

KonstantinRamthun commented Dec 14, 2021

KonstantinRamthun commented Dec 14, 2021

GraV1337y commented Dec 8, 2021 •

edited by Horrible22232

Loading