-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multiple GPU Support #26
Comments
With commit 9012399 we get a deadlock, because the learners are initialized synchronously, resulting in the learner 0 waiting for the DDP initialization of the following learners. Log output
|
Currently our models are all initialized randomly. As stated here, torch DDP
We need a method, which initializes the models in a deterministic way (because we don't want full models to be send from one note to another). Methods, which initialize models deterministically are e.g.
|
Description
We follow this idea and spawn one learner on each available GPU. Then we use Torch DDP to average the gradients like we already do when using multiple nodes.
For starters we are not using policy workers on multiple GPUs, as suggested here:
The only advantage of this would be to save some memory (and maybe some time transferring the gradients through DDP), by not storing the model multiple times per node.
Tasks
num_policies
learner worker on each GPUpolicy_workers_per_policy
*num_policies
on each GPUThe text was updated successfully, but these errors were encountered: