[feature request] Add Multi-Processing / VecEnviroment support for SAC and DDPG #170

Sohojoe · 2019-01-21T07:21:14Z

Thanks for adding SAC! I've been able to get it running in
https://github.com/Sohojoe/MarathonEnvsBaselines

The sample efficiency looks promising, however, the wall clock training time is poor compared to PPO2 due to the lack of support for Multi-Processing / VecEnviroment.

Unity / ML-Agents really shines when using multiple instances of an agent - this is a much better approach than MPI / asking the researcher to manage threads. Training with 16 agents gives a 10x speed increase over a single agent and I've pushed some environments to 64 - 100 agents.

It would also be great to have VecEnviroment support for DDPG for techniques that require offline learning.

Would it be relatively easy to add proper VecEnviroment support to SAC and DDPG? I would be happy to spend some cycles against this

araffin · 2019-01-21T09:37:39Z

Hello,

The sample efficiency looks promising, however, the wall clock training time is poor compared to PPO2 due to the lack of support for Multi-Processing / VecEnviroment.

Yes, SAC were designed to be applied on real robots, were multiprocessing is not really possible.

It would also be great to have VecEnviroment support for DDPG for techniques that require offline learning.
Would it be relatively easy to add proper VecEnviroment support to SAC and DDPG? I would be happy to spend some cycles against this

I think current DDPG version supports MPI (but I did not try it yet). Also, multiprocessing training would change the original algorithms, so I would be careful doing so.
In fact, I think this is called D4PG and was a research paper in itself (you should also look at Distributed Prioritized Experience Replay. As SAC is quite new, I'm not aware of any distributed version yet, but maybe the previous techniques can be adapted.

I did not have the time to take a deeper look at those paper so I don't know how easy it is to implement...
Personally, I would be interested to have at least one off-policy learning method that supports multiprocessing.
Pinging @hill-a and @erniejunior ;)

araffin · 2019-01-22T12:23:57Z

Also related: rail-berkeley/softlearning#8

araffin · 2020-10-24T16:06:04Z

closing this in favor of DLR-RM/stable-baselines3#179

araffin added the enhancement New feature or request label Jan 21, 2019

araffin mentioned this issue May 14, 2019

Multiprocessing for SAC #324

Closed

acyclics mentioned this issue Oct 3, 2019

VecEnc Support TD3 #495

Open

araffin mentioned this issue Feb 4, 2020

[question] [feature request] DDPG VecEnv support #679

Closed

araffin added the v3 Discussion about V3 label Oct 12, 2020

araffin closed this as completed Oct 24, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[feature request] Add Multi-Processing / VecEnviroment support for SAC and DDPG #170

[feature request] Add Multi-Processing / VecEnviroment support for SAC and DDPG #170

Sohojoe commented Jan 21, 2019

araffin commented Jan 21, 2019

araffin commented Jan 22, 2019

araffin commented Oct 24, 2020

[feature request] Add Multi-Processing / VecEnviroment support for SAC and DDPG #170

[feature request] Add Multi-Processing / VecEnviroment support for SAC and DDPG #170

Comments

Sohojoe commented Jan 21, 2019

araffin commented Jan 21, 2019

araffin commented Jan 22, 2019

araffin commented Oct 24, 2020