recurrent policy implementation in ppo [feature-request] #18

pushkalkatara · 2020-05-12T09:09:44Z

Hi, is CNNLSTM based policy implementation anytime soon for ppo?

araffin · 2020-05-12T09:19:38Z

Hello,
Please take a look at the roadmap: #1

It is planned but for v1.1+ (so not before 1 or 2 months at least). In the meantime, you can always use frame-stacking if you need to account for history (it yields most of the time competitive results).

This feature will need extra care because it may complexify the codebase, it is a feature wanted by users and it is also an open research question.

araffin · 2021-05-10T15:13:06Z

Related #160

zhihanyang2022 · 2021-11-14T08:53:01Z

@pushkalkatara Having worked on getting recurrent networks to work with DDPG, TD3 and SAC (https://arxiv.org/pdf/2110.12628.pdf), one important question is, do you want to apply recurrent to take into account of the (1) entire history or (2) just a short window of it? As arrafin mentioned, if your problem is (2), then stacking would be an easier option. In fact, there isn't a simple solution that allow both (1) and (2) to be implemented together, so before diving into coding we should reflect on our actual needs :D.

araffin · 2021-11-25T16:20:39Z

I have a very experimental version of recurrent PPO in a SB3 contrib branch based on SB2/cleanRL implementation: Stable-Baselines-Team/stable-baselines3-contrib#53

Use it at your own risk :p
(I will try to continue to work on it but help is welcome too)

araffin added the enhancement New feature or request label May 12, 2020

Miffyli added this to the v1.2 milestone Jun 15, 2020

Miffyli mentioned this issue Feb 17, 2021

getting MlpLstmPolicy working with PPO DQN or A2C?? #319

Closed

araffin added the help wanted Help from contributors is welcomed label Mar 18, 2021

Miffyli mentioned this issue Aug 22, 2021

Recurrent layer support for Online policy algorithms #550

Closed

araffin mentioned this issue Oct 1, 2021

[Feature Request] Recurrent Policies for A2C #592

Closed

araffin pinned this issue Nov 4, 2021

araffin mentioned this issue Nov 4, 2021

Roadmap to Stable-Baselines3 V1.0 #1

Closed

42 tasks

araffin mentioned this issue Nov 25, 2021

[feature request] LSTM policies with custom feature extractors #160

Closed

araffin mentioned this issue Nov 26, 2021

Fix evaluation script for recurrent policies #678

Merged

14 tasks

araffin mentioned this issue Dec 7, 2021

[question] How to get the model architecture when using recurrent policy? hill-a/stable-baselines#1145

Closed

araffin mentioned this issue Apr 12, 2022

Recurrent PPO Stable-Baselines-Team/stable-baselines3-contrib#53

Merged

20 tasks

araffin closed this as completed in Stable-Baselines-Team/stable-baselines3-contrib#53 May 30, 2022

araffin unpinned this issue May 30, 2022

araffin mentioned this issue Feb 6, 2025

[Feature Request] LSTM policies #2081

Closed

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

recurrent policy implementation in ppo [feature-request] #18

recurrent policy implementation in ppo [feature-request] #18

pushkalkatara commented May 12, 2020

araffin commented May 12, 2020

araffin commented May 10, 2021

zhihanyang2022 commented Nov 14, 2021 •

edited

Loading

araffin commented Nov 25, 2021 •

edited

Loading

recurrent policy implementation in ppo [feature-request] #18

recurrent policy implementation in ppo [feature-request] #18

Comments

pushkalkatara commented May 12, 2020

araffin commented May 12, 2020

araffin commented May 10, 2021

zhihanyang2022 commented Nov 14, 2021 • edited Loading

araffin commented Nov 25, 2021 • edited Loading

zhihanyang2022 commented Nov 14, 2021 •

edited

Loading

araffin commented Nov 25, 2021 •

edited

Loading