[Feature Request] Implementation of Lattice exploration (Chiappa et al., NeurIPS 2023) #1829

albertochiappa · 2024-02-09T19:33:18Z

🚀 Feature

I propose to include in Stable Baselines 3 an option to use Lattice exploration, an action noise that some colleagues and I have presented in this NeurIPS paper last year. Lattice introduces noise in the policy network before the last dense layer, making the action distribution a multivariate gaussian with full covariance matrix. It can improve the performance of SAC and PPO in high-dimensional environments with many actuators. In particular, we have been using it with success in the musculoskeletal simulation library MyoSuite, where we benchmarked it together with recurrent PPO and obtained good results:

We also tested together with SAC in the common PyBullet locomotion environments, where it is especially competitive in Humanoid:

It also powered our winning solution to the NeurIPS MyoChallenge 2023.

Motivation

It would be easier for the users of SB3 to test Lattice in their environment of choice if it is part of the library, vs installing a separate package or downloading another repository. The change does not break any of the current behavior of the library, as the feature is incremental.

Pitch

I have tried my best to integrate Lattice in SB3 modifying the codebase as little as possible. In the branch feature/lattice of this fork of SB3 I have implemented Lattice for SAC and PPO. It can be used by setting the argument "use_lattice=True" and passing additional hyperparameters in a dictionary called "lattice_kwargs". It seems to work correctly when called from the configuration files of SB3 zoo. I would invite a SB3 developer to check whether the integration I propose follows the library's guidelines and spirit. If you have no major concern, I would be happy to prepare a pull request!

Alternatives

Alternatively, Lattice could become part of the contrib repository of SB3. However, I don't see a way to implement it this way without creating entirely new algorithms (e.g., LatticePPO, LatticeSAC, …), which is, in my opinion, excessive, given that relatively limited changes have to be implemented in the original algorithms to enable this option.

Additional context

No response

Checklist

I have checked that there is no similar issue in the repo
If I'm requesting a new feature, I have proposed alternatives

AlexEMG · 2024-02-12T19:59:27Z

Thanks for considering the feature proposal. We're looking forward to hear your feedback!

araffin · 2024-02-13T09:50:05Z

Hello,
thanks for the proposal.
I still need to read the paper, but I would welcome anyway if you could do a PR for the project section of the documentation =) (combining Lattice exploration and MyoChallenge)

araffin · 2024-02-15T16:07:33Z

I still need to read the paper, but I would welcome anyway if you could do a PR for the project section of the documentation =) (combining Lattice exploration and MyoChallenge)

I have read the paper but I still need some time to process the content (I'll probably will be back with some questions), in the meantime, I would be happy to receive a PR that updates the project section.

albertochiappa added the enhancement New feature or request label Feb 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature Request] Implementation of Lattice exploration (Chiappa et al., NeurIPS 2023) #1829

[Feature Request] Implementation of Lattice exploration (Chiappa et al., NeurIPS 2023) #1829

albertochiappa commented Feb 9, 2024

AlexEMG commented Feb 12, 2024

araffin commented Feb 13, 2024

araffin commented Feb 15, 2024

[Feature Request] Implementation of Lattice exploration (Chiappa et al., NeurIPS 2023) #1829

[Feature Request] Implementation of Lattice exploration (Chiappa et al., NeurIPS 2023) #1829

Comments

albertochiappa commented Feb 9, 2024

🚀 Feature

Motivation

Pitch

Alternatives

Additional context

Checklist

AlexEMG commented Feb 12, 2024

araffin commented Feb 13, 2024

araffin commented Feb 15, 2024