Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix rllib related documentation and examples. #2132

Merged
merged 4 commits into from
Jan 4, 2024

Conversation

Gamenot
Copy link
Collaborator

@Gamenot Gamenot commented Jan 4, 2024

See changelog.

@Gamenot
Copy link
Collaborator Author

Gamenot commented Jan 4, 2024

Failing docs test is expected because {ppo_example|ppo_pbt_example}.py do not exist yet on master.

@Gamenot Gamenot merged commit 94a0bcf into master Jan 4, 2024
25 of 26 checks passed
@Gamenot Gamenot deleted the tucker/bugfix-ray_examples branch January 4, 2024 20:55
@@ -6,17 +6,17 @@ RLlib

**RLlib** is an open-source library for reinforcement learning that offers both high scalability and a unified API for a variety of applications. ``RLlib`` natively supports ``TensorFlow``, ``TensorFlow Eager``, and ``PyTorch``. Most of its internals are agnostic to such deep learning frameworks.

SMARTS contains two examples using `Policy Gradients (PG) <https://docs.ray.io/en/latest/rllib-algorithms.html#policy-gradients-pg>`_.
SMARTS contains two examples using `Proximal Policy Optimization (PPO) <https://docs.ray.io/en/latest/rllib/rllib-algorithms.html#ppo>`_.

#. Policy gradient
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PPO

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed a fix to these on master.


#. Policy gradient

+ script: :examples:`e12_rllib/pg_example.py`
+ script: :examples:`e12_rllib/ppo_example.py`
+ Shows the basics of using RLlib with SMARTS through :class:`~smarts.env.rllib_hiway_env.RLlibHiWayEnv`.

#. Policy gradient with population based training
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PPO with population based training

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants