-
Notifications
You must be signed in to change notification settings - Fork 645
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add gymnasium support for DQN #370
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One small comment but otherwise LGTM. Feel free to start the RLops process.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vwxyzjn On line 208 of jax atari and 180 of jax classic control have np
rather than jnp
https://github.com/vwxyzjn/cleanrl/blob/599f9adfec89d63721578b08b75ec38ab0209372/cleanrl/dqn_jax.py#L180
Im guessing this is a simple mistake (it shouldn't affect performance), can we change to jnp
The error is due to needing stable baselines 3 ==2 |
No sign of regression, as shown in the PR description. Merging now. |
Hi @vwxyzjn @charraut, I'm wondering what part of this change forced us to add the following line: Vectorization was a useful feature earlier. Thank you! |
@ronuchit this is due to SB3's replay buffer don't support |
I believe it does, actually: https://github.com/DLR-RM/stable-baselines3/blame/master/stable_baselines3/common/buffers.py#L162 We would just need to pass in |
I see. That’s interesting. Would you be interested in making a PR that optionally supports num_envs>1? |
sure, done: #395 |
Description
This PR updates the DQN files to the lastest version of gymnasium, replacing gym.
dqn.py
dqn_jax.py
dqn_atari.py
dqn_atari_jax.py
Types of changes
Checklist:
pre-commit run --all-files
passes (required).mkdocs serve
.If you need to run benchmark experiments for a performance-impacting changes:
--capture-video
.python -m openrlbenchmark.rlops
.python -m openrlbenchmark.rlops
utility to the documentation.python -m openrlbenchmark.rlops ....your_args... --report
, to the documentation.Regression report
https://wandb.ai/costa-huang/cleanrl/reports/Regression-Report-dqn_atari_jax--Vmlldzo0MjQ5OTA2
https://wandb.ai/costa-huang/cleanrl/reports/Regression-Report-dqn_jax--Vmlldzo0MjUwMDM1