Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

issue in train_sac.py #14

Open
Astik-2002 opened this issue Jul 19, 2024 · 4 comments
Open

issue in train_sac.py #14

Astik-2002 opened this issue Jul 19, 2024 · 4 comments

Comments

@Astik-2002
Copy link

Astik-2002 commented Jul 19, 2024

while training sac, the following error occured

  File "/home/astik/double_pendulum/examples/reinforcement_learning/SAC/train_sac_noisy_env.py", line 357, in <module>
    agent = SAC(
  File "/home/astik/anaconda3/envs/drones/lib/python3.10/site-packages/stable_baselines3/sac/sac.py", line 106, in __init__
    super(SAC, self).__init__(
  File "/home/astik/anaconda3/envs/drones/lib/python3.10/site-packages/stable_baselines3/common/off_policy_algorithm.py", line 107, in __init__
    super(OffPolicyAlgorithm, self).__init__(
  File "/home/astik/anaconda3/envs/drones/lib/python3.10/site-packages/stable_baselines3/common/base_class.py", line 171, in __init__
    assert isinstance(self.action_space, supported_action_spaces), (
AssertionError: The algorithm only supports <class 'gym.spaces.box.Box'> as action spaces but Box(-1.0, 1.0, (1,), float32) was provided```
@Astik-2002
Copy link
Author

Astik-2002 commented Jul 19, 2024

Also, DQN was unable to stabilize the trajectory after training for 100 epochs. is it expected? The final output of training DQN is pinned below

video.mp4

@fwiebe
Copy link
Member

fwiebe commented Jul 19, 2024

Hi @Astik-2002 ,
Thanks for raising the issue.
Regarding your first comment: That error occurs because the SAC uses StableBaselines3 and that expects the gym library while the training environment was migrated to the more modern gymnasium. I will check if/how StableBaselines3 can be used with gymnasium.
Regarding your second comment: Yes that is expected. The DQN implementation discretizes the state space produces only a subpar policy.

@Astik-2002
Copy link
Author

Thanks for the comment. I'm facing issues understanding how the competition is going to be judged. Most controllers in literature used to stabilize acrobot or pendubot are already implemented in the realistic examples. But since they require significant parameter optimization, can a working controller from examples be submitted to the leader board, Even if it's not a new controller?

@fwiebe
Copy link
Member

fwiebe commented Jul 23, 2024

Hi @Astik-2002 , yes it is allowed to submit a modified controller from the leaderboard with tuned parameters, another filtering method, etc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants