You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am trying to train a controller using PPO2 algorithm. The action space for my problem consists of 2 continuous and one discrete action. I tried using a tuple action space (similar to examples on gym website), but PPO2 (I also tried TRPO) throws a not implemented error. I tried a workaround: I defined the action space as Box with 3 actions and before stepping the environment, I check if the value is below a threshold value, I change the action value to 0, else 1. But this simplification is making it hard for the controller to learn the task. Is there a way to use tuple action spaces, or do you have ideas from similar problems?
The text was updated successfully, but these errors were encountered:
sahilgupta2105
changed the title
Tuple action space with stable baselines PPO2 [question]
Tuple action space with stable baselines PPO2
Dec 1, 2018
sahilgupta2105
changed the title
Tuple action space with stable baselines PPO2
Tuple action space with stable baselines PPO2 [question]
Dec 1, 2018
Thanks for the prompt reply. I saw the comment. So, I am guessing just implementing a probability distribution for a tuple space will suffice. I will update you if I am able to successfully implement it.
Hi,
I am trying to train a controller using PPO2 algorithm. The action space for my problem consists of 2 continuous and one discrete action. I tried using a tuple action space (similar to examples on gym website), but PPO2 (I also tried TRPO) throws a not implemented error. I tried a workaround: I defined the action space as Box with 3 actions and before stepping the environment, I check if the value is below a threshold value, I change the action value to 0, else 1. But this simplification is making it hard for the controller to learn the task. Is there a way to use tuple action spaces, or do you have ideas from similar problems?
The text was updated successfully, but these errors were encountered: