Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BC Problems with Actor Crititc (PPO) getting action probabilities #8

Closed
benspoek opened this issue Feb 22, 2022 · 1 comment
Closed
Labels
bug Something isn't working

Comments

@benspoek
Copy link

I am implementing the Stable Baselines3 - Pretraining with Behavior Cloning example for a PPO agent with a discrete action space. However I can not retrieve the logits by the method proposed in the code

latent_pi, _, _ = model._get_latent(data)
logits = model.action_net(latent_pi)
action_prediction = logits

due to a AttributeError: 'ActorCriticPolicy' object has no attribute '_get_latent'. How can i work around that? is there another possibility to get the action probabilities?

@araffin araffin added the bug Something isn't working label Feb 22, 2022
@araffin
Copy link
Member

araffin commented Feb 22, 2022

Hello,
the PPO code was updated but not the BC one apparently...
PPO policy now has a get_distribution() method from which you should be able to extract logits ;)
see https://github.com/DLR-RM/stable-baselines3/blob/52c29dc497fa2eb235d0476b067bed8ac488fe64/stable_baselines3/common/policies.py#L650

A PR that solves this issue is welcomed ;)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants