intuition on the loss function #55

ss555 · 2023-06-20T18:17:12Z

I run halfcheetach env using MBPO(SAC+3NNs(dynamics), and my training loss increases with this the reward.
I don't have intuition to interpret this
why training loss of model based policy optimization increases?
I can share wandb

Thanks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

intuition on the loss function #55

intuition on the loss function #55

ss555 commented Jun 20, 2023

intuition on the loss function #55

intuition on the loss function #55

Comments

ss555 commented Jun 20, 2023