You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It appears that the policy of an agent is sampled every timestep, whereas I believe it should be sampled every episode.
By the way, the code is written with Ray as well as multi-threading and I find it difficult to debug (naive ray.init(local_mode = True) doesn't work). Could you suggest any elegant way of debugging?
The text was updated successfully, but these errors were encountered:
I greatly appreciate your contribution! The lib helps a lot, especially considering the scarcity of open-source code related to PSRO in large games.
There seems to be a bug in rollout module of PSRO, specifically in the following section:
malib/malib/rollout/inference/ray/server.py
Line 115 in ea37d5d
It appears that the policy of an agent is sampled every timestep, whereas I believe it should be sampled every episode.
By the way, the code is written with Ray as well as multi-threading and I find it difficult to debug (naive ray.init(local_mode = True) doesn't work). Could you suggest any elegant way of debugging?
The text was updated successfully, but these errors were encountered: