-
Notifications
You must be signed in to change notification settings - Fork 190
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add bubble_env
contribution
#1537
Conversation
acdb102
to
5f88b32
Compare
try: | ||
self._data = copy.deepcopy(dict(**kwargs)) | ||
except RecursionError: | ||
self._data = copy.copy(dict(**kwargs)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
deepcopy
was used to prevent external users from modifying some of the infos
objects later using additional env wrappers.
How does RecursionError
occur and would a shallow copy be a sufficient tamper protection here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am unsure how the RecursionError
occurred but some part of the SMARTS Observation
type causes it. I intend to fix this during the competition.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Adaickalavan To be clear, I think this is fine to update at a slightly later date because it is only useful to us.
@@ -28,6 +28,6 @@ def step( | |||
obs, reward, done, info = self.env.step(action) | |||
|
|||
for agent_id in info.keys(): | |||
info[agent_id]["is_success"] = bool(info[agent_id]["score"]) | |||
info[agent_id]["is_success"] = bool(info[agent_id].get("score", True)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The is_success
is set here to enable the SB3 library to count the number of successfully completed episodes and to print it during evaluation. An episode is considered to be successfully completed if the agent reaches the goal, i.e., obs.events.reached_goal=True
.
Why is the score
being set to True
by default here, as this is misleading?
@@ -2,3 +2,4 @@ phase: track1 | |||
eval_episodes: 50 | |||
seed: 42 | |||
scenarios: [] | |||
bubble_env_evaluation_seeds: [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If bubble_env_evaluation_seeds
is an empty list, then the additional bubble_env will not be run during Codalab evaluation (i.e., not included in competition scoring and ranking). It only runs during local evaluation by participants as the default config sets bubble_env_evaluation_seeds=[6]
.
Was this the intended outcome?
@@ -15,6 +15,7 @@ | |||
"eval_episodes", | |||
"seed", | |||
"scenarios", | |||
"bubble_env_evaluation_seeds", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider using short variable names. Here, we could use bubble_env_seeds
instead of bubble_env_evaluation_seeds
.
No description provided.