Fixes Termination Manager logging to report aggregated percentage of environments done due to each term. #3107

ooctipus · 2025-08-07T06:02:56Z

Description

Currently Termination Manager write current step's done count for each term if reset is detected. This leads to two problem.

User sees different counts just by varying num_envs
the count can be over-count or under-count depending on when reset happens, as pointed out in [Bug Report] Termination Overcounting Caused by Missing Log Buffer Reset in manager_based_rl_env.py #2977 (Thanks, @Kyu3224)

The cause of the bug is because we are reporting current step status into a buffer that suppose to record episodic done. So instead of write the entire buffer base on current value, we ask the update to respect the non-reseting environment's old value, and instead of reporting count, we report percentage of environment that was done due to the particular term.

Test on Isaac-Velocity-Rough-Anymal-C-v0

Before fix:

Red: num_envs = 4096, Orange: num_envs = 1024

After fix:

Red: num_envs = 4096, Orange: num_envs = 1024

Note that curve of the same color ran on same seed, and curves matched exactly, the only difference is the data gets reported in termination. The percentage version is a lot more clear in conveying how agent currently fails, and how much percentage of agent fails, and shows that increasing num_envs to 4096 helps improve agent avoiding termination by base_contact much quicker than num_envs=1024. Such message is a bit hard to tell in first image.

Checklist

I have run the pre-commit checks with ./isaaclab.sh --format
I have made corresponding changes to the documentation
My changes generate no new warnings
I have added tests that prove my fix is effective or that my feature works
I have updated the changelog and the corresponding version in the extension's config/extension.toml file
I have added my name to the CONTRIBUTORS.md or my name already exists there

Mayankm96 · 2025-08-07T08:05:37Z

source/isaaclab/isaaclab/managers/termination_manager.py

            The corresponding termination term value. Shape is (num_envs,).
        """
-        return self._term_dones[name]
+        return self._term_dones[name, self._term_names.index(name)]


This indexing here is not great. We use the value from here to give termination rewards in different environments. Doing the .index every time for reward computation may lead to slow downs (as it searches over the list in O(n) fashion).

Ahhh if you like the single tensor approach I can also store one more name -> idx dict, and make change here O(1), I wasn't aware of termination rewards, (my bad!!) and thought this function may be used infrequently. I should do a global search next time rather than assume!

Mayankm96 · 2025-08-07T08:06:07Z

source/isaaclab/isaaclab/managers/termination_manager.py

-        self._term_dones = dict()
-        for term_name in self._term_names:
-            self._term_dones[term_name] = torch.zeros(self.num_envs, device=self.device, dtype=torch.bool)
+        self._term_dones = torch.zeros((self.num_envs, len(self._term_names)), device=self.device, dtype=torch.bool)


Is there a reason to change this as a dict of tensors to a single tensor?

I thought it last_episode_done_stats = self._term_dones.float().mean(dim=0) this operation is very nice and optimized, thats why I did it, but if you think dict is more clear I can revert back : ))

Mayankm96 · 2025-08-07T08:08:36Z

source/isaaclab/isaaclab/managers/termination_manager.py

+        for i, key in enumerate(self._term_names):
            # store information
-            extras["Episode_Termination/" + key] = torch.count_nonzero(self._term_dones[key][env_ids]).item()
+            extras["Episode_Termination/" + key] = last_episode_done_stats[i].item()


If you want the ratio, isn't that simply?

self._term_dones[key][env_ids].sum() / len(env_ids)

Thanks for reveiwing!
I guess, doing with env_ids will be viewing ratio of resetting environments. I thought maybe report stats of all environment can be a bit more nicer as user can verify from the graph that all terms sum up to 1. Of course you can do self._term_dones[key].sum() / self.env.num_envs as well

But it seems like if this approach is what I after, do it in one tensor operation seems quite nice, both speed wise and memory utility wise.

github-actions · 2025-08-10T20:11:43Z

Test Results Summary

2 419 tests 2 011 ✅ 2h 22m 39s ⏱️
90 suites 408 💤
1 files 0 ❌

Results for commit a0767c6.

♻️ This comment has been updated with latest results.

ooctipus · 2025-08-13T00:30:48Z

@Mayankm96
I did my best to optimize the operations in compute and reset,
index all changed from O(n) to O(1) -> this part not captured in benchmark below

benchmarked task: velocity rough anymal c, 4096 envs, 1000 steps

before change :
one step total: 79.038

| termination.compute     |       0.229 ms|
| termination.reset       |       0.007 ms|

after change:

one step total: 80.187 ms

| termination.compute     |       0.274 ms|  slower due to one extra operation
| termination.reset       |       0.004 ms|  twice as fast

the cost for this PR is about 0.04 ms / 80, 0.05%, mostly due to compute needs modify other terms's done to correctly update the done buffer. I think this is reasonable

@Kyu3224

…environments done due to each term. (isaac-sim#3107) # Description Currently Termination Manager write current step's done count for each term if reset is detected. This leads to two problem. 1. User sees different counts just by varying num_envs 2. the count can be over-count or under-count depending on when reset happens, as pointed out in isaac-sim#2977 (Thanks, @Kyu3224) The cause of the bug is because we are reporting current step status into a buffer that suppose to record episodic done. So instead of write the entire buffer base on current value, we ask the update to respect the non-reseting environment's old value, and instead of reporting count, we report percentage of environment that was done due to the particular term. Test on Isaac-Velocity-Rough-Anymal-C-v0 Before fix: <img width="786" height="323" alt="Screenshot from 2025-08-06 22-16-20" src="https://github.com/user-attachments/assets/4838d612-7f0e-4232-a07e-688b547e91db" /> Red: num_envs = 4096, Orange: num_envs = 1024 After fix: <img width="786" height="323" alt="Screenshot from 2025-08-06 22-16-12" src="https://github.com/user-attachments/assets/e6e55c21-17ed-42ca-8d94-a19d08611f86" /> Red: num_envs = 4096, Orange: num_envs = 1024 Note that curve of the same color ran on same seed, and curves matched exactly, the only difference is the data gets reported in termination. The percentage version is a lot more clear in conveying how agent currently fails, and how much percentage of agent fails, and shows that increasing num_envs to 4096 helps improve agent avoiding termination by `base_contact` much quicker than num_envs=1024. Such message is a bit hard to tell in first image.  ## Checklist - [x] I have run the [`pre-commit` checks](https://pre-commit.com/) with `./isaaclab.sh --format` - [ ] I have made corresponding changes to the documentation - [ ] My changes generate no new warnings - [ ] I have added tests that prove my fix is effective or that my feature works - [x] I have updated the changelog and the corresponding version in the extension's `config/extension.toml` file - [x] I have added my name to the `CONTRIBUTORS.md` or my name already exists there

@Kyu3224

…environments done due to each term. (isaac-sim#3107) # Description Currently Termination Manager write current step's done count for each term if reset is detected. This leads to two problem. 1. User sees different counts just by varying num_envs 2. the count can be over-count or under-count depending on when reset happens, as pointed out in isaac-sim#2977 (Thanks, @Kyu3224) The cause of the bug is because we are reporting current step status into a buffer that suppose to record episodic done. So instead of write the entire buffer base on current value, we ask the update to respect the non-reseting environment's old value, and instead of reporting count, we report percentage of environment that was done due to the particular term. Test on Isaac-Velocity-Rough-Anymal-C-v0 Before fix: <img width="786" height="323" alt="Screenshot from 2025-08-06 22-16-20" src="https://github.com/user-attachments/assets/4838d612-7f0e-4232-a07e-688b547e91db" /> Red: num_envs = 4096, Orange: num_envs = 1024 After fix: <img width="786" height="323" alt="Screenshot from 2025-08-06 22-16-12" src="https://github.com/user-attachments/assets/e6e55c21-17ed-42ca-8d94-a19d08611f86" /> Red: num_envs = 4096, Orange: num_envs = 1024 Note that curve of the same color ran on same seed, and curves matched exactly, the only difference is the data gets reported in termination. The percentage version is a lot more clear in conveying how agent currently fails, and how much percentage of agent fails, and shows that increasing num_envs to 4096 helps improve agent avoiding termination by `base_contact` much quicker than num_envs=1024. Such message is a bit hard to tell in first image.  ## Checklist - [x] I have run the [`pre-commit` checks](https://pre-commit.com/) with `./isaaclab.sh --format` - [ ] I have made corresponding changes to the documentation - [ ] My changes generate no new warnings - [ ] I have added tests that prove my fix is effective or that my feature works - [x] I have updated the changelog and the corresponding version in the extension's `config/extension.toml` file - [x] I have added my name to the `CONTRIBUTORS.md` or my name already exists there

…ng (#3745) # Description This PR fixes the issue where get_done_term returned last episode value rather than current step value. This PR realizes values used for get_term should be different from that used for logging, and mixed useage leads to non-intuitive behavior. using per-step value for logging leads to overcounting and undercounting reported in #2977 using last-episode value for get_term leads to misalignment with expectation reported in #3720 Fixes #2977 #3720 --- The logging behavior remains *mostly* the same as #3107, and and also got rid of the weird overwriting behavior(yay). I get exactly the same termination curve as #3107 when run on `Isaac-Velocity-Rough-Anymal-C-v0` Here is a benchmark summary with 1000 steps running `Isaac-Velocity-Rough-Anymal-C-v0 ` with 4096 envs Before #3107: `| termination.compute | 0.229 ms|` `| termination.reset | 0.007 ms|` PR #3107: `| termination.compute | 0.274 ms|` `| termination.reset | 0.004 ms|` This PR: `| termination.compute | 0.258 ms|` `| termination.reset | 0.004 ms|` We actually see improvement, this is due to the fact that expensive maintenance of last_episode_value is only computed once per compute(#3107 computes last_episode_value for every term) ## Type of change - Bug fix (non-breaking change which fixes an issue) ## Checklist - [x] I have read and understood the [contribution guidelines](https://isaac-sim.github.io/IsaacLab/main/source/refs/contributing.html) - [x] I have run the [`pre-commit` checks](https://pre-commit.com/) with `./isaaclab.sh --format` - [ ] I have made corresponding changes to the documentation - [x] My changes generate no new warnings - [ ] I have added tests that prove my fix is effective or that my feature works - [x] I have updated the changelog and the corresponding version in the extension's `config/extension.toml` file - [x] I have added my name to the `CONTRIBUTORS.md` or my name already exists there --------- Signed-off-by: Kelly Guo <[email protected]> Co-authored-by: Kelly Guo <[email protected]>

ooctipus requested review from Mayankm96, jtigue-bdai and kellyguo11 as code owners August 7, 2025 06:02

ooctipus mentioned this pull request Aug 7, 2025

[Bug Report] Termination Overcounting Caused by Missing Log Buffer Reset in manager_based_rl_env.py #2977

Closed

2 tasks

ooctipus force-pushed the fix/termination_reporting branch from 6244bfa to 45ff89d Compare August 7, 2025 07:47

ooctipus changed the title ~~Fixed Termination Manager logging to report aggregated percentage of environments done due to each term.~~ Fixes Termination Manager logging to report aggregated percentage of environments done due to each term. Aug 7, 2025

Mayankm96 reviewed Aug 7, 2025

View reviewed changes

ooctipus force-pushed the fix/termination_reporting branch from 25d4594 to c220593 Compare August 10, 2025 18:28

ooctipus force-pushed the fix/termination_reporting branch 2 times, most recently from ea5fa9a to 41ccce3 Compare August 11, 2025 08:55

ooctipus added 6 commits August 13, 2025 13:55

fix the per-step termination log on reset to per-episode termination log

c7b9ebd

update change log

fc8c511

make name indexing O(1) operation

842cea0

make performance a bit faster

7ef71cc

make performance a bit faster

183b4aa

pass precommit

a0767c6

ooctipus force-pushed the fix/termination_reporting branch from 5ace435 to a0767c6 Compare August 13, 2025 20:55

ooctipus merged commit 8dabd3f into main Aug 14, 2025
10 checks passed

ooctipus deleted the fix/termination_reporting branch August 14, 2025 01:04

This was referenced Oct 15, 2025

[Bug Report] TerminationManager._term_dones not updated per step (stale values during rollout) #3720

Closed

Separates per-step termination and last-episode termination bookkeeping #3745

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fixes Termination Manager logging to report aggregated percentage of environments done due to each term. #3107

Fixes Termination Manager logging to report aggregated percentage of environments done due to each term. #3107

Uh oh!

ooctipus commented Aug 7, 2025

Uh oh!

Mayankm96 Aug 7, 2025

Uh oh!

ooctipus Aug 7, 2025

Uh oh!

Mayankm96 Aug 7, 2025

Uh oh!

ooctipus Aug 7, 2025

Uh oh!

Mayankm96 Aug 7, 2025

Uh oh!

ooctipus Aug 7, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Aug 10, 2025 •

edited

Loading

Uh oh!

ooctipus commented Aug 13, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Fixes Termination Manager logging to report aggregated percentage of environments done due to each term. #3107

Fixes Termination Manager logging to report aggregated percentage of environments done due to each term. #3107

Uh oh!

Conversation

ooctipus commented Aug 7, 2025

Description

Checklist

Uh oh!

Mayankm96 Aug 7, 2025

Choose a reason for hiding this comment

Uh oh!

ooctipus Aug 7, 2025

Choose a reason for hiding this comment

Uh oh!

Mayankm96 Aug 7, 2025

Choose a reason for hiding this comment

Uh oh!

ooctipus Aug 7, 2025

Choose a reason for hiding this comment

Uh oh!

Mayankm96 Aug 7, 2025

Choose a reason for hiding this comment

Uh oh!

ooctipus Aug 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Aug 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Test Results Summary

Uh oh!

ooctipus commented Aug 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ooctipus Aug 7, 2025 •

edited

Loading

github-actions bot commented Aug 10, 2025 •

edited

Loading

ooctipus commented Aug 13, 2025 •

edited

Loading