Skip to content

Conversation

@kamil-kaczmarek
Copy link
Contributor

@kamil-kaczmarek kamil-kaczmarek commented Aug 13, 2025

Why are these changes needed?

Fix failing release test: learning_tests_multi_agent_cartpole_appo_multi_gpu.

Related issue number

Checks

  • I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
    • I've added any new APIs to the API Reference. For example, if I added a
      method in Tune, I've added it in doc/source/tune/api/ under the
      corresponding .rst file.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

Signed-off-by: Kamil Kaczmarek <[email protected]>
Signed-off-by: Kamil Kaczmarek <[email protected]>
@gemini-code-assist
Copy link
Contributor

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

@kamil-kaczmarek kamil-kaczmarek self-assigned this Aug 13, 2025
@kamil-kaczmarek kamil-kaczmarek marked this pull request as ready for review August 13, 2025 06:01
@kamil-kaczmarek kamil-kaczmarek requested a review from a team as a code owner August 13, 2025 06:01
Copy link
Contributor

@simonsays1980 simonsays1980 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great PR @kamil-kaczmarek! Thanks a lot for working through this complex setup. I left some comments and request another debug iteration to understand better why this happens now and the intended process is somehow interrupted.

)
# In case all environments had been terminated `to_module` will be
# empty and no actions are needed b/c we reset all environemnts.
# empty and no actions are needed b/c we reset all environments.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder why this happens now that the to_module is None. Can we debug another round and see where this happens? Then check the autoreset and the connector run (I know this is complex).

What should happen is: env resets automatically; init obs goes through the to_module connector pipeline and produces to_module which can in turn passed through the module and the to_env pipeline to produce an action.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me investigate this more. This first happened last Thursday in the release tests. Will look into the code diff.

@sven1977 sven1977 changed the title Fix failing env step in MultiAgentEnvRunner [RLlib] Fix failing env step in MultiAgentEnvRunner. Aug 13, 2025
@ray-gardener ray-gardener bot added rllib RLlib related issues release-test release test labels Aug 14, 2025
@github-actions
Copy link

github-actions bot commented Sep 4, 2025

This pull request has been automatically marked as stale because it has not had
any activity for 14 days. It will be closed in another 14 days if no further activity occurs.
Thank you for your contributions.

You can always ask for help on our discussion forum or Ray's public slack channel.

If you'd like to keep this open, just leave any comment, and the stale label will be removed.

@github-actions github-actions bot added the stale The issue is stale. It will be closed within 7 days unless there are further conversation label Sep 4, 2025
@kamil-kaczmarek
Copy link
Contributor Author

kamil-kaczmarek commented Sep 4, 2025

unstale

@github-actions github-actions bot added unstale A PR that has been marked unstale. It will not get marked stale again if this label is on it. and removed stale The issue is stale. It will be closed within 7 days unless there are further conversation labels Sep 4, 2025
@simonsays1980 simonsays1980 added the go add ONLY when ready to merge, run all tests label Sep 11, 2025
kamil-kaczmarek and others added 11 commits September 16, 2025 17:32
Signed-off-by: Kamil Kaczmarek <[email protected]>
Signed-off-by: Kamil Kaczmarek <[email protected]>
Signed-off-by: Kamil Kaczmarek <[email protected]>
Signed-off-by: Kamil Kaczmarek <[email protected]>
Signed-off-by: Kamil Kaczmarek <[email protected]>
Signed-off-by: Kamil Kaczmarek <[email protected]>
Signed-off-by: Kamil Kaczmarek <[email protected]>
Signed-off-by: Kamil Kaczmarek <[email protected]>
Copy link
Contributor

@simonsays1980 simonsays1980 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Nice patch @kamil-kaczmarek ! Thanks for the work!

@simonsays1980 simonsays1980 enabled auto-merge (squash) September 17, 2025 11:02
@github-actions github-actions bot disabled auto-merge September 17, 2025 18:08
@simonsays1980 simonsays1980 merged commit 29bc824 into master Sep 18, 2025
5 checks passed
@simonsays1980 simonsays1980 deleted the kk/fix-failing-env-step branch September 18, 2025 09:37
zma2 pushed a commit to zma2/ray that referenced this pull request Sep 23, 2025
…5567)

## Why are these changes needed?

Fix failing release test:
`learning_tests_multi_agent_cartpole_appo_multi_gpu`.

## Related issue number

<!-- For example: "Closes ray-project#1234" -->

## Checks

- [x] I've signed off every commit(by using the -s flag, i.e., `git
commit -s`) in this PR.
- [x] I've run `scripts/format.sh` to lint the changes in this PR.
- [ ] I've included any doc changes needed for
https://docs.ray.io/en/master/.
- [ ] I've added any new APIs to the API Reference. For example, if I
added a
method in Tune, I've added it in `doc/source/tune/api/` under the
           corresponding `.rst` file.
- [ ] I've made sure the tests are passing. Note that there might be a
few flaky tests, see the recent failures at https://flakey-tests.ray.io/
- Testing Strategy
   - [x] Unit tests
   - [x] Release tests
   - [ ] This PR is not tested :(

---------

Signed-off-by: Kamil Kaczmarek <[email protected]>
Signed-off-by: Kamil Kaczmarek <[email protected]>
Signed-off-by: Zhiqiang Ma <[email protected]>
ZacAttack pushed a commit to ZacAttack/ray that referenced this pull request Sep 24, 2025
…5567)

## Why are these changes needed?

Fix failing release test:
`learning_tests_multi_agent_cartpole_appo_multi_gpu`.

## Related issue number

<!-- For example: "Closes ray-project#1234" -->

## Checks

- [x] I've signed off every commit(by using the -s flag, i.e., `git
commit -s`) in this PR.
- [x] I've run `scripts/format.sh` to lint the changes in this PR.
- [ ] I've included any doc changes needed for
https://docs.ray.io/en/master/.
- [ ] I've added any new APIs to the API Reference. For example, if I
added a
method in Tune, I've added it in `doc/source/tune/api/` under the
           corresponding `.rst` file.
- [ ] I've made sure the tests are passing. Note that there might be a
few flaky tests, see the recent failures at https://flakey-tests.ray.io/
- Testing Strategy
   - [x] Unit tests
   - [x] Release tests
   - [ ] This PR is not tested :(

---------

Signed-off-by: Kamil Kaczmarek <[email protected]>
Signed-off-by: Kamil Kaczmarek <[email protected]>
Signed-off-by: zac <[email protected]>
elliot-barn pushed a commit that referenced this pull request Sep 24, 2025
## Why are these changes needed?

Fix failing release test:
`learning_tests_multi_agent_cartpole_appo_multi_gpu`.

## Related issue number

<!-- For example: "Closes #1234" -->

## Checks

- [x] I've signed off every commit(by using the -s flag, i.e., `git
commit -s`) in this PR.
- [x] I've run `scripts/format.sh` to lint the changes in this PR.
- [ ] I've included any doc changes needed for
https://docs.ray.io/en/master/.
- [ ] I've added any new APIs to the API Reference. For example, if I
added a
method in Tune, I've added it in `doc/source/tune/api/` under the
           corresponding `.rst` file.
- [ ] I've made sure the tests are passing. Note that there might be a
few flaky tests, see the recent failures at https://flakey-tests.ray.io/
- Testing Strategy
   - [x] Unit tests
   - [x] Release tests
   - [ ] This PR is not tested :(

---------

Signed-off-by: Kamil Kaczmarek <[email protected]>
Signed-off-by: Kamil Kaczmarek <[email protected]>
Signed-off-by: elliot-barn <[email protected]>
marcostephan pushed a commit to marcostephan/ray that referenced this pull request Sep 24, 2025
…5567)

## Why are these changes needed?

Fix failing release test:
`learning_tests_multi_agent_cartpole_appo_multi_gpu`.

## Related issue number

<!-- For example: "Closes ray-project#1234" -->

## Checks

- [x] I've signed off every commit(by using the -s flag, i.e., `git
commit -s`) in this PR.
- [x] I've run `scripts/format.sh` to lint the changes in this PR.
- [ ] I've included any doc changes needed for
https://docs.ray.io/en/master/.
- [ ] I've added any new APIs to the API Reference. For example, if I
added a
method in Tune, I've added it in `doc/source/tune/api/` under the
           corresponding `.rst` file.
- [ ] I've made sure the tests are passing. Note that there might be a
few flaky tests, see the recent failures at https://flakey-tests.ray.io/
- Testing Strategy
   - [x] Unit tests
   - [x] Release tests
   - [ ] This PR is not tested :(

---------

Signed-off-by: Kamil Kaczmarek <[email protected]>
Signed-off-by: Kamil Kaczmarek <[email protected]>
Signed-off-by: Marco Stephan <[email protected]>
elliot-barn pushed a commit that referenced this pull request Sep 27, 2025
## Why are these changes needed?

Fix failing release test:
`learning_tests_multi_agent_cartpole_appo_multi_gpu`.

## Related issue number

<!-- For example: "Closes #1234" -->

## Checks

- [x] I've signed off every commit(by using the -s flag, i.e., `git
commit -s`) in this PR.
- [x] I've run `scripts/format.sh` to lint the changes in this PR.
- [ ] I've included any doc changes needed for
https://docs.ray.io/en/master/.
- [ ] I've added any new APIs to the API Reference. For example, if I
added a
method in Tune, I've added it in `doc/source/tune/api/` under the
           corresponding `.rst` file.
- [ ] I've made sure the tests are passing. Note that there might be a
few flaky tests, see the recent failures at https://flakey-tests.ray.io/
- Testing Strategy
   - [x] Unit tests
   - [x] Release tests
   - [ ] This PR is not tested :(

---------

Signed-off-by: Kamil Kaczmarek <[email protected]>
Signed-off-by: Kamil Kaczmarek <[email protected]>
Signed-off-by: elliot-barn <[email protected]>
dstrodtman pushed a commit to dstrodtman/ray that referenced this pull request Oct 6, 2025
…5567)

## Why are these changes needed?

Fix failing release test:
`learning_tests_multi_agent_cartpole_appo_multi_gpu`.

## Related issue number

<!-- For example: "Closes ray-project#1234" -->

## Checks

- [x] I've signed off every commit(by using the -s flag, i.e., `git
commit -s`) in this PR.
- [x] I've run `scripts/format.sh` to lint the changes in this PR.
- [ ] I've included any doc changes needed for
https://docs.ray.io/en/master/.
- [ ] I've added any new APIs to the API Reference. For example, if I
added a
method in Tune, I've added it in `doc/source/tune/api/` under the
           corresponding `.rst` file.
- [ ] I've made sure the tests are passing. Note that there might be a
few flaky tests, see the recent failures at https://flakey-tests.ray.io/
- Testing Strategy
   - [x] Unit tests
   - [x] Release tests
   - [ ] This PR is not tested :(

---------

Signed-off-by: Kamil Kaczmarek <[email protected]>
Signed-off-by: Kamil Kaczmarek <[email protected]>
Signed-off-by: Douglas Strodtman <[email protected]>
justinyeh1995 pushed a commit to justinyeh1995/ray that referenced this pull request Oct 20, 2025
…5567)

## Why are these changes needed?

Fix failing release test:
`learning_tests_multi_agent_cartpole_appo_multi_gpu`.

## Related issue number

<!-- For example: "Closes ray-project#1234" -->

## Checks

- [x] I've signed off every commit(by using the -s flag, i.e., `git
commit -s`) in this PR.
- [x] I've run `scripts/format.sh` to lint the changes in this PR.
- [ ] I've included any doc changes needed for
https://docs.ray.io/en/master/.
- [ ] I've added any new APIs to the API Reference. For example, if I
added a
method in Tune, I've added it in `doc/source/tune/api/` under the
           corresponding `.rst` file.
- [ ] I've made sure the tests are passing. Note that there might be a
few flaky tests, see the recent failures at https://flakey-tests.ray.io/
- Testing Strategy
   - [x] Unit tests
   - [x] Release tests
   - [ ] This PR is not tested :(

---------

Signed-off-by: Kamil Kaczmarek <[email protected]>
Signed-off-by: Kamil Kaczmarek <[email protected]>
landscapepainter pushed a commit to landscapepainter/ray that referenced this pull request Nov 17, 2025
…5567)

## Why are these changes needed?

Fix failing release test:
`learning_tests_multi_agent_cartpole_appo_multi_gpu`.

## Related issue number

<!-- For example: "Closes ray-project#1234" -->

## Checks

- [x] I've signed off every commit(by using the -s flag, i.e., `git
commit -s`) in this PR.
- [x] I've run `scripts/format.sh` to lint the changes in this PR.
- [ ] I've included any doc changes needed for
https://docs.ray.io/en/master/.
- [ ] I've added any new APIs to the API Reference. For example, if I
added a
method in Tune, I've added it in `doc/source/tune/api/` under the
           corresponding `.rst` file.
- [ ] I've made sure the tests are passing. Note that there might be a
few flaky tests, see the recent failures at https://flakey-tests.ray.io/
- Testing Strategy
   - [x] Unit tests
   - [x] Release tests
   - [ ] This PR is not tested :(

---------

Signed-off-by: Kamil Kaczmarek <[email protected]>
Signed-off-by: Kamil Kaczmarek <[email protected]>
ArturNiederfahrenhorst added a commit that referenced this pull request Nov 25, 2025
## Why are these changes needed?

We introduced an improved error message when environments fail in
#55567.
At the same time, this bypasses the silencing of env step errors.
This PR consolidates the messages.

---------

Co-authored-by: Kamil Kaczmarek <[email protected]>
SheldonTsen pushed a commit to SheldonTsen/ray that referenced this pull request Dec 1, 2025
## Why are these changes needed?

We introduced an improved error message when environments fail in
ray-project#55567.
At the same time, this bypasses the silencing of env step errors.
This PR consolidates the messages.

---------

Co-authored-by: Kamil Kaczmarek <[email protected]>
Future-Outlier pushed a commit to Future-Outlier/ray that referenced this pull request Dec 7, 2025
…5567)

## Why are these changes needed?

Fix failing release test:
`learning_tests_multi_agent_cartpole_appo_multi_gpu`.

## Related issue number

<!-- For example: "Closes ray-project#1234" -->

## Checks

- [x] I've signed off every commit(by using the -s flag, i.e., `git
commit -s`) in this PR.
- [x] I've run `scripts/format.sh` to lint the changes in this PR.
- [ ] I've included any doc changes needed for
https://docs.ray.io/en/master/.
- [ ] I've added any new APIs to the API Reference. For example, if I
added a
method in Tune, I've added it in `doc/source/tune/api/` under the
           corresponding `.rst` file.
- [ ] I've made sure the tests are passing. Note that there might be a
few flaky tests, see the recent failures at https://flakey-tests.ray.io/
- Testing Strategy
   - [x] Unit tests
   - [x] Release tests
   - [ ] This PR is not tested :(

---------

Signed-off-by: Kamil Kaczmarek <[email protected]>
Signed-off-by: Kamil Kaczmarek <[email protected]>
Signed-off-by: Future-Outlier <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

go add ONLY when ready to merge, run all tests release-test release test rllib RLlib related issues unstale A PR that has been marked unstale. It will not get marked stale again if this label is on it.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants