Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RecordingStream context managers are broken when used with generators #6238

Closed
jleibs opened this issue May 6, 2024 · 2 comments · Fixed by #6240
Closed

RecordingStream context managers are broken when used with generators #6238

jleibs opened this issue May 6, 2024 · 2 comments · Fixed by #6240
Assignees
Labels
🪳 bug Something isn't working 🐍 Python API Python logging API
Milestone

Comments

@jleibs
Copy link
Member

jleibs commented May 6, 2024

Problem

Example of problematic code:

import rerun as rr

def my_gen_func(stream, name):
    with stream:
        for i in range(10):
            print(f"{name} {i}")
            rr.log("stream", rr.TextLog(f"{name} {i}"))
            yield

rr.init("rerun_example_leak_context")

stream1 = rr.new_recording("rerun_example_stream1")
stream1.save("stream1.rrd")
stream2 = rr.new_recording("rerun_example_stream2")
stream2.save("stream2.rrd")

gen1 = my_gen_func(stream1, "stream1")
gen2 = my_gen_func(stream2, "stream2")

next(gen1)
next(gen2)
rr.log("stream", rr.TextLog("This should go to the global stream"))
next(gen1)
next(gen1)
next(gen1)

Run this and then open stream2.rrd.
You will see that both global messages and messages that should have been destined for stream 1 leaked into stream 2:
image

Investigation

The reason this happens is our context-manager tries to establish the context by running:

bindings.set_thread_local_data_recording(...)

But the context manager stays active on the thread after yielding the result, since the context is held around by the generator state. This means state leaks out of the generator into the thread and additionally when resuming the generator, can leak in from whatever context is calling the generator.

@jleibs jleibs added 🪳 bug Something isn't working 🐍 Python API Python logging API labels May 6, 2024
@jleibs jleibs added this to the 0.16 milestone May 6, 2024
@jleibs
Copy link
Member Author

jleibs commented May 6, 2024

Adding to 0.16 since this is very easy to hit when working while running code inside of an environment like Gradio when using streaming generators.

@jleibs
Copy link
Member Author

jleibs commented May 6, 2024

Turns out this may not be generically solvable. See https://peps.python.org/pep-0568/ and https://discuss.python.org/t/preventing-yield-inside-certain-context-managers

The two best things we may be able to offer are probably:

  • a utility-decorator that can capture and restore stream context on yield.
  • maybe a sanity check in our context manager that catches some edge-cases, though this seems hard to guarantee

However, the footgun is going to be there regardless... this seems like something the python devs aren't really interested in fixing so not much we can do about it.

@jleibs jleibs self-assigned this May 6, 2024
jleibs added a commit that referenced this issue May 7, 2024
…used with generators (#6240)

### What
 - Resolves: #6238

In `thread_local_stream`, don't yield while holding the stream context
open. Re-create context before continuing the generator.

Also introduce a new `recording_stream_generator_ctx` for more advanced
usage. This is mainly an escape hatch.

The problem from the example can now be handled using:
```
import rerun as rr

@rr.recording_stream_generator_ctx
def my_gen_func(stream, name):
    with stream:
        for i in range(10):
            print(f"{name} {i}")
            rr.log("stream", rr.TextLog(f"{name} {i}"))
            yield

rr.init("rerun_example_leak_context")

stream1 = rr.new_recording("rerun_example_stream1")
stream1.save("stream1.rrd")
stream2 = rr.new_recording("rerun_example_stream2")
stream2.save("stream2.rrd")

gen1 = my_gen_func(stream1, "stream1")
gen2 = my_gen_func(stream2, "stream2")

next(gen1)
next(gen2)
rr.log("stream", rr.TextLog("This should go to the global stream"))
next(gen1)
next(gen1)
next(gen1)
```

### Checklist
* [x] I have read and agree to [Contributor
Guide](https://github.com/rerun-io/rerun/blob/main/CONTRIBUTING.md) and
the [Code of
Conduct](https://github.com/rerun-io/rerun/blob/main/CODE_OF_CONDUCT.md)
* [x] I've included a screenshot or gif (if applicable)
* [x] I have tested the web demo (if applicable):
* Using examples from latest `main` build:
[rerun.io/viewer](https://rerun.io/viewer/pr/6240?manifest_url=https://app.rerun.io/version/main/examples_manifest.json)
* Using full set of examples from `nightly` build:
[rerun.io/viewer](https://rerun.io/viewer/pr/6240?manifest_url=https://app.rerun.io/version/nightly/examples_manifest.json)
* [x] The PR title and labels are set such as to maximize their
usefulness for the next release's CHANGELOG
* [x] If applicable, add a new check to the [release
checklist](https://github.com/rerun-io/rerun/blob/main/tests/python/release_checklist)!

- [PR Build Summary](https://build.rerun.io/pr/6240)
- [Recent benchmark results](https://build.rerun.io/graphs/crates.html)
- [Wasm size tracking](https://build.rerun.io/graphs/sizes.html)

To run all checks from `main`, comment on the PR with `@rerun-bot
full-check`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🪳 bug Something isn't working 🐍 Python API Python logging API
Projects
None yet
1 participant