Logging large values to `rerun.set_time_seconds()` causes hang/crash #2582

Nicholas-Autio-Mitchell · 2023-07-02T11:41:34Z

Describe the bug

Using the Python SDK, logging a time before logging an entity causes the GUI to hang then crash. The process itself never exits.

I am logging various sensor data to the same stream via rerun.log_points(), rerun.log_image(), etc.

For each call, I first set the time of that recorded sensor data.

Using a "frame ID" works fine e.g. rr.set_time_sequence("frame_id", step).
I tried logging my dataset's timestamps as nanoseconds: rr.set_time_nanos("timestamp", timestamp). This wasn't correct for my case.
I instead logged timestamps as seconds: rr.set_time_nanos("timestamp", timestamp). This causes the hang/crash.

[sidenote: turned out my dataset uses a custom time reference: nanoseconds since something domain specific]

To Reproduce
Reproduced on the clock example: examples/python/clock/main.py. Apply the diff below, where we scale up the step by 1e9 e.g. nanoseconds:

@ examples/python/clock/main.py:46 @ def log_clock(steps: int) -> None:
    for step in range(steps):
        t_secs = step

-        rr.set_time_seconds("sim_time", t_secs)
+        rr.set_time_seconds("seconds", t_secs * 1e9)        # causes the app to hang
+        # rr.set_time_nanos("nano seconds", t_secs * 1e9)   # this works fine

The full process:

pip install -U rerun-sdk  # get 0.7.0 at time of writing
git clone [email protected]:rerun-io/rerun.git  # Clone the repository
cd rerun/
git checkout 0.7.0
# code -n examples/python/clock/main.py  # open file to apply the diff above
python examples/python/clock/main.py  # opens the GUI

Interestingly, the logging itself takes a really high, suggesting some type checks/conversions are slowing things down?

In the GUI, change the logging time to choose seconds, which causes the app to hang
Wait ~7 minutes, the GUI app does crash and close... but the process does not. You can still find python -m rerun --port 9876 in htop still chugging along doing something (for me CPU% floats around ~25%). Watching htop as I switch to seconds in the GUI to enter failure mode, I notice that memory spikes to fill my memory, then seems to clear itself and again spike, then clear itself and become stable.

Playing with datatypes on the Python side had no effect. E.g rr.set_time_seconds("seconds", float(t_secs * 1e9)), ends with the same result. I haven't starting poking around into the bindings/rust code.

Expected behavior

An error, or perhaps a sane conversion? The docs already speak of some assumptions on timestamps based on their magnitude.

Backtrace

No backtrace is given.

Desktop (please complete the following information):

OS: MacOS Ventura 13.4.1

Rerun version

rerun_py 0.7.0 [rustc 1.69.0 (84c898d65 2023-04-16), LLVM 15.0.7] aarch64-apple-darwin prepare-0.7 9cf3033, built 2023-06-16T15:39:37Z

The text was updated successfully, but these errors were encountered:

emilk · 2023-07-03T11:03:38Z

Whoa, thanks for the report and the reproduce!

I can reproduce on main, and in debug builds we get a panic:

thread 'ThreadId(1)' panicked at 'attempt to add with overflow', re_time_panel/src/paint_ticks.rs:213

   8: core::panicking::panic_fmt
             at core/src/panicking.rs:64:14
   9: core::panicking::panic
             at core/src/panicking.rs:114:5
  10: re_time_panel::paint_ticks::paint_ticks
             at re_time_panel/src/paint_ticks.rs:213:9
  11: re_time_panel::paint_ticks::paint_time_range_ticks
      re_time_panel::paint_ticks::paint_time_ranges_and_ticks
             at re_time_panel/src/paint_ticks.rs:49:21
  12: re_time_panel::TimePanel::expanded_ui
             at re_time_panel/src/lib.rs:263:9

I'll take a look right away

### What If the user logged times that were at the limit of i64 we would sometimes get freezes and crashes. Closes #2582 ### Checklist * [x] I have read and agree to [Contributor Guide](https://github.com/rerun-io/rerun/blob/main/CONTRIBUTING.md) and the [Code of Conduct](https://github.com/rerun-io/rerun/blob/main/CODE_OF_CONDUCT.md) * [x] I've included a screenshot or gif (if applicable) * [x] I have tested [demo.rerun.io](https://demo.rerun.io/pr/2588) (if applicable) - [PR Build Summary](https://build.rerun.io/pr/2588) - [Docs preview](https://rerun.io/preview/pr%3Aemilk%2Ffix-freeze-on-large-times/docs) - [Examples preview](https://rerun.io/preview/pr%3Aemilk%2Ffix-freeze-on-large-times/examples)

Nicholas-Autio-Mitchell added 👀 needs triage This issue needs to be triaged by the Rerun team 🪳 bug Something isn't working labels Jul 2, 2023

emilk added 💣 crash crash, deadlock/freeze, do-no-start and removed 👀 needs triage This issue needs to be triaged by the Rerun team labels Jul 3, 2023

emilk self-assigned this Jul 3, 2023

emilk mentioned this issue Jul 3, 2023

Fix freeze/crash when logging large times #2588

Merged

3 tasks

emilk closed this as completed in #2588 Jul 4, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Logging large values to `rerun.set_time_seconds()` causes hang/crash #2582

Logging large values to `rerun.set_time_seconds()` causes hang/crash #2582

Nicholas-Autio-Mitchell commented Jul 2, 2023

emilk commented Jul 3, 2023

Logging large values to rerun.set_time_seconds() causes hang/crash #2582

Logging large values to rerun.set_time_seconds() causes hang/crash #2582

Comments

Nicholas-Autio-Mitchell commented Jul 2, 2023

emilk commented Jul 3, 2023

Logging large values to `rerun.set_time_seconds()` causes hang/crash #2582

Logging large values to `rerun.set_time_seconds()` causes hang/crash #2582