Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Garbage collection (--memory-limit) not correctly applied to streams histogram #2517

Closed
emilk opened this issue Jun 26, 2023 · 2 comments · Fixed by #3364
Closed

Garbage collection (--memory-limit) not correctly applied to streams histogram #2517

emilk opened this issue Jun 26, 2023 · 2 comments · Fixed by #3364
Assignees
Labels
🪳 bug Something isn't working ⛃ re_datastore affects the datastore itself
Milestone

Comments

@emilk
Copy link
Member

emilk commented Jun 26, 2023

Reproduce:

cargo rerun --memory-limit 200MB
examples/python/face_tracking/main.py --connect

The oldest images will be pruned, but the streams view still show the data points, so when you go back in time you'll see no image:

image

It seems like TimeHistogramPerTimeline is not beeing correctly pruned


Keywords: purge, prune, gc

@emilk emilk added 🪳 bug Something isn't working 🏎️ Quick Issue Can be fixed in a few hours or less labels Jun 26, 2023
@emilk emilk self-assigned this Jul 4, 2023
@emilk
Copy link
Member Author

emilk commented Jul 4, 2023

It works for the log_tick timeline, but not the log_time timeline. The cutoff_times is not updated for the latter. I suspect something weird inside the data store. Using a single cutoff_time may not be the best idea anyhow, because data is shared in mini-batches.

@emilk emilk removed the 🏎️ Quick Issue Can be fixed in a few hours or less label Jul 4, 2023
@emilk emilk removed their assignment Jul 4, 2023
@emilk emilk added this to the 0.9 milestone Aug 16, 2023
@emilk emilk added the ⛃ re_datastore affects the datastore itself label Aug 25, 2023
@emilk emilk changed the title Garbage collection (--memory-limit) not applied to streams? Garbage collection (--memory-limit) not correctly applied to streams histogram Aug 25, 2023
@emilk
Copy link
Member Author

emilk commented Sep 19, 2023

This does not reproduce consistently, but it does reproduce.

@emilk emilk self-assigned this Sep 19, 2023
emilk added a commit that referenced this issue Sep 20, 2023
### What
* Closes #2517

The datastore is not the only place we store data. For each node in the
entity tree we store a _time histogram_ of where we have data. This was
never properly purged post GC - a very rough heuristic was instead used
(throw away everything up to the oldest time in the store - which will
be an even worse heursistic after
#3357).

With this PR, the store gc will book-keep exactly what was deleted, and
it will be properly purged from all secondary indices.

The resulting code is a bit complicated, because the store has no idea
about the hierarchical nature of the entity paths, but we store the time
histograms (and other book-keeping) for each node. That is, logging to
`/foo/bar` we will note the data on `/`, `/foo` and `/foo/bar`, but the
data will only be registered in the store for `/foo/bar`.

I also fixed a bunch of other smaller things that came up.

Best reviewed commit-by-commit.

### Checklist
* [x] I have read and agree to [Contributor
Guide](https://github.com/rerun-io/rerun/blob/main/CONTRIBUTING.md) and
the [Code of
Conduct](https://github.com/rerun-io/rerun/blob/main/CODE_OF_CONDUCT.md)
* [x] I've included a screenshot or gif (if applicable)
* [x] I have tested [demo.rerun.io](https://demo.rerun.io/pr/3364) (if
applicable)

- [PR Build Summary](https://build.rerun.io/pr/3364)
- [Docs
preview](https://rerun.io/preview/18cbc53fe96798cbe45ab84ddeb66aa183253e12/docs)
<!--DOCS-PREVIEW-->
- [Examples
preview](https://rerun.io/preview/18cbc53fe96798cbe45ab84ddeb66aa183253e12/examples)
<!--EXAMPLES-PREVIEW-->
- [Recent benchmark results](https://ref.rerun.io/dev/bench/)
- [Wasm size tracking](https://ref.rerun.io/dev/sizes/)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🪳 bug Something isn't working ⛃ re_datastore affects the datastore itself
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant