Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix post-GC purging of streams view time histogram #3364

Merged
merged 14 commits into from
Sep 20, 2023
Merged

Conversation

emilk
Copy link
Member

@emilk emilk commented Sep 19, 2023

What

The datastore is not the only place we store data. For each node in the entity tree we store a time histogram of where we have data. This was never properly purged post GC - a very rough heuristic was instead used (throw away everything up to the oldest time in the store - which will be an even worse heursistic after #3357).

With this PR, the store gc will book-keep exactly what was deleted, and it will be properly purged from all secondary indices.

The resulting code is a bit complicated, because the store has no idea about the hierarchical nature of the entity paths, but we store the time histograms (and other book-keeping) for each node. That is, logging to /foo/bar we will note the data on /, /foo and /foo/bar, but the data will only be registered in the store for /foo/bar.

I also fixed a bunch of other smaller things that came up.

Best reviewed commit-by-commit.

Checklist

@emilk emilk changed the title Emilk/better purging Fix post-GC purging of streams view time histogram Sep 19, 2023
@emilk emilk added 🪳 bug Something isn't working 📺 re_viewer affects re_viewer itself labels Sep 19, 2023
@emilk emilk marked this pull request as ready for review September 19, 2023 12:10
@jleibs jleibs self-requested a review September 19, 2023 12:17
Copy link
Member

@jleibs jleibs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aside from the bug on the component-book-keeping this looks good. In reality that case is pretty rare, but still seems worth fixing for completeness.

crates/re_arrow_store/src/store_gc.rs Outdated Show resolved Hide resolved
crates/re_data_store/src/entity_tree.rs Outdated Show resolved Hide resolved
@emilk emilk force-pushed the emilk/better-purging branch from a32d34a to c7ebb4c Compare September 20, 2023 09:21
@emilk
Copy link
Member Author

emilk commented Sep 20, 2023

Thanks for that @jleibs, it wasn't too bad keeping track of the individual components, and the code makes much more sense now imho!

@emilk emilk merged commit a91f415 into main Sep 20, 2023
@emilk emilk deleted the emilk/better-purging branch September 20, 2023 09:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🪳 bug Something isn't working 📺 re_viewer affects re_viewer itself
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Garbage collection (--memory-limit) not correctly applied to streams histogram
2 participants