Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Always protect at least one value on the timeline when running GC #3357

Merged
merged 1 commit into from
Sep 19, 2023

Conversation

jleibs
Copy link
Member

@jleibs jleibs commented Sep 19, 2023

What

Now that GC has the abillity to protect data, turn the feature on for our normal purge_fraction_of_ram operations.

Resolves: #1803

Checklist

@jleibs jleibs added the ⛃ re_datastore affects the datastore itself label Sep 19, 2023
@jleibs jleibs marked this pull request as ready for review September 19, 2023 02:06
@nikolausWest
Copy link
Member

Does this also work for "Save loop selection"?

@jleibs
Copy link
Member Author

jleibs commented Sep 19, 2023

Does this also work for "Save loop selection"?

I don't believe so but creating a ticket to do something similar there: #3366

@jleibs jleibs merged commit b1fcd57 into main Sep 19, 2023
@jleibs jleibs deleted the jleibs/gc_protections branch September 19, 2023 12:16
emilk added a commit that referenced this pull request Sep 20, 2023
### What
* Closes #2517

The datastore is not the only place we store data. For each node in the
entity tree we store a _time histogram_ of where we have data. This was
never properly purged post GC - a very rough heuristic was instead used
(throw away everything up to the oldest time in the store - which will
be an even worse heursistic after
#3357).

With this PR, the store gc will book-keep exactly what was deleted, and
it will be properly purged from all secondary indices.

The resulting code is a bit complicated, because the store has no idea
about the hierarchical nature of the entity paths, but we store the time
histograms (and other book-keeping) for each node. That is, logging to
`/foo/bar` we will note the data on `/`, `/foo` and `/foo/bar`, but the
data will only be registered in the store for `/foo/bar`.

I also fixed a bunch of other smaller things that came up.

Best reviewed commit-by-commit.

### Checklist
* [x] I have read and agree to [Contributor
Guide](https://github.com/rerun-io/rerun/blob/main/CONTRIBUTING.md) and
the [Code of
Conduct](https://github.com/rerun-io/rerun/blob/main/CODE_OF_CONDUCT.md)
* [x] I've included a screenshot or gif (if applicable)
* [x] I have tested [demo.rerun.io](https://demo.rerun.io/pr/3364) (if
applicable)

- [PR Build Summary](https://build.rerun.io/pr/3364)
- [Docs
preview](https://rerun.io/preview/18cbc53fe96798cbe45ab84ddeb66aa183253e12/docs)
<!--DOCS-PREVIEW-->
- [Examples
preview](https://rerun.io/preview/18cbc53fe96798cbe45ab84ddeb66aa183253e12/examples)
<!--EXAMPLES-PREVIEW-->
- [Recent benchmark results](https://ref.rerun.io/dev/bench/)
- [Wasm size tracking](https://ref.rerun.io/dev/sizes/)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
include in changelog ⛃ re_datastore affects the datastore itself
Projects
None yet
Development

Successfully merging this pull request may close these issues.

datastore: bake latest-at semantics into the garbage collector
4 participants