-
Notifications
You must be signed in to change notification settings - Fork 373
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Primary caching 3: bare-bone latest-at caching #4659
Conversation
3577366
to
9989efd
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a rename of VecDeque::insert_range
to VecDeque::insert_many
to match the terminology in FlatVecDeque
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a rename of VecDeque::insert_range
to VecDeque::insert_many
to match the terminology in FlatVecDeque
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a literal port of our existing benchmarks, only using the cached query APIs instead.
// TODO(cmc): we need an extra indirection layer so that cached entries can be shared across | ||
// queries with different query timestamps but identical data timestamps. | ||
// This requires keeping track of all `RowId`s in `ArchetypeView`, not just the `RowId` of the | ||
// point-of-view component. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
53bb375
to
a826526
Compare
// TODO(cmc): Centralize and harmonize all caches (query, jpeg, mesh). | ||
static CACHES: Lazy<Caches> = Lazy::new(Caches::default); | ||
|
||
/// Maintains the top-level cache mappings. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we expected to have one cache per DataStore
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CACHES
cover everything, the StoreId
is part of the CacheKey
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Put that in the docstring :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in next PR
a826526
to
bee3f9a
Compare
Make it possible to toggle primary caching on and off at runtime, for both latest-at and range queries. ![image](https://github.com/rerun-io/rerun/assets/2910679/46404d8d-ea27-441c-9bae-ba5e3476adef) --- Part of the primary caching series of PR (index search, joins, deserialization): - #4592 - #4593 - #4659 - #4680 - #4681 - #4698 - #4711 - #4712 - #4721 - #4726
Integrates the cached APIs with the 2D & 3D spatial views, which is a pretty tough thing to do because there's a lot of abstraction going on in there. `main` vs. cache disabled vs. cache enable (5950X, Arch): ``` group main primcache_5_uncached primcache_5_cached ----- ---- -------------------- ------------------ Points3D/load_all 1.68 10.1±0.14ms 94.2 MElem/sec 1.00 6.0±0.07ms 157.9 MElem/sec 1.01 6.1±0.06ms 155.7 MElem/sec Points3D/load_colors 1.44 3.8±0.02ms 252.6 MElem/sec 1.00 2.6±0.05ms 364.0 MElem/sec 1.07 2.8±0.06ms 339.5 MElem/sec Points3D/load_picking_ids 15.16 1859.6±7.01µs 512.9 MElem/sec 1.01 124.3±3.92µs 7.5 GElem/sec 1.00 122.7±3.86µs 7.6 GElem/sec Points3D/load_positions 2.29 420.1±0.76µs 2.2 GElem/sec 1.03 189.3±7.44µs 4.9 GElem/sec 1.00 183.4±5.56µs 5.1 GElem/sec Points3D/load_radii 1.46 3.3±0.04ms 290.1 MElem/sec 1.05 2.4±0.03ms 404.8 MElem/sec 1.00 2.2±0.00ms 423.9 MElem/sec Points3D/query_archetype 2.51 676.1±7.59ns ? ?/sec 15859.98 4.3±0.06ms ? ?/sec 1.00 268.9±3.39ns ? ?/sec ``` --- Part of the primary caching series of PR (index search, joins, deserialization): - #4592 - #4593 - #4659 - #4680 - #4681 - #4698 - #4711 - #4712 - #4721 - #4726
Integrates the cached APIs with the TextLog & TimeSeries views, which is pretty trivial. This of course does nothing, since the cache doesn't cache range queries yet. --- Part of the primary caching series of PR (index search, joins, deserialization): - #4592 - #4593 - #4659 - #4680 - #4681 - #4698 - #4711 - #4712 - #4721 - #4726
) _99% grunt work, the only somewhat interesting thing happens in `query_archetype`_ Our query model always operates with two distinct timestamps: the timestamp you're querying for (`query_time`) vs. the timestamp of the data you get back (`data_time`). This is the result of our latest-at semantics: a query for a point at time `10` can return a point at time `2`. This is important to know when caching the data: a query at time `4` and a query at time `8` that both return the data at time `2` must share the same single entry or the memory budget would explode. This PR just updates all existing latest-at APIs so they return the data time in their response. This was already the case for range APIs. Note that in the case of `query_archetype`, which is a compound API that emits multiple queries, the data time of the final result is the most recent data time among all of its components. A follow-up PR will use the data time to deduplicate entries in the latest-at cache. --- Part of the primary caching series of PR (index search, joins, deserialization): - #4592 - #4593 - #4659 - #4680 - #4681 - #4698 - #4711 - #4712 - #4721 - #4726 - #4773 - #4784 - #4785 - #4793 - #4800
…ation (#4712) Introduces the notion of cache deduplication: given a query at time `4` and a query at time `8` that both returns data at time `2`, they must share a single cache entry. I.e. starting with this PR, scrubbing through the OPF example will not result if more cache memory being used. --- Part of the primary caching series of PR (index search, joins, deserialization): - #4592 - #4593 - #4659 - #4680 - #4681 - #4698 - #4711 - #4712 - #4721 - #4726 - #4773 - #4784 - #4785 - #4793 - #4800
Introduces a dedicated cache bucket for timeless data and properly forwards the information through all APIs downstream. --- Part of the primary caching series of PR (index search, joins, deserialization): - #4592 - #4593 - #4659 - #4680 - #4681 - #4698 - #4711 - #4712 - #4721 - #4726 - #4773 - #4784 - #4785 - #4793 - #4800
This implements cache invalidation via a `StoreSubscriber`. We keep track of the timestamps to invalidate in the `StoreSubscriber`, but we only do the actual removal of components at query time. This is similar to how we handle bucket sorting in the main store: doing it at query time has the benefit that the frame time effectively behaves as natural micro-batching mechanism that vastly improves performance. --- Part of the primary caching series of PR (index search, joins, deserialization): - #4592 - #4593 - #4659 - #4680 - #4681 - #4698 - #4711 - #4712 - #4721 - #4726 - #4773 - #4784 - #4785 - #4793 - #4800
) The primary cache now tracks memory statistics and display them in the memory panel. This immediately highlights a very stupid thing that the cache does: missing optional components that have been turned into streams of default values by the `ArchetypeView` are materialized as such :man_facepalming: - #4779 https://github.com/rerun-io/rerun/assets/2910679/876b264a-3f77-4d91-934e-aa8897bb32fe - Fixes #4730 --- Part of the primary caching series of PR (index search, joins, deserialization): - #4592 - #4593 - #4659 - #4680 - #4681 - #4698 - #4711 - #4712 - #4721 - #4726 - #4773 - #4784 - #4785 - #4793 - #4800
**Prefer on a per-commit basis, stuff has moved around** Range queries are back!... in the most primitive form possible. No invalidation, no bucketing, no optimization, no nothing. Just putting everything in place. https://github.com/rerun-io/rerun/assets/2910679/a65281e4-9843-4598-9547-ce7e45197995 --- Part of the primary caching series of PR (index search, joins, deserialization): - #4592 - #4593 - #4659 - #4680 - #4681 - #4698 - #4711 - #4712 - #4721 - #4726 - #4773 - #4784 - #4785 - #4793 - #4800
#4785) Title. https://github.com/rerun-io/rerun/assets/2910679/cf2c2748-a461-49fe-8124-c2a94164c956 --- Part of the primary caching series of PR (index search, joins, deserialization): - #4592 - #4593 - #4659 - #4680 - #4681 - #4698 - #4711 - #4712 - #4721 - #4726 - #4773 - #4784 - #4785 - #4793 - #4800
… range queries (#4793) Our low-level range APIs used to bake the latest-at results at `range.min - 1` into the range results, which is a big problem in a multi tenant setting because `range(1, 10)` vs. `latestat(1) + range(2, 10)` are two completely different things. Side-effect: a plot with a window of len 1 now behaves as expected: https://github.com/rerun-io/rerun/assets/2910679/957ac367-35a6-4bea-9f40-59d51c556639 --- Part of the primary caching series of PR (index search, joins, deserialization): - #4592 - #4593 - #4659 - #4680 - #4681 - #4698 - #4711 - #4712 - #4721 - #4726 - #4773 - #4784 - #4785 - #4793 - #4800
The most obvious and most important performance optimization when doing cached range queries: only upsert data at the edges of the bucket / ring-buffer. This works because our buckets (well, singular, at the moment) are always dense. - #4793 ![image](https://github.com/rerun-io/rerun/assets/2910679/7246827c-4977-4b3f-9ef9-f8e96b8a9bea) - #4800: ![image](https://github.com/rerun-io/rerun/assets/2910679/ab78643b-a98b-4568-b510-2b8827467095) --- Part of the primary caching series of PR (index search, joins, deserialization): - #4592 - #4593 - #4659 - #4680 - #4681 - #4698 - #4711 - #4712 - #4721 - #4726 - #4773 - #4784 - #4785 - #4793 - #4800
Range queries used to A) return the frame a T-1, B) accumulate state starting at T-1 and then C) yield frames starting at T. A) was a huge issue for many reasons, which #4793 took care of by eliminating both A) and B). But we need B) for range queries to be context-free, i.e. to be guaranteed that `Range(5, 10)` and `Range(4, 10)` will return the exact same data for frame `5`. This is crucial for multi-tenant settings where those 2 example queries would share the same cache. It also is the nicer-nicer version of the range semantics that we wanted anyway, I just didn't realize back then that it would require so little changes, or I would've gone straight for that. --- Part of the primary caching series of PR (index search, joins, deserialization): - #4592 - #4593 - #4659 - #4680 - #4681 - #4698 - #4711 - #4712 - #4721 - #4726 - #4773 - #4784 - #4785 - #4793 - #4800 - #4851 - #4852 - #4853 - #4856
Simply add a timeless path for the range cache, and actually only iterate over the range the user asked for (we were still blindly iterating over everything until now). Also some very minimal clean up related to #4832, but we have a long way to go... - #4832 --- - Fixes #4821 --- Part of the primary caching series of PR (index search, joins, deserialization): - #4592 - #4593 - #4659 - #4680 - #4681 - #4698 - #4711 - #4712 - #4721 - #4726 - #4773 - #4784 - #4785 - #4793 - #4800 - #4851 - #4852 - #4853 - #4856
Implement range invalidation and do a quality pass over all the size tracking stuff in the cache. **Range caching is now enabled by default!** - Fixes #4809 - Fixes #374 --- Part of the primary caching series of PR (index search, joins, deserialization): - #4592 - #4593 - #4659 - #4680 - #4681 - #4698 - #4711 - #4712 - #4721 - #4726 - #4773 - #4784 - #4785 - #4793 - #4800 - #4851 - #4852 - #4853 - #4856
- Quick sanity pass over all the intermediary locks and refcounts to make sure we don't hold anything for longer than we need. - Get rid of all static globals and let the caches live with their associated stores in `EntityDb`. - `CacheKey` no longer requires a `StoreId`. --- - Fixes #4815 --- Part of the primary caching series of PR (index search, joins, deserialization): - #4592 - #4593 - #4659 - #4680 - #4681 - #4698 - #4711 - #4712 - #4721 - #4726 - #4773 - #4784 - #4785 - #4793 - #4800 - #4851 - #4852 - #4853 - #4856
This implements the most barebone latest-at caching support.
The goal is merely to introduce all the machinery and boilerplate required to get the primary cache running, actual caching features will be implemented on top of this foundation in follow up PRs.
The existing benchmark suite has been ported as-is to the cached APIs (5950X, Arch):
Part of the primary caching series of PR (index search, joins, deserialization):
VecDeque
extensions #4592FlatVecDeque
#4593Checklist
main
build: app.rerun.ionightly
build: app.rerun.io