-
Notifications
You must be signed in to change notification settings - Fork 373
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DataStore
changelog 6: event-driven EntityTree
#4209
Conversation
DataStore
changelog 5: event-driven EntityTree
DataStore
changelog 6: event-driven EntityTree
8a98ae5
to
d231e2a
Compare
f197e9d
to
538d2ff
Compare
d231e2a
to
6dcd79e
Compare
538d2ff
to
3c568d0
Compare
6dcd79e
to
1f3c790
Compare
83eaad7
to
35b5f88
Compare
fn on_events(&mut self, _events: &[StoreEvent]) { | ||
unimplemented!( | ||
r"EntityTree view is maintained manually, see `EntityTree::on_store_{{additions|deletions}}`" | ||
); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to the last review -- what use is implementing StoreView
if we don't support on_events
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See #4208 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fact that our first StoreView is not really a StoreView -- doesn't use on_events
and isn't dispatched via the generic store-view dispatch mechanism -- makes me a bit sad.
The existence of global StoreViews in the first place still strikes me as possibly problematic -- and would certainly be problematic if we tried to make EntityTree a proper store-view (ignoring the fact that EntityTree itself is also recursive).
It seems like every "proper" StoreView is going to need to implement custom behavior for filtering per-store-id, and I'm struggling to come up with a case where we actually want these to be global. Rather, I suspect we may be better served by having StoreView registered as a factory such that every Store gets its own isolated instance of each StoreView. This would likewise then mean we could move with_view
to a store instance and have most store-views just "do the right thing" in their respective isolated contexts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is complex enough that I'm trusting the tests more than I trust my ability to guarantee correctness by inspection. Seems like an improvement overall but I would still love for this to not need special-handling.
…4202) The upcoming `StoreView` works in global scope: by registering a view you subscribe to changes to _all_ `DataStore`s, including those that are yet to be created. This is very powerful as it allows views & triggers implementers to build cross-recording indices as well as be notified as soon as new recordings come in and go out. But it means that `StoreEvent`s must indicate which `DataStore` they originate from, which isn't possible today since the stores themselves don't know who they are to begin with. This trivial PR plumbs the `StoreId` all the way through so `DataStore`s know about their own ID. Also made `StoreGeneration` account for the GC counter while I was at it. --- Requires: - #4215 `DataStore` changelog PR series: - #4202 - #4203 - #4205 - #4206 - #4208 - #4209
Introduces `StoreEvent`, an event that describes the atomic unit of change in the Rerun `DataStore`: a row has been added to or removed from the store. `StoreEvent`s are fired on both the insertion and garbage collection paths, enabling listeners to build arbitrary, always up-to-date views & trigger systems. ```rust /// The atomic unit of change in the Rerun [`DataStore`]. /// /// A [`StoreEvent`] describes the changes caused by the addition or deletion of a /// [`re_log_types::DataRow`] in the store. /// /// Methods that mutate the [`DataStore`], such as [`DataStore::insert_row`] and [`DataStore::gc`], /// return [`StoreEvent`]s that describe the changes. /// /// Refer to field-level documentation for more details and check out [`StoreDiff`] for a precise /// definition of what an event involves. #[derive(Debug, Clone, PartialEq)] pub struct StoreEvent { /// Which [`DataStore`] sent this event? pub store_id: StoreId, /// What was the store's generation when it sent that event? pub store_generation: StoreGeneration, /// Monotonically increasing ID of the event. /// /// This is on a per-store basis. /// /// When handling a [`StoreEvent`], if this is the first time you process this [`StoreId`] and /// the associated `event_id` is not `1`, it means you registered late and missed some updates. pub event_id: u64, /// What actually changed? /// /// Refer to [`StoreDiff`] for more information. pub diff: StoreDiff, } /// Describes an atomic change in the Rerun [`DataStore`]: a row has been added or deleted. /// /// From a query model standpoint, the [`DataStore`] _always_ operates one row at a time: /// - The contents of a row (i.e. its columns) are immutable past insertion, by virtue of /// [`RowId`]s being unique and non-reusable. /// - Similarly, garbage collection always removes _all the data_ associated with a row in one go: /// there cannot be orphaned columns. When a row is gone, all data associated with it is gone too. /// /// Refer to field-level documentation for more information. #[derive(Debug, Clone, PartialEq)] pub struct StoreDiff { /// Addition or deletion? /// /// The store's internals are opaque and don't necessarily reflect the query model (e.g. there /// might be data in the store that cannot by reached by any query). /// /// A [`StoreDiff`] answers a logical question: "does there exist a query path which can return /// data from that row?". pub kind: StoreDiffKind, /// What's the row's [`RowId`]? /// /// [`RowId`]s are guaranteed to be unique within a single [`DataStore`]. /// /// Put another way, the same [`RowId`] can only appear twice in a [`StoreDiff`] event: /// one addition and (optionally) one deletion (in that order!). pub row_id: RowId, /// The [`TimePoint`] associated with that row. /// /// Since insertions and deletions both work on a row-level basis, this is guaranteed to be the /// same value for both the insertion and deletion events (if any). pub timepoint: TimePoint, /// The [`EntityPath`] associated with that row. /// /// Since insertions and deletions both work on a row-level basis, this is guaranteed to be the /// same value for both the insertion and deletion events (if any). pub entity_path: EntityPath, /// All the [`DataCell`]s associated with that row. /// /// Since insertions and deletions both work on a row-level basis, this is guaranteed to be the /// same set of values for both the insertion and deletion events (if any). pub cells: IntMap<ComponentName, DataCell>, } ``` --- `DataStore` changelog PR series: - #4202 - #4203 - #4205 - #4206 - #4208 - #4209
Introducing the `StoreView` trait and registration system, allowing anybody to subscribe to `DataStore` changes, even from external code. `StoreView`s global scope: by registering a view you subscribe to changes to _all_ `DataStore`s, including those that are yet to be created. This is very powerful as it allows views & triggers implementers to build cross-recording indices as well as be notified as soon as new recordings come in and go out. ```rust /// A [`StoreView`] subscribes to atomic changes in one or more [`DataStore`]s through [`StoreEvent`]s. /// /// [`StoreView`]s can be used to build both secondary indices and trigger systems. pub trait StoreView: std::any::Any + Send + Sync { /// Arbitrary name for the view. /// /// Does not need to be unique. fn name(&self) -> String; /// Workaround for downcasting support, simply return `self`: /// ```ignore /// fn as_any(&self) -> &dyn std::any::Any { /// self /// } /// ``` fn as_any(&self) -> &dyn std::any::Any; /// Workaround for downcasting support, simply return `self`: /// ```ignore /// fn as_any_mut(&mut self) -> &mut dyn std::any::Any { /// self /// } /// ``` fn as_any_mut(&mut self) -> &mut dyn std::any::Any; /// The core of this trait: get notified of changes happening in one or more [`DataStore`]s. /// /// This will be called automatically by the [`DataStore`] itself if the view has been /// registered: [`DataStore::register_view`]. /// Or you might want to feed it [`StoreEvent`]s manually, depending on your use case. /// /// ## Example /// /// ```ignore /// fn on_events(&mut self, events: &[StoreEvent]) { /// use re_arrow_store::StoreDiffKind; /// for event in events { /// match event.kind { /// StoreDiffKind::Addition => println!("Row added: {}", event.row_id), /// StoreDiffKind::Deletion => println!("Row removed: {}", event.row_id), /// } /// } /// } /// ``` fn on_events(&mut self, events: &[StoreEvent]); } ``` --- `DataStore` changelog PR series: - #4202 - #4203 - #4205 - #4206 - #4208 - #4209
…4206) Standalone example of how to implement and register custom `StoreView`s, even from external code. <picture> <img src="https://static.rerun.io/custom_store_view/f7258673486f91d944180bd4a83307bce09b741e/full.png" alt=""> <source media="(max-width: 480px)" srcset="https://static.rerun.io/custom_store_view/f7258673486f91d944180bd4a83307bce09b741e/480w.png"> <source media="(max-width: 768px)" srcset="https://static.rerun.io/custom_store_view/f7258673486f91d944180bd4a83307bce09b741e/768w.png"> <source media="(max-width: 1024px)" srcset="https://static.rerun.io/custom_store_view/f7258673486f91d944180bd4a83307bce09b741e/1024w.png"> <source media="(max-width: 1200px)" srcset="https://static.rerun.io/custom_store_view/f7258673486f91d944180bd4a83307bce09b741e/1200w.png"> </picture> --- `DataStore` changelog PR series: - #4202 - #4203 - #4205 - #4206 - #4208 - #4209
1f3c790
to
452875b
Compare
This is mostly preliminary work for #4209, which makes this PR a bit weird. Basically just trying to offload complexity from #4209. `TimesPerTimeline` as well as `TimeHistogramPerTimeline` are now living on their own and are maintained as `StoreView`s, i.e. they react to changes to the `DataStore` rather than constructing alternate truths. This is the first step towards turning the `EntityTree` giga-structure into an event-driven view in the next PR. --- `DataStore` changelog PR series: - #4202 - #4203 - #4205 - #4206 - #4208 - #4209
ac40dbb
to
e328d44
Compare
Eh, this doesn't particularly surprise nor saddens me. It's pretty common for builtin views with extra powers to A) be special cased and B) consume the event stream through private APIs / lower-level means because of that, IME. That said, in our specific case, I don't think it would be particularly difficult to fix though. The special casing essentially comes from the
Not a hill I'm willing to die on, but in my experience globally scoped events scale much better: it's just much easier to filter stuff out on entry than it is to try and reconcile data across multiple views scattered all over the place when you need to.
I can think of plenty, especially since our data model is based around the idea of many similar recordings (i.e. Going even one-level higher: anything that wants to compute Viewer-level metrics. |
Turns the
EntityTree
giga-datastructure into aStoreView
, meaning it now reacts toStoreEvent
s rather than creating alternate truths.This introduces the notion of cascading side-effects, and more specifically
ClearCascade
s.When the
EntityTree
reacts to changes in the store, this might cause cascading effects (e.g. pending clears), that in turn need to write back to the store, which in turn sends more events to react to!The cycle is guaranteed finite because "clears don't get cleared"!
Cascading side-effects have an interesting requirement: they need to log their cascaded data using a
RowId
similar to the one used in the original event that caused the cascade (so they get GC'd at roughly the same pace)."Similar" in this cases means that their
TUID
shares the same timestamp and that the newRowId
is strictly greater than the old one.PathOp
has finally been annihilated.According to our new "Clears" & "Time Histograms" test suites, this behaves exactly like the
main
branch.DataStore
changelog PR series:DataStore
changelog 1: letDataStore
s know about theirStoreId
#4202DataStore
changelog 2: introduceStoreEvent
s #4203DataStore
changelog 3: introduceStoreView
s #4205DataStore
changelog 4: add standalone "Custom StoreView" example #4206DataStore
changelog 5: event-driven time histograms #4208DataStore
changelog 6: event-drivenEntityTree
#4209Checklist