Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EntityTree: only check for entity deletions when necessary #8103

Merged
merged 4 commits into from
Nov 12, 2024
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 15 additions & 4 deletions crates/store/re_entity_db/src/entity_db.rs
Original file line number Diff line number Diff line change
Expand Up @@ -5,8 +5,8 @@ use parking_lot::Mutex;

use re_chunk::{Chunk, ChunkResult, RowId, TimeInt};
use re_chunk_store::{
ChunkStore, ChunkStoreChunkStats, ChunkStoreConfig, ChunkStoreEvent, ChunkStoreHandle,
ChunkStoreSubscriber, GarbageCollectionOptions, GarbageCollectionTarget,
ChunkStore, ChunkStoreChunkStats, ChunkStoreConfig, ChunkStoreDiffKind, ChunkStoreEvent,
ChunkStoreHandle, ChunkStoreSubscriber, GarbageCollectionOptions, GarbageCollectionTarget,
};
use re_log_types::{
ApplicationId, EntityPath, EntityPathHash, LogMsg, ResolvedTimeRange, ResolvedTimeRangeF,
Expand Down Expand Up @@ -385,7 +385,13 @@ impl EntityDb {

// It is possible for writes to trigger deletions: specifically in the case of
// overwritten static data leading to dangling chunks.
self.tree.on_store_deletions(&engine, &store_events);
let entity_paths_with_deletions = store_events
.iter()
.filter(|event| event.kind == ChunkStoreDiffKind::Deletion)
.map(|event| event.chunk.entity_path().clone())
.collect();
self.tree
.on_store_deletions(&engine, &entity_paths_with_deletions, &store_events);

// We inform the stats last, since it measures e2e latency.
self.stats.on_events(&store_events);
Expand Down Expand Up @@ -544,7 +550,12 @@ impl EntityDb {
time_histogram_per_timeline.on_events(store_events);

let engine = engine.downgrade();
tree.on_store_deletions(&engine, store_events);
let entity_paths_with_deletions = store_events
.iter()
.filter(|event| event.kind == ChunkStoreDiffKind::Deletion)
.map(|event| event.chunk.entity_path().clone())
.collect();
tree.on_store_deletions(&engine, &entity_paths_with_deletions, store_events);
}

/// Key used for sorting recordings in the UI.
Expand Down
9 changes: 6 additions & 3 deletions crates/store/re_entity_db/src/entity_tree.rs
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
use std::collections::BTreeMap;

use ahash::HashSet;
use nohash_hasher::IntMap;
use nohash_hasher::{IntMap, IntSet};

use re_chunk::RowId;
use re_chunk_store::{ChunkStoreDiffKind, ChunkStoreEvent, ChunkStoreSubscriber};
Expand Down Expand Up @@ -160,6 +160,7 @@ impl EntityTree {
pub fn on_store_deletions(
&mut self,
engine: &StorageEngineReadGuard<'_>,
entity_paths_with_deletions: &IntSet<EntityPath>,
events: &[ChunkStoreEvent],
) {
re_tracing::profile_function!();
Expand All @@ -171,9 +172,11 @@ impl EntityTree {

self.children.retain(|_, entity| {
// this is placed first, because we'll only know if the child entity is empty after telling it to clear itself.
entity.on_store_deletions(engine, events);
entity.on_store_deletions(engine, entity_paths_with_deletions, events);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not totally following how this improves performance.

The inclusion of entity_paths_with_deletions doesn't change the fact that on_store_deletions does an entire tree-walk.

Is the whole point of this optimization to bypass the overhead of the is_empty() call in cases where we know that the intermediate child couldn't have been deleted?

That said, this seems like this will no longer successfully delete intermediate children that only existed as containers for other entities but don't have their own data.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The inclusion of entity_paths_with_deletions doesn't change the fact that on_store_deletions does an entire tree-walk.

Is the whole point of this optimization to bypass the overhead of the is_empty() call in cases where we know that the intermediate child couldn't have been deleted?

Yeah, the tree walk in itself is imperceptible (it's just a few thousands recursions in the worst case, it's barely measurable) -- a few thousands is_empty() on the other hand is extremely costly.

This PR basically brings the latency down from several seconds (ever increasing) to a constant 10ms (using the benchmark script in the issue).

That said, this seems like this will no longer successfully delete intermediate children that only existed as containers for other entities but don't have their own data.

Haa! I didn't even know that that was the point of this thing. We need to tweak this slightly then.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a few thousands is_empty() on the other hand is extremely costly

Got it -- that wasn't obvious at all and definitely warrants a comment then. I wonder if is_empty() should be renamed to something like check_if_empty() to better imply there's an active cost to be paid and it's not just accessing some pre-computed state.


!entity.is_empty(engine)
let has_deletion_events = entity_paths_with_deletions.contains(&entity.path);

!has_deletion_events || !entity.is_empty(engine)
});
}

Expand Down
Loading