Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EntityTree: only check for entity deletions when necessary #8103

Merged
merged 4 commits into from
Nov 12, 2024

Conversation

teh-cmc
Copy link
Member

@teh-cmc teh-cmc commented Nov 12, 2024

Before:
image

After:
image

Checklist

  • I have read and agree to Contributor Guide and the Code of Conduct
  • I've included a screenshot or gif (if applicable)
  • I have tested the web demo (if applicable):
  • The PR title and labels are set such as to maximize their usefulness for the next release's CHANGELOG
  • If applicable, add a new check to the release checklist!
  • If have noted any breaking changes to the log API in CHANGELOG.md and the migration guide

To run all checks from main, comment on the PR with @rerun-bot full-check.

@teh-cmc teh-cmc added 🪳 bug Something isn't working 🚀 performance Optimization, memory use, etc 🦟 regression A thing that used to work in an earlier release include in changelog labels Nov 12, 2024
@@ -171,9 +172,11 @@ impl EntityTree {

self.children.retain(|_, entity| {
// this is placed first, because we'll only know if the child entity is empty after telling it to clear itself.
entity.on_store_deletions(engine, events);
entity.on_store_deletions(engine, entity_paths_with_deletions, events);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not totally following how this improves performance.

The inclusion of entity_paths_with_deletions doesn't change the fact that on_store_deletions does an entire tree-walk.

Is the whole point of this optimization to bypass the overhead of the is_empty() call in cases where we know that the intermediate child couldn't have been deleted?

That said, this seems like this will no longer successfully delete intermediate children that only existed as containers for other entities but don't have their own data.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The inclusion of entity_paths_with_deletions doesn't change the fact that on_store_deletions does an entire tree-walk.

Is the whole point of this optimization to bypass the overhead of the is_empty() call in cases where we know that the intermediate child couldn't have been deleted?

Yeah, the tree walk in itself is imperceptible (it's just a few thousands recursions in the worst case, it's barely measurable) -- a few thousands is_empty() on the other hand is extremely costly.

This PR basically brings the latency down from several seconds (ever increasing) to a constant 10ms (using the benchmark script in the issue).

That said, this seems like this will no longer successfully delete intermediate children that only existed as containers for other entities but don't have their own data.

Haa! I didn't even know that that was the point of this thing. We need to tweak this slightly then.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a few thousands is_empty() on the other hand is extremely costly

Got it -- that wasn't obvious at all and definitely warrants a comment then. I wonder if is_empty() should be renamed to something like check_if_empty() to better imply there's an active cost to be paid and it's not just accessing some pre-computed state.

@teh-cmc teh-cmc marked this pull request as draft November 12, 2024 17:19
@teh-cmc teh-cmc marked this pull request as ready for review November 12, 2024 17:54
@teh-cmc teh-cmc merged commit f9eb660 into main Nov 12, 2024
36 checks passed
@teh-cmc teh-cmc deleted the cmc/entity_tree_ddos branch November 12, 2024 18:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🪳 bug Something isn't working include in changelog 🚀 performance Optimization, memory use, etc 🦟 regression A thing that used to work in an earlier release
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Chunk ingestion performance regression because of compaction logic
2 participants