feat(sqlite): Improve throughput of `SqliteEventCacheStore::load_all_chunks_metadata` by 1140% by Hywan · Pull Request #5411 · matrix-org/matrix-rust-sdk

Hywan · 2025-07-15T13:44:17Z

This patch changes the query used by SqliteEventCacheStore::load_all_chunks_metadata. It was the cause of severe slowness (see #5407). The new query improves the throughput by +1140% and the time by -91.916%.

This query will visit all chunks of a linked chunk with ID hashed_linked_chunk_id. For each chunk, it collects its ID (ChunkIdentifier), previous chunk, next chunk, and number of events (num_events). If it's a gap, num_events is equal to 0, otherwise it counts the number of events in event_chunks where event_chunks.chunk_id = linked_chunks.id.

Why not using a (LEFT) JOIN + COUNT? Because for gaps, the entire event_chunks will be traversed every time to always count zero events (a gap has no event). It's extremely inefficient. To speed that up, we could use an INDEX but it will consume more storage space. Finally, traversing an INDEX boils down to traversing a B-tree, which is O(log n), whilst this CASE approach is O(1). This solution is a nice trade-off and offers great performance.

Note

Please, note that the more gaps, the slower the query was. Now, the number of gaps isn't impactful.

This change doesn't involve a new schema, no migration is necessary.

Metrics for 10'000 events (with 1 gap every 80 events)

Before

	Lower bound	Estimate	Upper bound
Throughput	20.650 Kelem/s	20.686 Kelem/s	20.722 Kelem/s
R²	0.1032688	0.1374475	0.1035309
Mean	482.58 ms	483.43 ms	484.27 ms
Std. Dev.	929.07 µs	1.4376 ms	1.7083 ms
Median	481.90 ms	483.82 ms	484.60 ms
MAD	173.82 µs	2.1061 ms	2.4203 ms

After

	Lower bound	Estimate	Upper bound
Slope	39.322 ms	39.444 ms	39.607 ms
Throughput	252.48 Kelem/s	253.52 Kelem/s	254.31 Kelem/s
R²	0.9993976	0.9995784	0.9992583
Mean	39.381 ms	39.478 ms	39.596 ms
Std. Dev.	75.457 µs	184.96 µs	260.35 µs
Median	39.354 ms	39.459 ms	39.546 ms
MAD	23.761 µs	124.48 µs	267.83 µs

This patch changes the query used by `SqliteEventCacheStore::load_all_chunks_metadata`. It was the cause of severe slowness. The new query improves the throughput by +1140% and the time by -91.916%. The benchmark will follow in the next patch. Metrics for 10'000 events (with 1 gap every 80 events). - Before: - throughput: 20.686 Kelem/s, - time: 483.43 ms. - After: - throughput: 253.52 Kelem/s, - time: 39.478 ms. This query will visit all chunks of a linked chunk with ID `hashed_linked_chunk_id`. For each chunk, it collects its ID (`ChunkIdentifier`), previous chunk, next chunk, and number of events (`num_events`). If it's a gap, `num_events` is equal to 0, otherwise it counts the number of events in `event_chunks` where `event_chunks.chunk_id = linked_chunks.id`. Why not using a `(LEFT) JOIN` + `COUNT`? Because for gaps, the entire `event_chunks` will be traversed every time. It's extremely inefficient. To speed that up, we could use an `INDEX` but it will consume more storage space. Finally, traversing an `INDEX` boils down to traverse a B-tree, which is O(log n), whilst this `CASE` approach is O(1). This solution is nice trade-off and offers great performance.

codecov · 2025-07-15T14:01:44Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 88.81%. Comparing base (7d9d5bf) to head (9e2da60).
Report is 30 commits behind head on main.

✅ All tests successful. No failed tests found.

Additional details and impacted files

@@           Coverage Diff           @@
##             main    #5411   +/-   ##
=======================================
  Coverage   88.80%   88.81%           
=======================================
  Files         334      334           
  Lines       91256    91266   +10     
  Branches    91256    91266   +10     
=======================================
+ Hits        81044    81059   +15     
+ Misses       6399     6393    -6     
- Partials     3813     3814    +1

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

bnjbvr

Nice stuff, thanks!

Hywan added 2 commits July 15, 2025 15:35

bench: Add a benchmark for EventCacheStore::load_all_chunks_metadata.

9e2da60

Hywan changed the title ~~feat(sqlite): Improve throughput of load_all_chunks_metadata by 1140%~~ feat(sqlite): Improve throughput of SqliteEventCacheStore::load_all_chunks_metadata by 1140% Jul 15, 2025

Hywan mentioned this pull request Jul 15, 2025

Revert "fix(sdk): Disable OrderTracker for the moment." #5412

Merged

Hywan marked this pull request as ready for review July 15, 2025 14:01

Hywan requested a review from a team as a code owner July 15, 2025 14:01

Hywan requested review from andybalaam and poljar and removed request for a team and andybalaam July 15, 2025 14:01

Hywan mentioned this pull request Jul 15, 2025

fix(sdk): Disable OrderTracker for the moment #5407

Merged

bnjbvr approved these changes Jul 21, 2025

View reviewed changes

bnjbvr merged commit 165ec9d into matrix-org:main Jul 21, 2025
51 checks passed

bnjbvr mentioned this pull request Jul 21, 2025

feat(sqlite): make sqlite's implementation of load_all_chunks_metadata even faster #5425

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(sqlite): Improve throughput of `SqliteEventCacheStore::load_all_chunks_metadata` by 1140%#5411

feat(sqlite): Improve throughput of `SqliteEventCacheStore::load_all_chunks_metadata` by 1140%#5411
bnjbvr merged 2 commits intomatrix-org:mainfrom
Hywan:fix-sdk-event-cache-load-metadata-performance

Hywan commented Jul 15, 2025 •

edited

Loading

Uh oh!

codecov bot commented Jul 15, 2025 •

edited

Loading

Uh oh!

bnjbvr left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Hywan commented Jul 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Metrics for 10'000 events (with 1 gap every 80 events)

Before

After

Uh oh!

codecov bot commented Jul 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

bnjbvr left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Hywan commented Jul 15, 2025 •

edited

Loading

codecov bot commented Jul 15, 2025 •

edited

Loading