-
Notifications
You must be signed in to change notification settings - Fork 270
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
indexeddb: Shrink values in inbound_group_sessions store, improving performance all around #3073
Conversation
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## main #3073 +/- ##
==========================================
- Coverage 83.72% 83.68% -0.05%
==========================================
Files 222 222
Lines 23357 23389 +32
==========================================
+ Hits 19556 19573 +17
- Misses 3801 3816 +15 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work, thx!
LGTM to me. I agree that it's worth it to include the future backed_up_to
field at this point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks excellent to me. I've a couple of really tiny feedback, but overall, great work!
crates/matrix-sdk-indexeddb/src/crypto_store/indexeddb_serializer.rs
Outdated
Show resolved
Hide resolved
64fb436
to
56ac6df
Compare
56ac6df
to
e3da325
Compare
Oh no: #718. |
Great work with the speed-up! I'm a bit late to the party, but what role does base64-encoding the payload serve in this? It seems that the trick is to replace an array with lots of values with a single value (a string), which could've just as well been a JSON string. Which would mean the base64 step only serves to add overhead on top of this? The only way the base64 encoding would make sense is if we opted for a more compact binary representation of the array of numbers (which I think was the ultimate goal of #718 there which poljar linked). |
@dkasak the base64 encoding is to transform an array of bytes into a string, because I found that strings are much faster than arrays of numbers to store inside Indexed DB. When you say a JSON string, do you mean treat the bytes as characters directly? That would result in invalid UTF-8, which is not allowed in |
What I mean is using |
We are already encoding to JSON using |
If I remember correctly, the |
Just to write down the conclusion: after a quick chat with @andybalaam, only the unencrypted case unnecessarily base64-encodes the payload before storing. The encrypted case a couple of lines above does what is expected, so that's why we were talking past each other. |
Fixes #3057.
Change the way we store values in Indexed DB for the crypto store, specifically in the
inbound_group_sessions
store. Reduce the size of the stored values and replace arrays of numbers with base 64 strings. The rationale and some implementation ideas are documented in #3057. (TL;DR: small values are faster, base64 is faster.)This change:
MaybeEncrypted
type inmatrix_sdk_indexeddb::crypto_store::indexeddb_serializer
that encapsulates the fact that we serialize differently if the session is encrypted with a store cipher. (Previously, we just serialized an array of numbers in both cases. This was actually quite confusing since these arrays of numbers were holding encoded versions of very different values.) This enum isuntagged
in Serde to avoid increasing the size of Indexed DB records.matrix_sdk_store_encryption::EncryptedValueBase64
, which works the same way asEncryptedValue
, but stores arrays of numbers as base 64.matrix_sdk_indexeddb::crypto_store::migrations
, including adding thebacked_up_to
property, so we can support a fix for #26892 (see this comment) without doing another migration.I also wrote a performance test that runs in a web browser. In the checked-in version, it runs for 2K records, but when I manually increased the number on my machine to 200K, I got these results:
So the speedup is impressive.