Reduce session size with encoding and compression improvements#7703
Reduce session size with encoding and compression improvements#7703mitchellhenke merged 10 commits intomainfrom
Conversation
zachmargolis
left a comment
There was a problem hiding this comment.
LGTM! Is there any good spot to log what version a session is when we decrypt it? would be good to get a sense of how much pii bundle is v2 vs v3, how many sessions are v2 vs v3 etc
ea9458f to
ccfa365
Compare
There was a problem hiding this comment.
PII::FIngerprinter.fingerprint returns a hexdigested fingerprint. We could save even more space using bytes here
There was a problem hiding this comment.
Oh, that's a good catch! I tried this a bit, and the improvements are relatively constant. It's 20-32 bytes saved regardless of the size of the total payload, so the savings may not necessarily scale. I think the only place it is used is once in the AES encryption, which would track with that.
244c4c063 is my draft attempt at this, I'm kind of on the fence on the tradeoff. If it's useful for the future, it would be more worthwhile though.
|
It looks like we get msgpack as a side-effect of having bootsnap installed. It doesn't look like a dep of rails. We should probably explicitly add the gem to the Gemfile |
Ah good catch, added |
83aadcc to
5c54a88
Compare
895a776 to
2bde069
Compare
80ed2f4 to
37df769
Compare
changelog: Internal, Performance, Reduce session size
Co-authored-by: Zach Margolis <zachmargolis@users.noreply.github.com>
37df769 to
15ed9b1
Compare
🛠 Summary of changes
This set of changes follows a similar path to #6315 to update the structure and processing of session data.
Currently, our sessions are encrypted and encoded in a few layers:
The shape of this process has existed since near the beginning of the project, was conservative in its approach and worked reliably. It also provides a lot of flexibility by returning encoded data at any step, which meant it could be saved to a given data store without worrying about whether it would accept unencoded binary data.
The current process traded some overhead on session size for flexibility. We do not anticipate significant changes to how or where we'll save session data, and the volume to Redis is a potential bottleneck going forward. Redis and KMS support storing and encrypting binary data, so we can trim that overhead out while trying to retain some of the flexibility and have a path towards iterating on the structure if desired.
One of the larger complications in this refactor is in the
AesCipherandAesEncryptorclasses, which are used in a variety of places for encryption, some of which we cannot easily convert into using updated formats. The most difficult is profile PII which we'd only be able to re-encrypt once the user has decrypted it with their password. This PR moves the existing classes and replaces them with theLegacyprefix in the class name. One pattern I'd like to establish in this is that the Encryptor classes should minimize the overhead of processing and storage size and allow the caller to do any encoding or post-processing if needed. I've intentionally not renamed the new classes over the old ones for now to avoid collisions with any ongoing work. I've also not modified theKmsClient. It does include one extraneous Base64 encoding that we could drop, but the gains in other areas were relatively significant already. We could certainly explore updating that in the future.Two significant changes in this is compression, which we will now do above a defined threshold and using MessagePack for encoding rather than JSON. The primary reason for this is JSON does not support binary data natively.
The new process is:
Preliminary testing suggests the we could save 80+% in the common cases and potentially more. This has been running in the load testing environment.