-
Notifications
You must be signed in to change notification settings - Fork 169
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RCORE-2141 RCORE-2142 Clean up a bunch of old encryption cruft #7698
Conversation
I applaud the wisdom found here :-) |
8f9e7f4
to
aa3d9c0
Compare
Pull Request Test Coverage Report for Build thomas.goyne_396Details
💛 - Coveralls |
4e6e8b1
to
c468758
Compare
80c8d66
to
e2e78b2
Compare
flush(); | ||
sync(); | ||
do_flush(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the reasoning for removing the call to sync? Is it because we can rely on the IV un-bumping strategy for consistency?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removing a map is the wrong granularity for syncing. Either we need to be syncing between the IV write and the data write for every page, or we need to be syncing once (or twice) per transaction as part of committing. This was making us sync at fairly random times in the middle of the commit which weren't connected to anything logical, and is why some of the tests had to do a lot less work on Windows to not be unreasonably slow (FlushFileBuffers() is closer to F_FULLFSYNC than fsync()).
static void memcpy_if_changed(void* dst, const void* src, size_t n) | ||
{ | ||
#if REALM_SANITIZE_THREAD | ||
// Because our copying is page-level granularity, we have some benign races |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we remove this case from suppression from test/tsan.suppress
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like we can.
@@ -1649,23 +1488,6 @@ FileDesc File::dup_file_desc(FileDesc fd) | |||
return fd_duped; | |||
} | |||
|
|||
File::UniqueID File::get_unique_id() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great to finally be rid of this 💯
if (n == 0) | ||
break; | ||
used_size += n; | ||
} | ||
return std::string(buffer.data(), used_size); // Throws | ||
} | ||
|
||
|
||
std::string util::load_file_and_chomp(const std::string& path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you move the simple deletions/cleanups like this one into a separate PR, a lot of the file/mapping stuff is probably too interconnected to be worth extracting, but it would be nice to reduce the number of non-encryption related clean-ups here. (These are all great though!)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a few steps in between, but this is actually connected to the encryption changes. Making File::seek()
work for encrypted files was previously done with a global mutex which I wanted to kill. Rather than trying to figure out some locking scheme that would make seeking work I updated all of our uses of File to not rely on syncing and instead issue atomic read/write calls at specific offsets. For each thing I had to update for this I first checked if it was actually still used and just deleted it if not.
Ideally the File changes would all be a separate commit that goes before the encryption changes but they are pretty entangled, largely because of the map_flags thing that was passed around through every layer without ever being used for anything.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice improvements across the board 👍 I'm glad to see that the encryption code has been simplified as well. I gave this a fairly detailed review, but given the large scope of changes it may be best to get @finnschiermer to review as well in case I missed something.
I did a bit of benchmarking of this and concluded that while there's some things that are faster, it was really hard to actually hit the cases where you could hit the performance pitfalls of the old code, which is a good thing I guess. As a result the only real performance change from this is that operations on unrelated encrypted files no longer sometimes block each other. |
d6c8fec
to
2ed0f28
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very, very nice.
The global shared cache of encrypted file maps was originally required because we actually opened Realm files mulitple times in normal usage, so each of the open files had to know about each other to copy things around. #4839 made it so that in normal usage we only ever have one DB instance per file per process, so it became dead code. Multiprocess encryption made it unneccesary even when the one-DB-per-process rule is violated, as the multiprocess code path covers that. This eliminates our last reliance on file UniqueIDs, so it lets us get rid of hacks related to that. The encryption page reclaimer mostly never actually worked. It used a very conserative page reclaimation rule that meant that pages would never be reclaimed if there was a long-lived Transaction, even if it was frozen or kept refreshed. This is very common in practice, and when it doesn't happen the DB usually isn't kept open either, making it redundant. Encryption used to rely on handling BAD_EXEC signals (or mach exceptions) rather than explicit barriers, so it had to read and write in page-sized chunks. That's no longer the case, so we can eliminate a lot of complexity by always reading and writing in 4k blocks.
The global shared cache of encrypted file maps was originally required because we actually opened Realm files mulitple times in normal usage, so each of the open files had to know about each other to copy things around. #4839 made it so that in normal usage we only ever have one DB instance per file per process, so it became dead code. Multiprocess encryption made it unneccesary even when the one-DB-per-process rule is violated, as the multiprocess code path covers that.
This eliminates our last reliance on file UniqueIDs, so it lets us get rid of hacks related to that.
The encryption page reclaimer mostly never actually worked. It used a very conservative page reclamation rule that meant that pages would never be reclaimed if there was a long-lived Transaction, even if it was frozen or kept refreshed. This is very common in practice, and when it doesn't happen the DB usually isn't kept open either, making it redundant.
Encryption used to rely on handling BAD_EXEC signals (or mach exceptions) rather than explicit barriers, so it had to read and write in page-sized chunks. That's no longer the case, so we can eliminate a lot of complexity by always reading and writing in 4k blocks.
Our use of off_t meant that on Windows we didn't support >2GB files because off_t is 32-bit even on x64 Windows. The encryption layer now theoretically supports files up to 8 TB on 32-bit (which isn't relevant because SlabAlloc doesn't).
This makes it so that the multiprocess encryption codepaths can be tested in a single process, and in fact UNITTEST_ENCRYPT_ALL=1 will incidentally test it in a bunch of places. This revealed a preexisting bug:
When copying data to a StaleIV page we need to copy the entire page rather than just the modified bytes. We can't just mark the page as Clean because while it's fine for the reader mapping to see the data on disk rather than the newly written data while the write is still in progress, we wouldn't know when to actually reread the page.
We no longer use file seeking anywhere and use explicit position offsets. File seeking is spooky when multiple threads are involved and it involved a lot of extra syscalls.
IV refreshing now involves fewer read() calls. I don't think this is actually a meaningful perf gain since they would all have been warm cache hits anyway. Might be faster, though.
The global
mapping_mutex
is gone and encryption operations on two different DBs can now happen concurrently.The error messages when decryption fails now include a little more information.
Fixes #7743. Fixes #7744.