-
Notifications
You must be signed in to change notification settings - Fork 144
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Secure Cell passphrase API: ThemisPP #588
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Lagovas
reviewed
Feb 3, 2020
Lagovas
approved these changes
Feb 3, 2020
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
This is how you say "AsRef<[u8]>" in C++. Yes, really. We are going to use this bunch of templates to accept anything that can be converted into a byte slice in new Secure Cell interface. Public interface: - themispp::input_buffer() templates Application code is allowed to invoke these freely. They are resilient to silly values, but until C++20 it is impossible to detect contiguous iterators reliably. Therefore is is technically possible to pass, say, std::deque<uint8_t> with inevitable probable segfault. Don't do that. (Thankfully, std::list and everything non-random-access is ruled out.) Private interface: - themispp::impl::input_buffer struct - themispp::impl::input_bytes() templates - themispp::impl::input_string() templates These functions are going to be used by ThemisPP internally. There are dummy identity conversion to handle themispp::input_buffer() results. String functions are used only by passphrase API constructors. They are not available to general users so that they don't pass std::string wherever they want. However, it is possible for users to provide specializations of all template functions in order to support their own custom containers. Tests provide an example of how this can be done. Note that the implementation is hidden in themispp::impl namespace which itself is mirrored in "themispp/impl" directory. This should prevent users from accidentally including and using these definitions. Also note the "compat" headers which are used to polyfill newer features of C++ when available. They are meant for internal use and should be used in pairs to avoid leaking magic macros to application code.
We are going to use this little class to store secret data inside Secure Cell. It is a wrapper over std::vector equipped with autowiping. The interface is restricted by design, limiting out own stupidity when coding ThemisPP. Use of soter_wipe() in ThemisPP in any form has a drawback: it makes application code dependent on Soter library directly. Previously only themispp::secure_rand_t (mostly unused) caused this, but now all new Secure Cell classes will trigger this. It's not an issue on Linux with its flat linker spaace, but virtually every other OS has nested linkage.
Implement Seal mode of Secure Cell, in both master key and passphrase flavors. The implementation is more or less straightforward, but there is one thing which must be commented. Initially I expected that allocator-aware templates can be placed directly into "themispp" namespace: auto cell = themispp::secure_cell_seal_with_passphrase("secret"); However, it turned out that C++ grammar requires (until C++17) the template brackets to be present at all times, even if all template arguments can be inferred. This results in silly-looking: auto cell = themispp::secure_cell_seal_with_passphrase<>("secret"); As a result, the actual implementation goes into "impl" subnamespace and proper name is reexported via a typedef. Well... it's C++, what can I do? Assuming that we do want to keep allocator awareness. New implementation gets a bunch of new test which verify both API and some behavior. In particular, they can be somewhat of a usage guide now. Though, they are still very much incomplete. Note that some tests are templated over "master key"/"passphrase" type. I really don't want to duplicate this code (it's enough KLOC here) *and* this ensures that both Secure Cell flavors have identical API. Win-win. (Actual implementation could be templated too, but that does not save much lines of code while making the implementation *much* more complex.) Also note a whole lot of "// NOLINT" comments. If we were not bound by C++03 compatibility we could have used modern alternatives, but C++03 makes it hard to write code that compiles with every standard so we use the greatest common denominator. Hence we need clang-tidy to shut up.
This one is easy to do: there are no overloads, simple API, there will never be a passphrase API for Context Imprint. So it's mostly copy-paste of the Seal mode. Just pay attention to the argument order in Core API. It's slightly different for Context Imprint mode. Since Context Imprint mode does not verify message validity, there is not much that we can verify in tests. But at least we check the GIGO behavior.
Ah! Token Protect API! The foster child of APIs in Themis which usually does not get enough attention and ends up as a second-class citizen. ThemisPP has specially obnoxious API for it, which is unique across all Themis wrappers. Well, not anymore. Instead of doing all that "set_token()"/"get_token()" we now return the encryption results as std::pair (with a couple of better accessors bolted onto it). This also gives us nice destructuring API which has finally arrived in C++17. Other than that, the implementation is pretty unremarkable. Mind the argument order and you'll be fine. The tests are templatized like Seal mode to make it easier to add passphrase support later. Note that we very message and token separately.
Yes, placement of attributes is... questionable but that's how you do it in C++. Don't ask me, I have no clue and don't want to know. There is probably a 75-page paper in C++ committee proceedings which explains in great detail why it must be done this way and absolutely cannot be done in a different, more sane way. clang-format does not make it look pretty either. Oh well... At least this code looks ugly and *definitely* deprecated.
- Autodetect C++ standard version to prevent clang-format from fixing "> >" into ">>" in template usage like "foo<bar = <baz> >" which does not parse correctly in C++03
Replace whatever we were calling 'passwords' with something that looks like a key and is named like one.
Well... project... The code style here leaves much to be desired, but I'm not going to fix it up right now. Just update it to use the latest API which is appropriate here.
PR #577 has been merged. I've rebased and retargeted this PR to master with no changes. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Add support of Secure Cell passphrase API to ThemisPP. The API
draftis described in RFC 3.1.User notes
Passphrase-based interface of Secure Cell allows you to use short and memorable passphrases to secure your data. While symmetric keys are more secure, they are also longer and much harder for humans to remember.
Here is how you can use passphrases with Secure Cell:
Passphrase API accepts passphrases as relatively short strings, suitable for human memory. Master key API uses randomly generated, long binary keys, which are more suitable for machines to remember. However, they are also more efficient and generally more secure due to considerable length. You should prefer to use keys over passphrases if there are no humans involved. The interface is almost the same:
Other breaking changes
New Token Protect API
Previously Token Protect API was used like this:
This API was not thread-safe and prone to mistakes such as forgetting to set (or update) the authentication token before decryption.
New API provides much simpler and safer interface:
Note that APIs are incompatible and you will need to update encryption/decryption call sites during migration. Simply renaming Secure Cell classes will not be enough.
Iterator pair support
New API does not support iterator pairs (aka ranges or spans) directly for maintainability reasons. Avoiding direct iterator pair support allows us to have much fewer method overloads and keeps the code signficantly easier to follow.
If you have been using iterator pairs:
then you need to wrap them in
themispp::input_buffer()
with new API:auto cell = themispp::secure_cell_seal_with_key(themispp::input_buffer(key_begin, key_end));
This is actual for all Secure Cell modes and all methods (construction, encryption, decryption).
Technical notes
As the saying goes, no battle plan survives contact with the enemy and this case is no exception. The implementation looks a little bit different from what is prescribed by the RFCs. However, user API says the same. Well, in fact, I consider C++ as a language to be hostile actor so this is more or less expected outcome.
This pull request is the most complex, the most unreadable, ridden with bugs and undefined behavior, and unpleasant in general of them all. In short, your typical C++ code. I started with C++ to run away from it as fast possible (as a side effect, it should run as fast as possible too).
I do not expect you to review it quickly. Take your time. Just open it up when you want to have your fix of suffering for the day or something.
Rant full of profanity
Why do you have to write all those
input_buffer
again and again – or pull in Boost monsters to avoid copying the code – instead of simply sayingimpl AsRef<[u8]>
? *weeps in Rust*Why do you have to have a PhD to write generic library code in C++? And it’s not PhD in math as you‘d probably need for Haskell. A PhD in law and criminal psychopathology will be much more appropriate here.
This is what happens to a language if a committee starts piling features onto features without ever deprecating anything. And in case your opponent attempts a counterargument you simply need to throw you hands up in the air and repeat “...but backwards compatibility!” and “undefined behavior!” until they go away.
Bjarne said that “there are two kinds of languages: the ones people complain about and the ones nobody uses”. So apparently they decided to design a language as bitchworthy as possible to get the adoption as high as C++ has.
But at least I‘m happy with this code. I really am. This is probably some kind of a Stockholm syndrome. Like, ‘clever’ languages like C++ reward ‘clever’ code. You get an immense sense of accomplishment once your tests actually compile and work the way you want to. This is where ‘boring’ languages like, say, Go or Java are different. C++ rewards you for writing clever, beautiful code. The only moment a program in Go is going to reward you is when the damn code does what it‘s supposed to be doing. Unfulfilling, boring, ugly as hell, but that‘s what people need – working code – as opposed to beautiful code that pleases the compiler. I‘m sorry I‘m weak before the machines but that‘s how it is.
I hope I don‘t have to touch this code again with a three-meter pole for the next year.
Whew! I feel a little bit better. Let’s do business now...
There‘s like 2.6k lines of code here. A significant part of them is documentation comments, copypasta in tests, dumb glue FFI code to interface with Themis Core, and a whole bunch of C++ magic to make all this work somehow. You‘ll probably have a better code review experience if you walk through it commit by commit.
Organizational notes
🚧 This branch depends on #577 so it has to wait before it is merged. To get better diffs the PR is targeted to a temporary branch which has to be removed before this PR is merged.
Once dependencies are merged, do the following:
Follow the order. If the base branch is deleted early then this PR might get autoclosed.
Checklist
Benchmark results are attached(pls no, I don‘t want to bench wrappers this month)