Secure Cell passphrase API: ThemisPP #588

ilammy · 2020-01-31T21:45:46Z

Add support of Secure Cell passphrase API to ThemisPP. The API ~~draft~~ is described in RFC 3.1.

User notes

Passphrase-based interface of Secure Cell allows you to use short and memorable passphrases to secure your data. While symmetric keys are more secure, they are also longer and much harder for humans to remember.

Here is how you can use passphrases with Secure Cell:

#include <themispp/secure_cell.hpp>

auto cell = themispp::secure_cell_seal_with_passphrase("secret");

uint8_t message[] = "precious message";

std::vector<uint8_t> encrypted = cell.encrypt(message);
std::vector<uint8_t> decrypted = cell.decrypt(encrypted);

EXPECT_EQ(decrypted, message);

Passphrase API accepts passphrases as relatively short strings, suitable for human memory. Master key API uses randomly generated, long binary keys, which are more suitable for machines to remember. However, they are also more efficient and generally more secure due to considerable length. You should prefer to use keys over passphrases if there are no humans involved. The interface is almost the same:

#include <themispp/secure_keygen.hpp>

// Get a new key if you don't have one already:
std::vector<uint8_t> master_key = themispp::gen_sym_key();
// Or use an existing value that you store somewhere:
std::vector<uint8_t> master_key = base64::decode("b0gyNlM4LTFKRDI5anFIRGJ4SmQyLGE7MXN5YWUzR2U=");

auto cell = themispp::secure_cell_seal_with_key(master_key);

uint8_t message[] = "precious message";

std::vector<uint8_t> encrypted = cell.encrypt(message);
std::vector<uint8_t> decrypted = cell.decrypt(encrypted);

EXPECT_EQ(decrypted, message);

⚠️ NOTE: themispp::secure_cell_seal_t types are deprecated. Please use less ambiguous themispp::secure_cell_seal_with_key instead.

Other breaking changes

New Token Protect API

Previously Token Protect API was used like this:

auto cell = themispp::secure_cell_token_protect_t(master_key);

auto encrypted = cell.encrypt(message);
auto token = cell.get_token();

// ...

cell.set_token(token);
auto decrypted = cell.decrypt(encrypted);

This API was not thread-safe and prone to mistakes such as forgetting to set (or update) the authentication token before decryption.

New API provides much simpler and safer interface:

auto cell = themispp::secure_cell_token_protect_with_key(master_key);

// Since C++17:
auto [encrypted, token] = cell.encrypt(message);

// Since C++11:
std::vector<uint8_t> encrypted, token;
std::tie(encrypted, token) = cell.encrypt(message);

// C++03:
typedef themispp::secure_cell_token_protect_with_key::output_pair token_pair;
token_pair result = cell.encrypt(message);
// Access result.encrypted() and result.token()

// ...

auto decrypted = cell.decrypt(encrypted, token);

Note that APIs are incompatible and you will need to update encryption/decryption call sites during migration. Simply renaming Secure Cell classes will not be enough.

Iterator pair support

New API does not support iterator pairs (aka ranges or spans) directly for maintainability reasons. Avoiding direct iterator pair support allows us to have much fewer method overloads and keeps the code signficantly easier to follow.

If you have been using iterator pairs:

auto cell = themispp::secure_cell_seal_t(key_begin, key_end);

then you need to wrap them in themispp::input_buffer() with new API:

auto cell = themispp::secure_cell_seal_with_key(themispp::input_buffer(key_begin, key_end));

This is actual for all Secure Cell modes and all methods (construction, encryption, decryption).

Technical notes

As the saying goes, no battle plan survives contact with the enemy and this case is no exception. The implementation looks a little bit different from what is prescribed by the RFCs. However, user API says the same. Well, in fact, I consider C++ as a language to be hostile actor so this is more or less expected outcome.

This pull request is the most complex, the most unreadable, ridden with bugs and undefined behavior, and unpleasant in general of them all. In short, your typical C++ code. I started with C++ to run away from it as fast possible (as a side effect, it should run as fast as possible too).

I do not expect you to review it quickly. Take your time. Just open it up when you want to have your fix of suffering for the day or something.

Rant full of profanity

Why do you have to write all those input_buffer again and again – or pull in Boost monsters to avoid copying the code – instead of simply saying impl AsRef<[u8]>? *weeps in Rust*

Why do you have to have a PhD to write generic library code in C++? And it’s not PhD in math as you‘d probably need for Haskell. A PhD in law and criminal psychopathology will be much more appropriate here.

This is what happens to a language if a committee starts piling features onto features without ever deprecating anything. And in case your opponent attempts a counterargument you simply need to throw you hands up in the air and repeat “...but backwards compatibility!” and “undefined behavior!” until they go away.

Bjarne said that “there are two kinds of languages: the ones people complain about and the ones nobody uses”. So apparently they decided to design a language as bitchworthy as possible to get the adoption as high as C++ has.

But at least I‘m happy with this code. I really am. This is probably some kind of a Stockholm syndrome. Like, ‘clever’ languages like C++ reward ‘clever’ code. You get an immense sense of accomplishment once your tests actually compile and work the way you want to. This is where ‘boring’ languages like, say, Go or Java are different. C++ rewards you for writing clever, beautiful code. The only moment a program in Go is going to reward you is when the damn code does what it‘s supposed to be doing. Unfulfilling, boring, ugly as hell, but that‘s what people need – working code – as opposed to beautiful code that pleases the compiler. I‘m sorry I‘m weak before the machines but that‘s how it is.

I hope I don‘t have to touch this code again with a three-meter pole for the next year.

Whew! I feel a little bit better. Let’s do business now...

There‘s like 2.6k lines of code here. A significant part of them is documentation comments, copypasta in tests, dumb glue FFI code to interface with Themis Core, and a whole bunch of C++ magic to make all this work somehow. You‘ll probably have a better code review experience if you walk through it commit by commit.

Organizational notes

🚧 This branch depends on #577 so it has to wait before it is merged. To get better diffs the PR is targeted to a temporary branch which has to be removed before this PR is merged.

Once dependencies are merged, do the following:

rebase ilammy:kdf/c++ onto cossacklabs:master
retarget this PR to master
delete ilammy/kdf/core

Follow the order. If the base branch is deleted early then this PR might get autoclosed.

Checklist

Change is covered by automated tests
~~Benchmark results are attached~~ (pls no, I don‘t want to bench wrappers this month)
The coding guidelines are followed
Public API has proper documentation
Example projects and code samples are updated
Changelog is updated

src/wrappers/themis/themispp/impl/input_buffer.hpp

src/wrappers/themis/themispp/secure_cell_seal.hpp

Lagovas

lgtm

This is how you say "AsRef<[u8]>" in C++. Yes, really. We are going to use this bunch of templates to accept anything that can be converted into a byte slice in new Secure Cell interface. Public interface: - themispp::input_buffer() templates Application code is allowed to invoke these freely. They are resilient to silly values, but until C++20 it is impossible to detect contiguous iterators reliably. Therefore is is technically possible to pass, say, std::deque<uint8_t> with inevitable probable segfault. Don't do that. (Thankfully, std::list and everything non-random-access is ruled out.) Private interface: - themispp::impl::input_buffer struct - themispp::impl::input_bytes() templates - themispp::impl::input_string() templates These functions are going to be used by ThemisPP internally. There are dummy identity conversion to handle themispp::input_buffer() results. String functions are used only by passphrase API constructors. They are not available to general users so that they don't pass std::string wherever they want. However, it is possible for users to provide specializations of all template functions in order to support their own custom containers. Tests provide an example of how this can be done. Note that the implementation is hidden in themispp::impl namespace which itself is mirrored in "themispp/impl" directory. This should prevent users from accidentally including and using these definitions. Also note the "compat" headers which are used to polyfill newer features of C++ when available. They are meant for internal use and should be used in pairs to avoid leaking magic macros to application code.

We are going to use this little class to store secret data inside Secure Cell. It is a wrapper over std::vector equipped with autowiping. The interface is restricted by design, limiting out own stupidity when coding ThemisPP. Use of soter_wipe() in ThemisPP in any form has a drawback: it makes application code dependent on Soter library directly. Previously only themispp::secure_rand_t (mostly unused) caused this, but now all new Secure Cell classes will trigger this. It's not an issue on Linux with its flat linker spaace, but virtually every other OS has nested linkage.

Implement Seal mode of Secure Cell, in both master key and passphrase flavors. The implementation is more or less straightforward, but there is one thing which must be commented. Initially I expected that allocator-aware templates can be placed directly into "themispp" namespace: auto cell = themispp::secure_cell_seal_with_passphrase("secret"); However, it turned out that C++ grammar requires (until C++17) the template brackets to be present at all times, even if all template arguments can be inferred. This results in silly-looking: auto cell = themispp::secure_cell_seal_with_passphrase<>("secret"); As a result, the actual implementation goes into "impl" subnamespace and proper name is reexported via a typedef. Well... it's C++, what can I do? Assuming that we do want to keep allocator awareness. New implementation gets a bunch of new test which verify both API and some behavior. In particular, they can be somewhat of a usage guide now. Though, they are still very much incomplete. Note that some tests are templated over "master key"/"passphrase" type. I really don't want to duplicate this code (it's enough KLOC here) *and* this ensures that both Secure Cell flavors have identical API. Win-win. (Actual implementation could be templated too, but that does not save much lines of code while making the implementation *much* more complex.) Also note a whole lot of "// NOLINT" comments. If we were not bound by C++03 compatibility we could have used modern alternatives, but C++03 makes it hard to write code that compiles with every standard so we use the greatest common denominator. Hence we need clang-tidy to shut up.

This one is easy to do: there are no overloads, simple API, there will never be a passphrase API for Context Imprint. So it's mostly copy-paste of the Seal mode. Just pay attention to the argument order in Core API. It's slightly different for Context Imprint mode. Since Context Imprint mode does not verify message validity, there is not much that we can verify in tests. But at least we check the GIGO behavior.

Ah! Token Protect API! The foster child of APIs in Themis which usually does not get enough attention and ends up as a second-class citizen. ThemisPP has specially obnoxious API for it, which is unique across all Themis wrappers. Well, not anymore. Instead of doing all that "set_token()"/"get_token()" we now return the encryption results as std::pair (with a couple of better accessors bolted onto it). This also gives us nice destructuring API which has finally arrived in C++17. Other than that, the implementation is pretty unremarkable. Mind the argument order and you'll be fine. The tests are templatized like Seal mode to make it easier to add passphrase support later. Note that we very message and token separately.

Yes, placement of attributes is... questionable but that's how you do it in C++. Don't ask me, I have no clue and don't want to know. There is probably a 75-page paper in C++ committee proceedings which explains in great detail why it must be done this way and absolutely cannot be done in a different, more sane way. clang-format does not make it look pretty either. Oh well... At least this code looks ugly and *definitely* deprecated.

- Autodetect C++ standard version to prevent clang-format from fixing "> >" into ">>" in template usage like "foo<bar = <baz> >" which does not parse correctly in C++03

Replace whatever we were calling 'passwords' with something that looks like a key and is named like one.

Well... project... The code style here leaves much to be desired, but I'm not going to fix it up right now. Just update it to use the latest API which is appropriate here.

ilammy · 2020-02-12T20:04:33Z

PR #577 has been merged. I've rebased and retargeted this PR to master with no changes.

ilammy added the W-ThemisPP ⚔️ Wrapper: ThemisPP, C++ API label Jan 31, 2020

ilammy requested review from Lagovas, shadinua and vixentael as code owners January 31, 2020 21:45

Lagovas reviewed Feb 3, 2020

View reviewed changes

src/wrappers/themis/themispp/impl/input_buffer.hpp Show resolved Hide resolved

src/wrappers/themis/themispp/impl/input_buffer.hpp Show resolved Hide resolved

src/wrappers/themis/themispp/secure_cell_seal.hpp Show resolved Hide resolved

Lagovas approved these changes Feb 3, 2020

View reviewed changes

ilammy added 10 commits February 12, 2020 22:01

Update clang-format rules

9cd1d2f

- Autodetect C++ standard version to prevent clang-format from fixing "> >" into ">>" in template usage like "foo<bar = <baz> >" which does not parse correctly in C++03

Note new API in changelog

7cb57fa

Use wording "key" with old master key API

228eb8f

Replace whatever we were calling 'passwords' with something that looks like a key and is named like one.

Use passphrase API in example project

353491e

Well... project... The code style here leaves much to be desired, but I'm not going to fix it up right now. Just update it to use the latest API which is appropriate here.

ilammy force-pushed the kdf/c++ branch from c20816b to 353491e Compare February 12, 2020 20:01

ilammy requested review from ignatk and storojs72 as code owners February 12, 2020 20:01

ilammy changed the base branch from ilammy/kdf/core to master February 12, 2020 20:02

ilammy merged commit e3f3b5b into cossacklabs:master Feb 27, 2020

ilammy deleted the kdf/c++ branch March 5, 2020 12:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Secure Cell passphrase API: ThemisPP #588

Secure Cell passphrase API: ThemisPP #588

ilammy commented Jan 31, 2020 •

edited

Loading

Lagovas left a comment

ilammy commented Feb 12, 2020

Secure Cell passphrase API: ThemisPP #588

Secure Cell passphrase API: ThemisPP #588

Conversation

ilammy commented Jan 31, 2020 • edited Loading

User notes

Other breaking changes

New Token Protect API

Iterator pair support

Technical notes

Organizational notes

Checklist

Lagovas left a comment

Choose a reason for hiding this comment

ilammy commented Feb 12, 2020

ilammy commented Jan 31, 2020 •

edited

Loading