Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Secure Cell passphrase API: ThemisPP #588

Merged
merged 10 commits into from
Feb 27, 2020
Merged

Conversation

ilammy
Copy link
Collaborator

@ilammy ilammy commented Jan 31, 2020

Add support of Secure Cell passphrase API to ThemisPP. The API draft is described in RFC 3.1.

User notes

Passphrase-based interface of Secure Cell allows you to use short and memorable passphrases to secure your data. While symmetric keys are more secure, they are also longer and much harder for humans to remember.

Here is how you can use passphrases with Secure Cell:

#include <themispp/secure_cell.hpp>

auto cell = themispp::secure_cell_seal_with_passphrase("secret");

uint8_t message[] = "precious message";

std::vector<uint8_t> encrypted = cell.encrypt(message);
std::vector<uint8_t> decrypted = cell.decrypt(encrypted);

EXPECT_EQ(decrypted, message);

Passphrase API accepts passphrases as relatively short strings, suitable for human memory. Master key API uses randomly generated, long binary keys, which are more suitable for machines to remember. However, they are also more efficient and generally more secure due to considerable length. You should prefer to use keys over passphrases if there are no humans involved. The interface is almost the same:

#include <themispp/secure_keygen.hpp>

// Get a new key if you don't have one already:
std::vector<uint8_t> master_key = themispp::gen_sym_key();
// Or use an existing value that you store somewhere:
std::vector<uint8_t> master_key = base64::decode("b0gyNlM4LTFKRDI5anFIRGJ4SmQyLGE7MXN5YWUzR2U=");

auto cell = themispp::secure_cell_seal_with_key(master_key);

uint8_t message[] = "precious message";

std::vector<uint8_t> encrypted = cell.encrypt(message);
std::vector<uint8_t> decrypted = cell.decrypt(encrypted);

EXPECT_EQ(decrypted, message);

⚠️ NOTE: themispp::secure_cell_seal_t types are deprecated. Please use less ambiguous themispp::secure_cell_seal_with_key instead.

Other breaking changes

New Token Protect API

Previously Token Protect API was used like this:

auto cell = themispp::secure_cell_token_protect_t(master_key);

auto encrypted = cell.encrypt(message);
auto token = cell.get_token();

// ...

cell.set_token(token);
auto decrypted = cell.decrypt(encrypted);

This API was not thread-safe and prone to mistakes such as forgetting to set (or update) the authentication token before decryption.

New API provides much simpler and safer interface:

auto cell = themispp::secure_cell_token_protect_with_key(master_key);

// Since C++17:
auto [encrypted, token] = cell.encrypt(message);

// Since C++11:
std::vector<uint8_t> encrypted, token;
std::tie(encrypted, token) = cell.encrypt(message);

// C++03:
typedef themispp::secure_cell_token_protect_with_key::output_pair token_pair;
token_pair result = cell.encrypt(message);
// Access result.encrypted() and result.token()

// ...

auto decrypted = cell.decrypt(encrypted, token);

Note that APIs are incompatible and you will need to update encryption/decryption call sites during migration. Simply renaming Secure Cell classes will not be enough.

Iterator pair support

New API does not support iterator pairs (aka ranges or spans) directly for maintainability reasons. Avoiding direct iterator pair support allows us to have much fewer method overloads and keeps the code signficantly easier to follow.

If you have been using iterator pairs:

auto cell = themispp::secure_cell_seal_t(key_begin, key_end);

then you need to wrap them in themispp::input_buffer() with new API:

auto cell = themispp::secure_cell_seal_with_key(themispp::input_buffer(key_begin, key_end));

This is actual for all Secure Cell modes and all methods (construction, encryption, decryption).

Technical notes

As the saying goes, no battle plan survives contact with the enemy and this case is no exception. The implementation looks a little bit different from what is prescribed by the RFCs. However, user API says the same. Well, in fact, I consider C++ as a language to be hostile actor so this is more or less expected outcome.

This pull request is the most complex, the most unreadable, ridden with bugs and undefined behavior, and unpleasant in general of them all. In short, your typical C++ code. I started with C++ to run away from it as fast possible (as a side effect, it should run as fast as possible too).

I do not expect you to review it quickly. Take your time. Just open it up when you want to have your fix of suffering for the day or something.

Rant full of profanity

Why do you have to write all those input_buffer again and again – or pull in Boost monsters to avoid copying the code – instead of simply saying impl AsRef<[u8]>? *weeps in Rust*

Why do you have to have a PhD to write generic library code in C++? And it’s not PhD in math as you‘d probably need for Haskell. A PhD in law and criminal psychopathology will be much more appropriate here.

This is what happens to a language if a committee starts piling features onto features without ever deprecating anything. And in case your opponent attempts a counterargument you simply need to throw you hands up in the air and repeat “...but backwards compatibility!” and “undefined behavior!” until they go away.

Bjarne said that “there are two kinds of languages: the ones people complain about and the ones nobody uses”. So apparently they decided to design a language as bitchworthy as possible to get the adoption as high as C++ has.

But at least I‘m happy with this code. I really am. This is probably some kind of a Stockholm syndrome. Like, ‘clever’ languages like C++ reward ‘clever’ code. You get an immense sense of accomplishment once your tests actually compile and work the way you want to. This is where ‘boring’ languages like, say, Go or Java are different. C++ rewards you for writing clever, beautiful code. The only moment a program in Go is going to reward you is when the damn code does what it‘s supposed to be doing. Unfulfilling, boring, ugly as hell, but that‘s what people need – working code – as opposed to beautiful code that pleases the compiler. I‘m sorry I‘m weak before the machines but that‘s how it is.

I hope I don‘t have to touch this code again with a three-meter pole for the next year.

Whew! I feel a little bit better. Let’s do business now...

There‘s like 2.6k lines of code here. A significant part of them is documentation comments, copypasta in tests, dumb glue FFI code to interface with Themis Core, and a whole bunch of C++ magic to make all this work somehow. You‘ll probably have a better code review experience if you walk through it commit by commit.

Organizational notes

🚧 This branch depends on #577 so it has to wait before it is merged. To get better diffs the PR is targeted to a temporary branch which has to be removed before this PR is merged.

Once dependencies are merged, do the following:

  • rebase ilammy:kdf/c++ onto cossacklabs:master
  • retarget this PR to master
  • delete ilammy/kdf/core

Follow the order. If the base branch is deleted early then this PR might get autoclosed.

Checklist

  • Change is covered by automated tests
  • Benchmark results are attached (pls no, I don‘t want to bench wrappers this month)
  • The coding guidelines are followed
  • Public API has proper documentation
  • Example projects and code samples are updated
  • Changelog is updated

@ilammy ilammy added the W-ThemisPP ⚔️ Wrapper: ThemisPP, C++ API label Jan 31, 2020
Copy link
Collaborator

@Lagovas Lagovas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

This is how you say "AsRef<[u8]>" in C++. Yes, really.

We are going to use this bunch of templates to accept anything that
can be converted into a byte slice in new Secure Cell interface.

Public interface:

- themispp::input_buffer() templates

Application code is allowed to invoke these freely. They are resilient
to silly values, but until C++20 it is impossible to detect contiguous
iterators reliably. Therefore is is technically possible to pass, say,
std::deque<uint8_t> with inevitable probable segfault. Don't do that.
(Thankfully, std::list and everything non-random-access is ruled out.)

Private interface:

- themispp::impl::input_buffer struct
- themispp::impl::input_bytes() templates
- themispp::impl::input_string() templates

These functions are going to be used by ThemisPP internally. There are
dummy identity conversion to handle themispp::input_buffer() results.
String functions are used only by passphrase API constructors. They are
not available to general users so that they don't pass std::string
wherever they want.

However, it is possible for users to provide specializations of all
template functions in order to support their own custom containers.
Tests provide an example of how this can be done.

Note that the implementation is hidden in themispp::impl namespace
which itself is mirrored in "themispp/impl" directory. This should
prevent users from accidentally including and using these definitions.

Also note the "compat" headers which are used to polyfill newer features
of C++ when available. They are meant for internal use and should be
used in pairs to avoid leaking magic macros to application code.
We are going to use this little class to store secret data inside
Secure Cell. It is a wrapper over std::vector equipped with autowiping.
The interface is restricted by design, limiting out own stupidity
when coding ThemisPP.

Use of soter_wipe() in ThemisPP in any form has a drawback: it makes
application code dependent on Soter library directly. Previously only
themispp::secure_rand_t (mostly unused) caused this, but now all new
Secure Cell classes will trigger this. It's not an issue on Linux with
its flat linker spaace, but virtually every other OS has nested linkage.
Implement Seal mode of Secure Cell, in both master key and passphrase
flavors. The implementation is more or less straightforward, but there
is one thing which must be commented. Initially I expected that
allocator-aware templates can be placed directly into "themispp"
namespace:

    auto cell = themispp::secure_cell_seal_with_passphrase("secret");

However, it turned out that C++ grammar requires (until C++17) the
template brackets to be present at all times, even if all template
arguments can be inferred. This results in silly-looking:

    auto cell = themispp::secure_cell_seal_with_passphrase<>("secret");

As a result, the actual implementation goes into "impl" subnamespace and
proper name is reexported via a typedef. Well... it's C++, what can I do?
Assuming that we do want to keep allocator awareness.

New implementation gets a bunch of new test which verify both API and
some behavior. In particular, they can be somewhat of a usage guide now.
Though, they are still very much incomplete.

Note that some tests are templated over "master key"/"passphrase" type.
I really don't want to duplicate this code (it's enough KLOC here) *and*
this ensures that both Secure Cell flavors have identical API. Win-win.
(Actual implementation could be templated too, but that does not save
much lines of code while making the implementation *much* more complex.)

Also note a whole lot of "// NOLINT" comments. If we were not bound by
C++03 compatibility we could have used modern alternatives, but C++03
makes it hard to write code that compiles with every standard so we use
the greatest common denominator. Hence we need clang-tidy to shut up.
This one is easy to do: there are no overloads, simple API, there will
never be a passphrase API for Context Imprint. So it's mostly copy-paste
of the Seal mode. Just pay attention to the argument order in Core API.
It's slightly different for Context Imprint mode.

Since Context Imprint mode does not verify message validity, there is
not much that we can verify in tests. But at least we check the GIGO
behavior.
Ah! Token Protect API! The foster child of APIs in Themis which usually
does not get enough attention and ends up as a second-class citizen.
ThemisPP has specially obnoxious API for it, which is unique across all
Themis wrappers. Well, not anymore.

Instead of doing all that "set_token()"/"get_token()" we now return the
encryption results as std::pair (with a couple of better accessors
bolted onto it). This also gives us nice destructuring API which has
finally arrived in C++17.

Other than that, the implementation is pretty unremarkable. Mind the
argument order and you'll be fine.

The tests are templatized like Seal mode to make it easier to add
passphrase support later. Note that we very message and token
separately.
Yes, placement of attributes is... questionable but that's how you do it
in C++. Don't ask me, I have no clue and don't want to know. There is
probably a 75-page paper in C++ committee proceedings which explains in
great detail why it must be done this way and absolutely cannot be done
in a different, more sane way.

clang-format does not make it look pretty either. Oh well... At least
this code looks ugly and *definitely* deprecated.
- Autodetect C++ standard version to prevent clang-format from fixing
  "> >" into ">>" in template usage like "foo<bar = <baz> >" which
  does not parse correctly in C++03
Replace whatever we were calling 'passwords' with something that looks
like a key and is named like one.
Well... project... The code style here leaves much to be desired, but
I'm not going to fix it up right now. Just update it to use the latest
API which is appropriate here.
@ilammy ilammy changed the base branch from ilammy/kdf/core to master February 12, 2020 20:02
@ilammy
Copy link
Collaborator Author

ilammy commented Feb 12, 2020

PR #577 has been merged. I've rebased and retargeted this PR to master with no changes.

@ilammy ilammy merged commit e3f3b5b into cossacklabs:master Feb 27, 2020
@ilammy ilammy deleted the kdf/c++ branch March 5, 2020 12:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
W-ThemisPP ⚔️ Wrapper: ThemisPP, C++ API
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants