Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update header parser in Secure Cell master key API #592

Merged
merged 9 commits into from
Mar 5, 2020

Conversation

ilammy
Copy link
Collaborator

@ilammy ilammy commented Feb 12, 2020

Passphrase API added in #577 has added quite a few new utilities and more robust and readable approach to parsing Secure Cell headers. Fuzzing with AFL in #579 and #583 has uncovered a bunch of sleeper issues in parsing code. This PR incorporates those improvements and updates master key API implementation to resolve all those issues.

First bunch of commits unifies master key implementation with passphrase one, now they are more or less similar, with differences only in KDF processing and header handling. Maybe later we will be able to reduce code duplication even further.

Some next commits resolve miscellaneous issues found in #583 for passphrase API which were also actual for master key API. This should unbreak fuzzing tests performed by #579.

And there are some minor commits with “collateral damage”: extracted common code from passphrase API implementation and some fixups in tests to expect different error codes.

Checklist

  • Change is covered by automated tests
  • Benchmark results are attached (manually compare with master)
  • The coding guidelines are followed
  • Public API has proper documentation (no changes)
  • Changelog is updated (will update in bulk before release)

Just like with passphrase API, introduce structures, serializers and
parsers for header used by master key API. It's mostly identical but
lacks KDF context.
Make it read like the current passphrase API using the same parsing
techniques. This resolves a bunch of possible vulnerabilities found
by AFL fuzzer which could be triggered by specific corruptions in
the header.

We reuse some utilities from passphrase API files so move them into
common internal header.

Note how we treat the compatibility path. Themis 0.9.6 used to use
incompatble format for one of the context fields used for Soter KDF
which resulted in encrypted data that cannot be decrypted by modern
implementation. We still support decryption of these 'broken' cells
by trying compatibility KDF if the initial decryption fails.
Similarly to decryption, unify the code with passphrase implementation.
Since we use new definitions now, those are no longer necessary.
We have added a new accessor to KDF in Soter algorithm ID, use it.
Just like with passphrase API, we need to pay close attention to the key
length field which we use to determine the length of the derived key.
Accept only formats currently produced by Themis. It's unlikely that we
update them, and Soter cannot handle other formats anyway right now.
Similar to passphrase API, master key header can encode messages with
total length exceeding UINT32_MAX which can cause overflows on 32-bit
platforms (with 32-bit size_t).

Avoid overflows in computations by judiciously using uint64_t and
breaking up intermediate computations to ensure uint64_t is used.
Changes in master key implementations unify error reporting with
passphrase API as well. In particular, we consistently report detected
data corruption with THEMIS_FAIL error code now.

THEMIS_INVALID_PARAMETER is used to indicate programming errors which
could have been prevented by a better type system. Attempting to decrypt
corrupted data is not such an issue.

Some tests that verify corrupted data processing expect particular error
codes to be reported. Update them to expect THEMIS_FAIL now.
@ilammy ilammy added the core Themis Core written in C, its packages label Feb 12, 2020
@ilammy ilammy mentioned this pull request Feb 12, 2020
6 tasks
@ilammy
Copy link
Collaborator Author

ilammy commented Feb 12, 2020

Since I did not teach our CI to compare benchmark results yet, I'm filling it for this job.

Here are reports for master and this PR, they check only processing of 4 KB of data.

Encryption Decryption
Before 4.38 µs ± 0.73 µs 3.10 µs ± 0.26 µs
After 4.20 µs ± 0.18 µs 3.01 µs ± 0.28 µs

So it’s even a slight improvement but within the variance so I doubt it is statistically significant.

Performance optimization is not a goal of this PR so I won‘t be doing measurements manually. I’m fine with not having any obvious degradation.

src/themis/secure_cell_alg.h Show resolved Hide resolved
switch (soter_alg_kdf(hdr->alg)) {
case SOTER_SYM_NOKDF: {
/*
* Themis 0.9.6 incorrectly used 64-bit size_t (as uint64_) for this field.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not just wrap this small piece of code with #define instead copy/paste 70% of code?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not just wrap this small piece of code with #define instead copy/paste 70% of code?

I vaguely recall there was a reason but I can’t remember it right now. I’ll see what if this can be cleaned up. Maybe it has something to do with the structure of code.

While we’re here, I’d like @vixentael to chime in for a second opinion. How about keeping this compatibility shim enabled permanently, removing all this #ifdef magic around it, and making NO_SCELL_COMPAT a no-op? I’ve been thinking about it for some time and can’t find any reasons why we would like to have this as opt-out. “Hey, psst, user! Do want to try turning this switch off? It will not improve performance on correct data, but can make some old data to be undecryptable. Wanna try?”

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I’ve extracted the varying parts into themis_auth_sym_kdf_context() and themis_auth_sym_kdf_context_compat(). Now there is less code duplication. Ironically, this is longer by 5 lines because of error handling (which is going to be optimized out).


return THEMIS_SUCCESS;
if (*auth_token_length < themis_scell_auth_token_key_size(&hdr)) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we check do all checks before encryption? check size of buffer for token before encryption above?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we check do all checks before encryption?

Yes, we probably can, but arguably we will need to check after encryption too. We do early checks without encryption when the caller code expects THEMIS_BUFFER_TOO_SMALL to be returned, using some hardcoded expected value for maximum possible buffer size. This check compares the buffer length with actual token length value returned by Soter after encryption. It may be possible to safely use a buffer of smaller size than maximum.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought we can calculate final size of encrypted ciphertext before encryption. if not, it bad that may be case when user should check error value and compare with buffer_to_small more then once because it require to check it in some loop and probably re-allocate buffer. can we do something and allow to check error on buffer_to_small only once before encryption/decryption?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can (and do) conservatively estimate the maximum size of the encrypted message and return than with “buffer too small” error.

In order to accurately tell the encrypted message size we need to execute everything up to and including soter_sym_encrypt_update() and soter_sym_decrypt_final() calls which return authoritative length of encrypted message and authentication token, based on the used algorithm, key, IV, and input. Those functions can return SOTER_BUFFER_TOO_SMALL, but they need a valid soter_sym_ctx_t (that is, we need to have performed KDF and generated IV to do this). In short, with current Soter API we need to basically successfully perform encryption to get accurate output size.

There is a way to avoid actually encrypting data but it's inconvenient to implement. We will need to duplicate code to get correct measurements of key length, IV, etc., initialize a temporary soter_sym_ctx_t, ask Soter what it thinks about expected output length, and return that value. Soter will effectively encrypt a zero-length message to ask cryptographic backend about the length. Though, I'm not sure this will work for decryption path.

What we do instead is give the user maximum size that Themis expects Soter to generate (because Themis has an idea what values cryptographic backend will return). This makes it highly likely that the buffer will have sufficient (and exact) size. However, it is also possible that the actual length as returned by Soter will be smaller (because Themis makes a conservative estimate). It is also possible that Soter will return “buffer too small” if Themis returns an incorrect maximum buffer size and the user believes it.

This is how it worked pretty much since the beginning, this refactoring does not change this estimation logic. Passphrase API behaves in the same way as it ultimately uses the same Soter API.

We still need to maintain compatibility KDF implementation in order to
be able to decrypt data encrypted by Themis 0.9.6 which used incorrect
Soter KDF context computation, incompatible with 32-bit platforms.

Refactor the compatibility shim code to extact the context computation
and avoid duplicating the entire KDF processing, with all its length and
algorithm verification.
Copy link
Collaborator

@Lagovas Lagovas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@ilammy ilammy merged commit 2f19dfe into cossacklabs:master Mar 5, 2020
@ilammy ilammy deleted the update-scell-master-key branch March 5, 2020 12:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Themis Core written in C, its packages
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants