-
Notifications
You must be signed in to change notification settings - Fork 144
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update header parser in Secure Cell master key API #592
Conversation
Just like with passphrase API, introduce structures, serializers and parsers for header used by master key API. It's mostly identical but lacks KDF context.
Make it read like the current passphrase API using the same parsing techniques. This resolves a bunch of possible vulnerabilities found by AFL fuzzer which could be triggered by specific corruptions in the header. We reuse some utilities from passphrase API files so move them into common internal header. Note how we treat the compatibility path. Themis 0.9.6 used to use incompatble format for one of the context fields used for Soter KDF which resulted in encrypted data that cannot be decrypted by modern implementation. We still support decryption of these 'broken' cells by trying compatibility KDF if the initial decryption fails.
Similarly to decryption, unify the code with passphrase implementation.
Since we use new definitions now, those are no longer necessary.
We have added a new accessor to KDF in Soter algorithm ID, use it.
Just like with passphrase API, we need to pay close attention to the key length field which we use to determine the length of the derived key. Accept only formats currently produced by Themis. It's unlikely that we update them, and Soter cannot handle other formats anyway right now.
Similar to passphrase API, master key header can encode messages with total length exceeding UINT32_MAX which can cause overflows on 32-bit platforms (with 32-bit size_t). Avoid overflows in computations by judiciously using uint64_t and breaking up intermediate computations to ensure uint64_t is used.
Changes in master key implementations unify error reporting with passphrase API as well. In particular, we consistently report detected data corruption with THEMIS_FAIL error code now. THEMIS_INVALID_PARAMETER is used to indicate programming errors which could have been prevented by a better type system. Attempting to decrypt corrupted data is not such an issue. Some tests that verify corrupted data processing expect particular error codes to be reported. Update them to expect THEMIS_FAIL now.
Since I did not teach our CI to compare benchmark results yet, I'm filling it for this job. Here are reports for master and this PR, they check only processing of 4 KB of data.
So it’s even a slight improvement but within the variance so I doubt it is statistically significant. Performance optimization is not a goal of this PR so I won‘t be doing measurements manually. I’m fine with not having any obvious degradation. |
src/themis/sym_enc_message.c
Outdated
switch (soter_alg_kdf(hdr->alg)) { | ||
case SOTER_SYM_NOKDF: { | ||
/* | ||
* Themis 0.9.6 incorrectly used 64-bit size_t (as uint64_) for this field. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not just wrap this small piece of code with #define instead copy/paste 70% of code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not just wrap this small piece of code with #define instead copy/paste 70% of code?
I vaguely recall there was a reason but I can’t remember it right now. I’ll see what if this can be cleaned up. Maybe it has something to do with the structure of code.
While we’re here, I’d like @vixentael to chime in for a second opinion. How about keeping this compatibility shim enabled permanently, removing all this #ifdef
magic around it, and making NO_SCELL_COMPAT
a no-op? I’ve been thinking about it for some time and can’t find any reasons why we would like to have this as opt-out. “Hey, psst, user! Do want to try turning this switch off? It will not improve performance on correct data, but can make some old data to be undecryptable. Wanna try?”
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I’ve extracted the varying parts into themis_auth_sym_kdf_context()
and themis_auth_sym_kdf_context_compat()
. Now there is less code duplication. Ironically, this is longer by 5 lines because of error handling (which is going to be optimized out).
|
||
return THEMIS_SUCCESS; | ||
if (*auth_token_length < themis_scell_auth_token_key_size(&hdr)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we check do all checks before encryption? check size of buffer for token before encryption above?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we check do all checks before encryption?
Yes, we probably can, but arguably we will need to check after encryption too. We do early checks without encryption when the caller code expects THEMIS_BUFFER_TOO_SMALL to be returned, using some hardcoded expected value for maximum possible buffer size. This check compares the buffer length with actual token length value returned by Soter after encryption. It may be possible to safely use a buffer of smaller size than maximum.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought we can calculate final size of encrypted ciphertext before encryption. if not, it bad that may be case when user should check error value and compare with buffer_to_small more then once because it require to check it in some loop and probably re-allocate buffer. can we do something and allow to check error on buffer_to_small only once before encryption/decryption?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can (and do) conservatively estimate the maximum size of the encrypted message and return than with “buffer too small” error.
In order to accurately tell the encrypted message size we need to execute everything up to and including soter_sym_encrypt_update()
and soter_sym_decrypt_final()
calls which return authoritative length of encrypted message and authentication token, based on the used algorithm, key, IV, and input. Those functions can return SOTER_BUFFER_TOO_SMALL, but they need a valid soter_sym_ctx_t
(that is, we need to have performed KDF and generated IV to do this). In short, with current Soter API we need to basically successfully perform encryption to get accurate output size.
There is a way to avoid actually encrypting data but it's inconvenient to implement. We will need to duplicate code to get correct measurements of key length, IV, etc., initialize a temporary soter_sym_ctx_t
, ask Soter what it thinks about expected output length, and return that value. Soter will effectively encrypt a zero-length message to ask cryptographic backend about the length. Though, I'm not sure this will work for decryption path.
What we do instead is give the user maximum size that Themis expects Soter to generate (because Themis has an idea what values cryptographic backend will return). This makes it highly likely that the buffer will have sufficient (and exact) size. However, it is also possible that the actual length as returned by Soter will be smaller (because Themis makes a conservative estimate). It is also possible that Soter will return “buffer too small” if Themis returns an incorrect maximum buffer size and the user believes it.
This is how it worked pretty much since the beginning, this refactoring does not change this estimation logic. Passphrase API behaves in the same way as it ultimately uses the same Soter API.
We still need to maintain compatibility KDF implementation in order to be able to decrypt data encrypted by Themis 0.9.6 which used incorrect Soter KDF context computation, incompatible with 32-bit platforms. Refactor the compatibility shim code to extact the context computation and avoid duplicating the entire KDF processing, with all its length and algorithm verification.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
Passphrase API added in #577 has added quite a few new utilities and more robust and readable approach to parsing Secure Cell headers. Fuzzing with AFL in #579 and #583 has uncovered a bunch of sleeper issues in parsing code. This PR incorporates those improvements and updates master key API implementation to resolve all those issues.
First bunch of commits unifies master key implementation with passphrase one, now they are more or less similar, with differences only in KDF processing and header handling. Maybe later we will be able to reduce code duplication even further.
Some next commits resolve miscellaneous issues found in #583 for passphrase API which were also actual for master key API. This should unbreak fuzzing tests performed by #579.
And there are some minor commits with “collateral damage”: extracted common code from passphrase API implementation and some fixups in tests to expect different error codes.
Checklist
Public API has proper documentation(no changes)Changelog is updated(will update in bulk before release)