Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Passphrase-based API of Secure Cell #577

Merged
merged 16 commits into from
Feb 12, 2020
Merged

Commits on Jan 24, 2020

  1. Define passphrase API

    Add API definitions to <themis/secure_cell.h>. This is new public API
    for passphrase-secured data where we apply a *password* key-derivation
    function -- existing master key API uses ZRTP KDF which cannot be
    securely used with passwords.
    ilammy committed Jan 24, 2020
    Configuration menu
    Copy the full SHA
    9740893 View commit details
    Browse the repository at this point in the history
  2. Tests for passphrase API

    Here are some basic tests that are going to hold for the new API.
    
    We verify a bunch of things:
    
    - ability to encrypt and decrypt data
    - reaction to invalid argument values
    - incompatibility of passphrase and master-key API
    - compatibility with older formats
    
    We currently do *not* verify that Secure Cell is able to detect data
    corruptions, invalid passphrases, etc. These tests will be added later.
    
    Note the stability test suite. Some of language wrappers have something
    like this but Themis Core generally did not verify that old data formats
    are still supported. These tests verify that we are able to decrypt data
    encrypted be previous versions of Themis. Passphrase API is going to
    evolve and use different KDF configurations, effectively changing the
    encryption algorithm. We need to be sure that we are still able to make
    sense of the data after we change the implementation.
    
    The tests also verify that we understand various AES key lengths that
    may be used by different Themis builds. Existing Secure Cell code for
    master key API handles them incorrectly. We'll fix it later.
    
    And while we're here, move ARRAY_SIZE utility into common header file so
    that it can be used across all tests.
    ilammy committed Jan 24, 2020
    Configuration menu
    Copy the full SHA
    f84a0f7 View commit details
    Browse the repository at this point in the history
  3. Define authentication token format

    Describe data structures used for Secure Cell authentication tokens.
    
    Here we use much more explicit method than the existing master key
    definitions. If we were to use the old approach, these structs would
    look like
    
        struct themis_scell_auth_token_passphrase {
            uint32_t alg;
            uint32_t iv_length;
            uint32_t auth_tag_length;
            uint32_t message_length;
            uint32_t kdf_context_length;
        };
        // followed by IV data, auth tag data, KDF context
    
        struct themis_scell_pbkdf2_context {
            uint32_t iteration_count;
            uint16_t salt_length;
        };
        // followed by salt data
    
    And then we'd go and cast byte buffers into this structs:
    
        (const struct themis_scell_seal_passphrase_header*)buffer
        buffer->iv_length;
    
    While these structure definitions are definitely more compact, 'parsing'
    code ends up scattered around the file and riddled with hard-to-reason
    pointer magic. Furthermore, such pointer casts do not handle endianness
    and pointer alignment correctly, leading to undefined behavior on some
    ARM systems (and warnings from undefined behavior sanitizer -- that we
    currently squelch because of such not-exactly-correct code).
    
    New code is much more verbose -- in this file -- but the 'end-user' code
    in Secure Cell implementation looks much cleaner and way easier to read
    with all this complexity hidden elsewhere. The compiler does good job
    at inlining all of the parsing code and it is as efficient as possible.
    (Though yeah, it is *not* zero-copy as it was before.)
    
    We put definitions in a private header file and make all parsing and
    serialization functions "static inline" so that they do not cause
    duplicate symbol errors and are always available to the compiler.
    
    All of these are internal helpers and they are not intended to be reused
    for other encryption modes or for any other purposes. They have rather
    idiosyncratic API that might appear weird at first. Let it sink in.
    ilammy committed Jan 24, 2020
    Configuration menu
    Copy the full SHA
    107cf81 View commit details
    Browse the repository at this point in the history
  4. Prepend authentication token in Seal mode

    Similar to master key API, introduce internal helpers that actually
    implement encryption and leave only seal mode handling in secure_cell.c
    
    Just like with master key API, the helpers can be shared between Seal
    and Token Protect modes (though we do not implement Token Protect now).
    Seal mode has to do a bit of trickery to prepend the authentication
    token and to detect its length relative to the message for decryption.
    
    So it's mostly the same ideas, but there are some differences:
    
    - First of all, we use way fewer magical macros in the implementation.
    
    - THEMIS_FAIL is returned when encrypted message buffer size does not
      match the size expected from the header. Master key API returns
      THEMIS_INVALID_PARAMETER in this case which is not entirely correct.
      New API treats this situation as data corruption.
    
      THEMIS_INVALID_PARAMETER should be used to indicate programming
      errors, such as passing a NULL pointer for a required argument.
    
    - We also take care to not overwrite out-parameters for message length
      unless we are returning a 'success': either successful decryption
      indicated by THEMIS_SUCCESS, or a successful measurement of required
      output buffer size indicated by THEMIS_BUFFER_TOO_SMALL.
    
      Also note that the user may pass a buffer larger than necessary. For
      this we need to update the out pointer even in THEMIS_SUCCCESS case
      so that the user knows actual output size.
    
    Also, expose "plain" encryption functions used by master key API so that
    we can avoid dealing with Soter API intricacies and reuse the same code.
    ilammy committed Jan 24, 2020
    Configuration menu
    Copy the full SHA
    ce273c0 View commit details
    Browse the repository at this point in the history
  5. Secure Cell encryption

    Implement Secure Cell (Seal) encryption with passphrases as described in
    Themis RFC 2.
    
    We use the same pattern as with master key code:
    
    - themis_auth_sym_encrypt_message_with_passphrase() does all parameter
      validation, returns expected output buffer size if requested, and
      hands off processing to...
    
    - themis_auth_sym_encrypt_message_with_passphrase_() which does all
      preparatory tasks for encryption: fills in the message header,
      generates derived key and ancillary data, calls encryption function,
      and then finally writes the header into output buffer
    
    - themis_auth_sym_plain_encrypt() does actual encryption using Soter.
      Note that we have to clear the KDF field in algorithm to prevent
      Soter from using its own PBKDF2 implementation.
    
    There are other points to note in the code:
    
    - Before writing output buffer we verify its size again, now that we
      really know all the component sizes.
    
    - We always wipe intermdidate data that we store on the stack. Derived
      key is never stored anywhere else. Other byte buffers are copied into
      output buffer.
    ilammy committed Jan 24, 2020
    Configuration menu
    Copy the full SHA
    ba8645b View commit details
    Browse the repository at this point in the history
  6. Secure Cell decryption

    Decryption is more complex that encryption. Here we need to tread
    veeeery carefully, treating input data as actively hostile.
    
    Function split is the same as with encryption. Notable points:
    
    - Derived key length depends on the algorithm stored in the message.
      Current implementation of master key API makes a mistake of using
      the default key length always.
    
    - Algorithm field has some reserved bits that are set to zero by our
      implementation. We check that they are indeed set to zero.
    
    - Since we support only PBKDF2 code, we can cheat a little and keep
      everything in one function. I guess it will have to rewritten once
      we migrate from PBKDF2. But it's good enough for now.
    
    - There are around 4 or 5 length checks in total for the output message
      because I really don't want to get a buffer overflow here.
    
    - Don't forget to wipe the derived key from memory. Always.
    ilammy committed Jan 24, 2020
    Configuration menu
    Copy the full SHA
    863c14e View commit details
    Browse the repository at this point in the history
  7. Note new arrivals in CHANGELOG

    ilammy committed Jan 24, 2020
    Configuration menu
    Copy the full SHA
    afc4195 View commit details
    Browse the repository at this point in the history

Commits on Jan 28, 2020

  1. Avoid incorrect terminology

    Do not use wording "password" or "passphrase" with master key API of
    Secure Cell. Instead, use either generate symmetric key, or something
    that looks like a key and does not look like a passphrase.
    ilammy committed Jan 28, 2020
    Configuration menu
    Copy the full SHA
    34e2470 View commit details
    Browse the repository at this point in the history
  2. fixup! Avoid incorrect terminology

    Okay, sure, clang-format, that makes it much readable.
    ilammy committed Jan 28, 2020
    Configuration menu
    Copy the full SHA
    e8e8c39 View commit details
    Browse the repository at this point in the history

Commits on Feb 5, 2020

  1. Improve naming of new parsing utilities

    Add a "stream" prefix so that they are better associated with file
    streams and similar wStream utilities from WinPR.
    
    Unify read and write API to use in/out pointers in last position, with
    first argument always being the stream to modify, with its new value
    returned from the function.
    
    The behavior is the same, but the API is neater and easier to read.
    ilammy committed Feb 5, 2020
    Configuration menu
    Copy the full SHA
    782b446 View commit details
    Browse the repository at this point in the history
  2. Relax restrictions on context buffer values

    Allow non-NULL pointer to associated context buffer when its length is
    set to zero. This plays nicely with languages that do not return NULL
    for their empty byte containers. (And it was not tested either...)
    ilammy committed Feb 5, 2020
    Configuration menu
    Copy the full SHA
    f409539 View commit details
    Browse the repository at this point in the history
  3. Simplify auth tag length output

    Encryption paths of master key and passphrase API has an unnecessary
    and weird check of auth tag length output. Strictly speaking, we should
    measure the size of tag but we just know that it's 12 bytes for AES-GCM.
    
    Tweak the pointer type in internal helper function so that it is able to
    directly write the auth tag length into the header struct. This way we
    don't have to check the length is unchanged.
    
    However, since Soter API outputs size_t, we do need to check that it's
    safe to truncate the value returned by Soter.
    ilammy committed Feb 5, 2020
    Configuration menu
    Copy the full SHA
    32ddb96 View commit details
    Browse the repository at this point in the history
  4. Generalize KDF handling in decryption path

    Move KDF context parsing into seprate helper function with takes in
    Secure Cell header and a passphrase, and outputs a derived key. This
    will make it easier to extend the decryption code path to handle other
    KDFs when we add them.
    ilammy committed Feb 5, 2020
    Configuration menu
    Copy the full SHA
    a0ead0a View commit details
    Browse the repository at this point in the history

Commits on Feb 10, 2020

  1. Generalize KDF handling in encryption path

    Similar to decryption, make KDF computation dependent on whatever we
    write into hdr->alg so that it stays consistent. This should make adding
    new KDFs later much easier: you only need to write a new function which
    computes derived key and add it to the KDF switch.
    ilammy committed Feb 10, 2020
    Configuration menu
    Copy the full SHA
    77e3830 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    212bd4b View commit details
    Browse the repository at this point in the history

Commits on Feb 11, 2020

  1. fixup! Simplify auth tag length output

    Woops, this is a good typo. clang-analyzer catches it, by the way.
    ilammy committed Feb 11, 2020
    Configuration menu
    Copy the full SHA
    f9db138 View commit details
    Browse the repository at this point in the history