Skip to content

rjrelyea/SHELib

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

60 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Simple Homomorphic Encryption (SHE)

Basics

Homomorphic Encryption allows you to send encrypted data to a third party
and that third party can do computations on the data and return it to you
will all the computations happenning on the encrypted data itself. This
library uses the helib's binArithmetic library to implement as full a C++
semantic as possible. Helib uses the BGV homomorphic encryption system. This
system can do primitive operations of bit level addition and bit level
multiplication. Each operation adds some noise to the result, so only
a limitted number of operations can be done before the noise overwhelms our
actual data. Part of BGV is a way to set the noise level to a lower value
by decrypting our the encrypted value homophorphically. The details aren't
important for the use of SHE, but this affects the performance of any
calculations you do.

The goals of this library.

Helib as been available for about 10 years, and is already used in systems
to provide homomorphic services. It is the "assembler" level of doing
homomorphic encryption, with a few higher level features like binArithmetic.
The latter provides basic functionality like shift, bitwise operations,
addition, subraction, and multiplication. Using helib requires finding
correct parameters, and initializing crypto systems with those parameters,
which can be tricky.

The goals of this library is to reduce the complexity of initializing your
the system and provide C++ like access to underlying homomorphic operations.

Overall the goals, in order, are:

1. Ease of Use: Encrypted values are represented by C++ classes that mimic
the C++ builtin types, with operator overrides so you an do math on them
just like the native types. The compiler precidences work, non-bool value
will take on bool values in logical expression (1=non-zero, 0=0).
  - this means much of the details of helib are hidden, and while you
    can get the underlying helib data structures, you usually don't have
    to.
  - as much as possible, your standard programming will work (with the
    notable exception of comparison, which returns an encrypted value
    that cannot affect your program flow (see more below)).

2. Correct and fully functioning:  The goal is that anything that you can possibly
do with the builtins, you should be able to do with the Encrypted values,
except you can't cast encrypted values to unencrypted values (so you can't
get a bool, or instance from a comparision).
  - this means that integer division is included, even if it could take
    hours or days to complete.
  - encrypted int64 types are allowed, even though they would take hours
    to do basic operations.
  - bootstrapping (see below) is implemented automatically so you don't
    have to worry about it in your initial implementation (even if it
    slows down performance).

3. Perfomance: This is goal 3. This means if there are two ways of doing
   something, one way is significantly slower, but produces the correct
   results and the other way is faster, but sometimes hits corner cases.
   I've chosen correctness. That being said, I will usually try to chose
   the method that provides optimal performance. I've also provided
   access to methods the application programmer can use to optimize their
   implementation.

------------------------- Using SHELib -------------------------------------

            KeyGeneration and management.

To use SHE, you first need to get a keypair. This keyPair is generated using
the SHEGenerate_KeyPair(), which takes a type, a security level, and an
operation level. The operation level sets how many operations you can do before
you need to bootstrap again. The higher the operation level, the slower each
operation will be for a given security level. Security level is the strength
of the system in bits. Several levels are provided that are not really secure,
but allow you to experiment before cranking up to a higher security level.

Once you have your keyPair, you can save them to a file, or stream using the
C++ << operators. In a real system, you will want to send the publickey
along with your encrypted data, so that they server can operate on your
encrypted data.

NOTE: in production, only the client needs to generate a keypair, the server
just accepts publicKey along with the encrypted data.

            Integer operations.

Encrypted data is stored in various SHEInt classes: SHEInt8, SHEInt16,
SHEInt32, SHEInt64, SHEUInt8, SHEUInt16, SHEUInt32, SHEUInt64and SHEBool. These
correspond the the C++ types int8_t, int16_t, int32_t, int64_t, uint8_t,
uint16_t, uint32_t, uint64_t, and bool respectively

These classes are all subclasses of SHEInt, so you can implement generic
integer libraries that work on any of these native types by using SHEInt.
All SHEInt constructs must have a publicKey, either explicitly, or implicity.
Simply declare a SHEInt variable, and start using it.

SHEInt16 a(publicKey);
SHEInt16 b(publicKey);
SHEInt16 r(publicKey);

a = 5;
b = -7;

r = a*a + b;

All the basic integer operators are supported except  the ? operator (a?b:c).
Instead you can use a.select(b:c), or select(a,b,c);

Logical and comparison operations return SHEInt and not bool, and there is
no cast from SHEInt to bool. This is because the results of these operations
are encrypted, so they cannot be examined directly in your program. This means
they cannont be used in a if, or as an exit to a loop condition.

  if (a == b)  // This will not work.

This is because the result of the comparison is itself encrypted. So you can't
query the result (unless you decrypt it). What you can do is use the result
to select a value:
  r = (a==b).select(10, 5);

or
  r = select(a==b, 10, 5);

r will get the selected encrypted value (either 10 or 5) depending on the state
of the comparision. You can use integer constants or SHEInt values or a mix of
values in the select.

To implement a while loop with an encrypted index, you can't use and
encrypted bool result to exit the loop. Instead you need to use a
for loop to loop over all the possibilities and stop updating your variables
when the loop condition 'ends':

   // unencrypted a and b:
   while (a < b) {
    a += 1;
    b >> 1;
  }

  // encrypted a && b, assumes a is not negative...
  SHEBool lbreak(publicKey, false);
  for (int i=0; i < b.getSize(); i++) {
    // lbreak is zero until the condition is reached
    lbreak= lbreak || (a < b);
    // update variables until the lbreak condition occurs
    a = lbreak.select(a, a+1);
    b = lbreak.select(b, b>>1);
  }

Note: it's important to know the actual maximum bound of the while loop, if,
in the above example, a is -b.getSize(), then this loop won't execute as
long as expected an will produce incorrect results.

            SHEVector operations.

SHEVector allows you to store a vector or array of encrypted values and access
that array with an encrypted index. In all other ways it functions as a normal
std::vector. When using an encrypted index, you can't assign to it, but you can
use the assign() method:

    SHEInt16 model(publicKey);
    SHEVector<SHEInt16> array(model,1);
    SHEInt16 r(publicKey,5);
    SHEInt16 r2(publicKey);
    SHEInt8 index(publicKey, 3);

    array[1] = SHEInt16(publicKey, 5); // this works because the index is
                                       // unencrypted
    array[3] = r;
    r2 = array[index];                // r2 will get an encrypted value of r
    array[index] = SHEInt16(publicKey, 7) // this will not work.
    array.assign(index,SHEInt16(publicKey, 7)) // use this instead

SHEVectors can store and index any value that has a
<type>select(const SHEInt &, const <type>&, const <type>&); function.
Indexes are type SHEInt.

            Using encrypted indicies on unencrypted arrays, vectors and maps.

SHEVector.h also defines 3 template functions: getArray, getVector, and getMap
which takes normal arrays, vectors and maps and returns encrypted values based
on encrypted indicies. These functions also take a default encrypted value to
use if the index is out of bounds.

            SHEFp operations

Like SHEInt, SHEFp is a base class used to implement SHEHalfFloat,
SHEBFloat16, SHEFloat, SHEDouble, SHEExtendedFloat, and SHELongDouble.
Like SHEInt, in needs to be initialized with a publicKey either implicitly
or explicitly. Once you have a float variable, you can assign to it, operate,
on it, etc. Basic operators +, -, *, and / are defined (returning SHEFp).
Comparision operators return SHEBool and can be used in a select function:

SHEFp select(&SHEBool sel,const SHEFp &a_true, const SHEFp &a_false);
version also allow mixing in floating point constants as well.

A special SHEFpBool class is defined which allows syntax like:
    SHEFp a = SHEFPBool(c>b).select(c,b);
This is equivalent to a = select(c>b,c,b);

           SHEString operations

SHEString provides an encrypted version of std::string. Constructors take
either a model string or a public key, just like SHEInt and SHEFp. Strings
are implemented as a vector of SHEChar. The length of the string may or may
not be encrypted. You can use the hasEncryptedLength flag to specify
encryptedLength in the constructor. You can also call the encryptLength()
methode to force a string to have an encrypted length. Any operation that uses
an encrypted length itself, and encrypted position, or another string with an
encrypted length will force the resulting string to have an encrypted length.
The only operation that will cause a string with an encrypted length to have an
unencrypted length is a resize with an unencrypted length value.

Strings with encrypted lengths have 2 different modes, set by global class
variables. The modes are fixed and non-fixed. Fixed strings all have the same
length, set by the maxStringLength variable. If a string result would be
larger than maxStringLength, then that result is truncated. Non-fixed strings
have variable lengths which are guaranteed to be the same or longer than the
encrypted length. These strings are still limited by the
maxEncryptedStringLength because the length is stored in a SHEUInt8 variable.
Applications can set the behavior with SHEString::SetFixedSize(), and query it
with SHEString::getFixedSize() and SHEString::getHasFixedSize().

           Performance and bootstrapping.

While pretty much anything you want to do with Integers and Doubles is support,
not everything can be done in finite time. While the library 'hides' the
details of underlying bootstrapping operations, they will quickly become
appearant in your application. For any real security level, bootstrapping
large integers, like 64 bits, can take 3-8 hours to complete. So while you
may be able to do some operation in multiple seconds, when you run close to
the noise limit, they can suddenly take hours. There are a couple of ways to
mitigate this issue.

   1) you can select a high noise tolerance context and try to complete your
   entire calculation before bootstrapping becomes necessary.
   2) you can use a lower noise tolerance to get faster performance for the
   same security level.
   3) you can reduce your security level.
   4) you can reduce your variable size.

For almost all usable security levels, SHEInt64 will have extremely poor
performance. SHEInt16 will have the best compromise of performance and
functionality. If you do have to boostrap, you can use the .bitCapacity()
method to see how many more operations your variable can go before it needs
to bootstrap. There is also a useful method (.verifyArgs()) that will
preemptively bootstrap those variables before you execute a particular
function. This is important because you may find that while SHELib can
automatically bootstrap, it may bootstrap temparary copies, leaving the
original variable unupdated. Also bootstrapping a variable that may be
used in multiple locations helps keep all the variables that depend
on it from loosing capacity.

           Debugging

You can get logging output from the internals of each component of SHELib
by providing a stream to the SHExxx::setLog function. Setting the logger
will cause the function to output various calls to that function. You
can also output summary information for SHEInt and SHEFp types using
the SHEIntSummary and SHEFpSummary casts before outputting them to a stream:

   std::cout << "My encrypted vars: " << (SHEIntSummary) myint
             << ", " << (SHEFpSummary) myfloat
             << ", " << (SHEStringSummary) mystring << std::endl

For SHEInts you'll get:
  SHEInt(label, bitSize, isUnsigned, isZero, capacity[:value])
  where:
    -label      is the label you set on the variable (temporaries get their own
                unique 't' label which is set when they are first printed).
    -bitSize    is the number of encrypted bits of this SHEInt (same as
                returned from myint.getSize()).
    -isUnsigned is 'U' for unsigned values and 'S' for signed values.
    -isZero     is 'Z' if this value is an unencrypted explicit zero, and
                'E' if this is a properly encrypted value.
    -capacity   how many 'level's are left before this variable is too
                noisy to be usefull, MAX means the value is effectively
                unencrypted and has no noise.
    -value      is the decrypted value. It's only included if you have set
                the private key with SHEInt::setDebugPrivateKey(SHEPrivateKey&)
For SHEFps you'll get:
  SHEFp(label, expBitSize, mantissaBitSize, capacity[:value])
  where:
    -label       is the label you set on the variable (temporaries get their
                 own unique 't' label which is set when they are first
                 printed).
    -expBitSize  is the number of encrypted bits of exponent (same as
                 returned from myfloat.getExp().getSize()).
    -mantBitSize is the number of encrypted bits mantissa (same as
                 returned from myfloat.getMatissa().getSize()).
    -capacity    how many 'level's are left before this variable is too
                 noisy to be useful, MAX means the value is effectively
                 unencrypted and has no noise.
    -value       is the decrypted value. It's only included if you have set
                 the private key with SHEFp::setDebugPrivateKey(SHEPrivateKey&)
For SHEStrings you'll get:
  SHEString(label, encryptedLength, size, capacity[:”value”])
  where:
    -label       is the label you set on the variable (temporaries get their
                 own unique 't' label which is set when they are first
                 printed).
    -encryptedLength  is ‘El’ if the length is encrypted and ‘Ul’
                 for unencrypted length.
    -size        size of the string buffer, for unencrypted lengths this is
                 the string size, for encrypted lengths it’s always bigger
                 than equal to the real length.
    -capacity	 how many 'level's are left before this variable is too noisy
                 to be useful, MAX means the value is effectively unencrypted
                 and has no noise.
    -value	 is the decrypted value. It's only included if you have set the
                 private key with SHEString::setDebugPrivateKey(SHEPrivateKey&)

If you don't cast the variables, you will get the object encoded as json, which
is suitable for communicating to other locations, but not all that usable for
debugging.

If you set some radix other than std::dec, all the above values will still be
printed in radix std::dec except the SHEInt values, which will use the radix
you supplied (std::hex or std::octal)