See also the accompanying WhiteNoise-System and WhiteNoise-Samples repositories for this system.
Differential privacy is the gold standard definition of privacy protection. The WhiteNoise project aims to connect theoretical solutions from the academic community with the practical lessons learned from real-world deployments, to make differential privacy broadly accessible to future deployments. Specifically, we provide several basic building blocks that can be used by people involved with sensitive data, with implementations based on vetted and mature differential privacy research. In WhiteNoise Core, we provide a pluggable open source library of differentially private algorithms and mechanisms for releasing privacy preserving queries and statistics, as well as APIs for defining an analysis and a validator for evaluating these analyses and composing the total privacy loss on a dataset.
The mechanisms library provides a fast, memory-safe native runtime for validating and running differentially private analyses. The runtime and validator are built in Rust, while Python support is available and R support is forthcoming.
Differentially private computations are specified as an analysis graph that can be validated and executed to produce differentially private releases of data. Releases include metadata about accuracy of outputs and the complete privacy cost of the analysis.
- More about WhiteNoise Core
- Installation
- Getting Started
- Communication
- Releases and Contributing
- Contributing Team
The primary releases available in the library, and the mechanisms for generating these releases, are enumerated below. For a full listing of the extensive set of components available in the library see this documentation.
Statistics | Mechanisms | Utilities |
---|---|---|
Count | Gaussian | Cast |
Histogram | Geometric | Clamping |
Mean | Laplace | Digitize |
Quantiles | Filter | |
Sum | Imputation | |
Variance/Covariance | Transform |
There are three sub-projects that address individual architectural concerns. These sub-projects communicate via protobuf messages that encode a graph description of an arbitrary computation, called an analysis
.
The core library is the validator
, which provides a suite of utilities for checking and deriving necessary conditions for an analysis to be differentially private. This includes checking if sufficient properties have been met for each component, deriving sensitivities, noise scales and accuracies for various definitions of privacy, building reports and dynamically validating individual components. This library is written in rust.
There must also be a medium to execute the analysis, called a runtime
. There is a reference runtime written in rust, but runtimes may be written using any computation framework- be it SQL, Spark or Dask- to address your individual data needs.
Finally, there are helper libraries for building analyses, called bindings
. Bindings may be written for any language, and are thin wrappers over the validator and/or runtime(s). Language bindings are currently available for Python, with support for at minimum R and SQL forthcoming.
All projects implement protobuf code generation, protobuf serialization/deserialization, communication over FFI, handle distributable packaging, and have at some point compiled cross-platform (more testing needed). Communication among projects is handled via proto definitions from the prototypes
directory. The validator and reference runtime compile to standalone libraries that may be linked into your project, allowing communication over C foreign function interfaces.
- (forthcoming PyPi binaries via milksnake)
-
Clone the repository
git clone $REPOSITORY_URI
-
Install system dependencies (rust, gcc, protoc, python 3.6+ for bindings)
Mac:
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh xcode-select --install brew install protobuf python
You can test with
cargo build
in a new terminal.Linux:
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh sudo apt-get install diffutils gcc make m4 python sudo snap install protobuf --classic
You can test with
cargo build
in a new terminal.Windows:
choco install rust msys2 protoc python
For non-Chocolatey users: download and install the latest build of rust, msys2, protobuf and python
- https://forge.rust-lang.org/infra/other-installation-methods.html
- https://github.com/protocolbuffers/protobuf/releases/latest
- https://www.msys2.org/
- https://www.python.org/downloads/windows/
Then install gcc under MSYS2
refreshenv reg Query "HKLM\Hardware\Description\System\CentralProcessor\0" | find /i "x86" > NUL && setx WN_SYS_ARCH=i686 || setx WN_SYS_ARCH=x86_64 bash -xlc "pacman --noconfirm -S --needed pacman-mirrors" bash -xlc "pacman --noconfirm -S --needed diffutils make mingw-w64-%WN_SYS_ARCH%-gcc"
You can test with
bash -xc cargo build
. The bash prefix ensures that gmp and mpfr build with the GNU/gcc/mingw toolchain. -
Install the python bindings
cd bindings-python pip install -e ".[test,plotting]"
If you are doing package development, I recommend using
bindings-python/debug_*.sh
for debugging.
First install system libs (GMP version 6.2, MPFR version 4.0.2-p1)
Mac:
brew install gmp mpfr
Linux:
Build gmp and mpfr from source. Then set the environment variable:
export DEP_GMP_OUT_DIR=/path/to/folder/containing/lib/and/includes
Windows:
This is not fully tested. Build gmp and mpfr from source. Then set the environment variable and also switch the rust toolchain:
setx DEP_GMP_OUT_DIR=/path/to/folder/containing/lib/and/includes
rustup toolchain install stable-%WN_SYS_ARCH%-pc-windows-gnu
rustup default stable-%WN_SYS_ARCH%-pc-windows-gnu
To install the python bindings, set the variable
export WN_USE_SYSTEM_LIBS=True
To build the runtime, set the feature flag
cd runtime-rust; cargo build --feature use-system-libs
Provide an alternative openssl installation, either via directions in the automatic or manual section:
Otherwise, please open an issue.
We have numerous Jupyter notebooks demonstrating the use of the WhiteNoise library and validator through our Python bindings. These are in our accompanying WhiteNoise-Samples repository which has exemplars, notebooks and sample code demonstrating most facets of this project.
The Rust documentation includes full documentation on all pieces of the library and validator, including extensive component by component descriptions with examples.
(In process.)
Please let us know if you encounter a bug by creating an issue.
We appreciate all contributions. We welcome pull requests with bug-fixes without prior discussion.
If you plan to contribute new features, utility functions or extensions to the core, please first open an issue and discuss the feature with us.
- Sending a PR without discussion might end up resulting in a rejected PR, because we may be taking the core in a different direction than you might be aware of.
Joshua Allen, Christian Covington, Eduardo de Leon, Ira Globus-Harris, James Honaker, Jason Huang, Saniya Movahed, Michael Phelan, Raman Prasad, Michael Shoemate, You?