Skip to content
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
Show all changes
18 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
137 changes: 137 additions & 0 deletions crypto/schemes/enc/v1/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,137 @@
# Dapr encryption scheme v1: `dapr.io/enc/v1`

This document contains the reference for the **Dapr encryption scheme v1**, identified by the header `dapr.io/enc/v1`.

The Dapr encryption scheme is optimized for processing data as a stream. Data is chunked into multiple parts which are encrypted independently. This allows us to return data to callers as a stream, even when decrypting messages, being confident that we are not flushing unverified data to the client.

> **Sources:** The encryption scheme that Dapr uses is heavily inspired by the [Tink wire format](https://developers.google.com/tink/wire-format) (from the Tink library maintained by Google), as well as by Filippo Valsorda's [age](https://age-encryption.org/v1), and Minio's [DARE](https://github.com/minio/sio).

## Key

Each message is encrypted with a 256-bit symmetric **File Key (FK)** that is randomly generated by Dapr for each new message. The key must be generated as 32 byte of output from a CSPRNG (such as Go's `crypto/rand.Reader`) and must not be reused for other files.

The FK is wrapped using a key stored in a key vault (**Key Encryption Key (KEK)**) by Dapr. The result of the wrapping operation is the **Wrapped File Key (WFK)**. The algorithm used depends on the type of the KEK as well as the algorithms supported by the component:

- For symmetric keys:
- AES-KW ([RFC 3394](https://www.rfc-editor.org/rfc/rfc3394.html)): `AES-KW`
- For RSA keys:
- RSA OAEP with SHA-256: `RSA-OAEP-256`

> Other key wrapping algorithms can be implemented in the future.

## Ciphertext format

The ciphertext is formatted as:

```text
header || binary_payload
```

## Header

The **header** is human-readable and contains 3 items, each terminated by a line feed (`0x0A`) character:

1. Name and version of the encryption scheme used. In this version of the spec, this is always `dapr.io/enc/v1`.
2. The manifest, which is a JSON object.
3. The MAC for the header, base64-encoded.

> Base64 encoding follows [RFC 4648 §4](https://datatracker.ietf.org/doc/html/rfc4648#section-4) ("standard" format, with padding included)

For example:

```text
dapr.io/enc/v1
{"k":"mykey","kw":1,"wfk":"hGYjwDpWEXEymSTFZ95zgX8krElb3Gqyls67R8zJA3k=","cph":1,"np":"Y3J5cHRvIQ=="}
pBDKLrhAWL7IAvDKBV/v7lmbTG6AEZbf3srUN0Pnn30=
```

### Manifest

The second line in the header is the **manifest**, which is a compact JSON object.

Its corresponding Go struct is:

```go
type Manifest struct {
// Name of the key that can be used to decrypt the message.
// This is optional, and if specified can be in the format `key` or `key/version`.
KeyName string `json:"k,omitempty"`
// ID of the wrapping algorithm used.
// 0x01 = AES-KW
// 0x02 = RSA-OAEP-256
KeyWrappingAlgorithm int `json:"kw"`
// The Wrapped File Key
WFK []byte `json:"wfk"`
// ID of the cipher used.
// 0x01 = AES-GCM
// 0x02 = ChaCha20-Poly1305
Cipher int `json:"cph"`
// Random sequence of 7 bytes generated by a CSPRNG.
NoncePrefix []byte `json:"np"`
}
```

- **`KeyName`** is the name of the key that can be used to decrypt the message.
Usually this is the same as the name of the key used to encrypt the message, but when asymmetric ciphers are used, it could be different.
Including a `KeyName` in the manifest is not required, but when i'ts present, it's used as the default value for the key name while decrypting the document (however, users can override this value by passing a custom one while decrypting the document).
- **`Cipher`** indicates the cipher used to encrypt the actual data, and it must be an [AEAD](https://en.wikipedia.org/wiki/Authenticated_encryption#Authenticated_encryption_with_associated_data_(AEAD)) symmetric cipher.
- Dapr will choose AES-GCM as cipher by default.
- ChaCha20-Poly1305 is offered as an option for users that work with hardware that doesn't support AES-NI (such as Raspberry Pi), and needs to be enabled explicitly.
- Other AEAD ciphers can be supported in the future if needed, for example.

### MAC

The third and final line of the plaintext header is the MAC for the header, which is the HMAC-SHA-256 hash computed over the previous 2 lines (including the final newline character).

The HMAC key is derived from the (plain-text) File Key with HKDF-SHA-256:

```text
mac-key = HKDF-SHA-256(ikm = file key, salt = empty, info = "header")
MAC = HMAC-SHA-256(key = mac-key, message = first 2 lines of the header, including the trailing newline character)
```

> HKDF-SHA-256 is a key derivation function based on HMAC with SHA-256. See [RFC 5869 ("HMAC-based Extract-and-Expand Key Derivation Function (HKDF)")](https://www.rfc-editor.org/rfc/rfc5869.html). Being based on HMAC, it's not vulnerable to length-extension attacks, so we do not consider using SHA-512 and truncating the output to 256-bits necessary.

Note that there's one newline character (`0x0A`) at the end of the MAC, which concludes the header.

> Because each JSON encoder could produce a slightly different output, when verifying the manifest the MAC should be computed on the exact manifest string as included in the header. Verifiers should not re-encode the message as JSON themselves.

## Binary payload

The binary payload begins immediately after the header (after the 3rd newline character) and it includes the each segment of data encrypted:

```text
segment_0 || segment_1 || ... || segment_k
```

### Segments

The plaintext is chunked into segments of 64KB (65,536 bytes) each; the last segment may be shorter. Segments must never be empty, unless the entire file is empty.

> Because segments are 64KB each, and we can have up to 2^32 segments, the maximum size of the encrypted message is 256TB.

Each segment of plaintext is encrypted independently and stored together with its authentication tag at the end:

```text
encrypted_chunk || tag
```

> Tag size is 16 bytes for AES-GCM and ChaCha20-Poly1305, so each encrypted segment has an overhead of 16 bytes.

Segments are encrypted with a **Payload Key (PK)** that is derived from the plain-text File Key and the nonce prefix:

```text
payload-key = HKDF-SHA-256(ikm = file key, salt = nonce prefix, info = "payload")
```

Each segment is encrypted using a different 12-byte nonce:

```text
nonce_prefix || i || last_segment
```

Where:

- `nonce_prefix` (7 bytes) is the nonce prefix from the header.
- `i` (4 bytes) is the sequence number, as a 32-bit unsigned integer counter, encoded as big-endian. The first segment has sequence number 0, and it increases.
- `last_segment` (1 byte) is `0x01` if this is the last segment, or `0x00` otherwise.
99 changes: 99 additions & 0 deletions crypto/schemes/enc/v1/algorithms.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,99 @@
/*
Copyright 2023 The Dapr Authors
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/

package v1

import (
"errors"
"strconv"
)

// Algorithm used to wrap the file key.
type KeyAlgorithm string

const (
KeyAlgorithmAESKW KeyAlgorithm = "AES-KW"
KeyAlgorithmRSAOAEP256 KeyAlgorithm = "RSA-OAEP-256"

KeyAlgorithmAES KeyAlgorithm = "AES" // Alias for AES-KW
KeyAlgorithmRSA KeyAlgorithm = "RSA" // Alias for RSA-OAEP-256
)

// Validate the passed algorithm and resolves aliases.
func (a KeyAlgorithm) Validate() (KeyAlgorithm, error) {
switch a {
// Valid algorithms, not aliased
case KeyAlgorithmAESKW, KeyAlgorithmRSAOAEP256:
return a, nil

// Alias for AES-KW
case KeyAlgorithmAES:
return KeyAlgorithmAESKW, nil

// Alias for RSA-OAEP-256
case KeyAlgorithmRSA:
return KeyAlgorithmRSAOAEP256, nil

default:
return a, errors.New("algorithm " + string(a) + " is not supported")
}
}

// ID returns the numeric ID for the algorithm.
func (a KeyAlgorithm) ID() int {
switch a {
case KeyAlgorithmAESKW, KeyAlgorithmAES:
return 1
case KeyAlgorithmRSAOAEP256, KeyAlgorithmRSA:
return 2
default:
return 0
}
}

// NewKeyAlgorithmFromID returns a KeyAlgorithm from its ID.
func NewKeyAlgorithmFromID(id int) (KeyAlgorithm, error) {
switch id {
case 1:
return KeyAlgorithmAESKW, nil
case 2:
return KeyAlgorithmRSAOAEP256, nil
default:
return "", errors.New("algorithm ID " + strconv.Itoa(id) + " is not supported")
}
}

// MarhsalJSON implements json.Marshaler.
func (a KeyAlgorithm) MarshalJSON() ([]byte, error) {
return []byte(strconv.Itoa(a.ID())), nil
}

// UnmarshalJSON implements json.Unmarshaler.
func (a *KeyAlgorithm) UnmarshalJSON(dataB []byte) error {
data := string(dataB)
if data == "" || data == "null" {
return errors.New("value is empty")
}

id, err := strconv.Atoi(data)
if err != nil {
return errors.New("failed to parse value as number")
}

newA, err := NewKeyAlgorithmFromID(id)
if err != nil {
return err
}
*a = newA
return nil
}
108 changes: 108 additions & 0 deletions crypto/schemes/enc/v1/algorithms_test.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,108 @@
/*
Copyright 2023 The Dapr Authors
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/

package v1

import (
"encoding/json"
"testing"
)

func TestKeyAlgorithmValidate(t *testing.T) {
tests := []struct {
name string
a KeyAlgorithm
want KeyAlgorithm
wantErr bool
}{
{name: string(KeyAlgorithmAESKW), a: KeyAlgorithmAESKW, want: KeyAlgorithmAESKW},
{name: string(KeyAlgorithmAES) + " alias", a: KeyAlgorithmAES, want: KeyAlgorithmAESKW},
{name: string(KeyAlgorithmRSAOAEP256), a: KeyAlgorithmRSAOAEP256, want: KeyAlgorithmRSAOAEP256},
{name: string(KeyAlgorithmRSA) + " alias", a: KeyAlgorithmRSA, want: KeyAlgorithmRSAOAEP256},
{name: "invalid algorithm", a: "foo", wantErr: true},
{name: "empty algorithm", a: "", wantErr: true},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
got, err := tt.a.Validate()
if tt.wantErr {
if err == nil {
t.Errorf("KeyAlgorithm.Validate() error = %v, wantErr %v", err, tt.wantErr)
}
return
} else if err != nil {
t.Errorf("KeyAlgorithm.Validate() error = %v, wantErr %v", err, tt.wantErr)
}
if got != tt.want {
t.Errorf("KeyAlgorithm.Validate() = %v, want %v", got, tt.want)
}
})
}
}

func TestKeyAlgorithmMarshalJSON(t *testing.T) {
tests := []struct {
name string
a KeyAlgorithm
want string
wantErr bool
}{
{name: string(KeyAlgorithmAESKW), a: KeyAlgorithmAESKW, want: "1"},
{name: string(KeyAlgorithmAES) + " alias", a: KeyAlgorithmAES, want: "1"},
{name: string(KeyAlgorithmRSAOAEP256), a: KeyAlgorithmRSAOAEP256, want: "2"},
{name: string(KeyAlgorithmRSA) + " alias", a: KeyAlgorithmRSA, want: "2"},
{name: "invalid algorithm", a: "foo", want: "0"},
{name: "empty algorithm", a: "", want: "0"},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
got, err := json.Marshal(tt.a)
if (err != nil) != tt.wantErr {
t.Errorf("KeyAlgorithm.MarshalJSON() error = %v, wantErr %v", err, tt.wantErr)
return
}
if string(got) != tt.want {
t.Errorf("KeyAlgorithm.MarshalJSON() = %v, want %v", got, tt.want)
}
})
}
}

func TestKeyAlgorithmUnmarshalJSON(t *testing.T) {
tests := []struct {
name string
message string
want KeyAlgorithm
wantErr bool
}{
{name: string(KeyAlgorithmAESKW), message: "1", want: KeyAlgorithmAESKW},
{name: string(KeyAlgorithmRSAOAEP256), message: "2", want: KeyAlgorithmRSAOAEP256},
{name: "invalid ID", message: "99", wantErr: true},
{name: "empty", message: "", wantErr: true},
{name: "JSON null", message: "null", wantErr: true},
{name: "JSON string", message: `"AES"`, wantErr: true},
{name: "JSON object", message: `{"foo":1}`, wantErr: true},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
var a KeyAlgorithm
err := json.Unmarshal([]byte(tt.message), &a)
if (err != nil) != tt.wantErr {
t.Errorf("KeyAlgorithm.UnmarshalJSON() error = %v, wantErr %v", err, tt.wantErr)
}
if a != tt.want {
t.Errorf("KeyAlgorithm.UnmarshalJSON() = %v, want %v", a, tt.want)
}
})
}
}
Loading