-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Manifest list encryption #7770
base: main
Are you sure you want to change the base?
Manifest list encryption #7770
Conversation
aba7650
to
3162f9e
Compare
@@ -162,6 +162,15 @@ default Iterable<DeleteFile> removedDeleteFiles(FileIO io) { | |||
*/ | |||
String manifestListLocation(); | |||
|
|||
/** | |||
* Return the size of this snapshot's manifest list. For encrypted tables, a verified plaintext |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fix comment
3162f9e
to
a2d7b10
Compare
@@ -162,6 +162,25 @@ default Iterable<DeleteFile> removedDeleteFiles(FileIO io) { | |||
*/ | |||
String manifestListLocation(); | |||
|
|||
/** | |||
* Return the size of this snapshot's manifest list file. Must be a verified value, taken from a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Confused here, we have a default of -1 as well set in base Snapshot which seemed to also be allowed as an "unset". Should we mention that here or is it always required?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can define this field to be required only for encrypted tables. It will be not set in the snapshot file for unencrypted tables - where this method can return 0 (or -1, I'll make it consistent across all implementation classes).
if (manifestListKeyMetadata != null) { // encrypted manifest list file | ||
Preconditions.checkArgument( | ||
fileIO instanceof EncryptingFileIO, | ||
"No encryption in FileIO class " + fileIO.getClass()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cannot read manifest list (%s) because it is encrypted but the configured FileIO (%s) does not implement EncryptingFileIO)
EncryptingFileIO encryptingFileIO = (EncryptingFileIO) fileIO; | ||
Preconditions.checkArgument( | ||
encryptingFileIO.encryptionManager() instanceof StandardEncryptionManager, | ||
"Encryption manager for encrypted manifest list files can currently only be an instance of " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cannot decrypt manifest list (%s) because the encryption manager (%s) does not implement StandardEncryptionManager
generator.writeStringField(MANIFEST_LIST_KEY_METADATA, snapshot.manifestListKeyMetadata()); | ||
} | ||
|
||
// TODO discuss: do we need to sign the size value? Or sign the whole snapshot? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How would this attack work? Wouldn't the user also need the key to encrypt the replacement files? I thought we were storing the metadata.json key in the catalog so an attacker could replace everything but still not be able to trick a client using the catalog.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the question. Some thoughts on the scenarios and protection options:
- currently, we don't have a metadata.json key. We have only a key for snapshot's manifest list file. Besides using it for encrypting the manifest list file, we can also use this key for signing snapshot's sensitive parts like the manifest list size field. Or for signing the whole metadata.json file (should be possible with some effort) - then we also protect the integrity of e.g. the table properties (like the table key id).
- snapshot (metadata.json file) doesn't keep secret values, so encrypting it might not be required. The signatures, mentioned above, would be kept in added snapshot fields - sufficient for detecting the file modification attacks.
- these protection techniques are not required with the REST catalog - because we trust the catalog service (we don't trust the storage service). Since the whole snapshot is stored in the REST catalog, we don't need to sign anything.
- the manifest list key is not stored in the catalog. Instead, it is wrapped in a KMS with the table master key, and stored in the snapshot MANIFEST_LIST_KEY_METADATA field. Only the KMS-authorized (for the table key) users/processes will be able to get the manifest list key.
- In catalogs other than the REST, the signatures provide a partial protection - because the metadata.json is kept in the untrusted storage. With the signatures, it can't be modified. But the whole folder can be replaced (e.g. a replay attack - where all table files are removed, and replaced with files of an older version of the table). To prevent this attack in non-REST catalogs, we will have to update the catalog per each table snapshot (setting eg the latest table version/sequence number, or a random AAD prefix)
&& encryptedManifestList.keyMetadata().buffer() != null) { | ||
Preconditions.checkArgument( | ||
encryptionManager instanceof StandardEncryptionManager, | ||
"Encryption manager for encrypted manifest list files can currently only be an instance of " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
similar comment as above. "Cannot X because Y"
@@ -85,7 +85,7 @@ public class TestManifestEncryption { | |||
|
|||
private static final DataFile DATA_FILE = | |||
new GenericDataFile( | |||
0, | |||
SPEC.specId(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
completing the previous patch #8252 (comment)
private static final long EXISTING_ROWS = 857273L; | ||
private static final int DELETED_FILES = 1; | ||
private static final long DELETED_ROWS = 22910L; | ||
private static final List<ManifestFile.PartitionFieldSummary> PARTITION_SUMMARIES = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We probably need a test example which has a non-empty list of partition field summaries
|
||
@Test | ||
public void testV2Write() throws IOException { | ||
ManifestFile manifest = writeAndReadManifestList(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: writeAndReadEncryptedManifestList
public void testV2Write() throws IOException { | ||
ManifestFile manifest = writeAndReadManifestList(); | ||
|
||
// all v2 fields should be read correctly |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assert J has some helper for this, Not sure if it is correct
assertThat(actual).usingRecursiveComparison().isEqualTo(expected);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This unitest/code is mostly copy/paste from https://github.com/apache/iceberg/blob/main/core/src/test/java/org/apache/iceberg/TestManifestListVersions.java#L137
|
||
// TODO discuss: do we need to sign the size value? Or sign the whole snapshot? | ||
// Or rely on REST catalog? - the only option that prevents "full folder replacement" attack. | ||
if (snapshot.manifestListSize() >= 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
another small question here, we essentially are doing a transform here
manifestlists sizes < 0 become 0.
Also nit: we are also ignoring 0's that get passed through although we will read this as 0 if it is missing.
Just wondering what the intent here is. I think it may be better to have a defined missing value? Not sure
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, part of this thread #7770 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it important that we store the manifest list size? Won't the encryption be enough to prove the file is the right one?
Yep, this is due to https://github.com/apache/iceberg/blob/main/format/gcm-stream-spec.md#file-length . There are options for table modification attacks if this field is not (safely) stored. |
this.v1ManifestLocations = v1ManifestLocations; | ||
this.manifestListKeyMetadata = null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as comment above (group manifest vars)
* In encrypted tables, return the size of this snapshot's manifest list file. Must be a verified | ||
* value, taken from a trusted source. In unencrypted tables, can return 0. | ||
*/ | ||
default long manifestListSize() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than adding new methods for each piece of new information we want to pass, what about adding a ManifestList
object that contains the location, size, and key metadata?
6c7f5a1
to
5486f6f
Compare
@@ -162,6 +162,16 @@ default Iterable<DeleteFile> removedDeleteFiles(FileIO io) { | |||
*/ | |||
String manifestListLocation(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is String manifestListLocation()
same as what is stored in ManifestListFile? If so, do we still need it as a separate field?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rdblue @RussellSpitzer what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are dozens of calls to this method today (tests etc), so we'll likely need to keep it for a while.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I saw that. We can consolidate in a separate PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I agree. We need to keep this because it is part of the public API and used in many cases.
ByteBuffer.wrap( | ||
Base64.getDecoder().decode(snapshot.manifestListFile().wrappedKeyMetadata())); | ||
|
||
NativeEncryptionKeyMetadata wrappedKeyMetadata = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this unused?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. I'll remove these lines.
ByteBuffer manifestListKeyMetadata = null; | ||
ByteBuffer wrappedManifestListKeyMetadata = null; | ||
String wrappedKeyEncryptionKey = null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Declare these inside if (node.has(MANIFEST_LIST_KEY_METADATA))
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They are used later, in the BaseSnapshot constructor. We can use a different constructor (if no manifest list encryption), but the code won't be more compact.
return size; | ||
} | ||
|
||
public String wrappedKeyEncryptionKey() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add javadoc: "Manifest list keys are encrypted with a table "key encryption key". This function returns a KMS wrap of the key encryption key.
@@ -143,7 +172,13 @@ private void cacheManifests(FileIO fileIO) { | |||
|
|||
if (allManifests == null) { | |||
// if manifests isn't set, then the snapshotFile is set and should be read to get the list | |||
this.allManifests = ManifestLists.read(fileIO.newInputFile(manifestListLocation)); | |||
InputFile manifestListInputFile = fileIO.newInputFile(manifestListFile.location()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll move this under if else (called only for unencrypted)
return keyBytes; | ||
} | ||
|
||
public void setUnwrappedKey(ByteBuffer key) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should not contain a key. It's fine to cache, but cache in unwrapKey
instead.
1ffa46c
to
684e5e3
Compare
update snapshot producer update the patch clean up fix for previous patch address review comments move key wrapping to metadata encryption encrypt manifest list key metadata new aad util null key needs no encryption comment; clearer method/var names use key encryption key for manifest list keys add encryption util changes update EncryptionTestHelpers handle api change remove unused lines revert revapi.yml KEK cache unitest update rename var address review comments fix timeout default change writer kek timeout default Updates from review. cache unwrapped keys
684e5e3
to
e39e740
Compare
Co-authored-by: Ryan Blue <[email protected]>
@rdblue thanks for the caffeine-based cache for unwrapped keys, I've applied the patch. |
Regarding the caching limit - 1 minute might be too short. If a reader (or writer) refreshes a table each few minutes, there could be many KMS unwrap calls. In the new commit, I've changed this to a combined limit of 1000 entries or 1 day. Each entry is comprised of a "wrapped key" (could be up to a few hundred bytes, depending on KMS), and "unwrapped key" (a few dozen bytes), so the 1000 entries / max size is something like 0.5MB. EDT: actually, we might be able to make the cache ~x10 smaller, if we use the "key id" as the cache key. Its size is a few dozen bytes. |
.palantir/revapi.yml
Outdated
new: "method java.lang.Long org.apache.iceberg.encryption.NativeEncryptionKeyMetadata::fileLength()" | ||
justification: "New method in interface" | ||
- code: "java.method.addedToInterface" | ||
new: "method org.apache.iceberg.encryption.NativeEncryptionKeyMetadata org.apache.iceberg.encryption.NativeEncryptionKeyMetadata::copyWithLength(long)" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than revapi exceptions, we should add default implementations.
} | ||
|
||
@Override | ||
public ByteBuffer decryptKeyMetadata(EncryptionManager em) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When was this added? I don't think this should be here because it just calls a method on EncryptionManager
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was added in PR11 (these lines: interface, implementation ). I think the main is reason is, the class EncryptingFileIO
is in the api module, so it cannot access StandardEncryptionManager and EncryptionUtil classes.
@@ -160,4 +160,10 @@ private CatalogProperties() {} | |||
|
|||
public static final String ENCRYPTION_KMS_TYPE = "encryption.kms-type"; | |||
public static final String ENCRYPTION_KMS_IMPL = "encryption.kms-impl"; | |||
public static final String WRITER_KEK_TIMEOUT_SEC = "encryption.kek-timeout-sec"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Avoid using kek
because it is not clear. Is there a good term for the key you're referring to here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we have a reasonably well defined value of this parameter in the NIST spec, maybe we don't need to make it configurable in an initial version of table encryption. This parameter is also not easy to explain. So I'll remove this from the CatalogProperties
for now. If, for some reason, a requirement comes later to make this configurable, we can add this back, with a proper name and documentation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds great!
} | ||
|
||
protected abstract ManifestFile prepare(ManifestFile manifest); | ||
|
||
protected abstract FileAppender<ManifestFile> newAppender( | ||
OutputFile file, Map<String, String> meta); | ||
OutputFile outputFile, Map<String, String> meta); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like file
was accidentally renamed to outputFile
. Can you revert that change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reverting it triggers the following build message:
> Task :iceberg-core:checkstyleMain
[ant:checkstyle] [ERROR] /<path>/iceberg/core/src/main/java/org/apache/iceberg/ManifestListWriter.java:66:18:
'file' hides a field. [HiddenField]
> Task :iceberg-core:checkstyleMain FAILED
FAILURE: Build failed with an exception.
@@ -27,4 +27,15 @@ public interface NativeEncryptionKeyMetadata extends EncryptionKeyMetadata { | |||
|
|||
/** Additional authentication data as a {@link ByteBuffer} */ | |||
ByteBuffer aadPrefix(); | |||
|
|||
/** Encrypted file length */ | |||
Long fileLength(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From revapi changes, I think these need default implementations.
import java.io.Serializable; | ||
import java.nio.ByteBuffer; | ||
|
||
public class WrappedEncryptionKey implements Serializable { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor: needs Javadoc to explain what "Wrapped" means in this context.
@ggershinsky, I'm not too concerned with the size of the cache. I'm okay with 1 day, but that seems like a long time to have unencrypted key material in memory. I'll defer to your judgement here. |
Ok. We don't have clear guidelines on key caching in memory (key copies are spread all over the process memory - cache, plug-in KMS client code, an HTTP library in the KMS client code; the Java GC - so there are no guarantees for when an uncached key is deleted from memory, if ever, before the process stops). But I agree a day could be too long. I'll change it to 1 hour - might be a reasonable trade-off between performance requirements (KMS call overhead), and safety requirements (there is a chance a key will be deleted from the memory within a business day). |
Hi @rdblue , I've built the integration code with the latest version of this patch, works ok. Can we merge this PR? |
@rdblue Did you have any more comments on this one? I can do another pass as well but I'd like to finish this up as well soon |
@rdblue @RussellSpitzer All current comments should have been addressed in this thread and in the last commit. An additional review round is always welcome, I too would like to complete this feature. |
No description provided.