Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Initial implementation #1

Merged
merged 41 commits into from
Nov 21, 2024
Merged

Conversation

jachym-tousek-keboola
Copy link
Collaborator

@jachym-tousek-keboola jachym-tousek-keboola commented Oct 30, 2024

Jira: https://keboola.atlassian.net/browse/PSGO-909

The implementation is inspired by https://github.com/keboola/object-encryptor, however I have chosen somewhat different approach in order to avoid some aspects I didn't like.

I didn't want the library to be aware of keboola specific things such as project, branch, etc. So instead the encryptors just accept generic metadata which need to match between encrypt and decrypt.

I also disliked the code duplication regarding how the cloud provider wrappers generated a secret key which they then ask the cloud provider to encrypt and encrypt the actual data using https://github.com/defuse/php-encryption. This dual encryption is good, as it prevents size limits, I just wanted to deduplicate the code and provide a way to not use any cloud provider - can be useful to make keboola-as-code CI pipelines cheaper by not using the cloud providers unnecessarily. This is why native and dual encryptors exist.

Similarly caching and logging were implemented as separate proxy encryptors. This way we can freely combine cloud provider with caching and native implementation to achieve similar results to object encryptor while keeping each class very simple and focused on its single responsibility.

For in-memory caching I've chosen ristretto library as it is still developed unlike go-cache and their benchmarks claim that it's faster than bigcache.

Copy link
Contributor

@Matovidlo Matovidlo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some code ajustements and spliting into internal vs pkg modules. Basically all importable functions are inside pkg and all hidden utility or necessary functions that are non importable are inside internal folder

Makefile Outdated Show resolved Hide resolved
build/ci/golangci.yml Outdated Show resolved Hide resolved
cloudencrypt/aws.go Outdated Show resolved Hide resolved
cloudencrypt/aws.go Outdated Show resolved Hide resolved
cloudencrypt/aws.go Outdated Show resolved Hide resolved
cloudencrypt/dual.go Outdated Show resolved Hide resolved
cloudencrypt/encode.go Outdated Show resolved Hide resolved
cloudencrypt/encryptor.go Outdated Show resolved Hide resolved
cloudencrypt/log.go Outdated Show resolved Hide resolved
cloudencrypt/metadata.go Outdated Show resolved Hide resolved
@odinuv
Copy link
Member

odinuv commented Oct 31, 2024

Well "how the cloud provider wrappers generated a secret key which they then ask the cloud provider to encrypt and encrypt" is not a "code duplication". It's "envelope encryption" and it's recommended way of encrypting data at rest. Not doing it may be a valid choice, but it is not a question of liking, but security compliance.

Also If there is a choice - I would refrain from using Azure secrets and use Key Encrypt too - the downside is that the length is limited, but if it suffices, it will be faster and is a lot easier to mange (otherwise the secrets accumulate in the key store).

@jachym-tousek-keboola
Copy link
Collaborator Author

@odinuv

Well "how the cloud provider wrappers generated a secret key which they then ask the cloud provider to encrypt and encrypt" is not a "code duplication". It's "envelope encryption" and it's recommended way of encrypting data at rest. Not doing it may be a valid choice, but it is not a question of liking, but security compliance.

Fair, we will of course use both. But the code is easier to understand and cover with tests this way.

Also If there is a choice - I would refrain from using Azure secrets and use Key Encrypt too - the downside is that the length is limited, but if it suffices, it will be faster and is a lot easier to mange (otherwise the secrets accumulate in the key store).

I was considering that. The main issue is that Azure's Encrypt/Decrypt endpoints don't seem to have anything similar to the authentication field the other providers have (where we pass the metadata like project id). How should we solve that part for Azure then?

@odinuv
Copy link
Member

odinuv commented Nov 5, 2024

How should we solve that part for Azure then?

Envelope encryption :) Provided that envelope fits into Azure encrypt limits (I don't know, it may or may not)

@jachym-tousek-keboola jachym-tousek-keboola force-pushed the jt-psgo-909-cloud-encrypt branch 2 times, most recently from b37c981 to f14585a Compare November 6, 2024 09:59
@jachym-tousek-keboola jachym-tousek-keboola force-pushed the jt-psgo-909-cloud-encrypt branch 2 times, most recently from caabe21 to 4a8a8c3 Compare November 7, 2024 10:44
@jachym-tousek-keboola jachym-tousek-keboola force-pushed the jt-psgo-909-cloud-encrypt branch 4 times, most recently from 9fb2f13 to ac43ec4 Compare November 8, 2024 11:56
Copy link
Contributor

@Matovidlo Matovidlo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added some comments, I'll try to merge docker-compose but maybe it is not possible.

Copy link
Contributor

@Matovidlo Matovidlo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, I closed some of the comments of dynamo, S3 and asymetric cryptography to have call with creator of object encryptor library.
Btw, if you move internal in top level it should works (I have tried it), as I am not sure whether the go module importer works also on pkg/internal level.

@jachym-tousek-keboola
Copy link
Collaborator Author

Btw, if you move internal in top level it should works (I have tried it), as I am not sure whether the go module importer works also on pkg/internal level.

@Matovidlo That's better, thanks for the suggestion!

@jachym-tousek-keboola
Copy link
Collaborator Author

jachym-tousek-keboola commented Nov 20, 2024

@Matovidlo AWS and GCP now work using OIDC. I tried it for Azure as well 06a5c33 but it didn't work https://github.com/keboola/go-cloud-encrypt/actions/runs/11931172541/job/33253483907. According to Martin it's because Azure doesn't support wildcards in subjects so it's essentially impossible (or very hacky). I subscribed to Azure/azure-workload-identity#373 where the limitation is tracked.

@jachym-tousek-keboola jachym-tousek-keboola merged commit 4f21343 into main Nov 21, 2024
1 check passed
@jachym-tousek-keboola jachym-tousek-keboola deleted the jt-psgo-909-cloud-encrypt branch November 21, 2024 13:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants