-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Initial implementation #1
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some code ajustements and spliting into internal vs pkg modules. Basically all importable functions are inside pkg
and all hidden utility or necessary functions that are non importable are inside internal
folder
Well "how the cloud provider wrappers generated a secret key which they then ask the cloud provider to encrypt and encrypt" is not a "code duplication". It's "envelope encryption" and it's recommended way of encrypting data at rest. Not doing it may be a valid choice, but it is not a question of liking, but security compliance. Also If there is a choice - I would refrain from using Azure secrets and use Key Encrypt too - the downside is that the length is limited, but if it suffices, it will be faster and is a lot easier to mange (otherwise the secrets accumulate in the key store). |
Fair, we will of course use both. But the code is easier to understand and cover with tests this way.
I was considering that. The main issue is that Azure's Encrypt/Decrypt endpoints don't seem to have anything similar to the authentication field the other providers have (where we pass the metadata like project id). How should we solve that part for Azure then? |
Envelope encryption :) Provided that envelope fits into Azure encrypt limits (I don't know, it may or may not) |
b37c981
to
f14585a
Compare
caabe21
to
4a8a8c3
Compare
4a8a8c3
to
8b9a03a
Compare
9fb2f13
to
ac43ec4
Compare
56606a6
to
3beaab4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added some comments, I'll try to merge docker-compose but maybe it is not possible.
3c76519
to
cf8fa70
Compare
5ee40ab
to
bcb2cf2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, I closed some of the comments of dynamo, S3 and asymetric cryptography to have call with creator of object encryptor library.
Btw, if you move internal in top level it should works (I have tried it), as I am not sure whether the go module importer works also on pkg/internal
level.
@Matovidlo That's better, thanks for the suggestion! |
@Matovidlo AWS and GCP now work using OIDC. I tried it for Azure as well 06a5c33 but it didn't work https://github.com/keboola/go-cloud-encrypt/actions/runs/11931172541/job/33253483907. According to Martin it's because Azure doesn't support wildcards in subjects so it's essentially impossible (or very hacky). I subscribed to Azure/azure-workload-identity#373 where the limitation is tracked. |
Jira: https://keboola.atlassian.net/browse/PSGO-909
The implementation is inspired by https://github.com/keboola/object-encryptor, however I have chosen somewhat different approach in order to avoid some aspects I didn't like.
I didn't want the library to be aware of keboola specific things such as project, branch, etc. So instead the encryptors just accept generic metadata which need to match between encrypt and decrypt.
I also disliked the code duplication regarding how the cloud provider wrappers generated a secret key which they then ask the cloud provider to encrypt and encrypt the actual data using https://github.com/defuse/php-encryption. This dual encryption is good, as it prevents size limits, I just wanted to deduplicate the code and provide a way to not use any cloud provider - can be useful to make keboola-as-code CI pipelines cheaper by not using the cloud providers unnecessarily. This is why
native
anddual
encryptors exist.Similarly caching and logging were implemented as separate proxy encryptors. This way we can freely combine cloud provider with caching and native implementation to achieve similar results to object encryptor while keeping each class very simple and focused on its single responsibility.
For in-memory caching I've chosen ristretto library as it is still developed unlike go-cache and their benchmarks claim that it's faster than bigcache.