Skip to content
This repository has been archived by the owner on Apr 17, 2024. It is now read-only.

Hashicorp Vault Integration - Python #711

Merged
merged 16 commits into from
Jan 26, 2024

Conversation

JordanStopford
Copy link
Contributor

Hi all,

This is my first pull request for this project so please bear with me...

This pull requests includes Hashicorp Vault integration, similar to the implementation already available for go.

If there's any other information needed or if I've missed anything please let me know.

Regards,

Jordan

Tests implemented but not yet running
Tests all running and passing
Fixed mount points
Fixed encrypt/decrypt
Added ability to specify vault client arguments from register_client
@google-cla
Copy link

google-cla bot commented Aug 1, 2023

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

Doc fix
Fixed cert parameter
@chuckx chuckx self-assigned this Aug 10, 2023
@juergw
Copy link
Contributor

juergw commented Aug 17, 2023

Thanks for doing this! This is certainly something that we need.

I'm commenting first here and not on the Java implementation. Many comments will be similar for the Java implementation. Let me know if you prefer to do the Java version first, then we could continue the discussion also there.

We are currently in the process of changing how we use the KMS integration in Tink. The current global KMS registration has many problems and is a bit confusing. We therefore want to deprecate it. Instead, the user should just create an AEAD object, and then use that to encrypt a keyset or to implement envelope encryption or do whatever they want to do.

For this new integration here, I think we should therefore avoid defining a registration function. There is also no need to implement a KMS client anymore, all we really need is a function that returns the AEAD object. Also, I think it would be easiest if the user constructs the hvac.Client themselves, and then passes that to the function that creates the AEAD, together with the key_uri, for example like this

def create_aead(client: hvac.Client, key_uri: str) -> aead.Aead:

This also makes it easier to test, because we can pass in a fake hvac.Client.

For the implementation of _HcVaultKmsAead: The function get_endpoint_paths should not be part of _HcVaultKmsAead, it should be a normal function and probably be private. And we usually don't write the "get", so I think it should be

def _endpoint_paths(self, key_uri: str) -> Tuple[str, str]:

This function should be tested with the same test cases as in the Go implementation in TestGetEndpointPaths.

The _HcVaultKmsAead implementation is currently not really tested. Could you add some tests? I think it should be possible to make a simple fake implementation of hvac.Client, and use that. Or do the tests similar as in the go implementation, by running a fake HTTP server.

@morambro morambro requested a review from juergw August 17, 2023 09:39
@JordanStopford
Copy link
Contributor Author

Thanks for your comments - I'll look into actioning those now. In terms of the architecture, how would you expect a user to call create_aead()? Are you suggesting we keep the KMS client class but just remove the requirement to register it?

@juergw
Copy link
Contributor

juergw commented Aug 23, 2023

Let me explain the API I have in mind a bit better.

Yes, removing the "register_client" from the HcVaultKmsClient class is the most important change. I'm deprecating all these existing functions and we don't want new functions that do this.

Then, the user would create a kms aead object like this:
remote_aead = hcvault.HcVaultKmsClient(KEY_URI, TOKEN).get_aead(KEY_URI)

But this could be simplified by just providing a function that does not require the user to create a client, that directly create an aead:
remote_aead = hcvault.create_aead(KEY_URI, TOKEN)

In fact, the whole "HcVaultKmsClient" class is not really needed anymore if we don't want the user to register. And it is preferable to not add this class, because the less we have to add to the public interface, the better.

Now this create_aead function has the problem that we need to add a lot of optional parameters, like this:
def create_aead(key_uri: str, token: str, https=True, verify=True, client_certs: Tuple[str, str] = None, namespace: Optional[str] = None) -> aead.Aead
...
So instead, I think it would be simpler and preferable to just define it like this:
def create_aead(key_uri: str, client: hvac.Client) -> aead.Aead
...
Now, the user has to create the "hvac.Client" object themselves. This makes it more flexible for the user and easier to implement and maintain for us.

@JordanStopford
Copy link
Contributor Author

I've updated the code with the fixes from your comments - let me know if this is OK!

@juergw
Copy link
Contributor

juergw commented Aug 28, 2023

Thanks for the changes.

Some more comments:
_endpoint_paths: I think the comment does not match what the implementation does. In the golang implementation, it returns a encryptPath and a decryptPath, but in your implementation it returns a mount point and a key id. Please update the name and the documentation. And I think this function should be tested with the same uri test cases as in https://github.com/google/tink/blob/master/go/integration/hcvault/hcvault_aead_internal_test.go

The code now adds a new dependency on flask. We really would like to avoid that. can this be implemented using https://docs.python.org/3/library/http.server.html?

@juergw
Copy link
Contributor

juergw commented Aug 28, 2023

I think _endpoint_paths doesn't return correct mount point. Maybe it's best to just start with only accepting the default "transit" as mount point.

To do that, you just have to check that the path in the key URI has the form:

transit/keys/

and then don't set the mount point parameter in encrypt and decrypt, which will use the default "transit" mount point.

@JordanStopford
Copy link
Contributor Author

I'll look into those changes - I've done the testing with Flask as it's something I already know so was quick to do, it's only needed for the tests but I couldn't see a way of only adding test dependencies? Is there a way of doing that? I think it will help keep the tests simple and reduce boilerplate code

JordanStopford and others added 2 commits August 29, 2023 13:58
Added extra tests for endpoint path calculation
Fixed issues for endpoint path calculation as a result
@JordanStopford
Copy link
Contributor Author

I've added some new tests - there is one exclusion from what is represented in the go tests for the endpoints paths which is the "this and+that" path. I've tested that locally and it's an unsupported path in vault so I have removed it from the Python tests.

@juergw
Copy link
Contributor

juergw commented Sep 6, 2023

I think the code now looks fine, but I'm not sure about the tests. I really don't want to add a dependency on Flask. Maybe it would be better to test this using a mock client?

But, more importantly, I think we (the Tink team) really should add test using the real vault. But this is something we need to set up ourselves, starting with the go implementation. I'll look into this. I'm really busy this week, so it may take some time.

@JordanStopford
Copy link
Contributor Author

Is there a way we can add the dependency only for test runs? I'm not overly familiar with bazel but I'm sure it should be possible, most other systems have test dependencies independent of implementation dependencies.

When I was writing the tests I did them against a real vault server - once they all worked I then ensured that the mock server behaved in a similar fashion so that they pass against both a real server and a mock server.

@JordanStopford
Copy link
Contributor Author

I've now refactored the tests to not use flask - does this all look OK?

@juergw
Copy link
Contributor

juergw commented Oct 3, 2023

Great, thanks.

Could you also change the date in the files from 2019 to 2023?

I'll try to get this integrated this or next week. Sorry for the delay.

Remove my local testing config
@JordanStopford
Copy link
Contributor Author

All done - I'm on holiday for a couple of weeks now but when I get back I'll make the necessary changes to the Java side so we can get that integrated too.

@juergw
Copy link
Contributor

juergw commented Nov 28, 2023

Before integrating this pull request, I added some more tests, to make sure that this this implementation and the golang implementation are compatible. I found out that associated data is not working as expected in the golang implementation. So, I first wanted to fix that. I have now added a new API for golang where associated data works correctly, tink-crypto/tink-go-hcvault@2304be5. It uses the "associated_data" parameter of the encrypt/decrypt requests, instead of "context", see https://developer.hashicorp.com/vault/api-docs/secret/transit#decrypt-data.
Now I wanted to see if this pull request works and is compatible with the golang implementation if I use the "associated_data" parameter, and realized that the hvac library does not (yet?) support this: https://hvac.readthedocs.io/en/stable/usage/secrets_engines/transit.html#decrypt-data. So I think we have to wait with this until hvac support this feature.

@juergw
Copy link
Contributor

juergw commented Nov 28, 2023

I opened an issue for this: hvac/hvac#1107.

@morambro morambro self-requested a review January 17, 2024 10:26
@morambro
Copy link
Contributor

Hi @JordanStopford I'll work on merging this PR.

Could you please sync with master and resolve conflicts? AFACT this is what should be done:

  • creating a requirements_hcvault.in file
  • update setup.py's extras_require field
  • update requirements_all.txt with
    pip-compile --all-extras --generate-hashes --output-file=requirements_all.txt setup.py

@JordanStopford
Copy link
Contributor Author

Hi @morambro apologies for the delay but I believe I've made the changes specified in your message and synced my fork with master. Hopefully should be possible for you to merge now, thanks!

@morambro
Copy link
Contributor

Thanks! No worries :) I'll give it a try

@morambro morambro merged commit 33c4286 into tink-crypto:master Jan 26, 2024
6 checks passed
copybara-service bot pushed a commit to tink-crypto/tink-py that referenced this pull request Jan 31, 2024
This was originally added to github.com/tink-crypto/tink in tink-crypto/tink#711.

#tinkApiChange

PiperOrigin-RevId: 603075572
Change-Id: Ibae1059d7cd739252ecee34f97087d244debb496
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants