-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
‼️ (iam): OpenIdConnectProvider defaults to first thumbprint instead of root CA thumbprint #8607
Comments
Same problem observed in CLI Version: 1.46.0 |
Hello @eladb , That seem to work perfectly 👍 Tested with alb-ingress-controller: But i don't understand why the thumbprint on the identity provider is false.
|
Glad to hear that |
Our EKS code currently has the following comment:
|
Routing to IAM |
It looks like the lambda is using the *.execute-api.us-west-2.amazonaws.com -> Amazon -> Amazon Root CA 1 -> Starfield Services Root Certificate Authority - G2 The last giving the expected thumbprint of This is an issue beyond just retrieving the EKS issuer thumbprint. For any issuer, if the first certificate in the chain is self-signed, it'll be the root, but in any other case the intermediate is retrieved instead. The Node.js TLS documentation for getPeerCertificate gives more info that could be useful in implementing this. |
@bleish wow thanks so much for the detailed explanation. Would you say the preferred behavior for the OIDC provider resource is to always use the root CA thumbprint? |
@eladb
One gotcha I ran into when testing the |
It looks like openssl s_client doesn't always return the full certificate chain. The documentation for the
Using |
My organization is also having this same issue. Can this get prioritized? |
We're working on a fix for this issue right now. If you can't upgrade to the CDK version that contains the fix, you can workaround this issue on an old version by changing any
Or, if you prefer to use aws-iam, you can hardcode the thumbprint to that of the root certificate (which is exactly what the
|
Signed-off-by: Vinayak Kukreja <[email protected]>
…rtificate (#22509) Currently, the IAM OIDC Provider is retrieving leaf certificates for a given url. The validity for these certificates is not that long. This can cause an outage for the customer since they might not be aware of when the certificate is going to expire. We have seen an [outage](#8607) in EKS due to this issue. This change will help retrieving root certificates instead of leaf certificates. The validity of root certificate is much more than the leaf certificates. I am also adding validations for the certificate and also informing the customer if there retrieved certificate is going to expire within six months when they do a new deployment. Fixes #8607 Signed-off-by: Vinayak Kukreja <[email protected]> ### All Submissions: * [x] Have you followed the guidelines in our [Contributing guide?](https://github.com/aws/aws-cdk/blob/main/CONTRIBUTING.md) ### Adding new Unconventional Dependencies: * [ ] This PR adds new unconventional dependencies following the process described [here](https://github.com/aws/aws-cdk/blob/main/CONTRIBUTING.md/#adding-new-unconventional-dependencies) ### New Features * [x] Have you added the new feature to an [integration test](https://github.com/aws/aws-cdk/blob/main/INTEGRATION_TESTS.md)? * [x] Did you use `yarn integ` to deploy the infrastructure and generate the snapshot (i.e. `yarn integ` without `--dry-run`)? *By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*
|
…rtificate (#22509) Currently, the IAM OIDC Provider is retrieving leaf certificates for a given url. The validity for these certificates is not that long. This can cause an outage for the customer since they might not be aware of when the certificate is going to expire. We have seen an [outage](#8607) in EKS due to this issue. This change will help retrieving root certificates instead of leaf certificates. The validity of root certificate is much more than the leaf certificates. I am also adding validations for the certificate and also informing the customer if there retrieved certificate is going to expire within six months when they do a new deployment. Fixes #8607 Signed-off-by: Vinayak Kukreja <[email protected]> ### All Submissions: * [x] Have you followed the guidelines in our [Contributing guide?](https://github.com/aws/aws-cdk/blob/main/CONTRIBUTING.md) ### Adding new Unconventional Dependencies: * [ ] This PR adds new unconventional dependencies following the process described [here](https://github.com/aws/aws-cdk/blob/main/CONTRIBUTING.md/#adding-new-unconventional-dependencies) ### New Features * [x] Have you added the new feature to an [integration test](https://github.com/aws/aws-cdk/blob/main/INTEGRATION_TESTS.md)? * [x] Did you use `yarn integ` to deploy the infrastructure and generate the snapshot (i.e. `yarn integ` without `--dry-run`)? *By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*
Reopening. See #22802 |
Adding a notice for IAM/EKS OIDC Issue where the OIDC provider currently is retrieving short lived leaf certificates instead of root certificates which have a longer expiration date.
…f root (#22802) Currently, the implementation of the OIDC provider custom resource fetches the certificate chain using: https://github.com/aws/aws-cdk/blob/9bde9f3149cbfa6e7b97204f54e7cef5c9127971/packages/%40aws-cdk/aws-iam/lib/oidc-provider/external.ts#L40 It then extracts the root certificate by detecting a circular reference in the `cert.issuerCertificate` property. https://github.com/aws/aws-cdk/blob/9bde9f3149cbfa6e7b97204f54e7cef5c9127971/packages/%40aws-cdk/aws-iam/lib/oidc-provider/external.ts#L46 As it turns out, this results in the wrong certificate being extracted. I observed this while running an [EKS integration test](https://github.com/aws/aws-cdk/blob/main/packages/%40aws-cdk/aws-eks/test/integ.eks-service-account-sdk-call.ts). The current certificate thumbprint that is extracted is: `AD7E1C28B064EF8F6003402014C3D0E3370EB58A`. While the expected thumbprint is: `9E99A48A9960B14926BB7F3B02E22DA2B0AB7280`. (this is the value we used to [hardcode](https://github.com/aws/aws-cdk/blob/v2.50.0/packages/%40aws-cdk/aws-eks/lib/oidc-provider.ts#L49)) The [recommended way](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_providers_create_oidc_verify-thumbprint.html) for extracting the correct thumbprint according to AWS is using `openssl s_client -showcerts`. When I tried this, I did in fact see that the last certificate returned by this command has the correct thumbprint. During investigation, I noticed that `socket.getPeerCertificate(true)` returns another additional certificate, acting as the root one. This is aligned with what the [comment](#8607 (comment)) made in the original issue. This additional certificate is not the correct one, and we should be using the one just before it in the chain. There is however no way to detect that "second to last" certificate in this way, because it doesn't resolve to a circular reference. After some digging, I switched the implementation to use `socket.getPeerX509Certificate()`, a new method that only exists in Node16. This method skips over the incorrect certificate, and results in the correct thumbprint. <img width="539" alt="Screen Shot 2022-11-06 at 10 04 51 PM" src="https://user-images.githubusercontent.com/1428812/200195623-6735377b-a82f-472f-884d-7bec450c32c6.png"> Fixes #8607 ---- ### All Submissions: * [X] Have you followed the guidelines in our [Contributing guide?](https://github.com/aws/aws-cdk/blob/main/CONTRIBUTING.md) ### Adding new Unconventional Dependencies: * [ ] This PR adds new unconventional dependencies following the process described [here](https://github.com/aws/aws-cdk/blob/main/CONTRIBUTING.md/#adding-new-unconventional-dependencies) ### New Features * [ ] Have you added the new feature to an [integration test](https://github.com/aws/aws-cdk/blob/main/INTEGRATION_TESTS.md)? * [ ] Did you use `yarn integ` to deploy the infrastructure and generate the snapshot (i.e. `yarn integ` without `--dry-run`)? *By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*
|
…f root (#22802) Currently, the implementation of the OIDC provider custom resource fetches the certificate chain using: https://github.com/aws/aws-cdk/blob/9bde9f3149cbfa6e7b97204f54e7cef5c9127971/packages/%40aws-cdk/aws-iam/lib/oidc-provider/external.ts#L40 It then extracts the root certificate by detecting a circular reference in the `cert.issuerCertificate` property. https://github.com/aws/aws-cdk/blob/9bde9f3149cbfa6e7b97204f54e7cef5c9127971/packages/%40aws-cdk/aws-iam/lib/oidc-provider/external.ts#L46 As it turns out, this results in the wrong certificate being extracted. I observed this while running an [EKS integration test](https://github.com/aws/aws-cdk/blob/main/packages/%40aws-cdk/aws-eks/test/integ.eks-service-account-sdk-call.ts). The current certificate thumbprint that is extracted is: `AD7E1C28B064EF8F6003402014C3D0E3370EB58A`. While the expected thumbprint is: `9E99A48A9960B14926BB7F3B02E22DA2B0AB7280`. (this is the value we used to [hardcode](https://github.com/aws/aws-cdk/blob/v2.50.0/packages/%40aws-cdk/aws-eks/lib/oidc-provider.ts#L49)) The [recommended way](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_providers_create_oidc_verify-thumbprint.html) for extracting the correct thumbprint according to AWS is using `openssl s_client -showcerts`. When I tried this, I did in fact see that the last certificate returned by this command has the correct thumbprint. During investigation, I noticed that `socket.getPeerCertificate(true)` returns another additional certificate, acting as the root one. This is aligned with what the [comment](#8607 (comment)) made in the original issue. This additional certificate is not the correct one, and we should be using the one just before it in the chain. There is however no way to detect that "second to last" certificate in this way, because it doesn't resolve to a circular reference. After some digging, I switched the implementation to use `socket.getPeerX509Certificate()`, a new method that only exists in Node16. This method skips over the incorrect certificate, and results in the correct thumbprint. <img width="539" alt="Screen Shot 2022-11-06 at 10 04 51 PM" src="https://user-images.githubusercontent.com/1428812/200195623-6735377b-a82f-472f-884d-7bec450c32c6.png"> Fixes #8607 ---- * [X] Have you followed the guidelines in our [Contributing guide?](https://github.com/aws/aws-cdk/blob/main/CONTRIBUTING.md) * [ ] This PR adds new unconventional dependencies following the process described [here](https://github.com/aws/aws-cdk/blob/main/CONTRIBUTING.md/#adding-new-unconventional-dependencies) * [ ] Have you added the new feature to an [integration test](https://github.com/aws/aws-cdk/blob/main/INTEGRATION_TESTS.md)? * [ ] Did you use `yarn integ` to deploy the infrastructure and generate the snapshot (i.e. `yarn integ` without `--dry-run`)? *By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*
if this is Closed, shouldn't notice 8607 be removed ? or do i still have to acknowledge it even thoug its fixed by a release? |
Apologies for this. We had an issue that wrongly showed to notice for non affected versions. This should be fixed now. |
@fadinasr Thanks for letting me know. I can't reproduce this. |
Please add your +1 👍 to let us know you have encountered this
Status: IN-PROGRESS
Overview:
The
iam.OpenIdConnectProvider
resource contains logic that dynamically fetches the certificate thumbprint required to create an OpenID Connect provider. However, as of now, it mistakenly fetches the leaf certificate of the provider, instead of the root one.As long as the leaf certificate is valid, this doesn't have an impact your applications. However, once the certificate is rotated, your application will fail to use the provider to authenticate against AWS services. Since leaf certificates are rotated frequently, you are in danger of disruption. This will probably manifest in
Access Denied
errors.Workaround:
If you are using the
iam.OpenIdConnectProvider
construct in conjunction with an EKS cluster:Switch to use the
OpenIdConnectProvider
construct from the EKS library:The reason this works is because the
eks.OpenIdConnectProvider
hardcodes the correct thumbprint for EKS.If you are using the
iam.OpenIdConnectProvider
in conjunction with other services, make sure you pass the thumbprint explicitly to the construct, instead of relying on its dynamic fetching capabilities.To obtain the correct thumbprint for your provider, follow these instructions.
Solution:
We are working on a fix to the
iam.OpenIdConnectProvider
construct so that if correctly fetches the root certificate thumbprint. See PR. Once it is merged, the fix will be available in the following CDK release, at which point a simple deployment will fix the issue in your environment.Originally reported as
When deploying an OpenIdConnectProvider construct using the oidc issuer url retrieved from an EKS cluster (the domain is oidc.eks.us-west-2.amazonaws.com) and no value for the
thumbprints
property, the resulting auto-obtained thumbprint doesn't match the one I get from following the steps provided here.Reproduction Steps
Error Log
See Other for a related error.
Environment
Other
If I try to deploy a cluster autoscaler to my EKS cluster using a service account role tied to that provider, the pod enters a CrashLoopBackOff state with the error message:
If I instead follow the guide to retrieve the correct thumbprint via openssl and provide that to the OpenIdConnectProvider construct, the cluster autoscaler successfully deploys.
This is 🐛 Bug Report
The text was updated successfully, but these errors were encountered: