OCPBUGS-6661: Handle mTLS CRLs in the router#891
OCPBUGS-6661: Handle mTLS CRLs in the router#891rfredette wants to merge 2 commits intoopenshift:masterfrom
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
@rfredette: This pull request references Jira Issue OCPBUGS-6661, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
cbb9efe to
5d33116
Compare
37f4516 to
6d756cf
Compare
|
I've pushed a temporary commit that makes the router image the latest commit in openshift/router#447 so that the new tests work. Once that router PR merges, I will revert the router image change. /hold |
e0bf8e9 to
7242c1c
Compare
7242c1c to
c5fb8b4
Compare
gcs278
left a comment
There was a problem hiding this comment.
still reviewing, but adding a couple of comments
test/e2e/client_tls_test.go
Outdated
| // service. Returns the HTTP status code returned from curl, and an error either if there is an HTTP error, or if | ||
| // there's another error in running the command. If the error was not an HTTP error, the HTTP status code returned will | ||
| // be -1. | ||
| func curl(t *testing.T, clientPod *corev1.Pod, certName, endpoint, ingressControllerIP string, verbose bool) (int64, error) { |
There was a problem hiding this comment.
nit should the name of this function reflect that it's getting a status code? i.e. curlHttpStatus or getHttpStatusCode?
There was a problem hiding this comment.
Ehh sorry I see you didn't write this, but adapted it. Nevermind this.
There was a problem hiding this comment.
I think it's still reasonable to try to convey what's being returned. I'm not sure I love either of those names, but I'll see if I can come up with something that will work
test/e2e/client_tls_test.go
Outdated
| { | ||
| // This test case has CA certificates including a CRL distribution point (CDP) for the CRL that they | ||
| // generate and sign. This is the default way to distribute CRLs according to RFC-5280 | ||
| Name: "subject-crl", |
There was a problem hiding this comment.
[from meeting] - better name.
i'm not so sure I understand why "subject" is in the name here, but I could see something like: crls-included-in-ca-certificates.
test/e2e/client_tls_test.go
Outdated
| // This test case has certificates including the CRL distribution point of their signer (i.e. intermediate | ||
| // CA is signed by root CA, and includes the URL for root's CRL). In this case, neither of the certificates | ||
| // in the CA bundle include the intermediate CRL, so connections that rely on it will be rejected. | ||
| Name: "signer-crl", |
There was a problem hiding this comment.
I understand the word signer better than subject, but could be more explicit crls-included-in-signer-certs or something.
| }, | ||
| }, | ||
| { | ||
| // This test case has certificates including the CRL distribution point of their signer. In this case, a |
There was a problem hiding this comment.
Is it worth referencing the bug in this comment? For additional supporting info.
test/e2e/client_tls_test.go
Outdated
| t.Fatalf("Failed to create client certificate: %v", err) | ||
| } | ||
|
|
||
| _, rootCRLPem, err := CreateCRL(nil, rootCA, time.Now(), time.Now().Add(10*time.Minute), RevokeCertificates(time.Now(), revokedByRoot)) |
There was a problem hiding this comment.
Anything significant about this 10 min next update value? Is there any chance that these tests could take 10 mins and something would break? Should it be like an hour to be safe? Does it matter?
| // TLS/SSL verification failures result in a 0 http status code (no connection is made to the backend, so no http status code is returned). | ||
| continue | ||
| } | ||
| t.Errorf("Unexpected error from curl for cert %q: %v", certName, err) |
There was a problem hiding this comment.
Would it be beneficial to add logging of the httpStatusCode value for future debugging?
test/e2e/client_tls_test.go
Outdated
| t.Fatalf("timed out waiting for pod %q to become ready: %v", clientPodName, err) | ||
| } | ||
|
|
||
| // We need an IP address to which to send requests. The test client |
There was a problem hiding this comment.
nit I struggled a bit on this comment. But I realized it's because the curl function doesn't directly use the DNS name. Is there a better way we can describe this?
| // We need an IP address to which to send requests. The test client | |
| // We need an IP address to which to send requests (in lieu of using a DNS domain name). The test client |
| if err := waitForIngressControllerCondition(t, kclient, 5*time.Minute, icName, availableConditionsForPrivateIngressController...); err != nil { | ||
| t.Fatalf("failed to observe expected conditions: %v", err) | ||
| } | ||
|
|
There was a problem hiding this comment.
nit it's a nit pick, but probably worth a block comment here describing clientCerts are going to do, I tend to try to read comments first when deciphering code, feels more effcient:
| // Create client certificate bundle as a configmap which we will mount into our test client pod |
gcs278
left a comment
There was a problem hiding this comment.
Generally, I agree with these changes. Most comments are documentation related, but I'm being a bit more nit-picky than usual, since this is a complex E2E.
ad3b9b1 to
8eb9601
Compare
|
/jira refresh |
|
@rfredette: This pull request references Jira Issue OCPBUGS-6661, which is valid. 3 validation(s) were run on this bug
Requesting review from QA contact: The bug has been updated to refer to the pull request using the external bug tracker. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
8eb9601 to
e28f320
Compare
Leave a stub of the CRL controller to clean up any existing configmaps. The stub controller will need to be removed in a future release Use cluster-wide proxy for CRL downloads when available Add a test with several test cases to test CRL management
e28f320 to
8bc1171
Compare
8bc1171 to
46b92e6
Compare
|
/retest |
|
Test failures seem unrelated; Even the operator suite failure had /retest |
|
more unrelated failures. /retest |
|
@rfredette: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
| return false, nil, ctx, fmt.Errorf("failed to build configmap: %w", err) | ||
| } | ||
| // The CRL management code has been moved into the router, so the CRL configmap is no longer necessary. | ||
| // TODO: Remove this whole controller after 4.13 |
There was a problem hiding this comment.
| // TODO: Remove this whole controller after 4.13 | |
| // TODO: Remove this whole controller after 4.14. |
|
Closing this PR in favor of the combined fix in #930 |
|
@rfredette: This pull request references Jira Issue OCPBUGS-6661. The bug has been updated to no longer refer to the pull request using the external bug tracker. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
CRL lifecycle management is being moved to the router pod to avoid hitting the configmap max size (1MB).
This relies on openshift/router#447 to function.