Skip to content

Conversation

@stlaz
Copy link

@stlaz stlaz commented Feb 11, 2020

Uses kube serving-cert reloader

/cc @marun

@openshift-ci-robot openshift-ci-robot added the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label Feb 11, 2020
@stlaz stlaz changed the title Reload serving certs Bug 1801573: Reload serving certs Feb 11, 2020
@openshift-ci-robot openshift-ci-robot added the bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. label Feb 11, 2020
@openshift-ci-robot
Copy link

@stlaz: This pull request references Bugzilla bug 1801573, which is valid. The bug has been moved to the POST state. The bug has been updated to refer to the pull request using the external bug tracker.

Details

In response to this:

Bug 1801573: Reload serving certs

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 11, 2020
@stlaz stlaz force-pushed the reload_serving_cert branch from 0c2065f to f9d15a9 Compare February 11, 2020 17:06
@s-urbaniak
Copy link

@stlaz thanks a lot for the fix 🙌 i believe this is worth backporting until 4.2.

@marun
Copy link

marun commented Feb 11, 2020

@stlaz Would it make sense to ensure test coverage for rotation compatibility?

@s-urbaniak
Copy link

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Feb 12, 2020
@openshift-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: s-urbaniak, stlaz

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

// this disregards information from ClientHello but we're not doing SNI anyway
cert, key := servingCertProvider.CurrentCertKeyContent()

certKeyPair, err := tls.X509KeyPair(cert, key)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note though that this will be executed at every TLS handshake so maybe we want some optimization here and return early if the certificate didn't change.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code makes the certs refresh each minute, but this read is from memory, not from the files, this should be safe.

@openshift-merge-robot openshift-merge-robot merged commit 3d0621e into openshift:master Feb 12, 2020
@openshift-ci-robot
Copy link

@stlaz: All pull requests linked via external trackers have merged. Bugzilla bug 1801573 has been moved to the MODIFIED state.

Details

In response to this:

Bug 1801573: Reload serving certs

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.


config.GetCertificate = func(_ *tls.ClientHelloInfo) (*tls.Certificate, error) {
// this disregards information from ClientHello but we're not doing SNI anyway
cert, key := servingCertProvider.CurrentCertKeyContent()

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: as this will be invoked with every TLS handshake, we are relying here on the underlying caching mechanism of the client implementation, correct?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which client caching mechanisms do you have in mind? Does #152 (comment) answer your question?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds, good the above comment answers it.

@s-urbaniak
Copy link

@marun (cc @stlaz) as discussed with @sttts: I firmly believe this is worth back-porting to 4.3 at least. @sttts mentioned you are also backporting potential rotation logic of ca-operator into 4.2, hence a backport to 4.2 is also viable (being a critical/release-blocking fix) after/concurrently with that.

@marun
Copy link

marun commented Feb 14, 2020

@s-urbaniak I wouldn't classify this as release-critical. The duration of the service CA and the certificates it issues are being set to a long enough duration (26 and 24 months respectively) to ensure a pod restart-initiating upgrade after rotation and before key material expiry. Anything less has the potential to break user workloads.

If this was actually a release critical fix, I would expect automated testing to ensure that cert reloading didn't regress when backported or in future releases.

@marun
Copy link

marun commented Feb 17, 2020

@s-urbaniak @stlaz @sttts Mea culpa, this should be backported to 4.3 and 4.2 as proposed.

I recall that oauth-proxy is one of the only critical-path components using service ca key material for something other than securing a metrics endpoint. Without this backported fix, there's a non-zero chance of control plane degradation if rotation isn't followed by pod restart before expiry of the pre-rotation CA. For new clusters this won't be a problem, but for existing clusters that were deployed with only a 12 month CA duration we need all the help we can get in keeping a control plane healthy after rotation.

@s-urbaniak
Copy link

@marun i agree this should definitely be backported to 4.3, and 4.2

@marun
Copy link

marun commented Mar 2, 2020

/cherrypick release-4.3
/cherrypick release-4.2

@openshift-cherrypick-robot

@marun: new pull request created: #159

Details

In response to this:

/cherrypick release-4.3
/cherrypick release-4.2

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

wking added a commit to wking/cincinnati-graph-data that referenced this pull request Mar 18, 2020
…1810036

The bugs were introduced by the [1] series, and fixed by the
combination of [2,3].  This commit also tombstones affected releases
to avoid further channel promotion.  Details on the bug:

* 4.5: Introduced by [1] (no PR?).  Fixed by [2], service-ca-operator
  74b5ce2 [4], which included library-go d9c73bb [5].

  Also fixed by [3], oauth-proxy 3d0621e [6], which landed before the
  4.4/4.5 split.

* 4.4: Introduced by [1] (no PR?).  Fixed by [7], service-ca-operator
  e5a04d6 [7], which included library-go 3c25293 [9].

  $ oc adm release info --commits quay.io/openshift-release-dev/ocp-release:4.4.0-rc.0-x86_64 | grep service-ca-operator
    service-ca-operator                            https://github.com/openshift/service-ca-operator                            094a9ad02dbe3bcb57d5fbad301cfcfcd48bd2ed
  $ oc adm release info --commits quay.io/openshift-release-dev/ocp-release:4.4.0-rc.1-x86_64 | grep service-ca-operator
    service-ca-operator                            https://github.com/openshift/service-ca-operator                            094a9ad02dbe3bcb57d5fbad301cfcfcd48bd2ed
  $ git --no-pager log -2 --first-parent --oneline origin/release-4.4
  e5a04d6a (origin/release-4.4) Merge pull request openshift#111 from marun/4.4-unique-ca-serial
  094a9ad0 Merge pull request #95 from vareti/signer-ca-metrics

  So both RCs are affected.

  Also fixed by [3], oauth-proxy 3d0621e [6], which landed before the
  4.4/4.5 split.

  $ oc adm release info --commits quay.io/openshift-release-dev/ocp-release:4.4.0-rc.0-x86_64 | grep oauth-proxy
    oauth-proxy                                    https://github.com/openshift/oauth-proxy                                    3d0621eb72c9dd1c036505363032468a9016f381
  $ oc adm release info --commits quay.io/openshift-release-dev/ocp-release:4.4.0-rc.1-x86_64 | grep oauth-proxy
  oauth-proxy                                    https://github.com/openshift/oauth-proxy                                    3d0621eb72c9dd1c036505363032468a9016f381

  So both RCs have OAuth fix, but neither has the service-ca-operator
  fix.

* 4.3: Introduced by [10], service-ca-operator 8395d65 [11]. Fixed by
  [12], service-ca-operator dd7235b [13], which includes library-go
  5844159 [14].

  Fix has not been released yet.

  $ oc adm release info --commits quay.io/openshift-release-dev/ocp-release:4.3.3-x86_64 | grep service-ca-operator
    service-ca-operator                           https://github.com/openshift/service-ca-operator                           774c394da334dec446703545d4baaf89611ccb9d
  $ oc adm release info --commits quay.io/openshift-release-dev/ocp-release:4.3.5-x86_64 | grep service-ca-operator
    service-ca-operator                           https://github.com/openshift/service-ca-operator                           8395d65888b0a4249277989f18ee03f45383e409

  So this was introduced in 4.3.5 (there was no 4.3.4).

  Fix also requires the OAuth proxy fix [15,16], which is still in
  flight.

* 4.2: Introduced by [17], service-ca-operator 0324055 [18], which
  includes library-go 2cf86bb [19] and API 8ce0047 [20].  Fix in
  flight with [21,22].  [23] has already landed with library-go
  d58edcb.

  $ oc adm release info --commits quay.io/openshift-release-dev/ocp-release:4.2.21-x86_64 | grep service-ca-operator
    service-ca-operator                           https://github.com/openshift/service-ca-operator                           f6720573b9b63147436374e51e6fda44683b1e9f
  $ oc adm release info --commits quay.io/openshift-release-dev/ocp-release:4.2.22-x86_64 | grep service-ca-operator
    service-ca-operator                           https://github.com/openshift/service-ca-operator                           0324055c3bad3a857dcf3471c024bf42c20d549e

  So this was introduced in 4.2.22.

  Fix also requires the OAuth proxy fix [24,25], which is still in
  flight.

* 4.1: Backport stream introducing the bug is still ASSIGNED [26], so
  no 4.1 impact yet.

[1]: https://bugzilla.redhat.com/show_bug.cgi?id=1774121
[2]: https://bugzilla.redhat.com/show_bug.cgi?id=1810036
[3]: https://bugzilla.redhat.com/show_bug.cgi?id=1801573
[4]: openshift/service-ca-operator#110 (comment)
[5]: openshift/library-go#726 (comment)
[6]: openshift/oauth-proxy#152 (comment)
[7]: https://bugzilla.redhat.com/show_bug.cgi?id=1810418
[8]: openshift/service-ca-operator#111 (comment)
[9]: openshift/library-go#728 (comment)
[10]: https://bugzilla.redhat.com/show_bug.cgi?id=1788179
[11]: openshift/service-ca-operator#104 (comment)
[12]: https://bugzilla.redhat.com/show_bug.cgi?id=1810420
[13]: openshift/service-ca-operator#112 (comment)
[14]: openshift/library-go#729 (comment)
[15]: https://bugzilla.redhat.com/show_bug.cgi?id=1809253
[16]: openshift/oauth-proxy#160
[17]: https://bugzilla.redhat.com/show_bug.cgi?id=1774156
[18]: openshift/service-ca-operator#105 (comment)
[19]: openshift/library-go#684 (comment)
[20]: openshift/api#577 (comment)
[21]: https://bugzilla.redhat.com/show_bug.cgi?id=1810421
[22]: openshift/service-ca-operator#113
[23]: openshift/library-go#730 (comment)
[24]: https://bugzilla.redhat.com/show_bug.cgi?id=1809258
[25]: openshift/oauth-proxy#164
[26]: https://bugzilla.redhat.com/show_bug.cgi?id=1774157
wking added a commit to wking/cincinnati-graph-data that referenced this pull request Mar 18, 2020
…1810036

The bugs were introduced by the [1] series, and fixed by the
combination of [2,3].  This commit also tombstones affected releases
to avoid further channel promotion.  Details on the bug:

* 4.5: Introduced by [1] (no linked PR, so not sure exactly when it
  was introduced).  Fixed by [2], service-ca-operator 74b5ce2 [4],
  which included library-go d9c73bb [5].

  Also fixed by [3], oauth-proxy 3d0621e [6], which landed before the
  4.4/4.5 split.

* 4.4: Introduced by [1] (no linked PR, so not sure exactly when it
  was introduced).  Fixed by [7], service-ca-operator e5a04d6 [7],
  which included library-go 3c25293 [9].

  $ oc adm release info --commits quay.io/openshift-release-dev/ocp-release:4.4.0-rc.0-x86_64 | grep service-ca-operator
    service-ca-operator                            https://github.com/openshift/service-ca-operator                            094a9ad02dbe3bcb57d5fbad301cfcfcd48bd2ed
  $ oc adm release info --commits quay.io/openshift-release-dev/ocp-release:4.4.0-rc.1-x86_64 | grep service-ca-operator
    service-ca-operator                            https://github.com/openshift/service-ca-operator                            094a9ad02dbe3bcb57d5fbad301cfcfcd48bd2ed
  $ git --no-pager log -2 --first-parent --oneline origin/release-4.4
  e5a04d6a (origin/release-4.4) Merge pull request openshift#111 from marun/4.4-unique-ca-serial
  094a9ad0 Merge pull request #95 from vareti/signer-ca-metrics

  So both RCs are affected.

  Also fixed by [3], oauth-proxy 3d0621e [6], which landed before the
  4.4/4.5 split.

  $ oc adm release info --commits quay.io/openshift-release-dev/ocp-release:4.4.0-rc.0-x86_64 | grep oauth-proxy
    oauth-proxy                                    https://github.com/openshift/oauth-proxy                                    3d0621eb72c9dd1c036505363032468a9016f381
  $ oc adm release info --commits quay.io/openshift-release-dev/ocp-release:4.4.0-rc.1-x86_64 | grep oauth-proxy
  oauth-proxy                                    https://github.com/openshift/oauth-proxy                                    3d0621eb72c9dd1c036505363032468a9016f381

  So both RCs have OAuth fix, but neither has the service-ca-operator
  fix.

* 4.3: Introduced by [10], service-ca-operator 8395d65 [11]. Fixed by
  [12], service-ca-operator dd7235b [13], which includes library-go
  5844159 [14].

  Fix has not been released yet.

  $ oc adm release info --commits quay.io/openshift-release-dev/ocp-release:4.3.3-x86_64 | grep service-ca-operator
    service-ca-operator                           https://github.com/openshift/service-ca-operator                           774c394da334dec446703545d4baaf89611ccb9d
  $ oc adm release info --commits quay.io/openshift-release-dev/ocp-release:4.3.5-x86_64 | grep service-ca-operator
    service-ca-operator                           https://github.com/openshift/service-ca-operator                           8395d65888b0a4249277989f18ee03f45383e409

  So this was introduced in 4.3.5 (there was no 4.3.4).

  Fix also requires the OAuth proxy fix [15,16], which is still in
  flight.

* 4.2: Introduced by [17], service-ca-operator 0324055 [18], which
  includes library-go 2cf86bb [19] and API 8ce0047 [20].  Fix in
  flight with [21,22].  [23] has already landed with library-go
  d58edcb.

  $ oc adm release info --commits quay.io/openshift-release-dev/ocp-release:4.2.21-x86_64 | grep service-ca-operator
    service-ca-operator                           https://github.com/openshift/service-ca-operator                           f6720573b9b63147436374e51e6fda44683b1e9f
  $ oc adm release info --commits quay.io/openshift-release-dev/ocp-release:4.2.22-x86_64 | grep service-ca-operator
    service-ca-operator                           https://github.com/openshift/service-ca-operator                           0324055c3bad3a857dcf3471c024bf42c20d549e

  So this was introduced in 4.2.22.

  Fix also requires the OAuth proxy fix [24,25], which is still in
  flight.

* 4.1: Backport stream introducing the bug is still ASSIGNED [26], so
  no 4.1 impact yet.

[1]: https://bugzilla.redhat.com/show_bug.cgi?id=1774121
[2]: https://bugzilla.redhat.com/show_bug.cgi?id=1810036
[3]: https://bugzilla.redhat.com/show_bug.cgi?id=1801573
[4]: openshift/service-ca-operator#110 (comment)
[5]: openshift/library-go#726 (comment)
[6]: openshift/oauth-proxy#152 (comment)
[7]: https://bugzilla.redhat.com/show_bug.cgi?id=1810418
[8]: openshift/service-ca-operator#111 (comment)
[9]: openshift/library-go#728 (comment)
[10]: https://bugzilla.redhat.com/show_bug.cgi?id=1788179
[11]: openshift/service-ca-operator#104 (comment)
[12]: https://bugzilla.redhat.com/show_bug.cgi?id=1810420
[13]: openshift/service-ca-operator#112 (comment)
[14]: openshift/library-go#729 (comment)
[15]: https://bugzilla.redhat.com/show_bug.cgi?id=1809253
[16]: openshift/oauth-proxy#160
[17]: https://bugzilla.redhat.com/show_bug.cgi?id=1774156
[18]: openshift/service-ca-operator#105 (comment)
[19]: openshift/library-go#684 (comment)
[20]: openshift/api#577 (comment)
[21]: https://bugzilla.redhat.com/show_bug.cgi?id=1810421
[22]: openshift/service-ca-operator#113
[23]: openshift/library-go#730 (comment)
[24]: https://bugzilla.redhat.com/show_bug.cgi?id=1809258
[25]: openshift/oauth-proxy#164
[26]: https://bugzilla.redhat.com/show_bug.cgi?id=1774157
wking added a commit to wking/cincinnati-graph-data that referenced this pull request Mar 18, 2020
…1810036

The bugs were introduced by the [1] series, and fixed by the
combination of [2,3].  This commit also tombstones affected releases
to avoid further channel promotion.  Details on the bug:

* 4.5: Introduced by [1] (no linked PR, so not sure exactly when it
  was introduced).  Fixed by [2], service-ca-operator 74b5ce2 [4],
  which included library-go d9c73bb [5].

  Also fixed by [3], oauth-proxy 3d0621e [6], which landed before the
  4.4/4.5 split.

* 4.4: Introduced by [1] (no linked PR, so not sure exactly when it
  was introduced).  Fixed by [7], service-ca-operator e5a04d6 [7],
  which included library-go 3c25293 [9].

  $ oc adm release info --commits quay.io/openshift-release-dev/ocp-release:4.4.0-rc.0-x86_64 | grep service-ca-operator
    service-ca-operator                            https://github.com/openshift/service-ca-operator                            094a9ad02dbe3bcb57d5fbad301cfcfcd48bd2ed
  $ oc adm release info --commits quay.io/openshift-release-dev/ocp-release:4.4.0-rc.1-x86_64 | grep service-ca-operator
    service-ca-operator                            https://github.com/openshift/service-ca-operator                            094a9ad02dbe3bcb57d5fbad301cfcfcd48bd2ed
  $ git --no-pager log -2 --first-parent --oneline origin/release-4.4
  e5a04d6a (origin/release-4.4) Merge pull request openshift#111 from marun/4.4-unique-ca-serial
  094a9ad0 Merge pull request #95 from vareti/signer-ca-metrics

  So both RCs are affected.

  Also fixed by [3], oauth-proxy 3d0621e [6], which landed before the
  4.4/4.5 split.

  $ oc adm release info --commits quay.io/openshift-release-dev/ocp-release:4.4.0-rc.0-x86_64 | grep oauth-proxy
    oauth-proxy                                    https://github.com/openshift/oauth-proxy                                    3d0621eb72c9dd1c036505363032468a9016f381
  $ oc adm release info --commits quay.io/openshift-release-dev/ocp-release:4.4.0-rc.1-x86_64 | grep oauth-proxy
  oauth-proxy                                    https://github.com/openshift/oauth-proxy                                    3d0621eb72c9dd1c036505363032468a9016f381

  So both RCs have OAuth fix, but neither has the service-ca-operator
  fix.

* 4.3: Introduced by [10], service-ca-operator 8395d65 [11]. Fixed by
  [12], service-ca-operator dd7235b [13], which includes library-go
  5844159 [14].

  Fix has not been released yet.

  $ oc adm release info --commits quay.io/openshift-release-dev/ocp-release:4.3.3-x86_64 | grep service-ca-operator
    service-ca-operator                           https://github.com/openshift/service-ca-operator                           774c394da334dec446703545d4baaf89611ccb9d
  $ oc adm release info --commits quay.io/openshift-release-dev/ocp-release:4.3.5-x86_64 | grep service-ca-operator
    service-ca-operator                           https://github.com/openshift/service-ca-operator                           8395d65888b0a4249277989f18ee03f45383e409

  So this was introduced in 4.3.5 (there was no 4.3.4).

  Fix also requires the OAuth proxy fix [15,16], which is still in
  flight.

* 4.2: Introduced by [17], service-ca-operator 0324055 [18], which
  includes library-go 2cf86bb [19] and API 8ce0047 [20].  Fix in
  flight with [21,22].  [23] has already landed with library-go
  d58edcb.

  $ oc adm release info --commits quay.io/openshift-release-dev/ocp-release:4.2.21-x86_64 | grep service-ca-operator
    service-ca-operator                           https://github.com/openshift/service-ca-operator                           f6720573b9b63147436374e51e6fda44683b1e9f
  $ oc adm release info --commits quay.io/openshift-release-dev/ocp-release:4.2.22-x86_64 | grep service-ca-operator
    service-ca-operator                           https://github.com/openshift/service-ca-operator                           0324055c3bad3a857dcf3471c024bf42c20d549e

  So this was introduced in 4.2.22.

  Fix also requires the OAuth proxy fix [24,25], which is still in
  flight.

* 4.1: Backport stream introducing the bug is still ASSIGNED [26], so
  no 4.1 impact yet.

[1]: https://bugzilla.redhat.com/show_bug.cgi?id=1774121
[2]: https://bugzilla.redhat.com/show_bug.cgi?id=1810036
[3]: https://bugzilla.redhat.com/show_bug.cgi?id=1801573
[4]: openshift/service-ca-operator#110 (comment)
[5]: openshift/library-go#726 (comment)
[6]: openshift/oauth-proxy#152 (comment)
[7]: https://bugzilla.redhat.com/show_bug.cgi?id=1810418
[8]: openshift/service-ca-operator#111 (comment)
[9]: openshift/library-go#728 (comment)
[10]: https://bugzilla.redhat.com/show_bug.cgi?id=1788179
[11]: openshift/service-ca-operator#104 (comment)
[12]: https://bugzilla.redhat.com/show_bug.cgi?id=1810420
[13]: openshift/service-ca-operator#112 (comment)
[14]: openshift/library-go#729 (comment)
[15]: https://bugzilla.redhat.com/show_bug.cgi?id=1809253
[16]: openshift/oauth-proxy#160
[17]: https://bugzilla.redhat.com/show_bug.cgi?id=1774156
[18]: openshift/service-ca-operator#105 (comment)
[19]: openshift/library-go#684 (comment)
[20]: openshift/api#577 (comment)
[21]: https://bugzilla.redhat.com/show_bug.cgi?id=1810421
[22]: openshift/service-ca-operator#113
[23]: openshift/library-go#730 (comment)
[24]: https://bugzilla.redhat.com/show_bug.cgi?id=1809258
[25]: openshift/oauth-proxy#164
[26]: https://bugzilla.redhat.com/show_bug.cgi?id=1774157
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. lgtm Indicates that a PR is ready to be merged. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants