-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Adding certificate docs #18254
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding certificate docs #18254
Conversation
7998814 to
83b0b10
Compare
|
The preview will be available shortly at: |
60c4d11 to
ae29cbf
Compare
modules/monitoring-certificates.adoc
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@deads2k Can you please help me build out this section? The customer is asking for the information outlined here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@deads2k Can you please help provide information by end of week? The information we are needing for monitoring certificates includes:
- The purpose
- File path
- Default expiration term
- How to set custom expiration term
- How to specify the expiration date of all certificates used in OpenShift when installing OpenShift
- Which services use it
- How to update/extend it
- Will certificates that are about to expire be automatically renewed by the operator?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know anything about any monitoring certificates. Perhaps @s-urbaniak ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is some amount of client cert monitoring done by the kube api and we have alerts on this: https://github.com/kubernetes-monitoring/kubernetes-mixin/blob/a7ee9d1abe1b1a3670a02ede1135cadb660b9d0c/alerts/kube_apiserver.libsonnet#L125-L148
I don't believe there is any serving certs monitoring, as this is typically done via blackbox probes which is not something the monitoring team has available as part of the monitoring framework today (in the same way as teams can self serve scraping/alerting with the Prometheus Operator CRDs).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@brancz how do we document that? Is there a way to pull/show what Prometheus Operator CRDs rules the individual operators are providing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@brancz Can you please comment?
ae29cbf to
20d670a
Compare
modules/node-certificates.adoc
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rphillips @sjenning Can you please review what is here for node certs so far? Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rphillips @sjenning Can you please provide feedback by end of week?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sanchezl Can you please review what I have in this section so far?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sanchezl Can you please review by end of week?
64ca725 to
47a8df6
Compare
adellape
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few minor drive-by comments from a cursory read.
modules/node-certificates.adoc
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we talk about the CA's and the validity of these? How long they are valid for; how to rotate them, etc?
@rphillips @sjenning can either of you comment on this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rphillips @sjenning Can you please comment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ahardin-rh this is correct. We might want to add that once the cluster is installed the Node certificates are auto-rotated.
@ahardin-rh what do you mean by "the latest changes"? Do you mean the change for #18254 (comment) ? If yes, I reviewed http://file.rdu.redhat.com/~ahardin/12052019/OCP-certificates/authentication/certificate-types-descriptions.html#control-plane-certificates, it lgtm. If for the other rest of the PR content, still like #18254 (comment) , need other subteam QE who are familiar for the owned area. Update: @ahardin-rh , I already @'ed them in Slack, they acked to review as soon as they can. |
be3a153 to
9d8ea1f
Compare
modules/etcd-certificates.adoc
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
regarding to 'metric-signer', I can't find it in the right dir, pls help to correct me if any misunderstanding:
sh-4.2# cd /etc/ssl/
sh-4.2# ls
certs etcd
sh-4.2# cd etcd/
sh-4.2# ls
ca.crt system:etcd-peer:etcd-0.xxx.qe.gcp.devcluster.openshift.com.crt
metric-ca.crt system:etcd-peer:etcd-0.xx.qe.gcp.devcluster.openshift.com.key
root-ca.crt system:etcd-server:etcd-0.xx.qe.gcp.devcluster.openshift.com.crt
system:etcd-metric:etcd-0.xxx.qe.gcp.devcluster.openshift.com.crt system:etcd-server:etcd-0.xxx.qe.gcp.devcluster.openshift.com.key
system:etcd-metric:etcd-0.xxx.qe.gcp.devcluster.openshift.com.key
sh-4.2#
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hexfusion Can you please review?
modules/etcd-certificates.adoc
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this doc for ocp 4.3 only or for 4.4? if it only for 4.3, it should be ok, regarding to 4.4, there is not dir: /etc/ssl/etcd directory.
sh-4.2# cd /etc/ssl
sh-4.2# ls
certs
sh-4.2# cd certs/
sh-4.2# ls
ca-bundle.crt ca-bundle.trust.crt
|
Node certificates section LGTM for 4.4 , http://file.rdu.redhat.com/~ahardin/12052019/OCP-certificates/authentication/certificate-types-descriptions.html#node-certificates_ocp-certificates @ahardin-rh I see bug 1800636 is targeting 4.3.z release, I see auto-rotation is implemented from 4.4. Please help clarify. |
modules/olm-certificates.adoc
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
modules/proxy-certificates.adoc
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Proxy certificates should be provided by the user, not very clear about managed by the system mean here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@danehans Can you please confirm? I may have gotten details mixed up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ahardin-rh the RHCOS trust bundle is managed by the system and the Proxy resource is used to add user-provided certs to the trust bundle. Cluster Network Operator merges the the two into a combined bundle and operators mount the bundle into their trust store.
modules/proxy-certificates.adoc
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The detailed renewal steps of Proxy cert are actually noted in the above Customization section:
Updating the user-provided trust bundle consists of either:
updating the PEM-encoded certificates in the ConfigMap referenced by trustedCA, or
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@danehans Can you please confirm?
|
The section |
@sunilcio Thank you! It is my understanding that all of the content in this PR is applicable to 4.3.z. For example, I know that service CA auto-rotation is available as of 4.3.5. cc @pweil- |
9d8ea1f to
a8fcdf6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/openshift-ingress-it /openshift-ingress-operator/
modules/proxy-certificates.adoc
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused by the rest of the text in this section. This looks like how MCO implements proxy trustedCA. Maybe the text should go into a new section such as modules/machine-certificates.adoc? cc: @runcom
The mechanism operators use for writing the trust bundle consists of:
- The operator requests trust bundle injection by creating a
ConfigMapin the operator's namespace with labelconfig.openshift.io/inject-trusted-cabundle: "true". Here's an example of the Ingress Operator:
apiVersion: v1
kind: ConfigMap
metadata:
annotations:
release.openshift.io/create-only: "true"
labels:
config.openshift.io/inject-trusted-cabundle: "true"
name: trusted-ca
namespace: openshift-ingress-operator
- Cluster Network Operator injects the trusted ca bundle into this
ConfigMap:
kind: ConfigMap
metadata:
annotations:
release.openshift.io/create-only: "true"
labels:
config.openshift.io/inject-trusted-cabundle: "true"
name: trusted-ca
namespace: openshift-ingress-operator
apiVersion: v1
data:
ca-bundle.crt: |
<PEM_ENCODED_TRUSTED_CA_CERTS>
ca-bundle.crt contains either the RHCOS trust bundle or the merged RHCOS/user-provided bundle.
- If the operator makes egress requests, it will typically mount this
ConfigMapto/etc/pki/ca-trust/extracted/pem/tls-ca-bundle.pem. If the operand makes egress requests, the operator will plumb the contents ofca-bundle.crtinto the operand's trust store, typically/etc/pki/ca-trust/extracted/pem/tls-ca-bundle.pem. - The operator watches the
ConfigMapfor changes and updates the trust bundle accordingly.
awgreene
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2 small notes.
modules/olm-certificates.adoc
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this targeted at 4.4? This is changing with the introduction of OLM support for admission webhooks in 4.5
modules/olm-certificates.adoc
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am unsure if this is necessary to call out, but OLM will not update the certificates of operators that it manages in proxy environments. These certificates must be managed by the user via the subscription config.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a user, the existence of a feature called the “cluster-wide proxy” makes an unambiguous promise that components managed by the cluster will use those settings. Cluster-wide proxy settings include a field for trust settings.
So this isn’t a minor point. Anything that breaks the promise made by calling it the settings “cluster-wide” should be fixed. If it’s not going to work as advertised it should be called out as clearly as possible to help set users’ expectations.
Or rename the feature to better indicate purpose. As it stands very few features that are documented as being part of an OpenShift Container Platform cluster actually follow these settings.
obockows
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's great you have created something like that - very awaited by customers and support engineers, but would like to add my feedback.
Not long ago I was studying for CKA (Certified Kubernetes Admin) and I was using outstanding course:
https://www.udemy.com/course/certified-kubernetes-administrator-with-practice-tests/
after that course, I fully understood all correlations between certs in control pane for K8s.
What was very very helpful was his spreadshet:
https://github.com/mmumshad/kubernetes-the-hard-way/tree/master/tools
I understand in case of OCP it would require a lot of effort but maybe someday we could create such kind of table to better visualize correlations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do I understand it correctly that the alerting framework gives alerts when we have less than 5 minutes to the expiration?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"User-provided certificates " - are we talking here about "certificate-authority-data" from kubeconfig? or we are talking about 3rd party certificates we can add here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe some link to the place where it's described how to do that?
modules/service-ca-certificates.adoc
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand here it's secret/signing-key in namespace openshift-service-ca ?
modules/service-ca-certificates.adoc
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ca-bundle.crt -> are we talking about configmap/signing-cabundle ?
modules/node-certificates.adoc
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this part should be the most robust, due to the fact customers are very concern about it (most likely due to OCP 3.x expirations)
So it shoud be described that first is the boostrap and then we have new certs after ~ 24 hours that are valid for 20 days? (ocp 4.2?) and they are autorotated... when? if left >= 20% time of lifetime?
We should also mention who signed them (I see Issuer: CN = kube-csr-signer_@1586557208) and where it's located (what namespace and what secrets/configmaps?)
modules/etcd-certificates.adoc
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would describe here what is the purpose of client and server certs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ahardin-rh I remembered in #18254 (comment) sanchezl requested change for openshift-ingress to openshift-config. Not sure why still seeing openshift-ingress :)
Seems this PR is messed up with so many conversations. I suggest divide the content into different PRs and mention corresponding team owners (no matter QE or Dev or whoever) to review their team's certs content.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ahardin-rh , didn't see the correction in your new update. Here please change openshift-ingress to openshift-config, thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry about that! I thought I saved the change, but I guess not. It's updated now!
a8fcdf6 to
f15a40d
Compare
f15a40d to
82552fe
Compare
82552fe to
ad6dc48
Compare
|
/cherrypick enterprise-4.4 |
|
@ahardin-rh: new pull request created: #22061 DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
/cherrypick enterprise-4.5 |
|
@ahardin-rh: new pull request created: #22062 DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
https://jira.coreos.com/browse/OSDOCS-804
https://bugzilla.redhat.com/show_bug.cgi?id=1800636
Preview Build: http://file.rdu.redhat.com/~ahardin/12052019/OCP-certificates/authentication/certificate-types-descriptions.html (updated May 8)