-
Notifications
You must be signed in to change notification settings - Fork 214
OCPBUGS-3783: Update dnsPolicy to allow consistent resolution of the internal LB #920
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The kubelet consistently resolves the name. This change allows the CVO to use the kubelet's DNS configuration.
| dnsPolicy: ClusterFirstWithHostNet | ||
| # this pod is hostNetwork and uses the internal LB DNS name when possible, which the kubelet also uses. | ||
| # this dnsPolicy allows us to use the same dnsConfig as the kubelet, without access to read it ourselves. | ||
| dnsPolicy: Default |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpick: This doesn't cause it to use the kubelet's DNS config, it causes it to use 127.0.0.1:53 as its DNS resolver (regardless of what kubelet does). This only works if we run a recursive DNS resolver in the host network ns on nodes, which I guess we must...
Just leaving dnsPolicy unset would do the same thing, for a hostNetwork pod.
The naming of the dnsPolicy options is completely broken and led the author of this yaml file astray...
|
/hold will cause
also, that appears to be missing an e2e test. |
|
now allowing access to internal promql /hold cancel |
Globals cause problems for unit tests (demonstrated in some of the unit tests) and prevent general reuse. The anti-pattern is avoided in kuberentes code and should be OCP as well. This updates to providing a configured condition registry without init blocks and provides a standard clustercondition registry for each consumption.
We do this so that our host-network pod can use the node's resolv.conf to resolve the internal load balancer name on the pod before DNS pods are available and before the service network is available. The side effect is that the CVO cannot resolve service DNS names.
wking
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: deads2k, wking The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/test ci/prow/e2e-agnostic-ovn-upgrade-out-of-change |
|
@deads2k: The specified target(s) for
Use DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
/retest |
|
/hold upgrade out of change is failing consistently. It's upgrading just fine, but at the start of the upgrade there are dial failures to routes. I don't know how this change causes that, but it apparently does. Even stranger, the installation itself is working (see the agnostic ovn job) |
|
/retest |
|
actually this is a general azure regression. I've scheduled a slack message for TRT in the morning and left details in the thread. /hold cancel |
|
/retest |
petr-muller
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Big fan of getting rid of the global registry 🤮 . I am not sure whether the name resolving commit is an ugly hack or not: it looks to me that way but /shrug.
|
|
||
| // NewClient creates a new Cincinnati client with the given client identifier. | ||
| func NewClient(id uuid.UUID, transport *http.Transport, userAgent string) Client { | ||
| func NewClient(id uuid.UUID, transport *http.Transport, userAgent string, conditionRegistry clusterconditions.ConditionRegistry) Client { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a note: this is a really good example why the global sucked. I would never expect an osus client code to depend on the condition registry data, this way it is at least explicit (although it's still fishy and seems wrong to me)
| "reflect" | ||
| "testing" | ||
|
|
||
| "github.com/openshift/cluster-version-operator/pkg/clusterconditions/standard" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we come up with some better names here? Both clusterconditions and clusterconditions/standard have a ConditionRegistry type and NewConditionRegistry constructor, and at callsites they will look like clusterconditions.NewConditionRegistry and standard.NewConditionRegistry and it is not clear how they are different and when to use which.
| defer ts.Close() | ||
|
|
||
| c := NewClient(clientID, nil, "") | ||
| c := NewClient(clientID, nil, "", standard.NewConditionRegistry(nil)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I'd prefer if the registry had a WithClient variant or a fluent method but I guess it's just a cosmetic thing
|
|
||
| for _, config := range matchingRules { | ||
| condition, ok := Registry[config.Type] | ||
| condition, ok := r.registry[config.Type] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok this is a code I dislike (not your fault, another artifact of the global), the actual registry is needed here only to validate types, and the remaining code IMO does not belong to a registry. I believe this code should live elsewhere, and the registry should only provide a IsValid(type) method or maybe a method that returns a validator func to be used here (probably the latter because the registration can allow dynamic behavior, even if it is not dynamically changed right now)
I don't think this needs to be addressed in this PR, I'll probably want to clean this up as a followup
|
|
||
| for _, config := range matchingRules { | ||
| condition, ok := Registry[config.Type] | ||
| condition, ok := r.registry[config.Type] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
| }) | ||
| } | ||
|
|
||
| delete(clusterconditions.Registry, "Error") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❤️
| svc, err := p.kubeClient.CoreV1().Services("openshift-monitoring").Get(ctx, "thanos-querier", metav1.GetOptions{}) | ||
| if err != nil { | ||
| return "", err | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have any expectations about how reliable this is? Rock-solid, flakey, no idea? I'm just curious.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have any expectations about how reliable this is? Rock-solid, flakey, no idea? I'm just curious.
very reliable. It's what drives DNS
|
I'm going for it. It's https://issues.redhat.com/browse/OCPBUGS-11308 /override ci/prow/e2e-agnostic-ovn-upgrade-out-of-change |
|
@deads2k: Overrode contexts on behalf of deads2k: ci/prow/e2e-agnostic-ovn-upgrade-out-of-change DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
@deads2k: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
|
/shrug |
DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
/retitle OCPBUGS-3783: Update dnsPolicy to allow consistent resolution of the internal LB |
|
@deads2k: Jira Issue OCPBUGS-3783: All pull requests linked via external trackers have merged: Jira Issue OCPBUGS-3783 has been moved to the MODIFIED state. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
/cherry-pick release-4.13 |
|
@petr-muller: new pull request created: #923 DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
This commit will introduce new flags and logic to the CVO regarding its
PromQL target for risk evaluation of conditional updates with the CVO
being run in a hosted control plane in mind.
For the CVO to successfully access a service that provides metrics
(in the case of the CVO in a standalone OpenShift cluster it's the
thanos-querier service), it needs three things. It needs the service's
address, a CA bundle to verify the certificate provided by the service
to allow secure communication using TLS between the actors [1], and the
authorization credentials of the CVO. Currently, the CVO hardcodes the
address, the path to the CA bundle, and the path to the credentials file.
This is not ideal, as CVO is starting to be used in other repositories
such as HyperShift [2].
A path to the service CA bundle file is added to allow explicitly
setting the CA bundle file of the given query service.
Currently, the CVO is using a kube-api server to resolve the IP address
of a specific service in the cluster [4]. New flags were added to
specify the service name and its namespace as the PromQL target may
reside in different namespaces or have a different name.
In HyperShift multiple kube-api servers are being used. A management
cluster creates a hosted control plane where a hosted CVO resides. This
hosted CVO is then given a kubeconfig file for a cluster where the
hosted CVO reconciles manifests. However, because the CVO cannot resolve
internal DNS names and, in this case, is given a kubeconfig for a remote
cluster, the CVO is not capable to communicate with internal services on
the management cluster. Thus a new flag is added to specify the path to
the management cluster kubeconfig file. In case the management cluster
kubeconfig is not specified, the CVO defaults to using the
`serviceaccount` directory as is the behavior in the standalone OCP
cluster.
A flag to specify the path to the credentials file was not added because
the CVO hardcodes the same path to the credentials file as is the
path to the credentials file used in other service accounts in
Kubernetes [3].
[1] https://docs.openshift.com/container-platform/4.12/security/certificate_types_descriptions/service-ca-certificates.html
[2] https://github.com/openshift/hypershift
[3] https://kubernetes.io/docs/reference/access-authn-authz/service-accounts-admin/#serviceaccount-admission-controller
[4] openshift#920
This commit also adds a new flag to specify whether the CVO resides in
a hosted control plane. In this case, the CVO will inject its cluster
Id into PromQL queries to differentiate between multiple time series
belonging to different hosted clusters [5].
[5] openshift/cincinnati-graph-data#3591
This commit will introduce new flags and logic to the CVO regarding its
PromQL target for risk evaluation of conditional updates with the CVO
being run in a hosted control plane in mind.
For the CVO to successfully access a service that provides metrics
(in the case of the CVO in a standalone OpenShift cluster it's the
thanos-querier service), it needs three things. It needs the service's
address, a CA bundle to verify the certificate provided by the service
to allow secure communication using TLS between the actors [1], and the
authorization credentials of the CVO. Currently, the CVO hardcodes the
address, the path to the CA bundle, and the path to the credentials file.
This is not ideal, as CVO is starting to be used in other repositories
such as HyperShift [2].
A path to the service CA bundle file is added to allow explicitly
setting the CA bundle file of the given query service.
Currently, the CVO is using a kube-api server to resolve the IP address
of a specific service in the cluster [4]. New flags were added to
specify the service name and its namespace as the PromQL target may
reside in different namespaces or have a different name.
In HyperShift multiple kube-api servers are being used. A management
cluster creates a hosted control plane where a hosted CVO resides. This
hosted CVO is then given a kubeconfig file for a cluster where the
hosted CVO reconciles manifests. However, because the CVO cannot resolve
internal DNS names and, in this case, is given a kubeconfig for a remote
cluster, the CVO is not capable to communicate with internal services on
the management cluster. Thus a new flag is added to specify the path to
the management cluster kubeconfig file. In case the management cluster
kubeconfig is not specified, the CVO defaults to using the
`serviceaccount` directory as is the behavior in the standalone OCP
cluster.
A flag to specify the path to the credentials file was not added because
the CVO hardcodes the same path to the credentials file as is the
path to the credentials file used in other service accounts in
Kubernetes [3].
[1] https://docs.openshift.com/container-platform/4.12/security/certificate_types_descriptions/service-ca-certificates.html
[2] https://github.com/openshift/hypershift
[3] https://kubernetes.io/docs/reference/access-authn-authz/service-accounts-admin/#serviceaccount-admission-controller
[4] openshift#920
This commit also adds a new flag to specify whether the CVO resides in
a hosted control plane. In this case, the CVO will inject its cluster
Id into PromQL queries to differentiate between multiple time series
belonging to different hosted clusters [5].
[5] openshift/cincinnati-graph-data#3591
This commit will enable the Cluster Version Operator (CVO) to evaluate conditional updates. For the CVO to evaluate conditional updates, it accesses a PromQL query service holding the appropriate information. In the standalone OpenShift, it is the thanos-querier service in the openshift-monitoring namespace. We are granting the CVO the `get` permission on namespaces resources because to access Thanos Querier service APIs, the requesting account must have `get` permission on the namespaces resource [1]. The CVO lost the ability to resolve service DNS names [2]. The CVO is now using the kube-api server to get a specific service, and the CVO then uses the IP address specified in the resource. This commit will give the CVO permission to `get` the service resources. This will enable the CVO to contact the appropriate query service and to contact any potential services in the future. This commit also adds more volumes to the CVO, such as the service CA bundle and the /var/run/secrets/kubernetes.io/serviceaccount directory for the CVO to be able to communicate with the query service. The volumes were added similarly as to how are they set in the OpenShift manifest file [3]. The hosted control planes scraped data can be either queried by accessing the thanos-querier service or by accessing the hypershift-monitoring-stack-prometheus service in the case the ObO (Observability Operator) is responsible for scraping metrics [4]. The CPO (Control Plane Operator) will pass this information to the CVO, in the case RHOBS monitoring is enabled. [1] https://docs.openshift.com/container-platform/4.13/monitoring/accessing-third-party-monitoring-apis.html [2] openshift/cluster-version-operator#920 [3] https://github.com/openshift/cluster-version-operator/blob/216683cfc09eebefd87bed66849d52b6a357c5ab/install/0000_00_cluster-version-operator_03_deployment.yaml [4] https://gitlab.cee.redhat.com/service/rhobs-rules-and-dashboards/-/blob/main/docs/rules/hypershift-monitoring-architecture-introduction.md#the-big-picture
This commit will introduce new flags and logic to the CVO regarding its
PromQL target for risk evaluation of conditional updates with the CVO
being run in a hosted control plane in mind.
For the CVO to successfully access a service that provides metrics
(in the case of the CVO in a standalone OpenShift cluster it's the
thanos-querier service), it needs three things. It needs the service's
address, a CA bundle to verify the certificate provided by the service
to allow secure communication using TLS between the actors [1], and the
authorization credentials of the CVO. Currently, the CVO hardcodes the
address, the path to the CA bundle, and the path to the credentials file.
This is not ideal, as CVO is starting to be used in other repositories
such as HyperShift [2].
A path to the service CA bundle file is added to allow explicitly
setting the CA bundle file of the given query service.
Currently, the CVO is using a kube-api server to resolve the IP address
of a specific service in the cluster [4]. New flags were added to
specify the service name and its namespace as the PromQL target may
reside in different namespaces or have a different name. The port of the
service can also be different as in the case of the managed HyperShift
monitoring. This commit will make sure the port number is not hard-coded
but is rather dynamically found by the port's name.
In HyperShift multiple kube-api servers are being used. A management
cluster creates a hosted control plane where a hosted CVO resides. This
hosted CVO is then given a kubeconfig file for a cluster where the
hosted CVO reconciles manifests. However, because the CVO cannot resolve
internal DNS names and, in this case, is given a kubeconfig for a remote
cluster, the CVO is not capable to communicate with internal services on
the management cluster. Thus a new flag is added to specify the path to
the management cluster kubeconfig file. In case the management cluster
kubeconfig is not specified, the CVO defaults to using the
`serviceaccount` directory as is the behavior in the standalone OCP
cluster.
A flag to specify the path to the credentials file was not added because
the CVO hardcodes the same path to the credentials file as is the
path to the credentials file used in other service accounts in
Kubernetes [3].
[1] https://docs.openshift.com/container-platform/4.12/security/certificate_types_descriptions/service-ca-certificates.html
[2] https://github.com/openshift/hypershift
[3] https://kubernetes.io/docs/reference/access-authn-authz/service-accounts-admin/#serviceaccount-admission-controller
[4] openshift#920
This commit also adds a new flag to specify whether the CVO resides in
a hosted control plane. In this case, the CVO will inject its cluster
Id into PromQL queries to differentiate between multiple time series
belonging to different hosted clusters [5].
[5] openshift/cincinnati-graph-data#3591
This commit will enable the Cluster Version Operator (CVO) to evaluate conditional updates. For the CVO to evaluate conditional updates, it accesses a PromQL query service holding the appropriate information. In the standalone OpenShift, it is the thanos-querier service in the openshift-monitoring namespace. We are granting the CVO the `get` permission on namespaces resources because to access Thanos Querier service APIs, the requesting account must have `get` permission on the namespaces resource [1]. The CVO lost the ability to resolve service DNS names [2]. The CVO is now using the kube-api server to get a specific service, and the CVO then uses the IP address specified in the resource. This commit will give the CVO permission to `get` the service resources. This will enable the CVO to contact the appropriate query service and to contact any potential services in the future. This commit also adds more volumes to the CVO, such as the service CA bundle and the /var/run/secrets/kubernetes.io/serviceaccount directory for the CVO to be able to communicate with the query service. The volumes were added similarly as to how are they set in the OpenShift manifest file [3]. The hosted control planes scraped data can be either queried by accessing the thanos-querier service or by accessing the hypershift-monitoring-stack-prometheus service in the case the ObO (Observability Operator) is responsible for scraping metrics [4]. The CPO (Control Plane Operator) will pass this information to the CVO, in the case RHOBS monitoring is enabled. [1] https://docs.openshift.com/container-platform/4.13/monitoring/accessing-third-party-monitoring-apis.html [2] openshift/cluster-version-operator#920 [3] https://github.com/openshift/cluster-version-operator/blob/216683cfc09eebefd87bed66849d52b6a357c5ab/install/0000_00_cluster-version-operator_03_deployment.yaml [4] https://gitlab.cee.redhat.com/service/rhobs-rules-and-dashboards/-/blob/main/docs/rules/hypershift-monitoring-architecture-introduction.md#the-big-picture
This commit will introduce new flags and logic to the CVO regarding its
PromQL target for risk evaluation of conditional updates with the CVO
being run in a hosted control plane in mind.
For the CVO to successfully access a service that provides metrics
(in the case of the CVO in a standalone OpenShift cluster it's the
thanos-querier service), it needs three things. It needs the service's
address, a CA bundle to verify the certificate provided by the service
to allow secure communication using TLS between the actors [1], and the
authorization credentials of the CVO. Currently, the CVO hardcodes the
address, the path to the CA bundle, and the path to the credentials file.
This is not ideal, as CVO is starting to be used in other repositories
such as HyperShift [2].
A path to the service CA bundle file is added to allow explicitly
setting the CA bundle file of the given query service.
Currently, the CVO is using a kube-api server to resolve the IP address
of a specific service in the cluster [4]. New flags were added to
specify the service name and its namespace as the PromQL target may
reside in different namespaces or have a different name. The port of the
service can also be different as in the case of the managed HyperShift
monitoring. This commit will make sure the port number is not hard-coded
but is rather dynamically found by the port's name.
In HyperShift multiple kube-api servers are being used. A management
cluster creates a hosted control plane where a hosted CVO resides. This
hosted CVO is then given a kubeconfig file for a cluster where the
hosted CVO reconciles manifests. However, because the CVO cannot resolve
internal DNS names and, in this case, is given a kubeconfig for a remote
cluster, the CVO is not capable to communicate with internal services on
the management cluster. Thus a new flag is added to specify the path to
the management cluster kubeconfig file. In case the management cluster
kubeconfig is not specified, the CVO defaults to using the
`serviceaccount` directory as is the behavior in the standalone OCP
cluster.
A flag to specify the path to the credentials file was not added because
the CVO hardcodes the same path to the credentials file as is the
path to the credentials file used in other service accounts in
Kubernetes [3].
[1] https://docs.openshift.com/container-platform/4.12/security/certificate_types_descriptions/service-ca-certificates.html
[2] https://github.com/openshift/hypershift
[3] https://kubernetes.io/docs/reference/access-authn-authz/service-accounts-admin/#serviceaccount-admission-controller
[4] openshift#920
This commit also adds a new flag to specify whether the CVO resides in
a hosted control plane. In this case, the CVO will inject its cluster
Id into PromQL queries to differentiate between multiple time series
belonging to different hosted clusters [5].
[5] openshift/cincinnati-graph-data#3591
This commit will introduce new flags and logic to the CVO regarding its
PromQL target for risk evaluation of conditional updates with the CVO
being run in a hosted control plane in mind.
For the CVO to successfully access a service that provides metrics
(in the case of the CVO in a standalone OpenShift cluster it's the
thanos-querier service), it needs three things. It needs the service's
address, a CA bundle to verify the certificate provided by the service
to allow secure communication using TLS between the actors [1], and the
authorization credentials of the CVO. Currently, the CVO hardcodes the
address, the path to the CA bundle, and the path to the credentials file.
This is not ideal, as CVO is starting to be used in other repositories
such as HyperShift [2].
A path to the service CA bundle file is added to allow explicitly
setting the CA bundle file of the given query service.
Currently, the CVO is using a kube-api server to resolve the IP address
of a specific service in the cluster [4]. New flags were added to
specify the service name and its namespace as the PromQL target may
reside in different namespaces or have a different name. The port of the
service can also be different as in the case of the managed HyperShift
monitoring. This commit will make sure the port number is not hard-coded
but is rather dynamically found by the port's name.
In HyperShift multiple kube-api servers are being used. A management
cluster creates a hosted control plane where a hosted CVO resides. This
hosted CVO is then given a kubeconfig file for a cluster where the
hosted CVO reconciles manifests. However, because the CVO cannot resolve
internal DNS names and, in this case, is given a kubeconfig for a remote
cluster, the CVO is not capable to communicate with internal services on
the management cluster. Thus a new flag is added to specify the path to
the management cluster kubeconfig file. In case the management cluster
kubeconfig is not specified, the CVO defaults to using the
`serviceaccount` directory as is the behavior in the standalone OCP
cluster.
A flag to specify the path to the credentials file was not added because
the CVO hardcodes the same path to the credentials file as is the
path to the credentials file used in other service accounts in
Kubernetes [3].
[1] https://docs.openshift.com/container-platform/4.12/security/certificate_types_descriptions/service-ca-certificates.html
[2] https://github.com/openshift/hypershift
[3] https://kubernetes.io/docs/reference/access-authn-authz/service-accounts-admin/#serviceaccount-admission-controller
[4] openshift#920
This commit also adds a new flag to specify whether the CVO resides in
a hosted control plane. In this case, the CVO will inject its cluster
Id into PromQL queries to differentiate between multiple time series
belonging to different hosted clusters [5].
[5] openshift/cincinnati-graph-data#3591
This commit will introduce new flags and logic to the CVO regarding its
PromQL target for risk evaluation of conditional updates with the CVO
being run in a hosted control plane in mind.
For the CVO to successfully access a service that provides metrics
(in the case of the CVO in a standalone OpenShift cluster it's the
thanos-querier service), it needs three things. It needs the service's
address, a CA bundle to verify the certificate provided by the service
to allow secure communication using TLS between the actors [1], and the
authorization credentials of the CVO. Currently, the CVO hardcodes the
address, the path to the CA bundle, and the path to the credentials file.
This is not ideal, as CVO is starting to be used in other repositories
such as HyperShift [2].
A path to the service CA bundle file is added to allow explicitly
setting the CA bundle file of the given query service.
Currently, the CVO is using a kube-api server to resolve the IP address
of a specific service in the cluster [4]. New flags were added to
specify the service name and its namespace as the PromQL target may
reside in different namespaces or have a different name. The port of the
service can also be different as in the case of the managed HyperShift
monitoring. This commit will make sure the port number is not hard-coded
but is rather dynamically found by the port's name.
In HyperShift multiple kube-api servers are being used. A management
cluster creates a hosted control plane where a hosted CVO resides. This
hosted CVO is then given a kubeconfig file for a cluster where the
hosted CVO reconciles manifests. However, because the CVO cannot resolve
internal DNS names and, in this case, is given a kubeconfig for a remote
cluster, the CVO is not capable to communicate with internal services on
the management cluster. Thus a new flag is added to specify the path to
the management cluster kubeconfig file. In case the management cluster
kubeconfig is not specified, the CVO defaults to using the
`serviceaccount` directory as is the behavior in the standalone OCP
cluster.
A flag to specify the path to the credentials file was not added because
the CVO hardcodes the same path to the credentials file as is the
path to the credentials file used in other service accounts in
Kubernetes [3].
[1] https://docs.openshift.com/container-platform/4.12/security/certificate_types_descriptions/service-ca-certificates.html
[2] https://github.com/openshift/hypershift
[3] https://kubernetes.io/docs/reference/access-authn-authz/service-accounts-admin/#serviceaccount-admission-controller
[4] openshift#920
This commit also adds a new flag to specify whether the CVO resides in
a hosted control plane. In this case, the CVO will inject its cluster
Id into PromQL queries to differentiate between multiple time series
belonging to different hosted clusters [5].
[5] openshift/cincinnati-graph-data#3591
This commit will introduce new flags and logic to the CVO regarding its
PromQL target for risk evaluation of conditional updates with the CVO
being run in a hosted control plane in mind.
For the CVO to successfully access a service that provides metrics
(in the case of the CVO in a standalone OpenShift cluster it's the
thanos-querier service), it needs three things. It needs the service's
address, a CA bundle to verify the certificate provided by the service
to allow secure communication using TLS between the actors [1], and the
authorization credentials of the CVO. Currently, the CVO hardcodes the
address, the path to the CA bundle, and the path to the credentials file.
This is not ideal, as CVO is starting to be used in other repositories
such as HyperShift [2].
A path to the service CA bundle file is added to allow explicitly
setting the CA bundle file of the given query service.
Currently, the CVO is using a kube-api server to resolve the IP address
of a specific service in the cluster [4]. New flags were added to
specify the service name and its namespace as the PromQL target may
reside in different namespaces or have a different name. The port of the
service can also be different as in the case of the managed HyperShift
monitoring. This commit will make sure the port number is not hard-coded
but is rather dynamically found by the port's name.
In HyperShift multiple kube-api servers are being used. A management
cluster creates a hosted control plane where a hosted CVO resides. This
hosted CVO is then given a kubeconfig file for a cluster where the
hosted CVO reconciles manifests. However, because the CVO cannot resolve
internal DNS names and, in this case, is given a kubeconfig for a remote
cluster, the CVO is not capable to communicate with internal services on
the management cluster. Thus a new flag is added to specify the path to
the management cluster kubeconfig file. In case the management cluster
kubeconfig is not specified, the CVO defaults to using the
`serviceaccount` directory as is the behavior in the standalone OCP
cluster.
A flag to specify the path to the credentials file was not added because
the CVO hardcodes the same path to the credentials file as is the
path to the credentials file used in other service accounts in
Kubernetes [3].
[1] https://docs.openshift.com/container-platform/4.12/security/certificate_types_descriptions/service-ca-certificates.html
[2] https://github.com/openshift/hypershift
[3] https://kubernetes.io/docs/reference/access-authn-authz/service-accounts-admin/#serviceaccount-admission-controller
[4] openshift#920
This commit also adds a new flag to specify whether the CVO resides in
a hosted control plane. In this case, the CVO will inject its cluster
Id into PromQL queries to differentiate between multiple time series
belonging to different hosted clusters [5].
[5] openshift/cincinnati-graph-data#3591
This commit will introduce new flags and logic to the CVO regarding its
PromQL target for risk evaluation of conditional updates with the CVO
being run in a hosted control plane in mind.
For the CVO to successfully access a service that provides metrics
(in the case of the CVO in a standalone OpenShift cluster it's the
thanos-querier service), it needs three things. It needs the service's
address, a CA bundle to verify the certificate provided by the service
to allow secure communication using TLS between the actors [1], and the
authorization credentials of the CVO. Currently, the CVO hardcodes the
address, the path to the CA bundle, and the path to the credentials file.
This is not ideal, as CVO is starting to be used in other repositories
such as HyperShift [2].
A path to the service CA bundle file is added to allow explicitly
setting the CA bundle file of the given query service.
Currently, the CVO is using a kube-api server to resolve the IP address
of a specific service in the cluster [4]. New flags were added to
specify the service name and its namespace as the PromQL target may
reside in different namespaces or have a different name. The port of the
service can also be different as in the case of the managed HyperShift
monitoring. This commit will make sure the port number is not hard-coded
but is rather dynamically found by the port's name.
In HyperShift multiple kube-api servers are being used. A management
cluster creates a hosted control plane where a hosted CVO resides. This
hosted CVO is then given a kubeconfig file for a cluster where the
hosted CVO reconciles manifests. However, because the CVO cannot resolve
internal DNS names and, in this case, is given a kubeconfig for a remote
cluster, the CVO is not capable to communicate with internal services on
the management cluster. Thus a new flag is added to specify the path to
the management cluster kubeconfig file. In case the management cluster
kubeconfig is not specified, the CVO defaults to using the
`serviceaccount` directory as is the behavior in the standalone OCP
cluster.
A flag to specify the path to the credentials file was not added because
the CVO hardcodes the same path to the credentials file as is the
path to the credentials file used in other service accounts in
Kubernetes [3].
[1] https://docs.openshift.com/container-platform/4.12/security/certificate_types_descriptions/service-ca-certificates.html
[2] https://github.com/openshift/hypershift
[3] https://kubernetes.io/docs/reference/access-authn-authz/service-accounts-admin/#serviceaccount-admission-controller
[4] openshift#920
This commit also adds a new flag to specify whether the CVO resides in
a hosted control plane. In this case, the CVO will inject its cluster
Id into PromQL queries to differentiate between multiple time series
belonging to different hosted clusters [5].
[5] openshift/cincinnati-graph-data#3591
This commit will introduce new flags and logic to the CVO regarding its PromQL target for risk evaluation of conditional updates with the CVO being run in a hosted control plane in mind. For the CVO to successfully access a service that provides metrics (in the case of the CVO in a standalone OpenShift cluster it's the thanos-querier service), it needs three things. It needs the service's address, a CA bundle to verify the certificate provided by the service to allow secure communication using TLS between the actors [1], and the authorization credentials of the CVO. Currently, the CVO hardcodes the address, the path to the CA bundle, and the path to the credentials file. This is not ideal, as CVO is starting to be used in other repositories such as HyperShift [2]. A path to the service CA bundle file is added to allow explicitly setting the CA bundle file of the given query service. Currently, the CVO is using a kube-api server to resolve the IP address of a specific service in the cluster [3]. Add new flags that allow configuring the CVO to resolve the IP address via DNS when DNS is available to the CVO. This is the case for hosted CVOs in HyperShift. The alternative in HyperShift would be to use the management cluster kube-api server and give the hosted CVOs additional permissions. Add flags to specify the PromQL target URL. This URL contains the used scheme, port, and the server's name used for TLS configuration. In the case, DNS is enabled, the URL is also used to query the Prometheus server. In the case, the DNS is disabled, the server can be specified via the Kubernetes service in the cluster that exposes the server. A flag to specify the path to the credentials file was added for more customizability. This flag also enables the CVO to not set the token when it's not needed. A CVO can communicate with a Prometheus server over HTTP. In the case, the token is not needed, it would be undesirable for the CVO to send its token without reason over HTTP. This commit also adds a new flag to specify whether the CVO resides in a hosted control plane. In this case, the CVO will inject its cluster Id into PromQL queries to differentiate between multiple time series belonging to different hosted clusters [4]. [1] https://docs.openshift.com/container-platform/4.12/security/certificate_types_descriptions/service-ca-certificates.html [2] https://github.com/openshift/hypershift [3] openshift#920 [4] openshift/cincinnati-graph-data#3591

The kubelet consistently resolves the name. This change allows the CVO to use the kubelet's DNS configuration.