-
Notifications
You must be signed in to change notification settings - Fork 82
Bug 1805168: inject global proxy envs,ca into OCM daemonset #145
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug 1805168: inject global proxy envs,ca into OCM daemonset #145
Conversation
|
@gabemontero: This pull request references Bugzilla bug 1805168, which is valid. The bug has been moved to the POST state. The bug has been updated to refer to the pull request using the external bug tracker. 3 validation(s) were run on this bug
DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
d8d367e to
518421c
Compare
|
existing e2e's are green at least /assign @bparees |
|
I don't think this is the right solution. the CVO injection is only for operators. As in "I need the CVO to inject these env vars into my operators' deployment because the CVO owns my deployment and only it can touch it". You(the OCM operator) own the OCM's deployment, you should be setting the proxy envs on the OCM's deployment and reconciling them as they change (i.e. OCM operator needs to watch the proxy config and propagate that config to the OCM deployment) |
|
/hold |
OK a bigger change but certainly not onerours. @adambkaplan 's suggestion of the annotation was trivial enough that at least proposing it to a bigger audience for evaluation seemed fine. There is also the possibility of the OCM, which is already watching proxy config, setting the envs on itself. While less expensive, that probably subverts our "what an operator does and what an operand does" conventions. But certainly voice any opinions there @bparees if you like. |
the resource needs to be owned/written to by a single writer. for the OCM daemonset resource, that's the OCM operator. Otherwise you risk two things fighting w/ each other about what the correct state of the resource should be. |
Well I was actually thinking it would be golang But again, probably too off the beaten path ... I just mentioned it cause it popped into my head I'll start down the path in this repo |
the drawbacks w/ that approach are:
(2) is why we had to make the CVO able to inject proxy config on operator resources, because operators could not solve the problem themselves by dynamically setting their own env vars. You're basically in the same boat, one level deeper. |
|
Note: i think i may have seen another devex PR recently that was adding the CVO proxy injection annotation. I didn't look at it closely... but if it was doing the same thing as this one (adding the injection annotation to an operand resource, not an operator resource) it should be reverted and redone per the approach i've outlined in the discussion here. |
since @adambkaplan suggested the annotation to me, maybe he know which PR you are referring to @bparees otherwise, I'll refactor this PR to have OCM-O watch global proxy and update the OCM daemonset's env as you previously mentioned in the classic operator/operand mgmt pattern |
|
btw if the operand has to consume the proxy it also needs to consume the proxy CAs, so there's more that needs to be done here. the operand has to mount a configmap that's annotated for the proxy CA injection, and the operator has to watch that configmap and redeploy the operand if the configmap is updated w/ new CA content. |
yep true as well ... so yeah the OCM-O is already creating a proxy injected CM for the CAs and is watching it as part of the build support ... the daemonset update would have to include mounting that CA in addition to the envs And I suppose the OCM would have to be updated to copy the cert from the k8s mount point to the well known location under /etc just like the openshift/builder image does today |
yes, or mount the CM there directly. (The CM should contain all the CAs you need, including system trusts, so it should be ok to just mount over top of the image's own CAs) |
I concur that mounting directly is the way to go for OCM. Builds are kind of a special case because we have |
|
I think you need something similar to openshift/cluster-kube-controller-manager-operator#285 and openshift/cluster-kube-controller-manager-operator#325 to be able to consume and use the proxy as far as I can tell. |
thanks for the pointers @soltysh I've got some changes on my laptop that share some commonality with those PRs I'm wrapping up some unit tests and should have a PR up before my EOB close today. |
518421c to
0319d9a
Compare
|
@gabemontero: This pull request references Bugzilla bug 1805168, which is valid. 3 validation(s) were run on this bug
DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
/hold cancel PR switched over to a proxy watch, config observation, and injection of envs/ca into OCM daemonset via operator ptal |
0319d9a to
4bce0da
Compare
soltysh
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This lgtm
|
@gabemontero looks good to me, squash commits and we can make this official. |
|
10-4 @adambkaplan @soltysh thanks .. |
7d2648d to
9c985c9
Compare
|
... and commits squashed/pushed @adambkaplan @soltysh and thanks @bparees for all the help |
|
/retest |
|
green tests @adambkaplan @soltysh @bparees can somebody lgtm ? |
| configobservation.Listers{ | ||
| ImageConfigLister: configInformers.Config().V1().Images().Lister(), | ||
| BuildConfigLister: configInformers.Config().V1().Builds().Lister(), | ||
| ProxyLister: configInformers.Config().V1().Proxies().Lister(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't we install event handlers to notice changes and update?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you are absolutely right @sttts
though this particular piece is a leftover from the confusion over when to use observe_config_controller.go vs. operator.go and its calls to the sync_*.go file
so I think we need
a) cleanup and full revert of the changes here in observe_config_controller.go
b) and install the event handlers as you say around here:
| proxyLister proxyvclient1.ProxyLister, | |
| kubeInformersForOpenshiftControllerManager informers.SharedInformerFactory, | |
| operatorConfigClient operatorclientv1.OperatorV1Interface, | |
| kubeClient kubernetes.Interface, | |
| recorder events.Recorder, | |
| ) *OpenShiftControllerManagerOperator { | |
| c := &OpenShiftControllerManagerOperator{ | |
| targetImagePullSpec: targetImagePullSpec, | |
| operatorConfigClient: operatorConfigClient, | |
| proxyLister: proxyLister, | |
| kubeClient: kubeClient, | |
| queue: workqueue.NewNamedRateLimitingQueue(workqueue.DefaultControllerRateLimiter(), "KubeApiserverOperator"), | |
| rateLimiter: flowcontrol.NewTokenBucketRateLimiter(0.05 /*3 per minute*/, 4), | |
| recorder: recorder, | |
| } | |
| operatorConfigInformer.Informer().AddEventHandler(c.eventHandler()) | |
| kubeInformersForOpenshiftControllerManager.Core().V1().ConfigMaps().Informer().AddEventHandler(c.eventHandler()) | |
| kubeInformersForOpenshiftControllerManager.Core().V1().ServiceAccounts().Informer().AddEventHandler(c.eventHandler()) | |
| kubeInformersForOpenshiftControllerManager.Core().V1().Services().Informer().AddEventHandler(c.eventHandler()) | |
| kubeInformersForOpenshiftControllerManager.Apps().V1().Deployments().Informer().AddEventHandler(c.eventHandler()) |
thanks ... getting started
| return required, false, err | ||
| } | ||
| } | ||
| if err == nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
else {
| } | ||
| if httpsProxySet && !hasHTTPSProxy { | ||
| newEnvs = append(newEnvs, corev1.EnvVar{Name: "HTTPS_PROXY", Value: proxyCfg.Status.HTTPSProxy}) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about this?
newEnvs := []corev1.EnvVar{}
for _, env := range c.Env {
name := strings.TrimSpace(env.Name)
switch name {
case "HTTPS_PROXY":
if len(proxyCfg.Status.HTTPSProxy) == 0 {
continue
}
env.Value = proxyCfg.Status.HTTPSProxy
case ...:
case ....:
}
newEnvs = append(newEnvs, env)
}
forceRollout = forceRollout || !reflect.DeepEqual(newEnvs, c.Env)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had to tweak it slightly to deal with an empty c.Env but I agree your form is better ... will be pushing new commit soon
9c985c9 to
f0cdb02
Compare
|
updates pushed @stts thanks again |
|
broad failures across all suites waiting on expected SAs in test namespaces... as an example: The CVO has a nasty looking extra condition on it: punting /test e2e-aws |
|
I see a lot of throttling notifications trying to access the api server in the OCM-O pod logs around the times of the e2e-aws-operator failures ... my assumption is that messed with those e2e's but am getting up to speed on those tests, looking around the artifacts some more before retesting that one. |
Yeah even the basic, very first test that just reads the OCM clusteroperator suffered a TO, but if you look at the artifacts, that object showed everything was well, and was last updated before the test started. /test e2e-aws-operator |
|
/retest |
1 similar comment
|
/retest |
|
/test e2e-aws |
f0cdb02 to
84afeea
Compare
|
pushed fix for e2e-aws-operator that should better account for a false positive we were getting with the |
|
CI hiccup across most of the board /retest |
|
green tests bump - any of the previous reviewers ready to LGTM ? |
adambkaplan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: adambkaplan, gabemontero, soltysh The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
@gabemontero: All pull requests linked via external trackers have merged: openshift/cluster-openshift-controller-manager-operator#145. Bugzilla bug 1805168 has been moved to the MODIFIED state. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
per recent discussion between @dmage, @adambkaplan, and myself
the image signature controller in OCM is reaching out to the registries like registry.redhat.io and not using the global proxy if set
see https://github.com/openshift/cluster-version-operator/blob/611490ff448962c70ab29a90764a6c30d4ee87dc/lib/resourcebuilder/apps.go#L55
the bug originator manually set the proxy ENVs on the OCM and saw things works
@bparees @mfojtik @soltysh FYI