Skip to content

Conversation

@danehans
Copy link
Contributor

Previously MergeUserSystemNoProxy() was including specific api-server and etcd dns names when creating the default noProxy list. This PR uses the cluster's baseDomain from the openshift config.DNS API as a noProxy entry. For example:

$ oc get proxy/cluster -o yaml
<SNIP>
status:
  httpProxy: http://admin:admin@35.196.128.173:3128
  noProxy: .dhansen.devcluster.openshift.com,10.0.0.0/16,10.128.0.0/14,127.0.0.1,169.254.169.254,172.30.0.0/16,example.com,localhost

This approach will cover etcd and api-server names and any additional names added to the cluster's base domain. For example, the cluster-ingress-operator creates a wildcard alias record for routes. For example:

*.apps.dhansen.devcluster.openshift.com.

These routes should not be proxied.

/assign @bparees @knobunc

@openshift-ci-robot openshift-ci-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Aug 20, 2019
@bparees
Copy link

bparees commented Aug 20, 2019

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Aug 20, 2019
@openshift-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: bparees, danehans
To complete the pull request process, please assign knobunc
You can assign the PR to them by writing /assign @knobunc in a comment when ready.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@danehans danehans changed the title Refactors noPorxy merge func Refactors noProxy merge func Aug 20, 2019
@danehans
Copy link
Contributor Author

/test e2e-aws
/test e2e-aws-upgrade
/test e2e-aws-ovn-kubernetes

}

if len(dns.Spec.BaseDomain) > 0 {
set.Insert("." + dns.Spec.BaseDomain)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@deads2k here's where we're adding the cluster domain to noproxy. I believe we made this choice based on what @sdodson found we were doing in 3.x.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be .cluster.local or .dhansen.devcluster.openshift.com? The former is what we had added in 3.x but in 4.x since we're provisioning hosts into .dhansen.devcluster.openshift that zone it probably makes sense to add that as well.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.dhansen.devcluster.openshift.com, i believe. .cluster.local is being added via hardcode elsewhere.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@spadgett i would expect that this routing/proxy question would have been an issue in 3.x as well, so either:

  1. we were noproxying the domain in 3.x
  2. customers were manually adding it
  3. console traffic was going through the customer proxies without issue (which presumably means they were doing so without any additional CAs)
  4. ??

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would've been either 2 or 3.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For 3.x, the browser talked directly to the OAuth server. We didn't make requests to the OAuth server from our pod.

The exception is the 3.11 admin console, which is the same code base as the 4.x console. We let users customize the CA for talking to the OAuth server in the admin console, though. I'm not familiar with how proxy worked in 3.x, but presumably it was possible to add HTTP proxy env vars directly to the admin console deployment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sdodson @bparees that is correct, this PR uses .${CLUSTER_NAME}.${BASE_DOMAIN} as a default for no proxy. In my cluster, this is .dhansen.devcluster.openshift.com. #295 added .svc and .cluster.local as a no proxy default.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@danehans per the email thread discussion, let's drop the cluster domain from the noproxy autogenerated list and move forward w/ this PR.

we'll resolve the fallout (e.g. console being broken) on a case by case basis.

@spadgett
Copy link
Member

cc @enj

@deads2k
Copy link
Contributor

deads2k commented Aug 22, 2019

/hold

This is being discussed. It's not clear that the ingress subdomain can be assumed to be in-cluster. Just don't want to merge in the interim.

@openshift-ci-robot openshift-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Aug 22, 2019
@danehans
Copy link
Contributor Author

"Cluster did not complete upgrade: timed out waiting for the condition: Working towards registry.svc.ci.openshift.org/ci-op-v419r356/release@sha256:3387479f940a92d4b2ef992fc34d7bbef8f54ebfbb15c312b327e1857a77aa0d: downloading update",

I believe a PR landed to fix this issue.
/test e2e-aws-upgrade

@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented Aug 22, 2019

@danehans: The following test failed, say /retest to rerun them all:

Test name Commit Details Rerun command
ci/prow/e2e-aws-ovn-kubernetes 8f1f6fe link /test e2e-aws-ovn-kubernetes

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@danehans
Copy link
Contributor Author

/cc @jcpowermac @abhinavdahiya as the installer will need to update if this merges.

@danehans
Copy link
Contributor Author

/test e2e-aws-proxy

@danehans
Copy link
Contributor Author

level=error msg="  on ../tmp/openshift-install-065943236/vpc/master-elb.tf line 1, in resource \"aws_lb\" \"api_internal\":"
level=error msg="   1: resource \"aws_lb\" \"api_internal\" {"

/test e2e-aws-upgrade

// provided, a comma-separated string of cluster-wide noProxy settings
// are returned.
func MergeUserSystemNoProxy(proxy *configv1.Proxy, infra *configv1.Infrastructure, network *configv1.Network, cluster *corev1.ConfigMap) (string, error) {
func MergeUserSystemNoProxy(proxy *configv1.Proxy, infra *configv1.Infrastructure, network *configv1.Network,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@danehans
Copy link
Contributor Author

danehans commented Sep 16, 2019

@deads2k I'm seeing the following auth operator error while trying to create a proxy-enabled cluster using IPI:

E0916 21:24:49.420417       1 controller.go:129] {AuthenticationOperator2 AuthenticationOperator2} failed with: error checking current version: unable to check route health: failed to GET route: proxyconnect tcp: x509: certificate signed by unknown authority

openshift/cluster-authentication-operator#194 attempts to fix this error. However, I don't understand why this call should be proxied. Any updates on #296 (comment). Otherwise, should this PR be refactored to only include noProxy for specific routes or the apps subdomain instead of the entire cluster domain?

@danehans
Copy link
Contributor Author

Per #296 (comment) ingress domain will not be added to default no proxy.

@danehans danehans closed this Sep 26, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. lgtm Indicates that a PR is ready to be merged. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants