Skip to content

kube_proxy_and_dns: add role that runs standalone kube-proxy and DNS#9621

Merged
openshift-merge-robot merged 1 commit intoopenshift:release-3.10from
dcbw:3.10-standalone-kube-proxy
Aug 31, 2018
Merged

kube_proxy_and_dns: add role that runs standalone kube-proxy and DNS#9621
openshift-merge-robot merged 1 commit intoopenshift:release-3.10from
dcbw:3.10-standalone-kube-proxy

Conversation

@dcbw
Copy link
Contributor

@dcbw dcbw commented Aug 16, 2018

And depend on that role in Calico.

@smarterclayton @sdodson @knobunc @squeed 3rd party plugins might depend on kube-proxy, which as of 3.10 is part of the SDN daemonset which those 3rd party plugins probably don't want to select/run. Including an ansible role for it is one option; another would be to request that each 3rd party plugin that needs kube-proxy add a container for it to their pod definition in their daemonset. Thoughts?

(also my Ansible is weak, I have no idea if this is the correct way to do it)

@openshift-ci-robot openshift-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Aug 16, 2018
@dcbw dcbw force-pushed the 3.10-standalone-kube-proxy branch from dc7c79e to 0fe2381 Compare August 16, 2018 03:18
@vrutkovs
Copy link
Contributor

This PR (and fixes from #9435) seem to fix Calico on 3.11. Could you resubmit it for master as well?

@dcbw
Copy link
Contributor Author

dcbw commented Aug 16, 2018

Could you resubmit it for master as well?

@vrutkovs Yes, I can do that as well. The immediate issue we had was lack of kube-proxy in 3.10 due to the daemonset requirements, hence the 3.10 submission as the first pass. I'd like to get some comments on it too, are you able to test this PR out and verify that it correctly starts the kube-proxy daemonset on calico nodes?

@vrutkovs
Copy link
Contributor

@dcbw yep, kube-proxy daemonset started and calico worked fine (with several fixes from #9435 - would be great if you could have a look at those if possible). Haven't tried other custom SDNs.

Ansible part looks great, lets have another lgtm about the approach and we'll merge this.

I wonder if running kube-proxy in a daemonset would change the metrics fix in #9592?

Copy link

@spuranam spuranam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its best not to hardcode version instead something like this should do the trick s/v3.10/${version}/

apiVersion: image.openshift.io/v1
kind: ImageStreamTag
metadata:
name: node:v3.10

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its best not to hardcode version instead something like this should do the trick s/v3.10/${version}/

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a name of the imagestream, it needs to be hardcoded so that an upgraded cluster would have several imagestreams for major versions

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, for a PR against 'git master' I would expect the tag to be different, but I was just copying what some of the existing ImageStream tags already were thus "v3.10"

reference: true
from:
kind: DockerImage
name: openshift/node:v3.10.0

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its best not to hardcode version instead something like this should do the trick s/v3.10/${version}/

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This daemonset launches kube-proxy.
image.openshift.io/triggers: |
[
{"from":{"kind":"ImageStreamTag","name":"node:v3.10"},"fieldPath":"spec.template.spec.containers[?(@.name==\"proxy\")].image"}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its best not to hardcode version instead something like this should do the trick s/v3.10/${version}/

@squeed
Copy link

squeed commented Aug 17, 2018

Should this use the openshift binary or the hyperkube binary? My preference would be for hyperkube.

@smarterclayton
Copy link
Contributor

smarterclayton commented Aug 17, 2018 via email

oc config --config=/tmp/kubeconfig set-credentials sa "--token=$( cat /var/run/secrets/kubernetes.io/serviceaccount/token )"
oc config --config=/tmp/kubeconfig set-context "$( oc config --config=/tmp/kubeconfig current-context )" --user=sa
# Launch the kube-proxy process
exec openshift start network --disable=dns,plugins --enable=proxy --config=/etc/origin/node/node-config.yaml --kubeconfig=/tmp/kubeconfig --loglevel=${DEBUG_LOGLEVEL:-2}
Copy link
Contributor

@kprabhak kprabhak Aug 17, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might want to "--enable=dns", otherwise cluster internal service lookups may not work.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dcbw @vrutkovs @kprabhak I can confirm that this PR along with @kprabhak suggestion and fixes from #9435 fixes Calico on OpenShit Enterprise 3.10

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kprabhak yes good point, which brings up the questions does everyone who would use a standalone kube-proxy role also use the DNS, and if not, should it be individually selectable from kube-proxy.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For what it is worth, I had:

 exec openshift start network --disable=plugins --enable=proxy,dns --config=/etc/origin/node/node-config.yaml --kubeconfig=/tmp/kubeconfig --loglevel=${DEBUG_LOGLEVEL:-2}

Just work for me in a install.

dependencies:
- role: openshift_facts
- role: container_runtime
- role: kube_proxy
Copy link
Contributor

@mgleung mgleung Aug 19, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the release-3.10 branch, this role dependency should go in the calico_master role. If this is still a thing going forward, we should leave this in the calico role (we will rename the roles going forward as per #9435). I made a PR for the appropriate calico fixes in the release-3.10 branch here: #9657 but this means that this role will need to be a dependency of the calico_master role instead of the calico role.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exactly what @mgleung , for the 3.10 release branch this should be under calico_master role.

@dcbw dcbw force-pushed the 3.10-standalone-kube-proxy branch from 0fe2381 to 67e911f Compare August 30, 2018 20:40
@dcbw
Copy link
Contributor Author

dcbw commented Aug 30, 2018

@mgleung @dmmcquay @vrutkovs @kprabhak @spuranam

Updated the role depenency to calico_master and added DNS to the service. Thus the updated name. PTAL thanks!

@dcbw dcbw changed the title [RFC] kube_proxy: add role that runs standalone kube-proxy kube_proxy_and_dns: add role that runs standalone kube-proxy and DNS Aug 31, 2018
@dcbw
Copy link
Contributor Author

dcbw commented Aug 31, 2018

/retest

@kprabhak
Copy link
Contributor

kprabhak commented Aug 31, 2018

Looks like one pr previously merged into 3.9 but left behind in the new kube-proxy structure in 3.10 is #8099

Without this, nodePorts only work in cases where the endpoint happens to be on the same node.

In 3.10, the location for roles/openshift_node/templates/node.yaml.v1.j2 appears to have changed to roles/openshift_node_group/templates/node-config.yaml.j2

I think @mgleung is working on a separate PR for this - I can confirm that nodePorts work correctly from all nodes once the following lines are added to roles/openshift_node_group/templates/node-config.yaml.j2:
{% if openshift_use_calico | default(False) | bool %}
proxyArguments:
cluster-cidr:
- {{ osm_cluster_network_cidr }}
{% endif %}

@mgleung
Copy link
Contributor

mgleung commented Aug 31, 2018

I added #9863 and #9866 for the changes @kprabhak described.

The changes now look good to me. Thanks @dcbw !

@sdodson
Copy link
Member

sdodson commented Aug 31, 2018

Ansible code looks fine to me. Just need to make sure to forward port this.
/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Aug 31, 2018
@openshift-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: dcbw, sdodson

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 31, 2018
@openshift-merge-robot openshift-merge-robot merged commit eeb4272 into openshift:release-3.10 Aug 31, 2018
mgleung pushed a commit to mgleung/openshift-ansible that referenced this pull request Aug 31, 2018
kube_proxy_and_dns: add role that runs standalone kube-proxy and DNS
@mgleung
Copy link
Contributor

mgleung commented Aug 31, 2018

I created a forward port of this with the changes from #9871 added on as well here: #9872

@vrutkovs
Copy link
Contributor

vrutkovs commented Sep 3, 2018

/cherrypick master

@openshift-cherrypick-robot

@vrutkovs: new pull request created: #9878

Details

In response to this:

/cherrypick master

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.