Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade to kubernetes 1.11 #473

Merged
merged 36 commits into from
Aug 7, 2018
Merged

Upgrade to kubernetes 1.11 #473

merged 36 commits into from
Aug 7, 2018

Conversation

SpComb
Copy link
Contributor

@SpComb SpComb commented Jul 4, 2018

Fixes #419
Fixes #472

Changes

  • Install kubeadm binary separately for master upgrades, to avoid breaking the kubelet systemd unit dropins
  • Switch from kube-dns to coredns
  • Switch to the new kubeadm kubeadm.k8s.io/v1alpha2 config
  • Remove kubelet --cluster-dns configuration now that kubeadm handles it via the kubelet config
  • Upgrade to etcd 3.2

TODO

  • master node upgrades

  • worker node upgrades

  • Fix

    • kubeadm reset prompt
    • Don't use /usr/local/bin/kubeadm
    • Our custom crio package should no longer include critool
  • Build new packages/images https://github.com/kontena/pharos-kube-zipper/pull/21

    • https://dl.bintray.com/kontena/pharos-debian kubelet/kubectl/kubeadm 1.11.0
    • https://dl.bintray.com/kontena/pharos-debian cri-tools 1.11.0 (this is now a kubeadm dependency)
    • https://dl.bintray.com/kontena/pharos-rpm 1.11.0
    • quay.io/kontena kube-apiserver/kube-controllermanager/kube-proxy/kube-scheduler 1.11.0
    • quay.io/kontena etcd 3.2.18
    • quay.io/kontena coredns 1.1.3
  • Test

    • Initial install across different OS platforms
    • Upgrade from pharos 1.2
    • Optional configurations: external etcd, cri-o, ...

@SpComb SpComb added the enhancement New feature or request label Jul 4, 2018
@SpComb
Copy link
Contributor Author

SpComb commented Jul 4, 2018

kubeadm upgrade now requires cri-tools:

    $ sudo VERSION=1.11.0 ARCH=amd64 sh -x < upgrade-kubeadm.sh
    +     set     -ex    
    +     kubeadm     version     -o     short    
    + [ v1.10.4 = v1.11.0 ]
    + cd /tmp
    + export DEBIAN_FRONTEND=noninteractive
    + apt-get download kubeadm=1.11.0-00
    Get:1 https://packages.cloud.google.com/apt kubernetes-xenial/main amd64 kubeadm amd64 1.11.0-00 [9,422 kB]
    Fetched 9,422 kB in 1s (6,268 kB/s)
    + dpkg -i --ignore-depends=kubelet kubeadm_1.11.0-00_amd64.deb
    (Reading database ... 83419 files and directories currently installed.)
    Preparing to unpack kubeadm_1.11.0-00_amd64.deb ...
    Unpacking kubeadm (1.11.0-00) over (1.10.4-00) ...
    dpkg: dependency problems prevent configuration of kubeadm:
     kubeadm depends on cri-tools (>= 1.11.0); however:
      Package cri-tools is not installed.
    
    dpkg: error processing package kubeadm (--install):
     dependency problems - leaving unconfigured
    Errors were encountered while processing:
     kubeadm
    ! 1

@SpComb
Copy link
Contributor Author

SpComb commented Jul 4, 2018

Upgrading the system kubeadm package to run the kubeadm upgrade breaks the kubelet, because the new kubeadm package updates the /etc/systemd/system/kubelet.service.d/10-kubeadm.conf file to run the kubelet with --config=/var/lib/kubelet/config.yaml, but the kubeadm package upgrade does not write out this config, so it gets stuck with the kubelet in a restart loop on the missing config.

This is a documented limitation for the kubeadm upgrade path: https://kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade-1-11/#upgrade-the-control-plane

Note that upgrading the kubeadm package on your system prior to upgrading the control plane causes a failed upgrade. Even though kubeadm ships in the Kubernetes repositories, it’s important to install it manually. The kubeadm team is working on fixing this limitation.


The kubeadm upgrade will update the kubelet config on the master node at the end, so no need for a separate kubeadm upgrade node config for the master node AFAICT

    [apiclient] Found 1 Pods for label selector component=kube-apiserver
    [upgrade/staticpods] Component "kube-apiserver" upgraded successfully!
    [upgrade/staticpods] Moved new manifest to "/etc/kubernetes/manifests/kube-controller-manager.yaml" and backed up old manifest to "/etc/kubernetes/tmp/kubeadm-backup-manifests-2018-07-04-13-58-03/kube-controller-manager.yaml"
    [upgrade/staticpods] Waiting for the kubelet to restart the component
    Static pod: kube-controller-manager-terom-pharos-master hash: 6c1e40c591159ead0cb40bfed474e0f3
    Static pod: kube-controller-manager-terom-pharos-master hash: 01e7100e5550f41e40cc34eb55f9a7fa
    [apiclient] Found 1 Pods for label selector component=kube-controller-manager
    [upgrade/staticpods] Component "kube-controller-manager" upgraded successfully!
    [upgrade/staticpods] Moved new manifest to "/etc/kubernetes/manifests/kube-scheduler.yaml" and backed up old manifest to "/etc/kubernetes/tmp/kubeadm-backup-manifests-2018-07-04-13-58-03/kube-scheduler.yaml"
    [upgrade/staticpods] Waiting for the kubelet to restart the component
    Static pod: kube-scheduler-terom-pharos-master hash: 2ec65d6c3ad7f10608bdfd93016abe03
    Static pod: kube-scheduler-terom-pharos-master hash: 31eabaff7d89a40d8f7e05dfc971cdbd
    [apiclient] Found 1 Pods for label selector component=kube-scheduler
    [upgrade/staticpods] Component "kube-scheduler" upgraded successfully!
    [uploadconfig] storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
    [kubelet] Creating a ConfigMap "kubelet-config-1.11" in namespace kube-system with the configuration for the kubelets in the cluster
    [kubelet] Downloading configuration for the kubelet from the "kubelet-config-1.11" ConfigMap in the kube-system namespace
    [kubelet] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
    [kubelet] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
    [patchnode] Uploading the CRI Socket information "/var/run/dockershim.sock" to the Node API object "terom-pharos-master" as an annotation
    [bootstraptoken] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
    [bootstraptoken] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
    [bootstraptoken] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
    [addons] Applied essential addon: CoreDNS
    [addons] Applied essential addon: kube-proxy
    
    [upgrade/successful] SUCCESS! Your cluster was upgraded to "v1.11.0". Enjoy!
    
    [upgrade/kubelet] Now that your control plane is upgraded, please proceed with upgrading your kubelets if you haven't already done so.
    ! 0

@SpComb SpComb added this to the 1.3.0 milestone Jul 4, 2018
@SpComb
Copy link
Contributor Author

SpComb commented Jul 4, 2018

Initial install and upgrade from pharos 1.2 should work now, tested with custom hacks to use upstream repos instead of the pharos ones.

@SpComb
Copy link
Contributor Author

SpComb commented Jul 5, 2018

Further testing/work pending on updated pharos images/packages.

@SpComb
Copy link
Contributor Author

SpComb commented Jul 13, 2018

kubeadm reset now prompts:

+ kubeadm reset
[reset] WARNING: changes made to this host by 'kubeadm init' or 'kubeadm join' will be reverted.
[reset] are you sure you want to proceed? [y/N]: Aborted reset operation
/app/lib/pharos/ssh/remote_command.rb:66:in `run!'
/app/lib/pharos/ssh/client.rb:60:in `exec!'
/app/lib/pharos/ssh/client.rb:72:in `exec_script!'
/app/lib/pharos/host/configurer.rb:74:in `exec_script'
/app/lib/pharos/host/el7/el7.rb:82:in `reset'
/app/lib/pharos/phases/reset_host.rb:10:in `call'
/app/lib/pharos/phase_manager.rb:70:in `block in apply'
/app/lib/pharos/phase_manager.rb:26:in `block (2 levels) in run_parallel'

gpg --verify /tmp/kubeadm.gz.asc /tmp/kubeadm.gz
gunzip /tmp/kubeadm.gz
install -o root -g root -m 0755 -t /usr/local/bin /tmp/kubeadm # XXX: overrides package version?
Copy link
Contributor Author

@SpComb SpComb Jul 13, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be installed somewhere temporarily only for the duration of the upgrade, and removed once the kubeadm package has been upgraded... leaving behind a version of kubeadm in /usr/local/bin is bad.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also gets left behind on pharos-cluster reset and causes the next pharos-cluster up to run using the wrong version of kubeadm.

@SpComb
Copy link
Contributor Author

SpComb commented Jul 13, 2018

pharos-cluster reset also does not remove the cri-tools package, leaving crictl installed and breaking kubeadm join on kube 1.10 for the default docker CRI per kubernetes/kubeadm#657

@SpComb
Copy link
Contributor Author

SpComb commented Jul 13, 2018

With kubeadm now depending on the separate cri-tools package, our own cri-o package should no longer be including /usr/local/bin/crictl?

@SpComb
Copy link
Contributor Author

SpComb commented Jul 25, 2018

Upgrading a CentOS 7 worker node from 1.10 -> 1.11 with kubeadm upgrade node config leaves the kubelet configured with the default cgroupDriver: cgroupfs:

Jul 25 11:36:10 terom-centos-test kubelet[5939]: F0725 11:36:10.541309    5939 server.go:262] failed to run Kubelet: failed to create kubelet: misconfiguration: kubelet cgroup driver: "cgroupfs" is different from docker cgroup driver: "systemd"

Per https://kubernetes.io/docs/setup/independent/install-kubeadm/#configure-cgroup-driver-used-by-kubelet-on-master-node kubeadm should be detecting it and configuring it automatically, but this isn't happening:

When using Docker, kubeadm will automatically detect the cgroup driver for the kubelet and set it in the /var/lib/kubelet/kubeadm-flags.env file during runtime.

[root@terom-centos-test ~]# cat /var/lib/kubelet/kubeadm-flags.env
cat: /var/lib/kubelet/kubeadm-flags.env: No such file or directory
[root@terom-centos-test ~]# grep cgroupDriver /var/lib/kubelet/config.yaml
cgroupDriver: cgroupfs

EDIT: seems like this might be a kubeadm upgrade node config bug - after a pharos-cluster reset and fresh kubeadm join, the cgroup-driver configuration is different:

[root@terom-centos-test ~]# cat /var/lib/kubelet/kubeadm-flags.env 
KUBELET_KUBEADM_ARGS=--cgroup-driver=systemd --cni-bin-dir=/opt/cni/bin --cni-conf-dir=/etc/cni/net.d --network-plugin=cni

Workaround is to have pharos-cluster itself configure --cgroup-driver=systemd via the systemd unit KUBELET_EXTRA_ARGS.

@SpComb
Copy link
Contributor Author

SpComb commented Jul 25, 2018

Oops, the default EnvironmentFile=./etc/sysconfig/kubelet|/etc/default/kubelet with KUBELET_EXTRA_ARGS= now overrides the Environemnt="KUBELET_EXTRA_ARGS=..." in the system dropin.

@jakolehm
Copy link
Contributor

jakolehm commented Aug 3, 2018

CoreDNS being a multiarch image is causing headaches. First I tried to build coredns on separate repos for each architecture (like every other k8s image) but kubeadm wants to use multiarch -> breaks upgrade because image pull does not work.

Then I tried to actually create a multiarch repo (with https://github.com/estesp/manifest-tool) but it seems that quay.io does not support v2.2 manifests (they are currently implementing it, see: moby/buildkit#409 (comment) .

@jakolehm
Copy link
Contributor

jakolehm commented Aug 6, 2018

kubeadm wants to use multiarch -> breaks upgrade because image pull does not work.

Not completely sure if this is actually true. Based on logs it seems that CoreDNS addon is applied but then upgrade hangs on when it tries to remove kube-dns deployment.

    [bootstraptoken] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
    [bootstraptoken] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
    [bootstraptoken] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
    [addons] Applied essential addon: CoreDNS
... HANGS ...

@jakolehm
Copy link
Contributor

jakolehm commented Aug 6, 2018

Reason why it halts there is that kubeadm waits for coredns replicas... this cannot happen because if invalid image name: https://github.com/kubernetes/kubernetes/blob/master/cmd/kubeadm/app/phases/upgrade/postupgrade.go#L141-L149

@@ -6,7 +6,8 @@ class ConfigureDNS < Pharos::Phase
title "Configure DNS"

def call
patch_kubedns(
patch_deployment(
'coredns',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this kube-dns -> coredns change cause downtime when upgrading from Pharos 1.2?

Nope, it should be smooth ride (kube-dns deployment is removed after coredns is running).

}
]
}
Pharos::Kube.session(@master.api_address).resource_client('apps/v1').patch_deployment(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Patch should be ok here, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the PATCH matches what kubectl set image does:

$ kubectl -v8 -n kube-system set image deployments/coredns coredns=quay.io/kontena/coredns-amd64:1.1.3
...
I0807 11:11:05.700941    8436 request.go:874] Request Body: {"spec":{"template":{"spec":{"$setElementOrder/containers":[{"name":"coredns"}],"containers":[{"image":"quay.io/kontena/coredns-amd64:1.1.3","name":"coredns"}]}}}}
I0807 11:11:05.700985    8436 round_trippers.go:383] PATCH https://167.99.39.233:6443/apis/extensions/v1beta1/namespaces/kube-system/deployments/coredns
...

Not entirely sure what the $setElementOrder is doing, but it seems unnecessary. Apart from that the container image part is identical.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a note that this also changes how dns replicas/maxSurge/etc is sent.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, those are simple object-level PATCHes, no array merge needed. Verified that the end result is as intended.

@jakolehm
Copy link
Contributor

jakolehm commented Aug 7, 2018

@SpComb I think we should merge this and fix remaining issues (if any) in separate pr's.

{
key: "k8s-app",
operator: "In",
values: [name]
Copy link
Contributor Author

@SpComb SpComb Aug 7, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The podAntiAffinity is broken, the pods get scheduled on the same node:

NAME                                          READY     STATUS    RESTARTS   AGE       IP               NODE
coredns-5c7c9977c-7dww4                       1/1       Running   0          4m        10.32.0.216      terom-pharos-master
coredns-5c7c9977c-pn5k4                       1/1       Running   0          4m        10.32.0.215      terom-pharos-master

The coredns deployment uses a k8s-app: kube-dns label, so this shouldn't be coredns.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

NAME                                          READY     STATUS    RESTARTS   AGE       IP               NODEt
coredns-dcb4c7ddd-pn96m                       1/1       Running   0          1m        10.32.1.8        terom-xenial-test
coredns-dcb4c7ddd-tk2dm                       1/1       Running   0          1m        10.32.2.29       terom-bionic-test

@SpComb
Copy link
Contributor Author

SpComb commented Aug 7, 2018

Quick smoketest for this is passing, so this should be good enough to merge and continue testing in master together with other changes.

CI needs #504 to pass.

The CoreDNS image hack is unfortunate, but I don't know how.

@jakolehm jakolehm changed the title [WiP] Upgrade to kubernetes 1.11 Upgrade to kubernetes 1.11 Aug 7, 2018
@jakolehm jakolehm merged commit 71bb4de into master Aug 7, 2018
@jakolehm jakolehm deleted the feature/kubernetes-1.11 branch August 7, 2018 13:23
@jakolehm jakolehm mentioned this pull request Aug 7, 2018
@jakolehm jakolehm mentioned this pull request Aug 30, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants