Add Typhoon for Fedora Atomic as an alpha #199

dghubble · 2018-04-26T15:20:41Z

https://typhoon.psdn.io/announce/#april-26-2018

* Update manifests for Kubernetes v1.10.0 * Update etcd from v3.3.2 to v3.3.3 * Add disk_type optional variable on AWS * Remove redundant kubeconfig copy on AWS * Distribute etcd secres only to controllers * Organize module variables and ssh steps

* Several known hacks and broken areas * Download v1.10 Kubelet from release tarball * Install flannel CNI binaries to /opt/cni * Switch SELinux to Permissive * Disable firewalld service * port-forward won't work, socat missing

* Use the upstream etcd image packaged with the required metadata to be usable as a system container (runc) under systemd

* Use the upstream hyperkube image packaged with the required metadata to be usable as a system container under systemd * Fix port-forward since socat is included

* Mount /opt/cni/bin in kubelet system container so CNI plugin binaries can be found. Before, flannel worked because the kubelet falls back to flannel plugin baked into the hyperkube (undesired) * Move the CNI bin install location later, since /opt changes may be lost between ostree rebases

* Enable etcd v3.3 metrics to expose metrics for scraping by Prometheus * Use k8s.gcr.io instead of gcr.io/google_containers * Add flexvolume plugin mount to controller manager * Update kube-dns from v1.14.8 to v1.14.9

* Atomic has published AMI images that shutdown immediately after being powered on

* Network load balancer for ingress doesn't work yet because Compute Engine packages are missing * port-forward / socat is broken

* Fix kubelet port-forward on Google Cloud / Fedora Atomic * Mount the host's /etc/hosts in kubelet system containers * Problem: kubelet runc system containers on Atomic were not mounting the host's /etc/hosts, like rkt-fly does on Container Linux. `kubectl port-forward` calls socat with localhost. DNS servers on AWS, DO, and in many bare-metal environments resolve localhost to the caller as a convenience. Google Cloud notably does not nor is it required to do so and this surfaced the missing /etc/hosts in runc kubelet namespaces.

* Change kubelet system image to use --cgroups-per-qos=true (default) instead of false * Change kubelet system image to use --enforce-node-allocatable=pods instead of an empty string

* Fix `kubectl describe node` to reflect the host's operating system

* Use the upstream bootkube image packaged with the required metadata to be usable as a system container under systemd * Run bootkube with runc so no host level components use Docker any more. Docker is still the runtime * Remove bootkube script and old systemd unit

* http://www.projectatomic.io/blog/2018/04/fedora-atomic-20-apr-18/ * Atomic publishes nightly AMIs which sometimes don't boot or have issues. Until there is a source of reliable AMIs, pin the best known working AMI * Rel 66a66f0d18544591ffdbf8fae9df790113c93d72

* atomic host updates were fetching updates from the repo cache fedora-atomic-27, instead of from upstream

* (containerized) kube-proxy warns that it is unable to load the ip_vs kernel module despite having the correct mounts. Atomic uses an xz compressed module and modprobe in the container was not compiled with compression support * Workaround issue for now by always loading ip_vs on-host * kubernetes/kubernetes#60

* Observed frequent kube-scheduler and controller-manager restarts with Calico as the CNI provider. Root cause was unclear since control plane was functional and tests of pod to pod network connectivity passed * Root cause: Calico sets up cali* and tunl* network interfaces for containers on hosts. NetworkManager tries to manage these interfaces. It periodically disconnected veth pairs. Logs did not surface this issue since its not an error per-se, just Calico and NetworkManager dueling for control. Kubernetes correctly restarted pods failing health checks and ensured 2 replicas were running so the control plane functioned mostly normally. Pod to pod connecitivity was only affected occassionally. Pain to debug. * Solution: Configure NetworkManager to ignore the Calico ifaces per Calico's recommendation. Cloud-init writes files after NetworkManager starts, so a restart is required on first boot. On subsequent boots, the file is present so no restart is needed

dghubble added 30 commits April 21, 2018 18:46

Add fedora-cloud module for Digital Ocean

485586e

Add fedora-cloud module for AWS

3610da8

Add bare-metal Fedora Atomic module

ddc75e9

* Several known hacks and broken areas * Download v1.10 Kubelet from release tarball * Install flannel CNI binaries to /opt/cni * Switch SELinux to Permissive * Disable firewalld service * port-forward won't work, socat missing

Change DO Fedora module to fedora-atomic

4e43b2f

Change AWS Fedora module to fedora-atomic

9969c35

Use etcd system container on fedora-atomic

8d7cfc1

* Use the upstream etcd image packaged with the required metadata to be usable as a system container (runc) under systemd

Use kubelet system container on fedora-atomic

19bc5ae

* Use the upstream hyperkube image packaged with the required metadata to be usable as a system container under systemd * Fix port-forward since socat is included

Update control plane manifests and add etcd metrics

f990473

* Enable etcd v3.3 metrics to expose metrics for scraping by Prometheus * Use k8s.gcr.io instead of gcr.io/google_containers * Add flexvolume plugin mount to controller manager * Update kube-dns from v1.14.8 to v1.14.9

Temporarily pin Fedora Atomic AMI

5212684

* Atomic has published AMI images that shutdown immediately after being powered on

Update Fedora Atomic modules to Kubernetes v1.10.1

b3cf950

Name ostree remote repo fedora-atomic across platforms

cf22e70

Add cloud-metadata.service on AWS fedora-atomic

24d2305

Add Google Cloud fedora-atomic module

2b74aba

* Network load balancer for ingress doesn't work yet because Compute Engine packages are missing * port-forward / socat is broken

Enable kubelet allocatable enforcement and QoS cgroup hierarchy

e148552

* Change kubelet system image to use --cgroups-per-qos=true (default) instead of false * Change kubelet system image to use --enforce-node-allocatable=pods instead of an empty string

Mount host's /etc/os-release in kubelet system containers

3dde4ba

* Fix `kubectl describe node` to reflect the host's operating system

Add atomic_assets_endpoint var for fedora-atomic bare-metal

3f29788

Fix ostree repo to be called fedora-atomic on bare-metal

f36c890

* atomic host updates were fetching updates from the repo cache fedora-atomic-27, instead of from upstream

Update Calico from v3.0.4 to v3.1.1 for Atomic

7198b90

Organize docs by operating system

af54efe

Write documentation for Fedora Atomic

cd91398

Switch to quay.io/poseidon tagged system containers

d784b0f

Add architecture docs on operating systems

b6a51d0

Add Fedora Atomic announcement and improve docs

2e4bf4d

dghubble merged commit 2e4bf4d into master Apr 26, 2018

dghubble deleted the add-fedora-atomic branch April 28, 2018 19:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Typhoon for Fedora Atomic as an alpha #199

Add Typhoon for Fedora Atomic as an alpha #199

dghubble commented Apr 26, 2018

Add Typhoon for Fedora Atomic as an alpha #199

Add Typhoon for Fedora Atomic as an alpha #199

Conversation

dghubble commented Apr 26, 2018