Skip to content

Multi-platform clusters #163

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
malaskim opened this issue Feb 13, 2017 · 15 comments
Closed

Multi-platform clusters #163

malaskim opened this issue Feb 13, 2017 · 15 comments

Comments

@malaskim
Copy link

Hi everyone,

I'm trying to set up a multi-platform cluster with a master node on Ubuntu (amd64), and some worker nodes on raspberry pi3 (arm, with hypriotOS)
I can get the cluster set up with the tutorial here (https://kubernetes.io/docs/getting-started-guides/kubeadm/, awesome work by the way!!)

However when I tried to start a container with
kubectl run hypriot --image=hypriot/rpi-busybox-httpd --replicas=1 --port=80
I got the following error : Error syncing pod, skipping: failed to "SetupNetwork" for "hypriot-1452852107-f11tt_default" with SetupNetworkError: "Failed to setup network for pod "hypriot-1452852107-f11tt_default(527e63f1-f1f4-11e6-b5e5-080027eb4e93)" using network plugins "cni": cni config unintialized; Skipping pod"

Whereas I've successfully started flannel on the master node with this command :
kubectl create -f https://rawgit.com/coreos/flannel/master/Documentation/kube-flannel.yml
All the services are running (dns, flannel,proxy etc...) on the master node

If I reproduce this set up with the master node on a raspberry Pi (then all the cluster is on arm), everything works :)

As suggested in the section Multi-platform clusters on https://porter.io/github.com/luxas/kubernetes-on-arm, I've also try to run the command for all the architecture but I got the following error :
Error from server (AlreadyExists): error when creating "STDIN": serviceaccounts "flannel" already exists
Error from server (AlreadyExists): error when creating "STDIN": configmaps "kube-flannel-cfg" already exists
Error from server (AlreadyExists): error when creating "STDIN": daemonsets.extensions "kube-flannel-ds" already exists

I also figured out that after running the "kubeadm join" command on one of my raspberry, there is no kube-proxy and kube-flannel created on the Rpi3 when the master node is amd64, whereas it is indeed created if the master node is on the same platform (i.e another Rpi3)

Looks like I'm missing something about the multi-architecture configuration here...if anyone has any idea ?

Thanks you for your help !

@GheRivero
Copy link

The main issue is that flannel manifest has amd64 arch hard-coded into it:

  • quay.io/coreos/flannel:v0.7.0-amd64

I know there are some efforts to support multi platform images (also called "manifest" in docker) that will allow to download the proper image just using a common name, so there is no need to one manifest per arch... But it's a WIP.

As a workaround: Modify the flannel manifest locally to match the arm64 arch and see what happens (should fail in the master node though...)

@malaskim
Copy link
Author

Thanks for your answer !

I manually changed the flannel manifest to start one instance on the master node (adm64 with quay.io/coreos/flannel:v0.7.0-amd64)
and one on the worker (arm zith quay.io/coreos/flannel:v0.7.0-arm), but I still have an error "Error Syncyng pod"

I find it weird that when I do a join from an arm-node to a amd64-master-node, no "kube-proxy" is started on the worker node, whereas when I repeat the same on a unique architecture platform (both on arm or both on amd64), a kube-proxy is always started on the joining worker node.
Is it possible that there is a similar problem of hard-coding with the kube-proxy service when trying to deploy on a cross-platform ??

Thx again for your help !

@GheRivero
Copy link

GheRivero commented Feb 14, 2017 via email

@malaskim
Copy link
Author

Arf yes understandable..
I can change manually the flannel one as I can directly specify the .yml file, but for the kube-proxy one, it is automated in the join process of kubeadm I guess..If you have any idea how I could change it even manually, just to make it work..i'm buying :)
Thx!

ps: should I understand that this issue is related to this one #51, which should be solved in 1.6 version ? Sorry by advance if I'm mistaking, I'm discovering the github world..

@GheRivero
Copy link

GheRivero commented Feb 14, 2017 via email

@sudhagarc
Copy link

+1

Looks like support for this issue is pushed out of 1.6 version. Is there any update on this issue? If anyone has solved it manually, that works for me.

@sudhagarc
Copy link

Found a hack/workaround for this issue:

Before running kubeadm join from node with different architecture, I pulled the correct docker image for the architecture and renamed (re-tagged) it to match the master's architecture. Then the kube-proxy container started fine.

E.g., In my case, node is raspberry pi 3 (arm) and my master is x86 (amd64). I executed the following command on my node, before invoking kubeadm join:
docker tag gcr.io/google_containers/kube-proxy-arm:v1.6.0 gcr.io/google_containers/kube-proxy-amd64:v1.6.0

Hope this helps someone to keep going. Hoping for an official fix soon :)

@codesnk
Copy link

codesnk commented May 3, 2017

@sudhagarc Is the hack still working for you? The false tagging on Pi3 does not work for me as the proxy and weave net pods go into crashloopbackoff. For me, the master still download the original kube-proxy-amd64 image as the image id is different from the hacked arm image.

UPDATE: It works, although a bit inconsistently. I was making a mistake in tagging. The amd64 version image has an underscore (_) in its tag while the arm version has a minus (-).

@aitorhh
Copy link

aitorhh commented May 3, 2017

I can summarize the problems and how I solved to have the multi-platform cluster:

  • architecture of images (kube-proxy) is set based on the master node (cmd/kubeadm/app/images/images.go)
  • multi-images not supported by docker yet but available in registry

So, to solve the multi-platform with kubeadm:

  • use manifest tool to support docker images with schema V2. prototype tool available here
  • create manifest for all the kube images ("etcd" "k8s-dns-sidecar" "k8s-dns-kube-dns" "k8s-dns-dnsmasq-nanny" "kubedns" "dnsmasq-metrics" "kube-dnsmasq" "flannel" "defaultbackend" "kubernetes-dashboard" "kube-apiserver" "kube-controller-manager" "kube-scheduler" "kube-proxy") and push it to private registry
  • modify kubeadm (cmd/kubeadm/app/images/images.go) to remove architecture dependency. And compile
  • use the new compiled kubeadm (only on master) and execute with KUBE_REPO_PREFIX=<url-private-registry> ./kubeadm init ...

At the end I end up using weave as virtual network backend instead of flannel, which is multi-platform out-of-the-box.

@luxas
Copy link
Member

luxas commented May 29, 2017

Please see: https://github.com/luxas/kubeadm-workshop.
There's a lot of information about this there.

Also see #51 that's tracking the multi-platform support for kubeadm

@squidpickles
Copy link

I was able to make this work without needing a custom image for kube-proxy, using steps here.
Briefly, it involves modifying the DaemonSet containing kube-proxy only to apply to hosts matching the master node, then creating a duplicate DaemonSet that applies to the nodes' architecture, with the correct image.

It's definitely a workaround, but I like being able to use the official images for kube-proxy.

@sbisiaux
Copy link

sbisiaux commented May 4, 2018

Any updates to this I tried the above editing thekube-proxy DaemonSet, but keep getting a kernel:[ 126.609412] Internal error: Oops: 80000007 [#1] SMP ARM when the Pod starts up.

Linux kb06 4.14.34-v7+ #1110 SMP Mon Apr 16 15:18:51 BST 2018 armv7l GNU/Linux
`Client:
Version: 18.04.0-ce
API version: 1.37
Go version: go1.9.4
Git commit: 3d479c0
Built: Tue Apr 10 18:25:24 2018
OS/Arch: linux/arm
Experimental: false
Orchestrator: swarm

Server:
Engine:
Version: 18.04.0-ce
API version: 1.37 (minimum version 1.12)
Go version: go1.9.4
Git commit: 3d479c0
Built: Tue Apr 10 18:21:25 2018
OS/Arch: linux/arm
Experimental: false`

kubeadm version: &version.Info{Major:"1", Minor:"10", GitVersion:"v1.10.2", GitCommit:"81753b10df112992bf51bbc2c2f85208aad78335", GitTreeState:"clean", BuildDate:"2018-04-27T09:10:24Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"linux/arm"}

@drake7707
Copy link

I circumvented this issue by deploying an additional daemonset for each architecture, taken from https://gist.github.com/ssplatt/3d2f68a42e619f88dbed3244ad447708

Make sure to change the version in the image matching your deployment.

@mrpaws
Copy link

mrpaws commented Jul 9, 2018

This is a known limitation and the current official solution is documented in the bootstrapping documentation for a single master cluster at [https://kubernetes.io/docs/setup/independent/create-cluster-kubeadm/#instructions] :

If you join a node with a different architecture to your cluster, create a separate Deployment or DaemonSet for kube-proxy and kube-dns on the node. This is because the Docker images for these components do not currently support multi-architecture.

This will manifest itself for any non multiarch container image but Kubernetes provides the flexibility to work around the problem without a hack.

You can find info on writing daemonsets below and reposting @drake7707 's gist link, which helps make the documentation more applicable to this context:

https://gist.github.com/ssplatt/3d2f68a42e619f88dbed3244ad447708
https://kubernetes.io/docs/concepts/workloads/controllers/daemonset/#writing-a-daemonset-spec

A few ways to handle, but these are step-by-step instructions that I just performed for supporting both arm64 and amd64 using daemonsets and pre-populated labels:

  1. Save the currently running kube-proxy daemonset configuration:
    $ kubectl get ds --namespace=kube-system kube-proxy --export -o yaml > ~/original_kube-proxy.yaml

  2. Download both the kube-proxy-amd64.yaml and kube-proxy-arm.yaml files from the gist: [https://gist.github.com/ssplatt/3d2f68a42e619f88dbed3244ad447708]

  3. Update image specification line to the latest version (currently 1.9.9), respectively:

image: gcr.io/google_containers/kube-proxy-arm:v1.9.9
image: gcr.io/google_containers/kube-proxy-amd64:v1.9.9

  1. Create both daemonsets:
    $ kubectl create -f ~/kube-proxy-amd64.yaml $ kubectl create -f ~/kube-proxy-arm.yaml

  2. If last step successfully created, delete the original kube-proxy daemonset:
    $ kubectl delete daemonset kube-proxy --namespace=kube-system

  3. If you need to rollback, recreate the original kube-proxy daemonset:
    $ ~/original_kube-proxy.yaml

So we basically replaced the existing kube-proxy daemonset based on the master's architecture with architecture-specific kube-proxy daemonsets.

@drake7707
Copy link

drake7707 commented Jul 9, 2018

Couple of things I encountered after applying this in my v1.11.0 cluster:

  • The kube-proxy from the gist does not apply 'priorityClassName: system-node-critical', so it can get evicted if resource pressure is present (which I encountered on a small SD card with an rpi).

  • When I compared the latest deployed kube-proxy in my cluster set up by kubeadm and the one from the gist, I noticed that the - --config=/var/lib/kube-proxy/config.conf was not present in the gist version but was in the original kube-proxy. I'm not entirely sure if this is necessary, but during networking issues I noticed the error 'can't distinguish internal and external traffic' in the kube-proxy container logs which went away when I added this arg. My networking issues were because of stale CNI configuration files so I don't know if this had any effect.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants