Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Working with Hypriot 1.0 and kubernetes-on-arm 0.7.x #118

Open
saturnism opened this issue Aug 25, 2016 · 17 comments
Open

Working with Hypriot 1.0 and kubernetes-on-arm 0.7.x #118

saturnism opened this issue Aug 25, 2016 · 17 comments

Comments

@saturnism
Copy link

saturnism commented Aug 25, 2016

A few things I found that needed to change when working w/ Hypriot 1.0:

  1. Host name needs to be updated in /boot/device-init.yaml
  2. docker-flannel overlay was not installed by default
  3. flannel.service needs After=system-docker.service to survive restarts
  4. I still need to run kubelet with --containerized in order to use with NFS PV, with docker volume mount -v /:/rootfs:ro
  5. Need to append /boot/cmdline.txt with cgroup_enable=cpuset
  6. Need to enable mtu probing to avoid docker pull problems: append net.ipv4.tcp_mtu_probing=1 to /etc/sysctl.conf (optionally swappiness? see http://a.frtzlr.com/kubernetes-on-raspberry-pi-3-the-missing-troubleshooting-guide/)
  7. PV recycling doesn't work, since it uses a non-arm based busybox image by default to recycle a volume. Still figuring out how to replace the image.
@ghost
Copy link

ghost commented Aug 31, 2016

Thanks for this, am having issues running a Pi2/3 mixed cluster - OS OK - but Hypercube the only container running - issue connecting to the Apimanager

Can you please explain how to implement 2. ?

@luxas
Copy link
Owner

luxas commented Aug 31, 2016

skipping pod synchronization - [Failed to start ContainerManager system validation failed - Following Cgroup subsystem not mounted: [cpuset] container runtime is down]

@pakeha-kiwi You have to set cgroup_enable=cpuset in /boot/cmdline.txt

@erikthorselius
Copy link

erikthorselius commented Aug 31, 2016

How do i fix the "docker-flannel overlay was not installed by default" problem?

It looks like they are install

systemd-delta --type=extended
[EXTENDED]   /lib/systemd/system/docker.service <E2><86><92> /etc/systemd/system/docker.service.d/overlay.conf
[EXTENDED]   /lib/systemd/system/docker.service <E2><86><92> /usr/lib/systemd/system/docker.service.d/docker-flannel.conf

But I can't see any trace when the process is running. And the network layer don't route right.

ps waux |grep fd
root     18232  1.3  4.1 1001548 36484 ?       Ssl  23:03   0:04 /usr/bin/dockerd --storage-driver overlay -H fd://

@erikthorselius
Copy link

Looks like the dropin fails when removing the interface.

systemctl status docker
● docker.service - Docker Application Container Engine
   Loaded: loaded (/lib/systemd/system/docker.service; enabled)
  Drop-In: /usr/lib/systemd/system/docker.service.d
           └─docker-flannel.conf
        /etc/systemd/system/docker.service.d
           └─overlay.conf
   Active: active (running) since Wed 2016-08-31 23:47:48 EEST; 2min 33s ago
     Docs: https://docs.docker.com
  Process: 3091 ExecStartPre=/bin/sh -c ifconfig docker0 down; brctl delbr docker0 (code=exited, status=1/FAILURE)

@luxas
Copy link
Owner

luxas commented Aug 31, 2016

Is brctl installed?

@erikthorselius
Copy link

yes

$ brctl
Usage: brctl [commands]
commands:
    addbr       <bridge>        add bridge
    delbr       <bridge>        delete bridge
    addif       <bridge> <device>   add interface to bridge
    delif       <bridge> <device>   delete interface from bridge
    hairpin     <bridge> <port> {on|off}    turn hairpin on/off
    setageing   <bridge> <time>     set ageing time
    setbridgeprio   <bridge> <prio>     set bridge priority
    setfd       <bridge> <time>     set bridge forward delay
    sethello    <bridge> <time>     set hello time
    setmaxage   <bridge> <time>     set max message age
    setpathcost <bridge> <port> <cost>  set path cost
    setportprio <bridge> <port> <prio>  set port priority
    show        [ <bridge> ]        show a list of bridges
    showmacs    <bridge>        show a list of mac addrs
    showstp     <bridge>        show bridge stp info
    stp         <bridge> {on|off}   turn stp on/off

@erikthorselius
Copy link

I'm new to kubernetes and flannel but what I can see from ifconfig it looks like the usal docker0 interface.

ifconfig
docker0   Link encap:Ethernet  HWaddr 02:42:62:14:64:0f
          inet addr:172.17.0.1  Bcast:0.0.0.0  Mask:255.255.0.0
          UP BROADCAST MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

but the containers looks like they are live and kicking

docker -H unix:///var/run/system-docker.sock ps
CONTAINER ID        IMAGE                     COMMAND                  CREATED             STATUS              PORTS               NAMES
d33d6ae229b3        kubernetesonarm/etcd      "/usr/local/bin/etcd "   55 minutes ago      Up 55 minutes                           k8s-etcd
fd596c1f48f9        kubernetesonarm/flannel   "/flanneld --etcd-end"   55 minutes ago      Up 55 minutes                           k8s-flannel

@ghost
Copy link

ghost commented Sep 1, 2016

thanks @luxas, that got etcd/flannel going so my cluster works now

  1. it seems I don't have the kubectl binary but that was mentioned in another thread I think
  2. if you're using a remote kubectl, what are the creds to interrogate the Apimangler on https://
  3. how do you figure out what port the Dashboard is NATed to?

@luxas
Copy link
Owner

luxas commented Sep 1, 2016

The kubectl binary is downloadable from with

curl -sSL https://storage.googleapis.com/kubernetes-release/release/v1.2.0/bin/linux/arm/kubectl > /usr/local/bin/kubectl

To use remote kubectl, just set KUBERNETES_MASTER=http://{rpi-ip}:8080 or use -s http://{rpi-ip}:8080

For dashboard, visit http://{master-ip}:8080/api/v1/proxy/namespaces/kube-system/services/kubernetes-dashboard

@erikthorselius
Copy link

I solved my problem with flannel by changing the start of docker-flannel.conf to

[Unit]
After=flannel.service
Requires=flannel.service

Don't know why it did not work before but now it works...

@erikthorselius
Copy link

After doing it on the rest of the cluster I realized I was wrong.

sudo systemctl status docker
● docker.service - Docker Application Container Engine
   Loaded: loaded (/lib/systemd/system/docker.service; enabled)
  Drop-In: /usr/lib/systemd/system/docker.service.d
           └─docker-flannel.conf
        /etc/systemd/system/docker.service.d
           └─overlay.conf
   Active: active (running) since Wed 2016-08-31 17:39:56 EEST; 17h ago
     Docs: https://docs.docker.com
  Process: 455 ExecStartPre=/bin/sh -c ifconfig docker0 down; brctl delbr docker0 (code=exited, status=1/FAILURE)
 Main PID: 471 (dockerd)
....
HypriotOS/armv7: pirate@cluster-node02 in ~
$ sudo rm /etc/systemd/system/docker.service.d/overlay.conf
HypriotOS/armv7: pirate@cluster-node02 in ~
$ sudo systemctl daemon-reload
suHypriotOS/armv7: pirate@cluster-node02 in ~
$ sudo ifconfig docker0
docker0   Link encap:Ethernet  HWaddr 02:42:ff:18:2d:84
          inet addr:172.17.0.1  Bcast:0.0.0.0  Mask:255.255.0.0
          inet6 addr: fe80::42:ffff:fe18:2d84/64 Scope:Link
          UP BROADCAST MULTICAST  MTU:1500  Metric:1
          RX packets:31 errors:0 dropped:0 overruns:0 frame:0
          TX packets:36 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:2865 (2.7 KiB)  TX bytes:5462 (5.3 KiB)

HypriotOS/armv7: pirate@cluster-node02 in ~
$ sudo systemctl restart docker
HypriotOS/armv7: pirate@cluster-node02 in ~
$ ifconfig docker0
docker0   Link encap:Ethernet  HWaddr 02:42:93:2b:eb:d7
          inet addr:10.1.86.1  Bcast:0.0.0.0  Mask:255.255.255.0
          UP BROADCAST MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

Now the network is handed over between docker and flannel.

@bialad
Copy link

bialad commented Sep 1, 2016

Will the fixes from this issue be included in v0.8.0?

@luxas
Copy link
Owner

luxas commented Sep 2, 2016

@bialad v0.8.0 will use my "official" code docker-multinode

You may test that also if you want to

@bialad
Copy link

bialad commented Sep 2, 2016

@luxas I've actually been running my RPI cluster using docker-multinode so far, but since that repo doesn't use a release schedule I've had some issues with bugs since I'm always pulling the latest commit when booting my RPIs. I'd figure that I'd use this repo for setting up my core RPI cluster, and docker-multinode as a way to create amd64 worker nodes as vms in a windows server. Don't know if that's even possible, but time will tell. ;)

What do you mean by "my official" though? I've viewed this repo as a wrapper for kube-deploy, with stable releases. That's why I'm hopping for hypriot v1.0 stability to the v0.8.0 release.

@ebagdasa
Copy link

ebagdasa commented Oct 9, 2016

I didn't find brctl in .deb installation, should I install it manually?
Same goes for wirte.sh

@MathiasRenner
Copy link
Contributor

For me, after a reboot of a worker node, the routing is broken (all K8s containers, which have been running before the reboot, are up again -> fine).

Of the list in the first post of this thread, Flannel seems to be the only thing that is important for me (I don't mount anything, cgroup_enable=cpuset, docker pull works fine etc.). How do I get 2. and 3. implemented? ping @saturnism

@erikthorselius Where resides the docker-flannel.conf you mention? Can't find it on /etc/systemd/system/, and there's no flannel container running.

Here some logs:

@saturnism saturnism changed the title Working with Hypriot 1.0 Working with Hypriot 1.0 and kubernetes-on-arm 0.7.x Nov 16, 2016
@saturnism
Copy link
Author

Ah, I think I know where some of the confusions w/ flannels are coming from. I was working w/ kubernetes on arm 0.7. Flannel configuration was here:
https://github.com/luxas/kubernetes-on-arm/blob/release-0.7/sdcard/rootfs/kube-systemd/etc/kubernetes/dropins/docker-flannel.conf

I think it's changed since 0.8

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants