Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Minikube does not work on Travis arm64 with LXD #11845

Open
briandealwis opened this issue Jun 30, 2021 · 5 comments
Open

Minikube does not work on Travis arm64 with LXD #11845

briandealwis opened this issue Jun 30, 2021 · 5 comments
Labels
kind/feature Categorizes issue or PR as related to a new feature. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done.

Comments

@briandealwis
Copy link
Contributor

briandealwis commented Jun 30, 2021

I've been struggling to get Minikube to work with Travis's arm64 runners. These are run within LXD images on Packet.net (now Equinix) machines. I have no ability to change the configuration of these runners.

The runners do have Docker installed.

Using the docker driver fails

The docker driver (GoogleContainerTools/container-debug-support@ff8acb3) fails (full log) due to #6411:

X Exiting due to GUEST_PROVISION_EXIT_UNEXPECTED: Failed to start host: creating host: create: creating: create kic node: container name "minikube": log: 2021-06-28T13:39:59.446787233Z  + echo 'INFO: remounting /sys read-only'
2021-06-28T13:39:59.446796083Z  INFO: remounting /sys read-only
2021-06-28T13:39:59.446829357Z  + mount -o remount,ro /sys
2021-06-28T13:39:59.449462857Z  mount: /sys: permission denied.: container exited unexpectedly

Using the none driver fails

After a lot of trial and error (GoogleContainerTools/container-debug-support@ba6472d), my command line is: TERM=dumb sudo -E $HOME/bin/minikube start --driver=none --alsologtostderr -v=5 --feature-gates="LocalStorageCapacityIsolation=false" --extra-config=kubelet.protect-kernel-defaults=true. But:

  • the apiserver doesn't seem to be starting or listening: the kubelet complains trying to retrieve values,
  • the kubelet dies trying to access files in /proc/sys.

The LocalStorageCapacityIsolation is required as I was getting errors like kubelet.go:1367] Failed to start ContainerManager failed to get rootfs info: failed to get device for dir "/var/lib/kubelet": could not find device with major: 0, minor: 60 in cached partitions map and this check could be disabled.

The --extra-config=kubelet.protect-kernel-defaults=true was an attempt to work around errors trying to set specific sysctl values via /proc/sys. Now it fails as the retrieved values don't match what it wants to see.

At this point, the kubelet fails like the following (full log):

Jun 30 15:15:01 travis-job-googlecontaine-container-debu-520557790 kubelet[21048]: I0630 15:15:01.432806   21048 apiserver.go:43] Waiting for node sync before watching apiserver pods
Jun 30 15:15:01 travis-job-googlecontaine-container-debu-520557790 kubelet[21048]: E0630 15:15:01.433777   21048 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.Service: failed to list *v1.Service: Get "https://control-plane.minikube.internal:8443/api/v1/services?limit=500&resourceVersion=0": dial tcp 192.168.0.14:8443: connect: connection refused
Jun 30 15:15:01 travis-job-googlecontaine-container-debu-520557790 kubelet[21048]: E0630 15:15:01.433857   21048 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.Node: failed to list *v1.Node: Get "https://control-plane.minikube.internal:8443/api/v1/nodes?fieldSelector=metadata.name%3Dtravis-job-googlecontaine-container-debu-520557790&limit=500&resourceVersion=0": dial tcp 192.168.0.14:8443: connect: connection refused
...
Jun 30 15:09:13 travis-job-googlecontaine-container-debu-520557790 kubelet[11498]: I0630 15:09:13.177298   11498 cpu_manager.go:193] [cpumanager] starting with none policy
Jun 30 15:09:13 travis-job-googlecontaine-container-debu-520557790 kubelet[11498]: I0630 15:09:13.177332   11498 cpu_manager.go:194] [cpumanager] reconciling every 10s
Jun 30 15:09:13 travis-job-googlecontaine-container-debu-520557790 kubelet[11498]: I0630 15:09:13.177477   11498 state_mem.go:36] [cpumanager] initializing new in-memory state store
Jun 30 15:09:13 travis-job-googlecontaine-container-debu-520557790 kubelet[11498]: I0630 15:09:13.178262   11498 policy_none.go:43] [cpumanager] none policy: Start
Jun 30 15:09:13 travis-job-googlecontaine-container-debu-520557790 kubelet[11498]: F0630 15:09:13.179273   11498 kubelet.go:1367] Failed to start ContainerManager [invalid kernel flag: kernel/panic_on_oops, expected value: 1, actual value: 0, invalid kernel flag: vm/overcommit_memory, expected value: 1, actual value: 0, invalid kernel flag: kernel/panic, expected value: 10, actual value: 0]
@afbjorklund
Copy link
Collaborator

See #7957 for some earlier discussion, supporting LXC/LXD is not a top priority at the moment

@afbjorklund afbjorklund added the kind/feature Categorizes issue or PR as related to a new feature. label Jun 30, 2021
@medyagh
Copy link
Member

medyagh commented Jun 30, 2021

See #7957 for some earlier discussion, supporting LXC/LXD is not a top priority at the moment

@afbjorklund while we dont support LXD drivers,,, but may be we could support none driver on lxd ? I wonder what are the missing pieces for none driver

@afbjorklund
Copy link
Collaborator

The none driver is unlikely to work with a fakenode, without some similar hacks as the KIC image.

@spowelljr spowelljr added the priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done. label Jul 13, 2021
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 11, 2021
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Nov 10, 2021
@spowelljr spowelljr added lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. and removed lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. labels Nov 17, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done.
Projects
None yet
Development

No branches or pull requests

6 participants