Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add support for kvm on arm64 #9418

Closed
sgermanserrano opened this issue Oct 8, 2020 · 8 comments
Closed

add support for kvm on arm64 #9418

sgermanserrano opened this issue Oct 8, 2020 · 8 comments
Labels
kind/feature Categorizes issue or PR as related to a new feature. lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. os/linux priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete.

Comments

@sgermanserrano
Copy link

sgermanserrano commented Oct 8, 2020

Hi, I'm trying to run minikube v 1.13.1 on arm64 hardware and getting the following errors:

Steps to reproduce the issue:

  1. minikube start
πŸ˜„  minikube v1.13.1 on Ubuntu 18.04 (arm64)
✨  Automatically selected the kvm2 driver
πŸ’Ύ  Downloading driver docker-machine-driver-kvm2:
    > docker-machine-driver-kvm2.sha256: 65 B / 65 B [-------] 100.00% ? p/s 0s
    > docker-machine-driver-kvm2: 13.81 MiB / 13.81 MiB  100.00% 2.77 MiB p/s 6
πŸ’Ώ  Downloading VM boot image ...
    > minikube-v1.13.1.iso.sha256: 65 B / 65 B [-------------] 100.00% ? p/s 0s
    > minikube-v1.13.1.iso: 173.91 MiB / 173.91 MiB [] 100.00% 5.67 MiB p/s 31s
πŸ‘  Starting control plane node minikube in cluster minikube
πŸ”₯  Creating kvm2 VM (CPUs=2, Memory=2200MB, Disk=20000MB) ...
E1008 13:41:57.019740    5494 cache.go:63] save image to file "k8s.gcr.io/etcd-arm64:3.4.13-0" -> "/home/user/.minikube/cache/images/k8s.gcr.io/etcd-arm64_3.4.13-0" failed: nil image for k8s.gcr.io/etcd-arm64:3.4.13-0: GET https://k8s.gcr.io/v2/etcd-arm64/manifests/3.4.13-0: MANIFEST_UNKNOWN: Failed to fetch "3.4.13-0" from request "/v2/etcd-arm64/manifests/3.4.13-0".
E1008 13:41:57.184743    5494 cache.go:63] save image to file "gcr.io/k8s-minikube/storage-provisioner-arm64:v3" -> "/home/user/.minikube/cache/images/gcr.io/k8s-minikube/storage-provisioner-arm64_v3" failed: nil image for gcr.io/k8s-minikube/storage-provisioner-arm64:v3: GET https://gcr.io/v2/k8s-minikube/storage-provisioner-arm64/manifests/v3: MANIFEST_UNKNOWN: Failed to fetch "v3" from request "/v2/k8s-minikube/storage-provisioner-arm64/manifests/v3".
🀦  StartHost failed, but will try again: new host: Error attempting to get plugin server address for RPC: Failed to dial the plugin server in 10s
πŸ”₯  Creating kvm2 VM (CPUs=2, Memory=2200MB, Disk=20000MB) ...
😿  Failed to start kvm2 VM. Running "minikube delete" may fix it: new host: Error attempting to get plugin server address for RPC: Failed to dial the plugin server in 10s

❌  Exiting due to DRV_CORRUPT: Failed to start host: new host: Error attempting to get plugin server address for RPC: Failed to dial the plugin server in 10s
πŸ’‘  Suggestion: The VM driver exited with an error, and may be corrupt. Run 'minikube start' with --alsologtostderr -v=8 to see the error
πŸ“˜  Documentation: https://minikube.sigs.k8s.io/docs/reference/drivers/

😿  If the above advice does not help, please let us know: 
πŸ‘‰  https://github.com/kubernetes/minikube/issues/new/choose

It looks like it is failing to download k8s.gcr.io/etcd-arm64:3.4.13-0 and gcr.io/k8s-minikube/storage-provisioner-arm64:v3 images.

It looks like this is the same problem as #9060

I'm also getting the following in the error log:

Error starting plugin binary: fork/exec /home/user/.minikube/bin/docker-machine-driver-kvm2: exec format error

Is there a docker-machine-driver-kvm2 built for arm64 architectures?

@afbjorklund
Copy link
Collaborator

afbjorklund commented Oct 10, 2020

Is there a docker-machine-driver-kvm2 built for arm64 architectures?

We don't have a kvm2 driver for arm64 yet, mostly because we don't have an ISO for arm64...

Initially the only driver was "none", around the corner is "docker" and eventually maybe "kvm2"

I think the "-arm64" in etcd is a bug (?) and the storage-provisioner should be fixed by #9334

$ kubeadm config images list
k8s.gcr.io/kube-apiserver:v1.19.2
k8s.gcr.io/kube-controller-manager:v1.19.2
k8s.gcr.io/kube-scheduler:v1.19.2
k8s.gcr.io/kube-proxy:v1.19.2
k8s.gcr.io/pause:3.2
k8s.gcr.io/etcd:3.4.13-0
k8s.gcr.io/coredns:1.7.0

(no "etcd-arm64", just manifest)

@afbjorklund afbjorklund added co/kvm2-driver KVM2 driver related issues os/linux labels Oct 10, 2020
@afbjorklund afbjorklund changed the title minikube 1.13.1 error on arm64 minikube 1.13.1 error with kvm2 on arm64 Oct 10, 2020
@afbjorklund
Copy link
Collaborator

$ docker pull k8s.gcr.io/etcd-arm64:3.4.13-0
Error response from daemon: manifest for k8s.gcr.io/etcd-arm64:3.4.13-0 not found: manifest unknown: Failed to fetch "3.4.13-0" from request "/v2/etcd-arm64/manifests/3.4.13-0".
$ docker pull --platform linux/arm64 k8s.gcr.io/etcd:3.4.13-0
3.4.13-0: Pulling from etcd
4000adbbc3eb: Pull complete 
c01a58061c3d: Pull complete 
7e982895c1fc: Pull complete 
3c187ad2b50e: Pull complete 
aedf21830123: Pull complete 
Digest: sha256:4ad90a11b55313b182afc186b9876c8e891531b8db4c9bf1541953021618d0e2
Status: Downloaded newer image for k8s.gcr.io/etcd:3.4.13-0
k8s.gcr.io/etcd:3.4.13-0
$ docker run k8s.gcr.io/etcd:3.4.13-0 etcd --version
running etcd on unsupported architecture "arm64" since ETCD_UNSUPPORTED_ARCH is set
etcd Version: 3.4.13
Git SHA: ae9734ed2
Go Version: go1.13.5
Go OS/Arch: linux/arm64

So I guess that they just stopped providing the "compatibility" image tags ?

Now the individual architectures are only known by their digests in the manifest:

{
   "schemaVersion": 2,
   "mediaType": "application/vnd.docker.distribution.manifest.list.v2+json",
   "manifests": [
      {
         "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
         "size": 1372,
         "digest": "sha256:bd4d2c9a19be8a492bc79df53eee199fd04b415e9993eb69f7718052602a147a",
         "platform": {
            "architecture": "amd64",
            "os": "linux"
         }
      },
      {
         "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
         "size": 1372,
         "digest": "sha256:4cadaf5f37998d038861c66215809eee8316c7604934e03dd86015a0f6704cd3",
         "platform": {
            "architecture": "arm",
            "os": "linux"
         }
      },
      {
         "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
         "size": 1372,
         "digest": "sha256:cab4dc43598ed9b5c6a524dfea1abf4772a5e164c330451e5ca2635a95675aa8",
         "platform": {
            "architecture": "arm64",
            "os": "linux"
         }
      },
      {
         "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
         "size": 1373,
         "digest": "sha256:03d9db03b83a5991ede76dd53e19a1e46e6bf3c498ae8ace2e1be8f3d1d03ad6",
         "platform": {
            "architecture": "ppc64le",
            "os": "linux"
         }
      },
      {
         "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
         "size": 1372,
         "digest": "sha256:dd9baad37b716232a737cd781cf65ceb9f7c541264cdde9ad3870d84e8644a49",
         "platform": {
            "architecture": "s390x",
            "os": "linux"
         }
      }
   ]
}

So we should stop adding the architecture, at least for later Kubernetes versions:

func etcd(v semver.Version, mirror string) string {
        needsArchSuffix := false
        ancient := semver.MustParseRange("<1.12.0")
        if ancient(v) {
                needsArchSuffix = true
        }
...
        return path.Join(kubernetesRepo(mirror), "etcd"+archTag(needsArchSuffix)+ev)
}

Unfortunately it is hardcoding "amd64"

// archTag returns a CPU architecture suffix for images
func archTag(hasTag bool) string {
        if runtime.GOARCH == "amd64" && !hasTag {
                return ":"
        }
        return "-" + runtime.GOARCH + ":"
}

@afbjorklund
Copy link
Collaborator

afbjorklund commented Oct 10, 2020

This is related to not running any tests on other architectures, like arm64: #9205

We are also duplicating a fair share of the image lists in minikube, from kubeadm...

kubernetes/kubernetes@14dbfdc "kubeadm: Drop arch suffixes" (this happened in 1.12)

We fixed the kubernetes components [1] and coredns [2] earlier, but not etcd and pause:

  1. aeb1605

  2. 9e317ac

This change was new for kubernetes 1.19, that upgraded to etcd 3.4.13 (without -arm64)

The previous etcd version (kubernetes 1.18 and earlier) still had both tags (also with arch):

$ docker pull k8s.gcr.io/etcd-arm64:3.4.3-0
3.4.3-0: Pulling from etcd-arm64
9f9ba9541db2: Pull complete 
6feb97f21dc3: Pull complete 
de473e163c10: Pull complete 
Digest: sha256:fbc0f8b4861d23c9989edf877df7ae2533083e98c05687eb22b00422b9825c2f
Status: Downloaded newer image for k8s.gcr.io/etcd-arm64:3.4.3-0
k8s.gcr.io/etcd-arm64:3.4.3-0
$ docker images k8s.gcr.io/etcd-arm64:3.4.3-0
REPOSITORY              TAG                 IMAGE ID            CREATED             SIZE
k8s.gcr.io/etcd-arm64   3.4.3-0             ab707b0a0ea3        11 months ago       363MB
$ docker pull --platform=linux/arm64 k8s.gcr.io/etcd:3.4.3-0
3.4.3-0: Pulling from etcd
Digest: sha256:4afb99b4690b418ffc2ceb67e1a17376457e441c1f09ab55447f0aaf992fa646
Status: Downloaded newer image for k8s.gcr.io/etcd:3.4.3-0
k8s.gcr.io/etcd:3.4.3-0
$ docker images k8s.gcr.io/etcd:3.4.3-0
REPOSITORY          TAG                 IMAGE ID            CREATED             SIZE
k8s.gcr.io/etcd     3.4.3-0             ab707b0a0ea3        11 months ago       363MB

And "pause" still has both, since it has only been updated (at all) once.

$ docker pull k8s.gcr.io/pause-arm64:3.2
3.2: Pulling from pause-arm64
Digest: sha256:31d3efd12022ffeffb3146bc10ae8beb890c80ed2f07363515580add7ed47636
Status: Downloaded newer image for k8s.gcr.io/pause-arm64:3.2
k8s.gcr.io/pause-arm64:3.2
$ docker pull --platform=linux/arm64 k8s.gcr.io/pause:3.2
3.2: Pulling from pause
Digest: sha256:927d98197ec1141a368550822d18fa1c60bdae27b78b0c004f705f548c07814f
Status: Image is up to date for k8s.gcr.io/pause:3.2
k8s.gcr.io/pause:3.2
$ docker images | grep pause
k8s.gcr.io/pause-arm64                                             3.2                                2a060e2e7101        7 months ago        484kB
k8s.gcr.io/pause                                                   3.2                                2a060e2e7101        7 months ago        484kB

But we shouldn't need to use the arch suffixes anywhere anymore (1.12+)

@medyagh
Copy link
Member

medyagh commented Oct 14, 2020

better support for army64 is on our road map and we will be working on this for early 2021.
@ilya-zuyev is working on adding integration tests for arm64

@medyagh medyagh added priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. and removed co/kvm2-driver KVM2 driver related issues labels Oct 14, 2020
@medyagh medyagh changed the title minikube 1.13.1 error with kvm2 on arm64 add support for kvm on arm64 Oct 14, 2020
@medyagh medyagh added priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. and removed priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. labels Oct 14, 2020
@afbjorklund
Copy link
Collaborator

afbjorklund commented Oct 15, 2020

There is #9228 for the new arm64 iso, no issue yet for building docker-machine-driver-kvm2 for arm64

Most likely we need to add some "amd64" suffix somewhere in the existing iso and bin, like for minikube ?

boot2docker.iso (minikube.iso)
docker-machine-driver-kvm2

minikube-linux-amd64
minikube-linux-arm64

There is no preload support either, but that should still work by loading the images from the normal cache.

However, there is no architecture support in the current cache so it will only work for the "native" arch.

β”œβ”€β”€ gcr.io
β”‚Β Β  └── k8s-minikube
β”‚Β Β      └── storage-provisioner_v3
β”œβ”€β”€ k8s.gcr.io
β”‚Β Β  β”œβ”€β”€ coredns_1.7.0
β”‚Β Β  β”œβ”€β”€ etcd_3.4.13-0
β”‚Β Β  β”œβ”€β”€ kube-apiserver_v1.19.2
β”‚Β Β  β”œβ”€β”€ kube-controller-manager_v1.19.2
β”‚Β Β  β”œβ”€β”€ kube-proxy_v1.19.2
β”‚Β Β  β”œβ”€β”€ kube-scheduler_v1.19.2
β”‚Β Β  └── pause_3.2
└── kubernetesui
    β”œβ”€β”€ dashboard_v2.0.3
    └── metrics-scraper_v1.0.4

4 directories, 10 files

@medyagh medyagh added the kind/feature Categorizes issue or PR as related to a new feature. label Oct 15, 2020
@afbjorklund
Copy link
Collaborator

However, there is no architecture support in the current cache so it will only work for the "native" arch.

Added #9593

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 13, 2021
@priyawadhwa
Copy link

I'm going to close this issue as a dupe of the following, so that we can centralize the discussion around these things there:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature. lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. os/linux priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete.
Projects
None yet
Development

No branches or pull requests

6 participants