Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Minikube installation on Azure VM - NV6 #4804

Closed
ashwini-git opened this issue Jul 18, 2019 · 15 comments
Closed

Minikube installation on Azure VM - NV6 #4804

ashwini-git opened this issue Jul 18, 2019 · 15 comments
Labels
cause/nested-vm-config When nested VM's appear to play a role co/kvm2-driver KVM2 driver related issues help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/support Categorizes issue or PR as a support question.

Comments

@ashwini-git
Copy link

I am working onto install Minikube on Azure VM - Ubuntu - NV6 .
I installed KVM2 driver , now as mentioned in the document it is required t have VT-x/AMD-v virtualization enabled in BIOS .

Question - How do i know if Azure VM - NV6 supports VT-x/AMD-v virtualization ? If it not by default , how can i enable it?

I also installed Minikube using curl , but it throws error if i try to start Minikube.

Any help would be much appreciated.

@afbjorklund
Copy link
Collaborator

Apparently it is supported on some sizes (Dv3/Ev3) of virtual machines:
https://azure.microsoft.com/sv-se/blog/nested-virtualization-in-azure/

If it is supported, then the normal VT-x/AMD-v detection should see it...
I still think we need some better documentation for this cloud setup (#4730)

@afbjorklund afbjorklund added cause/nested-vm-config When nested VM's appear to play a role kind/support Categorizes issue or PR as a support question. co/kvm2-driver KVM2 driver related issues labels Jul 18, 2019
@tstromberg
Copy link
Contributor

I don't have Azure access to help, but do you mind sharing the output of:

minikube start --alsologtostderr -v=1

It'd be good to see what error shows up currently. I know that minikube w/ kvm2 and virtualbox works really well under GCE, so hopefully Azure will work similarly well.

@tstromberg tstromberg added triage/needs-information Indicates an issue needs more information in order to work on it. help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. labels Jul 18, 2019
@ashwini-git
Copy link
Author

WS-DSVMUser@WorkplaceSafety-KubeVM:~$ minikube start --alsologtostderr -v=1
I0723 05:00:50.074572 75943 notify.go:128] Checking for updates...

  • minikube v1.2.0 on linux (amd64)
    I0723 05:00:50.426594 75943 downloader.go:60] Not caching ISO, using https://s torage.googleapis.com/minikube/iso/minikube-v1.2.0.iso
    I0723 05:00:50.426688 75943 start.go:753] Saving config:
    {
    "MachineConfig": {
    "KeepContext": false,
    "MinikubeISO": "https://storage.googleapis.com/minikube/iso/minikube-v1. 2.0.iso",
    "Memory": 2048,
    "CPUs": 2,
    "DiskSize": 20000,
    "VMDriver": "kvm2",
    "ContainerRuntime": "docker",
    "HyperkitVpnKitSock": "",
    "HyperkitVSockPorts": [],
    "XhyveDiskDriver": "ahci-hd",
    "DockerEnv": null,
    "InsecureRegistry": null,
    "RegistryMirror": null,
    "HostOnlyCIDR": "192.168.99.1/24",
    "HypervVirtualSwitch": "",
    "KvmNetwork": "default",
    "DockerOpt": null,
    "DisableDriverMounts": false,
    "NFSShare": [],
    "NFSSharesRoot": "/nfsshares",
    "UUID": "",
    "GPU": false,
    "Hidden": false,
    "NoVTXCheck": false
    },
    "KubernetesConfig": {
    "KubernetesVersion": "v1.15.0",
    "NodeIP": "",
    "NodePort": 8443,
    "NodeName": "minikube",
    "APIServerName": "minikubeCA",
    "APIServerNames": null,
    "APIServerIPs": null,
    "DNSDomain": "cluster.local",
    "ContainerRuntime": "docker",
    "CRISocket": "",
    "NetworkPlugin": "",
    "FeatureGates": "",
    "ServiceCIDR": "10.96.0.0/12",
    "ImageRepository": "",
    "ExtraOptions": null,
    "ShouldLoadCachedImages": true,
    "EnableDefaultCNI": false
    }
    }
    I0723 05:00:50.426895 75943 cluster.go:95] Skipping create...Using existing ma chine configuration
    I0723 05:00:50.512931 75943 cache_images.go:285] Attempting to cache image: gc r.io/k8s-minikube/storage-provisioner:v1.8.1 at /home/WS-DSVMUser/.minikube/cach e/images/gcr.io/k8s-minikube/storage-provisioner_v1.8.1
    I0723 05:00:50.512959 75943 cache_images.go:285] Attempting to cache image: k8 s.gcr.io/kube-controller-manager:v1.15.0 at /home/WS-DSVMUser/.minikube/cache/im ages/k8s.gcr.io/kube-controller-manager_v1.15.0
    I0723 05:00:50.512983 75943 cache_images.go:285] Attempting to cache image: k8 s.gcr.io/etcd:3.3.10 at /home/WS-DSVMUser/.minikube/cache/images/k8s.gcr.io/etcd 3.3.10
    I0723 05:00:50.513010 75943 cache_images.go:285] Attempting to cache image: k8 s.gcr.io/k8s-dns-dnsmasq-nanny-amd64:1.14.13 at /home/WS-DSVMUser/.minikube/cach e/images/k8s.gcr.io/k8s-dns-dnsmasq-nanny-amd64_1.14.13
    I0723 05:00:50.513029 75943 cache_images.go:285] Attempting to cache image: k8 s.gcr.io/k8s-dns-kube-dns-amd64:1.14.13 at /home/WS-DSVMUser/.minikube/cache/ima ges/k8s.gcr.io/k8s-dns-kube-dns-amd64_1.14.13
    I0723 05:00:50.513055 75943 cache_images.go:285] Attempting to cache image: k8 s.gcr.io/coredns:1.3.1 at /home/WS-DSVMUser/.minikube/cache/images/k8s.gcr.io/co redns_1.3.1
    I0723 05:00:50.513072 75943 cache_images.go:285] Attempting to cache image: k8 s.gcr.io/k8s-dns-sidecar-amd64:1.14.13 at /home/WS-DSVMUser/.minikube/cache/imag es/k8s.gcr.io/k8s-dns-sidecar-amd64_1.14.13
    I0723 05:00:50.512974 75943 cache_images.go:285] Attempting to cache image: k8 s.gcr.io/kube-proxy:v1.15.0 at /home/WS-DSVMUser/.minikube/cache/images/k8s.gcr. io/kube-proxy_v1.15.0
    I0723 05:00:50.512959 75943 cache_images.go:285] Attempting to cache image: k8 s.gcr.io/kube-scheduler:v1.15.0 at /home/WS-DSVMUser/.minikube/cache/images/k8s. gcr.io/kube-scheduler_v1.15.0
    I0723 05:00:50.513037 75943 cache_images.go:285] Attempting to cache image: k8 s.gcr.io/kubernetes-dashboard-amd64:v1.10.1 at /home/WS-DSVMUser/.minikube/cache /images/k8s.gcr.io/kubernetes-dashboard-amd64_v1.10.1
    I0723 05:00:50.512996 75943 cache_images.go:285] Attempting to cache image: k8 s.gcr.io/kube-apiserver:v1.15.0 at /home/WS-DSVMUser/.minikube/cache/images/k8s. gcr.io/kube-apiserver_v1.15.0
    I0723 05:00:50.512934 75943 cache_images.go:285] Attempting to cache image: k8 s.gcr.io/pause:3.1 at /home/WS-DSVMUser/.minikube/cache/images/k8s.gcr.io/pause
    3.1
    I0723 05:00:50.513086 75943 cache_images.go:285] Attempting to cache image: k8 s.gcr.io/kube-addon-manager:v9.0 at /home/WS-DSVMUser/.minikube/cache/images/k8s .gcr.io/kube-addon-manager_v9.0
    I0723 05:00:50.513194 75943 cache_images.go:82] Successfully cached all images .
  • Tip: Use 'minikube start -p ' to create a new cluster, or 'minikube dele te' to delete this one.
    I0723 05:00:55.152900 75943 cluster.go:114] Machine state: Error
    E0723 05:00:55.152933 75943 start.go:559] StartHost: Error getting state for h ost: getting connection: looking up domain: virError(Code=42, Domain=10, Message ='Domain not found: no domain with matching name 'minikube'')
    I0723 05:00:55.153230 75943 utils.go:123] non-retriable error: Error getting s tate for host: getting connection: looking up domain: virError(Code=42, Domain=1 0, Message='Domain not found: no domain with matching name 'minikube'')
    W0723 05:00:55.153297 75943 exit.go:100] Unable to start VM: Error getting sta te for host: getting connection: looking up domain: virError(Code=42, Domain=10, Message='Domain not found: no domain with matching name 'minikube'')

X Unable to start VM: Error getting state for host: getting connection: looking up domain: virError(Code=42, Domain=10, Message='Domain not found: no domain wit h matching name 'minikube'')

@ashwini-git
Copy link
Author

@tstromberg above is the output - when i run " minikube start --alsologtostderr -v=1"
Let me know if you if you can help.
Thanks

@laozc
Copy link
Contributor

laozc commented Jul 24, 2019

Domain not found: no domain with matching name 'minikube' indicates that the VM was not created.
Could you run egrep --color 'vmx|svm' /proc/cpuinfo on the VM and check if nested virtualization is supported on the guest type?

I was able to run minikube on a Standard D4s v3 (4 vcpus, 16 GiB memory) VM with kvm2 driver.
https://www.brianlinkletter.com/create-a-nested-virtual-machine-in-a-microsoft-azure-linux-vm/

But for this scenario it seems a lot better to provision an Azure VM for cluster workload instead of nested virtualization.

@ashwini-git
Copy link
Author

@laozc
i get no result for egrep --color 'vmx|svm' /proc/cpuinfo , which suggest nested virtualization is not supported at Azure NV6 GPU VM.
Do you have any details on how to provision Azure VM for cluster workload , please share if you do.
Thanks for your response.
Regards

@blueelvis
Copy link
Contributor

@ashwini-git - There are limited series which support Nested Virtualization in Azure. Following is a list (Anything 3 stars *** ) supports nested virtualization -

https://docs.microsoft.com/en-us/azure/virtual-machines/windows/acu

Can you try creating a VM from that series and then see if it is working?

@ashwini-git
Copy link
Author

@blueelvis I require gpu enabled computing on my kubernetes cluster , as D and E series Azure VMs are not gpu enabled , how m supposed to achive that if i install Minikube there.

Azure does provide Kuberenetes services , we are trying to avoid it since it bit costly.
Thanks

@afbjorklund
Copy link
Collaborator

Probably better to avoid nested virtualization wherever possible, and look into other options...

Such as https://github.com/kubernetes/minikube/blob/master/docs/vmdriver-none.md

@laozc
Copy link
Contributor

laozc commented Jul 25, 2019

+1 for none driver.
You may install docker on the VM and then run minikube start --vm-driver=none, which will provision the kubernetes cluster directly on your VM.
You may need to open port 8443 for access to the kubernetes API server.

@medyagh medyagh removed the triage/needs-information Indicates an issue needs more information in order to work on it. label Jul 25, 2019
@ashwini-git
Copy link
Author

Hi , I used None driver approach and now kubectl is configured . I can see a Master node created.
How do i setup nvidia.com/gpu capacity now , there is a documentation for kvm2 driver -
https://github.com/kubernetes/minikube/blob/master/docs/gpu.md

Is there any details on to setup gpu plugin in this case ?

@afbjorklund
Copy link
Collaborator

@ashwini-git
Copy link
Author

i followed the instructions at -https://github.com/kubernetes/minikube/blob/master/docs/gpu.md#using-nvidia-gpu-on-minikube-on-linux-with---vm-drivernone

when i try to view the status of deployment as from - https://docs.microsoft.com/en-us/azure/aks/gpu-cluster#view-the-status-and-output-of-the-gpu-enabled-workload

i get the error : -------
Warning FailedScheduling 54s (x2 over 56s) default-scheduler 0/1 nodes are available: 1 Insufficient nvidia.com/gpu.

Below is the Node description FYR.

root@WorkplaceSafety-KubeVM:~# kubectl describe node
Name: minikube
Roles: master
Labels: beta.kubernetes.io/arch=amd64
beta.kubernetes.io/os=linux
kubernetes.io/arch=amd64
kubernetes.io/hostname=minikube
kubernetes.io/os=linux
node-role.kubernetes.io/master=
Annotations: kubeadm.alpha.kubernetes.io/cri-socket: /var/run/dockershim.sock
node.alpha.kubernetes.io/ttl: 0
volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp: Mon, 29 Jul 2019 15:22:27 +0000
Taints:
Unschedulable: false
Conditions:
Type Status LastHeartbeatTime LastTransitionTime Reason Message


MemoryPressure False Wed, 31 Jul 2019 08:39:07 +0000 Mon, 29 Jul 2019 15:22:26 +0000 KubeletHasSufficientMemory kubelet has sufficient memory available
DiskPressure False Wed, 31 Jul 2019 08:39:07 +0000 Mon, 29 Jul 2019 15:22:26 +0000 KubeletHasNoDiskPressure kubelet has no disk pressure
PIDPressure False Wed, 31 Jul 2019 08:39:07 +0000 Mon, 29 Jul 2019 15:22:26 +0000 KubeletHasSufficientPID kubelet has sufficient PID available
Ready True Wed, 31 Jul 2019 08:39:07 +0000 Mon, 29 Jul 2019 15:22:26 +0000 KubeletReady kubelet is posting ready status. AppArmor enabled
Addresses:
InternalIP: 10.0.0.4
Hostname: minikube
Capacity:
cpu: 6
ephemeral-storage: 30308240Ki
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 57688264Ki
pods: 110
Allocatable:
cpu: 6
ephemeral-storage: 27932073938
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 57585864Ki
pods: 110
System Info:
Machine ID: faa2bfb8a7f940ab8eef65b090ed7e05
System UUID: 94489ace-3f42-3542-966c-21d73cf959af
Boot ID: edef2cb8-6237-467a-b013-7ec305fa01f0
Kernel Version: 4.18.0-1024-azure
OS Image: Ubuntu 18.04.2 LTS
Operating System: linux
Architecture: amd64
Container Runtime Version: docker://18.9.7
Kubelet Version: v1.15.0
Kube-Proxy Version: v1.15.0
Non-terminated Pods: (11 in total)
Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits AGE


default hello-minikube-64c7df9db-5d6rl 0 (0%) 0 (0%) 0 (0%) 0 (0%) 24h
kube-system coredns-5c98db65d4-4m2dk 100m (1%) 0 (0%) 70Mi (0%) 170Mi (0%) 41h
kube-system coredns-5c98db65d4-c487m 100m (1%) 0 (0%) 70Mi (0%) 170Mi (0%) 41h
kube-system etcd-minikube 0 (0%) 0 (0%) 0 (0%) 0 (0%) 41h
kube-system kube-addon-manager-minikube 5m (0%) 0 (0%) 50Mi (0%) 0 (0%) 41h
kube-system kube-apiserver-minikube 250m (4%) 0 (0%) 0 (0%) 0 (0%) 41h
kube-system kube-controller-manager-minikube 200m (3%) 0 (0%) 0 (0%) 0 (0%) 41h
kube-system kube-proxy-ccq5t 0 (0%) 0 (0%) 0 (0%) 0 (0%) 41h
kube-system kube-scheduler-minikube 100m (1%) 0 (0%) 0 (0%) 0 (0%) 41h
kube-system nvidia-device-plugin-daemonset-9tzbk 0 (0%) 0 (0%) 0 (0%) 0 (0%) 88m
kube-system storage-provisioner 0 (0%) 0 (0%) 0 (0%) 0 (0%) 41h
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
Resource Requests Limits


cpu 755m (12%) 0 (0%)
memory 190Mi (0%) 340Mi (0%)
ephemeral-storage 0 (0%) 0 (0%)
Events:

@laozc
Copy link
Contributor

laozc commented Aug 11, 2019

Could you check if the node has the nvidia.com/gpu capability?

$ kubectl get nodes -ojson | jq .items[].status.capacity

{
  "cpu": "6",
  "ephemeral-storage": "30308240Ki",
  "hugepages-1Gi": "0",
  "hugepages-2Mi": "0",
  "memory": "57688264Ki",
  "nvidia.com/gpu": "1",
  "pods": "110"
}

If it didn't show in the output, you need to use the nvidia CUDA driver and container runtime

  1. Follow this to install CUDA driver
    http://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#package-manager-installation

  2. Configure nvidia container runtime if needed
    https://github.com/NVIDIA/nvidia-container-runtime
    https://github.com/NVIDIA/nvidia-docker
    Note: you should set the default runtime to nvidia in docker
    $ cat /etc/docker/daemon.json

{
    "default-runtime": "nvidia",
    "runtimes": {
        "nvidia": {
            "path": "/usr/bin/nvidia-container-runtime",
            "runtimeArgs": []
        }
    }
}
  1. Restart docker engine if needed
    $ sudo systemctl restart docker

  2. Start your minikube cluster with none driver
    $ sudo minikube start --vm-driver=none

  3. Install the device plugin
    $ kubectl create -f https://raw.githubusercontent.com/NVIDIA/k8s-device-plugin/v1.10/nvidia-device-plugin.yml

Check the node capacity again and you should be able to use GPU in your minikube cluster.

@tstromberg
Copy link
Contributor

The new documentation gives more guidance as to what is required for minikube to run in the cloud:

https://minikube.sigs.k8s.io/docs/start/linux/

minikube v1.4 also gives better error behavior for these cases. Closing as apparently a workaround has been found.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cause/nested-vm-config When nested VM's appear to play a role co/kvm2-driver KVM2 driver related issues help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/support Categorizes issue or PR as a support question.
Projects
None yet
Development

No branches or pull requests

6 participants