Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hyperkit: start hangs after shutdown: SSH getsockopt: operation timed out (stale pid) #2994

Closed
hrqiang opened this issue Jul 18, 2018 · 12 comments
Assignees
Labels
cause/vm-networking Startup failures due to VM networking co/hyperkit Hyperkit related issues co/sshd ssh related issues ev/hung-start help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/bug Categorizes issue or PR as related to a bug. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release.
Milestone

Comments

@hrqiang
Copy link

hrqiang commented Jul 18, 2018

Is this a BUG REPORT or FEATURE REQUEST? (choose one):

Please provide the following details:

Environment:

Minikube version (use minikube version): v0.28.1

  • OS (e.g. from /etc/os-release): OSX High Sierra
  • VM Driver (e.g. cat ~/.minikube/machines/minikube/config.json | grep DriverName): hyperkit
  • ISO version (e.g. cat ~/.minikube/machines/minikube/config.json | grep -i ISO or minikube ssh cat /etc/VERSION): minikube-v0.28.0.iso
  • Install tools:
  • Others:
    The above can be generated in one go with the following commands (can be copied and pasted directly into your terminal):
minikube version
echo "";
echo "OS:";
cat /etc/os-release
echo "";
echo "VM driver": 
grep DriverName ~/.minikube/machines/minikube/config.json
echo "";
echo "ISO version";
grep -i ISO ~/.minikube/machines/minikube/config.json

What happened:

minikube start stuck at ssh after stop command or reboot my Mac.

What you expected to happen:

It'll be able to ssh and start the system.

How to reproduce it (as minimally and precisely as possible):

$ minikube start --vm-driver=hyperkit --memory 4096 --v 9 --alsologtostderr
wait until it running. Reboot Mac or $ minikube stop.
Run again
$ minikube start --vm-driver=hyperkit --memory 4096 --v 9 --alsologtostderr
Stuck

Waiting for SSH to be available...
Getting to WaitForSSH function...
(minikube) Calling .GetSSHHostname
(minikube) Calling .GetSSHPort
(minikube) Calling .GetSSHKeyPath
(minikube) Calling .GetSSHKeyPath
(minikube) Calling .GetSSHUsername
Using SSH client type: native
&{{{<nil> 0 [] [] []} docker [0x14354b0] 0x1435460  [] 0s} 192.168.64.6 22 <nil> <nil>}
About to run SSH command:
exit 0


Error dialing TCP: dial tcp 192.168.64.6:22: getsockopt: operation timed out

Output of minikube logs (if applicable):
hanging too.

Anything else do we need to know:
screen tty
gives no output, so linuxkit is not running.

Please let me know how to debug this.

@InbarRose
Copy link

InbarRose commented Jul 18, 2018

I am also having this problem. (on windows)

Error getting ssh command 'exit 0' : IP not found

Full Trace:

Getting to WaitForSSH function...
[executing ==>] : C:\Windows\System32\WindowsPowerShell\v1.0\powershell.exe -NoProfile -NonInteractive ( Hyper-V\Get-VM minikube ).state
[stdout =====>] : Running

[stderr =====>] :
[executing ==>] : C:\Windows\System32\WindowsPowerShell\v1.0\powershell.exe -NoProfile -NonInteractive (( Hyper-V\Get-VM minikube ).networkadapters[0]).ipaddresses[0]
[stdout =====>] :
[stderr =====>] :
Error getting ssh command 'exit 0' : IP not found

But just earlier in the logs I can see:

Getting to WaitForSSH function...
[executing ==>] : C:\Windows\System32\WindowsPowerShell\v1.0\powershell.exe -NoProfile -NonInteractive ( Hyper-V\Get-VM minikube ).state
[stdout =====>] : Running

[stderr =====>] :
[executing ==>] : C:\Windows\System32\WindowsPowerShell\v1.0\powershell.exe -NoProfile -NonInteractive (( Hyper-V\Get-VM minikube ).networkadapters[0]).ipaddresses[0]
[stdout =====>] : 192.168.0.13

[stderr =====>] :
Using SSH client type: native
&{{{<nil> 0 [] [] []} docker [0x8427f0] 0x8427a0  [] 0s} 192.168.0.13 22 <nil> <nil>}
About to run SSH command:
exit 0
SSH cmd err, output: <nil>:
Detecting the provisioner...
[executing ==>] : C:\Windows\System32\WindowsPowerShell\v1.0\powershell.exe -NoProfile -NonInteractive ( Hyper-V\Get-VM minikube ).state
[stdout =====>] : Running

[stderr =====>] :
[executing ==>] : C:\Windows\System32\WindowsPowerShell\v1.0\powershell.exe -NoProfile -NonInteractive (( Hyper-V\Get-VM minikube ).networkadapters[0]).ipaddresses[0]
[stdout =====>] : 192.168.0.13

[stderr =====>] :
Using SSH client type: native
&{{{<nil> 0 [] [] []} docker [0x8427f0] 0x8427a0  [] 0s} 192.168.0.13 22 <nil> <nil>}
About to run SSH command:
cat /etc/os-release
SSH cmd err, output: <nil>: NAME=Buildroot
VERSION=2018.05
ID=buildroot
VERSION_ID=2018.05
PRETTY_NAME="Buildroot 2018.05"

found compatible host: buildroot
setting hostname "minikube"

Which indicates that it did find an IP

minikube version: v0.28.0

@hrqiang
Copy link
Author

hrqiang commented Jul 18, 2018

I even upgrade to latest hyperkit go driver recompile and stuck at same point. Please tell me how to debug hyperkit boot up issue?

@balopat
Copy link
Contributor

balopat commented Jul 23, 2018

@hrqiang try removing the ~/.minikube/machines/minikube/hyperkit.pid file.
@InbarRose I think that it is a different issue - the other one is very hyperkit (MacOS) specific.

@hrqiang
Copy link
Author

hrqiang commented Jul 28, 2018

@balopat, it works. Thanks.
After restart, it has a small problem with ssh command thought

$ minikube ssh
➜  ~ minikube ssh
E0727 23:11:57.537430    3355 ssh.go:53] Error attempting to ssh/run-ssh-command: Error: Cannot run ssh command: Host "minikube" is not running

@balopat
Copy link
Contributor

balopat commented Aug 13, 2018

@hrqiang unfortunately with unclean shutdowns it's hard to tell what the issue is. Based on this command minikube is not running - are you sure it's running?

@tstromberg tstromberg changed the title minikube cann't startup hyperkit VM after stop. minikube start hangs after shutdown: Waiting for SSH to be available... Sep 18, 2018
@tstromberg tstromberg added os/macos co/hyperkit Hyperkit related issues failed/local-networking startup failures due to networking issues kind/bug Categorizes issue or PR as related to a bug. co/sshd ssh related issues labels Sep 18, 2018
@tstromberg tstromberg changed the title minikube start hangs after shutdown: Waiting for SSH to be available... start hangs after shutdown: Waiting for SSH to be available... (getsockopt: operation timed out) Sep 20, 2018
@onpaws
Copy link

onpaws commented Oct 8, 2018

Thanks for your efforts on minikube, it's a nice project and I'm intending to teach my colleagues about Kubernetes by pointing them here first.

Just wanted to share that this issue appears to still be happening on Mojave aka the latest macOS on the hyperkit driver.
minikube start --vm-driver hyperkit -v 10

My feeble 'workaround' has been to rm -r .minikube in between sessions, but it's not great to have to start from scratch every time.

For me rm -f /Users/paws/.minikube/machines/minikube/hyperkit.pid didn't make a difference. Curious if anyone know a workaround for this issue?

$ minikube version
minikube version: v0.29.0
$ sw_vers 
ProductName:	Mac OS X
ProductVersion:	10.14
BuildVersion:	18A391
$ brew info docker-machine-driver-hyperkit 
docker-machine-driver-hyperkit: stable 1.0.0 (bottled)
Docker Machine driver for hyperkit
https://github.com/machine-drivers/docker-machine-driver-hyperkit
/usr/local/Cellar/docker-machine-driver-hyperkit/1.0.0 (5 files, 13.5MB) *
  Poured from bottle on 2018-10-01 at 22:06:47
From: https://github.com/Homebrew/homebrew-core/blob/master/Formula/docker-machine-driver-hyperkit.rb
==> Dependencies
Build: dep ✘, go ✘
Required: docker-machine ✔
==> Requirements
Required: macOS >= 10.10 ✔

@sam2013
Copy link

sam2013 commented Oct 9, 2018

I am getting similar error. Using macOS High Sierra (10.13.6)

Sams-MacBook-Pro:minikube samxxxx$ minikube start -v10 Aliases: map[string]string{} Override: map[string]interface {}{"v":"10"} PFlags: map[string]viper.FlagValue{"nfs-shares-root":viper.pflagValue{flag:(*pflag.Flag)(0xc4204b80a0)}, "apiserver-ips":viper.pflagValue{flag:(*pflag.Flag)(0xc4204b83c0)}, "apiserver-names":viper.pflagValue{flag:(*pflag.Flag)(0xc4204b8320)}, "docker-opt":viper.pflagValue{flag:(*pflag.Flag)(0xc4204b81e0)}, "iso-url":viper.pflagValue{flag:(*pflag.Flag)(0xc42001ba40)}, "keep-context":viper.pflagValue{flag:(*pflag.Flag)(0xc42001b7c0)}, "mount-string":viper.pflagValue{flag:(*pflag.Flag)(0xc42001b900)}, "cache-images":viper.pflagValue{flag:(*pflag.Flag)(0xc4204b88c0)}, "kubernetes-version":viper.pflagValue{flag:(*pflag.Flag)(0xc4204b86e0)}, "xhyve-disk-driver":viper.pflagValue{flag:(*pflag.Flag)(0xc42001bf40)}, "profile":viper.pflagValue{flag:(*pflag.Flag)(0xc42001adc0)}, "apiserver-name":viper.pflagValue{flag:(*pflag.Flag)(0xc4204b8280)}, "insecure-registry":viper.pflagValue{flag:(*pflag.Flag)(0xc4204b8500)}, "memory":viper.pflagValue{flag:(*pflag.Flag)(0xc42001bb80)}, "uuid":viper.pflagValue{flag:(*pflag.Flag)(0xc4204b8a00)}, "cpus":viper.pflagValue{flag:(*pflag.Flag)(0xc42001bc20)}, "dns-domain":viper.pflagValue{flag:(*pflag.Flag)(0xc4204b8460)}, "gpu":viper.pflagValue{flag:(*pflag.Flag)(0xc4204b8be0)}, "hyperv-virtual-switch":viper.pflagValue{flag:(*pflag.Flag)(0xc42001be00)}, "network-plugin":viper.pflagValue{flag:(*pflag.Flag)(0xc4204b8780)}, "feature-gates":viper.pflagValue{flag:(*pflag.Flag)(0xc4204b8820)}, "hyperkit-vpnkit-sock":viper.pflagValue{flag:(*pflag.Flag)(0xc4204b8aa0)}, "mount":viper.pflagValue{flag:(*pflag.Flag)(0xc42001b860)}, "registry-mirror":viper.pflagValue{flag:(*pflag.Flag)(0xc4204b85a0)}, "container-runtime":viper.pflagValue{flag:(*pflag.Flag)(0xc4204b8640)}, "disk-size":viper.pflagValue{flag:(*pflag.Flag)(0xc42001bcc0)}, "docker-env":viper.pflagValue{flag:(*pflag.Flag)(0xc4204b8140)}, "hyperkit-vsock-ports":viper.pflagValue{flag:(*pflag.Flag)(0xc4204b8b40)}, "bootstrapper":viper.pflagValue{flag:(*pflag.Flag)(0xc42001ae60)}, "extra-config":viper.pflagValue{flag:(*pflag.Flag)(0xc4204b8960)}, "kvm-network":viper.pflagValue{flag:(*pflag.Flag)(0xc42001bea0)}, "nfs-share":viper.pflagValue{flag:(*pflag.Flag)(0xc4204b8000)}, "vm-driver":viper.pflagValue{flag:(*pflag.Flag)(0xc42001bae0)}, "disable-driver-mounts":viper.pflagValue{flag:(*pflag.Flag)(0xc42001b9a0)}, "host-only-cidr":viper.pflagValue{flag:(*pflag.Flag)(0xc42001bd60)}} Env: map[string]string{} Key/Value Store: map[string]interface {}{} Config: map[string]interface {}{} Defaults: map[string]interface {}{"alsologtostderr":"false", "wantreporterror":false, "wantnonedriverwarning":true, "v":"0", "wantkubectldownloadmsg":true, "showdriverdeprecationnotification":true, "showbootstrapperdeprecationnotification":true, "log_dir":"", "wantupdatenotification":true, "reminderwaitperiodinhours":24, "wantreporterrorprompt":true} Starting local Kubernetes v1.10.0 cluster... Starting VM... Found binary path at /usr/local/bin/docker-machine-driver-hyperkit Launching plugin server for driver hyperkit Plugin server listening at address 127.0.0.1:54004 () Calling .GetVersion Using API Version 1 () Calling .SetConfigRaw () Calling .GetMachineName (minikube) Calling .GetState (minikube) Calling .Start (minikube) Using UUID 0f8850ed-cbb3-11e8-b1f0-c4b301d0a7d1 (minikube) Generated MAC 82:88:60:45:9a:22 (minikube) Starting with cmdline: loglevel=3 user=docker console=ttyS0 console=tty0 noembed nomodeset norestore waitusb=10 systemd.legacy_systemd_cgroup_controller=yes base host=minikube (minikube) Calling .GetConfigRaw (minikube) Calling .DriverName Waiting for SSH to be available... Getting to WaitForSSH function... (minikube) Calling .GetSSHHostname (minikube) Calling .GetSSHPort (minikube) Calling .GetSSHKeyPath (minikube) Calling .GetSSHKeyPath (minikube) Calling .GetSSHUsername Using SSH client type: native &{{{<nil> 0 [] [] []} docker [0x140f940] 0x140f910 [] 0s} 192.168.64.3 22 <nil> <nil>} About to run SSH command: exit 0 Error dialing TCP: dial tcp 192.168.64.3:22: connect: operation timed out Error dialing TCP: dial tcp 192.168.64.3:22: connect: operation timed out Error dialing TCP: dial tcp 192.168.64.3:22: connect: operation timed out

@fyuan1316
Copy link

@balopat
thanks a lot ! After deleting ~/.minikube/machines/minikube/hyperkit.pid ,everything works fine
@hrqiang
I tried the ssh command ('minikube ssh' ) without an error. The minikube version I am using is 0.3.0.

@dguendisch
Copy link

@balopat thank you! Deleting ~/.minikube/machines/minikube/hyperkit.pid helps me as well to recover.
Isn't this something minikube could check on its own and eventually delete on its own?

@steliosAnastasakis
Copy link

i had to :
$rm ~/.minikube/machines/minikube/hyperkit.pid
$minikube stop
$minikube delete

@jski
Copy link

jski commented Jan 10, 2019

Same issue on OSX 10.14.2, minikube 0.32.0.

@tstromberg tstromberg changed the title start hangs after shutdown: Waiting for SSH to be available... (getsockopt: operation timed out) hyperkit: start hangs after shutdown: SSH getsockopt: operation timed out Jan 24, 2019
@tstromberg tstromberg added cause/vm-networking Startup failures due to VM networking help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. and removed failed/local-networking startup failures due to networking issues os/macos labels Jan 24, 2019
@tstromberg tstromberg added this to the v1.0.0-candidate milestone Jan 24, 2019
@tstromberg tstromberg changed the title hyperkit: start hangs after shutdown: SSH getsockopt: operation timed out hyperkit: start hangs after shutdown: SSH getsockopt: operation timed out (stale pid) Jan 28, 2019
@tstromberg tstromberg modified the milestones: v1.0.0-candidate, v0.34.0 Jan 28, 2019
@tstromberg
Copy link
Contributor

tstromberg commented May 22, 2019

If you run into this, please upgrade to the latest hyperkit driver we provide:

curl -LO https://storage.googleapis.com/minikube/releases/latest/docker-machine-driver-hyperkit && sudo install -o root -g wheel -m 4755 docker-machine-driver-hyperkit /usr/local/bin/

Additionally, you may have to run minikube delete to remove the corrupt state.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cause/vm-networking Startup failures due to VM networking co/hyperkit Hyperkit related issues co/sshd ssh related issues ev/hung-start help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/bug Categorizes issue or PR as related to a bug. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release.
Projects
None yet
Development

No branches or pull requests

10 participants