Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KVM integration tests failing with "machine didn't return an IP after 120 seconds" #5927

Closed
tstromberg opened this issue Nov 15, 2019 · 4 comments
Labels
area/testing kind/bug Categorizes issue or PR as related to a bug. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete.

Comments

@tstromberg
Copy link
Contributor

Example: https://storage.googleapis.com/minikube-builds/logs/5916/KVM_Linux.txt

                I1115 19:25:30.929899    8093 main.go:110] libmachine: Launching plugin server for driver kvm2
                I1115 19:25:31.004338    8093 main.go:110] libmachine: Plugin server listening at address 127.0.0.1:33605
                I1115 19:25:31.005685    8093 main.go:110] libmachine: () Calling .GetVersion
                I1115 19:25:31.009125    8093 main.go:110] libmachine: Using API Version  1
                I1115 19:25:31.009150    8093 main.go:110] libmachine: () Calling .SetConfigRaw
                I1115 19:25:31.011731    8093 main.go:110] libmachine: () Calling .GetMachineName
                I1115 19:25:31.017180    8093 main.go:110] libmachine: (cni-20191115T192103.770067553-5420) Calling .GetState
                I1115 19:25:31.029372    8093 main.go:110] libmachine: (cni-20191115T192103.770067553-5420) Calling .DriverName
                I1115 19:25:31.029846    8093 main.go:110] libmachine: (cni-20191115T192103.770067553-5420) Calling .Remove
                I1115 19:25:31.030513    8093 main.go:110] libmachine: (cni-20191115T192103.770067553-5420) DBG | Removing machine...
                I1115 19:25:31.041158    8093 main.go:110] libmachine: (cni-20191115T192103.770067553-5420) DBG | Trying to delete the networks (if possible)
                I1115 19:25:31.053214    8093 main.go:110] libmachine: (cni-20191115T192103.770067553-5420) DBG | Checking if network minikube-net exists...
                I1115 19:25:31.053256    8093 main.go:110] libmachine: (cni-20191115T192103.770067553-5420) DBG | Network minikube-net exists
                I1115 19:25:31.053291    8093 main.go:110] libmachine: (cni-20191115T192103.770067553-5420) DBG | Trying to list all domains...
                I1115 19:25:31.053900    8093 main.go:110] libmachine: (cni-20191115T192103.770067553-5420) DBG | Listed all domains: total of 9 domains
                I1115 19:25:31.053922    8093 main.go:110] libmachine: (cni-20191115T192103.770067553-5420) DBG | Trying to get name of domain...
                I1115 19:25:31.053940    8093 main.go:110] libmachine: (cni-20191115T192103.770067553-5420) DBG | Got domain name: cni-20191115T192103.770067553-5420
                I1115 19:25:31.053952    8093 main.go:110] libmachine: (cni-20191115T192103.770067553-5420) DBG | Skipping domain as it is us...
                I1115 19:25:31.053972    8093 main.go:110] libmachine: (cni-20191115T192103.770067553-5420) DBG | Trying to get name of domain...
                I1115 19:25:31.053989    8093 main.go:110] libmachine: (cni-20191115T192103.770067553-5420) DBG | Got domain name: docker-flags-20191115T191903.76998124-5420
                I1115 19:25:31.054008    8093 main.go:110] libmachine: (cni-20191115T192103.770067553-5420) DBG | Getting XML for domain docker-flags-20191115T191903.76998124-5420...
                I1115 19:25:31.055486    8093 main.go:110] libmachine: (cni-20191115T192103.770067553-5420) DBG | Got XML for domain docker-flags-20191115T191903.76998124-5420
                I1115 19:25:31.056155    8093 main.go:110] libmachine: (cni-20191115T192103.770067553-5420) DBG | Unmarshaled XML for domain docker-flags-20191115T191903.76998124-5420: kvm.result{Name:"docker-flags-20191115T191903.76998124-5420", Interfaces:[]kvm.iface{kvm.iface{Source:kvm.source{Network:"default"}}, kvm.iface{Source:kvm.source{Network:"minikube-net"}}}}
                I1115 19:25:31.056200    8093 main.go:110] libmachine: (cni-20191115T192103.770067553-5420) DBG | domain docker-flags-20191115T191903.76998124-5420 does not use network minikube-net
                I1115 19:25:31.056231    8093 main.go:110] libmachine: (cni-20191115T192103.770067553-5420) DBG | domain docker-flags-20191115T191903.76998124-5420 DOES use network minikube-net, aborting...
                I1115 19:25:31.056247    8093 main.go:110] libmachine: (cni-20191115T192103.770067553-5420) DBG | Checking if the domain needs to be deleted
                I1115 19:25:31.056281    8093 main.go:110] libmachine: (cni-20191115T192103.770067553-5420) Deleting of networks failed: network still in use at least by domain 'docker-flags-20191115T191903.76998124-5420',
                I1115 19:25:31.059614    8093 main.go:110] libmachine: (cni-20191115T192103.770067553-5420) Domain cni-20191115T192103.770067553-5420 exists, removing...
                W1115 19:25:32.410392    8093 exit.go:101] Unable to start VM: create: Error creating machine: Error in driver during machine creation: machine didn't return an IP after 120 seconds
                * 
                X Unable to start VM: create: Error creating machine: Error in driver during machine creation: machine didn't return an IP after 120 seconds
                * 
                * Sorry that minikube crashed. If this was unexpected, we would love to hear from you:
                  - https://github.com/kubernetes/minikube/issues/new/choose
                
                ** /stderr **

Digging around jenkins:/var/lib/jenkins/jobs:

Of the 375 KVM tests we've run in the last 30 days, 86 have failed with this message (22%). This failure occurs on many machines:

 82 "kvm-integration-slave"
 25 "kvm-integration-slave2"
 23 "kvm-integration-slave3"
  8 "kvm-integration-slave4"
@tstromberg
Copy link
Contributor Author

Related: #3566

@tstromberg
Copy link
Contributor Author

Occurrences per date:

      2 0807
      6 0808
      4 0809
      1 0819
      1 0820
     34 0821
      3 0822
      4 0823
      2 0904
      2 0909
      1 0914
      1 0915
      2 1014
      4 1015
      2 1017
      3 1018
      2 1019
      1 1022
      4 1023
      2 1028
      5 1030
      9 1031
      1 1101
      2 1102
      3 1103
      2 1105
      5 1106
      4 1107
      3 1108
      1 1109
      6 1110
      9 1111
     19 1112
     12 1113
     19 1114
      6 1115

I imagine this has to be related to parallel runs. Curious we didn't see errors for the second half of September though.

@tstromberg tstromberg added area/testing kind/bug Categorizes issue or PR as related to a bug. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. labels Nov 20, 2019
@tstromberg tstromberg added priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. and removed priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. labels Dec 9, 2019
@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 8, 2020
@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Apr 7, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/testing kind/bug Categorizes issue or PR as related to a bug. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete.
Projects
None yet
Development

No branches or pull requests

3 participants