Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Running 'apt upgrade' breaks the VM #1313

Closed
Sangeppato opened this issue Jan 22, 2023 · 11 comments · Fixed by #1315
Closed

Running 'apt upgrade' breaks the VM #1313

Sangeppato opened this issue Jan 22, 2023 · 11 comments · Fixed by #1315
Labels

Comments

@Sangeppato
Copy link

Description

I've tried running the docker-rootful example with a VZ VM, but I've noticed that a simple apt update and apt upgrade completely breaks the VM: the boot process gets stuck at Waiting for the essential requirement 1 of 3: "ssh" with the CPU usage pinned at 100% (ssh probably broke during the upgrade).

If this is the expected behaviour, I think the user should be at least warned or, even better, such an action should be blocked.

@AkihiroSuda
Copy link
Member

expected

No.

Any error in $HOME/.lima/<INSTANCE>/serial.log ?

@Sangeppato
Copy link
Author

expected

No.

Any error in $HOME/.lima/<INSTANCE>/serial.log ?

I can only reproduce using VZ as the backend, so serial.log is empty..
Could this be a duplicate of #1200?

In any case, here is the template I'm using:
docker.yaml.zip

@balajiv113
Copy link
Member

@Sangeppato
Is this a existing instance or a newly created one ??

@Sangeppato
Copy link
Author

@Sangeppato Is this a existing instance or a newly created one ??

Newly created, I can consistently reproduce it

@balajiv113
Copy link
Member

serial.log is empty

Looks like something even wrong with starting of vm itself. On successful boot, this file should contain login prompt.

If possible you could try this approach, to get more info.

  • Start the VM with vzNAT: true
  • Run arp -a, wait for some ip under bridge100 (might be bridge101 also, something under bridge prefix)
  • If you get some ip from previous step, you can you that ip to ssh to vm and get kernel logs to see if something wrong

@balajiv113
Copy link
Member

balajiv113 commented Jan 22, 2023

I just tried the same template. It works for me in both my intel and M1.

There is surely some network issue, not able to find the cause as its not reproducible for me yet :(

Edit: I was able to reproduce, i missed doing apt upgrade my bad.

@Sangeppato
Copy link
Author

serial.log is empty

Looks like something even wrong with starting of vm itself. On successful boot, this file should contain login prompt.

If possible you could try this approach, to get more info.

* Start the VM with vzNAT: true

* Run `arp -a`, wait for some ip under bridge100 (might be bridge101 also, something under bridge prefix)

* If you get some ip from previous step, you can you that ip to ssh to vm and get kernel logs to see if something wrong

Thank you.

I'm already using vzNAT: true and after the first boot serial.log does in fact contain the login prompt. The problem only occurs after upgrading Ubuntu. Unfortunately, in those cases I don't get any IP for a bridge interface on the host.

@balajiv113
Copy link
Member

Looks like after upgrade something happening within grub. Instead of boot its going into grub edit mode.

I tried to boot the error vz disk via QEMU at that time noticed the below in serial.log,

   Minimal BASH-like line editing is supported. For the first word, TAB   
   lists possible command completions. Anywhere else TAB lists possible   
   device or file completions.                                            


grub>     

FYI - With a working vz disk i was able to boot the same disk with QEMU.

@AkihiroSuda AkihiroSuda added bug Something isn't working priority/high and removed status/more-info-needed labels Jan 22, 2023
@balajiv113
Copy link
Member

@Sangeppato
Can you try the following,

  • create new vm
  • sudo apt upgrade
  • Now, trigger a shutdown manually using sudo shutdown now
  • start vm again (Now this should work fine)

What's happening ?
More of a guess only, grub/disk corruption is happening, if apple virtualization framwork issues a shutdown after kernel upgrade (without reboots from vm).

@balajiv113
Copy link
Member

The guess looks valid now,

Compared stop behaviour of vz and qemu (qemu/vz stops the vm),

Steps followed

  • Start the VM limactl start docker
  • Shell into VM limactl shell docker
  • Run last -x shutdown to view recent shutdowns
  • From another terminal stop the lima instance limactl stop docker
  • Run last -x shutdown to view recent shutdowns (See if previous shutdown is displayed)

QEMU
All stops (within vm / from qemu) are displayed

VZ
Only shutdown within vm are shown

From this, we could more/less conclude that virtualization.framework is not doing a proper shutdown on stop. This might be the cause for this corruption.

@Sangeppato
Copy link
Author

Thank you @balajiv113!
I can confirm that by requesting a shutdown from inside the guest the VM remains bootable

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants