Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

shell connections are bounced ~30 seconds into debian boot cycle subsequent to first boot/provision #1941

Closed
mhio opened this issue Oct 24, 2023 · 7 comments · Fixed by #2048

Comments

@mhio
Copy link
Contributor

mhio commented Oct 24, 2023

Description

After the first boot/provision process has completed on a debian12 vm. Creating a new lima shell early in any subsequent boot process, the shell is bounced after about the same time as the initial limactl start creation command takes.

limactl version 0.18.0

Watching the systemd journal before the shell session is bounced, it always exits here

$ lima sudo journalctl -f
<snip>
Oct 24 00:51:23 debvm cloud-init[1583]: Cloud-init v. 22.4.2 running 'modules:final' at Tue, 24 Oct 2023 00:51:23 +0000. Up 36.11 seconds.

Looking through the journal around then, I believe the boot.sh script output isn't flushed before termination so it's a bit later the session is terminated:

Oct 24 00:51:23 debvm cloud-init[1583]: + command -v loginctl
Oct 24 00:51:23 debvm cloud-init[1583]: + loginctl terminate-user mhio
Oct 24 00:51:23 debvm systemd[1]: Stopping session-2.scope - Session 2 of User mhio...
Oct 24 00:51:23 debvm sshd[1271]: pam_unix(sshd:session): session closed for user mhio
Oct 24 00:51:23 debvm sudo[1286]: pam_unix(sudo:session): session closed for user root

Which led me to

loginctl terminate-user "${LIMA_CIDATA_USER}" || true

I believe the purpose here is to inject environment before the lima provisioning starts. I assume some config changes might also need to be reflected here after first provision/boot?

Could the login-ctl terminate-user be guarded on the content of /etc/environment changing? Something like

if [ -e /etc/environment ]; then
  etc_environment_sum_pre="$(md5sum /etc/environment)"
  sed -i '/#LIMA-START/,/#LIMA-END/d' /etc/environment
fi
cat "${LIMA_CIDATA_MNT}/etc_environment" >>/etc/environment
etc_environment_sum_post="$(md5sum /etc/environment)"

if command -v loginctl >/dev/null 2>&1 && [ "$etc_environment_sum_pre" != "$etc_environment_sum_post" ]; then
  loginctl terminate-user "${LIMA_CIDATA_USER}" || true
fi

lima.yaml

images:
  - location: 'https://cloud.debian.org/images/cloud/bookworm/20231013-1532/debian-12-genericcloud-amd64-20231013-1532.qcow2'
    arch: 'x86_64'
    digest: 'sha512:b2ddc01e8d13dabbcfde6661541aae92219be2d442653950f0e44613ddebaeb80dc7a83e0202c5509c5e72f4bd1f4edee4c83f35191f2562b3f31e20e9e87ec2'
  - location: 'https://cloud.debian.org/images/cloud/bookworm/latest/debian-12-genericcloud-amd64.qcow2'
    arch: 'x86_64'
vmType: 'vz'
mountType: 'virtiofs'
mounts:
  - location: '/tmp/lima'
    writable: true
networks:
  - socket: '/private/var/run/socket_vmnet'
containerd:
  system: false
  user: false
hostResolver:
  enabled: true
@mhio
Copy link
Contributor Author

mhio commented Oct 24, 2023

Also, the lima guest agent socket removal past first provisioning would be removing the already started lima-guestagent.service socket.

# Make sure the guestagent socket from a previous boot is removed before we open the "lima-ssh-ready" gate.
rm -f /run/lima-guest-agent.sock

Additionally the path to the file is /run/lima-guestagent.sock on this debian vm, so maybe isn't doing anything?

$ ls -l /run/lima-guestagent.sock 
srwxrwxrwx 1 root root 0 Oct 24 00:51 /run/lima-guestagent.sock

@AkihiroSuda
Copy link
Member

This seems from:

@jandubois Could you take a look?

@mhio
Copy link
Contributor Author

mhio commented Nov 23, 2023

The disconnect seems to be highlighted due to my use case of not using limactl start and waiting, but running the vm's command /usr/local/bin/limactl hostagent --pidfile /Users/kimi/.lima/default/ha.pid --socket /Users/kimi/.lima/default/ha.sock default directly via a launchd agent.

The vm also has a slightly longer systemd startup time, waiting for all the containers to be running before lima's cloud init steps run.

@jandubois
Copy link
Member

@jandubois Could you take a look?

Sure. In general I would say that this is expected if you connect to an instance before all the "requirements" have been completed that are part of the regular limactl start process.

However, I think we should implement the change proposed by @mhio that the user session is not bounced when /etc/environment hasn't changed.

@mhio Let me know if you want to create a PR, or if I should do it based on your suggestion above?

mhio added a commit to mhio/lima-pr that referenced this issue Nov 26, 2023
mhio added a commit to mhio/lima-pr that referenced this issue Nov 26, 2023
@mhio
Copy link
Contributor Author

mhio commented Nov 26, 2023

It looks like the cidata iso is all built in process, I don't have a build environment to test it right now but commits are there if needed.

@jandubois
Copy link
Member

commits are there if needed

I see the commits in your fork, and I was tempted to create the PR on your behalf, to start the code review process. But then I noticed that you haven't signed the commits with a Signed-off-by line yet, so the PR would not pass the DCO check.

I think it is better if you create the PR yourself anyways, so waiting for this to happen.

I see that you are removing the rm -f /run/lima-guest-agent.sock line. I cannot remember why I added it originally, and maybe it is indeed not needed. But does it get in the way with your use-case, or is that just cleanup because you think it isn't needed at all?

@mhio
Copy link
Contributor Author

mhio commented Dec 2, 2023

PR created

The socket removal can left behind if needed, I kept it in a second commit as it was just clean up and doesn't affect this issue.
That socket file on my system is /run/lima-guestagent.sock. In normal operation, after first boot provisioning, removal of the corrected socket path would remove the already running lima-guestagentservices socket. So I don't think it's fixing anything in provisioning at the moment, and would only degrade the normal boot up situation if therm` was updated to remove the correct path.

mhio added a commit to mhio/lima-pr that referenced this issue Dec 4, 2023
DennisRasey pushed a commit to DennisRasey/lima that referenced this issue Jan 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants