Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bump alpine-lima iso from v0.2.1 → v0.2.2 #1014

Closed
wants to merge 2 commits into from
Closed

Conversation

jandubois
Copy link
Member

  • Bumps Alpine from 3.13.5 → 3.14.3

  • Installs qemu-aarch64 from tonistiigi/binfmt instead of Alpine repo
    to include additional patches.

  • Installs qemu-x86_64 in the aarch64 iso

@jandubois jandubois added this to the v0.7.0 milestone Nov 26, 2021
Copy link
Contributor

@ericpromislow ericpromislow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Upgrades from the main branch don't work on macos (monterrey 12.0.1 / intel)

@jandubois
Copy link
Member Author

jandubois commented Nov 26, 2021

Upgrades from the main branch don't work on macos (monterrey 12.0.1 / intel)

Please provide more information! How does it fail? What is the error message?

I did.... don't know what happened there. Stay tuned

@ericpromislow
Copy link
Contributor

In k3s.log, stuck at

time="2021-11-26T11:57:28-08:00" level=info msg="[hostagent] Waiting for the essential requirement 1 of 5: \"ssh\""

This is what I see repeatedly in lima/0/ha.stderr.log:


{
  "level": "debug",
  "msg": "executing script \"ssh\"",
  "time": "2021-11-26T11:59:19-08:00"
}
{
  "level": "debug",
  "msg": "executing ssh for script \"ssh\": /usr/bin/ssh [ssh -F /dev/null -o 
IdentityFile=\"/Users/ericp/Library/Application 
Support/rancher-desktop/lima/_config/user\" -o StrictHostKeyChecking=no -o 
UserKnownHostsFile=/dev/null -o NoHostAuthenticationForLocalhost=yes -o 
GSSAPIAuthentication=no -o PreferredAuthentications=publickey -o Compression=no 
-o BatchMode=yes -o IdentitiesOnly=yes -o 
Ciphers=\"^[email protected],[email protected]\" -o User=ericp -o 
ControlMaster=auto -o ControlPath=\"/Users/ericp/Library/Application 
Support/rancher-desktop/lima/0/ssh.sock\" -o ControlPersist=5m -p 62348 
127.0.0.1 -- /bin/bash]",
  "time": "2021-11-26T11:59:19-08:00"
}
{
  "level": "debug",
  "msg": "stdout=\"\", stderr=\"kex_exchange_identification: read: Connection 
reset by peer\\r\\nConnection reset by 127.0.0.1 port 62348\\r\\n\", err=failed 
to execute script \"ssh\": stdout=\"\", stderr=\"kex_exchange_identification: 
read: Connection reset by peer\\r\\nConnection reset by 127.0.0.1 port 
62348\\r\\n\": exit status 255",
  "time": "2021-11-26T11:59:19-08:00"
}

To reproduce:

pushd $HOME/Library
rm -fr "Application Support/rancher-desktop"
rm -fr Preferences/rancher-desktop
rm -fr Caches/rancher-desktop
popd
git co main
rm -r resources/ && git co resources && npm run postinstall && npm run dev
# Verify rancher-desktop works, then shut down
git co alpine-3.14
rm -r resources/ && git co resources && npm run postinstall && npm run dev

This happened in 3/3 runs. I did a similar run with release builds and got the same results.
Instead of npm run dev, both of the relevant above lines ended with
npm run build && rm -fr /Application/Rancher\ Desktop && open dist/Rancher*.dmg && open /Application/Rancher\ Desktop

@jandubois
Copy link
Member Author

@ericpromislow The upgrade failed because our mechanism for persisting /etc on the data volume was not applying some required updates from the new ISO; for some reason the timestamps of the newer files were older than on the previous ISO. This included the /etc/init.d/sshd script, which prevented sshd from starting.

@mook-as Please review the new upgrade logic! I've tested it, and it seems to work fine so far, but this part should have multiple reviewers. Is there anything else we want to preserve from the previous /etc?

* Bumps Alpine from 3.13.5 → 3.14.3

* Installs qemu-aarch64 from tonistiigi/binfmt instead of Alpine repo
  to include additional patches.

* Installs qemu-x86_64 in the aarch64 iso

Signed-off-by: Jan Dubois <[email protected]>
The date based copy mechanism was unreliable, and other data (like apk
status) was incorrect after an update.

Signed-off-by: Jan Dubois <[email protected]>
Copy link
Contributor

@mook-as mook-as left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine to me, though I thought we moved to copying all of /etc because we wanted to support people adding extra files?

@jandubois
Copy link
Member Author

Looks fine to me, though I thought we moved to copying all of /etc because we wanted to support people adding extra files?

Yes, but we can't really. The only viable alternative would be to copy all files from the new ISO into /mnt/data/etc, except for those protected by a block-list (essentially the ones I'm in this PR copying over from old to new).

Lmk if you think that is preferable!

@jandubois
Copy link
Member Author

The only viable alternative would be [...]

On further reflection this isn't really a good option either because it would not remove files that maybe need to be removed, e.g. a service that has been renamed will still be included under the old name, config files, and run levels.

So I think the proper way to add/modify files persistently under /etc should be by provisioning scripts.

Note that lima-vm/lima#436 should help in the future to keep user-supplied provisioning scripts separate from the ones managed by Rancher Desktop, and therefore more likely to survive app updates, or config changes.

@mook-as
Copy link
Contributor

mook-as commented Nov 29, 2021

I'm totally happy to declare that nothing in /etc will be persisted across; we (effectively) already do that on Windows (when the base image needs upgrades).

Copy link
Contributor

@ericpromislow ericpromislow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ran the following commands on the latest commit and it failed the k3s API failed to start up.

Workflow:

git co main
# clear the cache
npm run dev # and shutdown
git co alpine-3-13
\rm -fr resources && git co resources && npm run postinstall
npm run dev

Contents of k3s.log:

tail: can't open '/var/log/k3s.log': Permission denied

Running limactl shell 0 to look at the log file:

/var/log/k3s.log is owned by root, perms 600, so a non-root process can't see it.

There are 60 Error lines in the log file. Let me know if there's anything specific I should report.

@jandubois
Copy link
Member Author

/var/log/k3s.log is owned by root, perms 600, so a non-root process can't see it.

@mook-as Is this due to your recent changes on Windows as well? Any suggestion how to modify the Lima codepath to match what Windows is doing?

@mook-as
Copy link
Contributor

mook-as commented Nov 29, 2021

I also see the issue about /var/log/k3s.log permissions on main; I don't think that was introduced by this PR. I believe we should fix that separately.

@jandubois
Copy link
Member Author

I also see the issue about /var/log/k3s.log permissions on main; I don't think that was introduced by this PR. I believe we should fix that separately.

Ok, so let's wait for that fix, and then I'll rebase this PR on top of that fix, so that we can still test the upgrade scenario before merging.

Is there a bug for the permission issue? And/or is somebody already working on it?

@mook-as
Copy link
Contributor

mook-as commented Nov 29, 2021

Eh, pretty sure it's my fault so I'll work on it now :D

@mook-as
Copy link
Contributor

mook-as commented Nov 29, 2021

This shouldn't have been clsoed by #1023.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants