Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Qemu wrapper #2194

Merged
merged 2 commits into from
Jul 6, 2022
Merged

Conversation

markusboehme
Copy link
Member

Issue number: #834

Description of changes:

tools: Add start-local-vm script

Starts a virtual machine running a locally built Bottlerocket image via
QEMU and KVM. This is meant to ease development and experimentation for
situations that don't call for integration into a Kubernetes cluster or
other amenities provided in a cloud VM.

This does the minimal amount of work to meaningfully interact with the
launched VM: It configures the serial console for direct login available
in the -dev variants and forwards host TCP port 2222 to VM TCP port 22
so login via SSH works if both the network is configured and the admin
container enabled.

Users can inject files into the private partition of a Bottlerocket
image before it is launched to simulate the presence of user data and
other configuration. For example, "--inject-file ipv4-net.toml:net.toml"
adds the file "ipv4-net.toml" to the private partition as "net.toml".

Potentially helpful future work includes running images where host and
guest architecture differ, made possible by QEMU's TCG, as well as
generating and automatically injecting bare-bones variants of "net.toml"
and "user-data.toml" files to kick-start exploration.

Testing done:

Booted various VMs and configurations on both a c5.metal (x86_64) and c6g.metal (aarch64) EC2 instance. Testing is most fruitful with the metal-dev variant, since e.g. aws-dev seems to require the presence of IMDS.

Further thoughts:

  • Do we want to lift the architecture restriction on the metal-dev variant? It works well on virtualized Arm and is pretty convenient for local testing since it doesn't have any dependencies on external infrastructure.
  • Where might be a good place to put a pointer to this script and a usage example? I've been thinking of a QUICKSTART-QEMU.md file, possibly linked from BUILDING.md's "Use your image" section.

Terms of contribution:

By submitting this pull request, I agree that this contribution is dual-licensed under the terms of both the Apache License, version 2.0, and the MIT license.

Copy link
Contributor

@webern webern left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is cool.

(may be given multiple times)
--help shows this usage text

The virtual machine's port 22 (SSH) will be exposed via the local port 2222,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the SSH host port mapping be an argument?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case the host's port 2222 is already taken, this would be helpful, yes. It won't do anything to be able to run multiple VMs off the script since there's other work needed for that (e.g. switching to a separate layer for disk writes so VMs don't share their images).

I think a separate --host-port-forwards option defaulting to tcp::2222:22 is most generic and avoids tying this option to SSH too much while still keeping the base case simple.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Converted to --host-port-forwards with the default as described.


if [[ ${arch} = aarch64 ]]; then
qemu_args+=( -machine virt )
qemu_args+=( -bios /usr/share/edk2/aarch64/QEMU_EFI.silent.fd )
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this provided by all distros? Or is distro specific, and the file is installed in different locations depending on the distro?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Totally decided by a distro's packagers. On current Fedora, this is where the edk2-arm package (pulled in by qemu) puts it. On an Ubuntu machine it might be provided by ovmf and only pulled in as a recommended package. For a start, I wanted to go with the fixed location and see whether there's actually any breakage.

--vm-cpus number of CPUs to spawn for VM (default is ${vm_cpus})
--force-extract force recreation of the extracted Bottlerocket image,
e.g. to incorporate new --inject-file options
--inject-file adds a local file to the private partition of the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really liked this :D! One potential addition to this might be to let the users attach their own NICs. I know that I would use that since I run a bunch of VMs in metal variants to test OS specifics, with an ENI attached to them which allows me to use SSM to connect to the VMs instead of SSH.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Passing through a secondary ENI on an EC2 instance? I haven't tried that myself yet, but it's definitely one way to extend this and improve the iteration time!

--help shows this usage text

The virtual machine's port 22 (SSH) will be exposed via the local port 2222,
i.e. if the Bottlerocket admin container has been enabled via user-data, it can
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the lack of a better place to ask, what happens to the control container? Does it just fail?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know! :-) The control container isn't active in the metal variants, and with, for example, aws-dev early-boot-config fails for lack of access to the IMDS (which likely has knock-on effects on the control container). Here's where the extension idea of ENI pass-through comes into play.

Copy link
Contributor

@bcressey bcressey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool!

tools/start-local-vm Outdated Show resolved Hide resolved
tools/start-local-vm Outdated Show resolved Hide resolved
tools/start-local-vm Show resolved Hide resolved
tools/start-local-vm Outdated Show resolved Hide resolved
@bcressey
Copy link
Contributor

Do we want to lift the architecture restriction on the metal-dev variant? It works well on virtualized Arm and is pretty convenient for local testing since it doesn't have any dependencies on external infrastructure.

+1

Where might be a good place to put a pointer to this script and a usage example? I've been thinking of a QUICKSTART-QEMU.md file, possibly linked from BUILDING.md's "Use your image" section.

That makes sense to me. We also need to document the Fedora / Ubuntu package requirements.

It might be an interesting experiment to see if we could install QEMU in the SDK and invoke it via docker run instead of needing it to be installed by developers. If that worked it could pave the way for Makefile.toml integration (cargo make qemu-run ...).

Copy link
Contributor

@zmrow zmrow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Neat! 📸

empty-bootconfig.data will soon get company. Create a dedicated
directory for pre-assembled bootconfig initrds and move the file there.

Signed-off-by: Markus Boehme <[email protected]>
Starts a virtual machine running a locally built Bottlerocket image via
QEMU and KVM. This is meant to ease development and experimentation for
situations that don't call for integration into a Kubernetes cluster or
other amenities provided in a cloud VM.

This does the minimal amount of work to meaningfully interact with the
launched VM: It configures the serial console for direct login available
in the -dev variants and forwards host TCP port 2222 to VM TCP port 22
so login via SSH works if both the network is configured and the admin
container enabled.

Users can inject files into the private partition of a Bottlerocket
image before it is launched to simulate the presence of user data and
other configuration. For example, "--inject-file ipv4-net.toml:net.toml"
adds the file "ipv4-net.toml" to the private partition as "net.toml".

Potentially helpful future work includes running images where host and
guest architecture differ, made possible by QEMU's TCG, as well as
generating and automatically injecting bare-bones variants of "net.toml"
and "user-data.toml" files to kick-start exploration.

Signed-off-by: Markus Boehme <[email protected]>
@markusboehme
Copy link
Member Author

Ah, rebasing wasn't a great idea. Now the comparison between revisions is tedious a best. The GitHub workflow is new to me--I assume we don't rebase our forks when having open PRs on them?

@markusboehme
Copy link
Member Author

@bcressey:

It might be an interesting experiment to see if we could install QEMU in the SDK and invoke it via docker run instead of needing it to be installed by developers. If that worked it could pave the way for Makefile.toml integration (cargo make qemu-run ...).

It's a fun idea. I don't see why it shouldn't work. Comes with the benefit of only having one environment to adapt to, too. I'll play with this.

@bcressey
Copy link
Contributor

bcressey commented Jul 4, 2022

Ah, rebasing wasn't a great idea. Now the comparison between revisions is tedious a best. The GitHub workflow is new to me--I assume we don't rebase our forks when having open PRs on them?

If you don't need to rebase, it's generally easier to avoid it.

The convention we've adopted in cases where it's necessary:

  1. rebase the branch with no changes to the review code
  2. force push and add a "rebase, no changes" comment to the PR
  3. make changes to the review code
  4. force push and add a summary comment to the PR

You have to wait 2-3 minutes between the two force pushes, otherwise GitHub will coalesce them into a single timeline note and the force push diff link is useless.

It isn't the end of the world if something goes wrong - your "oops, rebase" note is fine and people can just re-review the full changes.

@markusboehme markusboehme merged commit 6f08bf2 into bottlerocket-os:develop Jul 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants