Skip to content

Conversation

@iximeow
Copy link
Member

@iximeow iximeow commented Nov 11, 2025

This follows on turning the crank to max vCPUs in Helios and Propolis; if the hardware has so many vCPUs available, what's to stop someone from allocating them all for a single VM?

Similar to creating a VM requiring more memory than is available, one can create (or resize) a VM into a size that is much larger than any hardware has, or is available at runtime. Attempting to run such an instance will error because the instance can't get placed.

One could imagine a future operator control to limit max VM sizes for a silo; larger VMs get more difficult to migrate, can be more difficult to place. Without something like "anti-fragmentation" to group smaller VMs together it's quite possible that a sled could have 255 CPUs, 2 vCPUs for one small VM, 253 CPUs not spoken for, and unable to fit a 254 vCPU VM.

Further, 254 busy vCPUs leaves zero to one CPUs available for Propolis, driving emulated hardware, processing I/O, co-located Crucible, sled-agent, other services, etc. There is no mechanism to earmark CPUs for control plane and I/O purposes, so this isn't any worse than the status quo. But when such a mechanism comes to exist, we'll need to gracefully tolerate prior existence of sled-or-larger-size VMs.

Note that Helios is fine with being asked to oversubscribe hardware threads to vCPUs, and that's how I'd tested that a 254-vCPU VM works reasonably (on a 32-thread CPU). test_cannot_provision_instance_beyond_cpu_capacity is the demonstration that the control plane isn't willing to oversubscribe hardware in practice.

(Dan pointed out to me a bit ago that we could allow 255 vCPUs - my choice of 254 on the Helios side was really a fencepost error on my part. But I'd like to disallow odd vCPU counts in the first place, related to Propolis#940, so 254 is fine.)

This follows on turning the crank to max vCPUs in Helios and Propolis;
if the hardware has so many vCPUs available, what's to stop someone from
allocating them all for a single VM?

Similar to creating a VM requiring more memory than is available, one
can create (or resize) a VM into a size that is much larger than any
hardware has, or is available at runtime. Attempting to run such an
instance will error because the instance can't get placed.

One could imagine a future operator control to limit max VM sizes for a
silo; larger VMs get more difficult to migrate, can be more difficult to
place. Without something like "anti-fragmentation" to group smaller VMs
together it's quite possible that a sled could have 255 CPUs, 2 vCPUs
for one small VM, 253 CPUs not spoken for, and unable to fit a 254 vCPU VM.

Further, 254 busy vCPUs leaves zero to one CPUs available for Propolis,
driving emulated hardware, processing I/O, co-located Crucible,
sled-agent, other services, etc. There is no mechanism to earmark CPUs
for control plane and I/O purposes, so this isn't any worse than the
status quo. But when such a mechanism comes to exist, we'll need to
gracefully tolerate prior existence of sled-or-larger-size VMs.
@iximeow iximeow added virtualization Propolis Integration & VM Management release notes reminder to include this in the release notes labels Nov 11, 2025
@iximeow iximeow merged commit 2f7f807 into main Nov 13, 2025
16 checks passed
@iximeow iximeow deleted the ixi/max_vcpu_crank_turn branch November 13, 2025 21:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

release notes reminder to include this in the release notes virtualization Propolis Integration & VM Management

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants