Support different types of Disk #9274

jmpesp · 2025-10-23T01:41:19Z

In the same way that a higher level type was added to SiloUser and SiloGroup, add a type to Disks. Currently this is only Crucible, but that will change imminently to include local storage.

The main driver for this change was for different types of disks to occupy the same PCI BDF space. Having a higher level "Disk" that includes the runtime state for a disk (with information like what instance it's attached to, what slot it occupies) be separate from where the disk's blocks are coming from achieves this without any complicated database or diesel magic.

Call sites that used the Disk db model now take either a Disk datastore type, or directly take a CrucibleDisk (for functions that should only be performed on Crucible disks). Several sagas have changed to only accept CrucibleDisk in the parameters as well.

In this commit there's also a small change to how the instance spec is computed with respect to the storage that provides a new option for instances to have file backed storage. This is currently unused but the framework for creating different components depending on this new Disk type is there.

Other stuff in this commit:

fix a bug where project_create_disk would insert a disk into the database, and then perform some checks. If this failed in the context of a saga it would leave behind the inserted disk.
delete disk_refetch, fulfilling a TODO-cleanup. similarly, disk_set_runtime is deleted. both of these will need to be revisited for hotplug, but the whole redesign of how sled agent and nexus work with VMMs obviates the old code.

In the same way that a higher level type was added to SiloUser and SiloGroup, add a type to Disks. Currently this is only Crucible, but that will change imminently to include local storage. The main driver for this change was for different types of disks to occupy the same PCI BDF space. Having a higher level "Disk" that includes the runtime state for a disk (with information like what instance it's attached to, what slot it occupies) be separate from where the disk's blocks are coming from achieves this without any complicated database or diesel magic. Call sites that used the Disk db model now take either a Disk datastore type, or directly take a CrucibleDisk (for functions that should only be performed on Crucible disks). Several sagas have changed to only accept CrucibleDisk in the parameters as well. In this commit there's also a small change to how the instance spec is computed with respect to the storage that provides a new option for instances to have file backed storage. This is currently unused but the framework for creating different components depending on this new Disk type is there. Other stuff in this commit: - fix a bug where `project_create_disk` would insert a disk into the database, and _then_ perform some checks. If this failed in the context of a saga it would leave behind the inserted disk. - delete `disk_refetch`, fulfilling a `TODO-cleanup`. similarly, `disk_set_runtime` is deleted. both of these will need to be revisited for hotplug, but the whole redesign of how sled agent and nexus work with VMMs obviates the old code.

hawkw · 2025-10-27T21:15:51Z

nexus/db-model/src/disk.rs

-            device_path,
-        }
+    pub fn slot(&self) -> Option<u8> {
+        self.slot.map(|x| x.into())


style nit, considered unimportant: could be

Suggested change

self.slot.map(|x| x.into())

self.slot.map(Into::into)

done in f4425c6

hawkw · 2025-10-27T21:23:24Z

nexus/db-model/src/disk.rs

+    Crucible => b"crucible"
+);
+
+/// A Disk, where how the blocks are stored depend on the disk_type.


Maybe it would be worthwhile saying something here (or on the disk_type field?) that data specific to the particular disk type is stored in another table, depending on the value of disk_type. Maybe we also ought to explicitly list those tables here?

Good idea, done in 227268d

hawkw · 2025-10-27T21:35:35Z

nexus/db-model/src/disk_type_crucible.rs

+    pub fn pantry_address(&self) -> Option<SocketAddrV6> {
+        self.pantry_address.as_ref().map(|x| x.parse().unwrap())
+    }


I realize that this was here before, and we haven't changed this code, but, is the rationale for storing this as a string rather than using Postgres' inet type plus a port is that we want to enforce that this is only ever an IPv6 address?

I believe so yeah, it's been a while.

hawkw · 2025-10-27T21:51:04Z

nexus/db-queries/src/db/datastore/disk.rs

+#[derive(Debug, Clone, Serialize, Deserialize)]
+pub enum Disk {
+    Crucible(CrucibleDisk),
+}
+
+impl Disk {
+    pub fn model(&self) -> &model::Disk {
+        match &self {
+            Disk::Crucible(disk) => disk.model(),
+        }
+    }
+
+    pub fn id(&self) -> Uuid {
+        self.model().id()
+    }
+
+    pub fn name(&self) -> &api::external::Name {
+        self.model().name()
+    }
+
+    pub fn time_deleted(&self) -> Option<DateTime<Utc>> {
+        self.model().time_deleted()
+    }
+
+    pub fn project_id(&self) -> Uuid {
+        self.model().project_id
+    }
+
+    pub fn runtime(&self) -> DiskRuntimeState {
+        self.model().runtime()
+    }
+
+    pub fn state(&self) -> DiskState {
+        self.model().state()
+    }
+
+    pub fn size(&self) -> model::ByteCount {
+        self.model().size
+    }
+
+    pub fn slot(&self) -> Option<u8> {
+        self.model().slot()
+    }
+
+    pub fn block_size(&self) -> model::BlockSize {
+        self.model().block_size
+    }
+}
+
+#[derive(Debug, Clone, Serialize, Deserialize)]
+pub struct CrucibleDisk {
+    pub disk: model::Disk,
+    pub disk_type_crucible: DiskTypeCrucible,
+}
+
+impl CrucibleDisk {
+    pub fn model(&self) -> &model::Disk {
+        &self.disk
+    }
+
+    pub fn id(&self) -> Uuid {
+        self.model().id()
+    }
+
+    pub fn name(&self) -> &api::external::Name {
+        self.model().name()
+    }
+
+    pub fn time_deleted(&self) -> Option<DateTime<Utc>> {
+        self.model().time_deleted()
+    }
+
+    pub fn project_id(&self) -> Uuid {
+        self.model().project_id
+    }
+
+    pub fn runtime(&self) -> DiskRuntimeState {
+        self.model().runtime()
+    }
+
+    pub fn state(&self) -> DiskState {
+        self.model().state()
+    }
+
+    pub fn size(&self) -> model::ByteCount {
+        self.model().size
+    }
+
+    pub fn slot(&self) -> Option<u8> {
+        self.model().slot()
+    }
+
+    pub fn volume_id(&self) -> VolumeUuid {
+        self.disk_type_crucible.volume_id()
+    }
+
+    pub fn pantry_address(&self) -> Option<SocketAddrV6> {
+        self.disk_type_crucible.pantry_address()
+    }
+}


Hmm, this representation is interesting. It seems like storing the model within the types that represent different disk types and providing an accessor for the shared model results in a certain amount of boilerplate that has to be duplicated into every new disk type.

Is there a reason we've arrived at this representation rather than one where the shared disk model is present on the type that represents every kind of disk, like:

pub struct Disk { model: model::Disk, type_specific: DiskType, } enum DiskType { Crucible(model::DiskTypeCrucible), }

Or, is this intended to allow us to always have the cross-model state accessible regardless of the disk type?

If we need that, might also consider a representation that's like:

pub struct Disk<T> { pub model: model::Disk, pub type_specific: T, }

so that the shared code need not be repeated every time?

For what it's worth, I'm certainly not trying to argue that the way you've done it here is wrong, just curious about what led us to settle on this way of representing it.

It shook out this way because I wanted not to have to deal with any Options (beyond what the actual db model fields have), have things like volume_id and pantry_address only be defined for a CrucibleDisk, and have the ability for functions to accept the specific type of CrucibleDisk instead of Disk. There is a regrettable amount of copy and paste of that boilerplate though.

hawkw · 2025-10-27T21:59:38Z

nexus/db-queries/src/db/datastore/disk.rs

+        Ok(disk)
+    }
+
+    /// Return all the Crucible Disks matching a list of volume ids. Currently


nit:

Suggested change

/// Return all the Crucible Disks matching a list of volume ids. Currently

/// Return all the Crucible Disks matching a list of volume IDs. Currently

though i presume this was already there?

hawkw · 2025-10-27T23:28:40Z

nexus/src/app/instance_platform/mod.rs

+
+        if self.slot_usage.contains(&slot) {
+            return Err(Error::internal_error(&format!(
+                "disk PCI slot {slot} used more than once",


nit: should this mention which disk we were trying to attach in that slot?

hawkw · 2025-10-27T23:31:16Z

nexus/src/app/instance_platform/mod.rs

+                // XXX here is where we would restrict the type of disk used for
+                // a boot disk. should we?


using a file-backed disk as the boot device would mean that the instance can never be restarted anywhere else, and is perma-wasted if the physical disk that boot device file was on goes away...

Yeah, using any local storage restricts the VMM to only ever appearing on one sled. I'm inclined to let users select local storage as a boot volume, but for it to work they would need to:

boot first from a Crucible volume made from an image or snapshot

prepare the local storage disk they want to use by installing an OS and/or formatting how ever they want

then stop the instance, detach the crucible volume, start the instance again, selecting it as a boot disk

It's not impossible to do, just a bit annoying. Maybe possible to automate with Packer? Coming back to this after a few days, I can't come up with a reason for restricting boot disk type.

hawkw · 2025-10-27T23:34:24Z

schema/crdb/dbinit.sql

+CREATE TYPE IF NOT EXISTS omicron.public.disk_type AS ENUM (
+  'crucible'
+);


it would be nice to have a comment here saying that this indicates which other table to query in order to look up disk-type-specific information...

Would the new comment for disk_type in nexus/db-model/src/disk.rs suffice? :)

hawkw · 2025-10-27T23:38:38Z

schema/crdb/dbinit.sql

It occurs to me that we might be able to reduce some of the boilerplate in code that queries the database for specific disk types if we made a SQL view for each disk type that's defined like

CREATE VIEW IF NOT EXISTS omicron.public.crucible_disk AS SELECT -- ... FROM omicron.public.disk INNER JOIN omicron.public.disk_type_crucible ON disk.id = disk_type_crucible.disk_id WHERE disk.type = ' crucible'

or whatever?

hah, I did this exact thing first! When asking about it, we decided it was better that Nexus be aware of the different underlying tables rather than trying to mask that here. I also couldn't insert into a view so the disk slot allocation query didn't work.

hawkw · 2025-10-27T23:39:48Z

schema/crdb/dbinit.sql

-     * table?
-     */
    /* Runtime state */
    -- disk_state omicron.public.DiskState NOT NULL, /* TODO see above */


nit: this line now references "TODO see above" and there is no longer an "above" comment. should we just delete this?

I think this above comment this is referencing starts on line 1451?

jmpesp added 4 commits October 23, 2025 01:40

oops

2a52cfa

Merge branch 'main' into disks_have_types_now

c4d752a

undo parameter change for disk delete saga

cf65891

morlandi7 requested a review from hawkw October 27, 2025 18:45

hawkw reviewed Oct 27, 2025

View reviewed changes

jmpesp added 6 commits October 29, 2025 17:47

use Into::into

f4425c6

more on where to find specific disk info

227268d

ids -> IDs

ea48807

a better error message, no code adventure required

f751308

comment for disk_for_volume_id

826b5df

cannot attach disk with id

2e61172

	/// Return all the Crucible Disks matching a list of volume ids. Currently
	/// Return all the Crucible Disks matching a list of volume IDs. Currently

		// XXX here is where we would restrict the type of disk used for
		// a boot disk. should we?

Uh oh!

Support different types of Disk #9274

Are you sure you want to change the base?

Support different types of Disk #9274

Uh oh!

Conversation

jmpesp commented Oct 23, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants