Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

scsi rebase #301

Merged
merged 13 commits into from
Jun 5, 2023
Merged

scsi rebase #301

merged 13 commits into from
Jun 5, 2023

Conversation

Ablu
Copy link
Contributor

@Ablu Ablu commented Mar 7, 2023

Summary of the PR

This is a rebase of #4.

Requirements

Before submitting your PR, please make sure you addressed the following
requirements:

  • All commits in this PR are signed (with git commit -s), and the commit
    message has max 60 characters for the summary and max 75 characters for each
    description line.
  • All added/changed functionality has a corresponding unit/integration
    test.
  • All added/changed public-facing functionality has entries in the "Upcoming
    Release" section of CHANGELOG.md (if no such section exists, please create one).
  • Any newly added unsafe code is properly documented.

@Ablu
Copy link
Contributor Author

Ablu commented Mar 7, 2023

  • All commits in this PR are signed (with git commit -s), and the commit
    message has max 60 characters for the summary and max 75 characters for each
    description line.

Is there documentation of what signing means? I assume DCO, but I find it a bit weird to sign things without that having specified somewhere :)

Copy link
Contributor Author

@Ablu Ablu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left some comments for myself

crates/scsi/rustfmt.toml Outdated Show resolved Hide resolved
crates/scsi/src/main.rs Outdated Show resolved Hide resolved
crates/scsi/src/main.rs Outdated Show resolved Hide resolved
crates/scsi/src/main.rs Outdated Show resolved Hide resolved
crates/scsi/src/main.rs Outdated Show resolved Hide resolved
crates/scsi/src/main.rs Outdated Show resolved Hide resolved
crates/scsi/src/main.rs Outdated Show resolved Hide resolved
crates/scsi/src/scsi/emulation/block_device.rs Outdated Show resolved Hide resolved
crates/scsi/src/scsi/emulation/block_device.rs Outdated Show resolved Hide resolved
crates/scsi/src/virtio.rs Outdated Show resolved Hide resolved
@Ablu
Copy link
Contributor Author

Ablu commented Mar 8, 2023

Rough changelog:

  • squashed things, splitted into (hopefully sensible) chunks
  • moved tests under emulation (thats what they were testing)
  • implemented idx
  • fixed most previous comments

TODOs:

  • fix author ship... Somewhere during rebase I lost the authorship
  • review pubs. How do we want this crate to be consumed? @stsquad: Do we want this to just be a daemon or shall the interfaces be exported as lib (assumed the latter for now)?
  • throughout self-review

@Ablu
Copy link
Contributor Author

Ablu commented Mar 8, 2023

@Gaelan: Did I get the authorship right? Do we need to credit other people/projects from which things were adopted?

@Gaelan
Copy link
Contributor

Gaelan commented Mar 8, 2023

Many thanks for picking this back up!

Did I get the authorship right? Do we need to credit other people/projects from which things were adopted?

All the code here was written by me (as best I recall; I'm generally pretty good about noting if it isn't). The work was done as part of GSoC, but I don't believe any copyright assignment takes place there.

EDIT: I do, at various points, quote the SCSI spec, which is an ISO standard that isn't (officially) freely available. I'm assuming short excerpts fall under fair use and it should be fine?

Incidentally, I had a scsi-write branch that may be of interest; there was sufficient write support to boot and run Debian, though I can't vouch for its correctness in the face of writeback failures - IIRC there's a tricky mismatch here between the level of detail SCSI expects and what the POSIX file APIs give us. I think I might just fsync everything, which is probably slow-but-good-enough?

@Gaelan
Copy link
Contributor

Gaelan commented Mar 8, 2023

Please let me know if you have any questions, by the way! This was a couple years ago now, so no promises, but happy to try and remember what I was thinking.

@Ablu
Copy link
Contributor Author

Ablu commented Mar 9, 2023

All the code here was written by me (as best I recall; I'm generally pretty good about noting if it isn't). The work was done as part of GSoC, but I don't believe any copyright assignment takes place there.

Yeah, you had sign-offs on the relevant commits, mostly tried to make sure that the way I split up stuff is ok with you (since I put your name as main author). That said, we might be missing SPDX-License-Identifiers (at least current code has it, do not find anything in https://github.com/rust-vmm/community/blob/main/CONTRIBUTING.md). I would add: Apache-2.0 or BSD-3-Clause to all files, ok?

Incidentally, I had a scsi-write branch that may be of interest; there was sufficient write support to boot and run Debian, though I can't vouch for its correctness in the face of writeback failures - IIRC there's a tricky mismatch here between the level of detail SCSI expects and what the POSIX file APIs give us. I think I might just fsync everything, which is probably slow-but-good-enough?

I guess for now, I would aim at getting the readonly state merged. So I have not spent any thoughts on writes. Will check it out when I do so!

Please let me know if you have any questions, by the way! This was a couple years ago now, so no promises, but happy to try and remember what I was thinking.

Sure. Right now I focused on getting stuff work properly. Will let you know when I stumble over something. Feel free to provide review on the stuff that I changed!

@Ablu Ablu force-pushed the scsi-rebase branch 2 times, most recently from 4097016 to f0eab05 Compare March 13, 2023 08:21
@Ablu Ablu marked this pull request as ready for review March 13, 2023 08:26
@Ablu
Copy link
Contributor Author

Ablu commented Mar 13, 2023

Changelog (in order of significance):

  • Split the emulation interface into handling LUN-independent and LUN-specific messages
    • This allows the LUN-specific part to be oblivious to the target
    • Allowed for some de-duplication too
  • Create own trait to decouple from io::File
  • De-boilerplated the daemon by moving parsing to virtio.rs
    • Still requires some boilerplate to copy between virtio and scsi datastructures
    • Could be further simplified if we are fine with scsi-virtio-specific stuff returning datastructures of src/scsi/ (currently they were kept separated on purpose)
  • Removed no longer needed suppressing of linting (either due to simplified code or newer linter version)
  • Random cleanups here and there

@Ablu
Copy link
Contributor Author

Ablu commented Mar 13, 2023

CI seems to be broken by dirty permissions from left-overs?

@Ablu
Copy link
Contributor Author

Ablu commented Mar 13, 2023

Open clarification items:

  • SPDX identifiers?
  • abstraction-levels: What should be public? Which parts do we want to allow other people to use?
    • "Just the daemon?" vs "daemon, virtio-scsi abstraction, target interface, inner-emulation interfaces (LogicalUnit)"?
  • How to test the daemon?
  • Meaning of signoff is not documented.
  • No CHANGELOG available.

@Ablu Ablu force-pushed the scsi-rebase branch 3 times, most recently from c8577be to 7c45dd6 Compare March 13, 2023 09:50
#[derive(Debug)]
pub enum Command {
LunIndependentCommand(LunIndependentCommand),
LunSpecificCommand(LunSpecificCommand),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's worth noting that several of the "LUN-specific" commands here are used for multiple LUN types, not just block devices. (TEST UNIT READY, REQUEST SENSE, REPORT SUPPORTED OPERATION CODES, and INQUIRY are required for all LUN types; commands like READ CAPACITY and READ are used by several storage-like devices.)

So if we ever wanted to do more than just block devices, we'd need to do some more thinking about how we want to handle that. Unfortunately we can't just parse a command without knowing the intended device type, as sometimes operation codes are reused.

Anyway, cross that bridge when we get to it - just potentially useful background.

@Ablu
Copy link
Contributor Author

Ablu commented Mar 13, 2023

abstraction-levels: What should be public? Which parts do we want to allow other people to use?

  • "Just the daemon?" vs "daemon, virtio-scsi abstraction, target interface, inner-emulation interfaces (LogicalUnit)"?

Currently, we have multiple layers of abstraction in here:

  • the daemon that allows connection to the socket, parses things using the virtio module and forwards things to the EmulatedTarget after registering a BlockDevice for each image.
  • virtio that abstracts some of the virtio-scsi related parsing and provides some convenience features
  • scsi::Target that abstracts over the actual backend (only consists of execute_command)
  • scsi::emulation::target::EmulationTarget that implements scsi::Target by maintaining a list of LogicalUnits, handling REPORT_LUNS and forwarding other commands to the LogicalUnits.
  • scsi::emulation::block_device::BlockDevice that emulates an actual drive against a BlockDeviceBackend
  • scsi::emulation::block_device::FileBackend that implements BlockDeviceBackend

Comparing that against https://github.com/rust-vmm/vm-virtio/tree/main/crates/devices/virtio-blk/src, my feeling is that there the virtio and high-level device-specific abstractions are sharing the same data structures.

With that, there are a series of choices that could be made:

  1. About virtio parsing + scsi::Target:
  2. About the emulation-specific abstraction
    • Exposing these traits probably only makes sense if we also provide enough other support for the virtio layers in order to allow people to write their own daemon.
    • If we do not do that, we should probably just mark that API as crate-private.

I got no strong feelings about any of these options and are open to opinions.

Copy link
Contributor Author

@Ablu Ablu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @vireshk.

Fixed/reacted to all comments.

While rebasing I noticed missing tests. While adding them I went into a rabbit hole of trying to fix some API weirdness around the <W, R> generics. I ended up dropping them in favor of dynamic dispatch. That simplifies the code quite a bit. And I think spending too much time thinking through the monomorphization without any benchmarks in place is not worth scarifying code clarity.

Changelog

  • Dropped scsi-specific .lock file
  • Match the gpiod Cargo.toml fields
  • Fixed invocation in docs
  • Merged backend.rs into main.rs again
  • Switched from generics over Read/Write to dynamic dispatch
  • Dropped data_in/data_out from Request and turned them into ordinary function arguments
  • Minor API cleanup (some pub/pub(crate), &mut changes)
  • Made newlines more consistent
  • Dropped (for now) unused unmap field
  • Removed some left-over comment
  • Added tests for Write(10), WriteSame(16)
  • Dropped test_all* tests that would require annoying updates every time a new command is added
  • Reordered usings (cargo fmt -- --config group_imports=StdExternalCrate)

@@ -0,0 +1,2 @@
results/
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test results will be downloaded to results/, but we do not want to have those in SCM. The .containerignore prevents sending the (potentially large) results/ and test-data/ folders when building the container (though I am not sure whether this is actually still as relevant with root-less containers, probably no copy happens under the hood anymore...) Especially test-data/ might become "large-ish" with the Fedora image and test blockdevice.

crates/scsi/Cargo.lock Outdated Show resolved Hide resolved
crates/scsi/src/main.rs Outdated Show resolved Hide resolved
crates/scsi/src/backend.rs Outdated Show resolved Hide resolved
crates/scsi/src/backend.rs Outdated Show resolved Hide resolved
crates/scsi/src/backend.rs Outdated Show resolved Hide resolved
crates/scsi/README.md Outdated Show resolved Hide resolved
crates/scsi/Cargo.toml Outdated Show resolved Hide resolved
crates/scsi/README.md Outdated Show resolved Hide resolved
crates/scsi/README.md Outdated Show resolved Hide resolved
info!("vhost-user connection closed with partial message. If the VM is shutting down, this is expected behavior; otherwise, it might be a bug.");
}
Err(e) => {
warn!("Error running daemon: {:?}", e);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a warning or should it be log::error! instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is copied from the other daemons, that also log warn!() here. I agree that error!() is probably better. But I think this may be better fixed once we get rid of the boilerplate code (https://linaro.atlassian.net/browse/ORKO-32).

crates/scsi/src/scsi/emulation/block_device.rs Outdated Show resolved Hide resolved
// TODO: Ideally, this would be a read_vectored directly into guest
// address space. Instead, we have an allocation and several copies.

let mut ret = vec![0; (blocks * u64::from(self.backend.block_size())) as usize];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Early optimization) this buffer (and the one in write_blocks()) looks like it's "one use" only. It could be an internal field of BlockDevice that gets cleared every time it's reused (which is constant time) to avoid reallocation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code certainly is not very optimal. However, when moving to a single buffer shared across read, we will keep the memory of a single large read allocated until we tear down the device. I think I would rather spend the time on implementing a proper vectored read (that avoids doing the extra copies altogether). Before investing more time around the Read/Write DescriptorChains implementations I would try to think about a way to combine that with rust-vmm/vm-virtio#33.

crates/scsi/src/scsi/emulation/block_device.rs Outdated Show resolved Hide resolved
crates/scsi/src/scsi/emulation/command.rs Outdated Show resolved Hide resolved
Copy link
Contributor Author

@Ablu Ablu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@epilys: Thanks!

Changelog

  • Introduced new-types for ByteOffset, BlockOffset, BlockSize

// TODO: Ideally, this would be a read_vectored directly into guest
// address space. Instead, we have an allocation and several copies.

let mut ret = vec![0; (blocks * u64::from(self.backend.block_size())) as usize];
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code certainly is not very optimal. However, when moving to a single buffer shared across read, we will keep the memory of a single large read allocated until we tear down the device. I think I would rather spend the time on implementing a proper vectored read (that avoids doing the extra copies altogether). Before investing more time around the Read/Write DescriptorChains implementations I would try to think about a way to combine that with rust-vmm/vm-virtio#33.

info!("vhost-user connection closed with partial message. If the VM is shutting down, this is expected behavior; otherwise, it might be a bug.");
}
Err(e) => {
warn!("Error running daemon: {:?}", e);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is copied from the other daemons, that also log warn!() here. I agree that error!() is probably better. But I think this may be better fixed once we get rid of the boilerplate code (https://linaro.atlassian.net/browse/ORKO-32).

crates/scsi/src/scsi/emulation/block_device.rs Outdated Show resolved Hide resolved
@Ablu
Copy link
Contributor Author

Ablu commented May 17, 2023

CI reports dropped coverage. But that seems to happen because the CI reports slightly different percentages compared to what my local toolchain reports. I will adjust the number to make CI happy once this is reviewed. Overall the coverage improved, the failure is only happening because I bumped the percentage to what the script tells me locally.

@Ablu Ablu force-pushed the scsi-rebase branch 2 times, most recently from a862ab8 to e7ac0c2 Compare June 2, 2023 07:52
@Ablu
Copy link
Contributor Author

Ablu commented Jun 2, 2023

Changelog

  • Rebased on top of main
  • Bumped dependencies to match other crates
  • Dropped unneeded dependency on queue test-utils

Gaelan and others added 13 commits June 5, 2023 08:56
Co-developed-by: Erik Schilling <[email protected]>
Signed-off-by: Erik Schilling <[email protected]>
Signed-off-by: Gaelan Steele <[email protected]>
This defines the basic interface that any scsi device will have to
implement (along with some sensing constants that may be useful to share).

The vast majority of this work was done by Gaelan Steele as part of a
GSoC project [1][2].

[1] rust-vmm#4
[2] https://gist.github.com/Gaelan/febec4e4606e1320026a0924c3bf74d0

Co-developed-by: Erik Schilling <[email protected]>
Signed-off-by: Erik Schilling <[email protected]>
Signed-off-by: Gaelan Steele <[email protected]>
This implements the previously defined interface by emulating the commands
against a file-backed block device.

The vast majority of this work was done by Gaelan Steele as part of a
GSoC project [1][2].

[1] rust-vmm#4
[2] https://gist.github.com/Gaelan/febec4e4606e1320026a0924c3bf74d0

Co-developed-by: Erik Schilling <[email protected]>
Signed-off-by: Erik Schilling <[email protected]>
Signed-off-by: Gaelan Steele <[email protected]>
The vast majority of this work was done by Gaelan Steele as part of a
GSoC project [1][2].

[1] rust-vmm#4
[2] https://gist.github.com/Gaelan/febec4e4606e1320026a0924c3bf74d0

Co-developed-by: Erik Schilling <[email protected]>
Signed-off-by: Erik Schilling <[email protected]>
Signed-off-by: Gaelan Steele <[email protected]>
This adds the virtio-specific parts that use the previously formed
interfaces and scsi emulation in order to build a daemon that offers files
from the host system as drives to the guest.

The vast majority of this work was done by Gaelan Steele as part of a
GSoC project [1][2].

[1] rust-vmm#4
[2] https://gist.github.com/Gaelan/febec4e4606e1320026a0924c3bf74d0

Co-developed-by: Erik Schilling <[email protected]>
Signed-off-by: Erik Schilling <[email protected]>
Signed-off-by: Gaelan Steele <[email protected]>
Co-developed-by: Erik Schilling <[email protected]>
Signed-off-by: Erik Schilling <[email protected]>
Signed-off-by: Gaelan Steele <[email protected]>
The config that we send is based on the current QEMU defaults
(as of 60ca584b8af0de525656f959991a440f8c191f12).

This allows testing using Alex Bennee's vhost-user-generic series and
will be required for hypervisors that do not come with the stubs that
QEMU has (eg: Xen).

Signed-off-by: Erik Schilling <[email protected]>
Link: https://lore.kernel.org/all/[email protected]/
This adds write support. Being very similar to READ(10) in structure,
much of the code is very similar.

Signed-off-by: Erik Schilling <[email protected]>
WRITE SAME allows writing a block for a repeated number of times.

Mostly, it can also be used to deallocate parts of the block device
(the fstrim functionality uses this). We do not support that aspect
yet. Instead, we will just stupidly repeat the pattern as many times
as we are told.

A future, smarter implementation could just punch a hole into the
backend instead of filling it with zeros.

Signed-off-by: Erik Schilling <[email protected]>
While the command also allows syncing individual regions of a LUN, we
do not support that here and simply sync the entire file.

Signed-off-by: Erik Schilling <[email protected]>
This provides some tooling for running blktests. The README.md contains
documentation about the architecture.

I am seeing some race-conditions that sometimes lead to boot
freezes [1], so this is not really ready for automatic evaluation during
a CI pipeline.

[1] https://linaro.atlassian.net/browse/ORKO-37

Signed-off-by: Erik Schilling <[email protected]>
Link: https://linaro.atlassian.net/browse/ORKO-17
Polishing this up for inclusion is currently not high on the priority
list, but we can at least link it.

Signed-off-by: Erik Schilling <[email protected]>
@Ablu
Copy link
Contributor Author

Ablu commented Jun 5, 2023

Changelog

  • Hopefully fixed coverage

@vireshk
Copy link
Collaborator

vireshk commented Jun 5, 2023

This is just too much (and complex) code to review. I have looked at it from the perspective of vhost-device stuff and that it is implemented in a way similar to other crates in the workspace.

Approving it now.

@stsquad @mathieupoirier @stefano-garzarella

@vireshk vireshk enabled auto-merge (rebase) June 5, 2023 09:14
@vireshk vireshk merged commit f50e714 into rust-vmm:main Jun 5, 2023
@stsquad
Copy link
Collaborator

stsquad commented Jun 5, 2023

This is just too much (and complex) code to review. I have looked at it from the perspective of vhost-device stuff and that it is implemented in a way similar to other crates in the workspace.

Approving it now.

@stsquad @mathieupoirier @stefano-garzarella

I think its had some good cycles of review but there is a trade off to a growing stack of commits vs incremental updates. I'm happy with merging this so follow on enhancements aren't forever blocked.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants