Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

event ABI issues #1

Open
Wenzel opened this issue Jan 18, 2017 · 1 comment
Open

event ABI issues #1

Wenzel opened this issue Jan 18, 2017 · 1 comment

Comments

@Wenzel
Copy link
Member

Wenzel commented Jan 18, 2017

Copy pasting this issue which @tklengyel opened in Wenzel/kvm-vmi :

The current event ABI looks as follows:

struct event {
    bool present;
   enum syscall_direction direction;
   enum syscall_type type;
};

There are multiple issues with this design.

  • Using enum is not OK when making an ABI as different compilers may disagree on what the underlying storage for the enum should be. For example the kernel and the userspace can (and most likely will) be compiled with different compilers. Same principle stands for using storage types such as int, long and bool. It is best to use storage types that explicitly specify their widths, such as uint32_t.

  • In similar vein, structure padding has to be explicitly carried out. Different compilers may pad structures differently, so to avoid any misalignment, always pad structure members to 64-bit width.

  • Adding the event ABI version field to the structure will allow future expansion of the structure while providing downstream clients a direct way to verify that their tool can cope with the running ABI version.

  • This struct is being filled in by the kernel module for each vCPU. This design requires the userside to query the kernel for each vCPU to see if there are any pending events. On VMs with many vCPUs this will require many cycles even if there are no pending events on most of the vCPUs. On Xen the way we handle this is by using a queue to push events into on the hypervisor side and have the userside pop them. The nfqueue library already implements something akin to this - transferring information to userspace to be processed via a queue- so I would look there for inspiration.

  • Getting registers is performed in a separate system-call currently. However, the event struct could very well just carry that information by default. We already do this with Xen. See https://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=xen/include/public/vm_event.h;h=b7487a12f3920d30b948ce9a996fb3eb9e8773da;hb=refs/heads/master#l160

After some discussion with Jonas, what will make most sense actually is simply expanding kvm_run to provide a VMI exit-code for events we trigger for monitoring.

@Wenzel
Copy link
Member Author

Wenzel commented Jan 18, 2017

I agree with all the issues that you listed above !

Adding the registers directly in the event will improve performance, since we need the registers info anyway.

An yes, expanding kvm_run will make the design cleaner, and move the code to the QEMU side.
We already discussed the design of Nitro with @GabrielL and we had the same conclusion.

This is a draft of how this new design could be implemented:

KVM

  • open a new ioctl to update the bitmap exit fields, to force VM_EXIT when a GP or a UD is triggered
  • open a new ioctl to queue an exception in kvm
  • open a new ioctl to query the registers
  • add a new exit reason KVM_EXIT_VMI

QEMU

  • add the set_msr command, to call the KVM_SET_MSRS ioctl, and set the traps
  • add the set_debug_exception command, to call our ioctl

How it should work:

  1. pause the VM using pause command
  2. set traps on MSRs using our set_msr command
  3. update the bitmap exit fields using our set_debug_exception command
  4. continue the VM execution using cont command

At this point, the traps are set.

When an exception is triggered, and we go back to QEMU exit handler:

case KVM_EXIT_VMI:
if (exception == Undefined_Opcode)
    if (nitro_enabled == True && cur_instruction == syscall/sysret)
        # this is a nitro event, a syscall has been trapped
        query the registers
        report the event
    else
        # this is a natural UD, it should be queue back to KVM
        queue_exception # using our ioctl

    disable the traps # set_msr and set_debug_exception
    singlestep
    enable_traps
    continue VM

And the code is the same for a GP.
Therefore we keep the modifications as small as possible on the KVM side.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant