-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding Hyper-V Nested Virt and Guest UEFI Support #729
base: master
Are you sure you want to change the base?
Conversation
@@ -1,48 +1,8 @@ | |||
################################################################################ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are you removing this header?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My mistake - this patch was giving me some trouble. Will add it back in.
Fails to build (xen patchqueue doesn't apply): http://openxt-builder.ainfosec.com:8010/builders/openxt/builds/1042/steps/Build/logs/stdio |
I have done a cursory read of the change set and will provide those comments later. Before that, I would note that this adds two major features in a single PR that is one commit. The single PR is fine since it is related featues, but I would suggest that this at least gets broken into two commits, one that adds the guest UEFI support and one builds on it to add the nested virtualization support. |
if ( !++port ) | ||
nvcpu->nv_vmexit_pending = 1; | ||
} while ( !nvcpu->nv_vmexit_pending ); | ||
+#if 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of just disabling, why not test for DEBUG. That way it is easier to turn on when troubleshooting.
+} | ||
+ | ||
+ | ||
+static int paging_read_l2_entry(struct vcpu *v, unsigned long l2_pa, uint64_t * entry) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please be consistent with spacing and honor spacing conventions of existing code you are patching.
+ return 0; | ||
+} | ||
+ | ||
+static unsigned int paging_get_l2_pa_from_l2_va(struct vcpu *v,unsigned long l2_va, unsigned long *l2_pa) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Spacing here as well
+ | ||
+ uint64_t pml4e_addr; | ||
+ uint64_t pml4e; | ||
+ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpick: Is this extra white space needed?
+ | ||
+ uint64_t l2_pa; | ||
+ | ||
+if (paging_get_l2_pa_from_l2_va(v,va,&l2_pa)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indentation
|
||
if ( is_hvm_vcpu(v) && paging_mode_hap(v->domain) && nestedhvm_is_n2(v) ) | ||
{ | ||
+#if 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please don't just turn off code blocks, if replacing then replace. If it is being left to provide understanding, then add a comment within the block to let people know why its still here.
if ( (vendor_id == 0xffff) && (device_id == 0xffff) ) | ||
continue; | ||
|
||
+#if 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again don't #if out replaced code.
Device (ISA) | ||
{ | ||
- Name (_ADR, 0x00010000) /* device 1, fn 0 */ | ||
+ //Name (_ADR, 0x00010000) /* device 1, fn 0 */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just like #if, if you are leaving for understanding, then explain otherwise just delete the line
+ if (b_info->stubdomain_version == LIBXL_STUBDOMAIN_VERSION_LINUX) { | ||
+ drive = libxl__sprintf | ||
+ (gc, "file=%s%c,if=scsi,bus=0,unit=%d,format=%s,cache=writeback", | ||
+ "/dev/xvd", 'a'+disk, disk, format); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any way not to have to hard code the device path? Maybe make it a #define variable?
libxl__xs_get_dompath(gc, dm_domid)), | ||
"%d", guest_domid); | ||
if (guest_config->b_info.stubdomain_version == LIBXL_STUBDOMAIN_VERSION_LINUX) { | ||
+#if 0 /* LIES */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is wrong, then just replace it.
(AHCI_NUM_COMMAND_SLOTS << 8) | | ||
(AHCI_SUPPORTED_SPEED_GEN1 << AHCI_SUPPORTED_SPEED) | | ||
- HOST_CAP_NCQ | HOST_CAP_AHCI; | ||
+ /*HOST_CAP_NCQ |*/ HOST_CAP_AHCI; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just delete and if you feel its needed, add a comment that it was removed and why
return; | ||
} | ||
- dma_blk_unmap(dbs); | ||
+ //dma_blk_unmap(dbs); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just delete it
#include "migration/migration.h" | ||
#include "kvm_i386.h" | ||
|
||
+extern int xen_q35; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not like blind accessible global state, they are too easily abused and leaves yourself open to potential exploit. Make it local and export wrappers to set and get that guard for correct values, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like it's how QEMU defines a bunch of other Xen variables too in include/hw/xen/xen.h
, so I would just place this next to the other ones:
extern uint32_t xen_domid;
extern enum xen_mode xen_mode;
extern bool xen_allowed;
+ fatal_errmsg = g_strdup_printf("failed to get backing file size"); | ||
+ } else if (size == 0) { | ||
+ fatal_errmsg = g_strdup_printf("PC system firmware (pflash) " | ||
+ "cannot have zero size"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am all about observing 80 char, and this looks like it would have been fine. Please don't unnecessarily split.
+ } | ||
+ | ||
+ | ||
+ blk = blk_by_legacy_dinfo(pflash_drv); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indentation consistency
+ | ||
+ | ||
+ if (size < 0) { | ||
+ fatal_errmsg = g_strdup_printf("failed to get backing file size"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indentation
pc_memory_init(pcms, get_system_memory(), | ||
rom_memory, &ram_memory); | ||
+ } else { | ||
+ xen_map_efi_var_rom(rom_memory); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this logic trigger for non-uefi guest on xen and if so, is it handled gracefully?
pc_q35_machine_options(m); | ||
- m->alias = "q35"; | ||
+ if (xen_q35) | ||
+ m->alias = "pc"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indentation
IN EFI_HANDLE Controller | ||
) | ||
{ | ||
+#if 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replace, not #if 0
|
||
#include "AcpiPlatform.h" | ||
|
||
+#define FISH \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I get this for development, but maybe a more meaningful debug augmentation would be better.
EFI_STATUS Status; | ||
VOID *Interface; | ||
EFI_EVENT PciEnumerated; | ||
+#if 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just remove, leave comment if necessary
return Status; | ||
} | ||
|
||
+#if 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
again
|
||
EFI_ACPI_2_0_ROOT_SYSTEM_DESCRIPTION_POINTER *XenAcpiRsdpStructurePtr = NULL; | ||
|
||
+#define FISH \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
debug again
SETTINGS Settings; | ||
|
||
Status = GetSettings (&Settings); | ||
+#if 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove
AddReservedMemoryBaseSizeHob (0xFC000000, 0x1000000, FALSE); | ||
|
||
- PcdSetBool (PcdPciDisableBusEnumeration, TRUE); | ||
+ //PcdSetBool (PcdPciDisableBusEnumeration, TRUE); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove
// 0xFEE00000 LAPIC 1 MB | ||
// | ||
PciSize = 0xFC000000 - PciBase; | ||
+ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is returns necessary? would reduce diff churn
+ PcdSet64 (PcdPciMmio32Base, 0x80000000); | ||
+ PcdSet64 (PcdPciMmio32Size, 0x7c000000); | ||
+ | ||
+# if 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
turned off code you added, should it just not be added?
XENIO_PROTOCOL *XenIo; | ||
EFI_DEVICE_PATH_PROTOCOL *DevicePath; | ||
|
||
+ return EFI_UNSUPPORTED; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instead of just shorting out this function, can you not go to the invocations of this function and implement a check whether to even call?
EFI_STATUS Status; | ||
XENBUS_PROTOCOL *XenBusIo; | ||
|
||
+ return EFI_UNSUPPORTED; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
short out again
XEN_BLOCK_FRONT_DEVICE *Dev; | ||
EFI_BLOCK_IO_MEDIA *Media; | ||
|
||
+ return EFI_UNSUPPORTED; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
short out again
I'm very pleased to see this progress on nested virtualization -- thanks for the PR. We need a detailed and clear set of test validation steps for this feature, or set of features, in this PR so that we will be able to verify that the new functionality has been preserved correctly when OpenXT uprevs to a newer version of the Xen hypervisor and toolstack software. We need to know exactly what software, and what features of that software, this introduces a requirement to remain compatible with, please. |
A few updates:
|
I have attached two documents here that may be freely distributed that describe how to go about testing out the new features. I hope these documents are helpful. I will be periodically keeping track of this PR and hope to contribute more in the future, but for now the remaining work will be picked up by someone else at AIS. There is, however, one unfortunate thing to note. In my latest round of testing I've found that enabling Windows Hyper-V causes a triple fault upon boot as both a standard and UEFI guest. Having glanced through the Xen 4.6.6 changeset, I am not convinced that the problem was caused there. It's possible that other changes have come in recently that may have had an effect. Fortunately, the surface where this could have happened is relatively small. In addition, the original document we received suggests that all of this functionality (and more) was working in April 2017, thus the time frame for such a change is relatively small. This is the only issue I've found in my (limited) testing since merging many commits from master today. |
You may wish to leave the "#if 0"s in place as this functionality is likely to be merged upstream (Citrix are also instrested in q35, ovmf and Credential guard and HVCI), and it will make unpicking your PQ from the the upstream changes easier. |
Okay, then don't just #if 0 out the code. Use a descriptive flag and possible comments that a) makes the code easily locatable and b) helps reviewers understand why the code is disabled. |
Now OVMF won't build. Does it need |
You should obtain the original documentation provided with this patch series, it explains in detail what each part of the patches do and why. How you wish to modify it an maintain them in this PR is entirely up to you: but each modification that diverges significantly from upstream is something you're going to either have to merge in later or for which a PQ must be maintained - eg it's quite likely that xen upstream will implement q35 and ovmf support itself soon. |
@jean-edouard I did not experience any issues building OVMF, but judging from that error message it looks like nasm is indeed needed to build OVMF. My build machine must've had it, which is why I did not make any note of it. The only original documentation given is the 'readme' posted above, which gives some additional background/details. |
Thanks @willronchetti. In the meantime, I will manually add it to the AIS builders to be able to build and test this PR. |
It also requires adding a device to qemu, which is not covered by this PR: |
I ran a build with a qemu fix here: http://openxt-builder.ainfosec.com:8010/builders/openxt/builds/1077 I really feel that this PR should be split into multiple ones. I was able to smoke-test the guest UEFI support (ovmf), and it actually works pretty well. It's also optional and not enabled by default, so I would be willing to merge it pretty quickly if it was in its own PR. The nested-virt support is a more important feature that needs more reviewing, and not having guest UEFI support mixed-in would also help. |
The nesting support is dependent on a work-around in this PR. Quoting readme.pdf that @willronchetti added:
See also comments inline:
Until a proper fix is available, this is a -1 from me. @jean-edouard +1 on Splitting the PR to get the guest UEFI support merged independently. |
What are the test cases for guest UEFI support and nested virt in this PR? In today's community call, it was stated that new developers would be working on these features. Should we wait for them to submit a new PR? |
@eric-ch the fix is already upstream, and I beleive there's a 4.8 backport. |
- outb(0x4d0, (uint8_t)(PCI_ISA_IRQ_MASK >> 0)); | ||
- outb(0x4d1, (uint8_t)(PCI_ISA_IRQ_MASK >> 8)); | ||
+ if ((pci_readw(PCI_ISA_DEVFN, PCI_VENDOR_ID) == 0x8086) && | ||
+ (pci_readw(PCI_ISA_DEVFN, PCI_VENDOR_ID) == 0x7000)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This here should be PCI_DEVICE_ID
Relevant discussion on xen-devel https://lists.xenproject.org/archives/html/xen-devel/2018-06/msg00379.html The same conclusion has been reached about this approach I shared during the meeting we had in the fall: "While Secure Boot can be enabled with this implementation, it is not sufficiently secure because the guest is able to write anything it wants to the emulated flash. KVM solved this problem with SMM mode but I don't like that solution either." |
The correct solution is to implement efi variable reads and writes with a hypercall. There are only 3 functions. GetNextVariableName only needs 3 arguments, Get and Set both take 5 arguments, Since the interface is not often used, and performance isn't an issue, you could easily
This would be easy to implement, the only subtlety is that the code needs to register |
This patch set integrates nested virtualization and guest UEFI support into OpenXT. These patches have been tested on OpenXT. The following tests have been carried out. Areas where testing can be improved are highlighted.
This is a very large patch set that will require thorough testing beyond what I've done before being merged into master.