Skip to content

Ubuntu 22.04直通显卡给虚拟机使用

李小明 edited this page Apr 18, 2023 · 16 revisions

前置条件

参考:

  1. 有两个GPU(CPU核显和独立显卡 或 2个独立显卡),1个主机使用,1个虚拟机使用
  2. 主板需要启用: CPU的虚拟化, IOMMU
  3. 主板建议启用: 大于4G地址空间解码, Resize BAR支持(AMD显卡时建议关闭)
  4. 为了防止准备分给虚拟机的显卡被主机占用, 安装操作系统时拔掉给虚拟机使用的显卡, 操作系统安装完毕后再插显卡.
  5. 使用时显卡必须连接显示器(模拟器), 否则显卡默认没有画面输出. N卡显示器也能输出画面, I卡显示器没有画面(必须安装IDD驱动模拟显示设备).

注意:有些笔记本虽然看起来有集成显卡和独立显卡,但是并不能成功。有关讨论

案例: 开始使用RTX 3090 Ti 24GB时如果主板开启 Resize BAR支持 功能, 虚拟机会出现无法开机错误 Guest has not initialized the display (yet). 后在 网友帖子中找到原因 是OVMF默认的mmio address space不够, 给qemu添加调整参数 -fw_cfg opt/ovmf/X-PciMmio64Mb,string=262144 后问题解决.

找出独立显卡位号

sudo lspci -nn

01:00.0 VGA compatible controller [0300]: NVIDIA Corporation TU116 [GeForce GTX 1660] [10de:2184] (rev a1)
01:00.1 Audio device [0403]: NVIDIA Corporation TU116 High Definition Audio Controller [10de:1aeb] (rev a1)
01:00.2 USB controller [0c03]: NVIDIA Corporation TU116 USB 3.1 Host Controller [10de:1aec] (rev a1)
01:00.3 Serial bus controller [0c80]: NVIDIA Corporation TU116 [GeForce GTX 1650 SUPER] [10de:1aed] (rev a1)

找到自己显卡型号, 例如 GeForce GTX 1660, 通过其前面的 NVIDIA Corporation TU116 确定设备内部名称 . 末尾的 [10de:2184] 就是位号, 记录这4个位号. 只有独立PCI设备可以透传, 因为透传时要求同个iommu_group (地址前5位相同 01:00 ) 的必须全都传递, 否则启动虚拟机时报错 Please ensure all devices within the iommu_group are bound to their vfio bus driver. . 比如主板上的有线网卡就无法透传.

修改GRUB

sudo nano /etc/default/grub

GRUB_CMDLINE_LINUX 的值的后面添加参数:

GRUB_CMDLINE_LINUX="intel_iommu=on iommu=pt vfio-pci.ids=10de:2184,10de:1aeb,10de:1aec,10de:1aed video=efifb:off"

注意: CPU是AMD时, 应将 intel_iommu 改成 amd_iommu . 这里必须配置 vfio-pci.ids , 否则vfio无法分配这些设备.

说明: 如果是AMD显卡, 参数中需要加上 pcie_acs_override=downstream,multifunction video=vesafb:off .

保存后执行:

sudo update-grub

屏蔽Nouveau驱动

echo "
blacklist nouveau
blacklist nvidia
" | sudo tee -a /etc/modprobe.d/blacklist.conf

NVIDIA 防止应用崩溃导致虚拟机崩溃

echo "options kvm ignore_msrs=1 report_ignored_msrs=0" | sudo tee /etc/modprobe.d/kvm.conf

配置vfio加载顺序

echo "vfio
vfio_iommu_type1
vfio_pci
vfio_virqfd" | sudo tee -a /etc/modules

应用设置:

sudo update-initramfs -u -k all

重启操作系统.

验证配置是否成功

确认开启IOMMU:

sudo dmesg | grep "IOMMU"

amd处理器打印 AMD-Vi: AMD IOMMUv2 loaded and initialized , intel处理器打印 DMAR: IOMMU enabled

确认开启vfio:

sudo dmesg | grep -i vfio
输出示例
[    0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-5.15.0-40-generic root=UUID=3c2c2b13-a9b8-44a0-82d3-51ec314ac486 ro intel_iommu=on vfio-pci.ids=10de:2184,10de:1aeb,10de:1aec,10de:1aed quiet splash vt.handoff=7
[    0.027793] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-5.15.0-40-generic root=UUID=3c2c2b13-a9b8-44a0-82d3-51ec314ac486 ro intel_iommu=on vfio-pci.ids=10de:2184,10de:1aeb,10de:1aec,10de:1aed quiet splash vt.handoff=7
[    2.914490] VFIO - User Level meta-driver version: 0.3
[    2.914559] vfio-pci 0000:01:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=io+mem:owns=none
[    2.934677] vfio_pci: add [10de:2184[ffffffff:ffffffff]] class 0x000000/00000000
[    2.954602] vfio_pci: add [10de:1aeb[ffffffff:ffffffff]] class 0x000000/00000000
[    2.974601] vfio_pci: add [10de:1aec[ffffffff:ffffffff]] class 0x000000/00000000
[    2.994618] vfio_pci: add [10de:1aed[ffffffff:ffffffff]] class 0x000000/00000000
[    7.188782] vfio-pci 0000:01:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=io+mem:owns=none

打印内容中看到4个 vfio_pci: add [10de: 字样说明配置成功。

sudo lspci -nnk -d 10de:2184

01:00.0 VGA compatible controller [0300]: NVIDIA Corporation TU116 [GeForce GTX 1660] [10de:2184] (rev a1)
	Subsystem: Gigabyte Technology Co., Ltd TU116 [GeForce GTX 1660] [1458:3fc7]
	Kernel driver in use: vfio-pci
	Kernel modules: nvidiafb, nouveau

sudo lspci -nnk -d 10de:1aeb

01:00.1 Audio device [0403]: NVIDIA Corporation TU116 High Definition Audio Controller [10de:1aeb] (rev a1)
	Subsystem: Gigabyte Technology Co., Ltd TU116 High Definition Audio Controller [1458:3fc7]
	Kernel driver in use: vfio-pci
	Kernel modules: snd_hda_intel

sudo lspci -nnk -d 10de:1aec

01:00.2 USB controller [0c03]: NVIDIA Corporation TU116 USB 3.1 Host Controller [10de:1aec] (rev a1)
	Subsystem: Gigabyte Technology Co., Ltd TU116 USB 3.1 Host Controller [1458:3fc7]
	Kernel driver in use: vfio-pci
	Kernel modules: xhci_pci

sudo lspci -nnk -d 10de:1aed

01:00.3 Serial bus controller [0c80]: NVIDIA Corporation TU116 USB Type-C UCSI Controller [10de:1aed] (rev a1)
	Subsystem: Gigabyte Technology Co., Ltd TU116 USB Type-C UCSI Controller [1458:3fc7]
	Kernel driver in use: vfio-pci
	Kernel modules: i2c_nvidia_gpu

显示Kernel driver in use: vfio-pci时说明配置成功。

使用

qemu添加参数:

-fw_cfg opt/ovmf/X-PciMmio64Mb,string=262144 \
-device pcie-root-port,id=rp1,port=0,chassis=0,slot=0,hotplug=off,multifunction=on \
-device vfio-pci,bus=rp1,host=01:00.0 \
-device vfio-pci,bus=rp1,host=01:00.1 \
-device vfio-pci,bus=rp1,host=01:00.2 \
-device vfio-pci,bus=rp1,host=01:00.3 \

注意: 直通时NVIDIA控制面板中没有显示器选项, 这意味着物理显卡不会工作! 如果不在物理显卡上接显示器, 则远程应用无法工作一直黑屏. 直接在物理显卡上连接显示器, 则画面会传入连接的显示器中, 此时远程应用可以工作! 暂时没有找到类似vgpu中的软件层面可用的模拟显示器(VGX), 找到可能的方法为插一个硬件显示模拟器或是HDMI转VGA的转接头.

说明: 如果是AMD显卡, 必须设置 pcie-root-port, 否则显卡会出现43错误码无法工作.

NVIDIA示例参考:

export TPM_PATH=/tmp/tpm_1 ; mkdir -p ${TPM_PATH} ; \
swtpm socket --tpm2 --tpmstate dir=${TPM_PATH} --ctrl type=unixio,path=${TPM_PATH}/swtpm-sock --log level=20 -d ; \
sudo qemu-system-x86_64 -nodefaults -no-user-config -rtc base=localtime,clock=host -machine q35,accel=kvm,vmport=off,dump-guest-core=off \
-chardev socket,id=chrtpm,path=${TPM_PATH}/swtpm-sock -tpmdev emulator,id=tpm0,chardev=chrtpm -device tpm-tis,tpmdev=tpm0 \
-bios OVMF.fd \
-smbios type=0,vendor=lilu.red,version=1.0.0,date=2022-10-05,uefi=on \
-smbios type=1,manufacturer=lilu.red,product=dev,serial=LILU-DEV \
-smbios type=2,manufacturer=Gigabyte,product=H370,version=1.0 \
-cpu host,kvm=off,hypervisor=off,hv-time -smp cores=4 \
-m 8G \
-netdev bridge,id=net0,br=b0 -device virtio-net-pci,mq=on,packed=on,netdev=net0,mac=00:00:00:00:00:01 \
-drive file=w-1.qcow2,format=qcow2,l2-cache-size=8M,cache-clean-interval=60,if=virtio,aio=io_uring,cache=writethrough \
-drive file=/media/m/archive-b/game.qcow2,format=qcow2,l2-cache-size=8M,cache-clean-interval=60,if=virtio,aio=io_uring,cache=writethrough \
-nographic -display none -vga virtio \
-device virtio-serial-pci -chardev spicevmc,id=spicechannel0,name=vdagent,debug=0 -device virtserialport,chardev=spicechannel0,name=com.redhat.spice.0 -spice image-compression=off,disable-ticketing=true,port=5001 \
-fw_cfg opt/ovmf/X-PciMmio64Mb,string=262144 \
-device pcie-root-port,id=rp1,port=0,chassis=0,slot=0,hotplug=off,multifunction=on \
-device vfio-pci,bus=rp1,host=01:00.0 \
-device vfio-pci,bus=rp1,host=01:00.1 \
-device vfio-pci,bus=rp1,host=01:00.2 \
-device vfio-pci,bus=rp1,host=01:00.3 \
-daemonize

测试发现在显卡直通时, 部分游戏会因兼容性问题而无法发挥显卡的全部性能. 比如绝地求生在 常用图像设置 下如果使用DriectX 12, 显卡的占用率及功耗一直上不去, 即使是RTX 3090 Ti表现出的FPS也不好看. 目前还不清楚是否是AMD 5600G平台的原因, 因为GTX 1660在i5-8600k平台上的表现相对正常. 现在已知的是DriectX 11比12拥有更好的FPS, 打开多了20左右.

AMD示例参考:

export TPM_PATH=/tmp/tpm_1 ; \
swtpm socket --tpm2 --tpmstate dir=${TPM_PATH} --ctrl type=unixio,path=${TPM_PATH}/swtpm-sock --log level=20 -d ; \
sudo qemu-system-x86_64 -nodefaults -no-user-config -rtc base=localtime,clock=host -machine q35,accel=kvm,vmport=off,dump-guest-core=off \
-chardev socket,id=chrtpm,path=${TPM_PATH}/swtpm-sock -tpmdev emulator,id=tpm0,chardev=chrtpm -device tpm-tis,tpmdev=tpm0 \
-bios OVMF.fd \
-smbios type=0,vendor=lilu.red,version=1.0.0 -smbios type=1,manufacturer=lilu.red,product=dev \
-cpu host,-hypervisor,hv-passthrough -smp cores=4 \
-m 16G \
-netdev bridge,id=net0,br=b0 -device virtio-net-pci,mq=on,packed=on,netdev=net0,mac=00:00:00:00:00:01 \
-drive file=amd.qcow2,format=qcow2,l2-cache-size=8M,cache-clean-interval=60,if=virtio,aio=io_uring,cache=writethrough \
-drive file=game.qcow2,format=qcow2,l2-cache-size=8M,cache-clean-interval=60,if=virtio,aio=io_uring,cache=writethrough \
-nographic -display none -vga virtio \
-device virtio-serial-pci -chardev spicevmc,id=spicechannel0,name=vdagent,debug=0 -device virtserialport,chardev=spicechannel0,name=com.redhat.spice.0 -spice image-compression=off,disable-ticketing=true,port=5001 \
-device pcie-root-port,id=root_port1,hotplug=off,multifunction=on,chassis=6,addr=1c.0,slot=2,bus=pcie.0 \
-device vfio-pci,bus=root_port1,addr=00.0,multifunction=on,x-vga=on,host=03:00.0 \
-device vfio-pci,bus=root_port1,addr=00.1,host=03:00.1 \
-daemonize

USB设备

可以将主机的USB设备分给虚拟机使用, 比如鼠标键盘或是游戏手柄. 首先我们需要确定USB设备的Bus和Port, 运行:

lsusb -tvv

其输出内容如下:

/:  Bus 02.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/8p, 10000M
    ID 1d6b:0003 Linux Foundation 3.0 root hub
    /sys/bus/usb/devices/usb2  /dev/bus/usb/002/001
    |__ Port 5: Dev 2, If 0, Class=Mass Storage, Driver=uas, 5000M
        ID 152d:0578 JMicron Technology Corp. / JMicron USA Technology Corp. JMS578 SATA 6Gb/s
        /sys/bus/usb/devices/2-5  /dev/bus/usb/002/002
/:  Bus 01.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/16p, 480M
    ID 1d6b:0002 Linux Foundation 2.0 root hub
    /sys/bus/usb/devices/usb1  /dev/bus/usb/001/001
    |__ Port 8: Dev 2, If 1, Class=Human Interface Device, Driver=usbhid, 12M
        ID 24ae:2000 Shenzhen Rapoo Technology Co., Ltd. 2.4G Wireless Device Serial
        /sys/bus/usb/devices/1-8  /dev/bus/usb/001/002
    |__ Port 8: Dev 2, If 0, Class=Human Interface Device, Driver=usbhid, 12M
        ID 24ae:2000 Shenzhen Rapoo Technology Co., Ltd. 2.4G Wireless Device Serial
        /sys/bus/usb/devices/1-8  /dev/bus/usb/001/002
    |__ Port 10: Dev 3, If 0, Class=Human Interface Device, Driver=usbhid, 12M
        ID 046d:c534 Logitech, Inc. Unifying Receiver
        /sys/bus/usb/devices/1-10  /dev/bus/usb/001/003
    |__ Port 10: Dev 3, If 1, Class=Human Interface Device, Driver=usbhid, 12M
        ID 046d:c534 Logitech, Inc. Unifying Receiver
        /sys/bus/usb/devices/1-10  /dev/bus/usb/001/003
    |__ Port 14: Dev 4, If 0, Class=Wireless, Driver=btusb, 12M
        ID 8087:0aaa Intel Corp. Bluetooth 9460/9560 Jefferson Peak (JfP)
        /sys/bus/usb/devices/1-14  /dev/bus/usb/001/004
    |__ Port 14: Dev 4, If 1, Class=Wireless, Driver=btusb, 12M
        ID 8087:0aaa Intel Corp. Bluetooth 9460/9560 Jefferson Peak (JfP)
        /sys/bus/usb/devices/1-14  /dev/bus/usb/001/004

从中可以看到设备的Bus和Port, 为了便于找到特定的USB设备, 可以先执行1次命令, 插入USB设备后再次执行, 这样就能快速找到特定USB设备. 示例中想要设置Rapoo的USB鼠标, 其设备信息是 Shenzhen Rapoo Technology Co., Ltd. 2.4G Wireless Device Serial , Bus是1, Port是8. 找到设备的Bus和Port后, 将其以参数形式添加到虚拟机启动命令中即可:

-usb \
-device usb-host,hostbus=1,hostport=8  \

问题排查

IOMMU Group

参考:

当显卡插在副PCIE16插槽时, 显卡可能会与其它设备分到1组, 此时如果直通会出现如下错误:

qemu-system-x86_64: -device vfio-pci,host=07:00.0: vfio 0000:07:00.0: group 8 is not viable
Please ensure all devices within the iommu_group are bound to their vfio bus driver.

此时需要查看IOMMU分组:

for g in `find /sys/kernel/iommu_groups/* -maxdepth 0 -type d | sort -V`; do \
    echo "IOMMU Group ${g##*/}:"; \
        for d in $g/devices/*; do \
        echo -e "\t$(lspci -nns ${d##*/})"; \
        done; \
done;

输出内容如下:

...
IOMMU Group 8:
	01:00.0 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset USB 3.1 XHCI Controller [1022:43d5] (rev 01)
	01:00.1 SATA controller [0106]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset SATA Controller [1022:43c8] (rev 01)
	01:00.2 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe Bridge [1022:43c6] (rev 01)
	02:00.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe Port [1022:43c7] (rev 01)
	02:01.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe Port [1022:43c7] (rev 01)
	02:04.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe Port [1022:43c7] (rev 01)
	04:00.0 Ethernet controller [0200]: Realtek Semiconductor Co., Ltd. RTL8125 2.5GbE Controller [10ec:8125] (rev 05)
	05:00.0 PCI bridge [0604]: Intel Corporation Device [8086:4910]
	06:01.0 PCI bridge [0604]: Intel Corporation Device [8086:490f]
	06:04.0 PCI bridge [0604]: Intel Corporation Device [8086:490f]
	06:05.0 PCI bridge [0604]: Intel Corporation Device [8086:490f]
	07:00.0 VGA compatible controller [0300]: Intel Corporation DG1 [Iris Xe Graphics] [8086:4908] (rev 01)
	08:00.0 Audio device [0403]: Intel Corporation Device [8086:490d]
	09:00.0 Memory controller [0580]: Intel Corporation Device [8086:490e]
...

可以看到Intel DG1显卡跟其它设备分在了1组, 而这些设备不可能全分给虚拟机使用.