-
Notifications
You must be signed in to change notification settings - Fork 3
Ubuntu 22.04直通显卡给虚拟机使用
参考:
- 有两个GPU(CPU核显和独立显卡 或 2个独立显卡),1个主机使用,1个虚拟机使用
- 主板需要启用: CPU的虚拟化, IOMMU
- 主板建议启用: 大于4G地址空间解码, Resize BAR支持(AMD显卡时建议关闭)
- 为了防止准备分给虚拟机的显卡被主机占用, 安装操作系统时拔掉给虚拟机使用的显卡, 操作系统安装完毕后再插显卡.
- 使用时显卡必须连接显示器(模拟器), 否则显卡默认没有画面输出. N卡显示器也能输出画面, I卡显示器没有画面(必须安装IDD驱动模拟显示设备).
注意:有些笔记本虽然看起来有集成显卡和独立显卡,但是并不能成功。有关讨论
案例: 开始使用RTX 3090 Ti 24GB时如果主板开启 Resize BAR支持 功能, 虚拟机会出现无法开机错误
Guest has not initialized the display (yet)
. 后在 网友帖子中找到原因 是OVMF默认的mmio address space不够, 给qemu添加调整参数-fw_cfg opt/ovmf/X-PciMmio64Mb,string=262144
后问题解决.
sudo lspci -nn
01:00.0 VGA compatible controller [0300]: NVIDIA Corporation TU116 [GeForce GTX 1660] [10de:2184] (rev a1)
01:00.1 Audio device [0403]: NVIDIA Corporation TU116 High Definition Audio Controller [10de:1aeb] (rev a1)
01:00.2 USB controller [0c03]: NVIDIA Corporation TU116 USB 3.1 Host Controller [10de:1aec] (rev a1)
01:00.3 Serial bus controller [0c80]: NVIDIA Corporation TU116 [GeForce GTX 1650 SUPER] [10de:1aed] (rev a1)
找到自己显卡型号, 例如
GeForce GTX 1660
, 通过其前面的NVIDIA Corporation TU116
确定设备内部名称 . 末尾的[10de:2184]
就是位号, 记录这4个位号. 只有独立PCI设备可以透传, 因为透传时要求同个iommu_group (地址前5位相同01:00
) 的必须全都传递, 否则启动虚拟机时报错Please ensure all devices within the iommu_group are bound to their vfio bus driver.
. 比如主板上的有线网卡就无法透传.
sudo nano /etc/default/grub
在 GRUB_CMDLINE_LINUX
的值的后面添加参数:
GRUB_CMDLINE_LINUX="intel_iommu=on iommu=pt vfio-pci.ids=10de:2184,10de:1aeb,10de:1aec,10de:1aed video=efifb:off"
注意: CPU是AMD时, 应将
intel_iommu
改成amd_iommu
. 这里必须配置vfio-pci.ids
, 否则vfio无法分配这些设备.
说明: 如果是AMD显卡, 参数中需要加上
pcie_acs_override=downstream,multifunction video=vesafb:off
.
保存后执行:
sudo update-grub
echo "
blacklist nouveau
blacklist nvidia
" | sudo tee -a /etc/modprobe.d/blacklist.conf
echo "options kvm ignore_msrs=1 report_ignored_msrs=0" | sudo tee /etc/modprobe.d/kvm.conf
echo "vfio
vfio_iommu_type1
vfio_pci
vfio_virqfd" | sudo tee -a /etc/modules
应用设置:
sudo update-initramfs -u -k all
重启操作系统.
确认开启IOMMU:
sudo dmesg | grep "IOMMU"
amd处理器打印
AMD-Vi: AMD IOMMUv2 loaded and initialized
, intel处理器打印DMAR: IOMMU enabled
确认开启vfio:
sudo dmesg | grep -i vfio
输出示例
[ 0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-5.15.0-40-generic root=UUID=3c2c2b13-a9b8-44a0-82d3-51ec314ac486 ro intel_iommu=on vfio-pci.ids=10de:2184,10de:1aeb,10de:1aec,10de:1aed quiet splash vt.handoff=7
[ 0.027793] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-5.15.0-40-generic root=UUID=3c2c2b13-a9b8-44a0-82d3-51ec314ac486 ro intel_iommu=on vfio-pci.ids=10de:2184,10de:1aeb,10de:1aec,10de:1aed quiet splash vt.handoff=7
[ 2.914490] VFIO - User Level meta-driver version: 0.3
[ 2.914559] vfio-pci 0000:01:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=io+mem:owns=none
[ 2.934677] vfio_pci: add [10de:2184[ffffffff:ffffffff]] class 0x000000/00000000
[ 2.954602] vfio_pci: add [10de:1aeb[ffffffff:ffffffff]] class 0x000000/00000000
[ 2.974601] vfio_pci: add [10de:1aec[ffffffff:ffffffff]] class 0x000000/00000000
[ 2.994618] vfio_pci: add [10de:1aed[ffffffff:ffffffff]] class 0x000000/00000000
[ 7.188782] vfio-pci 0000:01:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=io+mem:owns=none
打印内容中看到4个
vfio_pci: add [10de:
字样说明配置成功。
sudo lspci -nnk -d 10de:2184
01:00.0 VGA compatible controller [0300]: NVIDIA Corporation TU116 [GeForce GTX 1660] [10de:2184] (rev a1)
Subsystem: Gigabyte Technology Co., Ltd TU116 [GeForce GTX 1660] [1458:3fc7]
Kernel driver in use: vfio-pci
Kernel modules: nvidiafb, nouveau
sudo lspci -nnk -d 10de:1aeb
01:00.1 Audio device [0403]: NVIDIA Corporation TU116 High Definition Audio Controller [10de:1aeb] (rev a1)
Subsystem: Gigabyte Technology Co., Ltd TU116 High Definition Audio Controller [1458:3fc7]
Kernel driver in use: vfio-pci
Kernel modules: snd_hda_intel
sudo lspci -nnk -d 10de:1aec
01:00.2 USB controller [0c03]: NVIDIA Corporation TU116 USB 3.1 Host Controller [10de:1aec] (rev a1)
Subsystem: Gigabyte Technology Co., Ltd TU116 USB 3.1 Host Controller [1458:3fc7]
Kernel driver in use: vfio-pci
Kernel modules: xhci_pci
sudo lspci -nnk -d 10de:1aed
01:00.3 Serial bus controller [0c80]: NVIDIA Corporation TU116 USB Type-C UCSI Controller [10de:1aed] (rev a1)
Subsystem: Gigabyte Technology Co., Ltd TU116 USB Type-C UCSI Controller [1458:3fc7]
Kernel driver in use: vfio-pci
Kernel modules: i2c_nvidia_gpu
显示Kernel driver in use: vfio-pci时说明配置成功。
qemu添加参数:
-fw_cfg opt/ovmf/X-PciMmio64Mb,string=262144 \
-device pcie-root-port,id=rp1,port=0,chassis=0,slot=0,hotplug=off,multifunction=on \
-device vfio-pci,bus=rp1,host=01:00.0 \
-device vfio-pci,bus=rp1,host=01:00.1 \
-device vfio-pci,bus=rp1,host=01:00.2 \
-device vfio-pci,bus=rp1,host=01:00.3 \
注意: 直通时NVIDIA控制面板中没有显示器选项, 这意味着物理显卡不会工作! 如果不在物理显卡上接显示器, 则远程应用无法工作一直黑屏. 直接在物理显卡上连接显示器, 则画面会传入连接的显示器中, 此时远程应用可以工作! 暂时没有找到类似vgpu中的软件层面可用的模拟显示器(VGX), 找到可能的方法为插一个硬件显示模拟器或是HDMI转VGA的转接头.
说明: 如果是AMD显卡, 必须设置 pcie-root-port, 否则显卡会出现43错误码无法工作.
NVIDIA示例参考:
export TPM_PATH=/tmp/tpm_1 ; mkdir -p ${TPM_PATH} ; \
swtpm socket --tpm2 --tpmstate dir=${TPM_PATH} --ctrl type=unixio,path=${TPM_PATH}/swtpm-sock --log level=20 -d ; \
sudo qemu-system-x86_64 -nodefaults -no-user-config -rtc base=localtime,clock=host -machine q35,accel=kvm,vmport=off,dump-guest-core=off \
-chardev socket,id=chrtpm,path=${TPM_PATH}/swtpm-sock -tpmdev emulator,id=tpm0,chardev=chrtpm -device tpm-tis,tpmdev=tpm0 \
-bios OVMF.fd \
-smbios type=0,vendor=lilu.red,version=1.0.0,date=2022-10-05,uefi=on \
-smbios type=1,manufacturer=lilu.red,product=dev,serial=LILU-DEV \
-smbios type=2,manufacturer=Gigabyte,product=H370,version=1.0 \
-cpu host,kvm=off,hypervisor=off,hv-time -smp cores=4 \
-m 8G \
-netdev bridge,id=net0,br=b0 -device virtio-net-pci,mq=on,packed=on,netdev=net0,mac=00:00:00:00:00:01 \
-drive file=w-1.qcow2,format=qcow2,l2-cache-size=8M,cache-clean-interval=60,if=virtio,aio=io_uring,cache=writethrough \
-drive file=/media/m/archive-b/game.qcow2,format=qcow2,l2-cache-size=8M,cache-clean-interval=60,if=virtio,aio=io_uring,cache=writethrough \
-nographic -display none -vga virtio \
-device virtio-serial-pci -chardev spicevmc,id=spicechannel0,name=vdagent,debug=0 -device virtserialport,chardev=spicechannel0,name=com.redhat.spice.0 -spice image-compression=off,disable-ticketing=true,port=5001 \
-fw_cfg opt/ovmf/X-PciMmio64Mb,string=262144 \
-device pcie-root-port,id=rp1,port=0,chassis=0,slot=0,hotplug=off,multifunction=on \
-device vfio-pci,bus=rp1,host=01:00.0 \
-device vfio-pci,bus=rp1,host=01:00.1 \
-device vfio-pci,bus=rp1,host=01:00.2 \
-device vfio-pci,bus=rp1,host=01:00.3 \
-daemonize
测试发现在显卡直通时, 部分游戏会因兼容性问题而无法发挥显卡的全部性能. 比如绝地求生在 常用图像设置 下如果使用DriectX 12, 显卡的占用率及功耗一直上不去, 即使是RTX 3090 Ti表现出的FPS也不好看. 目前还不清楚是否是AMD 5600G平台的原因, 因为GTX 1660在i5-8600k平台上的表现相对正常. 现在已知的是DriectX 11比12拥有更好的FPS, 打开多了20左右.
AMD示例参考:
export TPM_PATH=/tmp/tpm_1 ; \
swtpm socket --tpm2 --tpmstate dir=${TPM_PATH} --ctrl type=unixio,path=${TPM_PATH}/swtpm-sock --log level=20 -d ; \
sudo qemu-system-x86_64 -nodefaults -no-user-config -rtc base=localtime,clock=host -machine q35,accel=kvm,vmport=off,dump-guest-core=off \
-chardev socket,id=chrtpm,path=${TPM_PATH}/swtpm-sock -tpmdev emulator,id=tpm0,chardev=chrtpm -device tpm-tis,tpmdev=tpm0 \
-bios OVMF.fd \
-smbios type=0,vendor=lilu.red,version=1.0.0 -smbios type=1,manufacturer=lilu.red,product=dev \
-cpu host,-hypervisor,hv-passthrough -smp cores=4 \
-m 16G \
-netdev bridge,id=net0,br=b0 -device virtio-net-pci,mq=on,packed=on,netdev=net0,mac=00:00:00:00:00:01 \
-drive file=amd.qcow2,format=qcow2,l2-cache-size=8M,cache-clean-interval=60,if=virtio,aio=io_uring,cache=writethrough \
-drive file=game.qcow2,format=qcow2,l2-cache-size=8M,cache-clean-interval=60,if=virtio,aio=io_uring,cache=writethrough \
-nographic -display none -vga virtio \
-device virtio-serial-pci -chardev spicevmc,id=spicechannel0,name=vdagent,debug=0 -device virtserialport,chardev=spicechannel0,name=com.redhat.spice.0 -spice image-compression=off,disable-ticketing=true,port=5001 \
-device pcie-root-port,id=root_port1,hotplug=off,multifunction=on,chassis=6,addr=1c.0,slot=2,bus=pcie.0 \
-device vfio-pci,bus=root_port1,addr=00.0,multifunction=on,x-vga=on,host=03:00.0 \
-device vfio-pci,bus=root_port1,addr=00.1,host=03:00.1 \
-daemonize
可以将主机的USB设备分给虚拟机使用, 比如鼠标键盘或是游戏手柄. 首先我们需要确定USB设备的Bus和Port, 运行:
lsusb -tvv
其输出内容如下:
/: Bus 02.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/8p, 10000M
ID 1d6b:0003 Linux Foundation 3.0 root hub
/sys/bus/usb/devices/usb2 /dev/bus/usb/002/001
|__ Port 5: Dev 2, If 0, Class=Mass Storage, Driver=uas, 5000M
ID 152d:0578 JMicron Technology Corp. / JMicron USA Technology Corp. JMS578 SATA 6Gb/s
/sys/bus/usb/devices/2-5 /dev/bus/usb/002/002
/: Bus 01.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/16p, 480M
ID 1d6b:0002 Linux Foundation 2.0 root hub
/sys/bus/usb/devices/usb1 /dev/bus/usb/001/001
|__ Port 8: Dev 2, If 1, Class=Human Interface Device, Driver=usbhid, 12M
ID 24ae:2000 Shenzhen Rapoo Technology Co., Ltd. 2.4G Wireless Device Serial
/sys/bus/usb/devices/1-8 /dev/bus/usb/001/002
|__ Port 8: Dev 2, If 0, Class=Human Interface Device, Driver=usbhid, 12M
ID 24ae:2000 Shenzhen Rapoo Technology Co., Ltd. 2.4G Wireless Device Serial
/sys/bus/usb/devices/1-8 /dev/bus/usb/001/002
|__ Port 10: Dev 3, If 0, Class=Human Interface Device, Driver=usbhid, 12M
ID 046d:c534 Logitech, Inc. Unifying Receiver
/sys/bus/usb/devices/1-10 /dev/bus/usb/001/003
|__ Port 10: Dev 3, If 1, Class=Human Interface Device, Driver=usbhid, 12M
ID 046d:c534 Logitech, Inc. Unifying Receiver
/sys/bus/usb/devices/1-10 /dev/bus/usb/001/003
|__ Port 14: Dev 4, If 0, Class=Wireless, Driver=btusb, 12M
ID 8087:0aaa Intel Corp. Bluetooth 9460/9560 Jefferson Peak (JfP)
/sys/bus/usb/devices/1-14 /dev/bus/usb/001/004
|__ Port 14: Dev 4, If 1, Class=Wireless, Driver=btusb, 12M
ID 8087:0aaa Intel Corp. Bluetooth 9460/9560 Jefferson Peak (JfP)
/sys/bus/usb/devices/1-14 /dev/bus/usb/001/004
从中可以看到设备的Bus和Port, 为了便于找到特定的USB设备, 可以先执行1次命令, 插入USB设备后再次执行, 这样就能快速找到特定USB设备. 示例中想要设置Rapoo的USB鼠标, 其设备信息是 Shenzhen Rapoo Technology Co., Ltd. 2.4G Wireless Device Serial
, Bus是1, Port是8. 找到设备的Bus和Port后, 将其以参数形式添加到虚拟机启动命令中即可:
-usb \
-device usb-host,hostbus=1,hostport=8 \
参考:
当显卡插在副PCIE16插槽时, 显卡可能会与其它设备分到1组, 此时如果直通会出现如下错误:
qemu-system-x86_64: -device vfio-pci,host=07:00.0: vfio 0000:07:00.0: group 8 is not viable
Please ensure all devices within the iommu_group are bound to their vfio bus driver.
此时需要查看IOMMU分组:
for g in `find /sys/kernel/iommu_groups/* -maxdepth 0 -type d | sort -V`; do \
echo "IOMMU Group ${g##*/}:"; \
for d in $g/devices/*; do \
echo -e "\t$(lspci -nns ${d##*/})"; \
done; \
done;
输出内容如下:
...
IOMMU Group 8:
01:00.0 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset USB 3.1 XHCI Controller [1022:43d5] (rev 01)
01:00.1 SATA controller [0106]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset SATA Controller [1022:43c8] (rev 01)
01:00.2 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe Bridge [1022:43c6] (rev 01)
02:00.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe Port [1022:43c7] (rev 01)
02:01.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe Port [1022:43c7] (rev 01)
02:04.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe Port [1022:43c7] (rev 01)
04:00.0 Ethernet controller [0200]: Realtek Semiconductor Co., Ltd. RTL8125 2.5GbE Controller [10ec:8125] (rev 05)
05:00.0 PCI bridge [0604]: Intel Corporation Device [8086:4910]
06:01.0 PCI bridge [0604]: Intel Corporation Device [8086:490f]
06:04.0 PCI bridge [0604]: Intel Corporation Device [8086:490f]
06:05.0 PCI bridge [0604]: Intel Corporation Device [8086:490f]
07:00.0 VGA compatible controller [0300]: Intel Corporation DG1 [Iris Xe Graphics] [8086:4908] (rev 01)
08:00.0 Audio device [0403]: Intel Corporation Device [8086:490d]
09:00.0 Memory controller [0580]: Intel Corporation Device [8086:490e]
...
可以看到Intel DG1显卡跟其它设备分在了1组, 而这些设备不可能全分给虚拟机使用.