Setup of Oculus Rift in VM with GPU passthrough

How to set up Windows10 Virtual Machine with GPU passthrough via Qemu/VFIO/OVMF (with minimal system changes)
And install the Oculus Rift.
-------------------------------------------------------------------------------------------------------------
Date:12/07/2019		Author: larsupilami73


Goals:

a. GPU passthrough with Windows10 in a virtual machine for running games, Unigine benchmarks, Oculus Rift etc.,
b. Works for both identical or different Nvidia GPUs (no clue if it works for AMD GPUs)
c. Minimal changes to the system: no Grub config, initramfs, /etc/modules changes,
d. After shutdown of the VM, the 2nd GPU can be reclaimed for CUDA use.


1.1 Hardware:
-------------
AMD 1950x Threadripper (non overclocked),
AsRock Taichi X399 Motherboard, bios version 3.30, agesa sp3r2-1.1.0.1,
32 GB RAM,
Two identical Asus ROG Strix 1080Ti-11G-GAMING (vbios 86.02.39.00.54), not in SLI,
One extra Samsung 860 SSD 250GB for the Windows10 installation (unformated, not used by host Debian OS)
One extra Logitech K400+ wireless keyboard with touchpad for the virtual machine


1.2 Host:
---------
CrunchBangPlusPlus 9, which is Debian Stretch with Openbox desktop (changed from 'stable' release to 'testing', see https://wiki.debian.org/DebianTesting)
Kernel 4.19.05-amd64
Nvidia drivers 418.56


2 Method outline:
-----------------
The problem:
Qemu needs the 2nd GPU (the one be passed through) to be bound to the VFIO driver for passthrough.
Dynamic rebinding of the 2nd GPU from the Nvidia driver to VFIO and back is possible, however the GPU needs to be free of processes using it.
Now, once the displaymanager (lxdm, slim etc.) is started, X is started too. X then grabs *all* GPUs it can find that are bound to the nvidia driver.
You can check this by typing in a terminal:

	sudo lsof /dev/nvidia*

resulting in:
...
COMMAND  PID USER   FD   TYPE  DEVICE SIZE/OFF  NODE NAME
Xorg    1133 root  mem    CHR   195,1          46269 /dev/nvidia1  <---1st GPU, running desktop
...
Xorg    1133 root  mem    CHR   195,0          46276 /dev/nvidia0  <---2nd GPU, X still occupies it, even if it is not connected to a monitor
...


Also, trying to reset the 2nd GPU:

	sudo nvidia-smi -i 0 --gpu-reset

results in:

GPU 00000000:09:00.0 is currently in use by another process.
1 device is currently being used by one or more other processes
...


As far as I know, there is no way to tell X (xserver-xorg-video-nvidia) to leave a certain GPU alone.
So to do dynamic rebinding of the 2nd GPU to the VFIO driver, first X needs to be stopped, which in turn requires the displaymanager to be stopped,
which requires you to log out, drop to a terminal, login etc, manually call an unbinding script etc.
This is annoying. We can avoid X grabbing the 2nd GPU, by binding it to the VFIO driver early on, at system boot.
This is the approach followed in this excellent Arch wiki:

https://wiki.archlinux.org/index.php/PCI_passthrough_via_OVMF

However, it requires a change of initramfs, Grub config, kernel module parameters etc., parts of my system that I generally don't like touching,
for fear an update will mess up my settings. Worse yet, for two identical GPUs, it gets even more complicated, needing a 'hook' script:

https://wiki.archlinux.org/index.php/PCI_passthrough_via_OVMF#Using_identical_guest_and_host_GPUs

There is a *less invasive* way, which is outlined in the rest of this text.
The recipe goes like this:

-Boot with normal initramfs. The kernel binds both nvidia GPUs to the nvidia driver.
-When the cron service is started, a cron job with the '@reboot' setting calls a shell script that
 unbinds nvidia driver from the 2nd GPU and binds it to VFIO.
-Then the displaymanager starts X, ignoring the GPU bound to VFIO.
-Login to normal desktop using 1st GPU.
-Now another script can start windows10 VM or rebind the 2nd GPU to the nvidia driver for CUDA or to do whatever,
 because your at-present-running X server is only concerned with the 1st GPU.

This method was hinted at by 'TheCakeIsNaOH' here:

https://forum.level1techs.com/t/identical-gpu-passthrough-ubuntu/138843/14

The purpose of this text is to write this all out a bit more, step-by-step.


3 Lets do it:
-------------

3.1 IOMMU groups:
-----------------

Follow: https://wiki.archlinux.org/index.php/PCI_passthrough_via_OVMF
up to and including section 2.3.1 'Isolating the GPU'. At this point, your IOMMU groups should be sane, meaning
your 2nd GPU is in its own IOMMU group.

In my system, the 2nd GPU and its audio interface (the card I want to pass through) are in group 16,
as given by the script 'Ensuring that the groups are valid' in section 2.2 of the Archwiki article:

----------------------------------------------
#!/bin/bash
shopt -s nullglob
for g in /sys/kernel/iommu_groups/*; do
    echo "IOMMU Group ${g##*/}:"
    for d in $g/devices/*; do
        echo -e "\t$(lspci -nns ${d##*/})"
    done;
done;
----------------------------------------------

This outputs:

...
IOMMU Group 16 09:00.0 VGA compatible controller [0300]: NVIDIA Corporation GP102 [GeForce GTX 1080 Ti] [10de:1b06] (rev a1)
IOMMU Group 16 09:00.1 Audio device [0403]: NVIDIA Corporation GP102 HDMI Audio Controller [10de:10ef] (rev a1)
...

while my 1st GPU (the one running my Linux desktop) is in group 32:

...
IOMMU Group 32 41:00.0 VGA compatible controller [0300]: NVIDIA Corporation GP102 [GeForce GTX 1080 Ti] [10de:1b06] (rev a1)
IOMMU Group 32 41:00.1 Audio device [0403]: NVIDIA Corporation GP102 HDMI Audio Controller [10de:10ef] (rev a1)
...

Note: strangely enough, nividia-smi indexes the 2nd GPU as '0' while the 1st one is '1':

	nvidia-smi

Thu May 30 12:06:19 2019
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 418.56       Driver Version: 418.56       CUDA Version: 10.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 108...  Off  | 00000000:09:00.0 Off |                  N/A |
|  0%   32C    P8    11W / 250W |      2MiB / 11178MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   1  GeForce GTX 108...  Off  | 00000000:41:00.0  On |                  N/A |
|  0%   35C    P8    12W / 250W |    153MiB / 11178MiB |      6%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    1      1138      G   /usr/lib/xorg/Xorg                           148MiB |
|    1      1683      G   compton                                        3MiB |
+-----------------------------------------------------------------------------+


Also, nvidia-smi does not report processes that have opened /dev/nvidia0:

	sudo lsof /dev/nvidia0

will report several opened files.


3.2 Disable nvidia-persistenced:
--------------------------------
The nividia-persistenced daemon must be disabled to prevent it to (re-)initialize the 2nd GPU.
For more info, see: https://docs.nvidia.com/deploy/driver-persistence/index.html
Do once:

	sudo systemctl disable nvidia-persistenced

Reboot.Check with:

	nvidia-smi -i 0 -q

==============NVSMI LOG==============

Timestamp                           : Thu May 30 12:36:45 2019
Driver Version                      : 418.56
CUDA Version                        : 10.1

Attached GPUs                       : 2
GPU 00000000:09:00.0
    Product Name                    : GeForce GTX 1080 Ti
    Product Brand                   : GeForce
    Display Mode                    : Enabled
    Display Active                  : Disabled
    Persistence Mode                : Disabled	<----OK!
...


My conky script uses nvidia-smi to obtain GPUs memory, fan state etc.
For some reason, disabling nvidia-persistenced slows down nvidia-smi (and my desktop),
but only AFTER a the first time a windows virtual machine has been shut down in my normal desktop environment.
To avoid this, nvidia persistence mode will be re-enabled after the VM is shut down in a script (see Section 3.5.2).

If needed, persistence can be controlled manually, per GPU, by doing:

	sudo nvidia-smi -i {0,1} -pm {DISABLED,ENABLED}


3.3 Install the unbind script:
-------------------------------

3.3.1 Create a script called 'unbind_nvidia_bind_vfio.sh':
----------------------------------------------------------

#!/bin/sh
#place in /usr/local/bin
#unbinds GPU from Nvidia driver, bind to VFIO

/sbin/modprobe vfio
/sbin/modprobe vfio_pci

# VGA
echo '0000:09:00.0' > /sys/bus/pci/devices/0000:09:00.0/driver/unbind
echo '10de 1b06' > /sys/bus/pci/drivers/vfio-pci/new_id
echo '0000:09:00.0' > /sys/bus/pci/devices/0000:09:00.0/driver/bind
echo '10de 1b06' > /sys/bus/pci/drivers/vfio-pci/remove_id

# Audio
echo '0000:09:00.1' > /sys/bus/pci/devices/0000:09:00.1/driver/unbind
echo '10de 10ef' > /sys/bus/pci/drivers/vfio-pci/new_id
echo '0000:09:00.1' > /sys/bus/pci/devices/0000:09:00.1/driver/bind
echo '10de 10ef' > /sys/bus/pci/drivers/vfio-pci/remove_id

#this is a kvm option, needed to avoid blue-screen-of-death at ovmf uefi boot of the windows10 iso.
#see: https://forum.level1techs.com/t/windows-10-1803-as-guest-with-qemu-kvm-bsod-under-install/127425/9
echo 1 > /sys/module/kvm/parameters/ignore_msrs

exit 0


In above script, adjust the PCI adresses to your 2nd GPU (the one to be passed through) and its vendor and device IDs,
as reported by the IOMMU script in 3.1. Note that device IDs an PCI addresses are different for the VGA and Audio section!
Copy the script to /usr/local/bin and adjust the execution permissions and owner (if needed):

	sudo chmod 755 unbind_nvidia_bind_vfio.sh
	sudo chown root unbind_nvidia_bind_vfio.sh


3.3.2 Check that the unbinding script works:
--------------------------------------------
In this section, we will check if above script works, by calling it manually from command line.
As explained in section 2, rebinding does not work before all processes using the 2nd GPU (X servers etc.) are shut down,
so we first need to drop to a plain-text terminal and stop all X servers.

-Exit your desktop environment. This returns you to the display manager (lxdm for me).
-Go to a plain-text terminal session (ctrl-alt-F5 etc.).
-Log in as a normal user.
-Then stop the displaymanager service:

	sudo service lxdm stop

-You will be prompted to log in again.
-Run the unbind script from section 3.3.1 manually:

	sudo /etc/usr/local/bin/unbind_nvidia_bind_vfio.sh

-Check for the display part of the 2nd GPU (change vendor and device id as in the unbind script to that of your card):

	lspci -nnk -d 10de:1b06

which for me returns:

09:00.0 VGA compatible controller [0300]: NVIDIA Corporation GP102 [GeForce GTX 1080 Ti] [10de:1b06] (rev a1)
	Subsystem: ASUSTeK Computer Inc. GP102 [GeForce GTX 1080 Ti] [1043:85f1]
	Kernel driver in use: vfio-pci		<---2nd GPU, OK!
	Kernel modules: nvidia
41:00.0 VGA compatible controller [0300]: NVIDIA Corporation GP102 [GeForce GTX 1080 Ti] [10de:1b06] (rev a1)
	Subsystem: ASUSTeK Computer Inc. GP102 [GeForce GTX 1080 Ti] [1043:85f1]
	Kernel driver in use: nvidia		<---1st GPU, OK!
	Kernel modules: nvidia


-Check for the audio part of the 2nd GPU:

	lspci -nnk -d 10de:10ef

which for me returns:

09:00.1 Audio device [0403]: NVIDIA Corporation GP102 HDMI Audio Controller [10de:10ef] (rev a1)
	Subsystem: ASUSTeK Computer Inc. GP102 HDMI Audio Controller [1043:85f1]
	Kernel driver in use: vfio-pci		<---OK!
	Kernel modules: snd_hda_intel
41:00.1 Audio device [0403]: NVIDIA Corporation GP102 HDMI Audio Controller [10de:10ef] (rev a1)
	Subsystem: ASUSTeK Computer Inc. GP102 HDMI Audio Controller [1043:85f1]
	Kernel driver in use: snd_hda_intel
	Kernel modules: snd_hda_intel


'Kernel driver in use' should now be 'vfio-pci' for the passthrough card.


-Restart the displaymanager service:

	sudo service lxdm start

-The display manager will start. Log in to a normal desktop environment.

At this moment, X is not using the 2nd GPU and a Qemu command to start the Windows VM (see further, Section 3.5)
can be called and should work. Also, rebinding to the nvidia driver (see further, Section 3.4) should work while X is running.
Unfortunately, the situation as it is now does not survive a reboot.
As explained in the outline, Section 2, to avoid having to stop X and do this manual unbinding every time again,
the unbinding script will be called by cron at boot. This is explained in the next section.


3.3.3 Make the script execute at boot:
--------------------------------------
-To make cron execute the unbind_nvidia_bind_vfio.sh script at boot, do in a terminal:

	sudo crontab -e

-Select editor like nano and add to the file:

	@reboot  /usr/local/bin/unbind_nvidia_bind_vfio.sh 2>&1 | /usr/bin/logger -t unbind_nvidia_bind_vfio

The last part causes the output and errors of the unbind script to be logged.

-Save, reboot, log in to desktop environment and check with:

	sudo cat /var/log/syslog | grep unbind_nivida_bind_vfio

This returns:

Jun  3 19:50:42 home CRON[955]: (root) CMD (/usr/local/bin/unbind_nvidia_bind_vfio.sh 2>&1 | /usr/bin/logger -t unbind_nvidia_bind_vfio)
Jun  3 19:50:42 home unbind_nvidia_bind_vfio: /usr/local/bin/unbind_nvidia_bind_vfio.sh: 11: echo: echo: I/O error
Jun  3 19:50:43 home unbind_nvidia_bind_vfio: /usr/local/bin/unbind_nvidia_bind_vfio.sh: 17: echo: echo: I/O error

I have no idea why the I/O errors happen, but it seems harmless.
Check that the 2nd GPU is bound to the VFIO driver:

	lspci -nnk -d 10de:1b06


09:00.0 VGA compatible controller [0300]: NVIDIA Corporation GP102 [GeForce GTX 1080 Ti] [10de:1b06] (rev a1)
	Subsystem: ASUSTeK Computer Inc. GP102 [GeForce GTX 1080 Ti] [1043:85f1]
	Kernel driver in use: vfio-pci		<---2nd GPU, OK!
	Kernel modules: nvidia
41:00.0 VGA compatible controller [0300]: NVIDIA Corporation GP102 [GeForce GTX 1080 Ti] [10de:1b06] (rev a1)
	Subsystem: ASUSTeK Computer Inc. GP102 [GeForce GTX 1080 Ti] [1043:85f1]
	Kernel driver in use: nvidia		<---1st GPU, OK!
	Kernel modules: nvidia

And similarly for the audio part of the 2nd GPU:

	lspci -nnk -d 10de:10ef

09:00.1 Audio device [0403]: NVIDIA Corporation GP102 HDMI Audio Controller [10de:10ef] (rev a1)
	Subsystem: ASUSTeK Computer Inc. GP102 HDMI Audio Controller [1043:85f1]
	Kernel driver in use: vfio-pci		<---OK!
	Kernel modules: snd_hda_intel
...


3.4 Install the rebind script:
------------------------------
Create the following script 'unbind_vfio_bind_nvidia.sh' in /usr/local/bin:

#!/bin/sh
#place in /usr/local/bin
#unbind vfio and rebind 2nd GPU to nvidia

# Unbind the GPU from vfio-pci
echo -n "0000:09:00.0" > /sys/bus/pci/drivers/vfio-pci/unbind || echo "Failed to unbind gpu from vfio-pci"
echo -n "0000:09:00.1" > /sys/bus/pci/drivers/vfio-pci/unbind || echo "Failed to unbind gpu-audio from vfio-pci"

# Remove GPU from vfio-pci
echo -n "10de 1b06" > /sys/bus/pci/drivers/vfio-pci/remove_id
echo -n "10de 10ef" > /sys/bus/pci/drivers/vfio-pci/remove_id

# Remove vfio driver (is this needed?)
/sbin/modprobe -r vfio-pci

# Bind the GPU to it's drivers
echo -n "0000:09:00.0" > /sys/bus/pci/drivers/nvidia/bind || echo "Failed to bind nvidia"
echo -n "0000:09:00.1" > /sys/bus/pci/drivers/snd_hda_intel/bind || echo "Failed to bind nvidia"

exit 0


-Change permissions and if needed root ownership:

	sudo chmod 755 unbind_vfio_bind_nvidia.sh
	sudo chown root unbind_vfio_bind_nvidia.sh

-Check that the script works:
	sudo unbind_vfio_bind_nvidia.sh
	lspci -nnk -d 10de:1b06
	lspci -nnk -d 10de:10ef


-Check that nvidia-smi can access the 2nd GPU again:

	nvidia-smi


sudo nvidia-smi -i 0 --gpu-reset
-->
GPU 00000000:09:00.0 is currently in use by another process.


Important note: nvidia-smi seems to renumber the GPUs according to their PCI address,
so the 1st GPU that WAS number 1 (adress 41:00.x) becomes GPU 0 as long as it is the only
one bound to the nvidia-driver. All terribly confusing :-/


Before continuing, make sure, the 2nd GPU is bound to vfio, by calling the unbind_nvidia_bind_vfio.sh script.


3.5 Qemu Windows10 booting:
---------------------------


3.5.1 Preparations:
-------------------
-Download the windows10 .iso from:

https://www.microsoft.com/en-us/software-download/windows10ISO

I used Win10_1903_V1_EnglishInternational_x64.iso

Note:
you can install the iso without a product key.
Some features like choosing the wallpaper will be disabled.
For more info, see: https://www.howtogeek.com/244678/you-dont-need-a-product-key-to-install-and-use-windows-10/

-Download the Virtio .iso drivers that windows10 will use to access the SSD in the virtual environment:

https://fedorapeople.org/groups/virt/virtio-win/direct-downloads/latest-virtio/virtio-win.iso

I have virtio-win-0.1.171.iso
More info: https://passthroughpo.st/disk-passthrough-explained/

-Follow the steps of "Configuring libvirt" from the Archwiki:

https://wiki.archlinux.org/index.php/PCI_passthrough_via_OVMF#Setting_up_an_OVMF-based_guest_VM

-Copy /usr/share/OVMF_VARS.fd to (a directory in) your home. This enables storing changed virtualized UEFI boot parameters.

-Obviously, Qemu needs to be installed too:
	qemu-system-x86_64 --version

QEMU emulator version 3.1.0 (Debian 1:3.1+dfsg-7)
Copyright (c) 2003-2018 Fabrice Bellard and the QEMU Project developers

(alternatively, go the 'virt-manager route'. See the Archwiki)

3.5.2 The Qemu script:
----------------------
Create a script 'start_windows.sh' in your home directory and modify to your system specifics (see below):

#!/bin/sh
#watch out: no spaces allowed between ',' and options of qemu!

#if not done so already, disable nvidia persistence mode
nvidia-smi -i 0 -pm DISABLED
nvidia-smi -i 1 -pm DISABLED

#if 2nd GPU is not bound to vfio-pci driver, call unbinding script
if [ ! -e /sys/bus/pci/drivers/vfio-pci/0000:09:00.0 ]; then
	/usr/local/bin/unbind_nvidia_bind_vfio.sh
	sleep 1
fi

export QEMU_AUDIO_DRV=alsa QEMU_AUDIO_TIMER_PERIOD=0
qemu-system-x86_64 \
	-machine q35,accel=kvm \
	-enable-kvm -m 16384 -cpu host,kvm=off -smp 8,sockets=1,cores=8,threads=1 \
	-vga none \
	-nographic \
	-rtc base=localtime,clock=vm \
	-device ioh3420,bus=pcie.0,addr=1c.0,multifunction=on,port=1,chassis=1,id=root.1 \
	-device piix4-ide,bus=pcie.0,id=piix4-ide \
	-device vfio-pci,host=09:00.0,bus=root.1,addr=00.0,multifunction=on,x-vga=on \
	-device vfio-pci,host=09:00.1,bus=pcie.0 \
	-device nec-usb-xhci \
	-device usb-host,hostbus=5,hostaddr=2 \
	-drive file=/usr/share/OVMF/OVMF_CODE.fd,if=pflash,format=raw,readonly \
	-drive file=/home/lars/windows_vm/OVMF_VARS.fd,if=pflash,format=raw,unit=1 \
	-boot order=dc \
	-drive if=virtio,id=disk0,cache=none,format=raw,file=/dev/sda \
	-drive file=/home/lars/windows_vm/Win10_1903_V1_EnglishInternational_x64.iso,index=1,media=cdrom \
	-drive file=/home/lars/windows_vm/virtio-win-0.1.171.iso,index=2,media=cdrom \


#rebind to nvidia driver
/usr/local/bin/unbind_vfio_bind_nvidia.sh

#re-enable nvidia persistence mode otherwise nvidia-smi runs slow, slowing down conky and my desktop.
sleep 1
nvidia-smi -i 0 -pm ENABLED
nvidia-smi -i 1 -pm ENABLED

exit 0

Note: jum ahead to Section 3.7.2 for the *final* script including USB passthrough needed for Oculus Rift.

Qemu script modifications:
--------------------------
-Change the PCI address of the 2nd GPU (0000:09:00.0) to that of your 2nd GPU. Likewise with the audio part (0000:09:00.1).

-Change '/home/lars' to the location where you keep your scripts.

-'/dev/sda' is the SSD to install Windows10 on.
  Find out the device name with 'lsblk'. Before 1st boot, remove any existing partitions with gparted,

-Make sure you, the regular user, has permissions to read/write this disk.

-'hostbus=5,hostaddr=2' is the address of my Logitech K400+ wireless keyboard/touchpad that the windows10 guest will use.
 Other (extra) usb ports can be passed through in the same manner.
 Find out the hostbus and hostaddress with:

	lsusb

...
Bus 005 Device 002: ID 046d:c52b Logitech, Inc. Unifying Receiver
...

For more info, see: https://unix.stackexchange.com/questions/452934/can-i-pass-through-a-usb-port-via-qemu-command-line


-Make the script executable:

	chmod +755 startwindows.sh


3.5.3 First time booting and installing Windows10:
--------------------------------------------------
-Execute the script as root (how to avoid running as root?):

	sudo startwindows.sh

-After a few seconds, the monitor connected to the 2nd GPU will display OVMF UEFI booting.

-If you drop into the OVMF boot menu, type 'exit' and enter

-Select the boot menu and select one of the Qemu CDROMs to boot the windows iso
First time, I got a blue screen saying 'system thread_ exception not handled'.
This line, added to unbind_nvidia_bind_vfio.sh solved this:

	echo 1 > /sys/module/kvm/parameters/ignore_msrs

For more info, see: https://forum.level1techs.com/t/windows-10-1803-as-guest-with-qemu-kvm-bsod-under-install/127425/9

-If Windows asks for a key click "I don't have a key"

-Asked where to install, you will see an empty list. On the lower left, you an choose to 'Load drivers'. Then select the 'CDROM' with the Virtio drivers and install.
 The SSD where Windows will be installed upon, will then appear in the list. Select it and continue to install Windows.
 For some nice screenshots about installing the Virtio drivers, see:
 http://www.zeta.systems/blog/2018/07/03/Installing-Virtio-Drivers-In-Windows-On-KVM/

-I chose regular Windows 10 Home.
 For differences between the versions, see:
 https://answers.microsoft.com/en-us/windows/forum/windows_10-other_settings/whats-the-difference-between-windows-10-education/f05e202f-815a-47dc-a641-e3a85e974a0b

-Install Nvidia drivers (download and execute .exe from card manufacturer site). Reboot.

-(Specific for my Asus GPU) Install 'GPUTweakII' bloatware to control GPU overclocking and 'AURA_RGBLightingControl' for rainbow-unicorn-barf LEDs of the GPU.
 Warning: rainbow-barf settings survive reboot.

-Install the Unigine Heaven benchmark:
 https://benchmark.unigine.com/heaven
 so you look happy like Wendell:
 https://www.youtube.com/watch?v=UD4BxGNShw8

If when first starting Heaven, you get an error saying msvcp100.dll is not found then search for it and
copy and paste it to C:\Windows\System32 and C:\Windows\SysWOW64\, as explained here:
https://www.reddit.com/r/Windows10/comments/3ulr79/msvcp100dll_missing_for_unigine_valley_benchmark/

Heaven benchmark score with resolution 1680x1050, antialiasing x8, details on Ultra, : 3274.


*** Yeey! You did it! ***

3.6 Tweaks:
-----------

3.6.1 CPU 'pinning' with taskset:
---------------------------------
-Enable NUMA (Non Uniform Memory Access) and then assign CPUs to the VM:
This is specific for the AsRock Taichi X399 Motherboard. For others, the menus and or settings may be different.
At host boot, to go into UEFI menu by pressing F2.
Then follow the menus: Advanced --> AMD CBS --> DF Common options --> set option 'Memory interleaving' to 'Channel'.
-Check with:

	lstopo

The machine layout should resemble the image at: https://imgur.com/a/frnUq with two NUMA nodes.

-Find out to which NUMA node the passthrough GPU is connected:

	lstopo --verbose

...
NUMANode L#0 (P#0 local=16382396KB total=16382396KB)		<--- node 0
...
        PCI 10de:1b06 (P#36864 busid=0000:09:00.0 class=0300(VGA) link=4.00GB/s PCIVendor="NVIDIA Corporation") "NVIDIA Corporation"  <--- the passthrough GPU
            GPU L#6 "renderD128"
            GPU L#7 "card0"
...

-From the drawing lstopo gave above, you can now find out which CPUs are in the same NUMA node as the passthrough GPU.
 Note lscpu can also give this information:

	lscpu
...
NUMA node0 CPU(s):   0-7,16-23		<---where the passthrough GPU is at
NUMA node1 CPU(s):   8-15,24-31
...


We will now use the taskset command to 'pin' Qemu to those CPUs in the same NUMA node as the passthrough GPU.
Qemu also needs some threads of its own, next to the ones that make up the VM.
See: https://www.reddit.com/r/VFIO/comments/4vqnnv/qemu_command_line_cpu_pinning/
-Adjust the Qemu script of Section 3.5.2 as follows:

...
taskset --cpu-list --all-tasks 0-7,16-23 qemu-system-x86_64 \
...


- Alternatively, after the Qemu script is started, do from a terminal in the host:

	QEMUPID=$(pidof -s qemu-system-x86_64)
	taskset --cpu-list --all-tasks --pid 0-7,16-23  $QEMUPID


3.7 Steps to install the Oculus Rift:
-------------------------------------
The Oculus Rift (CV1) needs 3 USB 3.0 ports: 2 for the position sensor thingies and
one that goes to the helmet itself, along with an HDMI for video.
I have a USB 3.0 controller in a single IOMMU group (see script in Section 3.1):

...
IOMMU Group 35 42:00.3 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) USB 3.0 Host Controller [1022:145c]
...

This controller is bound to xhci_hcd driver:

	ls -la /sys/bus/pci/devices/0000:42:00.3/

...
driver -> ../../../../bus/pci/drivers/xhci_hcd
...


3.7.1 Stuff that doesn't work (and why):
----------------------------------------
(see 3.7.2 for what DOES work, this section is left in to show what I tried out)
It seems simple to add all ports needed to the Qemu script:


	-device nec-usb-xhci,id=xhci3,multifunction=on \
	-device usb-host,bus=xhci3.0,port=1,vendorid=0x2833,productid=0x3031 \
	-device usb-host,bus=xhci3.0,port=2,vendorid=0x2833,productid=0x0031 \
	-device usb-host,bus=xhci3.0,port=3,vendorid=0x2833,productid=0x2031 \
	-device nec-usb-xhci,id=xhci2,multifunction=on \
	-device usb-host,bus=xhci2.0,port=1,vendorid=0x046d,productid=0xc52b \  <-- for keyboard
	-device usb-host,bus=xhci2.0,port=2,vendorid=0x2833,productid=0x0211 \
	-device usb-host,bus=xhci2.0,port=3,vendorid=0x2833,productid=0x0211 \

In the terminal that calls qemu, the windows guest-initiated-resets of the oculus devices give:

	libusb: error [_open_sysfs_attr] open /sys/bus/usb/devices/5-2.1/bConfigurationValue failed ret=-1 errno=2
	libusb: error [_get_usbfs_fd] File doesn't exist, wait 10 ms and try again
	libusb: error [_get_usbfs_fd] libusb couldn't open USB device /dev/bus/usb/005/046: No such file or directory
	libusb: error [udev_hotplug_event] ignoring udev action bind
	libusb: error [udev_hotplug_event] ignoring udev action bind


However, even if this works to detect all parts of the oculus (including the 2 sensors), it fails to update the Oculus firmware
or link the controllers, because it does a reset of the devices, which the host passes through,
and then udev renumbers the devices, which confuses qemu.


After some googling:

https://www.reddit.com/r/VFIO/comments/97dhbw/qemu_w10_xbox_one_controller/

---> Same problem here, the host xhci keep resetting the device to a new address.
I've google around but only find people choose to pass the entire usb controller, which works,
but not actually solving this problem (and I don't have any usb controller to spare).

--> https://patchwork.ozlabs.org/patch/1031919/
With certain USB devices passed through via usb-host, a guest attempting to
reset a usb-host device can trigger a reset loop that renders the USB device
unusable. In my use case, the device was an iPhone XR that was passed through to
a Mac OS X Mojave guest. Upon connecting the device, the following happens:

1) Guest recognizes new device, sends reset to emulated USB host
2) QEMU's USB host sends reset to host kernel
3) Host kernel resets device
4) After reset, host kernel determines that some part of the device descriptor
has changed ("device firmware changed" in dmesg), so host kernel decides to
re-enumerate the device.
5) Re-enumeration causes QEMU to disconnect and reconnect the device in the
guest.
6) goto 1)

Same kind of problem reported here:
https://www.redhat.com/archives/vfio-users/2016-February/msg00034.html

So: here a loop is not initiated, but cant update firmware of occulus, since it dissapears. only unplugging/replugging works.
The added option no_guest_reset is introduced in qemu 4.0 (April 2019). Mine is 3.1 :-/


3.7.2 passthrough of entire USB controller:
-------------------------------------------
The advice is always the same: pass through entire usb controller!
-First check which one it is:

	./checkiommugroups.sh | grep USB

IOMMU Group 14 01:00.0 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] X399 Series Chipset USB 3.1 xHCI Controller [1022:43ba] (rev 02)
IOMMU Group 19 0a:00.3 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) USB 3.0 Host Controller [1022:145c]
IOMMU Group 35 42:00.3 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) USB 3.0 Host Controller [1022:145c]   <--- this one!


	./show_iommu.sh

IOMMU group 35
42:00.3 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) USB 3.0 Host Controller [1022:145c]
	Driver: xhci_hcd
	Usb bus:
		Bus 005 Device 049: ID 2833:1031  		<---2833=Oculus stuff
		Bus 005 Device 048: ID 2833:2031
		Bus 005 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
	Usb bus:
		Bus 006 Device 005: ID 2833:0211
		Bus 006 Device 006: ID 2833:0211
		Bus 006 Device 012: ID 2833:3031
		Bus 006 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub

...


IOMMU group 19
        0a:00.3 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) USB 3.0 Host Controller [1022:145c]
	Driver: xhci_hcd
	Usb bus:
		Bus 003 Device 003: ID 25a7:fa23
		Bus 003 Device 005: ID 046d:c52b Logitech, Inc. Unifying Receiver	<----wireless keyboard for the VM
		Bus 003 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
	Usb bus:
		Bus 004 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub


So we need to passthrough the USB controller that is in IOMMU 35.
In the same way as explained for the passthrough GPU, the entire usb controller of IOMMU group 35 can be bound to the VFIO driver:

	#!/bin/sh
	#place in /usr/local/bin
	#unbind usb controller in iommu group 35 and bind to vfio for passthrough

	#if not already done..
	/sbin/modprobe vfio
	/sbin/modprobe vfio_pci

	#
	echo '0000:42:00.3' > /sys/bus/pci/devices/0000:42:00.3/driver/unbind
	echo '1022 145c' > /sys/bus/pci/drivers/vfio-pci/new_id
	echo '0000:42:00.3' > /sys/bus/pci/devices/0000:42:00.3/driver/bind
	echo '1022 145c' > /sys/bus/pci/drivers/vfio-pci/remove_id

	sleep 1
	#check drivers associated with gpu and audio
	lspci -nnk -d 1022:145c

	exit 0


-Place the script in /usr/local/bin as 'unbind_usb_controller_bind_vfio.sh'.
-Check with:

	lspci -nnk -d 1022:145c

0a:00.3 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) USB 3.0 Host Controller [1022:145c]
	Subsystem: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) USB 3.0 Host Controller [1022:d102]
	Kernel driver in use: xhci_hcd
	Kernel modules: xhci_pci
42:00.3 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) USB 3.0 Host Controller [1022:145c]
	Subsystem: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) USB 3.0 Host Controller [1022:145c]
	Kernel driver in use: vfio-pci		<---OK!
	Kernel modules: xhci_pci


-Now update the start_windows.sh script to the *grand total* of:

	#!/bin/sh
	#watch out: no spaces allowed between ',' and options of qemu!

	#if not done so already, disable nvidia persistence mode
	nvidia-smi -i 0 -pm DISABLED
	nvidia-smi -i 1 -pm DISABLED

	#if 2nd GPU is not bound to vfio-pci driver, call unbinding script
	if [ ! -e /sys/bus/pci/drivers/vfio-pci/0000:09:00.0 ]; then
		/usr/local/bin/unbind_nvidia_bind_vfio.sh
		sleep 1
	fi


	#bind usb controller to vfio
	if [ ! -e /sys/bus/pci/drivers/vfio-pci/0000:42:00.3 ]; then
		/usr/local/bin/unbind_usb_controller_bind_vfio.sh
		sleep 1
	fi


	export QEMU_AUDIO_DRV=alsa QEMU_AUDIO_TIMER_PERIOD=0
	taskset --cpu-list --all-tasks 0-7,16-23 qemu-system-x86_64 \
		-machine q35,accel=kvm \
		-enable-kvm -m 16384 \
		-cpu host,kvm=off,check,hv_time,hv_relaxed,hv_vapic,hv_spinlocks=0x1fff,hv_vendor_id=whatever \
		-smp 8,sockets=1,cores=8,threads=1 \
		-vga none \
		-nographic \
		-rtc base=localtime,clock=vm \
		-device ioh3420,bus=pcie.0,addr=1c.0,multifunction=on,port=1,chassis=1,id=root.1 \
		-device piix4-ide,bus=pcie.0,id=piix4-ide \
		-device vfio-pci,host=09:00.0,bus=root.1,addr=00.0,multifunction=on,x-vga=on \
		-device vfio-pci,host=09:00.1,bus=pcie.0 \
		-device vfio-pci,host=42:00.3,multifunction=on \
		-device nec-usb-xhci,id=xhci2,multifunction=on \
		-device usb-host,bus=xhci2.0,port=1,vendorid=0x046d,productid=0xc52b \
		-drive file=/usr/share/OVMF/OVMF_CODE.fd,if=pflash,format=raw,readonly \
		-drive file=/home/lars/windows_vm/OVMF_VARS.fd,if=pflash,format=raw,unit=1 \
		-boot order=dc \
		-drive if=virtio,id=disk0,cache=none,format=raw,file=/dev/sda \
		-drive file=/home/lars/windows_vm/Win10_1903_V1_EnglishInternational_x64.iso,index=1,media=cdrom \
		-drive file=/home/lars/windows_vm/virtio-win-0.1.171.iso,index=2,media=cdrom \


	#rebind to nvidia driver
	/usr/local/bin/unbind_vfio_bind_nvidia.sh

	#re-enable nvidia persistence mode otherwise nvidia-smi runs slow, slowing down conky.
	sleep 1
	nvidia-smi -i 0 -pm ENABLED
	nvidia-smi -i 1 -pm ENABLED

	exit 0


*** That's it! ***
Make sure to put the wireless USB keyboard dongle in a port that belongs to the USB that is passed through.
Wait! Shouldn't we need a script to rebind the USB controller to xhci_pci?
No. The line that says:

	/sbin/modprobe -r vfio-pci

in 'unbind_vfio_bind_nvidia.sh' unloads the vfio driver and apparently that
makes the usb port automagically rebind to its original driver. Amazing isn't it!

Were done. Passthrough with identical GPUs and Oculus Rift working *flawlessly*.
Happy Robo-Recalling!!

4.TODOs:
--------
Adapt so that sudo isn't necessary to start the VM,
See: https://www.evonide.com/non-root-gpu-passthrough-setup/

V. Versionhistory:
------------------
29/05/2019	Initial.
12/07/2019	Added -rtc base=localhost,clock=vm to qemu command line, as correct clock is needed to make RecRoom VR account.

R. Useful links in no particular order:
---------------------------------------
https://www.reddit.com/r/VFIO/comments/8jreon/help_with_using_oculus_rift_in_windows_10_kvm_vm/
https://turlucode.com/qemu-kvm-installing-windows-10-client/
https://www.reddit.com/r/VFIO/comments/7avvwx/qemuaffinity_pin_qemu_kvm_cores_to_host_cpu_cores/
https://imgur.com/a/frnUq
https://forum.level1techs.com/t/enable-numa-on-threadripper/123544
https://www.reddit.com/r/Amd/comments/6vrcq0/psa_threadripper_umanuma_setting_in_bios/
https://devtalk.nvidia.com/default/topic/1016989/cuda-setup-and-installation/nvidia-smi-is-slow-and-hangs-after-sometime-with-1080ti/
https://www.reddit.com/r/VFIO/comments/991qzz/solutions_for_bindingunbinding_gpu_from_host/
https://wiki.debian.org/DebianTesting
https://forum.level1techs.com/t/identical-gpu-passthrough-ubuntu/138843/14
https://forum.level1techs.com/t/the-vfio-and-gpu-passthrough-beginners-resource/129897
https://docs.nvidia.com/deploy/driver-persistence/index.html#persistence-daemon
https://devtalk.nvidia.com/default/topic/1051170/cuda-setup-and-installation/nvidia-persistenced-failed-to-initialize-check-syslog-for-more-details-/
https://www.linux-kvm.org/page/Virtio
https://www.howtogeek.com/244678/you-dont-need-a-product-key-to-install-and-use-windows-10/
https://passthroughpo.st/disk-passthrough-explained/
https://www.reddit.com/r/Windows10/comments/3ulr79/msvcp100dll_missing_for_unigine_valley_benchmark/
https://ritsch.io/2017/08/02/execute-script-at-linux-startup.html
https://forum.level1techs.com/t/windows-10-1803-as-guest-with-qemu-kvm-bsod-under-install/127425/9
https://forum.level1techs.com/t/gpu-passthrough-vfio-blue-screen/132808
https://www.reddit.com/r/VFIO/comments/9pc0j7/dynamically_bindingunbinding_an_nvidia_card_from/
https://gitlab.com/YuriAlek/vfio/blob/master/scripts/windows-basic.sh
https://turlucode.com/qemu-kvm-installing-windows-10-client/
https://www.reddit.com/r/VFIO/comments/708uur/nvidia_switching_gpu_between_vm_and_host/
https://www.reddit.com/r/VFIO/comments/8q9923/looking_for_tutorial_linux_kvm_qemu_ssd/
https://dennisnotes.com/note/20180614-ubuntu-18.04-qemu-setup/
https://unix.stackexchange.com/questions/452934/can-i-pass-through-a-usb-port-via-qemu-command-line
https://www.reddit.com/r/VFIO/comments/4vqnnv/qemu_command_line_cpu_pinning/
https://www.evonide.com/non-root-gpu-passthrough-setup/

---------------------------------------------------------------------------------------------------------------------------------