Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: Add doc for VM interfaces and networks #23

Merged
merged 1 commit into from
Aug 11, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -135,13 +135,13 @@ You can also `Shutdown`, `Reset`, `Reboot` or `Pause` a running VM, or `Resume`
- [x] VM lifecycle management
- [x] Container disks
- [x] Direct kernel boot with container rootfs
- [x] Pod network
- [x] Multus CNI networks
- [x] [Pod network](docs/interfaces_and_networks.md#pod-network)
- [x] [Multus CNI networks](docs/interfaces_and_networks.md#multus-network)
- [x] Persistent volumes
- [x] CDI data volumes
- [x] ARM64 support
- [ ] VM live migration
- [ ] SR-IOV NIC passthrough
- [ ] [SR-IOV NIC passthrough](docs/interfaces_and_networks.md#sriov-mode)
- [ ] GPU passthrough
- [ ] Dedicated vCPU allocation
- [ ] VM devices hot-plug
Expand Down
196 changes: 196 additions & 0 deletions docs/interfaces_and_networks.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,196 @@
# Interfaces and Networks

Connecting a VM to a network consists of two parts. First, networks are specified in `spec.networks`. Then, interfaces backed by the networks are added to the VM by specifying them in `spec.instance.interfaces`. Each interface must have a corresponding network with the same name.

An interface defines a virtual network interface of a VM. A network specifies the backend of an interface and declares which logical or physical device it is connected to.

There are multiple ways of configuring an interface as well as a network.

## Networks

Networks are configured in `spec.networks`. Each network should declare its type by defining one of the following fields:

| Type | Description |
| -------- | ------------------------------------------------------------------------------------------------- |
| `pod` | Default Kubernetes network |
| `multus` | Secondary network provided using [Multus CNI](https://github.com/k8snetworkplumbingwg/multus-cni) |

### `pod` Network

A `pod` network represents the default pod `eth0` interface configured by cluster network solution that is present in each pod.

```yaml
apiVersion: virt.virtink.smartx.com/v1alpha1
kind: VirtualMachine
spec:
instance:
interfaces:
- name: pod
bridge: {}
networks:
- name: pod
pod: {}
```

### `multus` Network

It is also possible to connect VMs to secondary networks using [Multus CNI](https://github.com/k8snetworkplumbingwg/multus-cni). This assumes that Multus CNI is installed across your cluster and a corresponding `NetworkAttachmentDefinition` CRD was created.

The following example defines a network which uses the [Open vSwitch CNI plugin](https://github.com/k8snetworkplumbingwg/ovs-cni), which will connect the VM to Open vSwitch's bridge `br1` on the host. Other CNI plugins such as [bridge](https://www.cni.dev/plugins/current/main/bridge/) or [macvlan](https://www.cni.dev/plugins/current/main/macvlan/) might be used as well. For their installation and usage refer to the respective project documentation.

First the `NetworkAttachmentDefinition` needs to be created.

```yaml
apiVersion: k8s.cni.cncf.io/v1
kind: NetworkAttachmentDefinition
metadata:
name: ovs-br1
spec:
config: |
{
"cniVersion": "0.3.1",
"type": "ovs",
"bridge": "br1"
}
```

With following definition, the VM will be connected to the secondary Open vSwitch network.

```yaml
apiVersion: virt.virtink.smartx.com/v1alpha1
kind: VirtualMachine
spec:
instance:
interfaces:
- name: ovs
bridge: {}
networks:
- name: ovs
multus:
networkName: ovs-br1
```

## VM Network Interfaces

VM network interfaces are configured in `spec.instance.interfaces`. They describe properties of virtual interfaces as "seen" inside guest instances. The same network may be connected to a VM in multiple different ways, each with their own connectivity guarantees and characteristics.

Each interface should declare its type by defining one of the following fields:

| Type | Description |
| -------- | ---------------------------------------- |
| `bridge` | Connect using a linux bridge |
| `sriov` | Passthrough a SR-IOV PCI device via VFIO |

### `bridge` Mode

In `bridge` mode, VMs are connected to the network through a Linux bridge. The pod network IPv4 address is delegated to the VM via DHCPv4. The VM should be configured to use DHCP to acquire IPv4 addresses.

```yaml
apiVersion: virt.virtink.smartx.com/v1alpha1
kind: VirtualMachine
spec:
instance:
interfaces:
- name: pod
bridge: {}
networks:
- name: pod
pod: {}
```

At this time, `bridge` mode doesn't support additional configuration fields.

> **Note**: due to IPv4 address delegation, in `bridge` mode the pod doesn't have an IP address configured, which may introduce issues with third-party solutions that may rely on it. For example, Istio may not work in this mode.

### `sriov` Mode

In `sriov` mode, VMs are directly exposed to an SR-IOV PCI device, usually allocated by [SR-IOV Network Device Plugin](https://github.com/k8snetworkplumbingwg/sriov-network-device-plugin). The device is passed through into the guest operating system as a host device, using the [VFIO](https://www.kernel.org/doc/html/latest/driver-api/vfio.html#:~:text=The%20VFIO%20driver%20is%20an,non%2Dprivileged%2C%20userspace%20drivers.) userspace interface, to maintain high networking performance.

#### How to Expose SR-IOV VFs to Virtink

First you should have [Multus CNI](https://github.com/k8snetworkplumbingwg/multus-cni), [SR-IOV CNI](https://github.com/k8snetworkplumbingwg/sriov-cni) and [SR-IOV Network Device Plugin](https://github.com/k8snetworkplumbingwg/sriov-network-device-plugin) installed across the cluster. For their installation and usage refer to the respective project documentation.

Then you should create some VFs on the host's SR-IOV capable device. Please consult the vendor of your device for how to do so.

To expose SR-IOV VFs to Virtink, each VF's driver should be changed to `vfio-pci`. Below is an example of how to do so:

```bash
export VF_ADDR=0000:58:01.2 # change it to your VF's PCI address
modprobe vfio_pci
export DRIVER=$(lspci -s $VF_ADDR -k | grep driver | awk '{print $5}')
echo $VF_ADDR > /sys/bus/pci/drivers/$DRIVER/unbind
export VENDOR_ID=$(lspci -s $VF_ADDR -Dn | awk '{split($3,a,":"); print a[1]}')
export DEVICE_ID=$(lspci -s $VF_ADDR -Dn | awk '{split($3,a,":"); print a[2]}')
echo $VENDOR_ID $DEVICE_ID > /sys/bus/pci/drivers/vfio-pci/new_id
```

Now create a config for SR-IOV device plugin to capture this VF:

```bash
echo $VENDOR_ID # make sure it's the vendor ID of your VF
echo $DEVICE_ID # make sure it's the device ID of your VF
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: ConfigMap
metadata:
name: sriovdp-config
namespace: kube-system
data:
config.json: |
{
"resourceList": [{
"resourceName": "mellanox_SRIOV_25G",
"selectors": {
"vendors": ["$VENDOR_ID"],
"devices": ["$DEVICE_ID"],
"drivers": ["vfio-pci"]
}
}]
}
EOF
```

This will expose the VF as a node resource named `intel.com/mellanox_SRIOV_25G`. You can check if the VF was successfully exposed with following command:

```bash
kubectl get nodes -o=jsonpath-as-json="{.items[*]['status.capacity']}" | grep mellanox_SRIOV_25G
```

Finally, create a `NetworkAttachmentDefinition` for the VF:

```bash
cat <<EOF | kubectl apply -f -
apiVersion: k8s.cni.cncf.io/v1
kind: NetworkAttachmentDefinition
metadata:
name: mellanox-sriov-25g
annotations:
k8s.v1.cni.cncf.io/resourceName: intel.com/mellanox_SRIOV_25G
spec:
config: |
{
"cniVersion": "0.3.1",
"type": "sriov"
}
EOF
```

This will make the VF a `multus` network for Virtink to use.

#### Start an SR-IOV VM

To create a VM that will attach to the aforementioned network, refer to the following VM spec:

```yaml
apiVersion: virt.virtink.smartx.com/v1alpha1
kind: VirtualMachine
spec:
instance:
interfaces:
- name: sriov
sriov: {}
networks:
- name: sriov
multus:
networkName: mellanox-sriov-25g
```