Interfaces and Networks¶
Connecting a virtual machine to a network consists of two parts. First,
networks are specified in spec.networks
. Then, interfaces backed by
the networks are added to the VM by specifying them in
spec.domain.devices.interfaces
.
Each interface must have a corresponding network with the same name.
An interface
defines a virtual network interface of a virtual machine
(also called a frontend). A network
specifies the backend of an
interface
and declares which logical or physical device it is
connected to (also called as backend).
There are multiple ways of configuring an interface
as well as a
network
.
All possible configuration options are available in the Interface API Reference and Network API Reference.
Backend¶
Network backends are configured in spec.networks
. A network must have
a unique name. Additional fields declare which logical or physical
device the network relates to.
Each network should declare its type by defining one of the following fields:
Type | Description |
---|---|
|
Default Kubernetes network |
|
Secondary network provided using Multus |
pod¶
A pod
network represents the default pod eth0
interface configured
by cluster network solution that is present in each pod.
kind: VM
spec:
domain:
devices:
interfaces:
- name: default
masquerade: {}
networks:
- name: default
pod: {} # Stock pod network
multus¶
It is also possible to connect VMIs to secondary networks using
Multus. This assumes that multus
is installed across your cluster and a corresponding
NetworkAttachmentDefinition
CRD was created.
The following example defines a network which uses the bridge CNI
plugin, which will connect the VMI
to Linux bridge br1
. Other CNI plugins such as
ptp, ovs-cni, or Flannel might be used as well. For their
installation and usage refer to the respective project documentation.
First the NetworkAttachmentDefinition
needs to be created. That is
usually done by an administrator. Users can then reference the
definition.
apiVersion: "k8s.cni.cncf.io/v1"
kind: NetworkAttachmentDefinition
metadata:
name: bridge-test
spec:
config: '{
"cniVersion": "0.3.1",
"name": "bridge-test",
"type": "bridge",
"bridge": "br1",
"disableContainerInterface": true
}'
With following definition, the VMI will be connected to the default pod network and to the secondary Open vSwitch network.
kind: VM
spec:
domain:
devices:
interfaces:
- name: default
masquerade: {}
bootOrder: 1 # attempt to boot from an external tftp server
dhcpOptions:
bootFileName: default_image.bin
tftpServerName: tftp.example.com
- name: ovs-net
bridge: {}
bootOrder: 2 # if first attempt failed, try to PXE-boot from this L2 networks
networks:
- name: default
pod: {} # Stock pod network
- name: ovs-net
multus: # Secondary multus network
networkName: ovs-vlan-100
It is also possible to define a multus network as the default pod network with Multus. A version of multus after this Pull Request is required (currently master).
Note the following:
-
A multus default network and a pod network type are mutually exclusive.
-
The virt-launcher pod that starts the VMI will not have the pod network configured.
-
The multus delegate chosen as default must return at least one IP address.
Create a NetworkAttachmentDefinition
with IPAM.
apiVersion: "k8s.cni.cncf.io/v1"
kind: NetworkAttachmentDefinition
metadata:
name: bridge-test
spec:
config: '{
"cniVersion": "0.3.1",
"name": "bridge-test",
"type": "bridge",
"bridge": "br1",
"ipam": {
"type": "host-local",
"subnet": "10.250.250.0/24"
}
}'
Define a VMI with a Multus network as the default.
kind: VM
spec:
domain:
devices:
interfaces:
- name: test1
bridge: {}
networks:
- name: test1
multus: # Multus network as default
default: true
networkName: bridge-test
Invalid CNIs for secondary networks¶
The following list of CNIs is known not to work for bridge interfaces - which are most common for secondary interfaces.
The reason is similar: the bridge interface type moves the pod interface MAC address to the VM, leaving the pod interface with a different address. The aforementioned CNIs require the pod interface to have the original MAC address.
These issues are tracked individually:
Feel free to discuss and / or propose fixes for them; we'd like to have these plugins as valid options on our ecosystem.
Frontend¶
Network interfaces are configured in spec.domain.devices.interfaces
.
They describe properties of virtual interfaces as "seen" inside guest
instances. The same network backend may be connected to a virtual
machine in multiple different ways, each with their own connectivity
guarantees and characteristics.
Each interface should declare its type by defining on of the following fields:
Type | Description |
---|---|
|
Connect using a linux bridge |
|
Connect using QEMU user networking mode |
|
Pass through a SR-IOV PCI device via |
|
Connect using Iptables rules to nat the traffic |
Each interface may also have additional configuration fields that modify properties "seen" inside guest instances, as listed below:
Name | Format | Default value | Description |
---|---|---|---|
|
One of: |
|
NIC type |
macAddress |
|
MAC address as seen inside the guest system, for example: |
|
ports |
empty |
List of ports to be forwarded to the virtual machine. |
|
pciAddress |
|
Set network interface PCI address, for example: |
kind: VM
spec:
domain:
devices:
interfaces:
- name: default
model: e1000 # expose e1000 NIC to the guest
masquerade: {} # connect through a masquerade
ports:
- name: http
port: 80
networks:
- name: default
pod: {}
Note: For secondary interfaces, when a MAC address is specified for a virtual machine interface, it is passed to the underlying CNI plugin which is, in turn, expected to configure the backend to allow for this particular MAC. Not every plugin has native support for custom MAC addresses.
Note: For some CNI plugins without native support for custom MAC addresses, there is a workaround, which is to use the
tuning
CNI plugin to adjust pod interface MAC address. This can be used as follows:
apiVersion: "k8s.cni.cncf.io/v1"
kind: NetworkAttachmentDefinition
metadata:
name: ptp-mac
spec:
config: '{
"cniVersion": "0.3.1",
"name": "ptp-mac",
"plugins": [
{
"type": "ptp",
"ipam": {
"type": "host-local",
"subnet": "10.1.1.0/24"
}
},
{
"type": "tuning"
}
]
}'
This approach may not work for all plugins. For example, OKD SDN is not compatible with
tuning
plugin.
Plugins that handle custom MAC addresses natively:
ovs
,bridge
.Plugins that are compatible with
tuning
plugin:flannel
,ptp
.Plugins that don't need special MAC address treatment:
sriov
(invfio
mode).
Ports¶
Declare ports listen by the virtual machine
Note: When using the slirp interface only the configured ports will be forwarded to the virtual machine.
Name | Format | Required | Description |
---|---|---|---|
|
no |
Name |
|
|
1 - 65535 |
yes |
Port to expose |
|
TCP,UDP |
no |
Connection protocol |
Tip: Use
e1000
model if your guest image doesn't ship with virtio drivers.
If spec.domain.devices.interfaces
is omitted, the virtual machine is
connected using the default pod network interface of bridge
type. If
you'd like to have a virtual machine instance without any network
connectivity, you can use the autoattachPodInterface
field as follows:
MTU¶
There are two methods for the MTU to be propagated to the guest interface.
- Libvirt - for this the guest machine needs new enough virtio network driver that understands the data passed into the guest via a PCI config register in the emulated device.
- DHCP - for this the guest DHCP client should be able to read the MTU from the DHCP server response.
On Windows guest non virtio interfaces, MTU has to be set manually using netsh
or other tool
since the Windows DHCP client doesn't request/read the MTU.
The table below is summarizing the MTU propagation to the guest.
masquerade | bridge with CNI IP | bridge with no CNI IP | Windows | |
---|---|---|---|---|
virtio | DHCP & libvirt | DHCP & libvirt | libvirt | libvirt |
non-virtio | DHCP | DHCP | X | X |
- bridge with CNI IP - means the CNI gives IP to the pod interface and bridge binding is used to bind the pod interface to the guest.
bridge¶
In bridge
mode, virtual machines are connected to the network backend
through a linux "bridge". The pod network IPv4 address (if exists) is delegated to
the virtual machine via DHCPv4. The virtual machine should be configured
to use DHCP to acquire IPv4 addresses.
Note: If a specific MAC address is not configured in the virtual machine interface spec the MAC address from the relevant pod interface is delegated to the virtual machine.
kind: VM
spec:
domain:
devices:
interfaces:
- name: red
bridge: {} # connect through a bridge
networks:
- name: red
multus:
networkName: red
At this time, bridge
mode doesn't support additional configuration
fields.
Note: due to IPv4 address delegation, in
bridge
mode the pod doesn't have an IP address configured, which may introduce issues with third-party solutions that may rely on it. For example, Istio may not work in this mode.Note: admin can forbid using
bridge
interface type for pod networks via a designated configuration flag. To achieve it, the admin should set the following option tofalse
:
apiVersion: kubevirt.io/v1
kind: Kubevirt
metadata:
name: kubevirt
namespace: kubevirt
spec:
configuration:
network:
permitBridgeInterfaceOnPodNetwork: false
Note: binding the pod network using
bridge
interface type may cause issues. Other than the third-party issue mentioned in the above note, live migration is not allowed with a pod network binding ofbridge
interface type, and also some CNI plugins might not allow to use a custom MAC address for your VM instances. If you think you may be affected by any of issues mentioned above, consider changing the default interface type tomasquerade
, and disabling thebridge
type for pod network, as shown in the example above.
slirp¶
In slirp
mode, virtual machines are connected to the network backend
using QEMU user networking mode. In this mode, QEMU allocates internal
IP addresses to virtual machines and hides them behind NAT.
kind: VM
spec:
domain:
devices:
interfaces:
- name: red
slirp: {} # connect using SLIRP mode
networks:
- name: red
pod: {}
At this time, slirp
mode doesn't support additional configuration
fields.
Note: in
slirp
mode, the only supported protocols are TCP and UDP. ICMP is not supported.
More information about SLIRP mode can be found in QEMU Wiki.
Note: Since v1.1.0, Kubevirt delegates Slirp network configuration to the Slirp network binding plugin by default. In case the binding plugin is not registered, Kubevirt will use the following default image:
quay.io/kubevirt/network-slirp-binding:20230830_638c60fc8
.Note: In the next release (v1.2.0) no default image will be set by Kubevirt, registering an image will be mandatory.
Note: On disconnected clusters it will be necessary to mirror Slirp binding plugin image to the cluster registry.
masquerade¶
In masquerade
mode, KubeVirt allocates internal IP addresses to
virtual machines and hides them behind NAT. All the traffic exiting
virtual machines is "source NAT'ed" using pod IP addresses; thus, cluster
workloads should use the pod's IP address to contact the VM over this interface.
This IP address is reported in the VMI's spec.status.interface
. A guest
operating system should be configured to use DHCP to acquire IPv4 addresses.
To allow the VM to live-migrate or hard restart (both cause the VM to run on a different pod, with a different IP address) and still be reachable, it should be exposed by a Kubernetes service.
To allow traffic of specific ports into virtual machines, the template ports
section of
the interface should be configured as follows. If the ports
section is missing,
all ports forwarded into the VM.
kind: VM
spec:
domain:
devices:
interfaces:
- name: red
masquerade: {} # connect using masquerade mode
ports:
- port: 80 # allow incoming traffic on port 80 to get into the virtual machine
networks:
- name: red
pod: {}
Note: Masquerade is only allowed to connect to the pod network.
Note: The network CIDR can be configured in the pod network section using the
vmNetworkCIDR
attribute.
masquerade - IPv4 and IPv6 dual-stack support¶
masquerade
mode can be used in IPv4 and IPv6 dual-stack clusters to provide
a VM with an IP connectivity over both protocols.
As with the IPv4 masquerade
mode, the VM can be contacted using the pod's IP
address - which will be in this case two IP addresses, one IPv4 and one
IPv6. Outgoing traffic is also "NAT'ed" to the pod's respective IP address
from the given family.
Unlike in IPv4, the configuration of the IPv6 address and the default route is not automatic; it should be configured via cloud init, as shown below:
kind: VM
spec:
domain:
devices:
disks:
- disk:
bus: virtio
name: cloudinitdisk
interfaces:
- name: red
masquerade: {} # connect using masquerade mode
ports:
- port: 80 # allow incoming traffic on port 80 to get into the virtual machine
networks:
- name: red
pod: {}
volumes:
- cloudInitNoCloud:
networkData: |
version: 2
ethernets:
eth0:
dhcp4: true
addresses: [ fd10:0:2::2/120 ]
gateway6: fd10:0:2::1
userData: |-
#!/bin/bash
echo "fedora" |passwd fedora --stdin
Note: The IPv6 address for the VM and default gateway must be the ones shown above.
masquerade - IPv6 single-stack support¶
masquerade
mode can be used in IPv6 single stack clusters to provide a VM
with an IPv6 only connectivity.
As with the IPv4 masquerade
mode, the VM can be contacted using the pod's IP
address - which will be in this case the IPv6 one.
Outgoing traffic is also "NAT'ed" to the pod's respective IPv6 address.
As with the dual-stack cluster, the configuration of the IPv6 address and the default route is not automatic; it should be configured via cloud init, as shown in the dual-stack section.
Unlike the dual-stack cluster, which has a DHCP server for IPv4, the IPv6 single stack cluster has no DHCP server at all. Therefore, the VM won't have the search domains information and reaching a destination using its FQDN is not possible. Tracking issue - https://github.com/kubevirt/kubevirt/issues/7184
passt¶
Warning: The core binding is being deprecated and targeted for removal in v1.3 . As an alternative, the same functionality is introduced and available as a binding plugin.
passt
is a new approach for user-mode networking which can be used as a simple replacement for Slirp (which is practically dead).
passt
is a universal tool which implements a translation layer between a Layer-2 network interface and native
Layer -4 sockets (TCP, UDP, ICMP/ICMPv6 echo) on a host.
Its main benefits are: - doesn't require extra network capabilities as CAP_NET_RAW and CAP_NET_ADMIN. - allows integration with service meshes (which expect applications to run locally) out of the box. - supports IPv6 out of the box (in contrast to the existing bindings which require configuring IPv6 manually).
Masquerade | Bridge | Passt | |
---|---|---|---|
Supports migration | Yes | No | No (will be supported in the future) |
VM uses Pod IP | No | Yes | Yes (in the future it will be possible to configure the VM IP. Currently the default is the pod IP) |
Service Mesh out of the box | No (only ISTIO is supported, adjustmets on both ISTIO and kubevirt had to be done to make it work) |
No | Yes |
Doesn’t require extra capabilities on the virt-launcher pod | Yes (multiple workarounds had to be added to kuebivrt to make it work) |
No (Multiple workarounds had to be added to kuebivrt to make it work) |
Yes |
Doesn't require extra network devices on the virt-launcher pod | No (bridge and tap device are created) |
No (bridge and tap device are created) |
Yes |
Supports IPv6 | Yes (requires manual configuration on the VM) |
No | Yes |
kind: VM
spec:
domain:
devices:
interfaces:
- name: red
passt: {} # connect using passt mode
ports:
- port: 8080 # allow incoming traffic on port 8080 to get into the virtual machine
networks:
- name: red
pod: {}
Requirements/Recommendations:¶
- To get better performance the node should be configured with:
- To run multiple passt VMs with no explicit ports, the node's
fs.file-max
should be increased (for a VM forwards all IPv4 and IPv6 ports, for TCP and UDP, passt needs to create ~2^18 sockets):
NOTE: To achieve optimal memory consumption with Passt binding, specify ports required for your workload. When no ports are explicitly specified, all ports are forwarded, leading to memory overhead of up to 800 Mi.
Temporary restrictions:¶
passt
currently only supported as primary network and doesn't allow extra multus networks to be configured on the VM.
passt interfaces are feature gated; to enable the feature, follow
these
instructions, in order to activate the Passt
feature gate (case sensitive).
More information about passt mode can be found in passt Wiki.
virtio-net multiqueue¶
Setting the networkInterfaceMultiqueue
to true
will enable the
multi-queue functionality, increasing the number of vhost queue, for
interfaces configured with a virtio
model.
Users of a Virtual Machine with multiple vCPUs may benefit of increased network throughput and performance.
Currently, the number of queues is being determined by the number of vCPUs of a VM. This is because multi-queue support optimizes RX interrupt affinity and TX queue selection in order to make a specific queue private to a specific vCPU.
Without enabling the feature, network performance does not scale as the number of vCPUs increases. Guests cannot transmit or retrieve packets in parallel, as virtio-net has only one TX and RX queue.
Virtio interfaces advertise on their status.interfaces.interface entry a field named queueCount.
The queueCount field indicates how many queues were assigned to the interface.
Queue count value is derived from the domain XML.
In case the number of queues can't be determined (i.e interface that is reported by quest-agent only),
it will be omitted.
NOTE: Although the virtio-net multiqueue feature provides a performance benefit, it has some limitations and therefore should not be unconditionally enabled
Some known limitations¶
-
Guest OS is limited to ~200 MSI vectors. Each NIC queue requires a MSI vector, as well as any virtio device or assigned PCI device. Defining an instance with multiple virtio NICs and vCPUs might lead to a possibility of hitting the guest MSI limit.
-
virtio-net multiqueue works well for incoming traffic, but can occasionally cause a performance degradation, for outgoing traffic. Specifically, this may occur when sending packets under 1,500 bytes over the Transmission Control Protocol (TCP) stream.
-
Enabling virtio-net multiqueue increases the total network throughput, but in parallel it also increases the CPU consumption.
-
Enabling virtio-net multiqueue in the host QEMU config, does not enable the functionality in the guest OS. The guest OS administrator needs to manually turn it on for each guest NIC that requires this feature, using ethtool.
-
MSI vectors would still be consumed (wasted), if multiqueue was enabled in the host, but has not been enabled in the guest OS by the administrator.
-
In case the number of vNICs in a guest instance is proportional to the number of vCPUs, enabling the multiqueue feature is less important.
-
Each virtio-net queue consumes 64 KiB of kernel memory for the vhost driver.
NOTE: Virtio-net multiqueue should be enabled in the guest OS
manually, using ethtool. For example:
ethtool -L <NIC> combined #num_of_queues
More information please refer to KVM/QEMU MultiQueue.
sriov¶
In sriov
mode, virtual machines are directly exposed to an SR-IOV PCI
device, usually allocated by Intel SR-IOV device
plugin. The
device is passed through into the guest operating system as a host
device, using the
vfio userspace
interface, to maintain high networking performance.
How to expose SR-IOV VFs to KubeVirt¶
To simplify procedure, please use OpenShift SR-IOV operator to deploy and configure SR-IOV components in your cluster. On how to use the operator, please refer to their respective documentation.
Note: KubeVirt relies on VFIO userspace driver to pass PCI devices into VMI guest. Because of that, when configuring SR-IOV operator policies, make sure you define a pool of VF resources that uses
driver: vfio
.
Once the operator is deployed, an SriovNetworkNodePolicy must be provisioned, in which the list of SR-IOV devices to expose (with respective configurations) is defined.
Please refer to the following SriovNetworkNodePolicy
for an example:
apiVersion: sriovnetwork.openshift.io/v1
kind: SriovNetworkNodePolicy
metadata:
name: policy-1
namespace: sriov-network-operator
spec:
deviceType: vfio-pci
mtu: 9000
nicSelector:
pfNames:
- ens1f0
nodeSelector:
sriov: "true"
numVfs: 8
priority: 90
resourceName: sriov-nic
The policy above will configure the SR-IOV
device plugin, allowing the
PF named ens1f0
to be exposed in the SRIOV capable nodes as a resource named
sriov-nic
.
Start an SR-IOV VM¶
Once all the SR-IOV components are deployed, it is needed to indicate how to
configure the SR-IOV network. Refer to the following
SriovNetwork
for an example:
apiVersion: sriovnetwork.openshift.io/v1
kind: SriovNetwork
metadata:
name: sriov-net
namespace: sriov-network-operator
spec:
ipam: |
{}
networkNamespace: default
resourceName: sriov-nic
spoofChk: "off"
Finally, to create a VM that will attach to the aforementioned Network, refer to the following VMI spec:
---
apiVersion: kubevirt.io/v1
kind: VirtualMachineInstance
metadata:
labels:
special: vmi-perf
name: vmi-perf
spec:
domain:
cpu:
sockets: 2
cores: 1
threads: 1
dedicatedCpuPlacement: true
resources:
requests:
memory: "4Gi"
limits:
memory: "4Gi"
devices:
disks:
- disk:
bus: virtio
name: containerdisk
- disk:
bus: virtio
name: cloudinitdisk
interfaces:
- masquerade: {}
name: default
- name: sriov-net
sriov: {}
rng: {}
machine:
type: ""
networks:
- name: default
pod: {}
- multus:
networkName: default/sriov-net
name: sriov-net
terminationGracePeriodSeconds: 0
volumes:
- containerDisk:
image: docker.io/kubevirt/fedora-cloud-container-disk-demo:latest
name: containerdisk
- cloudInitNoCloud:
userData: |
#!/bin/bash
echo "centos" |passwd centos --stdin
dhclient eth1
name: cloudinitdisk
Note: for some NICs (e.g. Mellanox), the kernel module needs to be installed in the guest VM.
Note: Placement on dedicated CPUs can only be achieved if the Kubernetes CPU manager is running on the SR-IOV capable workers. For further details please refer to the dedicated cpu resources documentation.
Macvtap¶
Note: The core binding will be deprecated soon. As an alternative, the same functionality is introduced and available as a binding plugin.
In macvtap
mode, virtual machines are directly exposed to the Kubernetes
nodes L2 network. This is achieved by 'extending' an existing network interface
with a virtual device that has its own MAC address.
Macvtap interfaces are feature gated; to enable the feature, follow
these
instructions, in order to activate the Macvtap
feature gate (case sensitive).
Note: On KinD clusters, the user needs to adjust the cluster configuration, mounting
dev
of the running host onto the KinD nodes, because of a known issue.
Limitations¶
- Live migration is not seamless, see issue #5912
How to expose host interface to the macvtap device plugin¶
To simplify the procedure, please use the Cluster Network Addons Operator to deploy and configure the macvtap components in your cluster.
The aforementioned operator effectively deploys the macvtap-cni cni / device plugin combo.
There are two different alternatives to configure which host interfaces get exposed to the user, enabling them to create macvtap interfaces on top of:
- select the host interfaces: indicates which host interfaces are exposed.
- expose all interfaces: all interfaces of all hosts are exposed.
Both options are configured via the macvtap-deviceplugin-config
ConfigMap,
and more information on how to configure it can be found in the
macvtap-cni repo.
You can find a minimal example, in which the eth0
interface of the Kubernetes
nodes is exposed, via the lowerDevice
attribute.
kind: ConfigMap
apiVersion: v1
metadata:
name: macvtap-deviceplugin-config
data:
DP_MACVTAP_CONF: |
[
{
"name" : "dataplane",
"lowerDevice": "eth0",
"mode" : "bridge",
"capacity" : 50
}
]
This step can be omitted, since the default configuration of the aforementioned
ConfigMap
is to expose all host interfaces (which is represented by the
following configuration):
kind: ConfigMap
apiVersion: v1
metadata:
name: macvtap-deviceplugin-config
data:
DP_MACVTAP_CONF: '[]'
Start a VM with macvtap interfaces¶
Once the macvtap components are deployed, it is needed to indicate how to
configure the macvtap network. Refer to the following
NetworkAttachmentDefinition
for a simple example:
---
kind: NetworkAttachmentDefinition
apiVersion: k8s.cni.cncf.io/v1
metadata:
name: macvtapnetwork
annotations:
k8s.v1.cni.cncf.io/resourceName: macvtap.network.kubevirt.io/eth0
spec:
config: '{
"cniVersion": "0.3.1",
"name": "macvtapnetwork",
"type": "macvtap",
"mtu": 1500
}'
k8s.v1.cni.cncf.io/resourceName
annotation must point to an
exposed host interface (via the lowerDevice
attribute, on the
macvtap-deviceplugin-config
ConfigMap
).
Finally, to create a VM that will attach to the aforementioned Network, refer to the following VMI spec:
---
apiVersion: kubevirt.io/v1
kind: VirtualMachineInstance
metadata:
labels:
special: vmi-host-network
name: vmi-host-network
spec:
domain:
devices:
disks:
- disk:
bus: virtio
name: containerdisk
- disk:
bus: virtio
name: cloudinitdisk
interfaces:
- macvtap: {}
name: hostnetwork
rng: {}
machine:
type: ""
resources:
requests:
memory: 1024M
networks:
- multus:
networkName: macvtapnetwork
name: hostnetwork
terminationGracePeriodSeconds: 0
volumes:
- containerDisk:
image: docker.io/kubevirt/fedora-cloud-container-disk-demo:devel
name: containerdisk
- cloudInitNoCloud:
userData: |-
#!/bin/bash
echo "fedora" |passwd fedora --stdin
name: cloudinitdisk
multus
networkName
- i.e. macvtapnetwork
- must match the
name of the provisioned NetworkAttachmentDefinition
.
Note: VMIs with macvtap interfaces can be migrated, but their MAC addresses must be statically set.
Security¶
MAC spoof check¶
MAC spoofing refers to the ability to generate traffic with an arbitrary source MAC address. An attacker may use this option to generate attacks on the network.
In order to protect against such scenarios, it is possible to enable the mac-spoof-check support in CNI plugins that support it.
The pod primary network which is served by the cluster network provider is not covered by this documentation. Please refer to the relevant provider to check how to enable spoofing check. The following text refers to the secondary networks, served using multus.
There are two known CNI plugins that support mac-spoof-check:
- sriov-cni:
Through the
spoofchk
parameter . - bridge-cni:
Through the
macspoofchk
parameter.
The configuration is to be done on the NetworkAttachmentDefinition by the operator and any interface that refers to it, will have this feature enabled.
Below is an example of using the bridge
CNI with macspoofchk
enabled:
apiVersion: "k8s.cni.cncf.io/v1"
kind: NetworkAttachmentDefinition
metadata:
name: br-spoof-check
spec:
config: '{
"cniVersion": "0.3.1",
"name": "br-spoof-check",
"type": "bridge",
"bridge": "br10",
"disableContainerInterface": true,
"macspoofchk": true
}'
On the VMI, the network section should point to this NetworkAttachmentDefinition by name:
Limitations¶
- The
bridge
CNI supports mac-spoof-check through nftables, therefore the node must support nftables and have thenft
binary deployed.