我要评分
获取效率
正确性
完整性
易理解

Running and Verifying XPF

  1. Set the huge page memory and edit the startup items.
    1
    vim /etc/grub2-efi.cfg
    

    Add default_hugepagesz=512M hugepagesz=512M hugepages=64 to the file.

  2. Enable IOMMU and CPU isolation.
    1. Open the /etc/grub2-efi.cfg file.
      1
      vim /etc/grub2-efi.cfg
      
    2. Add isolcpus=0-5 iommu.passthrough=1 to the file.

  3. Restart the host for the configuration to take effect.

    To enable IOMMU, you also need to configure the BIOS in addition to configuring the startup items. For details, see BIOS Settings.

  4. Start OVS.
    1. Create the OVS working directory.
      1
      2
      mkdir -p /var/run/openvswitch
      mkdir -p /var/log/openvswitch
      
    2. Create the OVS database file.
      1
      ovsdb-tool create /etc/openvswitch/conf.db
      
    3. Start the ovsdb-server program.
      1
      ovsdb-server --remote=punix:/var/run/openvswitch/db.sock --remote=db:Open_vSwitch,Open_vSwitch,manager_options --private-key=db:Open_vSwitch,SSL,private_key --certificate=db:Open_vSwitch,SSL,certificate --bootstrap-ca-cert=db:Open_vSwitch,SSL,ca_cert --pidfile --detach --log-file
      
    4. Set OVS startup parameters.
      1
      ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-init=true other_config:dpdk-socket-mem="4096" other_config:dpdk-lcore-mask="0x1F" other_config:pmd-cpu-mask="0x1E"
      
    5. Start OVS.
      1
      ovs-vswitchd --pidfile --detach --log-file
      

  5. Bind the NIC to the user-mode DPDK.
    1. Load the igb_uio driver.
      1
      modprobe igb_uio
      

      If this command is executed for the first time, run the depmod command to enable the system to process driver dependencies. The driver is provided by DPDK and is installed in /lib/modules/4.14.0-115.el7a.0.1.aarch64/extra/dpdk/igb_uio.ko by default. If the system restarts, you need to load the driver again.

    2. View network port information.
      1
      dpdk-devbind -s
      

      Find the PCI address of the network port to be bound.

      The network port to be bound must be in the down state. Otherwise, the binding fails.

    3. Bind the NIC to the user-mode DPDK.
      1
      2
      dpdk-devbind --bind=igb_uio 0000:05:00.0
      dpdk-devbind --bind=igb_uio 0000:06:00.0
      

      To roll back the operation, run the following command:

      1
      dpdk-devbind -u 0000:05:00.0
      
    4. Check whether the binding is successful.

  6. Create a network.

    The network to be verified is a typical OVS network, as shown in Figure 1.

    Figure 1 OVS networking

    The following lists the commands run on Host 1. The commands run on Host 2 are similar, except that the IP address of br-dpdk on Host 2 is different.

    1. Add and configure the br-dpdk bridge.
      1
      2
      3
      4
      5
      ovs-vsctl add-br br-dpdk -- set bridge br-dpdk datapath_type=netdev
      ovs-vsctl add-bond br-dpdk dpdk-bond p0 p1 -- set Interface p0 type=dpdk options:dpdk-devargs=0000:05:00.0 -- set Interface p1 type=dpdk options:dpdk-devargs=0000:06:00.0
      ovs-vsctl set port dpdk-bond bond_mode=balance-tcp
      ovs-vsctl set port dpdk-bond lacp=active
      ifconfig br-dpdk 192.168.2.1/24 up
      

      The ifconfig br-dpdk 192.168.2.1/24 up command is used to configure a virtual extensible LAN (VXLAN) tunnel (192.168.2.1 indicates the IP address of the br-dpdk bridge). Run the ifconfig br-dpdk 192.168.2.2/24 up command on Host 2. The network segment of the tunnel is different from that of the VM.

    2. Add and configure the br-int bridge.
      1
      2
      ovs-vsctl add-br br-int -- set bridge br-int datapath_type=netdev
      ovs-vsctl add-port br-int vxlan0 --  set Interface vxlan0 type=vxlan options:local_ip=192.168.2.1 options:remote_ip=192.168.2.2
      

      In this networking, br-int has a VXLAN port, and the VXLAN header is added to all outgoing traffic of the host. local_ip of the VXLAN port is set to the IP address of the local br-dpdk, and remote_ip is set to the IP address of the peer br-dpdk.

    3. Add and configure the br-plyn bridge.
      1
      2
      3
      4
      ovs-vsctl add-br br-ply1 -- set bridge br-ply1 datapath_type=netdev
      ovs-vsctl add-port br-ply1 tap1 -- set Interface tap1 type=dpdkvhostuserclient options:vhost-server-path=/var/run/openvswitch/tap1
      ovs-vsctl add-port br-ply1 p-tap1-int --  set Interface p-tap1-int type=patch options:peer=p-tap1
      ovs-vsctl add-port br-int p-tap1 --  set Interface p-tap1 type=patch options:peer=p-tap1-int
      

      In this networking, a br-ply bridge is added each time a VM is added. The bridge has a dpdkvhostuser port for the VM, and the patch port is connected to the br-int bridge.

    4. Verify the networking.
      1
      ovs-vsctl show
      

    5. Check whether the br-dpdk bridge at the local end is connected to the br-dpdk bridge at the peer end.
      1
      ping 192.168.2.2
      

  7. Start the VM.
    When configuring a VM, pay attention to the huge page memory and network port configuration. The following VM configuration file is for reference:
      1
      2
      3
      4
      5
      6
      7
      8
      9
     10
     11
     12
     13
     14
     15
     16
     17
     18
     19
     20
     21
     22
     23
     24
     25
     26
     27
     28
     29
     30
     31
     32
     33
     34
     35
     36
     37
     38
     39
     40
     41
     42
     43
     44
     45
     46
     47
     48
     49
     50
     51
     52
     53
     54
     55
     56
     57
     58
     59
     60
     61
     62
     63
     64
     65
     66
     67
     68
     69
     70
     71
     72
     73
     74
     75
     76
     77
     78
     79
     80
     81
     82
     83
     84
     85
     86
     87
     88
     89
     90
     91
     92
     93
     94
     95
     96
     97
     98
     99
    100
    101
    102
    103
    104
    <domain type='kvm'>
      <name>VM1</name>
      <uuid>fb8eb9ff-21a7-42ad-b233-2a6e0470e0b5</uuid>
      <memory unit='KiB'>2097152</memory>
      <currentMemory unit='KiB'>2097152</currentMemory>
      <memoryBacking>
        <hugepages>
          <page size='524288' unit='KiB' nodeset='0'/>
        </hugepages>
        <locked/>
      </memoryBacking>
      <vcpu placement='static'>4</vcpu>
      <cputune>
        <vcpupin vcpu='0' cpuset='6'/>
        <vcpupin vcpu='1' cpuset='7'/>
        <vcpupin vcpu='2' cpuset='8'/>
        <vcpupin vcpu='3' cpuset='9'/>
        <emulatorpin cpuset='0-3'/>
      </cputune>
      <numatune>
        <memory mode='strict' nodeset='0'/>
      </numatune>
      <os>
        <type arch='aarch64' machine='virt-rhel7.6.0'>hvm</type>
        <loader readonly='yes' type='pflash'>/usr/share/AAVMF/AAVMF_CODE.fd</loader>
        <nvram>/var/lib/libvirt/qemu/nvram/VM1_VARS.fd</nvram>
        <boot dev='hd'/>
      </os>
      <features>
        <acpi/>
        <gic version='3'/>
      </features>
      <cpu mode='host-passthrough' check='none'>
        <topology sockets='1' cores='4' threads='1'/>
        <numa>
          <cell id='0' cpus='0-3' memory='2097152' unit='KiB' memAccess='shared'/>
        </numa>
      </cpu>
      <clock offset='utc'/>
      <on_poweroff>destroy</on_poweroff>
      <on_reboot>restart</on_reboot>
      <on_crash>destroy</on_crash>
      <devices>
        <emulator>/usr/libexec/qemu-kvm</emulator>
        <disk type='file' device='disk'>
          <driver name='qemu' type='qcow2'/>
          <source file='/home/kvm/images/1.img'/>
          <target dev='vda' bus='virtio'/>
          <address type='pci' domain='0x0000' bus='0x04' slot='0x00' function='0x0'/>
        </disk>
        <disk type='file' device='cdrom'>
          <driver name='qemu' type='raw'/>
          <target dev='sda' bus='scsi'/>
          <readonly/>
          <address type='drive' controller='0' bus='0' target='0' unit='0'/>
        </disk>
        <controller type='usb' index='0' model='qemu-xhci' ports='8'>
          <address type='pci' domain='0x0000' bus='0x02' slot='0x00' function='0x0'/>
        </controller>
        <controller type='scsi' index='0' model='virtio-scsi'>
          <address type='pci' domain='0x0000' bus='0x03' slot='0x00' function='0x0'/>
        </controller>
        <controller type='pci' index='0' model='pcie-root'/>
        <controller type='pci' index='1' model='pcie-root-port'>
          <model name='pcie-root-port'/>
          <target chassis='1' port='0x8'/>
          <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x0' multifunction='on'/>
        </controller>
        <controller type='pci' index='2' model='pcie-root-port'>
          <model name='pcie-root-port'/>
          <target chassis='2' port='0x9'/>
          <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/>
        </controller>
        <controller type='pci' index='3' model='pcie-root-port'>
          <model name='pcie-root-port'/>
          <target chassis='3' port='0xa'/>
          <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/>
        </controller>
        <controller type='pci' index='4' model='pcie-root-port'>
          <model name='pcie-root-port'/>
          <target chassis='4' port='0xb'/>
          <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x3'/>
        </controller>
        <controller type='pci' index='5' model='pcie-root-port'>
          <model name='pcie-root-port'/>
          <target chassis='5' port='0xc'/>
          <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x4'/>
        </controller>
        <interface type='vhostuser'>
          <source type='unix' path='/var/run/openvswitch/tap1' mode='server'/>
          <target dev='tap1'/>
          <model type='virtio'/>
          <driver name='vhost' queues='4' rx_queue_size='1024' tx_queue_size='1024'/>
        </interface>
        <serial type='pty'>
          <target type='system-serial' port='0'>
            <model name='pl011'/>
          </target>
        </serial>
        <console type='pty'>
          <target type='serial' port='0'/>
        </console>
      </devices>
    </domain>
    
    • The memoryBacking section specifies the specifications of the huge page memory to be applied for. The huge page memory of 512 MB is configured for the host. Therefore, specify 512 MB in this section.
    • The numatune section specifies the number of the NUMA node on which the memory is applied for. The NUMA node number of the VM must be the same as that of the NIC.
    • The numa subsection specifies the VM memory mode. In this example, the virtual network port of vhostuser is configured. The VM must have a shared huge page memory.
    • The interface section specifies the virtual network port of the VM.
      • In the source section, path specifies the location of the socket file for communication between the host and VM, and mode specifies the type of the VM socket. The OVS is configured with the dpdkvhostuserclient port, which is in the client mode. Therefore, mode is set to server in this example.
      • The target section specifies the name of socket used by the VM.
      • The driver section specifies the driver used by the VM and specifies the queue and queue depth. In this example, the vhostuser driver is used.
  8. Verify cross-host VM communication.

    Check whether the VM on Host 1 can communicate with the VM on Host 2.

    1
    ping 192.168.1.21
    

    The data length is added to the VXLAN header. The default maximum transmission unit (MTU) of the host is 1500 bytes. The MTU length of the VM needs to be reduced to ensure normal communication. It is recommended that the MTU length of the VM network port be 1400 bytes.