Mirantis OpenStack 7.0: NFVI Deployment Guide

Mirantis OpenStack 7.0: NFVI Deployment Guide — NUMA/CPU pinning

https://www.mirantis.com/blog/mirantis-openstack-7-0-nfvi-deployment-guide-numacpu-pinning/

Compute hosts configuration

To enable CPU Pinning, perform the following steps on every compute host where you want CPU pinning to be enabled.

Upgrade QEMU to 2.4 to use NUMA CPU pinning (see the Appendix A1 “Installing qemu 2.1”).

Get the NUMA topology for the node:

# lscpu  | grep NUMA

NUMA node(s):          2
NUMA node0 CPU(s):     0-5,12-17
NUMA node1 CPU(s):     6-11,18-23

在/etc/default/grub添加以下来告诉系统哪些cores只能被虚拟机使用，而不能被host os使用:
```
GRUB_CMDLINE_LINUX="$GRUB_CMDLINE_LINUX isolcpus=1-5,7-23”
```
同时把相同的list添加到 vcpu_pin_set in /etc/nova/nova.conf:
```
vcpu_pin_set=1-5,7-23
```
In this example we ensured that cores 0 and 6 will be dedicated to the host system. Virtual machines will use cores 1-5 and 12-17 on NUMA cell 1, and cores 7-11 and 18-23 on NUMA cell 2.
Update boot record and reboot compute node:
```
update-grub
reboot
```

Nova configuration

创建 aggregates for instances with and without cpu pinning:

# nova aggregate-create performance
# nova aggregate-set-metadata performance pinned=true

# nova aggregate-create normal
# nova aggregate-set-metadata normal pinned=false

Add one or more hosts to the new aggregates:

# nova aggregate-add-host performance node-9.domain.tld
# nova aggregate-add-host normal node-10.domain.tld

Create a new flavor for VMs that require CPU pinning:

# nova flavor-create m1.small.performance auto 2048 20 2
# nova flavor-key m1.small.performance set hw:cpu_policy=dedicated
# nova flavor-key m1.small.performance set aggregate_instance_extra_specs:pinned=true

To be thorough, you should update all other flavours so they will start only on hosts without CPU pinning:

# openstack flavor list -f csv|grep -v performance |cut -f1 -d,| 
tail -n +2| xargs -I% -n 1 nova flavor-key % set aggregate_instance_extra_specs:pinned=false

On every controller add values AggregateInstanceExtraSpecFilter and NUMATopologyFilter to the scheduler_default_filters parameter in /etc/nova/nova.conf:

scheduler_default_filters=RetryFilter,AvailabilityZoneFilter,RamFilter,DiskFilter,ComputeFilter,ComputeCapabilitiesFilter,ImagePropertiesFilter,ServerGroupAntiAffinityFilter,ServerGroupAffinityFilter,NUMATopologyFilter,AggregateInstanceExtraSpecsFilter

Restart nova scheduler service on all controllers:
```
restart nova-scheduler
```

Using CPU pinning

Once you’ve done this configuration, using CPU Pinning is straightforward. Follow these steps:

Start a new VM with a flavor that requires pinning …

# nova boot --image TestVM --nic net-id=`openstack network show net04 -f value | head -n1` --flavor m1.small.performance test1

… and check its vcpu configuration:

# hypervisor=`nova show test1 | grep OS-EXT-SRV-ATTR:host | cut -d| -f3`
# instance=`nova show test1 | grep OS-EXT-SRV-ATTR:instance_name | cut -d| -f3`
# ssh $hypervisor virsh dumpxml $instance |awk '/vcpu placement/ {p=1}; p; //numatune/ {p=0}’

  <vcpu placement='static'>2</vcpu>
  <cputune>
    <shares>2048</shares>
    <vcpupin vcpu='0' cpuset='16'/>
    <vcpupin vcpu='1' cpuset='4'/>
    <emulatorpin cpuset='4,16'/>
  </cputune>
  <numatune>
    <memory mode='strict' nodeset='0'/>
    <memnode cellid='0' mode='strict' nodeset='0'/>
  </numatune>

You should see that each vCPU is pinned to a dedicated CPU core, which is not used by the host operating system, and that these cores are inside the same host NUMA cell (in our example it’s cores 4 and 16 in NUMA cell 1).

Repeat the test for the instance with two NUMA cells:

# nova flavor-create m1.small.performance-2 auto 2048 20 2
# nova flavor-key m1.small.performance-2 set hw:cpu_policy=dedicated
# nova flavor-key m1.small.performance-2 set aggregate_instance_extra_specs:pinned=true
# nova flavor-key m1.small.performance-2 set hw:numa_nodes=2 把instance pin到2个NUMA cell上
# nova boot --image TestVM --nic net-id=`openstack network show net04 -f value | head -n1` --flavor m1.small.performance-2 test2
# hypervisor=`nova show test2 | grep OS-EXT-SRV-ATTR:host | cut -d| -f3`
# instance=`nova show test2 | grep OS-EXT-SRV-ATTR:instance_name | cut -d| -f3`
# ssh $hypervisor virsh dumpxml $instance |awk '/vcpu placement/ {p=1}; p; //numatune/ {p=0}’

  <vcpu placement='static'>2</vcpu>
  <cputune>
    <shares>2048</shares>
    <vcpupin vcpu='0' cpuset='2'/>
    <vcpupin vcpu='1' cpuset='10'/>
    <emulatorpin cpuset='2,10'/>
  </cputune>
  <numatune>
    <memory mode='strict' nodeset='0-1'/>
    <memnode cellid='0' mode='strict' nodeset='0'/>
    <memnode cellid='1' mode='strict' nodeset='1'/>
  </numatune>

You should see that each vCPU is pinned to a dedicated CPU core, which is not used by the host operating system, and that these cores are inside another host NUMA cell. In our example it’s core 2 in NUMA cell 1 and core 10 in NUMA cell 2. As you may remember in our configuration, cores 1-5 and 12-17 from cell 1 and cores 7-11 and 18-23 from cell 2 are available to virtual machines.

Troubleshooting

You might run into the following errors:

```
internal error: No PCI buses available in /etc/nova/nova.conf
```
In this case, you’ve specified the wrong hw_machine_type in /etc/nova/nova.conf

```
libvirtError: unsupported configuration
```
Per-node memory binding is not supported with this version of QEMU. You may have an older version of qemu, or a stale libvirt cache.
http://docs.openstack.org/developer/nova/testing/libvirt-numa.html