Configuring VXLAN Encap/Decap Offload Using tc

 
 

OVS Hardware Offloads Configuration

OVS-Kernel Hardware Offloads

Configuring Uplink Representor Mode

Please note that this step is optional. However, if you wish to configure uplink representor mode, make sure this step is performed before configuring SwitchDev.

The following are the uplink representor modes available for configuration

  • new_netdev: default mode - when found in this mode, the uplink representor is created as a new netdevice
  • nic_netdev: when found in this mode, the NIC netdevice acts as an uplink representor device

Example

echo nic_netdev > /sys/class/net/ens1f0/compat/devlink/uplink_rep_mode

Notes:

  • The mode can only be changed when found in Legacy mode
  • The mode is not saved when reloading mlx5_core
  • When two PFs in the same bonding device need to enter the SwitchDev mode, the uplink representor mode for both PFs should be same (either nic_netdev or new_netdev)
 
 

mlx5_core

Added new mlx5_core module parameter "num_of_groups", which controls the number of large groups in the FDB flow table.

Note: In MLNX_OFED v4.6-3.1.9.0.14, the default value of num_of_groups was 15, while in the current MLNX_OFED v4.7-3, 
the default value is 4. In order to achieve the same OOB experience, make sure to set the num_of_groups module parameter to 15 prior to driver load. For further information, please refer to Performance Tuning Based on Traffic Patterns section in MLNX_OFED User Manual.
 
 
Setup
Software Version
OFED: 4.6-1.0.1.1
Firmware: 16.25.1020
SUT server
The ConnectX5 should be already configured as the switchdev mode. See Section ‘1.2 Setting up SR-IOV’ in ASAP User Manual for the configuration steps.
ens1f0 is the uplink representor, while ens1f2 is the VF interface and ens1f0_0 is the corresponding VF representor.
A VXLAN interface vxlan16 will be created over the uplink representor.
Peer Server
ens1f0 is the physical interface. A VXLAN interface vxlan16 will be created over the physical interface as the peer interface
 
 
 
 

Configuring the Peer

 
  1. Set the IP address of the physical interface ens1f0 and bring it up.
# ifconfig ens1f0 2.2.2.3/24 up
 
  1. Create a VXLAN interface vxlan16 with vni 16 over the ens1f0 interface.
# ip link add name vxlan16 type vxlan id 16 dev ens1f0 remote 2.2.2.2 dstport 4789
  1.  
  2. Set the IP address of the vxlan16 interface and bring it up.
# ifconfig vxlan16 1.1.1.3/24 up
 
 

Configuring the SUT

 
  1. Set the IP address of the VF interface ens1f0 and the uplink representor interface ens1f0, then bring them up.
Also bring the VF representor interface ens1f0_0 up.
# ifconfig ens1f2 1.1.1.2/24 up   
# ifconfig ens1f0 2.2.2.2/24 up  
# ifconfig ens1f0_0 up
 ens1f0 is the uplink representor, while ens1f2 is the VF interface and ens1f0_0 is the corresponding VF representor.
  1. Create a VXLAN interface vxlan16 with vni 16 over the uplink representor interface ens1f0, then bring it up.
# ip link add name vxlan16 type vxlan id 16 dev ens1f0 remote 2.2.2.3 dstport 4789
# ifconfig vxlan16 up
 
  1. Reduce the MTU of VF and VF representor interfaces to allow the packets to be encapsulated without fragmentation.The MTU should be lowered to 1450 for IPv4 and 1430 for IPv6
# ifconfig ens1f2 mtu 1450
# ifconfig ens1f0_0 mtu 1450
 
  1. Enable TC hardware offload on the uplink representor and the VF representor.
# ethtool -K ens1f0 hw-tc-offload on
# ethtool -K ens1f0_0 hw-tc-offload on
 
  1. Enable TC ingress on the uplink representor, the VF representor and the vxlan interface.
# tc qdisc add dev ens1f0 ingress
# tc qdisc add dev ens1f0_0 ingress
# tc qdisc add dev vxlan16 ingress
 
  1. Add TC rules to offload the egress datapath and the VXLAN encapsulation actions.
# tc filter add dev ens1f0_0 protocol ip parent ffff: prio 1
flower
dst_mac e4:11:22:33:45:61
src_mac e4:11:22:33:44:51
action tunnel_key set
src_ip 2.2.2.2
dst_ip 2.2.2.3
dst_port 4789
id 16
action mirred egress redirect dev vxlan16

# tc filter add dev ens1f0_0 protocol arp parent ffff: prio 2
flower
dst_mac e4:11:22:33:45:61
src_mac e4:11:22:33:44:51
action tunnel_key set
src_ip 2.2.2.2
dst_ip 2.2.2.3
dst_port 4789
id 16
action mirred egress redirect dev vxlan16

# tc filter add dev ens1f0_0 protocol arp parent ffff: prio 3
flower
dst_mac ff:ff:ff:ff:ff:ff
src_mac e4:11:22:33:44:51
action tunnel_key set
src_ip 2.2.2.2
dst_ip 2.2.2.3
dst_port 4789
id 16
action mirred egress redirect dev vxlan16


Add TC rules to offload the egress datapath and the VXLAN encapsulation actions.
# tc filter add dev vxlan16 protocol ip parent ffff: prio 1
flower
dst_mac e4:11:22:33:44:51
src_mac e4:11:22:33:45:61
enc_dst_ip 2.2.2.2
enc_src_ip 2.2.2.3
enc_dst_port 4789
enc_key_id 16
action tunnel_key unset
action mirred egress redirect dev ens1f0_0

# tc filter add dev vxlan16 protocol arp parent ffff: prio 2
flower
dst_mac e4:11:22:33:44:51
src_mac e4:11:22:33:45:61
enc_dst_ip 2.2.2.2
enc_src_ip 2.2.2.3
enc_dst_port 4789
enc_key_id 16
action tunnel_key unset
action mirred egress redirect dev ens1f0_0

# tc filter add dev vxlan16 protocol arp parent ffff: prio 3
flower
dst_mac ff:ff:ff:ff:ff:ff
src_mac e4:11:22:33:45:61
enc_dst_ip 2.2.2.2
enc_src_ip 2.2.2.3
enc_dst_port 4789
enc_key_id 16
action tunnel_key unset
action mirred egress redirect dev ens1f0_0


Verification
Make sure the rules are offloaded
For egress rules:
# tc -s filter show dev ens1f0_0 root
filter protocol ip pref 1 flower
filter protocol ip pref 1 flower handle 0x1
  dst_mac e4:11:22:33:45:61
  src_mac e4:11:22:33:44:51
  eth_type ipv4
  in_hw
        action order 1: tunnel_key set
        src_ip 2.2.2.2
        dst_ip 2.2.2.3
        key_id 16
        dst_port 4789 pipe
        index 1 ref 1 bind 1 installed 452 sec used 452 sec
        Action statistics:
        Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
        backlog 0b 0p requeues 0

        action order 2: mirred (Egress Redirect to device vxlan16) stolen
        index 1 ref 1 bind 1 installed 452 sec used 0 sec
        Action statistics:
        Sent 148 bytes 1 pkt (dropped 0, overlimits 0 requeues 0)
        backlog 0b 0p requeues 0

filter protocol arp pref 2 flower
filter protocol arp pref 2 flower handle 0x1
  dst_mac e4:11:22:33:45:61
  src_mac e4:11:22:33:44:51
  eth_type arp
  in_hw
        action order 1: tunnel_key set
        src_ip 2.2.2.2
        dst_ip 2.2.2.3
        key_id 16
        dst_port 4789 pipe
        index 2 ref 1 bind 1 installed 452 sec used 452 sec
        Action statistics:
        Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
        backlog 0b 0p requeues 0

        action order 2: mirred (Egress Redirect to device vxlan16) stolen
        index 2 ref 1 bind 1 installed 452 sec used 20 sec
        Action statistics:
        Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
        backlog 0b 0p requeues 0

filter protocol arp pref 3 flower
filter protocol arp pref 3 flower handle 0x1
  dst_mac ff:ff:ff:ff:ff:ff
  src_mac e4:11:22:33:44:51
  eth_type arp
  in_hw
        action order 1: tunnel_key set
        src_ip 2.2.2.2
        dst_ip 2.2.2.3
        key_id 16
        dst_port 4789 pipe
        index 3 ref 1 bind 1 installed 452 sec used 452 sec
        Action statistics:
        Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
        backlog 0b 0p requeues 0

        action order 2: mirred (Egress Redirect to device vxlan16) stolen
        index 3 ref 1 bind 1 installed 452 sec used 93 sec
        Action statistics:
        Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
        backlog 0b 0p requeues 0

For ingress rules:
# tc -s filter show dev vxlan16 root
filter protocol ip pref 1 flower
filter protocol ip pref 1 flower handle 0x1
  dst_mac e4:11:22:33:44:51
  src_mac e4:11:22:33:45:61
  eth_type ipv4
  enc_dst_ip 2.2.2.2
  enc_src_ip 2.2.2.3
  enc_key_id 16
  enc_dst_port 4789
  in_hw
        action order 1: tunnel_key unset pipe
        index 4 ref 1 bind 1 installed 30 sec used 30 sec
        Action statistics:
        Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
        backlog 0b 0p requeues 0

        action order 2: mirred (Egress Redirect to device ens1f0_0) stolen
        index 4 ref 1 bind 1 installed 30 sec used 0 sec
        Action statistics:
        Sent 2156 bytes 22 pkt (dropped 0, overlimits 0 requeues 0)
        backlog 0b 0p requeues 0

filter protocol arp pref 2 flower
filter protocol arp pref 2 flower handle 0x1
  dst_mac e4:11:22:33:44:51
  src_mac e4:11:22:33:45:61
  eth_type arp
  enc_dst_ip 2.2.2.2
  enc_src_ip 2.2.2.3
  enc_key_id 16
  enc_dst_port 4789
  in_hw
        action order 1: tunnel_key unset pipe
        index 5 ref 1 bind 1 installed 30 sec used 30 sec
        Action statistics:
        Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
        backlog 0b 0p requeues 0

        action order 2: mirred (Egress Redirect to device ens1f0_0) stolen
        index 5 ref 1 bind 1 installed 30 sec used 17 sec
        Action statistics:
        Sent 84 bytes 2 pkt (dropped 0, overlimits 0 requeues 0)
        backlog 0b 0p requeues 0

filter protocol arp pref 3 flower
filter protocol arp pref 3 flower handle 0x1
  dst_mac ff:ff:ff:ff:ff:ff
  src_mac e4:11:22:33:45:61
  eth_type arp
  enc_dst_ip 2.2.2.2
  enc_src_ip 2.2.2.3
  enc_key_id 16
  enc_dst_port 4789
  in_hw

 
 
        action order 1: tunnel_key unset pipe
        index 6 ref 1 bind 1 installed 30 sec used 30 sec
        Action statistics:
        Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
        backlog 0b 0p requeues 0

        action order 2: mirred (Egress Redirect to device ens1f0_0) stolen
        index 6 ref 1 bind 1 installed 30 sec used 30 sec
        Action statistics:
        Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
        backlog 0b 0p requeues 0

Checking the traffic
  1. Keep pinging the peer. The results should be successful.
# ping 1.1.1.3
PING 1.1.1.3 (1.1.1.3) 56(84) bytes of data.
64 bytes from 1.1.1.3: icmp_seq=1 ttl=64 time=0.090 ms
64 bytes from 1.1.1.3: icmp_seq=2 ttl=64 time=0.088 ms
64 bytes from 1.1.1.3: icmp_seq=3 ttl=64 time=0.048 ms
64 bytes from 1.1.1.3: icmp_seq=4 ttl=64 time=0.052 ms
64 bytes from 1.1.1.3: icmp_seq=4 ttl=64 time=0.047 ms
 
  1. Verify the datapath is offloaded.
Run tcpdump on the uplink representor ens1f0, the VF representor ens1f0_0 and the vxlan interface vxlan16, no packets should be seen on them.
# tcpdump -i ens1f0
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on ens1f0, link-type EN10MB (Ethernet), capture size 262144 bytes
^C
0 packets captured
0 packets received by filter
0 packets dropped by kernel

# tcpdump -i ens1f0_0
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on ens1f0_0, link-type EN10MB (Ethernet), capture size 262144 bytes
^C
0 packets captured
0 packets received by filter
0 packets dropped by kernel

# tcpdump -i vxlan16
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on vxlan16, link-type EN10MB (Ethernet), capture size 262144 bytes
^C
0 packets captured
0 packets received by filter
0 packets dropped by kernel
 
  1. Verify the VXLAN encap/decap actions.

Capture the ingress traffic on the VF interface ens1f2.
# tcpdump -i ens1f2
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on ens1f2, link-type EN10MB (Ethernet), capture size 262144 bytes
13:57:48.991271 IP 1.1.1.2 > 1.1.1.3: ICMP echo request, id 18323, seq 153, length 64
13:57:48.991371 IP 1.1.1.3 > 1.1.1.2: ICMP echo reply, id 18323, seq 153, length 64
13:57:49.991268 IP 1.1.1.2 > 1.1.1.3: ICMP echo request, id 18323, seq 154, length 64
13:57:49.991310 IP 1.1.1.3 > 1.1.1.2: ICMP echo reply, id 18323, seq 154, length 64
We can see that the packet sent and received by the VF interface are not encapsulated.
Capture the ingress traffic on the physical interface ens1f0 on the peer server.
# tcpdump -i ens1f0
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on ens1f0, link-type EN10MB (Ethernet), capture size 262144 bytes
13:59:24.991245 IP 2.2.2.2.58743 > 2.2.2.3.4789: VXLAN, flags [I] (0x08), vni 16
IP 1.1.1.2 > 1.1.1.3: ICMP echo request, id 18323, seq 249, length 64
13:59:24.991266 IP 2.2.2.3.51556 > 2.2.2.2.4789: VXLAN, flags [I] (0x08), vni 16
IP 1.1.1.3 > 1.1.1.2: ICMP echo reply, id 18323, seq 249, length 64

The first packet captured is an ICMP request encapsulated in a VXLAN packet, which came from the SUT.
The second packet is the ICMP reply encapsulated in a VXLAN packet and sent to the SUT
This proves the VXLAN encapsulation/decapsulation actions are done by the hardware
 
原文地址:https://www.cnblogs.com/dream397/p/14464784.html