1、安装依赖包

yum install epel-release -y  
yum install gcc gcc-c++ bc openssl-devel automake ncurses-devel libibverbs -y  
yum install libibverbs-devel libibverbs-utils librdmacm librdmacm-devel librdmacm-utils perl-Switch elfutils-libelf-devel  -y

2、 librxe-dev 和 rxe-dev下载

下载地址

Github: https://github.com/SoftRoCE/rxe-dev.git  
Github: https://github.com/SoftRoCE/librxe-dev.git

备注:rxe-dev下载v18版本,即rxe-dev-rxesubmissionv18

3、安装rxe-dev

unzip rxe-dev-rxe_submission_v18.zip
cd rxe-dev-rxe_submission_v18/
cp /boot/config-3.10.0-514.el7.x86_64 .config

备注:使用root用户,执行以下命令

make menuconfig

会出现选择界面(如果没出现,需要安装 ncurse-devel)
输入 "/" ,然后输入 rxe,按下 enter,会查找有关 rxe 的选择项。
输入数字 1,就会选择到“Software RDMA over Ethernet (ROCE) driver”的设置,输入 "M" ,选中 RDMA 的配置,如果 输不了 M,那就输入空格。
移动到保存按钮,回车,装保存到.config中,退出安装界面(exit)。
然后 vi .config 来确认 
CONFIGRDMARXE 为 m
CONFIGINFINIBANDADDRTRANS 和 CONFIGINFINIBANDADDRTRANS_CONFIGFS 为 y

make -j 4  
make modules_install ,可能执行中途 会提示 丢失一些 module,这个 没关系,无关紧要。  
make install  
make headers_install INSTALL_HDR_PATH=/usr

确认 新的内核是否在 grub 引导中。查看 /etc/grub.cfg 即可看见。在开机的时候可以选择 新内核启动

4、安装 librxe-dev

cd librxe-dev  
./configure --libdir=/usr/lib64/ --prefix=  
make  
make install

Here is the issue:

checking for ibv_get_device_list in -libverbs... 
yes
checking infiniband/driver.h usability... no
checking infiniband/driver.h presence... no
checking for infiniband/driver.h... no
configure: error: <infiniband/driver.h> not found. librxe requires libibverbs.

How to fix?

https://wangmingjun.com/2018/09/03/how-to-build-the-development-environment-of-software-rdma-over-converged-ethernet-roce/

rxe-dev and librxe_dev these two repositories lack maintenance. And the RDMA-core has already contrained all the RXE utilities. So please use RDMA-core instead of librxe-dev.

重启操作系统,在开机启动时,选择4.7.0-rc3内核
启动后,查看内核版本

uname -r

5、验证 rdma

[root@aboss ~]# rxe_cfg start 
  Name        Link  Driver  Speed  NMTU  IPv4_addr  RDEV  RMTU  
  ens33       yes   e1000                                       
  virbr0      no    bridge                                      
  virbr0-nic  no    tun                                         
[root@aboss ~]# rxe_cfg add ens33
[root@aboss ~]# rxe_cfg status 
  Name        Link  Driver  Speed  NMTU  IPv4_addr  RDEV  RMTU          
  ens33       yes   e1000                           rxe0  1024  (3)  
  virbr0      no    bridge                                              
  virbr0-nic  no    tun

查看rxe设备
ibvdevices 程序显示该系统中目前所有设备,而 ibvdevinfo 命令会给出每个设备的具体信息

[root@aboss ~]# ibv_devices
    device          	   node GUID
    ------          	----------------
    rxe0            	020c29fffe55c818
[root@aboss ~]# ibv_devinfo  rxe0
hca_id:	rxe0
transport:	InfiniBand (0)
fw_ver:	0.0.0
node_guid:	020c:29ff:fe55:c818
sys_p_w_picpath_guid:	0000:0000:0000:0000
vendor_id:	0x0000
vendor_part_id:	0
hw_ver:	0x0
phys_port_cnt:	1
port:	1
state:	PORT_ACTIVE (4)
max_mtu:	4096 (5)
active_mtu:	1024 (3)
sm_lid:	0
port_lid:	0
port_lmc:	0x00
link_layer:	Ethernet

6、softRoCE连通性测试

服务端

rping -s -a 192.168.1.133 -v -C 10

客户端

rping -c -a 192.168.1.133 -v -C 10

Test connectivity.

  • On the server:
1
ibv_rc_pingpong -d rxe0 -g 0
  • On the client:
1
ibv_rc_pingpong -d rxe0 -g 0 <server_management_ip>

e.g Client:

1
2
3
kevin@ubuntu:~$ ibv_rc_pingpong -g 0 -d rxe0 -i 1 192.168.188.129
local address: LID 0x0000, QPN 0x000011, PSN 0x2cd726, GID fe80::20c:29ff:febd:5e22
remote address: LID 0x0000, QPN 0x000011, PSN 0x767a62, GID fe80::20c:29ff:fe44:4345

测试时需切换到root

Sever:

1
ib_send_bw –a

Client:

1
ib_send_bw 192.168.46.132 –a

7、关于librdmacm编译说明

git clone https://github.com/ofiwg/librdmacm.git
cd librdmacm
yum install autoconf automake gettext gettext-devel libtool -y
./autogen.sh 
./configure 
make 
make install

8、常见问题

(1)如果你克隆虚机,需要解决网卡问题
(2)使用rdma,请将防火墙与selinx关闭

How to build the development environment of Software RDMA over Converged Ethernet (RoCE) ?

Soft-RoCE (RXE)

In order to study the coding of RDMA, I need build the corresponding environment. Due to the lack of hardware, I found Soft-RoCE would be the first choice after some research work.

All the information focus on these two repositories: [rxe-dev] and [librxe-dev]. And others aim to help us how to build this virtual RDMA device which named RXE and how to use it.

Struggle Against RXE

Internet shows that the RXE need kernel support and user space codes.

Firstly, we git clone the [rxe-dev] to compile and install the new kernel to support RoCE. Then restart with the new kernel. Finally, compile the [librxe-dev] to get the utilities of Soft-Roce.

When I switched to the compiled new kernel, but it failed to start the system. When I configured the [librxe-dev], it showed “configure: error: <infiniband/driver.h> not found. librxe requires libibverbs”.

This page also shows the same issue, and several people also hangs on this error.

Sudden Inspiration

MosesAlexander’s comment of “I just found that the rxe functionality is all in rdma-core now.” gives me sudden inspiration. The rdma-core has already contained all the RXE utilities, and it looks like that there’s no maintenance for the two repositories [rxe-dev] and [librxe-dev].

Solution

Just “yum -y install libibverbs libibverbs-devel libibverbs-utils librdmacm librdmacm-devel librdmacm-utils” will be OK !

Notice:  I blog this post at 2018/09/03, based on CentOS 7 (3.10.0-862.el7.x86_64), whose kernel support RDMA-related technologies originally.

Verification

Run some commands to verify:

[root@localhost ~]# rxe_cfg start
Name Link Driver Speed NMTU IPv4_addr RDEV RMTU 
ens33 yes e1000 
virbr0 no bridge 
virbr0-nic no tun 
[root@localhost ~]# rxe_cfg add ens33
[root@localhost ~]# rxe_cfg status
Name Link Driver Speed NMTU IPv4_addr RDEV RMTU 
ens33 yes e1000 rxe0 1024 (3) 
virbr0 no bridge 
virbr0-nic no tun 
[root@localhost ~]# ibv_devices 
device node GUID
------ ----------------
rxe0 020c29fffe495c4d

Also, you could run example codes from the-geek-in-the-corner.

参考:

http://blog.sina.com.cn/s/blog_6de3aa8a0102wr14.html

http://www.unjeep.com/article/23742.html (rping测试,softroce/rdma安装测试)

https://github.com/SoftRoCE/rxe-dev/wiki/Validate-that-RXE-is-working (验证rxe是否工作)