Openstack Cinder使用NetApp NFS作为后端存储

公司使用NetApp FS8040作为测试环境NFS存储使用。正好有机会测一下OpenStack的Cinder跟NetApp存储集成。

说明:

1.OpenStack使用NetApp存储直接挂载NFS文件没任何问题,生产中已使用比较稳定测试IOPS在160-220M/s。

2.使用OpenStack的Cinder无法像挂载如Linux主机共享NFS文件那样直接使用,需要调用NetAPP的API才能实现功能,如果设置为标准驱动故障现象为cinder-volume在开始的时候是正常的,一般十来分钟后State状态为down.(暂无截图)

错误配置文件如下:

[DEFAULT]
enabled_backends = nfs

[nfs]

volume_backend_name = nfs               //标黄的三处命名应统一,命名内容与使用协议无关如下文命名netapp_nfs
volume_driver = cinder.volume.drivers.nfs.NfsDriver    //定义使用的驱动类型,通用的NFS使用该选项,第三方厂商调用的驱动配置各不相同
nfs_sparsed_volumes = True
nfs_shares_config = /etc/cinder/nfs_shares
nfs_mount_point_base = $state_path/mnt
nfs_mount_options = v3

[root@controller1 cinder]# vim nfs_shares

172.16.5.242:/vol/sqmgtvm02/nfs      //NetApp存储IP:/共享的文件目录  正确的内容应为172.16.5.xx:/vol/sqmgtvm02提供volume不是文件夹nfs为生产环境隔离增加nfs(导致下文报错2)

检查/var/log/cinder/volume.log中报错日志如下:

2017-09-07 22:07:58.983 16612 ERROR oslo_service.service [req-37e3e47a-e1cb-47b8-a950-73374fd8713b - - - - -] Error starting thread.
2017-09-07 22:07:58.983 16612 ERROR oslo_service.service Traceback (most recent call last):
2017-09-07 22:07:58.983 16612 ERROR oslo_service.service File "/usr/lib/python2.7/site-packages/oslo_service/service.py", line 708, in run_service
2017-09-07 22:07:58.983 16612 ERROR oslo_service.service service.start()
2017-09-07 22:07:58.983 16612 ERROR oslo_service.service File "/usr/lib/python2.7/site-packages/cinder/service.py", line 234, in start
2017-09-07 22:07:58.983 16612 ERROR oslo_service.service self.manager.init_host(added_to_cluster=self.added_to_cluster)
2017-09-07 22:07:58.983 16612 ERROR oslo_service.service File "/usr/lib/python2.7/site-packages/cinder/volume/manager.py", line 425, in init_host
2017-09-07 22:07:58.983 16612 ERROR oslo_service.service self.driver.init_capabilities()
2017-09-07 22:07:58.983 16612 ERROR oslo_service.service File "/usr/lib/python2.7/site-packages/cinder/volume/driver.py", line 704, in init_capabilities
2017-09-07 22:07:58.983 16612 ERROR oslo_service.service stats = self.get_volume_stats(True)
2017-09-07 22:07:58.983 16612 ERROR oslo_service.service File "/usr/lib/python2.7/site-packages/cinder/volume/drivers/remotefs.py", line 512, in get_volume_stats
2017-09-07 22:07:58.983 16612 ERROR oslo_service.service self._update_volume_stats()
2017-09-07 22:07:58.983 16612 ERROR oslo_service.service File "/usr/lib/python2.7/site-packages/cinder/volume/drivers/nfs.py", line 448, in _update_volume_stats
2017-09-07 22:07:58.983 16612 ERROR oslo_service.service provisioned_capacity = self._get_provisioned_capacity()
2017-09-07 22:07:58.983 16612 ERROR oslo_service.service File "/usr/lib/python2.7/site-packages/cinder/volume/drivers/remotefs.py", line 212, in _get_provisioned_capacity
2017-09-07 22:07:58.983 16612 ERROR oslo_service.service run_as_root=True)
2017-09-07 22:07:58.983 16612 ERROR oslo_service.service File "/usr/lib/python2.7/site-packages/cinder/utils.py", line 123, in execute
2017-09-07 22:07:58.983 16612 ERROR oslo_service.service return processutils.execute(*cmd, **kwargs)
2017-09-07 22:07:58.983 16612 ERROR oslo_service.service File "/usr/lib/python2.7/site-packages/oslo_concurrency/processutils.py", line 389, in execute
2017-09-07 22:07:58.983 16612 ERROR oslo_service.service cmd=sanitized_cmd)
2017-09-07 22:07:58.983 16612 ERROR oslo_service.service ProcessExecutionError: Unexpected error while running command.
2017-09-07 22:07:58.983 16612 ERROR oslo_service.service Command: sudo cinder-rootwrap /etc/cinder/rootwrap.conf du --bytes /var/lib/cinder/mnt/3d59744e62f876bf5171140e3a723d34
2017-09-07 22:07:58.983 16612 ERROR oslo_service.service Exit code: 1
2017-09-07 22:07:58.983 16612 ERROR oslo_service.service Stdout: u'4096 /var/lib/cinder/mnt/3d59744e62f876bf5171140e3a723d34/.snapshot 8268 /var/lib/cinder/mnt/3d59744e62f876bf5171140e3a723d34 '
2017-09-07 22:07:58.983 16612 ERROR oslo_service.service Stderr: '/bin/du: WARNING: Circular directory structure. This almost certainly means that you have a corrupted file system. NOTIFY YOUR SYSTEM MANAGER. The following directory is part of the cycle: xe2x80x98/var/lib/cinder/mnt/3d59744e62f876bf5171140e3a723d34/.snapshot/sv_nightly.0xe2x80x99 '

2017-09-09 21:33:28.066 154678 WARNING oslo_reports.guru_meditation_report [-] Guru meditation now registers SIGUSR1 and SIGUSR2 by default for backward compatibility. SIGUSR1 will no longer be registered in a future release, so please use SIGUSR2 to generate reports.
2017-09-09 21:33:28.500 154678 WARNING cinder.keymgr.conf_key_mgr [req-ba7d370f-a96c-4b3f-95fa-c6234277766e - - - - -] This key manager is insecure and is not recommended for production deployments
2017-09-09 21:33:28.513 154678 ERROR cinder.cmd.volume [req-ba7d370f-a96c-4b3f-95fa-c6234277766e - - - - -] Volume service controller2@nfs failed to start.
2017-09-09 21:33:28.513 154678 ERROR cinder.cmd.volume Traceback (most recent call last):
2017-09-09 21:33:28.513 154678 ERROR cinder.cmd.volume File "/usr/lib/python2.7/site-packages/cinder/cmd/volume.py", line 99, in main
2017-09-09 21:33:28.513 154678 ERROR cinder.cmd.volume cluster=cluster)
2017-09-09 21:33:28.513 154678 ERROR cinder.cmd.volume File "/usr/lib/python2.7/site-packages/cinder/service.py", line 382, in create
2017-09-09 21:33:28.513 154678 ERROR cinder.cmd.volume cluster=cluster)
2017-09-09 21:33:28.513 154678 ERROR cinder.cmd.volume File "/usr/lib/python2.7/site-packages/cinder/service.py", line 202, in __init__
2017-09-09 21:33:28.513 154678 ERROR cinder.cmd.volume *args, **kwargs)
2017-09-09 21:33:28.513 154678 ERROR cinder.cmd.volume File "/usr/lib/python2.7/site-packages/cinder/volume/manager.py", line 242, in __init__
2017-09-09 21:33:28.513 154678 ERROR cinder.cmd.volume active_backend_id=curr_active_backend_id)
2017-09-09 21:33:28.513 154678 ERROR cinder.cmd.volume File "/usr/lib/python2.7/site-packages/oslo_utils/importutils.py", line 44, in import_object
2017-09-09 21:33:28.513 154678 ERROR cinder.cmd.volume return import_class(import_str)(*args, **kwargs)
2017-09-09 21:33:28.513 154678 ERROR cinder.cmd.volume File "/usr/lib/python2.7/site-packages/cinder/volume/drivers/netapp/common.py", line 75, in __new__
2017-09-09 21:33:28.513 154678 ERROR cinder.cmd.volume na_utils.check_flags(NetAppDriver.REQUIRED_FLAGS, config)
2017-09-09 21:33:28.513 154678 ERROR cinder.cmd.volume File "/usr/lib/python2.7/site-packages/cinder/volume/drivers/netapp/utils.py", line 79, in check_flags
2017-09-09 21:33:28.513 154678 ERROR cinder.cmd.volume raise exception.InvalidInput(reason=msg)
2017-09-09 21:33:28.513 154678 ERROR cinder.cmd.volume InvalidInput: Invalid input received: Configuration value netapp_storage_protocol is not set.
2017-09-09 21:33:28.513 154678 ERROR cinder.cmd.volume
2017-09-09 21:33:28.517 154678 ERROR cinder.cmd.volume [req-ba7d370f-a96c-4b3f-95fa-c6234277766e - - - - -] No volume service(s) started successfully, terminating.
2017-09-09 21:33:30.401 154691 WARNING oslo_reports.guru_meditation_report [-] Guru meditation now registers SIGUSR1 and SIGUSR2 by default for backward compatibility. SIGUSR1 will no longer be registered in a future release, so please use SIGUSR2 to generate reports.
2017-09-09 21:33:33.308 154691 WARNING cinder.keymgr.conf_key_mgr [req-44d8acf3-246c-4efb-aaaf-00d092a68f40 - - - - -] This key manager is insecure and is not recommended for production deployments

正确配置文件:

[netapp_nfs]
volume_backend_name = netpp_nfs
volume_driver = cinder.volume.drivers.netapp.common.NetAppDriver
netapp_storage_family = ontap_7mode          //NetApp目前产品线分为两种模式
netapp_storage_protocol = nfs                        //使用协议
netapp_server_hostname = sqmgtvm02         //改名称建议修改/etc/hosts来定义主机和IP (原使用共享目录的172.16.X.X的IP地址提示认证错误更改为NetApp的管理地址正常)
netapp_server_port = 80                             
netapp_transport_type = http                          //认证协议选择支持https和http,标准中我使用http模式。(https配置较复杂详见下文参考链接)
netapp_login = root                                         //登录用户名,应该为管理员权限,就是登陆onecommand的那个账号和密码
netapp_password = netappxxx                       //登录密码
#netapp_vserver = svm_name                       //具体未知 按照官方文档推测应该就是sqmgtvm02
nfs_shares_config = /etc/cinder/nfs_shares   //配置NetApp NFS存储共享内容,可以使用showmount -e 172.16.5.xxx 来显示存储共享的目录
nfs_mount_point_base = $state_path/mnt     //挂载到本地的挂载点,改命令直接挂载至/var/lib/cinder/mnt/6ff41da189e9ce5bfc54af3394adbcd8
#max_over_subscription_ratio = 1.0              //推测为磁盘超配比
#reserved_percentage = 5                             //卷预留空间占比防止卷彻底挂掉,Ceph中也有类似选项避免空间爆掉可以通过释放改空间来执行紧急删除或迁移操作

个人排错跳坑历程:

too young,too simple

Cinder装好后认为会跟挂载NFS一样简单,按照错误配置直接挂载后打完收工竟然可以创建卷并成功挂载,简单dd命令一把完美准备交付。

为伊消得人憔悴

断断续续重启服务微调参数,各种参数修改但是cinder依然帅不过三秒。

1.尝试检查各节点时间与时区是否同步,发现NetApp存储时间差异较大差点动手调整。

2.检查NetApp配置,发现启用为NFS v3版本,调整[nfs]中nfs_mount_options = v3该选项默认是先尝试v4.1->v4.0->v3.0依次尝试,生产中建议直接指定为后端的NFS版本。

3.通过搜索发现下面截图的内容,灵光一闪如思科这种国际大厂都是对共有协议N多修改难道NetApp也有类似的修改定制。

原文链接:http://www.cnblogs.com/liaojiafa/p/6392684.html

检查如下,确认Cinder确有NetApp定制的驱动内容

翻看了RedHat相关文档内容也支持了这种推论。(囧,竟然是在思科官方找到)

截图链接:https://www.cisco.com/c/en/us/td/docs/unified_computing/ucs/UCS_CVDs/flexpod_openstack_osp6_design.html

 

各种让人忧伤的报错截图:

方向既然有就开始翻看OpenStack官网内容然后才了解原来NetApp存储还分不同的family如7-Mode和ontap_cluster,检查公司存储为7-Mode,其他不同的NetApp型号可能需要单独查询。

Newton:https://docs.openstack.org/newton/config-reference/block-storage/drivers/netapp-volume-driver.html

Ocata   :https://docs.openstack.org/ocata/config-reference/block-storage/drivers/netapp-volume-driver.html

按照上文配置参数修改后重新配置服务后重启依然帅不过三秒,果然幸福来的太快都是假的。

不过这次报错信息很贴心的告诉我服务很快会显示down,我谢谢你啊。。。。。。

报错为no sending heartbeat,既然是用heartbeat就说明有联动的调用关系才会有状态信息监测。OpenStack官方文档未找到相关选项点击进NetApp官方的GitHub看看有没有思路。

官方配置文档 :   http://netapp.io/openstack/

GitHub:            https://github.com/NetApp/cinder

对应参考NFS 7Mode:    http://netapp.github.io/openstack-deploy-ops-guide/liberty/content/cinder.7mode.nfs.configuration.html

netapp_transport_type = http 既然是Required所有示例中都没该参数,修改参数后故障依然继续服务还是帅不过三秒。

报错相关链接搜索:

http://community.netapp.com/t5/OpenStack-Discussions/Cinder-driver-netapp-problem-KILO-Release/td-p/115209

https://community.netapp.com/t5/OpenStack-Discussions/cinder-iscsi-driver-initialization-failed/td-p/131503

https://platform9.com/support/openstack-cinder-integration-with-netapp-cluster-nfs/

https://review.openstack.org/#/c/499148/

https://bugs.launchpad.net/cinder/+bug/1660870

https://bugs.launchpad.net/cinder/+bug/1705738

https://bugs.launchpad.net/cinder/+bug/1694579

检查NetApp日志XML报错提示http认证错误,排除账号密码问题检查NetApp发现默认是启用SSL,关闭SSL认证认证通过在存储控制器上正常发现Openstack Cinder可以正常连接。

该操作存在问题后续启用认证并未提示失败,待测。

HTTPS认证方式可以参考该文章非常不错可参考:

http://netapp.io/2017/02/15/use-certificate-verification-netapp-ontap-openstack-cinder-driver/

报错2:

 

使用中遇到问题:

nfs_sparsed_volumes=True生产中应禁用,该选项会导致创建的卷直接占用磁盘空间存在IOPS保证风险,切记不准随意开启。

Cinder成功后每次创建虚机都会先吃存储卷大小,直接将创建新卷点击为否可暂时不创建存储卷。

原文地址:https://www.cnblogs.com/nakky/p/7501440.html