ceph存储osd启动异常处理和正常启停操作

机器角色:cloudstack虚拟机的宿主机;ceph存储机器。

事件:ceph存储的物理机器由于内存异常,需要停机更换,仅仅是把该物理机上面的虚拟机迁移走,同时启动了停机维护,然后就直接关机。结果造成重启之后ceph异常

原因:由于异常关闭,ceph进程的相关信息没有正常关闭,信息没有同步到文件系统,如pid文件等信息

现象并尝试解决:

1)检查osd的整体信息

[root@haha1~]# ceph osd tree

ID WEIGHT    TYPE NAME             UP/DOWN REWEIGHT PRIMARY-AFFINITY

-1 101.91998 root default                                           

-2  25.48000     host haha-50                                  

 1   3.64000         osd.1              up  1.00000          1.00000

 2   3.64000         osd.2              up  1.00000          1.00000

 3   3.64000         osd.3              up  1.00000          1.00000

 4   3.64000         osd.4              up  1.00000          1.00000

 5   3.64000         osd.5              up  1.00000          1.00000

 6   3.64000         osd.6              up  1.00000          1.00000

 0   3.64000         osd.0              up  1.00000          1.00000

-3  25.48000     host XKDHhost1-51                                  

 7   3.64000         osd.7              up  1.00000          1.00000

 9   3.64000         osd.9              up  1.00000          1.00000

10   3.64000         osd.10           down        0          1.00000

11   3.64000         osd.11           down        0          1.00000

12   3.64000         osd.12             up  1.00000          1.00000

13   3.64000         osd.13             up  1.00000          1.00000 

2)osd显示的是down,但是通过

[root@haha1 ~]# /etc/init.d/ceph status osd.11

=== osd.11 ===

osd.11: running {"version":"0.94.2"}

 

3)重启osd.11尝试解决

[root@haha1 ~]# /etc/init.d/ceph restart osd.11

=== osd.11 ===

=== osd.11 ===

Stopping Ceph osd.11 on haha1...kill 7330...kill 7330...done #有kill,可以正常重启

=== osd.11 ===

create-or-move updated item name 'osd.11' weight 3.64 at location {host=XKDHhost1-51,root=default} to crush map

Starting Ceph osd.11 on haha1...

Running as unit run-35058.service.

 

4)osd.10启动异常

[root@haha1 ~]# /etc/init.d/ceph start osd.10

=== osd.10 ===

create-or-move updated item name 'osd.10' weight 3.64 at location {host=haha1,root=default} to crush map

Starting Ceph osd.10 on haha1...

Running as unit run-36525.service.

[root@haha1 ~]# /etc/init.d/ceph status osd.10

=== osd.10 ===

osd.10: not running.

s=a>create-or-move updated item name 'osd.11' weight 3.64 at location {root=default} to crush map

Starting Ceph osd.11 on haha1...

Running as unit run-35058.service.

 

 

 

 

 

 

原文地址:https://www.cnblogs.com/hixiaowei/p/8327724.html