ceph高可用分布式存储集群06-高可用纠删码对象存储实战之虚拟故障域

网络情况

# ip a

2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000

link/ether fa:16:3e:83:fc:7e brd ff:ff:ff:ff:ff:ff

inet 10.2.110.120/24 brd 10.2.110.255 scope global dynamic eth0

valid_lft 79948sec preferred_lft 79948sec

inet6 fe80::f816:3eff:fe83:fc7e/64 scope link

valid_lft forever preferred_lft forever

3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc mq state UP group default qlen 1000

link/ether fa:16:3e:7d:3c:56 brd ff:ff:ff:ff:ff:ff

inet 172.16.88.120/24 brd 172.16.88.255 scope global dynamic eth1

valid_lft 65007sec preferred_lft 65007sec

inet6 fe80::f816:3eff:fe7d:3c56/64 scope link

valid_lft forever preferred_lft forever

机器放置情况，我们有6个机柜，每个机柜有2台机器。

10.2.110.120 ceph-osd-120

10.2.110.126 ceph-osd-126

10.2.110.121 ceph-osd-121

10.2.110.127 ceph-osd-127

10.2.110.122 ceph-osd-122

10.2.110.128 ceph-osd-128

10.2.110.123 ceph-osd-123

10.2.110.128 ceph-osd-129

10.2.110.124 ceph-osd-124

10.2.110.130 ceph-osd-130

10.2.110.125 ceph-osd-125

10.2.110.131 ceph-osd-131

纠删码高可用思路：

由于我们使用了纠删码，我们希望K=4 M=2

然后我们设置一个中间层的虚拟域rep-region，在虚拟域里设置故障域osd-region

虚拟域1放置了ceph-osd-120 ceph-osd-121 ceph-osd-122 ceph-osd-123 ceph-osd-124 ceph-osd-125

虚拟域2放置了ceph-osd-126 ceph-osd-127 ceph-osd-128 ceph-osd-129 ceph-osd-130 ceph-osd-131

每个故障域放一个机器，后期集群扩大后，可以每个故障域放2~4台机器，故障域的机器超过4台机器后，我们会增加虚拟域

这样的ceph集群即使使用了纠删码，风险也会非常小......

理论上6个机柜放满DELL R510，每个机柜12台机器，我们会形成3个虚拟域，3*6=18个故障域，纠删码 K=4 M=2 K+M=6，所以2个机柜遇到问题，我们的ceph集群也会是健康的

---------------------------------------------------------------------------------------------------------

所有的节点初始化

yum install -y vim net-tools wget lrzsz deltarpm tree screen lsof tcpdump nmap sysstat iftop

yum install epel-release -y

添加ceph源

# cat /etc/yum.repos.d/ceph.repo

[Ceph]

name=Ceph packages for $basearch

baseurl=http://mirrors.aliyun.com/ceph/rpm-nautilus/el7/$basearch

enabled=1

gpgcheck=1

type=rpm-md

gpgkey=https://download.ceph.com/keys/release.asc

[Ceph-noarch]

name=Ceph noarch packages

baseurl=http://mirrors.aliyun.com/ceph/rpm-nautilus/el7/noarch

enabled=1

gpgcheck=1

type=rpm-md

gpgkey=https://download.ceph.com/keys/release.asc

[ceph-source]

name=Ceph source packages

baseurl=http://mirrors.aliyun.com/ceph/rpm-nautilus/el7/SRPMS

enabled=1

gpgcheck=1

type=rpm-md

gpgkey=https://download.ceph.com/keys/release.asc

修改hosts

# cat /etc/hosts

127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4

::1 localhost localhost.localdomain localhost6 localhost6.localdomain6

10.2.110.120 ceph-osd-120

10.2.110.121 ceph-osd-121

10.2.110.122 ceph-osd-122

10.2.110.123 ceph-osd-123

10.2.110.124 ceph-osd-124

10.2.110.125 ceph-osd-125

10.2.110.126 ceph-osd-126

10.2.110.127 ceph-osd-127

10.2.110.128 ceph-osd-128

10.2.110.129 ceph-osd-129

10.2.110.130 ceph-osd-130

10.2.110.131 ceph-osd-131

10.2.110.132 ceph-osd-132

给所有的主机命名，以10.2.110.120为示范

hostnamectl set-hostname ceph-osd-120

---------------------------------------------------------------------------------------------------

下面的操作都是在ceph部署机上操作命令

生成密钥

ssh-keygen

发送密钥到其他节点

ssh-copy-id ceph-osd-120

ssh-copy-id ceph-osd-121

ssh-copy-id ceph-osd-122

ssh-copy-id ceph-osd-123

ssh-copy-id ceph-osd-124

ssh-copy-id ceph-osd-125

ssh-copy-id ceph-osd-126

ssh-copy-id ceph-osd-127

ssh-copy-id ceph-osd-128

ssh-copy-id ceph-osd-129

ssh-copy-id ceph-osd-130

ssh-copy-id ceph-osd-131

ssh-copy-id ceph-osd-132

安装ceph-deploy

yum install ceph-deploy python-setuptools python2-subprocess32 -y

生成ceph.conf文件

mkdir -pv ceph-cluster

cd ceph-cluster/

ceph-deploy new ceph-osd-120 ceph-osd-121 ceph-osd-122

修改配置文件

# vim ceph.conf

[global]

fsid = 5b833752-2922-41b5-a3d4-f654a995af04

mon_initial_members = ceph-osd-120, ceph-osd-121, ceph-osd-122

mon_host = 10.2.110.120,10.2.110.121,10.2.110.122

auth_cluster_required = cephx

auth_service_required = cephx

auth_client_required = cephx

mon clock drift allowed = 2

mon clock drift warn backoff = 30

public_network = 10.2.110.0/24

cluster_network = 172.16.88.0/24

max_open_files = 131072

mon_pg_warn_max_per_osd = 1000

mon_max_pg_per_osd = 1000

osd pool default pg num = 64

osd pool default pgp num = 64

osd pool default size = 3

osd pool default min size = 1

mon_osd_full_ratio = .95

mon_osd_nearfull_ratio = .80

osd_deep_scrub_randomize_ratio = 0.01

[mon]

mon_allow_pool_delete = true

mon_osd_down_out_interval = 600

mon_osd_min_down_reporters = 3

[mgr]

mgr modules = dashboard

[osd]

osd_journal_size = 20480

osd_max_write_size = 1024

osd_recovery_op_priority = 1

osd_recovery_max_active = 1

osd_recovery_max_single_start = 1

osd_recovery_threads = 1

osd_recovery_max_chunk = 1048576

osd_max_backfills = 1

osd_scrub_begin_hour = 22

osd_scrub_end_hour = 7

osd_recovery_sleep = 0

osd crush update on start = false

[client]

rbd_cache = true

rbd_cache_writethrough_until_flush = true

rbd_concurrent_management_ops = 10

rbd_cache_size = 67108864

rbd_cache_max_dirty = 50331648

rbd_cache_target_dirty = 33554432

rbd_cache_max_dirty_age = 2

rbd_default_format = 2

给所有节点安装ceph软件

for i in `seq 120 132` ;do ceph-deploy install --no-adjust-repos ceph-osd-${i};done

初始化mon

ceph-deploy mon create-initial

传送ceph admin密钥

for i in `seq 120 132` ;do ceph-deploy admin ceph-osd-${i};done

添加mgr

ceph-deploy mgr create ceph-osd-120 ceph-osd-121 ceph-osd-122

添加部分osd

for i in `seq 120 125`;do for x in {b,c} ;do ceph-deploy osd create --data /dev/vd${x} ceph-osd-${i};done;done

-----------------------------------------------------------------------------------------------------

修改curshmap

ceph osd getcrushmap -o crushmap

crushtool -d crushmap -o crushmap.txt

在curshmap.txt文件中添加二个bucket type（osd-region，rep-region）

# types

type 0 osd

type 1 host

type 2 chassis

type 3 osd-region

type 4 rep-region

type 5 rack

type 6 row

type 7 pdu

type 8 pod

type 9 room

type 10 datacenter

type 11 zone

type 12 region

type 13 root

在curshmap.txt文件中修改添加rule,这里有2个rule，一个是副本rule,另一个是纠删码rule

# rules

rule replicated_rule {

id 0

type replicated

min_size 1

max_size 10

step take default

step choose firstn 1 type rep-region

step choose firstn 0 type osd-region

step choose firstn 1 type osd

step emit

}

rule erasure-code {

id 1

type erasure

min_size 3

max_size 20

step set_chooseleaf_tries 5

step set_choose_tries 150

step take default

step choose indep 1 type rep-region

step choose indep 0 type osd-region

step choose firstn 1 type osd

step emit

}

crushtool -c crushmap.txt -o crushmap.bin

ceph osd setcrushmap -i crushmap.bin

---------------------------------------------------------------------------------------------------------------------------------------------

由于配置文件中添加了osd crush update on start = false，所以这里我们需要自己执行下面的命令才能让crush map变成我们想要的样子

ceph osd crush add-bucket rep01 rep-region

ceph osd crush add-bucket rep02 rep-region

ceph osd crush add-bucket osdreg01 osd-region

ceph osd crush add-bucket osdreg02 osd-region

ceph osd crush add-bucket osdreg03 osd-region

ceph osd crush add-bucket osdreg04 osd-region

ceph osd crush add-bucket osdreg05 osd-region

ceph osd crush add-bucket osdreg06 osd-region

ceph osd crush add-bucket osdreg07 osd-region

ceph osd crush add-bucket osdreg08 osd-region

ceph osd crush add-bucket osdreg09 osd-region

ceph osd crush add-bucket osdreg10 osd-region

ceph osd crush add-bucket osdreg11 osd-region

ceph osd crush add-bucket osdreg12 osd-region

ceph osd crush move osdreg01 rep-region=rep01

ceph osd crush move osdreg02 rep-region=rep01

ceph osd crush move osdreg03 rep-region=rep01

ceph osd crush move osdreg04 rep-region=rep01

ceph osd crush move osdreg05 rep-region=rep01

ceph osd crush move osdreg06 rep-region=rep01

ceph osd crush move osdreg07 rep-region=rep02

ceph osd crush move osdreg08 rep-region=rep02

ceph osd crush move osdreg09 rep-region=rep02

ceph osd crush move osdreg10 rep-region=rep02

ceph osd crush move osdreg11 rep-region=rep02

ceph osd crush move osdreg12 rep-region=rep02

ceph osd crush move rep01 root=default

ceph osd crush move rep02 root=default

ceph osd crush move osd.0 osd-region=osdreg01

ceph osd crush move osd.1 osd-region=osdreg01

ceph osd crush move osd.2 osd-region=osdreg02

ceph osd crush move osd.3 osd-region=osdreg02

ceph osd crush move osd.4 osd-region=osdreg03

ceph osd crush move osd.5 osd-region=osdreg03

ceph osd crush move osd.6 osd-region=osdreg04

ceph osd crush move osd.7 osd-region=osdreg04

ceph osd crush move osd.8 osd-region=osdreg05

ceph osd crush move osd.9 osd-region=osdreg05

ceph osd crush move osd.10 osd-region=osdreg06

ceph osd crush move osd.11 osd-region=osdreg06

ceph osd crush reweight osd.0 0.03999

ceph osd crush reweight osd.1 0.03999

ceph osd crush reweight osd.2 0.03999

ceph osd crush reweight osd.3 0.03999

ceph osd crush reweight osd.4 0.03999

ceph osd crush reweight osd.5 0.03999

ceph osd crush reweight osd.6 0.03999

ceph osd crush reweight osd.7 0.03999

ceph osd crush reweight osd.8 0.03999

ceph osd crush reweight osd.9 0.03999

ceph osd crush reweight osd.10 0.03999

ceph osd crush reweight osd.11 0.03999

查看结果

# ceph osd tree

ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF

-1 0.47974 root default

-3 0.47974 rep-region rep01

-5 0.07996 osd-region osdreg01

0 hdd 0.03998 osd.0 up 1.00000 1.00000

1 hdd 0.03998 osd.1 up 1.00000 1.00000

-6 0.07996 osd-region osdreg02

2 hdd 0.03998 osd.2 up 1.00000 1.00000

3 hdd 0.03998 osd.3 up 1.00000 1.00000

-7 0.07996 osd-region osdreg03

4 hdd 0.03998 osd.4 up 1.00000 1.00000

5 hdd 0.03998 osd.5 up 1.00000 1.00000

-8 0.07996 osd-region osdreg04

6 hdd 0.03998 osd.6 up 1.00000 1.00000

7 hdd 0.03998 osd.7 up 1.00000 1.00000

-9 0.07996 osd-region osdreg05

8 hdd 0.03998 osd.8 up 1.00000 1.00000

9 hdd 0.03998 osd.9 up 1.00000 1.00000

-10 0.07996 osd-region osdreg06

10 hdd 0.03998 osd.10 up 1.00000 1.00000

11 hdd 0.03998 osd.11 up 1.00000 1.00000

-4 0 rep-region rep02

-19 0 osd-region osdreg07

-20 0 osd-region osdreg08

-21 0 osd-region osdreg09

-22 0 osd-region osdreg10

-23 0 osd-region osdreg11

-24 0 osd-region osdreg12

添加纠删码池使用的profile

ceph osd erasure-code-profile set erasurek4m2 k=4 m=2 crush-failure-domain='osd-region' crush-root=default

注：在生产环境都使用了 Intel CPU的服务器上推荐纠删码使用 ISA-L 库，命令如下

ceph osd erasure-code-profile set erasurek4m2 k=4 m=2 plugin=isa crush-failure-domain='osd-region' crush-root=default

查看创建的profile结果

ceph osd erasure-code-profile ls

default

erasurek4m2

ceph osd erasure-code-profile get erasurek4m2

crush-device-class=

crush-failure-domain=osd-region

crush-root=default

jerasure-per-chunk-alignment=false

k=4

m=2

plugin=jerasure

technique=reed_sol_van

w=8

创建对象存储需要的池,其中存放数据的池为使用了默认策略的纠删码池

ceph osd pool create .rgw.root 64

ceph osd pool create default.rgw.control 64

ceph osd pool create default.rgw.meta 64

ceph osd pool create default.rgw.log 64

ceph osd pool create default.rgw.buckets.index 64

ceph osd pool create default.rgw.buckets.data 128 128 erasure erasurek4m2 erasure-code

ceph osd pool create default.rgw.buckets.non-ec 128

ceph osd pool application enable .rgw.root rgw

ceph osd pool application enable default.rgw.control rgw

ceph osd pool application enable default.rgw.log rgw

ceph osd pool application enable default.rgw.meta rgw

ceph osd pool application enable default.rgw.buckets.index rgw

ceph osd pool application enable default.rgw.buckets.data rgw

由于默认k=4 m=2的纠删码池min_size是5，这里我们改成4以便集群可以故障的域更多点。

ceph osd pool set default.rgw.buckets.data min_size 4

注：创建池的命令详解如下

ceph osd pool create <poolname> <int[0-]> {<int[0-]>} {replicated|erasure} {<erasure_code_profile>} {<rule>} {<int>} {<int>} {<int[0-]>} {<int[0-]>} {<float[0.0-1.0]>}

给纠删码池做cache分层

ceph osd pool create default.rgw.buckets.data_cache 64

ceph osd tier add default.rgw.buckets.data default.rgw.buckets.data_cache

ceph osd tier cache-mode default.rgw.buckets.data_cache writeback

ceph osd tier set-overlay default.rgw.buckets.data default.rgw.buckets.data_cache

设置缓冲池使用布隆过滤器

设定热度数hit_set_count和热度周期hit_set_period,以及最大缓冲数据target_max_bytes

ceph osd pool set default.rgw.buckets.data_cache hit_set_type bloom

ceph osd pool set default.rgw.buckets.data_cache hit_set_count 1

ceph osd pool set default.rgw.buckets.data_cache hit_set_period 3600

ceph osd pool set default.rgw.buckets.data_cache target_max_bytes 107374182400

注：cache分层要谨慎，生产环境为了提高性能和降低成本，我们没有使用cache分层，而是把default.rgw.buckets.data放在机械盘，其他池放在SSD磁盘上。

部署rgw

ceph-deploy rgw create ceph-osd-120 ceph-osd-121 ceph-osd-122

创建一个测试帐号

# radosgw-admin user create --uid="test" --display-name="test"

{

"user_id": "test",

"display_name": "test",

"email": "",

"suspended": 0,

"max_buckets": 1000,

"subusers": [],

"keys": [

{

"user": "test",

"access_key": "47DLTATT3ZS76KPU140J",

"secret_key": "uhesPGUtaDravGDmprB6sVrJBqCSQvEppLGz5dT4"

}

"swift_keys": [],

"caps": [],

"op_mask": "read, write, delete",

"default_placement": "",

"default_storage_class": "",

"placement_tags": [],

"bucket_quota": {

"enabled": false,

"check_on_raw": false,

"max_size": -1,

"max_size_kb": 0,

"max_objects": -1

"user_quota": {

"enabled": false,

"check_on_raw": false,

"max_size": -1,

"max_size_kb": 0,

"max_objects": -1

"temp_url_keys": [],

"type": "rgw",

"mfa_ids": []

}

创建一个测试的bucket

s3cmd --access_key=47DLTATT3ZS76KPU140J --secret_key=uhesPGUtaDravGDmprB6sVrJBqCSQvEppLGz5dT4 --host=test2.zhazhahui.com:7480 --host-bucket="%(bucket).s3.zhazhahui.com:7480" --no-ssl --no-check-certificate mb s3://test

Bucket 's3://test/' created

当我们给bucket写入东西后，我们查看一些信息

# ceph df

RAW STORAGE:

CLASS SIZE AVAIL USED RAW USED %RAW USED

hdd 58 TiB 58 TiB 23 GiB 31 GiB 0.05

TOTAL 58 TiB 58 TiB 23 GiB 31 GiB 0.05

POOLS:

POOL ID STORED OBJECTS USED %USED MAX AVAIL

.rgw.root 1 1.2 KiB 4 768 KiB 0 18 TiB

default.rgw.control 2 0 B 8 0 B 0 18 TiB

default.rgw.meta 3 1.5 KiB 10 1.5 MiB 0 18 TiB

default.rgw.log 4 0 B 175 0 B 0 18 TiB

default.rgw.buckets.index 5 0 B 2 0 B 0 18 TiB

default.rgw.buckets.data 6 8.6 GiB 2.69k 13 GiB 0.02 37 TiB

default.rgw.buckets.non-ec 7 27 B 1 192 KiB 0 18 TiB

default.rgw.buckets.data_cache 8 3.3 GiB 1.02k 10 GiB 0.02 18 TiB

查看具体某个BUCKET属性

# radosgw-admin bucket stats --bucket=zhazhahui

{

"bucket": "zhazhahui",

"num_shards": 0,

"tenant": "",

"zonegroup": "6fb3c20a-ee52-45f2-83e8-0cb4b038bbac",

"placement_rule": "default-placement",

"explicit_placement": {

"data_pool": "",

"data_extra_pool": "",

"index_pool": ""

"id": "1c1013dc-c276-41b3-bdcb-aed676ea064c.5619.1",

"marker": "1c1013dc-c276-41b3-bdcb-aed676ea064c.5619.1",

"index_type": "Normal",

"owner": "qf",

"ver": "0#33481",

"master_ver": "0#0",

"mtime": "2021-03-15 09:33:36.164454Z",

"max_marker": "0#",

"usage": {

"rgw.main": {

"size": 401363173919,

"size_actual": 401374625792,

"size_utilized": 401363173919,

"size_kb": 391956225,

"size_kb_actual": 391967408,

"size_kb_utilized": 391956225,

"num_objects": 3803

"rgw.multimeta": {

"size": 0,

"size_actual": 0,

"size_utilized": 216,

"size_kb": 0,

"size_kb_actual": 0,

"size_kb_utilized": 1,

"num_objects": 8

}

该命令展示了BUCKET的名称，所在的data pool, index pool. BUCKET ID.

2.3 检查对应BUCKET在index中是否存在

# rados -p default.rgw.buckets.index ls | grep 1c1013dc-c276-41b3-bdcb-aed676ea064c.5619.1

.dir.1c1013dc-c276-41b3-bdcb-aed676ea064c.5619.1

注：此处需要在BUCKET ID前面加上.dir. 才是它在INDEX POOL中的索引

###2.4 查看对应INDEX中记录的key

# rados -p default.rgw.buckets.index listomapkeys .dir.1c1013dc-c276-41b3-bdcb-aed676ea064c.5619.1

34123.md5

34124.apk

34124.md5

34125.apk

34125.md5

34130.apk

34130.md5

34131.apk

34131.md5

34132.apk

34132.md5

34133.apk

34133.md5

34134.apk

34134.md5

34135.apk

34135.md5

34136.apk

34136.md5

34144.apk

34144.md5

34147.apk

34147.md5

34149.apk

34149.md5

34150.apk

34150.md5

34151.apk

34151.md5

34152.md5

统计文件数量

# rados -p default.rgw.buckets.index listomapkeys .dir.1c1013dc-c276-41b3-bdcb-aed676ea064c.5619.1 | wc -l

3811

查看对应索引信息存放的物理位置

# ceph osd map default.rgw.buckets.index .dir.1c1013dc-c276-41b3-bdcb-aed676ea064c.5619.1

osdmap e250 pool 'default.rgw.buckets.index' (5) object '.dir.1c1013dc-c276-41b3-bdcb-aed676ea064c.5619.1' -> pg 5.31e62451 (5.11) -> up ([14,16,20], p14) acting ([14,16,20], p14)

通过命令可以看到，BUCKET arpun 的index信息落在OSD 3,6,0上面，其中3为主osd.

那么纠删码中的数据分布情况如何？

rados -p default.rgw.buckets.data ls

1c1013dc-c276-41b3-bdcb-aed676ea064c.5619.1_31250.md5

1c1013dc-c276-41b3-bdcb-aed676ea064c.5619.1_33077.md5

1c1013dc-c276-41b3-bdcb-aed676ea064c.5619.1_31879.apk

ceph osd map default.rgw.buckets.data 1c1013dc-c276-41b3-bdcb-aed676ea064c.5619.1_31879.apk

osdmap e250 pool 'default.rgw.buckets.data' (8) object '1c1013dc-c276-41b3-bdcb-aed676ea064c.5619.1_31879.apk' -> pg 8.8484bfff (8.7f) -> up ([20,13,14,17,19], p20) acting ([20,13,14,17,19,7], p20)

我们发现不同的文件会被算法均匀的分布在不同的虚拟域中

# ceph osd map default.rgw.buckets.data .dir.1c1013dc-c276-41b3-bdcb-aed676ea064c.5619.1/28640.apk

osdmap e252 pool 'default.rgw.buckets.data' (8) object '.dir.1c1013dc-c276-41b3-bdcb-aed676ea064c.5619.1/28640.apk' -> pg 8.c2a33311 (8.11) -> up ([5,9,7,3,1,10], p5) acting ([5,9,7,3,1,10], p5)

# ceph osd map default.rgw.buckets.data .dir.1c1013dc-c276-41b3-bdcb-aed676ea064c.5619.1/27833.apk

osdmap e252 pool 'default.rgw.buckets.data' (8) object '.dir.1c1013dc-c276-41b3-bdcb-aed676ea064c.5619.1/27833.apk' -> pg 8.bab507d5 (8.55) -> up ([18,16,13,15,20], p18) acting ([18,16,13,15,20,2], p18)

最终架构

# ceph osd tree

ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF

-1 0.87978 root default

-3 0.48000 rep-region rep01

-5 0.07999 osd-region osdreg01

0 hdd 0.03999 osd.0 up 1.00000 1.00000

1 hdd 0.03999 osd.1 up 1.00000 1.00000

-6 0.07999 osd-region osdreg02

2 hdd 0.03999 osd.2 up 1.00000 1.00000

3 hdd 0.03999 osd.3 up 1.00000 1.00000

-7 0.07999 osd-region osdreg03

4 hdd 0.03999 osd.4 up 1.00000 1.00000

5 hdd 0.03999 osd.5 up 1.00000 1.00000

-8 0.07999 osd-region osdreg04

6 hdd 0.03999 osd.6 up 1.00000 1.00000

7 hdd 0.03999 osd.7 up 1.00000 1.00000

-9 0.07999 osd-region osdreg05

8 hdd 0.03999 osd.8 up 1.00000 1.00000

9 hdd 0.03999 osd.9 up 1.00000 1.00000

-10 0.07999 osd-region osdreg06

10 hdd 0.03999 osd.10 up 1.00000 1.00000

11 hdd 0.03999 osd.11 up 1.00000 1.00000

-4 0.39978 rep-region rep02

-19 0.07996 osd-region osdreg07

12 hdd 0.03998 osd.12 up 1.00000 1.00000

13 hdd 0.03998 osd.13 up 1.00000 1.00000

-20 0.07996 osd-region osdreg08

14 hdd 0.03998 osd.14 up 1.00000 1.00000

15 hdd 0.03998 osd.15 up 1.00000 1.00000

-21 0.07996 osd-region osdreg09

16 hdd 0.03998 osd.16 up 1.00000 1.00000

17 hdd 0.03998 osd.17 up 1.00000 1.00000

-22 0 osd-region osdreg10

-23 0.07996 osd-region osdreg11

18 hdd 0.03998 osd.18 up 1.00000 1.00000

19 hdd 0.03998 osd.19 up 1.00000 1.00000

-24 0.07996 osd-region osdreg12

20 hdd 0.03998 osd.20 up 1.00000 1.00000

21 hdd 0.03998 osd.21 up 1.00000 1.00000

总结：生产看来M=2还是有一定风险，最好是M=3,据说网易是K=8 M=3. 我们生产目前是 K=5 M=2，后期规模大了可以扩大M值以降低风险。

作者：Dexter_Wang 工作岗位：某互联网公司资深云计算与存储工程师联系邮箱：993852246@qq.com