搭建一个高可用的redis环境

一.环境准备

我的环境: Fedora 25 server  64位版 6台:

192.168.10.204 

192.168.10.205  

192.168.10.206

192.168.10.203

192.168.10.207

192.168.10.208

redis版本:3.2.9

二.Redis单机的安装

为了方便,我自己都用root用户

官方下载地址:

https://redis.io/download,这里我安装在/usr/local/redis这个目录,官网安装参考如下:

cd /usr/local
mkdir redis
cd redis
wget http://download.redis.io/releases/redis-3.2.9.tar.gz
tar xzf redis-3.2.9.tar.gz
cd redis-3.2.9
make

执行时发现报了2个错,如下:

cd src && make all
make[1]: Entering directory '/usr/local/redis/redis-3.2.9/src'
rm -rf redis-server redis-sentinel redis-cli redis-benchmark redis-check-rdb redis-check-aof *.o *.gcda *.gcno *.gcov redis.info lcov-html
(cd ../deps && make distclean)
make[2]: Entering directory '/usr/local/redis/redis-3.2.9/deps'
(cd hiredis && make clean) > /dev/null || true
(cd linenoise && make clean) > /dev/null || true
(cd lua && make clean) > /dev/null || true
(cd geohash-int && make clean) > /dev/null || true
(cd jemalloc && [ -f Makefile ] && make distclean) > /dev/null || true
(rm -f .make-*)
make[2]: Leaving directory '/usr/local/redis/redis-3.2.9/deps'
(rm -f .make-*)
echo STD=-std=c99 -pedantic -DREDIS_STATIC='' >> .make-settings
echo WARN=-Wall -W >> .make-settings
echo OPT=-O2 >> .make-settings
echo MALLOC=jemalloc >> .make-settings
echo CFLAGS= >> .make-settings
echo LDFLAGS= >> .make-settings
echo REDIS_CFLAGS= >> .make-settings
echo REDIS_LDFLAGS= >> .make-settings
echo PREV_FINAL_CFLAGS=-std=c99 -pedantic -DREDIS_STATIC='' -Wall -W -O2 -g -ggdb   -I../deps/geohash-int -I../deps/hiredis -I../deps/linenoise -I../deps/lua/src -DUSE_JEMALLOC -I../deps/jemalloc/include >> .make-settings
echo PREV_FINAL_LDFLAGS=  -g -ggdb -rdynamic >> .make-settings
(cd ../deps && make hiredis linenoise lua geohash-int jemalloc)
make[2]: Entering directory '/usr/local/redis/redis-3.2.9/deps'
(cd hiredis && make clean) > /dev/null || true
(cd linenoise && make clean) > /dev/null || true
(cd lua && make clean) > /dev/null || true
(cd geohash-int && make clean) > /dev/null || true
(cd jemalloc && [ -f Makefile ] && make distclean) > /dev/null || true
(rm -f .make-*)
(echo "" > .make-cflags)
(echo "" > .make-ldflags)
MAKE hiredis
cd hiredis && make static
make[3]: Entering directory '/usr/local/redis/redis-3.2.9/deps/hiredis'
gcc -std=c99 -pedantic -c -O3 -fPIC  -Wall -W -Wstrict-prototypes -Wwrite-strings -g -ggdb  net.c
make[3]: gcc: Command not found
Makefile:118: recipe for target 'net.o' failed
make[3]: *** [net.o] Error 127
make[3]: Leaving directory '/usr/local/redis/redis-3.2.9/deps/hiredis'
Makefile:46: recipe for target 'hiredis' failed
make[2]: *** [hiredis] Error 2
make[2]: Leaving directory '/usr/local/redis/redis-3.2.9/deps'
Makefile:156: recipe for target 'persist-settings' failed
make[1]: [persist-settings] Error 2 (ignored)
    CC adlist.o
/bin/sh: cc: command not found
Makefile:211: recipe for target 'adlist.o' failed
make[1]: *** [adlist.o] Error 127
make[1]: Leaving directory '/usr/local/redis/redis-3.2.9/src'
Makefile:6: recipe for target 'all' failed
make: *** [all] Error 2

其中一个是缺少gcc编译环境,那就安装gcc, 执行:

dnf -y install gcc

发现又出现了错误:

[root@localhost redis-3.2.9]# make
cd src && make all
make[1]: Entering directory '/usr/local/redis/redis-3.2.9/src'
    CC adlist.o
In file included from adlist.c:34:0:
zmalloc.h:50:31: fatal error: jemalloc/jemalloc.h: No such file or directory
 #include <jemalloc/jemalloc.h>
                               ^
compilation terminated.
Makefile:211: recipe for target 'adlist.o' failed
make[1]: *** [adlist.o] Error 1
make[1]: Leaving directory '/usr/local/redis/redis-3.2.9/src'
Makefile:6: recipe for target 'all' failed
make: *** [all] Error 2

关于这个错误,网上很多小伙伴都碰到过,也都有解答。

也就是安装时需要指定MALLOC这个参数,google一下,这个参数是用于管理内存的,官方推荐使用jemalloc,解释说jemalloc比libc会有更少的碎片问题 , redis安装包里面的README.MD文件有这么一段:

Allocator
---------

Selecting a non-default memory allocator when building Redis is done by setting
the `MALLOC` environment variable. Redis is compiled and linked against libc
malloc by default, with the exception of jemalloc being the default on Linux
systems. This default was picked because jemalloc has proven to have fewer
fragmentation problems than libc malloc.

To force compiling against libc malloc, use:

    % make MALLOC=libc

To compile against jemalloc on Mac OS X systems, use:

    % make MALLOC=jemalloc

Verbose build

也就是说, 因为redis默认使用jemalloc做内存管理,但是系统没有自带这个库,对于这个库我个人不是太清楚,既然推荐用这个内存分配库,那就安装吧

先创建安装目录,再下载资源包 jemalloc 的源代码托管在github(https://github.com/jemalloc/jemalloc)上,但是我这连不上亚马逊仓库,下载不下来,就直接从另一个站点下载了,这个站点没有最新的5.0版本。

cd /usr/local
mkdir jemalloc
cd jemalloc
wget http://www.canonware.com/download/jemalloc/jemalloc-4.2.1.tar.bz2 

解压:

tar xjf jemalloc-4.2.1.tar.bz2

下面安装到/usr/local/jemalloc这个目录,进入解压目录执行:

./configure

make & make install

执行完,发现解压目录多了个lib目录,说明jemalloc这个库安装成功了。

安装完成,再次切换到redis解压目录,执行时 指定这个库的位置:

make MALLOC=/usr/local/jemalloc/jemalloc-4.2.1/lib

 这样等待它安装完成。

或者如果不想用jemalloc内存管理,嫌麻烦,就指定使用libc去管理内存,如下执行:

make MALLOC=libc

上面二者选其一即可,然后控制台上一系列日志啪啪啪的打,直到安装完成没有再次出现问题就说明安装成功了

至此单台redis安装完了。

重复上面的安装,依次在其他机器都安装。

三.搭建Cluster多机环境

首先在每一台机器上创建一个目录,用于存放Cluster中每一台redis产生的配置文件nodes.conf文件,为了方便查看,把redis启动的redis.conf文件也放在这,日志文件,产生的数据文件也放在这

这里我分别在六台机器上创建目录,并将安装目录下的redis.conf文件分别拷到这个目录:

cd /usr/local/redis
mkdir redis-cluster
cd redis-cluster 
cp /usr/local/redis/redis-3.2.9/redis.conf ./

编辑redis.conf , 去掉多余的配置,如我的204这台机器的配置文件内容如下:

#取消保护模式,,默认是yes,组织redis-cli 等其他客户端连接
protected-mode no(必须)
#对外端口
port 7001(必须)
#是否以守护进程启动 跟命令行后面加个&一个意思
daemonize yes(可选)
#指定进程的pid路径(可选)
pidfile /var/run/redis_7001.pid

#支持cluster的配置
cluster-enabled yes(必须)
#这就是刚才创建的目录,存放node.conf,这个文件是redis启动时自动生成的
cluster-config-file /usr/local/redis/redis-cluster/nodes.conf(必须)
#节点与节点之间连接不上的超时时间ms(必须)
cluster-node-timeout 15000

依次编辑其他机器上的配置如上,修改对应的端口,进程pid的路径即可我的分别是7001,7002,7003,7004,7005,7006

编辑好配置文件,就可以启动每台机器的redis了

cd /usr/local/redis/redis-3.2.9/src

./redis-server /usr/local/redis/redis-cluster/redis.conf

#检查一下是否启动成功:
ps -eaf | grep redis

出现的状态与单机启动redis不同,后面多了个【cluster】,端口开放:7001说明启动成功了

ps -eaf | grep redis
root       1246      1  0 00:36 ?        00:00:00 ./redis-server *:7001 [cluster]
root       1250   1083  0 00:37 pts/0    00:00:00 grep --color=auto redis

依次启动其他5台机器。

这里测试一下机器的端口是否开发,在208这台机器上用redis提供的客户端连接204的server

1 cd /usr/local/redis/redis-3.2.9/src
#-h参数是要连接的redis   server的主机ip, -p参数是要连接的redis   server的主机开放的端口
2 ./redis-cli -h 192.168.10.204 -p 7001

出现了:

# ./redis-cli -h 192.168.10.204 -p 7001
Could not connect to Redis at 192.168.10.128:6379: No route to host
Could not connect to Redis at 192.168.10.128:6379: No route to host

说明了从208连接到204的7001端口是行不通的,检查是否能ping通:

ping 192.168.10.204

结果是通的。猜想是204这台机器防火墙没有开放7001端口,因为之前编辑配置文件是改了protected-mode 这个参数的。执行下面的命令,开放7001端口:

iptables -I INPUT -p tcp --dport 7001 -j ACCEPT

再次从208的客户端连接到204的7001,发现是OK的。

最后将这些节点加入的cluster中,还是参考官方文档:

Creating the cluster
Now that we have a number of instances running, we need to create our cluster by writing some meaningful configuration to the nodes.
This is very easy to accomplish as we are helped by the Redis Cluster command line utility called redis-trib, a Ruby program executing special commands on instances in order to create new clusters,
check or reshard an existing cluster, and so forth. The redis-trib utility is in the src directory of the Redis source code distribution. You need to install redis gem to be able to run redis-trib. gem install redis

这一段说创建一个redis cluster需要 安装ruby,它是通过ruby的一个gems的插件去执行的,同时安装ruby和这个插件,还有这个插件写的redis创建cluster的工具:

dnf -y install ruby rubygems

gem install redis

安装完成,最后切换到任意一台机器上进入redis安装目录的src 目录下执行:

官方解释说 -- replicas 后面的参数 1 是 给每一个master分配一个slaver

./redis-trib.rb create --replicas 1 192.168.10.204:7001 192.168.10.205:7002 192.168.10.206:7003 192.168.10.203:7004 192.168.10.207:7005 192.168.10.208:7006

出现 下图,默认将前三台服务器分配为Master:

输入yes,等待加入Cluster

这里会一直出现等待加入Cluster,如下:

>>> Assign a different config epoch to each node
>>> Sending CLUSTER MEET messages to join the cluster
Waiting for the cluster to join.....................................................

出现这个问题的原因是redis服务器防火墙没有开放cluster所需要的端口,去官网 教程(点我查看) 一查,果然有这么一段儿描述:

Redis Cluster TCP ports
Every Redis Cluster node requires two TCP connections open. The normal Redis TCP port used to serve clients, for example 6379, 
plus the port obtained by adding 10000 to the data port, so 16379 in the example. This second high port is used for the Cluster bus, that is a node-to-node communication channel using a binary protocol.
The Cluster bus is used by nodes for failure detection, configuration update, failover authorization and so forth.
Clients should never try to communicate with the cluster bus port, but always with the normal Redis command port,
however make sure you open both ports in your firewall, otherwise Redis cluster nodes will be not able to communicate. The command port and cluster bus port offset is fixed and is always 10000. Note that for a Redis Cluster to work properly you need, for each node: The normal client communication port (usually 6379) used to communicate with clients to be open to all the clients that need to reach the cluster,
plus all the other cluster nodes (that use the client port for keys migrations). The cluster bus port (the client port + 10000) must be reachable from all the other cluster nodes. If you don't open both TCP ports, your cluster will not work as expected. The cluster bus uses a different, binary protocol, for node to node data exchange, which is more suited to exchange information between nodes using little bandwidth and processing time.

意思是说在Cluster环境,每个节点不仅要对外开放客户端连接的端口,还需要开放一个用于Cluster通信的端口,这个端口一般是对外客户端端口 加上10000,如:节点对外端口,我这里是7001 ,需要把 17001这个端口开放,于是在防火墙里增加端口开放:

分别对应每一台节点执行:

在192.168.10.204上 执行: 

iptables -I INPUT -p tcp --dport 17001 -j ACCEPT

在192.168.10.205上 执行:

iptables -I INPUT -p tcp --dport 17002 -j ACCEPT

在192.168.10.206上 执行:

iptables -I INPUT -p tcp --dport 17003 -j ACCEPT

在192.168.10.203上 执行:

iptables -I INPUT -p tcp --dport 17004 -j ACCEPT

在192.168.10.207上 执行: 

iptables -I INPUT -p tcp --dport 17005 -j ACCEPT

在192.168.10.208上 执行: 

iptables -I INPUT -p tcp --dport 17006 -j ACCEPT

再次进入其中的一台,进入src目录执行:

./redis-trib.rb create --replicas 1 192.168.10.204:7001 192.168.10.205:7002 192.168.10.206:7003 192.168.10.203:7004 192.168.10.207:7005 192.168.10.208:7006

 出现:

>>> Creating cluster
>>> Performing hash slots allocation on 6 nodes...
Using 3 masters:
192.168.10.204:7001
192.168.10.205:7002
192.168.10.206:7003
Adding replica 192.168.10.203:7004 to 192.168.10.204:7001
Adding replica 192.168.10.207:7005 to 192.168.10.205:7002
Adding replica 192.168.10.208:7006 to 192.168.10.206:7003
M: 3335ad139839a644ea593942fd02c2df1dca3d24 192.168.10.204:7001
   slots:0-5460 (5461 slots) master
M: e359b1d676fafc49de2c2ce22ad8052cbc4c8c34 192.168.10.205:7002
   slots:5461-10922 (5462 slots) master
M: 727ddc5d4349007824cb8a382feeda699a8727f8 192.168.10.206:7003
   slots:10923-16383 (5461 slots) master
S: e99c70d79d1ecc062e0f0beec6c413b0cb679d1a 192.168.10.203:7004
   replicates 3335ad139839a644ea593942fd02c2df1dca3d24
S: 57479c4575f718417e68aee3d25ae75fcfc2cfdb 192.168.10.207:7005
   replicates e359b1d676fafc49de2c2ce22ad8052cbc4c8c34
S: b19f429b3815ca6e69115e1d40c380e09908a3c4 192.168.10.208:7006
   replicates 727ddc5d4349007824cb8a382feeda699a8727f8
Can I set the above configuration? (type 'yes' to accept): yes
>>> Nodes configuration updated
>>> Assign a different config epoch to each node
>>> Sending CLUSTER MEET messages to join the cluster
Waiting for the cluster to join....
>>> Performing Cluster Check (using node 192.168.10.204:7001)
M: 3335ad139839a644ea593942fd02c2df1dca3d24 192.168.10.204:7001
   slots:0-5460 (5461 slots) master
   1 additional replica(s)
S: b19f429b3815ca6e69115e1d40c380e09908a3c4 192.168.10.208:7006
   slots: (0 slots) slave
   replicates 727ddc5d4349007824cb8a382feeda699a8727f8
S: e99c70d79d1ecc062e0f0beec6c413b0cb679d1a 192.168.10.203:7004
   slots: (0 slots) slave
   replicates 3335ad139839a644ea593942fd02c2df1dca3d24
M: e359b1d676fafc49de2c2ce22ad8052cbc4c8c34 192.168.10.205:7002
   slots:5461-10922 (5462 slots) master
   1 additional replica(s)
S: 57479c4575f718417e68aee3d25ae75fcfc2cfdb 192.168.10.207:7005
   slots: (0 slots) slave
   replicates e359b1d676fafc49de2c2ce22ad8052cbc4c8c34
M: 727ddc5d4349007824cb8a382feeda699a8727f8 192.168.10.206:7003
   slots:10923-16383 (5461 slots) master
   1 additional replica(s)
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.

大概经过2s到3s,会出现上图,将16384个slots分配到6台机器上,至此cluster环境搭好了,来检查一下状态,随便选一台服务器,用redis自带的redis-cli客户端连上一台机,用 cluster info ,cluster nodes看一下结果:

 ./redis-cli -h 192.168.10.207 -p 7005192.168.10.207:7005> cluster info
cluster_state:ok
cluster_slots_assigned:16384
cluster_slots_ok:16384
cluster_slots_pfail:0
cluster_slots_fail:0
cluster_known_nodes:6
cluster_size:3
cluster_current_epoch:6
cluster_my_epoch:2
cluster_stats_messages_sent:994
cluster_stats_messages_received:994
192.168.10.207:7005> cluster nodes
57479c4575f718417e68aee3d25ae75fcfc2cfdb 192.168.10.207:7005 myself,slave e359b1d676fafc49de2c2ce22ad8052cbc4c8c34 0 0 5 connected
3335ad139839a644ea593942fd02c2df1dca3d24 192.168.10.204:7001 master - 0 1497864277901 1 connected 0-5460
e99c70d79d1ecc062e0f0beec6c413b0cb679d1a 192.168.10.203:7004 slave 3335ad139839a644ea593942fd02c2df1dca3d24 0 1497864275258 4 connected
727ddc5d4349007824cb8a382feeda699a8727f8 192.168.10.206:7003 master - 0 1497864275867 3 connected 10923-16383
b19f429b3815ca6e69115e1d40c380e09908a3c4 192.168.10.208:7006 slave 727ddc5d4349007824cb8a382feeda699a8727f8 0 1497864278307 6 connected
e359b1d676fafc49de2c2ce22ad8052cbc4c8c34 192.168.10.205:7002 master - 0 1497864278918 2 connected 5461-10922

说明Cluster搭建成功了, 不放心,写一条数据测试一下:

192.168.10.207:7005> set testKey testValue
(error) MOVED 5203 192.168.10.204:7001
192.168.10.207:7005> set a b 
(error) MOVED 15495 192.168.10.206:7003
192.168.10.207:7005> 

看来不放心是对的,连续写进2个值都报错了。OK ,还是看官网:

$ redis-cli -c -p 7000
redis 127.0.0.1:7000> set foo bar
-> Redirected to slot [12182] located at 127.0.0.1:7002
OK
redis 127.0.0.1:7002> set hello world
-> Redirected to slot [866] located at 127.0.0.1:7000
OK
redis 127.0.0.1:7000> get foo
-> Redirected to slot [12182] located at 127.0.0.1:7002
"bar"
redis 127.0.0.1:7000> get hello
-> Redirected to slot [866] located at 127.0.0.1:7000
"world"

嗯,多了个参数 -c ,一查,一看c 肯定就是cluster的简写,这意思肯定是在Cluster模式下操作redis 得以Cluster的方式连接.OK ,再来:

 ./redis-cli -c -h 192.168.10.203 -p 7004
192.168.10.203:7004> set a a.value
-> Redirected to slot [15495] located at 192.168.10.206:7003
OK
192.168.10.206:7003> get a
"a.value"
192.168.10.206:7003> 

这下就真的放心了,至此整个Cluster高可用环境搭建好了.

原文地址:https://www.cnblogs.com/blentle/p/6918104.html