[心得]redis集群环境搭建的错误

安装redis集群需要版本号在3.0以上

redis-cluster安装前需要安装ruby环境

搭建集群需要使用到官方提供的ruby脚本。

需要安装ruby的环境。

yum -y install ruby

yum -y install rubygems

 

redis集群管理工具redis-trib.rb

[root@bogon ~]# cd redis-3.0.0

[root@bogon redis-3.0.0]# cd src

[root@bogon src]# ll *.rb

-rwxrwxr-x.1 root root 48141 Apr  1 07:01 redis-trib.rb

 

脚本需要的ruby包:redis-3.0.0.gem

安装 gem install  redis-3.0.0.gem

 

[root@bogon ~]# gem install redis-3.0.0.gem

Successfully installed redis-3.0.0

1 gem installed

Installing ri documentation forredis-3.0.0...

Installing RDoc documentation forredis-3.0.0...

 

集群的搭建

需求,创建6台redis服务器,虚拟机模拟端口号为7001-7006

第二步:修改redis的配置文件

1、修改端口号
第三步:把创建集群的ruby脚本复制到redis-cluster目录下。
第四步:启动6个redis实例

第五步:创建集群。

./redis-trib.rb create --replicas 1 192.168.25.153:7001 192.168.25.153:7002 192.168.25.153:7003 192.168.25.153:7004 192.168.25.153:7005  192.168.25.153:7006

错误一
 
创建过程中会遇到错误
 
[ERR] Node 172.168.63.202:7001 is not empty. Either the nodealready knows other nodes (check with CLUSTER NODES) or contains some key in database 0.
解决办法:

解决方法:

1)、将需要新增的节点下aof、rdb等本地备份文件删除;

2)、同时将新Node的集群配置文件删除,即:删除你redis.conf里面cluster-config-file所在的文件;

3)、再次添加新节点如果还是报错,则登录新Node,./redis-cli–h x –p对数据库进行清除:

172.168.63.201:7001>  flushdb      #清空当前数据库

错误二

redis.clients.jedis.exceptions.JedisClusterException:CLUSTERDOWN The cluster is down
at redis.clients.jedis.Protocol.processError(Protocol.java:115)
at redis.clients.jedis.Protocol.process(Protocol.java:142)
at redis.clients.jedis.Protocol.read(Protocol.java:196)
at redis.clients.jedis.Connection.readProtocolWithCheckingBroken(Connection.java:288)
at redis.clients.jedis.Connection.getBinaryBulkReply(Connection.java:207)
at redis.clients.jedis.Connection.getBulkReply(Connection.java:196)
at redis.clients.jedis.Jedis.get(Jedis.java:98)
at JedisTest.testJedisPool(JedisTest.java:30)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:50)
at org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:467)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:683)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:390)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:197)
    
错误至今没有找到什么原因

使用redis-3.0.0目录下src下的redis-trib.rb check 192.168.218.128:6379进行检查和进行redis-trib.rb fix127.0.0.1:6380进行修复时

The folowing uncovered slots have no keys across the cluster:
./redis-trib.rb:412:in `fix_slots_coverage': undefined method `keys' for #<Array:0xb76ec21c> (NoMethodError)
        from ./redis-trib.rb:354:in `check_slots_coverage'
        from ./redis-trib.rb:333:in `check_cluster'
        from ./redis-trib.rb:847:in `fix_cluster_cmd'
        from ./redis-trib.rb:1373:in `send'
        from ./redis-trib.rb:1373

在用check检查集群运行状态时,遇到错误;最终我知道那里错了,是我把单机版的redis.conf配置文件开启了cluster-enable

所以总是提示CLUSTERDOWN The cluster is down的错误,修改单机版redis配置文件,关闭cluster-enable后正常。

 

错误三

[root@node01 src]# ./redis-trib.rb check 172.168.63.202:7000

 

Connecting to node 172.168.63.202:7000: OK

Connecting to node 172.168.63.203:7000: OK

Connecting to node 172.168.63.201:7000: OK

>>> Performing Cluster Check(using node 172.168.63.202:7000)

M: 449de2d2a4b799ceb858501b5b78ab91504c72e0172.168.63.202:7000

  slots: (0 slots) master

   0additional replica(s)

M: db9d26b1d15889ad2950382f4f32639606f9a94b172.168.63.203:7000

  slots: (0 slots) master

   0additional replica(s)

M: f90924f71308eb434038fc8a5f481d3661324792172.168.63.201:7000

  slots: (0 slots) master

   0additional replica(s)

[OK] All nodes agree about slotsconfiguration.

>>> Check for open slots...

>>> Check slots coverage...

[ERR] Not all 16384 slots are covered by nodes.

原因:

这个往往是由于主node移除了,但是并没有移除node上面的slot,从而导致了slot总数没有达到16384,其实也就是slots分布不正确。所以在删除节点的时候一定要注意删除的是否是Master主节点。

1)、官方是推荐使用redis-trib.rb fix 来修复集群…. ….  通过cluster nodes看到7001这个节点被干掉了… 那么

[root@node01 src]#  ./redis-trib.rb fix 172.168.63.201:7001

 

修复完成后再用check命令检查下是否正确

[root@node01 src]# ./redis-trib.rb check172.168.63.202:7000

只要输入任意集群中节点即可,会自动检查所有相关节点。可以查看相应的输出看下是否是每个Master都有了slots,如果分布不均匀那可以使用下面的方式重新分配slot:

 

[root@node01 src]#  ./redis-trib.rb reshard 172.168.63.201:7001







原文地址:https://www.cnblogs.com/bangiao/p/12418839.html