Mycat实战之连续分片

1 按照日期(天)分片:

 从开始日期算起,按照天数来分片 例如,从2017-11-01,每10天一个分片且可以指定结束日期
 注意事项:需要提前将分片规划好,建好,否则有可能日期超出实际配置分片数

1.1 修改配置文件

 #修改rule.xml 添加按日期分片的的分配规则 
 vi rule.xml
<function name="sharding-by-date" class="org.opencloudb.route.function.PartitionByDate"> 
		<property name="dateFormat">yyyy-MM-dd</property> <!--日期格式--> 
		<property name="sBeginDate">2017-11-01</property> <!--开始日期--> 
	   <property name="sPartionDay">10</property> <!--每分片天数--> 
</function>

 <tableRule name="sharding-by-date"> 
	 <rule> <columns>create_date</columns> 
	 <algorithm>sharding-by-date</algorithm> 
	</rule>
  </tableRule>


#修改schema.xml 添加逻辑表   
vi schema.xml 
<table name="mycatbydate" primaryKey="ID" dataNode="dn1,dn2,dn3" rule="sharding-by-date"/>


#重新加载配置文件
[root@localhost conf]# mysql -h 192.168.2.130 -P9066 -utest -ptest
Warning: Using a password on the command line interface can be insecure.
Welcome to the MySQL monitor.  Commands end with ; or g.
Your MySQL connection id is 20
Server version: 5.5.8-mycat-1.5.1-RELEASE-20161130213509 MyCat Server (monitor)

Copyright (c) 2000, 2017, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or 'h' for help. Type 'c' to clear the current input statement.

mysql> 
mysql> 
mysql> reload @@config;
Query OK, 1 row affected (0.08 sec)
Reload config success 

1.2 创建表后插入数据并分析日志

mysql> create table mycatbydate(id int not null auto_increment primary key,
       create_date datetime,datanode varchar(10));
   

mysql> insert into mycatbydate(create_date,datanode) values('2017-06-01',database());

分析mycat.log日志

1.3 查询语句1:分片字段是等值运算,分析mycat.log

mysql> select * from mycatbydate where create_date='2017-11-01';
+----+---------------------+----------+
| id | create_date         | datanode |
+----+---------------------+----------+
|  1 | 2017-11-01 00:00:00 | db1      |
+----+---------------------+----------+
1 row in set (0.03 sec)

分析mycat.log日志

1.4 查询语句2:分片字段范围查询,分析explain和mycat.log

这里分别用 where 跟 betwwen and 的方式来测试一下范围查询

where

mysql> explain select * from mycatbydate 
		   where create_date >='2017-11-01' and create_date <'2017-11-10';
+-----------+-------------------------------------------------------------+
| DATA_NODE | SQL                                                         |                                            
+-----------+-------------------------------------------------------------+
| dn1       | SELECT * FROM mycatbydate WHERE create_date >= '2017-11-01' |                                          
|           |               AND create_date < '2017-11-10' LIMIT 100      |                                          
| dn2       | SELECT * FROM mycatbydate WHERE create_date >= '2017-11-01' |                                          
|           |               AND create_date < '2017-11-10' LIMIT 100      |                                          
|           | SELECT * FROM mycatbydate WHERE create_date >= '2017-11-01' |                                                                                                     |
| dn3       |               AND create_date < '2017-11-10' LIMIT 100      |                                           
+-----------+-------------------------------------------------------------+
3 rows in set (0.00 sec)

分析mycat.log日志

between and

mysql> explain select * from mycatbydate where create_date 
                   BETWEEN '2017-11-01' AND '2017-11-10';
+-----------+---------------------------------------------------+
| DATA_NODE | SQL                                               |
+-----------+---------------------------------------------------+
| dn1       | SELECT * FROM mycatbydate WHERE create_date       |
|           |  BETWEEN '2017-11-01' AND '2017-11-10' LIMIT 100  | 
+-----------+---------------------------------------------------+
1 row in set (0.00 sec)

分析mycat.log日志

总结:对于范围查询,where语句是没法优化,从而路由到相应的节点上,但是between and 可以优化,这个有咨询过 lead us,答复是目前代码做不到where优化。。。

1.5 插入限制以及解决方案

插入数据,如果超过30天数据提示ERROR 1064 (HY000): Index: 6, Size: 3

由于配置3个dn,每10天一个分片,从7月1日数据就无法插入,这种配置就存在数据超过实际分配个数,底层dn和实际数据要规划好

mysql> insert into mycatbydate(id,create_date,datanode) values(2,'2017-12-01',database());
ERROR 1064 (HY000): Index: 6, Size: 3

mysql> insert into mycatbydate(id,create_date,datanode) values(2,'2017-11-10',database());
Query OK, 1 row affected (0.01 sec)

mysql> insert into mycatbydate(create_date,datanode) values('2017-11-20',database());
Query OK, 1 row affected (0.00 sec)

mysql> insert into mycatbydate(create_date,datanode) values('2017-11-30',database());
Query OK, 1 row affected (0.00 sec)

#查询数据分布:
mysql> select * from mycatbydate order by datanode;
+----+---------------------+----------+
| id | create_date         | datanode |
+----+---------------------+----------+
|  1 | 2017-11-01 00:00:00 | db1      |
|  2 | 2017-11-10 00:00:00 | db1      |
|  2 | 2017-11-20 00:00:00 | db2      |
|  2 | 2017-11-30 00:00:00 | db3      |
+----+---------------------+----------+
4 rows in set (0.08 sec)

解决方案:指定分区结束时间,这样分区可以重复使用

vi rule.xml
<function name="sharding-by-date" class="org.opencloudb.route.function.PartitionByDate"> 
        <property name="dateFormat">yyyy-MM-dd</property> <!--日期格式--> 
        <property name="sBeginDate">2017-06-01</property> <!--开始日期--> 
        <property name="sEndDate">2017-06-30</property> <!--结束日期-->
        <property name="sPartionDay">10</property> <!--每分片天数--> 
</function>

@@reload_config



#重新加载配置文件后可以正常插入数据:

mysql> insert into mycatbydate(id,create_date,datanode) values(2,'2017-12-01',database());
ERROR 1062 (23000): Duplicate entry '2' for key 'PRIMARY'
mysql> insert into mycatbydate(id,create_date,datanode) values(5,'2017-12-01',database());
Query OK, 1 row affected (0.18 sec)

mysql> insert into mycatbydate(id,create_date,datanode) values(6,'2018-01-01',database());
Query OK, 1 row affected (0.03 sec)

mysql> insert into mycatbydate(id,create_date,datanode) values(7,'2018-02-01',database());
Query OK, 1 row affected (0.03 sec)

mysql> insert into mycatbydate(id,create_date,datanode) values(8,'2018-03-01',database());
Query OK, 1 row affected (0.04 sec)

mysql> insert into mycatbydate(id,create_date,datanode) values(9,'2018-04-01',database());
Query OK, 1 row affected (0.04 sec)

mysql> 
mysql> 
mysql> select * from mycatbydate order by datanode;
+----+---------------------+----------+
| id | create_date         | datanode |
+----+---------------------+----------+
|  5 | 2017-12-01 00:00:00 | db1      |
|  2 | 2017-11-10 00:00:00 | db1      |
|  1 | 2017-11-01 00:00:00 | db1      |
|  6 | 2018-01-01 00:00:00 | db1      |
|  7 | 2018-02-01 00:00:00 | db1      |
|  8 | 2018-03-01 00:00:00 | db1      |
|  9 | 2018-04-01 00:00:00 | db1      |
|  2 | 2017-11-20 00:00:00 | db2      |
|  2 | 2017-11-30 00:00:00 | db3      |
+----+---------------------+----------+
9 rows in set (0.14 sec)

2 按照自然月分片

每个自然月一个分片,需要提前将分片数规划好,建好,否则有可能日期超出实际配置分片数

2.1修改配置文件

#新增分配规则
vi  rule.xml   
<function name="partbymonth"  class="org.opencloudb.route.function.PartitionByMonth">
			<property name="dateFormat">yyyy-MM-dd</property>
			<property name="sBeginDate">2017-11-01</property>
</function>

<tableRule name="sharding-by-month">
		<rule>
				<columns>create_date</columns>
				<algorithm>partbymonth</algorithm>
		</rule>
</tableRule>

#schema.xml增加逻辑表

<table name="mycatbymonth" primaryKey="ID" dataNode="dn$1-12" rule="sharding-by-month" />
    <dataNode name="dn1" dataHost="192.168.124.55" database="db1" />
    <dataNode name="dn2" dataHost="192.168.124.55" database="db2" />
    <dataNode name="dn3" dataHost="192.168.124.55" database="db3" />
    <dataNode name="dn4" dataHost="192.168.124.55" database="db4" />
    <dataNode name="dn5" dataHost="192.168.124.55" database="db5" />
    <dataNode name="dn6" dataHost="192.168.124.55" database="db6" />
    <dataNode name="dn7" dataHost="192.168.124.55" database="db7" />
    <dataNode name="dn8" dataHost="192.168.124.55" database="db8" />
    <dataNode name="dn9" dataHost="192.168.124.55" database="db9" />
    <dataNode name="dn10" dataHost="192.168.124.55" database="db10" />
    <dataNode name="dn11" dataHost="192.168.124.55" database="db11" />
    <dataNode name="dn12" dataHost="192.168.124.55" database="db12" />

2.2 mycat重新加载配置文件

mysql> reload @@config;
Query OK, 1 row affected (0.10 sec)
Reload config success


#查看逻辑表
mysql> show tables;
+------------------+
| Tables in TESTDB |
+------------------+
| customer         |
| customer_addr    |
| mycatbydate      |
| mycatbymonth     |
| orders           |
| order_items      |
| travelrecord     |
| t_vote           |
+------------------+
8 rows in set (0.00 sec)

2.3 创建表后插入数据

mysql>create table mycatbymonth(id int not null auto_increment primary key, create_date datetime,datanode varchar(10));
Query OK, 0 rows affected (0.20 sec)
	-> ;
mysql> explain select * from mycatbymonth;
+-----------+--------------------------------------+
| DATA_NODE | SQL                                  |
+-----------+--------------------------------------+
| dn1       | SELECT * FROM mycatbymonth LIMIT 100 |
| dn10      | SELECT * FROM mycatbymonth LIMIT 100 |
| dn11      | SELECT * FROM mycatbymonth LIMIT 100 |
| dn12      | SELECT * FROM mycatbymonth LIMIT 100 |
| dn2       | SELECT * FROM mycatbymonth LIMIT 100 |
| dn3       | SELECT * FROM mycatbymonth LIMIT 100 |
| dn4       | SELECT * FROM mycatbymonth LIMIT 100 |
| dn5       | SELECT * FROM mycatbymonth LIMIT 100 |
| dn6       | SELECT * FROM mycatbymonth LIMIT 100 |
| dn7       | SELECT * FROM mycatbymonth LIMIT 100 |
| dn8       | SELECT * FROM mycatbymonth LIMIT 100 |
| dn9       | SELECT * FROM mycatbymonth LIMIT 100 |
+-----------+--------------------------------------+
12 rows in set (0.01 sec)


mysql> insert into mycatbymonth(id,create_date,datanode) values(1,'2017-11-24',database());
Query OK, 1 row affected (0.01 sec)

查看mycat日志

2.4 批量插入,验证插入分区,不能超过一年的数据。

#批量插入数据:
insert into mycatbymonth(id,create_date,datanode) values(1,'2017-01-01',database());
insert into mycatbymonth(id,create_date,datanode) values(2,'2017-01-24',database()); 
insert into mycatbymonth(id,create_date,datanode) values(3,'2017-02-01',database());
insert into mycatbymonth(id,create_date,datanode) values(4,'2017-03-01',database());
insert into mycatbymonth(id,create_date,datanode) values(5,'2017-04-01',database());
insert into mycatbymonth(id,create_date,datanode) values(6,'2017-05-01',database());
insert into mycatbymonth(id,create_date,datanode) values(7,'2017-06-01',database());
insert into mycatbymonth(id,create_date,datanode) values(8,'2017-07-01',database());
insert into mycatbymonth(id,create_date,datanode) values(9,'2017-08-01',database());
insert into mycatbymonth(id,create_date,datanode) values(12,'2017-12-01',database());



#报错的插入        
mysql> insert into mycatbymonth(id,create_date,datanode) values(13,'2019-01-01','db1');
ERROR 1064 (HY000): Can't find a valid data node for specified 
node index :MYCATBYMONTH -> CREATE_DATE -> 2019-01-01 -> Index : 24

2.5 查询验证,查询分区字段等值 以及范围查询同 按照日期(天)分片的方式 不赘述!

连续分片总结:

1、针对时间范围查询,采用between可以直接路由到对应分片,从而避免查询所有dn,只查询特定分片(具体原因还没有分
  析),这个问题问过leaderUS,回复时,目前技术上还没法做到普通范围查询的优化。
2、插入数据存在超过分片情况,针对月分区,插入数据不能超过一年,针对天的,也存在日期超过分区问题
原文地址:https://www.cnblogs.com/chinesern/p/7800954.html