第四章 -- 索引及执行计划管理

索引的作用

提供了类似于书中目录的作用，目的是为了优化查询

索引的类型(算法)

BTREE索引
RTREE索引
Hash索引
Full text/全文索引

BTREE的细分类(算法)

B-TREE
B+TREE
B*TREE/默认

B树是如何构建的

辅助索引

-- 创建辅助索引
alter table t1 add index idx_name(name);

1. 将name列的所有值取出来,进行自动排序
2. 将排完序的值均匀的落到16KB叶子节点数据页中,并将索引键值所对应的数据行的聚集索引列值
3. 向上生成枝节点和根节点

聚集索引(IOT)

1. 默认是按照主键生成聚集索引.没有主键,存储引擎会使用唯一键
   如果都没有,会自动生成隐藏的聚集索引.
2. 数据在存储时,就会按照聚集索引的顺序存储到磁盘的数据页.
3. 由于本身数据就是有序的,所以聚集索引构建时,不需要进行排序.
4. 聚集索引直接将磁盘的数据页,作为叶子节点.
5. 枝节点和根节点只会调取下层节点主键的最小值

辅助索引与聚集索引区别

1）辅助索引：

叶子节点只保存主键值+索引键值的有序存储
对索引键值会自动排序
需要手工创建
辅助索引可以有多个，任何列都可以创建
2）聚集索引
只能在主键列上创建，唯一且非空
数据存储时，就会按照聚集索引顺序进行排序
叶子节点保存的是整个有序的数据行
子夜节点不需要单独生成

辅助索引细分

单列辅助索引
联合索引
唯一索引

索引树的高度(越低越好)

表的数据量级大(导致索引树高度)

分区表(了解、古老)
分库分表(分布式架构、主流)

索引键值的长度

尽可能选择列值短的列创建索引
采用前缀索引

数据类型选择

varchar 和char的选择，尽量使用varchar
固定范围的只尽量使用 enum

索引管理

索引命令操作

查询索引

KEY：PRI(主键)、UNI(唯一索引)、MUL(辅助索引)

mysql[school]>use school
mysql[school]>desc student;
mysql[school]>show index from studentG

创建索引

-- 创建单列索引 *
mysql[school]>alter table student add index idx_name(sname);

-- 创建联合索引 **
mysql[school]>alter table student add index idx_sname_sage_ssex(sname,sage,ssex);

-- 创建前缀索引 *
mysql[school]>alter table student add index idx(sname(5));

-- 创建唯一索引			
mysql[school]>alter table student add unique index idx_tel(telnum);

删除索引

mysql[school]>alter table student drop index idx;
mysql[school]>alter table student drop index idx_name;

压力测试

100W数据准备

create database test charset utf8mb4;
use test;
create table t100w (id int,num int,k1 char(2),k2 char(4),dt timestamp);
delimiter //
create  procedure rand_data(in num int)
begin
declare str char(62) default 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789';
declare str2 char(2);
declare str4 char(4);
declare i int default 0;
while i<num do
set str2=concat(substring(str,1+floor(rand()*61),1),substring(str,1+floor(rand()*61),1));
set str4=concat(substring(str,1+floor(rand()*61),2),substring(str,1+floor(rand()*61),2));
set i=i+1;
insert into t100w values (i,floor(rand()*num),str2,str4,now());
end while;
end;
//
delimiter ;

插入100w条数据：

call rand_data(1000000);

commit;

查询生成结果

# 查询
MySQL [test]>select count(*) from t100w;
+----------+
| count(*) |
+----------+
|  1000000 |
+----------+
1 row in set (0.47 sec)

压测命令

mysqlslap --defaults-file=/etc/my.cnf --concurrency=100 --iterations=1 --create-schema='test' --query="select * from test.t100w where k2='VWtu'" engine=innodb --number-of-queries=2000 -uroot -p123456 -verbose

没有索引前,压测结果

索引优化,创建索引

use test;
show tables;
desc t100w;
alter table t100w add index idx_k2(k2);  #优化

优化索引后，压测结果

explain(desc)

抓取优化器优化过的执行计划

语法

explain select * from test.t100w where k2='VWtu';
or
desc select * from test.t100w where k2='VWtu';

执行计划分析

MySQL [test]>explain select * from test.t100w where k2='VWtu';
+----+-------------+-------+------------+------+---------------+--------+---------+-------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key    | key_len | ref   | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+--------+---------+-------+------+----------+-------+
|  1 | SIMPLE      | t100w | NULL       | ref  | idx_k2        | idx_k2 | 17      | const | 1120 |   100.00 | NULL  |
+----+-------------+-------+------------+------+---------------+--------+---------+-------+------+----------+-------+

-- 重点关注
table	        -- 以上SQL语句设计到的表 ***
type 	        -- 查询的类型(全表扫描-ALL，索引扫描，查不到数据-NULL) *****
possible_keys	-- 可能会用到的索引 ***
key             -- 用到的索引 ****
key_len	        -- 索引的覆盖长度 *****
Extra           -- 额外的信息 ****

type 详细说明

ALL : 全表扫描 , 不会走任何索引

查询条件，没建索引

use test;
MySQL [test]>explain  select * from test.t100w where k2='VWtu';
+----+-------------+-------+------------+------+---------------+--------+---------+-------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key    | key_len | ref   | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+--------+---------+-------+------+----------+-------+
|  1 | SIMPLE      | t100w | NULL       | ref  | idx_k2        | idx_k2 | 17      | const | 1120 |   100.00 | NULL  |
+----+-------------+-------+------------+------+---------------+--------+---------+-------+------+----------+-------+
1 row in set, 1 warning (0.00 sec)

2）有索引不走

desc select * from t100w where k2 != 'asdf';
desc select * from t100w where k2 like '%aa%';
desc select * from t100w where k2 not in ('asda','asas');
desc select * from t100w;

index 全索引扫描

MySQL [test]>desc select k2 from t100w;

range 索引范围查询

所有索引：
>, <, >=, <=, like, between and
mysql[world]>desc select * from city where id<10;
mysql[world]>desc select * from city where countrycode like 'CH%'

in()， or 
mysql[world]>desc select * from city where countrycode in ('CHN','USA');

聚集索引：
!= not in
mysql[test]>desc select * from world.city where id != 10;
mysql[test]>desc select * from world.city where id not in (10,20);
========================================
说明：
B+tree 索引能额外优化到:> ,<, >=, <= ,like , between and
in 和 or 享受不到b+tree额外的优化效果的,所以我一般情况会将in , or 进行改性
mysql[world]>desc select * from city where countrycode = 'CHN' union all  select * from city where countrycode = 'USA';

ref 辅助索引等值查询

mysql[world]>desc select * from city where countrycode ='CHN';

eq_ref 多表连接查询中，非驱动表on的条件列是主键或者唯一键

mysql[world]>desc select a.name,b.name from city as a join country as b on a.countrycode=b.code where a.population<100;

const(system) 主键或唯一键的等值

mysql[world]>desc select * from city where id=10;

NULL : 获取不到数据

mysql[world]>desc select * from city where id=100000000000000;

possible_keys：可能会用到的索引

NULL：没有和查询条件配置的索引条目
有值：有和查询条件匹配的索引条目,但是没走，大部分原因是语句查询方式不符合索引应用条件

key：使用到的索引

最终使用的索引,可以帮助我们判断是否走了合适的索引.

key_len：索引的覆盖长度

在联合索引应用的判断时，会经常看

对于单列索引:
字符集：

utf8mb4 : 一个字符最大是4个字节
utf8 ：一个字符最大是3个字节

以utf8mb4为例
| 数据类型 | not null | 没有指定not null |
| ---------- | -------- | ---------------- |
| int | 4 | 4+1 |
| tinyint | 1 | 1+1 |
| char(2) | 24 | 24+1 |
| varchar(2) | 24+2 | 24+2+1 |

建立索引的原则(DBA运维规范)

(1) 必须要有主键,如果没有可以做为主键条件的列,创建无关列
(2) 经常做为where条件列 order by group by join on, distinct 的条件(业务:产品功能+用户行为)
(3) 最好使用唯一值多的列作为索引,如果索引列重复值较多,可以考虑使用联合索引
(4) 列值长度较长的索引列,我们建议使用前缀索引.
(5) 降低索引条目,一方面不要创建没用索引,不常使用的索引清理,percona toolkit(xxxxx)
(6) 索引维护要避开业务繁忙期