day 45索引

一、索引原理

本质都是：通过不断地缩小想要获取数据的范围来筛选出最终想要的结果，同时把随机的事件变成顺序的事件，也就是说，有了这种索引机制，我们可以总是用同一种查找方式来锁定数据

二磁盘IO与预读

当一次IO时，不光把当前磁盘地址的数据，而是把相邻的数据也都读取到内存缓冲区内，

三索引的数据结构

1.索引字段要尽量的小

2.索引的最左匹配特性：

四聚集索引与辅助索引

在数据库中，B+树的高度一般都在2~4层，这也就是说查找某一个键值的行记录时最多只需要2到4次IO，这倒不错。因为当前一般的机械硬盘每秒至少可以做100次IO，2~4次的IO意味着查询时间只需要0.02~0.04秒。

数据库中的B+树索引可以分为聚集索引（clustered index）和辅助索引（secondary index），

聚集索引与辅助索引相同的是：不管是聚集索引还是辅助索引，其内部都是B+树的形式，即高度是平衡的，叶子结点存放着所有的数据。

聚集索引与辅助索引不同的是：叶子结点存放的是否是一整行的信息

1、聚集索引

#InnoDB存储引擎表示索引组织表，即表中数据按照主键顺序存放。而聚集索引（clustered index）就是按照每张表的主键构造一棵B+树，同时叶子结点存放的即为整张表的行记录数据，也将聚集索引的叶子结点称为数据页。聚集索引的这个特性决定了索引组织表中数据也是索引的一部分。同B+树数据结构一样，每个数据页都通过一个双向链表来进行链接。
    
#如果未定义主键，MySQL取第一个唯一索引（unique）而且只含非空列（NOT NULL）作为主键，InnoDB使用它作为聚簇索引。
    
#如果没有这样的列，InnoDB就自己产生一个这样的ID值，它有六个字节，而且是隐藏的，使其作为聚簇索引。

#由于实际的数据页只能按照一棵B+树进行排序，因此每张表只能拥有一个聚集索引。在多少情况下，查询优化器倾向于采用聚集索引。因为聚集索引能够在B+树索引的叶子节点上直接找到数据。此外由于定义了数据的逻辑顺序，聚集索引能够特别快地访问针对范围值得查询。

聚集索引的好处之一：它对主键的排序查找和范围查找速度非常快，叶子节点的数据就是用户所要查询的数据。如用户需要查找一张表，查询最后的10位用户信息，由于B+树索引是双向链表，所以用户可以快速找到最后一个数据页，并取出10条记录

#参照第六小结测试索引的准备阶段来创建出表s1
mysql> desc s1; #最开始没有主键
+--------+-------------+------+-----+---------+-------+
| Field  | Type        | Null | Key | Default | Extra |
+--------+-------------+------+-----+---------+-------+
| id     | int(11)     | NO   |     | NULL    |       |
| name   | varchar(20) | YES  |     | NULL    |       |
| gender | char(6)     | YES  |     | NULL    |       |
| email  | varchar(50) | YES  |     | NULL    |       |
+--------+-------------+------+-----+---------+-------+
rows in set (0.00 sec)

mysql> explain select * from s1 order by id desc limit 10; #Using filesort，需要二次排序
+----+-------------+-------+------------+------+---------------+------+---------+------+---------+----------+----------------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref  | rows    | filtered | Extra          |
+----+-------------+-------+------------+------+---------------+------+---------+------+---------+----------+----------------+
|  1 | SIMPLE      | s1    | NULL       | ALL  | NULL          | NULL | NULL    | NULL | 2633472 |   100.00 | Using filesort |
+----+-------------+-------+------------+------+---------------+------+---------+------+---------+----------+----------------+
row in set, 1 warning (0.11 sec)

mysql> alter table s1 add primary key(id); #添加主键
Query OK, 0 rows affected (13.37 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> explain select * from s1 order by id desc limit 10; #基于主键的聚集索引在创建完毕后就已经完成了排序，无需二次排序
+----+-------------+-------+------------+-------+---------------+---------+---------+------+------+----------+-------+
| id | select_type | table | partitions | type  | possible_keys | key     | key_len | ref  | rows | filtered | Extra |
+----+-------------+-------+------------+-------+---------------+---------+---------+------+------+----------+-------+
|  1 | SIMPLE      | s1    | NULL       | index | NULL          | PRIMARY | 4       | NULL |   10 |   100.00 | NULL  |
+----+-------------+-------+------------+-------+---------------+---------+---------+------+------+----------+-------+
row in set, 1 warning (0.04 sec)

View Code

聚集索引的好处之二：范围查询（range query），即如果要查找主键某一范围内的数据，通过叶子节点的上层中间节点就可以得到页的范围，之后直接读取数据页即可

mysql> alter table s1 drop primary key;
Query OK, 2699998 rows affected (24.23 sec)
Records: 2699998  Duplicates: 0  Warnings: 0

mysql> desc s1;
+--------+-------------+------+-----+---------+-------+
| Field  | Type        | Null | Key | Default | Extra |
+--------+-------------+------+-----+---------+-------+
| id     | int(11)     | NO   |     | NULL    |       |
| name   | varchar(20) | YES  |     | NULL    |       |
| gender | char(6)     | YES  |     | NULL    |       |
| email  | varchar(50) | YES  |     | NULL    |       |
+--------+-------------+------+-----+---------+-------+
rows in set (0.12 sec)

mysql> explain select * from s1 where id > 1 and id < 1000000; #没有聚集索引，预估需要检索的rows数如下
+----+-------------+-------+------------+------+---------------+------+---------+------+---------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref  | rows    | filtered | Extra       |
+----+-------------+-------+------------+------+---------------+------+---------+------+---------+----------+-------------+
|  1 | SIMPLE      | s1    | NULL       | ALL  | NULL          | NULL | NULL    | NULL | 2690100 |    11.11 | Using where |
+----+-------------+-------+------------+------+---------------+------+---------+------+---------+----------+-------------+
row in set, 1 warning (0.00 sec)

mysql> alter table s1 add primary key(id);
Query OK, 0 rows affected (16.25 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> explain select * from s1 where id > 1 and id < 1000000; #有聚集索引，预估需要检索的rows数如下
+----+-------------+-------+------------+-------+---------------+---------+---------+------+---------+----------+-------------+
| id | select_type | table | partitions | type  | possible_keys | key     | key_len | ref  | rows    | filtered | Extra       |
+----+-------------+-------+------------+-------+---------------+---------+---------+------+---------+----------+-------------+
|  1 | SIMPLE      | s1    | NULL       | range | PRIMARY       | PRIMARY | 4       | NULL | 1343355 |   100.00 | Using where |
+----+-------------+-------+------------+-------+---------------+---------+---------+------+---------+----------+-------------+
row in set, 1 warning (0.09 sec)

View Code

2、辅助索引

表中除了聚集索引外其他索引都是辅助索引（Secondary Index，也称为非聚集索引），与聚集索引的区别是：辅助索引的叶子节点不包含行记录的全部数据。

五 MySQL索引管理

一功能

#1. 索引的功能就是加速查找
#2. mysql中的primary key，unique，联合唯一也都是索引，这些索引除了加速查找以外，还有约束的功能

二 MySQL常用的索引

索引分类：
普通索引INDEX：加速查找

唯一索引：
    -主键索引PRIMARY KEY：加速查找+约束（不为空、不能重复）
    -唯一索引UNIQUE:加速查找+约束（不能重复）

联合索引：
    -PRIMARY KEY(id,name):联合主键索引
    -UNIQUE(id,name):联合唯一索引
    -INDEX(id,name):联合普通索引

索引分类 **

普通索引INDEX：加速查找

唯一索引：
    -主键索引PRIMARY KEY：加速查找+约束（不为空、不能重复）
    -唯一索引UNIQUE:加速查找+约束（不能重复）

联合索引：
    -PRIMARY KEY(id,name):联合主键索引
    -UNIQUE(id,name):联合唯一索引
    -INDEX(id,name):联合普通索引

举个例子来说，比如你在为某商场做一个会员卡的系统。

这个系统有一个会员表
有下列字段：
会员编号 INT
会员姓名 VARCHAR(10)
会员身份证号码 VARCHAR(18)
会员电话 VARCHAR(10)
会员住址 VARCHAR(50)
会员备注信息 TEXT

那么这个 会员编号，作为主键，使用 PRIMARY
会员姓名 如果要建索引的话，那么就是普通的 INDEX
会员身份证号码 如果要建索引的话，那么可以选择 UNIQUE （唯一的，不允许重复）

#除此之外还有全文索引，即FULLTEXT
会员备注信息 ， 如果需要建索引的话，可以选择全文搜索。
用于搜索很长一篇文章的时候，效果最好。
用在比较短的文本，如果就一两行字的，普通的 INDEX 也可以。
但其实对于全文搜索，我们并不会使用MySQL自带的该索引，而是会选择第三方软件如Sphinx，专门来做全文搜索。

#其他的如空间索引SPATIAL，了解即可，几乎不用

各个索引的应用场景

应用场景

三索引的两大类型hash与btree

#我们可以在创建上述索引的时候，为其指定索引类型，分两类
hash类型的索引：查询单条快，范围查询慢
btree类型的索引：b+树，层数越多，数据量指数级增长（我们就用它，因为innodb默认支持它）

#不同的存储引擎支持的索引类型也不一样
InnoDB 支持事务，支持行级别锁定，支持 B-tree、Full-text 等索引，不支持 Hash 索引；
MyISAM 不支持事务，支持表级别锁定，支持 B-tree、Full-text 等索引，不支持 Hash 索引；
Memory 不支持事务，支持表级别锁定，支持 B-tree、Hash 等索引，不支持 Full-text 索引；
NDB 支持事务，支持行级别锁定，支持 Hash 索引，不支持 B-tree、Full-text 等索引；
Archive 不支持事务，支持表级别锁定，不支持 B-tree、Hash、Full-text 等索引；

View Code

四创建/删除索引的语法

#方法一：创建表时
    　　CREATE TABLE 表名 (
                字段名1  数据类型 [完整性约束条件…],
                字段名2  数据类型 [完整性约束条件…],
                [UNIQUE | FULLTEXT | SPATIAL ]   INDEX | KEY
                [索引名]  (字段名[(长度)]  [ASC |DESC]) 
                );


#方法二：CREATE在已存在的表上创建索引
        CREATE  [UNIQUE | FULLTEXT | SPATIAL ]  INDEX  索引名 
                     ON 表名 (字段名[(长度)]  [ASC |DESC]) ;


#方法三：ALTER TABLE在已存在的表上创建索引
        ALTER TABLE 表名 ADD  [UNIQUE | FULLTEXT | SPATIAL ] INDEX
                             索引名 (字段名[(长度)]  [ASC |DESC]) ;
                             
#删除索引：DROP INDEX 索引名 ON 表名字;

#方式一
create table t1(
    id int,
    name char,
    age int,
    sex enum('male','female'),
    unique key uni_id(id),
    index ix_name(name) #index没有key
);


#方式二
create index ix_age on t1(age);

#方式三
alter table t1 add index ix_sex(sex);

#查看
mysql> show create table t1;
| t1    | CREATE TABLE `t1` (
  `id` int(11) DEFAULT NULL,
  `name` char(1) DEFAULT NULL,
  `age` int(11) DEFAULT NULL,
  `sex` enum('male','female') DEFAULT NULL,
  UNIQUE KEY `uni_id` (`id`),
  KEY `ix_name` (`name`),
  KEY `ix_age` (`age`),
  KEY `ix_sex` (`sex`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1

示范

创建删除索引的方法

1 创建索引
    - 在创建表时就创建
        create table s1(
        id int,
        name char(6),
        age int,
        email varchar(30),
        index(id)
        );
    - 在创建表后创建
        create index name on s1(name);#添加普通索引
        create unique index age on s1(age);#添加唯一索引
        alter table s1 add primary key(id);#添加主键索引
        create index name on s1(id,name);#添加联合普通索引

2 删除索引
    drop index id on s1;
    drop index name on s1;
    alter table s1 drop primary key;#添加主键索引

创建和删除

建立索引：

六测试索引

一准备

#1. 准备表
create table s1(
id int,
name varchar(20),
gender char(6),
email varchar(50)
);

#2. 创建存储过程，实现批量插入记录
delimiter $$ #声明存储过程的结束符号为$$
create procedure auto_insert1()
BEGIN
    declare i int default 1;
    while(i<3000000)do
        insert into s1 values(i,'egon','male',concat('egon',i,'@oldboy'));
        set i=i+1;
    end while;
END$$ #$$结束
delimiter ; #重新声明分号为结束符号

#3. 查看存储过程
show create procedure auto_insert1G 

#4. 调用存储过程
call auto_insert1();

View Code

二在没有索引的前提下测试查询速度

#无索引：mysql根本就不知道到底是否存在id等于333333333的记录，只能把数据表从头到尾扫描一遍，此时有多少个磁盘块就需要进行多少IO操作，所以查询速度很慢
mysql> select * from s1 where id=333333333;
Empty set (0.33 sec)

三在表中已经存在大量数据的前提下，为某个字段段建立索引，建立速度会很慢

mysql> select count(*) from s1 where id=1000;
+----------+
| count(*) |
+----------+
|        1 |
+----------+
1 row in set (0.12 sec)

mysql> select count(*) from s1 where id>1000;
+----------+
| count(*) |
+----------+
|   298999 |
+----------+
1 row in set (0.12 sec)

1未加索引之前

mysql> create index a on s1(id)
    -> ;
Query OK, 0 rows affected (3.21 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> select count(*) from s1 where id=1000;
+----------+
| count(*) |
+----------+
|        1 |
+----------+
1 row in set (0.00 sec)

mysql> select count(*) from s1 where id>1000;
+----------+
| count(*) |
+----------+
|   298999 |
+----------+
1 row in set (0.12 sec)

mysql> select count(*) from s1 where id>1000 and id < 2000;
+----------+
| count(*) |
+----------+
|      999 |
+----------+
1 row in set (0.00 sec)

mysql> select count(*) from s1 where id>1000 and id < 300000;
+----------+
| count(*) |
+----------+
|   298999 |
+----------+
1 row in set (0.13 sec)

2、加索引提速

mysql> select count(*) from s1 where name='xxx';
+----------+
| count(*) |
+----------+
|        0 |
+----------+
1 row in set (0.00 sec)

mysql> select count(*) from s1 where name='egon';
+----------+
| count(*) |
+----------+
|   299999 |
+----------+
1 row in set (0.19 sec)


mysql> select count(*) from s1 where name='egon' and age=123123123123123;
+----------+
| count(*) |
+----------+
|        0 |
+----------+
1 row in set (0.45 sec)

mysql> create index c on s1(age);
Query OK, 0 rows affected (3.03 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> select count(*) from s1 where name='egon' and age=123123123123123;
+----------+
| count(*) |
+----------+
|        0 |
+----------+
1 row in set (0.00 sec)

mysql> select count(*) from s1 where name='egon' and age=10;
+----------+
| count(*) |
+----------+
|   299999 |
+----------+
1 row in set (0.35 sec)


mysql> select count(*) from s1 where name='egon' and age=10 and id>3000 and id < 4000;
+----------+
| count(*) |
+----------+
|      999 |
+----------+
1 row in set (0.00 sec)


mysql> select count(*) from s1 where name='egon' and age=10 and id>3000 and email='xxxx';
+----------+
| count(*) |
+----------+
|        0 |
+----------+
1 row in set (0.47 sec)

mysql> create index d on s1(email);
Query OK, 0 rows affected (4.83 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> select count(*) from s1 where name='egon' and age=10 and id>3000 and email='xxxx';
+----------+
| count(*) |
+----------+
|        0 |
+----------+
1 row in set (0.00 sec)

mysql> drop index a on s1;
Query OK, 0 rows affected (0.10 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> drop index b on s1;
Query OK, 0 rows affected (0.09 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> drop index c on s1;
Query OK, 0 rows affected (0.09 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> desc s1;
+-------+-------------+------+-----+---------+-------+
| Field | Type        | Null | Key | Default | Extra |
+-------+-------------+------+-----+---------+-------+
| id    | int(11)     | NO   |     | NULL    |       |
| name  | char(20)    | YES  |     | NULL    |       |
| age   | int(11)     | YES  |     | NULL    |       |
| email | varchar(30) | YES  | MUL | NULL    |       |
+-------+-------------+------+-----+---------+-------+
4 rows in set (0.00 sec)

mysql> select count(*) from s1 where name='egon' and age=10 and id>3000 and email='xxxx';
+----------+
| count(*) |
+----------+
|        0 |
+----------+
1 row in set (0.00 sec)

3 区分度低的字段不能加索引

 select count(*) from s1 where name='egon' and age=10 and id>3000 and email='xxxx';
index(name,email,age,id)

 select count(*) from s1 where name='egon' and age> 10 and id=3000 and email='xxxx';
index(name,email,id,age)

 select count(*) from s1 where name like 'egon' and age= 10 and id=3000 and email='xxxx';
index(email,id,age,name)


mysql> desc s1;
+-------+-------------+------+-----+---------+-------+
| Field | Type        | Null | Key | Default | Extra |
+-------+-------------+------+-----+---------+-------+
| id    | int(11)     | NO   |     | NULL    |       |
| name  | char(20)    | YES  |     | NULL    |       |
| age   | int(11)     | YES  |     | NULL    |       |
| email | varchar(30) | YES  |     | NULL    |       |
+-------+-------------+------+-----+---------+-------+
4 rows in set (0.00 sec)

mysql> create index xxx on s1(age,email,name,id);
Query OK, 0 rows affected (6.89 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> select count(*) from s1 where name='egon' and age=10 and id>3000 and email='xxxx';
+----------+
| count(*) |
+----------+
|        0 |
+----------+
1 row in set (0.00 sec)

4、增加联合索引，关于范围查询的字段要放到后面

index(id,age,email,name)
#条件中一定要出现id
id
id age
id email
id name

email #不行
mysql> select count(*) from s1 where id=3000;
+----------+
| count(*) |
+----------+
|        1 |
+----------+
1 row in set (0.11 sec)

mysql> create index xxx on s1(id,name,age,email);
Query OK, 0 rows affected (6.44 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql>  select count(*) from s1 where id=3000;
+----------+
| count(*) |
+----------+
|        1 |
+----------+
1 row in set (0.00 sec)

mysql>  select count(*) from s1 where name='egon';
+----------+
| count(*) |
+----------+
|   299999 |
+----------+
1 row in set (0.16 sec)

mysql>  select count(*) from s1 where email='egon3333@oldboy.com';
+----------+
| count(*) |
+----------+
|        1 |
+----------+
1 row in set (0.15 sec)

mysql>  select count(*) from s1 where id=1000 and email='egon3333@oldboy.com';
+----------+
| count(*) |
+----------+
|        0 |
+----------+
1 row in set (0.00 sec)

mysql>  select count(*) from s1 where email='egon3333@oldboy.com' and id=3000;
+----------+
| count(*) |
+----------+
|        0 |
+----------+
1 row in set (0.00 sec)

5、最左端前缀匹配

mysql> select count(*) from s1 where id=3000;
+----------+
| count(*) |
+----------+
|        1 |
+----------+
1 row in set (0.11 sec)

mysql> create index xxx on s1(id,name,age,email);
Query OK, 0 rows affected (6.44 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql>  select count(*) from s1 where id=3000;
+----------+
| count(*) |
+----------+
|        1 |
+----------+
1 row in set (0.00 sec)

mysql>  select count(*) from s1 where name='egon';
+----------+
| count(*) |
+----------+
|   299999 |
+----------+
1 row in set (0.16 sec)

mysql>  select count(*) from s1 where email='egon3333@oldboy.com';
+----------+
| count(*) |
+----------+
|        1 |
+----------+
1 row in set (0.15 sec)

mysql>  select count(*) from s1 where id=1000 and email='egon3333@oldboy.com';
+----------+
| count(*) |
+----------+
|        0 |
+----------+
1 row in set (0.00 sec)

mysql>  select count(*) from s1 where email='egon3333@oldboy.com' and id=3000;
+----------+
| count(*) |
+----------+
|        0 |
+----------+
1 row in set (0.00 sec)

6.索引列不能参与计算，保持列“干净”

day 45索引

五 MySQL索引管理

六 测试索引

六测试索引