索引优化

一、单表优化

建表sql

create table if not exists article (
  id int unsigned not null primary key auto_increment,
  author_id int unsigned not null,
  category_id int unsigned not null,
  views int unsigned not null,
  comments int unsigned not null,
  title varbinary(255) not null,
  content text not null
);
 
insert into article(author_id, category_id, views, comments, title, content) values
  (1, 1, 1, 1, '1', '1'),
  (2, 2, 2, 2, '2', '2'),
  (1, 1, 3, 3, '3', '3');

示例

查询category_id为1且comments大于1的情况下,views最多的article_id。

在不使用索引的情况下,执行查询语句, 查看执行计划。

mysql> explain select id,author_id from article where category_id = 1 and comments > 1 order by views desc limit 1;
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-----------------------------+
| id | select_type | table   | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra                       |
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-----------------------------+
|  1 | SIMPLE      | article | NULL       | ALL  | NULL          | NULL | NULL    | NULL |    3 |    33.33 | Using where; Using filesort |
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-----------------------------+
1 row in set (0.02 sec)

可以很清楚的看到查询的type是all,即全表扫描,此外,还出现了filesort这种文件排序,这是不能容忍的,因此必须建立索引来解决这种情况。

首先尝试为使用到的三个字段创建索引.

create index idx_article_ccv on article(category_id, comments, views); 

# 重新执行explain查看执行计划

mysql> explain select id,author_id from article where category_id = 1 and comments > 1 order by views desc limit 1;
+----+-------------+---------+------------+-------+-----------------+-----------------+---------+------+------+----------+---------------------------------------+
| id | select_type | table   | partitions | type  | possible_keys   | key             | key_len | ref  | rows | filtered | Extra                                 |
+----+-------------+---------+------------+-------+-----------------+-----------------+---------+------+------+----------+---------------------------------------+
|  1 | SIMPLE      | article | NULL       | range | idx_article_ccv | idx_article_ccv | 8       | NULL |    1 |   100.00 | Using index condition; Using filesort |
+----+-------------+---------+------------+-------+-----------------+-----------------+---------+------+------+----------+---------------------------------------+
1 row in set (0.02 sec)

可以看到, 虽然使用到了索引, 但是因为使用到了>,导致后面的排序索引失效, 最后还是使用到了文件排序, 这也是不太能够接受的, 所以需要继续重新修改索引。

# 尝试删除索引, 重新建立,这次只建立分类和点击量的索引关系。
drop index idx_article_ccv on article;
create index idx_article_ccv on article(category_id, views);

# 重新执行上面的操作,发现期待的一幕出现了,不但没有文件排序, 且索引的类型还是ref,效率不错, 如下所示。
mysql> explain select id,author_id from article where category_id = 1 and comments > 1 order by views desc limit 1;
+----+-------------+---------+------------+------+-----------------+-----------------+---------+-------+------+----------+-------------+
| id | select_type | table   | partitions | type | possible_keys   | key             | key_len | ref   | rows | filtered | Extra       |
+----+-------------+---------+------------+------+-----------------+-----------------+---------+-------+------+----------+-------------+
|  1 | SIMPLE      | article | NULL       | ref  | idx_article_ccv | idx_article_ccv | 4       | const |    2 |    33.33 | Using where |
+----+-------------+---------+------------+------+-----------------+-----------------+---------+-------+------+----------+-------------+
1 row in set (0.04 sec)

二、双表优化

建表sql

create table if not exists class (
  id int unsigned not null auto_increment,
  card int unsigned not null,
  primary key(id)
);
 
create table if not exists book (
  bookid int unsigned not null auto_increment primary key,
  card int unsigned not null
);
 
insert into class(card) values
  (floor(1 + rand() * 20)),
  (floor(1 + rand() * 20)),
  (floor(1 + rand() * 20)),
  (floor(1 + rand() * 20)),
  (floor(1 + rand() * 20)),
  (floor(1 + rand() * 20)),
  (floor(1 + rand() * 20)),
  (floor(1 + rand() * 20)),
  (floor(1 + rand() * 20)),
  (floor(1 + rand() * 20)),
  (floor(1 + rand() * 20));
 
insert into book(card) values
  (floor(1 + rand() * 20)),
  (floor(1 + rand() * 20)),
  (floor(1 + rand() * 20)),
  (floor(1 + rand() * 20)),
  (floor(1 + rand() * 20)),
  (floor(1 + rand() * 20)),
  (floor(1 + rand() * 20)),
  (floor(1 + rand() * 20)),
  (floor(1 + rand() * 20)),
  (floor(1 + rand() * 20)),
  (floor(1 + rand() * 20)),
  (floor(1 + rand() * 20)),
  (floor(1 + rand() * 20)),
  (floor(1 + rand() * 20)),
  (floor(1 + rand() * 20)),
  (floor(1 + rand() * 20));

示例:

class表左连接book表, 根据card字段

# 首先尝试非索引下的连接

mysql> explain select * from class left join book on class.card = book.card;
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra                                              |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------------------------------------------+
|  1 | SIMPLE      | class | NULL       | ALL  | NULL          | NULL | NULL    | NULL |   22 |   100.00 | NULL                                               |
|  1 | SIMPLE      | book  | NULL       | ALL  | NULL          | NULL | NULL    | NULL |   32 |   100.00 | Using where; Using join buffer (Block Nested Loop) |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------------------------------------------+
2 rows in set (0.04 sec)

结果显而易见, 两次全表连接,效果非常差, 现在需要为该次查询建立索引来进行优化.

首先尝试在book表的关联字段上创建索引.

# 创建索引
create index idx_book_card on book(card);
# 重新查看sql执行计划
mysql> explain select * from class left join book on class.card = book.card;
+----+-------------+-------+------------+------+---------------+---------------+---------+----------------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key           | key_len | ref            | rows | filtered | Extra       |
+----+-------------+-------+------------+------+---------------+---------------+---------+----------------+------+----------+-------------+
|  1 | SIMPLE      | class | NULL       | ALL  | NULL          | NULL          | NULL    | NULL           |   22 |   100.00 | NULL        |
|  1 | SIMPLE      | book  | NULL       | ref  | idx_book_card | idx_book_card | 4       | emp.class.card |    1 |   100.00 | Using index |
+----+-------------+-------+------------+------+---------------+---------------+---------+----------------+------+----------+-------------+
2 rows in set (0.05 sec)

结果显示优化效果很明显, 被驱动表book被索引的行直接变为1行, 直接少了一半多. 这说明将索引添加到book中是正确的. 那如果现在将索引天添加到驱动表class中呢?

drop index idx_book_card on book;
create index idx_class_card on class(card);

mysql> explain select * from class left join book on class.card = book.card;
+----+-------------+-------+------------+-------+---------------+----------------+---------+------+------+----------+----------------------------------------------------+
| id | select_type | table | partitions | type  | possible_keys | key            | key_len | ref  | rows | filtered | Extra                                              |
+----+-------------+-------+------------+-------+---------------+----------------+---------+------+------+----------+----------------------------------------------------+
|  1 | SIMPLE      | class | NULL       | index | NULL          | idx_class_card | 4       | NULL |   22 |   100.00 | Using index                                        |
|  1 | SIMPLE      | book  | NULL       | ALL   | NULL          | NULL           | NULL    | NULL |   32 |   100.00 | Using where; Using join buffer (Block Nested Loop) |
+----+-------------+-------+------------+-------+---------------+----------------+---------+------+------+----------+----------------------------------------------------+
2 rows in set (0.05 sec)

从上面的结果可以得出结论, 如果两表使用左连接的话, 那么索引最好加到被驱动表中, 因为驱动表无论如何都会是全表扫描的, 加到被驱动表可以完美的利用到索引的优化功能. 同理右外连接也一样是添加到被驱动表中.

三、三表优化

建表sql

# 在上面双表的基础上再添加上一个表。
create table if not exists phone (
  id int unsigned not null primary key auto_increment,
  card int unsigned not null  
);
 
insert into phone(card) values
  (floor(1 + rand() * 20)),
  (floor(1 + rand() * 20)),
  (floor(1 + rand() * 20)),
  (floor(1 + rand() * 20)),
  (floor(1 + rand() * 20)),
  (floor(1 + rand() * 20)),
  (floor(1 + rand() * 20)),
  (floor(1 + rand() * 20)),
  (floor(1 + rand() * 20)),
  (floor(1 + rand() * 20)),
  (floor(1 + rand() * 20)),
  (floor(1 + rand() * 20)),
  (floor(1 + rand() * 20)),
  (floor(1 + rand() * 20)),
  (floor(1 + rand() * 20));

示例:

使用左连接连接class, book, phone表。

mysql> explain select * from class left join book on book.card = class.card left join phone on book.card = phone.card;
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra                                              |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------------------------------------------+
|  1 | SIMPLE      | class | NULL       | ALL  | NULL          | NULL | NULL    | NULL |   22 |   100.00 | NULL                                               |
|  1 | SIMPLE      | book  | NULL       | ALL  | NULL          | NULL | NULL    | NULL |   32 |   100.00 | Using where; Using join buffer (Block Nested Loop) |
|  1 | SIMPLE      | phone | NULL       | ALL  | NULL          | NULL | NULL    | NULL |   30 |   100.00 | Using where; Using join buffer (Block Nested Loop) |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------------------------------------------+
3 rows in set (0.03 sec)

同样均使用到了全表连接,接下来在被驱动的两张表中使用索引。

create index idx_book_card on book(card);
create index idx_phone_card on class(phone);

mysql> explain select * from class left join book on book.card = class.card left join phone on book.card = phone.card;
+----+-------------+-------+------------+------+----------------+----------------+---------+----------------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys  | key            | key_len | ref            | rows | filtered | Extra       |
+----+-------------+-------+------------+------+----------------+----------------+---------+----------------+------+----------+-------------+
|  1 | SIMPLE      | class | NULL       | ALL  | NULL           | NULL           | NULL    | NULL           |   22 |   100.00 | NULL        |
|  1 | SIMPLE      | book  | NULL       | ref  | idx_book_card  | idx_book_card  | 4       | emp.class.card |    1 |   100.00 | Using index |
|  1 | SIMPLE      | phone | NULL       | ref  | idx_phone_card | idx_phone_card | 4       | emp.book.card  |    2 |   100.00 | Using index |
+----+-------------+-------+------------+------+----------------+----------------+---------+----------------+------+----------+-------------+
3 rows in set (0.04 sec)

结果基本能够满足要求。

四、join优化小总结

  • 尽量减少join连接内层循环的次数, 永远用“小表驱动大表的结果集”。
  • 优先优化join连接的内层循环
  • 保证join语句中被驱动表的join字段已经用上了索引。
  • 当无法保证被驱动表的Join条件字段内存是否充足的前提下, 加大Join Buffer的设置。

五、索引优化的总结

这里仍然以一个例子来进行说明。

建表sql:

简要说明:创建你staffs表, 为name,age,pos三个字段创建一个联合索引。

create table if not exists staffs (
  id int primary key auto_increment,
  name varchar(24) not null default '' comment '姓名',
  age int not null default 0 comment '年龄',
  pos varchar(20) not null default '' comment '职位',
  add_time timestamp not null default current_timestamp comment '入职时间'
)charset utf8mb4 comment '员工记录表';
 
desc staffs;  
 
insert into staffs(name, age , pos, add_time) values
  ('z3', 22, 'manageer', now()),
  ('July', 23, 'dev', now()),
  ('2000', 23, 'dev', now());
select * from staffs;
 
alter table staffs add index idx_staffs_nameAgePos(name, age, pos);

优化

1 全值匹配

使用全值匹配能够完美的使用到索引,效率也非常好,此外联合索引中多个索引调换顺序不会影响索引的使用,在优化器那会做一层优化。

mysql> explain select * from staffs where name = 'July';
+----+-------------+--------+------------+------+-----------------------+-----------------------+---------+-------+------+----------+-------+
| id | select_type | table  | partitions | type | possible_keys         | key                   | key_len | ref   | rows | filtered | Extra |
+----+-------------+--------+------------+------+-----------------------+-----------------------+---------+-------+------+----------+-------+
|  1 | SIMPLE      | staffs | NULL       | ref  | idx_staffs_nameAgePos | idx_staffs_nameAgePos | 98      | const |    1 |   100.00 | NULL  |
+----+-------------+--------+------------+------+-----------------------+-----------------------+---------+-------+------+----------+-------+
1 row in set (0.04 sec)

mysql> explain select * from staffs where name = 'July' and age = 22;
+----+-------------+--------+------------+------+-----------------------+-----------------------+---------+-------------+------+----------+-------+
| id | select_type | table  | partitions | type | possible_keys         | key                   | key_len | ref         | rows | filtered | Extra |
+----+-------------+--------+------------+------+-----------------------+-----------------------+---------+-------------+------+----------+-------+
|  1 | SIMPLE      | staffs | NULL       | ref  | idx_staffs_nameAgePos | idx_staffs_nameAgePos | 102     | const,const |    1 |   100.00 | NULL  |
+----+-------------+--------+------------+------+-----------------------+-----------------------+---------+-------------+------+----------+-------+
1 row in set (0.04 sec)

# 调换顺序不会影响索引。
mysql> explain select * from staffs where age = 22 and name = 'Joly';
+----+-------------+--------+------------+------+-----------------------+-----------------------+---------+-------------+------+----------+-------+
| id | select_type | table  | partitions | type | possible_keys         | key                   | key_len | ref         | rows | filtered | Extra |
+----+-------------+--------+------------+------+-----------------------+-----------------------+---------+-------------+------+----------+-------+
|  1 | SIMPLE      | staffs | NULL       | ref  | idx_staffs_nameAgePos | idx_staffs_nameAgePos | 102     | const,const |    1 |   100.00 | NULL  |
+----+-------------+--------+------------+------+-----------------------+-----------------------+---------+-------------+------+----------+-------+
1 row in set (0.03 sec)

2 最佳左前缀法则

如果索引了多列,要遵守最左前缀法则。指的是查询从联合索引的最左前列开始并且不跳过索引中的列。如果跳过了中间列, 则会导致索引失效。

# 正常使用了索引
mysql> explain select * from staffs where age = 22 and name = 'Joly' and pos = 'a';
+----+-------------+--------+------------+------+-----------------------+-----------------------+---------+-------------------+------+----------+-------+
| id | select_type | table  | partitions | type | possible_keys         | key                   | key_len | ref               | rows | filtered | Extra |
+----+-------------+--------+------------+------+-----------------------+-----------------------+---------+-------------------+------+----------+-------+
|  1 | SIMPLE      | staffs | NULL       | ref  | idx_staffs_nameAgePos | idx_staffs_nameAgePos | 184     | const,const,const |    1 |   100.00 | NULL  |
+----+-------------+--------+------------+------+-----------------------+-----------------------+---------+-------------------+------+----------+-------+
1 row in set (0.04 sec)

# 丢失索引
mysql> explain select * from staffs where age = 22 and pos = 'a';
+----+-------------+--------+------------+------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table  | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra       |
+----+-------------+--------+------------+------+---------------+------+---------+------+------+----------+-------------+
|  1 | SIMPLE      | staffs | NULL       | ALL  | NULL          | NULL | NULL    | NULL |    3 |    33.33 | Using where |
+----+-------------+--------+------------+------+---------------+------+---------+------+------+----------+-------------+
1 row in set (0.05 sec)

上面第二种情况直接走了全表扫描, 因为mysql是按照联合索引定义的顺序在索引树上排序的,现在丢失了第一个排序手段namemysql就不知道从哪里开始找起,因此直接走全表。

下面是一种只用到了部分索引,导致后续索引失效的情况。

mysql> explain select * from staffs where name = 'zhang' and pos = 'a';
+----+-------------+--------+------------+------+-----------------------+-----------------------+---------+-------+------+----------+-----------------------+
| id | select_type | table  | partitions | type | possible_keys         | key                   | key_len | ref   | rows | filtered | Extra                 |
+----+-------------+--------+------------+------+-----------------------+-----------------------+---------+-------+------+----------+-----------------------+
|  1 | SIMPLE      | staffs | NULL       | ref  | idx_staffs_nameAgePos | idx_staffs_nameAgePos | 98      | const |    1 |    33.33 | Using index condition |
+----+-------------+--------+------------+------+-----------------------+-----------------------+---------+-------+------+----------+-----------------------+
1 row in set (0.03 sec)

从ref这个字段的结果可以看出来, 这里只用到一个索引,后面那个索引并没有起到作用。

3 不在索引列上做任何操作

不要在索引列上做任何操作(计算、函数、(自动、手动)类型转换),会导致索引失效而转向全表扫描。

例如下面两种本质上是相同的操作,但是第二种方式直接导致扫描全表。

# 第一种
mysql> explain select * from staffs where name = 'zhang';
+----+-------------+--------+------------+------+-----------------------+-----------------------+---------+-------+------+----------+-------+
| id | select_type | table  | partitions | type | possible_keys         | key                   | key_len | ref   | rows | filtered | Extra |
+----+-------------+--------+------------+------+-----------------------+-----------------------+---------+-------+------+----------+-------+
|  1 | SIMPLE      | staffs | NULL       | ref  | idx_staffs_nameAgePos | idx_staffs_nameAgePos | 98      | const |    1 |   100.00 | NULL  |
+----+-------------+--------+------------+------+-----------------------+-----------------------+---------+-------+------+----------+-------+
1 row in set (0.05 sec)

# 第二种

mysql> explain select * from staffs where concat(name, '') = 'zhang';
+----+-------------+--------+------------+------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table  | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra       |
+----+-------------+--------+------------+------+---------------+------+---------+------+------+----------+-------------+
|  1 | SIMPLE      | staffs | NULL       | ALL  | NULL          | NULL | NULL    | NULL |    3 |   100.00 | Using where |
+----+-------------+--------+------------+------+---------------+------+---------+------+------+----------+-------------+
1 row in set (0.04 sec)

4 存储引擎不能使用索引中范围条件右边的列

索引中包含比较运算符, 那么除了当列,后面的索引列都会失效。比较下面两条sql,可以通过key_len这个参数看到第二条参数最后一个索引没有被使用到。

mysql> explain select * from staffs where name = 'zhang' and age = 11 and pos = '';
+----+-------------+--------+------------+------+-----------------------+-----------------------+---------+-------------------+------+----------+-------+
| id | select_type | table  | partitions | type | possible_keys         | key                   | key_len | ref               | rows | filtered | Extra |
+----+-------------+--------+------------+------+-----------------------+-----------------------+---------+-------------------+------+----------+-------+
|  1 | SIMPLE      | staffs | NULL       | ref  | idx_staffs_nameAgePos | idx_staffs_nameAgePos | 184     | const,const,const |    1 |   100.00 | NULL  |
+----+-------------+--------+------------+------+-----------------------+-----------------------+---------+-------------------+------+----------+-------+
1 row in set (0.04 sec)

mysql> explain select * from staffs where name = 'zhang' and age > 11 and pos = '';
+----+-------------+--------+------------+-------+-----------------------+-----------------------+---------+------+------+----------+-----------------------+
| id | select_type | table  | partitions | type  | possible_keys         | key                   | key_len | ref  | rows | filtered | Extra                 |
+----+-------------+--------+------------+-------+-----------------------+-----------------------+---------+------+------+----------+-----------------------+
|  1 | SIMPLE      | staffs | NULL       | range | idx_staffs_nameAgePos | idx_staffs_nameAgePos | 102     | NULL |    1 |    33.33 | Using index condition |
+----+-------------+--------+------------+-------+-----------------------+-----------------------+---------+------+------+----------+-----------------------+
1 row in set (0.04 sec)

5 尽量使用覆盖索引

我们在业务中,尽量做到用到什么变量取什么变量,最好都使用覆盖索引(只访问索引的查询,索引列和查询列一致),减少select*的查询语句,select * 这样的操作会多出很多无意义的回表操作,甚至会让原本走索引的查询语句放弃走索引,反而去走全表扫描。

它们的详细差别可以通过下面几个例子明显的感受出来。

mysql> explain select * from staffs where name in ('zhang', 'li') and age = 11 and pos = '';
+----+-------------+--------+------------+------+-----------------------+------+---------+------+------+----------+-------------+
| id | select_type | table  | partitions | type | possible_keys         | key  | key_len | ref  | rows | filtered | Extra       |
+----+-------------+--------+------------+------+-----------------------+------+---------+------+------+----------+-------------+
|  1 | SIMPLE      | staffs | NULL       | ALL  | idx_staffs_nameAgePos | NULL | NULL    | NULL |    3 |    66.67 | Using where |
+----+-------------+--------+------------+------+-----------------------+------+---------+------+------+----------+-------------+
1 row in set (0.06 sec)

mysql> explain select id,name,age from staffs where name in ('zhang', 'li') and age = 11 and pos = '';
+----+-------------+--------+------------+-------+-----------------------+-----------------------+---------+------+------+----------+--------------------------+
| id | select_type | table  | partitions | type  | possible_keys         | key                   | key_len | ref  | rows | filtered | Extra                    |
+----+-------------+--------+------------+-------+-----------------------+-----------------------+---------+------+------+----------+--------------------------+
|  1 | SIMPLE      | staffs | NULL       | range | idx_staffs_nameAgePos | idx_staffs_nameAgePos | 184     | NULL |    2 |   100.00 | Using where; Using index |
+----+-------------+--------+------------+-------+-----------------------+-----------------------+---------+------+------+----------+--------------------------+
1 row in set (0.04 sec)

6 不等于操作(<>、!=)的理解误区

mysql在使用不等于(!=或者<>)的时候无法使用索引,进而会导致全表扫描。网上大多的结论都是如此,可事实真是如此吗?就以上面的初始例子来看。

下面是初始数据

mysql> select * from staffs;
+----+------+-----+----------+---------------------+
| id | name | age | pos      | add_time            |
+----+------+-----+----------+---------------------+
|  1 | z3   |  22 | manageer | 2019-12-14 07:40:22 |
|  2 | July |  23 | dev      | 2019-12-14 07:40:22 |
|  3 | 2000 |  23 | dev      | 2019-12-14 07:40:22 |
+----+------+-----+----------+---------------------+

接下来我们执行如下操作。

mysql> explain select * from staffs where name != 'July';
+----+-------------+--------+------------+------+-----------------------+------+---------+------+------+----------+-------------+
| id | select_type | table  | partitions | type | possible_keys         | key  | key_len | ref  | rows | filtered | Extra       |
+----+-------------+--------+------------+------+-----------------------+------+---------+------+------+----------+-------------+
|  1 | SIMPLE      | staffs | NULL       | ALL  | idx_staffs_nameAgePos | NULL | NULL    | NULL |    3 |    66.67 | Using where |
+----+-------------+--------+------------+------+-----------------------+------+---------+------+------+----------+-------------+
1 row in set (0.03 sec)

一切看起来确实如同网上所总结的那样,!=操作确实会导致走全表,放弃走索引。那接下来看下面这条sql

mysql> explain select id,name from staffs where name != 'July';
+----+-------------+--------+------------+-------+-----------------------+-----------------------+---------+------+------+----------+--------------------------+
| id | select_type | table  | partitions | type  | possible_keys         | key                   | key_len | ref  | rows | filtered | Extra                    |
+----+-------------+--------+------------+-------+-----------------------+-----------------------+---------+------+------+----------+--------------------------+
|  1 | SIMPLE      | staffs | NULL       | range | idx_staffs_nameAgePos | idx_staffs_nameAgePos | 98      | NULL |    2 |   100.00 | Using where; Using index |
+----+-------------+--------+------------+-------+-----------------------+-----------------------+---------+------+------+----------+--------------------------+
1 row in set (0.04 sec)

只是使用到了覆盖索引,但是我们也使用了!=,但是mysql还是走了索引。可能我们会觉得这是覆盖索引导致的。那么我们在原先表的3条数据的基础上,重新添加4条数据,然后重新执行第一条sql

insert into staffs(name, age, pos) values ('July', 23, 'dev');
insert into staffs(name, age, pos) values ('July', 23, 'dev');
insert into staffs(name, age, pos) values ('July', 23, 'dev');
insert into staffs(name, age, pos) values ('July', 23, 'dev');

mysql> select * from staffs;
+----+------+-----+----------+---------------------+
| id | name | age | pos      | add_time            |
+----+------+-----+----------+---------------------+
|  1 | z3   |  22 | manageer | 2019-12-14 07:40:22 |
|  2 | July |  23 | dev      | 2019-12-14 07:40:22 |
|  3 | 2000 |  23 | dev      | 2019-12-14 07:40:22 |
|  4 | July |  23 | dev      | 2019-12-16 22:42:36 |
|  5 | July |  23 | dev      | 2019-12-16 22:42:36 |
|  6 | July |  23 | dev      | 2019-12-16 22:42:36 |
|  7 | July |  23 | dev      | 2019-12-16 22:42:36 |
+----+------+-----+----------+---------------------+
mysql> explain select * from staffs where name != 'July';
+----+-------------+--------+------------+-------+-----------------------+-----------------------+---------+------+------+----------+-----------------------+
| id | select_type | table  | partitions | type  | possible_keys         | key                   | key_len | ref  | rows | filtered | Extra                 |
+----+-------------+--------+------------+-------+-----------------------+-----------------------+---------+------+------+----------+-----------------------+
|  1 | SIMPLE      | staffs | NULL       | range | idx_staffs_nameAgePos | idx_staffs_nameAgePos | 98      | NULL |    2 |   100.00 | Using index condition |
+----+-------------+--------+------------+-------+-----------------------+-----------------------+---------+------+------+----------+-----------------------+
1 row in set (0.03 sec)

结果显而易见,网上总结的所谓!=就会导致索引一定失效,是站不住脚的。

7 is null, is not null 的理解误区

is null,is not null 在where子句中就无法使用索引.

先看下面两个例子.

mysql> explain select * from staffs where name is null;
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+------------------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra            |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+------------------+
|  1 | SIMPLE      | NULL  | NULL       | NULL | NULL          | NULL | NULL    | NULL | NULL | NULL     | Impossible WHERE |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+------------------+
1 row in set (0.03 sec)

mysql> explain select * from staffs where name is not null;
+----+-------------+--------+------------+------+-----------------------+------+---------+------+------+----------+-------------+
| id | select_type | table  | partitions | type | possible_keys         | key  | key_len | ref  | rows | filtered | Extra       |
+----+-------------+--------+------------+------+-----------------------+------+---------+------+------+----------+-------------+
|  1 | SIMPLE      | staffs | NULL       | ALL  | idx_staffs_nameAgePos | NULL | NULL    | NULL |    7 |    85.71 | Using where |
+----+-------------+--------+------------+------+-----------------------+------+---------+------+------+----------+-------------+
1 row in set (0.05 sec)

上面的看起来确实可以简单的得出无论是is null还是is not null都无法运用索引。但是这里需要注意到staffs表的name字段的约束条件,它是not null,这说明这个字段无法存储null值,因此当mysql优化器去判断name is null的时候,直接判断为Impossible WHERE,这样当然就无法走索引,而且这种写法也完全没有意义。

因此这里我们只要修改一下表的结构,把name这个字段的not null约束改成允许为空,再来执行上述命令,我们就能再次清楚的看出mysql走不走索引与where子句是否is null, is not null, !=, ...等关键字没有直接的关系。

mysql> alter table staffs modify column name varchar(50) null;
Query OK, 0 rows affected (0.06 sec)
Records: 0  Duplicates: 0  Warnings: 0


mysql> explain select * from staffs where name is null;
+----+-------------+--------+------------+------+-----------------------+-----------------------+---------+-------+------+----------+-----------------------+
| id | select_type | table  | partitions | type | possible_keys         | key                   | key_len | ref   | rows | filtered | Extra                 |
+----+-------------+--------+------------+------+-----------------------+-----------------------+---------+-------+------+----------+-----------------------+
|  1 | SIMPLE      | staffs | NULL       | ref  | idx_staffs_nameAgePos | idx_staffs_nameAgePos | 203     | const |    1 |   100.00 | Using index condition |
+----+-------------+--------+------------+------+-----------------------+-----------------------+---------+-------+------+----------+-----------------------+
1 row in set (0.06 sec)

只需要一个简单的例子的校验,就能看出即使条件使用的是is null仍然可以走索引。

8 like的优化

​ like以通配符开头('%abc...')可能会导致mysql索引失效进而变成全表扫描操作.

mysql> explain select * from staffs where name like '%zhang';
+----+-------------+--------+------------+------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table  | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra       |
+----+-------------+--------+------------+------+---------------+------+---------+------+------+----------+-------------+
|  1 | SIMPLE      | staffs | NULL       | ALL  | NULL          | NULL | NULL    | NULL |    7 |    14.29 | Using where |
+----+-------------+--------+------------+------+---------------+------+---------+------+------+----------+-------------+
1 row in set (0.04 sec)

所以有时候如果可以做到的话, 尽量在模糊匹配的时候把%写在末尾.

mysql> explain select * from staffs where name like 'zhang%';
+----+-------------+--------+------------+-------+-----------------------+-----------------------+---------+------+------+----------+-----------------------+
| id | select_type | table  | partitions | type  | possible_keys         | key                   | key_len | ref  | rows | filtered | Extra                 |
+----+-------------+--------+------------+-------+-----------------------+-----------------------+---------+------+------+----------+-----------------------+
|  1 | SIMPLE      | staffs | NULL       | range | idx_staffs_nameAgePos | idx_staffs_nameAgePos | 203     | NULL |    1 |   100.00 | Using index condition |
+----+-------------+--------+------------+-------+-----------------------+-----------------------+---------+------+------+----------+-----------------------+
1 row in set (0.05 sec)

但是业务逻辑很少会出现开头字符串是固定的情况, 这时候就需要用到覆盖索引的理念来优化.

mysql> explain select name from staffs where name like '%zhang';
+----+-------------+--------+------------+-------+---------------+-----------------------+---------+------+------+----------+--------------------------+
| id | select_type | table  | partitions | type  | possible_keys | key                   | key_len | ref  | rows | filtered | Extra                    |
+----+-------------+--------+------------+-------+---------------+-----------------------+---------+------+------+----------+--------------------------+
|  1 | SIMPLE      | staffs | NULL       | index | NULL          | idx_staffs_nameAgePos | 289     | NULL |    7 |    14.29 | Using where; Using index |
+----+-------------+--------+------------+-------+---------------+-----------------------+---------+------+------+----------+--------------------------+
1 row in set (0.05 sec)

像上面的情况, 如果需要的字段正好是索引字段,那么最好不要使用select *, 虽然优化后的结果type的类型是index, 也是扫描索引列全表, 但是索引文件一般都比全表数据小,所以这样还是能够优化,

9 字符串不加单引号索引失效

mysql> explain select * from staffs where name = '2000';
+----+-------------+--------+------------+------+-----------------------+-----------------------+---------+-------+------+----------+-------+
| id | select_type | table  | partitions | type | possible_keys         | key                   | key_len | ref   | rows | filtered | Extra |
+----+-------------+--------+------------+------+-----------------------+-----------------------+---------+-------+------+----------+-------+
|  1 | SIMPLE      | staffs | NULL       | ref  | idx_staffs_nameAgePos | idx_staffs_nameAgePos | 203     | const |    1 |   100.00 | NULL  |
+----+-------------+--------+------------+------+-----------------------+-----------------------+---------+-------+------+----------+-------+
1 row in set (0.05 sec)

当不加引号, mysql会做一个隐式装换, 然后做匹配.如下所示, 这样索引就会失效

mysql> explain select * from staffs where name = 2000;
+----+-------------+--------+------------+------+-----------------------+------+---------+------+------+----------+-------------+
| id | select_type | table  | partitions | type | possible_keys         | key  | key_len | ref  | rows | filtered | Extra       |
+----+-------------+--------+------------+------+-----------------------+------+---------+------+------+----------+-------------+
|  1 | SIMPLE      | staffs | NULL       | ALL  | idx_staffs_nameAgePos | NULL | NULL    | NULL |    7 |    14.29 | Using where |
+----+-------------+--------+------------+------+-----------------------+------+---------+------+------+----------+-------------+
1 row in set (0.03 sec)

10 or的使用

or的使用和前面的null不等于这些判断条件一样, 并不是根据是否使用这些运算符导致不走索引, 最后还是mysql的优化器根据总的代价计算后,然后决定是否走索引.

就如下面的例子.

mysql> select * from staffs;
+----+------+-----+----------+---------------------+
| id | name | age | pos      | add_time            |
+----+------+-----+----------+---------------------+
|  1 | z3   |  22 | manageer | 2019-12-14 07:40:22 |
|  2 | July |  23 | dev      | 2019-12-14 07:40:22 |
|  3 | 2000 |  23 | dev      | 2019-12-14 07:40:22 |
|  8 | July |  23 | dev      | 2019-12-16 22:58:45 |
|  9 | July |  23 | dev      | 2019-12-16 22:58:45 |
| 10 | July |  23 | dev      | 2019-12-16 22:58:45 |
| 11 | July |  23 | dev      | 2019-12-16 22:58:45 |
+----+------+-----+----------+---------------------+
7 rows in set (0.04 sec)
mysql> explain select * from staffs where name = 'July' or name = 'z3';
+----+-------------+--------+------------+------+-----------------------+------+---------+------+------+----------+-------------+
| id | select_type | table  | partitions | type | possible_keys         | key  | key_len | ref  | rows | filtered | Extra       |
+----+-------------+--------+------------+------+-----------------------+------+---------+------+------+----------+-------------+
|  1 | SIMPLE      | staffs | NULL       | ALL  | idx_staffs_nameAgePos | NULL | NULL    | NULL |    7 |    85.71 | Using where |
+----+-------------+--------+------------+------+-----------------------+------+---------+------+------+----------+-------------+
1 row in set (0.04 sec)

在这种情况下优化方式还是使用覆盖索引.

mysql> explain select name from staffs where name = 'July' or name = 'z3';
+----+-------------+--------+------------+-------+-----------------------+-----------------------+---------+------+------+----------+--------------------------+
| id | select_type | table  | partitions | type  | possible_keys         | key                   | key_len | ref  | rows | filtered | Extra                    |
+----+-------------+--------+------------+-------+-----------------------+-----------------------+---------+------+------+----------+--------------------------+
|  1 | SIMPLE      | staffs | NULL       | range | idx_staffs_nameAgePos | idx_staffs_nameAgePos | 203     | NULL |    6 |   100.00 | Using where; Using index |
+----+-------------+--------+------------+-------+-----------------------+-----------------------+---------+------+------+----------+--------------------------+
1 row in set (0.03 sec)

当像下面这样查询, 最后的执行方案又是走索引.

mysql> explain select * from staffs where name in ('2000', 'z3');
+----+-------------+--------+------------+-------+-----------------------+-----------------------+---------+------+------+----------+-----------------------+
| id | select_type | table  | partitions | type  | possible_keys         | key                   | key_len | ref  | rows | filtered | Extra                 |
+----+-------------+--------+------------+-------+-----------------------+-----------------------+---------+------+------+----------+-----------------------+
|  1 | SIMPLE      | staffs | NULL       | range | idx_staffs_nameAgePos | idx_staffs_nameAgePos | 203     | NULL |    2 |   100.00 | Using index condition |
+----+-------------+--------+------------+-------+-----------------------+-----------------------+---------+------+------+----------+-----------------------+
1 row in set (0.05 sec)

六、查询优化的一般性建议

  • 对于单键索引,尽量选择针对当前query过滤性更好的索引
  • 在选择组合索引的时候,当前Query中过滤性最好的字段在索引字段顺序中,位置越靠前越好。
  • 在选择组合索引的时候,尽量选择可以能包含当前query中的where子句中更多字段的索引
  • 尽可能通过分析统计信息和调整query的写法来达到选择合适索引的目的
  • 使用join连表时,永远小表驱动大表。
  • order by子句,尽量使用Index方式排序,避免使用FileSort方式排序
  • 尽可能在索引列上完成排序操作,遵照索引建的最佳左前缀, order by中不能既有asc也有desc
  • having只会在检索出所有记录之后才对结果集进行过滤。这个处理需要排序,总计等操作。如果能通过where子句限制记录的数目,那就能减少这方面的开销。
  • 当orderby 或 group by无法使用索引列,可以选择增大max_length_for_sort_data参数的设置+增大sort_buffer_size参数的设置
  • 除非确实需要去掉重复的行,否则尽量使用union all而不是union。因为union会自带distinct操作,代价很大

七、总结

其实看了这么多例子,是否走索引还是mysql优化器经过自己的优化算法计算,然后确认它认为的cost(代价)最少的执行方案,这与数据的分布,用到索引的列数,最后筛选的数据的比例,回表的次数等等都有关系,并不能单纯的通过某些条件就轻易下结论。

参考

1 MySQL中IS NULL、IS NOT NULL、!=不能用索引?胡扯!

2 官网

原文地址:https://www.cnblogs.com/yscl/p/12056507.html