mysql-sql优化

sql优化

其他笔记链接:
mysql-常用cmd命令
mysql-数据库基础
mysql-索引
mysql-sql优化
mysql-事物

sql优化

sql查询性能

开启Show Profile功能，默认该功能是关闭的，使用前需开启。

mysql> show variables like 'profiling';
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| profiling     | OFF   |
+---------------+-------+
#开启
mysql> set profiling=on;

性能消耗查询

show profile的常用查询参数。

①ALL：显示所有的开销信息。

②BLOCK IO：显示块IO开销。

③CONTEXT SWITCHES：上下文切换开销。

④CPU：显示CPU开销信息。

⑤IPC：显示发送和接收开销信息。

⑥MEMORY：显示内存开销信息。

⑦PAGE FAULTS：显示页面错误开销信息。

⑧SOURCE：显示和Source_function，Source_file，Source_line相关的开销信息。

⑨SWAPS：显示交换次数开销信息。

mysql> show profiles;
+----------+------------+------------------------+
| Query_ID | Duration   | Query                  |
+----------+------------+------------------------+
|        1 | 0.00022875 | select * from sys_user |
+----------+------------+------------------------+

show profile cpu,block io for query Query_ID;
#Query_ID为#3步骤中show profiles列表中的Query_ID

#还可以带上参数查询其他的选项
mysql> show profile for query 1;
+----------------------+----------+
| Status               | Duration |
+----------------------+----------+
| starting             | 0.000043 |
| checking permissions | 0.000006 |
| Opening tables       | 0.000057 |
| init                 | 0.000019 |
| System lock          | 0.000007 |
| optimizing           | 0.000004 |
| statistics           | 0.000009 |
| preparing            | 0.000007 |
| executing            | 0.000002 |
| Sending data         | 0.000047 |
| end                  | 0.000003 |
| query end            | 0.000005 |
| closing tables       | 0.000006 |
| freeing items        | 0.000010 |
| cleaning up          | 0.000007 |
+----------------------+----------+

需要优化的情况。

①converting HEAP to MyISAM：查询结果太大，内存不够，数据往磁盘上搬了。

②Creating tmp table：创建临时表。先拷贝数据到临时表，用完后再删除临时表。

③Copying to tmp table on disk：把内存中临时表复制到磁盘上，危险！！！

④locked。

如果在show profile诊断结果中出现了以上4条结果中的任何一条，则sql语句需要优化。

使用performance schema来监控mysql

详细 : https://www.processon.com/mindmap/5e3c0e9de4b05c26bea043a8

使用show processlist查看连接的线程个数，来观察是否有大量线程处于不正常的状态或者其他不正常的特征

详细 : https://www.processon.com/mindmap/5e3c0e9de4b05c26bea043a8

排序性能

全排序会将所有数据先加载到内存，然后进行排序，如果数据量大的话可以给排序字段建立索引来优化

执行计划

EXPLAIN输出列 https://dev.mysql.com/doc/refman/8.0/en/explain-output.html

柱	JSON名称	含义
`id`	`select_id`	该`SELECT`标识符，值越大越优先执行，同样的值按顺序向下执行
`select_type`	没有	该`SELECT`类型，是属于简单sql、或者子查询、或者out查询
`table`	`table_name`	输出行表
`partitions`	`partitions`	匹配的分区
`type`	`access_type`	联接类型
`possible_keys`	`possible_keys`	可能的索引选择
`key`	`key`	实际选择的索引
`key_len`	`key_length`	所选键的长度
`ref`	`ref`	与索引比较的列
`rows`	`rows`	估计要检查的行
`filtered`	`filtered`	按表条件过滤的行百分比
`Extra`	没有	附加信息

select查询的序列号，包含一组数字，表示查询中执行select子句或者操作表的顺序

id号分为三种情况：

1、如果id相同，那么执行顺序从上到下

explain select * from emp e join dept d on e.deptno = d.deptno join salgrade sg on e.sal between sg.losal and sg.hisal;

2、如果id不同，如果是子查询，id的序号会递增，id值越大优先级越高，越先被执行

explain select * from emp e where e.deptno in (select d.deptno from dept d where d.dname = 'SALES');

3、id相同和不同的，同时存在：相同的可以认为是一组，从上往下顺序执行，在所有组中，id值越大，优先级越高，越先执行

explain select * from emp e join dept d on e.deptno = d.deptno join salgrade sg on e.sal between sg.losal and sg.hisal where e.deptno in (select d.deptno from dept d where d.dname = 'SALES');

select_type

主要用来分辨查询的类型，是普通查询还是联合查询还是子查询

`select_type` Value	Meaning
SIMPLE	Simple SELECT (not using UNION or subqueries)
PRIMARY	Outermost SELECT
UNION	Second or later SELECT statement in a UNION
DEPENDENT UNION	Second or later SELECT statement in a UNION, dependent on outer query
UNION RESULT	Result of a UNION.
SUBQUERY	First SELECT in subquery
DEPENDENT SUBQUERY	First SELECT in subquery, dependent on outer query
DERIVED	Derived table
UNCACHEABLE SUBQUERY	A subquery for which the result cannot be cached and must be re-evaluated for each row of the outer query
UNCACHEABLE UNION	The second or later select in a UNION that belongs to an uncacheable subquery (see UNCACHEABLE SUBQUERY)

--sample:简单的查询，不包含子查询和union
explain select * from emp;

--primary:查询中若包含任何复杂的子查询，最外层查询则被标记为Primary
explain select staname,ename supname from (select ename staname,mgr from emp) t join emp on t.mgr=emp.empno ;

--union:若第二个select出现在union之后，则被标记为union
explain select * from emp where deptno = 10 union select * from emp where sal >2000;

--dependent union:跟union类似，此处的depentent表示union或union all联合而成的结果会受外部表影响
explain select * from emp e where e.empno  in ( select empno from emp where deptno = 10 union select empno from emp where sal >2000)

--union result:从union表获取结果的select
explain select * from emp where deptno = 10 union select * from emp where sal >2000;

--subquery:在select或者where列表中包含子查询
explain select * from emp where sal > (select avg(sal) from emp) ;

--dependent subquery:subquery的子查询要受到外部表查询的影响
explain select * from emp e where e.deptno in (select distinct deptno from dept);

--DERIVED: from子句中出现的子查询，也叫做派生类，
explain select staname,ename supname from (select ename staname,mgr from emp) t join emp on t.mgr=emp.empno ;

--UNCACHEABLE SUBQUERY：表示使用子查询的结果不能被缓存
 explain select * from emp where empno = (select empno from emp where deptno=@@sort_buffer_size);
 
--uncacheable union:表示union的查询结果不能被缓存：sql语句未验证

table

对应行正在访问哪一个表，表名或者别名，可能是临时表或者union合并结果集
1、如果是具体的表名，则表明从实际的物理表中获取数据，当然也可以是表的别名

2、表名是derivedN的形式，表示使用了id为N的查询产生的衍生表

3、当有union result的时候，表名是union n1,n2等的形式，n1,n2表示参与union的id

type

type显示的是访问类型，访问类型表示我是以何种方式去访问我们的数据，最容易想的是全表扫描，直接暴力的遍历一张表去寻找需要的数据，效率非常低下，访问的类型有很多，效率从最好到最坏依次是：

system > const > eq_ref > ref > fulltext > ref_or_null > index_merge > unique_subquery > index_subquery > range > index > ALL

一般情况下，得保证查询至少达到range级别，最好能达到ref

--all:全表扫描，一般情况下出现这样的sql语句而且数据量比较大的话那么就需要进行优化。
explain select * from emp;

--index：全索引扫描这个比all的效率要好，主要有两种情况，一种是当前的查询时覆盖索引，即我们需要的数据在索引中就可以索取，或者是使用了索引进行排序，这样就避免数据的重排序
explain  select empno from emp;

--range：表示利用索引查询的时候限制了范围，在指定范围内进行查询，这样避免了index的全索引扫描，适用的操作符： =, <>, >, >=, <, <=, IS NULL, BETWEEN, LIKE, or IN() 
explain select * from emp where empno between 7000 and 7500;

--index_subquery：利用索引来关联子查询，不再扫描全表
explain select * from emp where emp.job in (select job from t_job);

--unique_subquery:该连接类型类似与index_subquery,使用的是唯一索引
 explain select * from emp e where e.deptno in (select distinct deptno from dept);
 
--index_merge：在查询过程中需要多个索引组合使用，没有模拟出来

--ref_or_null：对于某个字段即需要关联条件，也需要null值的情况下，查询优化器会选择这种访问方式
explain select * from emp e where  e.mgr is null or e.mgr=7369;

--ref：使用了非唯一性索引进行数据的查找
 create index idx_3 on emp(deptno);
 explain select * from emp e,dept d where e.deptno =d.deptno;

--eq_ref ：使用唯一性索引进行数据查找
explain select * from emp,emp2 where emp.empno = emp2.empno;

--const：这个表至多有一个匹配行，
explain select * from emp where empno = 7369;
 
--system：表只有一行记录（等于系统表），这是const类型的特例，平时不会出现

possible_keys

显示可能应用在这张表中的索引，一个或多个，查询涉及到的字段上若存在索引，则该索引将被列出，但不一定被查询实际使用

explain select * from emp,dept where emp.deptno = dept.deptno and emp.deptno = 10;

key

实际使用的索引，如果为null，则没有使用索引，查询中若使用了覆盖索引，则该索引和查询的select字段重叠。

explain select * from emp,dept where emp.deptno = dept.deptno and emp.deptno = 10;

key_len

表示索引中使用的字节数，可以通过key_len计算查询中使用的索引长度，在不损失精度的情况下长度越短越好。

explain select * from emp,dept where emp.deptno = dept.deptno and emp.deptno = 10;

ref

显示索引的哪一列被使用了，如果可能的话，是一个常数

explain select * from emp,dept where emp.deptno = dept.deptno and emp.deptno = 10;

rows

根据表的统计信息及索引使用情况，大致估算出找出所需记录需要读取的行数，此参数很重要，直接反应的sql找了多少数据，在完成目的的情况下越少越好

explain select * from emp;

filtered：使用explain extended时会出现这个列，5.7之后的版本默认就有这个字段，不需要使用explain extended了。这个字段表示存储引擎返回的数据在server层过滤后，剩下多少满足查询的记录数量的比例，注意是百分比，不是具体记录数。

extra

包含额外的信息。

distinct：在select部分使用了distinc关键字
no tables used：不带from字句的查询或者From dual查询。

使用not in()形式子查询或not exists运算符的连接查询，这种叫做反连接。即，一般连接查询是先查询内表，再查询外表，反连接就是先查询外表，再查询内表。

using filesort：排序时无法使用到索引时，就会出现这个。常见于order by和group by语句中。
using index：查询时不需要回表查询，直接通过索引就可以获取查询的数据。
using_union：表示使用or连接各个使用索引的条件时，该信息表示从处理结果获取并集
using intersect：表示使用and的各个索引的条件时，该信息表示是从处理结果获取交集
using sort_union和using sort_intersection：与前面两个对应的类似，只是他们是出现在用and和or查询信息量大时，先查询主键，然后进行排序合并后，才能读取记录并返回。
using where：表示存储引擎返回的记录并不是所有的都满足查询条件，需要在server层进行过滤。查询条件中分为限制条件和检查条件，5.6之前，存储引擎只能根据限制条件扫描数据并返回，然后server层根据检查条件进行过滤再返回真正符合查询的数据。5.6.x之后支持ICP特性，可以把检查条件也下推到存储引擎层，不符合检查条件和限制条件的数据，直接不读取，这样就大大减少了存储引擎扫描的记录数量。extra列显示using index condition
using temporary：表示使用了临时表存储中间结果。临时表可以是内存临时表和磁盘临时表，执行计划中看不出来，需要查看status变量，used_tmp_table，used_tmp_disk_table才能看出来。
firstmatch(tb_name)：5.6.x开始引入的优化子查询的新特性之一，常见于where字句含有in()类型的子查询。如果内表的数据量比较大，就可能出现这个
loosescan(m..n)：5.6.x之后引入的优化子查询的新特性之一，在in()类型的子查询中，子查询返回的可能有重复记录时，就可能出现这个

--using filesort:说明mysql无法利用索引进行排序，只能利用排序算法进行排序，会消耗额外的位置
explain select * from emp order by sal;

--using temporary:建立临时表来保存中间结果，查询完成之后把临时表删除
explain select ename,count(*) from emp where deptno = 10 group by ename;

--using index:这个表示当前的查询时覆盖索引的，直接从索引中读取数据，而不用访问数据表。如果同时出现using where 表名索引被用来执行索引键值的查找，如果没有，表面索引被用来读取数据，而不是真的查找
explain select deptno,count(*) from emp group by deptno limit 10;

--using where:使用where进行条件过滤
explain select * from t_user where id = 1;

--using join buffer:使用连接缓存，情况没有模拟出来

--impossible where：where语句的结果总是false
explain select * from emp where empno = 7469;

sql执行顺序

(1)from
(3) join
(2) on
(4) where
(5)group by(开始使用select中的别名，后面的语句中都可以使用)
(6) avg,sum....
(7)having
(8) select
(9) distinct
(10) order by