求学生单科流水表中单科最近/最新的考试成绩表的四种方案(解释计划分析篇)

在拙文 https://www.cnblogs.com/xiandedanteng/p/12327809.html 中,我提到过求学生单科最新成绩的四种sql语句,它们是下面四种:

--Rank--
select * from (select tb_scoreflow.*,rank() over(partition by stuid,sbjid order by cdate desc) as seq from tb_scoreflow) a where a.seq=1

--Left join--
SELECT  a.* from tb_scoreflow a left JOIN tb_scoreflow b on a.stuid = b.stuid and a.sbjid = b.sbjid and b.cdate > a.cdate where b.cdate IS NULL order by a.stuid,a.sbjid

--group by--
select a.* from tb_scoreflow a , (select stuid,sbjid,max(cdate) as cdate from tb_scoreflow group by stuid,sbjid) b where a.stuid=b.stuid and a.sbjid=b.sbjid and a.cdate=b.cdate order by a.stuid,a.sbjid

--Exist anti join--
select a.* from tb_scoreflow a where not exists ( select null from tb_scoreflow b where b.stuid=a.stuid and b.sbjid=a.sbjid and b.cdate>a.cdate) order by a.stuid,a.sbjid

如果让Oracle解释计划来评价这四种SQL,哪一种会得到较高的评价呢,让我们看看:

首先是采用分析函数Rank方案的:

SQL> select * from (select tb_scoreflow.*,rank() over(partition by stuid,sbjid order by cdate desc) as seq from tb_scoreflow) a where a.seq=1;
已用时间:  00: 00: 00.04

执行计划
----------------------------------------------------------
Plan hash value: 2391432598

-----------------------------------------------------------------------------------------
| Id  | Operation                | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT         |              |    50 |  3700 |     4  (25)| 00:00:01 |
|*  1 |  VIEW                    |              |    50 |  3700 |     4  (25)| 00:00:01 |
|*  2 |   WINDOW SORT PUSHED RANK|              |    50 |  1000 |     4  (25)| 00:00:01 |
|   3 |    TABLE ACCESS FULL     | TB_SCOREFLOW |    50 |  1000 |     3   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("A"."SEQ"=1)
   2 - filter(RANK() OVER ( PARTITION BY "STUID","SBJID" ORDER BY
              INTERNAL_FUNCTION("CDATE") DESC )<=1)

左联结的:

SQL> SELECT  a.* from tb_scoreflow a left JOIN tb_scoreflow b on a.stuid = b.stuid and a.sbjid = b.sbjid and b.cdate > a.cdate where b.cdate IS NULL order by a.stuid,a.sbjid;
已用时间:  00: 00: 00.01

执行计划
----------------------------------------------------------
Plan hash value: 2451855310

-------------------------------------------------------------------------------------
| Id  | Operation            | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |              |     1 |    34 |     8  (25)| 00:00:01 |
|   1 |  SORT ORDER BY       |              |     1 |    34 |     8  (25)| 00:00:01 |
|*  2 |   FILTER             |              |       |       |            |          |
|*  3 |    HASH JOIN OUTER   |              |     1 |    34 |     7  (15)| 00:00:01 |
|   4 |     TABLE ACCESS FULL| TB_SCOREFLOW |    50 |  1000 |     3   (0)| 00:00:01 |
|   5 |     TABLE ACCESS FULL| TB_SCOREFLOW |    50 |   700 |     3   (0)| 00:00:01 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("B"."CDATE" IS NULL)
   3 - access("A"."SBJID"="B"."SBJID"(+) AND "A"."STUID"="B"."STUID"(+))
       filter("B"."CDATE"(+)>"A"."CDATE")

group by的:

SQL> select a.* from tb_scoreflow a , (select stuid,sbjid,max(cdate) as cdate from tb_scoreflow group by stuid,sbjid) b where a.stuid=b.stuid and a.sbjid=b.sbjid and a.cdate=b.cdate order by a.stuid,a.sbjid;
已用时间:  00: 00: 00.00

执行计划
----------------------------------------------------------
Plan hash value: 3190377201

-------------------------------------------------------------------------------------
| Id  | Operation            | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |              |     1 |    34 |     8  (25)| 00:00:01 |
|*  1 |  FILTER              |              |       |       |            |          |
|   2 |   SORT GROUP BY      |              |     1 |    34 |     8  (25)| 00:00:01 |
|*  3 |    HASH JOIN         |              |    50 |  1700 |     7  (15)| 00:00:01 |
|   4 |     TABLE ACCESS FULL| TB_SCOREFLOW |    50 |  1000 |     3   (0)| 00:00:01 |
|   5 |     TABLE ACCESS FULL| TB_SCOREFLOW |    50 |   700 |     3   (0)| 00:00:01 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("A"."CDATE"=MAX("CDATE"))
   3 - access("A"."STUID"="STUID" AND "A"."SBJID"="SBJID")

反联结的:

SQL> select a.* from tb_scoreflow a where not exists ( select null from tb_scoreflow b where b.stuid=a.stuid and b.sbjid=a.sbjid and b.cdate>a.cdate) order by a.stuid,a.sbjid;
已用时间:  00: 00: 00.00

执行计划
----------------------------------------------------------
Plan hash value: 3319093165

------------------------------------------------------------------------------------
| Id  | Operation           | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |              |     1 |    34 |     8  (25)| 00:00:01 |
|   1 |  SORT ORDER BY      |              |     1 |    34 |     8  (25)| 00:00:01 |
|*  2 |   HASH JOIN ANTI    |              |     1 |    34 |     7  (15)| 00:00:01 |
|   3 |    TABLE ACCESS FULL| TB_SCOREFLOW |    50 |  1000 |     3   (0)| 00:00:01 |
|   4 |    TABLE ACCESS FULL| TB_SCOREFLOW |    50 |   700 |     3   (0)| 00:00:01 |
------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("B"."STUID"="A"."STUID" AND "B"."SBJID"="A"."SBJID")
       filter("B"."CDATE">"A"."CDATE")

从最重要的Cost指标来看,Rank方案以4胜出,其它三种方案以8不分伯仲。

但是,这是在目前50条记录,且Oracle假定数据均衡下做出的判断,在真实数据下跑产生的速度差异可能和上面的结论大相径庭。

下面是tb_scoreflow表的信息:

SQL> desc tb_scoreflow;
 名称                                                  是否为空? 类型
 ----------------------------------------------------- -------- ------------------------------------
 ID                                                    NOT NULL NUMBER(6)
 STUID                                                 NOT NULL NUMBER(6)
 SBJID                                                 NOT NULL NUMBER(6)
 SCORE                                                 NOT NULL NUMBER(3)
 CDATE                                                          DATE

SQL> select count(*) from tb_scoreflow;

  COUNT(*)
----------
        50

已用时间:  00: 00: 00.04

表格和数据的产生请参考:https://www.cnblogs.com/xiandedanteng/p/12327809.html

--2020年3月15日--

原文地址:https://www.cnblogs.com/heyang78/p/12495883.html