SQL中子查询为聚合函数时的优化

测试数据：
create table test1 as select * from dba_objects where rownum<=10000;--10000条记录
create table test2 as select * from dba_objects;--13438条记录

分析执行计划：
SQL1：
SQL> select *
  2    from test
  3   where object_id =
  4         (select max(object_id)
  5            from test1
  6           where test1.object_name = test.object_name);

已选择10行。

已用时间:  00: 00: 00.07

执行计划
----------------------------------------------------------
Plan hash value: 2637409915

-----------------------------------------------------------------------------
| Id  | Operation           | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |       |   961 |   194K|    43   (0)| 00:00:01 |
|*  1 |  FILTER             |       |       |       |            |          |
|   2 |   TABLE ACCESS FULL | TEST  |    10 |  2070 |     3   (0)| 00:00:01 |
|   3 |   SORT AGGREGATE    |       |     1 |    79 |            |          |
|*  4 |    TABLE ACCESS FULL| TEST1 |    96 |  7584 |    40   (0)| 00:00:01 |
-----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("OBJECT_ID"= (SELECT MAX("OBJECT_ID") FROM "TEST1"
              "TEST1" WHERE "TEST1"."OBJECT_NAME"=:B1))
   4 - filter("TEST1"."OBJECT_NAME"=:B1)

Note
-----
   - dynamic sampling used for this statement (level=4)


统计信息
----------------------------------------------------------
          0  recursive calls
          0  db block gets
       1344  consistent gets
          0  physical reads
          0  redo size
       1710  bytes sent via SQL*Net to client
        415  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
         10  rows processed
SQL2：         
SQL> select *
  2    from test
  3   where exists (select 1
  4            from (select distinct object_name,
  5                                  max(object_id) over(partition by test1.object_name) object_id
  6                    from test1) t
  7           where t.object_name = test.object_name
  8             and test.object_id = t.object_id);

已选择10行。

已用时间:  00: 00: 00.06

执行计划
----------------------------------------------------------
Plan hash value: 918945524

---------------------------------------------------------------------------------------
| Id  | Operation             | Name  | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |       |     1 |   286 |       |   405   (1)| 00:00:05 |
|*  1 |  HASH JOIN SEMI       |       |     1 |   286 |       |   405   (1)| 00:00:05 |
|   2 |   TABLE ACCESS FULL   | TEST  |    10 |  2070 |       |     3   (0)| 00:00:01 |
|   3 |   VIEW                |       |  9606 |   741K|       |   401   (1)| 00:00:05 |
|   4 |    HASH UNIQUE        |       |  9606 |   741K|   848K|   401   (1)| 00:00:05 |
|   5 |     WINDOW SORT       |       |  9606 |   741K|   848K|   401   (1)| 00:00:05 |
|   6 |      TABLE ACCESS FULL| TEST1 |  9606 |   741K|       |    40   (0)| 00:00:01 |
---------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("T"."OBJECT_NAME"="TEST"."OBJECT_NAME" AND
              "TEST"."OBJECT_ID"="T"."OBJECT_ID")

Note
-----
   - dynamic sampling used for this statement (level=4)


统计信息
----------------------------------------------------------
          0  recursive calls
          0  db block gets
        137  consistent gets
          0  physical reads
          0  redo size
       1710  bytes sent via SQL*Net to client
        415  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
         10  rows processed

从上面执行计划可以看出：
SQL1：filter会根据test返回行数决定过滤表test1访问次数，类似于nested loop（注意，第二个表总是全表扫描的哦）；逻辑读也比较大1344.
SQL2：相当于将子查询作为一个”表“与test进行hash join，当然每个表只会访问一次。逻辑读为137。
当然，如果test表返回数据量很大，那么SQL1的效率问题会更明显。
这个就属于SQL书写的问题，需要谨慎小心。
        
将子查询作为一个“表”与主查询表test做join连接，当然，需要先改写max聚合函数为分析函数，如下：
select *
  from test
  join (select object_name, max(object_id) object_id
          from test1
         group by object_name) t
    on test.object_name = t.object_name
   and test.object_id = t.object_id;
与上面改写是等效的。