ITPUB: HAVING COUNT 的妙用

帮忙分析一个sql

-- Create table
CREATE TABLE TMP_TB
(
TID VARCHAR2(10) NOT NULL,
ORDID VARCHAR2(10),
STATUS VARCHAR2(10)
);
-- Create/Recreate primary, unique and foreign key constraints
ALTER TABLE TMP_TB
ADD CONSTRAINT TMP_TB_PK PRIMARY KEY (TID)
USING INDEX;
INSERT INTO TMP_TB (TID, ORDID, STATUS) VALUES ('1', '100', 'O');
INSERT INTO TMP_TB (TID, ORDID, STATUS) VALUES ('2', '111', 'R');
INSERT INTO TMP_TB (TID, ORDID, STATUS) VALUES ('3', '100', 'R');
INSERT INTO TMP_TB (TID, ORDID, STATUS) VALUES ('4', '100', 'V');
INSERT INTO TMP_TB (TID, ORDID, STATUS) VALUES ('5', '111', 'L');
INSERT INTO TMP_TB (TID, ORDID, STATUS) VALUES ('6', '110', 'O');
最近看到一个sql,要求检索status是'R',但不能是'O'的ordid,根据上面的数据检索的结果:
ordid
111
我的两种写法:
写法1:
SELECT m.ordid
FROM (SELECT t.ordid,
t.status,
SUM(CASE
WHEN t.status = 'O' THEN
1
ELSE
0
END) OVER(PARTITION BY t.ordid) cnt
FROM tmp_tb t) m
WHERE m.status = 'R'
AND m.cnt = 0

実行計画
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=4 Card=6 Bytes=162
)
1 0 VIEW (Cost=4 Card=6 Bytes=162)
2 1 WINDOW (SORT) (Cost=4 Card=6 Bytes=84)
3 2 TABLE ACCESS (FULL) OF 'TMP_TB' (TABLE) (Cost=3 Card=6
Bytes=84)
統計
----------------------------------------------------------
4 recursive calls
0 db block gets
15 consistent gets
0 physical reads
0 redo size
433 bytes sent via SQL*Net to client
503 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
1 rows processed
写法2:直接用minus
SELECT ordid
FROM tmp_tb
WHERE status = 'R'
MINUS (SELECT ordid
FROM tmp_tb
WHERE status = 'O')

実行計画
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=8 Card=2 Bytes=56)
1 0 MINUS
2 1 SORT (UNIQUE) (Cost=4 Card=2 Bytes=28)
3 2 TABLE ACCESS (FULL) OF 'TMP_TB' (TABLE) (Cost=3 Card=2
Bytes=28)
4 1 SORT (UNIQUE) (Cost=4 Card=2 Bytes=28)
5 4 TABLE ACCESS (FULL) OF 'TMP_TB' (TABLE) (Cost=3 Card=2
Bytes=28)
統計
----------------------------------------------------------
7 recursive calls
0 db block gets
30 consistent gets
0 physical reads
0 redo size
433 bytes sent via SQL*Net to client
503 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
2 sorts (memory)
0 sorts (disk)
1 rows processed
我的问题:数据量小的时候,两种写法效率差不多,当数据量在几百万的时候,我的理解minus会排序,使用minus效率会下降,但是结果是两个sql效率还是差不多,什么原因呢?有没有更效率的写法呢?谢谢~~~

------------------------------------------------------------------------

两种写法不等价,除非你 (ordid,status) 是唯一的。
你在大数据量的情况下,MINUS的两个集合分别多大?两次都是全表扫描吗?
SELECT ordid
FROM TMP_TB
WHERE status IN ('R','O')
GROUP BY ordid
HAVING COUNT(CASE WHEN status = 'O' THEN 1 END)=0;

原文地址:https://www.cnblogs.com/tracy/p/2055659.html