oracle 优化之组合索引

组合索引适用场景：

1.适用在单独查询返回记录很多，组合查询后忽然返回记录很少的情况:

比如where 学历=硕士以上返回不少的记录
比如where 职业=收银员同样返回不少的记录
于是无论哪个条件查询做索引，都不合适。
可是，如果学历为硕士以上，同时职业又是收银员的，返回的就少之又少了。
于是联合索引就可以这么开始建了。

2.组合查询的组合顺序，要考虑单独的前缀查询情况（否则单独前缀查询的索引不能生效或者只能用到跳跃索引）
比如你在建id,object_type的联合索引时，要看考虑是单独where id=xxx查询的多，还是单独where object_type查询的多。
这里细节就暂时略去了，在案例的部分中还有描述

结论就是：

组合索引过滤条件跟顺序有关系（当没有严格按照过略条件建索引时，某些情况下还是会利用组合索引，因为成本低）

3.仅等值无范围查询时，组合索引顺序不影响性能（比如where col1=xxx and col2=xxx，无论COL1+COL2组合还是COL2+COL1组合）

drop table t purge;
create table t as select * from dba_objects;
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
update t set object_id=rownum ;
commit;

create index idx_id_type on t(object_id,object_type);
create index idx_type_id on t(object_type,object_id);
set autotrace off
alter session set statistics_level=all ;
set linesize 200

select /*+index(t,idx_id_type)*/ * from t where object_id=20 and object_type='TABLE';
select * from table(dbms_xplan.display_cursor(null,null,'allstats last'));

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID 7qydm6x641kx4, child number 1
-------------------------------------
select /*+index(t,idx_id_type)*/ * from t where object_id=20 and
object_type='TABLE'

Plan hash value: 1470938839

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| 1 | TABLE ACCESS BY INDEX ROWID| T | 1 | 1 | 1 |00:00:00.01 | 5 |
|* 2 | INDEX RANGE SCAN | IDX_ID_TYPE | 1 | 1 | 1 |00:00:00.01 | 4 |
-----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("OBJECT_ID"=20 AND "OBJECT_TYPE"='TABLE')

Note
-----

24 rows selected.

可以看出走的是索引范围扫描，并且回表。

过略的条件不同，走的执行计划是不是也不一样？

当不用hints,不同的过略条件是不是走的执行计划不一样？

SQL> select * from t where object_id=20 and object_type='TABLE';

OWNER OBJECT_NAME SUBOBJECT_NAME
------------------------------ -------------------------------------------------------------------------------------------------------------------------------- ------------------------------
OBJECT_ID DATA_OBJECT_ID OBJECT_TYPE CREATED LAST_DDL_ TIMESTAMP STATUS T G S NAMESPACE EDITION_NAME
---------- -------------- ------------------- --------- --------- ------------------- ------- - - - ---------- ------------------------------
SYS IND$
20 2 TABLE 24-AUG-13 24-AUG-13 2013-08-24:11:37:35 VALID N N N 1

SQL> select * from table(dbms_xplan.display_cursor(null,null,'allstats last'));

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID ab3ng9dr3vkfw, child number 1
-------------------------------------
select * from t where object_id=20 and object_type='TABLE'

Plan hash value: 1470938839

-----------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
-----------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.01 | 5 |
| 1 | TABLE ACCESS BY INDEX ROWID| T | 1 | 1 | 1 |00:00:00.01 | 5 |
|* 2 | INDEX RANGE SCAN | IDX_ID_TYPE | 1 | 1 | 1 |00:00:00.01 | 4 |
-----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("OBJECT_ID"=20 AND "OBJECT_TYPE"='TABLE')

Note
-----
- cardinality feedback used for this statement

23 rows selected.

很明显，两个过滤条件顺序不一样，执行计划走的索引也不一样，不过后面都是索引范围扫面在通过行id 得出结果。

删除索引后，然后执行不同的过滤条件，结果有什么不一样？

SQL> drop index IDX_ID_TYPE;

Index dropped.

SQL>
SQL>
SQL>
SQL> select * from t where object_id=20 and object_type='TABLE';

SQL>
SQL> select * from table(dbms_xplan.display_cursor(null,null,'allstats last'));

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID ab3ng9dr3vkfw, child number 1
-------------------------------------
select * from t where object_id=20 and object_type='TABLE'

Plan hash value: 3420768628

-----------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
-----------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.01 | 5 |
| 1 | TABLE ACCESS BY INDEX ROWID| T | 1 | 1 | 1 |00:00:00.01 | 5 |
|* 2 | INDEX RANGE SCAN | IDX_TYPE_ID | 1 | 1 | 1 |00:00:00.01 | 4 |
-----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("OBJECT_TYPE"='TABLE' AND "OBJECT_ID"=20)

Note
-----
- dynamic sampling used for this statement (level=2)

23 rows selected.

发现也会走索引扫面，只是用的另外一个顺序不一样的索引（因为cbo成本比全表扫描低，还是会走另一个成本低的组合索引）

--4.组合索引最佳顺序一般是将列等值查询的列置前。

（测试组合索引在条件是不等的情况下的情况，条件经常是不等的，要放在后面，让等值的在前面）

Execution Plan
----------------------------------------------------------
Plan hash value: 1470938839

-------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 16 | 3312 | 52 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| T | 16 | 3312 | 52 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | IDX_ID_TYPE | 50 | | 51 (0)| 00:00:01 |
-------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("OBJECT_ID">=20 AND "OBJECT_TYPE"='TABLE' AND "OBJECT_ID"<2000)
filter("OBJECT_TYPE"='TABLE')

Note
-----
- dynamic sampling used for this statement (level=2)

Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
89 consistent gets
0 physical reads
0 redo size
51596 bytes sent via SQL*Net to client
875 bytes received via SQL*Net from client
34 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
493 rows processed

通过索引快速扫面获得行ID进行回表。

5.注意组合索引与组合条件中关于IN 的优化

案例1

UPDATE t SET OBJECT_ID=20 WHERE ROWNUM<=26000;
UPDATE t SET OBJECT_ID=21 WHERE OBJECT_ID<>20;
COMMIT;
set linesize 1000
set pagesize 1
alter session set statistics_level=all ;
select /*+index(t,idx1_object_id)*/ * from t where object_TYPE='TABLE' AND OBJECT_ID >= 20 AND OBJECT_ID<= 21;

Execution Plan
----------------------------------------------------------
Plan hash value: 3420768628

-------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 20755 | 4195K| 1186 (1)| 00:00:15 |
| 1 | TABLE ACCESS BY INDEX ROWID| T | 20755 | 4195K| 1186 (1)| 00:00:15 |
|* 2 | INDEX RANGE SCAN | IDX_TYPE_ID | 20755 | | 218 (0)| 00:00:03 |
-------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("OBJECT_TYPE"='TABLE' AND "OBJECT_ID">=20 AND "OBJECT_ID"<=21)

Note
-----
- dynamic sampling used for this statement (level=2)

Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
5059 consistent gets
0 physical reads
0 redo size
1191984 bytes sent via SQL*Net to client
18464 bytes received via SQL*Net from client
1633 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
24472 rows processed

select /*+index(t,idx1_object_id)*/ * from t where object_TYPE='TABLE' AND OBJECT_ID in (20,21);

以上两个语句：

select /*+index(t,idx1_object_id)*/ * from t where object_TYPE='TABLE' AND OBJECT_ID >= 20 AND OBJECT_ID<= 21; --CPU cost消耗1186

select /*+index(t,idx1_object_id)*/ * from t where object_TYPE='TABLE' AND OBJECT_ID in (20,21); -- CPU cost消耗仅仅为5

为何消耗的CPU成本差距这么大？

在人为的思考中，我们人为认为 (OBJECT_ID >= 20 AND OBJECT_ID<= 21) = in (20,21),而其实oracle 不这么认为in (20,21) 只有两个值，而(OBJECT_ID >= 20 AND OBJECT_ID<= 21)走索引中间有无穷个键值。所以第二句消耗的CPU COST仅仅为5。

6.依然是关于IN的优化 (col1,col2,col3的索引情况，如果没有为COL2赋予查询条件时，COL3只能起到检验作用)

drop table t purge;
create table t as select * from dba_objects;
UPDATE t SET OBJECT_ID=20 WHERE ROWNUM<=26000;
UPDATE t SET OBJECT_ID=21 WHERE OBJECT_ID<>20;
Update t set object_id=22 where rownum<=10000;
COMMIT;

create index idx_union on t(object_type,object_id,owner);
set autotrace traceonly

select * from t where object_type='VIEW' and OWNER='SYS';
select * from table(dbms_xplan.display_cursor(null,null,'allstats last'));

Execution Plan
----------------------------------------------------------
Plan hash value: 1570829420

-----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 3369 | 681K| 20 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| T | 3369 | 681K| 20 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | IDX_UNION | 14 | | 19 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("OBJECT_TYPE"='VIEW' AND "OWNER"='SYS')
filter("OWNER"='SYS')

Note
-----
- dynamic sampling used for this statement (level=2)

Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
686 consistent gets
0 physical reads
0 redo size
157650 bytes sent via SQL*Net to client
3405 bytes received via SQL*Net from client
264 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
3938 rows processed

SQL>

23 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 3713220770

----------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 8168 | 16336 | 29 (0)| 00:00:01 |
| 1 | COLLECTION ITERATOR PICKLER FETCH| DISPLAY_CURSOR | 8168 | 16336 | 29 (0)| 00:00:01 |
----------------------------------------------------------------------------------------------------

Statistics
----------------------------------------------------------
27 recursive calls
0 db block gets
136 consistent gets
0 physical reads
0 redo size
1925 bytes sent via SQL*Net to client
534 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
2 sorts (memory)
0 sorts (disk)
23 rows processed

select /*+INDEX(T,idx_union)*/ * from t T where object_type='VIEW' and OBJECT_ID IN (20,21,22) AND OWNER='SYS';
select * from table(dbms_xplan.display_cursor(null,null,'allstats last'));

Execution Plan
----------------------------------------------------------
Plan hash value: 306189815

------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 3369 | 681K| 6 (0)| 00:00:01 |
| 1 | INLIST ITERATOR | | | | | |
| 2 | TABLE ACCESS BY INDEX ROWID| T | 3369 | 681K| 6 (0)| 00:00:01 |
|* 3 | INDEX RANGE SCAN | IDX_UNION | 1 | | 5 (0)| 00:00:01 |
------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

3 - access("OBJECT_TYPE"='VIEW' AND ("OBJECT_ID"=20 OR "OBJECT_ID"=21 OR
"OBJECT_ID"=22) AND "OWNER"='SYS')

Note
-----
- dynamic sampling used for this statement (level=2)

Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
687 consistent gets
0 physical reads
0 redo size
157650 bytes sent via SQL*Net to client
3405 bytes received via SQL*Net from client
264 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
3938 rows processed

SQL>
23 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 3713220770

Statistics
----------------------------------------------------------
15 recursive calls
0 db block gets
0 consistent gets
0 physical reads
0 redo size
1862 bytes sent via SQL*Net to client
534 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
2 sorts (memory)
0 sorts (disk)
23 rows processed

可以看出，select * from t where object_type='VIEW' and OWNER='SYS'; 这一条语句跟select /*+INDEX(T,idx_union)*/ * from t T where object_type='VIEW' and OBJECT_ID IN (20,21,22) AND OWNER='SYS';这一条语句代价等价，因为此案例中object_type='VIEW' 且OBJECT_ID 只有20,21,22 这三条记录，所以如果没有为COL2赋予查询条件时，COL3只能起到检验作用。