Oracle索引问题诊断与优化

一、实验
create table s1 as select * from SH.SALES;
create table s2 as select * from SH.SALES;

s1表没有建立索引
s2表有建立索引

set timing on;
select * from s1 where prod_id=1;
2.45s
select * from s2 where prod_id=1;
0.59s

可见索引对于表查询速度的重要性。

二、索引性能测试与诊断

1、查看数据库Index信息：
SELECT A.OWNER, A.TABLE_OWNER, A.TABLE_NAME, A.INDEX_NAME, A.INDEX_TYPE,
   B.COLUMN_POSITION, B.COLUMN_NAME, C.TABLESPACE_NAME,
   A.TABLESPACE_NAME, A.UNIQUENESS
FROM DBA_INDEXES A, DBA_IND_COLUMNS B, DBA_TABLES C
   WHERE A.OWNER = UPPER ('hr')
AND A.OWNER = B.INDEX_OWNER
AND A.OWNER = C.OWNER
AND A.TABLE_NAME LIKE UPPER ('DEPARTMENTS')
AND A.TABLE_NAME = B.TABLE_NAME
AND A.TABLE_NAME = C.TABLE_NAME
AND A.INDEX_NAME = B.INDEX_NAME
ORDER BY A.OWNER, A.TABLE_OWNER, A.TABLE_NAME, A.INDEX_NAME, B.COLUMN_POSITION

2、查出没有建立index的表：
SELECT OWNER, TABLE_NAME
FROM ALL_TABLES
WHERE OWNER NOT IN ('SYS','SYSTEM','OUTLN','DBSNMP') AND OWNER = UPPER ('scott')
MINUS
SELECT OWNER, TABLE_NAME
FROM ALL_INDEXES
WHERE OWNER NOT IN ('SYS','SYSTEM','OUTLN','DBSNMP')

3、查出建立了过量index的表：
SELECT OWNER, TABLE_NAME, COUNT (*) "count"
FROM ALL_INDEXES
WHERE OWNER NOT IN ('SYS','SYSTEM','OUTLN','DBSNMP') AND OWNER = UPPER ('hr')
GROUP BY OWNER, TABLE_NAME
HAVING COUNT (*) > ('4')

一个表可以有几百个索引，但是对于频繁插入和更新表，索引越多系统CPU，I/O负担就越重；建议每张表不超过5个索引。

实验：
create table table1 as select * from SH.SALES;
create table table2 as select * from SH.SALES;

table1只在prod_id列建索引
table2在所有列建索引

SELECT count(*) FROM table1 where prod_id=30;
29282

set timing on;
update table1 set cust_id=1 where prod_id=30;
10.56s
update table2 set cust_id=1 where prod_id=30;
11.35s

4、找出全表扫描（Full Scan）的Sid和SQL
A full table scan occurs when every block is read from a table. Full table scans are often a preferred performance option in batch-style applications, such as decision support. We have seen some excellent run-time improvements in decision support systems that use the parallel query option, which relies on full table scans to operate. However, full table scans at an OLTP site during prime online usage times can create havoc with response times. Full table scans, even on small tables, can degrade response times particularly when the small table drives the query, and this table is not always the most efficient access path.

The following query reports how many full table scans are taking place:
SELECT name, value
FROM v$sysstat
WHERE name LIKE '%table %'
ORDER BY name;

The values relating to the full table scans are:
table scans (long tables) - a scan of a table that has more than five database blocks
table scans (short tables) - a count of full table scans with five or fewer blocks
If the number of long table scans is significant, there is a strong possibility that SQL statements in your application need tuning or indexes need to be added.

To get an appreciation of how many rows and blocks are being accessed on average for the long full table scans, use this calculation (the sample data comes from an OLTP application):

Average Long Table Scan Blocks
= (table scan blocks gotten - (short table scans * 5))
/ long table scans
= (3,540,450 - (160,618 * 5)) / 661
= (3,540,450 - (803,090)) / 661
= 4,141 blocks read per full table scan

In our example, 4141 average disk reads performed on an OLTP application 661 times in the space of a few short hours is not a healthy situation.

If you can identify the users who are experiencing the full table scans, you can find out what they were running to cause these scans. Below is a script that allows you to do this:

REM FILE NAME: fullscan.sql
REM LOCATION: Database Tuning\File I/O Reports
REM FUNCTION: Identifies users of full table scans
REM TESTED ON: 7.3.3.5, 8.0.4.1, 8.1.5, 8.1.7, 9.0.1, 9.2.0.2
REM PLATFORM: non-specific
REM REQUIRES: v$session, v$sesstat, v$statname
REM This view is used by the fscanavg.sql script
REM
REM This is a part of the Knowledge Xpert for Oracle Administration REM library.
REM Copyright (C) 2001 Quest Software
REM All rights reserved.
REM
REM************ Knowledge Xpert for Oracle Administration *************

DROP VIEW full_table_scans;

CREATE VIEW full_table_scans
AS
   SELECT      ss.username
            || '('
            || se.sid
            || ') ' "User Process",
SUM (DECODE (NAME, 'table scans (short tables)', VALUE)) "Short Scans",
SUM (DECODE (NAME, 'table scans (long tables)', VALUE)) "Long Scans",
SUM (DECODE (NAME, 'table scan rows gotten', VALUE)) "Rows Retrieved"
          FROM v$session ss, v$sesstat se, v$statname sn
         WHERE se.statistic# = sn.statistic#
           AND ( NAME LIKE '%table scans (short tables)%'
               OR NAME LIKE '%table scans (long tables)%'
               OR NAME LIKE '%table scan rows gotten%'
               )
           AND se.sid = ss.sid
           AND ss.username IS NOT NULL
      GROUP BY ss.username
               || '('
               || se.sid
               || ') ';

COLUMN "User Process" FORMAT a20;
COLUMN "Long Scans" FORMAT 999,999,999;
COLUMN "Short Scans" FORMAT 999,999,999;
COLUMN "Rows Retreived" FORMAT 999,999,999;
COLUMN "Average Long Scan Length" FORMAT 999,999,999;
TTITLE ' Table Access Activity By User '
SELECT "User Process", "Long Scans", "Short Scans", "Rows Retrieved"
FROM full_table_scans
ORDER BY "Long Scans" DESC;

找出可能有全表扫描的sql语句：
select sid,sql_text
     From v$session s,v$sql q
     Where sid in(136,135)
     And (q.sql_id=s.sql_id or q.sql_id=s.prev_sql_id);

可借助Knowledge Xpert for Oracle Administration的Active Analysis协助分析Index
Indexes exist primarily to improve the performance of SQL statements. In many cases, establishing good indexes is the best path to optimal performance.
Active Analysis - Show indexes by owner & table
Active Analysis - Show index statistics
Active Analysis - Show tables without indexes
Active Analysis - Show tables with excessive indexes
Active Analysis - Show similar indexes
Active Analysis - Show foreign keys missing indexes
Active Analysis - Show partitioned indexes
Knowing When to Rebuild Indexes

三、Index调优建议：
1、索引设计优化
The way columns are indexed effect their efficiency. The order columns are specified should reflect the way a select will retrieve them. The first column should be the one that will be accessed most often.

Oracle recommends that you do not explicitly define UNIQUE indexes on tables. When the unique constraint is created, a unique index is created to maintain it. Uniqueness is strictly a logical concept and should be associated with the definition of a table. As such, uniqueness should be defined by using UNIQUE integrity constraints on the desired columns.

Oracle recommends that you do not explicitly define UNIQUE indexes on tables (CREATE UNIQUE INDEX). In general, it is better to create constraints to enforce uniqueness than it is to use the CREATE UNIQUE INDEX syntax. A constraint's associated index always assumes the name of the constraint; you cannot specify a specific name for a constraint index.

经常一起使用多个字段检索记录，组合索引比单索引更有效；
把最常用的列放在最前面，例：dx_groupid_serv_id(groupid,serv_id)，在where条件中使用groupid或groupid,serv_id，查询将使用索引，若仅用到serv_id字段，则索引无效；

试验(组合索引比单索引更有效)：
create table3 as select * from SH.SALES;
在table3上建立一个prod_id和cust_id的组合索引

create table4 as select * from SH.SALES;
在table4上建立一个索引（prod_id）

set autotrace on;
SQL> select count(*) from table3 where prod_id = 30 or cust_id = 25939;
COUNT(*)
----------
     29553
Elapsed: 00:00:01.71
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=ALL_ROWS (Cost=528 Card=1 Bytes=2
          6)
   1    0   SORT (AGGREGATE)
   2    1     INDEX (FAST FULL SCAN) OF 'IDX_PROD_ID_CUST_ID' (INDEX)
          (Cost=528 Card=422 Bytes=10972)
Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
       2407 consistent gets
       2318 physical reads
          0 redo size
        395 bytes sent via SQL*Net to client
        512 bytes received via SQL*Net from client
          2 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
          1 rows processed
SQL> select count(*) from table4 where prod_id = 30 or cust_id = 25939;
COUNT(*)
----------
     29553
Elapsed: 00:00:02.12
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=ALL_ROWS (Cost=995 Card=1 Bytes=2
          6)
   1    0   SORT (AGGREGATE)
   2    1     TABLE ACCESS (FULL) OF 'TABLE4' (TABLE) (Cost=995 Card=3
          5494 Bytes=922844)
Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
       4435 consistent gets
       4246 physical reads
          0 redo size
        395 bytes sent via SQL*Net to client
        512 bytes received via SQL*Net from client
          2 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
          1 rows processed

试验(组合索引在Where子句中的使用)：
set autotrace on;

仅仅使用prod_id：
SQL> select count(*) from table3 where prod_id=30;
COUNT(*)
----------
     29282
Elapsed: 00:00:00.01
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=ALL_ROWS (Cost=11 Card=1 Bytes=13
          )
   1    0   SORT (AGGREGATE)
   2    1     INDEX (RANGE SCAN) OF 'IDX_PROD_ID_CUST_ID' (INDEX) (Cos
          t=11 Card=2500 Bytes=32500)
Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
         78 consistent gets
          0 physical reads
          0 redo size
        395 bytes sent via SQL*Net to client
        512 bytes received via SQL*Net from client
          2 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
          1 rows processed

prod_id和cust_id一起使用：
SQL> select count(*) from table3 where prod_id = 30 and cust_id = 25939;
COUNT(*)
----------
        12
Elapsed: 00:00:00.01
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=ALL_ROWS (Cost=3 Card=1 Bytes=26)
   1    0   SORT (AGGREGATE)
   2    1     INDEX (RANGE SCAN) OF 'IDX_PROD_ID_CUST_ID' (INDEX) (Cos
          t=3 Card=12 Bytes=312)
Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
          3 consistent gets
          0 physical reads
          0 redo size
        393 bytes sent via SQL*Net to client
        512 bytes received via SQL*Net from client
          2 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
          1 rows processed

仅仅使用cust_id：
SQL> select count(*) from table3 where cust_id = 25939;
COUNT(*)
----------
       283
Elapsed: 00:00:01.64
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=ALL_ROWS (Cost=528 Card=1 Bytes=1
          3)
   1    0   SORT (AGGREGATE)
   2    1     INDEX (FAST FULL SCAN) OF 'IDX_PROD_ID_CUST_ID' (INDEX)
          (Cost=528 Card=422 Bytes=5486)
Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
       2407 consistent gets
       2318 physical reads
          0 redo size
        394 bytes sent via SQL*Net to client
        512 bytes received via SQL*Net from client
          2 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
          1 rows processed

Oracle索引扫描的四种类型：
http://blog.csdn.net/tianlesoftware/archive/2010/08/31/5852106.aspx

2、索引使用优化
避免意外的表扫描
Avoid accidental table scans
One of the most fundamental SQL tuning problems is the accidental table scan. Accidental table scans usually occur when the SQL programmer tries to perform a search on an indexed column that can’t be supported by an index. This can occur when:

Using != (not equals to). Even if the not equals condition satisfies only a small number of rows, Oracle does not use an index to satisfy such a condition. Often, you can re-code these queries using > or IN conditions, which can be supported by index lookups.

Searching for NULLS. Oracle won’t use an index to find null values, since null values are not usually stored in an index (the exception is a concatenated index entry where only some of the values are NULL). If you’re planning to search for values that are logically missing, consider changing the column to NOT NULL with a DEFAULT clause. For example, you could set a default value of UNKNOWN and use the index to find these values. Interestingly, recent versions of Oracle can index to find values that are NOT NULL - if the cost-based optimizer determines that such an approach is cost-effective.

Using functions on indexed columns. Any function or operation on an indexed column prevents Oracle from using an index on that column. For instance, Oracle can’t use an index to find SUBSTR(SURNAME,1,4)=’SMIT’. Instead of manipulating the column, try to manipulate the search condition. In the previous example, a better formulation would be SURNAME LIKE ‘SMIT%’.

实验（慎用!=）：

!=使用的是Fast Full Scan：
SQL> select count(*) from table4 where prod_id != 30;
COUNT(*)
----------
    889561
Elapsed: 00:00:00.79
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=ALL_ROWS (Cost=409 Card=1 Bytes=1
          3)
   1    0   SORT (AGGREGATE)
   2    1     INDEX (FAST FULL SCAN) OF 'IDX1_TABLE4' (INDEX) (Cost=40
          9 Card=840758 Bytes=10929854)

而<、>、IN用的是Range Scan：
SQL> select count(*) from table4 where prod_id < 30;
COUNT(*)
----------
    182690
Elapsed: 00:00:00.04
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=ALL_ROWS (Cost=284 Card=1 Bytes=1
          3)
   1    0   SORT (AGGREGATE)
   2    1     INDEX (RANGE SCAN) OF 'IDX1_TABLE4' (INDEX) (Cost=284 Ca
          rd=131223 Bytes=1705899)

实验（慎用函数）：
create table emp as select * from scott.emp;
在emp的empno和ename字段上分别建立index
SQL> select Count(*) from emp where substr(ename,1,4) = 'smit';
COUNT(*)
----------
         0
Elapsed: 00:00:00.01
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=ALL_ROWS (Cost=2 Card=1 Bytes=7)
   1    0   SORT (AGGREGATE)
   2    1     TABLE ACCESS (FULL) OF 'EMP' (TABLE) (Cost=2 Card=1 Byte
          s=7)

SQL> select Count(*) from emp where ename like 'smit%';
COUNT(*)
----------
         0
Elapsed: 00:00:00.00
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=ALL_ROWS (Cost=0 Card=1 Bytes=7)
   1    0   SORT (AGGREGATE)
   2    1     INDEX (RANGE SCAN) OF 'IDX2_EMP' (INDEX)