一千六百万单表建联合索引对查询效率的提升

表结构：

create table hy_emp(
    id integer,
    name nvarchar2(20) not null,
    age integer not null,
    salary integer not null,
    cdate date not null)

注意这里没有设定主键，目的是插值时提高效率。

插入基础值：

insert into hy_emp
select 1,dbms_random.string('*',dbms_random.value(1,20)),dbms_random.value(18,80),dbms_random.value(1,100000),sysdate
from dual
connect by level<1000001
order by dbms_random.random;

现在表中有了一百万数据，现在重复执行以下语句四次，表中就有了1600万数据：

insert into hy_emp select * from hy_emp;

然后给id，cdate设上值，由于需要一条条设值，这一步比较，需要十多分钟。

update hy_emp set id=rownum , cdate=to_date('2000-07-02','yyyy-MM-dd')+rownum/100;

最后给表设上主键：

alter table hy_emp add constraint hy_emp_pk primary key (id) enable validate;

先查查下面SQL的cost：

EXPLAIN PLAN FOR
select name from hy_emp where name like 'A%' and age>65 and salary>20000
select * from table(dbms_xplan.display);

结果：

Plan hash value: 910676026
 
----------------------------------------------------------------------------
| Id  | Operation         | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |        |   113K|  5307K| 27763   (1)| 00:00:02 |
|*  1 |  TABLE ACCESS FULL| HY_EMP |   113K|  5307K| 27763   (1)| 00:00:02 |
----------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   1 - filter("AGE">65 AND "SALARY">20000 AND "NAME" LIKE U'A%')
 
Note
-----
   - dynamic statistics used: dynamic sampling (level=2)

感觉耗时有点长，给条件和显示列的字段加上索引看看。

create index idx_emp_name_age_sal on hy_emp(name,age,salary);

再查一遍：

EXPLAIN PLAN FOR
select name from hy_emp where name like 'A%' and age>65 and salary>20000
select * from table(dbms_xplan.display);

结果：

-----------------------------------------------------------------------------------------
| Id  | Operation        | Name                 | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT |                      |   113K|  5307K|  3340   (1)| 00:00:01 |
|*  1 |  INDEX RANGE SCAN| IDX_EMP_NAME_AGE_SAL |   113K|  5307K|  3340   (1)| 00:00:01 |
-----------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   1 - access("NAME" LIKE U'A%' AND "AGE">65 AND "SALARY">20000)
       filter("AGE">65 AND "SALARY">20000 AND "NAME" LIKE U'A%')
 
Note
-----
   - dynamic statistics used: dynamic sampling (level=2)

Cost直接缩减为原有方案的一成，效果不错。

--2020-04-03--

以上用到的全部SQL：

create table hy_emp(
    id integer,
    name nvarchar2(20) not null,
    age integer not null,
    salary integer not null,
    cdate date not null)
    
insert into hy_emp
select 1,dbms_random.string('*',dbms_random.value(1,20)),dbms_random.value(18,80),dbms_random.value(1,100000),sysdate
from dual
connect by level<1000001
order by dbms_random.random;

insert into hy_emp select * from hy_emp;

select count(*) from hy_emp;

update hy_emp set id=rownum , cdate=to_date('2000-07-02','yyyy-MM-dd')+rownum/100;

alter table hy_emp add constraint hy_emp_pk primary key (id) enable validate;

commit;

select * from hy_emp where rownum<20

EXPLAIN PLAN FOR
select name from hy_emp where name like 'A%' and age>65 and salary>20000
select * from table(dbms_xplan.display);

create index idx_emp_name_age_sal on hy_emp(name,age,salary);