不能自动生成snapshot,不能生成awr

近期一个库需要生成AWR时发现近期没有可选择的快照

检查快照生成
select snap_id,instance_number,begin_interval_time,end_interval_time from dba_hist_snapshot order by instance_number,begin_interval_time
截取部分行发现快照生成时间不对
286 28054 1 13-APR-20 06.00.13.172 AM 13-APR-20 07.00.01.661 AM
287 28055 1 13-APR-20 07.00.01.661 AM 13-APR-20 08.00.18.495 AM
288 28056 1 13-APR-20 08.00.18.495 AM 13-APR-20 09.00.06.341 AM
289 28057 1 13-APR-20 09.00.06.341 AM 13-APR-20 10.00.18.680 AM
290 28058 1 13-APR-20 10.00.18.680 AM 13-APR-20 11.00.06.287 AM
291 28060 1 13-APR-20 12.00.06.152 PM 13-APR-20 01.00.30.297 PM
292 28061 1 13-APR-20 01.00.30.297 PM 13-APR-20 02.00.01.093 PM

862 28231 2 07-MAY-20 08.00.26.682 PM 07-MAY-20 09.00.52.925 PM
863 28232 2 07-MAY-20 09.00.52.925 PM 07-MAY-20 10.04.14.955 PM
864 28233 2 07-MAY-20 10.04.14.955 PM 08-MAY-20 05.16.24.751 PM
865 28234 2 08-MAY-20 05.16.24.751 PM 08-MAY-20 05.25.23.908 PM
866 28235 2 08-MAY-20 05.25.23.908 PM 08-MAY-20 06.08.14.022 PM
867 28236 2 08-MAY-20 06.08.14.022 PM 08-MAY-20 07.00.40.240 PM
868 28237 2 08-MAY-20 07.00.40.240 PM 08-MAY-20 08.00.05.696 PM
869 28238 2 08-MAY-20 08.00.05.696 PM 08-MAY-20 09.00.03.040 PM
870 28239 2 08-MAY-20 09.00.03.040 PM 08-MAY-20 10.00.14.741 PM
871 28240 2 08-MAY-20 10.00.14.741 PM 08-MAY-20 11.00.11.279 PM
872 28241 2 08-MAY-20 11.00.11.279 PM 09-MAY-20 12.00.59.568 AM<<<<<<<<<<<<<<<<<<<


1443 28234 3 08-MAY-20 05.16.24.744 PM 08-MAY-20 05.25.23.905 PM
1444 28235 3 08-MAY-20 05.25.23.905 PM 08-MAY-20 06.08.13.950 PM
1445 28236 3 08-MAY-20 06.08.13.950 PM 08-MAY-20 07.00.40.167 PM
1446 28237 3 08-MAY-20 07.00.40.167 PM 08-MAY-20 08.00.05.664 PM
1447 28238 3 08-MAY-20 08.00.05.664 PM 08-MAY-20 09.00.03.039 PM
1448 28239 3 08-MAY-20 09.00.03.039 PM 08-MAY-20 10.00.16.037 PM
1449 28240 3 08-MAY-20 10.00.16.037 PM 08-MAY-20 11.00.11.235 PM
1450 28241 3 08-MAY-20 11.00.11.235 PM 09-MAY-20 12.00.59.561 AM<<<<<<<<<<<<<<<<<<<


2021 28234 4 08-MAY-20 05.16.24.741 PM 08-MAY-20 05.25.23.908 PM
2022 28235 4 08-MAY-20 05.25.23.908 PM 08-MAY-20 06.08.14.004 PM
2023 28236 4 08-MAY-20 06.08.14.004 PM 08-MAY-20 07.00.40.250 PM
2024 28237 4 08-MAY-20 07.00.40.250 PM 08-MAY-20 08.00.05.695 PM
2025 28238 4 08-MAY-20 08.00.05.695 PM 08-MAY-20 09.00.02.990 PM
2026 28239 4 08-MAY-20 09.00.02.990 PM 08-MAY-20 10.00.14.785 PM
2027 28240 4 08-MAY-20 10.00.14.785 PM 08-MAY-20 11.00.11.281 PM
2028 28241 4 08-MAY-20 11.00.11.281 PM 09-MAY-20 12.00.59.575 AM<<<<<<<<<<<<<<<<<<<


snapshot设置也没问题
SQL> SELECT * FROM DBA_HIST_WR_CONTROL

DBID SNAP_INTERVAL RETENTION TOPNSQL
---------- -------------------------------- -------------------------- ----------
1073293429 +00000 01:00:00.0 +00045 00:00:00.0 DEFAULT

SQL>


检查trace

alert定期出现如下报错
Thu May 07 22:09:20 2020
Suspending MMON slave action kewrmafsa_ for 82800 seconds

MMON进程介绍
MMON进程是数据库10g版本引入的进程,是数据库的很多自我监视和自我调整
功能的支持进程。
此数据库实例收集有关活动和性能的大量统计数据。这些统计数据收集到SGA中,
通过发出sql查询,可以询问他们的当前值。
MMON从SGA定期捕获统计数据(默认是每小时一次),并将它们写入到数据字典中,在数据字典中,可以无限期的存储它们(不过,默认方式是只存储8天)。
每当MMON收集一组统计数据(称为快照)时,它还启动ADDM。ADDM工具使用由多名
DBA历经多年开发的专家系统来分析数据库活动。它观察两个快照(默认是当前快照和先前的快照),并得出有关性能的观察结果和建议。
除了收集快照外,MMON还持续监视数据库和实例,来确定是否应该发出任何警报。


M000trace报错
not in wait at each sample
----- END DDE Action: 'ORA_12751_DUMP' (SUCCESS, 3 csec) -----
----- END DDE Actions Dump (total 3 csec) -----
*** KEWROCISTMTEXEC - encountered error: (ORA-12751: cpu time or run time policy violation<<<<<<<<<<<<<<<<<<<<<<<<<<<
)
*** SQLSTR: total-len=295, dump-len=240,
STR={insert into WRH$_SERVICE_STAT (snap_id, dbid, instance_number, service_name_hash, stat_id, value) select :snap_id, :dbid, :instance_number, stat.service_name_hash, stat.stat_id, stat.value from v$active_services asvc, v$service_st}
DDE rules only execution for: ORA 12751
----- START Event Driven Actions Dump ----
---- END Event Driven Actions Dump ----
----- START DDE Actions Dump -----
Executing SYNC actions
Executing ASYNC actions
----- START DDE Action: 'ORA_12751_DUMP' (Sync) -----
CPU time exceeded 300 seconds <<<<<<<<<<<<<<<<<<<<<<<<<<<
Time limit violation detected at:
ksedsts()+465<-kspol_12751_dump()+145<-dbgdaExecuteAction()+1065<-dbgerRunAction()+109<-dbgerRunActions()+4134<-dbgexPhaseII()+1873<-dbgexProcessError()+2680<-dbgeExecuteForError()+88<-dbgePostErrorKGE()+2136<-dbkePostKGE_kgsf()+71<-kgeade()+351<-kgerelv()+140
<-kgerev()+34<-kserec1()+170<-OCIKSEC()+189<-kewrose_oci_stmt_exec()+292<-kewrgwxf1_gwrsql_exft_1()+317<-kewrgwxf_gwrsql_exft()+496<-kewrews_execute_wr_sql()+52<-kewrftbs_flush_table_by_sql()+180<-kewrft_flush_table()+264<-kewrftec_flush_table_ehdlcx()+766
<-kewrfsvc_flush_svcstat()+45<-kewrft_flush_table()+397<-kewrftec_flush_table_ehdlcx()+766<-kewrfat_flush_all_tables()+898<-kewrfos_flush_onesnap()+170<-kewrfsc_flush_snapshot_c()+644<-kewrafs_auto_flush_slave()+769<-kebm_slave_main()+586<-ksvrdp()+1766<-opirip()+674
<-opidrv()+603<-sou2o()+103<-opimai_real()+250<-ssthrdmain()+265<-main()+201<-__libc_start_main()+230Current Wait Stack:
Not in wait; last wait ended 5 min 5 sec ago
There are 3 sessions blocked by this session.


*** 2020-05-09 00:06:05.637
*** SESSION ID:(609.4217) 2020-05-09 00:06:05.637
*** CLIENT ID:() 2020-05-09 00:06:05.637
*** SERVICE NAME:(SYS$BACKGROUND) 2020-05-09 00:06:05.637
*** MODULE NAME:(MMON_SLAVE) 2020-05-09 00:06:05.637
*** ACTION NAME:(Auto-Flush Slave Action) 2020-05-09 00:06:05.637<<<<<<<<<<<<<<<<<<

DDE rules only execution for: ORA 12751
----- START Event Driven Actions Dump ----
---- END Event Driven Actions Dump ----
----- START DDE Actions Dump -----
Executing SYNC actions
Executing ASYNC actions
----- START DDE Action: 'ORA_12751_DUMP' (Sync) -----
CPU time exceeded 300 seconds<<<<<<<<<<<<<<<<<<
Time limit violation detected at:
<-opidrv()+603<-sou2o()+103<-opimai_real()+250<-ssthrdmain()+265<-main()+201<-__libc_start_main()+230Current Wait Stack:
Not in wait; last wait ended 5 min 5 sec ago <<<<<<<<<<<<<<<<<<
There are 3 sessions blocked by this session.
Dumping one waiter:
inst: 3, sid: 1713, ser: 32257
wait event: 'enq: WF - contention'
p1: 'name|mode'=0x57460006
p2: '0'=0x4a
p3: '0'=0x0
row_wait_obj#: 4294967295, block#: 0, row#: 0, file# 0
min_blocked_time: 288 secs, waiter_cache_ver: 23457
Wait State:
fixed_waits=0 flags=0x20 boundary=(nil)/-1


mmon trace
*** 2020-05-09 07:20:31.882
Unable to schedule a MMON slave at: Auto Flush Main 1
Slave action has been temporarily suspended
- Slave action had prior policy violations.
Unknown return code: 101

手动发起快照并使用10046进行跟踪,尝试几次有出现快照未生成当前session就被kill的现象
文档 2043531.1 有介绍
Taking an AWR snapshot executes many SQL statements to generate a new snapshot and has a built-in timeout mechanism, which causes timeouts between 300 and 900 seconds by default. As a result, if snapshot generation stalls, "Time limit violation" or "Runtime exceeded" messages and associated ORA-12751 errors are raised.
获取AWR快照将执行许多SQL语句来生成新的快照,并具有内置超时机制,该机制在默认情况下会导致300到900秒的超时。因此,如果快照生成停止,就会引发“时间限制违规”或“运行时超限”消息和相关的ORA-12751错误。


SQL> oradebug setmypid;
Statement processed.
SQL> oradebug unlimit;
Statement processed.
SQL> oradebug session_event 10046 trace name context forever ,level 4
Statement processed.
SQL> oradebug tracefile_name;
/oracle/app/oracle/diag/rdbms/cdr/cdr1/trace/cdr1_ora_19411.trc
SQL> EXECUTE DBMS_WORKLOAD_REPOSITORY.CREATE_SNAPSHOT();
BEGIN DBMS_WORKLOAD_REPOSITORY.CREATE_SNAPSHOT(); END;

*
ERROR at line 1:
ORA-00028: your session has been killed
ORA-00028: your session has been killed
ORA-06512: at "SYS.DBMS_WORKLOAD_REPOSITORY", line 99
ORA-06512: at "SYS.DBMS_WORKLOAD_REPOSITORY", line 122
ORA-06512: at line 1

分析10046 trace找到耗时的sql_id
=====================
PARSING IN CURSOR #140236033454680 len=295 dep=1 uid=0 oct=2 lid=0 tim=1588989110811688 hv=1114153516 ad='be2a17d0' sqlid='7g732rx16j8jc'
insert into WRH$_SERVICE_STAT (snap_id, dbid, instance_number, service_name_hash, stat_id, value) select :snap_id, :dbid, :instance_number, stat.service_name_hash, stat.stat_id, stat.value from v$active_services asvc, v$service_stats stat where asvc.name_hash = stat.service_name_hash
END OF STMT
PARSE #140236033454680:c=0,e=500,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=1,plh=2550932738,tim=1588989110811688
BINDS #140236033454680:
Bind#0
oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
oacflg=00 fl2=0000 frm=00 csi=00 siz=72 off=0
kxsbbbfp=7f8b3edc6518 bln=22 avl=04 flg=05
value=28245
Bind#1
oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
oacflg=00 fl2=0000 frm=00 csi=00 siz=0 off=24
kxsbbbfp=7f8b3edc6530 bln=22 avl=06 flg=01
value=1073293429
Bind#2
oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
oacflg=00 fl2=0000 frm=00 csi=00 siz=0 off=48
kxsbbbfp=7f8b3edc6548 bln=22 avl=02 flg=01
value=1

*** 2020-05-09 09:59:32.945
EXEC #140236033454680:c=457832613,e=462133450,p=0,cr=0,cu=0,mis=0,r=174,dep=1,og=1,plh=2550932738,tim=1588989572945189
ERROR #140236033454680:err=28 tim=1588989572945334
STAT #140236033454680 id=1 cnt=0 pid=0 pos=1 obj=0 op='LOAD TABLE CONVENTIONAL (cr=0 pr=0 pw=0 time=13 us)'
STAT #140236033454680 id=2 cnt=174 pid=1 pos=1 obj=0 op='NESTED LOOPS (cr=0 pr=0 pw=0 time=445643146 us cost=0 size=130 card=1)'
STAT #140236033454680 id=3 cnt=175 pid=2 pos=1 obj=0 op='MERGE JOIN CARTESIAN (cr=0 pr=0 pw=0 time=1791 us cost=0 size=78 card=1)'
STAT #140236033454680 id=4 cnt=7 pid=3 pos=1 obj=0 op='FIXED TABLE FULL X$KSWSASTAB (cr=0 pr=0 pw=0 time=84 us cost=0 size=39 card=1)'
STAT #140236033454680 id=5 cnt=175 pid=3 pos=2 obj=0 op='BUFFER SORT (cr=0 pr=0 pw=0 time=972 us cost=0 size=39 card=1)'
STAT #140236033454680 id=6 cnt=28 pid=5 pos=1 obj=0 op='FIXED TABLE FULL X$KEWSSMAP (cr=0 pr=0 pw=0 time=69 us cost=0 size=39 card=1)'
STAT #140236033454680 id=7 cnt=174 pid=2 pos=2 obj=0 op='FIXED TABLE FIXED INDEX X$KEWSSVCV (ind:2) (cr=0 pr=0 pw=0 time=462106244 us cost=0 size=52 card=1)'
*** KEWROCISTMTEXEC - encountered error: (ORA-00028: your session has been killed
)
*** SQLSTR: total-len=295, dump-len=240,
STR={insert into WRH$_SERVICE_STAT (snap_id, dbid, instance_number, service_name_hash, stat_id, value) select :snap_id, :dbid, :instance_number, stat.service_name_hash, stat.stat_id, stat.value from v$active_services asvc, v$service_st}
*** KEWRAFM1: Error=13509 encountered by kewrfteh
CLOSE #140236033454680:c=0,e=11,dep=1,type=0,tim=1588989572945875
*** KEWROCISTMTEXEC - encountered error: (ORA-00028: your session has been killed
)
*** SQLSTR: total-len=395, dump-len=240,
STR={insert into WRH$_SERVICE_WAIT_CLASS (snap_id, dbid, instance_number, service_name_hash, wait_class_id, wait_class, total_waits, time_waited) select :snap_id, :dbid, :instance_number, stat.service_name_hash, stat.wait_class_id, s}
*** KEWRAFM1: Error=13509 encountered by kewrfteh
CLOSE #140236033454680:c=0,e=4,dep=1,type=0,tim=1588989572946282
PARSE #140236033662208:c=0,e=20,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=1,plh=0,tim=1588989572946322
BINDS #140236033662208:


格式化后
insert into WRH$_SERVICE_STAT
(snap_id, dbid, instance_number, service_name_hash, stat_id, value)
select :snap_id,
:dbid,
:instance_number,
stat.service_name_hash,
stat.stat_id,
stat.value
from v$active_services asvc, v$service_st


查找MOS 2294282.1


发现该sql多个执行计划有特别长耗时的

SQL> @xi 7g732rx16j8jc %
eXplain the execution plan for sqlid 7g732rx16j8jc child %...

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID 7g732rx16j8jc, child number 0
-------------------------------------
insert into WRH$_SERVICE_STAT (snap_id, dbid, instance_number,
service_name_hash, stat_id, value) select :snap_id, :dbid,
:instance_number, stat.service_name_hash, stat.stat_id, stat.value
from v$active_services asvc, v$service_stats stat where
asvc.name_hash = stat.service_name_hash

Plan hash value: 2550932738

--------------------------------------------------------------------------------------------
| Id | Operation | Name | E-Rows | OMem | 1Mem | Used-Mem |
--------------------------------------------------------------------------------------------
| 0 | INSERT STATEMENT | | | | | |
| 1 | LOAD TABLE CONVENTIONAL | | | | | |
| 2 | NESTED LOOPS | | 1 | | | |
| 3 | MERGE JOIN CARTESIAN | | 1 | | | |
|* 4 | FIXED TABLE FULL | X$KSWSASTAB | 1 | | | |
| 5 | BUFFER SORT | | 1 | 73728 | 73728 | |
|* 6 | FIXED TABLE FULL | X$KEWSSMAP | 1 | | | |
|* 7 | FIXED TABLE FIXED INDEX| X$KEWSSVCV (ind:2) | 1 | | | |
--------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

4 - filter(("KSWSASTABACT"=1 AND "INST_ID"=USERENV('INSTANCE')))
6 - filter("M"."AGGID"=3)
7 - filter(("S"."INST_ID"=USERENV('INSTANCE') AND "KSWSASTABNMH"="S"."SVCHSH"
AND "S"."KEWSOFF"="M"."OFFST"))

Note
-----
- Warning: basic plan statistics not available. These are only collected when:
* hint 'gather_plan_statistics' is used for the statement or
* parameter 'statistics_level' is set to 'ALL', at session or system level

SQL_ID 7g732rx16j8jc, child number 1
-------------------------------------
insert into WRH$_SERVICE_STAT (snap_id, dbid, instance_number,
service_name_hash, stat_id, value) select :snap_id, :dbid,
:instance_number, stat.service_name_hash, stat.stat_id, stat.value
from v$active_services asvc, v$service_stats stat where
asvc.name_hash = stat.service_name_hash

Plan hash value: 2550932738

--------------------------------------------------------------------------------------------
| Id | Operation | Name | E-Rows | OMem | 1Mem | Used-Mem |
--------------------------------------------------------------------------------------------
| 0 | INSERT STATEMENT | | | | | |
| 1 | LOAD TABLE CONVENTIONAL | | | | | |
| 2 | NESTED LOOPS | | 1 | | | |
| 3 | MERGE JOIN CARTESIAN | | 1 | | | |
|* 4 | FIXED TABLE FULL | X$KSWSASTAB | 1 | | | |
| 5 | BUFFER SORT | | 1 | 73728 | 73728 | |
|* 6 | FIXED TABLE FULL | X$KEWSSMAP | 1 | | | |
|* 7 | FIXED TABLE FIXED INDEX| X$KEWSSVCV (ind:2) | 1 | | | |
--------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

4 - filter(("KSWSASTABACT"=1 AND "INST_ID"=USERENV('INSTANCE')))
6 - filter("M"."AGGID"=3)
7 - filter(("S"."INST_ID"=USERENV('INSTANCE') AND "KSWSASTABNMH"="S"."SVCHSH"
AND "S"."KEWSOFF"="M"."OFFST"))

Note
-----
- Warning: basic plan statistics not available. These are only collected when:
* hint 'gather_plan_statistics' is used for the statement or
* parameter 'statistics_level' is set to 'ALL', at session or system level

SQL_ID 7g732rx16j8jc, child number 2
-------------------------------------
insert into WRH$_SERVICE_STAT (snap_id, dbid, instance_number,
service_name_hash, stat_id, value) select :snap_id, :dbid,
:instance_number, stat.service_name_hash, stat.stat_id, stat.value
from v$active_services asvc, v$service_stats stat where
asvc.name_hash = stat.service_name_hash

Plan hash value: 2550932738

--------------------------------------------------------------------------------------------
| Id | Operation | Name | E-Rows | OMem | 1Mem | Used-Mem |
--------------------------------------------------------------------------------------------
| 0 | INSERT STATEMENT | | | | | |
| 1 | LOAD TABLE CONVENTIONAL | | | | | |
| 2 | NESTED LOOPS | | 1 | | | |
| 3 | MERGE JOIN CARTESIAN | | 1 | | | |
|* 4 | FIXED TABLE FULL | X$KSWSASTAB | 1 | | | |
| 5 | BUFFER SORT | | 1 | 73728 | 73728 | |
|* 6 | FIXED TABLE FULL | X$KEWSSMAP | 1 | | | |
|* 7 | FIXED TABLE FIXED INDEX| X$KEWSSVCV (ind:2) | 1 | | | |
--------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

4 - filter(("KSWSASTABACT"=1 AND "INST_ID"=USERENV('INSTANCE')))
6 - filter("M"."AGGID"=3)
7 - filter(("S"."INST_ID"=USERENV('INSTANCE') AND "KSWSASTABNMH"="S"."SVCHSH"
AND "S"."KEWSOFF"="M"."OFFST"))

Note
-----
- Warning: basic plan statistics not available. These are only collected when:
* hint 'gather_plan_statistics' is used for the statement or
* parameter 'statistics_level' is set to 'ALL', at session or system level

SQL_ID 7g732rx16j8jc, child number 3
-------------------------------------
insert into WRH$_SERVICE_STAT (snap_id, dbid, instance_number,
service_name_hash, stat_id, value) select :snap_id, :dbid,
:instance_number, stat.service_name_hash, stat.stat_id, stat.value
from v$active_services asvc, v$service_stats stat where
asvc.name_hash = stat.service_name_hash

Plan hash value: 2550932738

---------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | OMem | 1Mem | Used-Mem |
---------------------------------------------------------------------------------------------------------------------------
| 0 | INSERT STATEMENT | | 1 | | 0 |00:00:00.01 | | | |
| 1 | LOAD TABLE CONVENTIONAL | | 1 | | 0 |00:00:00.01 | | | |
| 2 | NESTED LOOPS | | 1 | 1 | 174 |00:07:25.64 | | | | <<<<<<<<<<<<<<<<<<<<<<<<<<<<
| 3 | MERGE JOIN CARTESIAN | | 1 | 1 | 175 |00:00:00.01 | | | |
|* 4 | FIXED TABLE FULL | X$KSWSASTAB | 1 | 1 | 7 |00:00:00.01 | | | |
| 5 | BUFFER SORT | | 7 | 1 | 175 |00:00:00.01 | 73728 | 73728 | |
|* 6 | FIXED TABLE FULL | X$KEWSSMAP | 1 | 1 | 28 |00:00:00.01 | | | |
|* 7 | FIXED TABLE FIXED INDEX| X$KEWSSVCV (ind:2) | 175 | 1 | 174 |00:07:42.11 | | | | <<<<<<<<<<<<<<<<<<<<<<<<<<<<
---------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

4 - filter(("KSWSASTABACT"=1 AND "INST_ID"=USERENV('INSTANCE')))
6 - filter("M"."AGGID"=3)
7 - filter(("S"."INST_ID"=USERENV('INSTANCE') AND "KSWSASTABNMH"="S"."SVCHSH" AND "S"."KEWSOFF"="M"."OFFST"))


141 rows selected.


可以看到这三个表无统计信息,特别是X$KEWSSVCV的NUM_ROWS与实际行数有显著差异。
因此,优化器在X$KEWSSMAP、X$KEWSSVCV和X$KSWSASTAB之间选择了不合适的连接顺序,并选择了无效索引来访问X$KEWSSVCV。
SQL> select * from DBA_TAB_STATISTICS where table_name in ('X$KSWSASTAB','X$KEWSSMAP','X$KEWSSVCV');

OWNER TABLE_NAME PARTITION_NAME PARTITION_POSITION SUBPARTITION_NAME SUBPARTITION_POSITION OBJECT_TYPE NUM_ROWS BLOCKS EMPTY_BLOCKS AVG_SPACE CHAIN_CNT AVG_ROW_LEN AVG_SPACE_FREELIST_BLOCKS NUM_FREELIST_BLOCKS AVG_CACHED_BLOCKS AVG_CACHE_HIT_RATIO SAMPLE_SIZE LAST_ANALYZED GLO USE STATT STA
------------------------------ ------------------------------ ------------------------------ ------------------ ------------------------------ --------------------- ------------ ---------- ---------- ------------ ---------- ---------- ----------- ------------------------- ------------------- ----------------- ------------------- ----------- ----------------- --- --- ----- ---
SYS X$KSWSASTAB FIXED TABLE
SYS X$KEWSSMAP FIXED TABLE
SYS X$KEWSSVCV FIXED TABLE

SQL>

SQL> select count(*) from X$KSWSASTAB;

COUNT(*)
----------
8

SQL> select count(*) from X$KEWSSMAP;

COUNT(*)
----------
265

SQL> select count(*) from X$KEWSSVCV;

COUNT(*)
----------
2576336

SQL>

####生产环境x$表查的时候要慎重
SQL> select * from X$KEWSSVCV where rownum<10;

ADDR INDX INST_ID SVCNAM SVCID SVCHSH KEWSOFF KEWSVAL
---------------- ---------- ---------- ---------------------------------------------------------------- ---------- ---------- ---------- ----------
00007F618F9DA658 0 1 --UNKNOWN-- 151 2985209977 0 0
00007F618F9DA658 1 1 --UNKNOWN-- 151 2985209977 1 0
00007F618F9DA658 2 1 --UNKNOWN-- 151 2985209977 2 0
00007F618F9DA658 3 1 --UNKNOWN-- 151 2985209977 3 0
00007F618F9DA658 4 1 --UNKNOWN-- 151 2985209977 4 0
00007F618F9DA658 5 1 --UNKNOWN-- 151 2985209977 5 0
00007F618F9DA658 6 1 --UNKNOWN-- 151 2985209977 6 0
00007F618F9DA658 7 1 --UNKNOWN-- 151 2985209977 7 0
00007F618F9DA658 8 1 --UNKNOWN-- 151 2985209977 8 0

9 rows selected.

alter中有大量ALTER SYSTEM SET service_names 这是datapump发起时会有的操作
Sat May 09 10:21:20 2020
Thread 1 advanced to log sequence 152367 (LGWR switch)
Current log# 3 seq# 152367 mem# 0: +DATADG/cdr/redo03.log
Sat May 09 10:22:16 2020
ALTER SYSTEM SET service_names='SYS$SYS.KUPC$S_1_20200413152721.CDR.HEBEI.MOBILE.COM','SYS$SYS.KUPC$C_1_20200413152721.CDR.HEBEI.MOBILE.COM','SYS$SYS.KUPC$S_1_20200413145208.CDR.HEBEI.MOBILE.COM','SYS$SYS.KUPC$C_1_20200413145208.CDR.H
EBEI.MOBILE.COM','SYS$SYS.KUPC$C_1_20200509102215.CDR.HEBEI.MOBILE.COM' SCOPE=MEMORY SID='cdr1';
ALTER SYSTEM SET service_names='SYS$SYS.KUPC$C_1_20200509102215.CDR.HEBEI.MOBILE.COM','SYS$SYS.KUPC$S_1_20200413152721.CDR.HEBEI.MOBILE.COM','SYS$SYS.KUPC$C_1_20200413152721.CDR.HEBEI.MOBILE.COM','SYS$SYS.KUPC$S_1_20200413145208.CDR.H
EBEI.MOBILE.COM','SYS$SYS.KUPC$C_1_20200413145208.CDR.HEBEI.MOBILE.COM','SYS$SYS.KUPC$S_1_20200509102215.CDR.HEBEI.MOBILE.COM' SCOPE=MEMORY SID='cdr1';
Sat May 09 10:22:17 2020
DM09 started with pid=364, OS id=11990, job DBMT.SYS_EXPORT_TABLE_08
Sat May 09 10:22:45 2020
DW04 started with pid=136, OS id=23074, wid=1, job DBMT.SYS_EXPORT_TABLE_08
Sat May 09 10:23:05 2020
DW01 started with pid=947, OS id=27865, wid=2, job DBMT.SYS_EXPORT_TABLE_08
Sat May 09 10:25:29 2020
Thread 1 advanced to log sequence 152368 (LGWR switch)
Current log# 4 seq# 152368 mem# 0: +DATADG/cdr/redo04.log
Sat May 09 10:31:15 2020
Thread 1 advanced to log sequence 152369 (LGWR switch)
Current log# 1 seq# 152369 mem# 0: +DATADG/cdr/redo01.log
Sat May 09 10:35:26 2020
ALTER SYSTEM SET service_names='SYS$SYS.KUPC$S_1_20200509102215.CDR.HEBEI.MOBILE.COM','SYS$SYS.KUPC$S_1_20200413152721.CDR.HEBEI.MOBILE.COM','SYS$SYS.KUPC$C_1_20200413152721.CDR.HEBEI.MOBILE.COM','SYS$SYS.KUPC$S_1_20200413145208.CDR.H
EBEI.MOBILE.COM','SYS$SYS.KUPC$C_1_20200413145208.CDR.HEBEI.MOBILE.COM' SCOPE=MEMORY SID='cdr1';
ALTER SYSTEM SET service_names='SYS$SYS.KUPC$S_1_20200413152721.CDR.HEBEI.MOBILE.COM','SYS$SYS.KUPC$C_1_20200413152721.CDR.HEBEI.MOBILE.COM','SYS$SYS.KUPC$S_1_20200413145208.CDR.HEBEI.MOBILE.COM','SYS$SYS.KUPC$C_1_20200413145208.CDR.H
EBEI.MOBILE.COM' SCOPE=MEMORY SID='cdr1';
Sat May 09 10:35:54 2020


实例1服务中出现大量的unknown
在数据泵处理过程中,“——UNKNOWN——”服务名称将随着创建/删除服务的增加而增加,话单系统每月会调用大量的expdp进行表级备份任务发给nbu处理,吊起进程皆在节点1发起,所以导致节点1有这么多的UNKNOWN
SQL> select inst_id,service_name,count(1) from gv$service_stats
2 group by inst_id,service_name
3 order by 1,2,count(1) desc;

INST_ID SERVICE_NAME COUNT(1)
---------- ---------------------------------------------------------------- ----------
1 --UNKNOWN-- 2576100 <<<<<<<<<<<<<<<<<<<<<<
1 SYS$BACKGROUND 28
1 SYS$USERS 28
1 SYS.KUPC$C_1_20200413145208 27
1 SYS.KUPC$C_1_20200413152721 28
1 SYS.KUPC$C_1_20200509102215 7
1 SYS.KUPC$S_1_20200413145208 28
1 SYS.KUPC$S_1_20200413152721 27
1 SYS.KUPC$S_1_20200509102215 8
1 cdr.hebei.mobile.com 28
1 cdrXDB 27
2 --UNKNOWN-- 56
2 SYS$BACKGROUND 28
2 SYS$USERS 28
2 cdr.hebei.mobile.com 28
2 cdrXDB 28
3 --UNKNOWN-- 56
3 SYS$BACKGROUND 28
3 SYS$USERS 28
3 cdr.hebei.mobile.com 28
3 cdrXDB 28
4 --UNKNOWN-- 112
4 SYS$BACKGROUND 28
4 SYS$USERS 28
4 cdr.hebei.mobile.com 28
4 cdrXDB 28

26 rows selected.

SQL> select * from v$fixed_table where name='V$SERVICE_STATS';

NAME OBJECT_ID TYPE TABLE_NUM
------------------------------ ---------- ----- ----------
V$SERVICE_STATS 4294952576 VIEW 65537

SQL>
SQL> select view_definition from v$fixed_view_definition where view_name='GV$SERVICE_STATS';

VIEW_DEFINITION
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
select s.inst_id, s.svchsh, s.svcnam, m.extid, m.sname, s.kewsval from x$kewssvcv s, x$kewssmap m where s.kewsoff = m.offst and m.aggid = 3

SQL>

##X$KEWSSVCV是内存表也是gv$service_stats的基表,生成snapshot的时候慢的那条insert into WRH$_SERVICE_STAT 也使用到了这个表且耗时巨长

收集相关表统计信息
SQL> EXEC DBMS_STATS.GATHER_TABLE_STATS('SYS', 'X$KSWSASTAB', no_invalidate=>false);

PL/SQL procedure successfully completed.

SQL> EXEC DBMS_STATS.GATHER_TABLE_STATS('SYS', 'X$KEWSSMAP', no_invalidate=>false);

PL/SQL procedure successfully completed.

SQL> EXEC DBMS_STATS.GATHER_TABLE_STATS('SYS', 'X$KEWSSVCV', no_invalidate=>false);

PL/SQL procedure successfully completed.


SQL> select * from DBA_TAB_STATISTICS where table_name in ('X$KSWSASTAB','X$KEWSSMAP','X$KEWSSVCV');

OWNER TABLE_NAME PARTITION_NAME PARTITION_POSITION SUBPARTITION_NAME SUBPARTITION_POSITION OBJECT_TYPE NUM_ROWS BLOCKS EMPTY_BLOCKS AVG_SPACE CHAIN_CNT AVG_ROW_LEN AVG_SPACE_FREELIST_BLOCKS NUM_FREELIST_BLOCKS AVG_CACHED_BLOCKS AVG_CACHE_HIT_RATIO SAMPLE_SIZE LAST_ANALYZED GLO USE STATT STA
------------------------------ ------------------------------ ------------------------------ ------------------ ------------------------------ --------------------- ------------ ---------- ---------- ------------ ---------- ---------- ----------- ------------------------- ------------------- ----------------- ------------------- ----------- ----------------- --- --- ----- ---
SYS X$KSWSASTAB FIXED TABLE 8 112 8 20200509 10:48:26 YES NO
SYS X$KEWSSMAP FIXED TABLE 265 58 265 20200509 10:48:33 YES NO
SYS X$KEWSSVCV FIXED TABLE 2576336 46 2576336 20200509 10:51:12 YES NO

SQL>


检查mmon挂起状态
SQL> oradebug unit_test kebm_dmp_slv_attrs kewrmafsa_
Status: 3
Flags: 0
Runtime limit: 900
CPU time limit: 300
Violations: 28
Suspended until: 1589036765 <<<<<不为0代表挂起


再次实验AWR可以手动生成,观察后续
SQL> EXECUTE DBMS_WORKLOAD_REPOSITORY.CREATE_SNAPSHOT();

PL/SQL procedure successfully completed.

SQL>

SQL> oradebug unit_test kebm_dmp_slv_attrs kewrmafsa_
Status: 3
Flags: 0
Runtime limit: 900
CPU time limit: 300
Violations: 28
Suspended until: 1589036765
SQL>

并在必要时执行以下命令以解除MMON挂起状态。

SQL> oradebug unit_test kebm_set_slv_attrs kewrmafsa_ retain retain retain retain 0 0
Modified attributes of kewrmafsa_ (slave id 13)
SQL>
SQL> oradebug unit_test kebm_dmp_slv_attrs kewrmafsa_
Status: 3
Flags: 0
Runtime limit: 900
CPU time limit: 300
Violations: 0
Suspended until: 0
SQL>


经过几个小时快照周期查看已经可以在正确时间间隔自动生成快照

with t as (
select snap_id,
instance_number,
begin_interval_time,
end_interval_time,
snap_flag,
row_number() over(partition by instance_number order by begin_interval_time desc) num
from (select snap_id,
instance_number,
begin_interval_time,
end_interval_time,
snap_flag
from dba_hist_snapshot
order by instance_number, begin_interval_time))
select snap_id,instance_number,begin_interval_time,end_interval_time,snap_flag from t where num<=10 order by instance_number,begin_interval_time;

当前数据库设置的快照生成时间为每小时一个,仔细对比上边的4个节点的快照生成时间,不难发现,节点1 从13-APR-20开始已经没有快照生成了,其他节点也出现了相隔2小时or >2小时才生成一次快照的现象
,回想上边抓到的sql 共有多个游标,也就是说没能生成快照的时间点该sql必定走了不良的执行计划,所以导致超时被kill

SNAP_ID INSTANCE_NUMBER BEGIN_INTERVAL_TIME END_INTERVAL_TIME SNAP_FLAG
28057 1 13-APR-20 09.00.06.341 AM 13-APR-20 10.00.18.680 AM 0
28058 1 13-APR-20 10.00.18.680 AM 13-APR-20 11.00.06.287 AM 0
28060 1 13-APR-20 12.00.06.152 PM 13-APR-20 01.00.30.297 PM 0
28061 1 13-APR-20 01.00.30.297 PM 13-APR-20 02.00.01.093 PM 0
28233 1 07-MAY-20 10.04.14.903 PM 08-MAY-20 05.16.24.675 PM 1
28234 1 08-MAY-20 05.16.24.675 PM 08-MAY-20 05.25.23.869 PM 1
28246 1 09-MAY-20 09.51.50.009 AM 09-MAY-20 10.53.05.153 AM 1
28247 1 09-MAY-20 10.53.05.153 AM 09-MAY-20 12.00.06.767 PM 0
28248 1 09-MAY-20 12.00.06.767 PM 09-MAY-20 01.00.18.207 PM 0
28249 1 09-MAY-20 01.00.18.207 PM 09-MAY-20 02.00.29.186 PM 0
28240 2 08-MAY-20 10.00.14.741 PM 08-MAY-20 11.00.11.279 PM 0
28241 2 08-MAY-20 11.00.11.279 PM 09-MAY-20 12.00.59.568 AM 0
28242 2 09-MAY-20 12.00.59.568 AM 09-MAY-20 09.06.20.210 AM 0
28243 2 09-MAY-20 09.06.20.210 AM 09-MAY-20 09.20.47.483 AM 0
28244 2 09-MAY-20 09.20.47.483 AM 09-MAY-20 09.33.38.493 AM 0
28245 2 09-MAY-20 09.33.38.493 AM 09-MAY-20 09.51.50.048 AM 0
28246 2 09-MAY-20 09.51.50.048 AM 09-MAY-20 10.53.05.214 AM 0
28247 2 09-MAY-20 10.53.05.214 AM 09-MAY-20 12.00.06.849 PM 0
28248 2 09-MAY-20 12.00.06.849 PM 09-MAY-20 01.00.18.605 PM 0
28249 2 09-MAY-20 01.00.18.605 PM 09-MAY-20 02.00.29.250 PM 0
28240 3 08-MAY-20 10.00.16.037 PM 08-MAY-20 11.00.11.235 PM 0
28241 3 08-MAY-20 11.00.11.235 PM 09-MAY-20 12.00.59.561 AM 0
28242 3 09-MAY-20 12.00.59.561 AM 09-MAY-20 09.06.20.189 AM 0
28243 3 09-MAY-20 09.06.20.189 AM 09-MAY-20 09.20.47.480 AM 0
28244 3 09-MAY-20 09.20.47.480 AM 09-MAY-20 09.33.38.491 AM 0
28245 3 09-MAY-20 09.33.38.491 AM 09-MAY-20 09.51.50.049 AM 0
28246 3 09-MAY-20 09.51.50.049 AM 09-MAY-20 10.53.05.207 AM 0
28247 3 09-MAY-20 10.53.05.207 AM 09-MAY-20 12.00.06.828 PM 0
28248 3 09-MAY-20 12.00.06.828 PM 09-MAY-20 01.00.18.663 PM 0
28249 3 09-MAY-20 01.00.18.663 PM 09-MAY-20 02.00.29.388 PM 0
28240 4 08-MAY-20 10.00.14.785 PM 08-MAY-20 11.00.11.281 PM 0
28241 4 08-MAY-20 11.00.11.281 PM 09-MAY-20 12.00.59.575 AM 0
28242 4 09-MAY-20 12.00.59.575 AM 09-MAY-20 09.06.20.189 AM 0
28243 4 09-MAY-20 09.06.20.189 AM 09-MAY-20 09.20.47.482 AM 0
28244 4 09-MAY-20 09.20.47.482 AM 09-MAY-20 09.33.38.497 AM 0
28245 4 09-MAY-20 09.33.38.497 AM 09-MAY-20 09.51.50.047 AM 0
28246 4 09-MAY-20 09.51.50.047 AM 09-MAY-20 10.53.05.203 AM 0
28247 4 09-MAY-20 10.53.05.203 AM 09-MAY-20 12.00.06.865 PM 0
28248 4 09-MAY-20 12.00.06.865 PM 09-MAY-20 01.00.18.609 PM 0
28249 4 09-MAY-20 01.00.18.609 PM 09-MAY-20 02.00.29.628 PM 0

总结:因为频繁的datapump备份导致X$KEWSSVCV基表更新频繁,且X$KEWSSVCV表无统计信息(统计信息自动收集已经关闭),导致创建snapshot语句执行计划不对耗时长被自动kill
X$KEWSSVCV表是内存表一般需要重启实例释放
所以,收集这几个表的统计信息,让创建snapshot语句根据CBO自动走正确的执行计划即可

流年笑掷 未来可期
原文地址:https://www.cnblogs.com/nadian-li/p/12858525.html