oracle TBL$OR$IDX$PART$NUM BUG案例处理过程

image

image

image

通过AWR报告发现数据库cpu告警的问题,定位产生问题的sql语句

首先查看SQL的执行计划

SELECT count(1)
FROM cmds.psn_cntr_holder a
    ,cmds.pol_attrib c
    ,cmds.std_contract b
INNER JOIN cmds.CUSTOMINFO_E_SINGLE_CNTR k ON k.cntr_no = b.cntr_no
    AND k.e_flag = 'Y'
    AND k.branch_src = b.branch_src
    AND b.sys_src IN (
        'NBPS'
        ,'OBPS'
        )
WHERE 1 = 1
    AND (
        1 = 0
        OR a.hldr_cust_no = '2004140804000044007' AND a.branch_src = '140000' AND a.sys_src = 'CBPS8'
        OR a.hldr_cust_no = '2004140804000044007' AND a.branch_src = '000002' AND a.sys_src = 'UBPS'
        )
    AND a.cntr_id = b.cntr_id
    AND b.pol_code = c.pol_cbps_code
    AND b.cntr_stat IN ('K')
    AND c.pol_term_code = 'L'
    AND b.sys_src IN (
        SELECT syssrc
        FROM cmds.paramSystem p
        WHERE 1 = 1
            AND p.orisys = 'ECSS'
        );

执行计划没有问题

--------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                 | Name                     | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
--------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                          |                          |     1 |   169 |    71   (0)| 00:00:01 |       |       |
|   1 |  SORT AGGREGATE                           |                          |     1 |   169 |            |          |       |       |
|*  2 |   HASH JOIN SEMI                          |                          |     1 |   169 |    71   (0)| 00:00:01 |       |       |
|   3 |    NESTED LOOPS                           |                          |     1 |   155 |    68   (0)| 00:00:01 |       |       |
|   4 |     NESTED LOOPS                          |                          |     2 |   155 |    68   (0)| 00:00:01 |       |       |
|   5 |      NESTED LOOPS                         |                          |     1 |   148 |    64   (0)| 00:00:01 |       |       |
|   6 |       MERGE JOIN CARTESIAN                |                          |     1 |    85 |    60   (0)| 00:00:01 |       |       |
|   7 |        INLIST ITERATOR                    |                          |       |       |            |          |       |       |
|*  8 |         TABLE ACCESS BY GLOBAL INDEX ROWID| PSN_CNTR_HOLDER          |     1 |    53 |     6   (0)| 00:00:01 | ROWID | ROWID |
|*  9 |          INDEX RANGE SCAN                 | IDX_PSN_CNTR_HOLDERNO    |     3 |       |     4   (0)| 00:00:01 |       |       |
|  10 |        BUFFER SORT                        |                          |    67 |  2144 |    54   (0)| 00:00:01 |       |       |
|  11 |         PARTITION RANGE ALL               |                          |    67 |  2144 |    54   (0)| 00:00:01 |     1 |    35 |
|  12 |          PARTITION LIST ALL               |                          |    67 |  2144 |    54   (0)| 00:00:01 |     1 |     2 |
|* 13 |           TABLE ACCESS FULL               | CUSTOMINFO_E_SINGLE_CNTR |    67 |  2144 |    54   (0)| 00:00:01 |     1 |    70 |
|* 14 |       TABLE ACCESS BY GLOBAL INDEX ROWID  | STD_CONTRACT             |     1 |    63 |     4   (0)| 00:00:01 | ROWID | ROWID |
|* 15 |        INDEX RANGE SCAN                   | IDX_STD_CONTACT_CNTRID   |     1 |       |     3   (0)| 00:00:01 |       |       |
|* 16 |      INDEX RANGE SCAN                     | POL_ATTRIB_CBPS_CODE     |     2 |       |     1   (0)| 00:00:01 |       |       |
|* 17 |     TABLE ACCESS BY INDEX ROWID           | POL_ATTRIB               |     1 |     7 |     4   (0)| 00:00:01 |       |       |
|* 18 |    TABLE ACCESS FULL                      | PARAMSYSTEM              |     2 |    28 |     3   (0)| 00:00:01 |       |       |
--------------------------------------------------------------------------------------------------------------------------------------

检查下面的SQL的执行的session,因为比较奇怪,所以看看是谁执行的

SELECT DISTINCT TBL$OR$IDX$PART$NUM("STD_CONTRACT", 0, 2, 0, "SYS_SRC")
FROM (
    SELECT "P"."SYSSRC" "SYS_SRC"
    FROM "PARAMSYSTEM" "P"
    WHERE "P"."ORISYS" = 'ECSS'
        AND (
            "P"."SYSSRC" = 'NBPS'
            OR "P"."SYSSRC" = 'OBPS'
            )
    )
ORDER BY 1

无论是在当前的session视图还是历史的session视图都查不到这个sql的任何信息,定位sql里面的TBL$OR$IDX$PART$NUM 函数,oracle通过TBL$OR$IDX$PART$NUM函数来判断是否普通表满足分区置换的条件。

检查第一个sql的内容,发现INNER JOIN 部分存在一定的问题,建议开发调整该部分

INNER JOIN cmds.CUSTOMINFO_E_SINGLE_CNTR k ON k.cntr_no = b.cntr_no
    AND k.e_flag = 'Y'
    AND k.branch_src = b.branch_src
    AND b.sys_src IN (
        'NBPS'
        ,'OBPS'
        )

再看看sql的统计信息

image

可以看出,该sql返回一样产生的逻辑读非常高,不合理,但是sql从10046时间里面看,使用的执行计划是正确的,跟估算的是一样的。

这时猜测应该是触发了ORACLE bug,该sql引起了oracle内部非常高频率的反复运行分区置换条件的判断。

在oracle support里面确定BUG编号为:17572606

BUG处理方式:

_subquery_pruning_enabled=FALSE on a session or system level. OR use the no_subquery_pruning hint on the problematic queries.
It's a table-level hint so it would be "/*+ no_subquery_pruning(t1) ..

OR

Apply Patch 19889960 if available for your version and platform.

原文地址:https://www.cnblogs.com/wangrongxin/p/6256815.html