ORA00600 [4400][48]错误一例

5月26日某客户告警日志中出现的ORA-00600 [4400],[48]错误记录,并产生了trace文件:,
*** SERVICE NAME:(ETL) 2010-05-26 16:45:45.930
*** SESSION ID:(262.12024) 2010-05-26 16:45:45.930
*** 2010-05-26 16:45:45.930
ksedmp: internal or fatal error
ORA-00600: internal error code, arguments: [4400], [48], [], [], [], [], [], []
ORA-10387: parallel query server interrupt (normal)
----- Call Stack Trace -----
calling              call     entry                argument values in hex
location             type     point                (? means dubious value)
-------------------- -------- -------------------- ----------------------------
ksedst+001c          bl       ksedst1              0FFFFFFFF ? 000000020 ?
ksedmp+0290          bl       ksedst               1048DFFC0 ?
ksfdmp+0018          bl       03F5B014
kgeriv+0108          bl       _ptrgl
kgeasi+0118          bl       kgeriv               10496FEC8 ? 000000002 ?
                                                   700000010008000 ? 110000AD0 ?
                                                   110190290 ?
ktcddt+012c          bl       kgeasi               110190110 ? 110450040 ?
                                                   113000001130 ? 200000002 ?
                                                   100000001 ? 000000000 ?
                                                   000000030 ? 000000013 ?
ktcsod+0384          bl       ktcddt               0000003C0 ? 000000018 ?
                                                   1048D4808 ?
kssdch_stage+0758    bl       _ptrgl
....................................................
----- End of Call Stack Trace -----
******************* Dumping process map ****************
377144 : ora_p001_ETL
100000000      93710K  read/exec         oracle
1100007ef       2372K  read/write        oracle
9fffffff0000000        44K  read/exec         /usr/ccs/bin/usla64
9fffffff000b30a         0K  read/write        /usr/ccs/bin/usla64
900000000375880       484K  read/exec         /usr/lib/liblvm.a[shr_64.o]
9001000a0121ac8       135K  read/write        /usr/lib/liblvm.a[shr_64.o]
900000000426180        74K  read/exec         /usr/lib/libcfg.a[shr_64.o]
9001000a00f5ed0        26K  read/write        /usr/lib/libcfg.a[shr_64.o]
90000000034e280         2K  read/exec         /usr/lib/libcrypt.a[shr_64.o]
9001000a00d4760         0K  read/write        /usr/lib/libcrypt.a[shr_64.o]
90000000069a7b0         4K  read/exec         /usr/lib/libc.a[aio_64.o]
9001000a022a340         0K  read/write        /usr/lib/libc.a[aio_64.o]
90000000035f300        85K  read/exec         /usr/lib/libodm.a[shr_64.o]
9001000a00d5c08        35K  read/write        /usr/lib/libodm.a[shr_64.o]
9000000003ef080        83K  read/exec         /usr/lib/libperfstat.a[shr_64.o]
9001000a01d0818         9K  read/write        /usr/lib/libperfstat.a[shr_64.o]
900000000358000         0K  read/exec         /usr/lib/libdl.a[shr_64.o]
9001000a0219000         0K  read/write        /usr/lib/libdl.a[shr_64.o]
9000000007ff100      8588K  read/exec         /oracle/product/10.2.0/lib/libjox10.a[shr.o]
8001000a0000038       585K  read/write        /oracle/product/10.2.0/lib/libjox10.a[shr.o]
9000000004a0000       228K  read/exec         /usr/lib/libpthreads.a[shr_xpg5_64.o]
9001000a0144000       558K  read/write        /usr/lib/libpthreads.a[shr_xpg5_64.o]
900000000045500      2966K  read/exec         /usr/lib/libc.a[shr_64.o]
9001000a0000788       844K  read/write        /usr/lib/libc.a[shr_64.o]
   Total      110843K
******************* End of process map dump ****************
===================================================
PROCESS STATE
-------------
Process global information:
     process: 7000000cf48bf98, call: 7000000b40ebdf0, xact: 7000000cdece7a8, curses: 7000000cf666540, usrses: 7000000cf666540
  ----------------------------------------
  SO: 7000000cf48bf98, type: 2, owner: 0, flag: INIT/-/-/0x00
  (process) Oracle pid=43, calls cur/top: 7000000b40ebdf0/7000000b40ebdf0, flag: (0) -
            int error: 0, call error: 0, sess error: 0, txn error 0
  (post info) last post received: 115 0 4
              last post received-location: kslpsr
              last process to post me: 7000000cf479c38 1 6
              last post sent: 0 0 251
              last post sent-location: kxfpqr: QC
              last process posted by me: 7000000cf48a7f8 10 0
    (latch info) wait_event=0 bits=0
    Process Group: DEFAULT, pseudo proc: 7000000cf50bd50
    O/S info: user: oracle, term: UNKNOWN, ospid: 377144
    OSD pid info: Unix process pid: 377144, image: oracle@etl_a (P001)
   SO: 7000000cf666540, type: 4, owner: 7000000cf48bf98, flag: INIT/-/-/0x00
    (session) sid: 262 trans: 7000000cdece7a8, creator: 7000000cf48bf98, flag: (c0100041) USR/- BSY/-/-/DEL/-/-
              DID: 0001-0028-000FFB1E, short-term DID: 0001-002B-00151DA1
              txn branch: 7000000cdf412c8
              oct: 0, prv: 0, sql: 0, psql: 0, user: 26/ETL
    O/S info: user: bo, term: pts/0, ospid: 377144, machine: etl_a
              program: oracle@etl_a (P001)
    last wait for 'SQL*Net message from dblink' blocking sess=0x0 seq=45 wait_time=476 seconds since wait started=0
                driver id=0, #bytes=1, =0
    Dumping Session Wait History
     for 'SQL*Net message from dblink' count=1 wait_time=476
                driver id=0, #bytes=1, =0
     for 'SQL*Net message to dblink' count=1 wait_time=1
                driver id=0, #bytes=1, =0
     for 'log file sync' count=1 wait_time=7483
                buffer#=ca9, =0, =0
     for 'log file sync' count=1 wait_time=358
                buffer#=ca9, =0, =0
     for 'PX Deq: Execution Msg' count=1 wait_time=15527
                sleeptime/senderid=1001ffff, passes=1, =0
     for 'log file sync' count=1 wait_time=11855
                buffer#=c57, =0, =0
     for 'PX Deq: Execution Msg' count=1 wait_time=277142
                sleeptime/senderid=1001ffff, passes=1, =0
     for 'direct path write' count=1 wait_time=25
                file number=5, first dba=103ca, block cnt=3
     for 'direct path write' count=1 wait_time=6
                file number=4, first dba=1abe1, block cnt=8
     for 'row cache lock' count=1 wait_time=3307
                cache id=5, mode=0, request=5
    temporary object counter: 0
      ----------------------------------------
      Virtual Thread:
      kgskvt: 7000000c0e86b58, sess: 7000000cf666540, vc: 0, proc: 7000000cf48bf98
      consumer group cur: OTHER_GROUPS (upd? 0), mapped: DEFAULT_CONSUMER_GROUP, orig:
      vt_state: 0x200, vt_flags: 0x30, blkrun: 0
      is_assigned: 1, in_sched: 0 (0)
      vt_active: 0 (pending: 0)
      used quanta: 0 (cg: 0)
      cpu start time: 0, quantum status: 0x0
      quantum checks to skip: 0, check thresh: 0
      idle time: 0, active time: 0 (cg: 0)
      cpu yields: 0 (cg: 0), waits: 0 (cg: 0), wait time: 0 (cg: 0)
      queued time outs: 0, time: 0 (cur 0, cg 0)
      calls aborted: 0, num est exec limit hit: 0
      undo current: 0k max: 0k
      ----------------------------------------
      UOL used : 0 locks(used=0, free=2)
      KGX Atomic Operation Log 7000000bdcf0568
       Mutex 0(0, 0) idn 0 oper NONE
       Cursor Parent uid 262 efd 9 whr 11 slp 0
       oper=NONE pt1=7000000c69d0878 pt2=7000000bef37090 pt3=7000000b47119c8
       pt4=0 u41=1 stt=0
      KGX Atomic Operation Log 7000000bdcf05b0
       Mutex 7000000bdc2d028(0, 2) idn 0 oper NONE
       Cursor Stat uid 262 efd 12 whr 1 slp 0
       oper=NONE pt1=7000000bdc2cef8 pt2=0 pt3=0
       pt4=0 u41=0 stt=0
      KGX Atomic Operation Log 7000000bdcf05f8
       Mutex 0(0, 0) idn 0 oper NONE
       Library Cache uid 262 efd 0 whr 0 slp 0
      ----------------------------------------
      SO: 7000000cf7e3ec8, type: 5, owner: 7000000cf666540, flag: INIT/-/-/0x00
      (enqueue) DX-00000010-00000000	DID: 0001-002B-00151DA9
      lv: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  res_flag: 0x2
      res: 0x7000000cf8bd0b8, mode: X, lock_flag: 0x0
      own: 0x7000000cf666540, sess: 0x7000000cf666540, proc: 0x7000000cf48bf98, prv: 0x7000000cf8bd0c8
从trace文件名就可以看出出错的是某个并行子进程,其最近调用堆栈为ktcsod->ktcddt->kgeasi(报错),ktcsod与ktcddt均为Oracle中内核事务控制模块函数 (Kernel Transaction Control),又该并行子进程最近等待为"SQL*Net message from dblink"事件,共等待476s(wait_time=476 seconds)。trace显示该并行会话包含分布式事务锁:
      SO: 7000000cf7e3ec8, type: 5, owner: 7000000cf666540, flag: INIT/-/-/0x00
      (enqueue) DX-00000010-00000000	DID: 0001-002B-00151DA9
      lv: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  res_flag: 0x2
      res: 0x7000000cf8bd0b8, mode: X, lock_flag: 0x0
      own: 0x7000000cf666540, sess: 0x7000000cf666540, proc: 0x7000000cf48bf98, prv: 0x7000000cf8bd0c8
即DX(Distributed transaction entry)锁;可以猜测该并行会话在等待远程事务被清理,但等待超时,故报错。 查询metalink可以发现Bug 5223587与该错误一致,该Bug的特征为: 1. 出现ORA-600[4400],且调用栈为ktcddt ,说明涉及到了分布式事务 2. trace显示当时无打开的游标 3. 可以看到当时的等待为'SQL*Net message from dblink'事件 Metalink文档[ID  444108.1]描 述该错误会在涉及远程数据库的并行DML操作中出现,但该错误仅会在清理阶段出现,故实际无影响(It is not critical as such, since it is happening during the cleanup of the operation.)。 该Bug可以通过实施Patch 5223587修复, 目前该补丁只有对应基础版本10.2.0.2和10.2.0.3的版本,该补丁在 10.2.0.3 上有AIX power(64 bit)和Solaris Sparc(64 bit)的版本,10.2.0.2上只有linux-64 与HP-UX的版本。因为该Bug一般不会产生不良影响,故实际上不建议应用补丁。
原文地址:https://www.cnblogs.com/macleanoracle/p/2967439.html