Customer Introduction to ORA-7445 Errors (文档 ID 211909.1)

In this Document

  Purpose
  Scope
  Details
  References

Applies to:

Oracle Database - Enterprise Edition - Version 7.0.16.0 to 11.2.0.2 [Release 7.0 to 11.2] Information in this document applies to any platform. Add ***Checked for relevance on 14-Mar-2013***

Purpose

The purpose of this note is to provide an introduction to ORA-7445 errors, and to suggest the next step in diagnosing the error.  The note is written with a UNIX bias, although the concepts are generic.

Scope

To be used by the DBA when reporting ORA-7445 errors to Oracle Support Services.

Details

What is an ORA-7445 error?
An ORA-7445 error is raised by an Oracle server process when it has received a fatal signal from the operating system. The error may be raised in either a foreground or background process.  The process will normally:
    - write an error to the alert log     - write a trace file in either the user_dump_dest or background_dump_dest or, from 11g onwards, the diagnostic_dest     - create a core dump in core_dump_dest
There are many 'illegal' operations that the operating system can trap.  A common example is a process writing to an invalid memory location, and so to protect the system the offending process will be sent a fatal signal.
Typically, the signals seen are SIGBUS (signal 10, bus error) and SIGSEGV (signal 11, segmentation violation).  There are other UNIX signals and exceptions that may happen, however, they are likely caused by OS problems rather than an Oracle problem.  Examples of other signals are: SIGINT, SIGKILL, SIGSYS.  A complete list is available in Note:1038055.6.
An ORA-7445 is a generic error, and can occur anywhere in the Oracle code.  The precise location of the error is identified by the trace file it produces.
What Does an ORA-7445 Error Look Like?
The appearance of an ORA-7445 error varies slightly from platform to platform.  The two general forms are illustrated here.
Example 1 ORA-07445: exception encountered: core dump [run_some_SQL()+268] [SIGBUS] [Invalid address alignment] [] [] []
Example 2
ORA-07445: exception encountered: core dump [10] [2122262800] [261978112] [] [] []
Example 1 tells us:
    - The error occurred in function run_some_SQL()     - The process received signal SIGBUS     - Some additional information pertinent to the signal
Example 2 is less descriptive, although is exactly the same:
    - The failing function is not given, but rather its memory address is, e.g.  2122262800     - The signal the process received is given as its numeric value rather than signal name, e.g. 10 - SIGBUS     - Some additional information, which is not useful at this time.
What Information Should be Collected?
The following should always be collected:
1.  The alert log file, stretching back to at least the previous instance startup prior to the      error, so that the startup init.ora parameters are included.
2.  All ORA-7445 and ORA-600 trace files since the last instance startup.  This is to      establish if the ORA-7445 being investigated is related to or a symptom of another error.      Note:  If a lot of trace files exist only collect the first 2 to 3 and last 2 to 3 for initial                  investigation.  Oracle Support will then be able to tell from the alertlog file if more                  trace files are required.

3. From 11g onwards the ADR Incident package should be uploaded (see Note 411.1 for instructions on packaging incidents and forwarding them to support).
In some cases, a trace file is not generated.  If this happens, the core dump should be located and a stack trace taken from that using an operating system debugger, as discussed in Note: 1812.1.
This information should be uploaded at the time an SR is created.
Self Service Diagnosis and Initial Interrogation of the Trace File
MyOracle Support (Metalink) provides the following tool:

      Note:153788.1 Troubleshoot an ORA-600 or ORA-7445 Error Using the Error Lookup Tool

 To use this tool effectively, customers will need to isolate:
    - The failing function, if given in the ORA-7445 error message, or     - The stack trace from the Oracle trace file
The following snippet shows key information taken from a trace file.

[*** 2002-05-08 23:35:18.224                              <--- timestamp
*** SESSION ID:(194.14075) 2002-05-08 23:35:18.202
Exception signal: 10 (SIGBUS), code: 1 (Invalid address alignment), addr: 0x41e7, PC: kjrfnd()+44
*** 2002-05-08 23:35:19.404 ksedmp: internal or fatal error
ORA-07445: exception encountered: core dump [kjrfnd()+44] [SIGBUS] [Invalid address alignment] [16871] [] []   <---- the errror
Current SQL statement for this session:                   <--- the current SQL statement
DELETE FROM MY_TABLE WHERE COL1 < :b1 ----- PL/SQL Call Stack -----
object line object handle number name e560c680 35 anonymous block
e560c680 290 anonymous block
----- Call Stack Trace -----                              <----Stack trace starts here
calling call entry argument values in hex
location type point (? means dubious value)
-------------------- -------- -------------------- ----------------------------
ksedmp()+168         CALL     ksedst()+0           540 ? 0 ? FFBE4F98 ?
                                                   FFBE4A3C ? FFBE4A20 ? 0 ?
ssexhd()+380         CALL     ksedmp()+0           3 ? 0 ? 1 ? FFBE56B8 ? 1 ?
                                                   6 ?
sigacthandler()+40   PTR_CALL 00000000             A ? FFBE5F10 ? 19FE000 ?
                                                   19FE000 ? 0 ? 0 ? 
kjrfnd()+44          PTR_CALL 00000000             A ? FFBE5F10 ? FFBE5C58 ?
kjrref()+176         CALL     kjrfnd()+0           4177 ? F6A7F020 ? 0 ? 41DF ?
kjuocl()+732         CALL     kjrref()+0           FFBE63AC ? 19FA400 ?
kjusuc()+1260        CALL     kjuocl()+0           FFBE6218 ? EB5FB9A8 ?
                                                   EB5FB9A8 ? 5 ? 5 ? 0 ?
ksipget()+832        CALL     kjusuc()+0           19FA400 ? FFBE63AC ? 0 ?
                                                   E2A2ED40 ? 19FA400 ? 8 ?
ksqcmi()+3356        CALL     ksipget()+0          10020 ? FFBE6648 ? EE15430C ?
                                                   0 ? 0 ? 0 ?
ksqgtl()+944         CALL     ksqcmi()+0           FFBE65A8 ? 1 ? EDEB4C90 ?
                                                   EE1542D4 ? 1 ? 0 ?
<... lots of stuff deleted here ...> 
sou2o()+20           CALL     opidrv()+0           3C ? FFBEF784 ? 19F8000 ?
                                                   2F6C6F67 ? 0 ? 0 ?
main()+160           CALL     sou2o()+0            FFBEFA80 ? 3C ? 4 ?
                                                   FFBEFA70 ? 1746CF4 ?
                                                   1A06318 ?
_start()+220         CALL     main()+0             0 ? FFBEFC2C ? 1A1D478 ?
                                                   19F8000 ? 0 ? 0 ?
----- Argument/Register Address Dump -----    <----Stack trace ends here

The ORA-7445 error message and 'Current SQL statement' are self explanatory. The stack is the first column on each line, and reads:
ksedmp <- ssexhd <- sigacthandler <- kjrfnd <- kjrref .....
The topmost routines are for error handling; this is why the failing function (kjrfnd()) is not at the top of the stack.
For the purposes of the of the ORA-600/ORA-7445 tool, customers can simply cut and paste from the trace file, starting with the first line of the stack trace (in this case 'ksedmp....'). Customers should paste the first 15-20 lines, or until the line "Argument/Register Dump", whichever is shorter.
When the third argument is "Object specific hardware error", this indicates that the error was caused by the lack of some operating system resource, usually memory or swap space. In this case, please check your operating systems errorlog file for any accompanying error message, eg.:
Linux: /var/log/messages, Solaris, HP Tru64: /var/adm/messages, HP-UX: /var/adm/syslog/syslog.log, AIX: /bin/errpt
This will then typically show a message of the form:
Apr 7 15:30:00 aemhrsPD1 genunix: [ID 470503 kern.warning] WARNING: Sorry, no swap space to grow stack for pid 15415 (cron).
Does The Problem Reproduce?
The time to resolution of an ORA-7745 error can be shortened considerably if the customer is able to reproduce the error at will, or better still, provide a reproducible testcase (see Note:232963.1).
The first step is to identify the SQL being run at the time of the error. Note: 154170.1 discusses how to extract the currently running SQL statement from trace files. It is important to remember that the currently running SQL maybe the victim of some previous problem, and may not reproduce the error when run in isolation.
Often, it is useful to build up a pattern of when the error occurs, for example:
    - does the error only occur for certain bind variables in the current SQL?     - does the error always occur around a particular time of day?     - is the error timing related to other activity on the database, such as a backup or high load?     - is the error always from a given application, or user?     - when did the error first start to occur? What changed around this time?     - are there any other errors (especially ORA-7445 or ORA-600 errors) around the same time?

Video - Customer Introduction to ORA-7445 Errors (04:27)

References

NOTE:153788.1 - ORA-600/ORA-7445/ORA-700 Error Look-up Tool NOTE:154170.1 - How to Find the Offending SQL from a Trace File NOTE:1812.1 - TECH: Getting a Stack Trace from a CORE file on Unix
NOTE:232963.1 - How to Build a Testcase for Oracle Data Server Support to Reproduce ORA-600 and ORA-7445 Errors NOTE:268451.1 - ORA-7445: exception encountered: core dump [...] [SIGBUS] [Object specific hardware error] NOTE:411.1 - ADR Different Methods to Create IPS Package

原文地址:https://www.cnblogs.com/bwdata/p/3895316.html