PMON failed to acquire latch, see PMON dump

问题现象：

数据库监听崩溃

处理过程：

数据库出现异常，收到报警之后，马上登陆服务器，lsnrctl status 发现listener死掉，赶紧启动,结果持续了1分钟左右才起来。

数据库可以连接了，恢复正常。

接下来查找问题原因，到alert日志中发现报错：PMON failed to acquire latch, see PMON dump

上网搜索到，居然又是一个BUG。

解决办法：小组讨论决定是否打补丁。

Pmon Failed To Acquire Latch" Messages in Alert Log -Database Hung [ID 468740.1]

In this Document

This document is being delivered to you via Oracle Support's Rapid Visibility (RaV) process and therefore has not been subject to an independent technical review.

Applies to:

Oracle Database - Enterprise Edition - Version 10.2.0.1 to 11.1.0.7 [Release 10.2 to 11.1]
Information in this document applies to any platform.
***Checked for relevance on 26-Apr-2012***

Symptoms

Database Instance hangs and connections to database using 'sqlplus' are also not possible.

Checking alert.log we see following messages

PMON failed to acquire latch, see PMON dump
Fri Oct 5 10:33:00 2007
PMON failed to acquire latch, see PMON dump
Fri Oct 5 10:34:05 2007
PMON failed to acquire latch, see PMON dump
Errors in file /dwrac/BDUMP/dwhp_pmon_1912834.trc:

This will also dump a systemstate dump and the location will be mentioned in alert.log

Also at OS level, we see that MMAN is consuming lot of CPU.

On checking the system state dump, we see that MMAN is holding the Shared pool Latch and Location of
Latch is kgh_next_free or quiesce extents

SO: 3df6835b8, type: 2, owner: 0, flag: INIT/-/-/0x00
(process) Oracle pid=4, calls cur/top: 3dfa94c48/3dfa94c48, flag: (6)
SYSTEM
int error: 0, call error: 0, sess error: 0, txn error 0
(post info) last post received: 0 0 24
(latch info) wait_event=0 bits=80
holding (efd=3) 3800db408 Child shared pool level=7 child#=1
Location from where latch is held: kgh_next_free:
Context saved from call: 0
state=busy, wlstate=free
waiter count=3
Process Group: DEFAULT, pseudo proc: 3df77a4b8
O/S info: user: oracle10, term: UNKNOWN, ospid: 25931
OSD pid info: Unix process pid: 25931, image: oracle@dbname (MMAN)

Short stack would be like below
kghquiesce_regular_extent kgh_next_free ksmc_next_free kmgs_extract_mem_from_granule @kmgs_check_inuse_lists

Cause

This issue was worked upon by development in

Bug 6488694 - DATABASE HUNG WITH PMON FAILED TO ACQUIRE LATCH MESSAGE

Bug 6488694 was closed as a duplicate of Bug 7039896.

Solution

Please apply the patch 7039896 for your version and operating system.

Issue is fixed in:

11.2.0.1 (Base Release)
10.2.0.5 (Server Patch Set)
10.2.0.4.1 (Patch Set Update)
10.2.0.4 Patch 22 on Windows Platforms

Please refer to
Note 7039896.8 - Bug 7039896 - Spin under kghquiesce_regular_extent holding shared pool latch with AMM

Workarounds that can be used:

Disable Automatic Shared Memory Management (ASMM) i.e Setting SGA_TARGET=0

- OR -
Set the init.ora parameter _enable_shared_pool_durations=false

References

BUG:6488694 - DATABSE HUNG WITH PMON FAILED TO ACQUIRE LATCH MESSAGE

NOTE:7039896.8 - Bug 7039896 - Spin under kghquiesce_regular_extent holding shared pool latch with AMM