Oracle-DG疑问,什么情况下主库会发出一个会话连接备库

一、疑问,什么情况下主库会发出一个会话连接备库

可以根据上述问题,进行一定猜测。

例如,主库启动时,远程归档线程自动发出远程连接dg备库的请求;

以及远程归档线程状态参数,defer,enable切换时会尝试发起新的连接;

二、测试

2.1 主备连接正常,主库shutdown immediate

Primary
SQL> shutdown immediate;

Standby
ps -ef|grep LOCAL=NO  之前有来自主库远程发起的5个session,均断开
dg_alert
2021-04-20T07:53:39.624556+08:00
RFS[13]: Assigned to RFS process (PID:51957)
RFS[13]: Possible network disconnect with primary database
其中51957 是其中一个主库远程发起session 在备库主机上的pid

2.2 主备连接正常,主库startup

Primary

SQL> startup

DB_alert

2021-04-20T08:00:48.636757+08:00
TT02: Standby redo logfile selected for thread 1 sequence 98 for destination LOG_ARCHIVE_DEST_2
2021-04-20T08:00:48.761592+08:00
ARC1: Standby redo logfile selected for thread 1 sequence 97 for destination LOG_ARCHIVE_DEST_2



Standby

$ ps -efmT|grep LOCAL=NO
oracle 54578
oracle 54580
oracle 54582

DG_ALERT

2021-04-20T08:00:48.599128+08:00
Primary database is in MAXIMUM PERFORMANCE mode
RFS[14]: Assigned to RFS process (PID:54578)
RFS[14]: Selected log 4 for T-1.S-98 dbid 1594123398 branch 1066342665
2021-04-20T08:00:48.745723+08:00
RFS[15]: Assigned to RFS process (PID:54580)
RFS[15]: Selected log 5 for T-1.S-97 dbid 1594123398 branch 1066342665
2021-04-20T08:00:48.851883+08:00
Recovery: global flush, serial recovery

主库启动时,远程归档线程会连接备库,备库RFS进程接收归档信息进行备库本地化存储归档日志。

2.3 备库kill远程会话,关闭dg监听

备库
$ ps -ef|grep LOCAL=NO
oracle     3097      1  0 05:07 ?        00:00:00 oraclec12 (LOCAL=NO)
oracle     3099      1  0 05:07 ?        00:00:00 oraclec12 (LOCAL=NO)
oracle     3101      1  0 05:07 ?        00:00:00 oraclec12 (LOCAL=NO)
oracle     3103      1  0 05:07 ?        00:00:00 oraclec12 (LOCAL=NO)


SQL> select s.sid,s.serial#,p.program,s.username,p.username,p.background,s.program,s.LAST_CALL_ET,s.LOGON_TIME,s.status 
 from v$process p,v$session s where p.addr=s.paddr and p.spid in(3097,3099,3101,3103);
       SID    SERIAL# PROGRAM                   USERNAME   USERNAME   BA PROGRAM                   LAST_CALL_ET LOGON_TIME   STATUS
---------- ---------- ------------------------- ---------- ---------- -- ------------------------- ------------ ------------ ----------------
        51      18042 oracle@c12                PUBLIC     oracle        oracle@c11 (TNS V1-V3)               9 09-MAR-21    INACTIVE
        49      51581 oracle@c12                PUBLIC     oracle        oracle@c11 (TNS V1-V3)             853 09-MAR-21    INACTIVE
        50      45678 oracle@c12                PUBLIC     oracle        oracle@c11 (TNS V1-V3)             849 09-MAR-21    INACTIVE
        47       5930 oracle@c12                PUBLIC     oracle        oracle@c11 (TNS V1-V3)               0 09-MAR-21    INACTIVE

主库
SQL> shutdown immediate;
备库
远程session 都断开连接!
备库$lsnrctl stop

主库
SQL> startup
select  DEST_NAME , STATUS , RECOVERY_MODE from v$archive_dest_status where dest_name='LOG_ARCHIVE_DEST_2';
DEST_NAME            STATUS             RECOVERY_MODE
-------------------- ------------------ ---------------
LOG_ARCHIVE_DEST_2   ERROR              IDLE


备库
SQL> host lsnrctl start
$ ps -ef|grep -v grep|grep LOCAL=NO|wc -l
0

alter system set log_archive_dest_state_2=defer;
alter system set log_archive_dest_state_2=enable;


备库listener
TIMESTAMP * CONNECT DATA [* PROTOCOL INFO] * EVENT [* SID] * RETURN CODE
WARNING: Subscription for node down event still pending
09-MAR-2021 05:28:48 * (CONNECT_DATA=(CID=(PROGRAM=)(HOST=c12)(USER=oracle))(COMMAND=status)(ARGUMENTS=64)(SERVICE=LISTENER)(VERSION=203424000)) * status * 0
2021-03-09T05:31:30.969614+08:00
09-MAR-2021 05:31:30 * (CONNECT_DATA=(SERVER=DEDICATED)(SERVICE_NAME=orcldg)(CID=(PROGRAM=oracle@c11)(HOST=c11)(USER=oracle))) * (ADDRESS=(PROTOCOL=tcp)(HOST=10.0.0.98)(PORT=54711)) * establish * orcldg * 0
09-MAR-2021 05:31:30 * (CONNECT_DATA=(SERVER=DEDICATED)(SERVICE_NAME=orcldg)(CID=(PROGRAM=oracle@c11)(HOST=c11)(USER=oracle))) * (ADDRESS=(PROTOCOL=tcp)(HOST=10.0.0.98)(PORT=54712)) * establish * orcldg * 0
2021-03-09T05:31:32.030043+08:00
09-MAR-2021 05:31:32 * (CONNECT_DATA=(SERVER=DEDICATED)(SERVICE_NAME=orcldg)(CID=(PROGRAM=oracle)(HOST=c11)(USER=oracle))) * (ADDRESS=(PROTOCOL=tcp)(HOST=10.0.0.98)(PORT=54713)) * establish * orcldg * 0
09-MAR-2021 05:31:32 * (CONNECT_DATA=(SERVER=DEDICATED)(SERVICE_NAME=orcldg)(CID=(PROGRAM=oracle)(HOST=c11)(USER=oracle))) * (ADDRESS=(PROTOCOL=tcp)(HOST=10.0.0.98)(PORT=54714)) * establish * orcldg * 0
2021-03-09T05:31:33.086332+08:00
09-MAR-2021 05:31:33 * (CONNECT_DATA=(SERVER=DEDICATED)(SERVICE_NAME=orcldg)(CID=(PROGRAM=oracle)(HOST=c11)(USER=oracle))) * (ADDRESS=(PROTOCOL=tcp)(HOST=10.0.0.98)(PORT=54715)) * establish * orcldg * 0
09-MAR-2021 05:31:33 * (CONNECT_DATA=(SERVER=DEDICATED)(SERVICE_NAME=orcldg)(CID=(PROGRAM=oracle)(HOST=c11)(USER=oracle))) * (ADDRESS=(PROTOCOL=tcp)(HOST=10.0.0.98)(PORT=54716)) * establish * orcldg * 0
09-MAR-2021 05:31:33 * (CONNECT_DATA=(SERVER=DEDICATED)(SERVICE_NAME=orcldg)(CID=(PROGRAM=oracle@c11)(HOST=c11)(USER=oracle))) * (ADDRESS=(PROTOCOL=tcp)(HOST=10.0.0.98)(PORT=54717)) * establish * orcldg * 0
09-MAR-2021 05:31:33 * (CONNECT_DATA=(SERVER=DEDICATED)(SERVICE_NAME=orcldg)(CID=(PROGRAM=oracle)(HOST=c11)(USER=oracle))) * (ADDRESS=(PROTOCOL=tcp)(HOST=10.0.0.98)(PORT=54719)) * establish * orcldg * 0
09-MAR-2021 05:31:33 * (CONNECT_DATA=(SERVER=DEDICATED)(SERVICE_NAME=orcldg)(CID=(PROGRAM=oracle@c11)(HOST=c11)(USER=oracle))) * (ADDRESS=(PROTOCOL=tcp)(HOST=10.0.0.98)(PORT=54718)) * establish * orcldg * 0

$ ps -ef|grep LOCAL=NO|grep -v grep
oracle     4874      1  0 05:31 ?        00:00:00 oraclec12 (LOCAL=NO)
oracle     4876      1  0 05:31 ?        00:00:00 oraclec12 (LOCAL=NO)
oracle     4878      1  0 05:31 ?        00:00:00 oraclec12 (LOCAL=NO)
oracle     4880      1  0 05:31 ?        00:00:00 oraclec12 (LOCAL=NO)
oracle     4882      1  0 05:31 ?        00:00:00 oraclec12 (LOCAL=NO)
也就是说,12C 主备之间存在多个session
select s.sid,s.serial#,p.program,s.username,p.username,p.background,s.program,s.LAST_CALL_ET,s.LOGON_TIME,s.status 
 from v$process p,v$session s where p.addr=s.paddr and p.spid in(4874,4876,4878,4880,4882);
       SID    SERIAL# PROGRAM                   USERNAME   USERNAME   BA PROGRAM                   LAST_CALL_ET LOGON_TIME   STATUS
---------- ---------- ------------------------- ---------- ---------- -- ------------------------- ------------ ------------ ----------------
        27      44814 oracle@c12                PUBLIC     oracle        oracle@c11 (TNS V1-V3)              55 09-MAR-21    INACTIVE
        56      62720 oracle@c12                PUBLIC     oracle        oracle@c11 (TNS V1-V3)               1 09-MAR-21    INACTIVE
        55      38475 oracle@c12                PUBLIC     oracle        oracle@c11 (TNS V1-V3)             115 09-MAR-21    INACTIVE
        28       3511 oracle@c12                PUBLIC     oracle        oracle@c11 (TNS V1-V3)             115 09-MAR-21    INACTIVE
        51      40708 oracle@c12                PUBLIC     oracle        oracle@c11 (TNS V1-V3)             115 09-MAR-21    INACTIVE
        
可以发现备库关闭监听,也就代表拒绝新的连接。
而手工在dg备库kill远程会话,就是将已有的连接kill,因此主备之间连通性断开!

然后重置远程归档线程参数的状态defer,enable本质是让主库重新发起一次连接dg备库的请求!

2.4 dg备库取消MRP进程

alter database recover managed standby database cancel;
会话还在,主备连通性无改变,只是dg同步中断。

2.5 dg kill 远程会话

SQL> alter system kill session '56,62720' immediate;  

主库报Alert
2021-03-09T05:37:27.839588+08:00
TT02: Attempting destination LOG_ARCHIVE_DEST_2 network reconnect (3135)
TT02: Destination LOG_ARCHIVE_DEST_2 network reconnect abandoned
2021-03-09T05:37:27.853485+08:00
Errors in file /picclife/app/oracle/diag/rdbms/orcl/c11/trace/c11_tt02_4865.trc:
ORA-03135: 连接失去联系
Error 3135 for archive log file 2 to 'tns_standby'
2021-03-09T05:37:27.855995+08:00
Errors in file /picclife/app/oracle/diag/rdbms/orcl/c11/trace/c11_tt02_4865.trc:
ORA-03135: 连接失去联系
2021-03-09T05:37:27.856095+08:00
Errors in file /picclife/app/oracle/diag/rdbms/orcl/c11/trace/c11_tt02_4865.trc:
ORA-03135: 连接失去联系



alter system set log_archive_dest_state_2=defer;
alter system set log_archive_dest_state_2=enable;

SQL> select s.sid,s.serial#,p.program,s.username,p.username,p.background,s.program,s.LAST_CALL_ET,s.LOGON_TIME,s.status from v$process p,v$session s where p.addr=s.paddr and s.PROGRAM='oracle@c11 (TNS V1-V3)';
       SID    SERIAL# PROGRAM                   USERNAME   USERNAME   BA PROGRAM                   LAST_CALL_ET LOGON_TIME   STATUS
---------- ---------- ------------------------- ---------- ---------- -- ------------------------- ------------ ------------ ----------------
        60      28802 oracle@c12                PUBLIC     oracle        oracle@c11 (TNS V1-V3)              34 09-MAR-21    INACTIVE
        56      17436 oracle@c12                PUBLIC     oracle        oracle@c11 (TNS V1-V3)               1 09-MAR-21    INACTIVE
        55      38475 oracle@c12                PUBLIC     oracle        oracle@c11 (TNS V1-V3)             481 09-MAR-21    INACTIVE
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE archivelog DISCONNECT FROM SESSION;
alter database recover managed standby database cancel;


recover managed standby database disconnect from session;

SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING ARCHIVED LOGFILE DISCONNECT;

MRP0            WAIT_FOR_LOG      

       SID    SERIAL# PROGRAM                   USERNAME   USERNAME   BA PROGRAM                   LAST_CALL_ET LOGON_TIME   STATUS
---------- ---------- ------------------------- ---------- ---------- -- ------------------------- ------------ ------------ ----------------
        55      57549 oracle@c12                PUBLIC     oracle        oracle@c11 (TNS V1-V3)              13 09-MAR-21    INACTIVE
        56      17436 oracle@c12                PUBLIC     oracle        oracle@c11 (TNS V1-V3)               1 09-MAR-21    INACTIVE

上述问题,是我们重置参数修复的,新的问题又来了,备库kill 远程会话,Oracle能自动解决这个问题吗? 会话断开意味着主备连通性不正常,无法传输主库redo!



1) dg kill session ,primary db alert

2021-04-20T08:21:50.887515+08:00
TT02: Attempting destination LOG_ARCHIVE_DEST_2 network reconnect (3135)
TT02: Destination LOG_ARCHIVE_DEST_2 network reconnect abandoned
2021-04-20T08:21:50.887683+08:00
Errors in file /picclife/app/oracle/diag/rdbms/orcl/c11/trace/c11_tt02_10879.trc:
ORA-03135: 连接失去联系
Error 3135 for archive log file 2 to 'tns_standby'
2021-04-20T08:21:50.891152+08:00
Errors in file /picclife/app/oracle/diag/rdbms/orcl/c11/trace/c11_tt02_10879.trc:
ORA-03135: 连接失去联系
2021-04-20T08:21:50.892138+08:00
Errors in file /picclife/app/oracle/diag/rdbms/orcl/c11/trace/c11_tt02_10879.trc:
ORA-03135: 连接失去联系

2) 查询dg连通性状态

SQL> select dest_id,error from v$archive_dest;

DEST_ID ERROR
---------- -----------------------------------------------------------------
2 ORA-03135: connection lost contact

3)主库dml测试!

SQL> insert into b values(1);

1 row created.

SQL> commit;

dg 无法查询数据!

4)primary switch archive log 

SQL> alter system switch logfile;

SQL> alter system archive log current;

主库多次切换归档,但是由于远程归档报错,并未重新发起连接dg的请求!

SQL> select max(sequence#),thread# from v$archived_log group by thread#;

MAX(SEQUENCE#) THREAD#
-------------- ----------
102 1

并且dg配置了fal server 追gap的情况未出现,因为陷入了一个思维陷阱!

dg   sequence 最大 97, gap查询不出数据!

MAX(SEQUENCE#) THREAD#
-------------- ----------
97 1    

5) wait for 6 minute, 主库再次尝试发出请求,dg 会话,连通性自动恢复正常,请注意关注时间点顺序!

Primary

ORA-03135: 连接失去联系
2021-04-20T08:21:50.892138+08:00


2021-04-20T08:26:20.347305+08:00
ALTER SYSTEM ARCHIVE LOG
2021-04-20T08:26:20.349898+08:00
Thread 1 cannot allocate new log, sequence 101
Checkpoint not complete
Current log# 1 seq# 100 mem# 0: /picclife/app/oracle/oradata/orcl/redo01.log
2021-04-20T08:26:22.826568+08:00
Thread 1 advanced to log sequence 101 (LGWR switch)
Current log# 2 seq# 101 mem# 0: /picclife/app/oracle/oradata/orcl/redo02.log
2021-04-20T08:26:22.838124+08:00
Archived Log entry 192 added for T-1.S-100 ID 0x5f054486 LAD:1
2021-04-20T08:26:22.843289+08:00
ALTER SYSTEM ARCHIVE LOG
2021-04-20T08:26:22.848645+08:00
Thread 1 advanced to log sequence 102 (LGWR switch)
Current log# 3 seq# 102 mem# 0: /picclife/app/oracle/oradata/orcl/redo03.log
2021-04-20T08:26:22.851459+08:00
Archived Log entry 193 added for T-1.S-101 ID 0x5f054486 LAD:1
2021-04-20T08:26:23.299206+08:00
ALTER SYSTEM ARCHIVE LOG
2021-04-20T08:26:23.301270+08:00
Thread 1 cannot allocate new log, sequence 103
Checkpoint not complete
Current log# 3 seq# 102 mem# 0: /picclife/app/oracle/oradata/orcl/redo03.log
2021-04-20T08:26:25.863465+08:00
Thread 1 advanced to log sequence 103 (LGWR switch)
Current log# 1 seq# 103 mem# 0: /picclife/app/oracle/oradata/orcl/redo01.log
2021-04-20T08:26:25.866709+08:00
Archived Log entry 194 added for T-1.S-102 ID 0x5f054486 LAD:1
2021-04-20T08:27:49.298684+08:00
TT04: Standby redo logfile selected for thread 1 sequence 103 for destination LOG_ARCHIVE_DEST_2
2021-04-20T08:28:49.448793+08:00
ARC3: Standby redo logfile selected for thread 1 sequence 98 for destination LOG_ARCHIVE_DEST_2

Standby

2021-04-20T08:27:49.287742+08:00
Primary database is in MAXIMUM PERFORMANCE mode
RFS[16]: Assigned to RFS process (PID:56268)
RFS[16]: Selected log 5 for T-1.S-103 dbid 1594123398 branch 1066342665
2021-04-20T08:28:49.348064+08:00
RFS[17]: Assigned to RFS process (PID:56328)
RFS[17]: Opened log for T-1.S-100 dbid 1594123398 branch 1066342665
2021-04-20T08:28:49.353374+08:00
RFS[18]: Assigned to RFS process (PID:56330)
RFS[18]: Opened log for T-1.S-99 dbid 1594123398 branch 1066342665
2021-04-20T08:28:49.361200+08:00
Archived Log entry 17 added for thread 1 sequence 99 rlc 1066342665 ID 0x5f054486 LAD2 :
2021-04-20T08:28:49.391236+08:00
Archived Log entry 18 added for thread 1 sequence 100 rlc 1066342665 ID 0x5f054486 LAD2 :
RFS[18]: Opened log for T-1.S-101 dbid 1594123398 branch 1066342665
RFS[17]: Opened log for T-1.S-102 dbid 1594123398 branch 1066342665
2021-04-20T08:28:49.403222+08:00
Archived Log entry 19 added for thread 1 sequence 101 rlc 1066342665 ID 0x5f054486 LAD2 :
2021-04-20T08:28:49.407801+08:00
Archived Log entry 20 added for thread 1 sequence 102 rlc 1066342665 ID 0x5f054486 LAD2 :
2021-04-20T08:28:49.434004+08:00
RFS[19]: Assigned to RFS process (PID:56332)
RFS[19]: Selected log 4 for T-1.S-98 dbid 1594123398 branch 1066342665
2021-04-20T08:28:49.462006+08:00

原文地址:https://www.cnblogs.com/lvcha001/p/14679523.html