三节点DG环境主库单机转RAC(DG主备切换)

三节点DG环境主库单机转RAC(DG主备切换)

DBhanG 2020-08-24 09:03:22 164 收藏
文章标签: linux 数据库 oracle sql 运维
版权
生产三节点DG环境主库单机转RAC(DG主备切换)
所有数据均以脱敏
prod主库(15.91)有关DG环境的参数:

*.db_unique_name='prod'
*.fal_client='prod'
*.fal_server='proddg'
*.log_archive_config='DG_CONFIG=(prod,proddg,proddg2)'
*.log_archive_dest_1='location=/data/zjcbss/arch'
*.log_archive_dest_2='service=proddg arch async VALID_FOR=(ONLINE_LOGFILES,PRIMARY_ROLE) DB_UNIQUE_NAME=proddg'
*.log_archive_dest_3='SERVICE=proddg2 LGWR ASYNC VALID_FOR=(ONLINE_LOGFILES,PRIMARY_ROLE) DB_UNIQUE_NAME=proddg2'
*.log_archive_dest_state_2='ENABLE'
*.standby_file_management='AUTO'
1
2
3
4
5
6
7
8
9
proddg备库1(15.55)有关DG环境的参数:

*.db_file_name_convert='/data/prod2/data','/oradata/rman'
*.log_file_name_convert='/data/prod2/data','/oradata/rman'
*.db_unique_name='proddg'
*.fal_client='proddg'
*.fal_server='prod'
*.log_archive_config='DG_CONFIG=(prod,proddg)'
*.log_archive_dest_1='location=/oradata/oracle/arch'
*.log_archive_dest_2='SERVICE=prod LGWR ASYNC VALID_FOR=(ONLINE_LOGFILES,PRIMARY_ROLE) DB_UNIQUE_NAME=prod'
*.log_archive_dest_state_2='ENABLE'
*.standby_file_management='AUTO'
1
2
3
4
5
6
7
8
9
10
proddg2备库2(15.101)有关DG环境的参数:

*.db_file_name_convert='/data/prod2/data','/oradata/zjcbss'
*.log_file_name_convert='/data/prod2/data','/oradata/zjcbss/prod2/data'
*.fal_client='proddg2'
*.fal_server='prod'
*.log_archive_config='DG_CONFIG=(prod,proddg,proddg2)'
*.log_archive_dest_1='location=/oradata/zjcbss/arch'
*.log_archive_dest_2='SERVICE=prod LGWR ASYNC VALID_FOR=(ONLINE_LOGFILES,PRIMARY_ROLE) DB_UNIQUE_NAME=prod'
*.log_archive_dest_state_2='ENABLE'
*.standby_file_management='AUTO'
1
2
3
4
5
6
7
8
9
切换后的环境为:

主库proddg2 192.168.15.101
备库proddg 192.168.15.55
备库prod 192.168.15.91

//先切换,后修改参数
//切换成功后,proddg2为主库,prod为备库,proddg不属于该DG环境中,后续修改参数将proddg加入,将prod剔除
1
2
3
4
5
6
一:切换前准备:

1.本地杀会话
杀客户端进程:
ps -ef|grep "LOCAL=NO"|grep -v grep|awk '{print "kill -9 " $2}'| sh --->直接删

查询连接会话数量:
ps -ef|grep "LOCAL=NO"| wc -l --> 生产中是杀不干净的,几秒钟连接数又到了几十个

select count(*) from v$session where username is not null;

2.修改crontab删除归档脚本
三节点关闭归档日志删除计划任务//注释掉crontab
cronbab -l

3.在主库切换归档日志(多次切换,确保备库正常接收)
alter system switch logfile

4.standby redolog数据量配置正确:
standby redolog=redolog + 1
(经检查现环境符合要求,无需修改)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
二:开始进行switchover切换

1.主库prod切为备库:

select name,database_role,switchover_status from v$database;

NAME DATABASE_ROLE SWITCHOVER_STATUS
--------- ---------------- --------------------
prod PRIMARY SESSIONS ACTIVE

如果switchover_status状态是SESSIONS ACTIVE --> 生产下百分百是session active

使用下面命令切换
alter database commit to switchover to physical standby with session shutdown;
1
2
3
4
5
6
7
8
9
10
2.切换完成后数据库会被关闭,重新启动查看切换完的状态:

select name,database_role,switchover_status from v$database;

NAME DATABASE_ROLE SWITCHOVER_STATUS
--------- ---------------- --------------------
prod PHYSICAL STANDBY TO PRIMARY
//如果此时switchover_status显示为recovery needed,则启动日志应用服务alter database recover managed standby database disconnect from session后,再次查看

//在生产上我切换完,他的switchover_status依旧为session active,没有去过多在意

select database_role,protection_mode,protection_level from v$database;

DATABASE_ROLE PROTECTION_MODE PROTECTION_LEVEL
---------------- -------------------- --------------------
PHYSICAL STANDBY MAXIMUM PERFORMANCE MAXIMUM PERFORMANCE
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
3.备库proddg2切为主库:

//查看切换状态
select database_role,switchover_status from v$database;

DATABASE_ROLE SWITCHOVER_STATUS
---------------- --------------------
PHYSICAL STANDBY TO PRIMARY
(如果此时switchover_status为not allowed,可以尝试进行切换,按错误提示进行处理,一般为ORA-16139: media recovery required
尝试进行recover:SQL>recover managed standby database using current logfile disconnect from session;)

//生产上,备库竟然也是session active,并且该库也不是被用于查询使用。

//备库切换为主库
alter database commit to switchover to primary **with session shutdown;**
1
2
3
4
5
6
7
8
9
10
11
12
13
4.查看切换后的状态:

SQL> select name,database_role,protection_mode,protection_level from v$database;

NAME DATABASE_ROLE PROTECTION_MODE PROTECTION_LEVEL
--------- ---------------- -------------------- --------------------
proddg2 PRIMARY MAXIMUM PERFORMANCE UNPROTECTED

SQL> alter database open;
1
2
3
4
5
6
7
三:切换后的参数修改

proddg2:
alter system set log_archive_config='DG_CONFIG=(proddg2,proddg)' scope=both;
alter system set log_archive_dest_2='SERVICE=proddg LGWR ASYNC VALID_FOR=(ONLINE_LOGFILES,PRIMARY_ROLE) DB_UNIQUE_NAME=proddg' scope=both;
alter system set log_archive_dest_state_2='ENABLE' scope=both;
alter system set fal_server='proddg' scope=both;
alter system set fal_client='proddg2' scope=both;
alter system set db_file_name_convert='/oradata/rman','/oradata/zjcbss' scope=spfile;
alter system set log_file_name_convert='/oradata/rman','/oradata/zjcbss/prod2/data' scope=spfile;
1
2
3
4
5
6
7
8
proddg:

alter system set log_archive_config='DG_CONFIG=(proddg2,proddg)' scope=both;
alter system set log_archive_dest_2='SERVICE=proddg2 LGWR ASYNC VALID_FOR=(ONLINE_LOGFILES,PRIMARY_ROLE) DB_UNIQUE_NAME=proddg2' scope=both;
alter system set log_archive_dest_state_2='ENABLE' scope=both;
alter system set fal_server='proddg2' scope=both;
alter system set fal_client='proddg' scope=both;
alter system set db_file_name_convert='/oradata/zjcbss','/oradata/rman' scope=spfile;
alter system set log_file_name_convert='/oradata/zjcbss/prod2/data','/oradata/rman' scope=spfile;
1
2
3
4
5
6
7
prod无需修改参数,该节点已从DG环境中剔除

四:切换后测试

1.启动proddg的日志应用服务:
SQL> alter database recover managed standby database disconnect from session;

2.在新主库proddg2进行几次日志切换,查看alert日志,判断日志是否正常传输
SQL> alter system switch logfile;

//如果日志没有正常传输,则

alter system set log_archive_dest_state_2='defer' scope=both;

alter system set log_archive_dest_state_2='enable' scope=both;

重新开启一下service传输,再次查看alert日志,判断日志是否正常传输
1
2
3
4
5
6
7
8
9
10
11
12
13
3.检查日志应用情况:

SQL> select max(sequence#) from v$archived_log where applied='YES'; //备库查询
SQL> select max(sequence#) from v$archived_log; //主库查询
1
2
五:其他配置

1.在新的DG环境下重新布置归档日志删除脚本
将prod的归档日志删除脚本拷贝到proddg2相同目录下,并配置crontab计划任务
0,10,20,30,40,50 * * * * /home/oracle/delArch/delArchivelog.sh >> /home/oracle/delArch/delArchivelog.log

//注意修改脚本中的SID

2.将proddg的归档日志删除脚本注释打开

3.为了业务不修改service_name即可连接
在proddg库中再添加一个service_name=prod
alter system set service_names=proddg2,prod scope=both;

4.系统业务进行测试是否正常
1
2
3
4
5
6
7
8
9
10
11
12
13
六:遇到的问题汇总:

//主库切换为备库切换完后,他的switchover_status依旧为session active,正常情况应该为to primary
个人理解:切换为备库后,数据库会被关闭,再次启动后,会启动到read only状态,启动后,还是有新会话进行连接查询,所以该状态为session active

当DG切换完成后,日志正常传输以及日志正常应用,由于db_file_name_convert与log_file_name_convert参数被修改 ,并且该参数无法动态修改,只能使用scope=spfile,所以需要重新启动数据库,当重新启动到mount状态时,启动MRP进行日志应用,前台提示启动MRP成功,后台警告日志滚动将MRP强制关闭,并且后台日志中还出现许多许多找不到数据文件的错误(mount阶段会检测控制文件中记录的数据文件,如果数据文件不存在会在后台日志中报错,当open时,才会在前台报错,无法锁定数据文件。)
排查后最终发现,控制文件中记录的数据文件路径不正确,将228个数据文件进行重命名后,错误消失,日志传输正常,MRP进程启动正常。


————————————————
版权声明:本文为CSDN博主「DBhanG」的原创文章,遵循CC 4.0 BY-SA版权协议,转载请附上原文出处链接及本声明。
原文链接:https://blog.csdn.net/baoyuhang0/article/details/108192683

原文地址:https://www.cnblogs.com/yaoyangding/p/14933270.html