gen_server terminate与trap_exit

不论是新手还是熟手,写gen_server时常会遇到terminate/2,有时执行,有时却不执行的困惑。
比如stackoverflow中的Handling the cleanup of the gen_server state
,因为terminate的文档写得比较模糊,并没有给出如何让terminate/2一定会被执行的方案。

为了理顺各种情形,做了个小实验,结论如下:
让进程退出的来源有二种:

  • 内部原因,自己运行完退出或发生异常crash退出。
  • 外部原因,使用erlang:exit/2强制退出正在进行的进程。
gen_server退出原因 启动函数 trap_exit terminate
内部自身发生crash 无关 无关 执行
exit(P,kill) 无关 无关 不执行
exit(P,Reason) 无关 true 执行
exit(P,Reason) 无关 false 不执行
Pid!{'EXIT',F,Reason} gen_server:start_link 无关 执行
Pid!{'EXIT',F,Reason} gen_server:start 无关 不执行
  1. 特别注意kill是非常霸道的exit信号,直接强制退出,不会执行terminate,这也是supervisor在退出brutal_kill方式启动进程所使用的方法。
  2. 我们不执行terminate最常见情况:
        1. 使用监控树把进程挂载在Application下(gen_server:start_link/3-4)。
        2. Application关闭时会调用supervisor:terminate_child/2来依次关闭进程。
        3. terminate_child/2是使用exit(Pid, shutdown)来关闭工作进程。
        4. 所以如果我们trap_exit: false,则不会执行terminate/2。
  1. 确保进程执行terminate的方案是init/1中加上process_flag(trap_exit, true)

接下来,我们将分情况一步步分析下。

trap_exit 的作用

erlang:process_flage(trap_exit, true).
  • 设置为false时:
    link的进程 异常 退出(exit(whatever)),本进程也会直接异常退出。
    link的进程 正常 退出(运行结束或使用exit(normal)),则本进程完全没有影响。不会收到任何信息,也不会退出。
1> erlang:process_flag(trap_exit, false), self().
<0.64.0>
2> erlang:spawn_link(fun() -> exit(whatever) end). ## 子进程exit的原因除了normal以外的其它原因。
** exception exit: whatever
3> self().  ## 父进程也异常退出了,变成了一个新的shell进程。
<0.68.0>
4> erlang:process_flag(trap_exit,false),self().
<0.68.0>
5> erlang:spawn_link(fun() -> exit(normal) end). ## 子进程exit的原因为normal
<0.79.0>
6> erlang:spawn_link(fun() -> {ok, true} end). ## 子进程正常结束,与5)中等同
<0.80.0>
11> flush(). ##没有收到任何消息
ok
12> self(). ## 进程没有退出
<0.68.0>
  • 设置为true时,
    link的进程 异常 退出(比如exit(whatever)),那么本进程不会退出,只是会收到{’EXIT‘, FromPid, whatever}的消息。
    link的进程 正常 退出(直接正常结束),那么本进程会收到{'EXIT',FromPid,normal}
13> erlang:process_flag(trap_exit, true),self().
<0.68.0>
14> erlang:spawn_link(fun() -> exit(whatever) end).
<0.72.0>
15> flush().
Shell got {'EXIT',<0.72.0>,whatever}
ok
16> erlang:spawn_link(fun() -> {ok, true} end).
<0.90.0>
17> flush().
Shell got {'EXIT',<0.90.0>,normal}

总结:

  • 进程默认的trap_exit为false,如果link进程crash,则自己也会被用exit/2crash掉,link进程正常退出,则本进程不受影响,且收不到任何消息。
  • 进程trap_exit为true时,只要link进程退出(正常退出或crash),本进程都会收到{’EXIT‘, FromPid, Reason}的消息。

那么我们再来看一看一个单独的gen_server进程出错了,会发生什么?

gen_server内部出错,会发生什么?

如果gen_server内部逻辑发生错误导致crash,比如除零,原子使用++导致进程自己crash掉,会不会执行termniate/2 ?
结论是: 一定会!
写一个简单的gen_server验证一下:

-module(gen_server_test).
-behaviour(gen_server).

-export([start_link/1, start/1]).
-export([divide/2, stop/1, crash/1]).

%% gen_server callbacks
-export([init/1, handle_call/3, handle_cast/2, handle_info/2, terminate/2, code_change/3]).

%% API
start(TrapExit) ->
    gen_server:start({local, ?MODULE}, ?MODULE, [TrapExit], []).

start_link(TrapExit) ->
    gen_server:start_link({local, ?MODULE}, ?MODULE, [TrapExit], []).

stop(Reason) ->
    gen_server:call(?MODULE, {stop, Reason}).

crash(Reason) ->
    gen_server:call(?MODULE, {crash, Reason}).

divide(X, Y) ->
    gen_server:call(?MODULE, {divide, X, Y}).

init([TrapExit]) ->
    erlang:process_flag(trap_exit, TrapExit),
    {ok, undefined}.

handle_call({divide, X, Y}, _From, State) ->
    io:format("[line:~p] Divide:~p/~p ~n", [?LINE, X, Y]),
    {reply, X/Y, State};
handle_call({stop, Reason}, _From, State) ->
    io:format("[line:~p] Got stop by ~p ~n", [?LINE, Reason]),
    {stop, Reason, ok, State};
handle_call({crash, Reason}, _From, State) ->
    io:format("[line:~p] Got crash: error(~p).~n", [?LINE, Reason]),
    erlang:error(Reason),
    {reply, ok, State};
handle_call(_Msg, _From, State) -> {reply, ignore, State}.

handle_info(Msg, State) ->
    io:format("[line:~p] Got ~p~n", [?LINE, Msg]),
   {noreply, State}.

terminate(Reason, _State) ->
    io:format("[line:~p] Terminate reason: ~p~n", [?LINE, Reason]),
    ok.

handle_cast(_Msg, State) -> {noreply, State}.

code_change(_Old, State, _Extra) -> {ok, State}.

  • 如果内部crash退出结果:
1> c(gen_server_test).
{ok,gen_server_test}
2> gen_server_test:start_link(false).
{ok,<0.71.0>}
3> gen_server_test:divide(1,0).
[line:31] Divide:1/0
%% crash后执行terminate/2的callback
[line:47] Terminate reason: {badarith,
                                [{gen_server_test,handle_call,3,
                                     [{file,"gen_server_test.erl"},{line,32}]},
                                 {gen_server,try_handle_call,4,
                                     [{file,"gen_server.erl"},{line,636}]},
                                 {gen_server,handle_msg,6,
                                     [{file,"gen_server.erl"},{line,665}]},
                                 {proc_lib,init_p_do_apply,3,
                                     [{file,"proc_lib.erl"},{line,247}]}]}
%% 因为gen_server_test:divide(1,0)使用的是gen_server:call/2,
%% 它会先link到gen_server_test进程,gen_server_test导常退出
%% 会把这个错再抛出给调用者(shell进程)
%% shell进程trap_exit默认false,所以一起挂掉                                       
** exception exit: badarith
     in function  gen_server_test:handle_call/3 (gen_server_test.erl, line 32)
     in call from gen_server:try_handle_call/4 (gen_server.erl, line 636)
     in call from gen_server:handle_msg/6 (gen_server.erl, line 665)
     in call from proc_lib:init_p_do_apply/3 (proc_lib.erl, line 247)
4>
%% 如果gen_server进程被不是normal的Reason结束掉,默认会使用error_logger记录一条日志。
=ERROR REPORT==== 21-May-2018::17:03:34 ===
** Generic server gen_server_test terminating
** Last message in was {divide,1,0}
** When Server state == undefined
** Reason for termination ==
** {badarith,[{gen_server_test,handle_call,3,
                               [{file,"gen_server_test.erl"},{line,32}]},
              {gen_server,try_handle_call,4,
                          [{file,"gen_server.erl"},{line,636}]},
              {gen_server,handle_msg,6,[{file,"gen_server.erl"},{line,665}]},
              {proc_lib,init_p_do_apply,3,
                        [{file,"proc_lib.erl"},{line,247}]}]}
** Client <0.64.0> stacktrace
** [{gen,do_call,4,[{file,"gen.erl"},{line,169}]},
    {gen_server,call,2,[{file,"gen_server.erl"},{line,202}]},
    {erl_eval,do_apply,6,[{file,"erl_eval.erl"},{line,674}]},
    {shell,exprs,7,[{file,"shell.erl"},{line,687}]},
    {shell,eval_exprs,7,[{file,"shell.erl"},{line,642}]},
    {shell,eval_loop,3,[{file,"shell.erl"},{line,627}]}]
  • 如果进程内部正常退出结果:
4> gen_server_test:start_link(false).
{ok,<0.75.0>}
5> gen_server_test:stop(normal).
[line:34] Got stop by normal
[line:47] Terminate reason: normal
ok
6> gen_server_test:start_link(false).
{ok,<0.78.0>}
7> gen_server_test:stop(whatever).
[line:34] Got stop by whatever
[line:47] Terminate reason: whatever
%% 正常stop 但是reason不为normal时,会使用error_log打印信息
=ERROR REPORT==== 21-May-2018::17:07:02 ===
** Generic server gen_server_test terminating
** Last message in was {stop,whatever}
** When Server state == undefined
** Reason for termination ==
** whatever
** Client <0.73.0> stacktrace
** [{gen,do_call,4,[{file,"gen.erl"},{line,169}]},
    {gen_server,call,2,[{file,"gen_server.erl"},{line,202}]},
    {erl_eval,do_apply,6,[{file,"erl_eval.erl"},{line,674}]},
    {shell,exprs,7,[{file,"shell.erl"},{line,687}]},
    {shell,eval_exprs,7,[{file,"shell.erl"},{line,642}]},
    {shell,eval_loop,3,[{file,"shell.erl"},{line,627}]}]
** exception exit: whatever

总结:

  • gen_server 内部 自己退出或发生crash退出,都会执行terminate/2
  • 如果stop的原因不是normal,error_log会记录本次退出信息。

gen_server 外部强制退出,会发生什么

  • 使用exit(Pid, kill) 强制发送退出信号,terminate/2并不会执行。
  • trap_exit: fasle 使用exit(Pid, Reason) 强制发送退出信号,terminate/2并不会执行。
  • trap_exit: true 使用exit(Pid, Reason) 强制发送退出信号,terminate/2会执行。
  • {'EXIT',Pid,Reason}消息发送给gen_server:start/3启动的进程,消息被当成普通的消息被handle_info/2处理。
  • {'EXIT',Pid,Reason}消息发送给gen_server:start_link/3启动的进程,消息被当成退出信号被terminate/2处理。
8> {ok, Pid} = gen_server_test:start_link(false). %% trap_exit false
{ok,<0.86.0>}
9> erlang:exit(Pid, whatever).
** exception exit: whatever
10> {ok, Pid1} = gen_server_test:start_link(true). %% trap_exit true
{ok,<0.90.0>}
11> erlang:exit(Pid1, whatever).
[line:47] Terminate reason: whatever
true

=ERROR REPORT==== 21-May-2018::17:10:24 ===
** Generic server gen_server_test terminating
** Last message in was {'EXIT',<0.88.0>,whatever}
** When Server state == undefined
** Reason for termination ==
** whatever
** exception exit: whatever

12> gen_server_test:start(true).  ## 此进程使用gen_server:start/3启动,所以只把{'EXIT',self(), whatever}消息当成一个普通的消息给gen_server进程hanle_info/2处理
{ok,<0.94.0>}
13> gen_server_test ! {'EXIT',self(), whatever}.
[line:43] Got {'EXIT',<0.92.0>,whatever}
{'EXIT',<0.92.0>,whatever}.
14> > gen_server_test:stop(normal).
[line:34] Got stop by normal
[line:47] Terminate reason: normal
ok
15> gen_server_test:start_link(false).  ## 此进程使用gen_server:start_link/3启动,所以只把{'EXIT',self(), whatever}消息当成特殊的退出的消息给gen_server进程terminate/3处理
{ok,<0.98.0>}
16> gen_server_test ! {'EXIT',self(), whatever}.
> gen_server_test ! {'EXIT',self(), whatever}.
[line:47] Terminate reason: whatever
{'EXIT',<0.92.0>,whatever}

=ERROR REPORT==== 21-May-2018::17:14:56 ===
** Generic server gen_server_test terminating
** Last message in was {'EXIT',<0.92.0>,whatever}
** When Server state == undefined
** Reason for termination ==
** whatever
** exception exit: whatever

17> gen_server_test:start_link(false).
{ok,<0.102.0>}
18> exit(<0.102.0>, whatever).
** exception exit: whatever
19> gen_server_test:start_link(true).
{ok,<0.111.0>}
20> exit(<0.111.0>, kill).
** exception exit: killed
21> gen_server_test:start_link(true).
{ok,<0.106.0>}
22> exit(<0.106.0>, whatever).
[line:47] Terminate reason: whatever
true

=ERROR REPORT==== 21-May-2018::17:20:24 ===
** Generic server gen_server_test terminating
** Last message in was {'EXIT',<0.104.0>,whatever}
** When Server state == undefined
** Reason for termination ==
** whatever
** exception exit: whatever

terminate里面crash会发生什么?

会把crash继续住上抛出去,大多数情况都给exit给了supervisor,让他处理。gen_server源码中处理如下

terminate(ExitReason, ReportReason, Name, Msg, Mod, State, Debug) ->
    Reply = try_terminate(Mod, ExitReason, State),
    case Reply of
    {'EXIT', ExitReason1, ReportReason1} ->
        FmtState = format_status(terminate, Mod, get(), State),
        error_info(ReportReason1, Name, Msg, FmtState, Debug),
        exit(ExitReason1);
    _ ->
        case ExitReason of
        normal ->
            exit(normal);
        shutdown ->
            exit(shutdown);
        {shutdown,_}=Shutdown ->
            exit(Shutdown);
        _ ->
            FmtState = format_status(terminate, Mod, get(), State),
            error_info(ReportReason, Name, Msg, FmtState, Debug),
            exit(ExitReason)
        end
    end.

try_terminate(Mod, Reason, State) ->
    try
    {ok, Mod:terminate(Reason, State)}
    catch
    throw:R ->
        {ok, R};
    error:R ->
        Stacktrace = erlang:get_stacktrace(),
        {'EXIT', {R, Stacktrace}, {R, Stacktrace}};
    exit:R ->
        Stacktrace = erlang:get_stacktrace(),
        {'EXIT', R, {R, Stacktrace}}
    end.

人不了解自己时是最糟糕的。--李小龙

原文地址:https://www.cnblogs.com/zhongwencool/p/gen_server_terminate.html