pgpoolII 的health_check_period 和 health_check_timeout

对 health_check_period与healht_check_timeout,官方解释如下:

health_check_period
This parameter specifies the interval between the health checks in seconds. Default is 0, which means health check is disabled. You need to reload pgpool.conf if you change health_check_period.
health_check_timeout
pgpool-II periodically tries to connect to the backends to detect any error on the servers or networks. This error check procedure is called "health check". If an error is detected, pgpool-II tries to perform failover or degeneration. This parameter serves to prevent the health check from waiting for a long time in a case such as un unplugged network cable. The timeout value is in seconds. Default value is 20. 0 disables timeout (waits until TCP/IP timeout). This health check requires one extra connection to each backend, so max_connections in the postgresql.conf needs to be incremented as needed. You need to reload pgpool.conf if you change this value.

两者到底是怎样的关系?

看看代码就可以知晓:

int main(int argc, char **argv)                                        
{                                        
    ……                                    
    /*                                    
     * This is the main loop                                    
     */                                    
    for (;;)                                    
    {                                    
        CHECK_REQUEST;                 
        /* do we need health checking for PostgreSQL? */  
        if (pool_config->health_check_period > 0)                                
        {                                
            ……                            
            if (pool_config->health_check_timeout > 0)                            
            {                            
                /*                        
                 * set health checker timeout. we want to detect  
                 * communication path failure much earlier before 
                 * TCP/IP stack detects it.                        
                 */                        
                pool_signal(SIGALRM, health_check_timer_handler);                        
                alarm(pool_config->health_check_timeout);                        
            }                            
                                        
            /*                            
             * do actual health check. trying to connect to the backend 
             */                            
            errno = 0;                            
            health_check_timer_expired = 0;                            
            POOL_SETMASK(&UnBlockSig);                            
            sts = health_check();                            
            POOL_SETMASK(&BlockSig);                            
            if (pool_config->parallel_mode || pool_config->enable_query_cache)                            
                sys_sts = system_db_health_check();                        
                                        
            /** 着里面有根据结果进行failover处理的逻辑,省略*/                           
            if ((sts > 0 || sys_sts < 0) 
&& (errno != EINTR ||
(errno == EINTR && health_check_timer_expired))) { …… } if (pool_config->health_check_timeout > 0) { /* seems ok. cancel health check timer */ pool_signal(SIGALRM, SIG_IGN); }
/** 请注意这里的sleep处理 */ sleep_time
= pool_config->health_check_period; pool_sleep(sleep_time); } else { for (;;) { int r; struct timeval t = {3, 0}; POOL_SETMASK(&UnBlockSig); r = pool_pause(&t); POOL_SETMASK(&BlockSig); if (r > 0) break; } } } pool_shmem_exit(0); }

也就是说首先,health_check是否发生,要看 health_check_period是否大于0。

在这个前提下,如果health_check_timeout也大于零,就埋下一个定时器,
到达health_check_timeout的秒数时,从定时器激活 healht_check函数。

与此同时,

在主循环中进行了 health_check处理后,如果结果OK,那么进行一番整理后,要开始睡眠一段时间,睡眠的时间间隔就是: health_check_period。睡醒了,再回到循环起始处,继续循环。

原文地址:https://www.cnblogs.com/gaojian/p/2620070.html