pgpoolII 的health_check_period 和 health_check

pgpoolII 的health_check_period 和 health_check_timeout

对 health_check_period与healht_check_timeout，官方解释如下：

health_check_period
This parameter specifies the interval between the health checks in seconds. Default is 0, which means health check is disabled. You need to reload pgpool.conf if you change health_check_period.

health_check_timeout
pgpool-II periodically tries to connect to the backends to detect any error on the servers or networks. This error check procedure is called "health check". If an error is detected, pgpool-II tries to perform failover or degeneration. This parameter serves to prevent the health check from waiting for a long time in a case such as un unplugged network cable. The timeout value is in seconds. Default value is 20. 0 disables timeout (waits until TCP/IP timeout). This health check requires one extra connection to each backend, so max_connections in the postgresql.conf needs to be incremented as needed. You need to reload pgpool.conf if you change this value.

两者到底是怎样的关系？

看看代码就可以知晓：

int main(int argc, char **argv)                                        
{                                        
    ……                                    
    /*                                    
     * This is the main loop                                    
     */                                    
    for (;;)                                    
    {                                    
        CHECK_REQUEST;                 
        /* do we need health checking for PostgreSQL? */  
        if (pool_config->health_check_period > 0)                                
        {                                
            ……                            
            if (pool_config->health_check_timeout > 0)                            
            {                            
                /*                        
                 * set health checker timeout. we want to detect  
                 * communication path failure much earlier before 
                 * TCP/IP stack detects it.                        
                 */                        
                pool_signal(SIGALRM, health_check_timer_handler);                        
                alarm(pool_config->health_check_timeout);                        
            }                            
                                        
            /*                            
             * do actual health check. trying to connect to the backend 
             */                            
            errno = 0;                            
            health_check_timer_expired = 0;                            
            POOL_SETMASK(&UnBlockSig);                            
            sts = health_check();                            
            POOL_SETMASK(&BlockSig);                            
            if (pool_config->parallel_mode || pool_config->enable_query_cache)                            
                sys_sts = system_db_health_check();                        
                                        
            /** 着里面有根据结果进行failover处理的逻辑，省略*/                           
            if ((sts > 0 || sys_sts < 0) 
               && (errno != EINTR || 
                 (errno == EINTR && health_check_timer_expired)))                  
            {                            
                ……                        
            }                            
                                        
            if (pool_config->health_check_timeout > 0)                            
            {                            
                /* seems ok. cancel health check timer */                        
                pool_signal(SIGALRM, SIG_IGN);                        
            }                            
             
            /** 请注意这里的sleep处理 */                           
            sleep_time = pool_config->health_check_period; 
            pool_sleep(sleep_time);                            
        }                                
        else                                
        {                                
            for (;;)                            
            {                            
                int r;                        
                struct timeval t = {3, 0};                        
                                        
                POOL_SETMASK(&UnBlockSig);                        
                r = pool_pause(&t);                        
                POOL_SETMASK(&BlockSig);                        
                if (r > 0)                        
                    break;                    
            }                            
        }                                
    }                                    
                                        
    pool_shmem_exit(0);                                    
}

也就是说首先，health_check是否发生，要看 health_check_period是否大于0。

在这个前提下，如果health_check_timeout也大于零，就埋下一个定时器，
到达health_check_timeout的秒数时，从定时器激活 healht_check函数。

与此同时，

在主循环中进行了 health_check处理后，如果结果OK，那么进行一番整理后，要开始睡眠一段时间，睡眠的时间间隔就是: health_check_period。睡醒了，再回到循环起始处，继续循环。