Linux SO_KEEPALIVE属性,心跳

对于面向连接的TCP socket,在实际应用中通常都要检測对端是否处于连接中,连接port分两种情况:
1、连接正常关闭,调用close() shutdown()连接优雅关闭,send与recv立刻返回错误,select返回SOCK_ERR;
2、连接的对端异常关闭,比方网络断掉,突然断电.
对于另外一种情况,推断连接是否断开的方法有一下几种:
1、自己编写心跳包程序,简单的说就是自己的程序增加一条线程,定时向对端发送数据包,查看是否有ACK,依据ACK的返回情况来管理连接。此方法比較通用,一般使用业务层心跳处理,灵活可控,但改变了现有的协议;
2、使用TCP的keepalive机制,UNIX网络编程不推荐使用SO_KEEPALIVE来做心跳检測(为什么??

)。


keepalive原理:TCP内嵌有心跳包,以服务端为例,当server检測到超过一定时间(/proc/sys/net/ipv4/tcp_keepalive_time 7200 即2小时)没有传输数据,那么会向client端发送一个keepalive packet,此时client端有三种反应:
1、client端连接正常,返回一个ACK.server端收到ACK后重置计时器,在2小时后在发送探測.假设2小时内连接上有传输数据,那么在该时间的基础上向后推延2小时发送探測包;
2、客户端异常关闭,或网络断开。client无响应,server收不到ACK,在一定时间(/proc/sys/net/ipv4/tcp_keepalive_intvl 75 即75秒)后重发keepalive packet, 而且重发一定次数(/proc/sys/net/ipv4/tcp_keepalive_probes 9 即9次);
3、client以前崩溃,但已经重新启动.server收到的探測响应是一个复位,server端终止连接。

改动三个參数的系统默认值
暂时方法:向三个文件里直接写入參数,系统重新启动须要又一次设置;
暂时方法:sysctl -w net.ipv4.tcp_keepalive_intvl=20
全局设置:可更改/etc/sysctl.conf,加上:
net.ipv4.tcp_keepalive_intvl = 20
net.ipv4.tcp_keepalive_probes = 3
net.ipv4.tcp_keepalive_time = 60

/* Set TCP keep alive option to detect dead peers. The interval option
 * is only used for Linux as we are using Linux-specific APIs to set
 * the probe send time, interval, and count. */
int anetKeepAlive(char *err, int fd, int interval)
{
    int val = 1;
	//开启keepalive机制
    if (setsockopt(fd, SOL_SOCKET, SO_KEEPALIVE, &val, sizeof(val)) == -1)
    {
        anetSetError(err, "setsockopt SO_KEEPALIVE: %s", strerror(errno));
        return ANET_ERR;
    }

#ifdef __linux__
    /* Default settings are more or less garbage, with the keepalive time
     * set to 7200 by default on Linux. Modify settings to make the feature
     * actually useful. */

    /* Send first probe after interval. */
    val = interval;
    if (setsockopt(fd, IPPROTO_TCP, TCP_KEEPIDLE, &val, sizeof(val)) < 0) {
        anetSetError(err, "setsockopt TCP_KEEPIDLE: %s
", strerror(errno));
        return ANET_ERR;
    }

    /* Send next probes after the specified interval. Note that we set the
     * delay as interval / 3, as we send three probes before detecting
     * an error (see the next setsockopt call). */
    val = interval/3;
    if (val == 0) val = 1;
    if (setsockopt(fd, IPPROTO_TCP, TCP_KEEPINTVL, &val, sizeof(val)) < 0) {
        anetSetError(err, "setsockopt TCP_KEEPINTVL: %s
", strerror(errno));
        return ANET_ERR;
    }

    /* Consider the socket in error state after three we send three ACK
     * probes without getting a reply. */
    val = 3;
    if (setsockopt(fd, IPPROTO_TCP, TCP_KEEPCNT, &val, sizeof(val)) < 0) {
        anetSetError(err, "setsockopt TCP_KEEPCNT: %s
", strerror(errno));
        return ANET_ERR;
    }
#endif

    return ANET_OK;
}


 

原文地址:https://www.cnblogs.com/lcchuguo/p/5340856.html