socket编程实战-connect超时问题

https://www.cnblogs.com/rockyching2009/p/11032229.html

一、背景

connect()是会阻塞的。

这意味着,作为客户端去连服务器等了好久都得不到相应,业务处理被推迟,更有甚者等到黄花谢了等来个失败ETIMEDOUT

二、分析及方案

除了超时,其他connect()异常基本上立刻就可以得到反馈,这种处理起来也容易。

超时异常之所以让人头疼是因为超时时间太长,在默认配置的Linux下这个时间有一分多钟。

看下『UNIX Network Programming Volume 1』中4.3节对connect()的超时异常描述:

In the case of a TCP socket, the connect function initiates TCP's three-way handshake.
The function returns only when the connection is established or an error occurs. There
are several different error returns possible.
1. If the client TCP receives no response to its SYN segment, ETIMEDOUT is returned.
4.4 BSD, for example, sends one SYN when connect is called, another 6 seconds later,
and another 24 seconds later. If no response is received after a total of 75 seconds,
the error is returned.

1. 如果TCP客户没有收到SYN分节的响应,则返回ETIMEDOUT。例如在4.4BSD中,当调用函数 connect 时,发出一个SYN,
若无响应,等待6秒之后再发一个;若仍无响应,24秒钟之后再发一个。若总共等待了75秒钟之后仍未响应,则返回错误。

如果可以随意设置超时时间就好了。

现实是美好的,确实可以做到随心所欲,那就是非阻塞调用。

对于非阻塞socket,超时到来后connect()返回EINPROGRESS异常。此外在connect()的man page上还提供了一个极好的处理方式:

The socket is nonblocking and the connection cannot be completed immediately. It is possible
to select(2) or poll(2) for completion by selecting the socket for writing. After select(2) indicates
writability, use getsockopt(2) to read the SO_ERROR option at level SOL_SOCKET to determine
whether connect() completed successfully (SO_ERROR is zero) or unsuccessfully (SO_ERROR is one of
the usual error codes listed here, explaining the reason for the failure).

三、方案实现

非阻塞+getsockopt(),具体实现可以参看Java的底层代码,贴下来:

openjdk-8u40-src-b25-10_feb_2015openjdkjdksrcsolaris
ativejava
etPlainSocketImpl.c
/*
 * inetAddress is the address object passed to the socket connect
 * call.
 *
 * Class:     java_net_PlainSocketImpl
 * Method:    socketConnect
 * Signature: (Ljava/net/InetAddress;I)V
 */
JNIEXPORT void JNICALL
Java_java_net_PlainSocketImpl_socketConnect(JNIEnv *env, jobject this,
                                            jobject iaObj, jint port,
                                            jint timeout)
{
    jint localport = (*env)->GetIntField(env, this, psi_localportID);
    int len = 0;

    /* fdObj is the FileDescriptor field on this */
    jobject fdObj = (*env)->GetObjectField(env, this, psi_fdID);

    jclass clazz = (*env)->GetObjectClass(env, this);

    jobject fdLock;

    jint trafficClass = (*env)->GetIntField(env, this, psi_trafficClassID);

    /* fd is an int field on iaObj */
    jint fd;

    SOCKADDR him;
    /* The result of the connection */
    int connect_rv = -1;

    if (IS_NULL(fdObj)) {
        JNU_ThrowByName(env, JNU_JAVANETPKG "SocketException", "Socket closed");
        return;
    } else {
        fd = (*env)->GetIntField(env, fdObj, IO_fd_fdID);
    }
    if (IS_NULL(iaObj)) {
        JNU_ThrowNullPointerException(env, "inet address argument null.");
        return;
    }

    /* connect */
    if (NET_InetAddressToSockaddr(env, iaObj, port, (struct sockaddr *)&him, &len, JNI_TRUE) != 0) {
      return;
    }
    setDefaultScopeID(env, (struct sockaddr *)&him);

#ifdef AF_INET6
    if (trafficClass != 0 && ipv6_available()) {
        NET_SetTrafficClass((struct sockaddr *)&him, trafficClass);
    }
#endif /* AF_INET6 */
    if (timeout <= 0) {
        connect_rv = NET_Connect(fd, (struct sockaddr *)&him, len);
#ifdef __solaris__
        if (connect_rv == JVM_IO_ERR && errno == EINPROGRESS ) {

            /* This can happen if a blocking connect is interrupted by a signal.
             * See 6343810.
             */
            while (1) {
#ifndef USE_SELECT
                {
                    struct pollfd pfd;
                    pfd.fd = fd;
                    pfd.events = POLLOUT;

                    connect_rv = NET_Poll(&pfd, 1, -1);
                }
#else
                {
                    fd_set wr, ex;

                    FD_ZERO(&wr);
                    FD_SET(fd, &wr);
                    FD_ZERO(&ex);
                    FD_SET(fd, &ex);

                    connect_rv = NET_Select(fd+1, 0, &wr, &ex, 0);
                }
#endif

                if (connect_rv == JVM_IO_ERR) {
                    if (errno == EINTR) {
                        continue;
                    } else {
                        break;
                    }
                }
                if (connect_rv > 0) {
                    int optlen;
                    /* has connection been established */
                    optlen = sizeof(connect_rv);
                    if (JVM_GetSockOpt(fd, SOL_SOCKET, SO_ERROR,
                                        (void*)&connect_rv, &optlen) <0) {
                        connect_rv = errno;
                    }

                    if (connect_rv != 0) {
                        /* restore errno */
                        errno = connect_rv;
                        connect_rv = JVM_IO_ERR;
                    }
                    break;
                }
            }
        }
#endif
    } else {
        /*
         * A timeout was specified. We put the socket into non-blocking
         * mode, connect, and then wait for the connection to be
         * established, fail, or timeout.
         */
        SET_NONBLOCKING(fd);

        /* no need to use NET_Connect as non-blocking */
        connect_rv = connect(fd, (struct sockaddr *)&him, len);

        /* connection not established immediately */
        if (connect_rv != 0) {
            int optlen;
            jlong prevTime = JVM_CurrentTimeMillis(env, 0);

            if (errno != EINPROGRESS) {
                NET_ThrowByNameWithLastError(env, JNU_JAVANETPKG "ConnectException",
                             "connect failed");
                SET_BLOCKING(fd);
                return;
            }

            /*
             * Wait for the connection to be established or a
             * timeout occurs. poll/select needs to handle EINTR in
             * case lwp sig handler redirects any process signals to
             * this thread.
             */
            while (1) {
                jlong newTime;
#ifndef USE_SELECT
                {
                    struct pollfd pfd;
                    pfd.fd = fd;
                    pfd.events = POLLOUT;

                    errno = 0;
                    connect_rv = NET_Poll(&pfd, 1, timeout);
                }
#else
                {
                    fd_set wr, ex;
                    struct timeval t;

                    t.tv_sec = timeout / 1000;
                    t.tv_usec = (timeout % 1000) * 1000;

                    FD_ZERO(&wr);
                    FD_SET(fd, &wr);
                    FD_ZERO(&ex);
                    FD_SET(fd, &ex);

                    errno = 0;
                    connect_rv = NET_Select(fd+1, 0, &wr, &ex, &t);
                }
#endif

                if (connect_rv >= 0) {
                    break;
                }
                if (errno != EINTR) {
                    break;
                }

                /*
                 * The poll was interrupted so adjust timeout and
                 * restart
                 */
                newTime = JVM_CurrentTimeMillis(env, 0);
                timeout -= (newTime - prevTime);
                if (timeout <= 0) {
                    connect_rv = 0;
                    break;
                }
                prevTime = newTime;

            } /* while */

            if (connect_rv == 0) {
                JNU_ThrowByName(env, JNU_JAVANETPKG "SocketTimeoutException",
                            "connect timed out");

                /*
                 * Timeout out but connection may still be established.
                 * At the high level it should be closed immediately but
                 * just in case we make the socket blocking again and
                 * shutdown input & output.
                 */
                SET_BLOCKING(fd);
                JVM_SocketShutdown(fd, 2);
                return;
            }

            /* has connection been established */
            optlen = sizeof(connect_rv);
            if (JVM_GetSockOpt(fd, SOL_SOCKET, SO_ERROR, (void*)&connect_rv,
                               &optlen) <0) {
                connect_rv = errno;
            }
        }

        /* make socket blocking again */
        SET_BLOCKING(fd);

        /* restore errno */
        if (connect_rv != 0) {
            errno = connect_rv;
            connect_rv = JVM_IO_ERR;
        }
    }

    /* report the appropriate exception */
    if (connect_rv < 0) {

#ifdef __linux__
        /*
         * Linux/GNU distribution setup /etc/hosts so that
         * InetAddress.getLocalHost gets back the loopback address
         * rather than the host address. Thus a socket can be
         * bound to the loopback address and the connect will
         * fail with EADDRNOTAVAIL. In addition the Linux kernel
         * returns the wrong error in this case - it returns EINVAL
         * instead of EADDRNOTAVAIL. We handle this here so that
         * a more descriptive exception text is used.
         */
        if (connect_rv == JVM_IO_ERR && errno == EINVAL) {
            JNU_ThrowByName(env, JNU_JAVANETPKG "SocketException",
                "Invalid argument or cannot assign requested address");
            return;
        }
#endif
        if (connect_rv == JVM_IO_INTR) {
            JNU_ThrowByName(env, JNU_JAVAIOPKG "InterruptedIOException",
                            "operation interrupted");
#if defined(EPROTO)
        } else if (errno == EPROTO) {
            NET_ThrowByNameWithLastError(env, JNU_JAVANETPKG "ProtocolException",
                           "Protocol error");
#endif
        } else if (errno == ECONNREFUSED) {
            NET_ThrowByNameWithLastError(env, JNU_JAVANETPKG "ConnectException",
                           "Connection refused");
        } else if (errno == ETIMEDOUT) {
            NET_ThrowByNameWithLastError(env, JNU_JAVANETPKG "ConnectException",
                           "Connection timed out");
        } else if (errno == EHOSTUNREACH) {
            NET_ThrowByNameWithLastError(env, JNU_JAVANETPKG "NoRouteToHostException",
                           "Host unreachable");
        } else if (errno == EADDRNOTAVAIL) {
            NET_ThrowByNameWithLastError(env, JNU_JAVANETPKG "NoRouteToHostException",
                             "Address not available");
        } else if ((errno == EISCONN) || (errno == EBADF)) {
            JNU_ThrowByName(env, JNU_JAVANETPKG "SocketException",
                            "Socket closed");
        } else {
            NET_ThrowByNameWithLastError(env, JNU_JAVANETPKG "SocketException", "connect failed");
        }
        return;
    }

    (*env)->SetIntField(env, fdObj, IO_fd_fdID, fd);

    /* set the remote peer address and port */
    (*env)->SetObjectField(env, this, psi_addressID, iaObj);
    (*env)->SetIntField(env, this, psi_portID, port);

    /*
     * we need to initialize the local port field if bind was called
     * previously to the connect (by the client) then localport field
     * will already be initialized
     */
    if (localport == 0) {
        /* Now that we're a connected socket, let's extract the port number
         * that the system chose for us and store it in the Socket object.
         */
        len = SOCKADDR_LEN;
        if (JVM_GetSockName(fd, (struct sockaddr *)&him, &len) == -1) {
            NET_ThrowByNameWithLastError(env, JNU_JAVANETPKG "SocketException",
                           "Error getting socket name");
        } else {
            localport = NET_GetPortFromSockaddr((struct sockaddr *)&him);
            (*env)->SetIntField(env, this, psi_localportID, localport);
        }
    }
}

参考资料

1、Linux man

2、《UNIX网络编程(卷1)》

原文地址:https://www.cnblogs.com/rockyching2009/p/11032229.html