Heritrix 3.1.0 源码解析(二十三)

上文分析了Heritrix3.1.0系统是怎么扩展HttpClient组件的ProtocolSocketFactory接口用于创建HTTP和HTTPS连接的SOCKET对象的

接下来我们分析Heritrix3.1.0系统是怎么扩展HttpClient组件的HttpConnection对象的(创建SOCKET连接)

先看一下HttpConnection类的成员变量

// ----------------------------------------------------- Instance Variables
    
    /** My host. */
    private String hostName = null;
    
    /** My port. */
    private int portNumber = -1;
    
    /** My proxy host. */
    private String proxyHostName = null;
    
    /** My proxy port. */
    private int proxyPortNumber = -1;
    
    /** My client Socket. */
    private Socket socket = null;
    
    /** My InputStream. */
    private InputStream inputStream = null;

    /** My OutputStream. */
    private OutputStream outputStream = null;
    
    /** An {@link InputStream} for the response to an individual request. */
    private InputStream lastResponseInputStream = null;
    
    /** Whether or not the connection is connected. */
    protected boolean isOpen = false;
    
    /** the protocol being used */
    private Protocol protocolInUse;
    
    /** Collection of HTTP parameters associated with this HTTP connection*/
    private HttpConnectionParams params = new HttpConnectionParams();
    
    /** flag to indicate if this connection can be released, if locked the connection cannot be 
     * released */
    private boolean locked = false;
    
    /** Whether or not the socket is a secure one. */
    private boolean usingSecureSocket = false;
    
    /** Whether the connection is open via a secure tunnel or not */
    private boolean tunnelEstablished = false;
    
    /** the connection manager that created this connection or null */
    private HttpConnectionManager httpConnectionManager;
    
    /** The local interface on which the connection is created, or null for the default */
    private InetAddress localAddress;

这些成员变量都是创建SOCKET对象需要用到的参数或对象以及SOCKET的输入流输出流等,Heritrix3.1.0系统是怎么创建HttpConnection对象的呢

SingleHttpConnectionManager类的HttpConnection getConnectionWithTimeout(HostConfiguration hostConfiguration, long timeout)方法

public HttpConnection getConnectionWithTimeout(
        HostConfiguration hostConfiguration, long timeout) {

        HttpConnection conn = new HttpConnection(hostConfiguration);
        conn.setHttpConnectionManager(this);
        conn.getParams().setDefaults(this.getParams());
        return conn;
    }

 我们再来看一下HttpConnection类的构造方法

/**
     * Creates a new HTTP connection for the given host configuration.
     * 
     * @param hostConfiguration the host/proxy/protocol to use
     */
    public HttpConnection(HostConfiguration hostConfiguration) {
        this(hostConfiguration.getProxyHost(),
             hostConfiguration.getProxyPort(),
             hostConfiguration.getHost(),
             hostConfiguration.getPort(),
             hostConfiguration.getProtocol());
        this.localAddress = hostConfiguration.getLocalAddress();
    }
/**
     * Creates a new HTTP connection for the given host with the virtual 
     * alias and port via the given proxy host and port using the given 
     * protocol.
     * 
     * @param proxyHost the host to proxy via
     * @param proxyPort the port to proxy via
     * @param host the host to connect to. Parameter value must be non-null.
     * @param port the port to connect to
     * @param protocol The protocol to use. Parameter value must be non-null.
     */
    public HttpConnection(
        String proxyHost,
        int proxyPort,
        String host,
        int port,
        Protocol protocol) {

        if (host == null) {
            throw new IllegalArgumentException("host parameter is null");
        }
        if (protocol == null) {
            throw new IllegalArgumentException("protocol is null");
        }

        proxyHostName = proxyHost;
        proxyPortNumber = proxyPort;
        hostName = host;
        portNumber = protocol.resolvePort(port);
        protocolInUse = protocol;
    }

HttpConnection类的构造方法里面基本上就是初始化成员变量,我们注意到里面的初始化Protocol protocolInUse成员对象,接下来的获取SocketFactory工厂就是通过Protocol protocolInUse成员对象获取的(上文中提到Protocol类中注册了HTTP和HTTPS的SocketFactory工厂)

下面的void open()方法首先通过Protocol protocolInUse成员对象获取SOCKET对象,接着设置相关参数,得到SOCKET对象的InputStream和OutStream等

/**
     * Establishes a connection to the specified host and port
     * (via a proxy if specified).
     * The underlying socket is created from the {@link ProtocolSocketFactory}.
     *
     * @throws IOException if an attempt to establish the connection results in an
     *   I/O error.
     */
    public void open() throws IOException {
        LOG.trace("enter HttpConnection.open()");

        final String host = (proxyHostName == null) ? hostName : proxyHostName;
        final int port = (proxyHostName == null) ? portNumber : proxyPortNumber;
        assertNotOpen();
        
        if (LOG.isDebugEnabled()) {
            LOG.debug("Open connection to " + host + ":" + port);
        }
        
        try {
            if (this.socket == null) {
                usingSecureSocket = isSecure() && !isProxied();
                // use the protocol's socket factory unless this is a secure
                // proxied connection
                ProtocolSocketFactory socketFactory = null;
                if (isSecure() && isProxied()) {
                    Protocol defaultprotocol = Protocol.getProtocol("http");
                    socketFactory = defaultprotocol.getSocketFactory();
                } else {
                    socketFactory = this.protocolInUse.getSocketFactory();
                }
                this.socket = socketFactory.createSocket(
                            host, port, 
                            localAddress, 0,
                            this.params);
            }

            /*
            "Nagling has been broadly implemented across networks, 
            including the Internet, and is generally performed by default 
            - although it is sometimes considered to be undesirable in 
            highly interactive environments, such as some client/server 
            situations. In such cases, nagling may be turned off through 
            use of the TCP_NODELAY sockets option." */

            socket.setTcpNoDelay(this.params.getTcpNoDelay());
            socket.setSoTimeout(this.params.getSoTimeout());
            
            int linger = this.params.getLinger();
            if (linger >= 0) {
                socket.setSoLinger(linger > 0, linger);
            }
            
            int sndBufSize = this.params.getSendBufferSize();
            if (sndBufSize >= 0) {
                socket.setSendBufferSize(sndBufSize);
            }        
            int rcvBufSize = this.params.getReceiveBufferSize();
            if (rcvBufSize >= 0) {
                socket.setReceiveBufferSize(rcvBufSize);
            }        
            int outbuffersize = socket.getSendBufferSize();
            if ((outbuffersize > 2048) || (outbuffersize <= 0)) {
                outbuffersize = 2048;
            }
            int inbuffersize = socket.getReceiveBufferSize();
            if ((inbuffersize > 2048) || (inbuffersize <= 0)) {
                inbuffersize = 2048;
            }
            
            // START IA/HERITRIX change
            Recorder httpRecorder = Recorder.getHttpRecorder();
            if (httpRecorder == null || (isSecure() && isProxied())) {
                // no recorder, OR defer recording for pre-tunnel leg
                inputStream = new BufferedInputStream(
                    socket.getInputStream(), inbuffersize);
                outputStream = new BufferedOutputStream(
                    socket.getOutputStream(), outbuffersize);
            } else {
                inputStream = httpRecorder.inputWrap((InputStream)
                        (new BufferedInputStream(socket.getInputStream(),
                        inbuffersize)));
                outputStream = httpRecorder.outputWrap((OutputStream)
                        (new BufferedOutputStream(socket.getOutputStream(), 
                        outbuffersize)));
            }
            // END IA/HERITRIX change

            isOpen = true;
        } catch (IOException e) {
            // Connection wasn't opened properly
            // so close everything out
            closeSocketAndStreams();
            throw e;
        }
    }

我们注意到,Heritrix3.1.0系统用Recorder httpRecorder = Recorder.getHttpRecorder()对象封装了SOCKET连接的输出流和输入流,这样系统可以通过Recorder httpRecorder = Recorder.getHttpRecorder()对象得到SOCKET连接的输入流和输出流了

在Heritrix3.1.1系统里面同时封装了获取上述HttpConnection对象的构建类,它是通过扩展HttpCllient组件的SimpleHttpConnectionManager类来实现的

SimpleHttpConnectionManager类本身实现了HttpConnectionManager接口,HttpConnectionManager接口定义了构建HttpConnection连接对象的方法声明

public interface HttpConnectionManager {
    
    HttpConnection getConnection(HostConfiguration hostConfiguration);

    HttpConnection getConnection(HostConfiguration hostConfiguration, long timeout)
        throws HttpException;
    
    HttpConnection getConnectionWithTimeout(HostConfiguration hostConfiguration, long timeout)
        throws ConnectionPoolTimeoutException;
    
    void releaseConnection(HttpConnection conn);
    
    void closeIdleConnections(long idleTimeout);    
    
    HttpConnectionManagerParams getParams();
    
    void setParams(final HttpConnectionManagerParams params);
}

Heritrix3.1.0系统扩展的SingleHttpConnectionManager类如下

public class SingleHttpConnectionManager extends SimpleHttpConnectionManager {
    public SingleHttpConnectionManager() {
        super();
    }
    @Override
    public HttpConnection getConnectionWithTimeout(
        HostConfiguration hostConfiguration, long timeout) {

        HttpConnection conn = new HttpConnection(hostConfiguration);
        conn.setHttpConnectionManager(this);
        conn.getParams().setDefaults(this.getParams());
        return conn;
    }
    @Override
    public void releaseConnection(HttpConnection conn) {
        // ensure connection is closed
        conn.close();
        finishLast(conn);
    }
    
    static void finishLast(HttpConnection conn) {
        // copied from superclass because it wasn't made available to subclasses
        InputStream lastResponse = conn.getLastResponseInputStream();
        if (lastResponse != null) {
            conn.setLastResponseInputStream(null);
            try {
                lastResponse.close();
            } catch (IOException ioe) {
                //FIXME: badness - close to force reconnect.
                conn.close();
            }
        }
    }

}

扩展了获取HttpConnection连接对象和释放HttpConnection连接对象资源的方法 

---------------------------------------------------------------------------

本系列Heritrix 3.1.0 源码解析系本人原创

转载请注明出处 博客园 刺猬的温驯

本文链接 http://www.cnblogs.com/chenying99/archive/2013/04/27/3047510.html

原文地址:https://www.cnblogs.com/chenying99/p/3047510.html