Tornado源码分析之HTTP服务请求解析

代码来源

tornado1.2版本下httpserver.py

listen fd的读事件回调

Tornado定义类HTTPServer来表示一个HTTP服务器,该类在构造函数中会传入事件循环ioloop,和Application对象。同时该HTTPServer提供了如下几种方法:

  • listen() 表示该Server的监听方法,调用该方法时,通过调用bind已经将套接字设置成non-blocking,并使用socket.SO_REUSEADDR。
  • bind() 方法表示Server绑定端口,并设置socket 为non-blocking.
  • start() 方法则是在ioloop中启动该服务器。在不给start()方法传入任何参数的情况下,使用单进程模型的IOLoop。

在start()方法启动服务器时,要向IOLoop中注册对listen fd的可读事件的回调,listen fd可读,表示有新的客户接入到HTTPServer中。我们来看看接入HTTPServer的事件回调_handle_events

def _handle_events(self, fd, events):
    while True:
        try:
            connection, address = self._socket.accept()
        except socket.error, e:
            if e.args[0] in (errno.EWOULDBLOCK, errno.EAGAIN):
                return
            raise
        if self.ssl_options is not None:
            assert ssl, "Python 2.6+ and OpenSSL required for SSL"
            try:
                connection = ssl.wrap_socket(connection,
                                             server_side=True,
                                             do_handshake_on_connect=False,
                                             **self.ssl_options)
            except ssl.SSLError, err:
                if err.args[0] == ssl.SSL_ERROR_EOF:
                    return connection.close()
                else:
                    raise
            except socket.error, err:
                if err.args[0] == errno.ECONNABORTED:
                    return connection.close()
                else:
                    raise
        try:
            if self.ssl_options is not None:
                stream = iostream.SSLIOStream(connection, io_loop=self.io_loop)
            else:
                stream = iostream.IOStream(connection, io_loop=self.io_loop)
            HTTPConnection(stream, address, self.request_callback,
                           self.no_keep_alive, self.xheaders)
        except:
            logging.error("Error in connection callback", exc_info=True)

不看ssl处理部分。首先基于listen fd调用accept函数,获取connection和address,注意到tornado处理了spurious wakeup的情况:

if e.args[0] in (errno.EWOULDBLOCK, errno.EAGAIN):
    return

也就是说,当前listen fd,没有数据可以读了,表示没有新的客户连接过来,那么就应该返回。回调函数_handle_events,创建了IOStream对象stream

  stream = iostream.SSLIOStream(connection, io_loop=self.io_loop)

注意到创建IOStream对象会给单进程的IOLoop添加新的回调函数,该函数是用来处理accept fd的读事件,此时,在有客户接入过来,那么IOLoop中的handler有如下几个:

其中,

  • 描述符4是listen fd 读事件的处理回调
  • 描述符7 是用来唤醒IOLoop _read_waker
  • 描述符9 是新accept文件描述符读事件的回调.

accept fd 回调函数

 def _handle_events(self, fd, events):
        # 确保该连接还存在
        if not self.socket:
            logging.warning("Got events for closed stream %d", fd)
            return
        try:
            # 如果该连接的读事件产生了,调用读回调
            if events & self.io_loop.READ:
                self._handle_read()
            if not self.socket:
                return
            # 如果是写事件
            if events & self.io_loop.WRITE:
                # 如果该socket是客户端创建的socket, 其已经被服务器处理
                if self._connecting:
                    # 连接服务器端
                    self._handle_connect()
                # 普通的写事件回调
                self._handle_write()
            if not self.socket:
                return
            # 错误
            if events & self.io_loop.ERROR:
                # IOLoop中删除该文件描述符对应的handler
                # 并关闭连接.
                self.close()
                return
            state = self.io_loop.ERROR
            # 如果正在读
            if self.reading():
                # 添加状态继续读
                state |= self.io_loop.READ
            if self.writing():
               # 如果正在写,添加状态写
                state |= self.io_loop.WRITE
            if state != self._state:
                self._state = state
               # 更新对应fd对应的关注状态
                self.io_loop.update_handler(self.socket.fileno(), self._state)
        except:
            logging.error("Uncaught exception, closing connection.",
                          exc_info=True)
            self.close()
            raise

HTTPConnection对象的创建

接着我们将思路放回到listen fd 的回调函数_handle_events中,在IOStream对象stream创建后,_handle_events将创建HTTPConnection

HTTPConnection(stream, address, self.request_callback,
                               self.no_keep_alive, self.xheaders)

在HTTPConnnection构造函数中,可以看到起调用了

self.stream.read_until("

", self._header_callback)

此时,HTTPConnection中的成员stream,表示的是accept fd中的streamread_until函数将会从accept fd中一直读到 ,然后调用_header_callback来解析HTTP的头部字段。

read_until函数

read_until从套接字中就是读到指定的分隔符为止

 def read_until(self, delimiter, callback):
        """Call callback when we read the given delimiter."""
        assert not self._read_callback, "Already reading"
        self._read_delimiter = delimiter
        self._read_callback = stack_context.wrap(callback)
        while True:
            # See if we've already got the data from a previous read
            if self._read_from_buffer():
                return
            self._check_closed()
            if self._read_to_buffer() == 0:
                break
        self._add_io_state(self.io_loop.READ)

readl_until有两种退出方式:一种是从buffer中读到数据后,直接返回,另一种是将数据读到buffer,_ ead_to_buffer就是将数据读到buffer,如果没有数据,则将该socket的读事件添加进来。

accept fd获取后,显然该fd对应的IOStream中的缓冲区为0,所以_read_from_buffer返回False,流程将执行_read_to_buffer

从缓冲区读_read_from_buffer

  def _read_from_buffer(self):
        """Attempts to complete the currently-pending read from the buffer.
        Returns True if the read was completed.
        """
        if self._read_bytes:
            if self._read_buffer_size() >= self._read_bytes:
                num_bytes = self._read_bytes
                callback = self._read_callback
                self._read_callback = None
                self._read_bytes = None
                self._run_callback(callback, self._consume(num_bytes))
                return True
        elif self._read_delimiter:
            _merge_prefix(self._read_buffer, sys.maxint)
            loc = self._read_buffer[0].find(self._read_delimiter)
            if loc != -1:
                callback = self._read_callback
                delimiter_len = len(self._read_delimiter)
                self._read_callback = None
                self._read_delimiter = None
                self._run_callback(callback,
                                   self._consume(loc + delimiter_len))
                return True
        return False

read_from_buffer有两种形式的读:

  • 一种是读指定字节的数据
  • 一种是读到指定的分隔符

应该注意的是_read_from_buffer中指定了回调函数,意思是从socket中读的数据后,使用回调函数来消费。应该注意的是_read_from_buffer并跟socket打交道,其假设所有的数据已经在buffer中了。

将数据读到缓冲区

从socket中读数据到缓冲区,使用的是_read_to_buffer。read_to_buffer实际调用的是_read_from_socket,其从non-blocking中读一次,最多读4096个字节。注意错误的处理,当我们发现从socket中读,发生了其他的错误(除了EAGAIN)的时候,就应该关闭连接。

def _read_to_buffer(self):
        """Reads from the socket and appends the result to the read buffer.
        Returns the number of bytes read.  Returns 0 if there is nothing
        to read (i.e. the read returns EWOULDBLOCK or equivalent).  On
        error closes the socket and raises an exception.
        """
        try:
            chunk = self._read_from_socket()
        except socket.error, e:
            # ssl.SSLError is a subclass of socket.error
            logging.warning("Read error on %d: %s",
                            self.socket.fileno(), e)
            self.close()
            raise
        if chunk is None:
            return 0
        self._read_buffer.append(chunk)
        if self._read_buffer_size() >= self.max_buffer_size:
            logging.error("Reached maximum read buffer size")
            self.close()
            raise IOError("Reached maximum read buffer size")
        return len(chunk)

该函数总体来说就是讲将据读到bufferIOStream中的_read_buffer中,并返回实际读的大小。注意错误的处理:如果所读的数据过大,那么也应该关闭连接。

从socket读一次数据_read_from_socket

 def _read_from_socket(self):
        """Attempts to read from the socket
        Returns the data read or None if there is nothing to read.
        May be overridden in subclasses.
        """
        try:
            # 尽可能读得多
            chunk = self.socket.recv(self.read_chunk_size)
        except socket.error, e:
            if e.args[0] in (errno.EWOULDBLOCK, errno.EAGAIN):
      # 没有数据      
                return None
            else:
                raise
        if not chunk:
    # 客户端数据没有了,那么表示客户端关闭了连接
            self.close()
            return None
        return chunk

LT模式下从socket中读数据_handle_read()

def _handle_read(self):
        while True:
            try:
                # Read from the socket until we get EWOULDBLOCK or equivalent.
                # SSL sockets do some internal buffering, and if the data is
                # sitting in the SSL object's buffer select() and friends
                # can't see it; the only way to find out if it's there is to
                # try to read it.
                result = self._read_to_buffer()
            except Exception:
                self.close()
                return
            if result == 0:
                break
            else:
                if self._read_from_buffer():
                    return

在LT模式下从socket中读分成了两步:

  • 从socket中读数据读到_read_to_buffer
  • 然后从buffer中读。

从socket读,要一直读到出现EWOULDBLOCK或者是EAGAIN。注意到while True,就是不停的读,读到没有数据。

原文地址:https://www.cnblogs.com/bofengqiye/p/7353069.html