python SimpleHTTPServer源码学习

SimpleHTTPServer.SimpleHTTPRequestHandler继承了BaseHTTPServer.BaseHTTPRequestHandler。

源码中主要实现了BaseHTTPServer.BaseHTTPRequestHandler处理时需要调用的do_Head()和do_GET()函数。这类函数主要是在BaseHTTPRequestHandler在接受请求并判断请求头中的command之后调用的。

 def handle_one_request(self):
            ... ...
            mname = 'do_' + self.command
            if not hasattr(self, mname):
                self.send_error(501, "Unsupported method (%r)" % self.command)
                return
            method = getattr(self, mname)
            method()
            ... ...

因此，在我们使用SimpleHTTPServer 对web请求处理时基本都需要调用这个method()，当然，其他异常情况除外。

SimpleHTTPServer.SimpleHTTPRequestHandler默认的处理是，如果在执行该脚本的当前目录含有 index.html或index.htm时，将把这个文件的html内容作为首页，如果不存在，则在界面显示当前目录下的文件夹内容，并内部将其设置html页面展现方式。

    def list_directory(self, path):
        """Helper to produce a directory listing (absent index.html).

        Return value is either a file object, or None (indicating an
        error).  In either case, the headers are sent, making the
        interface the same as for send_head().

        """
        try:
            list = os.listdir(path)
        except os.error:
            self.send_error(404, "No permission to list directory")
            return None
        list.sort(key=lambda a: a.lower())
        f = StringIO()
        displaypath = cgi.escape(urllib.unquote(self.path))
        f.write('<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">')
        f.write("<html>
<title>Directory listing for %s</title>
" % displaypath)
        f.write("<body>
<h2>Directory listing for %s</h2>
" % displaypath)
        f.write("<hr>
<ul>
")
        for name in list:
            fullname = os.path.join(path, name)
            displayname = linkname = name
            # Append / for directories or @ for symbolic links
            if os.path.isdir(fullname):
                displayname = name + "/"
                linkname = name + "/"
            if os.path.islink(fullname):
                displayname = name + "@"
                # Note: a link to a directory displays with @ and links with /
            f.write('<li><a href="%s">%s</a>
'
                    % (urllib.quote(linkname), cgi.escape(displayname)))
        f.write("</ul>
<hr>
</body>
</html>
")
        length = f.tell()
        f.seek(0)
        self.send_response(200)
        encoding = sys.getfilesystemencoding()
        self.send_header("Content-type", "text/html; charset=%s" % encoding)
        self.send_header("Content-Length", str(length))
        self.end_headers()
        return f

SimpleHTTPRequestHandler中list_directory()

其实对于socket的请求整理是在SocketServer.TCPServer中处理的，对web请求头的处理是在BaseHTTPServer.BaseHTTPRequestHandler中处理的，其对头的类型，版本等作了处理。而对于请求的回应则在子类SimpleHTTPServer.SimpleHTTPRequestHandler中处理。

那么，SimpleHTTPServer.SimpleHTTPRequestHandler是如何作出上述说明的请求的呢？

首先，Simple通过send_head()函数内部预先分析了请求的url路径，然后提取路径与当前目录路径组合得到请求的绝对路径地址，如果在该路径下存在index.html或index.htm文件则将这个文件内容打开并设置回馈头的内容，写入文件内容的长度和内容的类型，如果没有这个文件，则将获取当前目录下的内容，创建一个文件缓存写入一个html格式的内容，其中写明当前目录所具有的内容并设置超链接，使得用户点击时服务器能正确的回馈对应内容。

我们发现send_head()其实发送的请求头是根据请求内容进行设置的，也就是说在send_head()中，Simple已经把请求data准备好了，所以在send_head()之后只需要调用self.copyfile(f, self.wfile)将文件对象或缓存文件对象中的内容写入请求流对象中即可。

至于其他函数，都是为这些作准备的。

*值得注意的是，在读取本地文件回馈给客户端时要注意文件需要以rb的方式，即二进制方式去读，这样就避免文本流中换行了，也能正确的就算出流的长度（长度是作为回馈头的一部分反馈出去的）。