TIME_WAIT Accumulation and Port Exhaustion

客户端实现连接的唯一性

HTTP The Definitive Guide

4.2.7 TIME_WAIT Accumulation and Port Exhaustion
TIME_WAIT port exhaustion is a serious performance problem that affects performance
benchmarking but is relatively uncommon is real deployments. It warrants special attention because
most people involved in performance benchmarking eventually run into this problem and get
unexpectedly poor performance.
When a TCP endpoint closes a TCP connection, it maintains in memory a small control block
recording the IP addresses and port numbers of the recently closed connection. This information is
maintained for a short time, typically around twice the estimated maximum segment lifetime (called
"2MSL"; often two minutes
[8]
), to make sure a new TCP connection with the same addresses and port
numbers is not created during this time. This prevents any stray duplicate packets from the previous
connection from accidentally being injected into a new connection that has the same addresses and
port numbers. In practice, this algorithm prevents two connections with the exact same IP addresses
and port numbers from being created, closed, and recreated within two minutes.
[8]
The 2MSL value of two minutes is historical. Long ago, when routers were much slower, it was estimated
that a duplicate copy of a packet might be able to remain queued in the Internet for up to a minute before
being destroyed. Today, the maximum segment lifetime is much smaller.
Today's higher-speed routers make it extremely unlikely that a duplicate packet will show up on a
server's doorstep minutes after a connection closes. Some operating systems set 2MSL to a smaller
value, but be careful about overriding this value. Packets do get duplicated, and TCP data will be
corrupted if a duplicate packet from a past connection gets inserted into a new stream with the same
connection values.
The 2MSL connection close delay normally is not a problem, but in benchmarking situations, it can
be. It's common that only one or a few test load-generation computers are connecting to a system
under benchmark test, which limits the number of client IP addresses that connect to the server.
Furthermore, the server typically is listening on HTTP's default TCP port, 80. These circumstances
limit the available combinations of connection values, at a time when port numbers are blocked from
reuse by TIME_WAIT.
In a pathological situation with one client and one web server, of the four values that make up a TCP
connection:
<source-IP-address, source-port, destination-IP-address,
destination-port>
three of them are fixed—only the source port is free to change:
<client-IP, source-port, server-IP, 80>
Each time the client connects to the server, it gets a new source port in order to have a unique
connection. But because a limited number of source ports are available (say, 60,000) and no
connection can be reused for 2MSL seconds (say, 120 seconds), this limits the connect rate to 60,000 /
120 = 500 transactions/sec. If you keep making optimizations, and your server doesn't get faster than about 500 transactions/sec, make sure you are not experiencing TIME_WAIT port exhaustion. You
can fix this problem by using more client load-generator machines or making sure the client and
server rotate through several virtual IP addresses to add more connection combinations.
Even if you do not suffer port exhaustion problems, be careful about having large numbers of open
connections or large numbers of control blocks allocated for connection in wait states. Some operating
systems slow down dramatically when there are numerous open connections or control blocks.