负载均衡之HAProxy——种种

HAProxy原理和配置

1.HAProxy简介

（1）HAProxy 是一款提供高可用性、负载均衡以及基于TCP（第四层）和HTTP（第七层）应用的代理软件，支持虚拟主机，它是免费、快速并且可靠的一种解决方案。 HAProxy特别适用于那些负载特大的web站点，这些站点通常又需要会话保持或七层处理。HAProxy运行在时下的硬件上，完全可以支持数以万计的并发连接。并且它的运行模式使得它可以很简单安全的整合进您当前的架构中，同时可以保护你的web服务器不被暴露到网络上。

（2）HAProxy 实现了一种事件驱动、单一进程模型，此模型支持非常大的并发连接数。多进程或多线程模型受内存限制、系统调度器限制以及无处不在的锁限制，很少能处理数千并发连接。事件驱动模型因为在有更好的资源和时间管理的用户端(User-Space) 实现所有这些任务，所以没有这些问题。此模型的弊端是，在多核系统上，这些程序通常扩展性较差。这就是为什么他们必须进行优化以使每个CPU时间片(Cycle)做更多的工作。

（3）HAProxy 支持连接拒绝 : 因为维护一个连接的打开的开销是很低的，有时我们很需要限制攻击蠕虫（attack bots），也就是说限制它们的连接打开从而限制它们的危害。这个已经为一个陷于小型DDoS攻击的网站开发了而且已经拯救了很多站点，这个优点也是其它负载均衡器没有的。

（4）HAProxy 支持全透明代理（已具备硬件防火墙的典型特点）: 可以用客户端IP地址或者任何其他地址来连接后端服务器. 这个特性仅在Linux 2.4/2.6 内核打了cttproxy补丁后才可以使用. 这个特性也使得为某特殊服务器处理部分流量同时又不修改服务器的地址成为可能。

性能

HAProxy借助于OS上几种常见的技术来实现性能的最大化。

1，单进程、事件驱动模型显著降低了上下文切换的开销及内存占用。

2，O(1)事件检查器(event checker)允许其在高并发连接中对任何连接的任何事件实现即时探测。

3，在任何可用的情况下，单缓冲(single buffering)机制能以不复制任何数据的方式完成读写操作，这会节约大量的CPU时钟周期及内存带宽；

4，借助于Linux 2.6 (>= 2.6.27.19)上的splice()系统调用，HAProxy可以实现零复制转发(Zero-copy forwarding)，在Linux 3.5及以上的OS中还可以实现零复制启动(zero-starting)；

5，内存分配器在固定大小的内存池中可实现即时内存分配，这能够显著减少创建一个会话的时长；

6，树型存储：侧重于使用作者多年前开发的弹性二叉树，实现了以O(log(N))的低开销来保持计时器命令、保持运行队列命令及管理轮询及最少连接队列；

7，优化的HTTP首部分析：优化的首部分析功能避免了在HTTP首部分析过程中重读任何内存区域；

8，精心地降低了昂贵的系统调用，大部分工作都在用户空间完成，如时间读取、缓冲聚合及文件描述符的启用和禁用等；

所有的这些细微之处的优化实现了在中等规模负载之上依然有着相当低的CPU负载，甚至于在非常高的负载场景中，5%的用户空间占用率和95%的系统空间占用率也是非常普遍的现象，这意味着HAProxy进程消耗比系统空间消耗低20倍以上。因此，对OS进行性能调优是非常重要的。即使用户空间的占用率提高一倍，其CPU占用率也仅为10%，这也解释了为何7层处理对性能影响有限这一现象。由此，在高端系统上HAProxy的7层性能可轻易超过硬件负载均衡设备。

在生产环境中，在7层处理上使用HAProxy作为昂贵的高端硬件负载均衡设备故障故障时的紧急解决方案也时长可见。硬件负载均衡设备在“报文”级别处理请求，这在支持跨报文请求(request across multiple packets)有着较高的难度，并且它们不缓冲任何数据，因此有着较长的响应时间。对应地，软件负载均衡设备使用TCP缓冲，可建立极长的请求，且有着较大的响应时间。

HAProxy目前主要有三个版本： 1.3 ， 1.4 ，1.5，CentOS6.6 自带的RPM包为 1.5 的。

LB Cluster:
四层：
lvs, nginx(stream)，haproxy(mode tcp)
七层：
http: nginx(http, ngx_http_upstream_module), haproxy(mode http), httpd, ats, perlbal, pound…

实现原理

四层代理：通过分析IP层及TCP/UDP层的流量实现的基于“IP+端口”的负载均衡。

七层：可以根据内容，再配合负载均衡算法来选择后端服务器，不但可以根据 “ip+端口”方式进行负载分流，还可以根据网站的URL，访问域名，浏览器类别，语言等决定负载均衡的策略。

七层负载均衡模式下，负载均衡与客户端及后端的服务器会分别建立一次 TCP连接，而在四层负载均衡模式下(DR)，仅建立一次TCP连接；七层负载均衡对负载均衡设备的要求更高，处理能力也低于四层负载均衡。

Haproxy的特性：

1、可靠性与稳定性都非常出色，可与硬件级设备媲美。
2、支持连接拒绝，可以用于防止DDoS攻击
3、支持长连接、短连接和日志功能，可根据需要灵活配置
4、路由HTTP请求到后端服务器,基于cookie作会话绑定；同时支持通过获取指定的url来检测后端服务器的状态
5、HAProxy还拥有功能强大的ACL支持，可灵活配置路由功能，实现动静分离，在架构设计与实现上带来很大方便
6、可支持四层和七层负载均衡，几乎能为所有服务常见的提供负载均衡功能
7、拥有功能强大的后端服务器的状态监控web页面，可以实时了解设备的运行状态，还可实现设备上下线等简单操作。
8、支持多种负载均衡调度算法，并且也支持session保持。

HAProxy：
http://www.haproxy.org
http://www.haproxy.com

文档：
http://cbonte.github.io/haproxy-dconv/

HAProxy is a TCP/HTTP reverse proxy which is particularly suited for high availability environments. Indeed, it can:
: – route HTTP requests depending on statically assigned cookies
: – spread load among several servers while assuring server persistence
: through the use of HTTP cookies
: – switch to backup servers in the event a main server fails
: – accept connections to special ports dedicated to service monitoring
: – stop accepting connections without breaking existing ones
: – add, modify, and delete HTTP headers in both directions
: – block requests matching particular patterns
: – report detailed status to authenticated users from a URI intercepted by the application

HAProxy是一种TCP/HTTP反向代理，特别适合于高可用性环境。事实上,它可以:

: -根据静态分配的cookie路由HTTP请求

: -在多个服务器之间扩展负载，同时保证服务器的持久性

: -通过使用HTTP cookie。

: -当主服务器发生故障时，切换到备份服务器

: -接受专用于服务监控的专用端口的连接

: -停止接受连接而不破坏现有的连接

: -添加、修改和删除两个方向的HTTP头

: -块请求匹配特定的模式。

: -从被应用程序截获的URI中向经过身份验证的用户报告详细状态。

版本：1.4, 1.5, 1.6, 1.7

程序环境：
主程序：/usr/sbin/haproxy
主配置文件：/etc/haproxy/haproxy.cfg
Unit file：/usr/lib/systemd/system/haproxy.service

配置段：
global：全局配置段
进程及安全配置相关的参数
性能调整相关参数
Debug参数
用户列表
peers
proxies：代理配置段
defaults：为frontend, listen, backend提供默认配置；
fronted：前端，相当于nginx, server {}
backend：后端，相当于nginx, upstream {}
listen：同时拥前端和后端

2.haproxy安装和配置说明

haproxy的安装，CentOS7.4 自带的RPM包为 1.5 的

yum install haproxy -y

haproxy的默认配置文件

[root@HAProxy ~]# cat /etc/haproxy/haproxy.cfg
#---------------------------------------------------------------------
# Example configuration for a possible web application.  See the
# full configuration options online.
#
#   http://haproxy.1wt.eu/download/1.4/doc/configuration.txt
#
#---------------------------------------------------------------------

#---------------------------------------------------------------------
# Global settings
#---------------------------------------------------------------------
global                                                                #全局配置文件
    # to have these messages end up in /var/log/haproxy.log you will
    # need to:
    #
    # 1) configure syslog to accept network log events.  This is done
    #    by adding the '-r' option to the SYSLOGD_OPTIONS in
    #    /etc/sysconfig/syslog
    #
    # 2) configure local2 events to go to the /var/log/haproxy.log
    #   file. A line like the following can be added to
    #   /etc/sysconfig/syslog
    #
    #    local2.*                       /var/log/haproxy.log
    #
    log         127.0.0.1 local2        #日志配置，所有的日志都记录本地，通过local2输出
  
    chroot      /var/lib/haproxy        #修改haproxy的工作目录至指定的目录并在放弃权限之前执行chroot()操作，可以提升haproxy的安全级别，不过需要注意的是要确保指定的目录为空目录且任何用户均不能有写权限；
    pidfile     /var/run/haproxy.pid    #指定pid文件的路径
    maxconn     4000                    #最大连接数的设定
    user        haproxy                 #指定运行服务的用户
    group       haproxy                 #指定运行服务的用户组
    daemon

    # turn on stats unix socket
    stats socket /var/lib/haproxy/stats

#---------------------------------------------------------------------
# common defaults that all the 'listen' and 'backend' sections will
# use if not designated in their block
#---------------------------------------------------------------------
defaults
    mode                    http     #默认使用协议,可以为{http|tcp|health} http:是七层协议 tcp:是四层   health：只返回OK
    log                     global   #全局日志记录
    option                  httplog     #详细记录http日志
    option                  dontlognull  #不记录空日志
    option http-server-close             #启用http-server-close
    option forwardfor       except 127.0.0.0/8  #来自这些信息的都不forwardfor
    option                  redispatch  #重新分发，ServerID对应的服务器宕机后，强制定向到其他运行正常的服务器
    retries                 3     #3次连接失败则认为服务不可用
    timeout http-request    10s   #默认http请求超时时间
    timeout queue           1m    #默认队列超时时间
    timeout connect         10s   #默认连接超时时间
    timeout client          1m    #默认客户端超时时间
    timeout server          1m     #默认服务器超时时间
    timeout http-keep-alive 10s    #默认持久连接超时时间
    timeout check           10s    #默认检查时间间隔
    maxconn                 3000   #最大连接数

#---------------------------------------------------------------------
# main frontend which proxys to the backends
#---------------------------------------------------------------------
frontend  main *:5000
    acl url_static       path_beg       -i /static /images /javascript /stylesheets
    acl url_static       path_end       -i .jpg .gif .png .css .js
      #定义ACL规则以如".html"结尾的文件；-i：忽略大小写
    use_backend static          if url_static  #调用后端服务器并检查ACL规则是否被匹配
    default_backend             app            #客户端访问时默认调用后端服务器地址池

#---------------------------------------------------------------------
# static backend for serving up images, stylesheets and such
#---------------------------------------------------------------------
backend static                                   #定义后端服务器
    balance     roundrobin                       #定义算法;基于权重进行轮询
    server      static 127.0.0.1:4331 check      #check:启动对后端server的健康状态检测

   
#---------------------------------------------------------------------
# round robin balancing between the various backends
#---------------------------------------------------------------------
backend app
    balance     roundrobin
    server  app1 127.0.0.1:5001 check
    server  app2 127.0.0.1:5002 check
    server  app3 127.0.0.1:5003 check
    server  app4 127.0.0.1:5004 check

IaaS, PaaS, SaaS
LBaaS, DBaaS, FWaaS, FaaS(Serverless), …
OpenShift(PaaS): HAPorxy, Ingress Controller

以下为配置的详细说明

global配置参数：
进程及安全管理：chroot, daemon，user, group, uid, gid
log：定义全局的syslog服务器；最多可以定义两个；
log <address> [len <length>] <facility> [max level [min level]]
nbproc <number>：要启动的haproxy的进程数量；
ulimit-n <number>：每个haproxy进程可打开的最大文件数；

性能调整：
maxconn <number>：设定每个haproxy进程所能接受的最大并发连接数；Sets the maximum per-process number of concurrent connections to <number>.
总体的并发连接数：nbproc * maxconn
maxconnrate <number>：Sets the maximum per-process number of connections per second to <number>. 每个进程每秒种所能创建的最大连接数量；
maxsessrate <number>：
maxsslconn <number>: Sets the maximum per-process number of concurrent SSL connections to <number>.
设定每个haproxy进程所能接受的ssl的最大并发连接数；
spread-checks <0..50, in percent>

proxies配置参数

代理配置段：
– defaults <name>
– frontend <name>
– backend <name>
– listen <name>

Frontend段：指定接收客户端连接侦听套接字设置
Backend段：指定将连接请求转发至后端服务器的相关设置
Listen段：指定完整的前后端设置，只对 TCP 有效
proxy 名称：使用字母数字 – _ . : 并区分字符大小写

bind配置

配置参数：

bind：指定一个或多个前端侦听地址和端口只用于frountend配置段和listen配置段
bind [<address>]:<port_range> [, …] [param*]

listen http_proxy
         bind :80,:443
         bind 10.0.0.1:10080,10.0.0.1:10443
         bind /var/run/ssl-frontend.sock user root mode 600 accept-proxy

Balance配置

balance：后端服务器组内的服务器调度算法
balance <algorithm> [ <arguments> ]
balance url_param <param> [check_post]

haproxy中调度算法同样分为动态调度算法和静态调度算法，与nginx调度算法中区分动静态调度算法的概念不同，nginx用能不能根据后端服务器的负载状况进行调度来区分动静态调度算法的差别，而haproxy中则根据该算法支不支持运行时即时生效来区分动静态算法

算法：
roundrobin：Each server is used in turns, according to their weights.
server options： weight # 动态算法：支持权重的运行时调整，支持慢启动；每个后端中最多支持4095个server；
static-rr：静态算法：不支持权重的运行时调整及慢启动；后端主机数量无上限；
leastconn：推荐使用在具有较长会话的场景中，例如MySQL、LDAP等；
first：根据服务器在列表中的位置，自上而下进行调度；前面服务器的连接数达到上限，新请求才会分配给下一台服务；
source：源地址hash；
除权取余法：
一致性哈希：
uri：对URI的左半部分做hash计算，并由服务器总权重相除以后派发至某挑出的服务器；

动静态取决于hash type

hash-type
    map-based
    consistent

URL的组成
<scheme>://<user>:<password>@<host>:<port>/<path>;<params>?<query>#<frag>

schame:方案，访问服务器以获取资源时要使用哪种协议 
user:用户，某些方案访问资源时需要的用户名 
password:密码，用户对应的密码，中间用：分隔 
host:主机，资源宿主服务器的主机名或IP地址 
port:端口,资源宿主服务器正在监听的端口号，很多方案有默认端口号 
path:路径,服务器资源的本地名，由一个/将其与前面的URL组件分隔 
params:参数，指定输入的参数，参数为名/值对，多个参数，用;分隔 
query:查询，传递参数给程序，如数据库，用？分隔,多个查询用&分隔 
frag:片段,一小片或一部分资源的名字，此组件在客户端使用，用#分隔

左半部分：/<path>;<params>
整个uri：/<path>;<params>?<query>#<frag>

username=jerry

url_param：对用户请求的uri的<params>部分中的参数的值作hash计算，并由服务器总权重相除以后派发至某挑出的服务器；通常用于追踪用户，以确保来自同一个用户的请求始终发往同一个Backend Server；

hdr(<name>)：对于每个http请求，此处由<name>指定的http首部将会被取出做hash计算；并由服务器总权重相除以后派发至某挑出的服务器；没有有效值的会被轮询调度；
hdr(Cookie)

rdp-cookie
rdp-cookie(<name>)

hash-type：哈希算法
hash-type <method> <function> <modifier>
map-based：除权取余法，哈希数据结构是静态的数组；
consistent：一致性哈希，哈希数据结构是一个树；

<function> is the hash function to be used : 哈希函数
sdbm
djb2
wt6

default_backend <backend>
设定默认的backend，用于frontend中；

default-server [param*]
为backend中的各server设定默认选项；

server <name> <address>[:[port]] [param*]
定义后端主机的各服务器及其选项；

server <name> <address>[:port] [settings …]
default-server [settings …]

<name>：服务器在haproxy上的内部名称；出现在日志及警告信息中；
<address>：服务器地址，支持使用主机名；
[:[port]]：端口映射；省略时，表示同bind中绑定的端口；
[param*]：参数
maxconn <maxconn>：当前server的最大并发连接数；
backlog <backlog>：当前server的连接数达到上限后的后援队列长度；
backup：设定当前server为备用服务器；
check：对当前server做健康状态检测；
addr ：检测时使用的IP地址；
port ：针对此端口进行检测；
inter <delay>：连续两次检测之间的时间间隔，默认为2000ms;
rise <count>：连续多少次检测结果为“成功”才标记服务器为可用；默认为2；
fall <count>：连续多少次检测结果为“失败”才标记服务器为不可用；默认为3；

注意：option httpchk，”smtpchk”, “mysql-check”, “pgsql-check” and “ssl-hello-chk” 用于定义应用层检测方法；

基于cookie的会话绑定

cookie <value>：为当前server指定其cookie值，用于实现基于cookie的会话黏性；
disabled：标记为不可用；
on-error <mode>：后端服务故障时的行动策略；
– fastinter: force fastinter
– fail-check: simulate a failed check, also forces fastinter (default)
– sudden-death: simulate a pre-fatal failed health check, one more failedcheck will mark a server down, forces fastinter
– mark-down: mark the server immediately down and force fastinter
redir <prefix>：将发往此server的所有GET和HEAD类的请求重定向至指定的URL；
weight <weight>：权重，默认为1;

OK –> PROBLEM
OK –> PROBLEM –> PROBLEM –> PROBLEM
PROBLEM –> OK

cookie <name> [ rewrite | insert | prefix ] [ indirect ] [ nocache ] [ postonly ] [ preserve ] [ httponly ] [ secure ] [ domain <domain> ]* [ maxidle <idle> ] [ maxlife <life> ]
<name>：is the name of the cookie which will be monitored, modified or inserted in order to bring persistence.
rewirte：重写；
insert：插入；
prefix：前缀；

基于cookie的session sticky的实现：

backend websrvs
cookie WEBSRV insert nocache indirect
server srv1 172.16.100.6:80 weight 2 check rise 1 fall 2 maxconn 3000 cookie srv1
server srv2 172.16.100.7:80 weight 1 check rise 1 fall 2 maxconn 3000 cookie srv2

统计接口启用相关的参数

stats enable
启用统计页；基于默认的参数启用stats page；
– stats uri : /haproxy?stats
– stats realm : “HAProxy Statistics”
– stats auth : no authentication
– stats scope : no restriction

stats auth <user>:<passwd> 认证时的账号和密码，可使用多次；
stats realm <realm> 认证时的realm；
stats uri <prefix> 自定义stats page uri
stats refresh <delay> 设定自动刷新时间间隔；
stats admin { if | unless } <cond> 启用stats page中的管理功能

配置示例：

listen stats
    bind :9099
    stats enable
    stats realm HAPorxy Stats Page
    stats auth admin:admin
    stats admin if TRUE

maxconn <conns>：为指定的frontend定义其最大并发连接数；默认为2000；
Fix the maximum number of concurrent connections on a frontend.

haproxy的工作模式

mode { tcp|http|health } 定义haproxy的工作模式；
tcp：基于layer4实现代理；可代理mysql, pgsql, ssh, ssl等协议；
http：仅当代理的协议为http时使用；
health：工作为健康状态检查的响应模式，当连接请求到达时回应“OK”后即断开连接；

示例：

listen ssh
bind :22022
balance leastconn
mode tcp
server sshsrv1 172.16.100.6:22 check
server sshsrv2 172.16.100.7:22 check

forwardfor配置

option forwardfor [ except <network> ] [ header <name> ] [ if-none ]
Enable insertion of the X-Forwarded-For header to requests sent to servers
在由haproxy发往后端主机的请求报文中添加“X-Forwarded-For”首部，其值前端客户端的地址；用于向后端主发送真实的客户端IP；

[ except <network> ]：请求报请来自此处指定的网络时不予添加此首部；
[ header <name> ]：使用自定义的首部名称，而非“X-Forwarded-For”；

[ if-none ] 如果没有首部才添加首部，如果有使用默认值

为指定的MIME类型启用压缩传输功能
compression algo <algorithm> …：启用http协议的压缩机制，指明压缩算法gzip, deflate
compression type <mime type> …：指明压缩的MIMI类型

错误页配置

errorfile <code> <file>
Return a file contents instead of errors generated by HAProxy

<code>：is the HTTP status code. Currently, HAProxy is capable of generating codes 200, 400, 403, 408, 500, 502, 503, and 504.
<file>：designates a file containing the full HTTP response.

示例：

errorfile 400 /etc/haproxy/errorfiles/400badreq.http
errorfile 408 /dev/null # workaround Chrome pre-connect bug
errorfile 403 /etc/haproxy/errorfiles/403forbid.http
errorfile 503 /etc/haproxy/errorfiles/503sorry.http

errorloc <code> <url>
errorloc302 <code> <url>

errorfile 403 http://www.magedu.com/error_pages/403.html

修改报文首部

reqadd <string> [{if | unless} <cond>] Add a header at the end of the HTTP request
在请求报文尾部添加指定首部
rspadd <string> [{if | unless} <cond>] Add a header at the end of the HTTP response
在响应报文尾部添加指定首部

示例

rspadd X-Via: HAPorxy #字符串中的空格要转义

reqdel <search> [{if | unless} <cond>]
reqidel <search> [{if | unless} <cond>] (ignore case) Delete all headers matching a regular expression in an HTTP request
不分大小写从请求报文中删除匹配正则表达式的首部
rspdel <search> [{if | unless} <cond>]
rspidel <search> [{if | unless} <cond>] (ignore case) Delete all headers matching a regular expression in an HTTP response
不分大小写从响应报文中删除匹配正则表达式的首部

示例

rspidel Server.*

会话保持

haproxy负载均衡保持客户端和服务器Session的三种方式:

1 用户源IP 识别

haroxy 将用户IP经过hash计算后指定到固定的真实服务器上（类似于nginx 的IP hash 指令）

配置指令 balance source

backend www

mode http

balance source

server web1 192.168.0.150:80 check inter 1500 rise 3 fall 3

server web2 192.168.0.151:80 check inter 1500 rise 3 fall 3

2 cookie 识别

haproxy 将WEB服务端发送给客户端的cookie中插入(或添加加前缀)haproxy定义的后端的服务器COOKIE ID。

配置指令例举 cookie SESSION_COOKIE insert indirect nocache

用firebug可以观察到用户的请求头的cookie里有类似” Cookie jsessionid=0bc588656ca05ecf7588c65f9be214f5;

SESSION_COOKIE=app1″

SESSION_COOKIE=app1就是haproxy添加的内容。

backend COOKIE_srv

mode http

cookie SESSION_COOKIE insert indirect nocache

server web1 192.168.0.150:80 cookie 1 check inter 1500 rise 3 fall 3

server web2 192.168.0.151:80 cookie 2 check inter 1500 rise 3 fall 3

3 session 识别

haproxy 将后端服务器产生的session和后端服务器标识存在haproxy中的一张表里。客户端请求时先查询这张表。

配置指令：appsession <cookie> len <length> timeout <holdtime>

配置指令例举 appsession JSESSIONID len 64 timeout 5h request-learn

配置举例

backend APPSESSION_srv

mode http

appsession JSESSIONID len 64 timeout 5h request-learn

server web1 192.168.0.150:80 cookie 1 check inter 1500 rise 3 fall 3

server web2 192.168.0.151:80 cookie 2 check inter 1500 rise 3 fall 3

日志系统

log：
log global
log <address> [len <length>] <facility> [<level> [<minlevel>]]
no log

注意：
默认发往本机的日志服务器；
(1) local2.* /var/log/local2.log
(2) $ModLoad imudp
$UDPServerRun 514

log-format <string>：
课外实践：参考文档实现combined格式的记录

capture cookie <name> len <length>
Capture and log a cookie in the request and in the response.

capture request header <name> len <length>
Capture and log the last occurrence of the specified request header.

capture request header X-Forwarded-For len 15

capture response header <name> len <length>
Capture and log the last occurrence of the specified response header.

capture response header Content-length len 9
capture response header Location len 15

为指定的MIME类型启用压缩传输功能
compression algo <algorithm> …：启用http协议的压缩机制，指明压缩算法gzip, deflate；
compression type <mime type> …：指明压缩的MIME类型；常适用于压缩的类型为文本类型；

对后端服务器做http协议的健康状态检测：
option httpchk
option httpchk <uri>
option httpchk <method> <uri>
option httpchk <method> <uri> <version>
定义基于http协议的7层健康状态检测机制；

http-check expect [!] <match> <pattern>
Make HTTP health checks consider response contents or specific status codes.

连接超时时长：
timeout client <timeout>
Set the maximum inactivity time on the client side. 默认单位是毫秒;

timeout server <timeout>
Set the maximum inactivity time on the server side.

timeout http-keep-alive <timeout>
持久连接的持久时长；

timeout http-request <timeout>
Set the maximum allowed time to wait for a complete HTTP request

timeout connect <timeout>
Set the maximum time to wait for a connection attempt to a server to succeed.

timeout client-fin <timeout>
Set the inactivity timeout on the client side for half-closed connections.

timeout server-fin <timeout>
Set the inactivity timeout on the server side for half-closed connections.

use_backend <backend> [{if | unless} <condition>]
Switch to a specific backend if/unless an ACL-based condition is matched.
当符合指定的条件时使用特定的backend；

block { if | unless } <condition>
Block a layer 7 request if/unless a condition is matched

acl invalid_src src 172.16.200.2
block if invalid_src
errorfile 403 /etc/fstab

http-request { allow | deny } [ { if | unless } <condition> ]
Access control for Layer 7 requests

tcp-request connection {accept|reject} [{if | unless} <condition>]
Perform an action on an incoming connection depending on a layer 4 condition

示例：
listen ssh
bind :22022
balance leastconn
acl invalid_src src 172.16.200.2
tcp-request connection reject if invalid_src
mode tcp
server sshsrv1 172.16.100.6:22 check
server sshsrv2 172.16.100.7:22 check backup

acl

The use of Access Control Lists (ACL) provides a flexible solution to perform content switching and generally to take decisions based on content extracted from the request, the response or any environmental status.

acl <aclname> <criterion> [flags] [operator] [<value>] …
<aclname>：ACL names must be formed from upper and lower case letters, digits, ‘-‘ (dash), ‘_’ (underscore) , ‘.’ (dot) and ‘:’ (colon).ACL names are case-sensitive.
<value>的类型：
– boolean
– integer or integer range
– IP address / network
– string (exact, substring, suffix, prefix, subdir, domain)
– regular expression
– hex block

<flags>
-i : ignore case during matching of all subsequent patterns.
-m : use a specific pattern matching method
-n : forbid the DNS resolutions
-u : force the unique id of the ACL
— : force end of flags. Useful when a string looks like one of the flags.

[operator]
匹配整数值：eq、ge、gt、le、lt

匹配字符串：
– exact match (-m str) : the extracted string must exactly match the patterns ;
– substring match (-m sub) : the patterns are looked up inside the extracted string, and the ACL matches if any of them is found inside ;
– prefix match (-m beg) : the patterns are compared with the beginning of the extracted string, and the ACL matches if any of them matches.
– suffix match (-m end) : the patterns are compared with the end of the extracted string, and the ACL matches if any of them matches.
– subdir match (-m dir) : the patterns are looked up inside the extracted string, delimited with slashes (“/”), and the ACL matches if any of them matches.
– domain match (-m dom) : the patterns are looked up inside the extracted string, delimited with dots (“.”), and the ACL matches if any of them matches.

acl作为条件时的逻辑关系：
– AND (implicit)
– OR (explicit with the “or” keyword or the “||” operator)
– Negation with the exclamation mark (“!”)

if invalid_src invalid_port
if invalid_src || invalid_port
if ! invalid_src invalid_port

<criterion> ：
dst : ip
dst_port : integer
src : ip
src_port : integer

acl invalid_src src 172.16.200.2

path : string
This extracts the request’s URL path, which starts at the first slash and ends before the question mark (without the host part).
/path;<params>

path : exact string match
path_beg : prefix match
path_dir : subdir match
path_dom : domain match
path_end : suffix match
path_len : length match
path_reg : regex match
path_sub : substring match

path_beg /images/
path_end .jpg .jpeg .png .gif
path_reg ^/images.*.jpeg$
path_sub image 
path_dir jpegs
path_dom ilinux

/images/jpegs/20180312/logo.jpg

url : string
This extracts the request’s URL as presented in the request. A typical use is with prefetch-capable caches, and with portals which need to aggregate multiple information from databases and keep them in caches.

url : exact string match
url_beg : prefix match
url_dir : subdir match
url_dom : domain match
url_end : suffix match
url_len : length match
url_reg : regex match
url_sub : substring match

req.hdr([<name>[,<occ>]]) : string
This extracts the last occurrence of header <name> in an HTTP request.

hdr([<name>[,<occ>]]) : exact string match
hdr_beg([<name>[,<occ>]]) : prefix match
hdr_dir([<name>[,<occ>]]) : subdir match
hdr_dom([<name>[,<occ>]]) : domain match
hdr_end([<name>[,<occ>]]) : suffix match
hdr_len([<name>[,<occ>]]) : length match
hdr_reg([<name>[,<occ>]]) : regex match
hdr_sub([<name>[,<occ>]]) : substring match

示例：
acl bad_curl hdr_sub(User-Agent) -i curl
block if bad_curl

status : integer
Returns an integer containing the HTTP status code in the HTTP response.

Pre-defined ACLs
ACL name Equivalent to Usage
FALSE always_false never match
HTTP req_proto_http match if protocol is valid HTTP
HTTP_1.0 req_ver 1.0 match HTTP version 1.0
HTTP_1.1 req_ver 1.1 match HTTP version 1.1
HTTP_CONTENT hdr_val(content-length) gt 0 match an existing content-length
HTTP_URL_ABS url_reg ^[^/:]*:// match absolute URL with scheme
HTTP_URL_SLASH url_beg / match URL beginning with “/”
HTTP_URL_STAR url * match URL equal to “*”
LOCALHOST src 127.0.0.1/8 match connection from local host
METH_CONNECT method CONNECT match HTTP CONNECT method
METH_GET method GET HEAD match HTTP GET or HEAD method
METH_HEAD method HEAD match HTTP HEAD method
METH_OPTIONS method OPTIONS match HTTP OPTIONS method
METH_POST method POST match HTTP POST method
METH_TRACE method TRACE match HTTP TRACE method
RDP_COOKIE req_rdp_cookie_cnt gt 0 match presence of an RDP cookie
REQ_CONTENT req_len gt 0 match data in the request buffer
TRUE always_true always match
WAIT_END wait_end wait for end of content analysis

HAProxy：global, proxies（fronted, backend, listen, defaults）
balance：
roundrobin, static-rr
leastconn
first
source
hdr(<name>)
uri (hash-type)
url_param

Nginx调度算法：ip_hash, hash, leastconn,
lvs调度算法：
rr/wrr/sh/dh, lc/wlc/sed/nq/lblc/lblcr

基于ACL的动静分离示例：
frontend web *:80
acl url_static path_beg -i /static /images /javascript /stylesheets
acl url_static path_end -i .jpg .gif .png .css .js .html .txt .htm

use_backend staticsrvs if url_static
default_backend appsrvs

backend staticsrvs
balance roundrobin
server stcsrv1 172.16.100.6:80 check

backend appsrvs
balance roundrobin
server app1 172.16.100.7:80 check
server app1 172.16.100.7:8080 check

listen stats
bind :9091
stats enable
stats auth admin:admin
stats admin if TRUE

配置HAProxy支持https协议：
1 支持ssl会话；
bind *:443 ssl crt /PATH/TO/SOME_PEM_FILE

crt后的证书文件要求PEM格式，且同时包含证书和与之匹配的所有私钥；

cat demo.crt demo.key > demo.pem

2 把80端口的请求重向定443；
bind *:80
redirect scheme https if !{ ssl_fc }

另一种配置：对非ssl的任何url的访问统统定向至https主机的主页；
redirect location https://172.16.0.67/ if !{ ssl_fc }

3 如何向后端传递用户请求的协议和端口
http_request set-header X-Forwarded-Port %[dst_port]
http_request add-header X-Forwared-Proto https if { ssl_fc }

配置时常用的功能：
http –> https

mode http
压缩、条件式转发、算法、stats page、自定义错误页、访问控制、日志功能
最大并发连接；
global, defaults, frontend, listen, server
基于cookie的session粘滞
后端主机的健康状态检测
请求和响应报文首部的操纵

实践（博客）作业：
http:
(1) 动静分离部署wordpress，动静都要能实现负载均衡，要注意会话的问题；
(2) 在haproxy和后端主机之间添加varnish进行缓存；
(3) 给出设计拓扑，写成博客；

(4) haproxy的设定要求：
(a) stats page，要求仅能通过本地访问使用管理接口；
(b) 动静分离；
(c) 分别考虑不同的服务器组的调度算法；
(4) 压缩合适的内容类型；

haproxy 超时机制

<pre name="code" class="python">option redispatch option redispatch 是否允许重新分配在session 失败后

option abortonclose 丢弃由于客户端等待时间过长而关闭连接但仍在haproxy等待队列中的请求
option abortonclose
#当服务器负载很高的时候，自动结束掉当前队列处理比较久的链接
defaults
log global
mode http
option httplog
option dontlognull
retries 3
option redispatch
option abortonclose
maxconn 65535
timeout connect 5000
timeout client 50000
timeout server 50000
timeout check 5s
stats refresh 30s
timeout http request ：在客户端建立连接但不请求数据时，关闭客户端连接
timeout queue ：等待最大时长
timeout connect：定义haproxy将客户端请求转发至后端服务器所等待的超时时长
timeout client：客户端非活动状态的超时时长
timeout server：客户端与服务器端建立连接后，等待服务器端的超时时长，
timeout http-keep-alive ：定义保持连接的超时时长
timeout check：健康状态监测时的超时时间，过短会误判，过长资源消耗
client_timeout 是 app 连接 haproxy的时间
server_timeout 是haproxy 连接后端的时间.
目前发现超时后,前台一个点击 haproxy 会收到两个相同的请求,原因待查明。
设置 5秒超时时:
timeout connect 5000
timeout client 50000
timeout server 50000

timeout check 5s
stats refresh 30s

操作步骤就是：
1）131和132虚拟机上关闭防火墙，分别执行while true; do echo -e 'HTTP/1.0 200 OK server_153' | sudo nc -l -p 80 ; done
2）在另外一台ip为129的虚拟机上多次使用wget http://10.192.74.160 访问
3）131，132上面观察包的处理情况

脚本就是操作步骤中的这个：while true; do echo -e 'HTTP/1.0 200 OK server_153' | sudo nc -l -p 80 ; done