HAproxy的安装配置及动静分离

/////////////////////////////目录//////////////////////////////////////////
一、安装HAproxy
二、编写HAproxy启动脚本
三、HAproxy参数配置说明
四、HAproxy动静分离参考配置

官方参考文档：
http://www.haproxy.org/#down

其他参考资料：
马哥HAproxy中文文档教程
http://mageedu.blog.51cto.com/4265610/1744420?utm_source=tuicool&utm_medium=referral
HAProxy的acl规则
http://dngood.blog.51cto.com/446195/886547/
http://hao360.blog.51cto.com/5820068/1343422
KeepAlived实现HAProxy双主并实现资源的动静分离
http://anyisalin.blog.51cto.com/10917514/1764547

////////////////////////////下面开始表演///////////////////////////////////

一、安装HAproxy
官方给出的文档如下（HAproxy1.7版本）：

To build haproxy, you will need :
  -GNU make
  -GCC between 2.95 and 4.8.
  -GNU ld

Also, you might want to build with libpcre support.
If your system supports PCRE (Perl Compatible Regular Expressions), then you really should build with libpcre which is between 2 and 10 times faster than other libc implementations.
- USE_PCRE=1

It is also possible to include native support for zlib to benefit from HTTP compression.
USE_ZLIB=1

查看Linux的内核版本的方法：

1.uname -a
2.cat /proc/version

考虑到本机设置
步骤如下：

 make TARGET=linux2628 USE_OPENSSL=1 USE_PCRE=1 USE_ZLIB=1 PREFIX=/usr/local/haproxy
 make install PREFIX=/usr/local/haproxy

二、编写HAproxy启动脚本

vi /etc/init.d/haproxy
#!/bin/sh
# chkconfig 2345 on
# description: HAProxy is a TCP/HTTP reverse proxy which is particularly suited for high availability environments.
if [ -f /etc/init.d/functions ]; then
 . /etc/init.d/functions
elif [ -f /etc/rc.d/init.d/functions ] ; then
 . /etc/rc.d/init.d/functions
else
 exit 0
fi

# Source networking configuration.
. /etc/sysconfig/network

# Check that networking is up.
[ ${NETWORKING} = "no" ] && exit 0
config="/etc/haproxy.cfg"
exec="/usr/local/haproxy/sbin/haproxy"
PID="/var/run/haproxy.pid"
[ -f $config ] || exit 1
RETVAL=0
start() {
  daemon $exec -c -q -f $config
    if [ $? -ne 0 ]; then
        echo "Errors found in configuration file."
        return 1
    fi
  echo -n "Starting HAproxy: "
  $exec -D -f $config -p $PID
  RETVAL=$?
  echo
  [ $RETVAL -eq 0 ] && touch /var/lock/subsys/haproxy
  return $RETVAL
}
stop() {
 echo -n "Shutting down HAproxy: "
 killproc haproxy -USR1
 RETVAL=$?
 echo
 [ $RETVAL -eq 0 ] && rm -f /var/lock/subsys/haproxy
 [ $RETVAL -eq 0 ] && rm -f $PID
 return $RETVAL
}
restart() {
 $exec -c -q -f $config
   if [ $? -ne 0 ]; then
       echo "Errors found in configuration file, check it with 'haproxy check'."
       return 1
   fi
 stop
 start
}
rhstatus() {
 status haproxy
}
# See how we were called.
case "$1" in
 start)
        start
        ;;
 stop)
        stop
        ;;
 restart)
        restart
        ;;
 status)
        rhstatus
        ;;
 *)
        echo $"Usage: haproxy {start|stop|restart|status}"
        RETVAL=1
        esac
        exit $RETVAL

亲测可用，需要注意配置文件的路径可能会有所差异

三、HAproxy参数配置说明
在配置文件中：
global 全局参数配置
defaults 为所有其他配置段提供默认参数
listen 关联“前端”和“后端”，定义了完整的代理，只对TCP流量有用
frontend 用于定义一系列监听的套接字，可与接收客户端请求并与之连接
backend 用于定义一系列“后端服务器”，代理会将对应客户端的请求转发到这些服务器

官方文档如下：

A "defaults" section sets default parameters for all other sections following its declaration. Those default parameters are reset by the next "defaults" section. See below for the list of parameters which can be set in a "defaults" section. The name is optional but its use is encouraged for better readability.

A "frontend" section describes a set of listening sockets accepting client connections.

A "backend" section describes a set of servers to which the proxy will connect to forward incoming connections.

A "listen" section defines a complete proxy with its frontend and backend parts combined in one section. It is generally useful for TCP-only traffic.

这里给出一张常用参数位置表：

具体配置文件书写结构参考 “四、HAproxy动静分离参考配置”

其他关键字参考：

3.1 balance

负载均衡算法，共有
  roundrobin：基于权重进行轮叫，在服务器的处理时间保持均匀分布时，这是最平衡、最公平的算法。此算法是动态的，这表示其权重可以在运行时进行调整，不过，在设计上，每个后端服务器仅能最多接受4128个连接；
  static-rr：基于权重进行轮叫，与roundrobin类似，但是为静态方法，在运行时调整其服务器权重不会生效；不过，其在后端服务器连接数上没有限制；
  leastconn：新的连接请求被派发至具有最少连接数目的后端服务器；在有着较长时间会话的场景中推荐使用此算法，如LDAP、SQL等，其并不太适用于较短会话的应用层协议，如HTTP；此算法是动态的，可以在运行时调整其权重；
  source：将请求的源地址进行hash运算，并由后端服务器的权重总数相除后派发至某匹配的服务器；这可以使得同一个客户端IP的请求始终被派发至某特定的服务器；不过，当服务器权重总数发生变化时，如某服务器宕机或添加了新的服务器，许多客户端的请求可能会被派发至与此前请求不同的服务器；常用于负载均衡无cookie功能的基于TCP的协议；其默认为静态，不过也可以使用hash-type修改此特性；
  uri：对URI的左半部分(“问题”标记之前的部分)或整个URI进行hash运算，并由服务器的总权重相除后派发至某匹配的服务器；这可以使得对同一个URI的请求总是被派发至某特定的服务器，除非服务器的权重总数发生了变化；此算法常用于代理缓存或反病毒代理以提高缓存的命中率；需要注意的是，此算法仅应用于HTTP后端服务器场景；其默认为静态算法，不过也可以使用hash-type修改此特性；
  url_param：通过<argument>为URL指定的参数在每个HTTP GET请求中将会被检索；如果找到了指定的参数且其通过等于号“=”被赋予了一个值，那么此值将被执行hash运算并被服务器的总权重相除后派发至某匹配的服务器；此算法可以通过追踪请求中的用户标识进而确保同一个用户ID的请求将被送往同一个特定的服务器，除非服务器的总权重发生了变化；如果某请求中没有出现指定的参数或其没有有效值，则使用轮叫算法对相应请求进行调度；此算法默认为静态的，不过其也可以使用hash-type修改此特性；
  hdr(<name>)：对于每个HTTP请求，通过<name>指定的HTTP首部将会被检索；如果相应的首部没有出现或其没有有效值，则使用轮叫算法对相应请求进行调度；其有一个可选选项“use_domain_only”，可在指定检索类似Host类的首部时仅计算域名部分(比如通过www.magedu.com来说，仅计算magedu字符串的hash值)以降低hash算法的运算量；此算法默认为静态的，不过其也可以使用hash-type修改此特性；
  rdp-cookie
  rdp-cookie(name)

3.2 mode { tcp|http|health }

3.3 default_backend

default_backend <backend>
  在没有匹配的"use_backend"规则时为实例指定使用的默认后端，因此，其不可应用于backend区段。在"frontend"和"backend"之间进行内容交换时，通常使用"use-backend"定义其匹配规则；而没有被规则匹配到的请求将由此参数指定的后端接收。

  使用案例：
  use_backend     dynamic  if  url_dyn
  use_backend     static   if  url_css url_img extension_img
  default_backend dynamic

3.4 server

server <name> <address>[:port] [param*]
  为后端声明一个server，因此，不能用于defaults和frontend区段。
  <name>：为此服务器指定的内部名称，其将出现在日志及警告信息中；如果设定了"http-send-server-name"，它还将被添加至发往此服务器的请求首部中；
  <address>：此服务器的的IPv4地址，也支持使用可解析的主机名，只不过在启动时需要解析主机名至相应的IPv4地址；
  [:port]：指定将连接请求所发往的此服务器时的目标端口，其为可选项；未设定时，将使用客户端请求时的同一相端口；
  [param*]：为此服务器设定的一系参数；其可用的参数非常多，具体请参考官方文档中的说明，下面仅说明几个常用的参数；
    backup：设定为备用服务器，仅在负载均衡场景中的其它server均不可用于启用此server；
    check：启动对此server执行健康状态检查，其可以借助于额外的其它参数完成更精细的设定，如：
    inter <delay>：设定健康状态检查的时间间隔，单位为毫秒，默认为2000；也可以使用fastinter和downinter来根据服务器端状态优化此时间延迟；
    rise <count>：设定健康状态检查中，某离线的server从离线状态转换至正常状态需要成功检查的次数；
    fall <count>：确认server从正常状态转换为不可用状态需要检查的次数；
    cookie <value>：为指定server设定cookie值，此处指定的值将在请求入站时被检查，第一次为此值挑选的server将在后续的请求中被选中，其目的在于实现持久连接的功能；
    maxconn <maxconn>：指定此服务器接受的最大并发连接数；如果发往此服务器的连接数目高于此处指定的值，其将被放置于请求队列，以等待其它连接被释放；
    maxqueue <maxqueue>：设定请求队列的最大长度；
    observe <mode>：通过观察服务器的通信状况来判定其健康状态，默认为禁用，其支持的类型有“layer4”和“layer7”，“layer7”仅能用于http代理场景；
    redir <prefix>：启用重定向功能，将发往此服务器的GET和HEAD请求均以302状态码响应；需要注意的是，在prefix后面不能使用/，且不能使用相对地址，以免造成循环；例如：
    server srv1 172.16.100.6:80 redir http://imageserver.magedu.com check
    weight <weight>：权重，默认为1，最大值为256，0表示不参与负载均衡；

3.5 errorfile

errorfile <code> <file>
  在用户请求不存在的页面时，返回一个页面文件给客户端而非由haproxy生成的错误代码；可用于所有段中。
  <code>：指定对HTTP的哪些状态码返回指定的页面；这里可用的状态码有200、400、403、408、500、502、503和504；
  <file>：指定用于响应的页面文件；
  例如：
  errorfile 400 /etc/haproxy/errorpages/400badreq.http
  errorfile 403 /etc/haproxy/errorpages/403forbid.http
  errorfile 503 /etc/haproxy/errorpages/503sorry.http

3.6 errorloc、errorloc302和errorloc303

3.7 访问控制列表（Access Control List，acl）

格式：
  acl 自定义的acl名称 hdr_reg(host) -i 正则表达式
                      hdr_dom(host) -i 域名
                      path_beg
                      path_end
                      base_...
                      ...
  他是用于匹配的，如果成功匹配，则“自定义的acl名称”返回1（即正确）    
  
  值得了解的是：
  <hdr>   is the name of a HTTP header in which to fetch the IP to bind to
  <base>  This returns the concatenation of the first Host header and the path part of
  the request, which starts at the first slash and ends before the question
  mark.
  <path>  This extracts the request's URL path, which starts at the first slash(/) and
  ends before the question mark (without the host part).
  <url>   This extracts the request's URL as presented in the request.

四、HAproxy动静分离参考配置

vi /etc/haproxy.cfg
global
  log     127.0.0.1 local2
  chroot  /var/lib/haproxy
  pidfile /var/run/haproxy.pid
  maxconn 4000    #最大连接数 
  user    haproxy
  group   haproxy
  daemon

default
  mode    http    #http是七层
                  #tcp 是四层
  log     global
  option  httplog #http 日志格式
  option  dontlognull
  option  http-server-close
  option  forwardfor  except 127.0.0.0/8
  option  redispatch  #serverId对应的服务器挂掉后,强制定向到其他健康的服务器 
  #option httpclose   #每次请求完毕后主动关闭http通道
  retries 3           #3次连接失败就认为服务不可用，也可以通过后面设置
  timeout http-request 10s
  timeout queue        1m
  timeout connect      10s
  timeout client       1m
  timeout server       1m
  timeout http-keep-alive 10s
  timeout check        10s
  maxconn 3000    #最大连接数
 
frontend  main    #定义前端服务器
  bind *:80       #监听地址
 
  ##开启stats界面
  stats   enable
  stats   hide-version                #隐藏统计页面上HAProxy的版本信息
  stats refresh 30s                   #统计页面自动刷新时间
  stats uri /haproxyadmin             #统计页面URL 
  stats realm   Haproxy Statistics   #统计页面密码框上提示文本
                                      #注意，realm后面跟的是字符串，空格需要用
  stats auth    admin:admin           #统计页面用户名和密码设置
  stats admin if TRUE                 #手工启用/禁用,后端服务器 
  
  default_backend  dynamic            #默认backend为dynamic
  acl url_static   path_end -i .jpg   #访问控制列表, 匹配结尾为.jpg的资源
  
  use_backend  static  if url_static  #如果结尾为.jpg, 则使用backend为static
 
backend dynamic
  balance  roundrobin  #负载均衡算法roundrobin
  server   dynamic 172.16.1.5:80 check
  backend  static
  balance  uri         #这里使用uri算法
  server   static  172.16.1.4:80 check

这里用到分别用到三个服务器，两个跑httpd，一个跑HAproxy