motan的线程保护策略

背景:

  线上ne-account服务由于调用量及qps都较高,在上线期间,motan日志打出如下错误:

2018-09-14 12:18:19 [ERROR] ThreadProtectedRequestRouter reject request: request_method=XXXXXX request_counter=76 =76 max_thread=100

  github上作者的回复: https://github.com/weibocom/motan/issues/551

 重现:

  配置motan服务MotanDemoService,提供4个方法,其中hello1的处理逻辑为sleep 1s

2018-09-14 13:58:27 [INFO] add method sign:hell817ff3733269, methodinfo:MethodInfo [group=motan-demo-rpc, interfaceName=com.weibo.motan.demo.service.MotanDemoService, methodName=hello, paramtersDesc=java.lang.String, version=1.0]
2018-09-14 13:58:27 [INFO] add method sign:helld1f0f2c9182d, methodinfo:MethodInfo [group=motan-demo-rpc, interfaceName=com.weibo.motan.demo.service.MotanDemoService, methodName=hello2, paramtersDesc=java.lang.String, version=1.0]
2018-09-14 13:58:27 [INFO] add method sign:hell52228017a74a, methodinfo:MethodInfo [group=motan-demo-rpc, interfaceName=com.weibo.motan.demo.service.MotanDemoService, methodName=hello4, paramtersDesc=java.lang.String, version=1.0]
2018-09-14 13:58:27 [INFO] add method sign:hellaea7504ac806, methodinfo:MethodInfo [group=motan-demo-rpc, interfaceName=com.weibo.motan.demo.service.MotanDemoService, methodName=hello3, paramtersDesc=java.lang.String, version=1.0]

  服务配置:

<motan:protocol id="demoMotan" default="true" name="motan"
                maxServerConnection="80000" maxContentLength="1048576"
                maxWorkerThread="100" minWorkerThread="100" threads="100" />

  motan服务暴露在8002端口,启动一个客户端,请求服务器1000次

汇总结果如下: 

    • 测试场景1
      75并发访问服务端,请求1000次,结果如下:
      motan demo is finish. success: 1000 error: 0
    • 测试场景2
      80并发访问服务端,请求1000次,结果如下:
      motan demo is finish. success: 150 error: 850
      服务端错误信息: 

2018-09-14 14:58:03 [ERROR] ThreadProtectedRequestRouter reject request: request_method=com.weibo.motan.demo.service.MotanDemoService.hello request_counter=76 =76 max_thread=100
2018-09-14 14:58:03 [ERROR] ThreadProtectedRequestRouter reject request: request_method=com.weibo.motan.demo.service.MotanDemoService.hello request_counter=76 =76 max_thread=100
2018-09-14 14:58:03 [ERROR] ThreadProtectedRequestRouter reject request: request_method=com.weibo.motan.demo.service.MotanDemoService.hello request_counter=76 =76 max_thread=100
2018-09-14 14:58:04 [ERROR] ThreadProtectedRequestRouter reject request: request_method=com.weibo.motan.demo.service.MotanDemoService.hello request_counter=76 =76 max_thread=100
2018-09-14 14:58:04 [ERROR] ThreadProtectedRequestRouter reject request: request_method=com.weibo.motan.demo.service.MotanDemoService.hello request_counter=78 =78 max_thread=100
2018-09-14 14:58:04 [ERROR] ThreadProtectedRequestRouter reject request: request_method=com.weibo.motan.demo.service.MotanDemoService.hello request_counter=77 =77 max_thread=100

  综述:跟作者描述一致,当并发度达到 方法数的 3/4 * 100(线程数)= 75时,触发motan的熔断机制,产生大量拒绝请求,客户端报错如下:

com.weibo.api.motan.exception.MotanServiceException: error_message: RoundRobinLoadBalance No available referers for call : referers_size=1 requestId=1611565139528515715 interface=com.weibo.motan.demo.service.MotanDemoService method=hello(java.lang.String), status: 503, error_code: 10001,r=null
    at com.weibo.api.motan.cluster.loadbalance.AbstractLoadBalance.selectToHolder(AbstractLoadBalance.java:84)
    at com.weibo.api.motan.cluster.ha.FailoverHaStrategy.selectReferers(FailoverHaStrategy.java:90)
    at com.weibo.api.motan.cluster.ha.FailoverHaStrategy.call(FailoverHaStrategy.java:53)
    at com.weibo.api.motan.cluster.support.ClusterSpi.call(ClusterSpi.java:73)
    at com.weibo.api.motan.proxy.RefererInvocationHandler.invoke(RefererInvocationHandler.java:132)
    at com.sun.proxy.$Proxy10.hello(Unknown Source)
    at com.weibo.motan.demo.client.DemoRpcCli  

代码:

  类ProviderProtectedMessageRouter

protected boolean isAllowRequest(int requestCounter, int totalCounter, int maxThread, Request request) {
    if (methodCounter.get() == 1) {
        return true;
    }
 
    // 该方法第一次请求,直接return true
    if (requestCounter == 1) {
        return true;
    }
 
    // 不简单判断 requsetCount > (maxThread / 2) ,因为假如有2或者3个method对外提供,
    // 但是只有一个接口很大调用量,而其他接口很空闲,那么这个时候允许单个method的极限到 maxThread * 3 / 4
    if (requestCounter > (maxThread / 2) && totalCounter > (maxThread * 3 / 4)) {
        return false;
    }
 
    // 如果总体线程数超过 maxThread * 3 / 4个,并且对外的method比较多,那么意味着这个时候整体压力比较大,
    // 那么这个时候如果单method超过 maxThread * 1 / 4,那么reject
    return !(methodCounter.get() >= 4 && totalCounter > (maxThread * 3 / 4) && requestCounter > (maxThread * 1 / 4));
 
}  

解决方案:

  除了作者提到的3点:

1是想办法提高业务处理效率,减少单个请求的处理耗时;
2是增加server节点数量,降低单server的qps;
3是如果服务端性能良好可以增加处理线程数量,例如1500

  对于不好改动的接口,需要将请求量大的接口单独抽离出来,如下:

<motan:protocol id="demoMotan" default="true" name="motan"
                maxServerConnection="80000" maxContentLength="1048576"
                maxWorkerThread="100" minWorkerThread="100" threads="100" />
<motan:protocol id="demoMotanSingleMethod" default="true" name="motan"
                maxServerConnection="80000" maxContentLength="1048576"
                maxWorkerThread="100" minWorkerThread="100" threads="100" /> 

期望接口能够承受100并发,测试结果如下:

motan demo is finish. success: 1000 error: 0

原文地址:https://www.cnblogs.com/jing-yi/p/14169267.html