yarn fairscheduler原理与配置

MAPREDUCE-3451, 把fairScheduler引入到2.0.2-alpha, 本文介绍一下hadoop 2.0.2-alpha的fairscheduler. 包括调度算法和配置方法.

代码

在org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair包下, 主要包括如下的类:

各个类作用的简要描述:

1. AllocationConfigurationException, 如果配置文件出错会抛出此异常.

2. AppSchedulable 一个可以提交task的实体, 继承于Schedulable,

3. FairScheduler 调度器的主体部分

4. FairSchedulerConfiguration的配置常量和默认值

5. FairSchedulerEventLog 封装了LOG, 用于把调度事件写到指定的log文件中

6. FifoAppComparator 继承于Comparator, 对比两个AppSchedulable的大小, 首先是Priority, 然后是startTime, 最后对比ApplicationId.

7. FSQueue, fairscheduler中的组信息类

8. FSQueueSchedulable继承于Schedulable, 一个可以提交task的实体

9. FSSchedulerApp继承于SchedulerApp, 从调度器的角度来看, 每个在RM中正在运行的应用程序都是此类的实例.

10. NewJobWeightBooster, 实现了WeightAdjuster接口, 用于更新AppSchedulable的权重, Weight.

11. QueueManager, 维护一个组队列, 并提供更新方法, fairscheduler调用.

12. Schedulable一个可以提交task的实体

13. SchedulingAlgorithms, 工具类, 包括fair scheduler使用到的调度算法.

14. SchedulingMode, enum类型, 包括FAIR, FIFO. 每个组内的调度模式.

15. 接口WeightAdjuster, 权重修改接口

Fairscheduler的原理

当 NM (NodeManager的简称)向RM (ResourceManager的简称)发送心跳后, 会调用调度器的nodeUpdate()方法,流程如下:

1. Processing the newly launched containers

2. Process completed containers

3. Assign new containers

a) Check for reserved applications

Reserved, 预留的意思, If we have have an application that has reserved a resource on this node already, we try to complete the reservation.

b) Schedule if there are no reservations. schedule at queue which is furthest below fair share.

i. 这里首先获取所有组(getQueueSchedulables), 然后对他们排序, 使用SchedulingAlgorithms. FairShareComparator类排序.

ii. 然后从第一个组开始, 把资源分配给它, 并开始组内分资源,

iii. 如果未分配给第一组, 则分给下一组, 如果分给了第一组, 则继续到第一步. 若未分配给第一个组, 或重复分配给某一组, 或大于maxAssign, 则退出循环.

SchedulingAlgorithms.FairShareComparator排序算法

两个组, 排序的规则是:

1. 一个需要资源, 另外一个不需要资源, 则需要资源的排前面

2. 若都需要资源的话, 对比使用的内存占minShare的比例, 比例小的排前面, (即尽量保证达到minShare)

3. 若比例相同的话, 计算出使用量与权重的比例, 小的排前面, 即权重大的优先, 使用量小的优先.

4. 若还是相同, 提交时间早的优先, app id小的排前面.

配置方法

在RM的配置目录下的yarn-site.xml文件中增加配置项

<property> 
  <name>yarn.resourcemanager.scheduler.class</name> 
  <value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler</value> 
</property>

在RM的配置目录下新建fair-scheduler.xml文件, 增加如下内容:

<?xml version="1.0"?> 
<allocations> 
  <queue name="sample_queue"> 
    <minResources>1000</minResources> 
    <maxResources>9000</maxResources> 
    <maxRunningApps>50</maxRunningApps> 
    <weight>2.0</weight> 
    <schedulingMode>fair</schedulingMode> 
    <aclSubmitApps> sample_queue,yuling.sh</aclSubmitApps> 
    <aclAdministerApps> sample_queue,yuling.sh</aclAdministerApps> 
  </queue> 
  <queue name="default"> 
    <minResources>1000</minResources> 
    <maxResources>9000</maxResources> 
    <maxRunningApps>50</maxRunningApps> 
    <weight>2.0</weight> 
    <schedulingMode>fair</schedulingMode> 
    <aclSubmitApps> yuling.sh</aclSubmitApps> 
    <aclAdministerApps> a</aclAdministerApps> 
  </queue>

  <userMaxAppsDefault>5</userMaxAppsDefault> 
</allocations>

注意, 在yarn中, 提交作业的组验证已经放到了调度器中实现.

转载请注明出处:http://www.cnblogs.com/shenh062326/archive/2012/12/09/2810010.html