k8s调度方式

deployment调度

deployment或者RC控制器他们的调度规则都是由系统自动完成调度的,他们各自最终运行在哪个节点上,完全由master节点的scheduler经过一系列的算法计算得出,用户无法干预调度过程和结果,这里不在演示!!

NodeSelector定向调度

在实际生产环境中,有可能我们需要某pod运行在特定的节点之下,这时我们就需要定向调度,让某一pod运行在特定的node2节点下,步骤如下:

第一步,给node2节点打赏标签

[root@master ~]# kubectl labels node node2 app=release
可以用kubectl get nodes --show-labels查看标签

第二步,定义pod.yaml文件,定义selector让3个pod都运行在标签为app:release的节点之上

[root@master ~]# vim deploy.yaml 
apiVersion: apps/v1
kind: Deployment
metadata:
  name: myapp
  namespace: default
spec:
  replicas: 3
  selector:
    matchLabels:
      app: release
  template:
    metadata:
      name: mypod
      namespace: default
      labels:
        app: release
    spec:
      nodeSelector:
        app: release
      containers:
      - name: mycontainer
        image: liwang7314/myapp:v1
        imagePullPolicy: IfNotPresent
        ports:
        - name: http
          containerPort: 80

第三步,执行文件创建pod并观察pod运行的节点,我们发现3个pod全部运行在node2节点上

[root@master ~]# kubectl create -f deploy.yaml 
deployment.apps/myapp created
[root@master ~]# kubectl get pods -o wide
NAME                    READY   STATUS    RESTARTS   AGE   IP           NODE    NOMINATED NODE   READINESS GATES
myapp-95ff9459c-8g6tc   1/1     Running   0          4s    10.244.2.8   node2   <none>           <none>
myapp-95ff9459c-ghxxx   1/1     Running   0          4s    10.244.2.7   node2   <none>           <none>
myapp-95ff9459c-s5pt9   1/1     Running   0          4s    10.244.2.6   node2   <none>           <none>

NodeAffinity:Node亲和性调度

NodeAffinity意味Node亲和性的调度策略,是用于替换NodeSelector的全信调度策略,目前有两种节点亲和性的表达

  1. requiredDuringSchedulingIgnoredDuringExecution:必须满足指定的规则才可以调度Pod到Node上(功能与NodeSelector类似,但是使用的语法不通),相当于硬限制
  2. preferredDuringSchedulingIgnoredDuringExecution:强调优先满足指定规则,调度器会尝试调度Pod到Node上,但并不强求,相当于软限制,多个优先级规则还可以设置权重(weight)值,以定义执行的先后顺序

我们定义一个affinity.yaml文件,里面定义一个3个pod,定义requiredDuringSchedulingIgnoredDuringExecution让pod调度至node2上,然后在定义preferredDuringSchedulingIgnoredDuringExecution不让pod调度至node2节点上,如下:

第一步,定义yaml文件

[root@master ~]# cat affinity.yaml 
apiVersion: apps/v1
kind: Deployment
metadata:
  name: myapp
  namespace: default
spec:
  replicas: 3
  selector:
    matchLabels:
      app: release
  template:
    metadata:
      name: mypod
      namespace: default
      labels:
        app: release
    spec:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: app
                operator: In
                values: 
                - release
          preferredDuringSchedulingIgnoredDuringExecution:
          - weight: 1
            preference:
              matchExpressions:
              - key: app
                operator: NotIn
                values: 
                - release
      containers:
      - name: mycontainer
        image: liwang7314/myapp:v1
        imagePullPolicy: IfNotPresent
        ports:
        - name: http
          containerPort: 80

第二步,创建pod然后观察pod所在节点,可以发现全部调度至node2上

[root@master ~]# kubectl get pods -o wide
NAME                     READY   STATUS    RESTARTS   AGE     IP            NODE    NOMINATED NODE   READINESS GATES
myapp-6fcfb98879-566z5   1/1     Running   0          3m24s   10.244.2.10   node2   <none>           <none>
myapp-6fcfb98879-5r6cm   1/1     Running   0          3m24s   10.244.2.11   node2   <none>           <none>
myapp-6fcfb98879-7kwwq   1/1     Running   0          3m24s   10.244.2.9    node2   <none>           <none>

如果我们把requiredDuringSchedulingIgnoredDuringExecution去掉,在查看,可以发现已经调度至node1,但是node2也会有一个pod,因为preferredDuringSchedulingIgnoredDuringExecution字段是尽可能,并非必须,因此结果如下

[root@master ~]# kubectl get pods -o wide
NAME                     READY   STATUS    RESTARTS   AGE   IP            NODE    NOMINATED NODE   READINESS GATES
myapp-55679db465-h5ftb   1/1     Running   0          11s   10.244.1.76   node1   <none>           <none>
myapp-55679db465-kj58w   1/1     Running   0          11s   10.244.1.75   node1   <none>           <none>
myapp-55679db465-mwfjq   1/1     Running   0          11s   10.244.2.12   node2   <none>           <none>

NodeAffinity规则设置的注意事项:

  1. 如果同时定义了nodeSelector和nodeAffinity,那么必须两个条件都得到满足,Pod才能最终运行到指定Node上
  2. 如果nodeAffinity制定了多个nodeSelectorTerms,那么其中一个能够匹配上即可成功调度
  3. 如果在nodeSelectorTerms中有多个matchExpressions,则一个节点必须满足所有的matchExpressions才能运行该Pod

Taints和Tolerations(污点和容忍) 

前面介绍了NodeAffinity节点亲和性,是在pod上定义的一种属性,使得Pod能够被调度到某些Node节点上运行(优先选择或强制要求)Taint则正好相反,它让Node拒绝Pod运行。

Taint要与Toleration配合使用,让Pod避开那些不适合的Node,在Node上设置一个或者多个Taint之后,除非Pod明确生命能够容忍这些污点,否则无法在这写Node节点上运行,TOolerations是Pod属性,让Pod能够(注意,只是能够,而非必须)运行标注了Taint的Node上

可以用以下方式设置污点,使其pod不在该节点运行

[root@master ~]# kubectl taint node node2 app=release:NoSchedule
node/node2 tainted

然后在Pod上生命Toleration

[root@master ~]# cat pod.yaml 
apiVersion: v1
kind: Pod
metadata:
  name: mypod
  namespace: default
spec:
  tolerations:
  - key: "app"
    operator: "Equal"
    value: "release"
    effect: "NoSchedule"
  containers:
  - name: mycontainer
    image: liwang7314/myapp:v1
    imagePullPolicy: IfNotPresent

查看该pod运行的节点位置发现还是可以被调度到node2上的,因为我们容忍这个污点

[root@master ~]# kubectl get pods -o wide
NAME    READY   STATUS    RESTARTS   AGE   IP            NODE    NOMINATED NODE   READINESS GATES
mypod   1/1     Running   0          72s   10.244.2.14   node2   <none>           <none>

此外,我们也可以给node节点打上NoEXecute污点,这样,tolerations没有定义容忍的污点的pod会被驱除,以后的Pod也不会在该节点运行,我们先看一下该节点状态

[root@master ~]# kubectl get pods -o wide                      
NAME    READY   STATUS              RESTARTS   AGE   IP       NODE    NOMINATED NODE   READINESS GATES
mypod   0/1     ContainerCreating   0          2s    <none>   node1   <none>           <none>

然后在该节点上打上NoExecute污点,然后在观察,发现pod已经被驱除

[root@master ~]# kubectl taint node node1 app=release:NoExecute
node/node1 tainted
[root@master ~]# kubectl get pods -o wide                      
NAME    READY   STATUS        RESTARTS   AGE    IP            NODE    NOMINATED NODE   READINESS GATES
mypod   1/1     Terminating   0          106s   10.244.1.82   node1   <none>           <none>
[root@master ~]# kubectl get pods -o wide
No resources found in default namespace.

如果要删除taint,只需要在后面加上“-”符号就可以了

[root@master ~]# kubectl taint node node1 app-
node/node1 untainted

注意点:

  • operator的值可以是Exists(无需指定value)
  • operator的值可以是Equal并且与value相等
  • 如果不指定operator的值默认是Equal
  • effect的取值为NoSchedule,也可以是PreferNoSchedule,这个值的意思是优先,也可以是NoExecute
  • 污点和容忍可以定义多个
原文地址:https://www.cnblogs.com/fengzi7314/p/12466738.html