k8s亲和性和反亲和性的理解

https://www.kancloud.cn/pshizhsysu/kubernetes/2511356

topologyKey用于指定调度时作用域,例如:
如果指定为kubernetes.io/hostname,那就是以Node节点为区分范围
如果指定为beta.kubernetes.io/os,则以Node节点的操作系统类型来区分

kubernetes.io/hostname不用的节点名称就是不同的调度区域

我们来看下面的一个案例


**PodAffinity**

PodAffinity主要实现以运行的Pod为参照,实现让新创建的Pod跟参照pod在一个区域的功能。

首先来看一下`PodAffinity`的可配置项:

~~~markdown
pod.spec.affinity.podAffinity
requiredDuringSchedulingIgnoredDuringExecution 硬限制
namespaces 指定参照pod的namespace
topologyKey 指定调度作用域
labelSelector 标签选择器
matchExpressions 按节点标签列出的节点选择器要求列表(推荐)
key 键
values 值
operator 关系符 支持In, NotIn, Exists, DoesNotExist.
matchLabels 指多个matchExpressions映射的内容
preferredDuringSchedulingIgnoredDuringExecution 软限制
podAffinityTerm 选项
namespaces
topologyKey
labelSelector
matchExpressions
key 键
values 值
operator
matchLabels
weight 倾向权重,在范围1-100
~~~

~~~markdown
topologyKey用于指定调度时作用域,例如:
如果指定为kubernetes.io/hostname,那就是以Node节点为区分范围
如果指定为beta.kubernetes.io/os,则以Node节点的操作系统类型来区分
~~~

接下来,演示下`requiredDuringSchedulingIgnoredDuringExecution`,

1)首先创建一个参照Pod,pod-podaffinity-target.yaml:

~~~yaml
apiVersion: v1
kind: Pod
metadata:
name: pod-podaffinity-target
namespace: dev
labels:
podenv: pro #设置标签
spec:
containers:
- name: nginx
image: nginx:1.17.1
nodeName: node1 # 将目标pod名确指定到node1上
~~~

~~~powershell
# 启动目标pod
[root@master ~]# kubectl create -f pod-podaffinity-target.yaml
pod/pod-podaffinity-target created

# 查看pod状况
[root@master ~]# kubectl get pods pod-podaffinity-target -n dev
NAME READY STATUS RESTARTS AGE
pod-podaffinity-target 1/1 Running 0 4s
~~~

2)创建pod-podaffinity-required.yaml,内容如下:

~~~yaml
apiVersion: v1
kind: Pod
metadata:
name: pod-podaffinity-required
namespace: dev
spec:
containers:
- name: nginx
image: nginx:1.17.1
affinity: #亲和性设置
podAffinity: #设置pod亲和性
requiredDuringSchedulingIgnoredDuringExecution: # 硬限制
- labelSelector:
matchExpressions: # 匹配env的值在["xxx","yyy"]中的标签
- key: podenv
operator: In
values: ["xxx","yyy"]
topologyKey: kubernetes.io/hostname
~~~

上面配置表达的意思是:新Pod必须要与拥有标签nodeenv=xxx或者nodeenv=yyy的pod在同一Node上,显然现在没有这样pod,接下来,运行测试一下。

~~~powershell
# 启动pod
[root@master ~]# kubectl create -f pod-podaffinity-required.yaml
pod/pod-podaffinity-required created

# 查看pod状态,发现未运行
[root@master ~]# kubectl get pods pod-podaffinity-required -n dev
NAME READY STATUS RESTARTS AGE
pod-podaffinity-required 0/1 Pending 0 9s

# 查看详细信息
[root@master ~]# kubectl describe pods pod-podaffinity-required -n dev
......
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling <unknown> default-scheduler 0/3 nodes are available: 2 node(s) didn't match pod affinity rules, 1 node(s) had taints that the pod didn't tolerate.

# 接下来修改 values: ["xxx","yyy"]----->values:["pro","yyy"]
# 意思是:新Pod必须要与拥有标签nodeenv=xxx或者nodeenv=yyy的pod在同一Node上
[root@master ~]# vim pod-podaffinity-required.yaml

# 然后重新创建pod,查看效果
[root@master ~]# kubectl delete -f pod-podaffinity-required.yaml
pod "pod-podaffinity-required" deleted
[root@master ~]# kubectl create -f pod-podaffinity-required.yaml
pod/pod-podaffinity-required created

# 发现此时Pod运行正常
[root@master ~]# kubectl get pods pod-podaffinity-required -n dev
NAME READY STATUS RESTARTS AGE LABELS
pod-podaffinity-required 1/1 Running 0 6s <none>
~~~

关于`PodAffinity`的 `preferredDuringSchedulingIgnoredDuringExecution`,这里不再演示。

piVersion: apps/v1
kind: StatefulSet
metadata:
  name: nginx-affinity-test
  namespace: dev
spec:
  serviceName: nginx-service
  replicas: 2
  selector:
    matchLabels:
      service: nginx
  template:
    metadata:
      name: nginx222
      labels:
        service: nginx
    spec:
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchExpressions:
              - key: service
                operator: In
                values:
                - nginx
            topologyKey: zone
      containers:
      - name: nginx
        image: nginx:1.17.1
        command:
        - sleep
        - "360000000"

例子:

需求:当前有两个机房( beijing,shanghai),需要部署一个nginx产品,副本为两个,为了保证机房容灾高可用场景,需要在两个机房分别部署一个副本

 我们给两个主机打上标签

解释:两个node上分别有zone标签,来标注自己属于哪个机房,topologyKey定义为zone,pod所以在调度的时候,会根据node上zone标签来区分拓扑域,当前用的上 反亲和性调度 根据拓扑纬度调度,beijing机房调度完一个pod后,然后控制器判断beijing 拓扑域已经有server=nginx标签的pod,就在下一个拓扑域的node上调度了,所以两个niginx pod 分别部署在node1和node2两个节点上面



原文地址:https://www.cnblogs.com/kebibuluan/p/15715371.html