目录
一. topologkey(拓扑域)
熟悉Pod调度方式的亲和与反亲和肯定会对topologkey有一定的认知。这里对topologkey进行下验证,看看它的作用是什么。
首先要明白topologkey是什么意思?主要解决了什么问题?下面列出官网给出的解释。
topologyKey is the key of node labels. If two Nodes are labelled with this key and have identical values for that label, the scheduler treats both Nodes as being in the same topology. The scheduler tries to place a balanced number of Pods into each topology domain.
意思就是node节点的标签有相同的值,则会视为处于同一拓扑域。而Scheduler则会在每个拓扑域中放置平衡数量的Pod。官方说的这个拓扑域有点晦涩难懂,这里举个例子。
node1 标签
node.role/China=Beijing node.role.China/Beijing=Chaoyang
node2标签
node.role/China=Biejing node.role.China/Beijing=Changping
node3标签
node.role/China=Biejing node.role.China/Beijing=Tongzhou
官方案例
defaultConstraints: - maxSkew: 3 topologyKey: "kubernetes.io/hostname"
node都存在同一种类型标签node.role/China=Biejing,若topologkey=node.role/China,则会视为同一域,在发布Pod时只会调度1个Pod在这个域上。
node都存在不同类型标签node.role.China/Beijing=XX,若topologkey=node.role.China/Biejing则会被视为有3个域,在发布Pod时会在每个域上调度一个(副本数>=2)。 官方使用的是topologkey=kubernetes.io/hostname,这个value默认是主机IP,K8S集群内,node节点的IP是不可能重复的,所有默认有node节点个数的域。
拓扑域没有明确的类型,可以地理位置、服务、主机名、主机ip、节点提供功能等都可以实现不同的拓扑域。 个人理解拓扑域的核心是解决了服务之间的高可用。在不同的拓扑域部署多个Pod副本。
二. 验证案例
2.1 3个拓扑域(kafka)
2.1.1 node节点标签
[root@node1 ~]# kubectl get nodes -L node-role.kubernetes.io/node
NAME STATUS ROLES AGE VERSION NODE
10.22.33.31 Ready master 2d15h v1.21.6
10.22.33.32 Ready master 2d15h v1.21.6
10.22.33.33 Ready master 2d15h v1.21.6
10.22.33.34 Ready node 2d15h v1.21.6 node4
10.22.33.35 Ready node 2d15h v1.21.6 node5
10.22.33.36 Ready node 2d15h v1.21.6 node6
2.2.2 Pod Yaml
spec:
# https://kubernetes.io/docs/concepts/workloads/pods/pod-topology-spread-constraints/
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app.deploy.service
operator: In
values:
- zookeeper
topologyKey: node-role.kubernetes.io/node
---
spec:git pu
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app.deploy.service
operator: In
values:
- kafka
topologyKey: node-role.kubernetes.io/node
2.2.3 发布
[root@node1 charts]# helm install me kafka/
NAME: me
LAST DEPLOYED: Sun May 15 10:51:36 2022
NAMESPACE: default
STATUS: deployed
REVISION: 1
[root@node1 charts]# kubectl get pods -n me-zookeeper -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
me-zookeeper-0 1/1 Running 0 56s 10.244.16.54 10.22.33.34 <none> <none>
me-zookeeper-1 1/1 Running 0 56s 10.244.17.40 10.22.33.36 <none> <none>
me-zookeeper-2 1/1 Running 0 56s 10.244.21.42 10.22.33.35 <none> <none>
[root@node1 charts]# kubectl get pods -n me-kafka -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
me-kafka-0 1/1 Running 3 2m40s 10.244.16.56 10.22.33.34 <none> <none>
me-kafka-1 1/1 Running 3 2m40s 10.244.17.42 10.22.33.36 <none> <none>
me-kafka-2 1/1 Running 3 2m40s 10.244.21.44 10.22.33.35 <none> <none>
zk、kafka都在每个域中调度一个Pod,预期结果。(Kafka重启3次是因为kafka 比readiness比zk liveiness短,即kafka启动完成zk集群还未完成)
2.2 1个拓扑域(kafka)
2.2.1 查看node节点标签
[root@node1 charts]# kubectl get nodes -L node-role.kubernetes.io/node
NAME STATUS ROLES AGE VERSION NODE
10.22.33.31 Ready master 2d15h v1.21.6
10.22.33.32 Ready master 2d15h v1.21.6
10.22.33.33 Ready master 2d15h v1.21.6
10.22.33.34 Ready node 2d15h v1.21.6 node
10.22.33.35 Ready node 2d15h v1.21.6 node
10.22.33.36 Ready node 2d15h v1.21.6 node
2.2.2 发布
[root@node1 charts]# helm install me kafka/
NAME: me
LAST DEPLOYED: Sun May 15 11:05:00 2022
NAMESPACE: default
STATUS: deployed
REVISION: 1
[root@node1 charts]# kubectl get pods -n me-zookeeper -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
me-zookeeper-0 0/1 Running 0 33s 10.244.16.58 10.22.33.34 <none> <none>
me-zookeeper-1 0/1 Pending 0 33s <none> <none> <none> <none>
me-zookeeper-2 0/1 Pending 0 33s <none> <none> <none> <none>
[root@node1 charts]# kubectl get pods -n me-kafka -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
me-kafka-0 0/1 Error 2 37s 10.244.21.45 10.22.33.35 <none> <none>
me-kafka-1 0/1 Pending 0 37s <none> <none> <none> <none>
me-kafka-2 0/1 Pending 0 37s <none> <none> <none> <none>
zk、kafka只在node这个域中调度了Pod,而其他的则发布到了master,由于master设置了污点,Pod状态一直处于Pending是预期结果。
确认下master污点
[root@node1 charts]# for i in 31 32 33;do echo "10.22.33.$i" `kubectl describe node 10.22.33.$i | grep Taint`; done
10.22.33.31 Taints: node-role.kubernetes.io/master:NoSchedule
10.22.33.32 Taints: node-role.kubernetes.io/master:NoSchedule
10.22.33.33 Taints: node-role.kubernetes.io/master:NoSchedule