当前位置:网站首页>Kubernetes Kube scheduler
Kubernetes Kube scheduler
2022-07-21 01:25:00 【Zhang quandan, Foxconn quality inspector】
APIServer The main responsibility is certification , authentication , admittance , It determines who initiated a request , Does the initiator have corresponding authority , Is this request legal , And from apiserver This end feels that it is necessary to change some properties of the original request , Then it can be done here .
apserver After these links , All of these tests passed , The request is also legal , Then it will save the request etcd Inside .
apiserver Itself is k8s The only and in the cluster etcd Such a component of database communication , All other components require and apiserver To communicate , To get the change information of data .
kubelet Divided into their own framework code , And include the following interface abstractions , It abstracts the runtime as cri, Abstract the network as cni, Abstract storage as csi.
kube-scheduler
kube-scheduler In charge of distribution and dispatch Pod To nodes in the cluster , It monitors kube-apiserver, The query has not been assigned yet Node Of Pod, Then according to the scheduling strategy for these Pod Assign nodes ( to update Pod Of NodeName Field ). There are many factors that need to be fully considered by the scheduler ∶
- Fair dispatch ( When receiving many requests , Make sure we can handle the request fairly , Everyone is equal , Dispatch based on the principle of first come, first serve , This is fairness . There are also some unfair factors , For example, when scheduling, there is scheduling priority , One of my applications is particularly important , Hope to jump the queue and go ahead , So here k8s Provides complete support , For the same scheduling priority , I am the principle of fairness , Different priorities will jump the queue and put it in front , The higher the priority, the higher the priority )
- Efficient use of resources ( Find the most suitable node and schedule it )
- QoS
- affinity and anti-affinity
- Data localization (data locality)
- Internal load interference (inter-workload interference)
- deadlines.
The scheduler will listen to the information of all computing nodes in the cluster , It needs to know how many computing nodes there are in the current cluster , How healthy are these nodes , How are their resources used , How many resources are used , How much can be allocated .
Each computing node will report its own information to apiserver, Our scheduler will go watch apiserver, Get the information of these nodes , Then the scheduler has a global view of the cluster .
On the one hand, it has a global view of the computing resources of all nodes in the cluster , On the other hand, it can accept users pod, For it, it is the scheduling request , To find the best node , Finding the best node is complete pod Binding relationship with nodes . In essence, it's going to be pod Of nodename Fields are filled .
User establishment pod It's time not to fill in nodename Of , Because I don't know pod Where will it be dispatched , The scheduler will go to see one pod Of nodename It's empty , That means you need to dispatch , So it will do scheduling , Find the right node , take nodename Fill it in .
Scheduler
kube-scheduler Scheduling is divided into two stages ,predicate and priority∶
predicate∶ Filter unqualified nodes ;filter
priority∶ Prioritization , Select the node with the highest priority .score
filter: Yes 100 Taiwan machine , You have one pod request , I have to see which nodes do not meet your needs , First filter out the nodes that do not meet the requirements .
After filtering, there may still be 10 Taiwan can meet your needs , Then you need to sort , Sorting is to score according to various factors ,
Predicates Strategy
Because when scheduling , There are many factors to consider , Each factor is a plug-in for the scheduler . do predicate Relative to traversing these predict plug-in unit , Then execute one by one .
PodFitsResources: Check Node Whether our resources are sufficient , Including allowed Pod Number 、CPU、 Memory 、GPU Number and others OpaquelntResources.( First, let's see which nodes are not satisfied pod Resources , Without proper resources, all machines are brushed off )
PodFitsHostPorts: Check if there is Host Ports Conflict .
PodFitsPorts: Same as PodFitsHostPorts.( There are some pod Want to occupy the host port , When I go to dispatch, I need to check whether this port is free , If this port is occupied , It means that this node cannot be installed pod 了 )
HostName∶ Check pod.Spec.NodeName Whether it is consistent with the candidate node .
MatchNodeSelector∶ Check the name of the candidate node pod.Spec.NodeSelector match .( Only these nodes will be dispatched )
NoVolumeZoneConflict∶ Check volume zone Conflict or not .
MatchlnterPodAffinity∶ Check for match Pod Affinity requirements for .
NoDiskConflict∶ Check for presence Volume Conflict , Is limited to GCEPD、AWS EBS、Ceph RBD as well as iSCSI.
PodToleratesNodeTaints∶ Check Pod Whether to tolerate Node Taints.
CheckNodeMemoryPressure∶ Check Pod Whether it can be scheduled to MemoryPressure Node .
CheckNodeDiskPressure∶ Check Pod Whether it can be scheduled to DiskPressure Node .
NoVolumeNodeConflict∶ Check whether the node meets Pod Cited Volume Conditions .
There are many other strategies , You can also write your own strategy .
Predicates plugin working principle
When to do pod When scheduling , Will go through one by one predicate Of plugin, I just one by one plugin Go for a run , After every plugin, I will filter a batch of machines , After every plugin Will filter out a batch of machines , Finally, there are machines that meet the scheduling requirements .
Priorities Strategy
about priority There are also many plug-ins , For each plug-in , He also goes through each plug-in to calculate the score , Finally, each node will be scored and summarized , Finally, the node with the highest score will be ranked in the front .
SelectorSpreadPriority∶ Give priority to reducing the number of nodes belonging to the same Service or Replication Controller Of Pod Number .
InterPodAffinityPriority∶ Priority will be Pod Schedule to the same topology ( Like a node 、Rack、Zone etc. ).
LeastRequestedPriority∶ Give priority to the nodes with less resources .
BalancedResourceAllocation∶ Give priority to balancing the resource use of each node .
NodePreferAvoidPodsPriority∶ alpha.kubernetes.io/preferAvoidPods Field judgment , The weight of 10000, Avoid the impact of other priority strategies .
Resource requirements
CPU
requests
Kubernetes Dispatch Pod when , It will judge the running of the current node Pod Of CPU Request The sum of , Plus current scheduling Pod Of CPU request, Calculate whether it exceeds the CPU Allocable resources .
limits
To configure cgroup To limit the resource limit .
Memory
requests
Judge whether the remaining memory of the node meets Pod Amount of memory requested , To determine if Pod Schedule to this node .
limits
To configure cgroup To limit the resource limit .
Disk resource requirements
Temporary storage of containers (ephemeral storage) Contains logs and writable layer data , By definition Pod Spec Medium limits.ephemeral-storage and requests.ephemeral-storage To apply for .
Pod After the dispatch , The limitation of computing nodes on temporary storage is not based on cgroup Of , But by the kubelet Get the log of the container and the disk usage of the writable layer of the container regularly , If you exceed the limit , It will be right Pod To drive .
Init Container The demand for resources
In a pod In addition to the main container , also init container, Do some initialization ,istio There is initcontainer, After it gets up, it will configure local iptables The rules , Exit after configuration .
For example, the application should pass jwt token To access other applications , Authentication is required between applications , We will use the initialization container , Because of this token It's a one-time acquisition , We will use the initialized container to get this token, This token After obtaining, it will be stored in the local hard disk , Then the hard disk passes volume mount To a main container , And the main container mount To the same path , Then it can be read .
initcontainer Most of the time, it is when the main container preloads resources , When loading the configuration, you can let it do .
- When kube-scheduler Scheduling has multiple init Container of Pod when , Only calculate cpu.request The most init Containers , Instead of calculating all init Total containers .( You can also set request limit)
● Due to the multiple init Containers execute sequentially , And exit immediately after execution , So apply for the most resources init All in the container
Resource requirements , That's all init Container requirements .
● kube-scheduler When calculating the resources occupied by this node ,init The resources of the container will still be included in the calculation . because init
The container may be executed again under certain circumstances , For example, it is caused by changing the image Sandbox When rebuilding .
边栏推荐
- 丢失了数据库密码,如何恢复?
- Baidu PaddlePaddle easydl helps manufacturing enterprises with intelligent transformation
- 数据库系统原理与应用教程(027)—— MySQL 修改表中数据(三):改(update)
- 300000 prize pool is waiting for you to fight! Natural language processing (NLP) competition collection is coming
- Day106.尚医通:数据字典列表、EasyExcel、数据字典导入导出、集成Redis缓存
- acwing 869. 试除法求约数
- STM32 learning ---spi
- 模糊照片秒变高清大图,飞桨PPDE带你复现图像恢复模型CMFNet
- JSP自定义标签(一篇学会,每一行代码都有注释)
- Warning FailedScheduling 8s default-scheduler 0/3 nodes are available: 1 Insufficient memory
猜你喜欢
[pyGame] the classic boss of soul duel is back. Are you ready to defeat them again? (source code attached)
【LeetCode】12. Balanced Binary Tree·平衡二叉树
Dix minutes pour générer un effet de design intérieur de qualité film et télévision, comment mettre à niveau l'industrie de la maison traditionnelle avec Red Star McLaren Design Cloud
VMware startup error: exception 0xc00000005 and windwos11 have no Hyper-V solution
详解Redis的RDB和AOF
QT_ QSS file easy-to-use tutorial
AVL tree
Unity shader shader learning (2)
The way to practice and play strange: the meaning of NPM global installation and local installation in nodejs, and the difference between global installation and local installation in NPM
封装、继承、多态
随机推荐
The way to practice and play strange: the meaning of NPM global installation and local installation in nodejs, and the difference between global installation and local installation in NPM
数据库系统原理与应用教程(032)—— MySQL 的数据完整性(五):定义自增列(AUTO_INCREMENT)
Kubernetes kube-scheduler调度器
Build product array
類和對象(上)
机器学习练习 8 -异常检测和推荐系统(协同过滤)
数据库系统原理与应用教程(023)—— MySQL 创建数据表的各种方法总结
acwing 869. 试除法求约数
数据库系统原理与应用教程(037)—— MySQL 的索引(三):删除索引
腾讯民汉翻译 小程序 改接口版(研究中)
Unity shader shader learning (I)
In the last interview, I knelt on redis, finished the internal redis documents given by cousin Ali, and finally entered the big factory
New features of globalization under the background of accelerated development of informatization
[731. My schedule II]
Select sort / insert sort / bubble sort
A survey of the theory and application of digital knowledge management
DNS resolution process
数据库系统原理与应用教程(026)—— MySQL 修改表中数据(二):删(delete from)
How to recover the lost database password?
Discussion on the new trend of network security technology