CN114356543A - A Kubernetes-based Multi-tenant Machine Learning Task Resource Scheduling Method - Google Patents
A Kubernetes-based Multi-tenant Machine Learning Task Resource Scheduling Method Download PDFInfo
- Publication number
- CN114356543A CN114356543A CN202111460970.6A CN202111460970A CN114356543A CN 114356543 A CN114356543 A CN 114356543A CN 202111460970 A CN202111460970 A CN 202111460970A CN 114356543 A CN114356543 A CN 114356543A
- Authority
- CN
- China
- Prior art keywords
- node
- resource
- gpu
- cpu
- machine learning
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Landscapes
- Debugging And Monitoring (AREA)
Abstract
本发明公开了一种基于Kubernetes的多租户机器学习任务资源调度方法,对不同用户可使用的算力资源进行配额管理,同时监测Kubernetes平台中各Node节点资源状态信息,考虑节点所在宿主机的资源利用率的问题,避免出现调度结果不准确的问题,同时通过监测实时调度和预调度request需求信息,根据调度任务需求信息对各Node节点进行优先级排序,获取最优节点的主机标签,根据标签对各类机器学习模型训练与预测任务的资源需求进行合理分配。本发明有效的预防和减少Kubernetes平台中节点资源使用的倾斜问题,实现多节点负载均衡,提高节点资源的利用率。
The invention discloses a multi-tenant machine learning task resource scheduling method based on Kubernetes, which performs quota management on the computing resources available to different users, monitors the resource status information of each Node node in the Kubernetes platform, and considers the resources of the host machine where the nodes are located. Utilization problem, avoid the problem of inaccurate scheduling results, and at the same time by monitoring real-time scheduling and pre-scheduling request demand information, according to the scheduling task demand information to prioritize each Node node, get the host label of the optimal node, according to the label Reasonable allocation of resource requirements for various machine learning model training and prediction tasks. The invention effectively prevents and reduces the inclination problem of node resource usage in the Kubernetes platform, realizes multi-node load balance, and improves the utilization rate of node resources.
Description
技术领域technical field
本发明涉及一种基于Kubernetes的多租户机器学习任务资源调度方法,属于电力调控技术领域。The invention relates to a Kubernetes-based multi-tenant machine learning task resource scheduling method, which belongs to the technical field of power regulation.
背景技术Background technique
目前电网调控领域人工智能技术应用取得了初步的成果,但在算力资源管控方面遇到了算力分散,制约应用的突破问题,各类应用“烟囱式”部署人工智能开发运行环境,造成了底层硬件资源的重复建设、算力分散且较难扩展。At present, the application of artificial intelligence technology in the field of power grid regulation has achieved preliminary results, but in the management and control of computing power resources, the computing power is scattered, which restricts the breakthrough of applications. Repeated construction of hardware resources, scattered computing power and difficult to expand.
云计算平台IaaS层主要利用了虚拟化技术实现多租户资源隔离与动态分配,但传统的虚拟化技术自身对硬件资源占用率较高,不适合机器学习模型训练和预测任务的算力资源高利用率场景;并且在应用程序配置、运行、管理等环节的复杂性较高,不利于集群化统筹管理。The IaaS layer of the cloud computing platform mainly uses virtualization technology to achieve multi-tenant resource isolation and dynamic allocation. However, the traditional virtualization technology itself has a high occupancy rate of hardware resources and is not suitable for high utilization of computing resources for machine learning model training and prediction tasks. In addition, the complexity of application configuration, operation, and management is relatively high, which is not conducive to clustered overall management.
kubernetes具有对服务进行自动化的编排、部署和资源调度等能力深受开发者的欢迎,本发明基于kubernetes对资源进行自定义编排调度,支撑新一代调度技术支持系统中人工智能应用开发及服务支撑平台的产品研制工作,用于电网故障辨识与分析、电网运行的预测与分析和电网智能调度辅助决策等机器学习训练和预测任务的资源调度,其应用成果验证了本发明的技术路线与可靠性。Kubernetes has the ability to automate service arrangement, deployment and resource scheduling, which is very popular among developers. The present invention performs customized arrangement and scheduling of resources based on Kubernetes, and supports artificial intelligence application development and service support platform in the new generation scheduling technology support system. The product development work of the invention is used for the resource scheduling of machine learning training and prediction tasks such as power grid fault identification and analysis, power grid operation prediction and analysis, and power grid intelligent dispatch auxiliary decision-making. Its application results verify the technical route and reliability of the present invention.
发明内容SUMMARY OF THE INVENTION
目的:为了克服现有技术中存在的不足,本发明提供一种基于Kubernetes的多租户机器学习任务资源调度方法,采用Kubernetes与容器技术对IaaS层CPU、GPU与内存资源进行统一调控,构建多租户机器学习模型训练与预测的应用程序标准化运行环境,提高电网调控系统的可控性、弹性扩展能力与资源隔离能力。Objective: In order to overcome the deficiencies in the prior art, the present invention provides a multi-tenant machine learning task resource scheduling method based on Kubernetes, which adopts Kubernetes and container technology to uniformly control the IaaS layer CPU, GPU and memory resources, and builds a multi-tenant system. The application of machine learning model training and prediction standardizes the operating environment, and improves the controllability, elastic expansion capability and resource isolation capability of the power grid regulation system.
技术方案:为解决上述技术问题,本发明采用的技术方案为:Technical scheme: in order to solve the above-mentioned technical problems, the technical scheme adopted in the present invention is:
一种基于Kubernetes的多租户机器学习任务资源调度方法,包括如下步骤:A Kubernetes-based multi-tenant machine learning task resource scheduling method, comprising the following steps:
计算集群中Node节点已使用资源与已创建容器使用资源的差值,得到Node节点操作系统自身所有进程占用的资源信息。Calculate the difference between the resources used by the Node node in the cluster and the resources used by the created container, and obtain the resource information occupied by all the processes of the Node node operating system itself.
调用Kubernetes API获取Node节点上所有机器学习模型训练与预测任务容器申请的资源信息。Call the Kubernetes API to obtain resource information requested by all machine learning model training and prediction task containers on the Node node.
将Node节点固有资源容量减去Node节点操作系统自身所有进程占用的资源信息和Node节点上所有机器学习模型训练与预测任务容器申请的资源信息,计算出Node节点实时可用资源信息。The real-time available resource information of the Node node is calculated by subtracting the inherent resource capacity of the Node node from the resource information occupied by all the processes of the Node node operating system itself and the resource information applied by all machine learning model training and prediction task containers on the Node node.
根据Node节点实时可用资源信息和Node节点固有资源容量,计算Node节点CPU、GPU和内存的可用率。According to the real-time available resource information of the Node node and the inherent resource capacity of the Node node, the availability rate of the CPU, GPU, and memory of the Node node is calculated.
系统集群资源管控服务预设资源阈值百分比,Node节点CPU、GPU和内存的可用率不低于预设资源阈值百分比的Node节点为机器学习模型训练与预测任务分配算力资源。The system cluster resource management and control service presets the resource threshold percentage. Node nodes whose CPU, GPU, and memory availability rates are not lower than the preset resource threshold percentages allocate computing resources for machine learning model training and prediction tasks.
机器学习任务调度服务将不同用户的机器学习模型训练与预测任务申请的CPU、GPU和内存资源数量发送至系统集群资源管控服务。The machine learning task scheduling service sends the number of CPU, GPU, and memory resources requested by different users' machine learning model training and prediction tasks to the system cluster resource management and control service.
系统集群资源管控服务通过计算多租户资源配额表、用户资源使用情况表的资源差值得到用户可申请剩余资源,并校验机器学习模型训练与预测任务申请的CPU、GPU和内存数量是否超过用户可申请剩余资源。The system cluster resource management and control service calculates the resource difference between the multi-tenant resource quota table and the user resource usage table to obtain the remaining resources that the user can apply for, and verifies whether the number of CPUs, GPUs, and memory requested by the machine learning model training and prediction tasks exceeds the number of users. Remaining resources can be applied for.
选择未超过用户可申请剩余资源的Node节点,系统集群资源管控服务将Node节点实时可用资源信息与申请的CPU、GPU和内存数量计算差值,除以Node节点固有资源容量,得到分配出资源后CPU、GPU和内存所剩资源的百分比。Select the Node node that does not exceed the remaining resources that the user can apply for. The system cluster resource management service calculates the difference between the real-time available resource information of the Node node and the number of CPUs, GPUs, and memory applied for, and divides it by the inherent resource capacity of the Node node to obtain the allocated resources. The percentage of resources left for CPU, GPU, and memory.
选择分配出资源后CPU、GPU和内存所剩资源的百分比大于预设的资源阈值百分比的Node节点,将每个Node节点的分配出资源后CPU、GPU和内存所剩资源的百分比进行评分计算,并按评分从大到小进行排序。Select Node nodes whose percentages of CPU, GPU, and memory remaining resources after resource allocation are greater than the preset resource threshold percentage, and calculate the percentage of CPU, GPU, and memory remaining resources for each Node node after resource allocation. And sort by rating from largest to smallest.
系统集群资源管控服务从序列中排序在前的Node节点为最优节点,并将最优节点的节点名返回给机器学习任务调度服务,并在用户资源使用情况表中进行持久化存储。The system cluster resource management and control service selects the node node ranked first in the sequence as the optimal node, returns the node name of the optimal node to the machine learning task scheduling service, and stores it persistently in the user resource usage table.
机器学习任务调度服务动态生成Kubernetes yaml文件,调用Kubernetes API在最优节点中创建容器运行机器学习模型训练与预测任务。The machine learning task scheduling service dynamically generates Kubernetes yaml files, and calls the Kubernetes API to create containers on the optimal nodes to run machine learning model training and prediction tasks.
作为优选方案,集群中每个Kubernetes Node节点上部署CPU、GPU与内存使用情况采集程序。As a preferred solution, a CPU, GPU, and memory usage collection program is deployed on each Kubernetes Node in the cluster.
作为优选方案,集群中Node节点固有资源容量将用户ID作为Kubernetes中的命名空间对虚拟资源池进行逻辑划分与隔离。As a preferred solution, the inherent resource capacity of Node nodes in the cluster uses the user ID as the namespace in Kubernetes to logically divide and isolate the virtual resource pool.
作为优选方案,多租户资源配额表如下所示:As a preferred solution, the multi-tenant resource quota table is as follows:
作为优选方案,用户资源使用情况表如下所示:As a preferred solution, the user resource usage table is as follows:
作为优选方案,通过Kubernetes的基于角色的访问控制对不同用户可操作的命名空间赋予访问权限。As a preferred option, grant access to different user-operable namespaces through Kubernetes' role-based access control.
作为优选方案,Kubernetes集群包括以下组件:API Server、ControllerManager、Scheduler、Kubelet、Kube-proxy、Etcd、Container runtime。As a preferred solution, a Kubernetes cluster includes the following components: API Server, ControllerManager, Scheduler, Kubelet, Kube-proxy, Etcd, and Container runtime.
作为优选方案,每个Node节点的分配出资源后CPU、GPU和内存所剩资源的百分比进行评分计算的方法如下:As a preferred solution, the method for calculating the percentage of the remaining resources of the CPU, GPU and memory of each Node node after allocating resources is as follows:
Scorei=request_cpu×percent_cpui+request_gpu×percent_gpui+request_mem×percent_memi其中,Scorei为第i个Node节点的评分,percent_cpui、percent_gpui、percent_memi分别为第i个Node节点的分配出资源后CPU、GPU和内存所剩资源的百分比,request_cpu、request_gpu、request_mem分别为第i个Node节Score i =request_cpu×percent_cpu i +request_gpu×percent_gpu i +request_mem×percent_mem i Among them, Score i is the score of the i-th Node node, percent_cpu i , percent_gpu i , percent_mem i are the resources of the i-th Node node after allocating resources respectively The percentage of the remaining resources of CPU, GPU and memory, request_cpu, request_gpu, request_mem are the i-th Node section respectively
点的申请的CPU、GPU和内存数量。The amount of CPU, GPU, and memory requested by the point.
有益效果:本发明提供的一种基于Kubernetes的多租户机器学习任务资源调度方法,对不同用户可使用的算力资源进行配额管理,同时监测Kubernetes平台中各Node节点资源状态信息,考虑节点所在宿主机的资源利用率的问题,避免出现调度结果不准确的问题,同时通过监测实时调度和预调度request需求信息,根据调度任务需求信息对各Node节点进行优先级排序,获取最优节点的主机标签,根据标签对各类机器学习模型训练与预测任务的资源需求进行合理分配,有效的预防和减少Kubernetes平台中节点资源使用的倾斜问题,实现多节点负载均衡,提高节点资源的利用率。Beneficial effect: The invention provides a Kubernetes-based multi-tenant machine learning task resource scheduling method, which performs quota management on computing resources available to different users, and monitors resource status information of each Node node in the Kubernetes platform at the same time, considering the destination of the node. The problem of resource utilization of the host can avoid the problem of inaccurate scheduling results. At the same time, by monitoring the real-time scheduling and pre-scheduling request demand information, each Node node is prioritized according to the scheduling task demand information, and the host label of the optimal node is obtained. , according to the label, the resource requirements of various machine learning model training and prediction tasks are reasonably allocated, which can effectively prevent and reduce the skew problem of node resource usage in the Kubernetes platform, achieve multi-node load balancing, and improve the utilization of node resources.
附图说明Description of drawings
图1是本发明实例中集群资源多租户管理示意图。FIG. 1 is a schematic diagram of multi-tenant management of cluster resources in an example of the present invention.
图2是本发明实例中Kubernetes集群资源管理架构示意图。FIG. 2 is a schematic diagram of a Kubernetes cluster resource management architecture in an example of the present invention.
图3是本发明实施例中机器学习训练与预测任务创建流程图。FIG. 3 is a flowchart of machine learning training and prediction task creation in an embodiment of the present invention.
具体实施方式Detailed ways
下面结合具体实施例对本发明作更进一步的说明。The present invention will be further described below in conjunction with specific embodiments.
一种基于Kubernetes的多租户机器学习任务资源调度方法,包括如下步骤:A Kubernetes-based multi-tenant machine learning task resource scheduling method, comprising the following steps:
1)通过计算Node节点已使用资源(node_cpu_usedi、node_gpu_usedi和node_mem_usedi)与已创建容器使用资源(pod_cpu_usedi、pod_gpu_usedi和pod_mem_usedi)的差值,得到该节点操作系统自身所有进程占用的资源信息。1) By calculating the difference between the resources used by the Node node (node_cpu_used i , node_gpu_used i and node_mem_used i ) and the resources used by the created container (pod_cpu_used i , pod_gpu_used i and pod_mem_used i ), the resources occupied by all processes of the node's operating system are obtained. information.
2)通过调用Kubernetes API获取到节点上所有机器学习模型训练与预测任务容器申请的资源信息(pod_cpu_reqi、pod_gpu_reqi和pod_mem_reqi)。2) Obtain the resource information (pod_cpu_req i , pod_gpu_req i and pod_mem_req i ) applied by all machine learning model training and prediction task containers on the node by calling the Kubernetes API.
3)将Node节点固有资源容量(node_cpu_totali、node_gpu_totali和node_mem_totali)减去上述两个值,即可计算出Node节点实时可用资源信息(node_cpui、node_gpui和node_memi)。3) Subtract the above two values from the inherent resource capacity of the Node node (node_cpu_total i , node_gpu_total i and node_mem_total i ) to calculate the real-time available resource information of the Node node (node_cpu_i , node_gpu i and node_mem i ) .
4)通过如下公式计算各Node节点CPU、GPU和内存的可用率:4) Calculate the availability of CPU, GPU and memory of each Node node by the following formula:
percent_cpui=node_cpui/node_cpu_totali percent_cpu i =node_cpu i /node_cpu_total i
percent_gpui=node_gpui/node_gpu_totali percent_gpu i =node_gpu i /node_gpu_total i
percent_memi=node_memi/node_mem_totali percent_mem i =node_mem i /node_mem_total i
5)系统集群资源管控服务通过预设资源阈值百分比,低于预设资源阈值百分比的节点将不再为机器学习模型训练与预测任务分配算力资源,保证各Node节点不会出现过载运行情况。5) The system cluster resource management and control service passes the preset resource threshold percentage. Nodes below the preset resource threshold percentage will no longer allocate computing resources for machine learning model training and prediction tasks to ensure that each Node node will not be overloaded.
6)机器学习任务调度服务将不同用户的机器学习模型训练与预测任务所需的CPU、GPU和内存资源数量(request_cpu、request_gpu和request_mem)发送至系统集群资源管控服务。6) The machine learning task scheduling service sends the number of CPU, GPU and memory resources (request_cpu, request_gpu and request_mem) required by different users' machine learning model training and prediction tasks to the system cluster resource management and control service.
7)系统集群资源管控服务通过计算多租户资源配额表、用户资源使用情况表的资源差值得到用户可申请剩余资源,并校验机器学习模型训练与预测任务申请的request_cpu、request_gpu和request_mem数量是否超过用户可申请剩余资源。7) The system cluster resource management and control service calculates the resource difference between the multi-tenant resource quota table and the user resource usage table to obtain the remaining resources that the user can apply for, and verifies whether the number of request_cpu, request_gpu and request_mem applied for the machine learning model training and prediction tasks is not Exceeding users can apply for the remaining resources.
8)系统集群资源管控服务将Node节点实时可用资源信息(node_cpui、node_gpui和node_memi)与申请的request_cpu、request_gpu和request_mem数量计算差值,除以Node节点固有资源容量,得到分配出资源后所剩资源的百分比。8) The system cluster resource management and control service calculates the difference between the real-time available resource information (node_cpu i , node_gpu i and node_mem i ) of the Node node and the requested number of request_cpu, request_gpu and request_mem, and divides it by the inherent resource capacity of the Node node to obtain the allocated resources. The percentage of resources remaining.
percent_cpui=(node_cpui-request_cpu)/node_cpu_totali percent_cpu i =(node_cpu i -request_cpu)/node_cpu_total i
percent_gpui=(node_gpui-request_gpu)/node_gpu_totali percent_gpu i =(node_gpu i -request_gpu)/node_gpu_total i
percent_memi=(node_memi-request_mem)/node_mem_totali percent_mem i =(node_mem i -request_mem)/node_mem_total i
将上述分配出资源后所剩资源百分比小于预设资源阈值百分比的节点对比后过滤,然后将剩余节点所剩资源百分比进行评分计算与并按评分排序。Compare and filter the nodes whose percentage of the remaining resources after allocating the resources is less than the preset resource threshold percentage, and then calculate and sort by the score the percentage of the remaining resources of the remaining nodes.
Scorei=request_cpu×percent_cpui+request_gpu×percent_gpui+request_mem×percent_memi Score i =request_cpu×percent_cpu i +request_gpu×percent_gpu i +request_mem×percent_mem i
9)系统集群资源管控服务从排序中选取评分在前的Node节点作为最优节点,并将节点名返回给机器学习任务调度服务,并在用户资源使用情况表中进行持久化存储。9) The system cluster resource management and control service selects the Node node with the highest score from the ranking as the optimal node, returns the node name to the machine learning task scheduling service, and stores it persistently in the user resource usage table.
10)机器学习任务调度服务动态生成Kubernetes yaml文件,调用Kubernetes API在最优节点中创建容器运行机器学习模型训练与预测任务。10) The machine learning task scheduling service dynamically generates the Kubernetes yaml file, and calls the Kubernetes API to create a container in the optimal node to run the machine learning model training and prediction tasks.
本发明的目的在于采用Kubernetes与容器技术对IaaS层CPU、GPU、内存与存储资源进行统一调控,构建多租户机器学习模型训练与预测的应用程序标准化运行环境,提高电网系统的可控性、弹性扩展能力与资源隔离能力。下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述:The purpose of the present invention is to use Kubernetes and container technology to uniformly control IaaS layer CPU, GPU, memory and storage resources, build a standardized operating environment for multi-tenant machine learning model training and prediction applications, and improve the controllability and flexibility of the power grid system Expansion capability and resource isolation capability. The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention:
根据集群中不同Node节点的可用CPU、GPU、内存与存储资源信息进行标签化管理,通过Kubernetes将集群资源整合为一个资源池,将用户ID作为Kubernetes中的命名空间(Namespace)对虚拟资源池进行逻辑划分与隔离,如图1所示。Label management is performed according to the available CPU, GPU, memory and storage resource information of different Node nodes in the cluster, and the cluster resources are integrated into a resource pool through Kubernetes, and the user ID is used as the namespace in Kubernetes. Logical division and isolation, as shown in Figure 1.
系统管理员通过集群多租户资源管理界面工具为不同用户分配所需资源信息,该信息采用多租户资源配额表进行持久化存储,如表1所示。通过Kubernetes的基于角色的访问控制(RBAC)对不同用户可操作的命名空间赋予访问权限,防止用户间的资源使用相互干扰。The system administrator allocates the required resource information to different users through the cluster multi-tenant resource management interface tool. The information is persistently stored in the multi-tenant resource quota table, as shown in Table 1. Through Kubernetes' role-based access control (RBAC), access permissions are granted to the namespaces that can be operated by different users, preventing mutual interference of resource usage among users.
集群中每个Kubernetes Node节点上部署CPU、GPU与内存使用情况采集程序,并根据上述获取信息分别计算出所有Node节点上的可用资源情况与可用率情况。The CPU, GPU, and memory usage collection programs are deployed on each Kubernetes Node node in the cluster, and the available resources and availability rates on all Node nodes are calculated based on the obtained information above.
表1多租户资源配额表Table 1 Multi-tenant resource quota table
表2用户资源使用情况表Table 2 User resource usage table
如图2所示为本实施例Kubernetes集群资源管理架构示意图,本实施例中Kubernetes集群由2个Master节点和6个Node节点构成。Master节点是集群的主要控制单元,主要对集群进行调度管理,防止项目需求增大,访问量增多,因此本实施例构建双Master节点的高可用模式;Node节点是工作负载节点,主要用于运行业务应用的容器,包括CPU和GPU两个集群,CPU集群主要用于创建常规pod任务,而GPU集群主要用于创建涉及图像运算的pod任务,双Node集群的模式将使得部署于其中的应用运行得更加合理、高效。FIG. 2 is a schematic diagram of the resource management architecture of the Kubernetes cluster in this embodiment. In this embodiment, the Kubernetes cluster is composed of 2 Master nodes and 6 Node nodes. The Master node is the main control unit of the cluster, which mainly performs scheduling management on the cluster to prevent the increase of project demand and the increase of access volume. Therefore, this embodiment builds a high-availability mode of dual-Master nodes; the Node node is a workload node, which is mainly used for running Containers for business applications include two clusters: CPU and GPU. The CPU cluster is mainly used to create regular pod tasks, while the GPU cluster is mainly used to create pod tasks involving image operations. The dual-node cluster mode will allow applications deployed in it to run. more reasonable and efficient.
Kubernetes集群主要包括七种主要的组件:API Server、Controller Manager、Scheduler、Kubelet、Kube-proxy、Etcd、Container runtime,以上各个组件之间协同配合进而实现整个集群的运行,本发明的调度策略主要对Scheduler起作用,通过计算实时任务和定时任务在各个Node节点的评价得分。评价得分包含两个方面,一方面是参考Node节点自身的资源的实际使用情况,另一方面兼顾了pod对于CPU、GPU和内存资源需求的偏好程度。最后本发明调度策略根据实时任务和定时任务对各个Node节点进行综合的评价,选择评价得分最高的Node节点为目标调度节点,跳过Scheduler的预选策略和优选策略,通过设定唯一标签可以直接在指定Node节点创建pod。如图2所示为本实施例中pod任务请求创建流程图,具体方式如下所示:The Kubernetes cluster mainly includes seven main components: API Server, Controller Manager, Scheduler, Kubelet, Kube-proxy, Etcd, and Container Runtime. The above components cooperate with each other to realize the operation of the entire cluster. The scheduling strategy of the present invention is mainly for The Scheduler works by calculating the evaluation scores of real-time tasks and timed tasks in each Node node. The evaluation score includes two aspects. On the one hand, it refers to the actual use of the resources of the Node node itself, and on the other hand, it takes into account the pod's preference for CPU, GPU, and memory resource requirements. Finally, the scheduling strategy of the present invention comprehensively evaluates each Node node according to real-time tasks and timing tasks, selects the Node node with the highest evaluation score as the target scheduling node, skips the pre-selection strategy and the optimization strategy of the Scheduler, and can directly use the unique label by setting the unique label. Specify the Node node to create a pod. Figure 2 shows the flow chart of creating a pod task request in this embodiment, and the specific method is as follows:
步骤1:获取Kubernetes平台中各个Node节点所在宿主机的CPU、GPU和内存使用信息,以及该节点上pod的CPU、GPU和内存的使用信息和request分配信息,并根据上述获取信息分别计算出每个Node节点的可用资源情况与可用率;Step 1: Obtain the CPU, GPU and memory usage information of the host where each Node node in the Kubernetes platform is located, as well as the CPU, GPU, and memory usage information and request allocation information of the pod on the node, and calculate each node according to the above obtained information. The available resources and availability rate of each Node node;
首先,通过计算宿主机和pod使用资源的差值获取pod容器外的宿主机使用资源情况;其次,获取pod容器的实际分配资源情况,将容器外宿主机使用情况与之求和,即可计算出Node节点的实际可用情况;通过如下公式计算出所有Node节点的CPU、GPU和内存的实际可用资源情况:First, obtain the resource usage of the host outside the pod container by calculating the difference between the resources used by the host and the pod; secondly, obtain the actual resource allocation of the pod container, and sum the usage of the host outside the container with it to calculate Get the actual availability of Node nodes; calculate the actual available resources of CPU, GPU and memory of all Node nodes by the following formula:
node_cpui=node_cpu_totali-(host_cpu_usedi-pod_cpu_usedi)-pod_cpu_reqi node_cpu i =node_cpu_total i -(host_cpu_used i -pod_cpu_used i )-pod_cpu_req i
node_memi=node_mem_totali-(host_mem_usedi-pod_mem_usedi)-pod_mem_reqi node_mem i =node_mem_total i -(host_mem_used i -pod_mem_used i )-pod_mem_req i
node_gpui=node_gpu_totali-(host_gpu_usedi-pod_gpu_usedi)-pod_gpu_reqi node_gpu i =node_gpu_total i -(host_gpu_used i -pod_gpu_used i )-pod_gpu_req i
其中,所述node_cpui、node_memi和node_gpui分别对应Node节点CPU、GPU和内存的实际可用资源信息,所述node_cpu_totali、node_mem_totali和node_gpu_totali分别对应Node节点CPU、GPU和内存的总计资源配置信息,所述host_cpu_usedi、host_mem_usedi和host_gpu_usedi分别对应Node节点CPU、GPU和内存的宿主机使用信息,所述host_cpu_usedi、pod_mem_usedi和host_gpu_usedi分别对应Node节点CPU、GPU和内存的pod使用信息,所述pod_cpu_reqi、pod_mem_reqi和pod_gpu_reqi分别对应当前Node节点CPU、GPU和内存的pod的资源request分配信息;Wherein, the node_cpu i , node_mem i and node_gpu i correspond to the actual available resource information of the Node node CPU, GPU and memory respectively, and the node_cpu_total i , node_mem_total i and node_gpu_total i respectively correspond to the total resource configuration of the Node node CPU, GPU and memory Information, the host_cpu_used i , host_mem_used i and host_gpu_used i correspond to the host machine usage information of Node node CPU, GPU and memory respectively, and the host_cpu_used i , pod_mem_used i and host_gpu_used i correspond to the pod usage information of Node node CPU, GPU and memory respectively , the pod_cpu_req i , pod_mem_req i and pod_gpu_req i respectively correspond to the resource request allocation information of the current Node node CPU, GPU and memory pod;
通过如下公式计算各Node节点CPU、GPU和内存的可用率:Calculate the availability of CPU, GPU and memory of each Node node by the following formula:
percent_cpui=node_cpui/node_cpu_totali percent_cpu i =node_cpu i /node_cpu_total i
percent_memi=node_memi/node_mem_totali percent_mem i =node_mem i /node_mem_total i
percent_gpui=node_gpui/node_gpu_totali percent_gpu i =node_gpu i /node_gpu_total i
步骤2:将各Node节点的CPU、GPU和内存可用率与预设阈值进行比对,若有节点低于规定阈值,则表明此节点过载,对该节点进行过滤,如果过滤出的节点个数为0,则返回调度失败;如果过滤出的节点个数大于0,则继续进行第3步;Step 2: Compare the CPU, GPU, and memory availability of each Node node with the preset threshold. If any node is lower than the specified threshold, it means that the node is overloaded, and the node is filtered. If the number of filtered nodes is If it is 0, it will return scheduling failure; if the number of filtered nodes is greater than 0, continue to step 3;
步骤3:通过K8s调度器获取到实时任务和定时任务pod对CPU、GPU和内存资源的请求信息,分别为request_cpu、request_gpu、request_mem以及用户ID,根据用户ID查表可获取当前用户资源剩余信息,通过对比可判断是否支持继续创建pod,如果不满足,则返回调度失败,如果满足则继续下一步;Step 3: Obtain the request information of real-time tasks and timed task pods for CPU, GPU and memory resources through the K8s scheduler, which are request_cpu, request_gpu, request_mem and user ID respectively. According to the user ID lookup table, the remaining information of the current user resources can be obtained. By comparison, it can be judged whether it supports to continue to create pods, if not, it will return scheduling failure, if it is satisfied, continue to the next step;
步骤4:将步骤3获取的任务资源请求信息与Node节点可用资源进行比对,过滤CPU、GPU和内存资源不足的Node节点,如果过滤出的节点个数为0,则返回调度失败,如果过滤出的节点个数等于1,则该Node节点设置为待创建pod的宿主机,如果过滤出的节点个数大于1,则继续进行下一步;Step 4: Compare the task resource request information obtained in
步骤5:对过滤出的Node节点进行评分,通过下式计算请求任务在各个Node节点分配出资源后所剩资源的百分比。Step 5: Score the filtered Node nodes, and calculate the percentage of the remaining resources of the request task after each Node node allocates resources by the following formula.
percent_cpui=(node_cpui-request_cpu)/node_cpu_totali percent_cpu i =(node_cpu i -request_cpu)/node_cpu_total i
percent_gpui=(node_gpui-request_gpu)/node_gpu_totali percent_gpu i =(node_gpu i -request_gpu)/node_gpu_total i
percent_memi=(node_memi-request_mem)/node_mem_totali percent_mem i =(node_mem i -request_mem)/node_mem_total i
将上述分配出资源后所剩资源百分比小于留资源阈值百分比的节点进行排除,并将所有节点的CPU、GPU和内存的分配出资源后所剩资源百分比进行累加与排序。The nodes whose percentage of the remaining resources after the above allocation of resources is less than the percentage of the remaining resource threshold are excluded, and the percentages of the remaining resources of all the nodes' CPU, GPU, and memory after the resources are allocated are accumulated and sorted.
对各Node节点进行优先级排序,根据排序确定最优节点个数,若节点个数为1,则该Node节点为最优节点并获取其标签;若节点个数大于1,则根据排序选择最优Node节点并获取其标签;最后通过机器学习任务的yaml文件指定标签启动pod。Prioritize each Node node, and determine the optimal number of nodes according to the ranking. If the number of nodes is 1, the Node node is the optimal node and its label is obtained; if the number of nodes is greater than 1, the most optimal node is selected according to the ordering Optimize the Node node and get its label; finally, start the pod by specifying the label in the yaml file of the machine learning task.
以上所述仅是本发明的优选实施方式,应当指出:对于本技术领域的普通技术人员来说,在不脱离本发明原理的前提下,还可以做出若干改进和润饰,这些改进和润饰也应视为本发明的保护范围。The above is only the preferred embodiment of the present invention, it should be pointed out that: for those skilled in the art, without departing from the principle of the present invention, several improvements and modifications can also be made, and these improvements and modifications are also It should be regarded as the protection scope of the present invention.
Claims (8)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111460970.6A CN114356543B (en) | 2021-12-02 | 2021-12-02 | A multi-tenant machine learning task resource scheduling method based on Kubernetes |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111460970.6A CN114356543B (en) | 2021-12-02 | 2021-12-02 | A multi-tenant machine learning task resource scheduling method based on Kubernetes |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114356543A true CN114356543A (en) | 2022-04-15 |
CN114356543B CN114356543B (en) | 2025-01-28 |
Family
ID=81096598
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111460970.6A Active CN114356543B (en) | 2021-12-02 | 2021-12-02 | A multi-tenant machine learning task resource scheduling method based on Kubernetes |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114356543B (en) |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114661482A (en) * | 2022-05-25 | 2022-06-24 | 成都索贝数码科技股份有限公司 | GPU computing power management method, medium, equipment and system |
CN114780245A (en) * | 2022-05-07 | 2022-07-22 | 中国银行股份有限公司 | Server resource allocation method and device |
CN115098238A (en) * | 2022-07-07 | 2022-09-23 | 北京鼎成智造科技有限公司 | Application program task scheduling method and device |
CN115145704A (en) * | 2022-06-02 | 2022-10-04 | 苏州思萃工业互联网技术研究所有限公司 | Pod scheduling method and system based on genetic algorithm |
CN115237608A (en) * | 2022-09-21 | 2022-10-25 | 之江实验室 | A multi-mode scheduling system and method based on multi-cluster unified computing power |
CN115373764A (en) * | 2022-10-27 | 2022-11-22 | 中诚华隆计算机技术有限公司 | Automatic container loading method and device |
CN115604362A (en) * | 2022-09-30 | 2023-01-13 | 苏州浪潮智能科技有限公司(Cn) | Scheduling management method and device based on Kubernetes |
CN118069379A (en) * | 2024-04-24 | 2024-05-24 | 麒麟软件有限公司 | Scheduling realization method based on GPU resources |
WO2024114483A3 (en) * | 2022-11-28 | 2024-08-02 | 中国科学院深圳先进技术研究院 | Resource allocation method and network based on dynamic programming, and storage medium and processor |
CN118502969A (en) * | 2024-07-17 | 2024-08-16 | 北京科东电力控制系统有限责任公司 | K8S platform-based multi-scene training exercise application building method and system |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20180074855A1 (en) * | 2016-09-14 | 2018-03-15 | Cloudera, Inc. | Utilization-aware resource scheduling in a distributed computing cluster |
CN109885389A (en) * | 2019-02-19 | 2019-06-14 | 山东浪潮云信息技术有限公司 | A kind of parallel deep learning scheduling training method and system based on container |
CN112418438A (en) * | 2020-11-24 | 2021-02-26 | 国电南瑞科技股份有限公司 | Container-based machine learning procedural training task execution method and system |
CN113157379A (en) * | 2020-01-22 | 2021-07-23 | 株式会社日立制作所 | Cluster node resource scheduling method and device |
US20210365290A1 (en) * | 2020-04-16 | 2021-11-25 | Nanjing University Of Posts And Telecommunications | Multidimensional resource scheduling method in kubernetes cluster architecture system |
-
2021
- 2021-12-02 CN CN202111460970.6A patent/CN114356543B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20180074855A1 (en) * | 2016-09-14 | 2018-03-15 | Cloudera, Inc. | Utilization-aware resource scheduling in a distributed computing cluster |
CN109885389A (en) * | 2019-02-19 | 2019-06-14 | 山东浪潮云信息技术有限公司 | A kind of parallel deep learning scheduling training method and system based on container |
CN113157379A (en) * | 2020-01-22 | 2021-07-23 | 株式会社日立制作所 | Cluster node resource scheduling method and device |
US20210365290A1 (en) * | 2020-04-16 | 2021-11-25 | Nanjing University Of Posts And Telecommunications | Multidimensional resource scheduling method in kubernetes cluster architecture system |
CN112418438A (en) * | 2020-11-24 | 2021-02-26 | 国电南瑞科技股份有限公司 | Container-based machine learning procedural training task execution method and system |
Non-Patent Citations (1)
Title |
---|
谢文舟;孙艳霞: "基于Kubernetes负载特征的资源预测模型研究", 网络安全技术与应用, no. 004, 31 December 2018 (2018-12-31) * |
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114780245A (en) * | 2022-05-07 | 2022-07-22 | 中国银行股份有限公司 | Server resource allocation method and device |
CN114661482A (en) * | 2022-05-25 | 2022-06-24 | 成都索贝数码科技股份有限公司 | GPU computing power management method, medium, equipment and system |
CN115145704A (en) * | 2022-06-02 | 2022-10-04 | 苏州思萃工业互联网技术研究所有限公司 | Pod scheduling method and system based on genetic algorithm |
CN115098238A (en) * | 2022-07-07 | 2022-09-23 | 北京鼎成智造科技有限公司 | Application program task scheduling method and device |
CN115098238B (en) * | 2022-07-07 | 2023-05-05 | 北京鼎成智造科技有限公司 | Application program task scheduling method and device |
CN115237608A (en) * | 2022-09-21 | 2022-10-25 | 之江实验室 | A multi-mode scheduling system and method based on multi-cluster unified computing power |
CN115604362A (en) * | 2022-09-30 | 2023-01-13 | 苏州浪潮智能科技有限公司(Cn) | Scheduling management method and device based on Kubernetes |
CN115604362B (en) * | 2022-09-30 | 2024-06-21 | 苏州浪潮智能科技有限公司 | Scheduling management method and device based on Kubernetes |
CN115373764A (en) * | 2022-10-27 | 2022-11-22 | 中诚华隆计算机技术有限公司 | Automatic container loading method and device |
WO2024114483A3 (en) * | 2022-11-28 | 2024-08-02 | 中国科学院深圳先进技术研究院 | Resource allocation method and network based on dynamic programming, and storage medium and processor |
CN118069379A (en) * | 2024-04-24 | 2024-05-24 | 麒麟软件有限公司 | Scheduling realization method based on GPU resources |
CN118069379B (en) * | 2024-04-24 | 2024-06-18 | 麒麟软件有限公司 | Scheduling realization method based on GPU resources |
CN118502969A (en) * | 2024-07-17 | 2024-08-16 | 北京科东电力控制系统有限责任公司 | K8S platform-based multi-scene training exercise application building method and system |
Also Published As
Publication number | Publication date |
---|---|
CN114356543B (en) | 2025-01-28 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN114356543A (en) | A Kubernetes-based Multi-tenant Machine Learning Task Resource Scheduling Method | |
CN113342477B (en) | Container group deployment method, device, equipment and storage medium | |
US7945913B2 (en) | Method, system and computer program product for optimizing allocation of resources on partitions of a data processing system | |
CN104503838B (en) | A kind of virtual cpu dispatching method | |
CN106095569B (en) | A kind of cloud workflow engine scheduling of resource and control method based on SLA | |
CN113064712B (en) | Micro-service optimization deployment control method, system and cluster based on cloud edge environment | |
CN105446816B (en) | A kind of energy optimization dispatching method towards heterogeneous platform | |
CN110221920B (en) | Deployment method, device, storage medium and system | |
CN107346264A (en) | A kind of method, apparatus and server apparatus of virtual machine load balance scheduling | |
CN114996018A (en) | Resource scheduling method, node, system, device and medium for heterogeneous computing | |
CN104679594B (en) | A kind of middleware distributed computing method | |
CN102968344A (en) | Method for migration scheduling of multiple virtual machines | |
CN113672391B (en) | Parallel computing task scheduling method and system based on Kubernetes | |
CN102708003A (en) | Method for allocating resources under cloud platform | |
CN114968566A (en) | Container scheduling method and device under shared GPU cluster | |
CN114625500B (en) | Topology-aware microservice application scheduling method and application in cloud environment | |
CN114911613B (en) | Cross-cluster resource high-availability scheduling method and system in inter-cloud computing environment | |
CN106371893A (en) | Cloud computing scheduling system and method | |
CN112559122A (en) | Virtualization instance management and control method and system based on electric power special security and protection equipment | |
CN108694083B (en) | Data processing method and device for server | |
CN116708454A (en) | Multi-cluster cloud computing system and multi-cluster job distribution method | |
CN114968601B (en) | Scheduling method and scheduling system for AI training jobs with resources reserved in proportion | |
CN110084507B (en) | A hierarchical-aware scientific workflow scheduling optimization method in cloud computing environment | |
CN107992351B (en) | Hardware resource allocation method and device and electronic equipment | |
CN112416520B (en) | Intelligent resource scheduling method based on vSphere |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |