CN116501446B - Kubernetes cluster deployment method and system, and electronic device - Google Patents
Kubernetes cluster deployment method and system, and electronic device Download PDFInfo
- Publication number
- CN116501446B CN116501446B CN202310686910.9A CN202310686910A CN116501446B CN 116501446 B CN116501446 B CN 116501446B CN 202310686910 A CN202310686910 A CN 202310686910A CN 116501446 B CN116501446 B CN 116501446B
- Authority
- CN
- China
- Prior art keywords
- gpu
- target
- kubernetes
- cluster
- virtualized
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 80
- 238000012544 monitoring process Methods 0.000 claims abstract description 6
- 238000009434 installation Methods 0.000 claims description 48
- 238000004590 computer program Methods 0.000 claims description 15
- 238000012217 deletion Methods 0.000 claims description 8
- 230000037430 deletion Effects 0.000 claims description 8
- 230000008569 process Effects 0.000 abstract description 16
- 238000012545 processing Methods 0.000 description 18
- 238000010586 diagram Methods 0.000 description 16
- 230000006870 function Effects 0.000 description 13
- 230000009471 action Effects 0.000 description 10
- 101100233916 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) KAR5 gene Proteins 0.000 description 6
- 238000007726 management method Methods 0.000 description 6
- 230000005540 biological transmission Effects 0.000 description 5
- 238000011161 development Methods 0.000 description 4
- 230000010365 information processing Effects 0.000 description 4
- 101001121408 Homo sapiens L-amino-acid oxidase Proteins 0.000 description 3
- 101000827703 Homo sapiens Polyphosphoinositide phosphatase Proteins 0.000 description 3
- 102100026388 L-amino-acid oxidase Human genes 0.000 description 3
- 102100023591 Polyphosphoinositide phosphatase Human genes 0.000 description 3
- 230000007246 mechanism Effects 0.000 description 3
- 230000000694 effects Effects 0.000 description 2
- 230000003862 health status Effects 0.000 description 2
- 239000013307 optical fiber Substances 0.000 description 2
- 238000013515 script Methods 0.000 description 2
- RYGMFSIKBFXOCR-UHFFFAOYSA-N Copper Chemical compound [Cu] RYGMFSIKBFXOCR-UHFFFAOYSA-N 0.000 description 1
- 101100012902 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) FIG2 gene Proteins 0.000 description 1
- 230000002159 abnormal effect Effects 0.000 description 1
- 238000013475 authorization Methods 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 229910052802 copper Inorganic materials 0.000 description 1
- 239000010949 copper Substances 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 238000003672 processing method Methods 0.000 description 1
- 230000000644 propagated effect Effects 0.000 description 1
- 230000001902 propagating effect Effects 0.000 description 1
- 238000009877 rendering Methods 0.000 description 1
- 238000013468 resource allocation Methods 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 230000001052 transient effect Effects 0.000 description 1
- 230000002747 voluntary effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/44—Arrangements for executing specific programs
- G06F9/455—Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
- G06F9/45533—Hypervisors; Virtual machine monitors
- G06F9/45558—Hypervisor-specific management and integration aspects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/60—Software deployment
- G06F8/65—Updates
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/44—Arrangements for executing specific programs
- G06F9/445—Program loading or initiating
- G06F9/44521—Dynamic linking or loading; Link editing at or after load time, e.g. Java class loading
- G06F9/44526—Plug-ins; Add-ons
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Computer Security & Cryptography (AREA)
- Stored Programmes (AREA)
- Testing And Monitoring For Control Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
技术领域Technical Field
本公开涉及计算机技术领域,尤其涉及一种Kubernetes集群部署方法及系统、电子设备。The present disclosure relates to the field of computer technology, and in particular to a Kubernetes cluster deployment method and system, and an electronic device.
背景技术Background technique
在云计算数据中心场景下,安装Kubernetes集群和图像处理器(GraphicsProcessing Unit,GPU)设备一般通过shell脚本,按照相应的文档和指南进行安装。但是,由于Kubernetes集群涉及多种Kubernetes组件、多种GPU应用类型需求,因此,亟需一种快速简便的Kubernetes集群部署方法。In cloud computing data center scenarios, Kubernetes clusters and graphics processing units (GPUs) are usually installed through shell scripts according to corresponding documents and guidelines. However, since Kubernetes clusters involve multiple Kubernetes components and multiple GPU application types, a quick and easy Kubernetes cluster deployment method is urgently needed.
发明内容Summary of the invention
本公开提出了一种Kubernetes集群部署方法及系统、电子设备的技术方案。The present invention discloses a Kubernetes cluster deployment method and system, and a technical solution for electronic equipment.
根据本公开的一方面,提供了一种Kubernetes集群部署方法,包括:在Kubernetes中,基于第一目标Operator,监听是否存在更新的自定义资源配置文件,其中,所述自定义资源配置文件用于执行目标Kubernetes集群操作;在监听到存在更新的所述自定义资源配置文件的情况下,根据预定义的所述目标Kubernetes集群操作对应的任务流水线,执行所述目标Kubernetes集群操作。According to one aspect of the present disclosure, a Kubernetes cluster deployment method is provided, including: in Kubernetes, based on a first target Operator, listening to whether there is an updated custom resource configuration file, wherein the custom resource configuration file is used to execute a target Kubernetes cluster operation; in the case of listening to the existence of the updated custom resource configuration file, executing the target Kubernetes cluster operation according to a predefined task pipeline corresponding to the target Kubernetes cluster operation.
在一种可能的实现方式中,一个任务流水线包括多个顺序排列的功能模块,每个功能模块中包括至少一个任务;针对任意一个任务,该任务中包括所述Kubernetes中的目标操作位置、以及在所述目标操作位置处需要执行的一个操作。In a possible implementation, a task pipeline includes multiple sequentially arranged functional modules, each functional module includes at least one task; for any task, the task includes a target operation location in the Kubernetes and an operation that needs to be performed at the target operation location.
在一种可能的实现方式中,所述根据预定义的所述目标Kubernetes集群操作对应的任务流水线,执行所述目标Kubernetes集群操作,包括:在所述目标Kubernetes集群操作为容器化GPU驱动操作的情况下,确定所述容器化GPU驱动操作对应的GPU应用类型,其中,所述容器化GPU驱动操作为在目标容器中执行GPU驱动安装操作或GPU驱动升级操作;根据所述容器化GPU驱动操作对应的GPU应用类型,确定所述容器化GPU驱动操作对应的任务流水线;根据所述容器化GPU驱动操作对应的任务流水线,在所述目标容器中执行所述容器化GPU驱动操作。In a possible implementation, executing the target Kubernetes cluster operation according to the predefined task pipeline corresponding to the target Kubernetes cluster operation includes: when the target Kubernetes cluster operation is a containerized GPU driver operation, determining the GPU application type corresponding to the containerized GPU driver operation, wherein the containerized GPU driver operation is executing a GPU driver installation operation or a GPU driver upgrade operation in a target container; determining the task pipeline corresponding to the containerized GPU driver operation according to the GPU application type corresponding to the containerized GPU driver operation; and executing the containerized GPU driver operation in the target container according to the task pipeline corresponding to the containerized GPU driver operation.
在一种可能的实现方式中,所述容器化GPU驱动操作对应的GPU应用类型包括:独占GPU类型、共享GPU类型。In a possible implementation, the GPU application type corresponding to the containerized GPU driver operation includes: an exclusive GPU type and a shared GPU type.
在一种可能的实现方式中,所述根据预定义的所述目标Kubernetes集群操作对应的任务流水线,执行所述目标Kubernetes集群操作,包括:在所述目标Kubernetes集群操作为虚拟机化驱动操作的情况下,确定所述虚拟机化GPU驱动操作对应的GPU应用类型,其中,所述虚拟机化GPU驱动操作为在目标虚拟机中执行GPU驱动安装操作或GPU驱动升级操作;根据所述虚拟机化GPU驱动操作对应的GPU应用类型,确定所述虚拟机化GPU驱动操作对应的任务流水线;根据所述虚拟机化GPU驱动操作对应的任务流水线,在所述目标虚拟机中执行所述虚拟机化GPU驱动操作。In a possible implementation, executing the target Kubernetes cluster operation according to the predefined task pipeline corresponding to the target Kubernetes cluster operation includes: when the target Kubernetes cluster operation is a virtualized driver operation, determining a GPU application type corresponding to the virtualized GPU driver operation, wherein the virtualized GPU driver operation is executing a GPU driver installation operation or a GPU driver upgrade operation in a target virtual machine; determining a task pipeline corresponding to the virtualized GPU driver operation according to the GPU application type corresponding to the virtualized GPU driver operation; and executing the virtualized GPU driver operation in the target virtual machine according to the task pipeline corresponding to the virtualized GPU driver operation.
在一种可能的实现方式中,所述虚拟机化GPU驱动操作对应的GPU应用类型包括:虚拟化GPU、直通GPU。In a possible implementation, the GPU application type corresponding to the virtualized GPU driver operation includes: virtualized GPU and pass-through GPU.
在一种可能的实现方式中,所述目标Kubernetes集群操作包括下述至少之一:集群安装操作、集群升级操作、集群删除操作、增加节点操作、更新证书操作、多种GPU应用类型GPU驱动安装操作、多种GPU应用类型GPU驱动升级操作、多种GPU设备插件安装操作、Kubernetes组件安装操作、Kubernetes组件升级操作。In one possible implementation, the target Kubernetes cluster operation includes at least one of the following: cluster installation operation, cluster upgrade operation, cluster deletion operation, node addition operation, certificate update operation, GPU driver installation operation for multiple GPU application types, GPU driver upgrade operation for multiple GPU application types, multiple GPU device plug-in installation operation, Kubernetes component installation operation, and Kubernetes component upgrade operation.
在一种可能的实现方式中,所述方法还包括:基于用户选择,确定更新的所述自定义资源配置文件,其中,所述用户选择包括:版本配置、网络插件配置、存储配置、节点配置、GPU应用类型配置。In a possible implementation, the method further includes: determining the updated custom resource configuration file based on user selection, wherein the user selection includes: version configuration, network plug-in configuration, storage configuration, node configuration, and GPU application type configuration.
在一种可能的实现方式中,所述方法还包括:在监听到存在更新的所述自定义资源配置文件的情况下,将所述自定义资源配置文件发送至所述Kubernetes中的APIServer;在执行完所述目标Kubernetes集群操作后,请求所述API Server更新所述自定义资源配置文件的执行状态。In a possible implementation, the method further includes: upon detecting the existence of an updated custom resource configuration file, sending the custom resource configuration file to the API Server in the Kubernetes; after executing the target Kubernetes cluster operation, requesting the API Server to update the execution status of the custom resource configuration file.
在一种可能的实现方式中,所述方法还包括:基于第二目标Operator,设置并管理所述Kubernetes中的主节点。In a possible implementation manner, the method further includes: setting and managing a master node in the Kubernetes based on the second target Operator.
根据本公开的一方面,提供了一种Kubernetes集群部署系统,包括:第一目标Operator,用于在Kubernetes中监听是否存在更新的自定义资源配置文件,其中,所述自定义资源配置文件用于执行目标Kubernetes集群操作;操作执行模块,用于在监听到存在更新的所述自定义资源配置文件的情况下,根据预定义的所述目标Kubernetes集群操作对应的任务流水线,执行所述目标Kubernetes集群操作,其中,一个任务流水线包括多个顺序排列的功能模块。According to one aspect of the present disclosure, a Kubernetes cluster deployment system is provided, including: a first target Operator, used to listen in Kubernetes to see whether there is an updated custom resource configuration file, wherein the custom resource configuration file is used to execute a target Kubernetes cluster operation; an operation execution module, used to, when the existence of the updated custom resource configuration file is monitored, execute the target Kubernetes cluster operation according to a predefined task pipeline corresponding to the target Kubernetes cluster operation, wherein a task pipeline includes a plurality of sequentially arranged functional modules.
在一种可能的实现方式中,每个功能模块中包括至少一个任务;针对任意一个任务,该任务中包括所述Kubernetes中的目标操作位置、以及在所述目标操作位置处需要执行的一个操作。In a possible implementation, each functional module includes at least one task; for any task, the task includes a target operation location in the Kubernetes and an operation that needs to be performed at the target operation location.
在一种可能的实现方式中,所述操作执行模块,具体用于:在所述目标Kubernetes集群操作为容器化GPU驱动操作的情况下,确定所述容器化GPU驱动操作对应的GPU应用类型,其中,所述容器化GPU驱动操作为在目标容器中执行GPU驱动安装操作或GPU驱动升级操作;根据所述容器化GPU驱动操作对应的GPU应用类型,确定所述容器化GPU驱动操作对应的任务流水线;根据所述容器化GPU驱动操作对应的任务流水线,在所述目标容器中执行所述容器化GPU驱动操作。In one possible implementation, the operation execution module is specifically used to: when the target Kubernetes cluster operation is a containerized GPU driver operation, determine the GPU application type corresponding to the containerized GPU driver operation, wherein the containerized GPU driver operation is to perform a GPU driver installation operation or a GPU driver upgrade operation in a target container; determine the task pipeline corresponding to the containerized GPU driver operation according to the GPU application type corresponding to the containerized GPU driver operation; and execute the containerized GPU driver operation in the target container according to the task pipeline corresponding to the containerized GPU driver operation.
在一种可能的实现方式中,所述容器化GPU驱动操作对应的GPU应用类型包括:独占GPU类型、共享GPU类型。In a possible implementation, the GPU application type corresponding to the containerized GPU driver operation includes: an exclusive GPU type and a shared GPU type.
在一种可能的实现方式中,所述操作执行模块,具体用于:在所述目标Kubernetes集群操作为虚拟机化驱动操作的情况下,确定所述虚拟机化GPU驱动操作对应的GPU应用类型,其中,所述虚拟机化GPU驱动操作为在目标虚拟机中执行GPU驱动安装操作或GPU驱动升级操作;根据所述虚拟机化GPU驱动操作对应的GPU应用类型,确定所述虚拟机化GPU驱动操作对应的任务流水线;根据所述虚拟机化GPU驱动操作对应的任务流水线,在所述目标虚拟机中执行所述虚拟机化GPU驱动操作。In one possible implementation, the operation execution module is specifically used to: when the target Kubernetes cluster operation is a virtualized driver operation, determine the GPU application type corresponding to the virtualized GPU driver operation, wherein the virtualized GPU driver operation is to perform a GPU driver installation operation or a GPU driver upgrade operation in the target virtual machine; determine the task pipeline corresponding to the virtualized GPU driver operation according to the GPU application type corresponding to the virtualized GPU driver operation; and execute the virtualized GPU driver operation in the target virtual machine according to the task pipeline corresponding to the virtualized GPU driver operation.
在一种可能的实现方式中,所述虚拟机化GPU驱动操作对应的GPU应用类型包括:虚拟化GPU、直通GPU。In a possible implementation, the GPU application type corresponding to the virtualized GPU driver operation includes: virtualized GPU and pass-through GPU.
在一种可能的实现方式中,所述目标Kubernetes集群操作包括下述至少之一:集群安装操作、集群升级操作、集群删除操作、增加节点操作、更新证书操作、多种GPU应用类型GPU驱动安装操作、多种GPU应用类型GPU驱动升级操作、多种GPU设备插件安装操作、Kubernetes组件安装操作、Kubernetes组件升级操作。In one possible implementation, the target Kubernetes cluster operation includes at least one of the following: cluster installation operation, cluster upgrade operation, cluster deletion operation, node addition operation, certificate update operation, GPU driver installation operation for multiple GPU application types, GPU driver upgrade operation for multiple GPU application types, multiple GPU device plug-in installation operation, Kubernetes component installation operation, and Kubernetes component upgrade operation.
在一种可能的实现方式中,所述系统还包括:用户配置模块,用于基于用户选择,确定更新的所述自定义资源配置文件,其中,所述用户选择包括:版本配置、网络插件配置、存储配置、节点配置、GPU应用类型配置。In a possible implementation, the system further includes: a user configuration module, used to determine the updated custom resource configuration file based on user selection, wherein the user selection includes: version configuration, network plug-in configuration, storage configuration, node configuration, and GPU application type configuration.
在一种可能的实现方式中,所述系统还包括:发送模块,用于在监听到存在更新的所述自定义资源配置文件的情况下,将所述自定义资源配置文件发送至所述Kubernetes中的API Server;状态更新模块,用于在执行完所述目标Kubernetes集群操作后,请求所述API Server更新所述自定义资源配置文件的执行状态。In a possible implementation, the system also includes: a sending module, which is used to send the custom resource configuration file to the API Server in the Kubernetes when it is detected that there is an updated custom resource configuration file; a status update module, which is used to request the API Server to update the execution status of the custom resource configuration file after executing the target Kubernetes cluster operation.
在一种可能的实现方式中,所述系统还包括:第二目标Operator,用于设置并管理所述Kubernetes中的主节点。In a possible implementation, the system further includes: a second target Operator, configured to set up and manage a master node in the Kubernetes.
根据本公开的一方面,提供了一种电子设备,包括:处理器;用于存储处理器可执行指令的存储器;其中,所述处理器被配置为调用所述存储器存储的指令,以执行上述方法。According to one aspect of the present disclosure, an electronic device is provided, comprising: a processor; and a memory for storing processor-executable instructions; wherein the processor is configured to call the instructions stored in the memory to execute the above method.
根据本公开的一方面,提供了一种计算机可读存储介质,其上存储有计算机程序指令,所述计算机程序指令被处理器执行时实现上述方法。According to one aspect of the present disclosure, a computer-readable storage medium is provided, on which computer program instructions are stored, and the computer program instructions implement the above method when executed by a processor.
在本公开实施例中,在Kubernetes中,基于第一目标Operator,监听是否存在更新的自定义资源配置文件,其中,自定义资源配置文件用于执行目标Kubernetes集群操作;在监听到存在自定义资源配置文件的情况下,根据预定义的目标Kubernetes集群操作对应的任务流水线,执行目标Kubernetes集群操作。基于第一目标Operator以及预定义的目标Kubernetes集群操作对应任务流水线,可以快速执行目标Kubernetes集群操作,简化Kubernetes集群部署过程,有效提高了Kubernetes集群部署效率。In the disclosed embodiment, in Kubernetes, based on the first target Operator, it is monitored whether there is an updated custom resource configuration file, wherein the custom resource configuration file is used to perform the target Kubernetes cluster operation; when it is monitored that there is a custom resource configuration file, the target Kubernetes cluster operation is performed according to the predefined task pipeline corresponding to the target Kubernetes cluster operation. Based on the first target Operator and the predefined task pipeline corresponding to the target Kubernetes cluster operation, the target Kubernetes cluster operation can be quickly executed, simplifying the Kubernetes cluster deployment process and effectively improving the Kubernetes cluster deployment efficiency.
应当理解的是,以上的一般描述和后文的细节描述仅是示例性和解释性的,而非限制本公开。根据下面参考附图对示例性实施例的详细说明,本公开的其它特征及方面将变得清楚。It should be understood that the above general description and the following detailed description are exemplary and explanatory only and do not limit the present disclosure. Other features and aspects of the present disclosure will become clear from the following detailed description of exemplary embodiments with reference to the accompanying drawings.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
此处的附图被并入说明书中并构成本说明书的一部分,这些附图示出了符合本公开的实施例,并与说明书一起用于说明本公开的技术方案。The drawings herein are incorporated into the specification and constitute a part of the specification. These drawings illustrate embodiments consistent with the present disclosure and are used to illustrate the technical solutions of the present disclosure together with the specification.
图1示出根据本公开实施例的一种Kubernetes集群部署方法的流程图;FIG1 shows a flow chart of a Kubernetes cluster deployment method according to an embodiment of the present disclosure;
图2示出根据本公开实施例的Kubernetes集群部署架构的框图;FIG2 shows a block diagram of a Kubernetes cluster deployment architecture according to an embodiment of the present disclosure;
图3示出根据本公开实施例的Kubernetes集群中多种应用类型GPU部署的流程图;FIG3 shows a flowchart of GPU deployment of multiple application types in a Kubernetes cluster according to an embodiment of the present disclosure;
图4示出根据本公开实施例的一种Kubernetes集群部署系统的框图;FIG4 shows a block diagram of a Kubernetes cluster deployment system according to an embodiment of the present disclosure;
图5示出根据本公开实施例的一种电子设备的框图。FIG5 shows a block diagram of an electronic device according to an embodiment of the present disclosure.
具体实施方式Detailed ways
以下将参考附图详细说明本公开的各种示例性实施例、特征和方面。附图中相同的附图标记表示功能相同或相似的元件。尽管在附图中示出了实施例的各种方面,但是除非特别指出,不必按比例绘制附图。Various exemplary embodiments, features and aspects of the present disclosure will be described in detail below with reference to the accompanying drawings. The same reference numerals in the accompanying drawings represent elements with the same or similar functions. Although various aspects of the embodiments are shown in the accompanying drawings, the drawings are not necessarily drawn to scale unless otherwise specified.
在这里专用的词“示例性”意为“用作例子、实施例或说明性”。这里作为“示例性”所说明的任何实施例不必解释为优于或好于其它实施例。The word “exemplary” is used exclusively herein to mean “serving as an example, example, or illustration.” Any embodiment described herein as “exemplary” is not necessarily to be construed as preferred or advantageous over other embodiments.
本文中术语“和/或”,仅仅是一种描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。另外,本文中术语“至少一种”表示多种中的任意一种或多种中的至少两种的任意组合,例如,包括A、B、C中的至少一种,可以表示包括从A、B和C构成的集合中选择的任意一个或多个元素。The term "and/or" herein is only a description of the association relationship of the associated objects, indicating that there may be three relationships. For example, A and/or B can represent: A exists alone, A and B exist at the same time, and B exists alone. In addition, the term "at least one" herein represents any combination of at least two of any one or more of a plurality of. For example, including at least one of A, B, and C can represent including any one or more elements selected from the set consisting of A, B, and C.
另外,为了更好地说明本公开,在下文的具体实施方式中给出了众多的具体细节。本领域技术人员应当理解,没有某些具体细节,本公开同样可以实施。在一些实例中,对于本领域技术人员熟知的方法、手段、元件和电路未作详细描述,以便于凸显本公开的主旨。In addition, in order to better illustrate the present disclosure, numerous specific details are given in the following specific embodiments. It should be understood by those skilled in the art that the present disclosure can also be implemented without certain specific details. In some examples, methods, means, components and circuits well known to those skilled in the art are not described in detail in order to highlight the subject matter of the present disclosure.
在云计算数据中心场景下,安装Kubernetes集群和多种应用类型GPU设备一般通过shell脚本,按照下述过程进行安装。In cloud computing data center scenarios, Kubernetes clusters and GPU devices of various application types are usually installed through shell scripts according to the following process.
第一步:安装Kubernetes集群。选择适合的Kubernetes安装方案(例如:Kubeadm、Kubespray、Kops等),按照相应的文档和指南进行安装和配置,确保所有节点上的Kubernetes组件(例如:kubelet、kube-proxy、kube-controller-manager等)都正确安装和运行。Step 1: Install the Kubernetes cluster. Choose a suitable Kubernetes installation solution (for example, Kubeadm, Kubespray, Kops, etc.), install and configure according to the corresponding documentation and guides, and ensure that the Kubernetes components on all nodes (for example, kubelet, kube-proxy, kube-controller-manager, etc.) are installed and running correctly.
第二步:安装不同应用类型的GPU驱动。为各种应用类型的GPU设备安装相应的驱动程序,根据每个GPU设备的厂商和型号,按照相应的文档和指南,确保正确安装和配置各种应用类型的GPU设备的驱动程序。Step 2: Install GPU drivers for different application types. Install the corresponding drivers for GPU devices of various application types. According to the manufacturer and model of each GPU device, follow the corresponding documents and guidelines to ensure that the drivers for GPU devices of various application types are correctly installed and configured.
第三步:安装和配置GPU设备插件(Device Plugin)。为各种应用类型的GPU设备安装和配置相应的GPU设备插件。GPU设备插件负责检测和管理各种应用类型的GPU设备,并将其注册到Kubernetes应用程序编程接口(Application Programming Interface,API)服务器上。按照GPU设备插件相应的文档和示例,确保正确安装和配置GPU设备插件。Step 3: Install and configure the GPU device plugin. Install and configure the corresponding GPU device plugin for various application types of GPU devices. The GPU device plugin is responsible for detecting and managing GPU devices of various application types and registering them to the Kubernetes Application Programming Interface (API) server. Follow the corresponding documentation and examples of the GPU device plugin to ensure that the GPU device plugin is correctly installed and configured.
相关技术中的Kubernetes集群和多种应用类型GPU设备的安装方案,虽然能够实现支持多种GPU应用场景下(例如,图形渲染、图像处理、AI、高性能计算等)的工作负载,但仍存在一些缺点和需要解决的技术问题。第一,驱动和版本兼容性:不同GPU设备和驱动程序之间存在兼容性问题。确保所选的GPU驱动程序与所使用的Kubernetes版本和操作系统相兼容是一个挑战。需要仔细考虑GPU驱动程序和Kubernetes版本之间的兼容性,并确保所选方案的GPU驱动和Kubernetes版本支持所需的GPU设备。第二,管理和配置复杂性:安装和配置多种应用类型GPU设备的复杂性较高。每种应用类型GPU设备可能需要不同的驱动程序和GPU设备插件,并且需要确保这些组件正确安装和配置。管理多种应用类型GPU设备的Kubernetes集群也需要额外的努力和技术知识。第三,Kubernetes集群安装和升级。Kubernetes集群安装涉及到的Kubernetes组件和依赖项众多,且有一定的版本要求。解决上述技术问题,需要密切关注GPU驱动程序、GPU设备插件、资源调度器和容器运行时等关键Kubernetes组件的版本兼容性、配置调整和性能优化。此外,持续监控和测试Kubernetes系统的性能、稳定性和安全性也是必要的。Although the installation solutions of Kubernetes clusters and GPU devices of various application types in the related art can support workloads in various GPU application scenarios (e.g., graphics rendering, image processing, AI, high-performance computing, etc.), there are still some shortcomings and technical problems that need to be solved. First, driver and version compatibility: There are compatibility issues between different GPU devices and drivers. It is a challenge to ensure that the selected GPU driver is compatible with the Kubernetes version and operating system used. It is necessary to carefully consider the compatibility between the GPU driver and the Kubernetes version, and ensure that the GPU driver and Kubernetes version of the selected solution support the required GPU devices. Second, management and configuration complexity: The complexity of installing and configuring GPU devices of various application types is high. Each application type GPU device may require different drivers and GPU device plug-ins, and it is necessary to ensure that these components are installed and configured correctly. Managing Kubernetes clusters of GPU devices of various application types also requires additional efforts and technical knowledge. Third, Kubernetes cluster installation and upgrade. There are many Kubernetes components and dependencies involved in the installation of Kubernetes clusters, and there are certain version requirements. To solve the above technical problems, it is necessary to pay close attention to the version compatibility, configuration adjustment and performance optimization of key Kubernetes components such as GPU drivers, GPU device plug-ins, resource schedulers and container runtimes. In addition, it is also necessary to continuously monitor and test the performance, stability and security of the Kubernetes system.
为了满足云计算数据中心不同应用场景下对Kubernetes集群的高效部署,本公开实施例提供了一种Kubernetes集群部署方法。下面对本公开实施例提供的Kubernetes集群部署方法进行详细描述。In order to meet the requirements of efficient deployment of Kubernetes clusters in different application scenarios of cloud computing data centers, the present disclosure provides a Kubernetes cluster deployment method. The Kubernetes cluster deployment method provided by the present disclosure is described in detail below.
图1示出根据本公开实施例的一种Kubernetes集群部署方法的流程图。该方法可以由终端设备或服务器等电子设备执行,终端设备可以为用户设备(User Equipment,UE)、移动设备、用户终端、终端、蜂窝电话、无绳电话、个人数字助理(Personal DigitalAssistant,PDA)、手持设备、计算设备、车载设备、可穿戴设备等,该方法可以通过处理器调用存储器中存储的计算机可读指令的方式来实现。或者,可通过服务器执行该方法。如图1所示,该方法包括:FIG1 shows a flow chart of a Kubernetes cluster deployment method according to an embodiment of the present disclosure. The method can be executed by an electronic device such as a terminal device or a server. The terminal device can be a user equipment (UE), a mobile device, a user terminal, a terminal, a cellular phone, a cordless phone, a personal digital assistant (PDA), a handheld device, a computing device, a vehicle-mounted device, a wearable device, etc. The method can be implemented by a processor calling a computer-readable instruction stored in a memory. Alternatively, the method can be executed by a server. As shown in FIG1 , the method includes:
在步骤S11中,在Kubernetes中,基于第一目标Operator,监听是否存在更新的自定义资源配置文件,其中,该自定义资源配置文件用于执行目标Kubernetes集群操作。In step S11, in Kubernetes, based on the first target Operator, it is monitored whether there is an updated custom resource configuration file, wherein the custom resource configuration file is used to execute the target Kubernetes cluster operation.
这里的第一目标Operator,可以是基于Operator开源工具开发的,用于进行Kubernetes部署。第一目标Operator的具体开发过程以及开发方式可以根据实际情况实现,本公开对此不作具体限定。The first target Operator here can be developed based on the Operator open source tool and used for Kubernetes deployment. The specific development process and development method of the first target Operator can be implemented according to actual conditions, and this disclosure does not specifically limit this.
在需要进行Kubernetes部署时,可以基于需求创建或更新自定义资源配置文件(Custom Resource Definition,CRD),创建或更新的该自定义资源配置文件用于执行目标Kubernetes集群操作。后文会结合本公开可能的实现方式,对创建或更新的该自定义资源配置文件的方式进行详细描述,此处不作赘述。When Kubernetes deployment is required, a custom resource definition (CRD) configuration file can be created or updated based on the requirements, and the custom resource definition configuration file created or updated is used to perform the target Kubernetes cluster operation. The method of creating or updating the custom resource configuration file will be described in detail later in conjunction with the possible implementation of the present disclosure, and will not be repeated here.
第一目标Operator基于List Watch机制,监听是否存在更新的自定义资源配置文件。The first target Operator monitors whether there is an updated custom resource configuration file based on the List Watch mechanism.
在步骤S12中,在监听到存在更新的自定义资源配置文件的情况下,根据预定义的目标Kubernetes集群操作对应的任务流水线,执行目标Kubernetes集群操作。In step S12, when an updated custom resource configuration file is detected, the target Kubernetes cluster operation is executed according to the predefined task pipeline corresponding to the target Kubernetes cluster operation.
在Kubernetes架构中,可以预先定义不同Kubernetes集群操作对应的任务流水线(Pipeline),以提高Kubernetes集群操作的执行效率。In the Kubernetes architecture, task pipelines corresponding to different Kubernetes cluster operations can be pre-defined to improve the execution efficiency of Kubernetes cluster operations.
在第一目标Operator监听到存在更新的自定义资源配置文件的情况下,可以根据预定义的目标Kubernetes集群操作对应的任务流水线,快速执行目标Kubernetes集群操作,以完成Kubernetes集群部署。后文会结合本公开可能的实现方式,对预先定义不同Kubernetes集群操作对应的任务流水线、以及如何根据预定义的目标Kubernetes集群操作对应的任务流水线,执行目标Kubernetes集群操作的过程进行详细描述,此处不作赘述。When the first target Operator detects that there is an updated custom resource configuration file, the target Kubernetes cluster operation can be quickly executed according to the predefined task pipeline corresponding to the target Kubernetes cluster operation to complete the Kubernetes cluster deployment. The following text will describe in detail the process of pre-defining task pipelines corresponding to different Kubernetes cluster operations and how to execute the target Kubernetes cluster operation according to the predefined task pipeline corresponding to the target Kubernetes cluster operation in combination with possible implementation methods of the present disclosure, which will not be repeated here.
在本公开实施例中,基于第一目标Operator以及预定义的目标Kubernetes集群操作对应任务流水线,可以快速执行目标Kubernetes集群操作,简化Kubernetes集群部署过程,有效提高了Kubernetes集群部署效率。In the disclosed embodiment, based on the first target Operator and the predefined task pipeline corresponding to the target Kubernetes cluster operation, the target Kubernetes cluster operation can be quickly executed, the Kubernetes cluster deployment process is simplified, and the Kubernetes cluster deployment efficiency is effectively improved.
在一种可能的实现方式中,一个任务流水线包括多个顺序排列的功能模块,每个功能模块中包括至少一个任务;针对任意一个任务,该任务中包括Kubernetes中的目标操作位置、以及在目标操作位置处需要执行的一个操作。In a possible implementation, a task pipeline includes multiple sequentially arranged functional modules, each functional module includes at least one task; for any task, the task includes a target operation location in Kubernetes and an operation that needs to be performed at the target operation location.
在Kubernetes架构中,预先定义了不同Kubernetes集群操作对应的任务流水线,一个任务流水线中包括多个顺序排列的功能模块,每个功能模块中包括至少一个任务;针对任意一个任务,该任务中包括Kubernetes中的目标操作位置、以及在目标操作位置处需要执行的一个操作。In the Kubernetes architecture, task pipelines corresponding to different Kubernetes cluster operations are pre-defined. A task pipeline includes multiple sequentially arranged functional modules, each of which includes at least one task. For any task, the task includes the target operation location in Kubernetes and an operation that needs to be performed at the target operation location.
通过预先定义的不同Kubernetes集群操作对应的任务流水线,可以快捷使用SSH和Kubernetes API在主机和Kubernetes集群中执行相应的Kubernetes集群操作,从而在Kubernetes集群中实现主机资源分配和配置管理。Through the pre-defined task pipelines corresponding to different Kubernetes cluster operations, you can quickly use SSH and Kubernetes API to perform corresponding Kubernetes cluster operations on the host and Kubernetes cluster, thereby realizing host resource allocation and configuration management in the Kubernetes cluster.
图2示出根据本公开实施例的Kubernetes集群部署架构的框图。如图2所示,该架构包括三部分,最上层包括命令模块(Command)、第一Operator、配置文件(ConfigurationFile)、自定义资源配置文件(CRD),中间层包括不同Kubernetes集群操作(如图2所示,集群安装(Install cluster)、增加节点(Add nodes)、集群升级(Upgrade cluster)、集群删除(Delete cluster)、更新证书(Renew certificates)等)、任务管理模块(Task manager)、配置管理模块(Configuration manager),最底层包括预定义的各Kubernetes集群操作对应的任务流水线(Pipeline)、SSH、缓存(Cache)、日志(Log)等。Figure 2 shows a block diagram of the Kubernetes cluster deployment architecture according to an embodiment of the present disclosure. As shown in Figure 2, the architecture includes three parts, the top layer includes a command module (Command), a first Operator, a configuration file (ConfigurationFile), a custom resource configuration file (CRD), the middle layer includes different Kubernetes cluster operations (as shown in Figure 2, cluster installation (Install cluster), adding nodes (Add nodes), cluster upgrade (Upgrade cluster), cluster deletion (Delete cluster), update certificates (Renew certificates), etc.), a task management module (Task manager), a configuration management module (Configuration manager), and the bottom layer includes predefined task pipelines (Pipeline), SSH, cache (Cache), logs (Log), etc. corresponding to each Kubernetes cluster operation.
如图2所示,一个任务流水线(Pipeline)(例如,集群安装操作对应的任务流水线、增加节点操作对应的任务流水线等)包括多个顺序排列的功能模块(功能模块1(Module1)、功能模块2(Module 2)、至功能模块n(Module n)),一个任务流水线包括执行一个集群操作的完整执行过程;功能更模块是一个具有特定和完整功能的模块,一个功能模块中包括至少一个任务(Task);一个任务中包括Kubernetes中的目标操作位置、以及在目标操作位置处需要执行的一个操作(Action),例如,一个任务中包括Action、Hosts、Retry、Parallel等字段,其中,Action字段指示的是该任务管理的那个Action,Hosts字段指示的是Kubernetes中执行该Action的目标操作位置,Retry字段指示是否重试,Parallel字段指示是否并行处理;一个操作(Action)是最基本单位,表示在目标操作位置处需要执行的一项操作。As shown in Figure 2, a task pipeline (Pipeline) (for example, the task pipeline corresponding to the cluster installation operation, the task pipeline corresponding to the node addition operation, etc.) includes multiple sequentially arranged functional modules (functional module 1 (Module1), functional module 2 (Module 2), to functional module n (Module n)), and a task pipeline includes the complete execution process of executing a cluster operation; a functional module is a module with specific and complete functions, and a functional module includes at least one task (Task); a task includes a target operation location in Kubernetes, and an operation (Action) that needs to be performed at the target operation location. For example, a task includes fields such as Action, Hosts, Retry, and Parallel, where the Action field indicates the Action managed by the task, the Hosts field indicates the target operation location for executing the Action in Kubernetes, the Retry field indicates whether to retry, and the Parallel field indicates whether to process in parallel; an operation (Action) is the most basic unit, indicating an operation that needs to be performed at the target operation location.
一个任务流水线的具体形式除了可以是上述记载形式之外,还可以根据实际需要设置为其他形式,本公开对此不作具体限定。In addition to the above-mentioned recording forms, the specific form of a task pipeline can also be set to other forms according to actual needs, and the present disclosure does not make specific limitations on this.
在一种可能的实现方式中,该方法还包括:基于用户选择,确定更新的自定义资源配置文件,其中,用户选择包括:版本配置、网络插件配置、存储配置、节点配置、GPU应用类型配置。In a possible implementation, the method further includes: determining an updated custom resource configuration file based on user selection, wherein the user selection includes: version configuration, network plug-in configuration, storage configuration, node configuration, and GPU application type configuration.
在需要进行Kubernetes部署时,根据特定的应用场景和用户需求下用户选择的版本配置、网络插件配置、存储配置、节点配置、GPU应用类型配置等配置信息,确定更新的自定义资源配置文件,从而有效实现满足用户需求的定制化配置,也能够解决版本兼容性问题。When Kubernetes deployment is required, the updated custom resource configuration file is determined based on the version configuration, network plug-in configuration, storage configuration, node configuration, GPU application type configuration and other configuration information selected by the user under specific application scenarios and user needs, thereby effectively implementing customized configuration that meets user needs and solving version compatibility issues.
在一种可能的实现方式中,该方法还包括:在监听到存在更新的自定义资源配置文件的情况下,将自定义资源配置文件发送至Kubernetes中的API Server;在执行完目标Kubernetes集群操作后,请求API Server更新自定义资源配置文件的执行状态。In a possible implementation, the method further includes: upon detecting the existence of an updated custom resource configuration file, sending the custom resource configuration file to an API Server in Kubernetes; and after executing the target Kubernetes cluster operation, requesting the API Server to update the execution status of the custom resource configuration file.
第一目标Operator在监听到存在更新的自定义资源配置文件的情况下,将该自定义资源配置文件发送至Kubernetes中的API Server,并确定该自定义资源配置文件的执行状态为初始化状态;在执行完该自定义资源配置文件对应的目标Kubernetes集群操作后,请求API Server更新该自定义资源配置文件的执行状态为已完成状态。基于第一目标Operator和API Server有效实现对自定义资源配置文件的监听和状态管理。When the first target Operator detects that there is an updated custom resource configuration file, it sends the custom resource configuration file to the API Server in Kubernetes and determines that the execution status of the custom resource configuration file is the initialization status; after executing the target Kubernetes cluster operation corresponding to the custom resource configuration file, it requests the API Server to update the execution status of the custom resource configuration file to the completed status. Based on the first target Operator and the API Server, the monitoring and status management of the custom resource configuration file are effectively realized.
基于本公开实施例的Kubernetes集群部署方法,可以支持多种应用类型GPU部署,下面对多种应用类型GPU部署的详细过程进行描述。The Kubernetes cluster deployment method based on the embodiment of the present disclosure can support GPU deployment of various application types. The detailed process of GPU deployment of various application types is described below.
在一种可能的实现方式中,根据预定义的目标Kubernetes集群操作对应的任务流水线,执行目标Kubernetes集群操作,包括:在目标Kubernetes集群操作为容器化GPU驱动操作的情况下,确定容器化GPU驱动操作对应的GPU应用类型,其中,容器化GPU驱动操作为在目标容器中执行GPU驱动安装操作或GPU驱动升级操作;根据容器化GPU驱动操作对应的GPU应用类型,确定容器化GPU驱动操作对应的任务流水线;根据容器化GPU驱动操作对应的任务流水线,在目标容器中执行容器化GPU驱动操作。In a possible implementation, a target Kubernetes cluster operation is executed according to a predefined task pipeline corresponding to the target Kubernetes cluster operation, including: when the target Kubernetes cluster operation is a containerized GPU driver operation, determining a GPU application type corresponding to the containerized GPU driver operation, wherein the containerized GPU driver operation is executing a GPU driver installation operation or a GPU driver upgrade operation in a target container; determining a task pipeline corresponding to the containerized GPU driver operation according to the GPU application type corresponding to the containerized GPU driver operation; and executing the containerized GPU driver operation in the target container according to the task pipeline corresponding to the containerized GPU driver operation.
预定义不同GPU应用类型的容器化GPU驱动操作对应的任务流水线,从而可以基于任务流水线,快速实现不同GPU应用类型的容器化GPU设备的部署。The task pipelines corresponding to the containerized GPU driver operations of different GPU application types are predefined, so that the deployment of containerized GPU devices of different GPU application types can be quickly implemented based on the task pipelines.
图3示出根据本公开实施例的Kubernetes集群中多种应用类型GPU部署的流程图。如图3所示,基于部署容器化GPU设备的用户需求,创建或修改对应的自定义资源配置文件(CRD),该自定义资源配置文件用于执行容器化GPU驱动操作。第一目标Operator基于ListWatch机制实时监听是否存在更新的自定义资源配置文件,在监听到存在更新的自定义资源配置文件之后,将更新的该自定义资源配置文件发送至API Server。检测更新的该自定义资源配置文件对应的目标Kubernetes集群操作是否为容器化GPU驱动操作(GPU驱动安装或GPU驱动升级)。如果目标Kubernetes集群操作为容器化GPU驱动操作,则确定容器化GPU驱动操作对应的GPU应用类型。根据容器化GPU驱动操作对应的GPU应用类型,确定容器化GPU驱动操作对应的任务流水线。进而根据容器化GPU驱动操作对应的任务流水线,快速在目标容器中执行容器化GPU驱动操作。最后,请求API Server更新该自定义资源配置文件的执行状态为已完成状态。FIG3 shows a flowchart of GPU deployment of multiple application types in a Kubernetes cluster according to an embodiment of the present disclosure. As shown in FIG3 , based on the user demand for deploying a containerized GPU device, a corresponding custom resource configuration file (CRD) is created or modified, and the custom resource configuration file is used to perform a containerized GPU driver operation. The first target Operator monitors in real time whether there is an updated custom resource configuration file based on the ListWatch mechanism, and after monitoring the existence of an updated custom resource configuration file, sends the updated custom resource configuration file to the API Server. Detect whether the target Kubernetes cluster operation corresponding to the updated custom resource configuration file is a containerized GPU driver operation (GPU driver installation or GPU driver upgrade). If the target Kubernetes cluster operation is a containerized GPU driver operation, determine the GPU application type corresponding to the containerized GPU driver operation. According to the GPU application type corresponding to the containerized GPU driver operation, determine the task pipeline corresponding to the containerized GPU driver operation. Then, according to the task pipeline corresponding to the containerized GPU driver operation, quickly execute the containerized GPU driver operation in the target container. Finally, request the API Server to update the execution status of the custom resource configuration file to a completed state.
容器化GPU驱动操作对应的任务流水线的具体操作过程可以包括:驱动程序下载、节点驱逐(该节点为目标容器所在节点)、节点禁止调度、驱动程序安装、角色分配(即设置该目标容器的GPU应用类型)。The specific operation process of the task pipeline corresponding to the containerized GPU driver operation may include: driver download, node expulsion (the node is the node where the target container is located), node scheduling prohibition, driver installation, and role allocation (i.e. setting the GPU application type of the target container).
在一种可能的实现方式中,容器化GPU驱动操作对应的GPU应用类型包括:独占GPU类型、共享GPU类型。In a possible implementation, the GPU application types corresponding to the containerized GPU driver operation include: an exclusive GPU type and a shared GPU type.
如图3所示,基于上述方法可以实现在Kubernetes集群中部署独占GPU类型的容器化GPU设备(GPU)、共享GPU类型的容器化GPU设备(sGPU)。As shown in FIG3 , based on the above method, it is possible to deploy an exclusive GPU type containerized GPU device (GPU) and a shared GPU type containerized GPU device (sGPU) in a Kubernetes cluster.
在一种可能的实现方式中,根据预定义的目标Kubernetes集群操作对应的任务流水线,执行目标Kubernetes集群操作,包括:在目标Kubernetes集群操作为虚拟机化驱动操作的情况下,确定虚拟机化GPU驱动操作对应的GPU应用类型,其中,虚拟机化GPU驱动操作为在目标虚拟机中执行GPU驱动安装操作或GPU驱动升级操作;根据虚拟机化GPU驱动操作对应的GPU应用类型,确定虚拟机化GPU驱动操作对应的任务流水线;根据虚拟机化GPU驱动操作对应的任务流水线,在目标虚拟机中执行虚拟机化GPU驱动操作。In a possible implementation, a target Kubernetes cluster operation is executed according to a predefined task pipeline corresponding to the target Kubernetes cluster operation, including: when the target Kubernetes cluster operation is a virtualized driver operation, determining a GPU application type corresponding to the virtualized GPU driver operation, wherein the virtualized GPU driver operation is executing a GPU driver installation operation or a GPU driver upgrade operation in a target virtual machine; determining a task pipeline corresponding to the virtualized GPU driver operation according to the GPU application type corresponding to the virtualized GPU driver operation; and executing the virtualized GPU driver operation in the target virtual machine according to the task pipeline corresponding to the virtualized GPU driver operation.
预定义不同GPU应用类型的虚拟机化GPU驱动操作对应的任务流水线,从而可以基于任务流水线,快速实现不同GPU应用类型的虚拟机化GPU设备的部署。The task pipelines corresponding to the virtualized GPU driver operations of different GPU application types are predefined, so that the deployment of virtualized GPU devices of different GPU application types can be quickly implemented based on the task pipelines.
如图3所示,基于部署虚拟机化GPU设备的用户需求,创建或修改对应的自定义资源配置文件,该自定义资源配置文件用于执行虚拟机化GPU驱动操作。第一目标Operator基于List Watch机制实时监听是否存在更新的自定义资源配置文件,在监听到存在更新的自定义资源配置文件之后,将更新的该自定义资源配置文件发送至API Server,并确定该自定义资源配置文件的执行状态为初始化状态。检测更新的该自定义资源配置文件对应的目标Kubernetes集群操作是否为虚拟机化GPU驱动操作(GPU驱动安装或GPU驱动升级)。如果目标Kubernetes集群操作为虚拟机化GPU驱动操作,则确定虚拟机化GPU驱动操作对应的GPU应用类型。根据虚拟机化GPU驱动操作对应的GPU应用类型,确定虚拟机化GPU驱动操作对应的任务流水线。进而根据虚拟机化GPU驱动操作对应的任务流水线,快速在目标虚拟机中执行虚拟机化GPU驱动操作。最后,请求API Server更新该自定义资源配置文件的执行状态为已完成状态。As shown in FIG3 , based on the user demand for deploying a virtualized GPU device, a corresponding custom resource configuration file is created or modified, and the custom resource configuration file is used to perform a virtualized GPU driver operation. The first target Operator monitors in real time whether there is an updated custom resource configuration file based on the List Watch mechanism. After monitoring the existence of an updated custom resource configuration file, the updated custom resource configuration file is sent to the API Server, and the execution state of the custom resource configuration file is determined to be an initialization state. Detect whether the target Kubernetes cluster operation corresponding to the updated custom resource configuration file is a virtualized GPU driver operation (GPU driver installation or GPU driver upgrade). If the target Kubernetes cluster operation is a virtualized GPU driver operation, determine the GPU application type corresponding to the virtualized GPU driver operation. According to the GPU application type corresponding to the virtualized GPU driver operation, determine the task pipeline corresponding to the virtualized GPU driver operation. Then, according to the task pipeline corresponding to the virtualized GPU driver operation, quickly execute the virtualized GPU driver operation in the target virtual machine. Finally, request the API Server to update the execution state of the custom resource configuration file to a completed state.
虚拟机化GPU驱动操作对应的任务流水线的具体操作过程可以包括:驱动程序下载、节点驱逐(该节点为目标虚拟机所在节点)、驱动程序安装等。The specific operation process of the task pipeline corresponding to the virtualized GPU driver operation may include: driver download, node eviction (the node is the node where the target virtual machine is located), driver installation, etc.
在一种可能的实现方式中,虚拟机化GPU驱动操作对应的GPU应用类型包括:虚拟化GPU、直通GPU。In a possible implementation, the GPU application types corresponding to the virtualized GPU driver operation include: virtualized GPU and pass-through GPU.
如图3所示,基于上述方法可以实现在Kubernetes集群中部署独占GPU类型的虚拟机化GPU设备(直通Passthrough)、共享GPU类型的虚拟机化GPU设备(vGPU)。As shown in FIG3 , based on the above method, it is possible to deploy an exclusive GPU type virtualized GPU device (passthrough) and a shared GPU type virtualized GPU device (vGPU) in a Kubernetes cluster.
在一种可能的实现方式中,目标Kubernetes集群操作包括下述至少之一:集群安装操作、集群升级操作、集群删除操作、增加节点操作、更新证书操作、多种GPU应用类型GPU驱动安装操作、多种GPU应用类型GPU驱动升级操作、多种GPU设备插件安装操作、Kubernetes组件安装操作、Kubernetes组件升级操作。In one possible implementation, the target Kubernetes cluster operation includes at least one of the following: cluster installation operation, cluster upgrade operation, cluster deletion operation, node addition operation, certificate update operation, GPU driver installation operation for multiple GPU application types, GPU driver upgrade operation for multiple GPU application types, multiple GPU device plug-in installation operation, Kubernetes component installation operation, and Kubernetes component upgrade operation.
基于第一目标Operator以及预定义的目标Kubernetes集群操作对应任务流水线,可以快速实现Kubernetes集群中的集群安装、集群升级作、集群删除、更新证书、组件安装、组件升级等操作实现Kubernetes集群的快速统一部署,并且还可以实现增加节点来扩展Kubernetes集群。Based on the first target Operator and the predefined target Kubernetes cluster operation corresponding task pipeline, you can quickly implement cluster installation, cluster upgrade, cluster deletion, certificate update, component installation, component upgrade and other operations in the Kubernetes cluster to achieve rapid and unified deployment of the Kubernetes cluster, and you can also add nodes to expand the Kubernetes cluster.
在一种可能的实现方式中,该方法还包括:基于第二目标Operator,设置并管理Kubernetes中的主节点。In a possible implementation, the method further includes: setting and managing a master node in Kubernetes based on the second target Operator.
这里的第二目标Operator,可以是基于Operator开源工具开发的,用于设置并管理Kubernetes中的主节点。第二目标Operator的具体开发过程以及开发方式可以根据实际情况实现,本公开对此不作具体限定。The second target Operator here can be developed based on the Operator open source tool and is used to set up and manage the master node in Kubernetes. The specific development process and development method of the second target Operator can be implemented according to actual conditions, and this disclosure does not specifically limit this.
基于第二目标Operator可以在Kubernetes集群中设置多个主节点,并且通过监控当前主节点的健康状态,以使得在发现当前主节点的健康状态出现异常无法正常工作时,可以快速切换到其他主节点,以确保Kubernetes集群的高可用性。Based on the second target Operator, multiple master nodes can be set up in the Kubernetes cluster, and by monitoring the health status of the current master node, when the health status of the current master node is found to be abnormal and cannot work normally, it can be quickly switched to other master nodes to ensure the high availability of the Kubernetes cluster.
在本公开实施例中,在Kubernetes中,基于第一目标Operator,监听是否存在更新的自定义资源配置文件,其中,自定义资源配置文件用于执行目标Kubernetes集群操作;在监听到存在自定义资源配置文件的情况下,根据预定义的目标Kubernetes集群操作对应的任务流水线,执行目标Kubernetes集群操作。基于第一目标Operator以及预定义的目标Kubernetes集群操作对应任务流水线,可以快速执行目标Kubernetes集群操作,简化Kubernetes集群部署过程,有效提高了Kubernetes集群部署效率。In the disclosed embodiment, in Kubernetes, based on the first target Operator, it is monitored whether there is an updated custom resource configuration file, wherein the custom resource configuration file is used to perform the target Kubernetes cluster operation; when it is monitored that there is a custom resource configuration file, the target Kubernetes cluster operation is performed according to the predefined task pipeline corresponding to the target Kubernetes cluster operation. Based on the first target Operator and the predefined task pipeline corresponding to the target Kubernetes cluster operation, the target Kubernetes cluster operation can be quickly executed, simplifying the Kubernetes cluster deployment process and effectively improving the Kubernetes cluster deployment efficiency.
可以理解,本公开提及的上述各个方法实施例,在不违背原理逻辑的情况下,均可以彼此相互结合形成结合后的实施例,限于篇幅,本公开不再赘述。本领域技术人员可以理解,在具体实施方式的上述方法中,各步骤的具体执行顺序应当以其功能和可能的内在逻辑确定。It can be understood that the above-mentioned various method embodiments mentioned in the present disclosure can be combined with each other to form a combined embodiment without violating the principle logic. Due to space limitations, the present disclosure will not repeat them. It can be understood by those skilled in the art that in the above-mentioned method of the specific implementation method, the specific execution order of each step should be determined according to its function and possible internal logic.
此外,本公开还提供了Kubernetes集群部署系统、电子设备、计算机可读存储介质、程序,上述均可用来实现本公开提供的任一种Kubernetes集群部署方法,相应技术方案和描述和参见方法部分的相应记载,不再赘述。In addition, the present disclosure also provides a Kubernetes cluster deployment system, an electronic device, a computer-readable storage medium, and a program, all of which can be used to implement any Kubernetes cluster deployment method provided by the present disclosure. The corresponding technical solutions and descriptions are referred to in the corresponding records of the method part and will not be repeated here.
图4示出根据本公开实施例的一种Kubernetes集群部署系统的框图。如图4所示,系统40包括:FIG4 shows a block diagram of a Kubernetes cluster deployment system according to an embodiment of the present disclosure. As shown in FIG4 , the system 40 includes:
第一目标Operator41,用于在Kubernetes中监听是否存在更新的自定义资源配置文件,其中,自定义资源配置文件用于执行目标Kubernetes集群操作;The first target Operator 41 is used to monitor whether there is an updated custom resource configuration file in Kubernetes, wherein the custom resource configuration file is used to perform target Kubernetes cluster operations;
操作执行模块42,用于在监听到存在更新的自定义资源配置文件的情况下,根据预定义的目标Kubernetes集群操作对应的任务流水线,执行目标Kubernetes集群操作,其中,一个任务流水线包括多个顺序排列的功能模块。The operation execution module 42 is used to execute the target Kubernetes cluster operation according to the predefined task pipeline corresponding to the target Kubernetes cluster operation when an updated custom resource configuration file is detected, wherein a task pipeline includes multiple sequentially arranged functional modules.
在一种可能的实现方式中,每个功能模块中包括至少一个任务;In a possible implementation, each functional module includes at least one task;
针对任意一个任务,该任务中包括Kubernetes中的目标操作位置、以及在目标操作位置处需要执行的一个操作。For any task, the task includes the target operation location in Kubernetes and an operation that needs to be performed at the target operation location.
在一种可能的实现方式中,操作执行模块42,具体用于:In a possible implementation, the operation execution module 42 is specifically configured to:
在目标Kubernetes集群操作为容器化GPU驱动操作的情况下,确定容器化GPU驱动操作对应的GPU应用类型,其中,容器化GPU驱动操作为在目标容器中执行GPU驱动安装操作或GPU驱动升级操作;In the case where the target Kubernetes cluster operation is a containerized GPU driver operation, determining a GPU application type corresponding to the containerized GPU driver operation, wherein the containerized GPU driver operation is to perform a GPU driver installation operation or a GPU driver upgrade operation in the target container;
根据容器化GPU驱动操作对应的GPU应用类型,确定容器化GPU驱动操作对应的任务流水线;Determine a task pipeline corresponding to the containerized GPU driver operation according to the GPU application type corresponding to the containerized GPU driver operation;
根据容器化GPU驱动操作对应的任务流水线,在目标容器中执行容器化GPU驱动操作。According to the task pipeline corresponding to the containerized GPU driver operation, the containerized GPU driver operation is executed in the target container.
在一种可能的实现方式中,容器化GPU驱动操作对应的GPU应用类型包括:独占GPU类型、共享GPU类型。In a possible implementation, the GPU application types corresponding to the containerized GPU driver operation include: an exclusive GPU type and a shared GPU type.
在一种可能的实现方式中,操作执行模块42,具体用于:In a possible implementation, the operation execution module 42 is specifically configured to:
在目标Kubernetes集群操作为虚拟机化驱动操作的情况下,确定虚拟机化GPU驱动操作对应的GPU应用类型,其中,虚拟机化GPU驱动操作为在目标虚拟机中执行GPU驱动安装操作或GPU驱动升级操作;In the case where the target Kubernetes cluster operation is a virtualized driver operation, determining a GPU application type corresponding to the virtualized GPU driver operation, wherein the virtualized GPU driver operation is to perform a GPU driver installation operation or a GPU driver upgrade operation in the target virtual machine;
根据虚拟机化GPU驱动操作对应的GPU应用类型,确定虚拟机化GPU驱动操作对应的任务流水线;Determine a task pipeline corresponding to the virtualized GPU driver operation according to a GPU application type corresponding to the virtualized GPU driver operation;
根据虚拟机化GPU驱动操作对应的任务流水线,在目标虚拟机中执行虚拟机化GPU驱动操作。The virtualized GPU driver operation is executed in the target virtual machine according to the task pipeline corresponding to the virtualized GPU driver operation.
在一种可能的实现方式中,虚拟机化GPU驱动操作对应的GPU应用类型包括:虚拟化GPU、直通GPU。In a possible implementation, the GPU application types corresponding to the virtualized GPU driver operation include: virtualized GPU and pass-through GPU.
在一种可能的实现方式中,目标Kubernetes集群操作包括下述至少之一:集群安装操作、集群升级操作、集群删除操作、增加节点操作、更新证书操作、多种GPU应用类型GPU驱动安装操作、多种GPU应用类型GPU驱动升级操作、多种GPU设备插件安装操作、Kubernetes组件安装操作、Kubernetes组件升级操作。In one possible implementation, the target Kubernetes cluster operation includes at least one of the following: cluster installation operation, cluster upgrade operation, cluster deletion operation, node addition operation, certificate update operation, GPU driver installation operation for multiple GPU application types, GPU driver upgrade operation for multiple GPU application types, multiple GPU device plug-in installation operation, Kubernetes component installation operation, and Kubernetes component upgrade operation.
在一种可能的实现方式中,系统40还包括:In a possible implementation, the system 40 further includes:
用户配置模块,用于基于用户选择,确定更新的自定义资源配置文件,其中,用户选择包括:版本配置、网络插件配置、存储配置、节点配置、GPU应用类型配置。The user configuration module is used to determine the updated custom resource configuration file based on user selection, wherein the user selection includes: version configuration, network plug-in configuration, storage configuration, node configuration, and GPU application type configuration.
在一种可能的实现方式中,系统40还包括:In a possible implementation, the system 40 further includes:
发送模块,用于在监听到存在更新的自定义资源配置文件的情况下,将该自定义资源配置文件发送至Kubernetes中的API Server;The sending module is used to send the custom resource configuration file to the API Server in Kubernetes when an updated custom resource configuration file is detected;
状态更新模块,用于在执行完目标Kubernetes集群操作后,请求API Server更新该自定义资源配置文件的执行状态。The status update module is used to request the API Server to update the execution status of the custom resource configuration file after executing the target Kubernetes cluster operation.
在一种可能的实现方式中,系统40还包括:In a possible implementation, the system 40 further includes:
第二目标Operator,用于设置并管理Kubernetes中的主节点。The second target Operator is used to set up and manage the master node in Kubernetes.
该方法与计算机系统的内部结构存在特定技术关联,且能够解决如何提升硬件运算效率或执行效果的技术问题(包括减少数据存储量、减少数据传输量、提高硬件处理速度等),从而获得符合自然规律的计算机系统内部性能改进的技术效果。This method has a specific technical connection with the internal structure of the computer system, and can solve the technical problem of how to improve the hardware computing efficiency or execution effect (including reducing the amount of data storage, reducing the amount of data transmission, increasing the hardware processing speed, etc.), thereby obtaining the technical effect of improving the internal performance of the computer system in accordance with the laws of nature.
在一些实施例中,本公开实施例提供的系统具有的功能或包含的模块可以用于执行上文方法实施例描述的方法,其具体实现可以参照上文方法实施例的描述,为了简洁,这里不再赘述。In some embodiments, the functions or modules included in the system provided by the embodiments of the present disclosure can be used to execute the method described in the above method embodiments. The specific implementation can refer to the description of the above method embodiments, and for the sake of brevity, it will not be repeated here.
本公开实施例还提出一种计算机可读存储介质,其上存储有计算机程序指令,所述计算机程序指令被处理器执行时实现上述方法。计算机可读存储介质可以是易失性或非易失性计算机可读存储介质。The embodiment of the present disclosure also provides a computer-readable storage medium, on which computer program instructions are stored, and when the computer program instructions are executed by a processor, the above method is implemented. The computer-readable storage medium can be a volatile or non-volatile computer-readable storage medium.
本公开实施例还提出一种电子设备,包括:处理器;用于存储处理器可执行指令的存储器;其中,所述处理器被配置为调用所述存储器存储的指令,以执行上述方法。The embodiment of the present disclosure also proposes an electronic device, comprising: a processor; a memory for storing instructions executable by the processor; wherein the processor is configured to call the instructions stored in the memory to execute the above method.
本公开实施例还提供了一种计算机程序产品,包括计算机可读代码,或者承载有计算机可读代码的非易失性计算机可读存储介质,当所述计算机可读代码在电子设备的处理器中运行时,所述电子设备中的处理器执行上述方法。The embodiments of the present disclosure also provide a computer program product, including a computer-readable code, or a non-volatile computer-readable storage medium carrying the computer-readable code. When the computer-readable code runs in a processor of an electronic device, the processor in the electronic device executes the above method.
电子设备可以被提供为终端、服务器或其它形态的设备。The electronic device may be provided as a terminal, a server, or a device in other forms.
图5示出根据本公开实施例的一种电子设备的框图。参照图5,电子设备1900可以被提供为一服务器或终端设备。参照图5,电子设备1900包括处理组件1922,其进一步包括一个或多个处理器,以及由存储器1932所代表的存储器资源,用于存储可由处理组件1922的执行的指令,例如应用程序。存储器1932中存储的应用程序可以包括一个或一个以上的每一个对应于一组指令的模块。此外,处理组件1922被配置为执行指令,以执行上述方法。FIG5 shows a block diagram of an electronic device according to an embodiment of the present disclosure. Referring to FIG5 , the electronic device 1900 may be provided as a server or a terminal device. Referring to FIG5 , the electronic device 1900 includes a processing component 1922, which further includes one or more processors, and a memory resource represented by a memory 1932 for storing instructions executable by the processing component 1922, such as an application. The application stored in the memory 1932 may include one or more modules, each of which corresponds to a set of instructions. In addition, the processing component 1922 is configured to execute instructions to perform the above method.
电子设备1900还可以包括一个电源组件1926被配置为执行电子设备1900的电源管理,一个有线或无线网络接口1950被配置为将电子设备1900连接到网络,和一个输入输出接口1958。电子设备1900可以操作基于存储在存储器1932的操作系统,例如微软服务器操作系统(Windows ServerTM),苹果公司推出的基于图形用户界面操作系统(Mac OS XTM),多用户多进程的计算机操作系统(UnixTM),自由和开放原代码的类Unix操作系统(LinuxTM),开放原代码的类Unix操作系统(FreeBSDTM)或类似。The electronic device 1900 may further include a power supply component 1926 configured to perform power management of the electronic device 1900, a wired or wireless network interface 1950 configured to connect the electronic device 1900 to a network, and an input/output interface 1958. The electronic device 1900 may operate based on an operating system stored in the memory 1932, such as Microsoft's server operating system (Windows Server ™ ), Apple's graphical user interface-based operating system (Mac OS X ™ ), a multi-user multi-process computer operating system (Unix ™ ), a free and open source Unix-like operating system (Linux ™ ), an open source Unix-like operating system (FreeBSD ™ ), or the like.
在示例性实施例中,还提供了一种非易失性计算机可读存储介质,例如包括计算机程序指令的存储器1932,上述计算机程序指令可由电子设备1900的处理组件1922执行以完成上述方法。In an exemplary embodiment, a non-volatile computer-readable storage medium is also provided, such as a memory 1932 including computer program instructions, which can be executed by the processing component 1922 of the electronic device 1900 to perform the above method.
本公开可以是系统、方法和/或计算机程序产品。计算机程序产品可以包括计算机可读存储介质,其上载有用于使处理器实现本公开的各个方面的计算机可读程序指令。The present disclosure may be a system, a method and/or a computer program product. The computer program product may include a computer-readable storage medium carrying computer-readable program instructions for causing a processor to implement various aspects of the present disclosure.
计算机可读存储介质可以是可以保持和存储由指令执行设备使用的指令的有形设备。计算机可读存储介质例如可以是(但不限于)电存储设备、磁存储设备、光存储设备、电磁存储设备、半导体存储设备或者上述的任意合适的组合。计算机可读存储介质的更具体的例子(非穷举的列表)包括:便携式计算机盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、静态随机存取存储器(SRAM)、便携式压缩盘只读存储器(CD-ROM)、数字多功能盘(DVD)、记忆棒、软盘、机械编码设备、例如其上存储有指令的打孔卡或凹槽内凸起结构、以及上述的任意合适的组合。这里所使用的计算机可读存储介质不被解释为瞬时信号本身,诸如无线电波或者其他自由传播的电磁波、通过波导或其他传输媒介传播的电磁波(例如,通过光纤电缆的光脉冲)、或者通过电线传输的电信号。Computer readable storage medium can be a tangible device that can hold and store instructions used by an instruction execution device. Computer readable storage medium can be, for example, (but not limited to) an electrical storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination thereof. More specific examples (non-exhaustive list) of computer readable storage medium include: a portable computer disk, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a static random access memory (SRAM), a portable compact disk read-only memory (CD-ROM), a digital versatile disk (DVD), a memory stick, a floppy disk, a mechanical encoding device, for example, a punch card or a convex structure in a groove on which instructions are stored, and any suitable combination thereof. The computer readable storage medium used here is not interpreted as a transient signal itself, such as a radio wave or other freely propagating electromagnetic wave, an electromagnetic wave propagated by a waveguide or other transmission medium (for example, a light pulse by an optical fiber cable), or an electrical signal transmitted by a wire.
这里所描述的计算机可读程序指令可以从计算机可读存储介质下载到各个计算/处理设备,或者通过网络、例如因特网、局域网、广域网和/或无线网下载到外部计算机或外部存储设备。网络可以包括铜传输电缆、光纤传输、无线传输、路由器、防火墙、交换机、网关计算机和/或边缘服务器。每个计算/处理设备中的网络适配卡或者网络接口从网络接收计算机可读程序指令,并转发该计算机可读程序指令,以供存储在各个计算/处理设备中的计算机可读存储介质中。The computer-readable program instructions described herein can be downloaded from a computer-readable storage medium to each computing/processing device, or downloaded to an external computer or external storage device via a network, such as the Internet, a local area network, a wide area network, and/or a wireless network. The network can include copper transmission cables, optical fiber transmissions, wireless transmissions, routers, firewalls, switches, gateway computers, and/or edge servers. The network adapter card or network interface in each computing/processing device receives the computer-readable program instructions from the network and forwards the computer-readable program instructions for storage in the computer-readable storage medium in each computing/processing device.
用于执行本公开操作的计算机程序指令可以是汇编指令、指令集架构(ISA)指令、机器指令、机器相关指令、微代码、固件指令、状态设置数据、或者以一种或多种编程语言的任意组合编写的源代码或目标代码,所述编程语言包括面向对象的编程语言—诸如Smalltalk、C++等,以及常规的过程式编程语言—诸如“C”语言或类似的编程语言。计算机可读程序指令可以完全地在用户计算机上执行、部分地在用户计算机上执行、作为一个独立的软件包执行、部分在用户计算机上部分在远程计算机上执行、或者完全在远程计算机或服务器上执行。在涉及远程计算机的情形中,远程计算机可以通过任意种类的网络—包括局域网(LAN)或广域网(WAN)—连接到用户计算机,或者,可以连接到外部计算机(例如利用因特网服务提供商来通过因特网连接)。在一些实施例中,通过利用计算机可读程序指令的状态信息来个性化定制电子电路,例如可编程逻辑电路、现场可编程门阵列(FPGA)或可编程逻辑阵列(PLA),该电子电路可以执行计算机可读程序指令,从而实现本公开的各个方面。The computer program instructions for performing the operation of the present disclosure may be assembly instructions, instruction set architecture (ISA) instructions, machine instructions, machine-related instructions, microcode, firmware instructions, state setting data, or source code or object code written in any combination of one or more programming languages, including object-oriented programming languages, such as Smalltalk, C++, etc., and conventional procedural programming languages, such as "C" language or similar programming languages. Computer-readable program instructions may be executed completely on a user's computer, partially on a user's computer, as an independent software package, partially on a user's computer, partially on a remote computer, or completely on a remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer via any type of network, including a local area network (LAN) or a wide area network (WAN), or may be connected to an external computer (e.g., using an Internet service provider to connect via the Internet). In some embodiments, an electronic circuit, such as a programmable logic circuit, a field programmable gate array (FPGA), or a programmable logic array (PLA), may be personalized by utilizing the state information of the computer-readable program instructions, and the electronic circuit may execute the computer-readable program instructions, thereby realizing various aspects of the present disclosure.
这里参照根据本公开实施例的方法、装置(系统)和计算机程序产品的流程图和/或框图描述了本公开的各个方面。应当理解,流程图和/或框图的每个方框以及流程图和/或框图中各方框的组合,都可以由计算机可读程序指令实现。Various aspects of the present disclosure are described herein with reference to the flowcharts and/or block diagrams of the methods, devices (systems) and computer program products according to the embodiments of the present disclosure. It should be understood that each box in the flowchart and/or block diagram and the combination of each box in the flowchart and/or block diagram can be implemented by computer-readable program instructions.
这些计算机可读程序指令可以提供给通用计算机、专用计算机或其它可编程数据处理装置的处理器,从而生产出一种机器,使得这些指令在通过计算机或其它可编程数据处理装置的处理器执行时,产生了实现流程图和/或框图中的一个或多个方框中规定的功能/动作的装置。也可以把这些计算机可读程序指令存储在计算机可读存储介质中,这些指令使得计算机、可编程数据处理装置和/或其他设备以特定方式工作,从而,存储有指令的计算机可读介质则包括一个制造品,其包括实现流程图和/或框图中的一个或多个方框中规定的功能/动作的各个方面的指令。These computer-readable program instructions can be provided to a processor of a general-purpose computer, a special-purpose computer, or other programmable data processing device, thereby producing a machine, so that when these instructions are executed by the processor of the computer or other programmable data processing device, a device that implements the functions/actions specified in one or more boxes in the flowchart and/or block diagram is generated. These computer-readable program instructions can also be stored in a computer-readable storage medium, and these instructions cause the computer, programmable data processing device, and/or other equipment to work in a specific manner, so that the computer-readable medium storing the instructions includes a manufactured product, which includes instructions for implementing various aspects of the functions/actions specified in one or more boxes in the flowchart and/or block diagram.
也可以把计算机可读程序指令加载到计算机、其它可编程数据处理装置、或其它设备上,使得在计算机、其它可编程数据处理装置或其它设备上执行一系列操作步骤,以产生计算机实现的过程,从而使得在计算机、其它可编程数据处理装置、或其它设备上执行的指令实现流程图和/或框图中的一个或多个方框中规定的功能/动作。Computer-readable program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other device so that a series of operating steps are performed on the computer, other programmable data processing apparatus, or other device to produce a computer-implemented process, thereby causing the instructions executed on the computer, other programmable data processing apparatus, or other device to implement the functions/actions specified in one or more boxes in the flowchart and/or block diagram.
附图中的流程图和框图显示了根据本公开的多个实施例的系统、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段或指令的一部分,所述模块、程序段或指令的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。在有些作为替换的实现中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个连续的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合,可以用执行规定的功能或动作的专用的基于硬件的系统来实现,或者可以用专用硬件与计算机指令的组合来实现。The flow chart and block diagram in the accompanying drawings show the possible architecture, function and operation of the system, method and computer program product according to multiple embodiments of the present disclosure. In this regard, each square box in the flow chart or block diagram can represent a part of a module, program segment or instruction, and a part of the module, program segment or instruction includes one or more executable instructions for realizing the specified logical function. In some alternative implementations, the functions marked in the square box can also occur in a sequence different from that marked in the accompanying drawings. For example, two continuous square boxes can actually be executed substantially in parallel, and they can sometimes be executed in reverse order, depending on the functions involved. It should also be noted that each square box in the block diagram and/or flow chart, and the combination of the square boxes in the block diagram and/or flow chart can be implemented with a dedicated hardware-based system that performs the specified function or action, or can be implemented with a combination of special hardware and computer instructions.
该计算机程序产品可以具体通过硬件、软件或其结合的方式实现。在一个可选实施例中,所述计算机程序产品具体体现为计算机存储介质,在另一个可选实施例中,计算机程序产品具体体现为软件产品,例如软件开发包(Software Development Kit,SDK)等等。The computer program product may be implemented in hardware, software or a combination thereof. In one optional embodiment, the computer program product is embodied as a computer storage medium, and in another optional embodiment, the computer program product is embodied as a software product, such as a software development kit (SDK) and the like.
上文对各个实施例的描述倾向于强调各个实施例之间的不同之处,其相同或相似之处可以互相参考,为了简洁,本文不再赘述。The above description of various embodiments tends to emphasize the differences between the various embodiments. The same or similar aspects can be referenced to each other, and for the sake of brevity, they will not be repeated herein.
本领域技术人员可以理解,在具体实施方式的上述方法中,各步骤的撰写顺序并不意味着严格的执行顺序而对实施过程构成任何限定,各步骤的具体执行顺序应当以其功能和可能的内在逻辑确定。Those skilled in the art will appreciate that, in the above method of specific implementation, the order in which the steps are written does not imply a strict execution order and does not constitute any limitation on the implementation process. The specific execution order of the steps should be determined by their functions and possible internal logic.
若本申请技术方案涉及个人信息,应用本申请技术方案的产品在处理个人信息前,已明确告知个人信息处理规则,并取得个人自主同意。若本申请技术方案涉及敏感个人信息,应用本申请技术方案的产品在处理敏感个人信息前,已取得个人单独同意,并且同时满足“明示同意”的要求。例如,在摄像头等个人信息采集装置处,设置明确显著的标识告知已进入个人信息采集范围,将会对个人信息进行采集,若个人自愿进入采集范围即视为同意对其个人信息进行采集;或者在个人信息处理的装置上,利用明显的标识/信息告知个人信息处理规则的情况下,通过弹窗信息或请个人自行上传其个人信息等方式获得个人授权;其中,个人信息处理规则可包括个人信息处理者、个人信息处理目的、处理方式以及处理的个人信息种类等信息。If the technical solution of this application involves personal information, the product using the technical solution of this application has clearly informed the personal information processing rules and obtained the individual's voluntary consent before processing the personal information. If the technical solution of this application involves sensitive personal information, the product using the technical solution of this application has obtained the individual's separate consent before processing the sensitive personal information, and at the same time meets the "explicit consent" requirement. For example, on personal information collection devices such as cameras, set clear and prominent signs to inform that the personal information collection scope has been entered and personal information will be collected. If the individual voluntarily enters the collection scope, it is deemed that he or she agrees to collect his or her personal information; or on the device that processes personal information, when the personal information processing rules are notified by obvious signs/information, the individual's authorization is obtained through pop-up information or by asking the individual to upload his or her personal information; among them, the personal information processing rules may include information such as the personal information processor, the purpose of personal information processing, the processing method, and the type of personal information processed.
以上已经描述了本公开的各实施例,上述说明是示例性的,并非穷尽性的,并且也不限于所披露的各实施例。在不偏离所说明的各实施例的范围和精神的情况下,对于本技术领域的普通技术人员来说许多修改和变更都是显而易见的。本文中所用术语的选择,旨在最好地解释各实施例的原理、实际应用或对市场中的技术的改进,或者使本技术领域的其它普通技术人员能理解本文披露的各实施例。The embodiments of the present disclosure have been described above, and the above description is exemplary, not exhaustive, and is not limited to the disclosed embodiments. Many modifications and changes will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the described embodiments. The selection of terms used herein is intended to best explain the principles of the embodiments, practical applications, or improvements to the technology in the market, or to enable other persons of ordinary skill in the art to understand the embodiments disclosed herein.
Claims (18)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310686910.9A CN116501446B (en) | 2023-06-09 | 2023-06-09 | Kubernetes cluster deployment method and system, and electronic device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310686910.9A CN116501446B (en) | 2023-06-09 | 2023-06-09 | Kubernetes cluster deployment method and system, and electronic device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN116501446A CN116501446A (en) | 2023-07-28 |
CN116501446B true CN116501446B (en) | 2024-06-07 |
Family
ID=87316733
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310686910.9A Active CN116501446B (en) | 2023-06-09 | 2023-06-09 | Kubernetes cluster deployment method and system, and electronic device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116501446B (en) |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110764901A (en) * | 2019-09-17 | 2020-02-07 | 阿里巴巴集团控股有限公司 | Data processing method based on GPU (graphics processing Unit) resources, electronic equipment and system |
CN112214330A (en) * | 2020-11-04 | 2021-01-12 | 腾讯科技(深圳)有限公司 | Method and device for deploying master nodes in cluster and computer-readable storage medium |
WO2021073214A1 (en) * | 2019-10-14 | 2021-04-22 | 支付宝(杭州)信息技术有限公司 | Method and apparatus for running application program, and gpu node |
CN113448686A (en) * | 2021-06-22 | 2021-09-28 | 深信服科技股份有限公司 | Resource deployment method and device, electronic equipment and storage medium |
CN113687912A (en) * | 2021-07-30 | 2021-11-23 | 济南浪潮数据技术有限公司 | Container cluster management method, device and system, electronic equipment and storage medium |
CN115167972A (en) * | 2022-05-30 | 2022-10-11 | 浪潮通信技术有限公司 | Cloud native platform integration method and system |
-
2023
- 2023-06-09 CN CN202310686910.9A patent/CN116501446B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110764901A (en) * | 2019-09-17 | 2020-02-07 | 阿里巴巴集团控股有限公司 | Data processing method based on GPU (graphics processing Unit) resources, electronic equipment and system |
WO2021073214A1 (en) * | 2019-10-14 | 2021-04-22 | 支付宝(杭州)信息技术有限公司 | Method and apparatus for running application program, and gpu node |
CN112214330A (en) * | 2020-11-04 | 2021-01-12 | 腾讯科技(深圳)有限公司 | Method and device for deploying master nodes in cluster and computer-readable storage medium |
CN113448686A (en) * | 2021-06-22 | 2021-09-28 | 深信服科技股份有限公司 | Resource deployment method and device, electronic equipment and storage medium |
CN113687912A (en) * | 2021-07-30 | 2021-11-23 | 济南浪潮数据技术有限公司 | Container cluster management method, device and system, electronic equipment and storage medium |
CN115167972A (en) * | 2022-05-30 | 2022-10-11 | 浪潮通信技术有限公司 | Cloud native platform integration method and system |
Non-Patent Citations (1)
Title |
---|
基于Kubernetes的融合原生基础设施方案与关键技术;何震苇 等;《电信科学》;第77-78页 * |
Also Published As
Publication number | Publication date |
---|---|
CN116501446A (en) | 2023-07-28 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11029992B2 (en) | Nondisruptive updates in a networked computing environment | |
JP6571161B2 (en) | Method, apparatus, and system for exploring application topology relationships | |
US10394477B2 (en) | Method and system for memory allocation in a disaggregated memory architecture | |
US20190289057A1 (en) | Software version control without affecting a deployed container | |
US10430171B2 (en) | Extensions for deployment patterns | |
US10324754B2 (en) | Managing virtual machine patterns | |
KR20200070085A (en) | Method and apparatus for processing information | |
CN114341810A (en) | Deploying microservices across service infrastructures | |
US9122793B2 (en) | Distributed debugging of an application in a distributed computing environment | |
CN110221910B (en) | Method and apparatus for performing MPI jobs | |
KR20220151585A (en) | Business data processing method, apparatus, electronic apparatus, storage media and computer program | |
CN114116393A (en) | Method, device and equipment for collecting GPU performance data of virtual machine | |
JP2021513137A (en) | Data migration in a tiered storage management system | |
CN115777192B (en) | Managing communication between microservices | |
CN116635834A (en) | Coordinating requests executing at extensible applications | |
CN115965517A (en) | Graphics processor resource management method and device, electronic device and storage medium | |
CN110247801A (en) | A kind of monitoring system and method for pair of cluster host | |
CN112152988B (en) | Method, system, computer device and medium for asynchronous NBMP request processing | |
CN117931097B (en) | Information providing method and device applied to servers of edge computing cluster | |
CN116501446B (en) | Kubernetes cluster deployment method and system, and electronic device | |
CN118152224A (en) | Distributed training method and platform based on GPU cluster, and electronic equipment | |
US20230086195A1 (en) | Efficient and extensive function groups with multi-instance function support for cloud based processing | |
CN117472509A (en) | Non-containerized application management method based on Kubernetes cluster equipment | |
US20160210180A1 (en) | Preventing recurrence of deterministic failures | |
US11954506B2 (en) | Inspection mechanism framework for visualizing application metrics |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
CP03 | Change of name, title or address |
Address after: B655, 4th Floor, Building 14, Cuiwei Zhongli, Haidian District, Beijing, 100036 Patentee after: Mole Thread Intelligent Technology (Beijing) Co.,Ltd. Country or region after: China Address before: 209, 2nd Floor, No. 31 Haidian Street, Haidian District, Beijing Patentee before: Moore Threads Technology Co., Ltd. Country or region before: China |
|
CP03 | Change of name, title or address |