CN112463375A - Data processing method and device - Google Patents
Data processing method and device Download PDFInfo
- Publication number
- CN112463375A CN112463375A CN202011348812.7A CN202011348812A CN112463375A CN 112463375 A CN112463375 A CN 112463375A CN 202011348812 A CN202011348812 A CN 202011348812A CN 112463375 A CN112463375 A CN 112463375A
- Authority
- CN
- China
- Prior art keywords
- gpu
- target
- determining
- scheduling request
- resource information
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5027—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
- G06F9/5044—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering hardware capabilities
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Transfer Between Computers (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The embodiment of the invention provides a data processing method and a data processing device, wherein the method comprises the following steps: acquiring GPU resource information of a plurality of GPUs including the GPU of the allocated container; receiving a GPU scheduling request; determining a target GPU from the plurality of GPUs according to the GPU resource information; creating a target container for the target GPU to process the GPU scheduling request. The embodiment of the invention realizes the sharing scheduling of the multi-card GPU resources, and fully utilizes the GPU resource information of the GPU cards of the allocated containers, thereby sharing the resources of one or more GPU cards and improving the utilization efficiency of the GPU resources.
Description
Technical Field
The present invention relates to the field of resource scheduling technologies, and in particular, to a method and an apparatus for data processing.
Background
Container technology, in which a container image is a lightweight, independently executable software package of software that contains everything needed to run the container, such as code, runtime, system tools, system libraries, settings, etc., can be applied in a kubernets cluster, by which the software can be isolated from its surroundings and help reduce conflicts between teams running different software on the same infrastructure.
The GPU container scheduling capability provided by the Kubernetes cluster directly allocates a GPU (Graphics Processing Unit) card to a container, so as to ensure that an application program using the GPU is not affected by other application programs, thereby achieving better isolation.
The Kubernets cluster only supports addition and subtraction of integer granularity for GPU scheduling, but cannot support allocation of complex resources, so that in the current Kubernets architecture design, when a user request application program occupies a non-whole GPU card, the Kubernets cluster cannot realize resource allocation and calling corresponding to the user request.
Meanwhile, when the application program corresponding to the container does not actually occupy the whole card, the arrangement of one GPU card corresponding to one container leads the resources in the GPU card to be incapable of being fully utilized, and the utilization rate of the GPU resources is reduced.
Disclosure of Invention
In view of the above, it is proposed to provide a method and apparatus for data processing that overcomes or at least partially solves the above mentioned problems, comprising:
a method of data processing, the method comprising:
acquiring GPU resource information of a plurality of GPUs including the GPU of the allocated container;
receiving a GPU scheduling request;
determining a target GPU from the plurality of GPUs according to the GPU resource information;
creating a target container for the target GPU to process the GPU scheduling request.
Optionally, the determining a target GPU from the multiple GPUs according to the GPU resource information includes:
determining resource demand information corresponding to the GPU scheduling request;
determining a target computing node from a plurality of computing nodes of a target cluster according to the GPU resource information and the resource demand information;
and determining a target GPU from the target computing node.
Optionally, the determining a target computing node from a plurality of computing nodes of a target cluster according to the GPU resource information and the resource demand information includes:
respectively determining node resource information of a plurality of computing nodes of the target cluster according to the GPU resource information;
and determining a target computing node from a plurality of computing nodes of the target cluster according to the node resource information and the resource demand information.
Optionally, the determining a target GPU from the target compute node includes:
determining candidate GPUs with the GPU resource information matched with the resource demand information from the target computing node;
and determining a target GPU from the candidate GPUs according to the GPU resource information.
Optionally, the creating a target container for the target GPU to process the GPU scheduling request includes:
creating a container group entity for the GPU scheduling request;
determining a target GPU identification of the target GPU;
updating the target GPU identification to an environment variable of the container group entity;
and creating a target container aiming at the target GPU according to the environment variables.
Optionally, the creating a container group entity for the GPU scheduling request includes:
generating container group information corresponding to the GPU scheduling request;
establishing a binding relationship between the container group information and the target computing node;
and creating a container group entity aiming at the GPU scheduling request according to the binding relationship.
Optionally, the GPU resource information is available video memory information of the GPU.
An apparatus for data processing, the apparatus comprising:
the GPU resource information acquisition module is used for acquiring GPU resource information of a plurality of GPUs including the GPUs of the allocated containers;
the GPU scheduling request receiving module is used for receiving a GPU scheduling request;
the target GPU determining module is used for determining a target GPU from the multiple GPUs according to the GPU resource information;
a target container creation module to create a target container for the target GPU to process the GPU scheduling request.
A server comprising a processor, a memory and a computer program stored on the memory and capable of running on the processor, the computer program, when executed by the processor, implementing a method of data processing as described above.
A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the method of data processing as described above.
The embodiment of the invention has the following advantages:
the method comprises the steps of receiving a GPU scheduling request by acquiring GPU resource information of a plurality of GPUs including GPUs of allocated containers; and according to the GPU resource information, determining a target GPU from the multiple GPUs, creating a target container aiming at the target GPU to process the GPU scheduling request, realizing the shared scheduling of the multi-card GPU resources, and fully utilizing the GPU resource information of the GPU cards of the allocated container, thereby sharing the resources of one or more GPU cards and improving the utilization efficiency of the GPU resources.
Drawings
In order to more clearly illustrate the technical solution of the present invention, the drawings needed to be used in the description of the present invention will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive exercise.
FIG. 1 is a flow chart illustrating steps of a method for data processing according to an embodiment of the present invention;
FIG. 2 is a flow chart illustrating steps of another method for data processing according to an embodiment of the present invention;
FIG. 3 is a flow chart of resource scheduling according to an embodiment of the present invention;
fig. 4 is a schematic structural diagram of a data processing apparatus according to an embodiment of the present invention.
Detailed Description
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in further detail below. It is to be understood that the embodiments described are only a few embodiments of the present invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1, a flowchart illustrating steps of a data processing method according to an embodiment of the present invention is shown, which may specifically include the following steps:
step 101, acquiring GPU resource information of a plurality of GPUs including the GPUs of the allocated containers;
in an embodiment of the present invention, the GPU resource information is available video memory information of the GPU.
In practical applications, the GPU resource information may be available video memory information of the GPU, that is, resources that are not occupied by the application program or other threads in the GPU, where the available video memory information of the GPU may be determined according to video memory information of the GPU (video memory information without resource occupation condition), video memory information that is already used by the GPU (video memory information that part or all of resources are already occupied), and video memory information that is not used by the GPU but is about to perform allocation (video memory information that is being allocated according to a user request).
When a plurality of GPUs exist, the plurality of GPUs can upload respective GPU resource information at intervals of preset time, so that the GPU video memory information of the plurality of GPUs can be obtained by adjusting the scheduler of the GPU resource use condition.
The multiple GPUs may include GPUs that have been allocated containers, the GPUs that have been allocated containers include GPUs that have occupied resources, and GPUs that are allocating containers according to user requests.
By obtaining the GPU resource information, it is possible to determine the GPU resources that are actually available at present, so as to make full use of the GPU resources in the following.
after the GPU resource information is obtained, when a user has a GPU resource scheduling request, a GPU scheduling request may be sent to the resource scheduler, so that the scheduler may receive the GPU scheduling request, so as to perform resource scheduling according to the user requirement.
The GPU scheduling request may include information about the number of GPU cards that the user needs to schedule, and the requirement that each GPU card meets a preset condition.
For example, when a user needs two 1024-MiB GPUs, a corresponding GPU scheduling request may be sent to the scheduler, where the GPU scheduling request may include two GPU cards that are needed, and a video memory size of each GPU card is 1024 mibs.
after receiving the GPU scheduling request, a target GPU that satisfies the GPU scheduling request of the user may be determined from the multiple GPUs according to the GPU resource information.
After the target GPU is determined, a target container of the target GPU can be created, so that resource scheduling is completed, and the target container can be operated.
In the embodiment of the invention, a GPU scheduling request is received by acquiring GPU resource information of a plurality of GPUs including GPUs of allocated containers; and according to the GPU resource information, determining a target GPU from the multiple GPUs, creating a target container aiming at the target GPU to process the GPU scheduling request, realizing the shared scheduling of the multi-card GPU resources, and fully utilizing the GPU resource information of the GPU cards of the allocated container, thereby sharing the resources of one or more GPU cards and improving the utilization efficiency of the GPU resources.
Referring to fig. 2, a flowchart illustrating steps of another data processing method according to an embodiment of the present invention is shown, which may specifically include the following steps:
after receiving the GPU scheduling request, the GPU scheduling request can have corresponding resource demand information, and the resource demand information can include the number of GPU cards required to be scheduled by a user and display card information required to be met by each GPU card, so that the resource demand information corresponding to the GPU scheduling request can be determined.
For example, when a user needs two GPUs with a video memory size of 1024MiB, a corresponding GPU scheduling request may be sent to the scheduler, where resource demanding information corresponding to the GPU scheduling request is that the number of GPU cards needed is two, and the video memory size of the GPU card is 1024 MiB.
after the resource requirement information is determined, available GPU resources can be known from the GPU resource information, so that GPU resources required by a user can be known from the resource requirement information.
The target cluster may include a plurality of compute nodes, each compute node including one or more GPUs, and further, the target compute node may be determined from the plurality of compute nodes in the target cluster according to the GPU resource information and the resource demand information.
In an embodiment of the present invention, the determining a target compute node from a plurality of compute nodes of a target cluster according to the GPU resource information and the resource requirement information includes:
respectively determining node resource information of a plurality of computing nodes of the target cluster according to the GPU resource information; and determining a target computing node from a plurality of computing nodes of the target cluster according to the node resource information and the resource demand information.
In practical application, the GPU resource information includes an available video memory of the GPU, and further, the GPU resource information corresponding to the GPU on each computing node in the target cluster can be known, so that node resource information of a plurality of computing nodes can be determined, and further, according to the node resource information and the resource demand information, a target computing node can be determined from the plurality of computing nodes of the target cluster, wherein the target computing node is a computing node whose node resource information can satisfy the resource demand information.
For example: there are three compute nodes in the target cluster, named N1, N2, N3, respectively. The resource allocation situation is as follows:
n1: the video memory of each card is 2048MiB for 1 GPU card, and the residual video memory of each card is 2048MiB at present.
N2: and 2 GPU cards, wherein the video memory of each card is 2048MiB, the residual video memory of one card is 0 at present, and the residual video memory of the other card is 2048 MiB.
N3: and 2 GPU cards, wherein the video memory of each card is 1024MiB, and the residual video memory of each card is 1024MiB at present.
The user applies for 2 GPU cards, the video memory requirement of each card is 1024MiB, and the total video memory application number is 2048 MiB.
N1: n1 can satisfy the total video memory amount requested by the user, but N1 cannot satisfy the request because the user requests two cards.
N2: n2 can satisfy the number of 2 GPU cards requested by the user, and the remaining total video memory number can also satisfy, but cannot satisfy the video memory number required by each card, because N2 cannot satisfy the requirement.
N3: n3 can satisfy all the requirements requested by the user, therefore, N3 is finally selected as the target computing node.
In an embodiment of the present invention, the determining a target GPU from the target compute node includes:
determining candidate GPUs with the GPU resource information matched with the resource demand information from the target computing node; and determining a target GPU from the candidate GPUs according to the GPU resource information.
After the target computing node is determined, the target computing node may include a group of GPUs that satisfy the resource demand information, and then, in the target computing node, the GPU resource information of each GPU of the target computing node may be matched with the resource demand information corresponding to the user request, and the GPU that is successfully matched may be used as a candidate GPU, and then, the target GPU may be determined from the candidate GPUs according to the GPU resource information of the candidate GPUs.
In an example, the GPU resource information includes the number of remaining video memories, and then one or more candidate GPUs with the smallest number of remaining video memories may be used as the target GPU according to the number of remaining video memories.
By selecting the candidate GPU with small residual video memory amount as the target GPU, the unallocated video memory can be fully utilized under the condition of meeting the resource requirement, and the utilization rate of the video memory is maximized.
After the target GPU is determined, a target container for the target GPU may be created, and the target container may access the video memory resources in the target GPU according to the GUP call request. Therefore, resource sharing is realized, and the resource utilization rate is improved.
In an embodiment of the present invention, the creating a target container for the target GPU to process the GPU scheduling request includes:
creating a container group entity for the GPU scheduling request; determining a target GPU identification of the target GPU; updating the target GPU identification to an environment variable of the container group entity; and creating a target container aiming at the target GPU according to the environment variables.
In practical applications, a container group entity for the GPU scheduling request may be created, where the container group entity may accommodate one or more containers, after determining the target GPU, a target GPU identifier of the target GPU may be further determined, and the target GPU identifier may be updated to an environment variable of the container group entity to indicate that the target GPU corresponding to the target GPU identifier is used for the GPU scheduling request, where the target GPU identifier may be an ID number of the target GPU.
After updating the environment variables, a target container for the target GPU may be created from the environment variables.
In an embodiment of the present invention, the creating a container group entity for the GPU scheduling request includes:
generating container group information corresponding to the GPU scheduling request; establishing a binding relationship between the container group information and the target computing node; and creating a container group entity aiming at the GPU scheduling request according to the binding relationship.
In practical application, container group information corresponding to the GPU scheduling request may be generated, and then a binding relationship between the container group information and the target computing node may be established, and a container group entity for the GPU scheduling request may be created according to the binding relationship.
In an example, when the user sends the GPU scheduling request, there is no reported GPU resource information, or there is no remaining resource in the GPU resource information, or the remaining resource amount in the GPU resource information is smaller than the resource amount corresponding to the resource demand information, and the scheduler does not receive the GPU scheduling request and does not perform GPU scheduling.
In the embodiment of the invention, the GPU resource information of a plurality of GPUs including the GPU of the allocated container is obtained, the GPU scheduling request is received, the resource requirement information corresponding to the GPU scheduling request is determined, the target computing node is determined from a plurality of computing nodes of the target cluster according to the GPU resource information and the resource requirement information, the target GPU is determined from the target computing node, the target container aiming at the target GPU is established to process the GPU scheduling request, the sharing scheduling of multi-card GPU resources is realized, the GPU resource information of the GPU card of the allocated container is fully utilized, the resources of one or a plurality of GPU cards can be shared, and the utilization efficiency of the GPU resources is improved.
It should be noted that, for simplicity of description, the method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present invention is not limited by the illustrated order of acts, as some steps may occur in other orders or concurrently in accordance with the embodiments of the present invention. Further, those skilled in the art will appreciate that the embodiments described in the specification are presently preferred and that no particular act is required to implement the invention.
The invention is illustrated below with reference to fig. 3:
as shown in fig. 3, for the process of scheduling GPU resources in the kubernets cluster, a shared GPU device plugin, a Kubelet component, a kubernets scheduler, and a shared GPU scheduling extension may be included in the kubernets cluster, where the GPU device plugin and the Kubelet component belong to a module in a compute node.
Kubernetes: is a portable, extensible, open source platform for managing containerized workloads and services that facilitates declarative configuration and automation.
GPU sharing scheduling expansion: the Kubernetes scheduler extension mechanism is used for judging whether enough GPU resources can be provided on a computing node or not in the process of Filter (filtering and screening) and Bind (binding) of the Kubernetes scheduler, and recording the allocation result of the GPU in a POD (container group) through Annotation in the Bind for the follow-up Filter to check the allocation result.
GPU sharing equipment plug-in: and a device plug-in mechanism of Kubernetes is utilized, the Kubelet calls the computing nodes, the resource report of the multi-card GPU is taken charge, and the final resource allocation and binding are executed depending on the allocation result of the GPU shared scheduling extension.
Kubelet component: and each computing node runs a Kubelet service process, defaults to a monitoring port, receives and executes a corresponding instruction, and manages the Pod and the container in the Pod. Each Kubelet process registers information of a computing node on the API Server, wherein the registration information comprises a host name, parameters covering the host name and specific logic of a certain cloud service provider; and reporting the resource use condition of the node to the Master node regularly, and monitoring the resources of the computing node and the container.
Kubernets scheduler: the scheduling function of the whole cluster resource is mainly responsible, and the Pod is scheduled to the optimal working node according to a specific scheduling algorithm and a specific scheduling strategy, so that the cluster resource is utilized more reasonably and fully.
Kubernets API Server: the system provides the addition and deletion of various Kubernetes resource objects (Pod, RC (Replication Controller), Service and the like) and HTTP Rest interfaces such as watch and the like, and is a data bus and a data center of the whole cluster system.
The Pod is the smallest unit which can be created and deployed in kubernets, is an application instance in a kubernets cluster, and is deployed on the same computing node. The Pod includes one or more containers and also includes resources shared by the containers such as storage and network. Pod can support multiple container environments, with Docker being a common container environment.
And 2 GPU cards are arranged on a certain computing node, the sizes of the video memories are 16276MiB, and the GPUIDs are GPU0 and GPU1 respectively.
1. And (3) reporting resources: the GPU sharing device plugin queries the number of GPU cards on the current computing node and the video memory size of each GPU card by using a specified management library, and reports the number (2) of the GPU cards of the node, the video memory size (16276MiB) of each GPU card and the total video memory size (32552MiB) after the multiple cards are combined to a Kubelet component as expansion resources through ListAndWatch ();
2. the Kubelet component may further report to a Kubernets API Server;
3. a user sends a GPU resource scheduling request (requesting 1 GPU card with the video memory size of 8138 MiB), and if the Kubernets API Server determines that all the computing nodes have no GPU resources or the GPU resources are insufficient, the request is not accepted; when determining that enough GPU resources exist, receiving the request, sending the corresponding request to a Kubernets scheduler for processing, and generating Pod group information according to the GPU resource scheduling request;
4. and the Kubernetes scheduler calls a Filter (filtering) method of GPU shared scheduling extension in an HTTP mode according to the GPU resource scheduling request. Therefore, the proper computing nodes can be screened out according to the GPU resource reporting condition and the GPU use condition of each computing node.
The data reported by the GPU resources are provided by the GPU sharing equipment plug-in, so that the number of cards of a computing node and the video memory size of each card can be determined; the usage of the GPU resources can be determined according to the information recorded in the POD through Annotation, thereby calculating the amount of the video memory of each remaining video card.
5. Determining a target computing node in the computing nodes obtained by filtering, and executing the binding of the container group information and the target computing node;
6. meanwhile, determining a target GPU on the target computing node, acquiring identification information of the GPU, wherein the target GPU is GPU0, and writing the GPU ID of the selected GPU into the notification of the POD;
7. the shared GPU equipment plug-in inquires the annotation and acquires a GPU ID;
8. when the binding event of the Pod and the computing node is received by the Kubelet component, the Kubelet can create a real POD entity on the node, the Kubelet component can call an allocation method of the GPU sharing device plugin, and the GPU sharing device plugin can write the distributed multiple GPU IDs into environment variables of the POD entity in the allocation method;
9. the Kubelet component creates a target container for GPU0 in the Pod entity that may occupy only 8138 MiB-sized video memory in GPU 0.
Referring to fig. 4, a schematic structural diagram of a data processing apparatus according to an embodiment of the present invention is shown, which may specifically include the following modules:
a GPU resource information acquiring module 401, configured to acquire GPU resource information of multiple GPUs including the GPU to which the container has been allocated;
a GPU scheduling request receiving module 402, configured to receive a GPU scheduling request;
a target GPU determining module 403, configured to determine a target GPU from the multiple GPUs according to the GPU resource information;
a target container creation module 404 that creates a target container for the target GPU to process the GPU scheduling request.
In an embodiment of the present invention, the GPU resource information is available video memory information of the GPU.
In an embodiment of the present invention, the target GPU determining module 403 may include:
the resource demand information determining submodule is used for determining resource demand information corresponding to the GPU scheduling request;
the target computing node determining submodule is used for determining a target computing node from a plurality of computing nodes of a target cluster according to the GPU resource information and the resource demand information;
and the target GPU determining submodule is used for determining a target GPU from the target computing node.
In an embodiment of the present invention, the target computing node determining sub-module may include:
a node resource information determining unit, configured to determine node resource information of multiple computing nodes of the target cluster according to the GPU resource information;
and the target computing node determining unit is used for determining a target computing node from a plurality of computing nodes of the target cluster according to the node resource information and the resource demand information.
In an embodiment of the present invention, the determining the sub-module by the target GPU may include:
a candidate GPU determining unit, configured to determine, from the target computing node, a candidate GPU for which the GPU resource information matches the resource requirement information;
and the target GPU determining unit is used for determining a target GPU from the candidate GPUs according to the GPU resource information.
In an embodiment of the present invention, the target container creating module 404 includes:
a container group entity creating submodule, configured to create a container group entity for the GPU scheduling request;
the target GPU identification determining submodule is used for determining a target GPU identification of the target GPU;
an environment variable updating submodule, configured to update the target GPU identifier to an environment variable of the container group entity;
and the target container creating submodule is used for creating a target container aiming at the target GPU according to the environment variables.
In an embodiment of the present invention, the container group entity creating sub-module may include:
a container group information generating unit, configured to generate container group information corresponding to the GPU scheduling request;
a binding relationship establishing unit, configured to establish a binding relationship between the container group information and the target computing node;
and the container group entity creating unit is used for creating a container group entity aiming at the GPU scheduling request according to the binding relationship.
An embodiment of the present invention also provides a server, which may include a processor, a memory, and a computer program stored on the memory and capable of running on the processor, and when executed by the processor, the computer program implements the steps of the method for processing data as described above.
An embodiment of the present invention further provides a computer-readable storage medium, on which a computer program is stored, which, when executed by a processor, implements the steps of the above data processing method.
For the device embodiment, since it is basically similar to the method embodiment, the description is simple, and for the relevant points, refer to the partial description of the method embodiment.
The embodiments in the present specification are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, apparatus, or computer program product. Accordingly, embodiments of the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, embodiments of the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
Embodiments of the present invention are described with reference to flowchart illustrations and/or block diagrams of methods, terminal devices (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing terminal to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing terminal, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing terminal to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing terminal to cause a series of operational steps to be performed on the computer or other programmable terminal to produce a computer implemented process such that the instructions which execute on the computer or other programmable terminal provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While preferred embodiments of the present invention have been described, additional variations and modifications of these embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all such alterations and modifications as fall within the scope of the embodiments of the invention.
Finally, it should also be noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or terminal that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or terminal. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or terminal that comprises the element.
The method and apparatus for data processing provided above are described in detail, and a specific example is applied herein to illustrate the principles and embodiments of the present invention, and the above description of the embodiment is only used to help understand the method and core ideas of the present invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present invention.
Claims (10)
1. A method of data processing, the method comprising:
acquiring GPU resource information of a plurality of GPUs including the GPU of the allocated container;
receiving a GPU scheduling request;
determining a target GPU from the plurality of GPUs according to the GPU resource information;
creating a target container for the target GPU to process the GPU scheduling request.
2. The method of claim 1, wherein determining the target GPU from the plurality of GPUs according to the GPU resource information comprises:
determining resource demand information corresponding to the GPU scheduling request;
determining a target computing node from a plurality of computing nodes of a target cluster according to the GPU resource information and the resource demand information;
and determining a target GPU from the target computing node.
3. The method of claim 2, wherein determining a target compute node from a plurality of compute nodes of a target cluster based on the GPU resource information and the resource requirement information comprises:
respectively determining node resource information of a plurality of computing nodes of the target cluster according to the GPU resource information;
and determining a target computing node from a plurality of computing nodes of the target cluster according to the node resource information and the resource demand information.
4. The method according to claim 2 or 3, wherein the determining a target GPU from the target compute nodes comprises:
determining candidate GPUs with the GPU resource information matched with the resource demand information from the target computing node;
and determining a target GPU from the candidate GPUs according to the GPU resource information.
5. The method according to claim 2 or 3, wherein the creating a target container for the target GPU to process the GPU scheduling request comprises:
creating a container group entity for the GPU scheduling request;
determining a target GPU identification of the target GPU;
updating the target GPU identification to an environment variable of the container group entity;
and creating a target container aiming at the target GPU according to the environment variables.
6. The method of claim 5, wherein creating the container group entity for the GPU scheduling request comprises:
generating container group information corresponding to the GPU scheduling request;
establishing a binding relationship between the container group information and the target computing node;
and creating a container group entity aiming at the GPU scheduling request according to the binding relationship.
7. The method of claim 1, wherein the GPU resource information is available video memory information of a GPU.
8. An apparatus for data processing, the apparatus comprising:
the GPU resource information acquisition module is used for acquiring GPU resource information of a plurality of GPUs including the GPUs of the allocated containers;
the GPU scheduling request receiving module is used for receiving a GPU scheduling request;
the target GPU determining module is used for determining a target GPU from the multiple GPUs according to the GPU resource information;
a target container creation module to create a target container for the target GPU to process the GPU scheduling request.
9. A server comprising a processor, a memory and a computer program stored on the memory and capable of running on the processor, the computer program, when executed by the processor, implementing a method of data processing according to any one of claims 1 to 7.
10. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out a method of data processing according to any one of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011348812.7A CN112463375B (en) | 2020-11-26 | 2020-11-26 | Data processing method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011348812.7A CN112463375B (en) | 2020-11-26 | 2020-11-26 | Data processing method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112463375A true CN112463375A (en) | 2021-03-09 |
CN112463375B CN112463375B (en) | 2024-07-19 |
Family
ID=74808627
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011348812.7A Active CN112463375B (en) | 2020-11-26 | 2020-11-26 | Data processing method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112463375B (en) |
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113110938A (en) * | 2021-05-08 | 2021-07-13 | 网易(杭州)网络有限公司 | Resource allocation method and device, computer equipment and storage medium |
CN113127192A (en) * | 2021-03-12 | 2021-07-16 | 山东英信计算机技术有限公司 | Method, system, device and medium for sharing same GPU by multiple services |
CN113626134A (en) * | 2021-06-29 | 2021-11-09 | 广东浪潮智慧计算技术有限公司 | Resource replication method, device, equipment and computer readable storage medium |
CN113900799A (en) * | 2021-09-08 | 2022-01-07 | 北京奇艺世纪科技有限公司 | Computing resource allocation method and device, electronic equipment and storage medium |
CN114217917A (en) * | 2021-06-30 | 2022-03-22 | 山东海量信息技术研究院 | Host scheduling method, device, equipment and storage medium |
CN114565502A (en) * | 2022-03-08 | 2022-05-31 | 重庆紫光华山智安科技有限公司 | GPU resource management method, scheduling method, device, electronic equipment and storage medium |
CN114675976A (en) * | 2022-05-26 | 2022-06-28 | 深圳前海环融联易信息科技服务有限公司 | GPU sharing method, device, equipment and medium based on kubernets |
CN114706690A (en) * | 2022-06-06 | 2022-07-05 | 浪潮通信技术有限公司 | Method and system for sharing GPU (graphics processing Unit) by Kubernetes container |
CN114840344A (en) * | 2022-05-19 | 2022-08-02 | 银河麒麟软件(长沙)有限公司 | GPU equipment resource allocation method and system based on kubernetes |
CN115080207A (en) * | 2021-07-09 | 2022-09-20 | 北京金山数字娱乐科技有限公司 | Task processing method and device based on container cluster |
CN115460075A (en) * | 2022-09-14 | 2022-12-09 | 深圳前海环融联易信息科技服务有限公司 | Multi-network mode implementation method, device, equipment and medium based on cloud-native |
CN116258622A (en) * | 2023-02-16 | 2023-06-13 | 青软创新科技集团股份有限公司 | A container-based GPU allocation method, device, electronic device and medium |
Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP3376361A2 (en) * | 2017-10-19 | 2018-09-19 | Pure Storage, Inc. | Ensuring reproducibility in an artificial intelligence infrastructure |
US10109030B1 (en) * | 2016-12-27 | 2018-10-23 | EMC IP Holding Company LLC | Queue-based GPU virtualization and management system |
CN109144734A (en) * | 2018-09-12 | 2019-01-04 | 郑州云海信息技术有限公司 | A kind of container resource quota distribution method and device |
CN109376009A (en) * | 2018-09-26 | 2019-02-22 | 郑州云海信息技术有限公司 | A method and device for sharing resources |
US10275851B1 (en) * | 2017-04-25 | 2019-04-30 | EMC IP Holding Company LLC | Checkpointing for GPU-as-a-service in cloud computing environment |
CN110457135A (en) * | 2019-08-09 | 2019-11-15 | 重庆紫光华山智安科技有限公司 | A kind of method of resource regulating method, device and shared GPU video memory |
CN110688218A (en) * | 2019-09-05 | 2020-01-14 | 广东浪潮大数据研究有限公司 | Resource scheduling method and device |
CN110941481A (en) * | 2019-10-22 | 2020-03-31 | 华为技术有限公司 | Resource scheduling method, device and system |
CN111475277A (en) * | 2019-01-23 | 2020-07-31 | 阿里巴巴集团控股有限公司 | Resource allocation method, system, equipment and machine readable storage medium |
CN111506404A (en) * | 2020-04-07 | 2020-08-07 | 上海德拓信息技术股份有限公司 | Kubernetes-based shared GPU (graphics processing Unit) scheduling method |
CN111880936A (en) * | 2020-07-31 | 2020-11-03 | 广州华多网络科技有限公司 | Resource scheduling method and device, container cluster, computer equipment and storage medium |
-
2020
- 2020-11-26 CN CN202011348812.7A patent/CN112463375B/en active Active
Patent Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10109030B1 (en) * | 2016-12-27 | 2018-10-23 | EMC IP Holding Company LLC | Queue-based GPU virtualization and management system |
US10275851B1 (en) * | 2017-04-25 | 2019-04-30 | EMC IP Holding Company LLC | Checkpointing for GPU-as-a-service in cloud computing environment |
EP3376361A2 (en) * | 2017-10-19 | 2018-09-19 | Pure Storage, Inc. | Ensuring reproducibility in an artificial intelligence infrastructure |
CN109144734A (en) * | 2018-09-12 | 2019-01-04 | 郑州云海信息技术有限公司 | A kind of container resource quota distribution method and device |
CN109376009A (en) * | 2018-09-26 | 2019-02-22 | 郑州云海信息技术有限公司 | A method and device for sharing resources |
CN111475277A (en) * | 2019-01-23 | 2020-07-31 | 阿里巴巴集团控股有限公司 | Resource allocation method, system, equipment and machine readable storage medium |
CN110457135A (en) * | 2019-08-09 | 2019-11-15 | 重庆紫光华山智安科技有限公司 | A kind of method of resource regulating method, device and shared GPU video memory |
CN110688218A (en) * | 2019-09-05 | 2020-01-14 | 广东浪潮大数据研究有限公司 | Resource scheduling method and device |
CN110941481A (en) * | 2019-10-22 | 2020-03-31 | 华为技术有限公司 | Resource scheduling method, device and system |
CN111506404A (en) * | 2020-04-07 | 2020-08-07 | 上海德拓信息技术股份有限公司 | Kubernetes-based shared GPU (graphics processing Unit) scheduling method |
CN111880936A (en) * | 2020-07-31 | 2020-11-03 | 广州华多网络科技有限公司 | Resource scheduling method and device, container cluster, computer equipment and storage medium |
Cited By (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2022188578A1 (en) * | 2021-03-12 | 2022-09-15 | 山东英信计算机技术有限公司 | Method and system for multiple services to share same gpu, and device and medium |
CN113127192A (en) * | 2021-03-12 | 2021-07-16 | 山东英信计算机技术有限公司 | Method, system, device and medium for sharing same GPU by multiple services |
CN113127192B (en) * | 2021-03-12 | 2023-02-28 | 山东英信计算机技术有限公司 | A method, system, device and medium for multiple services to share the same GPU |
CN113110938A (en) * | 2021-05-08 | 2021-07-13 | 网易(杭州)网络有限公司 | Resource allocation method and device, computer equipment and storage medium |
CN113110938B (en) * | 2021-05-08 | 2023-08-29 | 网易(杭州)网络有限公司 | Resource allocation method and device, computer equipment and storage medium |
CN113626134A (en) * | 2021-06-29 | 2021-11-09 | 广东浪潮智慧计算技术有限公司 | Resource replication method, device, equipment and computer readable storage medium |
CN113626134B (en) * | 2021-06-29 | 2024-02-13 | 广东浪潮智慧计算技术有限公司 | A resource copying method, device, equipment and computer-readable storage medium |
CN114217917A (en) * | 2021-06-30 | 2022-03-22 | 山东海量信息技术研究院 | Host scheduling method, device, equipment and storage medium |
CN115080207A (en) * | 2021-07-09 | 2022-09-20 | 北京金山数字娱乐科技有限公司 | Task processing method and device based on container cluster |
CN113900799A (en) * | 2021-09-08 | 2022-01-07 | 北京奇艺世纪科技有限公司 | Computing resource allocation method and device, electronic equipment and storage medium |
CN114565502A (en) * | 2022-03-08 | 2022-05-31 | 重庆紫光华山智安科技有限公司 | GPU resource management method, scheduling method, device, electronic equipment and storage medium |
CN114840344A (en) * | 2022-05-19 | 2022-08-02 | 银河麒麟软件(长沙)有限公司 | GPU equipment resource allocation method and system based on kubernetes |
CN114675976A (en) * | 2022-05-26 | 2022-06-28 | 深圳前海环融联易信息科技服务有限公司 | GPU sharing method, device, equipment and medium based on kubernets |
CN114706690A (en) * | 2022-06-06 | 2022-07-05 | 浪潮通信技术有限公司 | Method and system for sharing GPU (graphics processing Unit) by Kubernetes container |
CN115460075A (en) * | 2022-09-14 | 2022-12-09 | 深圳前海环融联易信息科技服务有限公司 | Multi-network mode implementation method, device, equipment and medium based on cloud-native |
CN116258622A (en) * | 2023-02-16 | 2023-06-13 | 青软创新科技集团股份有限公司 | A container-based GPU allocation method, device, electronic device and medium |
Also Published As
Publication number | Publication date |
---|---|
CN112463375B (en) | 2024-07-19 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112463375B (en) | Data processing method and device | |
CN110647394B (en) | Resource allocation method, device and equipment | |
US11467874B2 (en) | System and method for resource management | |
CN108282514B (en) | Distributed service establishing method and device | |
CN112866321B (en) | A resource scheduling method, device and system | |
CN111880936A (en) | Resource scheduling method and device, container cluster, computer equipment and storage medium | |
CN113419846B (en) | Resource allocation method and device, electronic equipment and computer readable storage medium | |
CN113645262A (en) | Cloud computing service system and method | |
CN112540841A (en) | Task scheduling method and device, processor and electronic equipment | |
CN113961335A (en) | Resource scheduling method, resource scheduling system and device | |
CN112905342A (en) | Resource scheduling method, device, equipment and computer readable storage medium | |
US9672073B2 (en) | Non-periodic check-pointing for fine granular retry of work in a distributed computing environment | |
CN113835897A (en) | Method for allocating and using GPU resources on distributed computing cluster Kubernets | |
WO2021013185A1 (en) | Virtual machine migration processing and strategy generation method, apparatus and device, and storage medium | |
CN119357005B (en) | Resource planning method, resource scheduling method, device, storage medium and product | |
CN114706690B (en) | Method and system for sharing GPU (graphics processing Unit) by Kubernetes container | |
CN115328608A (en) | Kubernetes container vertical expansion adjusting method and device | |
CN113900799A (en) | Computing resource allocation method and device, electronic equipment and storage medium | |
CN112114959B (en) | Resource scheduling method, distributed system, computer device and storage medium | |
CN111475277A (en) | Resource allocation method, system, equipment and machine readable storage medium | |
CN111796934B (en) | Task issuing method and device, storage medium and electronic equipment | |
CN114780232A (en) | Cloud application scheduling method and device, electronic equipment and storage medium | |
CN113127186B (en) | Method, device, server and storage medium for configuring cluster node resources | |
CN114629958A (en) | Resource allocation method, device, electronic equipment and storage medium | |
CN114090201A (en) | Resource scheduling method, device, equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |