CN117687804B

CN117687804B - Resource management method, device, system and storage medium

Info

Publication number: CN117687804B
Application number: CN202410156056.XA
Authority: CN
Inventors: 黄增士; 王鲲; 陈飞; 邹懋
Original assignee: Vita Technology Beijing Co ltd
Current assignee: Vita Technology Beijing Co ltd
Priority date: 2024-02-02
Filing date: 2024-02-02
Publication date: 2024-06-14
Anticipated expiration: 2044-02-02
Also published as: CN117687804A

Abstract

The present disclosure relates to the field of computer technology, and relates to a resource management method, device, system and storage medium, where the method includes: receiving a resource expansion instruction, wherein the resource expansion instruction is used for expanding the computing resource of a target virtual GPU, and the target virtual GPU is one of a plurality of virtual GPUs configured in a first GPU; migrating the target virtual GPU to the second GPU under the condition that the sum of computing resources corresponding to the virtual GPUs and the value of the extended resource quantity indicated by the resource extended instruction exceeds the upper limit of computing resources of the first GPU; and expanding the computing resources of the target virtual GPU in the second GPU based on the resource expansion instruction. Therefore, when the computing resources of the first GPU are insufficient to expand the computing resources of the target virtual GPU, the target virtual GPU can be migrated to the second GPU on the basis of not terminating the computing tasks, so that the computing resources of the target virtual GPU are flexibly expanded in the second GPU.

Description

Resource management method, device, system and storage medium

Technical Field

The present disclosure relates to the field of computer technologies, and in particular, to a method, an apparatus, a system, and a storage medium for resource management.

Background

Currently, a physical GPU may be split into multiple virtual GPUs by GPU (graphics processor, graphics Processing Unit) virtualization techniques, whereby the multiple virtual GPUs share the computing resources of the physical GPU. Wherein, the computing resources comprise a video memory resource and a computing power resource.

However, after the physical GPUs are split into multiple virtual GPUs, the computational resources that each virtual GPU can use are fixed. If the user end program needs to expand the computing resources of the virtual GPU, it is often necessary to re-divide the resources of the physical GPU and restart the executing computing task in the virtual GPU. It can be seen that this approach in the related art causes limitation on virtual GPU resource allocation, and has low flexibility.

Disclosure of Invention

The disclosure aims to provide a resource management method, a device, a system and a storage medium, which improve flexibility of virtual GPU resource allocation.

To achieve the above object, a first aspect of embodiments of the present disclosure provides a resource management method, including:

Receiving a resource expansion instruction, wherein the resource expansion instruction is used for expanding the computing resource of a target virtual GPU, and the target virtual GPU is one of a plurality of virtual GPUs configured in a first GPU;

migrating the target virtual GPU to a second GPU under the condition that the sum of computing resources corresponding to the virtual GPUs and the value of the extended resource quantity indicated by the resource extended instruction exceeds the upper limit of computing resources of the first GPU;

And expanding the computing resources of the target virtual GPU in the second GPU based on the resource expansion instruction.

Optionally, the migrating the target virtual GPU to a second GPU includes:

packaging the context information of the target virtual GPU to generate a virtual device file, and deleting the target virtual GPU in the first GPU;

creating a new virtual GPU in the second GPU;

And configuring the new virtual GPU into the target virtual GPU based on the context information in the virtual device file.

Optionally, the configuring the new virtual GPU to the target virtual GPU based on the context information in the virtual device file includes:

calling a preset recovery interface based on the context information in the virtual equipment file to configure the new virtual GPU into the target virtual GPU;

The context information comprises operation data representing the operation performed by the target virtual GPU and address data for scheduling the target virtual GPU; the preset recovery interface is used for distributing a corresponding mapping address to the new virtual GPU according to the address data and configuring the new virtual GPU into a corresponding operation state according to the operation data.

Optionally, before migrating the target virtual GPU to the second GPU, the method further comprises:

setting the access state of the target virtual GPU to be inaccessible;

After migrating the target virtual GPU to the second GPU, the method further comprises:

and setting the access state of the target virtual GPU to be accessible.

Optionally, the computing resource includes at least one of a memory resource and a computing power resource, and the resource expansion instruction indicates a memory size for expanding the memory resource and/or a computing power ratio size for expanding the computing power resource.

Optionally, the computing resources include computing power resources, and the extended resource quantity characterizes a computing power proportion size to be extended;

after receiving the resource expansion instruction, the method further comprises:

And adjusting the set power threshold of the target virtual GPU based on the resource expansion instruction under the condition that the sum of the power resources which are in use and correspond to the plurality of virtual GPUs and the power proportion indicated by the resource expansion instruction do not exceed the power resource upper limit of the first GPU.

Optionally, the computing resource includes a memory resource, and the extended resource quantity characterizes a memory size to be extended;

And adjusting the set total video memory of the target virtual GPU based on the resource expansion instruction under the condition that the value of the total video memory resource corresponding to the plurality of virtual GPUs and the video memory size indicated by the resource expansion instruction does not exceed the upper limit of the video memory resource of the first GPU.

A second aspect of the embodiments of the present disclosure provides a resource management device, including:

The system comprises a receiving module, a first GPU and a second GPU, wherein the receiving module is used for receiving a resource expansion instruction, the resource expansion instruction is used for expanding the computing resource of a target virtual GPU, and the target virtual GPU is one of a plurality of virtual GPUs configured in the first GPU;

The migration module is used for migrating the target virtual GPU to a second GPU under the condition that the sum of the computing resources corresponding to the plurality of virtual GPUs plus the value of the expansion resource quantity indicated by the resource expansion instruction exceeds the upper limit of the computing resources of the first GPU;

and the expansion module is used for expanding the computing resources of the target virtual GPU in the second GPU based on the resource expansion instruction.

A third aspect of the disclosed embodiments provides a resource management system, comprising:

A virtual device management module;

The virtual GPU is operated based on the computing resources of the first GPU or the second GPU;

the virtual device management module is configured to perform the resource management method provided in any one of the first aspects.

A fourth aspect of embodiments of the present disclosure provides a computer readable storage medium having stored thereon computer program instructions which, when executed by a processor, implement the steps of the resource management method provided in any of the first aspects of the present disclosure.

According to the technical scheme, when the computing resources of the first GPU are insufficient for expanding the computing resources of the target virtual GPU with the resource expansion requirements, the target virtual GPU can be migrated to the second GPU on the basis of not terminating the computing tasks, and the computing resources of the target virtual GPU are expanded in the second GPU based on the resource expansion instructions, so that flexible expansion of the computing resources of the target virtual GPU is realized.

Additional features and advantages of the present disclosure will be set forth in the detailed description which follows.

Drawings

The accompanying drawings are included to provide a further understanding of the disclosure, and are incorporated in and constitute a part of this specification, illustrate the disclosure and together with the description serve to explain, but do not limit the disclosure. In the drawings:

FIG. 1 is a flow chart illustrating a method of resource management, according to an exemplary embodiment.

Fig. 2 is a flow chart illustrating a method of resource management according to another exemplary embodiment.

FIG. 3 is a schematic diagram illustrating a target virtual GPU migration, according to an example embodiment.

Fig. 4 is a block diagram illustrating a resource management device according to an exemplary embodiment.

FIG. 5 is a schematic diagram of a resource management system, according to an example embodiment.

FIG. 6 is a schematic diagram illustrating a video memory control module controlling video memory resources according to an example embodiment.

FIG. 7 is a schematic diagram illustrating a computing force control module controlling computing force resources according to an example embodiment.

Fig. 8 is a block diagram of an electronic device, according to an example embodiment.

Detailed Description

Specific embodiments of the present disclosure are described in detail below with reference to the accompanying drawings. It should be understood that the detailed description and specific examples, while indicating and illustrating the disclosure, are not intended to limit the disclosure.

As noted in the background, after the physical GPU is split into multiple virtual GPUs, the computational resources that each virtual GPU can use are fixed. If the user end program needs to expand the computing resources of the virtual GPU, it is often necessary to re-divide the resources of the physical GPU and restart the executing computing task in the virtual GPU.

For example, in the case of an average allocation of computing resources to physical GPUs, if 10 virtual GPUs are obtained based on one physical GPU allocation, each virtual GPU is fixedly allocated 1/10 of the physical GPU computing resources. In this case, if the client program needs to expand the computing resources of the virtual GPU used, even if the remaining 9/10 computing resources are not used, the virtual GPU cannot be expanded upward, but the physical GPU needs to be divided again. However, re-partitioning the resources of the physical GPUs also means that the computing tasks being performed in the virtual GPUs need to be restarted.

In particular implementations, GPU computing service providers tend to provide more than one physical GPU. However, computing resources between different physical GPUs are difficult to share, involving complex address scheduling issues. On this basis, the same problem still exists in the case that the client program needs to expand the computing resources of the virtual GPU used. That is, even if there are other unused physical GPUs in the resource pool of the GPU computing service provider, the computing resources of the other physical GPUs cannot be simply allocated to the virtual GPUs, but the resource division of the physical GPUs still needs to be performed again, and the computing task being executed in the virtual GPUs is restarted.

The resource scheduling mode of the related technology causes limitation on virtual GPU resource allocation, increases the burden of resource allocation, has lower scheduling flexibility, and cannot fully utilize idle computing resources in a resource pool.

In view of this, embodiments of the present disclosure provide a method, an apparatus, a system, and a storage medium for resource management, where when the computing resources of a first GPU are insufficient for expanding the computing resources of a target virtual GPU having a resource expansion requirement, the target virtual GPU may be migrated to a second GPU without terminating a computing task, and the computing resources of the target virtual GPU are expanded in the second GPU based on a resource expansion instruction, so as to implement flexible expansion of the computing resources of the target virtual GPU.

Referring to fig. 1, fig. 1 is a flowchart illustrating a resource management method according to an exemplary embodiment. As shown in fig. 1, the resource management method may be applied to a computing device, such as a mobile phone, a tablet, a computer, etc., and the resource management method provided by the embodiments of the present disclosure may include steps S101 to S103.

In step S101, a resource expansion instruction is received, where the resource expansion instruction is used to expand a computing resource of a target virtual GPU, and the target virtual GPU is one of multiple virtual GPUs configured in the first GPU.

Wherein the first GPU is a physical GPU. Through the GPU virtualization technology, the first GPU may be split into multiple virtual GPUs, and each virtual GPU is configured with available computing resources. The target virtual GPU is one of a plurality of virtual GPUs configured in the first GPU. In one possible implementation, the target virtual GPU is a virtual GPU used by a client program that sends resource expansion instructions.

In step S102, when the sum of the computing resources corresponding to the plurality of virtual GPUs plus the amount of the extended resources indicated by the resource extension instruction exceeds the upper limit of the computing resources of the first GPU, the target virtual GPU is migrated to the second GPU.

The sum of the computing resources corresponding to the virtual GPUs characterizes the computing resources already allocated in the first GPU, and no matter whether the computing resources already allocated are being used or not, the computing resources cannot be allocated to the target virtual GPU again without re-dividing the resources of the first GPU. The upper limit of computing resources of the first GPU characterizes computing resources owned by the first GPU. The second GPU is a physical GPU.

In one possible implementation, the second GPU is a physical GPU that is free from the resource pool of the GPU computing service provider. The idle physical GPUs in the resource pool may refer to physical GPUs with computing resources not allocated, or refer to physical GPUs with allocatable computing resources larger than the computing resources corresponding to the target virtual GPU plus the value of the extended resource quantity indicated by the resource extension instruction.

In an embodiment, when the sum of the computing resources corresponding to the plurality of virtual GPUs plus the value of the amount of the extended resources indicated by the resource extension instruction does not exceed the upper limit of the computing resources of the first GPU, it indicates that the first GPU has an allocatable computing resource that can be used to extend the computing resources of the target virtual GPU. In this case, the computing resources of the target virtual GPU may be extended in the first GPU based on the resource extension instruction.

In another embodiment, when the sum of the computing resources corresponding to the plurality of virtual GPUs plus the amount of the extended resources indicated by the resource extension instruction exceeds the upper limit of the computing resources of the first GPU, it indicates that the first GPU has no other allocable computing resources that can be used to extend the computing resources of the target virtual GPU. In this case, the related art often needs to re-perform the resource partitioning of the GPU and restart the computing task being executed in the virtual GPU. In the embodiment of the disclosure, the target virtual GPU is migrated to the second GPU, and the computing resources of the target virtual GPU are extended by using the allocatable computing resources in the second GPU.

It is worth to describe that, in the embodiment of the present disclosure, migration to the target virtual GPU is implemented in a process, so that the executing computing task in the target virtual GPU does not need to be terminated, the burden of resource allocation is reduced, and the scheduling flexibility is high.

In step S103, the computing resources of the target virtual GPU are extended in the second GPU based on the resource extension instruction.

Therefore, when the computing resources of the first GPU are insufficient for expanding the computing resources of the target virtual GPU with the resource expansion requirement, the target virtual GPU can be migrated to the second GPU on the basis of not terminating the computing task, and the computing resources of the target virtual GPU are expanded in the second GPU based on the resource expansion instruction, so that flexible expansion of the computing resources of the target virtual GPU is realized, and the utilization rate of the idle physical GPUs in the resource pool is improved.

In one possible implementation, the computing resources may include at least one of memory resources and computing power resources. The video memory resource is used for representing the video memory service condition of the physical GPU, and the video memory can refer to the video memory of a video card, and also can refer to a memory space which is used as the video memory after the memory is uniformly addressed, such as uniform video memory. The computing power resources are used to characterize the computing power of the device, and the total computing power of one physical GPU may be represented by a computing power ratio of 100%, and thus the computing power resources used by the virtual GPU may be represented by a computing power ratio size. In computing operations involving API (application program interface, application Programming Interface) calls, the computing power of the device may be adjusted by adjusting the frequency of API calls.

On this basis, the resource expansion instruction may be used to indicate a memory size for expanding the memory resource and/or a computing ratio size for expanding the computing power resource. The value represented by the extended resource amount is different corresponding to different kinds of computing resources. For example, when the resource expansion instruction indicates a memory size for expanding a memory resource, the amount of the expansion resource characterizes the memory size to be expanded. When the resource expansion instruction indicates a calculation ratio size for expanding the calculation force resource, the expansion resource quantity characterizes the calculation ratio size to be expanded. When the resource expansion instruction indicates a memory size for expanding the memory resource and a computing power ratio size of the computing power resource, the expansion resource amount may include a memory expansion resource amount characterizing the memory size to be expanded and a computing power expansion resource amount characterizing the computing power ratio size to be expanded.

In one embodiment, the resource expansion instructions indicate a calculation ratio instance size for expanding the calculation force resource, and the amount of the expanded resource characterizes the calculation ratio instance size to be expanded. On this basis, after the step S101, the technical solution provided in the embodiment of the present disclosure may further include:

And adjusting the set power threshold of the target virtual GPU based on the resource expansion instruction under the condition that the sum of the power resources in use corresponding to the plurality of virtual GPUs and the power proportion indicated by the resource expansion instruction do not exceed the power resource upper limit of the first GPU.

Wherein setting the computational power threshold refers to an amount of computational power resources allocated to the target virtual GPU, characterizing the computational power resources that can be used by the target virtual GPU. The setting of the calculation force threshold may be dependent on the actual situation, which is not particularly limited in this disclosure.

In another embodiment, the resource expansion instruction indicates a memory size for expanding the memory resource, and the amount of the expanded resource characterizes the memory size to be expanded. On this basis, after the step S101, the technical solution provided in the embodiment of the present disclosure may further include:

and adjusting the set total video memory of the target virtual GPU based on the resource expansion instruction under the condition that the value of the sum of the video memory resources corresponding to the virtual GPUs and the video memory size indicated by the resource expansion instruction does not exceed the upper limit of the video memory resources of the first GPU.

The set total memory refers to the memory resource amount allocated to the target virtual GPU, and represents the memory resource which can be used by the target virtual GPU. The total memory may be set according to actual situations, which is not specifically limited in the present disclosure.

In yet another embodiment, the resource expansion instruction indicates a memory size for expanding the memory resource and a computing power size of the computing power resource, the expanding resource amount including a memory expansion resource amount characterizing the memory size to be expanded and a computing power expansion resource amount characterizing the computing power size to be expanded. On this basis, after the step S101, the technical solution provided in the embodiment of the present disclosure may further include:

And when the value of the sum of the computing power resources corresponding to the plurality of virtual GPUs and the computing power proportion indicated by the resource expansion instruction does not exceed the computing power resource upper limit of the first GPU and the value of the sum of the video memory resources corresponding to the plurality of virtual GPUs and the video memory size indicated by the resource expansion instruction does not exceed the video memory resource upper limit of the first GPU, adjusting the set computing power threshold of the target virtual GPU based on the computing power expansion resource quantity indicated by the resource expansion instruction and adjusting the set video memory total quantity of the target virtual GPU based on the video memory expansion resource quantity indicated by the resource expansion instruction.

Referring to fig. 2, fig. 2 is a flowchart illustrating a resource management method according to another exemplary embodiment. As shown in FIG. 2, the resource management method may include steps S201 to S203.

In step S201, a resource expansion instruction is received, where the resource expansion instruction is used to expand a computing resource of a target virtual GPU, and the target virtual GPU is one of multiple virtual GPUs configured in the first GPU.

The specific embodiment of step S201 may refer to the detailed description of step S101, which is not repeated here. After step S201, if it is determined that the sum of the computing resources corresponding to the virtual GPUs plus the value of the extended resource amount indicated by the resource extension instruction exceeds the upper limit of the computing resources of the first GPU, steps S2021 to S2023 are executed.

In step S2021, the context information of the target virtual GPU is packaged to generate a virtual device file, and the target virtual GPU in the first GPU is deleted.

Context information, among other things, may refer to state and parameter information associated with a process, which may be used to characterize the running state of the process. For example, the context information of the target virtual GPU may include operation data for the target virtual GPU to perform operations and address data for scheduling the target virtual GPU. The operation data of the target virtual GPU for executing the operation may include the calculation data input by the user program, the execution code related to the calculation task, and the code execution progress, where the operation data may be used to characterize the operation state of the calculation task executed in the target virtual GPU.

Thus, the virtual device file may be generated by packaging the context information of the target virtual GPU, thereby preserving the computing task being performed in the target virtual GPU. On this basis, the computing task being executed in the target virtual GPU may also be restored based on the context information in the virtual device file. Therefore, the target virtual GPU and the progress of the computing task executed by the target virtual GPU can be restored in the second GPU on the basis of not terminating the computing task, the computing task is not required to be restarted, and flexible expansion of the computing resources of the target virtual GPU is realized.

In addition, deleting the target virtual GPU in the first GPU can release the computing resources originally used by the target virtual GPU in the first GPU, and the released computing resources can be used for other computing tasks, so that the resource waste is reduced, and the resource utilization rate is improved.

In step S2022, a new virtual GPU is created in the second GPU.

In step S2023, the new virtual GPU is configured as the target virtual GPU based on the context information in the virtual device file.

In step S203, the computing resources of the target virtual GPU are extended in the second GPU based on the resource extension instruction.

The specific embodiment of step S203 may refer to the detailed description of step S103, which is not repeated here.

In one possible implementation manner, the step of configuring the new virtual GPU into the target virtual GPU in step S2023 may include:

and calling a preset recovery interface based on the context information in the virtual device file to configure the new virtual GPU into a target virtual GPU.

The context information includes operation data representing the execution operation of the target virtual GPU and address data for scheduling the target virtual GPU. The preset recovery interface is used for distributing a corresponding mapping address for the new virtual GPU according to the address data and configuring the new virtual GPU into a corresponding operation state according to the operation data. The mapped addresses include memory mapped addresses and/or video memory mapped addresses.

It will be appreciated that the step of configuring the new virtual GPU to the corresponding operational state in accordance with the operational data may include a process of resuming the computing task being performed in the target virtual GPU. Therefore, the target virtual GPU and the progress of the computing task executed by the target virtual GPU can be restored in the second GPU on the basis of not terminating the computing task, the computing task is not required to be restarted, and flexible expansion of the computing resources of the target virtual GPU is realized.

It should be noted that the preset restore interface may be configured in the virtual GPU. By calling a preset recovery interface in the new virtual GPU, a memory mapping address and/or a video memory mapping address corresponding to the address data in the context information can be allocated for the new virtual GPU, and the new virtual GPU is configured to be the operation state of the target virtual GPU according to the operation data, so that the new virtual GPU can serve as the target virtual GPU, and the corresponding calculation task can be continuously executed.

It should also be noted that the target virtual GPU is not accessible to the client program when it is migrated. Therefore, before the step of migrating the target virtual GPU to the second GPU in step S102, the technical solution provided by the embodiments of the present disclosure may further include: the access state of the target virtual GPU is set inaccessible. After the step of migrating the target virtual GPU to the second GPU in step S102, the technical solution provided by the embodiments of the present disclosure may further include: the access state of the target virtual GPU is set to be accessible.

Thus, when migration is performed, the operation of the user end program on the target virtual GPU can be blocked by setting the access state of the target virtual GPU to be inaccessible. After the migration is finished, the user end program is allowed to operate the target virtual GPU by setting the access state of the target virtual GPU to be accessible. Therefore, the migration of the target virtual GPU is prevented from being influenced by the operation of the user side program.

Referring to fig. 3, fig. 3 is a schematic diagram illustrating a target virtual GPU migration, according to an example embodiment. As shown in fig. 3, after receiving the resource expansion instruction, if it is determined that the sum of the computing resources corresponding to the plurality of virtual GPUs plus the value of the expansion resource amount indicated by the resource expansion instruction exceeds the upper limit of the computing resources of the first GPU, the access state of the target virtual GPU is set to be inaccessible, so as to avoid the migration of the target virtual GPU from the operation influence of the user program.

On the basis, the context information of the target virtual GPU can be packaged to generate a virtual device file by calling a preset recovery interface, the target virtual GPU in the first virtual GPU is deleted, and a new virtual GPU is created in the second GPU. And then, restoring the target virtual GPU in the second GPU based on the virtual device file and the new virtual GPU. Finally, setting the access state of the target virtual GPU to be accessible, and allowing the user end program to operate the target virtual GPU.

Based on the same conception, the present disclosure also provides a resource management device. Referring to fig. 4, fig. 4 is a block diagram illustrating a resource management device 400 according to an example embodiment. As shown in fig. 4, the resource management device 400 may include a receiving module 401, a migration module 402, and an expansion module 403.

The receiving module 401 is configured to receive a resource expansion instruction, where the resource expansion instruction is used to expand a computing resource of a target virtual GPU, and the target virtual GPU is one of multiple virtual GPUs configured in the first GPU;

A migration module 402, configured to migrate the target virtual GPU to the second GPU when a value of a sum of computing resources corresponding to the plurality of virtual GPUs plus an amount of extended resources indicated by the resource extension instruction exceeds an upper limit of computing resources of the first GPU;

the expansion module 403 is configured to expand the computing resources of the target virtual GPU in the second GPU based on the resource expansion instruction.

Optionally, the migration module 402 is configured to:

Creating a new virtual GPU in the second GPU;

based on the context information in the virtual device file, the new virtual GPU is configured as the target virtual GPU.

Optionally, the migration module 402 is configured to:

calling a preset recovery interface based on the context information in the virtual equipment file to configure a new virtual GPU into a target virtual GPU;

the context information comprises operation data representing the execution operation of the target virtual GPU and address data for scheduling the target virtual GPU; the preset recovery interface is used for distributing a corresponding mapping address for the new virtual GPU according to the address data and configuring the new virtual GPU into a corresponding operation state according to the operation data.

Optionally, the resource management device 400 may further include a setting module, where the setting module is configured to: the access state of the target virtual GPU is set inaccessible prior to migrating the target virtual GPU to the second GPU. The setting module is also used for: after migrating the target virtual GPU to the second GPU, the access state of the target virtual GPU is set to be accessible.

Optionally, the computing resources include at least one of a video memory resource and a computing power resource, and the resource expansion instruction indicates a video memory size for expanding the video memory resource and/or a computing power size for expanding the computing power resource.

Optionally, the computing resources include computing power resources, and the amount of extended resources characterizes a computing power ratio size to be extended. On this basis, the expansion module 403 is configured to:

After receiving the resource expansion instruction, adjusting the set power threshold of the target virtual GPU based on the resource expansion instruction under the condition that the sum of the power resources in use corresponding to the virtual GPUs and the power proportion indicated by the resource expansion instruction do not exceed the power resource upper limit of the first GPU.

Optionally, the computing resources include memory resources, and the amount of expansion resources characterizes a size of the memory to be expanded. On this basis, the expansion module 403 is configured to:

after the resource expansion instruction is received, under the condition that the sum of the video memory resources corresponding to the virtual GPUs and the value of the video memory indicated by the resource expansion instruction does not exceed the upper limit of the video memory resources of the first GPU, the set video memory total amount of the target virtual GPU is adjusted based on the resource expansion instruction.

The specific manner in which the various modules perform the operations in the apparatus of the above embodiments have been described in detail in connection with the embodiments of the method, and will not be described in detail herein.

Based on the same conception, the present disclosure also provides a resource management system. Referring to fig. 5, fig. 5 is a schematic diagram of a resource management system 500, according to an example embodiment. As shown in fig. 5, the resource management system 500 may include a virtual device management module 501 and a virtual GPU502.

Wherein virtual GPU502 operates based on the computing resources of the first GPU or the second GPU. The first GPU and the second GPU are physical GPUs, the first GPU or the second GPU can be divided into a plurality of virtual GPUs through a GPU virtualization technology, and available computing resources are configured for each virtual GPU. Thus, it is to be appreciated that there may be multiple virtual GPUs 502.

The virtual device management module 501 may implement the various embodiments of the resource management methods described above in software, hardware, or a combination of both. In one embodiment, virtual device management module 501 may include an access control module, a resource management module, and a migration management module as shown in FIG. 5.

The access control module is used for controlling whether the virtual GPU can be accessed. Therefore, when migration is performed, the access control module sets the access state of the target virtual GPU to be inaccessible, and the operation of the user end program on the target virtual GPU is blocked. After the migration is finished, the access control module sets the access state of the target virtual GPU to be accessible, and allows the user end program to operate the target virtual GPU. Therefore, the migration of the target virtual GPU is prevented from being influenced by the operation of the user side program.

The resource management module is used for managing the upper limit of the computing resource of the virtual GPU. The control of the upper limit of the computing resource in the virtual GPU can be realized by dynamically setting the upper limit of the computing resource of the virtual GPU and transmitting corresponding control instructions to a video memory control module and/or a computing power control module in the virtual GPU.

The migration management module is used for calling a preset interface to realize a migration target of the virtual GPU. For example, the preset recovery interface in the foregoing embodiment is called, so that the target virtual device is saved in the first GPU and the target virtual device is recovered in the second GPU based on the saved virtual device file, so as to implement migration to the target virtual GPU.

The specific manner in which the various modules in virtual device management module 501 perform operations has been described in detail in connection with embodiments of resource management methods and will not be described in detail herein.

In one possible implementation, and following the example of FIG. 5, a metadata management module, a video memory control module, a computing power control module, an operation management module, and a virtual address management module are configured in virtual GPU 502.

The metadata management module is used for managing and recording metadata information of the virtual GPU, wherein the metadata information comprises video memory resources of the virtual GPU, computing power resources in use, setting total video memory and setting a computing power threshold.

Thus, the amount of computing resources that the virtual GPU itself has used can be recorded by the metadata management module and used to support memory and power control.

And the video memory control module is used for controlling the video memory resources of the virtual GPU.

In one embodiment, the size of the video memory occupied by the virtual GPU may be counted by the video memory control module, and the physical video memory upper limit used by the virtual GPU may be controlled based on the size of the video memory, that is, the total amount of the video memory may be set.

Referring to fig. 6, fig. 6 is a schematic diagram illustrating a video memory control module controlling video memory resources according to an exemplary embodiment. After receiving the resource expansion instruction indicating the size of the video memory for expanding the video memory resource, the virtual device management module 501 sends a corresponding control instruction to the virtual GPU502 through the resource management module, where the control instruction is used as a new video memory application.

On this basis, the virtual GPU502 receives a new video memory application, determines whether the value of the sum of video memory resources corresponding to the multiple virtual GPUs plus the video memory size indicated by the video memory application exceeds the upper limit of the video memory resources of the first GPU, and when it is determined that the value of the sum of video memory resources corresponding to the multiple virtual GPUs plus the video memory size indicated by the video memory application does not exceed the upper limit of the video memory resources of the first GPU, invokes the relevant interface of the first GPU to allocate the corresponding video memory resources, and updates the set total video memory in the metadata management module.

In an embodiment, if the hardware device supports unified memory, the unified memory may be allocated when it is determined that a value of a sum of memory resources corresponding to the plurality of virtual GPUs plus a memory size indicated by the memory application exceeds an upper limit of the memory resources of the first GPU, and the memory and the swap-in and swap-out operations are used to simulate the memory. If the hardware equipment does not support the unified memory, returning to the resource application failure.

It should be noted that, when the hardware device supports unified memory, the memory of the device can be simulated by allocating memory and overlapping the mode of the swap-in and swap-out operation. For example, a Managed Memory (Managed Memory) of Injeida (NVIDIA) may support this functionality. However, using unified memory management introduces additional performance overhead, because the data swap-in and swap-out operations consume memory bandwidth, and bandwidth resources are limited, thereby introducing performance degradation.

Thus, embodiments of the present disclosure provide two ways to raise the upper resource utilization limit of a device. One is to directly expand the upper limit of the memory when the unified memory is not used, and ensure the success of the subsequent memory application. In this case, a new physical GPU, for example, the second GPU in the foregoing embodiment, may be introduced by the migration means of the embodiment of the present disclosure, so as to adjust the upper limit of the physical memory and increase the proportion of the physical memory in the unified memory, thereby reducing the utilization of the unified memory and the number of times of data swapping in and out the physical memory, and realizing the performance improvement of the application.

And the computing power control module is used for controlling the computing power resources being used by the virtual GPU.

In an embodiment, the magnitude of the computing power proportion being used by the virtual GPU may be counted by the computing power control module, and the upper limit of the computing power of the virtual GPU is controlled based on the magnitude of the computing power proportion, that is, the computing power threshold is set.

For example, after receiving a resource expansion instruction indicating a magnitude of a computing force ratio for expanding a computing force resource, the virtual device management module 501 sends a corresponding control instruction to the virtual GPU502 through the resource management module, where the control instruction is used as a computing force adjustment application.

On this basis, the virtual GPU502 receives a power adjustment application, determines whether the value of the sum of power resources in use corresponding to the plurality of virtual GPUs plus the power proportion indicated by the power adjustment application exceeds the power resource upper limit of the first GPU, and when it is determined that the value of the sum of power resources in use corresponding to the plurality of virtual GPUs plus the power proportion indicated by the power adjustment application does not exceed the power resource upper limit of the first GPU, invokes the relevant interface of the first GPU to allocate corresponding power resources, and updates the set power threshold recorded in the metadata management module.

In addition, it should be noted that, the computing power control module may also adaptively adjust the computing power resources used by the virtual GPU based on the computing operation issued by the user side program.

Referring to FIG. 7, FIG. 7 is a schematic diagram illustrating a computing force control module controlling computing force resources according to an example embodiment. The virtual GPU502 receives the computing operation, and executes the computing operation according to the scheduling frequency of the current virtual GPU, that is, the calling frequency of the API. Based on the above, the magnitude of the calculation force ratio occupied by the current virtual GPU can be queried, and whether the magnitude of the occupied calculation force ratio exceeds a set calculation force threshold value can be determined. And when the occupied calculation force proportion is not larger than the set calculation force threshold value, the dispatching frequency of the virtual GPU is increased. And when the occupied calculation ratio exceeds the set calculation threshold value, the scheduling frequency of the virtual GPU is reduced.

It will be appreciated that the manner in which embodiments of the present disclosure employ is a feedback-based adjustment method. When the user end program has a calculation request, the calculation request is issued to the virtual GPU according to the scheduling frequency of the existing virtual GPU to execute operation. On the basis, the embodiment of the disclosure adaptively adjusts the computing capacity of the virtual GPU by inquiring the magnitude of the computing force ratio occupied by the virtual GPU and setting a computing force threshold. Therefore, the flexible dynamic adjustment of the computational power resources used by the virtual GPU can be realized, and when the user side program has the resource expansion requirement for the computational power resources, the adjustment of the computational power resources can be realized only by adjusting and setting the computational power threshold value, and the execution speed can be fed back quickly.

And the operation management module is used for recording operation data representing the execution operation of the virtual GPU and address data for scheduling the virtual GPU so as to form the context information of the virtual GPU.

It should be noted that, the operation management module may screen the operation of the virtual GPU, and record all the operations affecting the context. Wherein the operations affecting the context include operations involving allocation of video memory and/or allocation of data objects. And when the operation is recorded, the virtual address corresponding to the current operation can be obtained from the virtual address management module, and the address information is packed and recorded to form a recoverable complete operation record. Thus, the record of the operation data representing the operation performed by the virtual GPU and the address data for scheduling the virtual GPU is realized, thereby constituting the context information of the virtual GPU.

And the virtual address management module is used for managing the address data of the virtual GPU and associating the address data with the operation executed when the virtual GPU is scheduled.

In one embodiment, management of address data of the virtual GPU may be implemented through a memory mapping manner (mmap) of a Hook (Hook) system. By way of example, a switch is added in the Hook function, when the operation on the virtual GPU is executed, the Hook switch is turned on, address information distributed by the system is intercepted, and the address information and the operation on the virtual GPU are associated to form a complete operation record on the virtual GPU. After the virtual GPU operation is completed, a switch in the Hook function is closed, and other address operations for the application are ensured not to be affected by the Hook function.

It is understood that the virtual address management module may be configured to ensure that the virtual addresses of the memory and the video memory associated with the virtual GPU do not change during the migration of the virtual GPU.

The virtual device management module 501 is connected with the metadata management module, the video memory control module, the power calculation control module, the operation management module and the virtual address management module through preset interfaces.

The preset interface can be used for calling each module in the virtual GPU, so that the functions of creation, deletion, storage and recovery are realized. The preset interfaces may include, for example, the preset restore interfaces in the foregoing embodiments.

It should be noted that, the process of creating the virtual GPU may include a process of establishing a mapping relationship between the virtual GPU and the physical GPU, setting a computation ratio, such as a size, and a video memory size, used by the virtual GPU. This process may initialize metadata information for the virtual GPU.

The process of deleting the virtual GPU may include cleaning up context information created during use of the virtual GPU and removing the mapping relationship of the virtual GPU to the physical GPU, deleting the virtual GPU object.

The process of saving the virtual GPU may include packaging the context information of the virtual GPU to generate a virtual device file that is used to restore the virtual GPU. For example, the operations recorded in the operation management module, the allocated addresses and the data in the allocated video memory are packed to generate corresponding virtual device files.

The process of restoring the virtual GPU may include using the saved virtual device file to restore the context information and data of the virtual GPU. In one embodiment, the recovery process may be implemented by playing back the operational record. When the operation record is played back, the recorded address can be distributed in the memory mapping mode of the Hook system when the virtual address is applied for operation, and the distributed address is ensured to be consistent with the previous address.

Therefore, when virtual GPU migration is performed, the virtual GPU502 can be migrated between different physical devices by calling a preset interface through the virtual device management module 501, and the upper limit of the computing resources of the virtual GPU502 is dynamically adjusted through the resource management module, so that more flexible resource scheduling is realized.

Through the technical scheme, the virtual GPU in the process can be migrated to different physical devices, the upper limit of the computing resources of the virtual GPU is improved, and the virtual GPU can obtain more resources through the memory and the computing power adjustment, so that the goal of rapidly expanding the resources is realized, the utilization rate of the resources is improved, and the service processing capacity of the application is dynamically adjusted.

Fig. 8 is a block diagram of an electronic device 800, according to an example embodiment. As shown in fig. 8, the electronic device 800 may include: a processor 801, a memory 802. The electronic device 800 may also include one or more of a multimedia component 803, an input/output (I/O) interface 804, and a communication component 805.

Wherein the processor 801 is configured to control the overall operation of the electronic device 800 to perform all or part of the steps of the resource management method described above. The memory 802 is used to store various types of data to support operation at the electronic device 800, which may include, for example, instructions for any application or method operating on the electronic device 800, as well as application-related data, such as contact data, messages sent and received, pictures, audio, video, and so forth. The Memory 802 may be implemented by any type or combination of volatile or non-volatile Memory devices, such as static random access Memory (Static Random Access Memory, SRAM for short), electrically erasable programmable Read-Only Memory (ELECTRICALLY ERASABLE PROGRAMMABLE READ-Only Memory, EEPROM for short), erasable programmable Read-Only Memory (Erasable Programmable Read-Only Memory, EPROM for short), programmable Read-Only Memory (Programmable Read-Only Memory, PROM for short), read-Only Memory (ROM for short), magnetic Memory, flash Memory, magnetic disk, or optical disk. The multimedia component 803 may include a screen and an audio component. Wherein the screen may be, for example, a touch screen, the audio component being for outputting and/or inputting audio signals. For example, the audio component may include a microphone for receiving external audio signals. The received audio signals may be further stored in the memory 802 or transmitted through the communication component 805. The audio assembly further comprises at least one speaker for outputting audio signals. The input/output interface 804 provides an interface between the processor 801 and other interface modules, which may be a keyboard, mouse, buttons, etc. These buttons may be virtual buttons or physical buttons. The communication component 805 is used for wired or wireless communication between the electronic device 800 and other devices. Wireless Communication, such as Wi-Fi, bluetooth, near Field Communication (NFC) for short, 2G, 3G, 4G, NB-IOT, eMTC, or other 5G, etc., or one or a combination of more of them, is not limited herein. The corresponding communication component 805 may thus comprise: wi-Fi module, bluetooth module, NFC module, etc.

In an exemplary embodiment, the electronic device 800 may be implemented by one or more Application-specific integrated circuits (ASIC), digital signal processors (DIGITAL SIGNAL Processor DSP), digital signal processing device (DIGITAL SIGNAL Processing Device DSPD), programmable logic device (Programmable Logic Device PLD), field programmable gate array (Field Programmable GATE ARRAY FPGA), controller, microcontroller, microprocessor, or other electronic element for performing the above-described resource management method.

In another exemplary embodiment, a computer readable storage medium is also provided, comprising program instructions which, when executed by a processor, implement the steps of the resource management method described above. For example, the computer readable storage medium may be the memory 802 described above including program instructions executable by the processor 801 of the electronic device 800 to perform the resource management method described above.

In another exemplary embodiment, a computer program product is also provided, comprising a computer program executable by a programmable apparatus, the computer program having code portions for performing the above-described resource management method when executed by the programmable apparatus.

The preferred embodiments of the present disclosure have been described in detail above with reference to the accompanying drawings, but the present disclosure is not limited to the specific details of the embodiments described above, and various simple modifications may be made to the technical solutions of the present disclosure within the scope of the technical concept of the present disclosure, and all the simple modifications belong to the protection scope of the present disclosure.

In addition, the specific features described in the foregoing embodiments may be combined in any suitable manner, and in order to avoid unnecessary repetition, the present disclosure does not further describe various possible combinations.

Moreover, any combination between the various embodiments of the present disclosure is possible as long as it does not depart from the spirit of the present disclosure, which should also be construed as the disclosure of the present disclosure.

Claims

1. A method of resource management, the method comprising:

expanding computing resources of the target virtual GPU in the second GPU based on the resource expansion instruction;

The migrating the target virtual GPU to a second GPU includes:

creating a new virtual GPU in the second GPU;

2. The resource management method according to claim 1, wherein the configuring the new virtual GPU to the target virtual GPU based on the context information in the virtual device file includes:

3. The resource management method according to any one of claims 1-2, wherein prior to migrating the target virtual GPU to a second GPU, the method further comprises:

setting the access state of the target virtual GPU to be inaccessible;

and setting the access state of the target virtual GPU to be accessible.

4. The resource management method according to any one of claims 1 to 2, wherein the computing resource includes at least one of a memory resource and a computing power resource, and the resource expansion instruction indicates a memory size for expanding the memory resource and/or a computing power size for expanding the computing power resource.

5. The resource management method according to any one of claims 1 to 2, wherein the computing resources include computing power resources, and the amount of extended resources characterizes a magnitude of a proportion of computing power to be extended;

6. The resource management method according to any one of claims 1 to 2, wherein the computing resource includes a memory resource, and the amount of the extended resource characterizes a size of the memory to be extended;

7. A resource management device, characterized in that the resource management device comprises:

The expansion module is used for expanding the computing resources of the target virtual GPU in the second GPU based on the resource expansion instruction;

the migration module is used for:

creating a new virtual GPU in the second GPU;

8. A resource management system, comprising:

A virtual device management module;

the virtual device management module is configured to perform the method of any of claims 1-6.

9. The resource management system of claim 8, wherein the virtual GPU is configured with a metadata management module, a video memory control module, a computing power control module, an operation management module, and a virtual address management module;

The metadata management module is used for managing and recording metadata information of the virtual GPU, wherein the metadata information comprises video memory resources of the virtual GPU, computing power resources in use, total set video memory and computing power threshold set;

The video memory control module is used for controlling video memory resources of the virtual GPU;

The computing power control module is used for controlling computing power resources which are being used by the virtual GPU;

The operation management module is used for recording operation data representing the operation performed by the virtual GPU and address data used for scheduling the virtual GPU so as to form context information of the virtual GPU;

The virtual address management module is used for managing address data of the virtual GPU and associating the address data with operations executed when the virtual GPU is scheduled;

The virtual equipment management module is connected with the metadata management module, the video memory control module, the power calculation control module, the operation management module and the virtual address management module through preset interfaces.

10. A non-transitory computer readable storage medium having stored thereon a computer program, characterized in that the program when executed by a processor realizes the steps of the method according to any of claims 1-6.