[go: up one dir, main page]

CN106155811A - Graphic processing facility, resource service device, resource regulating method and device - Google Patents

Graphic processing facility, resource service device, resource regulating method and device Download PDF

Info

Publication number
CN106155811A
CN106155811A CN201510208923.0A CN201510208923A CN106155811A CN 106155811 A CN106155811 A CN 106155811A CN 201510208923 A CN201510208923 A CN 201510208923A CN 106155811 A CN106155811 A CN 106155811A
Authority
CN
China
Prior art keywords
gpu
resource
processing facility
graphic processing
logical block
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201510208923.0A
Other languages
Chinese (zh)
Other versions
CN106155811B (en
Inventor
孔建钢
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba Group Holding Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to CN201510208923.0A priority Critical patent/CN106155811B/en
Priority to PCT/CN2016/079865 priority patent/WO2016173450A1/en
Publication of CN106155811A publication Critical patent/CN106155811A/en
Application granted granted Critical
Publication of CN106155811B publication Critical patent/CN106155811B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Information Transfer Between Computers (AREA)
  • Processing Or Creating Images (AREA)

Abstract

The embodiment of the present application discloses a kind of graphic processing facility.Wherein, logical block is minimum GPU resource scheduling unit, this graphic processing facility maps at least one GPU multi-process proxy server GPU-MPS, GPU-MPS is the agency dispatching this graphic processing facility, at least one logical block of client schedulable of GPU-MPS, one task process corresponds to a client of GPU-MPS, and the largest logical unit number that this graphic processing facility can comprise is M × N × K;M is the schedulable logic unit numbers of client of GPU-MPS, and N is the maximum number clients that a GPU-MPS comprises, and K is the GPU-MPS number that this graphic processing facility maps.By the application, can also can be that this graphic processing facility saving is set up and the expense of switching GPU context while improving the utilization rate of GPU resource.Disclosed herein as well is a kind of resource service device, resource regulating method and device.

Description

Graphic processing facility, resource service device, resource regulating method and device
Technical field
The application relates to computer application field, particularly relate to graphic processing facility, resource service device, Resource regulating method and device.
Background technology
Owing to, in modern computer, the process of figure is more and more important, accordingly, it would be desirable to a kind of special For the core processor of graphics process, and graphic process unit (GPU, Graphics Processing Unit) It it is exactly a kind of device being specifically designed to graphics process.Meanwhile, at the powerful computing capability of GPU Reason general-purpose computations (GPGPU, General Purpose GPU) is the most prevailing, for various high-performance In computing cluster.
At present, in existing GPU cluster technology, when processing operation (job) that user submits to, main There is the dispatching method of two kinds of GPU resource.Wherein, a kind of dispatching method is that Resource Scheduler is by one Individual GPU (e.g. a, GPU card) is only scheduled to the operation of a user.Another kind of dispatching method is, One GPU is scheduled to the operation of multiple user by Resource Scheduler simultaneously.
During realizing the application, inventors herein have recognized that in prior art, at least existence is as follows Problem: in the first dispatching method, owing to a GPU is only monopolized by the operation of a user, and one The operation of individual user is likely to make full use of the resource of a GPU, therefore there will be GPU resource The problem that utilization rate is low.And in the second dispatching method, owing to a GPU is by the work of multiple users Industry is shared, and multiple user more likely can make full use of the resource of a GPU, the most to a certain degree On improve the utilization rate of GPU resource.
Although the second dispatching method can improve the utilization rate of GPU resource, but, when multiple users' When a GPU is shared in operation, the process number that the operation of multiple users is opened simultaneously may be very big, for Each process, GPU will set up a GPU context for it, therefore, set up on GPU The quantity of GPU context also it is possible to very big, and, also can be in large number of GPU context Switch over, set up and switch GPU context GPU resource can be made to produce great expense incurred, thus cause Excessively share GPU problem.
Summary of the invention
In order to solve above-mentioned technical problem, the embodiment of the present application provides graphic processing facility, resource service Device, resource regulating method and device, while improving the utilization rate of GPU resource, also can save Set up and the expense of switching GPU context.Further, excessively sharing of GPU it is avoided as much as Problem.
The embodiment of the present application discloses following technical scheme:
A kind of graphic processing facility, in described graphic processing facility, logical block is at minimum figure Reason device GPU resource scheduling unit, described graphic processing facility maps at least one GPU multi-process agency Server GPU-MPS, described GPU-MPS are the agency dispatching described graphic processing facility, GPU-MPS At least one described logical block of a client schedulable, a task process is the one of GPU-MPS Individual client, the largest logical unit number that described graphic processing facility can comprise is M × N × K;
Wherein, M is the schedulable logic unit numbers of client of GPU-MPS, and N is one The maximum number clients that GPU-MPS comprises, K is the GPU-MPS number that described graphic processing facility maps, M, N and K are non-zero positive integer.
Preferably, one logical block of a client schedulable of GPU-MPS.
Preferably, described graphic processing facility maps a GPU multi-process proxy server.
Preferably, described graphic processing facility comprises M × N × K logical block.
A kind of resource service device, including the graphic processing facility described at least one above-mentioned any one, Monitoring means and the first communication unit, wherein,
Monitoring means, for when the cycle of monitoring arrives, monitoring described graphic processing facility in current period In the quantity of remaining logical block;
First communication unit, for the monitor node data monitored being sent in cluster, in order to institute Stating monitor node, utilize the data atom that monitors to update the resource preset when the update cycle arrives dynamic Table;
Wherein, described resource dynamic table is including at least the quantity of remaining logical block in graphic processing facility.
Preferably, described resource service device is that in cluster is from node.
Preferably, described resource dynamic table also comprises the instream factor of graphic processing facility;Described monitoring Unit is additionally operable to, when the cycle of monitoring arrives, and the reality of graphic processing facility local in monitoring current period Border utilization rate.
A kind of resource regulating method, applies the resource service device described in above-mentioned any one, described side Method includes:
It is received as the dispatch request of target job Dispatching Drawings processor GPU resource, in described dispatch request Indicate the quantity of the logical block of request scheduling;
In response to described dispatch request, from default resource dynamic table, search the number of remaining logical block The graphic processing facility that amount is not zero, and the quantity indicated according to described dispatch request, from the figure found It shape processing means is described target job scheduling logic unit;
Wherein, described resource dynamic table is including at least the quantity of remaining logical block in graphic processing facility.
Preferably, described resource dynamic table also comprises the instream factor of graphic processing facility;
Described in response to described dispatch request, from default resource dynamic table, search remaining logical block The graphic processing facility that is not zero of quantity, and the quantity indicated according to described dispatch request, from finding Graphic processing facility in for described target job scheduling logic unit be:
In response to described dispatch request, from default resource dynamic table search instream factor less than or etc. In the graphic processing facility that the number of default max-thresholds and remaining logical block is not zero, and according to The quantity of described dispatch request instruction, dispatches for described target job from the graphic processing facility found Logical block.
Preferably, described resource dynamic table also comprises the work of the resource service device in Resource Server cluster Make the duty of graphic processing facility in state and resource service device;Described method also includes:
When the update cycle arrives, the work shape of resource service device in resource dynamic table described in atomic update State and the duty of graphic processing facility, described duty includes work and inoperative.
A kind of resource scheduling device, it is characterised in that apply in the resource service described in above-mentioned any one Device, including:
Second communication unit, please for being received as the scheduling of target job Dispatching Drawings processor GPU resource Ask, described dispatch request indicates the quantity of the logical block of request scheduling;
Response unit, in response to described dispatch request, searches residue from default resource dynamic table The graphic processing facility that is not zero of the quantity of logical block, and the quantity indicated according to described dispatch request, For described target job scheduling logic unit from the graphic processing facility found;
Wherein, described resource dynamic table is including at least the quantity of remaining logical block in graphic processing facility.
Preferably, described resource dynamic table also comprises the instream factor of graphic processing facility;
Described response unit specifically for, in response to described dispatch request, from default resource dynamic table Search instream factor to be not zero less than or equal to the number of the max-thresholds preset and remaining logical block Graphic processing facility, and the quantity indicated according to described dispatch request, from the graphics process dress found For described target job scheduling logic unit in putting.
Preferably, described resource dynamic table also comprises the work of the resource service device in Resource Server cluster Make the duty of graphic processing facility in state and resource service device;Described device also includes:
Updating block, for when the update cycle arrives, in resource dynamic table described in atomic update, resource takes The business duty of device and the duty of graphic processing facility, described duty includes work and non- Work.
As can be seen from the above-described embodiment, compared with prior art, the advantage of the application is:
Owing to logical block is minimum GPU resource scheduling unit, therefore, it can a graphics process Logical blocks different in device is scheduled to different task process, makes different user jobs jointly take Same graphic processing facility, it is ensured that the utilization rate of GPU resource in graphic processing facility.Meanwhile, this Shen Please utilize GPU-MPS technology, make a task process become a client of GPU-MPS, so, GPU-MPS just can manage client the same management role process with image tube.Due in a GPU-MPS All clients share a GPU context, therefore, in a GPU multi-process proxy server, Multiple task process as its client also the most only need to share a GPU context.
It addition, when scheduling of resource, instream factor scheduling logic unit based on each GPU, it is also possible to Avoid the occurrence of the problem that GPU excessively shares.
Accompanying drawing explanation
In order to be illustrated more clearly that the embodiment of the present application or technical scheme of the prior art, below will be to reality Execute the required accompanying drawing used in example or description of the prior art to be briefly described, it should be apparent that below, Accompanying drawing in description is only some embodiments of the application, for those of ordinary skill in the art, On the premise of not paying creative work, it is also possible to obtain other accompanying drawing according to these accompanying drawings.
Fig. 1 schematically shows the structure of a kind of graphic processing facility according to presently filed embodiment Figure;
Fig. 2 schematically shows the knot of the another kind of graphic processing facility according to presently filed embodiment Composition;
Fig. 3 schematically shows the knot of the another kind of graphic processing facility according to presently filed embodiment Composition;
Fig. 4 schematically shows the knot of the another kind of graphic processing facility according to presently filed embodiment Composition;
Fig. 5 schematically shows the structure of a kind of resource service device according to presently filed embodiment Figure;
Fig. 6 schematically show according to presently filed embodiment can be implemented within exemplary should Use scene;
Fig. 7 schematically shows the knot of a kind of resource scheduling device according to presently filed embodiment Structure block diagram;
Fig. 8 schematically shows the flow process of a kind of resource regulating method according to presently filed embodiment Figure.
Detailed description of the invention
Understandable, below in conjunction with the accompanying drawings for enabling the above-mentioned purpose of the application, feature and advantage to become apparent from The embodiment of the present application is described in detail.
The operation (job) that user submits to is made up of multiple tasks (task), task be by One task process completes.Therefore, the job scheduling GPU resource for user is actually and has been Become all task process scheduling GPU resource of this operation.
Refer to Fig. 1, Fig. 1 and schematically show a kind of graphics process according to presently filed embodiment The structure chart of device, in this graphic processing facility 10, logical block 11 is that minimum GPU resource is adjusted Degree unit, this graphic processing facility one GPU multi-process proxy server of mapping (GPU-MPS, Graphics Processing Unit-Multiple Process Server) 20, this GPU-MPS 20 has Maximum number clients is 16, and, this GPU-MPS 20 is for dispatching this graphic processing facility 10 Agency, one logical block 11 of a client schedulable of GPU-MPS20, a task process is for being somebody's turn to do One client of GPU-MPS 20, the largest logical unit number that this graphic processing facility can comprise is 16 Individual.
It should be understood that owing to logical block is minimum GPU resource scheduling unit, therefore, it can by In one graphic processing facility, different logical blocks is scheduled to different task process, makes different users Operation takies same graphic processing facility jointly, it is ensured that the utilization rate of GPU resource in graphic processing facility. Meanwhile, the application utilizes GPU-MPS technology, makes a task process become a visitor of-GPU-MPS Family end, so, GPU-MPS just can manage client the same management role process with image tube.Due to one All clients in GPU-MPS share a GPU context, therefore, at a GPU-MPS In, the multiple task process as its client also the most only need to share a GPU context.Such as, when When one graphic processing facility maps a GPU-MPS, only it is required to be and dispatches all of this graphic processing facility Task process shares a GPU context, and is no longer necessary to set up respectively GPU context, thus drops The low quantity of GPU context, has finally saved and has set up and the expense of switching GPU context.
It addition, when for this graphic processing facility 10 configuration logic unit, (can wrap between 1 to 16 Include 1 and 16) quantity of arbitrary disposition logical block.
One client of GPU-MPS 20, in addition to can dispatching a logical block, can be dispatched many Individual logical block, e.g., 2,3, the most multiple logical blocks.Such as, GPU-MPS20 is worked as Two logical blocks of a client schedulable, and, this graphic processing facility 10 still maps one During GPU-MPS20, the largest logical unit number that this graphic processing facility 10 can comprise is 32, such as figure Shown in 2.As can be seen here, the number of the GPU-MPS 20 mapped at this graphic processing facility 10 is fixing not In the case of change, the largest logical unit number that this graphic processing facility 10 can comprise is with GPU-MPS20's One schedulable logic unit numbers of client is relevant, and is directly proportional.
It addition, this graphic processing facility 10 can only map a GPU-MPS20, it is also possible to mapping is many Individual GPU-MPS 20, e.g., 2,3, the most multiple GPU-MPS 20.Such as, figure is worked as Processing means 10 maps two GPU-MPS 20, and, a client of GPU-MPS20 is adjustable During one logical block of degree, the largest logical unit number that this graphic processing facility can comprise is 32, such as figure Shown in 3.As can be seen here, a schedulable logic unit numbers of client at GPU-MPS 20 is fixing not In the case of change, the largest logical unit number that this graphic processing facility 10 can comprise and this graphic processing facility The number of 10 GPU-MPS 20 mapped is relevant, and is directly proportional.
It is to say, the largest logical unit number that can comprise of this graphic processing facility 10 both with GPU-MPS20 A schedulable logic unit numbers of client relevant, again with this graphic processing facility 10 map The number of GPU-MPS 20 is relevant, and is directly proportional.Such as, two are mapped when graphic processing facility 10 GPU-MPS 20, and during two logical blocks of a client schedulable of GPU-MPS 20, this figure The largest logical unit number that shape processing means can comprise is 64, as shown in Figure 4.
Therefore, for graphic processing facility 10, the largest logical unit number that it can comprise is M × N × K Individual, wherein, M is the schedulable logic unit numbers of client of GPU-MPS, and N is one The maximum number clients that GPU-MPS comprises, K is the GPU-MPS number that described graphic processing facility maps, M, N and K are non-zero positive integer.
When configuring the logical block in this graphic processing facility 10, as long as at this graphic processing facility 10 Configure within the largest logical unit number that can comprise.
In a preferred implementation of the application, this graphic processing facility 10 comprises M × N × K and patrols Collect unit.
In another preferred implementation of the application, a client schedulable one of GPU-MPS Logical block, graphic processing facility 10 maps a GPU-MPS 20.It should be understood that this excellent Select in embodiment, the largest logical unit number that graphic processing facility comprises and a GPU-MPS The maximum number clients comprised is equal.
In addition, it is necessary to explanation, this graphic processing facility 10 is a graphics process in physical aspect Device.
In addition to graphic processing facility, the embodiment of the present application additionally provides a kind of resource service device.Please A kind of resource service device according to presently filed embodiment is schematically shown refering to Fig. 5, Fig. 5 Structure chart, wherein, this resource service device 50 includes, this resource service device 50 includes at least one Graphic processing facility 51, (such as, two graphic processing facilities 511 and 512), monitoring means 52 and One communication unit 53.Further, graphic processing facility 511 maps with GPU-MPS611, GPU-MPS 611 A client can call a logical block in graphic processing facility 511, graphic processing facility 512 Mapping with GPU-MPS612, a client of GPU-MPS 612 can call graphic processing facility 512 In a logical block, a task process can be a client of GPU-MPS 611, it is possible to Think a client of GPU-MPS 612.
Monitoring means 52, for when the cycle of monitoring arrives, in monitoring current period, described graphics process fills The quantity of remaining logical block in putting;
First communication unit 53, for the monitor node data monitored being sent in cluster, in order to It is dynamic that described monitor node utilizes the data atom monitored to update the resource preset when the update cycle arrives Table;
Wherein, described resource dynamic table is including at least the quantity of remaining logical block in graphic processing facility.
Wherein, each logical block can generate a PIPE literary composition under the specified path of Resource Server Part, once this logical block is used, and corresponding PIPE file i.e. can be generated, therefore, monitoring means As long as 11 quantity monitoring the PIPE under this path, i.e. can determine that the quantity of remaining logical block.
It should be understood that when each in cluster dynamically updates remaining logical block local GPU from node Quantity time, this renewal operation can support that off-line dispatch, i.e. not by unified scheduler schedules money Source, and GPU resource is directly used in this locality).
It should be noted that the structure of the resource service device shown in Fig. 5 is only an example, it also may be used With greater number of graphic processing facility.Further, the application does not the most limit each graphic processing facility and reflects The quantity of the logical block that the GPU-MPS quantity penetrated, a client of GPU-MPS can be called and The quantity of the logical block that each graphic processing facility comprises.
In a preferred implementation of the application, this resource service device 50 is one in physical aspect Individual Resource Server.
In another preferred implementation of application, this Resource Server can be in cluster one from Node.
Such as, referring to that Fig. 6, Fig. 6 schematically show can be according to presently filed embodiment The exemplary application scene wherein implemented.Wherein, in a cluster, include multiple from node 10 (describe for convenience and show, illustrate only one in Fig. 1 from node), a monitor node 20 With a monitor node 30.It is a Resource Server from node 10, is comprising from node 10 There is multiple graphic process unit (GPU), Fig. 1 illustrate only two GPU, i.e. GPU-0 and GPU-1, Each GPU respectively comprises 16 logical blocks, and MPS-0 is the agency of scheduling GPU-0, MPS-1 For dispatching the agency of GPU-1, and, MPS-0 and MPS-1 respectively has 16 clients, MPS-0 A logical block of client schedulable GPU-0, a client schedulable of MPS-1 One logical block of GPU-1, a task process in user job both can be the one of MPS-0 Individual client, it is also possible to for a client of MPS-1.
Such as, when a logical block in GPU-0 is scheduled to some in some user job During task process, this task process can be connected on the agency of the GPU-0 belonging to this logical block, i.e. It is connected on MPS-0.
Monitor node 30 includes job managing apparatus 31 and resource scheduling device 32, job managing apparatus 31 first receive the request for targeted customer's job assignment GPU resource that cluster client terminal 60 sends 61, the quantity of the logical block of request scheduling it is shown with at this request 61 middle finger.Job managing apparatus 31 Forward the request to resource scheduling device 32.
The structured flowchart of resource scheduling device as shown in Figure 7, resource scheduling device 32 includes that second leads to Letter unit 321 and response unit 322, wherein, the second communication unit 321 is used for being received as target and makees The dispatch request 61 of industry Dispatching Drawings processor GPU resource;Response unit 322 is in response to described tune Degree request, searches at the figure that the quantity of remaining logical block is not zero from default resource dynamic table Reason device, and the quantity indicated according to described dispatch request, for institute from the graphic processing facility found State target job scheduling logic unit;Wherein, described resource dynamic table is including at least in graphic processing facility The quantity of remaining logical block.
In this application, resource scheduling device 32 can use any one dispatching party of the prior art Method scheduling logic unit.Such as, First fit scheduling, Best fit scheduling, Backfill scheduling or CFS Scheduling etc..
Resource scheduling device 32 generates a resource dynamic table, and is dynamically updated this money by from node 10 The quantity of remaining logical block in GPU-0 and GPU-1 on the dynamic table of source, in order to resource scheduling device 32 can carry out scheduling of resource according to the remaining logical block of each GPU.Remaining logical block refers to not It is scheduled to the logical block of task process.
Certainly, if cluster also includes other from node, this resource dynamic table also simultaneously by other from joint Point Dynamic Maintenance, and, this resource dynamic table also includes and is positioned in other each GPU from node The quantity of remaining logical block.It is to say, this resource dynamic table comprises all GPU from node In the quantity of remaining logical block.
It addition, this resource dynamic table can also include this resource dynamic table comprise all marks from node Know and the mark of each all GPU from node, to determine the position of each logical block.Example As, resource dynamic table as shown in Figure 6, from the mark of node 10 (such as, this resource dynamic table comprises This mark can be from node 10 overall situation numbering in the cluster), the GPU-0 that comprises from node 10 and The quantity of remaining logical block in the mark of GPU-1 and GPU-0 and GPU-1.
In addition, it is contemplated that the GPU resource that the actually used GPU resource of operation is probably asked than it Greatly, for a GPU, its actually used resource is it is possible to bigger than its scheduling resource.When for different work When dispatching the resource in this GPU, also it is easy to produce the problem excessively sharing this GPU.
Therefore, in order to avoid the problem excessively sharing GPU, it is also possible to safeguard each in resource dynamic table The instream factor of GPU, in order to resource scheduling device dispatches each GPU according to the instream factor of each GPU In logical block.It is to say, this resource dynamic table comprise in cluster all from the mark of node, The quantity of remaining logical block and each GPU in each mark of all GPU from node, each GPU Instream factor.
In a preferred implementation of the application, resource dynamic table also comprises GPU-0 and GPU-1 Instream factor, from node 10, monitoring means 11 is additionally operable to when the monitoring cycle arrives, prison The instream factor of GPU-0 and GPU-1 in survey current period.
Accordingly, for monitor node 30, the response unit 322 in resource scheduling device 32 is specifically used In, in response to described dispatch request, from default resource dynamic table search instream factor less than or etc. In the graphic processing facility that the number of default max-thresholds and remaining logical block is not zero, and according to The quantity of described dispatch request instruction, dispatches for described target job from the graphic processing facility found Logical block.
In another preferred implementation of the application, resource dynamic table can also comprise Resource Server The work of the graphic processing facility in the duty of the resource service device in cluster and resource service device State, and dynamically updated by resource scheduling device, resource scheduling device 32 also includes:
Updating block, for when the update cycle arrives, in resource dynamic table described in atomic update, resource takes The business duty of device and the duty of graphic processing facility and the state of use, described duty bag Including work and inoperative, use state includes usage amount and the overall utilization rate of logical block.
Such as, when deleting from node or GPU or giving birth to from node or GPU fault, its duty Become not working from work, when add new from node or new GPU time, its duty is set to Work.
In this application, updating block 323 can initialize this resource dynamic table when cluster initializes, Or, when job migration, because of job migration failure or the reason of QoS, when needing migration task, Updating block 323 can update this resource dynamic table.It addition, updating block can also be according to scheduling of resource Respond in more new resources dynamic table the quantity of remaining logical block in each GPU.
Corresponding with above-mentioned resource scheduling device, the embodiment of the present application additionally provides resource regulating method. Refer to Fig. 8, Fig. 8 and schematically show a kind of resource regulating method according to presently filed embodiment Flow chart, the method can be performed by resource scheduling device 32, and the method such as may include that
Step 801: be received as the dispatch request of target job Dispatching Drawings processor GPU resource, described Dispatch request indicates the quantity of the logical block of request scheduling.
Step 802: in response to described dispatch request, search remaining logic from default resource dynamic table The graphic processing facility that the quantity of unit is not zero, and the quantity indicated according to described dispatch request, from looking into For described target job scheduling logic unit in the graphic processing facility found.
Wherein, described resource dynamic table is including at least the quantity of remaining logical block in graphic processing facility.
In a preferred implementation of the application, resource dynamic table also comprises the reality of graphic processing facility Border utilization rate;Described step 802 is:
In response to described dispatch request, from default resource dynamic table search instream factor less than or etc. In the graphic processing facility that the number of default max-thresholds and remaining logical block is not zero, and according to The quantity of described dispatch request instruction, dispatches for described target job from the graphic processing facility found Logical block.
In another preferred implementation of the application, described resource dynamic table also comprises Resource Server The work of the graphic processing facility in the duty of the resource service device in cluster and resource service device State;The method can also include: when the update cycle arrives, in resource dynamic table described in atomic update The duty of resource service device and the duty of graphic processing facility, described duty includes work Make and inoperative.
As can be seen from the above-described embodiment, compared with prior art, the advantage of the application is:
Owing to logical block is minimum GPU resource scheduling unit, therefore, it can a graphics process Logical blocks different in device is scheduled to different task process, makes different user jobs jointly take Same graphic processing facility, it is ensured that the utilization rate of GPU resource in graphic processing facility.Meanwhile, this Shen Please utilize GPU-MPS technology, make a task process become a client of GPU-MPS, so, GPU-MPS just can manage client the same management role process with image tube.Due in a GPU-MPS All clients share a GPU context, therefore, in a GPU-MPS, as its client Multiple task process of end also the most only need to share a GPU context.
It addition, when scheduling of resource, instream factor scheduling logic unit based on each GPU, it is also possible to Avoid the occurrence of the problem that GPU excessively shares.
The technical staff in described field is it can be understood that arrive, for convenience of description and succinctly, above-mentioned The specific works process of the system, device and the unit that describe, be referred in preceding method embodiment is right Answer process, do not repeat them here.
In several embodiments provided herein, it should be understood that disclosed system, device and Method, can realize by another way.Such as, the device embodiment arrived described above is only Schematically, such as, the division of described unit, it is only a kind of logic function and divides, actual when realizing Can have other dividing mode, the most multiple unit or assembly can in conjunction with or be desirably integrated into another System, or some features can ignore, or do not perform.Another point, shown or discussed each other Coupling direct-coupling or communication connection can be the INDIRECT COUPLING by some interfaces, device or unit Or communication connection, can be being electrical, mechanical or other form.
The described unit that illustrates as separating component can be or can also be physically separate, as The parts that unit shows can be or may not be physical location, i.e. may be located at a place, or Person can also be distributed on multiple NE.Can select according to the actual needs part therein or All unit realizes the purpose of the present embodiment scheme.
It addition, each functional unit in each embodiment of the application can be integrated in a processing unit, Can also be that unit is individually physically present, it is also possible to two or more unit are integrated in a list In unit.Above-mentioned integrated unit both can realize to use the form of hardware, can use SFU software functional unit Form realize.
It should be noted that one of ordinary skill in the art will appreciate that and realize in above-described embodiment method All or part of flow process, can be by computer program and completes to instruct relevant hardware, described Program can be stored in a computer read/write memory medium, and this program is upon execution, it may include as above-mentioned The flow process of the embodiment of each method.Wherein, described storage medium can be magnetic disc, CD, read-only storage Memory body (Read-Only Memory, ROM) or random store-memory body (Random Access Memory, RAM) etc..
Above to graphic processing facility provided herein, resource service device, resource regulating method and Device is described in detail, and specific embodiment used herein is to the principle of the application and embodiment Being set forth, the explanation of above example is only intended to help and understands that the present processes and core thereof are thought Think;Simultaneously for one of ordinary skill in the art, according to the thought of the application, in specific embodiment party All will change in formula and range of application, in sum, this specification content should not be construed as this The restriction of application.

Claims (13)

1. a graphic processing facility, it is characterised in that in described graphic processing facility, logical block Being minimum graphic process unit GPU resource scheduling unit, described graphic processing facility maps at least one GPU multi-process proxy server GPU-MPS, described GPU-MPS are for dispatching described graphic processing facility Agency, at least one described logical block of a client schedulable of GPU-MPS, a task is entered Journey is a client of GPU-MPS, and the largest logical unit number that described graphic processing facility can comprise is M × N × K;
Wherein, M is the schedulable logic unit numbers of client of GPU-MPS, and N is one The maximum number clients that GPU-MPS comprises, K is the GPU-MPS number that described graphic processing facility maps, M, N and K are non-zero positive integer.
Graphic processing facility the most according to claim 1, it is characterised in that the one of GPU-MPS One logical block of individual client schedulable.
Graphic processing facility the most according to claim 1 and 2, it is characterised in that at described figure Reason device maps a GPU multi-process proxy server.
Graphic processing facility the most according to claim 1, it is characterised in that described graphics process fills Put and comprise M × N × K logical block.
5. a resource service device, it is characterised in that include that at least one is as in Claims 1-4 Graphic processing facility, monitoring means and the first communication unit described in any one, wherein,
Monitoring means, for when the cycle of monitoring arrives, monitoring described graphic processing facility in current period In the quantity of remaining logical block;
First communication unit, for the monitor node data monitored being sent in cluster, in order to institute Stating monitor node, utilize the data atom that monitors to update the resource preset when the update cycle arrives dynamic Table;
Wherein, described resource dynamic table is including at least the quantity of remaining logical block in graphic processing facility.
Resource service device the most according to claim 5, it is characterised in that described resource service fills One be set in cluster from node.
Resource service device the most according to claim 5, it is characterised in that described resource dynamic table Also comprise the instream factor of graphic processing facility;Described monitoring means is additionally operable to, and arrives in the monitoring cycle Time, the instream factor of graphic processing facility local in monitoring current period.
8. a resource regulating method, it is characterised in that apply in such as claim 5 to 7 any one Resource service device described in Xiang, described method includes:
It is received as the dispatch request of target job Dispatching Drawings processor GPU resource, in described dispatch request Indicate the quantity of the logical block of request scheduling;
In response to described dispatch request, from default resource dynamic table, search the number of remaining logical block The graphic processing facility that amount is not zero, and the quantity indicated according to described dispatch request, from the figure found It shape processing means is described target job scheduling logic unit;
Wherein, described resource dynamic table is including at least the quantity of remaining logical block in graphic processing facility.
Method the most according to claim 8, it is characterised in that described resource dynamic table also comprises figure The instream factor of shape processing means;
Described in response to described dispatch request, from default resource dynamic table, search remaining logical block The graphic processing facility that is not zero of quantity, and the quantity indicated according to described dispatch request, from finding Graphic processing facility in for described target job scheduling logic unit be:
In response to described dispatch request, from default resource dynamic table search instream factor less than or etc. In the graphic processing facility that the number of default max-thresholds and remaining logical block is not zero, and according to The quantity of described dispatch request instruction, dispatches for described target job from the graphic processing facility found Logical block.
Method the most according to claim 8 or claim 9, it is characterised in that described resource dynamic table is also Comprise the figure in the duty of resource service device in Resource Server cluster and resource service device The duty of processing means;Described method also includes:
When the update cycle arrives, the work shape of resource service device in resource dynamic table described in atomic update State and the duty of graphic processing facility, described duty includes work and inoperative.
11. 1 kinds of resource scheduling devices, it is characterised in that apply in such as claim 5 to 7 any One described resource service device, including:
Second communication unit, please for being received as the scheduling of target job Dispatching Drawings processor GPU resource Ask, described dispatch request indicates the quantity of the logical block of request scheduling;
Response unit, in response to described dispatch request, searches residue from default resource dynamic table The graphic processing facility that is not zero of the quantity of logical block, and the quantity indicated according to described dispatch request, For described target job scheduling logic unit from the graphic processing facility found;
Wherein, described resource dynamic table is including at least the quantity of remaining logical block in graphic processing facility.
12. devices according to claim 11, it is characterised in that described resource dynamic table also comprises The instream factor of graphic processing facility;
Described response unit specifically for, in response to described dispatch request, from default resource dynamic table Search instream factor to be not zero less than or equal to the number of the max-thresholds preset and remaining logical block Graphic processing facility, and the quantity indicated according to described dispatch request, from the graphics process dress found For described target job scheduling logic unit in putting.
13. according to the device described in claim 11 or 12, it is characterised in that described resource dynamic table Also comprise the figure in the duty of resource service device in Resource Server cluster and resource service device The duty of shape processing means;Described device also includes:
Updating block, for when the update cycle arrives, in resource dynamic table described in atomic update, resource takes The business duty of device and the duty of graphic processing facility, described duty includes work and non- Work.
CN201510208923.0A 2015-04-28 2015-04-28 Resource service device, resource scheduling method and device Active CN106155811B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201510208923.0A CN106155811B (en) 2015-04-28 2015-04-28 Resource service device, resource scheduling method and device
PCT/CN2016/079865 WO2016173450A1 (en) 2015-04-28 2016-04-21 Graphic processing device, resource service device, resource scheduling method and device thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510208923.0A CN106155811B (en) 2015-04-28 2015-04-28 Resource service device, resource scheduling method and device

Publications (2)

Publication Number Publication Date
CN106155811A true CN106155811A (en) 2016-11-23
CN106155811B CN106155811B (en) 2020-01-07

Family

ID=57198136

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510208923.0A Active CN106155811B (en) 2015-04-28 2015-04-28 Resource service device, resource scheduling method and device

Country Status (2)

Country Link
CN (1) CN106155811B (en)
WO (1) WO2016173450A1 (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106686352A (en) * 2016-12-23 2017-05-17 北京大学 Real-time processing method of multi-channel video data on multi-GPU platform
CN107247629A (en) * 2017-07-04 2017-10-13 北京百度网讯科技有限公司 Cloud computing system and cloud computing method and device for controlling server
CN107329834A (en) * 2017-07-04 2017-11-07 北京百度网讯科技有限公司 Method and apparatus for performing calculating task
WO2018233299A1 (en) * 2017-06-22 2018-12-27 平安科技(深圳)有限公司 Method, apparatus and device for scheduling processor, and medium
CN109936604A (en) * 2017-12-18 2019-06-25 北京图森未来科技有限公司 A resource scheduling method, device and system
CN110795249A (en) * 2019-10-30 2020-02-14 亚信科技(中国)有限公司 GPU resource scheduling method and device based on MESOS containerized platform
CN111400051A (en) * 2020-03-31 2020-07-10 京东方科技集团股份有限公司 Resource scheduling method, device and system
WO2021057405A1 (en) * 2019-09-25 2021-04-01 中兴通讯股份有限公司 Resource sharing method and device
WO2021142614A1 (en) * 2020-01-14 2021-07-22 华为技术有限公司 Chip state determining method and device, and cluster resource scheduling method and device

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107544845B (en) * 2017-06-26 2020-08-11 新华三大数据技术有限公司 GPU resource scheduling method and device

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102541640A (en) * 2011-12-28 2012-07-04 厦门市美亚柏科信息股份有限公司 Cluster GPU (graphic processing unit) resource scheduling system and method
US20120179851A1 (en) * 2010-12-15 2012-07-12 Advanced Micro Devices, Inc. Computer System Interrupt Handling
US20120188259A1 (en) * 2010-12-13 2012-07-26 Advanced Micro Devices, Inc. Mechanisms for Enabling Task Scheduling
CN102959517A (en) * 2010-06-10 2013-03-06 Otoy公司 Allocation of gpu resources accross multiple clients
US20140108915A1 (en) * 2012-10-15 2014-04-17 Famous Industries, Inc. Efficient Manipulation of Surfaces in Multi-Dimensional Space Using Energy Agents
CN104541247A (en) * 2012-08-07 2015-04-22 超威半导体公司 System and method for tuning a cloud computing system

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7673304B2 (en) * 2003-02-18 2010-03-02 Microsoft Corporation Multithreaded kernel for graphics processing unit
CN101403983B (en) * 2008-11-25 2010-10-13 北京航空航天大学 Resource monitoring method and system for multi-core processor based on virtual machine
CN104407920B (en) * 2014-12-23 2018-02-09 浪潮(北京)电子信息产业有限公司 A kind of data processing method and system based on interprocess communication

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102959517A (en) * 2010-06-10 2013-03-06 Otoy公司 Allocation of gpu resources accross multiple clients
US20120188259A1 (en) * 2010-12-13 2012-07-26 Advanced Micro Devices, Inc. Mechanisms for Enabling Task Scheduling
US20120179851A1 (en) * 2010-12-15 2012-07-12 Advanced Micro Devices, Inc. Computer System Interrupt Handling
CN102541640A (en) * 2011-12-28 2012-07-04 厦门市美亚柏科信息股份有限公司 Cluster GPU (graphic processing unit) resource scheduling system and method
CN104541247A (en) * 2012-08-07 2015-04-22 超威半导体公司 System and method for tuning a cloud computing system
US20140108915A1 (en) * 2012-10-15 2014-04-17 Famous Industries, Inc. Efficient Manipulation of Surfaces in Multi-Dimensional Space Using Energy Agents

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
WANG XIAN ET AL: "Multi-GPU performance of incompressible flow computation by lattice Boltzmann method on GPU cluster", 《PARALLEL COMPUTERING》 *
张勤飞: "基于GPU集群的通用并行渲染系统设计与实现", 《万方学位论文库》 *
陈庆奎等: "一种GPU集群的动态任务映射策略", 《计算机工程》 *

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106686352A (en) * 2016-12-23 2017-05-17 北京大学 Real-time processing method of multi-channel video data on multi-GPU platform
CN106686352B (en) * 2016-12-23 2019-06-07 北京大学 The real-time processing method of the multi-path video data of more GPU platforms
WO2018233299A1 (en) * 2017-06-22 2018-12-27 平安科技(深圳)有限公司 Method, apparatus and device for scheduling processor, and medium
CN107247629A (en) * 2017-07-04 2017-10-13 北京百度网讯科技有限公司 Cloud computing system and cloud computing method and device for controlling server
CN107329834A (en) * 2017-07-04 2017-11-07 北京百度网讯科技有限公司 Method and apparatus for performing calculating task
CN109936604A (en) * 2017-12-18 2019-06-25 北京图森未来科技有限公司 A resource scheduling method, device and system
WO2021057405A1 (en) * 2019-09-25 2021-04-01 中兴通讯股份有限公司 Resource sharing method and device
CN110795249A (en) * 2019-10-30 2020-02-14 亚信科技(中国)有限公司 GPU resource scheduling method and device based on MESOS containerized platform
WO2021142614A1 (en) * 2020-01-14 2021-07-22 华为技术有限公司 Chip state determining method and device, and cluster resource scheduling method and device
CN111400051A (en) * 2020-03-31 2020-07-10 京东方科技集团股份有限公司 Resource scheduling method, device and system
CN111400051B (en) * 2020-03-31 2023-10-27 京东方科技集团股份有限公司 Resource scheduling method, device and system

Also Published As

Publication number Publication date
WO2016173450A1 (en) 2016-11-03
CN106155811B (en) 2020-01-07

Similar Documents

Publication Publication Date Title
CN106155811A (en) Graphic processing facility, resource service device, resource regulating method and device
US10963285B2 (en) Resource management for virtual machines in cloud computing systems
US11455193B2 (en) Method for deploying virtual machines in cloud computing systems based on predicted lifetime
CN110427284A (en) Data processing method, distributed system, computer system and medium
JP2018518744A (en) Automatic scaling of resource instance groups within a compute cluster
US9218226B2 (en) System and methods for remote access to IMS databases
CN106933669A (en) For the apparatus and method of data processing
CN106254471A (en) Resource United Dispatching method and system under a kind of isomery cloud environment
KR102338849B1 (en) Method and system for providing stack memory management in real-time operating systems
CN106657314A (en) Cross-data center data synchronization system and method
CN105607950A (en) Virtual machine resource configuration method and apparatus
CN107479984A (en) Message based distributed space data processing system
KR20180038515A (en) Graphical processing virtualization on the provider network
WO2015179509A1 (en) High-performance computing framework for cloud computing environments
CN113835830A (en) AI-based RPA cluster management method, device and storage medium
US8977752B2 (en) Event-based dynamic resource provisioning
AU2018303662A1 (en) Scalable statistics and analytics mechanisms in cloud networking
CN115951974A (en) Management method, system, device and medium for GPU virtual machine
CN111404757A (en) Cloud-based cross-network application integration system
US10802874B1 (en) Cloud agnostic task scheduler
CN105653347B (en) A kind of server, method for managing resource and virtual machine manager
CN107528871A (en) Data analysis in storage system
US11656914B2 (en) Anticipating future resource consumption based on user sessions
CN113726902A (en) Calling method and system of microservice
CN104717269A (en) Method for monitoring and dispatching cloud public platform computer resources for location-based service

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant