CN106155811A - Graphic processing facility, resource service device, resource regulating method and device - Google Patents
Graphic processing facility, resource service device, resource regulating method and device Download PDFInfo
- Publication number
- CN106155811A CN106155811A CN201510208923.0A CN201510208923A CN106155811A CN 106155811 A CN106155811 A CN 106155811A CN 201510208923 A CN201510208923 A CN 201510208923A CN 106155811 A CN106155811 A CN 106155811A
- Authority
- CN
- China
- Prior art keywords
- gpu
- resource
- processing facility
- graphic processing
- logical block
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000012545 processing Methods 0.000 title claims abstract description 140
- 238000000034 method Methods 0.000 title claims abstract description 84
- 230000001105 regulatory effect Effects 0.000 title claims abstract description 11
- 230000008569 process Effects 0.000 claims abstract description 46
- 238000012544 monitoring process Methods 0.000 claims description 22
- 230000004044 response Effects 0.000 claims description 21
- 238000004891 communication Methods 0.000 claims description 11
- 238000005516 engineering process Methods 0.000 description 5
- 102100022681 40S ribosomal protein S27 Human genes 0.000 description 4
- 101000678466 Homo sapiens 40S ribosomal protein S27 Proteins 0.000 description 4
- 230000008901 benefit Effects 0.000 description 3
- 230000008859 change Effects 0.000 description 3
- 238000010168 coupling process Methods 0.000 description 3
- 238000005859 coupling reaction Methods 0.000 description 3
- 238000013507 mapping Methods 0.000 description 3
- 230000005012 migration Effects 0.000 description 3
- 238000013508 migration Methods 0.000 description 3
- 230000008878 coupling Effects 0.000 description 2
- 238000004883 computer application Methods 0.000 description 1
- 238000004590 computer program Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 238000012423 maintenance Methods 0.000 description 1
- 230000032696 parturition Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
- Information Transfer Between Computers (AREA)
- Processing Or Creating Images (AREA)
Abstract
The embodiment of the present application discloses a kind of graphic processing facility.Wherein, logical block is minimum GPU resource scheduling unit, this graphic processing facility maps at least one GPU multi-process proxy server GPU-MPS, GPU-MPS is the agency dispatching this graphic processing facility, at least one logical block of client schedulable of GPU-MPS, one task process corresponds to a client of GPU-MPS, and the largest logical unit number that this graphic processing facility can comprise is M × N × K;M is the schedulable logic unit numbers of client of GPU-MPS, and N is the maximum number clients that a GPU-MPS comprises, and K is the GPU-MPS number that this graphic processing facility maps.By the application, can also can be that this graphic processing facility saving is set up and the expense of switching GPU context while improving the utilization rate of GPU resource.Disclosed herein as well is a kind of resource service device, resource regulating method and device.
Description
Technical field
The application relates to computer application field, particularly relate to graphic processing facility, resource service device,
Resource regulating method and device.
Background technology
Owing to, in modern computer, the process of figure is more and more important, accordingly, it would be desirable to a kind of special
For the core processor of graphics process, and graphic process unit (GPU, Graphics Processing Unit)
It it is exactly a kind of device being specifically designed to graphics process.Meanwhile, at the powerful computing capability of GPU
Reason general-purpose computations (GPGPU, General Purpose GPU) is the most prevailing, for various high-performance
In computing cluster.
At present, in existing GPU cluster technology, when processing operation (job) that user submits to, main
There is the dispatching method of two kinds of GPU resource.Wherein, a kind of dispatching method is that Resource Scheduler is by one
Individual GPU (e.g. a, GPU card) is only scheduled to the operation of a user.Another kind of dispatching method is,
One GPU is scheduled to the operation of multiple user by Resource Scheduler simultaneously.
During realizing the application, inventors herein have recognized that in prior art, at least existence is as follows
Problem: in the first dispatching method, owing to a GPU is only monopolized by the operation of a user, and one
The operation of individual user is likely to make full use of the resource of a GPU, therefore there will be GPU resource
The problem that utilization rate is low.And in the second dispatching method, owing to a GPU is by the work of multiple users
Industry is shared, and multiple user more likely can make full use of the resource of a GPU, the most to a certain degree
On improve the utilization rate of GPU resource.
Although the second dispatching method can improve the utilization rate of GPU resource, but, when multiple users'
When a GPU is shared in operation, the process number that the operation of multiple users is opened simultaneously may be very big, for
Each process, GPU will set up a GPU context for it, therefore, set up on GPU
The quantity of GPU context also it is possible to very big, and, also can be in large number of GPU context
Switch over, set up and switch GPU context GPU resource can be made to produce great expense incurred, thus cause
Excessively share GPU problem.
Summary of the invention
In order to solve above-mentioned technical problem, the embodiment of the present application provides graphic processing facility, resource service
Device, resource regulating method and device, while improving the utilization rate of GPU resource, also can save
Set up and the expense of switching GPU context.Further, excessively sharing of GPU it is avoided as much as
Problem.
The embodiment of the present application discloses following technical scheme:
A kind of graphic processing facility, in described graphic processing facility, logical block is at minimum figure
Reason device GPU resource scheduling unit, described graphic processing facility maps at least one GPU multi-process agency
Server GPU-MPS, described GPU-MPS are the agency dispatching described graphic processing facility, GPU-MPS
At least one described logical block of a client schedulable, a task process is the one of GPU-MPS
Individual client, the largest logical unit number that described graphic processing facility can comprise is M × N × K;
Wherein, M is the schedulable logic unit numbers of client of GPU-MPS, and N is one
The maximum number clients that GPU-MPS comprises, K is the GPU-MPS number that described graphic processing facility maps,
M, N and K are non-zero positive integer.
Preferably, one logical block of a client schedulable of GPU-MPS.
Preferably, described graphic processing facility maps a GPU multi-process proxy server.
Preferably, described graphic processing facility comprises M × N × K logical block.
A kind of resource service device, including the graphic processing facility described at least one above-mentioned any one,
Monitoring means and the first communication unit, wherein,
Monitoring means, for when the cycle of monitoring arrives, monitoring described graphic processing facility in current period
In the quantity of remaining logical block;
First communication unit, for the monitor node data monitored being sent in cluster, in order to institute
Stating monitor node, utilize the data atom that monitors to update the resource preset when the update cycle arrives dynamic
Table;
Wherein, described resource dynamic table is including at least the quantity of remaining logical block in graphic processing facility.
Preferably, described resource service device is that in cluster is from node.
Preferably, described resource dynamic table also comprises the instream factor of graphic processing facility;Described monitoring
Unit is additionally operable to, when the cycle of monitoring arrives, and the reality of graphic processing facility local in monitoring current period
Border utilization rate.
A kind of resource regulating method, applies the resource service device described in above-mentioned any one, described side
Method includes:
It is received as the dispatch request of target job Dispatching Drawings processor GPU resource, in described dispatch request
Indicate the quantity of the logical block of request scheduling;
In response to described dispatch request, from default resource dynamic table, search the number of remaining logical block
The graphic processing facility that amount is not zero, and the quantity indicated according to described dispatch request, from the figure found
It shape processing means is described target job scheduling logic unit;
Wherein, described resource dynamic table is including at least the quantity of remaining logical block in graphic processing facility.
Preferably, described resource dynamic table also comprises the instream factor of graphic processing facility;
Described in response to described dispatch request, from default resource dynamic table, search remaining logical block
The graphic processing facility that is not zero of quantity, and the quantity indicated according to described dispatch request, from finding
Graphic processing facility in for described target job scheduling logic unit be:
In response to described dispatch request, from default resource dynamic table search instream factor less than or etc.
In the graphic processing facility that the number of default max-thresholds and remaining logical block is not zero, and according to
The quantity of described dispatch request instruction, dispatches for described target job from the graphic processing facility found
Logical block.
Preferably, described resource dynamic table also comprises the work of the resource service device in Resource Server cluster
Make the duty of graphic processing facility in state and resource service device;Described method also includes:
When the update cycle arrives, the work shape of resource service device in resource dynamic table described in atomic update
State and the duty of graphic processing facility, described duty includes work and inoperative.
A kind of resource scheduling device, it is characterised in that apply in the resource service described in above-mentioned any one
Device, including:
Second communication unit, please for being received as the scheduling of target job Dispatching Drawings processor GPU resource
Ask, described dispatch request indicates the quantity of the logical block of request scheduling;
Response unit, in response to described dispatch request, searches residue from default resource dynamic table
The graphic processing facility that is not zero of the quantity of logical block, and the quantity indicated according to described dispatch request,
For described target job scheduling logic unit from the graphic processing facility found;
Wherein, described resource dynamic table is including at least the quantity of remaining logical block in graphic processing facility.
Preferably, described resource dynamic table also comprises the instream factor of graphic processing facility;
Described response unit specifically for, in response to described dispatch request, from default resource dynamic table
Search instream factor to be not zero less than or equal to the number of the max-thresholds preset and remaining logical block
Graphic processing facility, and the quantity indicated according to described dispatch request, from the graphics process dress found
For described target job scheduling logic unit in putting.
Preferably, described resource dynamic table also comprises the work of the resource service device in Resource Server cluster
Make the duty of graphic processing facility in state and resource service device;Described device also includes:
Updating block, for when the update cycle arrives, in resource dynamic table described in atomic update, resource takes
The business duty of device and the duty of graphic processing facility, described duty includes work and non-
Work.
As can be seen from the above-described embodiment, compared with prior art, the advantage of the application is:
Owing to logical block is minimum GPU resource scheduling unit, therefore, it can a graphics process
Logical blocks different in device is scheduled to different task process, makes different user jobs jointly take
Same graphic processing facility, it is ensured that the utilization rate of GPU resource in graphic processing facility.Meanwhile, this Shen
Please utilize GPU-MPS technology, make a task process become a client of GPU-MPS, so,
GPU-MPS just can manage client the same management role process with image tube.Due in a GPU-MPS
All clients share a GPU context, therefore, in a GPU multi-process proxy server,
Multiple task process as its client also the most only need to share a GPU context.
It addition, when scheduling of resource, instream factor scheduling logic unit based on each GPU, it is also possible to
Avoid the occurrence of the problem that GPU excessively shares.
Accompanying drawing explanation
In order to be illustrated more clearly that the embodiment of the present application or technical scheme of the prior art, below will be to reality
Execute the required accompanying drawing used in example or description of the prior art to be briefly described, it should be apparent that below,
Accompanying drawing in description is only some embodiments of the application, for those of ordinary skill in the art,
On the premise of not paying creative work, it is also possible to obtain other accompanying drawing according to these accompanying drawings.
Fig. 1 schematically shows the structure of a kind of graphic processing facility according to presently filed embodiment
Figure;
Fig. 2 schematically shows the knot of the another kind of graphic processing facility according to presently filed embodiment
Composition;
Fig. 3 schematically shows the knot of the another kind of graphic processing facility according to presently filed embodiment
Composition;
Fig. 4 schematically shows the knot of the another kind of graphic processing facility according to presently filed embodiment
Composition;
Fig. 5 schematically shows the structure of a kind of resource service device according to presently filed embodiment
Figure;
Fig. 6 schematically show according to presently filed embodiment can be implemented within exemplary should
Use scene;
Fig. 7 schematically shows the knot of a kind of resource scheduling device according to presently filed embodiment
Structure block diagram;
Fig. 8 schematically shows the flow process of a kind of resource regulating method according to presently filed embodiment
Figure.
Detailed description of the invention
Understandable, below in conjunction with the accompanying drawings for enabling the above-mentioned purpose of the application, feature and advantage to become apparent from
The embodiment of the present application is described in detail.
The operation (job) that user submits to is made up of multiple tasks (task), task be by
One task process completes.Therefore, the job scheduling GPU resource for user is actually and has been
Become all task process scheduling GPU resource of this operation.
Refer to Fig. 1, Fig. 1 and schematically show a kind of graphics process according to presently filed embodiment
The structure chart of device, in this graphic processing facility 10, logical block 11 is that minimum GPU resource is adjusted
Degree unit, this graphic processing facility one GPU multi-process proxy server of mapping (GPU-MPS,
Graphics Processing Unit-Multiple Process Server) 20, this GPU-MPS 20 has
Maximum number clients is 16, and, this GPU-MPS 20 is for dispatching this graphic processing facility 10
Agency, one logical block 11 of a client schedulable of GPU-MPS20, a task process is for being somebody's turn to do
One client of GPU-MPS 20, the largest logical unit number that this graphic processing facility can comprise is 16
Individual.
It should be understood that owing to logical block is minimum GPU resource scheduling unit, therefore, it can by
In one graphic processing facility, different logical blocks is scheduled to different task process, makes different users
Operation takies same graphic processing facility jointly, it is ensured that the utilization rate of GPU resource in graphic processing facility.
Meanwhile, the application utilizes GPU-MPS technology, makes a task process become a visitor of-GPU-MPS
Family end, so, GPU-MPS just can manage client the same management role process with image tube.Due to one
All clients in GPU-MPS share a GPU context, therefore, at a GPU-MPS
In, the multiple task process as its client also the most only need to share a GPU context.Such as, when
When one graphic processing facility maps a GPU-MPS, only it is required to be and dispatches all of this graphic processing facility
Task process shares a GPU context, and is no longer necessary to set up respectively GPU context, thus drops
The low quantity of GPU context, has finally saved and has set up and the expense of switching GPU context.
It addition, when for this graphic processing facility 10 configuration logic unit, (can wrap between 1 to 16
Include 1 and 16) quantity of arbitrary disposition logical block.
One client of GPU-MPS 20, in addition to can dispatching a logical block, can be dispatched many
Individual logical block, e.g., 2,3, the most multiple logical blocks.Such as, GPU-MPS20 is worked as
Two logical blocks of a client schedulable, and, this graphic processing facility 10 still maps one
During GPU-MPS20, the largest logical unit number that this graphic processing facility 10 can comprise is 32, such as figure
Shown in 2.As can be seen here, the number of the GPU-MPS 20 mapped at this graphic processing facility 10 is fixing not
In the case of change, the largest logical unit number that this graphic processing facility 10 can comprise is with GPU-MPS20's
One schedulable logic unit numbers of client is relevant, and is directly proportional.
It addition, this graphic processing facility 10 can only map a GPU-MPS20, it is also possible to mapping is many
Individual GPU-MPS 20, e.g., 2,3, the most multiple GPU-MPS 20.Such as, figure is worked as
Processing means 10 maps two GPU-MPS 20, and, a client of GPU-MPS20 is adjustable
During one logical block of degree, the largest logical unit number that this graphic processing facility can comprise is 32, such as figure
Shown in 3.As can be seen here, a schedulable logic unit numbers of client at GPU-MPS 20 is fixing not
In the case of change, the largest logical unit number that this graphic processing facility 10 can comprise and this graphic processing facility
The number of 10 GPU-MPS 20 mapped is relevant, and is directly proportional.
It is to say, the largest logical unit number that can comprise of this graphic processing facility 10 both with GPU-MPS20
A schedulable logic unit numbers of client relevant, again with this graphic processing facility 10 map
The number of GPU-MPS 20 is relevant, and is directly proportional.Such as, two are mapped when graphic processing facility 10
GPU-MPS 20, and during two logical blocks of a client schedulable of GPU-MPS 20, this figure
The largest logical unit number that shape processing means can comprise is 64, as shown in Figure 4.
Therefore, for graphic processing facility 10, the largest logical unit number that it can comprise is M × N × K
Individual, wherein, M is the schedulable logic unit numbers of client of GPU-MPS, and N is one
The maximum number clients that GPU-MPS comprises, K is the GPU-MPS number that described graphic processing facility maps,
M, N and K are non-zero positive integer.
When configuring the logical block in this graphic processing facility 10, as long as at this graphic processing facility 10
Configure within the largest logical unit number that can comprise.
In a preferred implementation of the application, this graphic processing facility 10 comprises M × N × K and patrols
Collect unit.
In another preferred implementation of the application, a client schedulable one of GPU-MPS
Logical block, graphic processing facility 10 maps a GPU-MPS 20.It should be understood that this excellent
Select in embodiment, the largest logical unit number that graphic processing facility comprises and a GPU-MPS
The maximum number clients comprised is equal.
In addition, it is necessary to explanation, this graphic processing facility 10 is a graphics process in physical aspect
Device.
In addition to graphic processing facility, the embodiment of the present application additionally provides a kind of resource service device.Please
A kind of resource service device according to presently filed embodiment is schematically shown refering to Fig. 5, Fig. 5
Structure chart, wherein, this resource service device 50 includes, this resource service device 50 includes at least one
Graphic processing facility 51, (such as, two graphic processing facilities 511 and 512), monitoring means 52 and
One communication unit 53.Further, graphic processing facility 511 maps with GPU-MPS611, GPU-MPS 611
A client can call a logical block in graphic processing facility 511, graphic processing facility 512
Mapping with GPU-MPS612, a client of GPU-MPS 612 can call graphic processing facility 512
In a logical block, a task process can be a client of GPU-MPS 611, it is possible to
Think a client of GPU-MPS 612.
Monitoring means 52, for when the cycle of monitoring arrives, in monitoring current period, described graphics process fills
The quantity of remaining logical block in putting;
First communication unit 53, for the monitor node data monitored being sent in cluster, in order to
It is dynamic that described monitor node utilizes the data atom monitored to update the resource preset when the update cycle arrives
Table;
Wherein, described resource dynamic table is including at least the quantity of remaining logical block in graphic processing facility.
Wherein, each logical block can generate a PIPE literary composition under the specified path of Resource Server
Part, once this logical block is used, and corresponding PIPE file i.e. can be generated, therefore, monitoring means
As long as 11 quantity monitoring the PIPE under this path, i.e. can determine that the quantity of remaining logical block.
It should be understood that when each in cluster dynamically updates remaining logical block local GPU from node
Quantity time, this renewal operation can support that off-line dispatch, i.e. not by unified scheduler schedules money
Source, and GPU resource is directly used in this locality).
It should be noted that the structure of the resource service device shown in Fig. 5 is only an example, it also may be used
With greater number of graphic processing facility.Further, the application does not the most limit each graphic processing facility and reflects
The quantity of the logical block that the GPU-MPS quantity penetrated, a client of GPU-MPS can be called and
The quantity of the logical block that each graphic processing facility comprises.
In a preferred implementation of the application, this resource service device 50 is one in physical aspect
Individual Resource Server.
In another preferred implementation of application, this Resource Server can be in cluster one from
Node.
Such as, referring to that Fig. 6, Fig. 6 schematically show can be according to presently filed embodiment
The exemplary application scene wherein implemented.Wherein, in a cluster, include multiple from node 10
(describe for convenience and show, illustrate only one in Fig. 1 from node), a monitor node 20
With a monitor node 30.It is a Resource Server from node 10, is comprising from node 10
There is multiple graphic process unit (GPU), Fig. 1 illustrate only two GPU, i.e. GPU-0 and GPU-1,
Each GPU respectively comprises 16 logical blocks, and MPS-0 is the agency of scheduling GPU-0, MPS-1
For dispatching the agency of GPU-1, and, MPS-0 and MPS-1 respectively has 16 clients, MPS-0
A logical block of client schedulable GPU-0, a client schedulable of MPS-1
One logical block of GPU-1, a task process in user job both can be the one of MPS-0
Individual client, it is also possible to for a client of MPS-1.
Such as, when a logical block in GPU-0 is scheduled to some in some user job
During task process, this task process can be connected on the agency of the GPU-0 belonging to this logical block, i.e.
It is connected on MPS-0.
Monitor node 30 includes job managing apparatus 31 and resource scheduling device 32, job managing apparatus
31 first receive the request for targeted customer's job assignment GPU resource that cluster client terminal 60 sends
61, the quantity of the logical block of request scheduling it is shown with at this request 61 middle finger.Job managing apparatus 31
Forward the request to resource scheduling device 32.
The structured flowchart of resource scheduling device as shown in Figure 7, resource scheduling device 32 includes that second leads to
Letter unit 321 and response unit 322, wherein, the second communication unit 321 is used for being received as target and makees
The dispatch request 61 of industry Dispatching Drawings processor GPU resource;Response unit 322 is in response to described tune
Degree request, searches at the figure that the quantity of remaining logical block is not zero from default resource dynamic table
Reason device, and the quantity indicated according to described dispatch request, for institute from the graphic processing facility found
State target job scheduling logic unit;Wherein, described resource dynamic table is including at least in graphic processing facility
The quantity of remaining logical block.
In this application, resource scheduling device 32 can use any one dispatching party of the prior art
Method scheduling logic unit.Such as, First fit scheduling, Best fit scheduling, Backfill scheduling or CFS
Scheduling etc..
Resource scheduling device 32 generates a resource dynamic table, and is dynamically updated this money by from node 10
The quantity of remaining logical block in GPU-0 and GPU-1 on the dynamic table of source, in order to resource scheduling device
32 can carry out scheduling of resource according to the remaining logical block of each GPU.Remaining logical block refers to not
It is scheduled to the logical block of task process.
Certainly, if cluster also includes other from node, this resource dynamic table also simultaneously by other from joint
Point Dynamic Maintenance, and, this resource dynamic table also includes and is positioned in other each GPU from node
The quantity of remaining logical block.It is to say, this resource dynamic table comprises all GPU from node
In the quantity of remaining logical block.
It addition, this resource dynamic table can also include this resource dynamic table comprise all marks from node
Know and the mark of each all GPU from node, to determine the position of each logical block.Example
As, resource dynamic table as shown in Figure 6, from the mark of node 10 (such as, this resource dynamic table comprises
This mark can be from node 10 overall situation numbering in the cluster), the GPU-0 that comprises from node 10 and
The quantity of remaining logical block in the mark of GPU-1 and GPU-0 and GPU-1.
In addition, it is contemplated that the GPU resource that the actually used GPU resource of operation is probably asked than it
Greatly, for a GPU, its actually used resource is it is possible to bigger than its scheduling resource.When for different work
When dispatching the resource in this GPU, also it is easy to produce the problem excessively sharing this GPU.
Therefore, in order to avoid the problem excessively sharing GPU, it is also possible to safeguard each in resource dynamic table
The instream factor of GPU, in order to resource scheduling device dispatches each GPU according to the instream factor of each GPU
In logical block.It is to say, this resource dynamic table comprise in cluster all from the mark of node,
The quantity of remaining logical block and each GPU in each mark of all GPU from node, each GPU
Instream factor.
In a preferred implementation of the application, resource dynamic table also comprises GPU-0 and GPU-1
Instream factor, from node 10, monitoring means 11 is additionally operable to when the monitoring cycle arrives, prison
The instream factor of GPU-0 and GPU-1 in survey current period.
Accordingly, for monitor node 30, the response unit 322 in resource scheduling device 32 is specifically used
In, in response to described dispatch request, from default resource dynamic table search instream factor less than or etc.
In the graphic processing facility that the number of default max-thresholds and remaining logical block is not zero, and according to
The quantity of described dispatch request instruction, dispatches for described target job from the graphic processing facility found
Logical block.
In another preferred implementation of the application, resource dynamic table can also comprise Resource Server
The work of the graphic processing facility in the duty of the resource service device in cluster and resource service device
State, and dynamically updated by resource scheduling device, resource scheduling device 32 also includes:
Updating block, for when the update cycle arrives, in resource dynamic table described in atomic update, resource takes
The business duty of device and the duty of graphic processing facility and the state of use, described duty bag
Including work and inoperative, use state includes usage amount and the overall utilization rate of logical block.
Such as, when deleting from node or GPU or giving birth to from node or GPU fault, its duty
Become not working from work, when add new from node or new GPU time, its duty is set to
Work.
In this application, updating block 323 can initialize this resource dynamic table when cluster initializes,
Or, when job migration, because of job migration failure or the reason of QoS, when needing migration task,
Updating block 323 can update this resource dynamic table.It addition, updating block can also be according to scheduling of resource
Respond in more new resources dynamic table the quantity of remaining logical block in each GPU.
Corresponding with above-mentioned resource scheduling device, the embodiment of the present application additionally provides resource regulating method.
Refer to Fig. 8, Fig. 8 and schematically show a kind of resource regulating method according to presently filed embodiment
Flow chart, the method can be performed by resource scheduling device 32, and the method such as may include that
Step 801: be received as the dispatch request of target job Dispatching Drawings processor GPU resource, described
Dispatch request indicates the quantity of the logical block of request scheduling.
Step 802: in response to described dispatch request, search remaining logic from default resource dynamic table
The graphic processing facility that the quantity of unit is not zero, and the quantity indicated according to described dispatch request, from looking into
For described target job scheduling logic unit in the graphic processing facility found.
Wherein, described resource dynamic table is including at least the quantity of remaining logical block in graphic processing facility.
In a preferred implementation of the application, resource dynamic table also comprises the reality of graphic processing facility
Border utilization rate;Described step 802 is:
In response to described dispatch request, from default resource dynamic table search instream factor less than or etc.
In the graphic processing facility that the number of default max-thresholds and remaining logical block is not zero, and according to
The quantity of described dispatch request instruction, dispatches for described target job from the graphic processing facility found
Logical block.
In another preferred implementation of the application, described resource dynamic table also comprises Resource Server
The work of the graphic processing facility in the duty of the resource service device in cluster and resource service device
State;The method can also include: when the update cycle arrives, in resource dynamic table described in atomic update
The duty of resource service device and the duty of graphic processing facility, described duty includes work
Make and inoperative.
As can be seen from the above-described embodiment, compared with prior art, the advantage of the application is:
Owing to logical block is minimum GPU resource scheduling unit, therefore, it can a graphics process
Logical blocks different in device is scheduled to different task process, makes different user jobs jointly take
Same graphic processing facility, it is ensured that the utilization rate of GPU resource in graphic processing facility.Meanwhile, this Shen
Please utilize GPU-MPS technology, make a task process become a client of GPU-MPS, so,
GPU-MPS just can manage client the same management role process with image tube.Due in a GPU-MPS
All clients share a GPU context, therefore, in a GPU-MPS, as its client
Multiple task process of end also the most only need to share a GPU context.
It addition, when scheduling of resource, instream factor scheduling logic unit based on each GPU, it is also possible to
Avoid the occurrence of the problem that GPU excessively shares.
The technical staff in described field is it can be understood that arrive, for convenience of description and succinctly, above-mentioned
The specific works process of the system, device and the unit that describe, be referred in preceding method embodiment is right
Answer process, do not repeat them here.
In several embodiments provided herein, it should be understood that disclosed system, device and
Method, can realize by another way.Such as, the device embodiment arrived described above is only
Schematically, such as, the division of described unit, it is only a kind of logic function and divides, actual when realizing
Can have other dividing mode, the most multiple unit or assembly can in conjunction with or be desirably integrated into another
System, or some features can ignore, or do not perform.Another point, shown or discussed each other
Coupling direct-coupling or communication connection can be the INDIRECT COUPLING by some interfaces, device or unit
Or communication connection, can be being electrical, mechanical or other form.
The described unit that illustrates as separating component can be or can also be physically separate, as
The parts that unit shows can be or may not be physical location, i.e. may be located at a place, or
Person can also be distributed on multiple NE.Can select according to the actual needs part therein or
All unit realizes the purpose of the present embodiment scheme.
It addition, each functional unit in each embodiment of the application can be integrated in a processing unit,
Can also be that unit is individually physically present, it is also possible to two or more unit are integrated in a list
In unit.Above-mentioned integrated unit both can realize to use the form of hardware, can use SFU software functional unit
Form realize.
It should be noted that one of ordinary skill in the art will appreciate that and realize in above-described embodiment method
All or part of flow process, can be by computer program and completes to instruct relevant hardware, described
Program can be stored in a computer read/write memory medium, and this program is upon execution, it may include as above-mentioned
The flow process of the embodiment of each method.Wherein, described storage medium can be magnetic disc, CD, read-only storage
Memory body (Read-Only Memory, ROM) or random store-memory body (Random Access
Memory, RAM) etc..
Above to graphic processing facility provided herein, resource service device, resource regulating method and
Device is described in detail, and specific embodiment used herein is to the principle of the application and embodiment
Being set forth, the explanation of above example is only intended to help and understands that the present processes and core thereof are thought
Think;Simultaneously for one of ordinary skill in the art, according to the thought of the application, in specific embodiment party
All will change in formula and range of application, in sum, this specification content should not be construed as this
The restriction of application.
Claims (13)
1. a graphic processing facility, it is characterised in that in described graphic processing facility, logical block
Being minimum graphic process unit GPU resource scheduling unit, described graphic processing facility maps at least one
GPU multi-process proxy server GPU-MPS, described GPU-MPS are for dispatching described graphic processing facility
Agency, at least one described logical block of a client schedulable of GPU-MPS, a task is entered
Journey is a client of GPU-MPS, and the largest logical unit number that described graphic processing facility can comprise is
M × N × K;
Wherein, M is the schedulable logic unit numbers of client of GPU-MPS, and N is one
The maximum number clients that GPU-MPS comprises, K is the GPU-MPS number that described graphic processing facility maps,
M, N and K are non-zero positive integer.
Graphic processing facility the most according to claim 1, it is characterised in that the one of GPU-MPS
One logical block of individual client schedulable.
Graphic processing facility the most according to claim 1 and 2, it is characterised in that at described figure
Reason device maps a GPU multi-process proxy server.
Graphic processing facility the most according to claim 1, it is characterised in that described graphics process fills
Put and comprise M × N × K logical block.
5. a resource service device, it is characterised in that include that at least one is as in Claims 1-4
Graphic processing facility, monitoring means and the first communication unit described in any one, wherein,
Monitoring means, for when the cycle of monitoring arrives, monitoring described graphic processing facility in current period
In the quantity of remaining logical block;
First communication unit, for the monitor node data monitored being sent in cluster, in order to institute
Stating monitor node, utilize the data atom that monitors to update the resource preset when the update cycle arrives dynamic
Table;
Wherein, described resource dynamic table is including at least the quantity of remaining logical block in graphic processing facility.
Resource service device the most according to claim 5, it is characterised in that described resource service fills
One be set in cluster from node.
Resource service device the most according to claim 5, it is characterised in that described resource dynamic table
Also comprise the instream factor of graphic processing facility;Described monitoring means is additionally operable to, and arrives in the monitoring cycle
Time, the instream factor of graphic processing facility local in monitoring current period.
8. a resource regulating method, it is characterised in that apply in such as claim 5 to 7 any one
Resource service device described in Xiang, described method includes:
It is received as the dispatch request of target job Dispatching Drawings processor GPU resource, in described dispatch request
Indicate the quantity of the logical block of request scheduling;
In response to described dispatch request, from default resource dynamic table, search the number of remaining logical block
The graphic processing facility that amount is not zero, and the quantity indicated according to described dispatch request, from the figure found
It shape processing means is described target job scheduling logic unit;
Wherein, described resource dynamic table is including at least the quantity of remaining logical block in graphic processing facility.
Method the most according to claim 8, it is characterised in that described resource dynamic table also comprises figure
The instream factor of shape processing means;
Described in response to described dispatch request, from default resource dynamic table, search remaining logical block
The graphic processing facility that is not zero of quantity, and the quantity indicated according to described dispatch request, from finding
Graphic processing facility in for described target job scheduling logic unit be:
In response to described dispatch request, from default resource dynamic table search instream factor less than or etc.
In the graphic processing facility that the number of default max-thresholds and remaining logical block is not zero, and according to
The quantity of described dispatch request instruction, dispatches for described target job from the graphic processing facility found
Logical block.
Method the most according to claim 8 or claim 9, it is characterised in that described resource dynamic table is also
Comprise the figure in the duty of resource service device in Resource Server cluster and resource service device
The duty of processing means;Described method also includes:
When the update cycle arrives, the work shape of resource service device in resource dynamic table described in atomic update
State and the duty of graphic processing facility, described duty includes work and inoperative.
11. 1 kinds of resource scheduling devices, it is characterised in that apply in such as claim 5 to 7 any
One described resource service device, including:
Second communication unit, please for being received as the scheduling of target job Dispatching Drawings processor GPU resource
Ask, described dispatch request indicates the quantity of the logical block of request scheduling;
Response unit, in response to described dispatch request, searches residue from default resource dynamic table
The graphic processing facility that is not zero of the quantity of logical block, and the quantity indicated according to described dispatch request,
For described target job scheduling logic unit from the graphic processing facility found;
Wherein, described resource dynamic table is including at least the quantity of remaining logical block in graphic processing facility.
12. devices according to claim 11, it is characterised in that described resource dynamic table also comprises
The instream factor of graphic processing facility;
Described response unit specifically for, in response to described dispatch request, from default resource dynamic table
Search instream factor to be not zero less than or equal to the number of the max-thresholds preset and remaining logical block
Graphic processing facility, and the quantity indicated according to described dispatch request, from the graphics process dress found
For described target job scheduling logic unit in putting.
13. according to the device described in claim 11 or 12, it is characterised in that described resource dynamic table
Also comprise the figure in the duty of resource service device in Resource Server cluster and resource service device
The duty of shape processing means;Described device also includes:
Updating block, for when the update cycle arrives, in resource dynamic table described in atomic update, resource takes
The business duty of device and the duty of graphic processing facility, described duty includes work and non-
Work.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510208923.0A CN106155811B (en) | 2015-04-28 | 2015-04-28 | Resource service device, resource scheduling method and device |
PCT/CN2016/079865 WO2016173450A1 (en) | 2015-04-28 | 2016-04-21 | Graphic processing device, resource service device, resource scheduling method and device thereof |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510208923.0A CN106155811B (en) | 2015-04-28 | 2015-04-28 | Resource service device, resource scheduling method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106155811A true CN106155811A (en) | 2016-11-23 |
CN106155811B CN106155811B (en) | 2020-01-07 |
Family
ID=57198136
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201510208923.0A Active CN106155811B (en) | 2015-04-28 | 2015-04-28 | Resource service device, resource scheduling method and device |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN106155811B (en) |
WO (1) | WO2016173450A1 (en) |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106686352A (en) * | 2016-12-23 | 2017-05-17 | 北京大学 | Real-time processing method of multi-channel video data on multi-GPU platform |
CN107247629A (en) * | 2017-07-04 | 2017-10-13 | 北京百度网讯科技有限公司 | Cloud computing system and cloud computing method and device for controlling server |
CN107329834A (en) * | 2017-07-04 | 2017-11-07 | 北京百度网讯科技有限公司 | Method and apparatus for performing calculating task |
WO2018233299A1 (en) * | 2017-06-22 | 2018-12-27 | 平安科技(深圳)有限公司 | Method, apparatus and device for scheduling processor, and medium |
CN109936604A (en) * | 2017-12-18 | 2019-06-25 | 北京图森未来科技有限公司 | A resource scheduling method, device and system |
CN110795249A (en) * | 2019-10-30 | 2020-02-14 | 亚信科技(中国)有限公司 | GPU resource scheduling method and device based on MESOS containerized platform |
CN111400051A (en) * | 2020-03-31 | 2020-07-10 | 京东方科技集团股份有限公司 | Resource scheduling method, device and system |
WO2021057405A1 (en) * | 2019-09-25 | 2021-04-01 | 中兴通讯股份有限公司 | Resource sharing method and device |
WO2021142614A1 (en) * | 2020-01-14 | 2021-07-22 | 华为技术有限公司 | Chip state determining method and device, and cluster resource scheduling method and device |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107544845B (en) * | 2017-06-26 | 2020-08-11 | 新华三大数据技术有限公司 | GPU resource scheduling method and device |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102541640A (en) * | 2011-12-28 | 2012-07-04 | 厦门市美亚柏科信息股份有限公司 | Cluster GPU (graphic processing unit) resource scheduling system and method |
US20120179851A1 (en) * | 2010-12-15 | 2012-07-12 | Advanced Micro Devices, Inc. | Computer System Interrupt Handling |
US20120188259A1 (en) * | 2010-12-13 | 2012-07-26 | Advanced Micro Devices, Inc. | Mechanisms for Enabling Task Scheduling |
CN102959517A (en) * | 2010-06-10 | 2013-03-06 | Otoy公司 | Allocation of gpu resources accross multiple clients |
US20140108915A1 (en) * | 2012-10-15 | 2014-04-17 | Famous Industries, Inc. | Efficient Manipulation of Surfaces in Multi-Dimensional Space Using Energy Agents |
CN104541247A (en) * | 2012-08-07 | 2015-04-22 | 超威半导体公司 | System and method for tuning a cloud computing system |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7673304B2 (en) * | 2003-02-18 | 2010-03-02 | Microsoft Corporation | Multithreaded kernel for graphics processing unit |
CN101403983B (en) * | 2008-11-25 | 2010-10-13 | 北京航空航天大学 | Resource monitoring method and system for multi-core processor based on virtual machine |
CN104407920B (en) * | 2014-12-23 | 2018-02-09 | 浪潮(北京)电子信息产业有限公司 | A kind of data processing method and system based on interprocess communication |
-
2015
- 2015-04-28 CN CN201510208923.0A patent/CN106155811B/en active Active
-
2016
- 2016-04-21 WO PCT/CN2016/079865 patent/WO2016173450A1/en active Application Filing
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102959517A (en) * | 2010-06-10 | 2013-03-06 | Otoy公司 | Allocation of gpu resources accross multiple clients |
US20120188259A1 (en) * | 2010-12-13 | 2012-07-26 | Advanced Micro Devices, Inc. | Mechanisms for Enabling Task Scheduling |
US20120179851A1 (en) * | 2010-12-15 | 2012-07-12 | Advanced Micro Devices, Inc. | Computer System Interrupt Handling |
CN102541640A (en) * | 2011-12-28 | 2012-07-04 | 厦门市美亚柏科信息股份有限公司 | Cluster GPU (graphic processing unit) resource scheduling system and method |
CN104541247A (en) * | 2012-08-07 | 2015-04-22 | 超威半导体公司 | System and method for tuning a cloud computing system |
US20140108915A1 (en) * | 2012-10-15 | 2014-04-17 | Famous Industries, Inc. | Efficient Manipulation of Surfaces in Multi-Dimensional Space Using Energy Agents |
Non-Patent Citations (3)
Title |
---|
WANG XIAN ET AL: "Multi-GPU performance of incompressible flow computation by lattice Boltzmann method on GPU cluster", 《PARALLEL COMPUTERING》 * |
张勤飞: "基于GPU集群的通用并行渲染系统设计与实现", 《万方学位论文库》 * |
陈庆奎等: "一种GPU集群的动态任务映射策略", 《计算机工程》 * |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106686352A (en) * | 2016-12-23 | 2017-05-17 | 北京大学 | Real-time processing method of multi-channel video data on multi-GPU platform |
CN106686352B (en) * | 2016-12-23 | 2019-06-07 | 北京大学 | The real-time processing method of the multi-path video data of more GPU platforms |
WO2018233299A1 (en) * | 2017-06-22 | 2018-12-27 | 平安科技(深圳)有限公司 | Method, apparatus and device for scheduling processor, and medium |
CN107247629A (en) * | 2017-07-04 | 2017-10-13 | 北京百度网讯科技有限公司 | Cloud computing system and cloud computing method and device for controlling server |
CN107329834A (en) * | 2017-07-04 | 2017-11-07 | 北京百度网讯科技有限公司 | Method and apparatus for performing calculating task |
CN109936604A (en) * | 2017-12-18 | 2019-06-25 | 北京图森未来科技有限公司 | A resource scheduling method, device and system |
WO2021057405A1 (en) * | 2019-09-25 | 2021-04-01 | 中兴通讯股份有限公司 | Resource sharing method and device |
CN110795249A (en) * | 2019-10-30 | 2020-02-14 | 亚信科技(中国)有限公司 | GPU resource scheduling method and device based on MESOS containerized platform |
WO2021142614A1 (en) * | 2020-01-14 | 2021-07-22 | 华为技术有限公司 | Chip state determining method and device, and cluster resource scheduling method and device |
CN111400051A (en) * | 2020-03-31 | 2020-07-10 | 京东方科技集团股份有限公司 | Resource scheduling method, device and system |
CN111400051B (en) * | 2020-03-31 | 2023-10-27 | 京东方科技集团股份有限公司 | Resource scheduling method, device and system |
Also Published As
Publication number | Publication date |
---|---|
WO2016173450A1 (en) | 2016-11-03 |
CN106155811B (en) | 2020-01-07 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106155811A (en) | Graphic processing facility, resource service device, resource regulating method and device | |
US10963285B2 (en) | Resource management for virtual machines in cloud computing systems | |
US11455193B2 (en) | Method for deploying virtual machines in cloud computing systems based on predicted lifetime | |
CN110427284A (en) | Data processing method, distributed system, computer system and medium | |
JP2018518744A (en) | Automatic scaling of resource instance groups within a compute cluster | |
US9218226B2 (en) | System and methods for remote access to IMS databases | |
CN106933669A (en) | For the apparatus and method of data processing | |
CN106254471A (en) | Resource United Dispatching method and system under a kind of isomery cloud environment | |
KR102338849B1 (en) | Method and system for providing stack memory management in real-time operating systems | |
CN106657314A (en) | Cross-data center data synchronization system and method | |
CN105607950A (en) | Virtual machine resource configuration method and apparatus | |
CN107479984A (en) | Message based distributed space data processing system | |
KR20180038515A (en) | Graphical processing virtualization on the provider network | |
WO2015179509A1 (en) | High-performance computing framework for cloud computing environments | |
CN113835830A (en) | AI-based RPA cluster management method, device and storage medium | |
US8977752B2 (en) | Event-based dynamic resource provisioning | |
AU2018303662A1 (en) | Scalable statistics and analytics mechanisms in cloud networking | |
CN115951974A (en) | Management method, system, device and medium for GPU virtual machine | |
CN111404757A (en) | Cloud-based cross-network application integration system | |
US10802874B1 (en) | Cloud agnostic task scheduler | |
CN105653347B (en) | A kind of server, method for managing resource and virtual machine manager | |
CN107528871A (en) | Data analysis in storage system | |
US11656914B2 (en) | Anticipating future resource consumption based on user sessions | |
CN113726902A (en) | Calling method and system of microservice | |
CN104717269A (en) | Method for monitoring and dispatching cloud public platform computer resources for location-based service |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |