WO2024125341A1 - Task scheduling method, apparatus and system - Google Patents
Task scheduling method, apparatus and system Download PDFInfo
- Publication number
- WO2024125341A1 WO2024125341A1 PCT/CN2023/136252 CN2023136252W WO2024125341A1 WO 2024125341 A1 WO2024125341 A1 WO 2024125341A1 CN 2023136252 W CN2023136252 W CN 2023136252W WO 2024125341 A1 WO2024125341 A1 WO 2024125341A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- task
- tasks
- blocked
- synchronization
- determined
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 176
- 238000012545 processing Methods 0.000 claims abstract description 130
- 230000008569 process Effects 0.000 claims abstract description 89
- 230000000903 blocking effect Effects 0.000 claims description 102
- 230000001360 synchronised effect Effects 0.000 claims description 53
- 230000001419 dependent effect Effects 0.000 claims description 43
- 230000015654 memory Effects 0.000 claims description 12
- 238000004590 computer program Methods 0.000 claims description 9
- 238000013461 design Methods 0.000 description 45
- 239000000306 component Substances 0.000 description 19
- 238000010586 diagram Methods 0.000 description 17
- 230000006870 function Effects 0.000 description 17
- 238000004891 communication Methods 0.000 description 12
- 238000004458 analytical method Methods 0.000 description 9
- 238000009877 rendering Methods 0.000 description 9
- 238000003709 image segmentation Methods 0.000 description 7
- 101100012902 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) FIG2 gene Proteins 0.000 description 5
- 230000008878 coupling Effects 0.000 description 5
- 238000010168 coupling process Methods 0.000 description 5
- 238000005859 coupling reaction Methods 0.000 description 5
- 238000001914 filtration Methods 0.000 description 4
- 239000012634 fragment Substances 0.000 description 4
- 238000012544 monitoring process Methods 0.000 description 4
- 101000827703 Homo sapiens Polyphosphoinositide phosphatase Proteins 0.000 description 3
- 102100023591 Polyphosphoinositide phosphatase Human genes 0.000 description 3
- 101100233916 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) KAR5 gene Proteins 0.000 description 3
- 239000008358 core component Substances 0.000 description 3
- 238000003672 processing method Methods 0.000 description 3
- 101001121408 Homo sapiens L-amino-acid oxidase Proteins 0.000 description 2
- 102100026388 L-amino-acid oxidase Human genes 0.000 description 2
- 239000000872 buffer Substances 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000004422 calculation algorithm Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000011017 operating method Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000004904 shortening Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/48—Program initiating; Program switching, e.g. by interrupt
- G06F9/4806—Task transfer initiation or dispatching
- G06F9/4843—Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
- G06F9/4881—Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/48—Program initiating; Program switching, e.g. by interrupt
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5027—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
- G06F9/5038—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the execution order of a plurality of tasks, e.g. taking priority or time dependency constraints into consideration
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/52—Program synchronisation; Mutual exclusion, e.g. by means of semaphores
Definitions
- the present application relates to the field of processor technology, and in particular to a task scheduling method, device and system.
- the task scheduler is one of the core components of the processor, which is used to reasonably schedule tasks to the processing unit to make full use of the hardware resources of the processor to efficiently process tasks.
- the rationality of the task scheduler's scheduling of tasks is crucial to shortening the waiting time of tasks, increasing the parallel amount of tasks, optimizing resource utilization, and reducing costs and increasing efficiency. Therefore, how to achieve reasonable scheduling of tasks has become a mainstream research direction in the design of task schedulers.
- a plurality of task queues are provided in the task scheduler, and the plurality of task queues are respectively used to store tasks belonging to different businesses, and the tasks in the plurality of task queues are executed in parallel, and the tasks in one task queue are executed in series.
- this solution can process tasks belonging to different businesses in parallel, tasks belonging to the same business need to be processed in series.
- the previous task in a task queue is not processed and completed, the other tasks after the task cannot use the idle hardware resources for processing in advance. It can be seen that this solution still causes the processing unit to have the problem of no load, and cannot make full use of the hardware resources of the processing unit to process tasks with high concurrency, which is not conducive to improving the utilization rate of the processing unit.
- the present application provides a task scheduling method, device and system for improving the utilization rate of a processing unit.
- the present application provides a task scheduling method, which is applicable to a task scheduler, and the task scheduler can be any device, apparatus or equipment with task scheduling capability, or a chip or circuit, without limitation.
- the method comprises: the task scheduler obtains N tasks and the association relationship between the N tasks, and according to the association relationship between the N tasks, determines the non-dependent or de-dependent tasks among the N tasks as schedulable tasks, and schedules the schedulable tasks to the processing unit.
- N is a positive integer.
- N tasks can be exemplarily sent to the task scheduler by the device driver package.
- the scheduling order of N tasks is maintained by the task scheduler from the hardware side, instead of the device driver package specifying the task execution order from the software side.
- the tasks without dependency or whose dependency has been released can be sent to the processing unit in advance for processing, so as to make full use of all the hardware resources of the processing unit to process the tasks with high concurrency, and to prevent the processing unit from being idle as much as possible, thereby effectively improving the utilization rate of the processing unit.
- the task scheduling method does not require the hardware side to send a notification message to the software side after each task is executed to instruct the software side to send a new task, which can effectively reduce the work pressure on the software side, improve the efficiency of task scheduling, and save communication overhead.
- association relationship of the N tasks may include synchronization association and/or dependency association, where:
- Synchronous association is used to indicate that a task is associated with all other tasks that are obtained earlier or later than the task.
- the synchronous association may include forward synchronous association and backward synchronous association.
- Forward synchronous association means that a task depends on all other tasks that are issued earlier than the task.
- the task that has a forward synchronous association with all other tasks is also called a forward synchronous task.
- Backward synchronous association means that all other tasks that are issued later than a task depend on the task.
- the task that has a backward synchronous association with all other tasks is also called a backward synchronous task;
- Dependency associations include partial dependency and serial dependency.
- Partial dependency means that a task depends on the execution results of some other tasks, that is, the task can only be executed after all other tasks it depends on have been completed.
- Serial dependency means that a task depends on the execution of some or all other tasks, that is, the task can only be executed after all other tasks it depends on have been completed.
- the device driver package only needs to configure one task to indicate the dependency relationship between the task and all other tasks acquired earlier or later than the task, without having to indicate all other tasks that the task depends on one by one in the configuration information of the task.
- Setting the dependency association can also separate tasks that are loosely associated except for the synchronization association.
- the complexity of the task data structure can be greatly reduced while indicating the association relationship between all tasks, which helps to alleviate the communication consumption between the device driver package and the task scheduler.
- the task scheduler determines the tasks that are not dependent or have contacted with the dependencies among the N tasks as schedulable tasks according to the association relationship among the N tasks, including: the task scheduler determines the tasks that are synchronously associated with other tasks according to the association relationship among the N tasks, and when the other tasks that are synchronously associated with the task corresponding to the task are executed, the task is determined as a schedulable task. In this way, by monitoring the execution status of the tasks that are synchronously associated, the task can be promptly scheduled to the processing unit when the other tasks that are synchronously associated with the task no longer block the task.
- other tasks corresponding to the task and having synchronization association with the task are completed, including: the task is a forward synchronization task, and all other tasks acquired earlier than the task are completed; and/or the task is a task acquired later than the backward synchronization task, and the backward synchronization task is completed.
- the task is no longer blocked by all other tasks or backward synchronization tasks earlier than the task, it is determined that the task is released from dependency, so that the task can be scheduled to the processing unit as soon as possible.
- the task scheduler determines the non-dependent or de-dependent tasks among the N tasks as schedulable tasks according to the association relationship among the N tasks, including: the task scheduler determines the tasks that have a dependency association with other tasks according to the association relationship among the N tasks, and when the other tasks that have a dependency association with the task corresponding to the task have been executed or have been completed, the task is determined as a schedulable task. In this way, by monitoring the execution status of the tasks that have a dependency association, the task can be scheduled to the processing unit in a timely manner when the other tasks that have a dependency association with the task no longer block the task.
- the task scheduler determines the tasks among the N tasks that are not blocked by other tasks synchronously and not blocked by other tasks dependencies as schedulable tasks based on the association relationship among the N tasks.
- a task is blocked by other tasks synchronously, which means that the task meets at least one of the following conditions: the task is a forward synchronization task and other tasks acquired earlier than the task have not been fully executed, and the task is a task acquired later than the backward synchronization task and the backward synchronization task has not been fully executed.
- a task is blocked by other tasks dependencies, which means that the task meets at least one of the following conditions: the task has a partial dependency association with other tasks and other tasks on which it depends have not been fully executed, and the task has a serial dependency association with other tasks and other tasks on which it depends have not been fully executed.
- the task scheduler determines the tasks among the N tasks that are not blocked synchronously by other tasks and not blocked by other tasks as schedulable tasks according to the association relationship among the N tasks, including: the task scheduler traverses each task among the N tasks that is not determined to be blocked in the order of acquisition of the N tasks, and when traversing each task that is not determined to be blocked: if the task is a forward synchronization task, and other tasks acquired earlier than the task have not been fully executed, then the task is determined to be blocked synchronously; if the task is a backward synchronization task, then all other tasks acquired later than the task are determined to be blocked synchronously; if the task depends on other tasks and the other tasks have not been fully executed, then the task is determined to be blocked by the dependencies; if the task depends on other tasks in series and the other tasks have not been fully executed, then the task is determined to be blocked by the dependencies; when the task is not blocked synchronously and not blocked by the dependencies, the task is determined to be
- the task being synchronously blocked may include the task being synchronously blocked by the entire group and/or the task being synchronously blocked by a sub-group, in which case:
- the other tasks used to determine whether a task is synchronously blocked by the entire group are all tasks in the entire group that are obtained earlier or later than the task. For example, when a task is marked as a forward synchronization task in the entire group, as long as the other tasks in the entire group that are issued earlier than the task have not been fully executed, it is determined that the task is synchronously blocked by other tasks in the entire group. When a task belongs to a backward synchronization task in the entire group, it is determined that all other tasks in the entire group that are later than the task are synchronously blocked by the task;
- other tasks used to determine whether a task is synchronously blocked by a subgroup are all tasks in the subgroup to which the task belongs that are obtained earlier or later than the task. For example, when a task is marked as a forward synchronization task in a subgroup, as long as other tasks in the subgroup that are issued earlier than the task have not been fully executed, it is determined that the task is synchronously blocked by other tasks in the subgroup. When a task belongs to a backward synchronization task in a subgroup, it is determined that all other tasks in the subgroup that are later than the task are synchronously blocked by the task.
- sub-groups can be obtained by dividing according to business characteristics. In this way, by grouping tasks according to business characteristics, even if tasks in a group are synchronously blocked, tasks in other groups will not be affected, that is, synchronous blocking of a business will not affect the execution of other businesses. It can be seen that by decoupling the task execution association of the business, mutual interference between businesses can be reduced.
- the sub-groups may be obtained by dividing the tasks with dense association relationships. Tasks with relatively concentrated relationships are divided into a sub-group. By only marking the forward synchronization task or backward synchronization task in the sub-group, the relationship between the tasks in the sub-group can be known without marking the dependency relationship of each task. This can streamline the data structure of each task and reduce the communication overhead between the device driver package and the task scheduler.
- the task scheduler can also store the N tasks in the first waiting queue.
- the task scheduler determines the tasks among the N tasks that are not synchronously blocked by other tasks and are not blocked by other task dependencies as schedulable tasks based on the relationship between the N tasks, including: for any task in the first waiting queue, determine whether the task is synchronously blocked by other tasks, if not, move the task from the first waiting queue to the second waiting queue; and for any task in the second waiting queue, determine whether the task is blocked by other task dependencies, if not, move the task from the second waiting queue to the ready queue. Further, the task scheduler schedules the schedulable tasks to the processing unit, including: scheduling the tasks in the ready queue to the processing unit.
- the task scheduler schedules the tasks in the ready queue to the processing unit, including: the task scheduler schedules the schedulable tasks to the processing unit, including: the task scheduler monitors the number of tasks currently executed by the processing unit, and when the number of tasks is less than the number of parallel tasks of the processing unit, the tasks in the ready queue are scheduled to the processing unit in sequence according to the order in which the tasks in the ready queue are acquired.
- the present application provides a task scheduler, including: a task acquirer, used to acquire N tasks and the association relationships between the N tasks, where N is a positive integer; a blocking manager, used to determine the non-dependent or released dependent tasks among the N tasks as schedulable tasks based on the association relationships among the N tasks; and a task dispatcher, which schedules the schedulable tasks to the processing unit.
- a task acquirer used to acquire N tasks and the association relationships between the N tasks, where N is a positive integer
- a blocking manager used to determine the non-dependent or released dependent tasks among the N tasks as schedulable tasks based on the association relationships among the N tasks
- a task dispatcher which schedules the schedulable tasks to the processing unit.
- the task acquirer is specifically used to receive N tasks issued by the device driver package and the association relationship between the N tasks.
- the blocking manager is specifically used to: determine the task that has synchronization association with other tasks according to the association relationship of N tasks, and when the other tasks that have synchronization association with the task corresponding to the task are executed, the task is determined as a schedulable task.
- the synchronization association with other tasks means that the task has an association relationship with all other tasks that are acquired earlier or later than the task.
- the completion of execution of other tasks corresponding to the task and having synchronization association with the task includes at least one of the following contents: the task is a forward synchronization task, and all other tasks acquired earlier than the task are completed; the task is a task acquired later than the backward synchronization task, and the backward synchronization task is completed.
- the forward synchronization task is used to indicate that the forward synchronization task depends on all other tasks acquired earlier than the forward synchronization task
- the backward synchronization task is used to indicate that all other tasks acquired later than the backward synchronization task depend on the backward synchronization task.
- the blocking manager is specifically used to: determine the task that has a dependency relationship with other tasks based on the association relationship among N tasks, and determine the task as a schedulable task when other tasks that have a dependency relationship with the task corresponding to the task have been executed or have been completed.
- the blocking manager is specifically used to: determine the tasks among the N tasks that are not synchronously blocked by other tasks and are not blocked by other task dependencies as schedulable tasks based on the association relationship among the N tasks.
- the task being synchronously blocked by other tasks includes: the task is a forward synchronization task and other tasks acquired earlier than the task have not been fully executed, or the task is a task acquired later than the backward synchronization task and the backward synchronization task has not been fully executed; the forward synchronization task is used to limit the forward synchronization task to be dependent on all other tasks acquired earlier than the forward synchronization task, and the backward synchronization task is used to limit all other tasks acquired later than the backward synchronization task to be dependent on the backward synchronization task.
- the task being dependently blocked by other tasks includes: the task is dependent on other tasks and the other tasks have not been fully executed or have not been fully executed.
- the blocking manager is specifically used to: traverse each task among the N tasks that has not been determined to be blocked in the order in which the N tasks are acquired, and when traversing each task that has not been determined to be blocked: if the task is a forward synchronization task, and other tasks acquired earlier than the task have not been fully executed, then the task is determined to be synchronously blocked; if the task is a backward synchronization task, then all other tasks acquired later than the task are determined to be synchronously blocked; if the task depends on other tasks and the other tasks have not been fully executed, then the task is determined to be dependently blocked; if the task depends on other tasks serially and the other tasks have not been fully executed, then the task is determined to be dependently blocked; when the task is not synchronously blocked and is not dependently blocked, the task is determined to be a schedulable task.
- the task being synchronously blocked includes the task being synchronously blocked by the entire group and/or the task being synchronously blocked by a subgroup, for
- the other tasks for judging whether the task is blocked by the whole group synchronization are tasks acquired earlier or later than the task in the N tasks, and the other tasks for judging whether the task is blocked by the subgroup synchronization are tasks acquired earlier or later than the task in the subgroup to which the task belongs.
- sub-groups are divided according to business characteristics.
- the task scheduler also includes: a first waiting queue, a second waiting queue and a ready queue.
- the task acquirer is also used to: store the N tasks in the first waiting queue.
- the blocking manager is specifically used to: for any task in the first waiting queue, determine whether the task is synchronously blocked by other tasks, if not, move the task from the first waiting queue to the second waiting queue, and, for any task in the second waiting queue, determine whether the task is blocked by other task dependencies, if not, move the task from the second waiting queue to the ready queue.
- the task dispatcher is specifically used to: schedule tasks in the ready queue to the processing unit.
- the task dispatcher is specifically used to: monitor the number of tasks currently executed by the processing unit, and when the number of tasks is less than the number of parallel tasks of the processing unit, schedule the tasks in the ready queue to the processing unit in sequence according to the order in which the tasks in the ready queue are acquired.
- the present application provides a task scheduler, comprising: an acquisition unit, used to acquire N tasks and the association relationships between the N tasks, where N is a positive integer; a determination unit, used to determine the non-dependent or released dependent tasks among the N tasks as schedulable tasks based on the association relationships among the N tasks; and a scheduling unit, used to schedule the schedulable tasks to the processing unit.
- the determination unit is specifically used to: determine the task that has synchronization association with other tasks according to the association relationship of N tasks, and when the other tasks that have synchronization association with the task corresponding to the task are completed, the task is determined as a schedulable task.
- having synchronization association with other tasks means that the task has an association relationship with all other tasks that are acquired earlier or later than the task.
- the completion of execution of other tasks corresponding to the task and having synchronization association with the task includes at least one of the following contents: the task is a forward synchronization task, and all other tasks acquired earlier than the task are completed; the task is a task acquired later than the backward synchronization task, and the backward synchronization task is completed.
- the forward synchronization task is used to indicate that the forward synchronization task depends on all other tasks acquired earlier than the forward synchronization task
- the backward synchronization task is used to indicate that all other tasks acquired later than the backward synchronization task depend on the backward synchronization task.
- the determination unit is specifically used to: determine a task that has a dependency relationship with other tasks based on the association relationship among N tasks, and when other tasks that have a dependency relationship with the task corresponding to the task have been executed or completed, determine the task as a schedulable task.
- the determination unit is specifically used to: determine, according to the association relationship among the N tasks, the tasks that are not blocked by other tasks synchronization and are not blocked by other task dependencies as schedulable tasks.
- the task being blocked by other tasks synchronization includes: the task is a forward synchronization task and other tasks acquired earlier than the task have not been fully executed, or the task is a task acquired later than the backward synchronization task and the backward synchronization task has not been fully executed; the forward synchronization task is used to limit the forward synchronization task to be dependent on all other tasks acquired earlier than the forward synchronization task, and the backward synchronization task is used to limit all other tasks acquired later than the backward synchronization task to be dependent on the backward synchronization task.
- the task being blocked by other task dependencies includes: the task is dependent on other tasks and the other tasks have not been fully executed or have not been fully executed.
- the determination unit is specifically used to: traverse each task among the N tasks that has not been determined to be blocked in the order in which the N tasks are acquired, and when traversing each task that has not been determined to be blocked: if the task is a forward synchronization task, and other tasks acquired earlier than the task have not been fully executed, then the task is determined to be synchronously blocked; if the task is a backward synchronization task, then all other tasks acquired later than the task are determined to be synchronously blocked; if the task depends on other tasks and the other tasks have not been fully executed, then the task is determined to be dependently blocked; if the task depends on other tasks serially and the other tasks have not been fully executed, then the task is determined to be dependently blocked; when the task is not synchronously blocked and is not dependently blocked, the task is determined to be a schedulable task.
- the task being synchronously blocked includes the task being synchronously blocked by the entire group and/or the task being synchronously blocked by a sub-group, which is used to determine whether other tasks that are synchronously blocked by the entire group are tasks that are acquired earlier or later than the task among the N tasks, and is used to determine whether other tasks that are synchronously blocked by the sub-group are tasks that are acquired earlier or later than the task in the sub-group to which the task belongs.
- sub-groups are divided according to business characteristics.
- the acquisition unit is further used to: store the N tasks in the first waiting queue.
- the determination unit is specifically used to: for any task in the first waiting queue, determine whether the task is synchronously blocked by other tasks, if not, move the task from the first waiting queue to the second waiting queue; and, for any task in the second waiting queue, determine whether the task is blocked by other task dependencies, if not, move the task from the second waiting queue to the ready queue.
- the scheduling unit is specifically used to: schedule the tasks in the ready queue to the processing unit.
- the scheduling unit is specifically used to: monitor the number of tasks currently executed by the processing unit, and when the number of tasks is less than the processing unit When the number of parallelizable tasks of a unit is reached, the tasks in the ready queue are dispatched to the processing unit in sequence according to the order in which the tasks in the ready queue are acquired.
- the present application provides a chip, including a task scheduler, and the task scheduler is used to implement the method described in any one of the designs in the first aspect above.
- the present application provides a processor, including a task scheduler and a processing unit, the task scheduler is used to execute the method described in any one of the designs of the first aspect above, and the processing unit is used to execute the tasks scheduled by the task scheduler.
- the present application provides an electronic device, comprising a processor, wherein the processor is coupled to a memory, and the processor is used to execute a computer program stored in the memory so that the electronic device executes a method as described in any one of the designs in the first aspect above.
- the present application provides a task scheduling system, comprising a device driver package and a processor as described in the fourth aspect above, wherein the device driver package is used to send N tasks to the processor, where N is a positive integer; and the processor is used to process the N tasks.
- the present application provides a computer-readable storage medium storing a computer program.
- the computer program When the computer program is executed, the method described in any one of the designs in the first aspect above is implemented.
- the present application provides a computer program product, which, when executed on a processor, implements a method as described in any one of the designs of the first aspect above.
- FIG1 exemplarily shows a system architecture diagram of a task processing system provided by an embodiment of the present application
- FIG2 exemplarily shows a schematic diagram of the structure of a processor provided in an embodiment of the present application
- FIG3 exemplarily shows a flowchart of a task processing method provided by the industry
- FIG4 is a schematic diagram showing a flow chart of a game task processing provided by the industry
- FIG5 exemplarily shows a flowchart corresponding to the task scheduling method provided in Embodiment 1 of the present application
- FIG6 exemplarily shows a schematic diagram of a task layout with synchronization association provided in an embodiment of the present application
- FIG. 7 exemplarily shows a schematic diagram of a task layout with dependency associations provided in an embodiment of the present application
- FIG8 exemplarily shows a flowchart of processing a game task using the task scheduling method in the first embodiment of the present application
- FIG9 exemplarily shows a flowchart corresponding to the task scheduling method provided in Embodiment 2 of the present application.
- FIG10 exemplarily shows a task scheduling process diagram provided in an embodiment of the present application.
- FIG11 exemplarily shows another task scheduling process diagram provided in an embodiment of the present application.
- FIG12 exemplarily shows another task scheduling process diagram provided in an embodiment of the present application.
- FIG. 13 exemplarily shows a structural diagram of a task scheduler provided in an embodiment of the present application.
- the task scheduling scheme disclosed in the present application can be applied to electronic devices with task processing capabilities.
- the task scheduler can be an independent unit, which is embedded in the electronic device and can assign tasks to the processor core when the processor core in the electronic device is idle, so as to maximize the use of the power margin of the processor core while meeting the processing capacity of the processor core, thereby improving the task processing capacity of the processor core.
- the task scheduler can also be a unit encapsulated inside the electronic device, which is used to implement the task scheduling function of the electronic device.
- the electronic device can be a computer device with a processor, such as a desktop computer, a personal computer or a server.
- the electronic device can also be a portable electronic device with a processor, such as a mobile phone, a tablet computer, a wearable device with wireless communication function (such as a smart watch), a vehicle-mounted device, etc.
- portable electronic devices include but are not limited to devices equipped with Or a portable electronic device with other operating systems.
- the portable electronic device may also be a laptop computer (Laptop) with a touch-sensitive surface (eg, a touch panel).
- connection can be understood as electrical connection, and the connection between two electrical components can be a direct or indirect connection between the two electrical components.
- a and B are connected, which can be either A and B directly connected, or A and B indirectly connected through one or more other electrical components, such as A and B are connected, or A and C are directly connected, C and B are directly connected, and A and B are connected through C.
- connection can also be understood as coupling, such as electromagnetic coupling between two inductors. In short, the connection between A and B enables the transmission of electrical energy between A and B.
- FIG1 exemplarily shows a system architecture diagram of a task processing system provided by an embodiment of the present application.
- the illustrated task processing system 10 includes an application (APP) 100, a device development kit (DDK) 200 and a processor 300, wherein the application 100 is connected to the device driver package 200, and the device driver package 200 is also connected to the processor 300.
- the device driver package 200 is also called driver software, and includes a kernel mode driver (KMD) 210 and a user mode driver (UMD) 220, and the user mode driver is also called a user-mode graphics driver. KMD and user mode belong to two different driver modes, and the device driver package 200 switches between the two driver modes according to the type of running code.
- KMD kernel mode driver
- UMD user mode driver
- kernel mode driver 210 Generally, most drivers belong to the kernel mode driver 210 and run in kernel mode, and some drivers belong to the user mode driver 220 and run in user mode. Since the kernel mode driver 210 and the user mode driver 220 are not very relevant to the solution of the present application, the embodiment of the present application does not introduce them in detail.
- FIG. 2 exemplarily shows a schematic diagram of the structure of a processor provided in an embodiment of the present application.
- the processor 300 shown in the figure may include one or more chips, for example, may include a system-on-a-chip (SoC) or a chipset formed by multiple chips.
- the processor 300 may include at least one processing unit, such as a neural-network processing unit (NPU), a graphics processing unit (GPU) and a central processing unit (CPU) as shown in FIG. 2, and may also include an application processor (AP), a modem processor, an image signal processor (ISP), a video codec, a digital signal processor (DSP), and/or a baseband processor.
- NPU neural-network processing unit
- GPU graphics processing unit
- CPU central processing unit
- AP application processor
- ISP image signal processor
- DSP digital signal processor
- At least one processing unit is also called at least one processing subsystem, which belongs to the core component of the processor 300 and is used to implement the processing function of the processor 300.
- Different processing units in at least one processing unit may be dispersed and deployed on different chips, or may be integrated on one chip, without specific limitation.
- the processor 300 may also include non-core components, such as general units (including counters, decoders, and signal generators, etc.), accelerator units, input/output control units, interface units, internal memories, and external buffers, etc.
- the internal memory and the external buffer are collectively referred to as the storage unit of the processor 300, which is used to store instructions and data.
- the instructions and data can be called so that when the processor is processing a task, the task that is not blocked by synchronization and is not blocked by dependency is selected and scheduled to the currently idle processor core.
- the storage unit can be a cache memory.
- the cache memory can save instructions or data that have just been used or are used in a loop. When the instruction or data needs to be used again, it can be directly called from the cache memory, thereby avoiding repeated access, reducing waiting time, and improving the processing efficiency of the task.
- each processing unit in the processor 300 may include one or more processor cores.
- the NPU includes 3 NPU cores, namely NPU core 1 to NPU core 3, the GPU includes 5 GPU cores, namely GPU core 1 to GPU core 5, and the CPU includes 5 CPU cores, namely CPU core 1 to CPU core 5.
- the processor core is also called an execution unit, which is used to execute the entire task or part of the task fragment.
- the multiple processor cores can be divided into one or more voltage domains, and the processor cores located in the same voltage domain have the same operating voltage and the same operating frequency.
- the multiple processor cores located in the same voltage domain can be multi-core heterogeneous, that is, have different structures, and are used to process different tasks or different task fragments respectively, or can be multi-core isomorphic, that is, have the same structure, and are used to jointly process the same task or the same task fragment.
- multi-core heterogeneous that is, have different structures, and are used to process different tasks or different task fragments respectively
- multi-core isomorphic that is, have the same structure, and are used to jointly process the same task or the same task fragment.
- the embodiment of the present application does not specifically limit this.
- the processor 300 may also include a task scheduler 310, and the task scheduler 310 is connected to each processor core.
- the connection between the task scheduler 310 and each processor core can be realized through a bus system.
- each processor core can publish its idle message to the bus system after processing a task or task fragment.
- the task scheduler 310 can learn the processor core that is currently in an idle state by monitoring the bus system, and then, when there is a task that needs to be scheduled, the task is scheduled to the idle processor core.
- processor 300 shown in the figure is only an example, and the processor 300 may have more or fewer components than those shown in the figure.
- a component may be a combination of two or more components, or may have different component configurations, which is not specifically limited in the embodiments of the present application.
- the above-mentioned application 100 may specifically refer to a program for generating images, such as a mobile phone camera, a camera or a screen recording program.
- the above-mentioned processor 300 may specifically refer to an image processor, which may only include a GPU core, but not other cores, such as an NPU core and a CPU core.
- the tasks scheduled by the task scheduler 310 specifically refer to tasks related to image processing.
- the video is obtained by shooting one frame of images after another, and each frame of the image usually needs to undergo Gaussian filtering, white balance, image denoising, image enhancement, image segmentation, and image rendering after shooting, and the processing operations of different images are usually performed in the order of shooting to avoid the phenomenon of long and short frames.
- the entire processing process of each frame of the image can be used as a task, or the processing operations such as Gaussian filtering, white balance, image denoising, image enhancement, image segmentation or image rendering of each frame of the image during the processing process can also be used as a task.
- Gaussian filtering, white balance, image denoising, image enhancement, and image segmentation are all based on the entire image and can be performed in parallel, that is, these operations are not dependent on each other.
- Image rendering is to render each small image after image segmentation separately, which can only be performed after image segmentation. Therefore, image rendering depends on image segmentation.
- the application 100 obtains the image frame by frame, it creates an image queue and records the command, and then sends the image queue to the device driver package 200.
- the device driver package 200 parses the command in the image queue, translates it into a task recognizable by the processor 300, and sends it to the task scheduler 310 in the processor 300 in the form of a task, a task chain, or a command stream.
- the task scheduler 310 monitors the working status of each GPU core in the processor 300, and when it is determined that there is an idle GPU core, it sends the task to be processed to the idle GPU core for processing.
- the task to be processed may be sent to a GPU core for separate processing, or it may be sent to multiple GPU cores for processing together.
- the multiple GPU cores may belong to the same voltage domain or to different voltage domains, which is not specifically limited.
- FIG3 exemplarily shows a flow chart of a task processing method provided by the industry. As shown in FIG3 , the method pre-sets multiple task queues in the task scheduler 310, such as task queue L 1 , task queue L 2 , ..., task queue L m , and m is a positive integer.
- each task queue corresponds to a business, such as task queue L 1 corresponds to Gaussian filtering business, task queue L 2 corresponds to white balance business, ..., task queue L m corresponds to image segmentation and image rendering business.
- the device driver package 200 distributes each task to the corresponding task queue according to the business to which each task belongs.
- the task scheduler 310 calls the idle GPU core to process the tasks in each task queue according to the principle of executing the tasks in multiple task queues in parallel and executing the tasks in one task queue in series.
- FIG4 shows a flow chart of processing the game task using the task scheduling method provided by the industry.
- Binning1 to Binning3 are stored in the same task queue, and Rendering1 to Rendering3 are stored in another task queue.
- the task scheduler calls the GPU core to execute the Binning tasks and Rendering tasks in the two task queues in parallel.
- Binning2 depends on Rendering1
- Rendering1 depends on Bining1
- Binning2 needs to wait until Rendering1 is executed before it can be executed
- Rendering1 needs to wait until Binning1 is executed before it can be executed.
- an embodiment of the present application provides a task scheduling method, which maintains the scheduling order of each task from the hardware side through a task scheduler. After determining that the task has no dependency or has been released from dependency, the task can be sent to an idle processing unit for processing in advance. In order to make full use of all the hardware resources of the processing unit to process tasks with high concurrency, the GPU core should not be idle as much as possible, and the utilization rate of the processing unit should be effectively improved.
- the task scheduler can be the task scheduler 310 in FIG. 2 , or a communication device that can support the processor to implement the functions required by the method, and of course, it can also be other communication devices or communication systems, such as chips, chip systems, circuits or circuit systems, without specific limitation.
- FIG5 exemplarily shows a flow chart of a task scheduling method provided in the first embodiment of the present application, which is applicable to a task scheduler, such as the task scheduler 310 shown in FIG2 .
- the method includes:
- Step 501 The task scheduler obtains N tasks and associations among the N tasks, where N is a positive integer.
- the N tasks can be sent to the task scheduler by the device driver package, and the association relationship of the N tasks can be sent to the task scheduler by the device driver package, or obtained by the task scheduler from other channels, such as accessing the business system, without specific limitation.
- N tasks and the relationship between N tasks can be sent to the task scheduler by the device driver package through tasks, task chains or command streams, and specifically, the relationship between each task and other tasks can be explicitly written into the data structure of the task as configuration information. Therefore, after obtaining N tasks, the task scheduler can parse the data structure of each task to know what kind of relationship each task has with other tasks.
- association relationship of the N tasks may include a synchronization association and/or a dependency association.
- association relationships may include a synchronization association and/or a dependency association.
- Synchronous association is used to indicate the association relationship between a task and all other tasks issued earlier or later than the task.
- the synchronous association may include forward synchronous association and backward synchronous association.
- Forward synchronous association means that a task depends on all other tasks issued earlier than the task, that is, the task can only be executed after all other tasks issued earlier than the task are completed.
- a task with forward synchronous association with other tasks is also called a forward synchronous task. As long as there is another task issued earlier than the forward synchronous task that has not been completed, the forward synchronous task will be synchronously blocked by the other task.
- backward synchronous association means that all other tasks issued later than a task depend on the task, that is, only after the task is completed can each other task issued later than the task be executed.
- a task with backward synchronous association with other tasks is also called a backward synchronous task.
- the backward synchronous task As long as the backward synchronous task has not been completed, all other tasks issued later than the backward synchronous task will be synchronously blocked by the backward synchronous task.
- N tasks are located in an entire group, and at least two of the N tasks may also be located in at least one sub-group. Specifically, when there is only an entire group but no sub-group in the task scheduler, all N tasks are located only in the entire group; when there are both an entire group and sub-groups in the task scheduler, each task may be located only in the entire group, or may be located in the entire group and one or more sub-groups at the same time.
- other tasks issued earlier than the forward synchronization task may specifically refer to: when a task is marked as a forward synchronization task in the entire group, that is, other tasks in the entire group that are issued earlier than the forward synchronization task; when a task is marked as a forward synchronization task in a sub-group, that is, other tasks in the sub-group that are issued earlier than the forward synchronization task.
- other tasks issued later than the backward synchronization task may specifically refer to: when a task is marked as a backward synchronization task in the entire group, that is, other tasks in the entire group that are issued later than the backward synchronization task; when a task is marked as a backward synchronization task in a sub-group, that is, other tasks in the sub-group that are issued later than the backward synchronization task.
- the grouping of the N tasks can be performed by the device driver package and carried in the task data structure to notify the task scheduler, or can be performed by the task scheduler itself.
- the grouping basis can be, for example, business characteristics or synchronization associations. Take the grouping of tasks by the device driver package as an example:
- the device driver package determines the tasks belonging to the same service according to the service characteristics of N tasks, then marks the same group identifier in the data structure of these tasks, and sends it to the task scheduler.
- the task scheduler parses the data structure of the N tasks, obtains the tasks with the same group identifier, creates corresponding virtual sub-groups, and stores these tasks in the virtual sub-groups.
- the group identifier can be a service name, service code, group number or other mark that can represent the same service, and is not specifically limited.
- the device driver package determines the tasks with relatively concentrated associations based on the associations among N tasks, and After marking the same group identifier in the data structure of these tasks, they are sent to the task scheduler, and the task scheduler creates a virtual sub-group according to the method in the previous example. For example, if it is determined that a certain task depends on at least three tasks, the device driver package can determine the task and the at least three tasks it depends on as a sub-group, mark the same group identifier in the data structure of the at least four tasks, and also mark the forward synchronization associated task in the data structure of the task.
- the device driver package can determine the task and the at least three tasks that depend on the task as a sub-group, mark the same group identifier in the data structure of the at least four tasks, and also mark the backward synchronization associated task in the data structure of the task.
- the device driver package can determine the task and the at least three tasks that depend on the task as a sub-group, mark the same group identifier in the data structure of the at least four tasks, and also mark the backward synchronization associated task in the data structure of the task.
- the device driver package can determine the task and the at least three tasks that depend on the task as a sub-group, mark the same group identifier in the data structure of the at least four tasks, and also mark the backward synchronization associated task in the data structure of the task.
- grouping by the task scheduler please refer to the relevant content of grouping by the above-mentioned device driver package, which will not be repeated here.
- the device driver package or task scheduler can also group tasks according to other characteristics, such as characteristics indicated by the user, and the embodiments of the present application do not make specific limitations on this.
- Dependency associations include partial dependency associations and serial dependency associations.
- Partial dependency associations refer to a task that depends on the execution results of one or more other tasks, and the task can only be executed after all the one or more other tasks it depends on have been completed. It should be noted that since the synchronous association has defined a task's dependence or dependency relationship on all other tasks that are earlier or later than the task, therefore, on the basis of the synchronous association, the partial dependency association can only define a task's dependence on some other tasks in all tasks rather than all other tasks. Therefore, we call this dependency relationship a partial dependency association.
- the serial dependency association refers to a task that depends on the execution of one or more other tasks, and the task can only be executed after all the other tasks it depends on have started to execute. Moreover, since the serial dependency association does not limit the other tasks it depends on to completion, the tasks that the serial dependency association depends on can be all other tasks or some other tasks.
- the synchronous association method is adopted for definition, and only the forward synchronization task or the backward synchronization task is defined, so that the forward synchronization task can be synchronously defined to be dependent on all other tasks issued earlier than the task or all other tasks issued later than the backward synchronization task are dependent on the backward synchronization task. If the same definition is performed in the dependent association method, it is necessary to define all the tasks that the task depends on. In other words, compared with the dependent association, the synchronous association can define the dependency relationship between multiple tasks through more streamlined content.
- the above-mentioned dependency association can be identified and marked after the synchronous association has been determined, that is, the synchronous association of the task is first determined, and after the synchronous association determination is completed, the dependent association is marked for the remaining tasks with dependencies.
- all the association relationships can be included in the data structure of the N tasks, and there is no need to describe the association for each task, which helps to simplify the data structure.
- Step 502 The task scheduler determines, according to the association relationship among the N tasks, the tasks that have no dependencies or have been released from dependencies among the N tasks as schedulable tasks.
- the task scheduler can determine the task that has synchronization association with other tasks according to the association relationship of the N tasks, and when the other tasks that have synchronization association with the task corresponding to the task are completed, the task is determined as a schedulable task.
- the completion of the execution of other tasks that have synchronization association with the task corresponding to the task can exemplarily include at least one of the following contents: the task is a forward synchronization task and all other tasks acquired earlier than the task are completed; the task is a task acquired later than the backward synchronization task and the backward synchronization task is completed. And/or,
- the task scheduler can determine the task that has a dependency association with other tasks based on the association relationship among the N tasks, and when the other tasks corresponding to the task that have a dependency association with the task have all been executed or have all been completed, the task is determined as a schedulable task.
- the task can be scheduled to the processing unit in a timely manner when other tasks with synchronization association and/or dependency association no longer block the task.
- the task scheduler can determine the tasks among the N tasks that are not blocked by other tasks synchronously and not blocked by other tasks dependencies as schedulable tasks based on the association relationship among the N tasks.
- the fact that a task is not blocked by other tasks synchronously means that the task is no longer blocked by tasks that are synchronously associated with the task, that is, the other tasks that are synchronously associated with the task corresponding to the task are completed.
- the fact that a task is not blocked by other tasks dependencies means that the task is not blocked by other tasks that are dependently associated with the task, that is, the other tasks that are dependently associated with the task corresponding to the task are completed.
- a task is synchronously blocked by other tasks means that the task meets at least one of the following conditions:
- Condition 1 the task is a forward synchronization task in the entire group, and other tasks in the entire group that were issued earlier than this task have not been fully executed.
- the task is a forward synchronization task in a subgroup, and all other tasks in the subgroup that were issued earlier than this task have not been completed. Execution completed.
- Condition three the task is the one that is issued later than the backward synchronization task in the entire group, and the backward synchronization task has not been completed.
- Condition 4 the task is a task in a subgroup that is issued later than the backward synchronization task, and the backward synchronization task has not been completed.
- FIG6 exemplarily shows a schematic diagram of a task layout with synchronization association provided by an embodiment of the present application.
- there are seven tasks namely, task 0 to task 6, and these seven tasks are sent to the task scheduler in the order of task 0, task 1, task 2, task 3, task 4, task 5 and task 6.
- tasks 0 to 6 are not grouped, that is, tasks 0 to 6 only exist in the whole group.
- task 3 is marked as a backward synchronization task, that is, tasks 4, 5, and 6 that are issued later than task 3 can only be executed after task 3 is completed. Therefore, as long as task 3 is not completed, tasks 4, 5, and 6 meet the above condition 3, that is, tasks 4, 5, and 6 are synchronously blocked.
- Figure 6 (B) also does not group tasks 0 to 6, that is, tasks 0 to 6 only exist in the entire group. Moreover, in the task queue shown in Figure 6 (B), task 3 is marked as a forward synchronization task, that is, task 3 can only be executed after tasks 0, 1, and 2 that were issued earlier than task 3 are all completed. Therefore, as long as at least one of task 0, task 1, or task 2 is not completed, task 3 meets the above condition 1, so task 3 is synchronously blocked.
- Tasks 0 to 6 are not grouped, that is, Tasks 0 to 6 only exist in the whole group. Moreover, in the task queue shown in Figure 6 (C), Task 3 is marked as both a forward synchronization task and a backward synchronization task. Task 3 can only be executed after Tasks 0, 1, and 2 that were issued earlier than Task 3 are all executed, and Tasks 4, 5, and 6 that were issued later than Task 3 can only be executed after Task 3 is executed. Therefore, as long as at least one of Tasks 0, 1, or 2 is not executed, Task 3 meets the above condition 1, and Tasks 4, 5, and 6 meet the above condition 3, so Tasks 3, 4, 5, and 6 will be synchronously blocked.
- Figure 6 (D) divides Task 3, Task 5 and Task 6 into the same subgroup, so Task 3, Task 5 and Task 6 exist in both the whole group and the subgroup, while Task 0, Task 1, Task 2 and Task 4 exist in the whole group but not in the subgroup.
- Task 3 is marked as a backward synchronization task in the subgroup, that is, Task 5 and Task 6 in the subgroup that are issued later than Task 3 need to be executed after Task 3 is completed. Therefore, as long as Task 3 is not completed, Task 5 and Task 6 meet the above condition 4, that is, Task 5 and Task 6 are synchronously blocked.
- Task 0, Task 1 and Task 3 are divided into the same subgroup. Therefore, Task 0, Task 1 and Task 3 exist in both the whole group and the subgroup. Task 2, Task 4, Task 5 and Task 6 exist in the whole group but not in the subgroup.
- Task 3 is marked as a forward synchronization task in the subgroup, which means that Task 3 can only be executed after Task 0 and Task 1, which are issued earlier than Task 3, are completed in the subgroup. Therefore, as long as there is at least one task in Task 0 or Task 1 that has not been completed, Task 3 meets the above condition 2, so Task 3 is synchronously blocked.
- Task 0, Task 1, Task 3, Task 5 and Task 6 are divided into the same subgroup. Therefore, Task 0, Task 1, Task 3, Task 5 and Task 6 exist in the whole group and the subgroup at the same time, while Task 2 and Task 4 exist in the whole group but not in the subgroup.
- Task 3 is marked as the forward synchronization task and the backward synchronization task in the subgroup, that is, Task 3 can only be executed after Task 0 and Task 1, which are issued earlier than Task 3 in the subgroup, are executed.
- Task 5 and Task 6, which are issued later than Task 3 in the subgroup can only be executed after Task 3 is executed. Therefore, as long as there is at least one task in Task 0 or Task 1 that has not been executed, Task 3 meets the above condition four, and Task 5 and Task 6 meet the above condition two, so Task 3, Task 5 and Task 6 will be synchronously blocked.
- the "task exists in the whole group but not in the subgroup" described above may mean that the task exists only in the whole group but not in any subgroup, or that the task exists in the whole group and other subgroups, without specific limitation.
- a task is usually placed in at most one subgroup, and will not be placed in two or more subgroups at the same time, that is, when there are multiple subgroups, the tasks in the multiple subgroups are different.
- a task is blocked by other task dependencies means that the task satisfies at least one of the following conditions:
- Condition 1 The task has some dependencies with other tasks, and the other dependent tasks have not been fully executed;
- Condition 2 The task has a serial dependency relationship with other tasks, and all other dependent tasks have not been executed.
- FIG7 exemplarily shows a task layout diagram with dependency association provided by an embodiment of the present application.
- there are seven tasks namely, Task 0, Task 1, Task 2, Task 3, Task 4, Task 5, Task 6 and Task 7, and Task 0 to Task 7 are not grouped, that is, Task 0 to Task 7 only exist in the whole group.
- Task 0 and Task 1, Task 3 and Task 4 have the above-mentioned partial dependency association with Task 6, if Task 6 has not started to execute, or Task 6 has been executed but has not yet been completed, Then Task 0, Task 1, Task 3 and Task 4 will be blocked by Task 6.
- Task 0 and Task 1 have the above serial dependency association with Task 6, if Task 6 has not started executing, then Task 0, Task 1, Task 3 and Task 4 will also be blocked by Task 6 dependency. As long as Task 6 has started executing, regardless of whether it has been completed, Task 0, Task 1, Task 3 and Task 4 will not be blocked by Task 6 dependency.
- the task scheduler can determine the schedulable tasks among the N tasks that are not blocked by other tasks synchronously and are not blocked by other task dependencies in the following manner:
- the task scheduler After the task scheduler obtains N tasks, it traverses each task in the N tasks that has not been determined to be blocked in the order in which the N tasks are issued, and, when traversing each task that has not been determined to be blocked: if the task is a forward synchronization task in the entire group, and other tasks in the entire group that were issued earlier than the task have not been fully executed, then it is determined that the task is synchronously blocked by other tasks in the entire group that were issued earlier than the task; if the task is a forward synchronization task in a sub-group, and other tasks in the sub-group that were issued earlier than the task have not been fully executed, then it is determined that the task is synchronously blocked by other tasks in the sub-group that were issued earlier than the task; if the task is a forward synchronization task in a sub-group, and other tasks in the sub-group that were issued earlier than the task have not been fully executed, then it is determined that the task is synchronously blocked by other tasks in the sub-group that were issued
- the above two judgment operations of judging whether a task is blocked synchronously and judging whether a task is blocked by a dependency can be executed serially or in parallel. Moreover, as long as one of the judgments is determined to be blocked, the other judgment can be terminated immediately without continuing to execute, thus avoiding unnecessary calculation processes, effectively saving calculation resources, and further improving the efficiency of task scheduling.
- Step 503 The task scheduler schedules the schedulable tasks to the processing units.
- the task scheduler can monitor the number of tasks currently executed by the GPU core in real time, and when it is determined that the number of tasks is less than the number of parallel tasks of the GPU core, the schedulable tasks are scheduled to the GPU core.
- the task scheduler can schedule multiple schedulable tasks to the GPU core in sequence according to the order of tasks sent by the receiving device driver package, so as to ensure that each task is executed in the order of image processing, avoid the next frame of the image being played before the previous frame of the image, and effectively avoid the phenomenon of long and short frames.
- the scheduling order of N tasks is maintained from the hardware side by the task scheduler, rather than being specified by the device driver package from the software side.
- tasks that have no dependencies or have been freed from dependencies can be sent to the processing unit for processing in advance according to the actual execution status of the tasks, thereby effectively improving the utilization rate of the processing unit.
- the hardware side there is no need for the hardware side to send a notification message to the software side after each task is executed to instruct the software side to send a new task. This can greatly reduce the work pressure on the software side, effectively improve the efficiency of task scheduling, and save communication overhead.
- FIG. 8 exemplarily shows a flow chart of processing the game task using the task scheduling method in the first embodiment above, wherein FIG. 8 (A) shows the order and dependency of the six tasks Binning1 to Binning3 and Rendering1 to Rendering3. It can be seen that the six tasks are sent to the task scheduler by the device driver package in the order of Binning1, Rendering1, Binning2, Rendering2, Binning3, and Rendering2.
- the task scheduler stores the six tasks in a whole group without grouping them.
- FIG. 8 (B) shows a possible situation of processing tasks according to the task scheduling method in the first embodiment above. Referring to FIG. 8 (B), the task scheduler can schedule each task according to the following steps:
- Step 1 The task scheduler determines whether Binning1, Rendering1, Binning2, Rendering2, Binning3, and Rendering2 are blocked synchronously or dependently in the order in which the tasks are issued:
- Binning1 is analyzed to determine that Binning1 does not belong to the forward synchronization task and there is no backward synchronization task in the entire group. Therefore, Binning1 is not blocked by synchronization, and Binning1 does not depend on other tasks. Therefore, Binning1 is not blocked by dependencies. Therefore, Binning1 is determined to be a task without dependencies, and the task scheduler schedules Binning1 to the GPU core.
- Binning2 is analyzed to determine that Binning2 does not belong to the forward synchronization task and there is no backward synchronization task in the entire group, so Binning2 is not blocked by synchronization, and Binning2 depends on Rendering1. Therefore, before Rendering1 is executed, Binning2 is blocked by Rendering1 dependency, and the task scheduler does not schedule Binning2.
- Rendering2 is analyzed to determine that Rendering2 does not belong to the forward synchronization task and there is no backward synchronization task in the entire group, so Rendering2 is not blocked by synchronization, and Rendering2 depends on Binning2. Therefore, before Binning2 is executed, Rendering2 is blocked by Binning2 dependency, so the task scheduler does not schedule Rendering2.
- Binning3 is analyzed and it is determined that Binning3 does not belong to the forward synchronization task and there is no backward synchronization task in the whole group. Therefore, Binning3 is not blocked by synchronization, and Binning3 does not depend on other tasks. Therefore, Binning3 is not blocked by dependencies. Therefore, Binning3 is determined to be a task without dependencies, and the task scheduler schedules Binning3 to the GPU core.
- Rendering3 is analyzed to determine that Rendering3 does not belong to the forward synchronization task and there is no backward synchronization task in the entire group, so Rendering3 is not blocked by synchronization, and Rendering3 depends on Binning3. Therefore, before Binning3 is executed, Rendering3 is blocked by Binning3 dependency, so the task scheduler does not schedule Rendering3.
- the task scheduler first schedules Binning1 and Binning3 to the GPU core for processing.
- Step 2 Assuming Binning1 is completed first, Rendering1, which depends on Binning1, is no longer dependent on it. Therefore, the task scheduler can schedule Rendering1 to the GPU core for processing. At this time, Binning3 and Rendering1 will be processed in parallel in the GPU core.
- Step 3 Assuming Binning3 is completed first, Rendering3, which depends on Binning3, releases the dependency. Therefore, the task scheduler can schedule Rendering3 to the GPU core for processing. At this time, Rendering1 and Rendering3 will be processed in parallel in the GPU core.
- Step 4 Assuming that Rendering1 is completed first, Binning2, which depends on Rendering1, is no longer dependent on it. Therefore, the task scheduler can schedule Binning2 to the GPU core for processing. At this time, Rendering3 and Binning2 will be processed in parallel in the GPU core.
- Step 5 Assuming Binning2 is completed first, Rendering2, which depends on Binning2, releases the dependency. Therefore, the task scheduler can schedule Rendering2 to the GPU core for processing. At this time, Rendering3 and Rendering2 will be processed in parallel in the GPU core.
- Step 6 After Rendering3 and Rendering2 are processed, all six tasks are completed.
- the GPU core can process two tasks in parallel without interruption, and the GPU core is basically not idle, so that the utilization rate of the GPU core is greatly improved, and the performance of task scheduling is also greatly improved.
- (B) in FIG. 8 above only shows a possible scheduling method, and there may be other scheduling methods in actual situations.
- Binning2 that depends on Rendering1 is released from dependency, so the task scheduler can schedule Binning2 to the GPU core for processing.
- Binning3 and Binning2 will be processed in parallel in the GPU core, and if it is the above step 3, Rendering3 and Binning2 will be processed in parallel in the GPU core.
- step 4 or step 5 assuming that Rendering3 is executed first, all the tasks that have not been processed at this time are not released from dependency, so the task scheduler no longer schedules new tasks, but waits for Rendering1 or Binning2 to be executed, and then schedules the released Binning2 or Rendering2 to the GPU core.
- the GPU core is idle during this period, compared with the three gaps shown in FIG. 4, it can still greatly improve the utilization rate of the GPU core. It should be understood that there are many possible scheduling methods, which are not listed one by one in the embodiments of the present application.
- FIG9 exemplarily shows a flow chart of a task scheduling method provided in the second embodiment of the present application, which is applicable to a task scheduler, such as the task scheduler 310 shown in FIG2 .
- the method includes:
- Step 901 The task scheduler obtains N tasks and associations among the N tasks.
- Step 902 the task scheduler selects the tasks that have not been determined to be synchronously blocked from the N tasks, and determines the earliest acquired task among the tasks that have not been determined to be synchronously blocked as the target task according to the acquisition order of the tasks that have not been determined to be synchronously blocked.
- tasks that have not yet been determined to be synchronously blocked refer to tasks that have not yet been analyzed whether they are synchronously blocked
- tasks that have been determined to be synchronously blocked include tasks that have been analyzed and determined to be synchronously blocked, as well as tasks that have not yet been analyzed whether they are synchronously blocked but have been determined to be synchronously blocked in the analysis of other tasks.
- Step 903 the task scheduler determines other tasks that are synchronously associated with the target task based on the association relationship of the N tasks, and determines whether the target task is synchronously blocked by other tasks based on the execution status of other tasks. If not, step 904 is executed. If yes, step 905 is executed. Go to step 902.
- the task scheduler may first perform the following judgments 1 to 4:
- Judgment 1 when judging that the target task is a forward synchronization task in the entire group according to the configuration information of the target task, it is determined that other tasks having a synchronization association relationship with the target task are other tasks in the entire group that are issued earlier than the target task, and further, if the other tasks in the entire group that are issued earlier than the target task are not all executed, it is determined that the target task is synchronously blocked by other tasks in the entire group that are issued earlier than the target task;
- Judgment 2 when judging that the target task is a forward synchronization task in a certain subgroup according to the configuration information of the target task, it is determined that other tasks having a synchronization association relationship with the target task are other tasks in the subgroup that are issued earlier than the target task, and further, if the other tasks in the subgroup that are issued earlier than the target task are not all executed, it is determined that the target task is synchronously blocked by other tasks in the subgroup that are issued earlier than the target task;
- Judgment three when it is determined that the target task is a task in the entire group that is issued later than a certain backward synchronization task according to the configuration information of the target task, it is determined that other tasks having a synchronization association relationship with the target task are the backward synchronization task in the entire group, and further, if the backward synchronization task in the entire group is not executed to completion, it is determined that the target task is synchronization blocked by the backward synchronization task in the entire group;
- Judgment four when it is judged that the target task is a task in a sub-group that is issued later than a backward synchronization task based on the configuration information of the target task, it is determined that other tasks that have a synchronization association relationship with the target task are the backward synchronization task in the sub-group. In this case, if the backward synchronization task in the sub-group has not been executed to completion, it is determined that the target task is synchronization blocked by the backward synchronization task in the sub-group.
- the task scheduler may also perform the following judgment five and judgment six:
- Judgment 5 when judging that the target task is a backward synchronization task in the entire group according to the configuration information of the target task, it is determined that all other tasks in the entire group that are issued later than the backward synchronization task are synchronously blocked by the target task;
- Judgment six when judging, based on the configuration information of the target task, that the target task is a backward synchronization task in a certain subgroup, it is determined that all other tasks in the subgroup that are issued later than the backward synchronization task are synchronously blocked by the target task.
- the task scheduler can determine whether the target task is synchronously blocked by other tasks, and through the above judgments 5 and 6, the task scheduler can determine whether other tasks that have not been analyzed are synchronously blocked by the target task. It can be seen that this method can determine other tasks that are synchronously blocked by the target task while analyzing whether a target task is synchronously blocked, so that there is no need to perform meaningless analysis on these tasks that have been determined to be synchronously blocked in the future, which helps save computing resources and improves the efficiency of synchronous blocking judgment.
- Step 904 the task scheduler determines other tasks that have dependency associations with the target task based on the association relationship among the N tasks, and determines whether the target task is blocked by other task dependencies based on the execution status of other tasks. If not, execute step 905; if yes, execute step 902.
- the task scheduler may first perform the following judgments 1 and 2:
- Judgment 1 when judging that the target task has a partial dependency relationship with other tasks according to the configuration information of the target task, if the other dependent tasks have not been fully executed, it is determined that the target task is partially blocked by the dependencies of other tasks;
- Judgment 2 when judging that the target task has a serial dependency association with other tasks according to the configuration information of the target task, if the other dependent tasks are not all executed, it is determined that the target task is blocked by the serial dependency of other tasks.
- Step 905 The task scheduler determines the target task as a schedulable task.
- Step 906 The task scheduler schedules the schedulable tasks to the processing units in sequence according to the order in which the schedulable tasks are acquired.
- first and second embodiments introduce possible implementations of the task scheduling method from the perspective of software.
- the following further introduces possible implementations of the task scheduling method from the perspective of hardware based on the third embodiment.
- the task scheduler 310 may include a first waiting queue, a second waiting queue and a ready queue.
- the first waiting queue is used to store tasks that have not been analyzed for being blocked synchronously, and tasks that have been determined to be blocked synchronously.
- the tasks that have been determined to be blocked synchronously include: tasks that have been analyzed for being blocked synchronously and have been determined to be blocked synchronously, and tasks that have not been analyzed for being blocked synchronously but have been determined to be blocked synchronously in the analysis of other tasks.
- the second waiting queue is used to store tasks that have been determined not to be blocked synchronously and have not been analyzed for being blocked by dependencies, and tasks that have been determined to be blocked by dependencies.
- the tasks that have been determined to be blocked by dependencies include: tasks that have been analyzed for being blocked by dependencies and have been determined to be blocked by dependencies, and tasks that have not been analyzed for being blocked by dependencies but have been determined to be blocked by dependencies in the analysis of other tasks.
- the ready queue is used to store tasks that have been determined not to be blocked synchronously and have been determined not to be blocked by dependencies, that is, schedulable tasks.
- the tasks in the first waiting queue, the second waiting queue and the ready queue can be processed in parallel by different threads.
- the task scheduler 310 can store the task in the first waiting queue in sequence according to the order in which the task is issued.
- the task scheduler 310 traverses each task in the first waiting queue that has not been determined to be synchronously blocked according to the order in which the task is stored in the first waiting queue.
- traversing each task the method in the above-mentioned embodiment one or two is used to determine whether the task is synchronously blocked by other tasks.
- the task scheduler 310 traverses each task in the second waiting queue that has not been determined to be dependently blocked according to the order in which the task is stored in the second waiting queue. When traversing each task, it is determined whether the task is dependently blocked by other tasks according to the above-mentioned method. If so, the next task that has not been determined to be dependently blocked is traversed, and if not, the task is moved from the second waiting queue to the ready queue. In yet another thread, the task scheduler 310 schedules the tasks in the ready queue to the available GPU cores in sequence according to the task processing status of the GPU core and the order in which the tasks are issued.
- the task scheduler 310 can integrate all functions on an independent physical device, or can disperse and deploy each function on different physical devices.
- the task scheduler 310 can also include a task acquirer 311, a blocking manager 312 and a task dispatcher 313, the task acquirer 311 can access the first waiting queue, the blocking manager 312 can access the first waiting queue, the second waiting queue and the ready queue, and the task dispatcher 313 can access the ready queue.
- the processing operation of the above-mentioned task scheduler 310 on the tasks in the first task queue, the second task queue and the ready queue can be realized by the task acquirer 311, the blocking manager 312 and the task dispatcher 313 accessing the tasks in the first waiting queue, the second waiting queue and the ready queue.
- the task acquirer 311 is used to receive the tasks sent by the device driver package to the task scheduler 310, and store the relevant data of the tasks in the first waiting queue in the order of sending the tasks.
- the relevant data of each task includes the configuration information of the task, and the configuration information is used to indicate one or more of the following contents: whether the task is a forward synchronization task in the whole group, whether the task is a backward synchronization task in the whole group, the subgroup to which the task belongs, whether the task is a forward synchronization task in the subgroup to which it belongs, whether the task is a backward synchronization task in the subgroup to which it belongs, whether the task has a partial dependency association with other tasks and the other tasks on which it depends, and whether the task has a serial dependency association with other tasks and the other tasks on which it depends.
- the blocking manager 312 is used to monitor the state of the first waiting queue. When it is sensed that there are tasks stored in the first waiting queue, each task in the first waiting queue that is not determined to be synchronously blocked is traversed in the order in which the tasks are stored in the first waiting queue. When traversing each task, the following operations are performed:
- Operation one if it is determined that the task is a forward synchronization task in the entire group, then by querying the task status of other tasks in the entire group that are issued earlier than the task, it is determined whether the other tasks in the entire group that are issued earlier than the task have been fully executed; and, if it is determined that the task is a forward synchronization task in at least one sub-group, then by querying the task status of other tasks in each sub-group that are issued earlier than the task, it is determined whether the other tasks in each sub-group that are issued earlier than the task have been fully executed.
- the above judgment results are all yes, it is determined that the task is not blocked by synchronization, and the task is moved from the first task queue to the second task queue. On the contrary, when there is at least one no in the above judgment results, it is determined that the task is blocked by synchronization, the task continues to remain in the first task queue, and starts to traverse the next task that is not determined to be blocked by synchronization;
- Operation two based on the configuration information of the task, if it is determined that the task is a backward synchronization task in the entire group, then it is determined that all other tasks in the entire group that are issued later than the task are all blocked synchronously, and all other tasks in the entire group that are issued later than the task are all left in the first task queue; and, if it is determined that the task is a backward synchronization task in at least one sub-group, then it is determined that all other tasks in each sub-group that are issued later than the task are all blocked synchronously, and all other tasks in each sub-group that are issued later than the task are all left in the first task queue.
- operation 1 may be executed first and then operation 2, or operation 2 may be executed first and then operation 1, or operation 1 and operation 2 may be executed simultaneously.
- the analysis operation of the first task queue adopts a polling method. For example, after one round of analysis, the status of all remaining tasks in the first task queue is updated to have not been determined to be synchronously blocked. After that, the same operation one and operation two are executed again according to the order in which the tasks are stored.
- the blocking manager 312 is also used to monitor the state of the second waiting queue. When it is perceived that there are tasks stored in the second waiting queue, each task in the second waiting queue that is not determined to be blocked by dependency is traversed in the order in which the tasks are stored in the second waiting queue. When traversing each task: according to the configuration information of the task, if it is determined that the task has a partial dependency association with other tasks, then by querying the task status of other tasks on which it depends, it is determined whether the other tasks on which it depends have been fully executed, and, if it is determined that the task has a serial dependency association with other tasks, then by querying the task status of other tasks on which it depends, it is determined whether the other tasks on which it depends have been fully executed.
- the task dispatcher 313 is used to monitor the status of the ready queue and the status of the GPU core. When it is sensed that there are tasks stored in the ready queue and the current task processing volume of the GPU core is less than the parallel task volume, the tasks in the ready queue are dispatched to the GPU core in sequence according to the order in which the device driver package sends the tasks. For example, assuming that task 3, task 1 and task 2 are stored in the ready queue in sequence, and the order of issuance is task 1, task 2 and task 3, and the number of parallel tasks of the GPU core is 2, then: when the number of tasks currently executed by the GPU core is 1, it is determined that the GPU core can currently execute another task.
- the task dispatcher 313 can dispatch task 1, which is the earliest issued in the ready queue, to the GPU core.
- task 2 in the ready queue is dispatched to the GPU core.
- task 3 in the ready queue is dispatched to the GPU core; or, when the number of tasks currently executed by the GPU core is 0, it is determined that the GPU core can currently execute two more tasks.
- the task dispatcher 313 can dispatch task 1 and task 2, which are the earliest issued in the ready queue, to the GPU core.
- task 3 in the ready queue is dispatched to the GPU core.
- the three operations of whether a task is synchronously blocked, whether it is dependently blocked, and whether it is scheduled to the GPU core are executed in series, but the operations of whether different tasks are synchronously blocked, whether they are dependently blocked, and whether they are scheduled to the GPU core are executed in parallel.
- the same task will only be judged whether it is dependently blocked if it is determined that it is not synchronously blocked, and will only be subsequently scheduled if it is determined that it is not dependently blocked.
- another task issued earlier than the task may be judged whether it is dependently blocked, and other tasks issued earlier than the task and another task may be scheduled to the GPU core.
- Figure 10 exemplarily shows a task scheduling process diagram provided by an embodiment of the present application, wherein (A) in Figure 10 shows tasks 0 to 5 and their associations issued by the device driver package to the task scheduler, and the tasks 0 to 5 are first acquired by the task acquirer and stored in the first waiting queue. In the association relationship between tasks 0 to 5, tasks 0 to 5 are only located in the entire group, and task 3 belongs to the forward synchronization task and the backward synchronization task in the entire group, which means that task 3 can only be executed after tasks 0, 1 and 2 issued earlier than task 3 are all completed, and tasks 4 and 5 issued later than task 3 can only be executed after task 3 is completed.
- (B) in Figure 10 shows a possible situation of processing tasks according to the task scheduling method in the above embodiment. Referring to (B) in Figure 10, the task scheduler can schedule each task according to the following steps:
- Step 1 First, analyze Task 0.
- Task 0 is scheduled to the GPU core without blocking through all processes. Specifically:
- the blocking manager traverses the first task 0 issued in the first waiting queue. Since task 0 does not belong to the forward synchronization task and is not a task issued later than the backward synchronization task 3 in the entire group, task 0 is not blocked by synchronization. The blocking manager moves task 0 from the first waiting queue to the second waiting queue.
- the blocking manager traverses the task 0 stored first in the second waiting queue. Since task 0 does not depend on other tasks, task 0 is not blocked by dependencies. The blocking manager moves task 0 from the second waiting queue to the ready queue.
- the task dispatcher monitors that the GPU core is not currently processing any tasks, that is, the GPU core can currently process two tasks. Therefore, the task dispatcher schedules the first task 0 issued in the ready queue to the GPU core.
- Step 2 Analyze Task 1 again.
- Task 1 is scheduled to the GPU core without blocking through all processes. Specifically:
- the first waiting queue only contains tasks 1 to 5.
- the blocking manager traverses the first task 1 issued in the first waiting queue. Since task 1 does not belong to the forward synchronization task and is not a task issued later than the backward synchronization task 3 in the whole group, task 1 is not blocked by synchronization. The blocking manager moves task 1 from the first waiting queue to the second waiting queue.
- the blocking manager traverses Task 1 in the second waiting queue. Since Task 1 does not depend on other tasks, Task 1 is not blocked by dependencies. The blocking manager moves Task 1 from the second waiting queue to the ready queue.
- the task dispatcher monitors that the GPU core is currently only processing task 0, that is, the GPU core can currently process one task. Therefore, the task dispatcher schedules task 1 in the ready queue to the GPU core.
- the GPU core processes Task 0 and Task 1 in parallel.
- Step 3 Analyze Task 2 again.
- Task 2 passes the synchronous blocking judgment process and the dependent blocking judgment process, but needs to wait for Task 0 or Task 1 to be completed in the dispatch process before it can be scheduled to the GPU core.
- the first waiting queue only contains tasks 2 to 5.
- the blocking manager traverses the first task 2 issued in the first waiting queue. Since task 2 does not belong to the forward synchronization task and is not a task issued later than the backward synchronization task 3 in the entire group, task 2 is not synchronously blocked. The blocking manager moves task 2 from the first waiting queue to the second waiting queue.
- the blocking manager traverses Task 2 in the second waiting queue. Since Task 2 does not depend on other tasks, Task 2 is not blocked by dependencies. The blocking manager moves Task 2 from the second waiting queue to the ready queue.
- the task dispatcher monitors that the GPU core is currently processing task 0 and task 1, that is, the GPU core cannot currently process new tasks. Therefore, the task dispatcher waits for the GPU core to complete processing one of the tasks and then schedules task 2 in the ready queue to the GPU core. As shown in (B) in Figure 10, assuming that the GPU core completes processing task 0 first, after the task dispatcher schedules task 2 to the GPU core, the GPU core processes task 1 and task 2 in parallel.
- Step 4 Analyze Task 3 again.
- Task 3 is blocked in the synchronous blocking judgment process and can only be executed after Task 1 and Task 2 are completed:
- the first waiting queue only contains tasks 3 to 5.
- the blocking manager traverses the first task 3 issued in the first waiting queue. Since task 3 belongs to the forward synchronization task and tasks 1 and 2 issued earlier than task 3 have not been completed, task 3 is synchronously blocked. At the same time, since task 3 also belongs to the backward synchronization task, tasks 4 and 5 issued later than task 3 are also determined to be synchronously blocked. Therefore, before tasks 1 and 2 are all executed, the blocking manager will no longer analyze tasks 3 to 5 in the first waiting queue.
- the blocking manager traverses task 3 in the second waiting queue. Since task 3 does not depend on other tasks, task 3 is not blocked by dependencies. The blocking manager moves task 3 from the second waiting queue to the ready queue.
- the task dispatcher monitors the tasks that are not currently being processed by the GPU core, that is, the GPU core can currently process 2 tasks. Therefore, the task dispatcher directly schedules task 3 in the ready queue to the GPU core, and the GPU core only processes task 3.
- Step 5 Task 4 and Task 5 are blocked in the synchronization blocking judgment process and can only be executed after Task 3 is completed:
- the first waiting queue only contains Task 4 and Task 5, and Task 4 and Task 5 have been determined to be synchronously blocked by Task 3 in the above step four. Therefore, before Task 3 is executed, the blocking manager will no longer analyze Task 4 and Task 5 in the first waiting queue.
- the blocking manager determines that task 3, which synchronously blocks tasks 4 and 5, is executed, and tasks 4 and 5 are released from synchronous blocking. Therefore, by traversing tasks 4 and 5 in sequence, the blocking manager moves tasks 4 and 5 from the first waiting queue to the second waiting queue in sequence.
- the blocking manager traverses tasks 4 and 5 in the second waiting queue in turn. Since tasks 4 and 5 do not depend on other tasks, neither of them is blocked by dependencies. The blocking manager moves tasks 4 and 5 from the second waiting queue to the ready queue.
- the task dispatcher monitors the tasks that are not currently being processed by the GPU core, that is, the GPU core can currently process 2 tasks. Therefore, the task dispatcher schedules tasks 4 and 5 in the ready queue to the GPU core, so that the GPU core processes tasks 4 and 5 in parallel.
- the above example 1 introduces the scenario where forward synchronization tasks and backward synchronization tasks are defined in the whole group.
- the tasks without synchronization blocking and dependency blocking in the whole group can be scheduled in advance as much as possible, so as to reduce the idle phenomenon of the GPU core and improve the utilization rate of the GPU core.
- FIG11 exemplarily shows another task scheduling process diagram provided by an embodiment of the present application, wherein FIG11 (A) shows tasks 0 to 4 and their associations issued by the device driver package to the task scheduler, and the tasks 0 to 4 are first acquired by the task acquirer and stored in the first waiting queue.
- FIG11 (A) shows tasks 0 to 4 and their associations issued by the device driver package to the task scheduler, and the tasks 0 to 4 are first acquired by the task acquirer and stored in the first waiting queue.
- tasks 0 to 4 are located in the entire group, while tasks 0, 2 and 4 are located in subgroup 1, and tasks 1 and 3 are located in subgroup 2.
- task 2 belongs to the forward synchronization task and the backward synchronization task in subgroup 1, which means that task 2 can only be executed after task 0 issued earlier than task 2 in subgroup 1 is executed, and task 4 issued later than task 2 in subgroup 1 can only be executed after task 2 is executed.
- FIG11 (B) shows a possible situation of processing tasks according to the task scheduling method in the above
- Step 1 Task 0 is analyzed first. Task 0 is scheduled to the GPU core without blocking through all processes.
- the specific implementation process refers to step 1 in the above example 1, which will not be repeated here.
- Step 2 Task 1 is analyzed again. Task 1 is scheduled to the GPU core without blocking through all processes.
- the specific implementation process refers to step 2 in the above example 1, which will not be repeated here.
- Step 3 Analyze Task 2 again.
- Task 2 is blocked in the synchronous blocking judgment process and can only be executed after Task 0 is completed:
- the first waiting queue only contains tasks 2 to 4.
- the blocking manager traverses the first task 2 issued in the first waiting queue. Since task 2 belongs to the forward synchronization task in subgroup 1, and task 0 issued earlier than task 2 in subgroup 1 has not been completed, task 2 is synchronously blocked. At the same time, since task 2 also belongs to the backward synchronization task in subgroup 1, task 4 issued later than task 2 in subgroup 1 is also determined to be synchronously blocked. Therefore, before task 2 is executed, the blocking manager no longer analyzes task 4 in the first subgroup.
- the blocking manager traverses Task 2 in the second waiting queue. Since Task 2 does not depend on other tasks, Task 2 is not blocked by dependencies. The blocking manager moves Task 2 from the second waiting queue to the ready queue.
- the task dispatcher monitors that the GPU core is currently processing task 1, that is, the GPU core can currently process a new task. Therefore, the task dispatcher schedules task 2 in the ready queue to the GPU core, so that the GPU core processes task 1 and task 2 in parallel.
- Step 4 Task 3 is analyzed again. Task 3 passes the synchronous blocking judgment process and the dependent blocking judgment process, but it needs to wait for Task 1 or Task 2 to be completed in the dispatch process before it can be scheduled to the GPU core. As shown in (B) in Figure 11, assuming that Task 1 is completed first, the task dispatcher dispatches Task 3 in the ready queue to the GPU core, so that the GPU core processes Task 2 and Task 3 in parallel. The specific implementation process of this step refers to Step 3 in the above Example 1, and will not be repeated here.
- Step 5 Task 4 is blocked in the synchronous blocking judgment process and can only be executed after Task 2 is completed:
- the first waiting queue only contains Task 4, and Task 4 has been determined to be synchronously blocked by Task 2 in the above step 3. Therefore, before Task 2 is executed, the blocking manager no longer analyzes Task 4 in the first waiting queue.
- the blocking manager determines that task 2 that synchronously blocks task 4 is executed, and task 4 is released from synchronous blocking. Therefore, the blocking manager moves task 4 from the first waiting queue to the second waiting queue.
- the blocking manager traverses task 4 in the second waiting queue. Since task 4 does not depend on other tasks, task 4 is not blocked by dependencies. The blocking manager moves task 4 from the second waiting queue to the ready queue.
- the task dispatcher monitors that the GPU core is currently processing task 3, that is, the GPU core can currently process a new task. Therefore, the task dispatcher schedules task 4 in the ready queue to the GPU core, so that the GPU core processes task 3 and task 4 in parallel.
- the above example 2 introduces the scenario where forward synchronization tasks and backward synchronization tasks are defined in sub-groups.
- forward synchronization tasks and backward synchronization tasks are defined in sub-groups.
- FIG. 12 exemplarily shows another task scheduling process diagram provided in an embodiment of the present application, wherein FIG. 12 (A) shows The device driver package sends tasks 0 to 4 and their associations to the task scheduler. Tasks 0 to 4 are first acquired by the task acquirer and stored in the first waiting queue. In the association of tasks 0 to 4, tasks 0 to 4 are in the entire group, and task 3 has a partial dependency association with tasks 0 and 2, which means that task 3 can only be executed after tasks 0 and 2 are all completed.
- FIG12 (B) shows a possible situation of processing tasks according to the task scheduling method in the above embodiment. As shown in FIG12 (B), the task scheduler can schedule each task according to the following steps:
- Step 1 Task 0 is analyzed first. Task 0 is scheduled to the GPU core without blocking through all processes.
- the specific implementation process refers to step 1 in the above example 1, which will not be repeated here.
- Step 2 Task 1 is analyzed again. Task 1 is scheduled to the GPU core without blocking through all processes.
- the specific implementation process refers to step 2 in the above example 1, which will not be repeated here.
- Step 3 Task 2 is analyzed again. Task 2 passes the synchronous blocking judgment process and the dependent blocking judgment process, but it needs to wait for Task 0 or Task 1 to be executed in the dispatch process before it can be scheduled to the GPU core. As shown in (B) in Figure 12, assuming that the GPU core processes Task 0 first, after the task dispatcher schedules Task 2 to the GPU core, the GPU core processes Task 1 and Task 2 in parallel. Please refer to Step 3 in the above Example 1 for the specific implementation process of this step, which will not be repeated here.
- Step 4 Analyze Task 3 again.
- Task 3 is blocked in the dependency blocking judgment process and can only be executed after Task 2 is completed:
- the first waiting queue only contains Task 3 and Task 4.
- the blocking manager traverses Task 3, which is the first task issued in the first waiting queue. Since Task 3 does not belong to the forward synchronization task and there is no backward synchronization task in the entire group, Task 3 is not synchronously blocked. The blocking manager moves Task 3 from the first waiting queue to the second waiting queue.
- the blocking manager traverses Task 3 in the second waiting queue. Since Task 3 depends on Task 0 and Task 2, Task 0 has been executed but Task 2 has not yet been completed. Therefore, Task 3 is blocked by the dependency. Before Task 2 is completed, the blocking manager no longer analyzes Task 3 in the second waiting queue.
- Step five as shown in (B) in Figure 12, assuming that task 1 is executed first, since task 2 is not completed and task 3 is not released from dependency blocking, task 4 that is located after task 3 in the first waiting queue can be directly analyzed. Task 4 is scheduled to the GPU core without blocking through all processes, and the GPU core will process tasks 2 and 4 at the same time.
- Step six as shown in (B) in Figure 12, assuming that task 2 is executed, task 3 in the second waiting queue releases the dependency block, so task 3 is scheduled to the GPU core without blocking through all processes, allowing the GPU core to process task 4 and task 3 in parallel.
- Task 3 has a serial dependency association with Task 0 and Task 2, it means that Task 3 can only be executed after Task 0 and Task 2 have started to execute, regardless of whether Task 0 and Task 2 have been completed.
- Task 3 is released from dependency, and Task 3 passes through the synchronous blocking judgment process and the dependent blocking judgment process, but needs to wait for Task 1 or Task 2 to be completed in the dispatch process before it can be scheduled to the GPU core.
- the task dispatcher will dispatch Task 3 in the ready queue to the GPU core, so that the GPU core processes Task 2 and Task 3 in parallel.
- the above example 3 introduces a scenario with defined dependencies.
- the subsequent task that is not blocked by the dependency can be scheduled to the GPU core first to ensure that the GPU core is not idle as much as possible, effectively improving the utilization rate of the GPU core.
- the above mainly introduces the solution provided by the present application from the perspective of the interaction between various network elements.
- the above-mentioned network elements include hardware structures and/or software modules corresponding to the execution of various functions.
- the present invention can be implemented in the form of hardware or a combination of hardware and computer software. Whether a function is executed in the form of hardware or computer software driving hardware depends on the specific application and design constraints of the technical solution. Professional and technical personnel can use different methods to implement the described functions for each specific application, but such implementation should not be considered to exceed the scope of the present invention.
- FIG13 is a schematic diagram of the structure of a task scheduler provided in an embodiment of the present application.
- the task scheduler 1300 may be a chip or a circuit, such as a chip or a circuit that can be set in a processor.
- the task scheduler 1300 corresponds to the task scheduler in the aforementioned method, such as the task scheduler 310 in FIG2.
- the task scheduler 1300 may implement the steps of any one or more of the corresponding methods shown in FIG5 or FIG9 above.
- the task scheduler 1300 may include an acquisition unit 1301, a determination unit 1302, and a task scheduler 1303. Unit 1302 and scheduling unit 1303.
- the acquisition unit 1301 may be a receiving unit or a receiver when receiving information, and the receiving unit or the receiver may be a radio frequency circuit.
- the acquisition unit 1301 is used to acquire N tasks and the association relationship of the N tasks
- the determination unit 1302 is used to determine the non-dependent or de-dependent tasks among the N tasks as schedulable tasks according to the association relationship of the N tasks
- the scheduling unit 1303 is used to schedule the schedulable tasks to the processing unit.
- N is a positive integer.
- modules of the task scheduler 1300 is merely a division of logical functions, and in actual implementation, all or part of them may be integrated into one physical entity, or they may be physically separated.
- the present application also provides a task scheduler, including the aforementioned task acquirer, blocking manager and task dispatcher, and also including a first waiting queue, a second waiting queue and a ready queue.
- the present application also provides a processor, including the aforementioned task scheduler and a processing unit.
- the processing unit may specifically be a processor core, such as the aforementioned GPU core.
- the present application also provides an electronic device, which includes a processor, the processor is coupled to a memory, and the processor is used to execute a computer program stored in the memory, so that the electronic device executes the method of any one of the embodiments shown in Figure 5 or Figure 9.
- the present application also provides a task processing system, including a processor and a device driver package, the device driver package is used to send N tasks to the processor, and the processor is used to process N tasks by executing the method of any one of the embodiments shown in Figure 5 or Figure 9.
- the present application also provides a computer program product, which includes: a computer program code, when the computer program code is run on a computer, the computer executes the method of any one of the embodiments shown in Figure 5 or Figure 9.
- the present application also provides a computer-readable storage medium, which stores a program code.
- the program code runs on a computer, the computer executes the method of any one of the embodiments shown in Figure 5 or Figure 9.
- a component can be, but is not limited to, a process running on a processor, a processor, an object, an executable file, an execution thread, a program and/or a computer.
- applications running on a computing device and a computing device can be components.
- One or more components may reside in a process and/or an execution thread, and a component may be located on a computer and/or distributed between two or more computers.
- these components may be executed from various computer-readable media having various data structures stored thereon.
- Components may, for example, communicate through local and/or remote processes according to signals having one or more data packets (e.g., data from two components interacting with another component between a local system, a distributed system and/or a network, such as the Internet interacting with other systems through signals).
- signals having one or more data packets (e.g., data from two components interacting with another component between a local system, a distributed system and/or a network, such as the Internet interacting with other systems through signals).
- the disclosed systems, devices and methods can be implemented in other ways.
- the device embodiments described above are only schematic.
- the division of the units is only a logical function division. There may be other division methods in actual implementation, such as multiple units or components can be combined or integrated into another system, or some features can be ignored or not executed.
- Another point is that the mutual coupling or direct coupling or communication connection shown or discussed can be through some interfaces, indirect coupling or communication connection of devices or units, which can be electrical, mechanical or other forms.
- the units described as separate components may or may not be physically separate, and the components shown as units may or may not be physical units, that is, they may be located in one place or distributed over multiple network units. Some or all of the units may be selected to achieve the purpose of the solution of this embodiment.
- each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.
- the functions are implemented in the form of software functional units and sold or used as independent products, they can be stored in a computer-readable storage medium.
- the technical solution of the present application can be essentially or partly embodied in the form of a software product that contributes to the prior art.
- the computer software product is stored in a storage medium and includes several instructions for a computer device (which can be a personal computer, server, or network device, etc.) to perform all or part of the steps of the methods described in each embodiment of the present application.
- the aforementioned storage media include: U disk, mobile hard disk, read-only memory (ROM), random access memory (RAM), disk or optical disk, and other media that can store program codes.
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Debugging And Monitoring (AREA)
- Multi Processors (AREA)
Abstract
A task scheduling method, apparatus and system, which are applicable to the technical field of processors and are used for improving the utilization rate of a processing unit. The method comprises: a task scheduler acquiring N tasks and an association relationship between the N tasks (501); according to the association relationship between the N tasks, determining tasks, which have no dependency or have been released from dependency, from among the N tasks to be schedulable tasks (502); and scheduling the schedulable tasks to a processing unit (503), where N is a positive integer. The scheduling sequence of N tasks is maintained by means of a task scheduler, and according to the actual execution condition of tasks, tasks, which have no dependency or have been released from dependency, can be issued in advance to a processing unit for processing, so as to make full use of all the hardware resources of the processing unit to process the tasks with high concurrency, and prevent the processing unit from being unloaded to the greatest extent, thereby effectively improving the utilization rate of the processing unit.
Description
相关申请的交叉引用CROSS-REFERENCE TO RELATED APPLICATIONS
本申请要求在2022年12月15日提交中国专利局、申请号为202211614262.8、申请名称为“一种任务调度方法、装置及系统”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims priority to the Chinese patent application filed with the China Patent Office on December 15, 2022, with application number 202211614262.8 and application name “A task scheduling method, device and system”, the entire contents of which are incorporated by reference in this application.
本申请涉及处理器技术领域,尤其涉及一种任务调度方法、装置及系统。The present application relates to the field of processor technology, and in particular to a task scheduling method, device and system.
任务调度器,是处理器的核心器件之一,用于将任务合理调度给处理单元,以充分利用处理器的硬件资源高效处理任务。任务调度器调度任务的合理性,对于缩短任务的等待时长、提高任务的并行量、优化资源利用率以及降本增效具有至关重要的作用。故而,如何实现任务的合理调度,成为任务调度器设计中的一个主流研究方向。The task scheduler is one of the core components of the processor, which is used to reasonably schedule tasks to the processing unit to make full use of the hardware resources of the processor to efficiently process tasks. The rationality of the task scheduler's scheduling of tasks is crucial to shortening the waiting time of tasks, increasing the parallel amount of tasks, optimizing resource utilization, and reducing costs and increasing efficiency. Therefore, how to achieve reasonable scheduling of tasks has become a mainstream research direction in the design of task schedulers.
为实现任务的合理调度,一种业界的解决方案中,任务调度器中设置有多个任务队列,多个任务队列分别用于存储属于不同业务的任务,多个任务队列中的任务并行执行,一个任务队列中的任务串行执行。然而,该种解决方案虽然能对属于不同业务的任务进行并行处理,但属于同一业务的任务却需要串行处理。也即是说,只要一个任务队列中在先的任务未处理完成,则位于该任务之后的其它任务也无法使用空闲的硬件资源进行提前处理。可见,该种解决方案仍会使处理单元存在空载问题,而无法充分利用处理单元的硬件资源高并发地处理任务,不利于提高处理单元的利用率。In order to realize the reasonable scheduling of tasks, in one solution of the industry, a plurality of task queues are provided in the task scheduler, and the plurality of task queues are respectively used to store tasks belonging to different businesses, and the tasks in the plurality of task queues are executed in parallel, and the tasks in one task queue are executed in series. However, although this solution can process tasks belonging to different businesses in parallel, tasks belonging to the same business need to be processed in series. In other words, as long as the previous task in a task queue is not processed and completed, the other tasks after the task cannot use the idle hardware resources for processing in advance. It can be seen that this solution still causes the processing unit to have the problem of no load, and cannot make full use of the hardware resources of the processing unit to process tasks with high concurrency, which is not conducive to improving the utilization rate of the processing unit.
综上,目前暨需一种任务调度方法,用以提高处理单元的利用率。In summary, there is a need for a task scheduling method to improve the utilization of processing units.
发明内容Summary of the invention
本申请提供一种任务调度方法、装置及系统,用以提高处理单元的利用率。The present application provides a task scheduling method, device and system for improving the utilization rate of a processing unit.
第一方面,本申请提供一种任务调度方法,该方法适用于任务调度器,任务调度器可以是具有任务调度能力的任意器件、装置或设备,也可以是芯片或电路,不作限定。该方法包括:任务调度器获取N个任务以及N个任务的关联关系,根据N个任务的关联关系,将N个任务中无依赖的或已解除依赖的任务确定为可调度任务,并将可调度任务调度至处理单元。其中,N为正整数。In a first aspect, the present application provides a task scheduling method, which is applicable to a task scheduler, and the task scheduler can be any device, apparatus or equipment with task scheduling capability, or a chip or circuit, without limitation. The method comprises: the task scheduler obtains N tasks and the association relationship between the N tasks, and according to the association relationship between the N tasks, determines the non-dependent or de-dependent tasks among the N tasks as schedulable tasks, and schedules the schedulable tasks to the processing unit. Wherein N is a positive integer.
在上述任务调度方法中,N个任务示例性地可以是设备驱动程序包下发给任务调度器。如此,通过任务调度器从硬件侧维护N个任务的调度顺序,而不是由设备驱动程序包从软件侧指定任务执行顺序,能根据任务的实际执行情况,将无依赖或者已经解除依赖的任务提前下发给处理单元进行处理,以便充分利用处理单元的全部硬件资源高并发地处理任务,尽可能地不让处理单元空载,有效提高处理单元的利用率。且,该任务调度方法还无需硬件侧在每个任务执行完成后向软件侧发送通知消息以指示软件侧下发新的任务,如此还能有效降低软件侧的工作压力,提高任务调度的效率,节省通信开销。In the above-mentioned task scheduling method, N tasks can be exemplarily sent to the task scheduler by the device driver package. In this way, the scheduling order of N tasks is maintained by the task scheduler from the hardware side, instead of the device driver package specifying the task execution order from the software side. According to the actual execution of the task, the tasks without dependency or whose dependency has been released can be sent to the processing unit in advance for processing, so as to make full use of all the hardware resources of the processing unit to process the tasks with high concurrency, and to prevent the processing unit from being idle as much as possible, thereby effectively improving the utilization rate of the processing unit. Moreover, the task scheduling method does not require the hardware side to send a notification message to the software side after each task is executed to instruct the software side to send a new task, which can effectively reduce the work pressure on the software side, improve the efficiency of task scheduling, and save communication overhead.
一种可能的设计中,N个任务的关联关系可以包括同步关联和/或依赖关联,其中:In a possible design, the association relationship of the N tasks may include synchronization association and/or dependency association, where:
同步关联用于指示任务与早于或晚于该任务获取的全部其它任务具有关联关系。具体的,同步关联可以包括前向同步关联和后向同步关联。前向同步关联是指某一任务依赖于早于该任务下发的全部其它任务,与全部其它任务具有前向同步关联的该任务也称为前向同步任务。后向同步关联是指晚于某一任务下发的全部其它任务依赖于该任务,与全部其它任务具有后向同步关联的该任务也称为后向同步任务;Synchronous association is used to indicate that a task is associated with all other tasks that are obtained earlier or later than the task. Specifically, the synchronous association may include forward synchronous association and backward synchronous association. Forward synchronous association means that a task depends on all other tasks that are issued earlier than the task. The task that has a forward synchronous association with all other tasks is also called a forward synchronous task. Backward synchronous association means that all other tasks that are issued later than a task depend on the task. The task that has a backward synchronous association with all other tasks is also called a backward synchronous task;
依赖关联包括部分依赖和串行依赖。部分依赖关联是指某一任务依赖于部分其它任务的执行结果,即该任务需在所依赖的其它任务全部执行完成后才可以执行。串行依赖是指某一任务依赖于部分或全部其它任务的执行,即该任务需在所依赖的其它任务全部执行后才可以执行。Dependency associations include partial dependency and serial dependency. Partial dependency means that a task depends on the execution results of some other tasks, that is, the task can only be executed after all other tasks it depends on have been completed. Serial dependency means that a task depends on the execution of some or all other tasks, that is, the task can only be executed after all other tasks it depends on have been completed.
采用上述设计,通过设置同步关联关系,设备驱动程序包只需要对一个任务进行配置,即可指示出该任务与早于或晚于该任务获取的全部其它任务的依赖关系,而无需再在该任务的配置信息中挨个指示出该任务所依赖的全部其它任务。而设置依赖关联则还能将除了同步关联以外的比较松散关联的任务。显然,采用该种关联关系设置方式,能在指示出全部任务之间的关联关系的同时,极大地降低任务的数据结构的复杂性,有助于缓解设备驱动程序包和任务调度器之间的通信消耗。
With the above design, by setting the synchronization association, the device driver package only needs to configure one task to indicate the dependency relationship between the task and all other tasks acquired earlier or later than the task, without having to indicate all other tasks that the task depends on one by one in the configuration information of the task. Setting the dependency association can also separate tasks that are loosely associated except for the synchronization association. Obviously, by adopting this association setting method, the complexity of the task data structure can be greatly reduced while indicating the association relationship between all tasks, which helps to alleviate the communication consumption between the device driver package and the task scheduler.
进一步地设计中,任务调度器根据N个任务的关联关系,将N个任务中无依赖的或已接触依赖的任务确定为可调度任务,包括:任务调度器根据N个任务的关联关系,确定与其它任务具有同步关联的任务,当该任务对应的与该任务具有同步关联的其它任务执行完成,则将该任务确定为一个可调度任务。如此,通过监控具有同步关联关系的任务的执行情况,能在与任务具有同步关联的其它任务不再阻塞该任务时,及时地将该任务调度给处理单元。In a further design, the task scheduler determines the tasks that are not dependent or have contacted with the dependencies among the N tasks as schedulable tasks according to the association relationship among the N tasks, including: the task scheduler determines the tasks that are synchronously associated with other tasks according to the association relationship among the N tasks, and when the other tasks that are synchronously associated with the task corresponding to the task are executed, the task is determined as a schedulable task. In this way, by monitoring the execution status of the tasks that are synchronously associated, the task can be promptly scheduled to the processing unit when the other tasks that are synchronously associated with the task no longer block the task.
进一步地设计中,任务对应的与任务具有同步关联的其它任务执行完成,包括:任务为前向同步任务,且早于该任务获取的其它任务全部执行完成;和/或,任务为晚于后向同步任务获取的任务,且该后向同步任务执行完成。在该设计中,当任务不再被早于该任务的全部其它任务或后向同步任务阻塞时,确定该任务解除依赖,如此可尽早将该任务调度给处理单元。In a further design, other tasks corresponding to the task and having synchronization association with the task are completed, including: the task is a forward synchronization task, and all other tasks acquired earlier than the task are completed; and/or the task is a task acquired later than the backward synchronization task, and the backward synchronization task is completed. In this design, when a task is no longer blocked by all other tasks or backward synchronization tasks earlier than the task, it is determined that the task is released from dependency, so that the task can be scheduled to the processing unit as soon as possible.
进一步地设计中,任务调度器根据N个任务的关联关系,将N个任务中无依赖的或已解除依赖的任务确定为可调度任务,包括:任务调度器根据N个任务的关联关系,确定与其它任务具有依赖关联的任务,当该任务对应的与该任务具有依赖关联的其它任务已执行或已执行完成,则将该任务确定为一个可调度任务。如此,通过监控具有依赖关联关系的任务的执行情况,能在与任务具有依赖关联的其它任务不再阻塞该任务时,及时地将该任务调度给处理单元。In a further design, the task scheduler determines the non-dependent or de-dependent tasks among the N tasks as schedulable tasks according to the association relationship among the N tasks, including: the task scheduler determines the tasks that have a dependency association with other tasks according to the association relationship among the N tasks, and when the other tasks that have a dependency association with the task corresponding to the task have been executed or have been completed, the task is determined as a schedulable task. In this way, by monitoring the execution status of the tasks that have a dependency association, the task can be scheduled to the processing unit in a timely manner when the other tasks that have a dependency association with the task no longer block the task.
进一步地设计中,任务调度器根据N个任务的关联关系,将N个任务中未被其它任务同步阻塞且未被其它任务依赖阻塞的任务确定为可调度任务。其中,任务被其它任务同步阻塞,是指该任务满足如下条件中的至少一项:该任务为前向同步任务且早于该任务获取的其它任务未全部执行完成,该任务为晚于后向同步任务获取的任务且后向同步任务未执行完成。任务被其它任务依赖阻塞,是指该任务满足如下条件中的至少一项:该任务与其它任务具有部分依赖关联且所依赖的其它任务未全部执行完成,该任务与其它任务具有串行依赖关联且所依赖的其它任务未全部执行。In a further design, the task scheduler determines the tasks among the N tasks that are not blocked by other tasks synchronously and not blocked by other tasks dependencies as schedulable tasks based on the association relationship among the N tasks. Among them, a task is blocked by other tasks synchronously, which means that the task meets at least one of the following conditions: the task is a forward synchronization task and other tasks acquired earlier than the task have not been fully executed, and the task is a task acquired later than the backward synchronization task and the backward synchronization task has not been fully executed. A task is blocked by other tasks dependencies, which means that the task meets at least one of the following conditions: the task has a partial dependency association with other tasks and other tasks on which it depends have not been fully executed, and the task has a serial dependency association with other tasks and other tasks on which it depends have not been fully executed.
采用上述设计,通过从同步关联和依赖关联的角度,分流程地各自确定任务是否被同步阻塞和是否被依赖阻塞,而无需根据任务的依赖关系对与该任务具有关联关系的全部任务进行笼统地分析,这有助于实现对任务阻塞判断的精细化管理,提高判定任务阻塞的灵活性。With the above design, whether a task is blocked by synchronization and whether it is blocked by dependency can be determined separately in a process-by-process basis from the perspective of synchronization association and dependency association, without the need to conduct a general analysis of all tasks associated with the task based on the task's dependency relationship. This helps to achieve refined management of task blocking judgment and improve the flexibility of determining task blocking.
进一步地设计中,任务调度器根据N个任务的关联关系,将N个任务中未被其它任务同步阻塞且未被其它任务依赖阻塞的任务确定为可调度任务,包括:任务调度器按照N个任务的获取顺序,遍历N个任务中未被确定阻塞的每个任务,在遍历未被确定阻塞的每个任务时:若该任务为前向同步任务,且早于该任务获取的其它任务未全部执行完成,则确定该任务被同步阻塞;若该任务为后向同步任务,则确定晚于该任务获取的全部其它任务被同步阻塞;若该任务依赖于其它任务,且其它任务未执行完成,则确定该任务被依赖阻塞;若该任务串行依赖于其它任务,且其它任务未全部执行,则确定该任务被依赖阻塞;当该任务未被同步阻塞且未被依赖阻塞时,将该任务确定为一个可调度任务。In a further design, the task scheduler determines the tasks among the N tasks that are not blocked synchronously by other tasks and not blocked by other tasks as schedulable tasks according to the association relationship among the N tasks, including: the task scheduler traverses each task among the N tasks that is not determined to be blocked in the order of acquisition of the N tasks, and when traversing each task that is not determined to be blocked: if the task is a forward synchronization task, and other tasks acquired earlier than the task have not been fully executed, then the task is determined to be blocked synchronously; if the task is a backward synchronization task, then all other tasks acquired later than the task are determined to be blocked synchronously; if the task depends on other tasks and the other tasks have not been fully executed, then the task is determined to be blocked by the dependencies; if the task depends on other tasks in series and the other tasks have not been fully executed, then the task is determined to be blocked by the dependencies; when the task is not blocked synchronously and not blocked by the dependencies, the task is determined to be a schedulable task.
采用上述设计,只要任务属于后向同步任务,则晚于该任务下发的其它任务可直接被判定为被同步阻塞。如此,只要分析出存在一个后向同步任务未执行完成,则晚于该后向同步任务下发的全部的其它任务都不用再进行同步阻塞分析了,如此可极大地简化同步阻塞的判断流程,有效提高同步阻塞判断的效率,进而有助于提高任务调度的效率。With the above design, as long as the task belongs to the backward synchronization task, other tasks issued later than the task can be directly judged as being synchronously blocked. In this way, as long as it is analyzed that there is a backward synchronization task that has not been completed, all other tasks issued later than the backward synchronization task do not need to be analyzed for synchronization blocking. This can greatly simplify the judgment process of synchronization blocking, effectively improve the efficiency of synchronization blocking judgment, and thus help improve the efficiency of task scheduling.
进一步地设计中,任务被同步阻塞可以包括任务被整组同步阻塞和/或任务被子分组同步阻塞,该情况下:In a further design, the task being synchronously blocked may include the task being synchronously blocked by the entire group and/or the task being synchronously blocked by a sub-group, in which case:
用于判断任务被整组同步阻塞的其它任务为整组中早于或晚于该任务所获取的全部任务。比如,当一个任务被标记为整组中的前向同步任务,则只要整组中早于该任务所下发的其它任务未全部执行完成,则确定该任务被整组中的其它任务同步阻塞。当一个任务属于整组中的后向同步任务,则确定整组中晚于该任务的其它任务全部被该任务同步阻塞;The other tasks used to determine whether a task is synchronously blocked by the entire group are all tasks in the entire group that are obtained earlier or later than the task. For example, when a task is marked as a forward synchronization task in the entire group, as long as the other tasks in the entire group that are issued earlier than the task have not been fully executed, it is determined that the task is synchronously blocked by other tasks in the entire group. When a task belongs to a backward synchronization task in the entire group, it is determined that all other tasks in the entire group that are later than the task are synchronously blocked by the task;
同样的,用于判断任务被子分组同步阻塞的其它任务为该任务所属的子分组中早于或晚于该任务所获取的全部任务。比如,当一个任务被标记为某一子分组中的前向同步任务,则只要该子分组中早于该任务所下发的其它任务未全部执行完成,则确定该任务被该子分组中的其它任务同步阻塞。当一个任务属于某一子分组中的后向同步任务,则确定该子分组中晚于该任务的其它任务全部被该任务同步阻塞。Similarly, other tasks used to determine whether a task is synchronously blocked by a subgroup are all tasks in the subgroup to which the task belongs that are obtained earlier or later than the task. For example, when a task is marked as a forward synchronization task in a subgroup, as long as other tasks in the subgroup that are issued earlier than the task have not been fully executed, it is determined that the task is synchronously blocked by other tasks in the subgroup. When a task belongs to a backward synchronization task in a subgroup, it is determined that all other tasks in the subgroup that are later than the task are synchronously blocked by the task.
进一步地设计中,子分组可以是根据业务特性划分得到的。如此,通过以业务特性对任务进行分组,即使某一组中的任务被同步阻塞,其它组中的任务也不会受到影响,也即是说,某一业务的同步阻塞不会影响到其它业务的执行,可见,通过解耦业务的任务执行关联,能降低业务之间的相互干扰。In a further design, sub-groups can be obtained by dividing according to business characteristics. In this way, by grouping tasks according to business characteristics, even if tasks in a group are synchronously blocked, tasks in other groups will not be affected, that is, synchronous blocking of a business will not affect the execution of other businesses. It can be seen that by decoupling the task execution association of the business, mutual interference between businesses can be reduced.
或者,进一步地设计中,子分组可以是对关联关系较为密集的任务划分得到的。如此,通过将关联
关系比较集中的任务划分到一个子分组中,仅对该子分组中的前向同步任务或后向同步任务进行标注,即可获知该子分组中的各个任务之间的关联,而无需再对每个任务的依赖关系进行标注,如此可精简每个任务的数据结构,降低设备驱动程序包和任务调度器之间的通信开销。Alternatively, in a further design, the sub-groups may be obtained by dividing the tasks with dense association relationships. Tasks with relatively concentrated relationships are divided into a sub-group. By only marking the forward synchronization task or backward synchronization task in the sub-group, the relationship between the tasks in the sub-group can be known without marking the dependency relationship of each task. This can streamline the data structure of each task and reduce the communication overhead between the device driver package and the task scheduler.
进一步地设计中,任务调度器在获取N个任务以及N个任务的关联关系之后,还可以将N个任务存放至第一等待队列。该情况下,任务调度器根据N个任务的关联关系,将N个任务中未被其它任务同步阻塞且未被其它任务依赖阻塞的任务确定为可调度任务,包括:针对于第一等待队列中的任一任务,判断任务是否被其它任务同步阻塞,若否,则将任务从第一等待队列移至第二等待队列;以及针对于第二等待队列中的任一任务,判断任务是否被其它任务依赖阻塞,若否,则将任务从第二等待队列移至就绪队列。进一步地,任务调度器将可调度任务调度至处理单元,包括:将就绪队列中的任务调度至处理单元。In a further design, after obtaining N tasks and the relationship between the N tasks, the task scheduler can also store the N tasks in the first waiting queue. In this case, the task scheduler determines the tasks among the N tasks that are not synchronously blocked by other tasks and are not blocked by other task dependencies as schedulable tasks based on the relationship between the N tasks, including: for any task in the first waiting queue, determine whether the task is synchronously blocked by other tasks, if not, move the task from the first waiting queue to the second waiting queue; and for any task in the second waiting queue, determine whether the task is blocked by other task dependencies, if not, move the task from the second waiting queue to the ready queue. Further, the task scheduler schedules the schedulable tasks to the processing unit, including: scheduling the tasks in the ready queue to the processing unit.
采用上述设计,通过并行执行同步阻塞判断、依赖阻塞判断和调度操作,不仅能确保一个任务在未被同步阻塞且未被依赖阻塞的情况下才会进行调度,同时还能对晚下发的任务的前流程和早下发的任务的后流程进行并行处理,这有助于进一步提高任务调度的效率。By adopting the above design, by executing synchronous blocking judgment, dependent blocking judgment and scheduling operations in parallel, it can not only ensure that a task will be scheduled only when it is not blocked by synchronization and dependency, but also the front process of the task issued later and the back process of the task issued earlier can be processed in parallel, which helps to further improve the efficiency of task scheduling.
进一步地设计中,任务调度器将就绪队列中的任务调度至处理单元,包括:任务调度器将可调度任务调度至处理单元,包括:任务调度器监控处理单元当前执行的任务数量,当该任务数量小于处理单元的可并行任务数量时,按照就绪队列中的任务的获取顺序,依次将就绪队列中的任务调度至处理单元。In a further design, the task scheduler schedules the tasks in the ready queue to the processing unit, including: the task scheduler schedules the schedulable tasks to the processing unit, including: the task scheduler monitors the number of tasks currently executed by the processing unit, and when the number of tasks is less than the number of parallel tasks of the processing unit, the tasks in the ready queue are scheduled to the processing unit in sequence according to the order in which the tasks in the ready queue are acquired.
采用上述设计,即使后下发的任务相比于前下发的任务先被存储在了就绪队列中,通过在真正调度之前找到最先下发的任务进行调度,而不是按照就绪队列中的存储顺序进行调度,能在提前找到所有可处理的任务的条件下,最大限度地确保当前可处理的任务中最早下发的任务先被处理。With the above design, even if the task issued later is stored in the ready queue before the task issued earlier, by finding the earliest issued task and scheduling it before the actual scheduling, rather than scheduling it according to the storage order in the ready queue, it is possible to maximize the guarantee that the earliest issued task among the currently processable tasks will be processed first while finding all processable tasks in advance.
第二方面,本申请提供一种任务调度器,包括:任务获取器,用于获取N个任务以及N个任务的关联关系,N为正整数;阻塞管理器,用于根据N个任务的关联关系,将N个任务中无依赖的或已解除依赖的任务确定为可调度任务;任务派发器,将可调度任务调度至处理单元。In the second aspect, the present application provides a task scheduler, including: a task acquirer, used to acquire N tasks and the association relationships between the N tasks, where N is a positive integer; a blocking manager, used to determine the non-dependent or released dependent tasks among the N tasks as schedulable tasks based on the association relationships among the N tasks; and a task dispatcher, which schedules the schedulable tasks to the processing unit.
一种可能的设计中,任务获取器具体用于:接收设备驱动程序包下发的N个任务以及N个任务的关联关系。In a possible design, the task acquirer is specifically used to receive N tasks issued by the device driver package and the association relationship between the N tasks.
一种可能的设计中,阻塞管理器具体用于:根据N个任务的关联关系,确定与其它任务具有同步关联的任务,当该任务对应的与该任务具有同步关联的其它任务执行完成,则将该任务确定为一个可调度任务。其中,与其它任务具有同步关联是指任务与早于或晚于任务获取的全部其它任务具有关联关系。In a possible design, the blocking manager is specifically used to: determine the task that has synchronization association with other tasks according to the association relationship of N tasks, and when the other tasks that have synchronization association with the task corresponding to the task are executed, the task is determined as a schedulable task. The synchronization association with other tasks means that the task has an association relationship with all other tasks that are acquired earlier or later than the task.
一种可能的设计中,任务对应的与任务具有同步关联的其它任务执行完成,包括如下内容中的至少一项:任务为前向同步任务,且早于任务获取的其它任务全部执行完成;任务为晚于后向同步任务获取的任务,且后向同步任务执行完成。其中,前向同步任务用于指示前向同步任务依赖于早于前向同步任务获取的全部其它任务,后向同步任务用于指示晚于后向同步任务获取的全部其它任务依赖于后向同步任务。In a possible design, the completion of execution of other tasks corresponding to the task and having synchronization association with the task includes at least one of the following contents: the task is a forward synchronization task, and all other tasks acquired earlier than the task are completed; the task is a task acquired later than the backward synchronization task, and the backward synchronization task is completed. Among them, the forward synchronization task is used to indicate that the forward synchronization task depends on all other tasks acquired earlier than the forward synchronization task, and the backward synchronization task is used to indicate that all other tasks acquired later than the backward synchronization task depend on the backward synchronization task.
一种可能的设计中,阻塞管理器具体用于:根据N个任务的关联关系,确定与其它任务具有依赖关联的任务,当任务对应的与任务具有依赖关联的其它任务已执行或已执行完成,则将任务确定为一个可调度任务。In one possible design, the blocking manager is specifically used to: determine the task that has a dependency relationship with other tasks based on the association relationship among N tasks, and determine the task as a schedulable task when other tasks that have a dependency relationship with the task corresponding to the task have been executed or have been completed.
一种可能的设计中,阻塞管理器具体用于:根据N个任务的关联关系,将N个任务中未被其它任务同步阻塞且未被其它任务依赖阻塞的任务确定为可调度任务。其中,任务被其它任务同步阻塞包括:任务为前向同步任务且早于任务获取的其它任务未全部执行完成,或者,任务为晚于后向同步任务获取的任务且后向同步任务未执行完成;前向同步任务用于限定前向同步任务依赖于早于前向同步任务获取的全部其它任务,后向同步任务用于限定晚于后向同步任务获取的全部其它任务依赖于后向同步任务。任务被其它任务依赖阻塞包括:任务依赖于其它任务且其它任务未全部执行或未全部执行完成。In one possible design, the blocking manager is specifically used to: determine the tasks among the N tasks that are not synchronously blocked by other tasks and are not blocked by other task dependencies as schedulable tasks based on the association relationship among the N tasks. Among them, the task being synchronously blocked by other tasks includes: the task is a forward synchronization task and other tasks acquired earlier than the task have not been fully executed, or the task is a task acquired later than the backward synchronization task and the backward synchronization task has not been fully executed; the forward synchronization task is used to limit the forward synchronization task to be dependent on all other tasks acquired earlier than the forward synchronization task, and the backward synchronization task is used to limit all other tasks acquired later than the backward synchronization task to be dependent on the backward synchronization task. The task being dependently blocked by other tasks includes: the task is dependent on other tasks and the other tasks have not been fully executed or have not been fully executed.
一种可能的设计中,阻塞管理器具体用于:按照N个任务的获取顺序,遍历N个任务中还未被确定阻塞的每个任务,在遍历还未被确定阻塞的每个任务时:若该任务为前向同步任务,且早于该任务获取的其它任务未全部执行完成,则确定该任务被同步阻塞;若该任务为后向同步任务,则确定晚于该任务获取的全部其它任务被同步阻塞;若该任务依赖于其它任务,且其它任务未执行完成,则确定该任务被依赖阻塞;若该任务串行依赖于其它任务,且其它任务未全部执行,则确定该任务被依赖阻塞;当该任务未被同步阻塞且未被依赖阻塞时,将该任务确定为一个可调度任务。In a possible design, the blocking manager is specifically used to: traverse each task among the N tasks that has not been determined to be blocked in the order in which the N tasks are acquired, and when traversing each task that has not been determined to be blocked: if the task is a forward synchronization task, and other tasks acquired earlier than the task have not been fully executed, then the task is determined to be synchronously blocked; if the task is a backward synchronization task, then all other tasks acquired later than the task are determined to be synchronously blocked; if the task depends on other tasks and the other tasks have not been fully executed, then the task is determined to be dependently blocked; if the task depends on other tasks serially and the other tasks have not been fully executed, then the task is determined to be dependently blocked; when the task is not synchronously blocked and is not dependently blocked, the task is determined to be a schedulable task.
一种可能的设计中,任务被同步阻塞包括任务被整组同步阻塞和/或任务被子分组同步阻塞,用于
判断任务被整组同步阻塞的其它任务为N个任务中早于或晚于任务所获取的任务,用于判断任务被子分组同步阻塞的其它任务为任务所属的子分组中早于或晚于任务所获取的任务。In a possible design, the task being synchronously blocked includes the task being synchronously blocked by the entire group and/or the task being synchronously blocked by a subgroup, for The other tasks for judging whether the task is blocked by the whole group synchronization are tasks acquired earlier or later than the task in the N tasks, and the other tasks for judging whether the task is blocked by the subgroup synchronization are tasks acquired earlier or later than the task in the subgroup to which the task belongs.
一种可能的设计中,子分组按照业务特性划分得到。In one possible design, sub-groups are divided according to business characteristics.
一种可能的设计中,任务调度器还包括:第一等待队列,第二等待队列和就绪队列。该情况下,任务获取器在获取N个任务后,还用于:将N个任务存放至第一等待队列。阻塞管理器具体用于:针对于第一等待队列中的任一任务,判断该任务是否被其它任务同步阻塞,若否,则将该任务从第一等待队列移至第二等待队列,以及,针对于第二等待队列中的任一任务,判断该任务是否被其它任务依赖阻塞,若否,则将该任务从第二等待队列移至就绪队列。任务派发器具体用于:将就绪队列中的任务调度至处理单元。In a possible design, the task scheduler also includes: a first waiting queue, a second waiting queue and a ready queue. In this case, after acquiring N tasks, the task acquirer is also used to: store the N tasks in the first waiting queue. The blocking manager is specifically used to: for any task in the first waiting queue, determine whether the task is synchronously blocked by other tasks, if not, move the task from the first waiting queue to the second waiting queue, and, for any task in the second waiting queue, determine whether the task is blocked by other task dependencies, if not, move the task from the second waiting queue to the ready queue. The task dispatcher is specifically used to: schedule tasks in the ready queue to the processing unit.
一种可能的设计中,任务派发器具体用于:监控处理单元当前执行的任务数量,当该任务数量小于处理单元的可并行任务数量时,按照就绪队列中的任务的获取顺序,依次将就绪队列中的任务调度至处理单元。In one possible design, the task dispatcher is specifically used to: monitor the number of tasks currently executed by the processing unit, and when the number of tasks is less than the number of parallel tasks of the processing unit, schedule the tasks in the ready queue to the processing unit in sequence according to the order in which the tasks in the ready queue are acquired.
第三方面,本申请提供一种任务调度器,包括:获取单元,用于获取N个任务以及N个任务的关联关系,N为正整数;确定单元,用于根据N个任务的关联关系,将N个任务中无依赖的或已解除依赖的任务确定为可调度任务;调度单元,用于将可调度任务调度至处理单元。In a third aspect, the present application provides a task scheduler, comprising: an acquisition unit, used to acquire N tasks and the association relationships between the N tasks, where N is a positive integer; a determination unit, used to determine the non-dependent or released dependent tasks among the N tasks as schedulable tasks based on the association relationships among the N tasks; and a scheduling unit, used to schedule the schedulable tasks to the processing unit.
一种可能的设计中,确定单元具体用于:根据N个任务的关联关系,确定与其它任务具有同步关联的任务,当该任务对应的与任务具有同步关联的其它任务执行完成,则将任务确定为一个可调度任务。其中,与其它任务具有同步关联是指任务与早于或晚于任务获取的全部其它任务具有关联关系。In a possible design, the determination unit is specifically used to: determine the task that has synchronization association with other tasks according to the association relationship of N tasks, and when the other tasks that have synchronization association with the task corresponding to the task are completed, the task is determined as a schedulable task. Among them, having synchronization association with other tasks means that the task has an association relationship with all other tasks that are acquired earlier or later than the task.
一种可能的设计中,任务对应的与任务具有同步关联的其它任务执行完成,包括如下内容中的至少一项:任务为前向同步任务,且早于任务获取的其它任务全部执行完成;任务为晚于后向同步任务获取的任务,且后向同步任务执行完成。其中,前向同步任务用于指示前向同步任务依赖于早于前向同步任务获取的全部其它任务,后向同步任务用于指示晚于后向同步任务获取的全部其它任务依赖于后向同步任务。In a possible design, the completion of execution of other tasks corresponding to the task and having synchronization association with the task includes at least one of the following contents: the task is a forward synchronization task, and all other tasks acquired earlier than the task are completed; the task is a task acquired later than the backward synchronization task, and the backward synchronization task is completed. Among them, the forward synchronization task is used to indicate that the forward synchronization task depends on all other tasks acquired earlier than the forward synchronization task, and the backward synchronization task is used to indicate that all other tasks acquired later than the backward synchronization task depend on the backward synchronization task.
一种可能的设计中,确定单元具体用于:根据N个任务的关联关系,确定与其它任务具有依赖关联的任务,当任务对应的与任务具有依赖关联的其它任务已执行或已执行完成,则将任务确定为一个可调度任务。In a possible design, the determination unit is specifically used to: determine a task that has a dependency relationship with other tasks based on the association relationship among N tasks, and when other tasks that have a dependency relationship with the task corresponding to the task have been executed or completed, determine the task as a schedulable task.
一种可能的设计中,确定单元具体用于:根据N个任务的关联关系,将N个任务中未被其它任务同步阻塞且未被其它任务依赖阻塞的任务确定为可调度任务。其中,任务被其它任务同步阻塞包括:任务为前向同步任务且早于任务获取的其它任务未全部执行完成,或者,任务为晚于后向同步任务获取的任务且后向同步任务未执行完成;前向同步任务用于限定前向同步任务依赖于早于前向同步任务获取的全部其它任务,后向同步任务用于限定晚于后向同步任务获取的全部其它任务依赖于后向同步任务。任务被其它任务依赖阻塞包括:任务依赖于其它任务且其它任务未全部执行或未全部执行完成。In one possible design, the determination unit is specifically used to: determine, according to the association relationship among the N tasks, the tasks that are not blocked by other tasks synchronization and are not blocked by other task dependencies as schedulable tasks. Among them, the task being blocked by other tasks synchronization includes: the task is a forward synchronization task and other tasks acquired earlier than the task have not been fully executed, or the task is a task acquired later than the backward synchronization task and the backward synchronization task has not been fully executed; the forward synchronization task is used to limit the forward synchronization task to be dependent on all other tasks acquired earlier than the forward synchronization task, and the backward synchronization task is used to limit all other tasks acquired later than the backward synchronization task to be dependent on the backward synchronization task. The task being blocked by other task dependencies includes: the task is dependent on other tasks and the other tasks have not been fully executed or have not been fully executed.
一种可能的设计中,确定单元具体用于:按照N个任务的获取顺序,遍历N个任务中还未被确定阻塞的每个任务,在遍历还未被确定阻塞的每个任务时:若任务为前向同步任务,且早于任务获取的其它任务未全部执行完成,则确定任务被同步阻塞;若任务为后向同步任务,则确定晚于任务获取的全部其它任务被同步阻塞;若任务依赖于其它任务,且其它任务未执行完成,则确定任务被依赖阻塞;若任务串行依赖于其它任务,且其它任务未全部执行,则确定任务被依赖阻塞;当任务未被同步阻塞且未被依赖阻塞时,将任务确定为一个可调度任务。In a possible design, the determination unit is specifically used to: traverse each task among the N tasks that has not been determined to be blocked in the order in which the N tasks are acquired, and when traversing each task that has not been determined to be blocked: if the task is a forward synchronization task, and other tasks acquired earlier than the task have not been fully executed, then the task is determined to be synchronously blocked; if the task is a backward synchronization task, then all other tasks acquired later than the task are determined to be synchronously blocked; if the task depends on other tasks and the other tasks have not been fully executed, then the task is determined to be dependently blocked; if the task depends on other tasks serially and the other tasks have not been fully executed, then the task is determined to be dependently blocked; when the task is not synchronously blocked and is not dependently blocked, the task is determined to be a schedulable task.
一种可能的设计中,任务被同步阻塞包括任务被整组同步阻塞和/或任务被子分组同步阻塞,用于判断任务被整组同步阻塞的其它任务为N个任务中早于或晚于任务所获取的任务,用于判断任务被子分组同步阻塞的其它任务为任务所属的子分组中早于或晚于任务所获取的任务。In one possible design, the task being synchronously blocked includes the task being synchronously blocked by the entire group and/or the task being synchronously blocked by a sub-group, which is used to determine whether other tasks that are synchronously blocked by the entire group are tasks that are acquired earlier or later than the task among the N tasks, and is used to determine whether other tasks that are synchronously blocked by the sub-group are tasks that are acquired earlier or later than the task in the sub-group to which the task belongs.
一种可能的设计中,子分组按照业务特性划分得到。In one possible design, sub-groups are divided according to business characteristics.
一种可能的设计中,获取单元在获取N个任务以及N个任务的关联关系之后,还用于:将所述N个任务存放至第一等待队列。确定单元具体用于:针对于第一等待队列中的任一任务,判断该任务是否被其它任务同步阻塞,若否,则将该任务从第一等待队列移至第二等待队列;以及,针对于第二等待队列中的任一任务,判断该任务是否被其它任务依赖阻塞,若否,则将该任务从第二等待队列移至就绪队列。调度单元具体用于:将就绪队列中的任务调度至处理单元。In one possible design, after acquiring N tasks and the association between the N tasks, the acquisition unit is further used to: store the N tasks in the first waiting queue. The determination unit is specifically used to: for any task in the first waiting queue, determine whether the task is synchronously blocked by other tasks, if not, move the task from the first waiting queue to the second waiting queue; and, for any task in the second waiting queue, determine whether the task is blocked by other task dependencies, if not, move the task from the second waiting queue to the ready queue. The scheduling unit is specifically used to: schedule the tasks in the ready queue to the processing unit.
一种可能的设计中,调度单元具体用于:监控处理单元当前执行的任务数量,当任务数量小于处理
单元的可并行任务数量时,按照就绪队列中的任务的获取顺序,依次将就绪队列中的任务调度至处理单元。In one possible design, the scheduling unit is specifically used to: monitor the number of tasks currently executed by the processing unit, and when the number of tasks is less than the processing unit When the number of parallelizable tasks of a unit is reached, the tasks in the ready queue are dispatched to the processing unit in sequence according to the order in which the tasks in the ready queue are acquired.
第四方面,本申请提供一种芯片,包括任务调度器,任务调度器用于实现如上述第一方面中任一项设计所述的方法。In a fourth aspect, the present application provides a chip, including a task scheduler, and the task scheduler is used to implement the method described in any one of the designs in the first aspect above.
第五方面,本申请提供一种处理器,包括任务调度器和处理单元,任务调度器用于执行如上述第一方面中任一项设计所述的方法,处理单元用于执行任务调度器调度过来的任务。In a fifth aspect, the present application provides a processor, including a task scheduler and a processing unit, the task scheduler is used to execute the method described in any one of the designs of the first aspect above, and the processing unit is used to execute the tasks scheduled by the task scheduler.
第六方面,本申请提供一种电子设备,包括处理器,处理器与存储器耦合,处理器用于执行存储器中存储的计算机程序,以使得所述电子设备执行如上述第一方面中任一项设计所述的方法。In a sixth aspect, the present application provides an electronic device, comprising a processor, wherein the processor is coupled to a memory, and the processor is used to execute a computer program stored in the memory so that the electronic device executes a method as described in any one of the designs in the first aspect above.
第七方面,本申请提供一种任务调度系统,包括设备驱动程序包和如上述第四方面所述的处理器,设备驱动程序包用于向处理器发送N个任务,N为正整数;处理器,用于处理N个任务。In a seventh aspect, the present application provides a task scheduling system, comprising a device driver package and a processor as described in the fourth aspect above, wherein the device driver package is used to send N tasks to the processor, where N is a positive integer; and the processor is used to process the N tasks.
第八方面,本申请提供一种计算机可读存储介质,该计算机可读存储介质存储有计算机程序,当计算机程序被运行时,实现如上述第一方面中任一项设计所述的方法。In an eighth aspect, the present application provides a computer-readable storage medium storing a computer program. When the computer program is executed, the method described in any one of the designs in the first aspect above is implemented.
第九方面,本申请提供一种计算机程序产品,当该计算机程序产品在处理器上运行时,实现如上述第一方面中任一项设计所述的方法。In a ninth aspect, the present application provides a computer program product, which, when executed on a processor, implements a method as described in any one of the designs of the first aspect above.
上述第二方面至第九方面的有益效果,具体请参照上述第一方面中相应设计可以达到的技术效果,这里不再重复赘述。For the beneficial effects of the second to ninth aspects mentioned above, please refer to the technical effects that can be achieved by the corresponding design in the first aspect mentioned above, and no further details will be given here.
图1示例性示出本申请实施例提供的一种任务处理系统的系统架构示意图;FIG1 exemplarily shows a system architecture diagram of a task processing system provided by an embodiment of the present application;
图2示例性示出本申请实施例提供的一种处理器的结构示意图;FIG2 exemplarily shows a schematic diagram of the structure of a processor provided in an embodiment of the present application;
图3示例性示出业界提供的一种任务处理方法的流程示意图;FIG3 exemplarily shows a flowchart of a task processing method provided by the industry;
图4示出业界提供的一种游戏任务处理的流程示意图;FIG4 is a schematic diagram showing a flow chart of a game task processing provided by the industry;
图5示例性示出本申请实施例一提供的任务调度方法对应的流程示意图;FIG5 exemplarily shows a flowchart corresponding to the task scheduling method provided in Embodiment 1 of the present application;
图6示例性示出本申请实施例提供的一种具有同步关联的任务布局示意图;FIG6 exemplarily shows a schematic diagram of a task layout with synchronization association provided in an embodiment of the present application;
图7示例性示出本申请实施例提供的一种具有依赖关联的任务布局示意图;FIG. 7 exemplarily shows a schematic diagram of a task layout with dependency associations provided in an embodiment of the present application;
图8示例性示出采用本申请实施例一中的任务调度方法处理游戏任务的流程示意图;FIG8 exemplarily shows a flowchart of processing a game task using the task scheduling method in the first embodiment of the present application;
图9示例性示出本申请实施例二提供的任务调度方法对应的流程示意图;FIG9 exemplarily shows a flowchart corresponding to the task scheduling method provided in Embodiment 2 of the present application;
图10示例性示出本申请实施例提供的一种任务调度流程示意图;FIG10 exemplarily shows a task scheduling process diagram provided in an embodiment of the present application;
图11示例性示出本申请实施例提供的另一种任务调度流程示意图;FIG11 exemplarily shows another task scheduling process diagram provided in an embodiment of the present application;
图12示例性示出本申请实施例提供的又一种任务调度流程示意图;FIG12 exemplarily shows another task scheduling process diagram provided in an embodiment of the present application;
图13示例性示出本申请实施例提供的一种任务调度器的结构示意图。FIG. 13 exemplarily shows a structural diagram of a task scheduler provided in an embodiment of the present application.
本申请所公开的任务调度方案可以应用于具有任务处理能力的电子设备中。在本申请一些实施例中,任务调度器可以是一个独立的单元,该单元嵌入在电子设备中,并能在该电子设备中的处理器核心空闲时,为处理器核心分配任务,以便在满足处理器核心的处理能力的情况下,最大化地利用处理器核心的功率裕量,提高处理器核心的任务处理能力。在本申请另一些实施例中,任务调度器也可以是封装在电子设备内部的单元,用于实现该电子设备的任务调度功能。其中,电子设备可以是具有处理器的计算机设备,例如台式计算机、个人计算机或服务器等。还应当理解的是,电子设备也可以是具有处理器的便携式电子设备,诸如手机、平板电脑、具备无线通讯功能的可穿戴设备(如智能手表)、车载设备等。便携式电子设备的示例性实施例包括但不限于搭载或者其它操作系统的便携式电子设备。上述便携式电子设备也可以是诸如具有触敏表面(例如触控面板)的膝上型计算机(Laptop)等。The task scheduling scheme disclosed in the present application can be applied to electronic devices with task processing capabilities. In some embodiments of the present application, the task scheduler can be an independent unit, which is embedded in the electronic device and can assign tasks to the processor core when the processor core in the electronic device is idle, so as to maximize the use of the power margin of the processor core while meeting the processing capacity of the processor core, thereby improving the task processing capacity of the processor core. In other embodiments of the present application, the task scheduler can also be a unit encapsulated inside the electronic device, which is used to implement the task scheduling function of the electronic device. Among them, the electronic device can be a computer device with a processor, such as a desktop computer, a personal computer or a server. It should also be understood that the electronic device can also be a portable electronic device with a processor, such as a mobile phone, a tablet computer, a wearable device with wireless communication function (such as a smart watch), a vehicle-mounted device, etc. Exemplary embodiments of portable electronic devices include but are not limited to devices equipped with Or a portable electronic device with other operating systems. The portable electronic device may also be a laptop computer (Laptop) with a touch-sensitive surface (eg, a touch panel).
为了使本申请的目的、技术方案和优点更加清楚,下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述。In order to make the purpose, technical solutions and advantages of the present application clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the accompanying drawings in the embodiments of the present invention.
应理解,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。方法实施例中的具体操作方法也可以应用于装置实施例或系统实施例中。且,在本申请的描述中,“至少一个”是指一个或多个,多个是指两个或两个以上。鉴于此,本发明实施例中也可以将“多个”理解为“至少两个”。“和/或”,
描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。另外,字符“/”,如无特殊说明,一般表示前后关联对象是一种“或”的关系。另外,需要理解的是,在本申请的描述中,“第一”、“第二”等词汇,仅用于区分描述的目的,而不能理解为指示或暗示相对重要性,也不能理解为指示或暗示顺序。It should be understood that the described embodiments are only some embodiments of the present invention, rather than all embodiments. The specific operating methods in the method embodiments can also be applied to device embodiments or system embodiments. Moreover, in the description of this application, "at least one" means one or more, and multiple means two or more. In view of this, in the embodiments of the present invention, "multiple" can also be understood as "at least two". "And/or", Describing the association relationship of associated objects, it indicates that there may be three kinds of relationships. For example, A and/or B can indicate three situations: A exists alone, A and B exist at the same time, and B exists alone. In addition, the character "/", unless otherwise specified, generally indicates that the objects associated with each other are in an "or" relationship. In addition, it should be understood that in the description of this application, the words "first", "second", etc. are only used for the purpose of distinguishing the description, and cannot be understood as indicating or implying relative importance, nor can they be understood as indicating or implying an order.
另外,本申请实施例中“连接”可以理解为电连接,两个电学元件连接可以是两个电学元件之间的直接或间接连接。例如,A与B连接,既可以是A与B直接连接,也可以是A与B之间通过一个或多个其它电学元件间接连接,例如A与B连接,也可以是A与C直接连接,C与B直接连接,A与B之间通过C实现了连接。在一些场景下,“连接”也可以理解为耦合,如两个电感之间的电磁耦合。总之,A与B之间连接,可以使A与B之间能够传输电能。In addition, in the embodiments of the present application, "connection" can be understood as electrical connection, and the connection between two electrical components can be a direct or indirect connection between the two electrical components. For example, A and B are connected, which can be either A and B directly connected, or A and B indirectly connected through one or more other electrical components, such as A and B are connected, or A and C are directly connected, C and B are directly connected, and A and B are connected through C. In some scenarios, "connection" can also be understood as coupling, such as electromagnetic coupling between two inductors. In short, the connection between A and B enables the transmission of electrical energy between A and B.
图1示例性示出本申请实施例提供的一种任务处理系统的系统架构示意图。图示任务处理系统10包括应用程序(application,APP)100、设备驱动程序包(device development kit,DDK)200和处理器300,应用程序100连接设备驱动程序包200,设备驱动程序包200还连接处理器300。其中,设备驱动程序包200也称为驱动软件,包括内核模式驱动程序(kernel mode driver,KMD)210和用户模式驱动程序(user mode driver,UMD)220,用户模式驱动程序也称为用户态显卡驱动程序。其中,内核模式和用户模式属于两种不同的驱动模式,根据运行代码的类型,设备驱动程序包200在这两种驱动模式之间进行切换。通常情况下,大多数驱动程序属于内核模式驱动程序210,运行在内核模式下,某些驱动程序属于用户模式驱动程序220,运行在用户模式下。由于内核模式驱动程序210和用户模式驱动程序220与本申请方案的相关性不大,因此本申请实施例对此不作具体介绍。FIG1 exemplarily shows a system architecture diagram of a task processing system provided by an embodiment of the present application. The illustrated task processing system 10 includes an application (APP) 100, a device development kit (DDK) 200 and a processor 300, wherein the application 100 is connected to the device driver package 200, and the device driver package 200 is also connected to the processor 300. The device driver package 200 is also called driver software, and includes a kernel mode driver (KMD) 210 and a user mode driver (UMD) 220, and the user mode driver is also called a user-mode graphics driver. KMD and user mode belong to two different driver modes, and the device driver package 200 switches between the two driver modes according to the type of running code. Generally, most drivers belong to the kernel mode driver 210 and run in kernel mode, and some drivers belong to the user mode driver 220 and run in user mode. Since the kernel mode driver 210 and the user mode driver 220 are not very relevant to the solution of the present application, the embodiment of the present application does not introduce them in detail.
进一步地,图2示例性示出本申请实施例提供的一种处理器的结构示意图。图示处理器300可以包括一个或多个芯片,例如可以包括片上系统(system-on-a-chip,SoC)或多个芯片所形成的芯片组。其中,处理器300可以包括至少一个处理单元,诸如可以包括如图2所示意出的神经网络处理单元(neural-network processing unit,NPU)、图形处理单元(graphics processing unit,GPU)和中央处理单元(central processing unit,CPU),还可以包括应用处理器(application processor,AP),调制解调处理器,图像信号处理器(image signal processor,ISP),视频编解码器,数字信号处理器(digital signal processor,DSP),和/或基带处理器等。其中,至少一个处理单元也称为至少一个处理子系统,属于处理器300的核心部件,用于实现处理器300的处理功能。至少一个处理单元中的不同处理单元可以分散部署在不同的芯片上,也可以集成在一个芯片上,具体不作限定。Further, FIG. 2 exemplarily shows a schematic diagram of the structure of a processor provided in an embodiment of the present application. The processor 300 shown in the figure may include one or more chips, for example, may include a system-on-a-chip (SoC) or a chipset formed by multiple chips. Among them, the processor 300 may include at least one processing unit, such as a neural-network processing unit (NPU), a graphics processing unit (GPU) and a central processing unit (CPU) as shown in FIG. 2, and may also include an application processor (AP), a modem processor, an image signal processor (ISP), a video codec, a digital signal processor (DSP), and/or a baseband processor. Among them, at least one processing unit is also called at least one processing subsystem, which belongs to the core component of the processor 300 and is used to implement the processing function of the processor 300. Different processing units in at least one processing unit may be dispersed and deployed on different chips, or may be integrated on one chip, without specific limitation.
继续参照图2所示,处理器300中还可以包括非核心部件,例如通用单元(包括计数器、译码器和信号发生器等)、加速器单元、输入/输出控制单元、接口单元、内部存储器和外部缓存器等。其中,内部存储器和外部缓存器统称为处理器300的存储单元,用于存储指令和数据。该指令和数据可以被调用,以便在处理器在处理任务时,选择未被同步阻塞且未被依赖阻塞的任务调度给当前空闲的处理器核心。在一些实施例中,存储单元可以为高速缓冲存储器。该高速缓冲存储器可以保存刚用过或循环使用的指令或数据。当需要再次使用该指令或数据时,可从该高速缓冲存储器中直接调用,从而可避免重复存取,可减少等待时间,可提高任务的处理效率。Continuing to refer to FIG. 2 , the processor 300 may also include non-core components, such as general units (including counters, decoders, and signal generators, etc.), accelerator units, input/output control units, interface units, internal memories, and external buffers, etc. Among them, the internal memory and the external buffer are collectively referred to as the storage unit of the processor 300, which is used to store instructions and data. The instructions and data can be called so that when the processor is processing a task, the task that is not blocked by synchronization and is not blocked by dependency is selected and scheduled to the currently idle processor core. In some embodiments, the storage unit can be a cache memory. The cache memory can save instructions or data that have just been used or are used in a loop. When the instruction or data needs to be used again, it can be directly called from the cache memory, thereby avoiding repeated access, reducing waiting time, and improving the processing efficiency of the task.
进一步示例性地,处理器300中的每个处理单元中可以包含一个或多个处理器核心,比如在图2所示出的处理器300中,NPU包含3个NPU核,即NPU核1~NPU核3,GPU包含5个GPU核,即GPU核1~GPU核5,CPU包含5个CPU核,即CPU核1~CPU核5。其中,处理器核心也称为执行单元,用于执行整个任务或部分任务碎片。当包含多个处理器核心时,多个处理器核心可以被划分至一个或多个电压域,位于同一电压域中的处理器核心具有同一工作电压和同一工作频率。且,位于同一电压域中的多个处理器核心可以是多核异构的,即具有不同结构,且用于分别处理不同任务或不同任务碎片,也可以是多核同构的,即具有相同结构,且用于共同处理同一任务或同一任务碎片,本申请实施例对此不作具体限定。Further exemplarily, each processing unit in the processor 300 may include one or more processor cores. For example, in the processor 300 shown in FIG2 , the NPU includes 3 NPU cores, namely NPU core 1 to NPU core 3, the GPU includes 5 GPU cores, namely GPU core 1 to GPU core 5, and the CPU includes 5 CPU cores, namely CPU core 1 to CPU core 5. Among them, the processor core is also called an execution unit, which is used to execute the entire task or part of the task fragment. When multiple processor cores are included, the multiple processor cores can be divided into one or more voltage domains, and the processor cores located in the same voltage domain have the same operating voltage and the same operating frequency. Moreover, the multiple processor cores located in the same voltage domain can be multi-core heterogeneous, that is, have different structures, and are used to process different tasks or different task fragments respectively, or can be multi-core isomorphic, that is, have the same structure, and are used to jointly process the same task or the same task fragment. The embodiment of the present application does not specifically limit this.
进一步示例性地,处理器300中还可以包括任务调度器310,任务调度器310连接各个处理器核心。一个示例中,任务调度器310和各个处理器核心之间的连接可以通过总线系统来实现。如此,在处理器300的工作过程中,每个处理器核心在处理完任务或任务碎片后,可将自己空闲的消息发布到总线系统上,任务调度器310通过监听总线系统,即可获知当前处于空闲状态的处理器核心,进而,在存在需要调度的任务时,将该任务调度给空闲的处理器核心。Further illustratively, the processor 300 may also include a task scheduler 310, and the task scheduler 310 is connected to each processor core. In one example, the connection between the task scheduler 310 and each processor core can be realized through a bus system. In this way, during the operation of the processor 300, each processor core can publish its idle message to the bus system after processing a task or task fragment. The task scheduler 310 can learn the processor core that is currently in an idle state by monitoring the bus system, and then, when there is a task that needs to be scheduled, the task is scheduled to the idle processor core.
应理解,图示处理器300仅是一个范例,处理器300可以具有比图中所示出的更多的或者更少的部
件,可以组合两个或更多的部件,或者可以具有不同的部件配置,本申请实施例对此不作具体限定。It should be understood that the processor 300 shown in the figure is only an example, and the processor 300 may have more or fewer components than those shown in the figure. A component may be a combination of two or more components, or may have different component configurations, which is not specifically limited in the embodiments of the present application.
一种具体的应用场景中,上述应用程序100具体可以是指用于生成图像的程序,比如手机摄像头、相机或录屏程序等。对应的,上述处理器300具体可以是指图像处理器,该图像处理器中可以仅包含GPU核心,而不包含其它核心,如NPU核心和CPU核心。且,当处理器300为图像处理器时,任务调度器310所调度的任务具体是指与图像处理相关的任务。比如,在拍摄视频的场景中,视频是一帧图像接一帧图像拍摄后组合得到的,其中的每帧图像在拍摄后通常都需要经过高斯滤波、白平衡、图像去噪、图像增强、图像分割以及图像渲染等操作,且不同图像的处理操作通常还要按照拍摄的顺序来进行,以避免出现长短帧现象。在该过程中,每帧图像的整个处理过程可以作为一个任务,或者,每帧图像在处理过程中的高斯滤波、白平衡、图像去噪、图像增强、图像分割或图像渲染等处理操作也可以各自作为一个任务。此外,各个任务之间还具有一定的关联关系。比如,高斯滤波、白平衡、图像去噪、图像增强和图像分割均是以整个图像为基准进行的,相互之间可以并行执行,也即是,这些操作之间不具有依赖关系。而图像渲染是对图像分割后的每个小片图像进行单独渲染,需要在图像分割之后才能执行,因此图像渲染需依赖于图像分割。In a specific application scenario, the above-mentioned application 100 may specifically refer to a program for generating images, such as a mobile phone camera, a camera or a screen recording program. Correspondingly, the above-mentioned processor 300 may specifically refer to an image processor, which may only include a GPU core, but not other cores, such as an NPU core and a CPU core. Moreover, when the processor 300 is an image processor, the tasks scheduled by the task scheduler 310 specifically refer to tasks related to image processing. For example, in the scene of shooting a video, the video is obtained by shooting one frame of images after another, and each frame of the image usually needs to undergo Gaussian filtering, white balance, image denoising, image enhancement, image segmentation, and image rendering after shooting, and the processing operations of different images are usually performed in the order of shooting to avoid the phenomenon of long and short frames. In this process, the entire processing process of each frame of the image can be used as a task, or the processing operations such as Gaussian filtering, white balance, image denoising, image enhancement, image segmentation or image rendering of each frame of the image during the processing process can also be used as a task. In addition, there is a certain correlation between the tasks. For example, Gaussian filtering, white balance, image denoising, image enhancement, and image segmentation are all based on the entire image and can be performed in parallel, that is, these operations are not dependent on each other. Image rendering, on the other hand, is to render each small image after image segmentation separately, which can only be performed after image segmentation. Therefore, image rendering depends on image segmentation.
进一步地,以视频拍摄场景为例,应用程序100在拍摄得到一帧一帧的图像后,创建图像队列并录制命令,进而将图像队列发送给设备驱动程序包200。设备驱动程序包200解析图像队列中的命令,将其转译为处理器300可识别的任务,并以任务、或任务链、或命令流的形式下发给处理器300中的任务调度器310。任务调度器310监控处理器300中各个GPU核的工作状态,在确定存在空闲的GPU核时,将待处理任务发送给空闲的GPU核进行处理。其中,待处理任务可能被发送给一个GPU核进行单独处理,也可能被发送给多个GPU核一起处理,在发送给多个GPU核的情况下,多个GPU核可能属于同一电压域,也可能属于不同电压域,具体不作限定。Further, taking the video shooting scene as an example, after the application 100 obtains the image frame by frame, it creates an image queue and records the command, and then sends the image queue to the device driver package 200. The device driver package 200 parses the command in the image queue, translates it into a task recognizable by the processor 300, and sends it to the task scheduler 310 in the processor 300 in the form of a task, a task chain, or a command stream. The task scheduler 310 monitors the working status of each GPU core in the processor 300, and when it is determined that there is an idle GPU core, it sends the task to be processed to the idle GPU core for processing. Among them, the task to be processed may be sent to a GPU core for separate processing, or it may be sent to multiple GPU cores for processing together. In the case of sending to multiple GPU cores, the multiple GPU cores may belong to the same voltage domain or to different voltage domains, which is not specifically limited.
进一步地,任务之间通常还具有依赖关系,比如某个任务需要在其它任务已经开始执行或执行完成后才能执行,这对于各个任务的执行顺序会有要求。然而,如果任务调度器310全部顺序执行所接收到的各个任务,则可能会使得不需要依赖其它任务的任务被阻塞,这显然不利于提高GPU核的利用率。为解决该问题,图3示例性示出业界提供的一种任务处理方法的流程示意图,如图3所示,该方法预先在任务调度器310中设置了多个任务队列,如任务队列L1、任务队列L2、……、任务队列Lm,m为正整数。其中,每个任务队列对应一种业务,比如任务队列L1对应高斯滤波业务,任务队列L2对应白平衡业务,……,任务队列Lm对应图像分割和图像渲染业务。在此配置下,设备驱动程序包200在解析得到多个任务后,根据每个任务所属的业务,将每个任务分发至各自对应的任务队列。任务调度器310在调用GPU核时,按照并行执行多个任务队列中的任务、以及串行执行一个任务队列中的任务的原则,调用空闲的GPU核处理各个任务队列中的任务。Furthermore, there is usually a dependency relationship between tasks, such as a task can only be executed after other tasks have started or completed, which has requirements for the execution order of each task. However, if the task scheduler 310 executes all the received tasks in sequence, it may cause tasks that do not need to rely on other tasks to be blocked, which is obviously not conducive to improving the utilization rate of the GPU core. To solve this problem, FIG3 exemplarily shows a flow chart of a task processing method provided by the industry. As shown in FIG3 , the method pre-sets multiple task queues in the task scheduler 310, such as task queue L 1 , task queue L 2 , ..., task queue L m , and m is a positive integer. Among them, each task queue corresponds to a business, such as task queue L 1 corresponds to Gaussian filtering business, task queue L 2 corresponds to white balance business, ..., task queue L m corresponds to image segmentation and image rendering business. Under this configuration, after parsing and obtaining multiple tasks, the device driver package 200 distributes each task to the corresponding task queue according to the business to which each task belongs. When calling the GPU core, the task scheduler 310 calls the idle GPU core to process the tasks in each task queue according to the principle of executing the tasks in multiple task queues in parallel and executing the tasks in one task queue in series.
采用图3所示意的任务处理方法,虽然不同任务队列中的任务可以并行执行,但同一任务队列中的任务却需要串行执行。也即是说,在一个任务队列中,位于后方的任务需要等到该任务之前的全部任务执行完成以后才能执行,即使后方的任务不依赖于前面的任务,且处理器中存在空闲的GPU核,也无法调用空闲的GPU核提前处理后方的任务。举例来说,假设某种游戏中包含Binning业务和Rendering业务,Binning业务包含Binning1、Binning2和Binning3这三个任务,Rendering业务包含Rendering1、Rendering2和Rendering3这三个任务,且各个任务之间的依赖关系为:Binning2依赖Rendering1,且Rendering i依赖Binningi,i取值为1、2或3,则:Using the task processing method shown in Figure 3, although tasks in different task queues can be executed in parallel, tasks in the same task queue need to be executed serially. That is to say, in a task queue, the task at the back needs to wait until all the tasks before it are completed before it can be executed. Even if the task at the back does not depend on the task at the front, and there are idle GPU cores in the processor, the idle GPU cores cannot be called to process the task at the back in advance. For example, suppose a game contains Binning and Rendering services, and the Binning service includes three tasks, Binning1, Binning2, and Binning3, and the Rendering service includes three tasks, Rendering1, Rendering2, and Rendering3. The dependency relationship between the tasks is: Binning2 depends on Rendering1, and Rendering i depends on Binningi, i is 1, 2, or 3, then:
图4示出采用业界所提供的任务调度方法处理该游戏任务的流程示意图,参照图4所示,Binning1~Binning3存储在同一任务队列中,Rendering1~Rendering3存储在另一任务队列中,任务调度器调用GPU核并行执行这两个任务队列中的Binning任务和Rendering任务。且,由于Binning2依赖Rendering1,而Rendering1又依赖Bining1,因此Binning2需要等到Rendering1执行完以后才可以执行,Rendering1又需要等到Binning1执行完成以后才可以执行,因此,Bendering1和Bendering2之间必然存在空隙1,Rendering1和Rendering2之间必然存在空隙2,Rendering1之前又必然存在空隙3。可见,在该种处理方案中,即使Rendering3不依赖于Rendering2,Rendering2不依赖于Rendering1,Binning3不依赖于Binning1和Binning2,但由于同一任务队列内是顺序执行的,因此Rendering3、Rendering2和Binning3也无法提前执行来填补空隙1、2或3。显然,这种方式无法有效提高GPU核的利用率。FIG4 shows a flow chart of processing the game task using the task scheduling method provided by the industry. As shown in FIG4 , Binning1 to Binning3 are stored in the same task queue, and Rendering1 to Rendering3 are stored in another task queue. The task scheduler calls the GPU core to execute the Binning tasks and Rendering tasks in the two task queues in parallel. Moreover, since Binning2 depends on Rendering1, and Rendering1 depends on Bining1, Binning2 needs to wait until Rendering1 is executed before it can be executed, and Rendering1 needs to wait until Binning1 is executed before it can be executed. Therefore, there must be a gap 1 between Bendering1 and Bendering2, a gap 2 between Rendering1 and Rendering2, and a gap 3 before Rendering1. It can be seen that in this processing scheme, even if Rendering3 does not depend on Rendering2, Rendering2 does not depend on Rendering1, and Binning3 does not depend on Binning1 and Binning2, since the same task queue is executed sequentially, Rendering3, Rendering2, and Binning3 cannot be executed in advance to fill gaps 1, 2, or 3. Obviously, this method cannot effectively improve the utilization of GPU cores.
有鉴于此,本申请实施例提供一种任务调度方法,该方法通过任务调度器从硬件侧维护各个任务的调度顺序,在确定任务无依赖或者已经解除依赖后,即可提前将该任务下发给空闲的处理单元进行处理,
以便充分利用处理单元的全部硬件资源高并发地处理任务,尽可能地不让GPU核空载,有效提高处理单元的利用率。In view of this, an embodiment of the present application provides a task scheduling method, which maintains the scheduling order of each task from the hardware side through a task scheduler. After determining that the task has no dependency or has been released from dependency, the task can be sent to an idle processing unit for processing in advance. In order to make full use of all the hardware resources of the processing unit to process tasks with high concurrency, the GPU core should not be idle as much as possible, and the utilization rate of the processing unit should be effectively improved.
下面通过本申请的具体实施例来介绍上述技术问题是如何被解决的。需要说明的是,在下文的描述中,任务调度器可以是图2中的任务调度器310,或能够支持处理器实现该方法所需的功能的通信装置,当然还可以是其它通信装置或通信系统,例如芯片、芯片系统、电路或电路系统,具体不作限定。The following describes how the above technical problems are solved through the specific embodiments of the present application. It should be noted that in the following description, the task scheduler can be the task scheduler 310 in FIG. 2 , or a communication device that can support the processor to implement the functions required by the method, and of course, it can also be other communication devices or communication systems, such as chips, chip systems, circuits or circuit systems, without specific limitation.
实施例一Embodiment 1
图5示例性示出本申请实施例一提供的任务调度方法对应的流程示意图,该方法适用于任务调度器,例如图2所示意的任务调度器310。如图5所示,该方法包括:FIG5 exemplarily shows a flow chart of a task scheduling method provided in the first embodiment of the present application, which is applicable to a task scheduler, such as the task scheduler 310 shown in FIG2 . As shown in FIG5 , the method includes:
步骤501,任务调度器获取N个任务以及N个任务的关联关系,N为正整数。Step 501: The task scheduler obtains N tasks and associations among the N tasks, where N is a positive integer.
其中,N个任务可以是设备驱动程序包下发给任务调度器的,而N个任务的关联关系可以是设备驱动程序包下发给任务调度器的,也可以是任务调度器从其它渠道获取的,比如访问业务系统获得的,具体不作限定。Among them, the N tasks can be sent to the task scheduler by the device driver package, and the association relationship of the N tasks can be sent to the task scheduler by the device driver package, or obtained by the task scheduler from other channels, such as accessing the business system, without specific limitation.
一个示例中,N个任务以及N个任务的关联关系可以是设备驱动程序包通过任务、任务链或命令流等方式下发给任务调度器的,具体的,每个任务与其它任务的关联关系可以作为配置信息被显式地写入到该任务的数据结构中。因此,任务调度器在获取到N个任务后,解析每个任务的数据结构,即可获知每个任务与其它任务之间具有何种关联。In one example, N tasks and the relationship between N tasks can be sent to the task scheduler by the device driver package through tasks, task chains or command streams, and specifically, the relationship between each task and other tasks can be explicitly written into the data structure of the task as configuration information. Therefore, after obtaining N tasks, the task scheduler can parse the data structure of each task to know what kind of relationship each task has with other tasks.
本申请实施例中,N个任务的关联关系可以包括同步关联和/或依赖关联,下面分别对这两种关联关系进行详细介绍。In the embodiment of the present application, the association relationship of the N tasks may include a synchronization association and/or a dependency association. The following describes these two types of association relationships in detail.
同步关联Synchronous Association
同步关联用于指示任务与早于或晚于该任务下发的全部其它任务之间的关联关系。具体的,同步关联可以包括前向同步关联和后向同步关联。前向同步关联是指某一任务依赖于早于该任务下发的全部其它任务,也即是,每个早于该任务下发的其它任务都执行完成后该任务才可以执行。其中,与其它任务具有前向同步关联的任务也称为前向同步任务,只要存在一个早于前向同步任务下发的其它任务未执行完成,则前向同步任务就会被该其它任务同步阻塞。反之,后向同步关联是指晚于某一任务下发的全部其它任务依赖于该任务,也即是,只有该任务执行完成后每个晚于该任务下发的其它任务才可以执行。其中,与其它任务具有后向同步关联的任务也称为后向同步任务,只要后向同步任务未执行完成,则晚于后向同步任务下发的其它任务就会全部被后向同步任务同步阻塞。Synchronous association is used to indicate the association relationship between a task and all other tasks issued earlier or later than the task. Specifically, the synchronous association may include forward synchronous association and backward synchronous association. Forward synchronous association means that a task depends on all other tasks issued earlier than the task, that is, the task can only be executed after all other tasks issued earlier than the task are completed. Among them, a task with forward synchronous association with other tasks is also called a forward synchronous task. As long as there is another task issued earlier than the forward synchronous task that has not been completed, the forward synchronous task will be synchronously blocked by the other task. Conversely, backward synchronous association means that all other tasks issued later than a task depend on the task, that is, only after the task is completed can each other task issued later than the task be executed. Among them, a task with backward synchronous association with other tasks is also called a backward synchronous task. As long as the backward synchronous task has not been completed, all other tasks issued later than the backward synchronous task will be synchronously blocked by the backward synchronous task.
需要说明的是,本申请实施例中,N个任务位于一个整组中,而N个任务中的至少两个任务还可能位于至少一个子分组中。具体来说,当任务调度器中仅存在整组而不存在子分组时,N个任务全部仅位于整组中,当任务调度器中同时存在整组和子分组时,每个任务可能仅位于整组中,也可能同时位于整组和一个或多个子分组中。基于此,在上述同步关联中,早于前向同步任务下发的其它任务,具体可以是指:当某一任务被标记为整组中的前向同步任务,即整组中早于该前向同步任务下发的其它任务;当某一任务被标记为某一子分组中的前向同步任务,即该子分组中早于该前向同步任务下发的其它任务。同样的,晚于后向同步任务下发的其它任务,具体可以是指:当某一任务被标记为整组中的后向同步任务,即整组中晚于该后向同步任务下发的其它任务;当某一任务被标记为某一子分组中的后向同步任务,即该子分组中晚于该后向同步任务下发的其它任务。It should be noted that, in the embodiment of the present application, N tasks are located in an entire group, and at least two of the N tasks may also be located in at least one sub-group. Specifically, when there is only an entire group but no sub-group in the task scheduler, all N tasks are located only in the entire group; when there are both an entire group and sub-groups in the task scheduler, each task may be located only in the entire group, or may be located in the entire group and one or more sub-groups at the same time. Based on this, in the above-mentioned synchronization association, other tasks issued earlier than the forward synchronization task may specifically refer to: when a task is marked as a forward synchronization task in the entire group, that is, other tasks in the entire group that are issued earlier than the forward synchronization task; when a task is marked as a forward synchronization task in a sub-group, that is, other tasks in the sub-group that are issued earlier than the forward synchronization task. Similarly, other tasks issued later than the backward synchronization task may specifically refer to: when a task is marked as a backward synchronization task in the entire group, that is, other tasks in the entire group that are issued later than the backward synchronization task; when a task is marked as a backward synchronization task in a sub-group, that is, other tasks in the sub-group that are issued later than the backward synchronization task.
进一步地,上述N个任务的分组可以是由设备驱动程序包进行划分并携带在任务的数据结构中通知给任务调度器的,也可以是由任务调度器自行划分的,分组依据示例性地可以是业务特性或同步关联等。以设备驱动程序包对任务进行分组为例:Furthermore, the grouping of the N tasks can be performed by the device driver package and carried in the task data structure to notify the task scheduler, or can be performed by the task scheduler itself. The grouping basis can be, for example, business characteristics or synchronization associations. Take the grouping of tasks by the device driver package as an example:
一个示例中,设备驱动程序包根据N个任务的业务特性,确定出属于同一业务的各个任务,然后在这些任务的数据结构中标注同一组标识,并发送给任务调度器。任务调度器接收到N个任务后,解析N个任务的数据结构,获得具有同一组标识的任务,并创建对应的虚拟的子分组,将这些任务存储在该虚拟的子分组中。其中,组标识可以是业务名称、业务代号、组编号或者其它能表征同一业务的标志,具体不作限定。在该示例中,通过以业务特性对任务进行分组,即使某一组中的任务被同步阻塞,其它组中的任务也不会受到影响,也即是说,某一业务的同步阻塞不会影响到其它业务的执行,如此能解耦业务的任务执行关联,降低业务之间的相互干扰。In one example, the device driver package determines the tasks belonging to the same service according to the service characteristics of N tasks, then marks the same group identifier in the data structure of these tasks, and sends it to the task scheduler. After receiving the N tasks, the task scheduler parses the data structure of the N tasks, obtains the tasks with the same group identifier, creates corresponding virtual sub-groups, and stores these tasks in the virtual sub-groups. Among them, the group identifier can be a service name, service code, group number or other mark that can represent the same service, and is not specifically limited. In this example, by grouping tasks according to service characteristics, even if tasks in a group are synchronously blocked, tasks in other groups will not be affected. In other words, the synchronous blocking of a service will not affect the execution of other services. In this way, the task execution association of the service can be decoupled and the mutual interference between services can be reduced.
另一个示例中,设备驱动程序包根据N个任务的关联关系,确定出关联关系比较集中的任务,并
在这些任务的数据结构中标注同一组标识后发送给任务调度器,由任务调度器按照上一示例中的方法创建虚拟的子分组。比如,若确定某一任务依赖至少三个任务,则设备驱动程序包可以将该任务和所依赖的至少三个任务确定为一个子分组,在这至少四个任务的数据结构中标注同一组标识,同时,在该任务的数据结构中还标注前向同步关联任务。或者,若确定某一任务被至少三个任务所依赖,则设备驱动程序包可以将该任务和依赖该任务的至少三个任务确定为一个子分组,在这至少四个任务的数据结构中标注同一组标识,同时,在该任务的数据结构中还标注后向同步关联任务。在该示例中,通过将关联关系比较集中的任务划分到一个子分组中,仅对该子分组中的前向同步任务或后向同步任务进行标注,即可清楚地指示该子分组中的各个任务之间的关联,而不需要再对每个任务的依赖关系进行标注,如此可精简整个任务链的数据结构,降低设备驱动程序包和任务调度器之间的通信开销。In another example, the device driver package determines the tasks with relatively concentrated associations based on the associations among N tasks, and After marking the same group identifier in the data structure of these tasks, they are sent to the task scheduler, and the task scheduler creates a virtual sub-group according to the method in the previous example. For example, if it is determined that a certain task depends on at least three tasks, the device driver package can determine the task and the at least three tasks it depends on as a sub-group, mark the same group identifier in the data structure of the at least four tasks, and also mark the forward synchronization associated task in the data structure of the task. Or, if it is determined that a certain task is dependent on at least three tasks, the device driver package can determine the task and the at least three tasks that depend on the task as a sub-group, mark the same group identifier in the data structure of the at least four tasks, and also mark the backward synchronization associated task in the data structure of the task. In this example, by dividing the tasks with relatively concentrated association relationships into a sub-group, only the forward synchronization task or the backward synchronization task in the sub-group is marked, and the association between the various tasks in the sub-group can be clearly indicated, without the need to mark the dependency relationship of each task, so that the data structure of the entire task chain can be streamlined and the communication overhead between the device driver package and the task scheduler can be reduced.
需要说明的是,由任务调度器进行分组的具体实现,请参照通过上述设备驱动程序包进行分组的相关内容,此处不再重复进行赘述。此外,设备驱动程序包或任务调度器还可以按照其它特征对任务进行分组,比如用户指示的特征,本申请实施例对此也不作具体限定。It should be noted that the specific implementation of grouping by the task scheduler, please refer to the relevant content of grouping by the above-mentioned device driver package, which will not be repeated here. In addition, the device driver package or task scheduler can also group tasks according to other characteristics, such as characteristics indicated by the user, and the embodiments of the present application do not make specific limitations on this.
依赖关联Dependency Association
依赖关联包括部分依赖关联和串行依赖关联。部分依赖关联是指某一任务依赖于一个或多个其它任务的执行结果,该任务需在所依赖的一个或多个其它任务全部执行完成后才可以执行。需要说明的是,由于同步关联已经限定了一个任务对早于或晚于该任务的全部其它任务的依赖或被依赖关系,因此,在同步关联的基础上,部分依赖关联只能再限定一个任务依赖于全部任务中的部分其它任务而非全部其它任务,因此,我们将这个依赖关系称为部分依赖关联。相对应的,串行依赖关联是指某一任务依赖于一个或多个其它任务的执行,该任务需在所依赖的其它任务全部开始执行后才可以执行。且,由于串行依赖关联并不限定所依赖的其它任务必须执行完成,因此,串行依赖关联所依赖的任务可以是全部其它任务,也可以是部分其它任务。Dependency associations include partial dependency associations and serial dependency associations. Partial dependency associations refer to a task that depends on the execution results of one or more other tasks, and the task can only be executed after all the one or more other tasks it depends on have been completed. It should be noted that since the synchronous association has defined a task's dependence or dependency relationship on all other tasks that are earlier or later than the task, therefore, on the basis of the synchronous association, the partial dependency association can only define a task's dependence on some other tasks in all tasks rather than all other tasks. Therefore, we call this dependency relationship a partial dependency association. Correspondingly, the serial dependency association refers to a task that depends on the execution of one or more other tasks, and the task can only be executed after all the other tasks it depends on have started to execute. Moreover, since the serial dependency association does not limit the other tasks it depends on to completion, the tasks that the serial dependency association depends on can be all other tasks or some other tasks.
本申请实施例中,采用同步关联方式进行定义,只对前向同步任务或后向同步任务进行定义,即可同步定义出前向同步任务依赖于早于该任务下发的全部其它任务或者晚于后向同步任务下发的全部其它任务依赖于后向同步任务,而若采用依赖关联方式进行相同的定义,则需要定义出该任务所依赖的全部任务。也即是说,同步关联相比于依赖关联来说,能通过更精简的内容定义出多个任务之间的依赖关系。基于此,上述依赖关联可以是在已经确定完同步关联之后进行识别和标注的,即先对任务的同步关联进行判定,在同步关联判定结束后,再对剩余具有依赖关系的任务标注依赖关联,如此,既能使N个任务的数据结构中包含全部的关联关系,又不用对每个任务进行关联描述,进而有助于简化数据结构。In the embodiment of the present application, the synchronous association method is adopted for definition, and only the forward synchronization task or the backward synchronization task is defined, so that the forward synchronization task can be synchronously defined to be dependent on all other tasks issued earlier than the task or all other tasks issued later than the backward synchronization task are dependent on the backward synchronization task. If the same definition is performed in the dependent association method, it is necessary to define all the tasks that the task depends on. In other words, compared with the dependent association, the synchronous association can define the dependency relationship between multiple tasks through more streamlined content. Based on this, the above-mentioned dependency association can be identified and marked after the synchronous association has been determined, that is, the synchronous association of the task is first determined, and after the synchronous association determination is completed, the dependent association is marked for the remaining tasks with dependencies. In this way, all the association relationships can be included in the data structure of the N tasks, and there is no need to describe the association for each task, which helps to simplify the data structure.
步骤502,任务调度器根据N个任务的关联关系,将N个任务中无依赖的或已解除依赖的任务确定为可调度任务。Step 502: The task scheduler determines, according to the association relationship among the N tasks, the tasks that have no dependencies or have been released from dependencies among the N tasks as schedulable tasks.
示例性地,当N个任务的关联关系包括上述同步关联时,任务调度器可以根据N个任务的关联关系,确定与其它任务具有同步关联的任务,当该任务对应的与该任务具有同步关联的其它任务执行完成,则将该任务确定为一个可调度任务。其中,任务对应的与该任务具有同步关联的其它任务执行完成,示例性地可以包括如下内容中的至少一项:该任务为前向同步任务且早于该任务获取的其它任务全部执行完成;该任务为晚于后向同步任务获取的任务且后向同步任务执行完成。和/或,Exemplarily, when the association relationship of N tasks includes the above-mentioned synchronization association, the task scheduler can determine the task that has synchronization association with other tasks according to the association relationship of the N tasks, and when the other tasks that have synchronization association with the task corresponding to the task are completed, the task is determined as a schedulable task. Among them, the completion of the execution of other tasks that have synchronization association with the task corresponding to the task can exemplarily include at least one of the following contents: the task is a forward synchronization task and all other tasks acquired earlier than the task are completed; the task is a task acquired later than the backward synchronization task and the backward synchronization task is completed. And/or,
当N个任务的关联关系包括上述依赖关联时,任务调度器可以根据N个任务的关联关系,确定与其它任务具有依赖关联的任务,当该任务对应的与该任务具有依赖关联的其它任务已全部执行或已全部执行完成,则将该任务确定为一个可调度任务。When the association relationship among N tasks includes the above-mentioned dependency association, the task scheduler can determine the task that has a dependency association with other tasks based on the association relationship among the N tasks, and when the other tasks corresponding to the task that have a dependency association with the task have all been executed or have all been completed, the task is determined as a schedulable task.
在上述示例中,通过监控具有同步关联和/或依赖关联的任务的执行情况,能在与任务具有同步关联和/或依赖关联的其它任务不再阻塞该任务时,及时地将该任务调度给处理单元。In the above example, by monitoring the execution status of tasks with synchronization association and/or dependency association, the task can be scheduled to the processing unit in a timely manner when other tasks with synchronization association and/or dependency association no longer block the task.
一种具体的实现方式中,任务调度器可以根据N个任务的关联关系,将N个任务中未被其它任务同步阻塞且未被其它任务依赖阻塞的任务确定为可调度任务。其中,任务未被其它任务同步阻塞是指任务不再被与该任务具有同步关联的任务所阻塞,即该任务对应的与该任务具有同步关联的其它任务执行完成。任务未被其它任务依赖阻塞是指任务未被与该任务具有依赖关联的其它任务所阻塞,即该任务对应的与该任务具有依赖关联的其它任务执行完成。In a specific implementation, the task scheduler can determine the tasks among the N tasks that are not blocked by other tasks synchronously and not blocked by other tasks dependencies as schedulable tasks based on the association relationship among the N tasks. The fact that a task is not blocked by other tasks synchronously means that the task is no longer blocked by tasks that are synchronously associated with the task, that is, the other tasks that are synchronously associated with the task corresponding to the task are completed. The fact that a task is not blocked by other tasks dependencies means that the task is not blocked by other tasks that are dependently associated with the task, that is, the other tasks that are dependently associated with the task corresponding to the task are completed.
在上述内容中,某一任务被其它任务同步阻塞是指该任务满足如下条件中的至少一项:In the above content, a task is synchronously blocked by other tasks means that the task meets at least one of the following conditions:
条件一,该任务为整组中的前向同步任务,且整组中早于该任务下发的其它任务未全部执行完成。Condition 1: the task is a forward synchronization task in the entire group, and other tasks in the entire group that were issued earlier than this task have not been fully executed.
条件二,该任务为某一子分组中的前向同步任务,且该子分组中早于该任务下发的其它任务未全部
执行完成。Condition 2: The task is a forward synchronization task in a subgroup, and all other tasks in the subgroup that were issued earlier than this task have not been completed. Execution completed.
条件三,该任务为整组中晚于后向同步任务下发的任务,且后向同步任务未执行完成。Condition three: the task is the one that is issued later than the backward synchronization task in the entire group, and the backward synchronization task has not been completed.
条件四,该任务为某一子分组中晚于后向同步任务下发的任务,且后向同步任务未执行完成。Condition 4: the task is a task in a subgroup that is issued later than the backward synchronization task, and the backward synchronization task has not been completed.
举例来说,图6示例性示出本申请实施例提供的一种具有同步关联的任务布局示意图,该示例中,共存在七个任务,即任务0至任务6,且这七个任务按照任务0、任务1、任务2、任务3、任务4、任务5和任务6的顺序依次被下发至任务调度器。参照图6中(A)至图6中(F)所示:For example, FIG6 exemplarily shows a schematic diagram of a task layout with synchronization association provided by an embodiment of the present application. In this example, there are seven tasks, namely, task 0 to task 6, and these seven tasks are sent to the task scheduler in the order of task 0, task 1, task 2, task 3, task 4, task 5 and task 6. Referring to FIG6 (A) to FIG6 (F):
图6中(A)未对任务0至任务6进行分组,即任务0至任务6只存在于整组中。且,在图6中(A)所示意的任务队列中,任务3被标注为后向同步任务,也即是晚于任务3下发的任务4、任务5和任务6需要在任务3执行完成后才可以执行,因此,只要任务3未执行完成,则任务4、任务5和任务6满足上述条件三,即任务4、任务5和任务6被同步阻塞。In Figure 6 (A), tasks 0 to 6 are not grouped, that is, tasks 0 to 6 only exist in the whole group. In addition, in the task queue shown in Figure 6 (A), task 3 is marked as a backward synchronization task, that is, tasks 4, 5, and 6 that are issued later than task 3 can only be executed after task 3 is completed. Therefore, as long as task 3 is not completed, tasks 4, 5, and 6 meet the above condition 3, that is, tasks 4, 5, and 6 are synchronously blocked.
图6中(B)也未对任务0至任务6进行分组,即任务0至任务6只存在于整组中。且,在图6中(B)所示意的任务队列中,任务3被标注为前向同步任务,也即是任务3需要在早于任务3下发的任务0、任务1和任务2都执行完成后才可以执行,因此,只要任务0、任务1或任务2中存在至少一个任务未执行完成,则任务3满足上述条件一,因此任务3被同步阻塞。Figure 6 (B) also does not group tasks 0 to 6, that is, tasks 0 to 6 only exist in the entire group. Moreover, in the task queue shown in Figure 6 (B), task 3 is marked as a forward synchronization task, that is, task 3 can only be executed after tasks 0, 1, and 2 that were issued earlier than task 3 are all completed. Therefore, as long as at least one of task 0, task 1, or task 2 is not completed, task 3 meets the above condition 1, so task 3 is synchronously blocked.
图6中(C)也未对任务0至任务6进行分组,即任务0至任务6只存在于整组中。且,在图6中(C)所示意的任务队列中,任务3既被标注为前向同步任务,又被标注为后向同步任务,任务3需要在早于任务3下发的任务0、任务1和任务2都执行完成后才可以执行,且,晚于任务3下发的任务4、任务5和任务6需要在任务3执行完成后才可以执行。因此,只要任务0、任务1或任务2中存在至少一个任务未执行完成,则任务3满足上述条件一,任务4、任务5和任务6满足上述条件三,因此任务3、任务4、任务5和任务6都会被同步阻塞。In Figure 6 (C), Tasks 0 to 6 are not grouped, that is, Tasks 0 to 6 only exist in the whole group. Moreover, in the task queue shown in Figure 6 (C), Task 3 is marked as both a forward synchronization task and a backward synchronization task. Task 3 can only be executed after Tasks 0, 1, and 2 that were issued earlier than Task 3 are all executed, and Tasks 4, 5, and 6 that were issued later than Task 3 can only be executed after Task 3 is executed. Therefore, as long as at least one of Tasks 0, 1, or 2 is not executed, Task 3 meets the above condition 1, and Tasks 4, 5, and 6 meet the above condition 3, so Tasks 3, 4, 5, and 6 will be synchronously blocked.
图6中(D)将任务3、任务5和任务6划分到同一个子分组中,因此,任务3、任务5和任务6同时存在于整组和该子分组中,而任务0、任务1、任务2和任务4存在于整组但不存在于该子分组中。在图6中(D)所示意的任务队列中,任务3被标注为该子分组中的后向同步任务,也即是该子分组中晚于任务3下发的任务5和任务6需要在任务3执行完成后才可以执行,因此,只要任务3未执行完成,则任务5和任务6满足上述条件四,即任务5和任务6被同步阻塞。Figure 6 (D) divides Task 3, Task 5 and Task 6 into the same subgroup, so Task 3, Task 5 and Task 6 exist in both the whole group and the subgroup, while Task 0, Task 1, Task 2 and Task 4 exist in the whole group but not in the subgroup. In the task queue shown in Figure 6 (D), Task 3 is marked as a backward synchronization task in the subgroup, that is, Task 5 and Task 6 in the subgroup that are issued later than Task 3 need to be executed after Task 3 is completed. Therefore, as long as Task 3 is not completed, Task 5 and Task 6 meet the above condition 4, that is, Task 5 and Task 6 are synchronously blocked.
图6中(E)将任务0、任务1和任务3划分到同一个子分组中,因此,任务0、任务1和任务3同时存在于整组和该子分组中,任务2、任务4、任务5和任务6存在于整组但不存在于该子分组中。在图6中(E)所示意的任务队列中,任务3被标注为该子分组中的前向同步任务,也即是任务3需要在该子分组中早于任务3下发的任务0和任务1都执行完成后才可以执行,因此,只要任务0或任务1中存在至少一个任务未执行完成,即任务3满足上述条件二,因此任务3被同步阻塞。In Figure 6 (E), Task 0, Task 1 and Task 3 are divided into the same subgroup. Therefore, Task 0, Task 1 and Task 3 exist in both the whole group and the subgroup. Task 2, Task 4, Task 5 and Task 6 exist in the whole group but not in the subgroup. In the task queue shown in Figure 6 (E), Task 3 is marked as a forward synchronization task in the subgroup, which means that Task 3 can only be executed after Task 0 and Task 1, which are issued earlier than Task 3, are completed in the subgroup. Therefore, as long as there is at least one task in Task 0 or Task 1 that has not been completed, Task 3 meets the above condition 2, so Task 3 is synchronously blocked.
图6中(F)将任务0、任务1、任务3、任务5和任务6划分到同一个子分组中,因此,任务0、任务1、任务3、任务5和任务6同时存在于整组和该子分组中,而任务2和任务4存在于整组但不存在于该子分组中。在图6中(F)所示意的任务队列中,任务3被标注为该子分组中的前向同步任务和后向同步任务,也即是,任务3需要在该子分组中早于任务3下发的任务0和任务1都执行完成后才可以执行,同时,该子分组中晚于任务3下发的任务5和任务6需要在任务3执行完成后才可以执行。因此,只要任务0或任务1中存在至少一个任务未执行完成,则任务3满足上述条件四,任务5和任务6满足上述条件二,因此任务3、任务5和任务6都会被同步阻塞。In Figure 6 (F), Task 0, Task 1, Task 3, Task 5 and Task 6 are divided into the same subgroup. Therefore, Task 0, Task 1, Task 3, Task 5 and Task 6 exist in the whole group and the subgroup at the same time, while Task 2 and Task 4 exist in the whole group but not in the subgroup. In the task queue shown in Figure 6 (F), Task 3 is marked as the forward synchronization task and the backward synchronization task in the subgroup, that is, Task 3 can only be executed after Task 0 and Task 1, which are issued earlier than Task 3 in the subgroup, are executed. At the same time, Task 5 and Task 6, which are issued later than Task 3 in the subgroup, can only be executed after Task 3 is executed. Therefore, as long as there is at least one task in Task 0 or Task 1 that has not been executed, Task 3 meets the above condition four, and Task 5 and Task 6 meet the above condition two, so Task 3, Task 5 and Task 6 will be synchronously blocked.
需要说明的是,上述内容所介绍的“任务存在于整组中但不存在于该子分组中”,可能是任务只存在于整组而不存在于任意的子分组中,也可能是任务存在于整组和其它子分组中,具体不作限定。此外,为了降低不同子分组之间任务执行的相互干扰,一个任务通常最多放置在一个子分组中,而不会同时放在两个或两个以上的子分组中,也即是,在存在多个子分组的情况下,多个子分组中的任务各不相同。It should be noted that the "task exists in the whole group but not in the subgroup" described above may mean that the task exists only in the whole group but not in any subgroup, or that the task exists in the whole group and other subgroups, without specific limitation. In addition, in order to reduce the mutual interference of task execution between different subgroups, a task is usually placed in at most one subgroup, and will not be placed in two or more subgroups at the same time, that is, when there are multiple subgroups, the tasks in the multiple subgroups are different.
相应地,在上述内容中,某一任务被其它任务依赖阻塞是指该任务满足如下条件中的至少一项:Accordingly, in the above content, a task is blocked by other task dependencies means that the task satisfies at least one of the following conditions:
条件一,该任务与其它任务具有部分依赖关联,且所依赖的其它任务未全部执行完成;Condition 1: The task has some dependencies with other tasks, and the other dependent tasks have not been fully executed;
条件二,该任务与其它任务具有串行依赖关联,且所依赖的其它任务未全部执行。Condition 2: The task has a serial dependency relationship with other tasks, and all other dependent tasks have not been executed.
举例来说,图7示例性示出本申请实施例提供的一种具有依赖关联的任务布局示意图,该示例中,共存在七个任务,即任务0、任务1、任务2、任务3、任务4、任务5、任务6和任务7,且未对任务0至任务7进行分组,即任务0至任务7只存在于整组中。在整组中,当任务0和任务1、任务3和任务4与任务6存在上述部分依赖关联时,若任务6还未开始执行,或者任务6已经执行但还未执行完成,
则任务0、任务1、任务3和任务4会被任务6依赖阻塞。当任务0和任务1、任务3和任务4与任务6存在上述串行依赖关联时,若任务6还未开始执行,则任务0、任务1、任务3和任务4也会被任务6依赖阻塞,只要任务6已经开始执行,无论是否执行完成,任务0、任务1、任务3和任务4都不会被任务6依赖阻塞。For example, FIG7 exemplarily shows a task layout diagram with dependency association provided by an embodiment of the present application. In this example, there are seven tasks, namely, Task 0, Task 1, Task 2, Task 3, Task 4, Task 5, Task 6 and Task 7, and Task 0 to Task 7 are not grouped, that is, Task 0 to Task 7 only exist in the whole group. In the whole group, when Task 0 and Task 1, Task 3 and Task 4 have the above-mentioned partial dependency association with Task 6, if Task 6 has not started to execute, or Task 6 has been executed but has not yet been completed, Then Task 0, Task 1, Task 3 and Task 4 will be blocked by Task 6. When Task 0 and Task 1, Task 3 and Task 4 have the above serial dependency association with Task 6, if Task 6 has not started executing, then Task 0, Task 1, Task 3 and Task 4 will also be blocked by Task 6 dependency. As long as Task 6 has started executing, regardless of whether it has been completed, Task 0, Task 1, Task 3 and Task 4 will not be blocked by Task 6 dependency.
进一步示例性地,根据上述介绍内容,任务调度器可以通过如下方式,确定N个任务中未被其它任务同步阻塞且未被其它任务依赖阻塞的可调度任务:Further exemplary, according to the above introduction, the task scheduler can determine the schedulable tasks among the N tasks that are not blocked by other tasks synchronously and are not blocked by other task dependencies in the following manner:
任务调度器获取到N个任务后,按照N个任务的下发顺序,遍历N个任务中还未被确定阻塞的每个任务,且,在遍历还未被确定阻塞的每个任务时:若该任务为整组中的前向同步任务,且整组中早于该任务下发的其它任务未全部执行完成,则确定该任务被整组中早于该任务下发的其它任务同步阻塞;若该任务为某一子分组中的前向同步任务,且该子分组中早于该任务下发的其它任务未全部执行完成,则确定该任务被该子分组中早于该任务下发的其它任务同步阻塞;若该任务为整组中的后向同步任务,则确定整组中晚于该任务下发的其它任务全部被该任务同步阻塞;若该任务为某一子分组中的后向同步任务,则确定该子分组中晚于该任务下发的其它任务全部被该任务同步阻塞;若该任务与其它任务存在部分依赖关联,且所依赖的其它任务未全部执行完成,则确定该任务被依赖阻塞;若该任务与其它任务存在串行依赖关联,且所依赖的其它任务未全部执行,则确定该任务被依赖阻塞;当该任务既未被同步阻塞也未被依赖阻塞时,将该任务确定为一个可调度任务。After the task scheduler obtains N tasks, it traverses each task in the N tasks that has not been determined to be blocked in the order in which the N tasks are issued, and, when traversing each task that has not been determined to be blocked: if the task is a forward synchronization task in the entire group, and other tasks in the entire group that were issued earlier than the task have not been fully executed, then it is determined that the task is synchronously blocked by other tasks in the entire group that were issued earlier than the task; if the task is a forward synchronization task in a sub-group, and other tasks in the sub-group that were issued earlier than the task have not been fully executed, then it is determined that the task is synchronously blocked by other tasks in the sub-group that were issued earlier than the task; if the task is a forward synchronization task in a sub-group, and other tasks in the sub-group that were issued earlier than the task have not been fully executed, then it is determined that the task is synchronously blocked by other tasks in the sub-group that were issued earlier than the task; If it is a backward synchronization task, it is determined that all other tasks in the entire group that are issued later than this task are synchronously blocked by this task; if this task is a backward synchronization task in a subgroup, it is determined that all other tasks in the subgroup that are issued later than this task are synchronously blocked by this task; if this task has a partial dependency association with other tasks, and the other tasks on which it depends have not been fully executed, it is determined that this task is blocked by the dependency; if this task has a serial dependency association with other tasks, and the other tasks on which it depends have not been fully executed, it is determined that this task is blocked by the dependency; when this task is neither blocked by synchronization nor blocked by dependency, the task is determined to be a schedulable task.
需要说明的是,在上述判断方式中,只要任务属于整组或子分组中的后向同步任务,则该任务所在的整组或子分组中晚于该任务下发的其它任务可直接被判定为被同步阻塞。可见,通过设置任务之间的同步关联,只要分析出整组或子分组中的一个后向同步任务未执行完成,则该整组或子分组中晚于该后向同步任务下发的全部的其它任务都不用再进行同步阻塞分析了,如此可极大地简化同步阻塞的判断流程,有效提高同步阻塞判断的效率,进而有助于提高任务调度的效率。It should be noted that in the above judgment method, as long as the task belongs to a backward synchronization task in the whole group or sub-group, other tasks in the whole group or sub-group where the task is located that are issued later than the task can be directly judged as being synchronously blocked. It can be seen that by setting the synchronization association between tasks, as long as it is analyzed that a backward synchronization task in the whole group or sub-group has not been executed, all other tasks in the whole group or sub-group that are issued later than the backward synchronization task do not need to be analyzed for synchronization blocking. This can greatly simplify the judgment process of synchronization blocking, effectively improve the efficiency of synchronization blocking judgment, and thus help improve the efficiency of task scheduling.
此外,上述判断任务是否被同步阻塞和判断任务是否被依赖阻塞,这两个判断操作可以是串行执行的,也可以是并行执行的。且,只要其中一个判断确定被阻塞,则另外一个判断可立马结束,而无需再继续执行,如此可避免不必要的计算流程,有效节省计算资源,进一步提高任务调度的效率。In addition, the above two judgment operations of judging whether a task is blocked synchronously and judging whether a task is blocked by a dependency can be executed serially or in parallel. Moreover, as long as one of the judgments is determined to be blocked, the other judgment can be terminated immediately without continuing to execute, thus avoiding unnecessary calculation processes, effectively saving calculation resources, and further improving the efficiency of task scheduling.
步骤503,任务调度器将可调度任务调度至处理单元。Step 503: The task scheduler schedules the schedulable tasks to the processing units.
本申请实施例中,任务调度器可以实时监控GPU核当前执行的任务数量,在确定该任务数量小于GPU核的可并行任务数量时,将可调度任务调度至GPU核。其中,当存在多个可调度任务时,任务调度器可以按照接收设备驱动程序包下发的任务的顺序,依次将多个可调度任务调度至GPU核,以确保各个任务按照图像处理的顺序执行,避免后一帧图像被播放在前一帧图像之前,有效避免长短帧现象。In the embodiment of the present application, the task scheduler can monitor the number of tasks currently executed by the GPU core in real time, and when it is determined that the number of tasks is less than the number of parallel tasks of the GPU core, the schedulable tasks are scheduled to the GPU core. Among them, when there are multiple schedulable tasks, the task scheduler can schedule multiple schedulable tasks to the GPU core in sequence according to the order of tasks sent by the receiving device driver package, so as to ensure that each task is executed in the order of image processing, avoid the next frame of the image being played before the previous frame of the image, and effectively avoid the phenomenon of long and short frames.
在上述实施例一中,通过任务调度器从硬件侧维护N个任务的调度顺序,而不是由设备驱动程序包从软件侧进行指定,一方面,能根据任务的实际执行情况,将无依赖或者已经解除依赖的任务提前下发给处理单元进行处理,有效提高处理单元的利用率,另一方面,还无需硬件侧在每个任务执行完成后向软件侧发通知消息以指示软件侧下发新的任务,如此可极大地降低软件侧的工作压力,有效提高任务调度的效率,节省通信开销。In the above-mentioned embodiment 1, the scheduling order of N tasks is maintained from the hardware side by the task scheduler, rather than being specified by the device driver package from the software side. On the one hand, tasks that have no dependencies or have been freed from dependencies can be sent to the processing unit for processing in advance according to the actual execution status of the tasks, thereby effectively improving the utilization rate of the processing unit. On the other hand, there is no need for the hardware side to send a notification message to the software side after each task is executed to instruct the software side to send a new task. This can greatly reduce the work pressure on the software side, effectively improve the efficiency of task scheduling, and save communication overhead.
同样针对于上述图4所示意的游戏处理场景,图8示例性示出采用上述实施例一中的任务调度方法处理该游戏任务的流程示意图,其中,图8中(A)示出的是Binning1~Binning3以及Rendering1~Rendering3这六个任务的下发顺序和依赖关系,可见,这六个任务按照Binning1、Rendering1、Binning2、Rendering2、Binning3、Rendering2的顺序被设备驱动程序包下发给任务调度器,任务调度器将这六个任务存储在一个整组中,且未进行分组。相对应的,图8中(B)示出的是按照上述实施例一中的任务调度方法处理任务的一种可能情况,参照图8中(B)所示,任务调度器可按照如下步骤调度各个任务:Similarly, for the game processing scenario shown in FIG. 4 above, FIG. 8 exemplarily shows a flow chart of processing the game task using the task scheduling method in the first embodiment above, wherein FIG. 8 (A) shows the order and dependency of the six tasks Binning1 to Binning3 and Rendering1 to Rendering3. It can be seen that the six tasks are sent to the task scheduler by the device driver package in the order of Binning1, Rendering1, Binning2, Rendering2, Binning3, and Rendering2. The task scheduler stores the six tasks in a whole group without grouping them. Correspondingly, FIG. 8 (B) shows a possible situation of processing tasks according to the task scheduling method in the first embodiment above. Referring to FIG. 8 (B), the task scheduler can schedule each task according to the following steps:
步骤一,任务调度器按照任务的下发顺序依次判断Binning1、Rendering1、Binning2、Rendering2、Binning3、Rendering2是否被同步阻塞和依赖阻塞:Step 1: The task scheduler determines whether Binning1, Rendering1, Binning2, Rendering2, Binning3, and Rendering2 are blocked synchronously or dependently in the order in which the tasks are issued:
首先,分析Binning1,确定Binning1不属于前向同步任务且整组中也不存在后向同步任务,因此Binning1未被同步阻塞,且,Binning1不依赖其它任务,因此,Binning1也未被依赖阻塞,从而,Binning1被确定为一个无依赖的任务,任务调度器将Binning1调度至GPU核;First, Binning1 is analyzed to determine that Binning1 does not belong to the forward synchronization task and there is no backward synchronization task in the entire group. Therefore, Binning1 is not blocked by synchronization, and Binning1 does not depend on other tasks. Therefore, Binning1 is not blocked by dependencies. Therefore, Binning1 is determined to be a task without dependencies, and the task scheduler schedules Binning1 to the GPU core.
其次,分析Rendering1,确定Rendering1不属于前向同步任务且整组中也不存在后向同步任务,
因此Rendering1未被同步阻塞,且,Rendering1依赖于Binning1,因此,在Binning1执行完成之前,Rendering1被Binning1依赖阻塞,从而,任务调度器不调度Rendering1;Secondly, analyze Rendering1 and determine that Rendering1 does not belong to the forward synchronization task and there is no backward synchronization task in the entire group. Therefore, Rendering1 is not blocked synchronously, and Rendering1 depends on Binning1. Therefore, before Binning1 is executed, Rendering1 is blocked by Binning1 dependency, so the task scheduler does not schedule Rendering1.
再者,分析Binning2,确定Binning2不属于前向同步任务且整组中也不存在后向同步任务,因此Binning2未被同步阻塞,且,Binning2依赖于Rendering1,因此,在Rendering1执行完成之前,Binning2被Rendering1依赖阻塞,任务调度器也不调度Binning2;Furthermore, Binning2 is analyzed to determine that Binning2 does not belong to the forward synchronization task and there is no backward synchronization task in the entire group, so Binning2 is not blocked by synchronization, and Binning2 depends on Rendering1. Therefore, before Rendering1 is executed, Binning2 is blocked by Rendering1 dependency, and the task scheduler does not schedule Binning2.
然后,分析Rendering2,确定Rendering2不属于前向同步任务且整组中也不存在后向同步任务,因此Rendering2未被同步阻塞,且,Rendering2依赖于Binning2,因此,在Binning2执行完成之前,Rendering2被Binning2依赖阻塞,从而,任务调度器不调度Rendering2;Then, Rendering2 is analyzed to determine that Rendering2 does not belong to the forward synchronization task and there is no backward synchronization task in the entire group, so Rendering2 is not blocked by synchronization, and Rendering2 depends on Binning2. Therefore, before Binning2 is executed, Rendering2 is blocked by Binning2 dependency, so the task scheduler does not schedule Rendering2.
之后,分析Binning3,确定Binning3不属于前向同步任务且整组中也不存在后向同步任务,因此Binning3未被同步阻塞,且,Binning3不依赖其它任务,因此,Binning3也未被依赖阻塞,从而,Binning3被确定为一个无依赖的任务,任务调度器将Binning3调度至GPU核;After that, Binning3 is analyzed and it is determined that Binning3 does not belong to the forward synchronization task and there is no backward synchronization task in the whole group. Therefore, Binning3 is not blocked by synchronization, and Binning3 does not depend on other tasks. Therefore, Binning3 is not blocked by dependencies. Therefore, Binning3 is determined to be a task without dependencies, and the task scheduler schedules Binning3 to the GPU core.
最后,分析Rendering3,确定Rendering3不属于前向同步任务且整组中也不存在后向同步任务,因此Rendering3未被同步阻塞,且,Rendering3依赖于Binning3,因此,在Binning3执行完成之前,Rendering3被Binning3依赖阻塞,从而,任务调度器不调度Rendering3;Finally, Rendering3 is analyzed to determine that Rendering3 does not belong to the forward synchronization task and there is no backward synchronization task in the entire group, so Rendering3 is not blocked by synchronization, and Rendering3 depends on Binning3. Therefore, before Binning3 is executed, Rendering3 is blocked by Binning3 dependency, so the task scheduler does not schedule Rendering3.
根据上述各个分析,任务调度器最先将Binning1和Binning3调度至GPU核进行处理。According to the above analysis, the task scheduler first schedules Binning1 and Binning3 to the GPU core for processing.
步骤二,假设Binning1先执行完成,则依赖于Binning1的Rendering1解除依赖,因此,任务调度器可再将Rendering1调度至GPU核进行处理,此时,GPU核内会并行处理Binning3和Rendering1。Step 2: Assuming Binning1 is completed first, Rendering1, which depends on Binning1, is no longer dependent on it. Therefore, the task scheduler can schedule Rendering1 to the GPU core for processing. At this time, Binning3 and Rendering1 will be processed in parallel in the GPU core.
步骤三,假设Binning3先执行完成,则依赖于Binning3的Rendering3解除依赖,因此,任务调度器可再将Rendering3调度至GPU核进行处理,此时,GPU核内会并行处理Rendering1和Rendering3。Step 3: Assuming Binning3 is completed first, Rendering3, which depends on Binning3, releases the dependency. Therefore, the task scheduler can schedule Rendering3 to the GPU core for processing. At this time, Rendering1 and Rendering3 will be processed in parallel in the GPU core.
步骤四,假设Rendering1先执行完成,则依赖于Rendering1的Binning2解除依赖,因此,任务调度器可再将Binning2调度至GPU核进行处理,此时,GPU核内会并行处理Rendering3和Binning2。Step 4: Assuming that Rendering1 is completed first, Binning2, which depends on Rendering1, is no longer dependent on it. Therefore, the task scheduler can schedule Binning2 to the GPU core for processing. At this time, Rendering3 and Binning2 will be processed in parallel in the GPU core.
步骤五,假设Binning2先执行完成,则依赖于Binning2的Rendering2解除依赖,因此,任务调度器可再将Rendering2调度至GPU核进行处理,此时,GPU核内会并行处理Rendering3和Rendering2。Step 5: Assuming Binning2 is completed first, Rendering2, which depends on Binning2, releases the dependency. Therefore, the task scheduler can schedule Rendering2 to the GPU core for processing. At this time, Rendering3 and Rendering2 will be processed in parallel in the GPU core.
步骤六,Rendering3和Rendering2都处理完成后,这六个任务全部处理完成。Step 6: After Rendering3 and Rendering2 are processed, all six tasks are completed.
可见,在上述步骤一至步骤六中,GPU核可以一直无间断地并行处理两个任务,GPU核基本不空载,从而GPU核的利用率得到了较大提升,任务调度的性能也得到了较大提高。It can be seen that in the above steps 1 to 6, the GPU core can process two tasks in parallel without interruption, and the GPU core is basically not idle, so that the utilization rate of the GPU core is greatly improved, and the performance of task scheduling is also greatly improved.
需要说明的是,上述图8中(B)只是示出一种可能的调度方式,实际情况中还可能存在其它调度方式。比如,另一个示例中,在上述步骤二或步骤三中,假设Rendering1先执行完成,则依赖于Rendering1的Binning2解除依赖,因此,任务调度器可再将Binning2调度至GPU核进行处理,此时,如果是上述步骤二,则GPU核内会并行处理Binning3和Binning2,如果是上述步骤三,则GPU核内会并行处理Rendering3和Binning2。或者,再一个示例中,在上述步骤四或步骤五中,假设Rendering3先执行完成,则此时还未处理的任务全部都未解除依赖,因此,任务调度器不再调度新的任务,而是等待Rendering1或Binning2执行完成后,再将解除依赖的Binning2或Rendering2调度给GPU核,虽然此期间内GPU核空载,但相比于图4所示意的三个空隙来说,还是能较大地提高GPU核的利用率。应理解,可能的调度方式还有很多,本申请实施例对此不再一一列举。It should be noted that (B) in FIG. 8 above only shows a possible scheduling method, and there may be other scheduling methods in actual situations. For example, in another example, in the above step 2 or step 3, assuming that Rendering1 is executed first, Binning2 that depends on Rendering1 is released from dependency, so the task scheduler can schedule Binning2 to the GPU core for processing. At this time, if it is the above step 2, Binning3 and Binning2 will be processed in parallel in the GPU core, and if it is the above step 3, Rendering3 and Binning2 will be processed in parallel in the GPU core. Or, in another example, in the above step 4 or step 5, assuming that Rendering3 is executed first, all the tasks that have not been processed at this time are not released from dependency, so the task scheduler no longer schedules new tasks, but waits for Rendering1 or Binning2 to be executed, and then schedules the released Binning2 or Rendering2 to the GPU core. Although the GPU core is idle during this period, compared with the three gaps shown in FIG. 4, it can still greatly improve the utilization rate of the GPU core. It should be understood that there are many possible scheduling methods, which are not listed one by one in the embodiments of the present application.
基于上述实施例一,下面通过实施例二进一步介绍任务调度方法的具体实现。Based on the above-mentioned embodiment 1, the specific implementation of the task scheduling method is further introduced through embodiment 2 below.
实施例二Embodiment 2
图9示例性示出本申请实施例二提供的任务调度方法对应的流程示意图,该方法适用于任务调度器,例如图2所示意的任务调度器310。如图9所示,该方法包括:FIG9 exemplarily shows a flow chart of a task scheduling method provided in the second embodiment of the present application, which is applicable to a task scheduler, such as the task scheduler 310 shown in FIG2 . As shown in FIG9 , the method includes:
步骤901,任务调度器获取N个任务以及N个任务的关联关系。Step 901: The task scheduler obtains N tasks and associations among the N tasks.
步骤902,任务调度器从N个任务中筛选出还未被确定同步阻塞的各个任务,根据还未被确定同步阻塞的各个任务的获取顺序,将还未被确定同步阻塞的任务中最早获取的任务确定为目标任务。Step 902, the task scheduler selects the tasks that have not been determined to be synchronously blocked from the N tasks, and determines the earliest acquired task among the tasks that have not been determined to be synchronously blocked as the target task according to the acquisition order of the tasks that have not been determined to be synchronously blocked.
其中,还未被确定同步阻塞的任务是指还未被分析过是否被同步阻塞的任务,而已被确定同步阻塞的任务包括已经被分析过且确定被同步阻塞的任务、以及虽然还未被分析过是否被同步阻塞但在其它任务的分析中已经被确定被同步阻塞的任务。Among them, tasks that have not yet been determined to be synchronously blocked refer to tasks that have not yet been analyzed whether they are synchronously blocked, and tasks that have been determined to be synchronously blocked include tasks that have been analyzed and determined to be synchronously blocked, as well as tasks that have not yet been analyzed whether they are synchronously blocked but have been determined to be synchronously blocked in the analysis of other tasks.
步骤903,任务调度器根据N个任务的关联关系,确定出与目标任务具有同步关联的其它任务,根据其它任务的执行情况,确定目标任务是否被其它任务同步阻塞,若否,则执行步骤904,若是,则执
行步骤902。Step 903, the task scheduler determines other tasks that are synchronously associated with the target task based on the association relationship of the N tasks, and determines whether the target task is synchronously blocked by other tasks based on the execution status of other tasks. If not, step 904 is executed. If yes, step 905 is executed. Go to step 902.
具体的,任务调度器可以先执行如下判断一至判断四:Specifically, the task scheduler may first perform the following judgments 1 to 4:
判断一,根据目标任务的配置信息,判断目标任务为整组中的前向同步任务时,确定与目标任务具有同步关联关系的其它任务为整组中早于该任务下发的其它任务,进而,若整组中早于该任务下发的其它任务未全部执行完成,则确定目标任务被整组中早于该任务下发的其它任务同步阻塞;Judgment 1: when judging that the target task is a forward synchronization task in the entire group according to the configuration information of the target task, it is determined that other tasks having a synchronization association relationship with the target task are other tasks in the entire group that are issued earlier than the target task, and further, if the other tasks in the entire group that are issued earlier than the target task are not all executed, it is determined that the target task is synchronously blocked by other tasks in the entire group that are issued earlier than the target task;
判断二,根据目标任务的配置信息,判断目标任务为某一子分组中的前向同步任务时,确定与目标任务具有同步关联关系的其它任务为该子分组中早于该任务下发的其它任务,进而,若该子分组中早于该任务下发的其它任务未全部执行完成,则确定目标任务被该子分组中早于该任务下发的其它任务同步阻塞;Judgment 2: when judging that the target task is a forward synchronization task in a certain subgroup according to the configuration information of the target task, it is determined that other tasks having a synchronization association relationship with the target task are other tasks in the subgroup that are issued earlier than the target task, and further, if the other tasks in the subgroup that are issued earlier than the target task are not all executed, it is determined that the target task is synchronously blocked by other tasks in the subgroup that are issued earlier than the target task;
判断三,根据目标任务的配置信息,判断目标任务为整组中晚于某一后向同步任务下发的任务时,确定与目标任务具有同步关联关系的其它任务为整组中的该后向同步任务,进而,若整组中的该后向同步任务未执行完成,则确定目标任务被整组中的该后向同步任务同步阻塞;Judgment three: when it is determined that the target task is a task in the entire group that is issued later than a certain backward synchronization task according to the configuration information of the target task, it is determined that other tasks having a synchronization association relationship with the target task are the backward synchronization task in the entire group, and further, if the backward synchronization task in the entire group is not executed to completion, it is determined that the target task is synchronization blocked by the backward synchronization task in the entire group;
判断四,根据目标任务的配置信息,判断目标任务为某一子分组中晚于某一后向同步任务下发的任务时,确定与目标任务具有同步关联关系的其它任务为该子分组中的该后向同步任务,该情况下,若该子分组中的该后向同步任务未执行完成,则确定目标任务被该子分组中的该后向同步任务同步阻塞。Judgment four, when it is judged that the target task is a task in a sub-group that is issued later than a backward synchronization task based on the configuration information of the target task, it is determined that other tasks that have a synchronization association relationship with the target task are the backward synchronization task in the sub-group. In this case, if the backward synchronization task in the sub-group has not been executed to completion, it is determined that the target task is synchronization blocked by the backward synchronization task in the sub-group.
当上述判断一至判断四中存在至少一个满足时,确定目标任务被同步阻塞,反之,当上述判断一至判断四都不满足时,确定目标任务未被同步阻塞。When at least one of the above judgments 1 to 4 is satisfied, it is determined that the target task is synchronously blocked. Conversely, when none of the above judgments 1 to 4 are satisfied, it is determined that the target task is not synchronously blocked.
进一步地,任务调度器还可以执行如下判断五和判断六:Furthermore, the task scheduler may also perform the following judgment five and judgment six:
判断五,根据目标任务的配置信息,判断目标任务为整组中的后向同步任务时,确定整组中晚于该后向同步任务下发的其它任务全部被目标任务同步阻塞;Judgment 5: when judging that the target task is a backward synchronization task in the entire group according to the configuration information of the target task, it is determined that all other tasks in the entire group that are issued later than the backward synchronization task are synchronously blocked by the target task;
判断六,根据目标任务的配置信息,判断目标任务为某一子分组中的后向同步任务时,确定该子分组中晚于该后向同步任务下发的其它任务全部被目标任务同步阻塞。Judgment six: when judging, based on the configuration information of the target task, that the target task is a backward synchronization task in a certain subgroup, it is determined that all other tasks in the subgroup that are issued later than the backward synchronization task are synchronously blocked by the target task.
通过上述判断一至判断四,任务调度器能确定目标任务是否被其它任务同步阻塞,而通过上述判断五和判断六,任务调度器能确定还未被分析过的其它任务是否被目标任务同步阻塞。可见,该方式能在分析一个目标任务是否被同步阻塞的同时,将被该目标任务同步阻塞的其它任务也确定出来,从而后续即可无需再对这些已被确定同步阻塞的任务进行无意义的分析,如此有助于节省计算资源,同时还能提高同步阻塞的判断效率。Through the above judgments 1 to 4, the task scheduler can determine whether the target task is synchronously blocked by other tasks, and through the above judgments 5 and 6, the task scheduler can determine whether other tasks that have not been analyzed are synchronously blocked by the target task. It can be seen that this method can determine other tasks that are synchronously blocked by the target task while analyzing whether a target task is synchronously blocked, so that there is no need to perform meaningless analysis on these tasks that have been determined to be synchronously blocked in the future, which helps save computing resources and improves the efficiency of synchronous blocking judgment.
应理解,上述判断一至判断六可以是并行执行的,也可以是按照任意顺序串行执行的,本申请实施例对此不作具体限定。It should be understood that the above judgments one to six can be executed in parallel or in series in any order, and the embodiments of the present application do not specifically limit this.
步骤904,任务调度器根据N个任务的关联关系,确定出与目标任务具有依赖关联的其它任务,根据其它任务的执行情况,确定目标任务是否被其它任务依赖阻塞,若否,则执行步骤905,若是,则执行步骤902。Step 904, the task scheduler determines other tasks that have dependency associations with the target task based on the association relationship among the N tasks, and determines whether the target task is blocked by other task dependencies based on the execution status of other tasks. If not, execute step 905; if yes, execute step 902.
具体的,任务调度器可以先执行如下判断一和判断二:Specifically, the task scheduler may first perform the following judgments 1 and 2:
判断一,根据目标任务的配置信息,判断目标任务与其它任务具有部分依赖关联时,若所依赖的其它任务未全部执行完成,则确定目标任务被其它任务部分依赖阻塞;Judgment 1: when judging that the target task has a partial dependency relationship with other tasks according to the configuration information of the target task, if the other dependent tasks have not been fully executed, it is determined that the target task is partially blocked by the dependencies of other tasks;
判断二,根据目标任务的配置信息,判断目标任务与其它任务具有串行依赖关联时,若所依赖的其它任务未全部执行,则确定目标任务被其它任务串行依赖阻塞。Judgment 2: when judging that the target task has a serial dependency association with other tasks according to the configuration information of the target task, if the other dependent tasks are not all executed, it is determined that the target task is blocked by the serial dependency of other tasks.
当上述判断一和判断二中存在至少一个满足时,确定目标任务被依赖阻塞,反之,当上述判断一和判断二都不满足时,确定目标任务未被依赖阻塞。When at least one of the above judgment 1 and judgment 2 is satisfied, it is determined that the target task is blocked by the dependency. Conversely, when neither the above judgment 1 nor the judgment 2 is satisfied, it is determined that the target task is not blocked by the dependency.
步骤905,任务调度器将目标任务确定为一个可调度任务。Step 905: The task scheduler determines the target task as a schedulable task.
步骤906,任务调度器按照可调度任务的获取顺序,将可调度任务依次调度至处理单元。Step 906: The task scheduler schedules the schedulable tasks to the processing units in sequence according to the order in which the schedulable tasks are acquired.
在上述实施例二中,通过先判断同步阻塞再判断依赖阻塞,能先将被一个任务同步阻塞的全部任务都确定出来,再对剩下未确定的任务进行分析,而不需要挨个地对每个任务都进行分析,如此能极大地节省分析流程,提高阻塞判断的效率,进而提高任务调度的效率。In the above-mentioned embodiment 2, by first determining the synchronous blocking and then determining the dependent blocking, all tasks that are synchronously blocked by a task can be determined first, and then the remaining undetermined tasks can be analyzed, without the need to analyze each task one by one. This can greatly save the analysis process, improve the efficiency of blocking judgment, and further improve the efficiency of task scheduling.
上述实施例一和实施例二是从软件上介绍任务调度方法的可能实现,下面基于实施例三从硬件上进一步介绍任务调度方法的可能实现。The above-mentioned first and second embodiments introduce possible implementations of the task scheduling method from the perspective of software. The following further introduces possible implementations of the task scheduling method from the perspective of hardware based on the third embodiment.
实施例三
Embodiment 3
示例性地,继续参照图2所示,任务调度器310中可以包括第一等待队列、第二等待队列和就绪队列。其中,第一等待队列用于存储未被分析过是否被同步阻塞的任务、以及已确定被同步阻塞的任务。其中,已确定被同步阻塞的任务包括:已被分析过是否被同步阻塞且已确定被同步阻塞的任务,以及虽然未被分析过是否被同步阻塞但在其它任务的分析中已确定被同步阻塞的任务。第二等待队列用于存储已确定未被同步阻塞且还未被分析过是否被依赖阻塞的任务、以及已确定被依赖阻塞的任务。其中,已确定被依赖阻塞的任务包括:已被分析过是否被依赖阻塞且已确定被依赖阻塞的任务,以及虽然未被分析过是否被依赖阻塞但在其它任务的分析中已确定被依赖阻塞的任务。就绪队列用于存储已确定未被同步阻塞且已确定未被依赖阻塞的任务,即可调度任务。Exemplarily, as shown in Figure 2, the task scheduler 310 may include a first waiting queue, a second waiting queue and a ready queue. The first waiting queue is used to store tasks that have not been analyzed for being blocked synchronously, and tasks that have been determined to be blocked synchronously. The tasks that have been determined to be blocked synchronously include: tasks that have been analyzed for being blocked synchronously and have been determined to be blocked synchronously, and tasks that have not been analyzed for being blocked synchronously but have been determined to be blocked synchronously in the analysis of other tasks. The second waiting queue is used to store tasks that have been determined not to be blocked synchronously and have not been analyzed for being blocked by dependencies, and tasks that have been determined to be blocked by dependencies. The tasks that have been determined to be blocked by dependencies include: tasks that have been analyzed for being blocked by dependencies and have been determined to be blocked by dependencies, and tasks that have not been analyzed for being blocked by dependencies but have been determined to be blocked by dependencies in the analysis of other tasks. The ready queue is used to store tasks that have been determined not to be blocked synchronously and have been determined not to be blocked by dependencies, that is, schedulable tasks.
进一步示例性地,第一等待队列、第二等待队列和就绪队列中的任务可由不同的线程进行并行处理。具体来说,在一个线程中,任务调度器310在接收到设备驱动程序包下发的任务后,可以按照任务的下发顺序将任务依次存放至第一等待队列中。在另一个线程中,任务调度器310按照任务存储至第一等待队列的顺序遍历第一等待队列中还未被确定同步阻塞的每个任务,在遍历每个任务时,按照上述实施例一或二中的方法判断该任务是否被其它任务同步阻塞,若是,则遍历下一个还未被确定同步阻塞的任务,若否,则将该任务从第一等待队列移至第二等待队列。在又一个线程中,任务调度器310按照任务存储至第二等待队列的顺序遍历第二等待队列中还未被确定依赖阻塞的每个任务,在遍历每个任务时,按照上述方法判断该任务是否被其它任务依赖阻塞,若是,则遍历下一个还未被确定依赖阻塞的任务,若否,则将该任务从第二等待队列移至就绪队列。在再一个线程中,任务调度器310根据GPU核的任务处理情况,按照任务下发的顺序将就绪队列中的任务依次调度至可用的GPU核。Further exemplary, the tasks in the first waiting queue, the second waiting queue and the ready queue can be processed in parallel by different threads. Specifically, in one thread, after receiving the task issued by the device driver package, the task scheduler 310 can store the task in the first waiting queue in sequence according to the order in which the task is issued. In another thread, the task scheduler 310 traverses each task in the first waiting queue that has not been determined to be synchronously blocked according to the order in which the task is stored in the first waiting queue. When traversing each task, the method in the above-mentioned embodiment one or two is used to determine whether the task is synchronously blocked by other tasks. If so, the next task that has not been determined to be synchronously blocked is traversed, and if not, the task is moved from the first waiting queue to the second waiting queue. In another thread, the task scheduler 310 traverses each task in the second waiting queue that has not been determined to be dependently blocked according to the order in which the task is stored in the second waiting queue. When traversing each task, it is determined whether the task is dependently blocked by other tasks according to the above-mentioned method. If so, the next task that has not been determined to be dependently blocked is traversed, and if not, the task is moved from the second waiting queue to the ready queue. In yet another thread, the task scheduler 310 schedules the tasks in the ready queue to the available GPU cores in sequence according to the task processing status of the GPU core and the order in which the tasks are issued.
需要说明的是,上述任务调度器可以将全部功能集成在一个独立的物理器件上,也可以将各个功能分散部署在不同的物理器件上。比如,一种具体的部署方式中,继续参照图2所示,任务调度器310中还可以包括任务获取器311、阻塞管理器312和任务派发器313,任务获取器311可访问第一等待队列,阻塞管理器312可访问第一等待队列、第二等待队列和就绪队列,任务派发器313可访问就绪队列。上述任务调度器310对第一任务队列、第二任务队列和就绪队列中的任务的处理操作,具体可由任务获取器311、阻塞管理器312和任务派发器313访问第一等待队列、第二等待队列和就绪队列中的任务来实现。It should be noted that the above-mentioned task scheduler can integrate all functions on an independent physical device, or can disperse and deploy each function on different physical devices. For example, in a specific deployment method, as shown in Figure 2, the task scheduler 310 can also include a task acquirer 311, a blocking manager 312 and a task dispatcher 313, the task acquirer 311 can access the first waiting queue, the blocking manager 312 can access the first waiting queue, the second waiting queue and the ready queue, and the task dispatcher 313 can access the ready queue. The processing operation of the above-mentioned task scheduler 310 on the tasks in the first task queue, the second task queue and the ready queue can be realized by the task acquirer 311, the blocking manager 312 and the task dispatcher 313 accessing the tasks in the first waiting queue, the second waiting queue and the ready queue.
下面对这三个器件的具体功能进行详细介绍:The specific functions of these three devices are described in detail below:
任务获取器311用于接收设备驱动程序包下发给任务调度器310的任务,并按照任务的下发顺序依次将任务的相关数据存储至第一等待队列。其中,每个任务的相关数据中包含该任务的配置信息,配置信息用于指示如下内容中的一项或多项:该任务是否为整组中的前向同步任务、该任务是否为整组中的后向同步任务、该任务所属的子分组、该任务是否为所属的子分组中的前向同步任务、该任务是否为所属的子分组中的后向同步任务、该任务是否与其它任务具有部分依赖关联以及所依赖的其它任务、该任务是否与其它任务具有串行依赖关联以及所依赖的其它任务。The task acquirer 311 is used to receive the tasks sent by the device driver package to the task scheduler 310, and store the relevant data of the tasks in the first waiting queue in the order of sending the tasks. The relevant data of each task includes the configuration information of the task, and the configuration information is used to indicate one or more of the following contents: whether the task is a forward synchronization task in the whole group, whether the task is a backward synchronization task in the whole group, the subgroup to which the task belongs, whether the task is a forward synchronization task in the subgroup to which it belongs, whether the task is a backward synchronization task in the subgroup to which it belongs, whether the task has a partial dependency association with other tasks and the other tasks on which it depends, and whether the task has a serial dependency association with other tasks and the other tasks on which it depends.
阻塞管理器312用于监控第一等待队列的状态,当感知到第一等待队列中存储有任务后,按照任务存入第一等待队列的顺序遍历第一等待队列中未被确定同步阻塞的每个任务,在遍历每个任务时,执行如下操作:The blocking manager 312 is used to monitor the state of the first waiting queue. When it is sensed that there are tasks stored in the first waiting queue, each task in the first waiting queue that is not determined to be synchronously blocked is traversed in the order in which the tasks are stored in the first waiting queue. When traversing each task, the following operations are performed:
操作一,根据该任务的配置信息,若确定该任务为整组中的前向同步任务,则通过查询整组中早于该任务下发的其它任务的任务状态,判断整组中早于该任务下发的其它任务是否已全部执行完成;以及,若确定该任务为至少一个子分组中的前向同步任务,则通过查询每个子分组中早于该任务的其它任务的任务状态,判断每个子分组中早于该任务下发的其它任务是否已全部执行完成。当上述判断结果都为是时,确定该任务未被同步阻塞,将该任务从第一任务队列移动至第二任务队列。反之,当上述判断结果中存在至少一个为否时,确定该任务被同步阻塞,将该任务继续留在第一任务队列中,并开始遍历下一个未被确定同步阻塞的任务;Operation one, according to the configuration information of the task, if it is determined that the task is a forward synchronization task in the entire group, then by querying the task status of other tasks in the entire group that are issued earlier than the task, it is determined whether the other tasks in the entire group that are issued earlier than the task have been fully executed; and, if it is determined that the task is a forward synchronization task in at least one sub-group, then by querying the task status of other tasks in each sub-group that are issued earlier than the task, it is determined whether the other tasks in each sub-group that are issued earlier than the task have been fully executed. When the above judgment results are all yes, it is determined that the task is not blocked by synchronization, and the task is moved from the first task queue to the second task queue. On the contrary, when there is at least one no in the above judgment results, it is determined that the task is blocked by synchronization, the task continues to remain in the first task queue, and starts to traverse the next task that is not determined to be blocked by synchronization;
操作二,根据该任务的配置信息,若确定该任务为整组中的后向同步任务,则确定整组中晚于该任务下发的其它任务全部被同步阻塞,将整组中晚于该任务下发的其它任务全部留在第一任务队列中;以及,若确定该任务为至少一个子分组中的后向同步任务,则确定其中每个子分组中晚于该任务下发的其它任务全部被同步阻塞,将每个子分组中晚于该任务下发的其它任务全部留在第一任务队列中。Operation two, based on the configuration information of the task, if it is determined that the task is a backward synchronization task in the entire group, then it is determined that all other tasks in the entire group that are issued later than the task are all blocked synchronously, and all other tasks in the entire group that are issued later than the task are all left in the first task queue; and, if it is determined that the task is a backward synchronization task in at least one sub-group, then it is determined that all other tasks in each sub-group that are issued later than the task are all blocked synchronously, and all other tasks in each sub-group that are issued later than the task are all left in the first task queue.
需要说明的是,本申请实施例对上述操作一和操作二的执行顺序不作限定,比如可以先执行操作一再执行操作二,也可以先执行操作二再执行操作一,还可以同时执行操作一和操作二。此外,上述对
第一任务队列的分析操作采用轮询方式,比如分析过一轮之后,将第一任务队列中剩余的任务的状态全部更新为还未被确定同步阻塞,之后,重新按照任务的存入顺序,执行相同的操作一和操作二。It should be noted that the embodiment of the present application does not limit the execution order of the above-mentioned operation 1 and operation 2. For example, operation 1 may be executed first and then operation 2, or operation 2 may be executed first and then operation 1, or operation 1 and operation 2 may be executed simultaneously. The analysis operation of the first task queue adopts a polling method. For example, after one round of analysis, the status of all remaining tasks in the first task queue is updated to have not been determined to be synchronously blocked. After that, the same operation one and operation two are executed again according to the order in which the tasks are stored.
进一步地,阻塞管理器312还用于监控第二等待队列的状态,当感知到第二等待队列中存储有任务后,按照任务存入第二等待队列的顺序遍历第二等待队列中未被确定依赖阻塞的每个任务,在遍历每个任务时:根据该任务的配置信息,若确定该任务与其它任务具有部分依赖关联,则通过查询所依赖的其它任务的任务状态,判断所依赖的其它任务是否已全部执行完成,以及,若确定该任务与其它任务具有串行依赖关联,则通过查询所依赖的其它任务的任务状态,判断所依赖的其它任务是否已全部执行。当上述判断结果都为是时,确定该任务未被依赖阻塞,将该任务从第二任务队列移动至就绪队列。反之,当上述判断结果中存在至少一个为否时,确定该任务被依赖阻塞,将该任务继续留在第二任务队列中,并开始遍历下一个未被确定依赖阻塞的任务。Further, the blocking manager 312 is also used to monitor the state of the second waiting queue. When it is perceived that there are tasks stored in the second waiting queue, each task in the second waiting queue that is not determined to be blocked by dependency is traversed in the order in which the tasks are stored in the second waiting queue. When traversing each task: according to the configuration information of the task, if it is determined that the task has a partial dependency association with other tasks, then by querying the task status of other tasks on which it depends, it is determined whether the other tasks on which it depends have been fully executed, and, if it is determined that the task has a serial dependency association with other tasks, then by querying the task status of other tasks on which it depends, it is determined whether the other tasks on which it depends have been fully executed. When the above-mentioned judgment results are all yes, it is determined that the task is not blocked by dependency, and the task is moved from the second task queue to the ready queue. On the contrary, when there is at least one no in the above-mentioned judgment results, it is determined that the task is blocked by dependency, the task continues to be left in the second task queue, and the next task on which dependency is not determined to be blocked begins to be traversed.
任务派发器313用于监控就绪队列的状态和GPU核的状态,当感知到就绪队列中存储有任务,且GPU核当前的任务处理量小于可并行任务量时,按照设备驱动程序包下发任务的顺序,将就绪队列中的任务依次调度给GPU核。举例来说,假设就绪队列中依次存储有任务3、任务1和任务2,下发顺序依次是任务1、任务2和任务3,GPU核的可并行任务数量为2,则:当GPU核当前执行的任务数量为1时,确定GPU核当前可再执行一个任务,此时,任务派发器313可将就绪队列中最早下发的任务1派发给GPU核,后续确定GPU核可再执行一个任务时,将就绪队列中的任务2派发给GPU核,后续确定GPU核可再执行一个任务时,将就绪队列中的任务3派发给GPU核;或者,当GPU核当前执行的任务数量为0时,确定GPU核当前可再执行两个任务,此时,任务派发器313可将就绪队列中最早下发的任务1和任务2派发给GPU核,GPU核可再执行一个任务时,将就绪队列中的任务3派发给GPU核。如此,通过上述同步阻塞判断和依赖阻塞判断,即使后下发的任务相比于前下发的任务先被存储在了就绪队列中,通过在真正调度之前找到最先下发的任务进行调度,而不是按照就绪队列中的存储顺序进行调度,能在提前找到所有可处理的任务的条件下,最大限度地确保当前可处理的任务中最早下发的任务先被处理。The task dispatcher 313 is used to monitor the status of the ready queue and the status of the GPU core. When it is sensed that there are tasks stored in the ready queue and the current task processing volume of the GPU core is less than the parallel task volume, the tasks in the ready queue are dispatched to the GPU core in sequence according to the order in which the device driver package sends the tasks. For example, assuming that task 3, task 1 and task 2 are stored in the ready queue in sequence, and the order of issuance is task 1, task 2 and task 3, and the number of parallel tasks of the GPU core is 2, then: when the number of tasks currently executed by the GPU core is 1, it is determined that the GPU core can currently execute another task. At this time, the task dispatcher 313 can dispatch task 1, which is the earliest issued in the ready queue, to the GPU core. When it is subsequently determined that the GPU core can execute another task, task 2 in the ready queue is dispatched to the GPU core. When it is subsequently determined that the GPU core can execute another task, task 3 in the ready queue is dispatched to the GPU core; or, when the number of tasks currently executed by the GPU core is 0, it is determined that the GPU core can currently execute two more tasks. At this time, the task dispatcher 313 can dispatch task 1 and task 2, which are the earliest issued in the ready queue, to the GPU core. When the GPU core can execute another task, task 3 in the ready queue is dispatched to the GPU core. In this way, through the above-mentioned synchronous blocking judgment and dependent blocking judgment, even if the task issued later is stored in the ready queue before the task issued earlier, by finding the first issued task for scheduling before the actual scheduling, rather than scheduling according to the storage order in the ready queue, it is possible to find all processable tasks in advance and maximize the guarantee that the earliest issued task among the currently processable tasks will be processed first.
在上述实施例三中,一个任务是否被同步阻塞、是否被依赖阻塞和是否被调度至GPU核这三个操作是串行执行的,但不同任务是否被同步阻塞、是否被依赖阻塞和是否被调度至GPU核的操作是并行执行的。比如,同一个任务只有在确定未被同步阻塞的情况下,才会被判断是否被依赖阻塞,只有在确定未被依赖阻塞的情况下,才会被进行后续的调度。而一个任务在被判断是否被同步阻塞时,早于该任务下发的另一个任务可能正在被判断是否被依赖阻塞,早于该任务和另一个任务下发的其它任务可能正在被调度至GPU核。可见,通过设置多个线程并行执行同步阻塞判断、依赖阻塞判断和调度操作,不仅能确保一个任务在未被同步阻塞且未被依赖阻塞的情况下才会进行调度,同时还能对晚下发的任务的前流程和早下发的任务的后流程进行并行处理,这有助于进一步提高任务调度的效率。In the above-mentioned third embodiment, the three operations of whether a task is synchronously blocked, whether it is dependently blocked, and whether it is scheduled to the GPU core are executed in series, but the operations of whether different tasks are synchronously blocked, whether they are dependently blocked, and whether they are scheduled to the GPU core are executed in parallel. For example, the same task will only be judged whether it is dependently blocked if it is determined that it is not synchronously blocked, and will only be subsequently scheduled if it is determined that it is not dependently blocked. When a task is judged whether it is synchronously blocked, another task issued earlier than the task may be judged whether it is dependently blocked, and other tasks issued earlier than the task and another task may be scheduled to the GPU core. It can be seen that by setting multiple threads to perform synchronous blocking judgment, dependent blocking judgment and scheduling operations in parallel, it can not only ensure that a task will be scheduled only when it is not synchronously blocked and not dependently blocked, but also the front process of the task issued later and the back process of the task issued earlier can be processed in parallel, which helps to further improve the efficiency of task scheduling.
下面通过几个具体的例子对上述实施例中的任务调度方法进行详细介绍。需要说明的是,在下文的示例中,假设处理器的最大并行任务量为2,即处理器同一时刻能同时处理两个任务。The task scheduling method in the above embodiment is described in detail below through several specific examples. It should be noted that in the following examples, it is assumed that the maximum parallel task amount of the processor is 2, that is, the processor can process two tasks at the same time.
示例一Example 1
图10示例性示出本申请实施例提供的一种任务调度流程示意图,其中,图10中(A)示出的是设备驱动程序包向任务调度器下发的任务0~任务5及其关联关系,该任务0~任务5首先被任务获取器获取并存储在第一等待队列中。在任务0~任务5的关联关系中,任务0~任务5只位于整组中,且,任务3属于整组中的前向同步任务和后向同步任务,意味着,任务3需要在早于任务3所下发的任务0、任务1和任务2都执行完成后才可以执行,且晚于任务3所下发的任务4和任务5需要在任务3执行完成后才可以执行。相对应的,图10中(B)示出的是按照上述实施例中的任务调度方法处理任务的一种可能情况,参照图10中(B)所示,任务调度器可按照如下步骤调度各个任务:Figure 10 exemplarily shows a task scheduling process diagram provided by an embodiment of the present application, wherein (A) in Figure 10 shows tasks 0 to 5 and their associations issued by the device driver package to the task scheduler, and the tasks 0 to 5 are first acquired by the task acquirer and stored in the first waiting queue. In the association relationship between tasks 0 to 5, tasks 0 to 5 are only located in the entire group, and task 3 belongs to the forward synchronization task and the backward synchronization task in the entire group, which means that task 3 can only be executed after tasks 0, 1 and 2 issued earlier than task 3 are all completed, and tasks 4 and 5 issued later than task 3 can only be executed after task 3 is completed. Correspondingly, (B) in Figure 10 shows a possible situation of processing tasks according to the task scheduling method in the above embodiment. Referring to (B) in Figure 10, the task scheduler can schedule each task according to the following steps:
步骤一,先对任务0进行分析,任务0通过所有流程无阻塞地被调度至GPU核,具体来说:Step 1: First, analyze Task 0. Task 0 is scheduled to the GPU core without blocking through all processes. Specifically:
阻塞管理器遍历第一等待队列中最先下发的任务0,由于任务0不属于前向同步任务,且也不是晚于整组中的后向同步任务3所下发的任务,因此任务0未被同步阻塞,阻塞管理器将任务0从第一等待队列移动至第二等待队列;The blocking manager traverses the first task 0 issued in the first waiting queue. Since task 0 does not belong to the forward synchronization task and is not a task issued later than the backward synchronization task 3 in the entire group, task 0 is not blocked by synchronization. The blocking manager moves task 0 from the first waiting queue to the second waiting queue.
阻塞管理器遍历第二等待队列中最先存储的任务0,由于任务0不依赖其它任务,因此,任务0也未被依赖阻塞,阻塞管理器将任务0从第二等待队列移动至就绪队列;
The blocking manager traverses the task 0 stored first in the second waiting queue. Since task 0 does not depend on other tasks, task 0 is not blocked by dependencies. The blocking manager moves task 0 from the second waiting queue to the ready queue.
任务派发器监控到GPU核当前并未处理任务,也即是,GPU核当前还可处理两个任务,因此,任务派发器将就绪队列中最先下发的任务0调度至GPU核。The task dispatcher monitors that the GPU core is not currently processing any tasks, that is, the GPU core can currently process two tasks. Therefore, the task dispatcher schedules the first task 0 issued in the ready queue to the GPU core.
步骤二,再对任务1进行分析,任务1通过所有流程无阻塞地被调度至GPU核,具体来说:Step 2: Analyze Task 1 again. Task 1 is scheduled to the GPU core without blocking through all processes. Specifically:
经过上述步骤一后,第一等待队列中只包含任务1~任务5,阻塞管理器遍历第一等待队列中最先下发的任务1,由于任务1不属于前向同步任务,且也不是晚于整组中的后向同步任务3下发的任务,因此任务1未被同步阻塞,阻塞管理器将任务1从第一等待队列移动至第二等待队列;After the above step 1, the first waiting queue only contains tasks 1 to 5. The blocking manager traverses the first task 1 issued in the first waiting queue. Since task 1 does not belong to the forward synchronization task and is not a task issued later than the backward synchronization task 3 in the whole group, task 1 is not blocked by synchronization. The blocking manager moves task 1 from the first waiting queue to the second waiting queue.
阻塞管理器遍历第二等待队列中的任务1,由于任务1不依赖其它任务,因此,任务1也未被依赖阻塞,阻塞管理器将任务1从第二等待队列移动至就绪队列;The blocking manager traverses Task 1 in the second waiting queue. Since Task 1 does not depend on other tasks, Task 1 is not blocked by dependencies. The blocking manager moves Task 1 from the second waiting queue to the ready queue.
任务派发器监控到GPU核当前只处理任务0,也即是,GPU核当前还可处理一个任务,因此,任务派发器将就绪队列中的任务1调度至GPU核。The task dispatcher monitors that the GPU core is currently only processing task 0, that is, the GPU core can currently process one task. Therefore, the task dispatcher schedules task 1 in the ready queue to the GPU core.
经过上述步骤一和步骤二后,GPU核并行处理任务0和任务1。After the above steps 1 and 2, the GPU core processes Task 0 and Task 1 in parallel.
步骤三,再对任务2进行分析,任务2通过同步阻塞判断流程和依赖阻塞判断流程,但需在派发流程中等待任务0或任务1执行完成后才可以被调度至GPU核,具体来说:Step 3: Analyze Task 2 again. Task 2 passes the synchronous blocking judgment process and the dependent blocking judgment process, but needs to wait for Task 0 or Task 1 to be completed in the dispatch process before it can be scheduled to the GPU core. Specifically:
经过上述步骤二后,第一等待队列中只包含任务2~任务5,阻塞管理器遍历第一等待队列中最先下发的任务2,由于任务2不属于前向同步任务,且也不是晚于整组中的后向同步任务3下发的任务,因此任务2未被同步阻塞,阻塞管理器将任务2从第一等待队列移动至第二等待队列;After the above step 2, the first waiting queue only contains tasks 2 to 5. The blocking manager traverses the first task 2 issued in the first waiting queue. Since task 2 does not belong to the forward synchronization task and is not a task issued later than the backward synchronization task 3 in the entire group, task 2 is not synchronously blocked. The blocking manager moves task 2 from the first waiting queue to the second waiting queue.
阻塞管理器遍历第二等待队列中的任务2,由于任务2不依赖其它任务,因此,任务2也未被依赖阻塞,阻塞管理器将任务2从第二等待队列移动至就绪队列;The blocking manager traverses Task 2 in the second waiting queue. Since Task 2 does not depend on other tasks, Task 2 is not blocked by dependencies. The blocking manager moves Task 2 from the second waiting queue to the ready queue.
任务派发器监控GPU核当前正在处理任务0和任务1,也即是,GPU核当前无法再处理新的任务,因此,任务派发器等待GPU核处理完其中一个任务后,将就绪队列中的任务2调度至GPU核。如图10中(B)所示,假设GPU核先处理完任务0,则任务派发器将任务2调度至GPU核后,GPU核并行处理任务1和任务2。The task dispatcher monitors that the GPU core is currently processing task 0 and task 1, that is, the GPU core cannot currently process new tasks. Therefore, the task dispatcher waits for the GPU core to complete processing one of the tasks and then schedules task 2 in the ready queue to the GPU core. As shown in (B) in Figure 10, assuming that the GPU core completes processing task 0 first, after the task dispatcher schedules task 2 to the GPU core, the GPU core processes task 1 and task 2 in parallel.
步骤四,再对任务3进行分析,任务3被阻塞在同步阻塞判断流程中,需在任务1和任务2均执行完成后才能够执行:Step 4: Analyze Task 3 again. Task 3 is blocked in the synchronous blocking judgment process and can only be executed after Task 1 and Task 2 are completed:
经过上述步骤三后,第一等待队列中只包含任务3~任务5,阻塞管理器遍历第一等待队列中最先下发的任务3,由于任务3属于前向同步任务,且早于任务3所下发的任务1和任务2未执行完成,因此任务3被同步阻塞。同时,由于任务3还属于后向同步任务,因此晚于任务3所下发的任务4和任务5也被确定为被同步阻塞。因此,在任务1和任务2全部执行完成之前,阻塞管理器不再对第一等待队列中的任务3~任务5进行分析。After the above step 3, the first waiting queue only contains tasks 3 to 5. The blocking manager traverses the first task 3 issued in the first waiting queue. Since task 3 belongs to the forward synchronization task and tasks 1 and 2 issued earlier than task 3 have not been completed, task 3 is synchronously blocked. At the same time, since task 3 also belongs to the backward synchronization task, tasks 4 and 5 issued later than task 3 are also determined to be synchronously blocked. Therefore, before tasks 1 and 2 are all executed, the blocking manager will no longer analyze tasks 3 to 5 in the first waiting queue.
如图10中(B)所示,假设任务1先执行完成,则GPU核中不会再被调度新的任务,GPU核只处理任务2,当任务2也执行完成后,阻塞管理器确定同步阻塞任务3的任务全部执行完成,任务3解除同步阻塞,因此,阻塞管理器将任务3从第一等待队列移动至第二等待队列;As shown in (B) of FIG10 , assuming that task 1 is completed first, no new tasks will be scheduled in the GPU core, and the GPU core only processes task 2. When task 2 is also completed, the blocking manager determines that all the tasks that synchronously block task 3 are completed, and task 3 is released from synchronous blocking. Therefore, the blocking manager moves task 3 from the first waiting queue to the second waiting queue;
阻塞管理器遍历第二等待队列中的任务3,由于任务3不依赖其它任务,因此,任务3未被依赖阻塞,阻塞管理器将任务3从第二等待队列移动至就绪队列;The blocking manager traverses task 3 in the second waiting queue. Since task 3 does not depend on other tasks, task 3 is not blocked by dependencies. The blocking manager moves task 3 from the second waiting queue to the ready queue.
任务派发器监控GPU核当前未处理任务,也即是,GPU核当前可处理2个任务,因此,任务派发器将就绪队列中的任务3直接调度至GPU核,GPU核只处理任务3。The task dispatcher monitors the tasks that are not currently being processed by the GPU core, that is, the GPU core can currently process 2 tasks. Therefore, the task dispatcher directly schedules task 3 in the ready queue to the GPU core, and the GPU core only processes task 3.
步骤五,任务4和任务5被阻塞在同步阻塞判断流程中,需在任务3执行完成后才能够执行:Step 5: Task 4 and Task 5 are blocked in the synchronization blocking judgment process and can only be executed after Task 3 is completed:
经过上述步骤四后,第一等待队列中只包含任务4和任务5,且任务4和任务5在上述步骤四中已确定被任务3所同步阻塞,因此,在任务3执行完成之前,阻塞管理器不再对第一等待队列中的任务4和任务5进行分析。After the above step four, the first waiting queue only contains Task 4 and Task 5, and Task 4 and Task 5 have been determined to be synchronously blocked by Task 3 in the above step four. Therefore, before Task 3 is executed, the blocking manager will no longer analyze Task 4 and Task 5 in the first waiting queue.
进而,当任务3执行完成后,阻塞管理器确定同步阻塞任务4和任务5的任务3执行完成,任务4和任务5解除同步阻塞,因此,通过依次遍历任务4和任务5,阻塞管理器依次将任务4和任务5从第一等待队列移动至第二等待队列;Furthermore, when task 3 is executed, the blocking manager determines that task 3, which synchronously blocks tasks 4 and 5, is executed, and tasks 4 and 5 are released from synchronous blocking. Therefore, by traversing tasks 4 and 5 in sequence, the blocking manager moves tasks 4 and 5 from the first waiting queue to the second waiting queue in sequence.
阻塞管理器依次遍历第二等待队列中的任务4和任务5,由于任务4和任务5不依赖其它任务,因此,任务4和任务5均未被依赖阻塞,阻塞管理器将任务4和任务5从第二等待队列移动至就绪队列;The blocking manager traverses tasks 4 and 5 in the second waiting queue in turn. Since tasks 4 and 5 do not depend on other tasks, neither of them is blocked by dependencies. The blocking manager moves tasks 4 and 5 from the second waiting queue to the ready queue.
任务派发器监控GPU核当前未处理任务,也即是,GPU核当前可处理2个任务,因此,任务派发器将就绪队列中的任务4和任务5调度至GPU核,使得GPU核并行处理任务4和任务5。The task dispatcher monitors the tasks that are not currently being processed by the GPU core, that is, the GPU core can currently process 2 tasks. Therefore, the task dispatcher schedules tasks 4 and 5 in the ready queue to the GPU core, so that the GPU core processes tasks 4 and 5 in parallel.
上述示例一是对整组中定义有前向同步任务和后向同步任务的场景进行介绍,通过定义整组中任务
之间的同步关联和依赖关联,以及按照上述任务调度方法调度任务,能尽可能将整组中无同步阻塞且无依赖阻塞的任务进行提前调度,尽量降低GPU核的空载现象,提高GPU核的利用率。The above example 1 introduces the scenario where forward synchronization tasks and backward synchronization tasks are defined in the whole group. By combining the synchronization association and dependency association between the two and scheduling tasks according to the above task scheduling method, the tasks without synchronization blocking and dependency blocking in the whole group can be scheduled in advance as much as possible, so as to reduce the idle phenomenon of the GPU core and improve the utilization rate of the GPU core.
示例二Example 2
图11示例性示出本申请实施例提供的另一种任务调度流程示意图,其中,图11中(A)示出的是设备驱动程序包向任务调度器下发的任务0~任务4及其关联关系,该任务0~任务4首先被任务获取器获取并存储在第一等待队列中。在任务0~任务4的关联关系中,任务0~任务4位于整组中,同时,任务0、任务2和任务4位于子分组1中,任务1和任务3位于子分组2中。且,任务2属于子分组1中的前向同步任务和后向同步任务,意味着,任务2需要在子分组1中早于任务2所下发的任务0执行完成后才可以执行,且子分组1中晚于任务2所下发的任务4需要在任务2执行完成后才可以执行。相对应的,图11中(B)示出的是按照上述实施例中的任务调度方法处理任务的一种可能情况,参照图11中(B)所示,任务调度器可按照如下步骤调度各个任务:FIG11 exemplarily shows another task scheduling process diagram provided by an embodiment of the present application, wherein FIG11 (A) shows tasks 0 to 4 and their associations issued by the device driver package to the task scheduler, and the tasks 0 to 4 are first acquired by the task acquirer and stored in the first waiting queue. In the association relationship of tasks 0 to 4, tasks 0 to 4 are located in the entire group, while tasks 0, 2 and 4 are located in subgroup 1, and tasks 1 and 3 are located in subgroup 2. Moreover, task 2 belongs to the forward synchronization task and the backward synchronization task in subgroup 1, which means that task 2 can only be executed after task 0 issued earlier than task 2 in subgroup 1 is executed, and task 4 issued later than task 2 in subgroup 1 can only be executed after task 2 is executed. Correspondingly, FIG11 (B) shows a possible situation of processing tasks according to the task scheduling method in the above embodiment. Referring to FIG11 (B), the task scheduler can schedule each task according to the following steps:
步骤一,先对任务0进行分析,任务0通过所有流程无阻塞地被调度至GPU核,具体实现过程参照上述示例一中的步骤一,此处不再重复赘述。Step 1: Task 0 is analyzed first. Task 0 is scheduled to the GPU core without blocking through all processes. The specific implementation process refers to step 1 in the above example 1, which will not be repeated here.
步骤二,再对任务1进行分析,任务1通过所有流程无阻塞地被调度至GPU核,具体实现过程参照上述示例一中的步骤二,此处不再重复赘述。Step 2: Task 1 is analyzed again. Task 1 is scheduled to the GPU core without blocking through all processes. The specific implementation process refers to step 2 in the above example 1, which will not be repeated here.
步骤三,再对任务2进行分析,任务2被阻塞在同步阻塞判断流程中,需在任务0执行完成后才能够执行:Step 3: Analyze Task 2 again. Task 2 is blocked in the synchronous blocking judgment process and can only be executed after Task 0 is completed:
经过上述步骤二后,第一等待队列中只包含任务2~任务4,阻塞管理器遍历第一等待队列中最先下发的任务2,由于任务2属于子分组1中的前向同步任务,且子分组1中早于任务2所下发的任务0未执行完成,因此任务2被同步阻塞。同时,由于任务2还属于子分组1中的后向同步任务,因此子分组1中晚于任务2所下发的任务4也被确定为被同步阻塞。因此,在任务2执行完成之前,阻塞管理器不再对第一子分组中的任务4进行分析。After the above step 2, the first waiting queue only contains tasks 2 to 4. The blocking manager traverses the first task 2 issued in the first waiting queue. Since task 2 belongs to the forward synchronization task in subgroup 1, and task 0 issued earlier than task 2 in subgroup 1 has not been completed, task 2 is synchronously blocked. At the same time, since task 2 also belongs to the backward synchronization task in subgroup 1, task 4 issued later than task 2 in subgroup 1 is also determined to be synchronously blocked. Therefore, before task 2 is executed, the blocking manager no longer analyzes task 4 in the first subgroup.
如图11中(B)所示,假设任务0先执行完成,则任务2解除同步阻塞,因此,阻塞管理器将任务2从第一等待队列移动至第二等待队列;As shown in (B) of FIG11 , assuming that task 0 is completed first, task 2 is released from synchronization blocking, so the blocking manager moves task 2 from the first waiting queue to the second waiting queue;
阻塞管理器遍历第二等待队列中的任务2,由于任务2不依赖其它任务,因此,任务2未被依赖阻塞,阻塞管理器将任务2从第二等待队列移动至就绪队列;The blocking manager traverses Task 2 in the second waiting queue. Since Task 2 does not depend on other tasks, Task 2 is not blocked by dependencies. The blocking manager moves Task 2 from the second waiting queue to the ready queue.
任务派发器监控GPU核当前正在处理任务1,也即是,GPU核当前还可处理一个新的任务,因此,任务派发器将就绪队列中的任务2调度至GPU核,使得GPU核并行处理任务1和任务2。The task dispatcher monitors that the GPU core is currently processing task 1, that is, the GPU core can currently process a new task. Therefore, the task dispatcher schedules task 2 in the ready queue to the GPU core, so that the GPU core processes task 1 and task 2 in parallel.
步骤四,再对任务3进行分析,任务3通过同步阻塞判断流程和依赖阻塞判断流程,但需在派发流程中等待任务1或任务2执行完成后才可以被调度至GPU核。如图11中(B)所示,假设任务1先执行完成,则任务派发器将就绪队列中的任务3调度至GPU核,使得GPU核并行处理任务2和任务3。该步骤的具体实现过程参照上述示例一中的步骤三,此处不再重复赘述。Step 4: Task 3 is analyzed again. Task 3 passes the synchronous blocking judgment process and the dependent blocking judgment process, but it needs to wait for Task 1 or Task 2 to be completed in the dispatch process before it can be scheduled to the GPU core. As shown in (B) in Figure 11, assuming that Task 1 is completed first, the task dispatcher dispatches Task 3 in the ready queue to the GPU core, so that the GPU core processes Task 2 and Task 3 in parallel. The specific implementation process of this step refers to Step 3 in the above Example 1, and will not be repeated here.
步骤五,任务4被阻塞在同步阻塞判断流程中,需在任务2执行完成后才能够执行:Step 5: Task 4 is blocked in the synchronous blocking judgment process and can only be executed after Task 2 is completed:
经过上述步骤四后,第一等待队列中只包含任务4,且任务4在上述步骤三中已被确定被任务2所同步阻塞,因此,在任务2执行完成之前,阻塞管理器不再对第一等待队列中的任务4进行分析。After the above step 4, the first waiting queue only contains Task 4, and Task 4 has been determined to be synchronously blocked by Task 2 in the above step 3. Therefore, before Task 2 is executed, the blocking manager no longer analyzes Task 4 in the first waiting queue.
进而,当任务2执行完成后,阻塞管理器确定同步阻塞任务4的任务2执行完成,任务4解除同步阻塞,因此,阻塞管理器将任务4从第一等待队列移动至第二等待队列;Furthermore, when task 2 is executed, the blocking manager determines that task 2 that synchronously blocks task 4 is executed, and task 4 is released from synchronous blocking. Therefore, the blocking manager moves task 4 from the first waiting queue to the second waiting queue.
阻塞管理器遍历第二等待队列中的任务4,由于任务4不依赖其它任务,因此,任务4未被依赖阻塞,阻塞管理器将任务4从第二等待队列移动至就绪队列;The blocking manager traverses task 4 in the second waiting queue. Since task 4 does not depend on other tasks, task 4 is not blocked by dependencies. The blocking manager moves task 4 from the second waiting queue to the ready queue.
任务派发器监控GPU核当前正在处理任务3,也即是,GPU核当前还可处理一个新的任务,因此,任务派发器将就绪队列中的任务4调度至GPU核,使得GPU核并行处理任务3和任务4。The task dispatcher monitors that the GPU core is currently processing task 3, that is, the GPU core can currently process a new task. Therefore, the task dispatcher schedules task 4 in the ready queue to the GPU core, so that the GPU core processes task 3 and task 4 in parallel.
上述示例二是对子分组中定义有前向同步任务和后向同步任务的场景进行介绍,通过定义子分组中任务之间的同步关联,即使某一子分组中的任务被同步阻塞,也不会影响到其它子分组中的任务的执行,如此有助于解耦不同子分组的任务,降低任务之间的相互影响。The above example 2 introduces the scenario where forward synchronization tasks and backward synchronization tasks are defined in sub-groups. By defining the synchronization association between tasks in sub-groups, even if a task in a sub-group is blocked by synchronization, it will not affect the execution of tasks in other sub-groups. This helps to decouple tasks of different sub-groups and reduce the mutual influence between tasks.
示例三Example 3
图12示例性示出本申请实施例提供的又一种任务调度流程示意图,其中,图12中(A)示出的是
设备驱动程序包向任务调度器下发的任务0~任务4及其关联关系,该任务0~任务4首先被任务获取器获取并存储在第一等待队列中。在任务0~任务4的关联关系中,任务0~任务4位于整组中,且,任务3与任务0和任务2具有部分依赖关联,意味着,任务3需要在任务0和任务2全都执行完成后才可以执行。相对应的,图12中(B)示出的是按照上述实施例中的任务调度方法处理任务的一种可能情况,参照图12中(B)所示,任务调度器可按照如下步骤调度各个任务:FIG. 12 exemplarily shows another task scheduling process diagram provided in an embodiment of the present application, wherein FIG. 12 (A) shows The device driver package sends tasks 0 to 4 and their associations to the task scheduler. Tasks 0 to 4 are first acquired by the task acquirer and stored in the first waiting queue. In the association of tasks 0 to 4, tasks 0 to 4 are in the entire group, and task 3 has a partial dependency association with tasks 0 and 2, which means that task 3 can only be executed after tasks 0 and 2 are all completed. Correspondingly, FIG12 (B) shows a possible situation of processing tasks according to the task scheduling method in the above embodiment. As shown in FIG12 (B), the task scheduler can schedule each task according to the following steps:
步骤一,先对任务0进行分析,任务0通过所有流程无阻塞地被调度至GPU核,具体实现过程参照上述示例一中的步骤一,此处不再重复赘述。Step 1: Task 0 is analyzed first. Task 0 is scheduled to the GPU core without blocking through all processes. The specific implementation process refers to step 1 in the above example 1, which will not be repeated here.
步骤二,再对任务1进行分析,任务1通过所有流程无阻塞地被调度至GPU核,具体实现过程参照上述示例一中的步骤二,此处不再重复赘述。Step 2: Task 1 is analyzed again. Task 1 is scheduled to the GPU core without blocking through all processes. The specific implementation process refers to step 2 in the above example 1, which will not be repeated here.
步骤三,再对任务2进行分析,任务2通过同步阻塞判断流程和依赖阻塞判断流程,但需在派发流程中等待任务0或任务1执行完成后才可以被调度至GPU核。如图12中(B)所示,假设GPU核先处理完任务0,则任务派发器将任务2调度至GPU核后,GPU核并行处理任务1和任务2。该步骤的具体实现过程请参照上述示例一中的步骤三,此处不再重复赘述。Step 3: Task 2 is analyzed again. Task 2 passes the synchronous blocking judgment process and the dependent blocking judgment process, but it needs to wait for Task 0 or Task 1 to be executed in the dispatch process before it can be scheduled to the GPU core. As shown in (B) in Figure 12, assuming that the GPU core processes Task 0 first, after the task dispatcher schedules Task 2 to the GPU core, the GPU core processes Task 1 and Task 2 in parallel. Please refer to Step 3 in the above Example 1 for the specific implementation process of this step, which will not be repeated here.
步骤四,再对任务3进行分析,任务3被阻塞在依赖阻塞判断流程中,需在任务2执行完成后才能够执行:Step 4: Analyze Task 3 again. Task 3 is blocked in the dependency blocking judgment process and can only be executed after Task 2 is completed:
经过上述步骤三后,第一等待队列中只包含任务3和任务4,阻塞管理器遍历第一等待队列中最先下发的任务3,由于任务3不属于前向同步任务,且整组中也不存在后向同步任务,因此任务3未被同步阻塞,阻塞管理器将任务3从第一等待队列移动至第二等待队列;After the above step 3, the first waiting queue only contains Task 3 and Task 4. The blocking manager traverses Task 3, which is the first task issued in the first waiting queue. Since Task 3 does not belong to the forward synchronization task and there is no backward synchronization task in the entire group, Task 3 is not synchronously blocked. The blocking manager moves Task 3 from the first waiting queue to the second waiting queue.
阻塞管理器遍历第二等待队列中的任务3,由于任务3依赖任务0和任务2,任务0已执行完成,但任务2还未执行完成,因此,任务3被依赖阻塞,在任务2执行完成之前,阻塞管理器不再对第二等待队列中的任务3进行分析。The blocking manager traverses Task 3 in the second waiting queue. Since Task 3 depends on Task 0 and Task 2, Task 0 has been executed but Task 2 has not yet been completed. Therefore, Task 3 is blocked by the dependency. Before Task 2 is completed, the blocking manager no longer analyzes Task 3 in the second waiting queue.
步骤五,如图12中(B)所示,假设任务1先执行完成,则由于任务2未执行完成,任务3未解除依赖阻塞,因此,可直接对第一等待队列中位于任务3之后的任务4进行分析,任务4通过所有流程无阻塞地被调度至GPU核,GPU核会同时处理任务2和任务4。Step five, as shown in (B) in Figure 12, assuming that task 1 is executed first, since task 2 is not completed and task 3 is not released from dependency blocking, task 4 that is located after task 3 in the first waiting queue can be directly analyzed. Task 4 is scheduled to the GPU core without blocking through all processes, and the GPU core will process tasks 2 and 4 at the same time.
步骤六,如图12中(B)所示,假设任务2执行完成,则第二等待队列中的任务3解除依赖阻塞,因此,任务3通过所有流程无阻塞地被调度至GPU核,使得GPU核并行处理任务4和任务3。Step six, as shown in (B) in Figure 12, assuming that task 2 is executed, task 3 in the second waiting queue releases the dependency block, so task 3 is scheduled to the GPU core without blocking through all processes, allowing the GPU core to process task 4 and task 3 in parallel.
需要说明的是,在上述示例三中,若任务3与任务0和任务2具有串行依赖关联,意味着,任务3需要在任务0和任务2都开始执行以后才可以执行,无论任务0和任务2是否执行完成。该情况下,在上述步骤四中,由于任务0已经执行完成,且任务2虽然没有执行完成但已经开始执行,因此,任务3解除依赖,任务3通过同步阻塞判断流程和依赖阻塞判断流程,但需在派发流程中等待任务1或任务2执行完成后才可以被调度至GPU核。比如,如图12中(B)所示意的,假设任务1先执行完成,则任务派发器会将就绪队列中的任务3调度至GPU核,使得GPU核并行处理任务2和任务3。It should be noted that in the above example three, if Task 3 has a serial dependency association with Task 0 and Task 2, it means that Task 3 can only be executed after Task 0 and Task 2 have started to execute, regardless of whether Task 0 and Task 2 have been completed. In this case, in the above step four, since Task 0 has been completed, and Task 2 has not been completed but has started to execute, Task 3 is released from dependency, and Task 3 passes through the synchronous blocking judgment process and the dependent blocking judgment process, but needs to wait for Task 1 or Task 2 to be completed in the dispatch process before it can be scheduled to the GPU core. For example, as shown in (B) in Figure 12, assuming that Task 1 is completed first, the task dispatcher will dispatch Task 3 in the ready queue to the GPU core, so that the GPU core processes Task 2 and Task 3 in parallel.
上述示例三是对定义有依赖的场景进行介绍,当在前的任务被依赖阻塞时,可将在后未被依赖阻塞的任务优先调度至GPU核,以确保GPU核尽量不空载,有效提高GPU核的利用率。The above example 3 introduces a scenario with defined dependencies. When the previous task is blocked by the dependency, the subsequent task that is not blocked by the dependency can be scheduled to the GPU core first to ensure that the GPU core is not idle as much as possible, effectively improving the utilization rate of the GPU core.
应理解,本申请中的各个实施例还可以相互结合,以得到新的实施例。It should be understood that the various embodiments in the present application can also be combined with each other to obtain new embodiments.
需要说明的是,上述各个信息的名称仅仅是作为示例,随着通信技术的演变,上述任意信息均可能改变其名称,但不管其名称如何发生变化,只要其含义与本申请上述信息的含义相同,则均落入本申请的保护范围之内。It should be noted that the names of the above-mentioned information are only examples. With the evolution of communication technology, the names of any of the above-mentioned information may change. However, no matter how the names change, as long as their meanings are the same as those of the above-mentioned information in this application, they fall within the scope of protection of this application.
上述主要从各个网元之间交互的角度对本申请提供的方案进行了介绍。可以理解的是,上述实现各网元为了实现上述功能,其包含了执行各个功能相应的硬件结构和/或软件模块。本领域技术人员应该很容易意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,本发明能够以硬件或硬件和计算机软件的结合形式来实现。某个功能究竟以硬件还是计算机软件驱动硬件的方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本发明的范围。The above mainly introduces the solution provided by the present application from the perspective of the interaction between various network elements. It can be understood that in order to realize the above functions, the above-mentioned network elements include hardware structures and/or software modules corresponding to the execution of various functions. Those skilled in the art should easily realize that, in combination with the units and algorithm steps of each example described in the embodiments disclosed in this document, the present invention can be implemented in the form of hardware or a combination of hardware and computer software. Whether a function is executed in the form of hardware or computer software driving hardware depends on the specific application and design constraints of the technical solution. Professional and technical personnel can use different methods to implement the described functions for each specific application, but such implementation should not be considered to exceed the scope of the present invention.
根据前述方法,图13为本申请实施例提供的一种任务调度器的结构示意图,该任务调度器1300可以为芯片或电路,比如可设置于处理器中的芯片或电路。该任务调度器1300以对应上述方法中的任务调度器,比如图2中的任务调度器310。该任务调度器1300可以实现如上图5或图9中所示的任一项或任多项对应的方法的步骤。如图13所示,该任务调度器1300中可以包括获取单元1301、确定单
元1302和调度单元1303。According to the aforementioned method, FIG13 is a schematic diagram of the structure of a task scheduler provided in an embodiment of the present application. The task scheduler 1300 may be a chip or a circuit, such as a chip or a circuit that can be set in a processor. The task scheduler 1300 corresponds to the task scheduler in the aforementioned method, such as the task scheduler 310 in FIG2. The task scheduler 1300 may implement the steps of any one or more of the corresponding methods shown in FIG5 or FIG9 above. As shown in FIG13, the task scheduler 1300 may include an acquisition unit 1301, a determination unit 1302, and a task scheduler 1303. Unit 1302 and scheduling unit 1303.
本申请实施例中,获取单元1301在接收信息时可以为接收单元或接收器,此接收单元或接收器可以为射频电路。具体实施中,获取单元1301用于获取N个任务以及N个任务的关联关系,确定单元1302用于根据N个任务的关联关系,将N个任务中无依赖的或已解除依赖的任务确定为可调度任务,调度单元1303用于将可调度任务调度至处理单元。其中,N为正整数。In the embodiment of the present application, the acquisition unit 1301 may be a receiving unit or a receiver when receiving information, and the receiving unit or the receiver may be a radio frequency circuit. In a specific implementation, the acquisition unit 1301 is used to acquire N tasks and the association relationship of the N tasks, the determination unit 1302 is used to determine the non-dependent or de-dependent tasks among the N tasks as schedulable tasks according to the association relationship of the N tasks, and the scheduling unit 1303 is used to schedule the schedulable tasks to the processing unit. Wherein, N is a positive integer.
该任务调度器1300所涉及的与本申请实施例提供的技术方案相关的概念,解释和详细说明及其他步骤请参见前述方法或其他实施例中关于这些内容的描述,此处不做赘述。For the concepts, explanations, detailed descriptions and other steps involved in the task scheduler 1300 and related to the technical solution provided in the embodiment of the present application, please refer to the description of these contents in the aforementioned method or other embodiments, which will not be repeated here.
可以理解的是,上述任务调度器1300中各个单元的功能可以参考相应方法实施例的实现,此处不再赘述。It can be understood that the functions of each unit in the above-mentioned task scheduler 1300 can refer to the implementation of the corresponding method embodiment, and will not be repeated here.
应理解,以上任务调度器1300的模块的划分仅仅是一种逻辑功能的划分,实际实现时可以全部或部分集成到一个物理实体上,也可以物理上分开。It should be understood that the above division of modules of the task scheduler 1300 is merely a division of logical functions, and in actual implementation, all or part of them may be integrated into one physical entity, or they may be physically separated.
根据本申请实施例提供的方法,本申请还提供一种任务调度器,包括前述的任务获取器、阻塞管理器和任务派发器,还包括第一等待队列、第二等待队列和就绪队列。According to the method provided in the embodiment of the present application, the present application also provides a task scheduler, including the aforementioned task acquirer, blocking manager and task dispatcher, and also including a first waiting queue, a second waiting queue and a ready queue.
根据本申请实施例提供的方法,本申请还提供一种处理器,包括前述的任务调度器和处理单元。其中,处理单元具体可以是处理器核心,比如前述的GPU核。According to the method provided in the embodiment of the present application, the present application also provides a processor, including the aforementioned task scheduler and a processing unit. The processing unit may specifically be a processor core, such as the aforementioned GPU core.
根据本申请实施例提供的方法,本申请还提供一种电子设备,该电子设备包括处理器,处理器与存储器耦合,处理器用于执行存储器中存储的计算机程序,以使得该电子设备执行如图5或图9所示实施例中任意一个实施例的方法。According to the method provided in the embodiments of the present application, the present application also provides an electronic device, which includes a processor, the processor is coupled to a memory, and the processor is used to execute a computer program stored in the memory, so that the electronic device executes the method of any one of the embodiments shown in Figure 5 or Figure 9.
根据本申请实施例提供的方法,本申请还提供一种任务处理系统,包括处理器和设备驱动程序包,设备驱动程度包用于向处理器发送N个任务,处理器用于通过执行图5或图9所示实施例中任意一个实施例的方法处理N个任务。According to the method provided in the embodiments of the present application, the present application also provides a task processing system, including a processor and a device driver package, the device driver package is used to send N tasks to the processor, and the processor is used to process N tasks by executing the method of any one of the embodiments shown in Figure 5 or Figure 9.
根据本申请实施例提供的方法,本申请还提供一种计算机程序产品,该计算机程序产品包括:计算机程序代码,当该计算机程序代码在计算机上运行时,使得该计算机执行图5或图9所示实施例中任意一个实施例的方法。According to the method provided in the embodiments of the present application, the present application also provides a computer program product, which includes: a computer program code, when the computer program code is run on a computer, the computer executes the method of any one of the embodiments shown in Figure 5 or Figure 9.
根据本申请实施例提供的方法,本申请还提供一种计算机可读存储介质,该计算机可读介质存储有程序代码,当该程序代码在计算机上运行时,使得该计算机执行图5或图9所示实施例中任意一个实施例的方法。According to the method provided in the embodiments of the present application, the present application also provides a computer-readable storage medium, which stores a program code. When the program code runs on a computer, the computer executes the method of any one of the embodiments shown in Figure 5 or Figure 9.
在本说明书中使用的术语“部件”、“模块”、“系统”等用于表示计算机相关的实体、硬件、固件、硬件和软件的组合、软件、或执行中的软件。例如,部件可以是但不限于,在处理器上运行的进程、处理器、对象、可执行文件、执行线程、程序和/或计算机。通过图示,在计算设备上运行的应用和计算设备都可以是部件。一个或多个部件可驻留在进程和/或执行线程中,部件可位于一个计算机上和/或分布在两个或更多个计算机之间。此外,这些部件可从在上面存储有各种数据结构的各种计算机可读介质执行。部件可例如根据具有一个或多个数据分组(例如来自与本地系统、分布式系统和/或网络间的另一部件交互的二个部件的数据,例如通过信号与其它系统交互的互联网)的信号通过本地和/或远程进程来通信。The terms "component", "module", "system", etc. used in this specification are used to represent computer-related entities, hardware, firmware, a combination of hardware and software, software, or software in execution. For example, a component can be, but is not limited to, a process running on a processor, a processor, an object, an executable file, an execution thread, a program and/or a computer. By way of illustration, both applications running on a computing device and a computing device can be components. One or more components may reside in a process and/or an execution thread, and a component may be located on a computer and/or distributed between two or more computers. In addition, these components may be executed from various computer-readable media having various data structures stored thereon. Components may, for example, communicate through local and/or remote processes according to signals having one or more data packets (e.g., data from two components interacting with another component between a local system, a distributed system and/or a network, such as the Internet interacting with other systems through signals).
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各种说明性逻辑块(illustrative logical block)和步骤(step),能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。Those of ordinary skill in the art will appreciate that the various illustrative logical blocks and steps described in conjunction with the embodiments disclosed herein can be implemented in electronic hardware, or a combination of computer software and electronic hardware. Whether these functions are performed in hardware or software depends on the specific application and design constraints of the technical solution. Professional and technical personnel can use different methods to implement the described functions for each specific application, but such implementation should not be considered to be beyond the scope of this application.
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统、装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。Those skilled in the art can clearly understand that, for the convenience and brevity of description, the specific working processes of the systems, devices and units described above can refer to the corresponding processes in the aforementioned method embodiments and will not be repeated here.
在本申请所提供的几个实施例中,应该理解到,所揭露的系统、装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。In the several embodiments provided in the present application, it should be understood that the disclosed systems, devices and methods can be implemented in other ways. For example, the device embodiments described above are only schematic. For example, the division of the units is only a logical function division. There may be other division methods in actual implementation, such as multiple units or components can be combined or integrated into another system, or some features can be ignored or not executed. Another point is that the mutual coupling or direct coupling or communication connection shown or discussed can be through some interfaces, indirect coupling or communication connection of devices or units, which can be electrical, mechanical or other forms.
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需
要选择其中的部分或者全部单元来实现本实施例方案的目的。The units described as separate components may or may not be physically separate, and the components shown as units may or may not be physical units, that is, they may be located in one place or distributed over multiple network units. Some or all of the units may be selected to achieve the purpose of the solution of this embodiment.
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。In addition, each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.
所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(read-only memory,ROM)、随机存取存储器(random access memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。If the functions are implemented in the form of software functional units and sold or used as independent products, they can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present application can be essentially or partly embodied in the form of a software product that contributes to the prior art. The computer software product is stored in a storage medium and includes several instructions for a computer device (which can be a personal computer, server, or network device, etc.) to perform all or part of the steps of the methods described in each embodiment of the present application. The aforementioned storage media include: U disk, mobile hard disk, read-only memory (ROM), random access memory (RAM), disk or optical disk, and other media that can store program codes.
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。
The above is only a specific implementation of the present application, but the protection scope of the present application is not limited thereto. Any person skilled in the art who is familiar with the present technical field can easily think of changes or substitutions within the technical scope disclosed in the present application, which should be included in the protection scope of the present application. Therefore, the protection scope of the present application should be based on the protection scope of the claims.
Claims (31)
- 一种任务调度方法,其特征在于,包括:A task scheduling method, characterized by comprising:获取N个任务以及所述N个任务的关联关系,所述N为正整数;Obtain N tasks and association relationships between the N tasks, where N is a positive integer;根据所述N个任务的关联关系,将所述N个任务中无依赖的或已解除依赖的任务确定为可调度任务;According to the association relationship of the N tasks, determining the tasks that have no dependencies or have been released from dependencies among the N tasks as schedulable tasks;将所述可调度任务调度至处理单元。The schedulable tasks are scheduled to the processing units.
- 如权利要求1所述的方法,其特征在于,所述根据所述N个任务的关联关系,将所述N个任务中无依赖的或已解除依赖的任务确定为可调度任务,包括:The method according to claim 1, characterized in that the step of determining, according to the association relationship among the N tasks, tasks that are not dependent or have been released from dependency as schedulable tasks comprises:根据所述N个任务的关联关系,确定与其它任务具有同步关联的任务,当所述任务对应的与所述任务具有同步关联的其它任务执行完成,则将所述任务确定为一个可调度任务;According to the association relationship of the N tasks, a task that is synchronously associated with other tasks is determined, and when other tasks that are synchronously associated with the task corresponding to the task are completed, the task is determined as a schedulable task;其中,与其它任务具有同步关联是指所述任务与早于或晚于所述任务获取的全部其它任务具有关联关系。The synchronous association with other tasks means that the task has an association relationship with all other tasks acquired earlier or later than the task.
- 如权利要求2所述的方法,其特征在于,所述任务对应的与所述任务具有同步关联的其它任务执行完成,包括如下内容中的至少一项:The method according to claim 2, wherein the completion of execution of other tasks corresponding to the task and having synchronization association with the task includes at least one of the following:所述任务为前向同步任务,且早于所述任务获取的其它任务全部执行完成;The task is a forward synchronization task, and all other tasks acquired earlier than the task are completed;所述任务为晚于后向同步任务获取的任务,且所述后向同步任务执行完成;The task is a task acquired later than the backward synchronization task, and the backward synchronization task is completed;其中,所述前向同步任务用于指示所述前向同步任务依赖于早于所述前向同步任务获取的全部其它任务,所述后向同步任务用于指示晚于所述后向同步任务获取的全部其它任务依赖于所述后向同步任务。The forward synchronization task is used to indicate that the forward synchronization task depends on all other tasks acquired earlier than the forward synchronization task, and the backward synchronization task is used to indicate that all other tasks acquired later than the backward synchronization task depend on the backward synchronization task.
- 如权利要求1至3中任一项所述的方法,其特征在于,所述根据所述N个任务的关联关系,将所述N个任务中无依赖的或已解除依赖的任务确定为可调度任务,包括:The method according to any one of claims 1 to 3, characterized in that, according to the association relationship of the N tasks, determining the tasks that have no dependencies or have been released from dependencies among the N tasks as schedulable tasks comprises:根据所述N个任务的关联关系,确定与其它任务具有依赖关联的任务,当所述任务对应的与所述任务具有依赖关联的其它任务已执行或已执行完成,则将所述任务确定为一个可调度任务。According to the association relationship of the N tasks, a task having a dependency association with other tasks is determined, and when other tasks corresponding to the task and having a dependency association with the task have been executed or have been completed, the task is determined as a schedulable task.
- 如权利要求1至4中任一项所述的方法,其特征在于,所述根据所述N个任务的关联关系,将所述N个任务中无依赖的或已解除依赖的任务确定为可调度任务,包括:The method according to any one of claims 1 to 4, characterized in that, according to the association relationship of the N tasks, determining the tasks that have no dependencies or have been released from dependencies among the N tasks as schedulable tasks comprises:按照所述N个任务的获取顺序,遍历所述N个任务中还未被确定阻塞的每个任务,在遍历还未被确定阻塞的每个任务时:According to the acquisition order of the N tasks, traverse each task among the N tasks that has not been determined to be blocked. When traversing each task that has not been determined to be blocked:若所述任务为前向同步任务,且早于所述任务获取的其它任务未全部执行完成,则确定所述任务被同步阻塞;If the task is a forward synchronization task, and other tasks acquired earlier than the task have not been fully executed, it is determined that the task is synchronization blocked;若所述任务为后向同步任务,则确定晚于所述任务获取的全部其它任务被同步阻塞;If the task is a backward synchronization task, it is determined that all other tasks acquired later than the task are synchronously blocked;若所述任务依赖于其它任务,且所述其它任务未执行完成,则确定所述任务被依赖阻塞;If the task depends on other tasks, and the other tasks have not been completed, it is determined that the task is blocked by the dependency;若所述任务串行依赖于其它任务,且所述其它任务未全部执行,则确定所述任务被依赖阻塞;If the task is serially dependent on other tasks, and the other tasks are not all executed, it is determined that the task is blocked by the dependency;当所述任务未被同步阻塞且未被依赖阻塞时,将所述任务确定为一个所述可调度任务。When the task is not blocked by synchronization and is not blocked by dependencies, the task is determined as one of the schedulable tasks.
- 如权利要求5所述的方法,其特征在于,The method according to claim 5, characterized in that所述任务被同步阻塞包括所述任务被整组同步阻塞和/或所述任务被子分组同步阻塞;The task being synchronously blocked includes the task being synchronously blocked by the entire group and/or the task being synchronously blocked by a sub-group;用于判断所述任务被整组同步阻塞的其它任务为所述N个任务中早于或晚于所述任务所获取的任务;The other tasks used to determine that the task is blocked by the entire group of synchronization are tasks acquired earlier or later than the task among the N tasks;用于判断所述任务被子分组同步阻塞的其它任务为所述任务所属的子分组中早于或晚于所述任务所获取的任务。The other tasks used to determine that the task is blocked by sub-group synchronization are tasks in the sub-group to which the task belongs that are acquired earlier or later than the task.
- 如权利要求6所述的方法,其特征在于,所述子分组按照业务特性划分得到。The method according to claim 6 is characterized in that the sub-groups are obtained by dividing according to business characteristics.
- 如权利要求1至7中任一项所述的方法,其特征在于,The method according to any one of claims 1 to 7, characterized in that所述获取N个任务以及所述N个任务的关联关系之后,还包括:After obtaining the N tasks and the association relationship between the N tasks, the method further includes:将所述N个任务存放至第一等待队列;Storing the N tasks in a first waiting queue;所述根据所述N个任务的关联关系,将所述N个任务中无依赖的或已解除依赖的任务确定为可调度任务,包括:The determining, according to the association relationship of the N tasks, tasks that have no dependencies or have been released from dependencies among the N tasks as schedulable tasks includes:针对于所述第一等待队列中的任一任务,判断所述任务是否被其它任务同步阻塞,若否,则将所述任务从所述第一等待队列移至第二等待队列;For any task in the first waiting queue, determine whether the task is synchronously blocked by other tasks, and if not, move the task from the first waiting queue to the second waiting queue;针对于第二等待队列中的任一任务,判断所述任务是否被其它任务依赖阻塞,若否,则将所述任务 从所述第二等待队列移至就绪队列;For any task in the second waiting queue, determine whether the task is blocked by other task dependencies. If not, wait for the task to complete. Move from the second waiting queue to the ready queue;所述将所述可调度任务调度至处理单元,包括:The step of scheduling the schedulable task to the processing unit includes:将所述就绪队列中的任务调度至处理单元。The tasks in the ready queue are dispatched to the processing unit.
- 如权利要求8所述的方法,其特征在于,所述将所述就绪队列中的任务调度至处理单元,包括:The method according to claim 8, characterized in that scheduling the tasks in the ready queue to the processing unit comprises:监控所述处理单元当前执行的任务数量,当所述任务数量小于所述处理单元的可并行任务数量时,按照所述就绪队列中的任务的获取顺序,依次将所述就绪队列中的任务调度至所述处理单元。The number of tasks currently executed by the processing unit is monitored, and when the number of tasks is less than the number of parallel tasks of the processing unit, the tasks in the ready queue are sequentially scheduled to the processing unit according to the order in which the tasks in the ready queue are acquired.
- 一种任务调度器,其特征在于,包括:A task scheduler, characterized by comprising:任务获取器,用于获取N个任务以及所述N个任务的关联关系,所述N为正整数;A task acquirer, used to acquire N tasks and associations between the N tasks, where N is a positive integer;阻塞管理器,用于根据所述N个任务的关联关系,将所述N个任务中无依赖的或已解除依赖的任务确定为可调度任务;A blocking manager, configured to determine, according to the association relationship among the N tasks, tasks that have no dependencies or have been released from dependencies among the N tasks as schedulable tasks;任务派发器,将所述可调度任务调度至处理单元。The task dispatcher dispatches the schedulable tasks to the processing units.
- 如权利要求10所述的任务调度器,其特征在于,所述任务获取器,具体用于:The task scheduler according to claim 10, characterized in that the task acquirer is specifically used to:接收设备驱动程序包下发的所述N个任务以及所述N个任务的关联关系。The N tasks and the association relationship between the N tasks sent by the device driver package are received.
- 如权利要求10或11所述的任务调度器,其特征在于,所述阻塞管理器,具体用于:The task scheduler according to claim 10 or 11, characterized in that the blocking manager is specifically used to:根据所述N个任务的关联关系,确定与其它任务具有同步关联的任务,当所述任务对应的与所述任务具有同步关联的其它任务执行完成,则将所述任务确定为一个可调度任务;According to the association relationship of the N tasks, a task that is synchronously associated with other tasks is determined, and when other tasks that are synchronously associated with the task corresponding to the task are completed, the task is determined as a schedulable task;其中,与其它任务具有同步关联是指所述任务与早于或晚于所述任务获取的全部其它任务具有关联关系。The synchronous association with other tasks means that the task has an association relationship with all other tasks acquired earlier or later than the task.
- 如权利要求12所述的任务调度器,其特征在于,所述任务对应的与所述任务具有同步关联的其它任务执行完成,包括如下内容中的至少一项:The task scheduler according to claim 12, wherein the completion of execution of other tasks corresponding to the task and having synchronization association with the task includes at least one of the following:所述任务为前向同步任务,且早于所述任务获取的其它任务全部执行完成;The task is a forward synchronization task, and all other tasks acquired earlier than the task are completed;所述任务为晚于后向同步任务获取的任务,且所述后向同步任务执行完成;The task is a task acquired later than the backward synchronization task, and the backward synchronization task is executed and completed;其中,所述前向同步任务用于指示所述前向同步任务依赖于早于所述前向同步任务获取的全部其它任务,所述后向同步任务用于指示晚于所述后向同步任务获取的全部其它任务依赖于所述后向同步任务。The forward synchronization task is used to indicate that the forward synchronization task depends on all other tasks acquired earlier than the forward synchronization task, and the backward synchronization task is used to indicate that all other tasks acquired later than the backward synchronization task depend on the backward synchronization task.
- 如权利要求10至13中任一项所述的任务调度器,其特征在于,所述阻塞管理器,具体用于:The task scheduler according to any one of claims 10 to 13, characterized in that the blocking manager is specifically used to:根据所述N个任务的关联关系,确定与其它任务具有依赖关联的任务,当所述任务对应的与所述任务具有依赖关联的其它任务已执行或已执行完成,则将所述任务确定为一个可调度任务。According to the association relationship of the N tasks, a task having a dependency association with other tasks is determined, and when other tasks corresponding to the task and having a dependency association with the task have been executed or have been completed, the task is determined as a schedulable task.
- 如权利要求10至14中任一项所述的任务调度器,其特征在于,所述阻塞管理器,具体用于:The task scheduler according to any one of claims 10 to 14, characterized in that the blocking manager is specifically used to:按照所述N个任务的获取顺序,遍历N个任务中还未被确定阻塞的每个任务,在遍历还未被确定阻塞的每个任务时:According to the acquisition order of the N tasks, traverse each task among the N tasks that has not been determined to be blocked. When traversing each task that has not been determined to be blocked:若所述任务为前向同步任务,且早于所述任务获取的其它任务未全部执行完成,则确定所述任务被同步阻塞;If the task is a forward synchronization task, and other tasks acquired earlier than the task have not been fully executed, it is determined that the task is synchronization blocked;若所述任务为后向同步任务,则确定晚于所述任务获取的全部其它任务被同步阻塞;If the task is a backward synchronization task, it is determined that all other tasks acquired later than the task are synchronously blocked;若所述任务依赖于其它任务,且所述其它任务未执行完成,则确定所述任务被依赖阻塞;If the task depends on other tasks, and the other tasks have not been completed, it is determined that the task is blocked by the dependency;若所述任务串行依赖于其它任务,且所述其它任务未全部执行,则确定所述任务被依赖阻塞;If the task is serially dependent on other tasks, and the other tasks are not all executed, it is determined that the task is blocked by the dependency;当所述任务未被同步阻塞且未被依赖阻塞时,将所述任务确定为一个所述可调度任务。When the task is not blocked by synchronization and is not blocked by dependencies, the task is determined as one of the schedulable tasks.
- 如权利要求15所述的任务调度器,其特征在于,The task scheduler according to claim 15, characterized in that所述任务被同步阻塞包括所述任务被整组同步阻塞和/或所述任务被子分组同步阻塞;The task being synchronously blocked includes the task being synchronously blocked by the entire group and/or the task being synchronously blocked by a sub-group;用于判断所述任务被整组同步阻塞的其它任务为所述N个任务中早于或晚于所述任务所获取的任务;The other tasks used to determine that the task is blocked by the entire group of synchronization are tasks acquired earlier or later than the task among the N tasks;用于判断所述任务被子分组同步阻塞的其它任务为所述任务所属的子分组中早于或晚于所述任务所获取的任务。The other tasks used to determine that the task is blocked by sub-group synchronization are tasks in the sub-group to which the task belongs that are acquired earlier or later than the task.
- 如权利要求16所述的任务调度器,其特征在于,所述子分组按照业务特性划分得到。The task scheduler as described in claim 16 is characterized in that the sub-groups are obtained by dividing according to business characteristics.
- 如权利要求10至17中任一项所述的任务调度器,其特征在于,还包括:第一等待队列、第二等待队列和就绪队列;The task scheduler according to any one of claims 10 to 17, further comprising: a first waiting queue, a second waiting queue and a ready queue;所述任务获取器,还用于:将所述N个任务存放至所述第一等待队列;The task acquirer is further used to: store the N tasks into the first waiting queue;所述阻塞管理器,具体用于:针对于所述第一等待队列中的任一任务,判断所述任务是否被其它任务同步阻塞,若否,则将所述任务从所述第一等待队列移至所述第二等待队列;以及,针对于所述第二 等待队列中的任一任务,判断所述任务是否被其它任务依赖阻塞,若否,则将所述任务从所述第二等待队列移至所述就绪队列;The blocking manager is specifically used to: for any task in the first waiting queue, determine whether the task is synchronously blocked by other tasks, and if not, move the task from the first waiting queue to the second waiting queue; and for the second For any task in the waiting queue, determine whether the task is blocked by other task dependencies, and if not, move the task from the second waiting queue to the ready queue;所述任务派发器,具体用于:将所述就绪队列中的任务调度至处理单元。The task dispatcher is specifically used to dispatch the tasks in the ready queue to the processing unit.
- 如权利要求18所述的任务调度器,其特征在于,所述任务派发器,具体用于:The task scheduler according to claim 18, wherein the task dispatcher is specifically used to:监控所述处理单元当前执行的任务数量,当所述任务数量小于所述处理单元的可并行任务数量时,按照所述就绪队列中的任务的获取顺序,依次将所述就绪队列中的任务调度至所述处理单元。The number of tasks currently executed by the processing unit is monitored, and when the number of tasks is less than the number of parallel tasks of the processing unit, the tasks in the ready queue are sequentially scheduled to the processing unit according to the order in which the tasks in the ready queue are acquired.
- 一种任务调度器,其特征在于,包括:A task scheduler, characterized by comprising:获取单元,用于获取N个任务以及所述N个任务的关联关系,所述N为正整数;An acquisition unit, used to acquire N tasks and association relationships among the N tasks, where N is a positive integer;确定单元,用于根据所述N个任务的关联关系,将所述N个任务中无依赖的或已解除依赖的任务确定为可调度任务;A determination unit, configured to determine, according to the association relationship among the N tasks, tasks that are not dependent or have been released from dependency among the N tasks as schedulable tasks;调度单元,用于将所述可调度任务调度至处理单元。The scheduling unit is used to schedule the schedulable task to the processing unit.
- 如权利要求20所述的任务调度器,其特征在于,所述确定单元具体用于:The task scheduler according to claim 20, wherein the determining unit is specifically used to:根据所述N个任务的关联关系,确定与其它任务具有同步关联的任务,当所述任务对应的与所述任务具有同步关联的其它任务执行完成,则将所述任务确定为一个可调度任务;According to the association relationship of the N tasks, a task that is synchronously associated with other tasks is determined, and when other tasks that are synchronously associated with the task corresponding to the task are completed, the task is determined as a schedulable task;其中,与其它任务具有同步关联是指所述任务与早于或晚于所述任务获取的全部其它任务具有关联关系。The synchronous association with other tasks means that the task has an association relationship with all other tasks acquired earlier or later than the task.
- 如权利要求21所述的任务调度器,其特征在于,所述任务对应的与所述任务具有同步关联的其它任务执行完成,包括如下内容中的至少一项:The task scheduler according to claim 21, wherein the completion of execution of other tasks corresponding to the task and having synchronization association with the task includes at least one of the following:所述任务为前向同步任务,且早于所述任务获取的其它任务全部执行完成;The task is a forward synchronization task, and all other tasks acquired earlier than the task are completed;所述任务为晚于后向同步任务获取的任务,且所述后向同步任务执行完成;The task is a task acquired later than the backward synchronization task, and the backward synchronization task is completed;其中,所述前向同步任务用于指示所述前向同步任务依赖于早于所述前向同步任务获取的全部其它任务,所述后向同步任务用于指示晚于所述后向同步任务获取的全部其它任务依赖于所述后向同步任务。The forward synchronization task is used to indicate that the forward synchronization task depends on all other tasks acquired earlier than the forward synchronization task, and the backward synchronization task is used to indicate that all other tasks acquired later than the backward synchronization task depend on the backward synchronization task.
- 如权利要求20至22中任一项所述的任务调度器,其特征在于,所述确定单元具体用于:The task scheduler according to any one of claims 20 to 22, characterized in that the determining unit is specifically used to:根据所述N个任务的关联关系,确定与其它任务具有依赖关联的任务,当所述任务对应的与所述任务具有依赖关联的其它任务已执行或已执行完成,则将所述任务确定为一个可调度任务。According to the association relationship of the N tasks, a task having a dependency association with other tasks is determined, and when other tasks corresponding to the task and having a dependency association with the task have been executed or have been completed, the task is determined as a schedulable task.
- 如权利要求20至23中任一项所述的任务调度器,其特征在于,所述确定单元具体用于:The task scheduler according to any one of claims 20 to 23, characterized in that the determining unit is specifically used to:按照所述N个任务的获取顺序,遍历N个任务中未被确定阻塞的每个任务,在遍历未被确定阻塞的每个任务时:According to the acquisition order of the N tasks, traverse each task that is not blocked among the N tasks. When traversing each task that is not blocked:若所述任务为前向同步任务,且早于所述任务获取的其它任务未全部执行完成,则确定所述任务被同步阻塞;If the task is a forward synchronization task, and other tasks acquired earlier than the task have not been fully executed, it is determined that the task is synchronization blocked;若所述任务为后向同步任务,则确定晚于所述任务获取的全部其它任务被同步阻塞;If the task is a backward synchronization task, it is determined that all other tasks acquired later than the task are synchronously blocked;若所述任务依赖于其它任务,且所述其它任务未执行完成,则确定所述任务被依赖阻塞;If the task depends on other tasks, and the other tasks have not been completed, it is determined that the task is blocked by the dependency;若所述任务串行依赖于其它任务,且所述其它任务未全部执行,则确定所述任务被依赖阻塞;If the task is serially dependent on other tasks, and the other tasks are not all executed, it is determined that the task is blocked by the dependency;当所述任务未被同步阻塞且未被依赖阻塞时,将所述任务确定为一个所述可调度任务。When the task is not blocked by synchronization and is not blocked by dependencies, the task is determined as one of the schedulable tasks.
- 如权利要求24所述的任务调度器,其特征在于,The task scheduler according to claim 24, characterized in that所述任务被同步阻塞包括所述任务被整组同步阻塞和/或所述任务被子分组同步阻塞;The task being synchronously blocked includes the task being synchronously blocked by the entire group and/or the task being synchronously blocked by a sub-group;用于判断所述任务被整组同步阻塞的其它任务为所述N个任务中早于或晚于所述任务所获取的任务;The other tasks used to determine that the task is blocked by the entire group of synchronization are tasks acquired earlier or later than the task among the N tasks;用于判断所述任务被子分组同步阻塞的其它任务为所述任务所属的子分组中早于或晚于所述任务所获取的任务。The other tasks used to determine that the task is blocked by sub-group synchronization are tasks in the sub-group to which the task belongs that are acquired earlier or later than the task.
- 如权利要求25所述的任务调度器,其特征在于,所述子分组按照业务特性划分得到。The task scheduler as described in claim 25 is characterized in that the sub-groups are divided according to business characteristics.
- 如权利要求20至26中任一项所述的任务调度器,其特征在于,The task scheduler according to any one of claims 20 to 26, characterized in that:所述获取单元在获取所述N个任务之后,还用于:将所述N个任务存放至第一等待队列;After acquiring the N tasks, the acquiring unit is further used to: store the N tasks in a first waiting queue;所述确定单元具体用于:针对于所述第一等待队列中的任一任务,判断所述任务是否被其它任务同步阻塞,若否,则将所述任务从所述第一等待队列移至第二等待队列;以及,针对于所述第二等待队列中的任一任务,判断所述任务是否被其它任务依赖阻塞,若否,则将所述任务从所述第二等待队列移至就绪队列;The determining unit is specifically used for: for any task in the first waiting queue, determining whether the task is synchronously blocked by other tasks, if not, moving the task from the first waiting queue to the second waiting queue; and for any task in the second waiting queue, determining whether the task is blocked by other task dependencies, if not, moving the task from the second waiting queue to the ready queue;所述调度单元具体用于:将所述就绪队列中的任务调度至处理单元。 The scheduling unit is specifically used to schedule the tasks in the ready queue to the processing unit.
- 如权利要求27所述的任务调度器,其特征在于,所述调度单元具体用于:The task scheduler according to claim 27, wherein the scheduling unit is specifically used for:监控所述处理单元当前执行的任务数量,当所述任务数量小于所述处理单元的可并行任务数量时,按照所述就绪队列中的任务的获取顺序,依次将所述就绪队列中的任务调度至所述处理单元。The number of tasks currently executed by the processing unit is monitored, and when the number of tasks is less than the number of parallel tasks of the processing unit, the tasks in the ready queue are sequentially scheduled to the processing unit according to the order in which the tasks in the ready queue are acquired.
- 一种处理器,其特征在于,包括任务调度器和处理单元;A processor, characterized in that it comprises a task scheduler and a processing unit;所述任务调度器,用于执行如权利要求1至9中任一项所述的方法;The task scheduler is used to execute the method according to any one of claims 1 to 9;所述处理单元,用于处理所述任务调度器调度过来的任务。The processing unit is used to process the tasks scheduled by the task scheduler.
- 一种电子设备,其特征在于,包括处理器,处理器与存储器耦合,处理器用于执行存储器中存储的计算机程序,以使得所述电子设备执行如权利要求1至9中任一项所述的方法。An electronic device, characterized in that it comprises a processor, the processor is coupled to a memory, and the processor is used to execute a computer program stored in the memory so that the electronic device executes the method as described in any one of claims 1 to 9.
- 一种任务调度系统,其特征在于,包括设备驱动程序包和如权利要求29所述的处理器;A task scheduling system, comprising a device driver package and a processor as claimed in claim 29;所述设备驱动程序包,用于向所述处理器发送N个任务,N为正整数;The device driver package is used to send N tasks to the processor, where N is a positive integer;所述处理器,用于处理所述N个任务。 The processor is used to process the N tasks.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211614262.8 | 2022-12-15 | ||
CN202211614262.8A CN118210597A (en) | 2022-12-15 | 2022-12-15 | A task scheduling method, device and system |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2024125341A1 true WO2024125341A1 (en) | 2024-06-20 |
Family
ID=91445127
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2023/136252 WO2024125341A1 (en) | 2022-12-15 | 2023-12-04 | Task scheduling method, apparatus and system |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN118210597A (en) |
WO (1) | WO2024125341A1 (en) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120180068A1 (en) * | 2009-07-24 | 2012-07-12 | Enno Wein | Scheduling and communication in computing systems |
US20150172412A1 (en) * | 2012-07-06 | 2015-06-18 | Cornell University | Managing dependencies between operations in a distributed system |
CN110554909A (en) * | 2019-09-06 | 2019-12-10 | 腾讯科技(深圳)有限公司 | task scheduling processing method and device and computer equipment |
CN112099958A (en) * | 2020-11-17 | 2020-12-18 | 深圳壹账通智能科技有限公司 | Distributed multi-task management method and device, computer equipment and storage medium |
-
2022
- 2022-12-15 CN CN202211614262.8A patent/CN118210597A/en active Pending
-
2023
- 2023-12-04 WO PCT/CN2023/136252 patent/WO2024125341A1/en unknown
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120180068A1 (en) * | 2009-07-24 | 2012-07-12 | Enno Wein | Scheduling and communication in computing systems |
US20150172412A1 (en) * | 2012-07-06 | 2015-06-18 | Cornell University | Managing dependencies between operations in a distributed system |
CN110554909A (en) * | 2019-09-06 | 2019-12-10 | 腾讯科技(深圳)有限公司 | task scheduling processing method and device and computer equipment |
CN112099958A (en) * | 2020-11-17 | 2020-12-18 | 深圳壹账通智能科技有限公司 | Distributed multi-task management method and device, computer equipment and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN118210597A (en) | 2024-06-18 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10733019B2 (en) | Apparatus and method for data processing | |
CN104615488B (en) | The method and apparatus of task scheduling in heterogeneous multi-core reconfigurable calculating platform | |
WO2017166777A1 (en) | Task scheduling method and device | |
JP2014525619A (en) | Data processing system | |
US20240106754A1 (en) | Load Balancing Method for Multi-Thread Forwarding and Related Apparatus | |
WO2023020177A1 (en) | Task scheduling method, game engine, device and storage medium | |
JP2021518955A (en) | Processor core scheduling method, equipment, terminals and storage media | |
CN112214299A (en) | Multi-core processor and task scheduling method and device thereof | |
CN111694675A (en) | Task scheduling method and device and storage medium | |
CN112380001A (en) | Log output method, load balancing device and computer readable storage medium | |
TW202107408A (en) | Methods and apparatus for wave slot management | |
CN110018782B (en) | Data reading/writing method and related device | |
CN110955461A (en) | Processing method, device and system of computing task, server and storage medium | |
WO2024125341A1 (en) | Task scheduling method, apparatus and system | |
CN116414534A (en) | Task scheduling method, device, integrated circuit, network equipment and storage medium | |
CN114371920A (en) | A Network Function Virtualization System Based on Graphics Processor Acceleration Optimization | |
CN115269131A (en) | A task scheduling method and device | |
CN118689633A (en) | GPU resource allocation method and server | |
CN115981893A (en) | Message queue task processing method and device, server and storage medium | |
CN113923212B (en) | Network data packet processing method and device | |
CN116982030A (en) | In-server delay control device, in-server delay control method, and program | |
CN116458143A (en) | Method and device for editing message | |
JP7662062B2 (en) | Intra-server delay control device, intra-server delay control method and program | |
US20230393889A1 (en) | Multi-core processor, multi-core processor processing method, and related device | |
US20230195546A1 (en) | Message Management Method and Apparatus, and Serverless System |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 23902538 Country of ref document: EP Kind code of ref document: A1 |