WO2024125341A1

WO2024125341A1 - Task scheduling method, apparatus and system

Info

Publication number: WO2024125341A1
Application number: PCT/CN2023/136252
Authority: WO
Inventors: 於正强; 张雷; 肖潇; 高攀
Original assignee: 华为技术有限公司
Priority date: 2022-12-15
Filing date: 2023-12-04
Publication date: 2024-06-20
Also published as: CN118210597A

Abstract

A task scheduling method, apparatus and system, which are applicable to the technical field of processors and are used for improving the utilization rate of a processing unit. The method comprises: a task scheduler acquiring N tasks and an association relationship between the N tasks (501); according to the association relationship between the N tasks, determining tasks, which have no dependency or have been released from dependency, from among the N tasks to be schedulable tasks (502); and scheduling the schedulable tasks to a processing unit (503), where N is a positive integer. The scheduling sequence of N tasks is maintained by means of a task scheduler, and according to the actual execution condition of tasks, tasks, which have no dependency or have been released from dependency, can be issued in advance to a processing unit for processing, so as to make full use of all the hardware resources of the processing unit to process the tasks with high concurrency, and prevent the processing unit from being unloaded to the greatest extent, thereby effectively improving the utilization rate of the processing unit.

Description

A task scheduling method, device and system

CROSS-REFERENCE TO RELATED APPLICATIONS

This application claims priority to the Chinese patent application filed with the China Patent Office on December 15, 2022, with application number 202211614262.8 and application name “A task scheduling method, device and system”, the entire contents of which are incorporated by reference in this application.

Technical Field

The present application relates to the field of processor technology, and in particular to a task scheduling method, device and system.

Background technique

The task scheduler is one of the core components of the processor, which is used to reasonably schedule tasks to the processing unit to make full use of the hardware resources of the processor to efficiently process tasks. The rationality of the task scheduler's scheduling of tasks is crucial to shortening the waiting time of tasks, increasing the parallel amount of tasks, optimizing resource utilization, and reducing costs and increasing efficiency. Therefore, how to achieve reasonable scheduling of tasks has become a mainstream research direction in the design of task schedulers.

In order to realize the reasonable scheduling of tasks, in one solution of the industry, a plurality of task queues are provided in the task scheduler, and the plurality of task queues are respectively used to store tasks belonging to different businesses, and the tasks in the plurality of task queues are executed in parallel, and the tasks in one task queue are executed in series. However, although this solution can process tasks belonging to different businesses in parallel, tasks belonging to the same business need to be processed in series. In other words, as long as the previous task in a task queue is not processed and completed, the other tasks after the task cannot use the idle hardware resources for processing in advance. It can be seen that this solution still causes the processing unit to have the problem of no load, and cannot make full use of the hardware resources of the processing unit to process tasks with high concurrency, which is not conducive to improving the utilization rate of the processing unit.

In summary, there is a need for a task scheduling method to improve the utilization of processing units.

Summary of the invention

The present application provides a task scheduling method, device and system for improving the utilization rate of a processing unit.

In a first aspect, the present application provides a task scheduling method, which is applicable to a task scheduler, and the task scheduler can be any device, apparatus or equipment with task scheduling capability, or a chip or circuit, without limitation. The method comprises: the task scheduler obtains N tasks and the association relationship between the N tasks, and according to the association relationship between the N tasks, determines the non-dependent or de-dependent tasks among the N tasks as schedulable tasks, and schedules the schedulable tasks to the processing unit. Wherein N is a positive integer.

In the above-mentioned task scheduling method, N tasks can be exemplarily sent to the task scheduler by the device driver package. In this way, the scheduling order of N tasks is maintained by the task scheduler from the hardware side, instead of the device driver package specifying the task execution order from the software side. According to the actual execution of the task, the tasks without dependency or whose dependency has been released can be sent to the processing unit in advance for processing, so as to make full use of all the hardware resources of the processing unit to process the tasks with high concurrency, and to prevent the processing unit from being idle as much as possible, thereby effectively improving the utilization rate of the processing unit. Moreover, the task scheduling method does not require the hardware side to send a notification message to the software side after each task is executed to instruct the software side to send a new task, which can effectively reduce the work pressure on the software side, improve the efficiency of task scheduling, and save communication overhead.

In a possible design, the association relationship of the N tasks may include synchronization association and/or dependency association, where:

Synchronous association is used to indicate that a task is associated with all other tasks that are obtained earlier or later than the task. Specifically, the synchronous association may include forward synchronous association and backward synchronous association. Forward synchronous association means that a task depends on all other tasks that are issued earlier than the task. The task that has a forward synchronous association with all other tasks is also called a forward synchronous task. Backward synchronous association means that all other tasks that are issued later than a task depend on the task. The task that has a backward synchronous association with all other tasks is also called a backward synchronous task;

Dependency associations include partial dependency and serial dependency. Partial dependency means that a task depends on the execution results of some other tasks, that is, the task can only be executed after all other tasks it depends on have been completed. Serial dependency means that a task depends on the execution of some or all other tasks, that is, the task can only be executed after all other tasks it depends on have been completed.

With the above design, by setting the synchronization association, the device driver package only needs to configure one task to indicate the dependency relationship between the task and all other tasks acquired earlier or later than the task, without having to indicate all other tasks that the task depends on one by one in the configuration information of the task. Setting the dependency association can also separate tasks that are loosely associated except for the synchronization association. Obviously, by adopting this association setting method, the complexity of the task data structure can be greatly reduced while indicating the association relationship between all tasks, which helps to alleviate the communication consumption between the device driver package and the task scheduler.

In a further design, the task scheduler determines the tasks that are not dependent or have contacted with the dependencies among the N tasks as schedulable tasks according to the association relationship among the N tasks, including: the task scheduler determines the tasks that are synchronously associated with other tasks according to the association relationship among the N tasks, and when the other tasks that are synchronously associated with the task corresponding to the task are executed, the task is determined as a schedulable task. In this way, by monitoring the execution status of the tasks that are synchronously associated, the task can be promptly scheduled to the processing unit when the other tasks that are synchronously associated with the task no longer block the task.

In a further design, other tasks corresponding to the task and having synchronization association with the task are completed, including: the task is a forward synchronization task, and all other tasks acquired earlier than the task are completed; and/or the task is a task acquired later than the backward synchronization task, and the backward synchronization task is completed. In this design, when a task is no longer blocked by all other tasks or backward synchronization tasks earlier than the task, it is determined that the task is released from dependency, so that the task can be scheduled to the processing unit as soon as possible.

In a further design, the task scheduler determines the non-dependent or de-dependent tasks among the N tasks as schedulable tasks according to the association relationship among the N tasks, including: the task scheduler determines the tasks that have a dependency association with other tasks according to the association relationship among the N tasks, and when the other tasks that have a dependency association with the task corresponding to the task have been executed or have been completed, the task is determined as a schedulable task. In this way, by monitoring the execution status of the tasks that have a dependency association, the task can be scheduled to the processing unit in a timely manner when the other tasks that have a dependency association with the task no longer block the task.

In a further design, the task scheduler determines the tasks among the N tasks that are not blocked by other tasks synchronously and not blocked by other tasks dependencies as schedulable tasks based on the association relationship among the N tasks. Among them, a task is blocked by other tasks synchronously, which means that the task meets at least one of the following conditions: the task is a forward synchronization task and other tasks acquired earlier than the task have not been fully executed, and the task is a task acquired later than the backward synchronization task and the backward synchronization task has not been fully executed. A task is blocked by other tasks dependencies, which means that the task meets at least one of the following conditions: the task has a partial dependency association with other tasks and other tasks on which it depends have not been fully executed, and the task has a serial dependency association with other tasks and other tasks on which it depends have not been fully executed.

With the above design, whether a task is blocked by synchronization and whether it is blocked by dependency can be determined separately in a process-by-process basis from the perspective of synchronization association and dependency association, without the need to conduct a general analysis of all tasks associated with the task based on the task's dependency relationship. This helps to achieve refined management of task blocking judgment and improve the flexibility of determining task blocking.

In a further design, the task scheduler determines the tasks among the N tasks that are not blocked synchronously by other tasks and not blocked by other tasks as schedulable tasks according to the association relationship among the N tasks, including: the task scheduler traverses each task among the N tasks that is not determined to be blocked in the order of acquisition of the N tasks, and when traversing each task that is not determined to be blocked: if the task is a forward synchronization task, and other tasks acquired earlier than the task have not been fully executed, then the task is determined to be blocked synchronously; if the task is a backward synchronization task, then all other tasks acquired later than the task are determined to be blocked synchronously; if the task depends on other tasks and the other tasks have not been fully executed, then the task is determined to be blocked by the dependencies; if the task depends on other tasks in series and the other tasks have not been fully executed, then the task is determined to be blocked by the dependencies; when the task is not blocked synchronously and not blocked by the dependencies, the task is determined to be a schedulable task.

With the above design, as long as the task belongs to the backward synchronization task, other tasks issued later than the task can be directly judged as being synchronously blocked. In this way, as long as it is analyzed that there is a backward synchronization task that has not been completed, all other tasks issued later than the backward synchronization task do not need to be analyzed for synchronization blocking. This can greatly simplify the judgment process of synchronization blocking, effectively improve the efficiency of synchronization blocking judgment, and thus help improve the efficiency of task scheduling.

In a further design, the task being synchronously blocked may include the task being synchronously blocked by the entire group and/or the task being synchronously blocked by a sub-group, in which case:

The other tasks used to determine whether a task is synchronously blocked by the entire group are all tasks in the entire group that are obtained earlier or later than the task. For example, when a task is marked as a forward synchronization task in the entire group, as long as the other tasks in the entire group that are issued earlier than the task have not been fully executed, it is determined that the task is synchronously blocked by other tasks in the entire group. When a task belongs to a backward synchronization task in the entire group, it is determined that all other tasks in the entire group that are later than the task are synchronously blocked by the task;

Similarly, other tasks used to determine whether a task is synchronously blocked by a subgroup are all tasks in the subgroup to which the task belongs that are obtained earlier or later than the task. For example, when a task is marked as a forward synchronization task in a subgroup, as long as other tasks in the subgroup that are issued earlier than the task have not been fully executed, it is determined that the task is synchronously blocked by other tasks in the subgroup. When a task belongs to a backward synchronization task in a subgroup, it is determined that all other tasks in the subgroup that are later than the task are synchronously blocked by the task.

In a further design, sub-groups can be obtained by dividing according to business characteristics. In this way, by grouping tasks according to business characteristics, even if tasks in a group are synchronously blocked, tasks in other groups will not be affected, that is, synchronous blocking of a business will not affect the execution of other businesses. It can be seen that by decoupling the task execution association of the business, mutual interference between businesses can be reduced.

Alternatively, in a further design, the sub-groups may be obtained by dividing the tasks with dense association relationships. Tasks with relatively concentrated relationships are divided into a sub-group. By only marking the forward synchronization task or backward synchronization task in the sub-group, the relationship between the tasks in the sub-group can be known without marking the dependency relationship of each task. This can streamline the data structure of each task and reduce the communication overhead between the device driver package and the task scheduler.

In a further design, after obtaining N tasks and the relationship between the N tasks, the task scheduler can also store the N tasks in the first waiting queue. In this case, the task scheduler determines the tasks among the N tasks that are not synchronously blocked by other tasks and are not blocked by other task dependencies as schedulable tasks based on the relationship between the N tasks, including: for any task in the first waiting queue, determine whether the task is synchronously blocked by other tasks, if not, move the task from the first waiting queue to the second waiting queue; and for any task in the second waiting queue, determine whether the task is blocked by other task dependencies, if not, move the task from the second waiting queue to the ready queue. Further, the task scheduler schedules the schedulable tasks to the processing unit, including: scheduling the tasks in the ready queue to the processing unit.

By adopting the above design, by executing synchronous blocking judgment, dependent blocking judgment and scheduling operations in parallel, it can not only ensure that a task will be scheduled only when it is not blocked by synchronization and dependency, but also the front process of the task issued later and the back process of the task issued earlier can be processed in parallel, which helps to further improve the efficiency of task scheduling.

In a further design, the task scheduler schedules the tasks in the ready queue to the processing unit, including: the task scheduler schedules the schedulable tasks to the processing unit, including: the task scheduler monitors the number of tasks currently executed by the processing unit, and when the number of tasks is less than the number of parallel tasks of the processing unit, the tasks in the ready queue are scheduled to the processing unit in sequence according to the order in which the tasks in the ready queue are acquired.

With the above design, even if the task issued later is stored in the ready queue before the task issued earlier, by finding the earliest issued task and scheduling it before the actual scheduling, rather than scheduling it according to the storage order in the ready queue, it is possible to maximize the guarantee that the earliest issued task among the currently processable tasks will be processed first while finding all processable tasks in advance.

In the second aspect, the present application provides a task scheduler, including: a task acquirer, used to acquire N tasks and the association relationships between the N tasks, where N is a positive integer; a blocking manager, used to determine the non-dependent or released dependent tasks among the N tasks as schedulable tasks based on the association relationships among the N tasks; and a task dispatcher, which schedules the schedulable tasks to the processing unit.

In a possible design, the task acquirer is specifically used to receive N tasks issued by the device driver package and the association relationship between the N tasks.

In a possible design, the blocking manager is specifically used to: determine the task that has synchronization association with other tasks according to the association relationship of N tasks, and when the other tasks that have synchronization association with the task corresponding to the task are executed, the task is determined as a schedulable task. The synchronization association with other tasks means that the task has an association relationship with all other tasks that are acquired earlier or later than the task.

In a possible design, the completion of execution of other tasks corresponding to the task and having synchronization association with the task includes at least one of the following contents: the task is a forward synchronization task, and all other tasks acquired earlier than the task are completed; the task is a task acquired later than the backward synchronization task, and the backward synchronization task is completed. Among them, the forward synchronization task is used to indicate that the forward synchronization task depends on all other tasks acquired earlier than the forward synchronization task, and the backward synchronization task is used to indicate that all other tasks acquired later than the backward synchronization task depend on the backward synchronization task.

In one possible design, the blocking manager is specifically used to: determine the task that has a dependency relationship with other tasks based on the association relationship among N tasks, and determine the task as a schedulable task when other tasks that have a dependency relationship with the task corresponding to the task have been executed or have been completed.

In one possible design, the blocking manager is specifically used to: determine the tasks among the N tasks that are not synchronously blocked by other tasks and are not blocked by other task dependencies as schedulable tasks based on the association relationship among the N tasks. Among them, the task being synchronously blocked by other tasks includes: the task is a forward synchronization task and other tasks acquired earlier than the task have not been fully executed, or the task is a task acquired later than the backward synchronization task and the backward synchronization task has not been fully executed; the forward synchronization task is used to limit the forward synchronization task to be dependent on all other tasks acquired earlier than the forward synchronization task, and the backward synchronization task is used to limit all other tasks acquired later than the backward synchronization task to be dependent on the backward synchronization task. The task being dependently blocked by other tasks includes: the task is dependent on other tasks and the other tasks have not been fully executed or have not been fully executed.

In a possible design, the blocking manager is specifically used to: traverse each task among the N tasks that has not been determined to be blocked in the order in which the N tasks are acquired, and when traversing each task that has not been determined to be blocked: if the task is a forward synchronization task, and other tasks acquired earlier than the task have not been fully executed, then the task is determined to be synchronously blocked; if the task is a backward synchronization task, then all other tasks acquired later than the task are determined to be synchronously blocked; if the task depends on other tasks and the other tasks have not been fully executed, then the task is determined to be dependently blocked; if the task depends on other tasks serially and the other tasks have not been fully executed, then the task is determined to be dependently blocked; when the task is not synchronously blocked and is not dependently blocked, the task is determined to be a schedulable task.

In a possible design, the task being synchronously blocked includes the task being synchronously blocked by the entire group and/or the task being synchronously blocked by a subgroup, for The other tasks for judging whether the task is blocked by the whole group synchronization are tasks acquired earlier or later than the task in the N tasks, and the other tasks for judging whether the task is blocked by the subgroup synchronization are tasks acquired earlier or later than the task in the subgroup to which the task belongs.

In one possible design, sub-groups are divided according to business characteristics.

In a possible design, the task scheduler also includes: a first waiting queue, a second waiting queue and a ready queue. In this case, after acquiring N tasks, the task acquirer is also used to: store the N tasks in the first waiting queue. The blocking manager is specifically used to: for any task in the first waiting queue, determine whether the task is synchronously blocked by other tasks, if not, move the task from the first waiting queue to the second waiting queue, and, for any task in the second waiting queue, determine whether the task is blocked by other task dependencies, if not, move the task from the second waiting queue to the ready queue. The task dispatcher is specifically used to: schedule tasks in the ready queue to the processing unit.

In one possible design, the task dispatcher is specifically used to: monitor the number of tasks currently executed by the processing unit, and when the number of tasks is less than the number of parallel tasks of the processing unit, schedule the tasks in the ready queue to the processing unit in sequence according to the order in which the tasks in the ready queue are acquired.

In a third aspect, the present application provides a task scheduler, comprising: an acquisition unit, used to acquire N tasks and the association relationships between the N tasks, where N is a positive integer; a determination unit, used to determine the non-dependent or released dependent tasks among the N tasks as schedulable tasks based on the association relationships among the N tasks; and a scheduling unit, used to schedule the schedulable tasks to the processing unit.

In a possible design, the determination unit is specifically used to: determine the task that has synchronization association with other tasks according to the association relationship of N tasks, and when the other tasks that have synchronization association with the task corresponding to the task are completed, the task is determined as a schedulable task. Among them, having synchronization association with other tasks means that the task has an association relationship with all other tasks that are acquired earlier or later than the task.

In a possible design, the determination unit is specifically used to: determine a task that has a dependency relationship with other tasks based on the association relationship among N tasks, and when other tasks that have a dependency relationship with the task corresponding to the task have been executed or completed, determine the task as a schedulable task.

In one possible design, the determination unit is specifically used to: determine, according to the association relationship among the N tasks, the tasks that are not blocked by other tasks synchronization and are not blocked by other task dependencies as schedulable tasks. Among them, the task being blocked by other tasks synchronization includes: the task is a forward synchronization task and other tasks acquired earlier than the task have not been fully executed, or the task is a task acquired later than the backward synchronization task and the backward synchronization task has not been fully executed; the forward synchronization task is used to limit the forward synchronization task to be dependent on all other tasks acquired earlier than the forward synchronization task, and the backward synchronization task is used to limit all other tasks acquired later than the backward synchronization task to be dependent on the backward synchronization task. The task being blocked by other task dependencies includes: the task is dependent on other tasks and the other tasks have not been fully executed or have not been fully executed.

In a possible design, the determination unit is specifically used to: traverse each task among the N tasks that has not been determined to be blocked in the order in which the N tasks are acquired, and when traversing each task that has not been determined to be blocked: if the task is a forward synchronization task, and other tasks acquired earlier than the task have not been fully executed, then the task is determined to be synchronously blocked; if the task is a backward synchronization task, then all other tasks acquired later than the task are determined to be synchronously blocked; if the task depends on other tasks and the other tasks have not been fully executed, then the task is determined to be dependently blocked; if the task depends on other tasks serially and the other tasks have not been fully executed, then the task is determined to be dependently blocked; when the task is not synchronously blocked and is not dependently blocked, the task is determined to be a schedulable task.

In one possible design, the task being synchronously blocked includes the task being synchronously blocked by the entire group and/or the task being synchronously blocked by a sub-group, which is used to determine whether other tasks that are synchronously blocked by the entire group are tasks that are acquired earlier or later than the task among the N tasks, and is used to determine whether other tasks that are synchronously blocked by the sub-group are tasks that are acquired earlier or later than the task in the sub-group to which the task belongs.

In one possible design, after acquiring N tasks and the association between the N tasks, the acquisition unit is further used to: store the N tasks in the first waiting queue. The determination unit is specifically used to: for any task in the first waiting queue, determine whether the task is synchronously blocked by other tasks, if not, move the task from the first waiting queue to the second waiting queue; and, for any task in the second waiting queue, determine whether the task is blocked by other task dependencies, if not, move the task from the second waiting queue to the ready queue. The scheduling unit is specifically used to: schedule the tasks in the ready queue to the processing unit.

In one possible design, the scheduling unit is specifically used to: monitor the number of tasks currently executed by the processing unit, and when the number of tasks is less than the processing unit When the number of parallelizable tasks of a unit is reached, the tasks in the ready queue are dispatched to the processing unit in sequence according to the order in which the tasks in the ready queue are acquired.

In a fourth aspect, the present application provides a chip, including a task scheduler, and the task scheduler is used to implement the method described in any one of the designs in the first aspect above.

In a fifth aspect, the present application provides a processor, including a task scheduler and a processing unit, the task scheduler is used to execute the method described in any one of the designs of the first aspect above, and the processing unit is used to execute the tasks scheduled by the task scheduler.

In a sixth aspect, the present application provides an electronic device, comprising a processor, wherein the processor is coupled to a memory, and the processor is used to execute a computer program stored in the memory so that the electronic device executes a method as described in any one of the designs in the first aspect above.

In a seventh aspect, the present application provides a task scheduling system, comprising a device driver package and a processor as described in the fourth aspect above, wherein the device driver package is used to send N tasks to the processor, where N is a positive integer; and the processor is used to process the N tasks.

In an eighth aspect, the present application provides a computer-readable storage medium storing a computer program. When the computer program is executed, the method described in any one of the designs in the first aspect above is implemented.

In a ninth aspect, the present application provides a computer program product, which, when executed on a processor, implements a method as described in any one of the designs of the first aspect above.

For the beneficial effects of the second to ninth aspects mentioned above, please refer to the technical effects that can be achieved by the corresponding design in the first aspect mentioned above, and no further details will be given here.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG1 exemplarily shows a system architecture diagram of a task processing system provided by an embodiment of the present application;

FIG2 exemplarily shows a schematic diagram of the structure of a processor provided in an embodiment of the present application;

FIG3 exemplarily shows a flowchart of a task processing method provided by the industry;

FIG4 is a schematic diagram showing a flow chart of a game task processing provided by the industry;

FIG5 exemplarily shows a flowchart corresponding to the task scheduling method provided in Embodiment 1 of the present application;

FIG6 exemplarily shows a schematic diagram of a task layout with synchronization association provided in an embodiment of the present application;

FIG. 7 exemplarily shows a schematic diagram of a task layout with dependency associations provided in an embodiment of the present application;

FIG8 exemplarily shows a flowchart of processing a game task using the task scheduling method in the first embodiment of the present application;

FIG9 exemplarily shows a flowchart corresponding to the task scheduling method provided in Embodiment 2 of the present application;

FIG10 exemplarily shows a task scheduling process diagram provided in an embodiment of the present application;

FIG11 exemplarily shows another task scheduling process diagram provided in an embodiment of the present application;

FIG12 exemplarily shows another task scheduling process diagram provided in an embodiment of the present application;

FIG. 13 exemplarily shows a structural diagram of a task scheduler provided in an embodiment of the present application.

Detailed ways

The task scheduling scheme disclosed in the present application can be applied to electronic devices with task processing capabilities. In some embodiments of the present application, the task scheduler can be an independent unit, which is embedded in the electronic device and can assign tasks to the processor core when the processor core in the electronic device is idle, so as to maximize the use of the power margin of the processor core while meeting the processing capacity of the processor core, thereby improving the task processing capacity of the processor core. In other embodiments of the present application, the task scheduler can also be a unit encapsulated inside the electronic device, which is used to implement the task scheduling function of the electronic device. Among them, the electronic device can be a computer device with a processor, such as a desktop computer, a personal computer or a server. It should also be understood that the electronic device can also be a portable electronic device with a processor, such as a mobile phone, a tablet computer, a wearable device with wireless communication function (such as a smart watch), a vehicle-mounted device, etc. Exemplary embodiments of portable electronic devices include but are not limited to devices equipped with Or a portable electronic device with other operating systems. The portable electronic device may also be a laptop computer (Laptop) with a touch-sensitive surface (eg, a touch panel).

In order to make the purpose, technical solutions and advantages of the present application clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the accompanying drawings in the embodiments of the present invention.

It should be understood that the described embodiments are only some embodiments of the present invention, rather than all embodiments. The specific operating methods in the method embodiments can also be applied to device embodiments or system embodiments. Moreover, in the description of this application, "at least one" means one or more, and multiple means two or more. In view of this, in the embodiments of the present invention, "multiple" can also be understood as "at least two". "And/or", Describing the association relationship of associated objects, it indicates that there may be three kinds of relationships. For example, A and/or B can indicate three situations: A exists alone, A and B exist at the same time, and B exists alone. In addition, the character "/", unless otherwise specified, generally indicates that the objects associated with each other are in an "or" relationship. In addition, it should be understood that in the description of this application, the words "first", "second", etc. are only used for the purpose of distinguishing the description, and cannot be understood as indicating or implying relative importance, nor can they be understood as indicating or implying an order.

In addition, in the embodiments of the present application, "connection" can be understood as electrical connection, and the connection between two electrical components can be a direct or indirect connection between the two electrical components. For example, A and B are connected, which can be either A and B directly connected, or A and B indirectly connected through one or more other electrical components, such as A and B are connected, or A and C are directly connected, C and B are directly connected, and A and B are connected through C. In some scenarios, "connection" can also be understood as coupling, such as electromagnetic coupling between two inductors. In short, the connection between A and B enables the transmission of electrical energy between A and B.

FIG1 exemplarily shows a system architecture diagram of a task processing system provided by an embodiment of the present application. The illustrated task processing system 10 includes an application (APP) 100, a device development kit (DDK) 200 and a processor 300, wherein the application 100 is connected to the device driver package 200, and the device driver package 200 is also connected to the processor 300. The device driver package 200 is also called driver software, and includes a kernel mode driver (KMD) 210 and a user mode driver (UMD) 220, and the user mode driver is also called a user-mode graphics driver. KMD and user mode belong to two different driver modes, and the device driver package 200 switches between the two driver modes according to the type of running code. Generally, most drivers belong to the kernel mode driver 210 and run in kernel mode, and some drivers belong to the user mode driver 220 and run in user mode. Since the kernel mode driver 210 and the user mode driver 220 are not very relevant to the solution of the present application, the embodiment of the present application does not introduce them in detail.

Further, FIG. 2 exemplarily shows a schematic diagram of the structure of a processor provided in an embodiment of the present application. The processor 300 shown in the figure may include one or more chips, for example, may include a system-on-a-chip (SoC) or a chipset formed by multiple chips. Among them, the processor 300 may include at least one processing unit, such as a neural-network processing unit (NPU), a graphics processing unit (GPU) and a central processing unit (CPU) as shown in FIG. 2, and may also include an application processor (AP), a modem processor, an image signal processor (ISP), a video codec, a digital signal processor (DSP), and/or a baseband processor. Among them, at least one processing unit is also called at least one processing subsystem, which belongs to the core component of the processor 300 and is used to implement the processing function of the processor 300. Different processing units in at least one processing unit may be dispersed and deployed on different chips, or may be integrated on one chip, without specific limitation.

Continuing to refer to FIG. 2 , the processor 300 may also include non-core components, such as general units (including counters, decoders, and signal generators, etc.), accelerator units, input/output control units, interface units, internal memories, and external buffers, etc. Among them, the internal memory and the external buffer are collectively referred to as the storage unit of the processor 300, which is used to store instructions and data. The instructions and data can be called so that when the processor is processing a task, the task that is not blocked by synchronization and is not blocked by dependency is selected and scheduled to the currently idle processor core. In some embodiments, the storage unit can be a cache memory. The cache memory can save instructions or data that have just been used or are used in a loop. When the instruction or data needs to be used again, it can be directly called from the cache memory, thereby avoiding repeated access, reducing waiting time, and improving the processing efficiency of the task.

Further exemplarily, each processing unit in the processor 300 may include one or more processor cores. For example, in the processor 300 shown in FIG2 , the NPU includes 3 NPU cores, namely NPU core 1 to NPU core 3, the GPU includes 5 GPU cores, namely GPU core 1 to GPU core 5, and the CPU includes 5 CPU cores, namely CPU core 1 to CPU core 5. Among them, the processor core is also called an execution unit, which is used to execute the entire task or part of the task fragment. When multiple processor cores are included, the multiple processor cores can be divided into one or more voltage domains, and the processor cores located in the same voltage domain have the same operating voltage and the same operating frequency. Moreover, the multiple processor cores located in the same voltage domain can be multi-core heterogeneous, that is, have different structures, and are used to process different tasks or different task fragments respectively, or can be multi-core isomorphic, that is, have the same structure, and are used to jointly process the same task or the same task fragment. The embodiment of the present application does not specifically limit this.

Further illustratively, the processor 300 may also include a task scheduler 310, and the task scheduler 310 is connected to each processor core. In one example, the connection between the task scheduler 310 and each processor core can be realized through a bus system. In this way, during the operation of the processor 300, each processor core can publish its idle message to the bus system after processing a task or task fragment. The task scheduler 310 can learn the processor core that is currently in an idle state by monitoring the bus system, and then, when there is a task that needs to be scheduled, the task is scheduled to the idle processor core.

It should be understood that the processor 300 shown in the figure is only an example, and the processor 300 may have more or fewer components than those shown in the figure. A component may be a combination of two or more components, or may have different component configurations, which is not specifically limited in the embodiments of the present application.

In a specific application scenario, the above-mentioned application 100 may specifically refer to a program for generating images, such as a mobile phone camera, a camera or a screen recording program. Correspondingly, the above-mentioned processor 300 may specifically refer to an image processor, which may only include a GPU core, but not other cores, such as an NPU core and a CPU core. Moreover, when the processor 300 is an image processor, the tasks scheduled by the task scheduler 310 specifically refer to tasks related to image processing. For example, in the scene of shooting a video, the video is obtained by shooting one frame of images after another, and each frame of the image usually needs to undergo Gaussian filtering, white balance, image denoising, image enhancement, image segmentation, and image rendering after shooting, and the processing operations of different images are usually performed in the order of shooting to avoid the phenomenon of long and short frames. In this process, the entire processing process of each frame of the image can be used as a task, or the processing operations such as Gaussian filtering, white balance, image denoising, image enhancement, image segmentation or image rendering of each frame of the image during the processing process can also be used as a task. In addition, there is a certain correlation between the tasks. For example, Gaussian filtering, white balance, image denoising, image enhancement, and image segmentation are all based on the entire image and can be performed in parallel, that is, these operations are not dependent on each other. Image rendering, on the other hand, is to render each small image after image segmentation separately, which can only be performed after image segmentation. Therefore, image rendering depends on image segmentation.

Further, taking the video shooting scene as an example, after the application 100 obtains the image frame by frame, it creates an image queue and records the command, and then sends the image queue to the device driver package 200. The device driver package 200 parses the command in the image queue, translates it into a task recognizable by the processor 300, and sends it to the task scheduler 310 in the processor 300 in the form of a task, a task chain, or a command stream. The task scheduler 310 monitors the working status of each GPU core in the processor 300, and when it is determined that there is an idle GPU core, it sends the task to be processed to the idle GPU core for processing. Among them, the task to be processed may be sent to a GPU core for separate processing, or it may be sent to multiple GPU cores for processing together. In the case of sending to multiple GPU cores, the multiple GPU cores may belong to the same voltage domain or to different voltage domains, which is not specifically limited.

Furthermore, there is usually a dependency relationship between tasks, such as a task can only be executed after other tasks have started or completed, which has requirements for the execution order of each task. However, if the task scheduler 310 executes all the received tasks in sequence, it may cause tasks that do not need to rely on other tasks to be blocked, which is obviously not conducive to improving the utilization rate of the GPU core. To solve this problem, FIG3 exemplarily shows a flow chart of a task processing method provided by the industry. As shown in FIG3 , the method pre-sets multiple task queues in the task scheduler 310, such as task queue L ₁ , task queue L ₂ , ..., task queue L _m , and m is a positive integer. Among them, each task queue corresponds to a business, such as task queue L ₁ corresponds to Gaussian filtering business, task queue L ₂ corresponds to white balance business, ..., task queue L _m corresponds to image segmentation and image rendering business. Under this configuration, after parsing and obtaining multiple tasks, the device driver package 200 distributes each task to the corresponding task queue according to the business to which each task belongs. When calling the GPU core, the task scheduler 310 calls the idle GPU core to process the tasks in each task queue according to the principle of executing the tasks in multiple task queues in parallel and executing the tasks in one task queue in series.

Using the task processing method shown in Figure 3, although tasks in different task queues can be executed in parallel, tasks in the same task queue need to be executed serially. That is to say, in a task queue, the task at the back needs to wait until all the tasks before it are completed before it can be executed. Even if the task at the back does not depend on the task at the front, and there are idle GPU cores in the processor, the idle GPU cores cannot be called to process the task at the back in advance. For example, suppose a game contains Binning and Rendering services, and the Binning service includes three tasks, Binning1, Binning2, and Binning3, and the Rendering service includes three tasks, Rendering1, Rendering2, and Rendering3. The dependency relationship between the tasks is: Binning2 depends on Rendering1, and Rendering i depends on Binningi, i is 1, 2, or 3, then:

FIG4 shows a flow chart of processing the game task using the task scheduling method provided by the industry. As shown in FIG4 , Binning1 to Binning3 are stored in the same task queue, and Rendering1 to Rendering3 are stored in another task queue. The task scheduler calls the GPU core to execute the Binning tasks and Rendering tasks in the two task queues in parallel. Moreover, since Binning2 depends on Rendering1, and Rendering1 depends on Bining1, Binning2 needs to wait until Rendering1 is executed before it can be executed, and Rendering1 needs to wait until Binning1 is executed before it can be executed. Therefore, there must be a gap 1 between Bendering1 and Bendering2, a gap 2 between Rendering1 and Rendering2, and a gap 3 before Rendering1. It can be seen that in this processing scheme, even if Rendering3 does not depend on Rendering2, Rendering2 does not depend on Rendering1, and Binning3 does not depend on Binning1 and Binning2, since the same task queue is executed sequentially, Rendering3, Rendering2, and Binning3 cannot be executed in advance to fill gaps 1, 2, or 3. Obviously, this method cannot effectively improve the utilization of GPU cores.

In view of this, an embodiment of the present application provides a task scheduling method, which maintains the scheduling order of each task from the hardware side through a task scheduler. After determining that the task has no dependency or has been released from dependency, the task can be sent to an idle processing unit for processing in advance. In order to make full use of all the hardware resources of the processing unit to process tasks with high concurrency, the GPU core should not be idle as much as possible, and the utilization rate of the processing unit should be effectively improved.

The following describes how the above technical problems are solved through the specific embodiments of the present application. It should be noted that in the following description, the task scheduler can be the task scheduler 310 in FIG. 2 , or a communication device that can support the processor to implement the functions required by the method, and of course, it can also be other communication devices or communication systems, such as chips, chip systems, circuits or circuit systems, without specific limitation.

Embodiment 1

FIG5 exemplarily shows a flow chart of a task scheduling method provided in the first embodiment of the present application, which is applicable to a task scheduler, such as the task scheduler 310 shown in FIG2 . As shown in FIG5 , the method includes:

Step 501: The task scheduler obtains N tasks and associations among the N tasks, where N is a positive integer.

Among them, the N tasks can be sent to the task scheduler by the device driver package, and the association relationship of the N tasks can be sent to the task scheduler by the device driver package, or obtained by the task scheduler from other channels, such as accessing the business system, without specific limitation.

In one example, N tasks and the relationship between N tasks can be sent to the task scheduler by the device driver package through tasks, task chains or command streams, and specifically, the relationship between each task and other tasks can be explicitly written into the data structure of the task as configuration information. Therefore, after obtaining N tasks, the task scheduler can parse the data structure of each task to know what kind of relationship each task has with other tasks.

In the embodiment of the present application, the association relationship of the N tasks may include a synchronization association and/or a dependency association. The following describes these two types of association relationships in detail.

Synchronous Association

Synchronous association is used to indicate the association relationship between a task and all other tasks issued earlier or later than the task. Specifically, the synchronous association may include forward synchronous association and backward synchronous association. Forward synchronous association means that a task depends on all other tasks issued earlier than the task, that is, the task can only be executed after all other tasks issued earlier than the task are completed. Among them, a task with forward synchronous association with other tasks is also called a forward synchronous task. As long as there is another task issued earlier than the forward synchronous task that has not been completed, the forward synchronous task will be synchronously blocked by the other task. Conversely, backward synchronous association means that all other tasks issued later than a task depend on the task, that is, only after the task is completed can each other task issued later than the task be executed. Among them, a task with backward synchronous association with other tasks is also called a backward synchronous task. As long as the backward synchronous task has not been completed, all other tasks issued later than the backward synchronous task will be synchronously blocked by the backward synchronous task.

It should be noted that, in the embodiment of the present application, N tasks are located in an entire group, and at least two of the N tasks may also be located in at least one sub-group. Specifically, when there is only an entire group but no sub-group in the task scheduler, all N tasks are located only in the entire group; when there are both an entire group and sub-groups in the task scheduler, each task may be located only in the entire group, or may be located in the entire group and one or more sub-groups at the same time. Based on this, in the above-mentioned synchronization association, other tasks issued earlier than the forward synchronization task may specifically refer to: when a task is marked as a forward synchronization task in the entire group, that is, other tasks in the entire group that are issued earlier than the forward synchronization task; when a task is marked as a forward synchronization task in a sub-group, that is, other tasks in the sub-group that are issued earlier than the forward synchronization task. Similarly, other tasks issued later than the backward synchronization task may specifically refer to: when a task is marked as a backward synchronization task in the entire group, that is, other tasks in the entire group that are issued later than the backward synchronization task; when a task is marked as a backward synchronization task in a sub-group, that is, other tasks in the sub-group that are issued later than the backward synchronization task.

Furthermore, the grouping of the N tasks can be performed by the device driver package and carried in the task data structure to notify the task scheduler, or can be performed by the task scheduler itself. The grouping basis can be, for example, business characteristics or synchronization associations. Take the grouping of tasks by the device driver package as an example:

In one example, the device driver package determines the tasks belonging to the same service according to the service characteristics of N tasks, then marks the same group identifier in the data structure of these tasks, and sends it to the task scheduler. After receiving the N tasks, the task scheduler parses the data structure of the N tasks, obtains the tasks with the same group identifier, creates corresponding virtual sub-groups, and stores these tasks in the virtual sub-groups. Among them, the group identifier can be a service name, service code, group number or other mark that can represent the same service, and is not specifically limited. In this example, by grouping tasks according to service characteristics, even if tasks in a group are synchronously blocked, tasks in other groups will not be affected. In other words, the synchronous blocking of a service will not affect the execution of other services. In this way, the task execution association of the service can be decoupled and the mutual interference between services can be reduced.

In another example, the device driver package determines the tasks with relatively concentrated associations based on the associations among N tasks, and After marking the same group identifier in the data structure of these tasks, they are sent to the task scheduler, and the task scheduler creates a virtual sub-group according to the method in the previous example. For example, if it is determined that a certain task depends on at least three tasks, the device driver package can determine the task and the at least three tasks it depends on as a sub-group, mark the same group identifier in the data structure of the at least four tasks, and also mark the forward synchronization associated task in the data structure of the task. Or, if it is determined that a certain task is dependent on at least three tasks, the device driver package can determine the task and the at least three tasks that depend on the task as a sub-group, mark the same group identifier in the data structure of the at least four tasks, and also mark the backward synchronization associated task in the data structure of the task. In this example, by dividing the tasks with relatively concentrated association relationships into a sub-group, only the forward synchronization task or the backward synchronization task in the sub-group is marked, and the association between the various tasks in the sub-group can be clearly indicated, without the need to mark the dependency relationship of each task, so that the data structure of the entire task chain can be streamlined and the communication overhead between the device driver package and the task scheduler can be reduced.

It should be noted that the specific implementation of grouping by the task scheduler, please refer to the relevant content of grouping by the above-mentioned device driver package, which will not be repeated here. In addition, the device driver package or task scheduler can also group tasks according to other characteristics, such as characteristics indicated by the user, and the embodiments of the present application do not make specific limitations on this.

Dependency Association

Dependency associations include partial dependency associations and serial dependency associations. Partial dependency associations refer to a task that depends on the execution results of one or more other tasks, and the task can only be executed after all the one or more other tasks it depends on have been completed. It should be noted that since the synchronous association has defined a task's dependence or dependency relationship on all other tasks that are earlier or later than the task, therefore, on the basis of the synchronous association, the partial dependency association can only define a task's dependence on some other tasks in all tasks rather than all other tasks. Therefore, we call this dependency relationship a partial dependency association. Correspondingly, the serial dependency association refers to a task that depends on the execution of one or more other tasks, and the task can only be executed after all the other tasks it depends on have started to execute. Moreover, since the serial dependency association does not limit the other tasks it depends on to completion, the tasks that the serial dependency association depends on can be all other tasks or some other tasks.

In the embodiment of the present application, the synchronous association method is adopted for definition, and only the forward synchronization task or the backward synchronization task is defined, so that the forward synchronization task can be synchronously defined to be dependent on all other tasks issued earlier than the task or all other tasks issued later than the backward synchronization task are dependent on the backward synchronization task. If the same definition is performed in the dependent association method, it is necessary to define all the tasks that the task depends on. In other words, compared with the dependent association, the synchronous association can define the dependency relationship between multiple tasks through more streamlined content. Based on this, the above-mentioned dependency association can be identified and marked after the synchronous association has been determined, that is, the synchronous association of the task is first determined, and after the synchronous association determination is completed, the dependent association is marked for the remaining tasks with dependencies. In this way, all the association relationships can be included in the data structure of the N tasks, and there is no need to describe the association for each task, which helps to simplify the data structure.

Step 502: The task scheduler determines, according to the association relationship among the N tasks, the tasks that have no dependencies or have been released from dependencies among the N tasks as schedulable tasks.

Exemplarily, when the association relationship of N tasks includes the above-mentioned synchronization association, the task scheduler can determine the task that has synchronization association with other tasks according to the association relationship of the N tasks, and when the other tasks that have synchronization association with the task corresponding to the task are completed, the task is determined as a schedulable task. Among them, the completion of the execution of other tasks that have synchronization association with the task corresponding to the task can exemplarily include at least one of the following contents: the task is a forward synchronization task and all other tasks acquired earlier than the task are completed; the task is a task acquired later than the backward synchronization task and the backward synchronization task is completed. And/or,

When the association relationship among N tasks includes the above-mentioned dependency association, the task scheduler can determine the task that has a dependency association with other tasks based on the association relationship among the N tasks, and when the other tasks corresponding to the task that have a dependency association with the task have all been executed or have all been completed, the task is determined as a schedulable task.

In the above example, by monitoring the execution status of tasks with synchronization association and/or dependency association, the task can be scheduled to the processing unit in a timely manner when other tasks with synchronization association and/or dependency association no longer block the task.

In a specific implementation, the task scheduler can determine the tasks among the N tasks that are not blocked by other tasks synchronously and not blocked by other tasks dependencies as schedulable tasks based on the association relationship among the N tasks. The fact that a task is not blocked by other tasks synchronously means that the task is no longer blocked by tasks that are synchronously associated with the task, that is, the other tasks that are synchronously associated with the task corresponding to the task are completed. The fact that a task is not blocked by other tasks dependencies means that the task is not blocked by other tasks that are dependently associated with the task, that is, the other tasks that are dependently associated with the task corresponding to the task are completed.

In the above content, a task is synchronously blocked by other tasks means that the task meets at least one of the following conditions:

Condition 1: the task is a forward synchronization task in the entire group, and other tasks in the entire group that were issued earlier than this task have not been fully executed.

Condition 2: The task is a forward synchronization task in a subgroup, and all other tasks in the subgroup that were issued earlier than this task have not been completed. Execution completed.

Condition three: the task is the one that is issued later than the backward synchronization task in the entire group, and the backward synchronization task has not been completed.

Condition 4: the task is a task in a subgroup that is issued later than the backward synchronization task, and the backward synchronization task has not been completed.

For example, FIG6 exemplarily shows a schematic diagram of a task layout with synchronization association provided by an embodiment of the present application. In this example, there are seven tasks, namely, task 0 to task 6, and these seven tasks are sent to the task scheduler in the order of task 0, task 1, task 2, task 3, task 4, task 5 and task 6. Referring to FIG6 (A) to FIG6 (F):

In Figure 6 (A), tasks 0 to 6 are not grouped, that is, tasks 0 to 6 only exist in the whole group. In addition, in the task queue shown in Figure 6 (A), task 3 is marked as a backward synchronization task, that is, tasks 4, 5, and 6 that are issued later than task 3 can only be executed after task 3 is completed. Therefore, as long as task 3 is not completed, tasks 4, 5, and 6 meet the above condition 3, that is, tasks 4, 5, and 6 are synchronously blocked.

Figure 6 (B) also does not group tasks 0 to 6, that is, tasks 0 to 6 only exist in the entire group. Moreover, in the task queue shown in Figure 6 (B), task 3 is marked as a forward synchronization task, that is, task 3 can only be executed after tasks 0, 1, and 2 that were issued earlier than task 3 are all completed. Therefore, as long as at least one of task 0, task 1, or task 2 is not completed, task 3 meets the above condition 1, so task 3 is synchronously blocked.

In Figure 6 (C), Tasks 0 to 6 are not grouped, that is, Tasks 0 to 6 only exist in the whole group. Moreover, in the task queue shown in Figure 6 (C), Task 3 is marked as both a forward synchronization task and a backward synchronization task. Task 3 can only be executed after Tasks 0, 1, and 2 that were issued earlier than Task 3 are all executed, and Tasks 4, 5, and 6 that were issued later than Task 3 can only be executed after Task 3 is executed. Therefore, as long as at least one of Tasks 0, 1, or 2 is not executed, Task 3 meets the above condition 1, and Tasks 4, 5, and 6 meet the above condition 3, so Tasks 3, 4, 5, and 6 will be synchronously blocked.

Figure 6 (D) divides Task 3, Task 5 and Task 6 into the same subgroup, so Task 3, Task 5 and Task 6 exist in both the whole group and the subgroup, while Task 0, Task 1, Task 2 and Task 4 exist in the whole group but not in the subgroup. In the task queue shown in Figure 6 (D), Task 3 is marked as a backward synchronization task in the subgroup, that is, Task 5 and Task 6 in the subgroup that are issued later than Task 3 need to be executed after Task 3 is completed. Therefore, as long as Task 3 is not completed, Task 5 and Task 6 meet the above condition 4, that is, Task 5 and Task 6 are synchronously blocked.

In Figure 6 (E), Task 0, Task 1 and Task 3 are divided into the same subgroup. Therefore, Task 0, Task 1 and Task 3 exist in both the whole group and the subgroup. Task 2, Task 4, Task 5 and Task 6 exist in the whole group but not in the subgroup. In the task queue shown in Figure 6 (E), Task 3 is marked as a forward synchronization task in the subgroup, which means that Task 3 can only be executed after Task 0 and Task 1, which are issued earlier than Task 3, are completed in the subgroup. Therefore, as long as there is at least one task in Task 0 or Task 1 that has not been completed, Task 3 meets the above condition 2, so Task 3 is synchronously blocked.

In Figure 6 (F), Task 0, Task 1, Task 3, Task 5 and Task 6 are divided into the same subgroup. Therefore, Task 0, Task 1, Task 3, Task 5 and Task 6 exist in the whole group and the subgroup at the same time, while Task 2 and Task 4 exist in the whole group but not in the subgroup. In the task queue shown in Figure 6 (F), Task 3 is marked as the forward synchronization task and the backward synchronization task in the subgroup, that is, Task 3 can only be executed after Task 0 and Task 1, which are issued earlier than Task 3 in the subgroup, are executed. At the same time, Task 5 and Task 6, which are issued later than Task 3 in the subgroup, can only be executed after Task 3 is executed. Therefore, as long as there is at least one task in Task 0 or Task 1 that has not been executed, Task 3 meets the above condition four, and Task 5 and Task 6 meet the above condition two, so Task 3, Task 5 and Task 6 will be synchronously blocked.

It should be noted that the "task exists in the whole group but not in the subgroup" described above may mean that the task exists only in the whole group but not in any subgroup, or that the task exists in the whole group and other subgroups, without specific limitation. In addition, in order to reduce the mutual interference of task execution between different subgroups, a task is usually placed in at most one subgroup, and will not be placed in two or more subgroups at the same time, that is, when there are multiple subgroups, the tasks in the multiple subgroups are different.

Accordingly, in the above content, a task is blocked by other task dependencies means that the task satisfies at least one of the following conditions:

Condition 1: The task has some dependencies with other tasks, and the other dependent tasks have not been fully executed;

Condition 2: The task has a serial dependency relationship with other tasks, and all other dependent tasks have not been executed.

For example, FIG7 exemplarily shows a task layout diagram with dependency association provided by an embodiment of the present application. In this example, there are seven tasks, namely, Task 0, Task 1, Task 2, Task 3, Task 4, Task 5, Task 6 and Task 7, and Task 0 to Task 7 are not grouped, that is, Task 0 to Task 7 only exist in the whole group. In the whole group, when Task 0 and Task 1, Task 3 and Task 4 have the above-mentioned partial dependency association with Task 6, if Task 6 has not started to execute, or Task 6 has been executed but has not yet been completed, Then Task 0, Task 1, Task 3 and Task 4 will be blocked by Task 6. When Task 0 and Task 1, Task 3 and Task 4 have the above serial dependency association with Task 6, if Task 6 has not started executing, then Task 0, Task 1, Task 3 and Task 4 will also be blocked by Task 6 dependency. As long as Task 6 has started executing, regardless of whether it has been completed, Task 0, Task 1, Task 3 and Task 4 will not be blocked by Task 6 dependency.

Further exemplary, according to the above introduction, the task scheduler can determine the schedulable tasks among the N tasks that are not blocked by other tasks synchronously and are not blocked by other task dependencies in the following manner:

After the task scheduler obtains N tasks, it traverses each task in the N tasks that has not been determined to be blocked in the order in which the N tasks are issued, and, when traversing each task that has not been determined to be blocked: if the task is a forward synchronization task in the entire group, and other tasks in the entire group that were issued earlier than the task have not been fully executed, then it is determined that the task is synchronously blocked by other tasks in the entire group that were issued earlier than the task; if the task is a forward synchronization task in a sub-group, and other tasks in the sub-group that were issued earlier than the task have not been fully executed, then it is determined that the task is synchronously blocked by other tasks in the sub-group that were issued earlier than the task; if the task is a forward synchronization task in a sub-group, and other tasks in the sub-group that were issued earlier than the task have not been fully executed, then it is determined that the task is synchronously blocked by other tasks in the sub-group that were issued earlier than the task; If it is a backward synchronization task, it is determined that all other tasks in the entire group that are issued later than this task are synchronously blocked by this task; if this task is a backward synchronization task in a subgroup, it is determined that all other tasks in the subgroup that are issued later than this task are synchronously blocked by this task; if this task has a partial dependency association with other tasks, and the other tasks on which it depends have not been fully executed, it is determined that this task is blocked by the dependency; if this task has a serial dependency association with other tasks, and the other tasks on which it depends have not been fully executed, it is determined that this task is blocked by the dependency; when this task is neither blocked by synchronization nor blocked by dependency, the task is determined to be a schedulable task.

It should be noted that in the above judgment method, as long as the task belongs to a backward synchronization task in the whole group or sub-group, other tasks in the whole group or sub-group where the task is located that are issued later than the task can be directly judged as being synchronously blocked. It can be seen that by setting the synchronization association between tasks, as long as it is analyzed that a backward synchronization task in the whole group or sub-group has not been executed, all other tasks in the whole group or sub-group that are issued later than the backward synchronization task do not need to be analyzed for synchronization blocking. This can greatly simplify the judgment process of synchronization blocking, effectively improve the efficiency of synchronization blocking judgment, and thus help improve the efficiency of task scheduling.

In addition, the above two judgment operations of judging whether a task is blocked synchronously and judging whether a task is blocked by a dependency can be executed serially or in parallel. Moreover, as long as one of the judgments is determined to be blocked, the other judgment can be terminated immediately without continuing to execute, thus avoiding unnecessary calculation processes, effectively saving calculation resources, and further improving the efficiency of task scheduling.

Step 503: The task scheduler schedules the schedulable tasks to the processing units.

In the embodiment of the present application, the task scheduler can monitor the number of tasks currently executed by the GPU core in real time, and when it is determined that the number of tasks is less than the number of parallel tasks of the GPU core, the schedulable tasks are scheduled to the GPU core. Among them, when there are multiple schedulable tasks, the task scheduler can schedule multiple schedulable tasks to the GPU core in sequence according to the order of tasks sent by the receiving device driver package, so as to ensure that each task is executed in the order of image processing, avoid the next frame of the image being played before the previous frame of the image, and effectively avoid the phenomenon of long and short frames.

In the above-mentioned embodiment 1, the scheduling order of N tasks is maintained from the hardware side by the task scheduler, rather than being specified by the device driver package from the software side. On the one hand, tasks that have no dependencies or have been freed from dependencies can be sent to the processing unit for processing in advance according to the actual execution status of the tasks, thereby effectively improving the utilization rate of the processing unit. On the other hand, there is no need for the hardware side to send a notification message to the software side after each task is executed to instruct the software side to send a new task. This can greatly reduce the work pressure on the software side, effectively improve the efficiency of task scheduling, and save communication overhead.

Similarly, for the game processing scenario shown in FIG. 4 above, FIG. 8 exemplarily shows a flow chart of processing the game task using the task scheduling method in the first embodiment above, wherein FIG. 8 (A) shows the order and dependency of the six tasks Binning1 to Binning3 and Rendering1 to Rendering3. It can be seen that the six tasks are sent to the task scheduler by the device driver package in the order of Binning1, Rendering1, Binning2, Rendering2, Binning3, and Rendering2. The task scheduler stores the six tasks in a whole group without grouping them. Correspondingly, FIG. 8 (B) shows a possible situation of processing tasks according to the task scheduling method in the first embodiment above. Referring to FIG. 8 (B), the task scheduler can schedule each task according to the following steps:

Step 1: The task scheduler determines whether Binning1, Rendering1, Binning2, Rendering2, Binning3, and Rendering2 are blocked synchronously or dependently in the order in which the tasks are issued:

First, Binning1 is analyzed to determine that Binning1 does not belong to the forward synchronization task and there is no backward synchronization task in the entire group. Therefore, Binning1 is not blocked by synchronization, and Binning1 does not depend on other tasks. Therefore, Binning1 is not blocked by dependencies. Therefore, Binning1 is determined to be a task without dependencies, and the task scheduler schedules Binning1 to the GPU core.

Secondly, analyze Rendering1 and determine that Rendering1 does not belong to the forward synchronization task and there is no backward synchronization task in the entire group. Therefore, Rendering1 is not blocked synchronously, and Rendering1 depends on Binning1. Therefore, before Binning1 is executed, Rendering1 is blocked by Binning1 dependency, so the task scheduler does not schedule Rendering1.

Furthermore, Binning2 is analyzed to determine that Binning2 does not belong to the forward synchronization task and there is no backward synchronization task in the entire group, so Binning2 is not blocked by synchronization, and Binning2 depends on Rendering1. Therefore, before Rendering1 is executed, Binning2 is blocked by Rendering1 dependency, and the task scheduler does not schedule Binning2.

Then, Rendering2 is analyzed to determine that Rendering2 does not belong to the forward synchronization task and there is no backward synchronization task in the entire group, so Rendering2 is not blocked by synchronization, and Rendering2 depends on Binning2. Therefore, before Binning2 is executed, Rendering2 is blocked by Binning2 dependency, so the task scheduler does not schedule Rendering2.

After that, Binning3 is analyzed and it is determined that Binning3 does not belong to the forward synchronization task and there is no backward synchronization task in the whole group. Therefore, Binning3 is not blocked by synchronization, and Binning3 does not depend on other tasks. Therefore, Binning3 is not blocked by dependencies. Therefore, Binning3 is determined to be a task without dependencies, and the task scheduler schedules Binning3 to the GPU core.

Finally, Rendering3 is analyzed to determine that Rendering3 does not belong to the forward synchronization task and there is no backward synchronization task in the entire group, so Rendering3 is not blocked by synchronization, and Rendering3 depends on Binning3. Therefore, before Binning3 is executed, Rendering3 is blocked by Binning3 dependency, so the task scheduler does not schedule Rendering3.

According to the above analysis, the task scheduler first schedules Binning1 and Binning3 to the GPU core for processing.

Step 2: Assuming Binning1 is completed first, Rendering1, which depends on Binning1, is no longer dependent on it. Therefore, the task scheduler can schedule Rendering1 to the GPU core for processing. At this time, Binning3 and Rendering1 will be processed in parallel in the GPU core.

Step 3: Assuming Binning3 is completed first, Rendering3, which depends on Binning3, releases the dependency. Therefore, the task scheduler can schedule Rendering3 to the GPU core for processing. At this time, Rendering1 and Rendering3 will be processed in parallel in the GPU core.

Step 4: Assuming that Rendering1 is completed first, Binning2, which depends on Rendering1, is no longer dependent on it. Therefore, the task scheduler can schedule Binning2 to the GPU core for processing. At this time, Rendering3 and Binning2 will be processed in parallel in the GPU core.

Step 5: Assuming Binning2 is completed first, Rendering2, which depends on Binning2, releases the dependency. Therefore, the task scheduler can schedule Rendering2 to the GPU core for processing. At this time, Rendering3 and Rendering2 will be processed in parallel in the GPU core.

Step 6: After Rendering3 and Rendering2 are processed, all six tasks are completed.

It can be seen that in the above steps 1 to 6, the GPU core can process two tasks in parallel without interruption, and the GPU core is basically not idle, so that the utilization rate of the GPU core is greatly improved, and the performance of task scheduling is also greatly improved.

It should be noted that (B) in FIG. 8 above only shows a possible scheduling method, and there may be other scheduling methods in actual situations. For example, in another example, in the above step 2 or step 3, assuming that Rendering1 is executed first, Binning2 that depends on Rendering1 is released from dependency, so the task scheduler can schedule Binning2 to the GPU core for processing. At this time, if it is the above step 2, Binning3 and Binning2 will be processed in parallel in the GPU core, and if it is the above step 3, Rendering3 and Binning2 will be processed in parallel in the GPU core. Or, in another example, in the above step 4 or step 5, assuming that Rendering3 is executed first, all the tasks that have not been processed at this time are not released from dependency, so the task scheduler no longer schedules new tasks, but waits for Rendering1 or Binning2 to be executed, and then schedules the released Binning2 or Rendering2 to the GPU core. Although the GPU core is idle during this period, compared with the three gaps shown in FIG. 4, it can still greatly improve the utilization rate of the GPU core. It should be understood that there are many possible scheduling methods, which are not listed one by one in the embodiments of the present application.

Based on the above-mentioned embodiment 1, the specific implementation of the task scheduling method is further introduced through embodiment 2 below.

Embodiment 2

FIG9 exemplarily shows a flow chart of a task scheduling method provided in the second embodiment of the present application, which is applicable to a task scheduler, such as the task scheduler 310 shown in FIG2 . As shown in FIG9 , the method includes:

Step 901: The task scheduler obtains N tasks and associations among the N tasks.

Step 902, the task scheduler selects the tasks that have not been determined to be synchronously blocked from the N tasks, and determines the earliest acquired task among the tasks that have not been determined to be synchronously blocked as the target task according to the acquisition order of the tasks that have not been determined to be synchronously blocked.

Among them, tasks that have not yet been determined to be synchronously blocked refer to tasks that have not yet been analyzed whether they are synchronously blocked, and tasks that have been determined to be synchronously blocked include tasks that have been analyzed and determined to be synchronously blocked, as well as tasks that have not yet been analyzed whether they are synchronously blocked but have been determined to be synchronously blocked in the analysis of other tasks.

Step 903, the task scheduler determines other tasks that are synchronously associated with the target task based on the association relationship of the N tasks, and determines whether the target task is synchronously blocked by other tasks based on the execution status of other tasks. If not, step 904 is executed. If yes, step 905 is executed. Go to step 902.

Specifically, the task scheduler may first perform the following judgments 1 to 4:

Judgment 1: when judging that the target task is a forward synchronization task in the entire group according to the configuration information of the target task, it is determined that other tasks having a synchronization association relationship with the target task are other tasks in the entire group that are issued earlier than the target task, and further, if the other tasks in the entire group that are issued earlier than the target task are not all executed, it is determined that the target task is synchronously blocked by other tasks in the entire group that are issued earlier than the target task;

Judgment 2: when judging that the target task is a forward synchronization task in a certain subgroup according to the configuration information of the target task, it is determined that other tasks having a synchronization association relationship with the target task are other tasks in the subgroup that are issued earlier than the target task, and further, if the other tasks in the subgroup that are issued earlier than the target task are not all executed, it is determined that the target task is synchronously blocked by other tasks in the subgroup that are issued earlier than the target task;

Judgment three: when it is determined that the target task is a task in the entire group that is issued later than a certain backward synchronization task according to the configuration information of the target task, it is determined that other tasks having a synchronization association relationship with the target task are the backward synchronization task in the entire group, and further, if the backward synchronization task in the entire group is not executed to completion, it is determined that the target task is synchronization blocked by the backward synchronization task in the entire group;

Judgment four, when it is judged that the target task is a task in a sub-group that is issued later than a backward synchronization task based on the configuration information of the target task, it is determined that other tasks that have a synchronization association relationship with the target task are the backward synchronization task in the sub-group. In this case, if the backward synchronization task in the sub-group has not been executed to completion, it is determined that the target task is synchronization blocked by the backward synchronization task in the sub-group.

When at least one of the above judgments 1 to 4 is satisfied, it is determined that the target task is synchronously blocked. Conversely, when none of the above judgments 1 to 4 are satisfied, it is determined that the target task is not synchronously blocked.

Furthermore, the task scheduler may also perform the following judgment five and judgment six:

Judgment 5: when judging that the target task is a backward synchronization task in the entire group according to the configuration information of the target task, it is determined that all other tasks in the entire group that are issued later than the backward synchronization task are synchronously blocked by the target task;

Judgment six: when judging, based on the configuration information of the target task, that the target task is a backward synchronization task in a certain subgroup, it is determined that all other tasks in the subgroup that are issued later than the backward synchronization task are synchronously blocked by the target task.

Through the above judgments 1 to 4, the task scheduler can determine whether the target task is synchronously blocked by other tasks, and through the above judgments 5 and 6, the task scheduler can determine whether other tasks that have not been analyzed are synchronously blocked by the target task. It can be seen that this method can determine other tasks that are synchronously blocked by the target task while analyzing whether a target task is synchronously blocked, so that there is no need to perform meaningless analysis on these tasks that have been determined to be synchronously blocked in the future, which helps save computing resources and improves the efficiency of synchronous blocking judgment.

It should be understood that the above judgments one to six can be executed in parallel or in series in any order, and the embodiments of the present application do not specifically limit this.

Step 904, the task scheduler determines other tasks that have dependency associations with the target task based on the association relationship among the N tasks, and determines whether the target task is blocked by other task dependencies based on the execution status of other tasks. If not, execute step 905; if yes, execute step 902.

Specifically, the task scheduler may first perform the following judgments 1 and 2:

Judgment 1: when judging that the target task has a partial dependency relationship with other tasks according to the configuration information of the target task, if the other dependent tasks have not been fully executed, it is determined that the target task is partially blocked by the dependencies of other tasks;

Judgment 2: when judging that the target task has a serial dependency association with other tasks according to the configuration information of the target task, if the other dependent tasks are not all executed, it is determined that the target task is blocked by the serial dependency of other tasks.

When at least one of the above judgment 1 and judgment 2 is satisfied, it is determined that the target task is blocked by the dependency. Conversely, when neither the above judgment 1 nor the judgment 2 is satisfied, it is determined that the target task is not blocked by the dependency.

Step 905: The task scheduler determines the target task as a schedulable task.

Step 906: The task scheduler schedules the schedulable tasks to the processing units in sequence according to the order in which the schedulable tasks are acquired.

In the above-mentioned embodiment 2, by first determining the synchronous blocking and then determining the dependent blocking, all tasks that are synchronously blocked by a task can be determined first, and then the remaining undetermined tasks can be analyzed, without the need to analyze each task one by one. This can greatly save the analysis process, improve the efficiency of blocking judgment, and further improve the efficiency of task scheduling.

The above-mentioned first and second embodiments introduce possible implementations of the task scheduling method from the perspective of software. The following further introduces possible implementations of the task scheduling method from the perspective of hardware based on the third embodiment.

Embodiment 3

Exemplarily, as shown in Figure 2, the task scheduler 310 may include a first waiting queue, a second waiting queue and a ready queue. The first waiting queue is used to store tasks that have not been analyzed for being blocked synchronously, and tasks that have been determined to be blocked synchronously. The tasks that have been determined to be blocked synchronously include: tasks that have been analyzed for being blocked synchronously and have been determined to be blocked synchronously, and tasks that have not been analyzed for being blocked synchronously but have been determined to be blocked synchronously in the analysis of other tasks. The second waiting queue is used to store tasks that have been determined not to be blocked synchronously and have not been analyzed for being blocked by dependencies, and tasks that have been determined to be blocked by dependencies. The tasks that have been determined to be blocked by dependencies include: tasks that have been analyzed for being blocked by dependencies and have been determined to be blocked by dependencies, and tasks that have not been analyzed for being blocked by dependencies but have been determined to be blocked by dependencies in the analysis of other tasks. The ready queue is used to store tasks that have been determined not to be blocked synchronously and have been determined not to be blocked by dependencies, that is, schedulable tasks.

Further exemplary, the tasks in the first waiting queue, the second waiting queue and the ready queue can be processed in parallel by different threads. Specifically, in one thread, after receiving the task issued by the device driver package, the task scheduler 310 can store the task in the first waiting queue in sequence according to the order in which the task is issued. In another thread, the task scheduler 310 traverses each task in the first waiting queue that has not been determined to be synchronously blocked according to the order in which the task is stored in the first waiting queue. When traversing each task, the method in the above-mentioned embodiment one or two is used to determine whether the task is synchronously blocked by other tasks. If so, the next task that has not been determined to be synchronously blocked is traversed, and if not, the task is moved from the first waiting queue to the second waiting queue. In another thread, the task scheduler 310 traverses each task in the second waiting queue that has not been determined to be dependently blocked according to the order in which the task is stored in the second waiting queue. When traversing each task, it is determined whether the task is dependently blocked by other tasks according to the above-mentioned method. If so, the next task that has not been determined to be dependently blocked is traversed, and if not, the task is moved from the second waiting queue to the ready queue. In yet another thread, the task scheduler 310 schedules the tasks in the ready queue to the available GPU cores in sequence according to the task processing status of the GPU core and the order in which the tasks are issued.

It should be noted that the above-mentioned task scheduler can integrate all functions on an independent physical device, or can disperse and deploy each function on different physical devices. For example, in a specific deployment method, as shown in Figure 2, the task scheduler 310 can also include a task acquirer 311, a blocking manager 312 and a task dispatcher 313, the task acquirer 311 can access the first waiting queue, the blocking manager 312 can access the first waiting queue, the second waiting queue and the ready queue, and the task dispatcher 313 can access the ready queue. The processing operation of the above-mentioned task scheduler 310 on the tasks in the first task queue, the second task queue and the ready queue can be realized by the task acquirer 311, the blocking manager 312 and the task dispatcher 313 accessing the tasks in the first waiting queue, the second waiting queue and the ready queue.

The specific functions of these three devices are described in detail below:

The task acquirer 311 is used to receive the tasks sent by the device driver package to the task scheduler 310, and store the relevant data of the tasks in the first waiting queue in the order of sending the tasks. The relevant data of each task includes the configuration information of the task, and the configuration information is used to indicate one or more of the following contents: whether the task is a forward synchronization task in the whole group, whether the task is a backward synchronization task in the whole group, the subgroup to which the task belongs, whether the task is a forward synchronization task in the subgroup to which it belongs, whether the task is a backward synchronization task in the subgroup to which it belongs, whether the task has a partial dependency association with other tasks and the other tasks on which it depends, and whether the task has a serial dependency association with other tasks and the other tasks on which it depends.

The blocking manager 312 is used to monitor the state of the first waiting queue. When it is sensed that there are tasks stored in the first waiting queue, each task in the first waiting queue that is not determined to be synchronously blocked is traversed in the order in which the tasks are stored in the first waiting queue. When traversing each task, the following operations are performed:

Operation one, according to the configuration information of the task, if it is determined that the task is a forward synchronization task in the entire group, then by querying the task status of other tasks in the entire group that are issued earlier than the task, it is determined whether the other tasks in the entire group that are issued earlier than the task have been fully executed; and, if it is determined that the task is a forward synchronization task in at least one sub-group, then by querying the task status of other tasks in each sub-group that are issued earlier than the task, it is determined whether the other tasks in each sub-group that are issued earlier than the task have been fully executed. When the above judgment results are all yes, it is determined that the task is not blocked by synchronization, and the task is moved from the first task queue to the second task queue. On the contrary, when there is at least one no in the above judgment results, it is determined that the task is blocked by synchronization, the task continues to remain in the first task queue, and starts to traverse the next task that is not determined to be blocked by synchronization;

Operation two, based on the configuration information of the task, if it is determined that the task is a backward synchronization task in the entire group, then it is determined that all other tasks in the entire group that are issued later than the task are all blocked synchronously, and all other tasks in the entire group that are issued later than the task are all left in the first task queue; and, if it is determined that the task is a backward synchronization task in at least one sub-group, then it is determined that all other tasks in each sub-group that are issued later than the task are all blocked synchronously, and all other tasks in each sub-group that are issued later than the task are all left in the first task queue.

It should be noted that the embodiment of the present application does not limit the execution order of the above-mentioned operation 1 and operation 2. For example, operation 1 may be executed first and then operation 2, or operation 2 may be executed first and then operation 1, or operation 1 and operation 2 may be executed simultaneously. The analysis operation of the first task queue adopts a polling method. For example, after one round of analysis, the status of all remaining tasks in the first task queue is updated to have not been determined to be synchronously blocked. After that, the same operation one and operation two are executed again according to the order in which the tasks are stored.

Further, the blocking manager 312 is also used to monitor the state of the second waiting queue. When it is perceived that there are tasks stored in the second waiting queue, each task in the second waiting queue that is not determined to be blocked by dependency is traversed in the order in which the tasks are stored in the second waiting queue. When traversing each task: according to the configuration information of the task, if it is determined that the task has a partial dependency association with other tasks, then by querying the task status of other tasks on which it depends, it is determined whether the other tasks on which it depends have been fully executed, and, if it is determined that the task has a serial dependency association with other tasks, then by querying the task status of other tasks on which it depends, it is determined whether the other tasks on which it depends have been fully executed. When the above-mentioned judgment results are all yes, it is determined that the task is not blocked by dependency, and the task is moved from the second task queue to the ready queue. On the contrary, when there is at least one no in the above-mentioned judgment results, it is determined that the task is blocked by dependency, the task continues to be left in the second task queue, and the next task on which dependency is not determined to be blocked begins to be traversed.

The task dispatcher 313 is used to monitor the status of the ready queue and the status of the GPU core. When it is sensed that there are tasks stored in the ready queue and the current task processing volume of the GPU core is less than the parallel task volume, the tasks in the ready queue are dispatched to the GPU core in sequence according to the order in which the device driver package sends the tasks. For example, assuming that task 3, task 1 and task 2 are stored in the ready queue in sequence, and the order of issuance is task 1, task 2 and task 3, and the number of parallel tasks of the GPU core is 2, then: when the number of tasks currently executed by the GPU core is 1, it is determined that the GPU core can currently execute another task. At this time, the task dispatcher 313 can dispatch task 1, which is the earliest issued in the ready queue, to the GPU core. When it is subsequently determined that the GPU core can execute another task, task 2 in the ready queue is dispatched to the GPU core. When it is subsequently determined that the GPU core can execute another task, task 3 in the ready queue is dispatched to the GPU core; or, when the number of tasks currently executed by the GPU core is 0, it is determined that the GPU core can currently execute two more tasks. At this time, the task dispatcher 313 can dispatch task 1 and task 2, which are the earliest issued in the ready queue, to the GPU core. When the GPU core can execute another task, task 3 in the ready queue is dispatched to the GPU core. In this way, through the above-mentioned synchronous blocking judgment and dependent blocking judgment, even if the task issued later is stored in the ready queue before the task issued earlier, by finding the first issued task for scheduling before the actual scheduling, rather than scheduling according to the storage order in the ready queue, it is possible to find all processable tasks in advance and maximize the guarantee that the earliest issued task among the currently processable tasks will be processed first.

In the above-mentioned third embodiment, the three operations of whether a task is synchronously blocked, whether it is dependently blocked, and whether it is scheduled to the GPU core are executed in series, but the operations of whether different tasks are synchronously blocked, whether they are dependently blocked, and whether they are scheduled to the GPU core are executed in parallel. For example, the same task will only be judged whether it is dependently blocked if it is determined that it is not synchronously blocked, and will only be subsequently scheduled if it is determined that it is not dependently blocked. When a task is judged whether it is synchronously blocked, another task issued earlier than the task may be judged whether it is dependently blocked, and other tasks issued earlier than the task and another task may be scheduled to the GPU core. It can be seen that by setting multiple threads to perform synchronous blocking judgment, dependent blocking judgment and scheduling operations in parallel, it can not only ensure that a task will be scheduled only when it is not synchronously blocked and not dependently blocked, but also the front process of the task issued later and the back process of the task issued earlier can be processed in parallel, which helps to further improve the efficiency of task scheduling.

The task scheduling method in the above embodiment is described in detail below through several specific examples. It should be noted that in the following examples, it is assumed that the maximum parallel task amount of the processor is 2, that is, the processor can process two tasks at the same time.

Example 1

Figure 10 exemplarily shows a task scheduling process diagram provided by an embodiment of the present application, wherein (A) in Figure 10 shows tasks 0 to 5 and their associations issued by the device driver package to the task scheduler, and the tasks 0 to 5 are first acquired by the task acquirer and stored in the first waiting queue. In the association relationship between tasks 0 to 5, tasks 0 to 5 are only located in the entire group, and task 3 belongs to the forward synchronization task and the backward synchronization task in the entire group, which means that task 3 can only be executed after tasks 0, 1 and 2 issued earlier than task 3 are all completed, and tasks 4 and 5 issued later than task 3 can only be executed after task 3 is completed. Correspondingly, (B) in Figure 10 shows a possible situation of processing tasks according to the task scheduling method in the above embodiment. Referring to (B) in Figure 10, the task scheduler can schedule each task according to the following steps:

Step 1: First, analyze Task 0. Task 0 is scheduled to the GPU core without blocking through all processes. Specifically:

The blocking manager traverses the first task 0 issued in the first waiting queue. Since task 0 does not belong to the forward synchronization task and is not a task issued later than the backward synchronization task 3 in the entire group, task 0 is not blocked by synchronization. The blocking manager moves task 0 from the first waiting queue to the second waiting queue.

The blocking manager traverses the task 0 stored first in the second waiting queue. Since task 0 does not depend on other tasks, task 0 is not blocked by dependencies. The blocking manager moves task 0 from the second waiting queue to the ready queue.

The task dispatcher monitors that the GPU core is not currently processing any tasks, that is, the GPU core can currently process two tasks. Therefore, the task dispatcher schedules the first task 0 issued in the ready queue to the GPU core.

Step 2: Analyze Task 1 again. Task 1 is scheduled to the GPU core without blocking through all processes. Specifically:

After the above step 1, the first waiting queue only contains tasks 1 to 5. The blocking manager traverses the first task 1 issued in the first waiting queue. Since task 1 does not belong to the forward synchronization task and is not a task issued later than the backward synchronization task 3 in the whole group, task 1 is not blocked by synchronization. The blocking manager moves task 1 from the first waiting queue to the second waiting queue.

The blocking manager traverses Task 1 in the second waiting queue. Since Task 1 does not depend on other tasks, Task 1 is not blocked by dependencies. The blocking manager moves Task 1 from the second waiting queue to the ready queue.

The task dispatcher monitors that the GPU core is currently only processing task 0, that is, the GPU core can currently process one task. Therefore, the task dispatcher schedules task 1 in the ready queue to the GPU core.

After the above steps 1 and 2, the GPU core processes Task 0 and Task 1 in parallel.

Step 3: Analyze Task 2 again. Task 2 passes the synchronous blocking judgment process and the dependent blocking judgment process, but needs to wait for Task 0 or Task 1 to be completed in the dispatch process before it can be scheduled to the GPU core. Specifically:

After the above step 2, the first waiting queue only contains tasks 2 to 5. The blocking manager traverses the first task 2 issued in the first waiting queue. Since task 2 does not belong to the forward synchronization task and is not a task issued later than the backward synchronization task 3 in the entire group, task 2 is not synchronously blocked. The blocking manager moves task 2 from the first waiting queue to the second waiting queue.

The blocking manager traverses Task 2 in the second waiting queue. Since Task 2 does not depend on other tasks, Task 2 is not blocked by dependencies. The blocking manager moves Task 2 from the second waiting queue to the ready queue.

The task dispatcher monitors that the GPU core is currently processing task 0 and task 1, that is, the GPU core cannot currently process new tasks. Therefore, the task dispatcher waits for the GPU core to complete processing one of the tasks and then schedules task 2 in the ready queue to the GPU core. As shown in (B) in Figure 10, assuming that the GPU core completes processing task 0 first, after the task dispatcher schedules task 2 to the GPU core, the GPU core processes task 1 and task 2 in parallel.

Step 4: Analyze Task 3 again. Task 3 is blocked in the synchronous blocking judgment process and can only be executed after Task 1 and Task 2 are completed:

After the above step 3, the first waiting queue only contains tasks 3 to 5. The blocking manager traverses the first task 3 issued in the first waiting queue. Since task 3 belongs to the forward synchronization task and tasks 1 and 2 issued earlier than task 3 have not been completed, task 3 is synchronously blocked. At the same time, since task 3 also belongs to the backward synchronization task, tasks 4 and 5 issued later than task 3 are also determined to be synchronously blocked. Therefore, before tasks 1 and 2 are all executed, the blocking manager will no longer analyze tasks 3 to 5 in the first waiting queue.

As shown in (B) of FIG10 , assuming that task 1 is completed first, no new tasks will be scheduled in the GPU core, and the GPU core only processes task 2. When task 2 is also completed, the blocking manager determines that all the tasks that synchronously block task 3 are completed, and task 3 is released from synchronous blocking. Therefore, the blocking manager moves task 3 from the first waiting queue to the second waiting queue;

The blocking manager traverses task 3 in the second waiting queue. Since task 3 does not depend on other tasks, task 3 is not blocked by dependencies. The blocking manager moves task 3 from the second waiting queue to the ready queue.

The task dispatcher monitors the tasks that are not currently being processed by the GPU core, that is, the GPU core can currently process 2 tasks. Therefore, the task dispatcher directly schedules task 3 in the ready queue to the GPU core, and the GPU core only processes task 3.

Step 5: Task 4 and Task 5 are blocked in the synchronization blocking judgment process and can only be executed after Task 3 is completed:

After the above step four, the first waiting queue only contains Task 4 and Task 5, and Task 4 and Task 5 have been determined to be synchronously blocked by Task 3 in the above step four. Therefore, before Task 3 is executed, the blocking manager will no longer analyze Task 4 and Task 5 in the first waiting queue.

Furthermore, when task 3 is executed, the blocking manager determines that task 3, which synchronously blocks tasks 4 and 5, is executed, and tasks 4 and 5 are released from synchronous blocking. Therefore, by traversing tasks 4 and 5 in sequence, the blocking manager moves tasks 4 and 5 from the first waiting queue to the second waiting queue in sequence.

The blocking manager traverses tasks 4 and 5 in the second waiting queue in turn. Since tasks 4 and 5 do not depend on other tasks, neither of them is blocked by dependencies. The blocking manager moves tasks 4 and 5 from the second waiting queue to the ready queue.

The task dispatcher monitors the tasks that are not currently being processed by the GPU core, that is, the GPU core can currently process 2 tasks. Therefore, the task dispatcher schedules tasks 4 and 5 in the ready queue to the GPU core, so that the GPU core processes tasks 4 and 5 in parallel.

The above example 1 introduces the scenario where forward synchronization tasks and backward synchronization tasks are defined in the whole group. By combining the synchronization association and dependency association between the two and scheduling tasks according to the above task scheduling method, the tasks without synchronization blocking and dependency blocking in the whole group can be scheduled in advance as much as possible, so as to reduce the idle phenomenon of the GPU core and improve the utilization rate of the GPU core.

Example 2

FIG11 exemplarily shows another task scheduling process diagram provided by an embodiment of the present application, wherein FIG11 (A) shows tasks 0 to 4 and their associations issued by the device driver package to the task scheduler, and the tasks 0 to 4 are first acquired by the task acquirer and stored in the first waiting queue. In the association relationship of tasks 0 to 4, tasks 0 to 4 are located in the entire group, while tasks 0, 2 and 4 are located in subgroup 1, and tasks 1 and 3 are located in subgroup 2. Moreover, task 2 belongs to the forward synchronization task and the backward synchronization task in subgroup 1, which means that task 2 can only be executed after task 0 issued earlier than task 2 in subgroup 1 is executed, and task 4 issued later than task 2 in subgroup 1 can only be executed after task 2 is executed. Correspondingly, FIG11 (B) shows a possible situation of processing tasks according to the task scheduling method in the above embodiment. Referring to FIG11 (B), the task scheduler can schedule each task according to the following steps:

Step 1: Task 0 is analyzed first. Task 0 is scheduled to the GPU core without blocking through all processes. The specific implementation process refers to step 1 in the above example 1, which will not be repeated here.

Step 2: Task 1 is analyzed again. Task 1 is scheduled to the GPU core without blocking through all processes. The specific implementation process refers to step 2 in the above example 1, which will not be repeated here.

Step 3: Analyze Task 2 again. Task 2 is blocked in the synchronous blocking judgment process and can only be executed after Task 0 is completed:

After the above step 2, the first waiting queue only contains tasks 2 to 4. The blocking manager traverses the first task 2 issued in the first waiting queue. Since task 2 belongs to the forward synchronization task in subgroup 1, and task 0 issued earlier than task 2 in subgroup 1 has not been completed, task 2 is synchronously blocked. At the same time, since task 2 also belongs to the backward synchronization task in subgroup 1, task 4 issued later than task 2 in subgroup 1 is also determined to be synchronously blocked. Therefore, before task 2 is executed, the blocking manager no longer analyzes task 4 in the first subgroup.

As shown in (B) of FIG11 , assuming that task 0 is completed first, task 2 is released from synchronization blocking, so the blocking manager moves task 2 from the first waiting queue to the second waiting queue;

The task dispatcher monitors that the GPU core is currently processing task 1, that is, the GPU core can currently process a new task. Therefore, the task dispatcher schedules task 2 in the ready queue to the GPU core, so that the GPU core processes task 1 and task 2 in parallel.

Step 4: Task 3 is analyzed again. Task 3 passes the synchronous blocking judgment process and the dependent blocking judgment process, but it needs to wait for Task 1 or Task 2 to be completed in the dispatch process before it can be scheduled to the GPU core. As shown in (B) in Figure 11, assuming that Task 1 is completed first, the task dispatcher dispatches Task 3 in the ready queue to the GPU core, so that the GPU core processes Task 2 and Task 3 in parallel. The specific implementation process of this step refers to Step 3 in the above Example 1, and will not be repeated here.

Step 5: Task 4 is blocked in the synchronous blocking judgment process and can only be executed after Task 2 is completed:

After the above step 4, the first waiting queue only contains Task 4, and Task 4 has been determined to be synchronously blocked by Task 2 in the above step 3. Therefore, before Task 2 is executed, the blocking manager no longer analyzes Task 4 in the first waiting queue.

Furthermore, when task 2 is executed, the blocking manager determines that task 2 that synchronously blocks task 4 is executed, and task 4 is released from synchronous blocking. Therefore, the blocking manager moves task 4 from the first waiting queue to the second waiting queue.

The blocking manager traverses task 4 in the second waiting queue. Since task 4 does not depend on other tasks, task 4 is not blocked by dependencies. The blocking manager moves task 4 from the second waiting queue to the ready queue.

The task dispatcher monitors that the GPU core is currently processing task 3, that is, the GPU core can currently process a new task. Therefore, the task dispatcher schedules task 4 in the ready queue to the GPU core, so that the GPU core processes task 3 and task 4 in parallel.

The above example 2 introduces the scenario where forward synchronization tasks and backward synchronization tasks are defined in sub-groups. By defining the synchronization association between tasks in sub-groups, even if a task in a sub-group is blocked by synchronization, it will not affect the execution of tasks in other sub-groups. This helps to decouple tasks of different sub-groups and reduce the mutual influence between tasks.

Example 3

FIG. 12 exemplarily shows another task scheduling process diagram provided in an embodiment of the present application, wherein FIG. 12 (A) shows The device driver package sends tasks 0 to 4 and their associations to the task scheduler. Tasks 0 to 4 are first acquired by the task acquirer and stored in the first waiting queue. In the association of tasks 0 to 4, tasks 0 to 4 are in the entire group, and task 3 has a partial dependency association with tasks 0 and 2, which means that task 3 can only be executed after tasks 0 and 2 are all completed. Correspondingly, FIG12 (B) shows a possible situation of processing tasks according to the task scheduling method in the above embodiment. As shown in FIG12 (B), the task scheduler can schedule each task according to the following steps:

Step 3: Task 2 is analyzed again. Task 2 passes the synchronous blocking judgment process and the dependent blocking judgment process, but it needs to wait for Task 0 or Task 1 to be executed in the dispatch process before it can be scheduled to the GPU core. As shown in (B) in Figure 12, assuming that the GPU core processes Task 0 first, after the task dispatcher schedules Task 2 to the GPU core, the GPU core processes Task 1 and Task 2 in parallel. Please refer to Step 3 in the above Example 1 for the specific implementation process of this step, which will not be repeated here.

Step 4: Analyze Task 3 again. Task 3 is blocked in the dependency blocking judgment process and can only be executed after Task 2 is completed:

After the above step 3, the first waiting queue only contains Task 3 and Task 4. The blocking manager traverses Task 3, which is the first task issued in the first waiting queue. Since Task 3 does not belong to the forward synchronization task and there is no backward synchronization task in the entire group, Task 3 is not synchronously blocked. The blocking manager moves Task 3 from the first waiting queue to the second waiting queue.

The blocking manager traverses Task 3 in the second waiting queue. Since Task 3 depends on Task 0 and Task 2, Task 0 has been executed but Task 2 has not yet been completed. Therefore, Task 3 is blocked by the dependency. Before Task 2 is completed, the blocking manager no longer analyzes Task 3 in the second waiting queue.

Step five, as shown in (B) in Figure 12, assuming that task 1 is executed first, since task 2 is not completed and task 3 is not released from dependency blocking, task 4 that is located after task 3 in the first waiting queue can be directly analyzed. Task 4 is scheduled to the GPU core without blocking through all processes, and the GPU core will process tasks 2 and 4 at the same time.

Step six, as shown in (B) in Figure 12, assuming that task 2 is executed, task 3 in the second waiting queue releases the dependency block, so task 3 is scheduled to the GPU core without blocking through all processes, allowing the GPU core to process task 4 and task 3 in parallel.

It should be noted that in the above example three, if Task 3 has a serial dependency association with Task 0 and Task 2, it means that Task 3 can only be executed after Task 0 and Task 2 have started to execute, regardless of whether Task 0 and Task 2 have been completed. In this case, in the above step four, since Task 0 has been completed, and Task 2 has not been completed but has started to execute, Task 3 is released from dependency, and Task 3 passes through the synchronous blocking judgment process and the dependent blocking judgment process, but needs to wait for Task 1 or Task 2 to be completed in the dispatch process before it can be scheduled to the GPU core. For example, as shown in (B) in Figure 12, assuming that Task 1 is completed first, the task dispatcher will dispatch Task 3 in the ready queue to the GPU core, so that the GPU core processes Task 2 and Task 3 in parallel.

The above example 3 introduces a scenario with defined dependencies. When the previous task is blocked by the dependency, the subsequent task that is not blocked by the dependency can be scheduled to the GPU core first to ensure that the GPU core is not idle as much as possible, effectively improving the utilization rate of the GPU core.

It should be understood that the various embodiments in the present application can also be combined with each other to obtain new embodiments.

It should be noted that the names of the above-mentioned information are only examples. With the evolution of communication technology, the names of any of the above-mentioned information may change. However, no matter how the names change, as long as their meanings are the same as those of the above-mentioned information in this application, they fall within the scope of protection of this application.

The above mainly introduces the solution provided by the present application from the perspective of the interaction between various network elements. It can be understood that in order to realize the above functions, the above-mentioned network elements include hardware structures and/or software modules corresponding to the execution of various functions. Those skilled in the art should easily realize that, in combination with the units and algorithm steps of each example described in the embodiments disclosed in this document, the present invention can be implemented in the form of hardware or a combination of hardware and computer software. Whether a function is executed in the form of hardware or computer software driving hardware depends on the specific application and design constraints of the technical solution. Professional and technical personnel can use different methods to implement the described functions for each specific application, but such implementation should not be considered to exceed the scope of the present invention.

According to the aforementioned method, FIG13 is a schematic diagram of the structure of a task scheduler provided in an embodiment of the present application. The task scheduler 1300 may be a chip or a circuit, such as a chip or a circuit that can be set in a processor. The task scheduler 1300 corresponds to the task scheduler in the aforementioned method, such as the task scheduler 310 in FIG2. The task scheduler 1300 may implement the steps of any one or more of the corresponding methods shown in FIG5 or FIG9 above. As shown in FIG13, the task scheduler 1300 may include an acquisition unit 1301, a determination unit 1302, and a task scheduler 1303. Unit 1302 and scheduling unit 1303.

In the embodiment of the present application, the acquisition unit 1301 may be a receiving unit or a receiver when receiving information, and the receiving unit or the receiver may be a radio frequency circuit. In a specific implementation, the acquisition unit 1301 is used to acquire N tasks and the association relationship of the N tasks, the determination unit 1302 is used to determine the non-dependent or de-dependent tasks among the N tasks as schedulable tasks according to the association relationship of the N tasks, and the scheduling unit 1303 is used to schedule the schedulable tasks to the processing unit. Wherein, N is a positive integer.

For the concepts, explanations, detailed descriptions and other steps involved in the task scheduler 1300 and related to the technical solution provided in the embodiment of the present application, please refer to the description of these contents in the aforementioned method or other embodiments, which will not be repeated here.

It can be understood that the functions of each unit in the above-mentioned task scheduler 1300 can refer to the implementation of the corresponding method embodiment, and will not be repeated here.

It should be understood that the above division of modules of the task scheduler 1300 is merely a division of logical functions, and in actual implementation, all or part of them may be integrated into one physical entity, or they may be physically separated.

According to the method provided in the embodiment of the present application, the present application also provides a task scheduler, including the aforementioned task acquirer, blocking manager and task dispatcher, and also including a first waiting queue, a second waiting queue and a ready queue.

According to the method provided in the embodiment of the present application, the present application also provides a processor, including the aforementioned task scheduler and a processing unit. The processing unit may specifically be a processor core, such as the aforementioned GPU core.

According to the method provided in the embodiments of the present application, the present application also provides an electronic device, which includes a processor, the processor is coupled to a memory, and the processor is used to execute a computer program stored in the memory, so that the electronic device executes the method of any one of the embodiments shown in Figure 5 or Figure 9.

According to the method provided in the embodiments of the present application, the present application also provides a task processing system, including a processor and a device driver package, the device driver package is used to send N tasks to the processor, and the processor is used to process N tasks by executing the method of any one of the embodiments shown in Figure 5 or Figure 9.

According to the method provided in the embodiments of the present application, the present application also provides a computer program product, which includes: a computer program code, when the computer program code is run on a computer, the computer executes the method of any one of the embodiments shown in Figure 5 or Figure 9.

According to the method provided in the embodiments of the present application, the present application also provides a computer-readable storage medium, which stores a program code. When the program code runs on a computer, the computer executes the method of any one of the embodiments shown in Figure 5 or Figure 9.

The terms "component", "module", "system", etc. used in this specification are used to represent computer-related entities, hardware, firmware, a combination of hardware and software, software, or software in execution. For example, a component can be, but is not limited to, a process running on a processor, a processor, an object, an executable file, an execution thread, a program and/or a computer. By way of illustration, both applications running on a computing device and a computing device can be components. One or more components may reside in a process and/or an execution thread, and a component may be located on a computer and/or distributed between two or more computers. In addition, these components may be executed from various computer-readable media having various data structures stored thereon. Components may, for example, communicate through local and/or remote processes according to signals having one or more data packets (e.g., data from two components interacting with another component between a local system, a distributed system and/or a network, such as the Internet interacting with other systems through signals).

Those of ordinary skill in the art will appreciate that the various illustrative logical blocks and steps described in conjunction with the embodiments disclosed herein can be implemented in electronic hardware, or a combination of computer software and electronic hardware. Whether these functions are performed in hardware or software depends on the specific application and design constraints of the technical solution. Professional and technical personnel can use different methods to implement the described functions for each specific application, but such implementation should not be considered to be beyond the scope of this application.

Those skilled in the art can clearly understand that, for the convenience and brevity of description, the specific working processes of the systems, devices and units described above can refer to the corresponding processes in the aforementioned method embodiments and will not be repeated here.

In the several embodiments provided in the present application, it should be understood that the disclosed systems, devices and methods can be implemented in other ways. For example, the device embodiments described above are only schematic. For example, the division of the units is only a logical function division. There may be other division methods in actual implementation, such as multiple units or components can be combined or integrated into another system, or some features can be ignored or not executed. Another point is that the mutual coupling or direct coupling or communication connection shown or discussed can be through some interfaces, indirect coupling or communication connection of devices or units, which can be electrical, mechanical or other forms.

The units described as separate components may or may not be physically separate, and the components shown as units may or may not be physical units, that is, they may be located in one place or distributed over multiple network units. Some or all of the units may be selected to achieve the purpose of the solution of this embodiment.

In addition, each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.

If the functions are implemented in the form of software functional units and sold or used as independent products, they can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present application can be essentially or partly embodied in the form of a software product that contributes to the prior art. The computer software product is stored in a storage medium and includes several instructions for a computer device (which can be a personal computer, server, or network device, etc.) to perform all or part of the steps of the methods described in each embodiment of the present application. The aforementioned storage media include: U disk, mobile hard disk, read-only memory (ROM), random access memory (RAM), disk or optical disk, and other media that can store program codes.

The above is only a specific implementation of the present application, but the protection scope of the present application is not limited thereto. Any person skilled in the art who is familiar with the present technical field can easily think of changes or substitutions within the technical scope disclosed in the present application, which should be included in the protection scope of the present application. Therefore, the protection scope of the present application should be based on the protection scope of the claims.

Claims

A task scheduling method, characterized by comprising:

Obtain N tasks and association relationships between the N tasks, where N is a positive integer;

According to the association relationship of the N tasks, determining the tasks that have no dependencies or have been released from dependencies among the N tasks as schedulable tasks;

The schedulable tasks are scheduled to the processing units.
The method according to claim 1, characterized in that the step of determining, according to the association relationship among the N tasks, tasks that are not dependent or have been released from dependency as schedulable tasks comprises:

According to the association relationship of the N tasks, a task that is synchronously associated with other tasks is determined, and when other tasks that are synchronously associated with the task corresponding to the task are completed, the task is determined as a schedulable task;

The synchronous association with other tasks means that the task has an association relationship with all other tasks acquired earlier or later than the task.
The method according to claim 2, wherein the completion of execution of other tasks corresponding to the task and having synchronization association with the task includes at least one of the following:

The task is a forward synchronization task, and all other tasks acquired earlier than the task are completed;

The task is a task acquired later than the backward synchronization task, and the backward synchronization task is completed;

The forward synchronization task is used to indicate that the forward synchronization task depends on all other tasks acquired earlier than the forward synchronization task, and the backward synchronization task is used to indicate that all other tasks acquired later than the backward synchronization task depend on the backward synchronization task.
The method according to any one of claims 1 to 3, characterized in that, according to the association relationship of the N tasks, determining the tasks that have no dependencies or have been released from dependencies among the N tasks as schedulable tasks comprises:

According to the association relationship of the N tasks, a task having a dependency association with other tasks is determined, and when other tasks corresponding to the task and having a dependency association with the task have been executed or have been completed, the task is determined as a schedulable task.
The method according to any one of claims 1 to 4, characterized in that, according to the association relationship of the N tasks, determining the tasks that have no dependencies or have been released from dependencies among the N tasks as schedulable tasks comprises:

According to the acquisition order of the N tasks, traverse each task among the N tasks that has not been determined to be blocked. When traversing each task that has not been determined to be blocked:

If the task is a forward synchronization task, and other tasks acquired earlier than the task have not been fully executed, it is determined that the task is synchronization blocked;

If the task is a backward synchronization task, it is determined that all other tasks acquired later than the task are synchronously blocked;

If the task depends on other tasks, and the other tasks have not been completed, it is determined that the task is blocked by the dependency;

If the task is serially dependent on other tasks, and the other tasks are not all executed, it is determined that the task is blocked by the dependency;

When the task is not blocked by synchronization and is not blocked by dependencies, the task is determined as one of the schedulable tasks.
The method according to claim 5, characterized in that

The task being synchronously blocked includes the task being synchronously blocked by the entire group and/or the task being synchronously blocked by a sub-group;

The other tasks used to determine that the task is blocked by the entire group of synchronization are tasks acquired earlier or later than the task among the N tasks;

The other tasks used to determine that the task is blocked by sub-group synchronization are tasks in the sub-group to which the task belongs that are acquired earlier or later than the task.
The method according to claim 6 is characterized in that the sub-groups are obtained by dividing according to business characteristics.
The method according to any one of claims 1 to 7, characterized in that

After obtaining the N tasks and the association relationship between the N tasks, the method further includes:

Storing the N tasks in a first waiting queue;

The determining, according to the association relationship of the N tasks, tasks that have no dependencies or have been released from dependencies among the N tasks as schedulable tasks includes:

For any task in the first waiting queue, determine whether the task is synchronously blocked by other tasks, and if not, move the task from the first waiting queue to the second waiting queue;

For any task in the second waiting queue, determine whether the task is blocked by other task dependencies. If not, wait for the task to complete. Move from the second waiting queue to the ready queue;

The step of scheduling the schedulable task to the processing unit includes:

The tasks in the ready queue are dispatched to the processing unit.
The method according to claim 8, characterized in that scheduling the tasks in the ready queue to the processing unit comprises:

The number of tasks currently executed by the processing unit is monitored, and when the number of tasks is less than the number of parallel tasks of the processing unit, the tasks in the ready queue are sequentially scheduled to the processing unit according to the order in which the tasks in the ready queue are acquired.
A task scheduler, characterized by comprising:

A task acquirer, used to acquire N tasks and associations between the N tasks, where N is a positive integer;

A blocking manager, configured to determine, according to the association relationship among the N tasks, tasks that have no dependencies or have been released from dependencies among the N tasks as schedulable tasks;

The task dispatcher dispatches the schedulable tasks to the processing units.
The task scheduler according to claim 10, characterized in that the task acquirer is specifically used to:

The N tasks and the association relationship between the N tasks sent by the device driver package are received.
The task scheduler according to claim 10 or 11, characterized in that the blocking manager is specifically used to:

According to the association relationship of the N tasks, a task that is synchronously associated with other tasks is determined, and when other tasks that are synchronously associated with the task corresponding to the task are completed, the task is determined as a schedulable task;

The synchronous association with other tasks means that the task has an association relationship with all other tasks acquired earlier or later than the task.
The task scheduler according to claim 12, wherein the completion of execution of other tasks corresponding to the task and having synchronization association with the task includes at least one of the following:

The task is a forward synchronization task, and all other tasks acquired earlier than the task are completed;

The task is a task acquired later than the backward synchronization task, and the backward synchronization task is executed and completed;

The forward synchronization task is used to indicate that the forward synchronization task depends on all other tasks acquired earlier than the forward synchronization task, and the backward synchronization task is used to indicate that all other tasks acquired later than the backward synchronization task depend on the backward synchronization task.
The task scheduler according to any one of claims 10 to 13, characterized in that the blocking manager is specifically used to:

According to the association relationship of the N tasks, a task having a dependency association with other tasks is determined, and when other tasks corresponding to the task and having a dependency association with the task have been executed or have been completed, the task is determined as a schedulable task.
The task scheduler according to any one of claims 10 to 14, characterized in that the blocking manager is specifically used to:

According to the acquisition order of the N tasks, traverse each task among the N tasks that has not been determined to be blocked. When traversing each task that has not been determined to be blocked:

If the task is a forward synchronization task, and other tasks acquired earlier than the task have not been fully executed, it is determined that the task is synchronization blocked;

If the task is a backward synchronization task, it is determined that all other tasks acquired later than the task are synchronously blocked;

If the task depends on other tasks, and the other tasks have not been completed, it is determined that the task is blocked by the dependency;

If the task is serially dependent on other tasks, and the other tasks are not all executed, it is determined that the task is blocked by the dependency;

When the task is not blocked by synchronization and is not blocked by dependencies, the task is determined as one of the schedulable tasks.
The task scheduler according to claim 15, characterized in that

The task being synchronously blocked includes the task being synchronously blocked by the entire group and/or the task being synchronously blocked by a sub-group;

The other tasks used to determine that the task is blocked by the entire group of synchronization are tasks acquired earlier or later than the task among the N tasks;

The other tasks used to determine that the task is blocked by sub-group synchronization are tasks in the sub-group to which the task belongs that are acquired earlier or later than the task.
The task scheduler as described in claim 16 is characterized in that the sub-groups are obtained by dividing according to business characteristics.
The task scheduler according to any one of claims 10 to 17, further comprising: a first waiting queue, a second waiting queue and a ready queue;

The task acquirer is further used to: store the N tasks into the first waiting queue;

The blocking manager is specifically used to: for any task in the first waiting queue, determine whether the task is synchronously blocked by other tasks, and if not, move the task from the first waiting queue to the second waiting queue; and for the second For any task in the waiting queue, determine whether the task is blocked by other task dependencies, and if not, move the task from the second waiting queue to the ready queue;

The task dispatcher is specifically used to dispatch the tasks in the ready queue to the processing unit.
The task scheduler according to claim 18, wherein the task dispatcher is specifically used to:

The number of tasks currently executed by the processing unit is monitored, and when the number of tasks is less than the number of parallel tasks of the processing unit, the tasks in the ready queue are sequentially scheduled to the processing unit according to the order in which the tasks in the ready queue are acquired.
A task scheduler, characterized by comprising:

An acquisition unit, used to acquire N tasks and association relationships among the N tasks, where N is a positive integer;

A determination unit, configured to determine, according to the association relationship among the N tasks, tasks that are not dependent or have been released from dependency among the N tasks as schedulable tasks;

The scheduling unit is used to schedule the schedulable task to the processing unit.
The task scheduler according to claim 20, wherein the determining unit is specifically used to:

According to the association relationship of the N tasks, a task that is synchronously associated with other tasks is determined, and when other tasks that are synchronously associated with the task corresponding to the task are completed, the task is determined as a schedulable task;

The synchronous association with other tasks means that the task has an association relationship with all other tasks acquired earlier or later than the task.
The task scheduler according to claim 21, wherein the completion of execution of other tasks corresponding to the task and having synchronization association with the task includes at least one of the following:

The task is a forward synchronization task, and all other tasks acquired earlier than the task are completed;

The task is a task acquired later than the backward synchronization task, and the backward synchronization task is completed;

The forward synchronization task is used to indicate that the forward synchronization task depends on all other tasks acquired earlier than the forward synchronization task, and the backward synchronization task is used to indicate that all other tasks acquired later than the backward synchronization task depend on the backward synchronization task.
The task scheduler according to any one of claims 20 to 22, characterized in that the determining unit is specifically used to:

According to the association relationship of the N tasks, a task having a dependency association with other tasks is determined, and when other tasks corresponding to the task and having a dependency association with the task have been executed or have been completed, the task is determined as a schedulable task.
The task scheduler according to any one of claims 20 to 23, characterized in that the determining unit is specifically used to:

According to the acquisition order of the N tasks, traverse each task that is not blocked among the N tasks. When traversing each task that is not blocked:

If the task is a forward synchronization task, and other tasks acquired earlier than the task have not been fully executed, it is determined that the task is synchronization blocked;

If the task is a backward synchronization task, it is determined that all other tasks acquired later than the task are synchronously blocked;

If the task depends on other tasks, and the other tasks have not been completed, it is determined that the task is blocked by the dependency;

If the task is serially dependent on other tasks, and the other tasks are not all executed, it is determined that the task is blocked by the dependency;

When the task is not blocked by synchronization and is not blocked by dependencies, the task is determined as one of the schedulable tasks.
The task scheduler according to claim 24, characterized in that

The task being synchronously blocked includes the task being synchronously blocked by the entire group and/or the task being synchronously blocked by a sub-group;

The other tasks used to determine that the task is blocked by the entire group of synchronization are tasks acquired earlier or later than the task among the N tasks;

The other tasks used to determine that the task is blocked by sub-group synchronization are tasks in the sub-group to which the task belongs that are acquired earlier or later than the task.
The task scheduler as described in claim 25 is characterized in that the sub-groups are divided according to business characteristics.
The task scheduler according to any one of claims 20 to 26, characterized in that:

After acquiring the N tasks, the acquiring unit is further used to: store the N tasks in a first waiting queue;

The determining unit is specifically used for: for any task in the first waiting queue, determining whether the task is synchronously blocked by other tasks, if not, moving the task from the first waiting queue to the second waiting queue; and for any task in the second waiting queue, determining whether the task is blocked by other task dependencies, if not, moving the task from the second waiting queue to the ready queue;

The scheduling unit is specifically used to schedule the tasks in the ready queue to the processing unit.
The task scheduler according to claim 27, wherein the scheduling unit is specifically used for:

The number of tasks currently executed by the processing unit is monitored, and when the number of tasks is less than the number of parallel tasks of the processing unit, the tasks in the ready queue are sequentially scheduled to the processing unit according to the order in which the tasks in the ready queue are acquired.
A processor, characterized in that it comprises a task scheduler and a processing unit;

The task scheduler is used to execute the method according to any one of claims 1 to 9;

The processing unit is used to process the tasks scheduled by the task scheduler.
An electronic device, characterized in that it comprises a processor, the processor is coupled to a memory, and the processor is used to execute a computer program stored in the memory so that the electronic device executes the method as described in any one of claims 1 to 9.
A task scheduling system, comprising a device driver package and a processor as claimed in claim 29;

The device driver package is used to send N tasks to the processor, where N is a positive integer;

The processor is used to process the N tasks.