[go: up one dir, main page]

CN111104169B - Instruction list scheduling method, device, computer equipment and storage medium - Google Patents

Instruction list scheduling method, device, computer equipment and storage medium Download PDF

Info

Publication number
CN111104169B
CN111104169B CN201911214046.2A CN201911214046A CN111104169B CN 111104169 B CN111104169 B CN 111104169B CN 201911214046 A CN201911214046 A CN 201911214046A CN 111104169 B CN111104169 B CN 111104169B
Authority
CN
China
Prior art keywords
instruction
execution time
scheduled
selection
node
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911214046.2A
Other languages
Chinese (zh)
Other versions
CN111104169A (en
Inventor
不公告发明人
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Cambricon Information Technology Co Ltd
Original Assignee
Shanghai Cambricon Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Cambricon Information Technology Co Ltd filed Critical Shanghai Cambricon Information Technology Co Ltd
Priority to CN201911214046.2A priority Critical patent/CN111104169B/en
Publication of CN111104169A publication Critical patent/CN111104169A/en
Application granted granted Critical
Publication of CN111104169B publication Critical patent/CN111104169B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3836Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Devices For Executing Special Programs (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

本发明涉及一种指令列表调度方法、装置、计算机设备及存储介质,该方法通过分析待调度指令的数据依赖关系,得到指令调度过程中每次进行指令选择的所有选择节点,再根据对各次序对应的选择节点的评估结果确定调度后的指令列表中各次序的指令。该方法可以保证每次选择指令时,选择的指令为当前状态的最优结果,使用这些最优结果得到的调度后的指令列表,各个指令之间的排列更加紧凑,便于缩短原指令列表中指令序列的执行时间。

Figure 201911214046

The invention relates to an instruction list scheduling method, device, computer equipment and storage medium. The method obtains all selected nodes for each instruction selection in the instruction scheduling process by analyzing the data dependency relationship of the instructions to be scheduled, and then according to the order of each order The evaluation result of the corresponding selection node determines the instructions in each order in the scheduled instruction list. This method can ensure that each time an instruction is selected, the selected instruction is the optimal result of the current state, and in the scheduled instruction list obtained by using these optimal results, the arrangement of each instruction is more compact, which is convenient for shortening the instructions in the original instruction list. The execution time of the sequence.

Figure 201911214046

Description

Instruction list scheduling method and device, computer equipment and storage medium
The application is a division of a chinese application 2017114844108 entitled "instruction list scheduling method, apparatus, computer device, and storage medium" filed on 29/12/2017.
Technical Field
The present invention relates to the field of computer technology in information technology, and in particular, to a method and an apparatus for scheduling an instruction list, a computer device, and a storage medium.
Background
With the rapid development of computer technology, a Multi-processor computer System (Multi-processor Computing System) including a plurality of first processors, such as a Multi-core computer System (Multi-processor Computing System) and a Heterogeneous computer System (Heterogeneous Computing System), has appeared. The plurality of first processors of the computer system can process different instructions in parallel according to the instruction lists corresponding to the plurality of first processors, so that the processing efficiency of the computer system is improved.
However, the order of the instructions in the instruction list corresponding to the plurality of first processors of the computer system may not be reasonable, for example, the instructions in the instruction list are not made to be parallel as much as possible, which may not improve the processing efficiency of the computer system or improve the efficiency.
Therefore, it is an urgent technical problem to provide a method, an apparatus, a computer device and a storage medium for scheduling an instruction list, so as to adjust the order of instructions in the instruction list, to make the arrangement between the instructions in the instruction list more compact, and to shorten the execution time of the instruction list.
Disclosure of Invention
Based on this, it is necessary to provide an instruction list scheduling method, apparatus, computer device, and storage medium for solving the problem of unreasonable instruction sequence ordering in an instruction list used by a processor.
An instruction list scheduling method, comprising: acquiring a to-be-scheduled instruction set in a to-be-scheduled instruction list, and performing data dependency analysis on the to-be-scheduled instruction set to obtain a data dependency relationship among instructions in the to-be-scheduled instruction set;
obtaining all selection nodes for instruction selection each time in the instruction scheduling process according to the data dependency relationship among the instructions;
and according to a preset rule, determining the instructions in each order in the scheduled instruction list according to the selected nodes in the corresponding order.
In one embodiment, the step of determining, according to a preset rule, instructions in each order in a scheduled instruction list according to the selection node in the corresponding order includes:
accessing the selection node and acquiring the longest execution time corresponding to the currently accessed selection node;
if the longest execution time corresponding to the currently accessed selection node is smaller than the initial execution time, determining the sequenced instructions of the currently accessed selection node as instructions in the corresponding sequence in a scheduled instruction list;
the initial execution time is the execution time of the instruction sequence in the instruction list to be scheduled.
In one embodiment, the method comprises:
and if the longest execution time corresponding to the currently accessed selection node is less than the initial execution time, updating the initial execution time to the longest execution time corresponding to the currently accessed selection node.
In one embodiment, the step of accessing the selected node and obtaining the longest execution time corresponding to the currently accessed selected node includes:
accessing the selection node in a preset access time period, and acquiring the longest execution time corresponding to the currently accessed selection node;
if the longest execution time corresponding to the currently accessed selection node is smaller than the initial execution time, determining the ordered instructions corresponding to the currently accessed selection node as the instructions in the corresponding order in the scheduled instruction list;
the initial execution time is the execution time of the instruction sequence in the instruction list to be scheduled.
In one embodiment, if the longest execution time corresponding to the currently accessed selection node is not less than the initial execution time, the instruction sequence in the instruction table to be scheduled is used as the instruction sequence in the instruction table after scheduling.
In one embodiment, the step of accessing the selected node and obtaining the longest execution time corresponding to the currently accessed selected node includes:
and selecting the selected node to access according to a random priority rule, and acquiring the longest execution time corresponding to the selected node which is selected to access currently.
In one embodiment, the step of accessing the selected node and obtaining the longest execution time corresponding to the currently accessed selected node includes:
and selecting the selected node for access according to a breadth-first rule, and acquiring the longest execution time corresponding to the selected node which is selected to be accessed currently.
In one embodiment, the step of accessing the selected node and obtaining the longest execution time corresponding to the currently accessed selected node includes:
and selecting the selected node for access according to a depth-first rule, and acquiring the longest execution time corresponding to the selected node which is selected to be accessed currently.
In one embodiment, the step of accessing the selected node and obtaining the longest execution time corresponding to the currently accessed selected node includes:
selecting the selected nodes with the sequence less than the preset sequence for access according to a breadth or random priority rule to obtain the longest execution time corresponding to the selected node which is selected for access currently;
and selecting the selected nodes not less than the preset sequence according to a depth-first rule for access to obtain the longest execution time corresponding to the selected nodes selected to be accessed currently.
In one embodiment, the step of accessing the selected node and obtaining the longest execution time corresponding to the currently accessed selected node includes:
obtaining the shortest execution time corresponding to the currently accessed selection node;
if the shortest execution time corresponding to the currently accessed selection node is greater than the initial execution time, terminating accessing the selection node associated with the currently accessed selection node;
the initial execution time is the execution time of the instruction sequence in the instruction list to be scheduled.
In one embodiment, according to a preset rule, the step of determining the instructions in each order in the scheduled instruction list according to the selection node in the corresponding order includes:
and evaluating all the selected nodes corresponding to the current order according to the preset priority of the instruction to obtain the evaluation result of each selected joint in the current order, and determining the instruction corresponding to the current order according to the evaluation result.
In one embodiment, the method comprises: the priority of each instruction is set according to the specific content and/or type of the currently selected node.
In one embodiment, according to a preset rule, the step of determining the instructions in each order in the scheduled instruction list according to the selection node in the corresponding order includes:
and determining the instruction corresponding to the current sequence according to the length of the shortest execution time corresponding to all the selected nodes in the current sequence.
An instruction scheduling apparatus comprising: an acquisition module, a data dependence analysis module and an evaluation module,
the obtaining module is used for obtaining a to-be-scheduled instruction set in the to-be-scheduled instruction list and obtaining all selection nodes corresponding to each instruction selection in the instruction scheduling process according to the data dependency relationship among the instructions;
the data dependency analysis module is used for carrying out data dependency analysis on the instruction set to be scheduled to obtain a data dependency relationship among the instructions;
and the evaluation module is used for determining the instructions in each order in the scheduled instruction list according to a preset rule and the selected nodes in the corresponding order.
A computer device comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, the processor performing the steps of the above mentioned method.
A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of carrying out the above-mentioned method.
Compared with the prior art, the instruction list scheduling method, the instruction list scheduling device, the computer equipment and the storage medium have the following beneficial effects:
and obtaining all selection nodes corresponding to each instruction selection in the scheduling process by analyzing the data dependency of the instructions to be scheduled, and determining the instructions of each order in the scheduled instruction list according to the evaluation result of the selection nodes corresponding to each order. The method can ensure that the selected instruction is the optimal result of the current state when the instruction is selected every time, the arrangement among the instructions is more compact by using the scheduled instruction list obtained by the optimal results, and the execution time of the instruction sequence in the original instruction list is convenient to shorten.
Drawings
FIG. 1 is a diagram illustrating a computer system according to one embodiment;
FIG. 2 is a flowchart illustrating the steps of a method for scheduling instruction lists according to one embodiment;
FIG. 3 is a diagram of data dependencies for instructions to be scheduled, obtained in one embodiment;
FIG. 4 is a correlation diagram of select nodes obtained in one embodiment;
FIG. 5 is a block diagram of an instruction list scheduler according to an embodiment of the present invention;
fig. 6 is an internal structural diagram of a computer device according to an embodiment.
Detailed Description
In order to make the objects, technical solutions and technical effects of the present invention more apparent, specific embodiments of the present invention are described below with reference to the accompanying drawings. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention. It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict. It should be clear that "first", "second", etc. in this embodiment are only used to distinguish the described objects, and do not have any order or technical meaning.
As shown in fig. 1, the computer System 100 according to an embodiment of the present invention may be a Multi-processor computer System (Multi-core processor Computing System) including a plurality of processors, such as a Multi-core processor computer System (Multi-core processor Computing System), a Heterogeneous computer System (Heterogeneous Computing System), and the like. Optionally, the computer system may specifically include an instruction list scheduling apparatus 110, a plurality of first processors 120, and a memory 130, the plurality of first processors 120 may be connected to the instruction list scheduling apparatus 110 at the same time, and the instruction list scheduling apparatus 110 may be used for instruction list rescheduling of the plurality of first processors 120. Optionally, the instruction list scheduling device 110 may also include a second processor. Optionally, the second processor may include an obtaining module, a data dependency analysis module, an evaluation module, an operation module, a control module, and the like, where the obtaining module may be a hardware module such as an IO (Input/Output) interface, and the operation module and the control module are both hardware modules.
The plurality of first processors 120 may process different instructions in parallel according to the instruction list to improve the processing efficiency of the computer system. Optionally, the instruction list may include one or more instructions, each instruction includes a set of reference operations for a resource, and the resource referenced by the instruction may be known by reading or executing the instruction. I.e., when the first processor or the like executes the instruction, the resource referenced by the instruction may be called to implement the particular operation. For example, the instruction may be a Load instruction (Load), a computation instruction (computing), a store instruction (store), or the like, but the instruction may also be an N-level computation of a neural network, N >0, and N may be an integer or a non-integer.
Furthermore, the instructions in the instruction list are arranged according to an execution sequence, and the resource referred by each instruction in the instruction list may be a virtual memory object or a physical memory object. The virtual memory object may be a memory block, a register, or other virtual memory space of a storage device capable of storing data in software logic. The instruction scheduling process in this embodiment is a process of reordering the instructions in the instruction list on the premise of ensuring that the semantics of the original instruction list are not changed, which may make the arrangement between the instructions in the instruction list more compact, so as to shorten the execution time of the instruction list and improve the processing efficiency of the system.
For example, the instruction list includes N instructions, where N ≧ 1, N is a positive integer, and the N instructions are labeled as a first instruction, a second instruction, … …, and an Nth instruction according to execution timing. The scheduling process of the instruction list is a process of reordering the N instructions.
Specifically, when scheduling the instruction list, the instruction list scheduling apparatus 110 may first obtain the data dependency relationship of each instruction in the instruction list to be scheduled. Alternatively, the form of the data dependency may include RAW (Read After Write)/WAR (Write After Read)/ww (Write After Write). Alternatively, the Data dependency relationship may be described by a Data dependency Graph DDG (Data dependency Graph). Further, the second processor of the instruction list scheduling apparatus 110 may obtain the instruction list to be scheduled through the obtaining module thereof, and perform data dependency analysis on the instructions in the instruction list to be scheduled through the data dependency analysis module thereof to obtain the data dependency relationship between the instructions. Specifically, the data dependency analysis module may perform resource scanning and tracking on each instruction in the instruction list to be scheduled, so as to analyze the data dependency relationship between the instructions. The data dependency between the instructions in this embodiment refers to whether the execution of the current instruction needs to depend on the execution results of other instructions. For example, if there is instruction A "read data written by instruction B" then instruction A depends on the result of instruction B. Then, the obtaining module may obtain all the selection nodes that perform instruction selection each time in the instruction scheduling process according to the obtained data dependency relationship between the instructions.
Then, the instruction list scheduling device may determine, through the evaluation module, instructions in each order in the scheduled instruction list from all the selected nodes in the corresponding order according to a preset rule. Optionally, the second processor may evaluate the selection node corresponding to the current order through the evaluation module of the second processor, obtain an evaluation result of each selection node of the current order, and determine the instruction corresponding to the current order according to the evaluation result. Each selection node records the ordered instruction and the instruction set to be scheduled corresponding to the selection node. Optionally, the evaluation module evaluates the selected nodes corresponding to the current order according to the priority of each instruction. Optionally, the second processor may also set the priority of the instruction according to the specific content and/or type of the currently selected node.
Optionally, when the instruction list scheduling apparatus 110 performs instruction scheduling, the first processor corresponding to an instruction in the instruction list to be scheduled may be adjusted. For example, the first processor corresponding to the instruction to be scheduled may be determined according to the type of the instruction, or the specific content of the instruction to be scheduled.
Fig. 2 is a flowchart illustrating steps of an instruction list scheduling method according to an embodiment of the present invention, which can be applied to the computer system described above. The computer system may include a memory 130 and a plurality of first processors 120. The instruction list scheduling method is used for realizing the rescheduling of the instructions in the instruction list corresponding to the plurality of first processors in the computer system so as to improve the processing efficiency of the computer. Specifically, the method may include the steps of:
step S100: the method comprises the steps of obtaining an instruction set to be scheduled in an instruction list to be scheduled, and carrying out data dependency analysis on the instruction set to be scheduled to obtain a data dependency relationship among instructions in the instruction set to be scheduled.
Specifically, the second processor may obtain, through the obtaining module, an instruction set to be scheduled of the instruction list to be scheduled, and obtain, through the data dependency analysis module, a data dependency relationship of the instruction. The instruction set to be scheduled in this embodiment is composed of a plurality of instructions to be scheduled in an instruction list to be scheduled. Optionally, the instruction set to be scheduled does not include a semantic-free instruction (e.g., a synchronization instruction) in the instruction list to be scheduled. Further, the step of acquiring the instruction set to be scheduled of the instruction list to be scheduled by the acquiring module includes: and acquiring a list of instructions to be scheduled, and deleting the semantic-free instructions in the list of instructions to be scheduled to obtain an instruction set to be scheduled.
For example, the instruction set to be scheduled acquired by the acquisition module includes six instructions { L1, L2, C1, C2, S1, S2 }. Wherein, L1, C1 and S1 need to be executed in sequence, L2, C2 and S2 need to be executed in sequence, and the rest instructions have no data dependency relationship. L1, L2, S1, S2 are I/O instructions, and C1, C2 are compute instructions. The Data dependency analysis module performs Data dependency analysis on the instruction to be scheduled to obtain a Data dependency relationship among instructions in the instruction set to be scheduled, and the Data dependency relationship is described by using a DDG (Data dependency Graph) as shown in fig. 3.
The resource referred by each instruction to be scheduled in the instruction list to be scheduled may be a virtual memory object or a physical memory object. The virtual memory object may be a memory block, a register, or other virtual memory space of a storage device capable of storing data in software logic.
Step S200: and obtaining all selection nodes for instruction selection each time in the instruction scheduling process according to the data dependency relationship among the instructions.
Each selection node records the ordered instruction and the instruction set to be scheduled corresponding to the selection node. Optionally, all the selected processes may be obtained as: the second processor preferentially obtains all first selection nodes during the first instruction selection through the obtaining module, specifically, obtains the ordered instructions and the instruction set to be scheduled corresponding to the first selection nodes. It should be clear that there are data dependencies for each instruction in these sets of instructions to be scheduled. And then the second processor acquires all second selection nodes associated with each first selection node through the acquisition module according to the data dependency relationship of each first selection node, wherein the second selection nodes correspond to the second instruction selection. And (4) circulating the steps to obtain a third selection node … … and an Nth selection node, wherein N is more than or equal to 3, and N is a positive integer. The sum of the first selected node, … …, and the nth selected node obtained in the above steps constitutes all selected nodes for instruction selection at a time.
For example, the instruction set to be scheduled in the acquired instruction list to be scheduled contains six instructions: { L1, L2, C1, C2, S1, S2}, the data dependency relationship among these six instructions is shown in FIG. 3. It can be clearly seen from fig. 3 that the six instructions L1 and L2 in the instruction set to be scheduled may not depend on the execution of other instructions, and therefore, when the instruction is selected for the first time, the instruction needs to be selected from L1 and L2, that is, the obtained first selection node corresponds to two cases of selecting the instruction L1 or L2. When L1 is selected at the time of the first instruction selection, L1 is an ordered instruction, at which time the first selection node records the ordered instruction L1, and deletes the instruction set to be scheduled { L2, C1, C2, S1, S2} of instruction L1. Similarly, another first selection node is obtained when the first instruction is selected at the time of selecting the L2, and the first selection node records the sorted instruction L2 and deletes the instruction set to be scheduled { L1, C1, C2, S1 and S2} of the instruction L2. The above process is looped to obtain the second selection node … … in the second instruction selection and the sixth selection node in the sixth instruction selection.
In this implementation step, each time instruction selection is performed, an instruction set to be scheduled, which is obtained according to a previous instruction selection, for example, an instruction set to be scheduled corresponding to fig. 3, is required to be selected, when an instruction selected in the first instruction selection is L1 (corresponding to one of the first selection nodes), an instruction set to be scheduled { L2, C1, C2, S1, S2} is obtained, instructions L2 and C1 in the instruction set of the first selection node may not depend on execution of other instructions, and at this time, when a second instruction selection is performed, selection from L2 and C1 is required (two second selection nodes are correspondingly present); when the selected instruction is L2 (corresponding to another first selection node) at the time of first instruction selection, the obtained instruction set to be scheduled { L1, C1, C2, S1, S2}, the instructions L1 and C2 in the scheduling instruction set of the first selection node may not depend on the execution of other instructions, and at this time, the instruction needs to be selected from L1 and C2 (corresponding to the existence of two second selection nodes) at the time of second instruction selection. Therefore, there is an association between all the selected nodes obtained in this embodiment, and such an association between the selected nodes can be represented by fig. 4.
Step S300: and according to a preset rule, determining the instructions in each order in the scheduled instruction list according to the selected nodes in the corresponding order. Optionally, the second processor may evaluate the selection node corresponding to the current order through the evaluation module, and determine the instruction corresponding to the current order according to the evaluation result of each selection node of the current order. For example, if the current order is the second instruction, then, corresponding to the second selection node in fig. 4, four second selection nodes in fig. 4 are evaluated according to a preset rule, and the second instruction in the scheduled instruction list is obtained according to the evaluation result. Optionally, the evaluation module evaluates the selected node (e.g., … … after C1 with the highest priority of L2) corresponding to the current order according to the preset priority of each instruction, so as to obtain an evaluation result. Optionally, the second processor sets the priority of each instruction according to the specific content and/or type of the currently selected node.
Optionally, the evaluation module may determine the instruction corresponding to the current order according to the length of the shortest execution time corresponding to all the selected nodes of the current order. For example, the shortest execution time of the instruction sequence corresponding to the first selected node corresponding to the instruction L1 in fig. 4 is t1The shortest execution time of the corresponding instruction sequence is t for the first selected node corresponding to the instruction L22,t1>t2Then L2 is determined to be the first instruction in the scheduled instruction list. Similarly, the second instruction, … …, sixth instruction of the scheduled instruction list is determined.
In the instruction list scheduling method provided in this embodiment, all the selection nodes that perform instruction selection each time in the instruction scheduling process are obtained by analyzing the data dependency relationship of the instruction to be scheduled, and then the instruction in each order in the scheduled instruction list is determined according to the evaluation result of the selection node corresponding to each order. The method can ensure that the selected instruction is the optimal result of the current state when the instruction is selected every time, the arrangement among the instructions is more compact by using the scheduled instruction list obtained by the optimal results, and the execution time of the instruction sequence in the original instruction list is convenient to shorten.
As an optional implementation manner, the step of determining, by the evaluation module, the instructions in each order in the post-scheduling instruction list according to the preset rule and in the selection node in the corresponding order includes:
step 210: and the evaluation module accesses the selection node and acquires the longest execution time corresponding to the currently accessed selection node. The selection nodes accessed by the evaluation module may be a first selection node, a second selection node, … …, an nth selection node.
Step S220: if the longest execution time corresponding to the currently accessed selection node is less than the initial execution time T0Then, the ordered instruction of the current access node is determined as the corresponding instruction in the scheduled instruction list. The initial execution time is the execution time of the instruction sequence in the instruction list to be scheduled.
The longest execution time corresponding to the currently accessed selection node in this implementation step is the execution time when the arrangement of the instruction sequence corresponding to the currently accessed node is least reasonable. For example, the longest execution time corresponding to the first and second selection nodes on the left side in FIG. 4 is T1=t1+t2+t3+t4+t5Wherein, t1To be arranged alreadyExecution time, t, of the sequential instructions L1-L22Is the execution time of instruction C1; t is t3Is the execution time, t, of instruction S14Is the execution time, t, of instruction C25The execution time of the instruction S2 is the case when the unordered instructions C1, C2, S1, and S2 corresponding to the selected node are not parallel at all and are sorted least reasonably. If T1<T0Then L1, L2 are taken as the first instruction and the second instruction in the scheduled list of instructions, respectively.
Because the longest execution time corresponding to the currently accessed selection node is less than the initial execution time, the execution time of the instruction sequence obtained by the instruction list scheduling method provided in this embodiment is not greater than the instruction sequence in the instruction list to be scheduled.
Since the evaluation module of this embodiment accesses the selected node accessed according to the preset rule, instructions in the instruction list are not scheduled only according to the selected node in the current order, and the influence of the determined instruction in the current order on the selection of subsequent instructions can be avoided. The method is particularly suitable for scheduling an instruction list containing instructions with large calculation amount, and optionally an instruction list containing neural network operation instructions. For example, the instruction list includes N instructions, where the N instructions include a weight load instruction a and a neural network convolutional layer calculation instruction B, and if the conventional method is used, the instruction a and the instruction B may not be parallel to achieve the highest processing efficiency of the system, and the instruction list scheduling scheme of this embodiment may implement the parallel of the instruction a and the instruction B in the scheduled instruction list.
In one embodiment, the method may further include: and if the longest execution time corresponding to the currently accessed selection node is less than the initial execution time, updating the initial execution time to the longest execution time corresponding to the currently accessed selection node. For example, in the above embodiment, when T is1<T0Then, L1 and L2 are used as the first instruction and the second instruction in the scheduled instruction list, and T is used as the first instruction and the second instruction in the scheduled instruction list1Updated to the initial execution time.
It should be clear that, when the longest execution time corresponding to the currently accessed selection node is less than the initial execution time, the ordered instruction corresponding to the currently accessed selection node is determined as the instruction in the corresponding order in the scheduled instruction list, and it can be ensured that the execution time of the instruction sequence in the obtained scheduled instruction list is shorter. The scheme for updating the initial execution time is to further optimize the ordering of the instructions and improve the processing efficiency of the system.
As an optional implementation manner, the step of accessing, by the evaluation module, the selected node and obtaining the longest execution time corresponding to the currently accessed selected node includes:
and accessing the selection nodes in a preset access time period to obtain the longest execution time corresponding to each selection node in the preset access time period. The present embodiment needs to determine the instructions in each order of the scheduled instruction list by combining the methods proposed in the above embodiments.
Because a plurality of instructions to be scheduled generally exist in the instruction list, the number of the selection nodes obtained according to the instructions to be scheduled is huge, and sufficient time is difficult to traverse all the selection nodes in actual operation. Based on the above, the object of the present invention can be achieved as long as the execution time is shortened by the new instruction list obtained by the instruction list scheduling method provided by the present invention. Therefore, when the instruction list scheduling method provided by the invention is actually used for instruction reordering, the access time period is generally set according to actual requirements, and the scheduling time of the instruction is controlled.
As an optional implementation manner, if the longest execution time corresponding to the currently accessed selection node is not less than the initial execution time, the instruction sequence in the instruction table to be scheduled is taken as the instruction sequence in the instruction table after scheduling.
In this embodiment, the longest execution time corresponding to the currently accessed selection node is not less than the initial execution time, and taking the instruction sequence in the instruction table to be scheduled as the instruction sequence in the instruction table after scheduling is the optimization of the instruction list scheduling method proposed in the above embodiment. The method can ensure that the obtained instruction sequence in the scheduled instruction list is the optimal result obtained in the preset time period.
As an optional implementation manner, the step of accessing the selection node and obtaining the longest execution time corresponding to the currently accessed selection node includes:
step 230: and the evaluation module acquires the shortest execution time corresponding to the currently accessed selected node.
Step 240: if the shortest execution time corresponding to the currently accessed selection node is larger than the initial execution time T0Then the access to the selected node associated with the currently accessed selected node is terminated. For example, the shortest execution time of the second selected node corresponding to the instruction L2 is T2,T2The unordered instructions C1, C2, S1 and S2 corresponding to the selected node are perfectly parallel, and the sorting is the most reasonable. If T2>T0Then access to the third selection nodes associated with the second selection node and the fourth selection nodes associated with these third selection nodes, … …, sixth selection node, is terminated.
Because the evaluation module consumes time when accessing one selection node, the technical scheme of the embodiment can eliminate invalid access to the selection node, and improve the scheduling efficiency of the instruction list.
As an optional implementation manner, the step of accessing, by the evaluation module, the selected node, and acquiring the longest execution time corresponding to the selected node that is currently selected and accessed includes: the evaluation module selects the selected node for access according to random priority (such as Monte Carlo Tree Search, MCTS, Monte Carlo Tree Search), and obtains the longest execution time corresponding to the selected node selected to be accessed currently.
As an optional implementation manner, the step of accessing, by the evaluation module, the selected node and obtaining the longest execution time corresponding to the currently accessed selected node includes: and the evaluation module selects the selected node for access according to a rule of Breadth First (BFS) and acquires the longest execution time corresponding to the selected node which is selected to be accessed currently. Specifically, breadth-first in this embodiment refers to preferentially selecting a selected node in the same order as the currently accessed selected node for access. For example, if the second selection node is currently accessed, the next accessed selection node preferentially selects the other second selection nodes.
As an optional implementation manner, the step of accessing, by the evaluation module, the selected node and obtaining the longest execution time corresponding to the currently accessed selected node includes: and the evaluation module selects the selected node for access according to a Depth First Search (BFS) rule and acquires the longest execution time corresponding to the selected node which is selected to be accessed currently. Specifically, depth-first in this embodiment refers to preferentially selecting a selection node in the next order associated with a currently accessed selection node for access. For example, if the currently accessed selection node is the second selection node, the next accessed selection node preferentially selects the third selection node associated with the second selection node.
Optionally, the evaluation module may also select the selected node for access by using a rule combining random preference with depth preference, or select the selected node for access by using a rule combining breadth preference with depth preference. Specifically, selecting the selection nodes smaller than a preset sequence for access according to a breadth or random priority rule to obtain the longest execution time corresponding to the selection node selected for access currently; and selecting the selected nodes not less than the preset sequence according to a depth-first rule for access to obtain the longest execution time corresponding to the selected nodes selected to be accessed currently. Optionally, the preset values of the corresponding sequences are determined according to empirical values, or according to pre-experimental results.
When an access time period is set for instruction list scheduling, an evaluation module of the instruction list scheduling apparatus does not have enough time to traverse all the selection nodes, and at this time, if a single rule of depth-first or breadth-first is adopted to select the selection nodes for access, the range of the selection nodes to be accessed finally may be relatively unilateral (for example, only the selection nodes associated with a certain selection node are accessed, or only the selection nodes in the first few orders are accessed), but the randomness of the selection nodes to be accessed finally when the selection nodes are selected for access only by adopting the rule of random preference is too strong, so the scheme that the selection nodes are selected for access by adopting the rule of random preference in combination with depth-first, or the selection nodes are selected for access by adopting the rule of breadth-first in combination with depth-first.
It should be understood that although the various steps in the flowchart of fig. 2 are shown as indicated by the arrows, the steps are not necessarily performed in the order indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least some of the steps in fig. 1 and 3 may include multiple sub-steps or multiple stages that are not necessarily performed at the same time, but may be performed at different times, and the order of performing the sub-steps or stages is not necessarily performed, but may be performed alternately or alternately with other steps or at least some of the sub-steps or stages of other steps.
Fig. 5 is a schematic structural diagram of an instruction list scheduling apparatus proposed in one embodiment, where the apparatus includes: an acquisition module 510, a data dependency analysis module 520, an evaluation module 530,
the obtaining module 510 is configured to obtain a set of instructions to be scheduled in the instruction list to be scheduled, and obtain all selection nodes for performing instruction selection each time in the instruction scheduling process according to a data dependency relationship between the instructions.
The data dependency analysis module 520 is configured to perform data dependency analysis on the instruction set to be scheduled, so as to obtain a data dependency relationship between instructions in the instruction set to be scheduled.
The evaluation module 530 is configured to determine, according to a preset rule, instructions in each order in the scheduled instruction list according to the selected node in the corresponding order.
In one embodiment, the evaluating module 530 accesses the selected node and obtains the longest execution time corresponding to the currently accessed selected node; if the longest execution time corresponding to the currently accessed selection node is smaller than the initial execution time, determining the sequenced instructions of the currently accessed selection node as instructions in the corresponding sequence in a scheduled instruction list; the initial execution time is the execution time of the instruction sequence in the instruction list to be scheduled.
In one embodiment, the instruction scheduling apparatus further includes an updating module 540, where the updating module is configured to update the initial execution time to the longest execution time corresponding to the currently accessed selected node if the longest execution time corresponding to the currently accessed selected node is less than the initial execution time.
In one embodiment, the evaluating module 530 is configured to access the selection node within a preset access time period, and obtain a longest execution time corresponding to the currently accessed selection node; if the longest execution time corresponding to the currently accessed selection node is smaller than the initial execution time, determining the ordered instructions corresponding to the currently accessed selection node as the instructions in the corresponding order in the scheduled instruction list; the initial execution time is the execution time of the instruction sequence in the instruction list to be scheduled.
In one embodiment, the evaluating module 530 is configured to use the instruction sequence in the instruction table to be scheduled as the instruction sequence in the instruction table after scheduling when the longest execution time corresponding to the currently accessed selection node is not less than the initial execution time.
In one embodiment, the evaluating module 530 is configured to select the selected node for access according to a random priority rule, and obtain the longest execution time corresponding to the selected node that is currently selected for access.
In one embodiment, the evaluation module 530 is configured to select the selected node for access according to a breadth-first rule, and obtain the longest execution time corresponding to the selected node that is currently selected for access.
In one embodiment, the evaluating module 530 is configured to select the selected node for access according to a depth-first rule, and obtain the longest execution time corresponding to the selected node that is currently selected for access.
In one embodiment, the evaluation module 530 is configured to select the selection nodes smaller than the preset order for access according to a breadth or random priority rule, so as to obtain the longest execution time corresponding to the selection node currently selected for access; and selecting the selected nodes not less than the preset sequence according to a depth-first rule for access to obtain the longest execution time corresponding to the selected nodes selected to be accessed currently.
In one embodiment, the evaluating module 530 is configured to obtain a shortest execution time corresponding to a currently accessed selection node; if the shortest execution time corresponding to the currently accessed selection node is larger than the initial execution time, terminating the access of the selection node associated with the currently accessed selection node; the initial execution time is the execution time of the instruction sequence in the instruction list to be scheduled.
In one embodiment, the evaluating module 530 is configured to evaluate all the selected nodes corresponding to the current order according to the preset priority of the instruction, obtain an evaluation result of each selected node in the current order, and determine the instruction corresponding to the current order according to the evaluation result.
In one embodiment, the evaluation module 530 is configured to set the priority of each instruction according to the specific content and/or type of the currently selected node.
In one embodiment, the evaluating module 530 is configured to determine the instruction corresponding to the current order according to the shortest execution time corresponding to all the selection nodes in the current order.
For specific limitations of the instruction list scheduling apparatus, reference may be made to the above limitations of the instruction list scheduling method, which is not described herein again. The modules in the instruction list scheduling device can be wholly or partially implemented by software, hardware and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.
In one embodiment, a computer device is provided, which may be a terminal, and its internal structure diagram may be as shown in fig. 6. The computer device includes a processor, a memory, a network interface, a display screen, and an input device connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement the verification stimulus generation method and/or the chip verification method mentioned in the above embodiments. The display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment can be a touch layer covered on the display screen, a key, a track ball or a touch pad arranged on the shell of the computer equipment, an external keyboard, a touch pad or a mouse and the like.
Those skilled in the art will appreciate that the architecture shown in fig. 6 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.
In one embodiment, a computer device is provided, comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, the processor implementing the following steps when executing the computer program: acquiring a to-be-scheduled instruction set in a to-be-scheduled instruction list, and performing data dependency analysis on the to-be-scheduled instruction set to obtain a data dependency relationship among instructions; obtaining all selection nodes for instruction selection each time in the instruction scheduling process according to the data dependency relationship among the instructions; and according to a preset rule, determining the instructions in each order in the scheduled instruction list according to the selected nodes in the corresponding order.
In one embodiment, the processor, when executing the computer program, further performs the steps of: accessing the selection node and acquiring the longest execution time corresponding to the currently accessed selection node; if the longest execution time corresponding to the currently accessed selection node is smaller than the initial execution time, determining the sequenced instructions of the currently accessed selection node as the instructions in the corresponding sequence in the scheduled instruction list; the initial execution time is the execution time of the instruction sequence in the instruction list to be scheduled.
In one embodiment, the processor, when executing the computer program, further performs the steps of: and if the longest execution time corresponding to the currently accessed selection node is less than the initial execution time, updating the initial execution time to the longest execution time corresponding to the currently accessed selection node.
In one embodiment, the processor, when executing the computer program, further performs the steps of: and if the longest execution time corresponding to the currently accessed selection node is less than the initial execution time, randomly generating an instruction sequence based on the ordered instructions corresponding to the currently accessed selection node, and updating the instruction sequence of the instruction list to be scheduled by using the randomly generated instruction sequence.
In one embodiment, the processor, when executing the computer program, further performs the steps of: accessing the selection node in a preset access time period, and acquiring the longest execution time corresponding to the currently accessed selection node; if the longest execution time corresponding to the currently accessed selection node is smaller than the initial execution time, determining the ordered instructions corresponding to the currently accessed selection node as the instructions in the corresponding order in the scheduled instruction list; the initial execution time is the execution time of the instruction sequence in the instruction list to be scheduled.
In one embodiment, the processor, when executing the computer program, further performs the steps of: and selecting the selected node for access according to a breadth-first rule, and acquiring the longest execution time corresponding to the selected node which is selected to be accessed currently.
In one embodiment, the processor, when executing the computer program, further performs the steps of: and selecting the selected node to access according to a random priority rule, and acquiring the longest execution time corresponding to the selected node which is selected to access currently.
In one embodiment, the processor, when executing the computer program, further performs the steps of: and selecting the selected node for access according to a breadth-first rule, and acquiring the longest execution time corresponding to the selected node which is selected to be accessed currently.
In one embodiment, the processor, when executing the computer program, further performs the steps of: selecting the selected nodes with the sequence less than the preset sequence for access according to a breadth or random priority rule to obtain the longest execution time corresponding to the selected node which is selected for access currently; and selecting the selected nodes not less than the preset sequence according to a depth-first rule for access to obtain the longest execution time corresponding to the selected nodes selected to be accessed currently.
In one embodiment, the processor, when executing the computer program, further performs the steps of: obtaining the shortest execution time corresponding to the currently accessed selection node; if the shortest execution time corresponding to the currently accessed selection node is greater than the initial execution time, terminating accessing the selection node associated with the currently accessed selection node; the initial execution time is the execution time of the instruction sequence in the instruction list to be scheduled.
In one embodiment, the processor, when executing the computer program, further performs the steps of: and evaluating all the selected nodes corresponding to the current order according to the preset priority of the instruction to obtain the evaluation result of each selected joint in the current order, and determining the instruction corresponding to the current order according to the evaluation result.
In one embodiment, the processor, when executing the computer program, further performs the steps of: the priority of each instruction is set according to the specific content and/or type of the currently selected node.
In one embodiment, the processor, when executing the computer program, further performs the steps of: and determining the instruction corresponding to the current sequence according to the length of the shortest execution time corresponding to all the selected nodes in the current sequence.
In one embodiment, a computer-readable storage medium is provided, having a computer program stored thereon, which when executed by a processor, performs the steps of: acquiring a to-be-scheduled instruction set in a to-be-scheduled instruction list, and performing data dependency analysis on the to-be-scheduled instruction set to obtain a data dependency relationship among instructions; obtaining all selection nodes for instruction selection each time in the instruction scheduling process according to the data dependency relationship among the instructions; and according to a preset rule, determining the instructions in each order in the scheduled instruction list according to the selected nodes in the corresponding order.
In one embodiment, the computer program when executed by the processor further performs the steps of: accessing the selection node and acquiring the longest execution time corresponding to the currently accessed selection node; if the longest execution time corresponding to the currently accessed selection node is smaller than the initial execution time, determining the sequenced instructions of the currently accessed selection node as the instructions in the corresponding sequence in the scheduled instruction list; the initial execution time is the execution time of the instruction sequence in the instruction list to be scheduled.
In one embodiment, the computer program when executed by the processor further performs the steps of: and if the longest execution time corresponding to the currently accessed selection node is less than the initial execution time, updating the initial execution time to the longest execution time corresponding to the currently accessed selection node.
In one embodiment, the computer program when executed by the processor further performs the steps of: accessing the selection node in a preset access time period, and acquiring the longest execution time corresponding to the currently accessed selection node; if the longest execution time corresponding to the currently accessed selection node is smaller than the initial execution time, determining the ordered instructions corresponding to the currently accessed selection node as the instructions in the corresponding order in the scheduled instruction list; the initial execution time is the execution time of the instruction sequence in the instruction list to be scheduled.
In one embodiment, the computer program when executed by the processor further performs the steps of: and if the longest execution time corresponding to the currently accessed selection node is not less than the initial execution time, taking the instruction sequence in the instruction list to be scheduled as the instruction sequence in the instruction list after scheduling.
In one embodiment, the computer program when executed by the processor further performs the steps of: and selecting the selected node to access according to a random priority rule, and acquiring the longest execution time corresponding to the selected node which is selected to access currently.
In one embodiment, the computer program when executed by the processor further performs the steps of: and selecting the selected node for access according to a depth-first rule, and acquiring the longest execution time corresponding to the selected node which is selected to be accessed currently.
In one embodiment, the computer program when executed by the processor further performs the steps of: and selecting the selected node for access according to a breadth-first rule, and acquiring the longest execution time corresponding to the selected node which is selected to be accessed currently.
In one embodiment, the computer program when executed by the processor further performs the steps of: selecting the selected nodes with the sequence less than the preset sequence for access according to a breadth or random priority rule to obtain the longest execution time corresponding to the selected node which is selected for access currently; and selecting the selected nodes not less than the preset sequence according to a depth-first rule for access to obtain the longest execution time corresponding to the selected nodes selected to be accessed currently.
In one embodiment, the computer program when executed by the processor further performs the steps of: obtaining the shortest execution time corresponding to the currently accessed selection node; if the shortest execution time corresponding to the currently accessed selection node is greater than the initial execution time, terminating accessing the selection node associated with the currently accessed selection node; the initial execution time is the execution time of the instruction sequence in the instruction list to be scheduled.
In one embodiment, the computer program when executed by the processor further performs the steps of: and evaluating all the selected nodes corresponding to the current order according to the preset priority of the instruction to obtain the evaluation result of each selected joint in the current order, and determining the instruction corresponding to the current order according to the evaluation result.
In one embodiment, the computer program when executed by the processor further performs the steps of: the priority of each instruction is set according to the specific content and/or type of the currently selected node.
In one embodiment, the computer program when executed by the processor further performs the steps of: and determining the instruction corresponding to the current sequence according to the length of the shortest execution time corresponding to all the selected nodes in the current sequence.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by a computer program, which can be stored in a computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory, among others. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).
The above-mentioned embodiments only express several embodiments of the present invention, and the description thereof is more specific and detailed, but not construed as limiting the scope of the present invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the inventive concept, which falls within the scope of the present invention. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims (12)

1.一种指令列表调度方法,其特征在于,包括:1. An instruction list scheduling method, comprising: 获取待调度指令列表中的待调度指令集,并对所述待调度指令集进行数据依赖分析,得到所述待调度指令集中各指令之间的数据依赖关系;Obtaining the instruction set to be scheduled in the instruction set to be scheduled, and performing data dependency analysis on the instruction set to be scheduled, to obtain the data dependency relationship between the instructions in the instruction set to be scheduled; 根据各指令之间的所述数据依赖关系,得到指令调度过程中每次进行指令选择的所有选择节点;According to the data dependencies between the instructions, all selected nodes for each instruction selection in the instruction scheduling process are obtained; 按照预设规则,根据对应次序的所述选择节点确定调度后指令列表中各次序的指令;According to the preset rule, according to the selection node of the corresponding order, determine the instructions of each order in the scheduled instruction list; 其中,所述按照预设规则,根据对应次序的所述选择节点确定调度后指令列表中各次序的指令,包括:Wherein, according to the preset rules, according to the selection node in the corresponding order, the instructions in each order in the scheduled instruction list are determined, including: 获取当前访问的选择节点对应的最长执行时间;Get the longest execution time corresponding to the currently accessed selected node; 若当前访问的所述选择节点对应的最长执行时间小于初始执行时间,则将当前访问的选择节点的已排序指令确定为调度后的指令列表中对应次序的指令;If the longest execution time corresponding to the currently accessed selection node is less than the initial execution time, then the sorted instructions of the currently accessed selection node are determined as the instructions in the corresponding order in the scheduled instruction list; 其中,初始执行时间为待调度指令列表中指令序列的执行时间。The initial execution time is the execution time of the instruction sequence in the instruction list to be scheduled. 2.根据权利要求1所述的方法,其特征在于,所述按照预设规则,根据对应次序的所述选择节点确定调度后指令列表中各次序的指令的步骤包括:2. The method according to claim 1, wherein the step of determining the instructions in each order in the scheduled instruction list according to the selection node in the corresponding order according to a preset rule comprises: 访问所述选择节点;accessing the selection node; 对访问的当前次序对应的选择节点进行评估,得当前次序的各选择节点的评估结果,根据评估结果确定当前次序对应的指令。Evaluate the selection nodes corresponding to the current order of access, obtain evaluation results of each selection node in the current order, and determine the instructions corresponding to the current order according to the evaluation results. 3.根据权利要求1所述的方法,其特征在于,按照预设规则,根据对应次序的选择节点中确定调度后指令列表中各次序的指令的步骤包括:3. The method according to claim 1, wherein, according to a preset rule, the step of determining the instructions of each order in the scheduled instruction list according to the selection node of the corresponding order comprises: 按照指令的预设优先级评估当前次序对应的所有选择节点,得到当前次序的各选择接点的评估结果,并根据所述评估结果确定当前次序对应的指令。All selection nodes corresponding to the current order are evaluated according to the preset priorities of the instructions, an evaluation result of each selection node in the current order is obtained, and an instruction corresponding to the current order is determined according to the evaluation results. 4.根据权利要求3所述的方法,其特征在于,所述方法还包括:4. The method according to claim 3, wherein the method further comprises: 根据当前选择节点的具体内容和/或类型设定各指令的优先级。The priority of each command is set according to the specific content and/or type of the currently selected node. 5.根据权利要求2所述的方法,其特征在于,对访问的当前次序对应的选择节点进行评估,得当前次序的各选择节点的评估结果,根据评估结果确定当前次序对应的指令包括:5. method according to claim 2, is characterized in that, the corresponding selection node of the current order of visit is evaluated, obtains the evaluation result of each selection node of current order, according to the evaluation result, it is determined that the corresponding instruction of current order comprises: 根据当前次序所有的选择节点对应的最短执行时间的长短,确定当前次序对应的指令。According to the length of the shortest execution time corresponding to all selected nodes in the current order, the instruction corresponding to the current order is determined. 6.根据权利要求5所述的方法,其特征在于,所述方法还包括:6. The method according to claim 5, wherein the method further comprises: 若当前访问的选择节点对应的最短执行时间大于初始执行时间,则终止访问与当前访问的选择节点关联的选择节点;If the shortest execution time corresponding to the currently accessed selection node is greater than the initial execution time, the access to the selection node associated with the currently accessed selection node is terminated; 其中,初始执行时间为待调度指令列表中指令序列的执行时间。The initial execution time is the execution time of the instruction sequence in the instruction list to be scheduled. 7.根据权利要求1所述的方法,其特征在于,所述方法包括:7. The method of claim 1, wherein the method comprises: 访问所述选择节点,若当前访问的选择节点对应的最长执行时间小于初始执行时间,则初始执行时间更新为当前访问的选择节点对应的最长执行时间。Visit the selection node, if the longest execution time corresponding to the currently accessed selection node is less than the initial execution time, the initial execution time is updated to the longest execution time corresponding to the currently accessed selection node. 8.根据权利要求2所述的方法,其特征在于,所述访问所述选择节点的步骤包括:8. The method of claim 2, wherein the step of accessing the selected node comprises: 在预设访问时间段内访问选择节点。Visit selected nodes within a preset visit time period. 9.根据权利要求2-8任一项所述的方法,其特征在于,访问所述选择节点,包括:9. The method according to any one of claims 2-8, wherein accessing the selection node comprises: 按照随机优先的规则选择所述选择节点进行访问;或者,The selected node is selected for access according to a random priority rule; or, 按照广度优先的规则选择所述选择节点进行访问;或者,The selected node is selected for access according to a breadth-first rule; or, 按照深度优先的规则选择所述选择节点进行访问;或者,The selected node is selected for access according to a depth-first rule; or, 按照广度或随机优先的规则选择小于预设次序的所述选择节点进行访问,并按照深度优先的规则选择不小于预设次序的所述选择节点进行访问。According to the rule of breadth or random priority, the selection nodes that are smaller than the preset order are selected for access, and the selection nodes that are not smaller than the preset order are selected to be accessed according to the rule of depth first. 10.一种指令调度装置,其特征在于,包括:获取模块、数据依赖分析模块、评估模块,10. An instruction scheduling device, comprising: an acquisition module, a data dependency analysis module, and an evaluation module, 所述获取模块,用于获取待调度指令列表中的待调度指令集,以及根据各指令之间的数据依赖关系,得到指令调度过程中每次指令选择对应的所有选择节点;The obtaining module is used to obtain the to-be-scheduled instruction set in the to-be-scheduled instruction list, and to obtain all selection nodes corresponding to each instruction selection in the instruction scheduling process according to the data dependencies between the instructions; 所述数据依赖分析模块,用于对待调度指令集进行数据依赖分析,得到各指令之间的数据依赖关系;The data dependency analysis module is used for performing data dependency analysis on the instruction set to be scheduled to obtain the data dependency relationship between the instructions; 所述评估模块,用于按照预设规则,根据对应次序的选择节点中确定调度后指令列表中各次序的指令;The evaluation module is used to determine, according to preset rules, the instructions of each order in the scheduled instruction list according to the selection nodes of the corresponding order; 其中,所述评估模块具体用于获取当前访问的选择节点对应的最长执行时间;若当前访问的所述选择节点对应的最长执行时间小于初始执行时间,则将当前访问的选择节点的已排序指令确定为调度后的指令列表中对应次序的指令;其中,初始执行时间为待调度指令列表中指令序列的执行时间。Wherein, the evaluation module is specifically used to obtain the longest execution time corresponding to the currently accessed selection node; if the longest execution time corresponding to the currently accessed selection node is less than the initial execution time, the currently accessed selection node The ordering instruction is determined as an instruction in the corresponding order in the scheduled instruction list; wherein, the initial execution time is the execution time of the instruction sequence in the instruction list to be scheduled. 11.一种计算机设备,包括存储器、处理器,及存储在存储器上并可在处理器上运行的计算机程序,其特征在于,所述处理器执行如权利要求1-9任一项所述的方法的步骤。11. A computer device, comprising a memory, a processor, and a computer program stored in the memory and running on the processor, wherein the processor executes the method according to any one of claims 1-9. steps of the method. 12.一种计算机可读存储介质,其特征在于,所述计算机可读存储介质上存储有计算机程序,当所述计算机程序被处理器执行时,实现如权利要求1-9任一项所述的方法的步骤。12. A computer-readable storage medium, characterized in that, a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, any one of claims 1-9 is implemented. steps of the method.
CN201911214046.2A 2017-12-29 2017-12-29 Instruction list scheduling method, device, computer equipment and storage medium Active CN111104169B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911214046.2A CN111104169B (en) 2017-12-29 2017-12-29 Instruction list scheduling method, device, computer equipment and storage medium

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201711484410.8A CN109992307B (en) 2017-12-29 2017-12-29 Instruction list scheduling method and device, computer equipment and storage medium
CN201911214046.2A CN111104169B (en) 2017-12-29 2017-12-29 Instruction list scheduling method, device, computer equipment and storage medium

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
CN201711484410.8A Division CN109992307B (en) 2017-11-20 2017-12-29 Instruction list scheduling method and device, computer equipment and storage medium

Publications (2)

Publication Number Publication Date
CN111104169A CN111104169A (en) 2020-05-05
CN111104169B true CN111104169B (en) 2021-01-12

Family

ID=67111244

Family Applications (2)

Application Number Title Priority Date Filing Date
CN201911214046.2A Active CN111104169B (en) 2017-12-29 2017-12-29 Instruction list scheduling method, device, computer equipment and storage medium
CN201711484410.8A Active CN109992307B (en) 2017-11-20 2017-12-29 Instruction list scheduling method and device, computer equipment and storage medium

Family Applications After (1)

Application Number Title Priority Date Filing Date
CN201711484410.8A Active CN109992307B (en) 2017-11-20 2017-12-29 Instruction list scheduling method and device, computer equipment and storage medium

Country Status (1)

Country Link
CN (2) CN111104169B (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113204373A (en) * 2019-07-24 2021-08-03 中科寒武纪科技股份有限公司 Operation method, device and related product
CN112947997B (en) * 2019-12-11 2024-08-30 阿里巴巴集团控股有限公司 Data processing method and device, instruction fusion method and code generation method
CN111124500B (en) * 2019-12-12 2022-03-08 浪潮(北京)电子信息产业有限公司 Instruction execution method, apparatus, device and storage medium
CN111190401A (en) * 2019-12-30 2020-05-22 讯飞智元信息科技有限公司 Instruction scheduling method, hydraulic engineering control method, equipment and medium
CN114490012B (en) * 2020-11-13 2025-03-21 中国移动通信集团设计院有限公司 Memory scheduling method, device and readable storage medium
CN113312085B (en) * 2021-04-29 2023-02-03 华人运通(上海)云计算科技有限公司 Instruction block processing method, device, equipment and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0451562A2 (en) * 1990-04-04 1991-10-16 International Business Machines Corporation Data dependency collapsing hardware apparatus
CN1706111A (en) * 2002-10-15 2005-12-07 摩托罗拉公司 Method and apparatus for limiting a transmission in a dispatch system
CN101946233A (en) * 2008-02-22 2011-01-12 高通股份有限公司 System and method for instruction latency reduction in graphics processing
CN106814993A (en) * 2015-12-01 2017-06-09 广州神马移动信息科技有限公司 The method for determining the task scheduling time is, the method and apparatus for determining task execution time

Family Cites Families (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2000165375A (en) * 1998-11-30 2000-06-16 Hitachi Ltd Information processing device, IC card
US6643769B1 (en) * 2000-08-23 2003-11-04 Hewlett-Packard Development Company, L.P. System and method for enabling selective execution of computer code
JP4196614B2 (en) * 2002-08-22 2008-12-17 パナソニック株式会社 Instruction scheduling method, instruction scheduling apparatus, and program
US7100157B2 (en) * 2002-09-24 2006-08-29 Intel Corporation Methods and apparatus to avoid dynamic micro-architectural penalties in an in-order processor
US7240082B2 (en) * 2003-07-07 2007-07-03 Faraday Technology Corp. Method for processing efficiency in a pipeline architecture
KR101335001B1 (en) * 2007-11-07 2013-12-02 삼성전자주식회사 Processor and instruction scheduling method
CN102063288A (en) * 2011-01-07 2011-05-18 四川九洲电器集团有限责任公司 DSP (Digital Signal Processing) chip-oriented instruction scheduling method
US9529596B2 (en) * 2011-07-01 2016-12-27 Intel Corporation Method and apparatus for scheduling instructions in a multi-strand out of order processor with instruction synchronization bits and scoreboard bits
US9304749B2 (en) * 2013-09-12 2016-04-05 Marvell World Trade Ltd. Method and system for instruction scheduling
CN103577242B (en) * 2013-11-14 2016-11-02 中国科学院声学研究所 Control Flow Graph Refactoring Method for Scheduled Assembly Code
CN104461471B (en) * 2014-12-19 2018-06-15 中国人民解放军国防科学技术大学 Unified instruction scheduling and register allocation method on sub-clustering vliw processor
CN104699464B (en) * 2015-03-26 2017-12-26 中国人民解放军国防科学技术大学 A kind of instruction level parallelism dispatching method based on dependence grid
US10372458B2 (en) * 2015-04-01 2019-08-06 Huawei Technologies Co., Ltd Method and apparatus for a self-clocked, event triggered superscalar processor

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0451562A2 (en) * 1990-04-04 1991-10-16 International Business Machines Corporation Data dependency collapsing hardware apparatus
CN1706111A (en) * 2002-10-15 2005-12-07 摩托罗拉公司 Method and apparatus for limiting a transmission in a dispatch system
CN101946233A (en) * 2008-02-22 2011-01-12 高通股份有限公司 System and method for instruction latency reduction in graphics processing
CN106814993A (en) * 2015-12-01 2017-06-09 广州神马移动信息科技有限公司 The method for determining the task scheduling time is, the method and apparatus for determining task execution time

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于表调度的 MatrixDSP指令调度算法的实现;罗杰等;《计算机工程与科学》;20130831;第35卷(第8期);第25-30页 *

Also Published As

Publication number Publication date
CN109992307B (en) 2020-05-05
CN111104169A (en) 2020-05-05
CN109992307A (en) 2019-07-09

Similar Documents

Publication Publication Date Title
CN111104169B (en) Instruction list scheduling method, device, computer equipment and storage medium
US11113103B2 (en) Task parallel processing method, apparatus and system, storage medium and computer device
Becker et al. Contention-free execution of automotive applications on a clustered many-core platform
Schranzhofer et al. Timing analysis for TDMA arbitration in resource sharing systems
US9292359B2 (en) System and method for memory management
US9009711B2 (en) Grouping and parallel execution of tasks based on functional dependencies and immediate transmission of data results upon availability
CN102834807B (en) The method and apparatus of multicomputer system load balancing
US8996811B2 (en) Scheduler, multi-core processor system, and scheduling method
Breß et al. Efficient co-processor utilization in database query processing
Liu et al. GPU-based parallelization for fast circuit optimization
Holst et al. High-throughput logic timing simulation on GPGPUs
Iturbe et al. Runtime Scheduling, Allocation, and Execution of Real‐Time Hardware Tasks onto Xilinx FPGAs Subject to Fault Occurrence
Oehlert et al. Bus-aware static instruction SPM allocation for multicore hard real-time systems
Beaumont et al. Comparison of static and runtime resource allocation strategies for matrix multiplication
Inggs et al. CTL* model checking on a shared-memory architecture
CN103713944A (en) Kernel-level thread processing method, device and system
Lázaro-Muñoz et al. A tasks reordering model to reduce transfers overhead on GPUs
Walter et al. Real-time Scheduling of I/O Transfers for Massively Parallel Processor Arrays
Cole et al. Efficient resource oblivious algorithms for multicores
Jungklass et al. Memopt: Automated memory distribution for multicore microcontrollers with hard real-time requirements
Alhammad Memory Efficient Scheduling for Multicore Real-time Systems
Zhenyang et al. Scheduling parallel real-time tasks for multicore systems
Ma Modeling algorithm performance on highly-threaded many-core architectures
Ezekiel et al. To Parallelize or to Optimize?
Kang et al. RT-MDM: Real-Time Scheduling Framework for Multi-DNN on MCU Using External Memory

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant