[go: up one dir, main page]

CN117331655A - Multi-thread scheduling method and device - Google Patents

Multi-thread scheduling method and device Download PDF

Info

Publication number
CN117331655A
CN117331655A CN202210738293.8A CN202210738293A CN117331655A CN 117331655 A CN117331655 A CN 117331655A CN 202210738293 A CN202210738293 A CN 202210738293A CN 117331655 A CN117331655 A CN 117331655A
Authority
CN
China
Prior art keywords
thread
state
message
linked list
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210738293.8A
Other languages
Chinese (zh)
Inventor
沈洋
徐金林
牛新伟
韩建辉
李铮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sanechips Technology Co Ltd
Original Assignee
Sanechips Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sanechips Technology Co Ltd filed Critical Sanechips Technology Co Ltd
Priority to CN202210738293.8A priority Critical patent/CN117331655A/en
Priority to PCT/CN2023/087477 priority patent/WO2024001411A1/en
Publication of CN117331655A publication Critical patent/CN117331655A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • G06F9/4881Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Multi Processors (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

本发明实施例提供了一种多线程调度方法及装置。该方法包括:每个报文进入处理器内核后,将每个报文携带的线程号依次存入与该报文所属的线程组对应的线程管理链表中,并建立线程号与线程管理链表的节点之间的映射关系;根据所述映射关系以及每个线程所对应的线程状态机的状态,按照报文进入所述处理器内核的先后顺序从线程组中调度出处于可执行状态的目标线程,并将目标线程输入与目标线程对应的流水线。在本发明中,通过引入线程管理链表来优化线程调度方式,保证先进入处理器内核的报文优先得到调度执行。因此,可以解决相关技术中无法保证按照报文进入处理器内核的顺序使先进入内核的报文优先得到转发的问题,达到了降低报文的执行延迟的效果。

Embodiments of the present invention provide a multi-thread scheduling method and device. The method includes: after each message enters the processor core, the thread number carried by each message is sequentially stored in the thread management linked list corresponding to the thread group to which the message belongs, and a relationship between the thread number and the thread management linked list is established. The mapping relationship between nodes; according to the mapping relationship and the state of the thread state machine corresponding to each thread, the target thread in the executable state is scheduled from the thread group in the order in which the messages enter the processor core. , and input the target thread into the pipeline corresponding to the target thread. In the present invention, the thread scheduling method is optimized by introducing a thread management linked list to ensure that messages that enter the processor core first are scheduled and executed first. Therefore, it is possible to solve the problem in the related technology that the packets entering the core first are forwarded first according to the order in which the packets enter the processor core, thereby achieving the effect of reducing the execution delay of the packets.

Description

多线程调度方法及装置Multi-thread scheduling method and device

技术领域Technical field

本发明实施例涉及核网络处理器技术领域,具体而言,涉及一种多线程调度方法及装置。Embodiments of the present invention relate to the technical field of core network processors, and specifically, to a multi-thread scheduling method and device.

背景技术Background technique

随着通信技术的飞速发展,网络处理器作为数字通信领域中数据转发的核心部件,特定应用于包处理、协议分析、路由查找、声音/数据汇聚、防火墙等通信领域的各项任务。With the rapid development of communication technology, network processors, as the core component of data forwarding in the field of digital communication, are specifically used in various tasks in the communication field such as packet processing, protocol analysis, route lookup, voice/data aggregation, and firewalls.

为了适应不断发展的网络技术,对网络处理器的处理能力提出越来越高的要求。传统的网络处理器采用细粒度多线程结构方式,在使用并行处理技术提高微引擎内核数据处理并行度的同时,利用多线程的切换来隐藏流水线和存储器的延迟,从而提高处理器的吞吐量;细粒度多线程每个时钟周期在线程间进行一次切换,使多个线程的指令执行过程交织在一起,这种交织通常采用Round-Robin轮询调度算法对准备就绪的线程按序号进行调度,无法保证按照报文进入微引擎的顺序使先进入微引擎的报文优先得到转发,且会存在一个准备就绪、没有停顿的线程可能被其他线程的执行延迟的情况,从而减缓个体线程的执行速度。In order to adapt to the ever-evolving network technology, higher and higher requirements are placed on the processing capabilities of network processors. Traditional network processors adopt a fine-grained multi-thread structure. While using parallel processing technology to improve the parallelism of micro-engine core data processing, they also use multi-thread switching to hide pipeline and memory delays, thereby improving processor throughput; Fine-grained multi-threading switches between threads once every clock cycle, so that the instruction execution processes of multiple threads are intertwined. This kind of interleaving usually uses the Round-Robin polling scheduling algorithm to schedule ready threads according to serial numbers, which cannot It is guaranteed that the packets that enter the micro-engine first are forwarded first according to the order in which the packets enter the micro-engine. There will be a situation where a thread that is ready and has no pause may be delayed by the execution of other threads, thereby slowing down the execution speed of individual threads.

发明内容Contents of the invention

本发明实施例提供了一种多线程调度方法及装置,以至少解决相关技术中无法保证按照报文进入微引擎的顺序使先进入微引擎的报文优先得到转发的问题。Embodiments of the present invention provide a multi-thread scheduling method and device to at least solve the problem in related technologies that it is impossible to ensure that messages entering the micro engine first are forwarded first in the order in which the messages enter the micro engine.

根据本发明的一个实施例,提供了一种多线程调度方法,包括:每个报文进入处理器内核后,将每个报文携带的线程号依次存入与该报文所属的线程组对应的线程管理链表中,并建立线程号与线程管理链表的节点之间的映射关系;According to an embodiment of the present invention, a multi-thread scheduling method is provided, which includes: after each message enters the processor core, the thread number carried by each message is stored in sequence corresponding to the thread group to which the message belongs. in the thread management linked list, and establish a mapping relationship between the thread number and the node of the thread management linked list;

根据所述映射关系以及每个线程所对应的线程状态机的状态,按照报文进入所述处理器内核的先后顺序从线程组中调度出处于可执行状态的目标线程,并将所述目标线程输入与所述目标线程对应的流水线。According to the mapping relationship and the state of the thread state machine corresponding to each thread, the target thread in the executable state is scheduled from the thread group in the order in which messages enter the processor core, and the target thread is Enter the pipeline corresponding to the target thread.

在一个示例性实施例中,其中,在将所述每个报文携带的线程号依次存入与该报文所属的线程组对应的线程管理链表中之前,还包括:为进入所述处理器内核的每个报文分配线程号,并将所有线程分为与流水线数量对应的线程组。In an exemplary embodiment, before sequentially storing the thread number carried by each message into the thread management linked list corresponding to the thread group to which the message belongs, the method further includes: Each packet of the kernel is assigned a thread number and all threads are divided into thread groups corresponding to the number of pipelines.

在一个示例性实施例中,每个线程组对应一个线程管理链表,每个线程管理链表的节点数与每个线程组包含的线程数量相同。In an exemplary embodiment, each thread group corresponds to a thread management linked list, and the number of nodes in each thread management linked list is the same as the number of threads included in each thread group.

在一个示例性实施例中,所述线程管理链表的节点到线程号之间的映射关系采用比特图方式表示。In an exemplary embodiment, the mapping relationship between the nodes of the thread management linked list and the thread number is represented by a bitmap.

在一个示例性实施例中,根据所述映射关系以及每个线程所处的状态,按照报文进入所述处理器内核的先后顺序从所述线程组中调度出处于可执行状态的目标线程,并将所述目标线程输入与所述目标线程对应的流水线,包括:根据所述比特图的值及所述线程组中各线程的准备状态计算得出每个节点对应的发射请求,将具有发射请求的线程进行优先级调度,使先进入所述处理器内核且处于准备就绪状态的线程获得授权,转换为可执行状态,并将所述可执行状态的线程作为所述目标线程进行调度;获取与所述目标线程对应的指令,并将所述目标线程输入与所述目标线程对应的流水线以执行所述指令。In an exemplary embodiment, according to the mapping relationship and the state of each thread, the target thread in the executable state is scheduled from the thread group in the order in which messages enter the processor core, and inputting the target thread into the pipeline corresponding to the target thread, including: calculating the launch request corresponding to each node according to the value of the bitmap and the readiness status of each thread in the thread group, and will have the launch request The requested thread performs priority scheduling, so that the thread that first enters the processor core and is in the ready state is authorized, converted to the executable state, and the thread in the executable state is scheduled as the target thread; obtain instructions corresponding to the target thread, and inputting the target thread into the pipeline corresponding to the target thread to execute the instructions.

在一个示例性实施例中,将所述目标线程输入与所述目标线程对应的流水线以执行所述指令之后,还包括:将执行完所述指令的报文调度出所述处理器内核,并释放与该报文对应的线程;在所述线程管理链表的节点中清除该报文的线程号,并将存储在所述线程管理链表的节点中的其他线程号依次向前移动一个节点。In an exemplary embodiment, after inputting the target thread into the pipeline corresponding to the target thread to execute the instruction, the method further includes: scheduling a message after executing the instruction out of the processor core, and Release the thread corresponding to the message; clear the thread number of the message in the node of the thread management linked list, and move other thread numbers stored in the node of the thread management linked list forward one node in sequence.

在一个示例性实施例中,每条流水线对应一个主控状态机,每个线程对应一个线程状态机,每条流水线在空闲和授权2个状态以及每个线程在空闲、准备就绪、可执行和等待4个状态间转换。In an exemplary embodiment, each pipeline corresponds to a main control state machine, each thread corresponds to a thread state machine, each pipeline is in two states: idle and authorized, and each thread is in idle, ready, executable and Wait for transition between 4 states.

在一个示例性实施例中,每条流水线在空闲和授权2个状态以及每个线程在空闲、准备就绪、可执行和等待4个状态间转换,包括:当所述主控状态机处于空闲状态,表示允许新的报文进入处理器内核,新的报文进入处理器内核后,对应线程处于空闲状态;在报文的线程号存入线程管理链表的节点,并从指令存储模块取回与该线程对应的指令后,该线程从空闲状态转入准备就绪状态;在主控状态机的授权状态下对处于准备就绪状态的线程进行授权,获得授权的线程由准备就绪状态转入执行状态;处于可执行状态的线程执行完对应的指令后由可执行状态转入空闲状态;处于可执行状态的线程在指令执行过程中存在数据等待、查表等待或重新取指令的情况时,该线程由可执行状态转入等待状态;处于等待状态的线程数据等待结束、或者查表结果返回、或者重新取指返回后,由等待状态重新转入准备就绪状态;在有执行完指令的线程的线程号被释放后,主控状态机进入空闲状态。In an exemplary embodiment, each pipeline transitions between two states: idle and authorized, and each thread transitions between four states: idle, ready, executable, and waiting, including: when the main control state machine is in the idle state , indicating that new messages are allowed to enter the processor core. After the new message enters the processor core, the corresponding thread is in an idle state; the thread number of the message is stored in the node of the thread management linked list, and is retrieved from the instruction storage module. After the instruction corresponding to the thread, the thread transfers from the idle state to the ready state; in the authorization state of the main control state machine, the thread in the ready state is authorized, and the authorized thread transfers from the ready state to the execution state; The thread in the executable state transfers from the executable state to the idle state after executing the corresponding instruction; when the thread in the executable state is waiting for data, waiting for table lookup, or re-fetching instructions during the execution of the instruction, the thread is The executable state is transferred to the waiting state; the thread in the waiting state is transferred from the waiting state to the ready state after the data wait ends, or the table lookup result is returned, or the instruction is retrieved and returned; when there is a thread number of the thread that has completed the execution of the instruction After being released, the main control state machine enters the idle state.

根据本发明的另一个实施例,提供了一种多线程调度装置,包括:建立模块,用于在每个报文进入处理器内核后,将每个报文携带的线程号依次存入与该报文所属的线程组对应的线程管理链表中,并建立线程号与线程管理链表的节点之间的映射关系;调度模块,用于根据所述映射关系以及每个线程所对应的线程状态机的状态,按照报文进入所述处理器内核的先后顺序从线程组中调度出处于可执行状态的目标线程,并将所述目标线程输入与所述目标线程对应的流水线。According to another embodiment of the present invention, a multi-thread scheduling device is provided, including: a setting module configured to sequentially store the thread number carried by each message into the processor core after each message enters the processor core. in the thread management linked list corresponding to the thread group to which the message belongs, and establish a mapping relationship between the thread number and the node of the thread management linked list; the scheduling module is used to calculate the mapping relationship according to the mapping relationship and the thread state machine corresponding to each thread. status, schedule the target thread in the executable state from the thread group according to the order in which messages enter the processor core, and input the target thread into the pipeline corresponding to the target thread.

在一个示例性实施例中,分配模块,用于为进入所述处理器内核的每个报文分配线程号,并将所有线程分为与流水线数量对应的线程组。In an exemplary embodiment, the allocation module is configured to allocate a thread number to each message entering the processor core, and divide all threads into thread groups corresponding to the number of pipelines.

在一个示例性实施例中,其中,每个线程组对应一个线程管理链表,每个线程管理链表的节点数与每个线程组包含的线程数量相同。In an exemplary embodiment, each thread group corresponds to a thread management linked list, and the number of nodes in each thread management linked list is the same as the number of threads included in each thread group.

在一个示例性实施例中,还包括:释放模块,用于将执行完与所述目标线程对应的指令的报文调度出所述处理器内核,并释放与该报文对应的线程,在所述线程管理链表的节点中清除该报文的线程号,并将存储在所述线程管理链表的节点中的其他线程号依次向前移动一个节点。In an exemplary embodiment, the method further includes: a release module, configured to schedule the message that has completed the execution of the instruction corresponding to the target thread out of the processor core, and release the thread corresponding to the message. Clear the thread number of the message from the node of the thread management linked list, and move the other thread numbers stored in the node of the thread management linked list forward one node in sequence.

在一个示例性实施例中,其中,每条流水线对应一个主控状态机,每个线程对应一个线程状态机,每条流水线在空闲和授权2个状态以及每个线程在空闲、就绪准备、可执行和等待4个状态间转换。In an exemplary embodiment, each pipeline corresponds to a main control state machine, each thread corresponds to a thread state machine, each pipeline is in two states: idle and authorized, and each thread is in idle, ready, and available. Transition between execution and waiting 4 states.

根据本发明的又一个实施例,还提供了一种计算机可读存储介质,所述计算机可读存储介质中存储有计算机程序,其中,所述计算机程序被设置为运行时执行上述任一项方法实施例中的步骤。According to yet another embodiment of the present invention, a computer-readable storage medium is also provided. A computer program is stored in the computer-readable storage medium, wherein the computer program is configured to execute any of the above methods when running. Steps in Examples.

根据本发明的又一个实施例,还提供了一种电子装置,包括存储器和处理器,所述存储器中存储有计算机程序,所述处理器被设置为运行所述计算机程序以执行上述任一项方法实施例中的步骤。According to yet another embodiment of the present invention, an electronic device is also provided, including a memory and a processor. A computer program is stored in the memory, and the processor is configured to run the computer program to perform any of the above. Steps in method embodiments.

通过本发明,通过引入线程管理链表优化线程调度方式,保证先进入处理器内核的报文优先得到调度执行。因此,可以解决相关技术中无法保证按照报文进入内核的顺序使先进入内核的报文优先得到转发的问题,达到了降低报文的执行延迟的效果。Through the present invention, the thread scheduling method is optimized by introducing a thread management linked list to ensure that messages that enter the processor core first are scheduled and executed first. Therefore, it is possible to solve the problem in the related technology that the packets that enter the kernel first are forwarded first according to the order in which the packets enter the kernel, thereby achieving the effect of reducing the execution delay of the packets.

附图说明Description of drawings

图1是运行本发明实施例的多线程调度方法的计算机终端的硬件结构框图;Figure 1 is a hardware structure block diagram of a computer terminal running a multi-thread scheduling method according to an embodiment of the present invention;

图2是根据本发明实施例的多线程调度方法的流程图;Figure 2 is a flow chart of a multi-thread scheduling method according to an embodiment of the present invention;

图3是根据本发明实施例的多线程调度装置的结构框图;Figure 3 is a structural block diagram of a multi-thread scheduling device according to an embodiment of the present invention;

图4是根据本发明另一实施例的多线程调度装置的结构框图;Figure 4 is a structural block diagram of a multi-thread scheduling device according to another embodiment of the present invention;

图5是根据本发明再一实施例的多线程调度装置的结构框图;Figure 5 is a structural block diagram of a multi-thread scheduling device according to yet another embodiment of the present invention;

图6是根据本发明实施例的基于粗粒度的多线程调度装置的结构示意图;Figure 6 is a schematic structural diagram of a coarse-grained multi-thread scheduling device according to an embodiment of the present invention;

图7是根据本发明实施例的线程与线程管理链表对应的示意图;Figure 7 is a schematic diagram corresponding to threads and thread management linked lists according to an embodiment of the present invention;

图8是根据本发明实施例的线程状态装换示意图;Figure 8 is a schematic diagram of thread state switching according to an embodiment of the present invention;

图9是根据本发明实施例的执行基于粗粒度的多线程调度的流程图。FIG. 9 is a flowchart for performing coarse-grained multi-thread scheduling according to an embodiment of the present invention.

具体实施方式Detailed ways

下文中将参考附图并结合实施例来详细说明本发明的实施例。Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings and embodiments.

需要说明的是,本发明的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。It should be noted that the terms "first", "second", etc. in the description and claims of the present invention and the above-mentioned drawings are used to distinguish similar objects and are not necessarily used to describe a specific order or sequence.

本申请实施例中所提供的方法实施例可以在移动终端、计算机终端或者类似的运算装置中执行。以运行在计算机终端上为例,图1是运行本发明实施例的多线程调度方法的计算机终端的硬件结构框图。如图1所示,计算机终端可以包括一个或多个(图1中仅示出一个)处理器102(处理器102可以包括但不限于微处理器(Central Processing Unit,MCU)或可编程逻辑器件(Field Programmable Gate Array,FPGA)等的处理装置)和用于存储数据的存储器104,其中,上述计算机终端还可以包括用于通信功能的传输设备106以及输入输出设备108。本领域普通技术人员可以理解,图1所示的结构仅为示意,其并不对上述计算机终端的结构造成限定。例如,计算机终端还可包括比图1中所示更多或者更少的组件,或者具有与图1所示不同的配置。The method embodiments provided in the embodiments of this application can be executed in a mobile terminal, a computer terminal, or a similar computing device. Taking running on a computer terminal as an example, FIG. 1 is a hardware structure block diagram of a computer terminal running the multi-thread scheduling method according to an embodiment of the present invention. As shown in FIG. 1 , the computer terminal may include one or more (only one is shown in FIG. 1 ) processors 102 (the processor 102 may include but is not limited to a microprocessor (Central Processing Unit, MCU) or a programmable logic device). (Processing device such as Field Programmable Gate Array (FPGA)) and a memory 104 for storing data, wherein the above-mentioned computer terminal may also include a transmission device 106 and an input and output device 108 for communication functions. Persons of ordinary skill in the art can understand that the structure shown in Figure 1 is only illustrative, and it does not limit the structure of the above-mentioned computer terminal. For example, the computer terminal may also include more or fewer components than shown in FIG. 1 , or have a different configuration than shown in FIG. 1 .

存储器104可用于存储计算机程序,例如,应用软件的软件程序以及模块,如本发明实施例中的多线程调度方法对应的计算机程序,处理器102通过运行存储在存储器104内的计算机程序,从而执行各种功能应用以及数据处理,即实现上述的方法。存储器104可包括高速随机存储器,还可包括非易失性存储器,如一个或者多个磁性存储装置、闪存、或者其他非易失性固态存储器。在一些实例中,存储器104可进一步包括相对于处理器102远程设置的存储器,这些远程存储器可以通过网络连接至计算机终端。上述网络的实例包括但不限于互联网、企业内部网、局域网、移动通信网及其组合。The memory 104 can be used to store computer programs, for example, software programs and modules of application software, such as the computer program corresponding to the multi-thread scheduling method in the embodiment of the present invention. The processor 102 executes the computer program by running the computer program stored in the memory 104. Various functional applications and data processing implement the above methods. Memory 104 may include high-speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, the memory 104 may further include memory located remotely relative to the processor 102, and these remote memories may be connected to the computer terminal through a network. Examples of the above-mentioned networks include but are not limited to the Internet, intranets, local area networks, mobile communication networks and combinations thereof.

传输装置106用于经由一个网络接收或者发送数据。上述的网络具体实例可包括计算机终端的通信供应商提供的无线网络。在一个实例中,传输装置106包括一个网络适配器(Network Interface Controller,简称为NIC),其可通过基站与其他网络设备相连从而可与互联网进行通讯。在一个实例中,传输装置106可以为射频(Radio Frequency,简称为RF)模块,其用于通过无线方式与互联网进行通讯。The transmission device 106 is used to receive or send data via a network. Specific examples of the above-mentioned network may include a wireless network provided by a communication provider of the computer terminal. In one example, the transmission device 106 includes a network adapter (Network Interface Controller, NIC for short), which can be connected to other network devices through a base station to communicate with the Internet. In one example, the transmission device 106 may be a radio frequency (Radio Frequency, RF for short) module, which is used to communicate with the Internet wirelessly.

在本实施例中提供了一种运行于上述计算机终端的多线程调度方法,图2是根据本发明实施例的多线程调度方法的流程图,如图2所示,该流程包括如下步骤:This embodiment provides a multi-thread scheduling method running on the above-mentioned computer terminal. Figure 2 is a flow chart of the multi-thread scheduling method according to the embodiment of the present invention. As shown in Figure 2, the process includes the following steps:

步骤S202,每个报文进入处理器内核后,将每个报文携带的线程号依次存入与该报文所属的线程组对应的线程管理链表中,并建立线程号与线程管理链表的节点之间的映射关系;Step S202: After each message enters the processor core, the thread number carried by each message is sequentially stored in the thread management linked list corresponding to the thread group to which the message belongs, and a node of the thread number and thread management linked list is established. the mapping relationship between;

步骤S204,根据所述映射关系以及每个线程所对应的线程状态机的状态,按照报文进入所述处理器内核的先后顺序从线程组中调度出处于可执行状态的目标线程,并将目标线程输入与目标线程对应的流水线。Step S204: According to the mapping relationship and the state of the thread state machine corresponding to each thread, the target thread in the executable state is scheduled from the thread group according to the order in which messages enter the processor core, and the target thread is The thread input corresponds to the pipeline of the target thread.

在本实施例的步骤S202之前,还包括:为进入所述处理器内核的每个报文分配线程号,并将所有线程分为与流水线数量对应的线程组。Before step S202 in this embodiment, the method further includes: assigning a thread number to each message entering the processor core, and dividing all threads into thread groups corresponding to the number of pipelines.

在本实施例的步骤S202中,每个线程组对应一个线程管理链表,每个线程管理链表的节点数与每个线程组包含的线程数量相同。In step S202 of this embodiment, each thread group corresponds to a thread management linked list, and the number of nodes in each thread management linked list is the same as the number of threads included in each thread group.

在本实施例的步骤S202中,所述线程管理链表的节点到线程号之间的映射关系采用比特图方式表示。In step S202 of this embodiment, the mapping relationship between the nodes of the thread management linked list and the thread numbers is represented by a bitmap.

在本实施例的步骤S204中,包括:根据所述比特图的值及所述线程组中各线程的准备状态计算得出每个节点对应的发射请求,将具有发射请求的线程进行优先级调度,使先进入所述处理器内核且处于准备就绪状态的线程获得授权,转换为可执行状态,并将所述可执行状态的线程作为所述目标线程进行调度;获取与目标线程对应的指令,并将目标线程输入与目标线程对应的流水线以执行所述指令。In step S204 of this embodiment, it includes: calculating the transmission request corresponding to each node according to the value of the bitmap and the readiness status of each thread in the thread group, and performing priority scheduling on the thread with the transmission request. , so that the thread that first enters the processor core and is in the ready state is authorized, converted to the executable state, and the thread in the executable state is scheduled as the target thread; obtains the instruction corresponding to the target thread, And input the target thread into the pipeline corresponding to the target thread to execute the instruction.

在本实施例中,将该线程发射进入对应的流水线以执行所述指令之后,还包括:将执行完指令的报文调度出所述处理器内核,并释放与该报文对应的线程;在所述线程管理链表的节点中清除该报文的线程号,并将存储在所述线程管理链表的节点中的其他线程号依次向前移动一个节点。In this embodiment, after launching the thread into the corresponding pipeline to execute the instruction, the method further includes: scheduling the message after the instruction has been executed out of the processor core, and releasing the thread corresponding to the message; The thread number of the message is cleared from the node of the thread management linked list, and other thread numbers stored in the node of the thread management linked list are moved forward by one node in sequence.

在本实施例中,每条流水线对应一个主控状态机,每个线程对应一个线程状态机,每条流水线在空闲和授权2个状态以及每个线程在空闲、准备就绪、可执行和等待4个状态间转换。In this embodiment, each pipeline corresponds to a main control state machine, each thread corresponds to a thread state machine, each pipeline is in two states: idle and authorized, and each thread is in four states: idle, ready, executable and waiting. Transition between states.

在本实施例中,每条流水线在空闲和授权2个状态以及每个线程在空闲、准备就绪、可执行和等待4个状态间转换,包括:当所述主控状态机处于空闲状态,表示允许新的报文进入处理器内核,新的报文进入处理器内核后,对应线程处于空闲状态;在报文的线程号存入线程管理链表的节点,并从指令存储模块取回与该线程对应的指令后,该线程从空闲状态转入准备就绪状态;在主控状态机的授权状态下对处于准备就绪状态的线程进行授权,获得授权的线程由准备就绪状态转入可执行状态;处于可执行状态的线程执行完对应的指令后由可执行状态转入空闲状态;处于可执行状态的线程在指令执行过程中存在数据等待、查表等待或重新取指令的情况时,该线程由可执行状态转入等待状态;处于等待状态的线程数据等待结束、或者查表结果返回、或者重新取指返回后,由等待状态重新转入准备就绪状态;在有执行完指令的线程的线程号被释放后,主控状态机进入空闲状态。In this embodiment, each pipeline transitions between two states: idle and authorized, and each thread transitions between four states: idle, ready, executable, and waiting, including: when the main control state machine is in an idle state, it means Allow new messages to enter the processor core. After the new message enters the processor core, the corresponding thread is in an idle state; the thread number of the message is stored in the node of the thread management linked list, and the thread number associated with the thread is retrieved from the instruction storage module. After the corresponding instruction, the thread transfers from the idle state to the ready state; in the authorization state of the main control state machine, the thread in the ready state is authorized, and the authorized thread transfers from the ready state to the executable state; in After the thread in the executable state executes the corresponding instruction, it transitions from the executable state to the idle state; when the thread in the executable state is waiting for data, table lookup, or re-fetching instructions during the execution of the instruction, the thread is changed from the executable state to the idle state. The execution state is transferred to the waiting state; after the thread data in the waiting state is finished waiting, or the table lookup result is returned, or the instruction is retrieved and returned, the waiting state is transferred back to the ready state; after the thread number of the thread that has completed the execution of the instruction is After release, the main control state machine enters the idle state.

通过上述步骤,通过引入线程管理链表优化线程调度方式,保证先进入处理器内核的报文优先得到调度执行。因此,可以解决相关技术中无法保证按照报文进入微引擎的顺序使先进入微引擎的报文优先得到转发的问题,达到了降低报文的执行延迟的效果。Through the above steps, the thread scheduling method is optimized by introducing a thread management linked list to ensure that the packets that enter the processor core first are scheduled and executed first. Therefore, it is possible to solve the problem in the related technology that the packets entering the micro-engine first are forwarded first according to the order in which the packets enter the micro-engine, thereby achieving the effect of reducing the execution delay of the packets.

通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到根据上述实施例的方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本发明的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质(如只读存储器/随机存取存储器(Read-Only Memory/Random Access Memory,ROM/RAM)、磁碟、光盘)中,包括若干指令用以使得一台终端设备(可以是手机,计算机,服务器,或者网络设备等)执行本发明各个实施例所述的方法。Through the description of the above embodiments, those skilled in the art can clearly understand that the method according to the above embodiments can be implemented by means of software plus the necessary general hardware platform. Of course, it can also be implemented by hardware, but in many cases the former is Better implementation. Based on this understanding, the technical solution of the present invention can be embodied in the form of a software product in nature or in part that contributes to the existing technology. The computer software product is stored in a storage medium (such as read-only memory/random access memory). The memory (Read-Only Memory/Random Access Memory, ROM/RAM), magnetic disk, optical disk) includes several instructions to cause a terminal device (which can be a mobile phone, computer, server, or network device, etc.) to execute the present invention. Methods described in various embodiments.

在本实施例中还提供了一种多线程调度装置,该装置用于实现上述实施例及优选实施方式,已经进行过说明的不再赘述。如以下所使用的,术语“模块”可以实现预定功能的软件和/或硬件的组合。尽管以下实施例所描述的装置较佳地以软件来实现,但是硬件,或者软件和硬件的组合的实现也是可能并被构想的。This embodiment also provides a multi-thread scheduling device, which is used to implement the above embodiments and preferred implementations. What has been described will not be described again. As used below, the term "module" may be a combination of software and/or hardware that implements a predetermined function. Although the apparatus described in the following embodiments is preferably implemented in software, implementation in hardware, or a combination of software and hardware, is also possible and contemplated.

图3是根据本发明实施例的多线程调度装置的结构框图,如图3所示,该装置包括:建立模块10和调度模块20。Figure 3 is a structural block diagram of a multi-thread scheduling device according to an embodiment of the present invention. As shown in Figure 3, the device includes: a creation module 10 and a scheduling module 20.

建立模块10,用于在每个报文进入处理器内核后,将每个报文携带的线程号依次存入与该报文所属的线程组对应的线程管理链表中,并建立线程号与线程管理链表的节点之间的映射关系;The establishment module 10 is used to store the thread number carried by each message into the thread management linked list corresponding to the thread group to which the message belongs after each message enters the processor core, and establish the thread number and thread number. Manage the mapping relationship between nodes in the linked list;

调度模块20,用于根据所述映射关系以及每个线程所对应的线程状态机的状态,按照报文进入所述处理器内核的先后顺序从线程组中调度出处于可执行状态的目标线程,并将目标线程输入与目标线程对应的流水线。The scheduling module 20 is configured to schedule the target thread in the executable state from the thread group according to the order in which messages enter the processor core according to the mapping relationship and the state of the thread state machine corresponding to each thread. And input the target thread into the pipeline corresponding to the target thread.

图4是根据本发明另一实施例的多线程调度装置的结构框图,如图4所示,该装置除包括图3所示的所有模块外,还包括:Figure 4 is a structural block diagram of a multi-thread scheduling device according to another embodiment of the present invention. As shown in Figure 4, in addition to all the modules shown in Figure 3, the device also includes:

分配模块30,用于为进入所述处理器内核的每个报文分配线程号,并将所有线程分为与流水线数量对应的线程组。The allocation module 30 is configured to allocate a thread number to each message entering the processor core, and divide all threads into thread groups corresponding to the number of pipelines.

在一个示例性实施例中,其中,每个线程组对应一个线程管理链表,每个线程管理链表的节点数与每个线程组包含的线程数量相同。In an exemplary embodiment, each thread group corresponds to a thread management linked list, and the number of nodes in each thread management linked list is the same as the number of threads included in each thread group.

图5是根据本发明再一实施例的多线程调度装置的结构框图,如图5所示,该装置除包括图4所示的所有模块外,还包括:Figure 5 is a structural block diagram of a multi-thread scheduling device according to yet another embodiment of the present invention. As shown in Figure 5, in addition to all the modules shown in Figure 4, the device also includes:

释放模块40,将执行完与目标线程对应的指令的报文调度出所述处理器内核,并释放与该报文对应的线程,在所述线程管理链表的节点中清除该报文的线程号,并将存储在所述线程管理链表的节点中的其他线程号依次向前移动一个节点。The release module 40 schedules the message that has completed the execution of the instruction corresponding to the target thread out of the processor core, releases the thread corresponding to the message, and clears the thread number of the message in the node of the thread management linked list. , and move other thread numbers stored in the nodes of the thread management linked list forward by one node in sequence.

在一个示例性实施例中,其中,每条流水线对应一个主控状态机,每个线程对应一个线程状态机,每条流水线在空闲和授权2个状态以及每个线程在空闲、准备就绪、可执行和等待4个状态间转换。In an exemplary embodiment, each pipeline corresponds to a main control state machine, each thread corresponds to a thread state machine, each pipeline is in two states: idle and authorized, and each thread is in idle, ready, and available states. Transition between execution and waiting 4 states.

需要说明的是,上述各个模块是可以通过软件或硬件来实现的,对于后者,可以通过以下方式实现,但不限于此:上述模块均位于同一处理器中;或者,上述各个模块以任意组合的形式分别位于不同的处理器中。It should be noted that each of the above modules can be implemented through software or hardware. For the latter, it can be implemented in the following ways, but is not limited to this: the above modules are all located in the same processor; or the above modules can be implemented in any combination. The forms are located in different processors.

本发明的实施例还提供了一种计算机可读存储介质,该计算机可读存储介质中存储有计算机程序,其中,该计算机程序被设置为运行时执行上述任一项方法实施例中的步骤。Embodiments of the present invention also provide a computer-readable storage medium that stores a computer program, wherein the computer program is configured to execute the steps in any of the above method embodiments when running.

在一个示例性实施例中,上述计算机可读存储介质可以包括但不限于:U盘、只读存储器(Read-Only Memory,简称为ROM)、随机存取存储器(Random Access Memory,简称为RAM)、移动硬盘、磁碟或者光盘等各种可以存储计算机程序的介质。In an exemplary embodiment, the computer-readable storage medium may include but is not limited to: USB flash drive, read-only memory (ROM), random access memory (Random Access Memory, RAM) , mobile hard disk, magnetic disk or optical disk and other media that can store computer programs.

本发明的实施例还提供了一种电子装置,包括存储器和处理器,该存储器中存储有计算机程序,该处理器被设置为运行计算机程序以执行上述任一项方法实施例中的步骤。An embodiment of the present invention also provides an electronic device, including a memory and a processor. A computer program is stored in the memory, and the processor is configured to run the computer program to perform the steps in any of the above method embodiments.

在一个示例性实施例中,上述电子装置还可以包括传输设备以及输入输出设备,其中,该传输设备和上述处理器连接,该输入输出设备和上述处理器连接。In an exemplary embodiment, the above-mentioned electronic device may further include a transmission device and an input-output device, wherein the transmission device is connected to the above-mentioned processor, and the input-output device is connected to the above-mentioned processor.

本实施例中的具体示例可以参考上述实施例及示例性实施方式中所描述的示例,本实施例在此不再赘述。For specific examples in this embodiment, reference may be made to the examples described in the above-mentioned embodiments and exemplary implementations, and details will not be described again in this embodiment.

为了便于对本发明所提供的技术方案的理解,下面将结合具体场景的实施例进行详细描述。In order to facilitate understanding of the technical solution provided by the present invention, a detailed description will be given below with reference to embodiments of specific scenarios.

在相关技术中,微引擎接收到新报文时,首先为新报文分配一个线程号,分配线程号时不区分报文的优先级;传统的最近最少使用(Least Recently Used,LRU)调度算法能够使最频繁使用的线程获得最高优先级,但线程数量较多时,无法保证先进入内核的报文优先获得执行并转发。In related technologies, when a micro-engine receives a new message, it first allocates a thread number to the new message. When allocating the thread number, the priority of the message is not distinguished; the traditional least recently used (Least Recently Used, LRU) scheduling algorithm It can make the most frequently used threads get the highest priority, but when the number of threads is large, there is no guarantee that the packets that enter the kernel first will be executed and forwarded first.

针对上述先进入内核的报文不能优先执行完成的问题,在本发明实施例中提供了一种基于粗粒度的多线程调度装置,图6是根据本发明实施例的基于粗粒度的多线程调度装置的结构示意图,如图6所示,该多线程调度系统包括了:线程调度模块11、指令存储模块12和完成调度模块13。In order to solve the above problem that messages that enter the kernel first cannot be executed first, a coarse-grained multi-thread scheduling device is provided in an embodiment of the present invention. Figure 6 shows a coarse-grained multi-thread scheduling device according to an embodiment of the present invention. A schematic structural diagram of the device is shown in Figure 6. The multi-thread scheduling system includes: a thread scheduling module 11, an instruction storage module 12 and a completion scheduling module 13.

线程调度模块11,用于为新报文分配线程号,将所有线程分为与流水线条数对应的线程组,每个线程组按照报文进入顺序调度出准备就绪的可执行线程,从指令存储模块中获取与线程对应的指令,发射进入与线程组对应的流水线进行执行,执行完成后通知完成调度模块13;在本实施中,线程调度模块11在功能上包含了上述是实施例中的建立模块10、调度模块20以及分配模块30的功能。Thread scheduling module 11 is used to allocate thread numbers to new messages, divide all threads into thread groups corresponding to the number of pipeline lines, and each thread group schedules ready executable threads according to the order in which the messages enter, from the instruction storage The instruction corresponding to the thread is obtained in the module, and is launched into the pipeline corresponding to the thread group for execution. After the execution is completed, the completion scheduling module 13 is notified; in this implementation, the thread scheduling module 11 functionally includes the above-mentioned establishment in the embodiment. Functions of module 10, scheduling module 20 and allocation module 30.

指令存储模块12,用于存储线程执行所用的指令,包含指令二级缓存、指令一级缓存;The instruction storage module 12 is used to store instructions used for thread execution, including an instruction level 2 cache and an instruction level 1 cache;

完成调度模块13,用于接收线程调度模块发送的报文执行完成信号,将相应报文调度出内核并释放线程号信息;在本实施中,完成调度模块13在功能上相当于上述是实施例中的释放模块40的功能。The completion scheduling module 13 is used to receive the message execution completion signal sent by the thread scheduling module, schedule the corresponding message out of the kernel and release the thread number information; in this implementation, the completion scheduling module 13 is functionally equivalent to the above embodiment. The function of release module 40 in .

具体地,在线程调度模块11中,引入线程管理链表,用于管理与报文进包顺序对应的线程号信息,从各线程组中调度出先进报文且准备就绪的可执行线程发射进入对应的流水线执行;其中,每个线程组对应一个线程管理链表,链表节点数量与每个线程组包含的线程数量相同;图7是根据本发明实施例的线程与线程管理链表对应的示意图,如图7所示,20个线程分为2个线程组,一个线程组包含10个线程,对应的线程管理链表设有10个节点node0-node9,上述节点用于存储进包报文被分配的线程号信息;报文进入内核后,将其携带的线程号信息从左到右依次存入节点node0-node9中,线程管理链表节点到线程号之间存在一层映射关系,所述线程管理链表节点到线程号之间的映射关系可以通过位图(bitmap)方式进行维护,每个线程管理链表有10个与节点对应的bitmap值;依据bitmap值及线程组中各线程准备状态(rdy)计算得出每个节点对应的发射请求,参与优先级调度(SP),先进报文且准备就绪的可执行线程获得授权,被发射进与线程组对应的流水线。Specifically, in the thread scheduling module 11, a thread management linked list is introduced to manage the thread number information corresponding to the packet entry sequence, and schedule advanced messages and ready executable threads from each thread group to launch into the corresponding Pipeline execution; wherein, each thread group corresponds to a thread management linked list, and the number of linked list nodes is the same as the number of threads contained in each thread group; Figure 7 is a schematic diagram corresponding to threads and thread management linked lists according to an embodiment of the present invention, as shown in Figure As shown in Figure 7, 20 threads are divided into 2 thread groups. One thread group contains 10 threads. The corresponding thread management linked list has 10 nodes node0-node9. The above nodes are used to store the thread numbers assigned to incoming packets. Information; after the message enters the kernel, the thread number information it carries is stored in the nodes node0-node9 in order from left to right. There is a mapping relationship between the thread management list nodes and the thread numbers. The thread management list nodes are to The mapping relationship between thread numbers can be maintained through bitmap. Each thread management linked list has 10 bitmap values corresponding to the nodes; it is calculated based on the bitmap value and the readiness status (rdy) of each thread in the thread group. The launch request corresponding to each node participates in priority scheduling (SP). The executable threads that advance the message and are ready are authorized and are launched into the pipeline corresponding to the thread group.

执行完各自指令的报文被调度出内核,相应线程被释放,在线程管理链表中检索出对应的线程号信息,清除匹配节点的线程号信息,同时将该匹配节点右侧所有节点中存储的线程号信息左移一个节点进行存储。After executing the respective instructions, the messages are scheduled out of the kernel, the corresponding threads are released, the corresponding thread number information is retrieved in the thread management linked list, the thread number information of the matching node is cleared, and at the same time, the thread number information stored in all nodes to the right of the matching node is The thread number information is shifted one node to the left for storage.

在本实施例中,上述2个线程组分别与2条流水线对应;每个线程组中线程的个数可以为10或其它任意个数;所述流水线可以分为五级流水或其它级数的流水(例如,七级等)。In this embodiment, the above two thread groups correspond to two pipelines respectively; the number of threads in each thread group can be 10 or any other number; the pipeline can be divided into five levels of pipelines or other levels. Running water (e.g., seventh level, etc.).

在本实施例中,线程调度模块11还可以控制每个线程的状态转换,每条流水线对应一个主控状态机,每个线程对应一个线程状态机。如图8所示,具体转换包括如下步骤:In this embodiment, the thread scheduling module 11 can also control the state transition of each thread. Each pipeline corresponds to a main control state machine, and each thread corresponds to a thread state machine. As shown in Figure 8, the specific conversion includes the following steps:

第一,当主控状态机处于闲置(IDLE)状态时,表示允许新包进入,新包进入内核首先处于IDLE状态;First, when the main control state machine is in the idle (IDLE) state, it means that new packets are allowed to enter. When new packets enter the kernel, they are first in the IDLE state;

第二,向指令存储模块发送取指指令,依据分配的线程号,维护图7中的对应线程的线程管理链表,将线程号信息存入相应节点,待指令从指令存储模块返回后,对应线程从IDLE状态转入rdy状态;Second, send an instruction fetch instruction to the instruction storage module, maintain the thread management linked list of the corresponding thread in Figure 7 according to the assigned thread number, and store the thread number information in the corresponding node. After the instruction returns from the instruction storage module, the corresponding thread Transition from IDLE state to rdy state;

第三,同一线程组中若干处于rdy状态的线程,结合线程管理链表各节点映射得到的进包线程顺序,进行SP调度(GRANT),使得最先进包的线程获得授权;Third, several threads in the rdy state in the same thread group perform SP scheduling (GRANT) based on the order of incoming packet threads mapped by each node of the thread management chain, so that the thread with the most advanced packet is authorized;

获得授权的线程由rdy状态转入运行(exe)状态,每一线程组同一时刻只有一个线程可以获得授权,2个线程组可从各自组内调度出最先进包的可执行线程发射进对应的流水线(流水线0或者流水线1)执行;The authorized thread is transferred from the rdy state to the running (exe) state. Only one thread in each thread group can be authorized at the same time. The two thread groups can schedule the executable thread of the most advanced package from their respective groups and launch it into the corresponding Pipeline (pipeline 0 or pipeline 1) execution;

第四,处于exe状态的线程执行完包发送指令,相应线程由exe状态转入idle状态,报文调度出内核,检索并删除线程管理链表匹配节点的线程号信息,释放相应的线程号;Fourth, after the thread in the exe state executes the packet sending instruction, the corresponding thread transfers from the exe state to the idle state, the message is scheduled out of the kernel, the thread number information of the matching node in the thread management linked list is retrieved and deleted, and the corresponding thread number is released;

第五,处于exe状态的线程在指令执行过程中发现存在数据相关性的指令、或查表返回数据相关性的指令,或者需要重新取指等需要长时间等待的情况时,相应线程由exe状态转入wait状态;Fifth, when a thread in the exe state finds an instruction with data dependency during the execution of the instruction, or an instruction that returns data dependency from a table lookup, or needs to re-fetch instructions and other situations that require a long wait, the corresponding thread will be changed from the exe state. Transfer to wait state;

第六,由GRANT授权其余最先进包处于rdy状态的线程进入exe状态,待先前转入wait状态的线程数据等待周期完成、或者查表数据已返回、或者重新取指返回后,由wait状态重新转入rdy状态;由于线程管理链表保存有其进包顺序信息,待当前处于exe状态的线程转入其他状态时,由wait状态转入rdy状态的线程仍可获得优先调度,直至其执行完包发送指令,线程由exe状态转入idle状态;Sixth, GRANT authorizes the remaining threads with the most advanced package in the rdy state to enter the exe state. After the data waiting period of the thread that previously transferred to the wait state is completed, or the table lookup data has been returned, or the index is retrieved and returned, the wait state will be restarted. Transfer to rdy state; since the thread management linked list saves its packet entry sequence information, when the thread currently in exe state transfers to other states, the thread transferred from wait state to rdy state can still receive priority scheduling until it completes the execution of the package Send an instruction and the thread changes from exe state to idle state;

第七,释放相应的线程号后,主控状态机进入IDLE状态。Seventh, after releasing the corresponding thread number, the main control state machine enters the IDLE state.

图9是根据本发明实施例的执行基于粗粒度的多线程调度的流程图,如图9所示,当有新包进入时,首先判断是线程组是否处于IDLE状态,如果否,则新包需等待至有空闲线程可供分配;如果是,则从空闲线程中选取一个线程i分配给新进报文;接着,向指令存储模块发送取指指令,待取指返回后,线程i由IDLE状态转入rdy状态;同一线程组中若干处于rdy状态的线程进行SP调度(GRANT),使得最先进包的线程获得授权GRANTi;获得授权后线程i转入exe状态;处于exe状态的线程i在指令执行过程中发现存在数据相关性的指令、或查表返回数据相关性的指令,或者需要重新取指等需要长时间等待的情况时,线程i转入wait状态;待数据等待周期完成、或者查表数据已返回、或者重新取指返回后,线程i由wait状态重新转入rdy状态,由于采用基于进包顺序的SP调度,待当前处于exe状态的线程转入其他状态时,线程i可获得优先调度,直至其执行完包发送指令,转入idle状态,并释放线程。Figure 9 is a flow chart for executing coarse-grained multi-thread scheduling according to an embodiment of the present invention. As shown in Figure 9, when a new packet enters, it is first determined whether the thread group is in the IDLE state. If not, the new packet It is necessary to wait until there is an idle thread available for allocation; if so, select a thread i from the idle thread and assign it to the incoming message; then, send an instruction fetch instruction to the instruction storage module. After the instruction fetch returns, thread i is transferred from IDLE The state transfers to rdy state; several threads in rdy state in the same thread group perform SP scheduling (GRANT), so that the thread of the most advanced package obtains authorization GRANTi; after obtaining authorization, thread i transfers to exe state; thread i in exe state is in When an instruction with data dependency is found during instruction execution, or an instruction with table lookup returns data dependency, or when instructions need to be re-fetched and require a long wait, thread i transfers to the wait state; until the data waiting cycle is completed, or After the table lookup data has been returned or the index is re-fetched, thread i is transferred from the wait state to the rdy state again. Due to the SP scheduling based on the packet entry sequence, when the thread currently in the exe state transfers to other states, thread i can Obtain priority scheduling until it completes the execution of the package sending instructions, transfers to the idle state, and releases the thread.

在本实施例中,主控状态机处于IDLE状态,表示允许新包进入,新包进入内核首先处于idle状态,向指令存储模块发送取指指令,依据分配的线程号,维护图7中的对应线程的线程管理链表,将线程号信息存入相应节点,待指令从指令存储模块返回后,对应线程从idle状态转入rdy状态,同一线程组中若干处于rdy状态的线程,结合线程管理链表各节点映射得到的进包线程顺序,进行SP调度(GRANT),使得最先进包的线程获得授权,获得授权的线程由rdy状态转入exe状态,每一线程组同一时刻只有一个线程可以获得授权,2个线程组可从各自组内调度出最先进包的可执行线程发射进对应的流水线(流水线0或者流水线1)执行,处于exe状态的线程执行完包发送指令,相应线程由exe状态转入idle状态,报文调度出内核,检索并删除线程管理链表匹配节点的线程号信息,释放相应的线程号,处于exe状态的线程在指令执行过程中发现存在数据相关性的指令、或查表返回数据相关性的指令,或者需要重新取指等需要长时间等待的情况时,相应线程由exe状态转入wait状态,由GRANT授权其余最先进包处于rdy状态的线程进入exe状态,待先前转入wait状态的线程数据等待周期完成、或者查表数据已返回、或者重新取指返回后,由wait状态重新转入rdy状态,由于线程管理链表保存有其进包顺序信息,待当前处于exe状态的线程转入其他状态时,由wait状态转入rdy状态的线程仍可获得优先调度,直至其执行完包发送指令,线程由exe状态转入idle状态,主控状态机进入IDLE状态。In this embodiment, the main control state machine is in the IDLE state, indicating that new packets are allowed to enter. When a new packet enters the kernel, it is first in the idle state, sends an instruction fetch instruction to the instruction storage module, and maintains the correspondence in Figure 7 according to the assigned thread number. The thread management linked list of the thread stores the thread number information in the corresponding node. After the instruction returns from the instruction storage module, the corresponding thread transfers from the idle state to the rdy state. Several threads in the rdy state in the same thread group combine each thread in the thread management linked list. The order of incoming packet threads obtained by node mapping is performed by SP scheduling (GRANT), so that the thread of the most advanced package is authorized. The authorized thread is transferred from the rdy state to the exe state. Only one thread in each thread group can be authorized at the same time. The two thread groups can schedule the executable thread of the most advanced package from their respective groups and launch it into the corresponding pipeline (pipeline 0 or pipeline 1) for execution. The thread in the exe state executes the package and sends the instruction, and the corresponding thread is transferred from the exe state. In the idle state, the message is scheduled out of the kernel, the thread number information of the matching node in the thread management linked list is retrieved and deleted, and the corresponding thread number is released. The thread in the exe state finds instructions with data correlation during the instruction execution, or returns from the table lookup. When there is a data-related instruction, or when instructions need to be re-fetched and require a long wait, the corresponding thread will be transferred from the exe state to the wait state, and GRANT will authorize the remaining threads with the most advanced package in the rdy state to enter the exe state until the previous transfer. The thread data waiting period in the wait state is completed, or the table lookup data has been returned, or after re-fetching and returning, the wait state is transferred back to the rdy state. Since the thread management linked list saves its packet entry sequence information, when the thread currently in the exe state When the thread transfers to other states, the thread that transfers from the wait state to the rdy state can still receive priority scheduling until it completes the execution of the package sending instructions, the thread transfers from the exe state to the idle state, and the main control state machine enters the IDLE state.

通过本发明的上述实施例,基于粗粒度的多线程调度方法保证先进入内核的报文优先得到调度,并且只有在发生成本较高的停顿时(比如重新取指、查表等待等)才切换其他线程执行,大大降低减缓任一报文执行速度的可能性,以及降低了任一报文的执行延迟。Through the above embodiments of the present invention, the coarse-grained multi-thread scheduling method ensures that the packets that enter the kernel first are scheduled first, and the switch is only performed when a costly pause occurs (such as re-fetching, table lookup, etc.) Other threads execute, greatly reducing the possibility of slowing down the execution speed of any message, and reducing the execution delay of any message.

显然,本领域的技术人员应该明白,上述的本发明的各模块或各步骤可以用通用的计算装置来实现,它们可以集中在单个的计算装置上,或者分布在多个计算装置所组成的网络上,它们可以用计算装置可执行的程序代码来实现,从而,可以将它们存储在存储装置中由计算装置来执行,并且在某些情况下,可以以不同于此处的顺序执行所示出或描述的步骤,或者将它们分别制作成各个集成电路模块,或者将它们中的多个模块或步骤制作成单个集成电路模块来实现。这样,本发明不限制于任何特定的硬件和软件结合。Obviously, those skilled in the art should understand that the above-mentioned modules or steps of the present invention can be implemented using general-purpose computing devices. They can be concentrated on a single computing device, or distributed across a network composed of multiple computing devices. They may be implemented in program code executable by a computing device, such that they may be stored in a storage device for execution by the computing device, and in some cases may be executed in a sequence different from that shown herein. Or the described steps can be implemented by making them into individual integrated circuit modules respectively, or by making multiple modules or steps among them into a single integrated circuit module. As such, the invention is not limited to any specific combination of hardware and software.

以上所述仅为本发明的优选实施例而已,并不用于限制本发明,对于本领域的技术人员来说,本发明可以有各种更改和变化。凡在本发明的原则之内,所作的任何修改、等同替换、改进等,均应包含在本发明的保护范围之内。The above descriptions are only preferred embodiments of the present invention and are not intended to limit the present invention. For those skilled in the art, the present invention may have various modifications and changes. Any modifications, equivalent substitutions, improvements, etc. made within the principles of the present invention shall be included in the protection scope of the present invention.

Claims (15)

1.一种多线程调度方法,其特征在于,包括:1. A multi-thread scheduling method, characterized by including: 每个报文进入处理器内核后,将每个报文携带的线程号依次存入与该报文所属的线程组对应的线程管理链表中,并建立线程号与线程管理链表的节点之间的映射关系;After each message enters the processor core, the thread number carried by each message is stored in the thread management linked list corresponding to the thread group to which the message belongs, and the relationship between the thread number and the node of the thread management linked list is established. Mapping relations; 根据所述映射关系以及每个线程所对应的线程状态机的状态,按照报文进入所述处理器内核的先后顺序从线程组中调度出处于可执行状态的目标线程,并将所述目标线程输入与所述目标线程对应的流水线。According to the mapping relationship and the state of the thread state machine corresponding to each thread, the target thread in the executable state is scheduled from the thread group in the order in which messages enter the processor core, and the target thread is Enter the pipeline corresponding to the target thread. 2.根据权利要求1所述的方法,其特征在于,其中,在将所述每个报文携带的线程号依次存入与该报文所属的线程组对应的线程管理链表中之前,还包括:2. The method according to claim 1, characterized in that, before sequentially storing the thread number carried by each message into the thread management linked list corresponding to the thread group to which the message belongs, it also includes: : 为进入所述处理器内核的每个报文分配线程号,并将所有线程分为与流水线数量对应的线程组。Each message entering the processor core is assigned a thread number, and all threads are divided into thread groups corresponding to the number of pipelines. 3.根据权利要求1所述的方法,其特征在于,每个线程组对应一个线程管理链表,每个线程管理链表的节点数与每个线程组包含的线程数量相同。3. The method of claim 1, wherein each thread group corresponds to a thread management linked list, and the number of nodes in each thread management linked list is the same as the number of threads contained in each thread group. 4.根据权利要求1所述的方法,其特征在于,所述线程管理链表的节点到线程号之间的映射关系采用比特图方式表示。4. The method according to claim 1, characterized in that the mapping relationship between the nodes of the thread management linked list and the thread number is represented by a bitmap. 5.根据权利要求4所述的方法,其特征在于,所述根据所述映射关系以及每个线程所处的状态,按照报文进入所述处理器内核的先后顺序从所述线程组中调度出处于可执行状态的目标线程,并将所述目标线程输入与所述目标线程对应的流水线,包括:5. The method according to claim 4, characterized in that, according to the mapping relationship and the state of each thread, the message is scheduled from the thread group in the order in which the message enters the processor core. Exit the target thread in the executable state, and input the target thread into the pipeline corresponding to the target thread, including: 根据所述比特图的值及所述线程组中各线程的准备状态计算得出每个节点对应的发射请求;Calculate the transmission request corresponding to each node based on the value of the bitmap and the readiness status of each thread in the thread group; 将具有发射请求的线程进行优先级调度,使先进入所述处理器内核且处于准备就绪状态的线程获得授权,转换为可执行状态,并将所述可执行状态的线程作为所述目标线程进行调度;Priority scheduling is performed on threads with emission requests, so that the thread that enters the processor core first and is in a ready state is authorized, converted to an executable state, and the thread in the executable state is used as the target thread. Scheduling; 获取与所述目标线程对应的指令,并将所述目标线程输入与所述目标线程对应的流水线,以执行所述指令。Obtain the instruction corresponding to the target thread, and input the target thread into the pipeline corresponding to the target thread to execute the instruction. 6.根据权利要求5所述的方法,其特征在于,将所述目标线程输入与所述目标线程对应的流水线以执行所述指令之后,还包括:6. The method according to claim 5, characterized in that, after inputting the target thread into the pipeline corresponding to the target thread to execute the instruction, it further includes: 将执行完所述指令的报文调度出所述处理器内核,并释放与该报文对应的线程;Schedule the message after executing the instruction out of the processor core and release the thread corresponding to the message; 在所述线程管理链表的节点中清除该报文的线程号,并将存储在所述线程管理链表的节点中的其他线程号依次向前移动一个节点。The thread number of the message is cleared in the node of the thread management linked list, and other thread numbers stored in the node of the thread management linked list are moved forward by one node in sequence. 7.根据权利要求1所述的方法,其特征在于,7. The method according to claim 1, characterized in that, 每条流水线对应一个主控状态机,每个线程对应一个线程状态机,每条流水线在空闲和授权2个状态以及每个线程在空闲、准备就绪、可执行和等待4个状态间转换。Each pipeline corresponds to a main control state machine, and each thread corresponds to a thread state machine. Each pipeline transitions between idle and authorized states, and each thread transitions between idle, ready, executable and waiting states. 8.根据权利要求7所述的方法,其特征在于,每条流水线在空闲和授权2个状态以及每个线程在空闲、准备就绪、可执行和等待4个状态间转换,包括:8. The method according to claim 7, characterized in that each pipeline is in two states of idle and authorized and each thread is in four states of idle, ready, executable and waiting, including: 当所述主控状态机处于空闲状态,表示允许新的报文进入处理器内核,新的报文进入处理器内核后,对应线程处于空闲状态;When the main control state machine is in the idle state, it means that new messages are allowed to enter the processor core. After the new messages enter the processor core, the corresponding thread is in the idle state; 在报文的线程号存入线程管理链表的节点,并从指令存储模块取回与该线程对应的指令后,该线程从空闲状态转入准备就绪状态;After the thread number of the message is stored in the node of the thread management linked list and the instruction corresponding to the thread is retrieved from the instruction storage module, the thread transfers from the idle state to the ready state; 在主控状态机的授权状态下对处于准备就绪状态的线程进行授权,获得授权的线程由准备就绪状态转入可执行状态;Authorize the thread in the ready state in the authorization state of the main control state machine, and the authorized thread transfers from the ready state to the executable state; 处于可执行状态的线程执行完对应的指令后由可执行状态转入空闲状态;The thread in the executable state transitions from the executable state to the idle state after executing the corresponding instructions; 处于可执行状态的线程在指令执行过程中存在数据等待、查表等待或重新取指令的情况时,该线程由可执行状态转入等待状态;When a thread in the executable state is waiting for data, waiting for table lookup, or re-fetching instructions during the execution of instructions, the thread will be transferred from the executable state to the waiting state; 处于等待状态的线程数据等待结束、或者查表结果返回、或者重新取指返回后,由等待状态重新转入准备就绪状态;The thread in the waiting state transfers from the waiting state to the ready state again after the data wait ends, or the table lookup result returns, or the instruction is retrieved and returned; 在有执行完指令的线程的线程号被释放后,主控状态机进入空闲状态。After the thread number of the thread that has completed the execution of the instruction is released, the main control state machine enters the idle state. 9.一种多线程调度装置,其特征在于,包括:9. A multi-thread scheduling device, characterized in that it includes: 建立模块,用于在每个报文进入处理器内核后,将每个报文携带的线程号依次存入与该报文所属的线程组对应的线程管理链表中,并建立线程号与线程管理链表的节点之间的映射关系;Establish a module to store the thread number carried by each message into the thread management linked list corresponding to the thread group to which the message belongs after each message enters the processor core, and establish the thread number and thread management The mapping relationship between the nodes of the linked list; 调度模块,用于根据所述映射关系以及每个线程所对应的线程状态机的状态,按照报文进入所述处理器内核的先后顺序从线程组中调度出处于可执行状态的目标线程,并将所述目标线程输入与所述目标线程对应的流水线。A scheduling module, configured to schedule target threads in an executable state from the thread group according to the order in which messages enter the processor core according to the mapping relationship and the state of the thread state machine corresponding to each thread, and The target thread is input into the pipeline corresponding to the target thread. 10.根据权利要求9所述的装置,其特征在于,还包括:10. The device of claim 9, further comprising: 分配模块,用于为进入所述处理器内核的每个报文分配线程号,并将所有线程分为与流水线数量对应的线程组。An allocation module is used to allocate a thread number to each message entering the processor core, and divide all threads into thread groups corresponding to the number of pipelines. 11.根据权利要求10所述的装置,其特征在于,其中,每个线程组对应一个线程管理链表,每个线程管理链表的节点数与每个线程组包含的线程数量相同。11. The device according to claim 10, wherein each thread group corresponds to a thread management linked list, and the number of nodes in each thread management linked list is the same as the number of threads included in each thread group. 12.根据权利要求10所述的装置,其特征在于,还包括:12. The device of claim 10, further comprising: 释放模块,用于将执行完与所述目标线程对应的指令的报文调度出所述处理器内核,并释放与该报文对应的线程,在所述线程管理链表的节点中清除该报文的线程号,并将存储在所述线程管理链表的节点中的其他线程号依次向前移动一个节点。A release module, used to schedule the message that has completed the execution of the instruction corresponding to the target thread out of the processor core, release the thread corresponding to the message, and clear the message in the node of the thread management linked list. thread number, and move other thread numbers stored in the nodes of the thread management linked list forward one node in sequence. 13.根据权利要求10所述的装置,其特征在于,其中,每条流水线对应一个主控状态机,每个线程对应一个线程状态机,每条流水线在空闲和授权2个状态以及每个线程在空闲、准备就绪、可执行和等待4个状态间转换。13. The device according to claim 10, wherein each pipeline corresponds to a main control state machine, each thread corresponds to a thread state machine, each pipeline is in two states: idle and authorized, and each thread is in two states: idle and authorized. Transition between four states: idle, ready, executable and waiting. 14.一种计算机可读存储介质,其特征在于,所述计算机可读存储介质中存储有计算机程序,其中,所述计算机程序被处理器执行时实现所述权利要求1至8任一项中所述的方法的步骤。14. A computer-readable storage medium, characterized in that a computer program is stored in the computer-readable storage medium, wherein when the computer program is executed by a processor, any one of claims 1 to 8 is realized. steps of the method. 15.一种电子装置,包括存储器、处理器以及存储在所述存储器上并可在所述处理器上运行的计算机程序,其特征在于,所述处理器执行所述计算机程序时实现所述权利要求1至8任一项中所述的方法的步骤。15. An electronic device, comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor realizes the right when executing the computer program The steps of the method described in any one of claims 1 to 8.
CN202210738293.8A 2022-06-27 2022-06-27 Multi-thread scheduling method and device Pending CN117331655A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202210738293.8A CN117331655A (en) 2022-06-27 2022-06-27 Multi-thread scheduling method and device
PCT/CN2023/087477 WO2024001411A1 (en) 2022-06-27 2023-04-11 Multi-thread scheduling method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210738293.8A CN117331655A (en) 2022-06-27 2022-06-27 Multi-thread scheduling method and device

Publications (1)

Publication Number Publication Date
CN117331655A true CN117331655A (en) 2024-01-02

Family

ID=89294062

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210738293.8A Pending CN117331655A (en) 2022-06-27 2022-06-27 Multi-thread scheduling method and device

Country Status (2)

Country Link
CN (1) CN117331655A (en)
WO (1) WO2024001411A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118069071A (en) * 2024-04-19 2024-05-24 苏州元脑智能科技有限公司 Resource access control method, device, computer equipment and storage medium

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7159216B2 (en) * 2001-11-07 2007-01-02 International Business Machines Corporation Method and apparatus for dispatching tasks in a non-uniform memory access (NUMA) computer system
CN104901901B (en) * 2014-03-07 2019-03-12 深圳市中兴微电子技术有限公司 A microengine and method for processing messages
US10140157B2 (en) * 2014-05-29 2018-11-27 Apple Inc. Multiple process scheduling of threads using process queues
CN109257280B (en) * 2017-07-14 2022-05-27 深圳市中兴微电子技术有限公司 Micro-engine and message processing method thereof

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118069071A (en) * 2024-04-19 2024-05-24 苏州元脑智能科技有限公司 Resource access control method, device, computer equipment and storage medium

Also Published As

Publication number Publication date
WO2024001411A1 (en) 2024-01-04

Similar Documents

Publication Publication Date Title
US11620255B2 (en) Time sensitive networking device
CN105511954B (en) Message processing method and device
CN109669768B (en) Resource allocation and task scheduling method for edge cloud combined architecture
CN110262901B (en) Data processing method and data processing system
CN109697122B (en) Task processing method, device and computer storage medium
CN109564528B (en) System and method for computing resource allocation in distributed computing
JP2014525619A (en) Data processing system
CN113485822A (en) Memory management method, system, client, server and storage medium
US9141436B2 (en) Apparatus and method for partition scheduling for a processor with cores
US8848532B2 (en) Method and system for processing data
US20170047069A1 (en) Voice processing method and device
WO2017185285A1 (en) Method and device for assigning graphics processing unit task
CN106385377B (en) Information processing method and system
CN105159774A (en) API request order-preserving processing method and system
CN106325996A (en) GPU resource distribution method and system
JP2024541019A (en) Robot scheduling method, device, electronic device, and storage medium
WO2024001411A1 (en) Multi-thread scheduling method and device
CN118113471A (en) A GPU sharing method and device for server-unaware inference load
CN110445580A (en) Data transmission method for uplink and device, storage medium, electronic device
WO2024156239A1 (en) Video streaming transmission method and apparatus, electronic device, and storage medium
CN109257280B (en) Micro-engine and message processing method thereof
CN116128704A (en) Data processing method, data processing apparatus, and computer readable storage medium
CN115766612A (en) A Scheduling Method and Corresponding Device Based on Weight Conversion Probability
CN114398410A (en) A continuous number generation method, device, server cluster and storage medium
CN114157619A (en) Message cache management method and device and network processor

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
CB02 Change of applicant information
CB02 Change of applicant information

Country or region after: China

Address after: 518055, 2nd Floor, ZTE Industrial Park, No. 2 Chuangyan Road, Xili Community, Xili Street, Nanshan District, Shenzhen City, Guangdong Province, China

Applicant after: SANECHIPS TECHNOLOGY Co.,Ltd.

Address before: 518055 Zhongxing Industrial Park, Liuxian Avenue, Xili street, Nanshan District, Shenzhen City, Guangdong Province

Applicant before: SANECHIPS TECHNOLOGY Co.,Ltd.

Country or region before: China

SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination