[go: up one dir, main page]

CN113407239B - Pipeline processor based on asynchronous monorail - Google Patents

Pipeline processor based on asynchronous monorail Download PDF

Info

Publication number
CN113407239B
CN113407239B CN202110644854.3A CN202110644854A CN113407239B CN 113407239 B CN113407239 B CN 113407239B CN 202110644854 A CN202110644854 A CN 202110644854A CN 113407239 B CN113407239 B CN 113407239B
Authority
CN
China
Prior art keywords
module
asynchronous
click
signal
pipeline
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110644854.3A
Other languages
Chinese (zh)
Other versions
CN113407239A (en
Inventor
田龙锋
虞志益
王凯
肖山林
李智宇
黄宇皓
朱瑞敏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sun Yat Sen University
Original Assignee
Sun Yat Sen University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sun Yat Sen University filed Critical Sun Yat Sen University
Priority to CN202110644854.3A priority Critical patent/CN113407239B/en
Publication of CN113407239A publication Critical patent/CN113407239A/en
Application granted granted Critical
Publication of CN113407239B publication Critical patent/CN113407239B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3867Concurrent instruction execution, e.g. pipeline or look ahead using instruction pipelines
    • G06F9/3871Asynchronous instruction pipeline, e.g. using handshake signals between stages
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Advance Control (AREA)

Abstract

The application relates to a pipeline processor based on asynchronous monorail, comprising: the system comprises an asynchronous control module, an instruction fetching module, a decoding module, an executing module, a self-adaptive selection module, a memory access module, a write-back module, a storage module, a control and status register and a general register, wherein data communication is completed among the modules through asynchronous single-rail handshake, the asynchronous control module comprises a plurality of control units, the plurality of control units are a plurality of phase decoupling Click units, and the plurality of phase decoupling Click units are mutually cascaded through handshake and are respectively connected with corresponding pipelines. According to the method and the device, the problems of high power consumption, serious global clock offset and low clock frequency limiting speed of the pipeline processor in the related technology are solved, and the pipeline is operated at a higher speed without a clock under low power consumption.

Description

一种基于异步单轨的流水线处理器A Pipelined Processor Based on Asynchronous Single Rail

技术领域technical field

本申请涉及电子信息、处理器与异步电路领域,特别是涉及一种基于异步单轨的流水线处理器。The application relates to the fields of electronic information, processors and asynchronous circuits, in particular to an asynchronous monorail-based pipeline processor.

背景技术Background technique

随着物联网和人工智能的快速发展,SOC技术不断成熟,现在的芯片大部分都集成了自己的处理器,可见处理器在电子技术中扮演着重要角色,因此处理器的设计受到广泛的关注。处理器结构大致为运算逻辑部件、寄存器部件和控制部件,这些部件都是由大量的寄存器构成,数据处理指令只对寄存器进行操作。由于全局时钟的存在,虽然运算速度和执行效率很高,但是寄存器始终随时钟不停的翻转,消耗了更多能量,增加了额外的功耗。此外,由于处理器大多是同步电路设计,全局时钟偏移问题严重,存在复杂的时钟树网络,设计比较困难,而且时钟树将会严重占用芯片设计面积和功耗。同时,在同步电路中,所有路径都在同一时钟下工作,为了保证一个时钟周期能够完成所有逻辑运算,时钟频率会被电路中的关键路径延时限制,同时影响到其它路径,而且关键路径的优化很困难,因此时钟频率难以提高,限制了整个处理器的性能。因此,现有的技术存在流水线处理器功耗高、全局时钟偏移严重、时钟频率受限制速度慢的问题。With the rapid development of the Internet of Things and artificial intelligence, SOC technology continues to mature. Most chips now integrate their own processors. It can be seen that processors play an important role in electronic technology, so the design of processors has received extensive attention. The processor structure is roughly divided into arithmetic logic unit, register unit and control unit. These units are composed of a large number of registers, and data processing instructions only operate on registers. Due to the existence of the global clock, although the operation speed and execution efficiency are high, the registers are always flipped with the clock, which consumes more energy and increases additional power consumption. In addition, because most processors are designed with synchronous circuits, the problem of global clock skew is serious, and there is a complex clock tree network, which is difficult to design, and the clock tree will seriously occupy the chip design area and power consumption. At the same time, in a synchronous circuit, all paths work under the same clock. In order to ensure that all logic operations can be completed in one clock cycle, the clock frequency will be limited by the delay of the critical path in the circuit and affect other paths at the same time, and the critical path Optimization is difficult, so the clock frequency is difficult to increase, limiting the performance of the entire processor. Therefore, the existing technology has the problems of high power consumption of the pipeline processor, severe global clock skew, limited clock frequency and slow speed.

发明内容Contents of the invention

在本实施例中提供了一种基于异步单轨的流水线处理器,以解决相关技术中流水线处理器功耗高、全局时钟偏移严重、时钟频率受限制速度慢的问题。In this embodiment, an asynchronous single-rail-based pipeline processor is provided to solve the problems of high power consumption, severe global clock skew, and limited clock frequency and slow speed of pipeline processors in the related art.

本申请的一种基于异步单轨的流水线处理器,包括:异步控制模块、取指模块、译码模块、执行模块、自适应选择模块、访存模块、写回模块、存储模块、控制和状态寄存器、通用寄存器,各个模块之间通过异步单轨握手完成数据通信,其中所述异步控制模块包含多个控制单元,所述多个控制单元为多个相位解耦Click单元,所述多个相位解耦Click单元通过握手相互级联,并分别与对应的流水线连接。An asynchronous monorail-based pipeline processor of the present application includes: an asynchronous control module, an instruction fetch module, a decoding module, an execution module, an adaptive selection module, a memory access module, a write-back module, a storage module, and a control and status register , general-purpose registers, complete data communication between each module through asynchronous monorail handshake, wherein the asynchronous control module includes a plurality of control units, the plurality of control units are multiple phase decoupling Click units, and the multiple phase decoupling The Click units are cascaded with each other through handshaking, and are respectively connected to the corresponding pipelines.

其中,所述相位解耦Click单元之间握手成功后产生控制本级流水线的click信号。Wherein, after the handshake between the phase decoupling Click units succeeds, a click signal for controlling the pipeline of the current stage is generated.

其中,第一级流水线包括所述取指模块,第二级流水线包括所述译码模块,第三级流水线包括所述执行模块,第四级流水线包括所述访存模块,第五级流水线包括所述写回模块。Wherein, the first-stage pipeline includes the instruction fetch module, the second-stage pipeline includes the decoding module, the third-stage pipeline includes the execution module, the fourth-stage pipeline includes the memory access module, and the fifth-stage pipeline includes The writeback module.

其中,所述相位解耦Click单元产生click信号,使程序计数器根据控制信号计算指令地址,将所述指令地址传输至所述取指模块,同时向下一级相位解耦Click单元发送请求信号,其中,所述控制信号为包括跳转、异常、中断的电信号,所述请求信号为上一级相位解耦Click单元对下一级相位解耦Click单元发出的请求下一级相位解耦Click单元工作的电信号。Wherein, the phase decoupling Click unit generates a click signal, so that the program counter calculates an instruction address according to the control signal, transmits the instruction address to the instruction fetch module, and simultaneously sends a request signal to the next-level phase decoupling Click unit, Wherein, the control signal is an electrical signal including jump, abnormality, and interruption, and the request signal is a request sent by the upper-level phase decoupling Click unit to the next-level phase decoupling Click unit. The electrical signal that the unit works on.

其中,所述取指模块根据所述指令地址从所述存储模块中读取指令,将所述指令传输至所述译码模块。Wherein, the fetching module reads an instruction from the storage module according to the instruction address, and transmits the instruction to the decoding module.

其中,所述译码模块对所述指令进行译码操作,并从所述控制和状态寄存器、所述通用寄存器中读取与所述指令相关的待处理数据,将所述待处理数据传输至所述执行模块。Wherein, the decoding module performs a decoding operation on the instruction, and reads the data to be processed related to the instruction from the control and status register and the general register, and transmits the data to be processed to The execution module.

其中,所述执行模块根据所述待处理数据执行相对应的运算,得到运算结果后将所述运算结果传输至所述访存模块。Wherein, the execution module executes a corresponding operation according to the data to be processed, and transmits the operation result to the memory access module after obtaining the operation result.

其中,所述访存模块根据所述运算结果对所述存储模块进行读写操作。Wherein, the memory access module performs read and write operations on the storage module according to the operation result.

其中,所述执行模块包括预测冲刷模块、跳转模块、旁路模块,所述预测冲刷模块对不正确的预测指令进行冲刷,所述跳转模块根据跳转信号和运算结果产生跳转地址返回给程序计数器,所述旁路模块根据寄存器地址提前从后置模块中获得需要的数据。Wherein, the execution module includes a prediction flushing module, a jump module, and a bypass module, the prediction flushing module flushes incorrect prediction instructions, and the jump module generates a jump address according to a jump signal and an operation result to return For the program counter, the bypass module obtains the required data from the post module in advance according to the register address.

其中,所述自适应选择模块从所述执行模块获取运算结果,并根据所述控制信号将所述运算结果传输给所述访存模块和所述写回模块。所述写回模块接收所述执行模块和所述访存模块传输的运算结果,将所述运算结果写回寄存器。所述存储模块包括指令存储模块,用于存储接收到的指令;数据存储模块,用于存储接收到的数据。Wherein, the self-adaptive selection module acquires a calculation result from the execution module, and transmits the calculation result to the memory access module and the write-back module according to the control signal. The write-back module receives the operation result transmitted by the execution module and the memory access module, and writes the operation result back to the register. The storage module includes an instruction storage module for storing received instructions and a data storage module for storing received data.

与相关技术相比,在本实施例中提供的是一种基于异步单轨的流水线处理器,解决了相关技术中流水线处理器功耗高、全局时钟偏移严重、时钟频率受限制速度慢的问题,实现了流水线在低功耗下无时钟以较高速度运行。Compared with related technologies, this embodiment provides an asynchronous single-rail-based pipeline processor, which solves the problems of high power consumption of pipeline processors, severe global clock skew, and limited clock speed in related technologies. , which realizes that the pipeline runs at a higher speed without a clock under low power consumption.

本申请的一个或多个实施例的细节在以下附图和描述中提出,以使本申请的其他特征、目的和优点更加简明易懂。The details of one or more embodiments of the application are set forth in the accompanying drawings and the description below, so as to make other features, objects, and advantages of the application more comprehensible.

附图说明Description of drawings

此处所说明的附图用来提供对本申请的进一步理解,构成本申请的一部分,本申请的示意性实施例及其说明用于解释本申请,并不构成对本申请的不当限定。在附图中:The drawings described here are used to provide a further understanding of the application and constitute a part of the application. The schematic embodiments and descriptions of the application are used to explain the application and do not constitute an improper limitation to the application. In the attached picture:

图1是本申请一种基于异步单轨的流水线处理器的结构示意图;Fig. 1 is the structural representation of a kind of pipeline processor based on asynchronous monorail of the present application;

图2是本申请实施例的相位解耦Click单元的逻辑示意图;Fig. 2 is the logic diagram of the phase decoupling Click unit of the embodiment of the present application;

图3是本申请实施例的控制单元C_EX2MEM的逻辑示意图;Fig. 3 is a logical schematic diagram of the control unit C_EX2MEM of the embodiment of the present application;

图4是本申请实施例的控制单元C_MEM2WB的逻辑示意图;Fig. 4 is a logical schematic diagram of the control unit C_MEM2WB of the embodiment of the present application;

图5是本申请实施例中异步控制模块的第一级控制单元的逻辑示意图。Fig. 5 is a logical schematic diagram of the first-level control unit of the asynchronous control module in the embodiment of the present application.

具体实施方式Detailed ways

为了使本申请的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本申请进行描述和说明。应当理解,此处所描述的具体实施例仅仅用以解释本申请,并不用于限定本申请。基于本申请提供的实施例,本领域普通技术人员在没有作出创造性劳动的前提下所获得的所有其他实施例,都属于本申请保护的范围。In order to make the purpose, technical solutions and advantages of the present application clearer, the present application will be described and illustrated below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present application, and are not intended to limit the present application. Based on the embodiments provided in the present application, all other embodiments obtained by persons of ordinary skill in the art without creative efforts shall fall within the protection scope of the present application.

显而易见地,下面描述中的附图仅仅是本申请的一些示例或实施例,对于本领域的普通技术人员而言,在不付出创造性劳动的前提下,还可以根据这些附图将本申请应用于其他类似情景。此外,还可以理解的是,虽然这种开发过程中所作出的努力可能是复杂并且冗长的,然而对于与本申请公开的内容相关的本领域的普通技术人员而言,在本申请揭露的技术内容的基础上进行的一些设计,制造或者生产等变更只是常规的技术手段,不应当理解为本申请公开的内容不充分。Obviously, the accompanying drawings in the following description are only some examples or embodiments of the present application, and those skilled in the art can also apply the present application to other similar scenarios. In addition, it can also be understood that although such development efforts may be complex and lengthy, for those of ordinary skill in the art relevant to the content disclosed in this application, the technology disclosed in this application Some design, manufacturing or production changes based on the content are just conventional technical means, and should not be understood as insufficient content disclosed in this application.

在本申请中提及“实施例”意味着,结合实施例描述的特定特征、结构或特性可以包含在本申请的至少一个实施例中。在说明书中的各个位置出现该短语并不一定均是指相同的实施例,也不是与其它实施例互斥的独立的或备选的实施例。本领域普通技术人员显式地和隐式地理解的是,本申请所描述的实施例在不冲突的情况下,可以与其它实施例相结合。Reference in this application to an "embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the present application. The occurrences of this phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is understood explicitly and implicitly by those of ordinary skill in the art that the embodiments described in this application can be combined with other embodiments without conflict.

除非另作定义,本申请所涉及的技术术语或者科学术语应当为本申请所属技术领域内具有一般技能的人士所理解的通常意义。本申请所涉及的“一”、“一个”、“一种”、“该”等类似词语并不表示数量限制,可表示单数或复数。本申请所涉及的术语“包括”、“包含”、“具有”以及它们任何变形,意图在于覆盖不排他的包含;例如包含了一系列步骤或模块(单元)的过程、方法、系统、产品或设备没有限定于已列出的步骤或单元,而是可以还包括没有列出的步骤或单元,或可以还包括对于这些过程、方法、产品或设备固有的其它步骤或单元。本申请所涉及的“连接”、“相连”、“耦接”等类似的词语并非限定于物理的或者机械的连接,而是可以包括电气的连接,不管是直接的还是间接的。本申请所涉及的“多个”是指两个或两个以上。“和/或”描述关联对象的关联关系,表示可以存在三种关系,例如,“A和/或B”可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。字符“/”一般表示前后关联对象是一种“或”的关系。本申请所涉及的术语“第一”、“第二”、“第三”等仅仅是区别类似的对象,不代表针对对象的特定排序。Unless otherwise defined, the technical terms or scientific terms involved in the application shall have the usual meanings understood by those with ordinary skill in the technical field to which the application belongs. Words such as "a", "an", "an" and "the" involved in this application do not indicate a limitation on quantity, and may indicate singular or plural numbers. The terms "comprising", "comprising", "having" and any variations thereof involved in this application are intended to cover non-exclusive inclusion; for example, a process, method, system, product or process that includes a series of steps or modules (units). The apparatus is not limited to the listed steps or units, but may further include steps or units not listed, or may further include other steps or units inherent to the process, method, product or apparatus. The words "connected", "connected", "coupled" and similar words mentioned in this application are not limited to physical or mechanical connection, but may include electrical connection, no matter it is direct or indirect. The "plurality" involved in this application refers to two or more than two. "And/or" describes the association relationship of associated objects, indicating that there may be three types of relationships. For example, "A and/or B" may indicate: A exists alone, A and B exist simultaneously, and B exists independently. The character "/" generally indicates that the contextual objects are an "or" relationship. The terms "first", "second", "third" and the like involved in this application are only used to distinguish similar objects, and do not represent a specific ordering of objects.

在本实施例中提供了一种基于异步单轨的流水线处理器。图1是本申请一种基于异步单轨的流水线处理器的结构示意图,如图1所示的一种基于异步单轨的流水线处理器,包括:异步控制模块、取指模块、译码模块、执行模块、自适应选择模块、访存模块、写回模块、存储模块、控制和状态寄存器、通用寄存器,各个模块之间通过异步单轨握手完成数据通信,其中异步控制模块包含多个控制单元,多个控制单元为多个相位解耦Click单元,多个相位解耦Click单元通过握手相互级联,并分别与对应的流水线连接。上述实施例中,取指模块对应于图中的第一级流水线20,译码模块对应第二级流水线30,执行模块对应第三级流水线40,访存模块对应第四级流水线50,写回模块对应第五级流水线60。In this embodiment, an asynchronous single-rail-based pipeline processor is provided. Fig. 1 is a schematic structural diagram of an asynchronous monorail-based pipeline processor of the present application. The asynchronous monorail-based pipeline processor shown in Fig. 1 includes: an asynchronous control module, an instruction fetch module, a decoding module, and an execution module , self-adaptive selection module, memory access module, write-back module, storage module, control and status registers, and general-purpose registers. The data communication between each module is completed through asynchronous single-track handshake. The asynchronous control module includes multiple control units, multiple control The unit is a plurality of phase decoupling Click units, and the multiple phase decoupling Click units are cascaded with each other through handshaking, and are respectively connected to the corresponding pipelines. In the above embodiment, the fetching module corresponds to the first-stage pipeline 20 in the figure, the decoding module corresponds to the second-stage pipeline 30, the execution module corresponds to the third-stage pipeline 40, the memory access module corresponds to the fourth-stage pipeline 50, and the write-back The modules correspond to the fifth stage of the pipeline 60 .

在本实施例中,相位解耦Click单元之间握手成功后产生控制本级流水线的click信号。相位解耦Click单元之间通过“请求”“应答”信号握手产生click信号,以click信号代替同步电路的全局时钟控制本级流水线的click信号,由事件驱动,属于无全局时钟电路。如图1中第一控制单元101与第一级流水线20之间的连接,第一控制单元101与第一级流水线20具体的工作过程为:第一控制单元101根据处理器工作的使能信号产生程序计数器的click信号,使程序计数器运算一次,根据跳转、异常、中断等控制信号计算下一条指令地址并传输给第一级流水线20。控制单元在产生本阶段的click信号的同时生成下一阶段控制单元的请求信号。其中,控制信号为包括跳转、异常、中断的电信号,请求信号为上一级相位解耦Click单元对下一级相位解耦Click单元发出的请求下一级相位解耦Click单元工作的电信号。其中的请求信号在相邻的控制单元的上下级之间传递,具体概述为:上级控制单元(如第一控制单元)向相邻的下级控制单元(如第二控制单元)发送请求信号,其中请求信号是能够使下一级控制单元控制其对应的流水线工作的电信号,当下一级控制单元接收到上一级控制单元发送的请求信号后,使其连接的流水线开始工作,同时,向上一级控制单元发送应答信号,通知上一级控制单元本级控制单元已经完成了控制任务。本实施例仅演示了一种简单的五级流水线结构,如果有实际应用需要,可以对流水线所包括的内容进行更改。相比于现有的技术,本电路通过使用异步单轨电路中的Click模块(控制单元)根据“请求”“应答”信号的状态产生的click信号来代替同步电路中的全局时钟来实现不需要外接时钟信号输入,因此没有复杂的时钟网络,避免了时钟树网络占据大量的芯片面积和增加功耗,同时具有设计简单的优点,能够提高处理器的运行速度并降低功耗。In this embodiment, the click signal for controlling the pipeline at the current stage is generated after the handshake between the phase decoupling Click units succeeds. Phase decoupling Click units generate a click signal by handshaking with "request" and "response" signals, and use the click signal instead of the global clock of the synchronous circuit to control the click signal of the current stage of the pipeline. It is driven by events and belongs to a circuit without a global clock. As shown in the connection between the first control unit 101 and the first-stage pipeline 20 in Figure 1, the specific working process of the first control unit 101 and the first-stage pipeline 20 is: the first control unit 101 operates according to the enable signal of the processor The click signal of the program counter is generated to make the program counter operate once, and the address of the next instruction is calculated according to control signals such as jump, exception, and interrupt, and transmitted to the first-stage pipeline 20 . The control unit generates the request signal of the control unit of the next stage while generating the click signal of the current stage. Among them, the control signal is an electrical signal including jump, abnormality, and interruption, and the request signal is an electrical signal sent by the upper-stage phase decoupling Click unit to the next-stage phase decoupling Click unit to request the next-stage phase decoupling Click unit to work. Signal. The request signal is transmitted between the upper and lower levels of the adjacent control units, specifically summarized as: the upper control unit (such as the first control unit) sends a request signal to the adjacent lower control unit (such as the second control unit), wherein The request signal is an electrical signal that enables the lower-level control unit to control the operation of its corresponding pipeline. After the lower-level control unit receives the request signal sent by the upper-level control unit, the pipeline connected to it starts to work. At the same time, the upper-level control unit The level control unit sends a response signal to inform the upper level control unit that the control unit of this level has completed the control task. This embodiment only demonstrates a simple five-stage pipeline structure, and the content included in the pipeline can be changed if required by practical applications. Compared with the existing technology, this circuit replaces the global clock in the synchronous circuit by using the click signal generated by the Click module (control unit) in the asynchronous monorail circuit according to the state of the "request" and "response" signals. No external Clock signal input, so there is no complex clock network, which avoids the clock tree network occupying a large amount of chip area and increasing power consumption, and has the advantage of simple design, which can increase the operating speed of the processor and reduce power consumption.

需要注意的是,在处理器开始工作前,需要初始化信号对处理器进行初始化。初始化后,处理器内部的数据处于初始状态,异步流水线处于停滞状态。当处理器工作的使能信号电平为高时,异步单轨流水线处理器开始工作。因此在每次使用本处理器处理数据时都需要初始化信号对处理器进行初始化,保证内部各个模块和存储器、寄存器的数据处于初始状态。It should be noted that before the processor starts to work, the initialization signal is required to initialize the processor. After initialization, the data inside the processor is in the initial state, and the asynchronous pipeline is in a stagnant state. When the enable signal level of the processor is high, the asynchronous single-rail pipeline processor starts to work. Therefore, each time the processor is used to process data, an initialization signal is required to initialize the processor to ensure that the data of each internal module, memory, and register are in an initial state.

在本实施例中,第一至五控制单元是指不同的相位解耦Click单元,命名规则和对应关系为:第一控制单元是相位解耦Click单元C_PC,第二控制单元是相位解耦Click单元C_IF2ID,第三控制单元是相位解耦Click单元C_ID2EX,第四控制单元是相位解耦Click单元C_EX2ME,第五控制单元是相位解耦Click单元C_MEM2WB。对应的,五个控制单元分别控制五大部分的寄存器,具体为:第一控制单元C_PC控制PC寄存器,第二控制单元C_IF2ID控制IF2ID寄存器,第三控制单元C_ID2EX控制ID2EX寄存器,第四控制单元C_EX2ME控制EX2ME寄存器,第五控制单元C_MEM2WB控制CSR状态寄存器和通用寄存器。上述对应关系仅为特定实施例的一种,不代表只能具有上述对于关系。在上述实施例中,当取指模块(第一级流水线20)接收到指令地址后,后续处理器各模块的工作内容和工作方式如下:取指模块根据指令地址从存储模块中读取指令,将指令传输至译码模块。同时,控制单元C_IF2ID接收C_PC单元产生的请求信号和C_ID2EX单元返回的应答信号,握手成功后产生控制本级流水线工作的click信号,产生给下一级流水线请求信号和返回给上一级流水线应答信号。译码模块(第二级流水线30)对指令进行译码操作,并从控制和状态寄存器、通用寄存器中读取与指令相关的待处理数据,将待处理数据传输至执行模块。同时,控制单元C_ID2EX接收C_IF2ID单元产生的请求信号和C_EX2ME单元返回的应答信号,握手成功后产生控制本级流水线工作的click信号,产生给下一级流水线请求信号和返回给上一级流水线应答信号。执行模块(第三级流水线40)根据待处理数据执行相对应的运算,得到运算结果后将运算结果传输至访存模块。同时,控制单元C_EX2MEM接收C_ID2EX单元产生的请求信号和C_MEM2WB单元返回的应答信号,握手成功后产生控制本级流水线工作的click信号,产生给下一级流水线请求信号和返回给上一级流水线应答信号。访存模块(第四级流水线50)根据运算结果对存储模块进行读写操作。自适应选择模块从执行模块获取运算结果,并根据控制信号将运算结果传输给访存模块和写回模块。写回模块(第五级流水线60)接收执行模块和访存模块传输的运算结果,将运算结果写回寄存器。控制单元C_Wreg和C_Wcsr产生对CSR状态寄存器和通用寄存器写回的click信号,属于写回控制单元。同时,控制单元C_MEM2WB接收C_EX2MEM单元产生的2个请求信号和C_Wreg、C_Wcsr单元返回的应答信号,握手成功后产生控制本级流水线工作的click信号,同时产生给下一级流水线请求信号和返回给上一级流水线应答信号。具体来说,控制单元C_Wreg和C_Wcsr是对CSR和REGISTER寄存器进行控制的控制单元,采用相位解耦Click单元。存储模块包括指令存储模块,用于存储接收到的指令;数据存储模块,用于存储接收到的数据。在上述工作方式和连接关系中,各级流水线由事件完成产生的脉冲信号驱动,每一级流水线的脉冲信号频率都不一样,频率只会受到本级流水线的最长路径限制,因此处理器的处理速度比同步电路更快。In this embodiment, the first to fifth control units refer to different phase decoupling Click units, and the naming rules and corresponding relationship are: the first control unit is the phase decoupling Click unit C_PC, and the second control unit is the phase decoupling Click unit Unit C_IF2ID, the third control unit is a phase decoupling Click unit C_ID2EX, the fourth control unit is a phase decoupling Click unit C_EX2ME, and the fifth control unit is a phase decoupling Click unit C_MEM2WB. Correspondingly, the five control units respectively control five major registers, specifically: the first control unit C_PC controls the PC register, the second control unit C_IF2ID controls the IF2ID register, the third control unit C_ID2EX controls the ID2EX register, and the fourth control unit C_EX2ME controls The EX2ME register, the fifth control unit C_MEM2WB controls the CSR status register and the general register. The above corresponding relationship is only one of the specific embodiments, and does not mean that it can only have the above corresponding relationship. In the above-described embodiment, after the instruction fetching module (the first-stage pipeline 20) receives the instruction address, the work content and working mode of each module of the subsequent processor are as follows: the instruction fetching module reads the instruction from the storage module according to the instruction address, The instruction is transmitted to the decoding module. At the same time, the control unit C_IF2ID receives the request signal generated by the C_PC unit and the response signal returned by the C_ID2EX unit. After the handshake is successful, it generates a click signal to control the work of the current-level pipeline, generates a request signal for the next-level pipeline and returns a response signal to the upper-level pipeline. . The decoding module (the second-stage pipeline 30 ) decodes the instruction, reads the data to be processed related to the instruction from the control and status register and the general register, and transmits the data to be processed to the execution module. At the same time, the control unit C_ID2EX receives the request signal generated by the C_IF2ID unit and the response signal returned by the C_EX2ME unit. After the handshake is successful, it generates a click signal to control the work of the current-level pipeline, generates a request signal for the next-level pipeline and returns a response signal to the upper-level pipeline. . The execution module (the third-stage pipeline 40 ) executes corresponding operations according to the data to be processed, and transmits the operation results to the memory access module after obtaining the operation results. At the same time, the control unit C_EX2MEM receives the request signal generated by the C_ID2EX unit and the response signal returned by the C_MEM2WB unit. After the handshake is successful, it generates a click signal to control the work of the current pipeline, generates a request signal for the next pipeline and returns a response signal to the upper pipeline. . The memory access module (the fourth-stage pipeline 50 ) performs read and write operations on the storage module according to the calculation result. The adaptive selection module obtains the operation result from the execution module, and transmits the operation result to the memory access module and the write-back module according to the control signal. The write-back module (fifth-stage pipeline 60 ) receives the operation result transmitted by the execution module and the memory access module, and writes the operation result back to the register. The control units C_Wreg and C_Wcsr generate click signals for writing back to the CSR status register and the general register, which belong to the write-back control unit. At the same time, the control unit C_MEM2WB receives the two request signals generated by the C_EX2MEM unit and the response signals returned by the C_Wreg and C_Wcsr units. After the handshake is successful, it generates a click signal to control the work of the current pipeline, and at the same time generates a request signal for the next pipeline and returns it to the upper First stage pipeline acknowledge signal. Specifically, the control units C_Wreg and C_Wcsr are control units that control the CSR and REGISTER registers, and use phase decoupling Click units. The storage module includes an instruction storage module for storing received instructions and a data storage module for storing received data. In the above working mode and connection relationship, each level of pipeline is driven by the pulse signal generated by the completion of the event, and the frequency of the pulse signal of each level of pipeline is different, and the frequency is only limited by the longest path of the pipeline at this level, so the processor’s The processing speed is faster than synchronous circuits.

需要补充的是,在上述执行模块包括预测冲刷模块、跳转模块、旁路模块,预测冲刷模块对不正确的预测指令进行冲刷,跳转模块根据跳转信号和运算结果产生跳转地址返回给程序计数器,旁路模块根据寄存器地址提前从后置模块中获得需要的数据,能够避免数据与数据之间的冲突。What needs to be added is that the above-mentioned execution module includes a prediction flushing module, a jump module, and a bypass module. The prediction flushing module flushes incorrect prediction instructions, and the jump module generates a jump address according to the jump signal and the operation result and returns it to The program counter and the bypass module obtain the required data from the post module in advance according to the register address, which can avoid the conflict between data and data.

针对本发明采用的寄存器IF2ID,相对于控制的是第二控制单元C_IF2ID,相当于一个数据传输的开关,当第二控制单元C_IF2ID产生的Click脉冲到来时,开关打开,传输上一级流水线传来的数据,其他时候处于关闭状态。这种相对封闭式的设计能够有效避免在对应控制的控制单元未发出Click脉冲信号时,寄存器内的数据受到干扰或者损坏。需要说明的是,本发明采用的其他寄存器都具有与上述寄存器IF2ID相同的效果,即当click脉冲到来时开关打开,其余时间为关闭状态。For the register IF2ID used in the present invention, it is the second control unit C_IF2ID that controls the second control unit, which is equivalent to a data transmission switch. When the Click pulse generated by the second control unit C_IF2ID arrives, the switch is opened, and the upper-level pipeline is transmitted. data, otherwise closed. This relatively closed design can effectively prevent the data in the register from being disturbed or damaged when the corresponding control unit does not send out the Click pulse signal. It should be noted that the other registers used in the present invention have the same effect as the above-mentioned register IF2ID, that is, the switch is turned on when the click pulse arrives, and turned off for the rest of the time.

图2是本申请实施例的相位解耦Click单元的逻辑示意图。首先,图中的命名规则是:D是相位解耦Click单元控制的模块,In_Data和Out_Data分别代表输入模块的数据和模块输出的数据,In_Req和Out_Req分别代表输入的请求信号和向外输出的请求信号,In_Ack和Out_Ack分别代表输入的应答信号和向外输出的应答信号。因此,对于第一控制单元C_PC来说,其只有Out_Req和In_Ack两种信号,因为第一控制单元前序没有连接控制单元,因此不需要向外输出应答信号和接收请求信号。如图2所示,相位解耦Click单元工作流程是:假设In_Req=1,In_Ack=0,Out_Ack=1,Out_Req=0,In_Req信号与In_Ack信号异或,Out_Req信号与Out_Ack信号同或,则输出结果都为1,再经过与门,产生Click脉冲,使Pi、Po触发,In_Ack与Out_Req翻转,值都变为1。即返回应答信号和产生请求信号的值为1,即完成了一次握手。通过上述相位解耦Click单元,能够在需要的时间产生Click脉冲信号驱动其连接的流水线,通过产生的Click信号来代替同步电路中的全局时钟,避免了时钟树网络占据大量的芯片面积和增加功耗,同时具有设计简单的优点,能够提高处理器的运行速度并降低功耗。FIG. 2 is a logical schematic diagram of a phase decoupling Click unit according to an embodiment of the present application. First of all, the naming rules in the figure are: D is the module controlled by the phase decoupling Click unit, In_Data and Out_Data respectively represent the data input to the module and the data output by the module, In_Req and Out_Req represent the input request signal and the external output request respectively Signals, In_Ack and Out_Ack respectively represent the input acknowledgment signal and the output acknowledgment signal. Therefore, for the first control unit C_PC, there are only two signals, Out_Req and In_Ack, because the first control unit is not connected to the control unit before, so there is no need to output the response signal and the reception request signal. As shown in Figure 2, the working process of the phase decoupling Click unit is: Assume that In_Req=1, In_Ack=0, Out_Ack=1, Out_Req=0, the In_Req signal and the In_Ack signal are XORed, and the Out_Req signal and the Out_Ack signal are XORed together, then output The results are all 1, and then through the AND gate, a Click pulse is generated to trigger Pi and Po, In_Ack and Out_Req are flipped, and the values become 1. That is, the value of returning the response signal and generating the request signal is 1, that is, a handshake is completed. Through the above-mentioned phase decoupling Click unit, the Click pulse signal can be generated at the required time to drive the connected pipeline, and the generated Click signal can replace the global clock in the synchronous circuit, avoiding the clock tree network occupying a large amount of chip area and increasing power. consumption, and has the advantages of simple design, which can increase the operating speed of the processor and reduce power consumption.

要详细说明的是,控制单元C_EX2MEM和控制单元C_MEM2WB是一对相互搭配使用的控制单元,二者共同组合实现握手信号的选择功能,接下来介绍二者的工作原理。It should be explained in detail that the control unit C_EX2MEM and the control unit C_MEM2WB are a pair of control units used in conjunction with each other. The combination of the two realizes the selection function of the handshake signal. Next, the working principle of the two is introduced.

图3是本申请实施例的控制单元C_EX2MEM的逻辑示意图。如图3所示,C_EX2MEM控制单元是一个前置握手选择器,根据控制信号选择输出不同的握手请求信号。Sel信号与上一次产生的请求信号分别异或和同或,再经一个由click信号触发的寄存器得到请求信号。当Sel为1时,握手产生请求信号req2为1,而req1为0;当Sel为0时,握手产生请求信号req1为1,此时req2为0。通过上述设计,控制单元C_EX2MEM可以根据Sel值的不同输出不同的请求信号,来控制对应流水线执行相应操作。Fig. 3 is a logical schematic diagram of the control unit C_EX2MEM of the embodiment of the present application. As shown in Figure 3, the C_EX2MEM control unit is a pre-handshake selector, which selects and outputs different handshake request signals according to the control signal. The Sel signal is XORed and XORed with the request signal generated last time, and the request signal is obtained through a register triggered by the click signal. When Sel is 1, the handshake generated request signal req2 is 1, and req1 is 0; when Sel is 0, the handshake generated request signal req1 is 1, and req2 is 0 at this time. Through the above design, the control unit C_EX2MEM can output different request signals according to different Sel values to control the corresponding pipeline to perform corresponding operations.

图4是本申请实施例的控制单元C_MEM2WB的逻辑示意图。如图4所示,C_MEM2WB控制单元是一个后置握手选择器,根据控制信号选择不同的请求应答信号进行握手。Sel信号与上一级控制单元产生的应答信号分别异或和同或,再经一个由Click信号触发的寄存器得到应答信号,应答信号与上一级给的请求信号异或,经过组合逻辑得到Click信号,Click信号触发寄存器使给下一级流水线的请求信号翻转。如图4所示,C_MEM2WB控制单元可实现两对请求应答信号的握手功能。其中,Sel信号与上述C_EX2MEM单元的Sel为同一信号,两控制单元搭配使用实现握手信号的选择功能。Fig. 4 is a logical schematic diagram of the control unit C_MEM2WB of the embodiment of the present application. As shown in Figure 4, the C_MEM2WB control unit is a post-handshake selector, which selects different request response signals for handshaking according to the control signal. The Sel signal and the response signal generated by the upper-level control unit are XOR and XOR respectively, and then the response signal is obtained through a register triggered by the Click signal, and the response signal is XORed with the request signal from the upper level, and the Click is obtained through combinational logic. Signal, the Click signal triggers the register to invert the request signal to the next stage of the pipeline. As shown in Figure 4, the C_MEM2WB control unit can realize the handshake function of two pairs of request response signals. Among them, the Sel signal is the same signal as the Sel of the above-mentioned C_EX2MEM unit, and the two control units are used together to realize the selection function of the handshake signal.

图5是本申请实施例中异步控制模块的第一级控制单元的逻辑示意图,如图5所示,控制单元C_PC根据处理器工作的使能信号产生程序计数器的click信号,使程序计数器运算一次,根据跳转、异常、中断等控制信号计算下一条指令地址,传输给取值模块。第一级控制单元在产生本阶段的click信号的同时生成下一阶段控制模块C_IF2ID的请求信号。C_PC单元是流水线的第一级控制单元,因此它不需要请求信号的输入和应答信号的输出。图中的D为第一级流水线,对应取指模块。第一级流水线D接收到第一控制单元产生的click信号后开始工作,并将输出数据Out_Data传输到下一级流水线。Fig. 5 is a logical schematic diagram of the first-level control unit of the asynchronous control module in the embodiment of the present application. As shown in Fig. 5, the control unit C_PC generates the click signal of the program counter according to the enable signal of the processor operation, so that the program counter performs one operation , calculate the address of the next instruction according to control signals such as jump, exception, and interrupt, and transmit it to the value-taking module. The first-level control unit generates the request signal of the next-stage control module C_IF2ID while generating the click signal of this stage. The C_PC unit is the first-level control unit of the pipeline, so it does not need the input of the request signal and the output of the response signal. D in the figure is the first-stage pipeline, which corresponds to the instruction fetch module. The first-stage pipeline D starts to work after receiving the click signal generated by the first control unit, and transmits the output data Out_Data to the next-stage pipeline.

通过上述实施例的说明可知,本发明至少具有以下有益效果:采用异步单轨电路,通过click信号来代替同步电路中的全局时钟,所以本发明不需要设计时钟树网络,而时钟树网络的消耗功耗和占用面积大,因此本发明能够大大降低处理器电路的功耗,同时节约电路中的可设计面积,为电路改造和优化提供了更大的空间,同时,采用异步电路没有全局时钟频率被电路中的关键路径延时限制的问题,各个模块的关键路径各自独立,能更好地优化关键路径。It can be seen from the description of the above embodiments that the present invention has at least the following beneficial effects: the global clock in the synchronous circuit is replaced by the click signal by using an asynchronous monorail circuit, so the present invention does not need to design a clock tree network, and the power consumption of the clock tree network Therefore, the present invention can greatly reduce the power consumption of the processor circuit, save the designable area in the circuit at the same time, and provide a larger space for circuit modification and optimization. For the problem of critical path delay limitation in the circuit, the critical paths of each module are independent, which can better optimize the critical path.

以上实施例仅表达了本申请的几种实施方式,其描述较为具体和详细,但并不能因此而理解为对专利保护范围的限制。应当指出的是,对于本领域的普通技术人员来说,在不脱离本申请构思的前提下,还可以做出若干变形和改进,这些都属于本申请的保护范围。因此,本申请的保护范围应以所附权利要求为准。The above examples only express several implementation modes of the present application, and the description thereof is relatively specific and detailed, but should not be construed as limiting the protection scope of the patent. It should be noted that those skilled in the art can make several modifications and improvements without departing from the concept of the present application, and these all belong to the protection scope of the present application. Therefore, the protection scope of the present application should be determined by the appended claims.

Claims (7)

1. An asynchronous monorail-based pipelined processor, comprising: the system comprises an asynchronous control module, an instruction taking module, a decoding module, an executing module, a self-adaptive selection module, a memory access module, a write-back module, a storage module, a control and status register and a general register, wherein data communication is completed among the modules through asynchronous single-rail handshake, the asynchronous control module comprises a plurality of control units, the plurality of control units are a plurality of phase decoupling Click units, and the plurality of phase decoupling Click units are mutually cascaded through handshake and are respectively connected with corresponding pipelines;
after handshake between the phase decoupling Click units is successful, a Click signal for controlling the pipeline of the stage is generated;
the phase decoupling Click unit generates a Click signal, so that a program counter calculates an instruction address according to a control signal, the instruction address is transmitted to the instruction fetch module, and meanwhile, a request signal is sent to a next-stage phase decoupling Click unit, wherein the control signal is an electric signal comprising skip, abnormality and interruption, and the request signal is an electric signal which is sent by a previous-stage phase decoupling Click unit to the next-stage phase decoupling Click unit and requests the next-stage phase decoupling Click unit to work;
the self-adaptive selection module acquires an operation result from the execution module, transmits the operation result to the access module and the write-back module according to a control signal, and the write-back module receives the operation result transmitted by the execution module and the access module and writes the operation result back to a register; the data storage module is used for storing the received data;
wherein an initialization signal is required to initialize the processor before the processor begins operation.
2. The asynchronous monorail-based pipeline processor of claim 1, wherein a first stage pipeline comprises the finger fetch module, a second stage pipeline comprises the decode module, a third stage pipeline comprises the execute module, a fourth stage pipeline comprises the memory module, and a fifth stage pipeline comprises the write back module.
3. An asynchronous monorail-based pipeline processor according to claim 1, wherein the instruction fetch module reads instructions from the memory module based on the instruction address and transfers the instructions to the decode module.
4. A pipeline processor based on asynchronous monorail according to claim 3, wherein said decode module decodes said instruction and reads the data to be processed associated with said instruction from said control and status register, said general purpose register, and transfers said data to be processed to said execution module.
5. The pipeline processor based on asynchronous monorail of claim 4, wherein the execution module performs corresponding operations according to the data to be processed, and transmits the operation result to the memory module after obtaining the operation result.
6. The asynchronous monorail-based pipeline processor of claim 5, wherein the memory module performs read-write operations on the memory module according to the operation result.
7. The asynchronous monorail-based pipeline processor of claim 1, wherein the execution module comprises a predictive flush module, a skip module, and a bypass module, the predictive flush module flushing incorrect predicted instructions, the skip module generating a skip address based on a skip signal and an operation result and returning the skip address to the program counter, the bypass module obtaining the required data from the post module in advance based on the register address.
CN202110644854.3A 2021-06-09 2021-06-09 Pipeline processor based on asynchronous monorail Active CN113407239B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110644854.3A CN113407239B (en) 2021-06-09 2021-06-09 Pipeline processor based on asynchronous monorail

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110644854.3A CN113407239B (en) 2021-06-09 2021-06-09 Pipeline processor based on asynchronous monorail

Publications (2)

Publication Number Publication Date
CN113407239A CN113407239A (en) 2021-09-17
CN113407239B true CN113407239B (en) 2023-06-13

Family

ID=77683307

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110644854.3A Active CN113407239B (en) 2021-06-09 2021-06-09 Pipeline processor based on asynchronous monorail

Country Status (1)

Country Link
CN (1) CN113407239B (en)

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4982402A (en) * 1989-02-03 1991-01-01 Digital Equipment Corporation Method and apparatus for detecting and correcting errors in a pipelined computer system
US5752070A (en) * 1990-03-19 1998-05-12 California Institute Of Technology Asynchronous processors
CN107092462A (en) * 2017-04-01 2017-08-25 何安平 A kind of 64 Asynchronous Multipliers based on FPGA
CN107404380A (en) * 2017-06-30 2017-11-28 吴尽昭 A kind of RSA Algorithm based on asynchronous data-path
CN207473606U (en) * 2017-07-27 2018-06-08 兰州大学 The communicating circuit of disparate step artificial neural network chip based on click controllers
CN108537331A (en) * 2018-04-04 2018-09-14 清华大学 A kind of restructural convolutional neural networks accelerating circuit based on asynchronous logic
CN109815619A (en) * 2019-02-18 2019-05-28 清华大学 A Method of Converting Synchronous Circuits to Asynchronous Circuits
CN109918130A (en) * 2019-01-24 2019-06-21 中山大学 A Four-Stage Pipeline RISC-V Processor with Fast Data Bypass Architecture
CN110928832A (en) * 2019-10-09 2020-03-27 中山大学 Asynchronous pipeline processor circuit, device and data processing method
CN111078294A (en) * 2019-11-22 2020-04-28 苏州浪潮智能科技有限公司 Instruction processing method, device and storage medium of a processor
CN112486312A (en) * 2020-11-19 2021-03-12 杭州电子科技大学 a low power processor
CN112667292A (en) * 2021-01-26 2021-04-16 北京中科芯蕊科技有限公司 Asynchronous miniflow line controller

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9384003B2 (en) * 2007-10-23 2016-07-05 Texas Instruments Incorporated Determining whether a branch instruction is predicted based on a capture range of a second instruction
US10892968B2 (en) * 2015-12-18 2021-01-12 Google Llc Systems and methods for latency reduction in content item interactions using client-generated click identifiers

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4982402A (en) * 1989-02-03 1991-01-01 Digital Equipment Corporation Method and apparatus for detecting and correcting errors in a pipelined computer system
US5752070A (en) * 1990-03-19 1998-05-12 California Institute Of Technology Asynchronous processors
CN107092462A (en) * 2017-04-01 2017-08-25 何安平 A kind of 64 Asynchronous Multipliers based on FPGA
CN107404380A (en) * 2017-06-30 2017-11-28 吴尽昭 A kind of RSA Algorithm based on asynchronous data-path
CN207473606U (en) * 2017-07-27 2018-06-08 兰州大学 The communicating circuit of disparate step artificial neural network chip based on click controllers
CN108537331A (en) * 2018-04-04 2018-09-14 清华大学 A kind of restructural convolutional neural networks accelerating circuit based on asynchronous logic
CN109918130A (en) * 2019-01-24 2019-06-21 中山大学 A Four-Stage Pipeline RISC-V Processor with Fast Data Bypass Architecture
CN109815619A (en) * 2019-02-18 2019-05-28 清华大学 A Method of Converting Synchronous Circuits to Asynchronous Circuits
CN110928832A (en) * 2019-10-09 2020-03-27 中山大学 Asynchronous pipeline processor circuit, device and data processing method
CN111078294A (en) * 2019-11-22 2020-04-28 苏州浪潮智能科技有限公司 Instruction processing method, device and storage medium of a processor
CN112486312A (en) * 2020-11-19 2021-03-12 杭州电子科技大学 a low power processor
CN112667292A (en) * 2021-01-26 2021-04-16 北京中科芯蕊科技有限公司 Asynchronous miniflow line controller

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
A Design Flow for Click-Based Asynchronous Circuits Design With Conventional EDA Tools;Hui Wu et al.;《IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS》;第第40卷卷(第第11期期);第2421-2425页 *
Design and FPGA-implementation of Asynchronous Circuits Using Two-phase Handshaking;Adrian Mardari et al.;《2019 25th IEEE International Symposium on Synchronous Circuits and Systems》;第9-18页 *
基于约束数据捆绑两相握手协议的8位异步Booth乘法器设计;何安平;刘晓庆;陈虹;;电子学报(第04期);全文 *

Also Published As

Publication number Publication date
CN113407239A (en) 2021-09-17

Similar Documents

Publication Publication Date Title
US10942737B2 (en) Method, device and system for control signalling in a data path module of a data stream processing engine
US10515046B2 (en) Processors, methods, and systems with a configurable spatial accelerator
WO2020005448A1 (en) Apparatuses, methods, and systems for unstructured data flow in a configurable spatial accelerator
US9405552B2 (en) Method, device and system for controlling execution of an instruction sequence in a data stream accelerator
WO2019194916A1 (en) Apparatuses, methods, and systems for remote memory access in a configurable spatial accelerator
WO2020005444A1 (en) Apparatus, methods, and systems for low latency communication in a configurable spatial accelerator
TW201802668A (en) Interruptible and restartable matrix multiply instructions, processors, methods and systems
TWI567639B (en) Processor power management based on class and content of instructions
JP2005531089A (en) Processing system with interspersed processors and communication elements
CN103150146A (en) ASIP (application-specific instruction-set processor) based on extensible processor architecture and realizing method thereof
US11467844B2 (en) Storing multiple instructions in a single reordering buffer entry
CN102184092A (en) Special instruction set processor based on pipeline structure
CN112667289B (en) A CNN reasoning acceleration system, acceleration method and medium
JP7229305B2 (en) Apparatus, method, and processing apparatus for writing back instruction execution results
WO2006132804A2 (en) System and method for power saving in pipelined microprocessors
US8171259B2 (en) Multi-cluster dynamic reconfigurable circuit for context valid processing of data by clearing received data with added context change indicative signal
JP2010117806A (en) Semiconductor device and data processing method by semiconductor device
CN113407239B (en) Pipeline processor based on asynchronous monorail
CN119149111A (en) RISC-V CPU architecture supporting integrated memory and calculation buffer
CN110045989B (en) Dynamic switching type low-power-consumption processor
CN113986354B (en) Six-stage pipeline CPU based on RISC-V instruction set
Lee et al. A low-power implementation of asynchronous 8051 employing adaptive pipeline structure
US20140115358A1 (en) Integrated circuit device and method for controlling an operating mode of an on-die memory
CN114356416B (en) Processor, control method and device thereof, electronic equipment and storage medium
CN105183697A (en) Embedded RSIC-DSP processor system and construction method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant