[go: up one dir, main page]

CN118426735B - Variable pipeline error correction and detection addition operation system and method - Google Patents

Variable pipeline error correction and detection addition operation system and method Download PDF

Info

Publication number
CN118426735B
CN118426735B CN202410885683.7A CN202410885683A CN118426735B CN 118426735 B CN118426735 B CN 118426735B CN 202410885683 A CN202410885683 A CN 202410885683A CN 118426735 B CN118426735 B CN 118426735B
Authority
CN
China
Prior art keywords
addition
error correction
unit
adder
error
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202410885683.7A
Other languages
Chinese (zh)
Other versions
CN118426735A (en
Inventor
张洵颖
李哲
张海金
赵晓冬
崔媛媛
李万通
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Northwestern Polytechnical University
Original Assignee
Northwestern Polytechnical University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Northwestern Polytechnical University filed Critical Northwestern Polytechnical University
Priority to CN202410885683.7A priority Critical patent/CN118426735B/en
Publication of CN118426735A publication Critical patent/CN118426735A/en
Application granted granted Critical
Publication of CN118426735B publication Critical patent/CN118426735B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/50Adding; Subtracting
    • G06F7/501Half or full adders, i.e. basic adder cells for one denomination
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/499Denomination or exception handling, e.g. rounding or overflow
    • G06F7/49905Exception handling
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/50Adding; Subtracting
    • G06F7/505Adding; Subtracting in bit-parallel fashion, i.e. having a different digit-handling circuit for each denomination
    • G06F7/5052Adding; Subtracting in bit-parallel fashion, i.e. having a different digit-handling circuit for each denomination using carry completion detection, either over all stages or at sample stages only
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/57Arithmetic logic units [ALU], i.e. arrangements or devices for performing two or more of the operations covered by groups G06F7/483 – G06F7/556 or for performing logical operations
    • G06F7/575Basic arithmetic logic units, i.e. devices selectable to perform either addition, subtraction or one of several logical operations, using, at least partially, the same circuitry

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Pure & Applied Mathematics (AREA)
  • Mathematical Optimization (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Advance Control (AREA)
  • Image Processing (AREA)

Abstract

The application discloses a variable pipeline error correction and detection addition operation system and a method, which belong to the technical field of addition operation systems, and each adder unit is connected with an operation detection unit by carrying out addition operation on addition operands from a bus and outputting an addition result and a carry; the method and the device can accurately acquire the adder unit with addition errors by utilizing the operation detection unit to detect the addition result of the adder unit connected with the adder unit, correct the adder unit with the error detected in the responsible area by utilizing the multiplexing error correction unit, and output the corrected result to the adder unit with the error in the addition result to realize quick error correction.

Description

一种可变流水线纠检错加法运算系统及方法A variable pipeline error correction and detection addition operation system and method

技术领域Technical Field

本发明属于加法运算系统技术领域,具体涉及一种可变流水线纠检错加法运算系统及方法。The present invention belongs to the technical field of addition operation systems, and in particular relates to a variable pipeline error correction and detection addition operation system and method.

背景技术Background Art

随着人工智能技术的进步,其算法模型复杂度和数据量也急剧增大,因此对硬件性能也提出了更高的要求。当前为这些海量运算提供支持的硬件类型主要有图形处理器(GPU)、现场可编程门阵列(FPGA)、应用型专用集成电路(ASIC)等。其中作为高性能应用的GPU、ASIC内部集成了大量的加法器,而巨量的加法逻辑会带来错误率的显著提升,并且由于缺少相应的处理方法,将导致出现静默错误或出错后可追溯性差,难以快速恢复。With the advancement of artificial intelligence technology, the complexity of its algorithm models and the amount of data have increased dramatically, which has put forward higher requirements for hardware performance. The types of hardware currently supporting these massive operations mainly include graphics processing units (GPUs), field programmable gate arrays (FPGAs), and application-specific integrated circuits (ASICs). Among them, GPUs and ASICs, which are high-performance applications, integrate a large number of adders. The huge amount of addition logic will significantly increase the error rate, and due to the lack of corresponding processing methods, silent errors will occur or the traceability after the error is poor, making it difficult to recover quickly.

发明内容Summary of the invention

本发明的目的在于提供一种可变流水线纠检错加法运算系统及方法,以克服现有技术易导致出现静默错误或出错后可追溯性差、海量加法运算逻辑可靠性差的问题。The purpose of the present invention is to provide a variable pipeline error correction and detection addition operation system and method to overcome the problems of the prior art that silent errors or poor traceability after errors and poor reliability of massive addition operation logic are easily caused.

为实现上述目的,本发明采用的技术方案如下:To achieve the above purpose, the technical solution adopted by the present invention is as follows:

一种可变流水线纠检错加法运算系统,包括加法器单元、运算检测单元和复用纠错单元;A variable pipeline error correction and detection addition operation system, comprising an adder unit, an operation detection unit and a multiplexing error correction unit;

所述加法器单元用于对来自总线的加法操作数进行加法运算,并输出加法结果和进位,每个加法器单元连接一个运算检测单元;The adder unit is used to perform addition operation on the addition operands from the bus and output the addition result and the carry, and each adder unit is connected to an operation detection unit;

运算检测单元用于对与其连接的加法器单元的加法结果和进位进行检测,如果加法器单元的加法结果有误,则运算检测单元输出错误标志信号至复用纠错单元;The operation detection unit is used to detect the addition result and carry of the adder unit connected thereto. If the addition result of the adder unit is wrong, the operation detection unit outputs an error flag signal to the multiplexing error correction unit;

复用纠错单元,根据错误标志信号对其负责区域内检测到错误的加法结果进行纠错,并将纠错后的结果输出至加法结果有误的加法器单元。The multiplexing error correction unit corrects the addition result detected as an error in its responsible area according to the error flag signal, and outputs the error-corrected result to the adder unit with the erroneous addition result.

优选的,所述加法器单元的加法操作数至少包括两位操作数。Preferably, the addition operand of the adder unit includes at least two-bit operands.

优选的,运算检测单元和与其连接的加法器单元并行工作,同时从至少两位操作数中获取两位相同的操作数 。Preferably, the operation detection unit and the adder unit connected thereto work in parallel and simultaneously obtain two identical operands from at least two operands.

优选的,所述运算检测单元采用伯格码进行检测校验 ,所述复用纠错单元采用错误纠正码对加法器单元中的加法运算的错误进行纠正。Preferably, the operation detection unit uses Burger code for detection and verification, and the multiplexing error correction unit uses error correction code to correct errors in addition operations in the adder unit.

优选的,多个加法器单元和多个运算检测单元共用一个复用纠错单元。Preferably, a plurality of adder units and a plurality of operation detection units share a multiplexing error correction unit.

优选的,所述复用纠错单元包括纠错仲裁器、数据寄存器与错误处理电路;Preferably, the multiplexing error correction unit includes an error correction arbiter, a data register and an error processing circuit;

数据寄存器用于获取并存储其负责区域内所有出现错误时的加法操作数及加法结果;The data register is used to obtain and store all addition operands and addition results when errors occur in the area it is responsible for;

纠错仲裁器用于对所有出现错误时的加法操作数及加法结果进行优先级确认,并进行优先级排序;The error correction arbiter is used to confirm the priority of all addition operands and addition results when errors occur, and to sort the priorities;

错误处理电路按优先级排序依次对加法结果有误的加法器单元的加法运算进行纠错,其他未纠错的加法器单元处于等待状态。The error processing circuit corrects the addition operations of the adder units with incorrect addition results in order of priority, and other adder units that have not been corrected are in a waiting state.

一种可变流水线纠检错加法运算方法,包括以下步骤:A variable pipeline error correction and detection addition operation method comprises the following steps:

对加法操作数进行加法运算的加法结果和进位进行检测,如果加法运算的加法结果有误,则对检测到错误有误的加法结果进行纠错,并输出将纠错后的加法结果输出至加法器单元 。The addition result and carry of the addition operation of the addition operand are detected. If the addition result of the addition operation is wrong, the addition result detected to be wrong is corrected, and the correction result is output to the adder unit.

优选的,采用加法器单元对来自总线的加法操作数进行加法运算,并输出加法结果和进位,每个加法器单元对应连接一个运算检测单元;Preferably, an adder unit is used to perform addition operation on the addition operands from the bus, and output the addition result and the carry, and each adder unit is correspondingly connected to an operation detection unit;

采用运算检测单元对与其连接的加法器单元的加法结果和进位进行检测,如果加法器单元的加法结果有误,则运算检测单元输出错误标志信号 至复用纠错单元;The operation detection unit is used to detect the addition result and carry of the adder unit connected thereto. If the addition result of the adder unit is wrong, the operation detection unit outputs an error flag signal to the multiplexing error correction unit;

利用复用纠错单元对其负责区域内检测到错误的加法结果进行纠错,并将纠错后的结果输出至加法结果有误的加法器单元。The multiplexed error correction unit is used to correct the addition result detected as erroneous in the area it is responsible for, and the error-corrected result is output to the adder unit with the erroneous addition result.

优选的,所述运算检测单元采用伯格码进行检测校验 ,所述复用纠错单元采用错误纠正码对加法器单元中的加法运算的错误进行纠正。Preferably, the operation detection unit uses Burger code for detection and verification, and the multiplexing error correction unit uses error correction code to correct errors in addition operations in the adder unit.

优选的,利用复用纠错单元对其负责区域内检测到错误的加法结果进行纠错具体过程为:对所有出现错误时的加法操作数及出现错误时的加法操作数得到的加法结果进行的优先级排序信号 ;Preferably, the specific process of using the multiplexed error correction unit to correct the addition results detected in the area it is responsible for is: a priority sorting signal is generated for all addition operands when errors occur and the addition results obtained from the addition operands when errors occur;

按优先级排序依次对加法结果有误的加法器单元的加法运算进行纠错,其他未纠错的加法器单元处于等待状态。The addition operations of the adder units with incorrect addition results are corrected in sequence according to the priority, and other adder units that have not been corrected are in a waiting state.

与现有技术相比,本发明具有以下有益的技术效果:Compared with the prior art, the present invention has the following beneficial technical effects:

本发明提供一种可变流水线纠检错加法运算系统,通过对来自总线的加法操作数进行加法运算,并输出加法结果和进位,每个加法器单元连接一个运算检测单元;利用运算检测单元对与其连接的加法器单元的加法结果和进位进行检测,能够准确获取出现加法错误的加法器单元,并利用复用纠错单元对其负责区域内检测到错误的加法结果进行纠错,并将纠错后的结果输出至加法结果有误的加法器单元,实现快速的纠错,本申请利用运算检测单元能够实现加法运算的纠错,不需要引入额外的开销,提高了逻辑运算单元的可靠性,为实际应用提供了一种优选方案。The present invention provides a variable pipeline error correction and detection addition operation system, which performs addition operation on addition operands from a bus and outputs addition results and carries, and each adder unit is connected to an operation detection unit; the operation detection unit is used to detect the addition results and carries of the adder units connected to it, so that the adder units with addition errors can be accurately obtained, and the multiplexing error correction unit is used to correct the addition results with errors detected in the area responsible for it, and the corrected results are output to the adder units with incorrect addition results, so as to realize fast error correction. The present application can realize error correction of addition operation by using the operation detection unit, does not need to introduce additional overhead, improves the reliability of the logic operation unit, and provides a preferred solution for practical application.

优选的,运算检测单元和与其连接的加法器单元并行工作,同时获取两位操作数,直接对出现错误的环节进行纠错,可追溯性强,错误检测逻辑相比于加法运算逻辑更简单,不影响关键路径时序。Preferably, the operation detection unit and the adder unit connected thereto work in parallel, obtain two operands at the same time, and directly correct the link where the error occurs. The traceability is strong, and the error detection logic is simpler than the addition operation logic and does not affect the critical path timing.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

图1为本发明实施例中可变流水线纠检错加法运算系统结构示意图。FIG1 is a schematic diagram of the structure of a variable pipeline error correction and detection addition operation system according to an embodiment of the present invention.

图2为本发明实施例中PE阵列在可变流水线纠检错加法运算系统中的应用示意图。FIG. 2 is a schematic diagram of an application of a PE array in a variable pipeline error correction and detection addition operation system according to an embodiment of the present invention.

图3为本发明实施例中运算检测单元电路原理图。FIG. 3 is a circuit diagram of an operation detection unit in an embodiment of the present invention.

图4为本发明实施例中复用纠错单元结构原理框图。FIG4 is a block diagram of the structure principle of a multiplexing error correction unit in an embodiment of the present invention.

图5为本发明实施例中错误处理电路图。FIG5 is a circuit diagram of an error handling process in an embodiment of the present invention.

图6为本发明实施例中复用纠错单元负责区域内仅有一个PE单元被检测出加法结果有误纠错流程示意图。FIG6 is a schematic diagram of an error correction process in which only one PE unit is detected to have an incorrect addition result in the area that the multiplexing error correction unit is responsible for according to an embodiment of the present invention.

图7为本发明实施例中复用纠错单元负责区域内多个PE单元被检测出加法结果有误纠错流程示意图。FIG. 7 is a schematic diagram of an error correction process for detecting that the addition results of multiple PE units in the area in which the multiplexing error correction unit is responsible are incorrect in an embodiment of the present invention.

图8为本发明实施例中可变流水线纠检错加法运算方法流程图。FIG8 is a flow chart of a variable pipeline error correction and detection addition operation method according to an embodiment of the present invention.

具体实施方式DETAILED DESCRIPTION

为了使本技术领域的人员更好地理解本发明方案,下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分的实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都应当属于本发明保护的范围。In order to enable those skilled in the art to better understand the scheme of the present invention, the technical scheme in the embodiments of the present invention will be clearly and completely described below in conjunction with the drawings in the embodiments of the present invention. Obviously, the described embodiments are only part of the embodiments of the present invention, not all of the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by ordinary technicians in this field without creative work should fall within the scope of protection of the present invention.

需要说明的是,本发明的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便这里描述的本发明的实施例能够以除了在这里图示或描述的那些以外的顺序实施。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元的过程、方法、系统、产品或设备不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。It should be noted that the terms "first", "second", etc. in the specification and claims of the present invention and the above-mentioned drawings are used to distinguish similar objects, and are not necessarily used to describe a specific order or sequence. It should be understood that the data used in this way can be interchanged where appropriate, so that the embodiments of the present invention described herein can be implemented in an order other than those illustrated or described herein. In addition, the terms "including" and "having" and any variations thereof are intended to cover non-exclusive inclusions, for example, a process, method, system, product or device that includes a series of steps or units is not necessarily limited to those steps or units that are clearly listed, but may include other steps or units that are not clearly listed or inherent to these processes, methods, products or devices.

如图1所示,本发明提供一种可变流水线纠检错加法运算系统,包括加法器单元、运算检测单元和复用纠错单元;As shown in FIG1 , the present invention provides a variable pipeline error correction and detection addition operation system, including an adder unit, an operation detection unit and a multiplexing error correction unit;

所述加法器单元设置于处理器单元中,加法器单元用于对来自总线的加法操作数进行加法运算,并输出加法结果和进位,每个加法器单元连接一个运算检测单元;The adder unit is arranged in the processor unit, and is used for performing addition operation on the addition operands from the bus, and outputting the addition result and the carry, and each adder unit is connected to an operation detection unit;

运算检测单元用于对与其连接的加法器单元的加法结果和进位进行检测,如果加法器单元的加法结果有误,则运算检测单元输出错误标志信号至复用纠错单元;The operation detection unit is used to detect the addition result and carry of the adder unit connected thereto. If the addition result of the adder unit is wrong, the operation detection unit outputs an error flag signal to the multiplexing error correction unit;

复用纠错单元,根据错误标志信号对其负责区域内检测到错误的加法结果进行纠错,并将纠错后的结果输出至加法结果有误的加法器单元。本申请采用运算检测单元对加法器单元的加法运算进行检测,同时对加法结果有误的加法运算利用复用纠错单元进行纠错,本申请利用运算检测单元能够实现加法运算的纠错,能够实现加法运算出错后的追溯性,不需要引入额外的开销,提高了逻辑运算单元的可靠性,为实际应用提供了一种优选方案。The multiplexing error correction unit corrects the addition result detected as an error in its responsible area according to the error flag signal, and outputs the error-corrected result to the adder unit with the wrong addition result. The present application uses an operation detection unit to detect the addition operation of the adder unit, and at the same time uses the multiplexing error correction unit to correct the addition operation with the wrong addition result. The present application uses the operation detection unit to realize the error correction of the addition operation, and can realize the traceability after the addition operation is wrong, without introducing additional overhead, thereby improving the reliability of the logic operation unit, and providing a preferred solution for practical application.

在本申请的具体实施例中,加法器单元具有普适性,包括但不局限于行波进位加法器、平方根进位选择加法器、超前进位加法器。In a specific embodiment of the present application, the adder unit is universal, including but not limited to a ripple carry adder, a square root carry select adder, and a carry lookahead adder.

在本申请具体实施方式中,所述加法器单元的加法操作数至少包括第一操作数A和第二操作数B。In a specific implementation of the present application, the addition operands of the adder unit include at least a first operand A and a second operand B.

运算检测单元和与其连接的加法器单元能够并行工作,同时获取第一操作数A和第二操作数B。The operation detection unit and the adder unit connected thereto can work in parallel to obtain the first operand A and the second operand B at the same time.

在本申请的具体实施例中,运算检测单元采用运算检测结构电路,检测方式采用伯格码进行检测。In a specific embodiment of the present application, the operation detection unit adopts an operation detection structure circuit, and the detection method adopts Burger code for detection.

本申请提出的运算检测单元与加法器单元的实现微架构无关,具有广泛的应用范围。The operation detection unit proposed in the present application is independent of the implementation micro-architecture of the adder unit and has a wide range of applications.

所述复用纠错单元的纠错操作在加法运算流水线中属于半隐藏流水级,仅在所述运算检测单元的错误标志信号有效时,激活纠错流水级,复用纠错单元进行纠错。The error correction operation of the multiplexed error correction unit belongs to a semi-hidden pipeline stage in the addition operation pipeline. Only when the error flag signal of the operation detection unit is valid, the error correction pipeline stage is activated and the multiplexed error correction unit performs error correction.

所述运算检测单元采用Berger码(伯格码)进行检测,复用纠错单元采用错误纠正码对加法器单元中的加法运算的错误进行纠正。The operation detection unit uses Berger code for detection, and the multiplexing error correction unit uses error correction code to correct the error of the addition operation in the adder unit.

所述复用纠错单元的错误纠正码具体采用海明码。The error correction code of the multiplexing error correction unit specifically adopts Hamming code.

运算检测单元与加法器单元的运算能够并行执行,且错误检测相比于加法运算逻辑更简单,不影响关键路径时序。The operations of the operation detection unit and the adder unit can be executed in parallel, and the error detection is simpler than the addition operation logic and does not affect the critical path timing.

在本申请中,多个加法器单元和多个运算检测单元共用一个复用纠错单元,最小化错误纠正的硬件资源开销。当加法器单元没有错误发生时,错误纠正使能不开启,不插入纠错流水级,这种半隐藏的纠错流水级设计可显著减小纠检错设计引入的额外开销。In the present application, multiple adder units and multiple operation detection units share a multiplexed error correction unit, minimizing the hardware resource overhead of error correction. When no error occurs in the adder unit, the error correction enable is not turned on and the error correction pipeline stage is not inserted. This semi-hidden error correction pipeline stage design can significantly reduce the additional overhead introduced by the error correction design.

在本申请具体实施方式中,以ALU(算术逻辑单元)运算单元中的加法器单元为例进行说明,如图2所示,本申请结合PE(处理器单元)单元阵列,阐述本申请中一个实例,对一种可变流水线纠检错加法运算系统进行示范性说明。In the specific implementation manner of the present application, the adder unit in the ALU (arithmetic logic unit) operation unit is taken as an example for explanation, as shown in Figure 2, the present application is combined with the PE (processor unit) unit array to explain an example in the present application and exemplarily illustrate a variable pipeline error correction and detection addition operation system.

对于每一个PE,其ALU中的加法器单元并行连接一个运算检测单元;基于PE的加法运算出错概率比较小,每一列的PE连接的加法器单元共同用一个复用纠错单元。当其中一个PE中的运算检测单元检测到该运算检测单元连接的加法器单元的加法运算错误后,产生错误标志信号发送至其对应区域内的复用纠错单元,即运算检测单元输出错误标志信号至复用纠错单元,由复用纠错单元完成纠错处理操作并返回正确结果至加法运算错误的加法器单元。正常状态下,加法器单元的加法通过取指、译码、执行、访存及回写操作完成,当检测到加法结果错误时,插入一级纠错级流水线;此时,加法操作通过取指、译码、执行、纠错、访存、及回写操作完成;本申请直接对出现错误的环节进行纠错,可追溯性强,错误检测逻辑相比于加法运算逻辑更简单,不影响关键路径时序。For each PE, the adder unit in its ALU is connected in parallel to an operation detection unit; the error probability of the addition operation based on PE is relatively small, and the adder units connected to each column of PE share a multiplexing error correction unit. When the operation detection unit in one of the PEs detects an addition operation error of the adder unit connected to the operation detection unit, an error flag signal is generated and sent to the multiplexing error correction unit in its corresponding area, that is, the operation detection unit outputs an error flag signal to the multiplexing error correction unit, and the multiplexing error correction unit completes the error correction processing operation and returns the correct result to the adder unit with the addition operation error. Under normal conditions, the addition of the adder unit is completed through instruction fetch, decoding, execution, memory access and write-back operations. When an addition result error is detected, a first-level error correction pipeline is inserted; at this time, the addition operation is completed through instruction fetch, decoding, execution, error correction, memory access, and write-back operations; the present application directly corrects the link where the error occurs, has strong traceability, and the error detection logic is simpler than the addition operation logic, and does not affect the critical path timing.

在本申请具体的实施方式中,运算检测单元具体采用Berger码(伯格码)进行校验,具体检测电路如图3示,运算检测单元的输入分别为来自总线的第一操作数A和第二操作数B,来自加法器单元的加法结果S、来自复用纠错单元的输出SE,加法运算的中间进位C;利用多操作数加法器进行运算,加法运算结果的Berger码预测公式:S b= Ab+Bb+Cn-Cb-C0,其中,C0对应于加法运算的进位输入,Cn对应于加法运算的进位输出,用Ab表示A的B0编码,Bb表示B的B0编码, Cb表示C的B0编码。In a specific implementation manner of the present application, the operation detection unit specifically adopts Berger code for verification. The specific detection circuit is shown in Figure 3. The inputs of the operation detection unit are respectively the first operand A and the second operand B from the bus, the addition result S from the adder unit, the output SE from the multiplexing error correction unit, and the intermediate carry C of the addition operation; the operation is performed using a multi-operand adder, and the Berger code prediction formula of the addition operation result is: S'b = Ab + Bb + Cn - Cb - C0 , wherein C0 corresponds to the carry input of the addition operation, Cn corresponds to the carry output of the addition operation, Ab represents the B0 encoding of A, Bb represents the B0 encoding of B, and Cb represents the B0 encoding of C.

运算检测单元的错误检测原理是比较预测码S b和实际编码Sb是否一致,该实例中提供一种组合逻辑的比较方式,预测码S b和实际编码Sb通过按位异或非运算得到错误标志信号EN:若预测码S b和实际编码Sb一致,经二选一选择器选择加法结果正常输出Sout;反之,若预测码S b和实际编码Sb不一致,则说明加法器的加法运算有误,生成错误标志信号,激活复用纠错单元;所述错误标志信号EN包括PE的n位错误指示信号,n表示复用纠错单元负责区域内PE的数量。The error detection principle of the operation detection unit is to compare whether the predicted code S'b is consistent with the actual code Sb . In this example, a combinational logic comparison method is provided. The predicted code S'b and the actual code Sb are subjected to a bitwise XOR operation to obtain an error flag signal EN: if the predicted code S'b is consistent with the actual code Sb , the addition result is selected by the two-to-one selector and outputted normally as Sout ; otherwise, if the predicted code S'b is inconsistent with the actual code Sb , it indicates that the addition operation of the adder is incorrect, an error flag signal is generated, and the multiplexing error correction unit is activated; the error flag signal EN includes an n-bit error indication signal of PE, where n represents the number of PEs in the area that the multiplexing error correction unit is responsible for.

如图4所示,所述复用纠错单元包括纠错仲裁器、数据寄存器与错误处理电路;数据寄存器用于获取并存储其负责区域内所有出现错误时的加法操作数及加法结果S;纠错仲裁器用于对所有出现错误时的加法操作数及加法结果S进行优先级确认,并进行优先级排序;错误处理电路按优先级排序依次对加法结果有误的加法器单元的加法运算进行纠错,其他未纠错的加法器单元处于等待状态。As shown in FIG4 , the multiplexed error correction unit includes an error correction arbiter, a data register and an error handling circuit; the data register is used to obtain and store all addition operands and addition results S when errors occur in its responsible area; the error correction arbiter is used to confirm the priority of all addition operands and addition results S when errors occur, and perform priority sorting; the error handling circuit corrects the addition operations of the adder units with incorrect addition results in turn according to the priority sorting, and other uncorrected adder units are in a waiting state.

在本申请具体实施例中,如图6所示,若复用纠错单元负责区域内仅有一个PE被检测出加法结果有误,纠错仲裁器直接激活错误处理电路,插入一个纠错流水级,错误处理电路获取数据寄存器中该PE正在处理的操作数进行纠错;如图7所示,当复用纠错单元负责区域内同时出现多个加法结果有误时,错误处理电路按优先级排序依次对加法结果有误的加法器单元的加法运算进行纠错。In a specific embodiment of the present application, as shown in FIG6 , if only one PE in the area responsible for the multiplexed error correction unit is detected to have an incorrect addition result, the error correction arbiter directly activates the error handling circuit, inserts an error correction pipeline stage, and the error handling circuit obtains the operand being processed by the PE in the data register for error correction; as shown in FIG7 , when multiple addition results are incorrect at the same time in the area responsible for the multiplexed error correction unit, the error handling circuit corrects the addition operations of the adder units with incorrect addition results in order of priority.

如图5所示,本申请复用纠错单元中的错误处理电路图,错误处理电路包括编码器、纠错解码器、纠错器和使能控制结构。使能控制结构用于接收并存储来自纠错仲裁器的优先级排序信结果,该优先级排序结果即纠错仲裁器对所有出现错误时的加法操作数及加法结果进行优先级确认,并进行优先级排序所得;编码器根据出现错误时的加法操作数得到加法结果中海明码的校验位Ps,纠错解码器通过校验位Ps与出现错误时的加法操作数得到的加法结果S进行对比得到校验结果ES,由纠错器对校验结果ES的出错位置进行纠错,并输出纠错后的结果。As shown in FIG5 , the error handling circuit diagram in the multiplexing error correction unit of the present application includes an encoder, an error correction decoder, an error corrector, and an enable control structure. The enable control structure is used to receive and store the priority sorting signal result from the error correction arbitrator, which is the result obtained by the error correction arbitrator confirming the priority of all addition operands and addition results when errors occur, and performing priority sorting; the encoder obtains the check bit P s of the Hamming code in the addition result according to the addition operand when an error occurs, and the error correction decoder compares the check bit P s with the addition result S obtained by the addition operand when an error occurs to obtain the check result E S , and the error corrector corrects the error position of the check result E S and outputs the corrected result.

本申请采用海明码对4位数据纠错为例,(7,4)海明码共有4位信息位和3位校验位,4位信息位分别记为S0,S1,S2,S3,3位校验位分别记为PS0,PS1,PS2This application takes Hamming code to correct 4-bit data as an example. The (7,4) Hamming code has 4 information bits and 3 check bits. The 4 information bits are recorded as S 0 , S 1 , S 2 , S 3 , and the 3 check bits are recorded as P S0 , P S1 , P S2 .

令向量X=[ A0,A1,A2,A3,B0,B1,B2,B3,C0,C1,C2,C3]T,PS=[PS0,PS1,PS2]T,其中,向量X中A0,A1,A2,A3分别表示第一操作数A的信息位,向量X中B0,B1,B2,B3分别表示第二操作数B的信息位,C0,C1,C2,C3分别表示加法器单元的加法运算中的进位。其编码器编码公式如下:Let vector X = [A 0 , A 1 , A 2 , A 3 , B 0 , B 1 , B 2 , B 3 , C 0 , C 1 , C 2 , C 3 ] T , PS = [ PS0 , PS1 , PS2 ] T , where A 0 , A 1 , A 2 , A 3 in vector X represent the information bits of the first operand A respectively, B 0 , B 1 , B 2 , B 3 in vector X represent the information bits of the second operand B respectively, and C 0 , C 1 , C 2 , C 3 represent the carry in the addition operation of the adder unit respectively. The encoding formula of the encoder is as follows:

纠错解码器的输入为出现错误时的加法操作数得到的加法结果S与加法结果中海明码的校验位Ps,令向量Y=[S0,S1,S2,S3,PS0,PS1,PS2]T,则纠错解码器输出校验结果ES为:The input of the error correction decoder is the addition result S obtained by the addition operand when an error occurs and the check bit Ps of the Hamming code in the addition result. Let the vector Y=[ S0 , S1 , S2 , S3 , Ps0 , Ps1 , Ps2 ] T , then the error correction decoder outputs the check result E S as:

纠错器将纠错解码器解码后得到的校验结果Es经译码器译码为7位纠错信号(仅译码器输出的低7位信号有效)SE,纠错信号SE中未出错位所对应的位线输出为0,出错位所对应的位线输出为1;将出现错误时的加法操作数得到的加法结果S中的信息位S0,S1,S2,S3与纠错信号SE中低4位进行按位异或即可得到纠正后的结果数据SR,即:The error corrector decodes the check result Es obtained after the error correction decoder is decoded into a 7-bit error correction signal S E (only the lower 7-bit signal output by the decoder is valid ). The bit line corresponding to the error-free bit in the error correction signal S E is output as 0, and the bit line corresponding to the error bit is output as 1. The information bits S 0 , S 1 , S 2 , S 3 in the addition result S obtained by the addition operand when an error occurs are bitwise XORed with the lower 4 bits in the error correction signal S E to obtain the corrected result data S R , that is:

在本发明另一个实施例中,提供一种可变流水线纠检错加法运算方法,具体包括以下步骤:In another embodiment of the present invention, a variable pipeline error correction and detection addition operation method is provided, which specifically includes the following steps:

对加法操作数进行加法运算的加法结果进行检测,如果加法运算的加法结果有误,则对检测到有误的加法结果进行纠错,并输出纠错后的加法结果。The addition result of the addition operation is detected for the addition operands. If the addition result of the addition operation is wrong, the wrong addition result is corrected and the corrected addition result is output.

如图8所示,上述方法中,采用加法器单元对来自总线的加法操作数进行加法运算,并输出加法结果,每个加法器单元对应连接一个运算检测单元;As shown in FIG8 , in the above method, an adder unit is used to perform addition operation on the addition operands from the bus and output the addition result, and each adder unit is connected to a corresponding operation detection unit;

采用运算检测单元对与其连接的加法器单元的加法结果进行检测,如果加法器单元的加法结果有误,则运算检测单元输出错误标志信号至复用纠错单元;The operation detection unit is used to detect the addition result of the adder unit connected thereto. If the addition result of the adder unit is wrong, the operation detection unit outputs an error flag signal to the multiplexing error correction unit;

利用复用纠错单元对其负责区域内检测到错误的加法结果进行纠错,并将纠错后的结果输出至加法结果有误的加法器单元。The multiplexed error correction unit is used to correct the addition result detected as erroneous in the area it is responsible for, and the error-corrected result is output to the adder unit with the erroneous addition result.

本申请利用运算检测单元能够实现加法运算的纠错,不需要引入额外的开销,提高了逻辑运算单元的可靠性。The present application utilizes an operation detection unit to implement error correction of addition operations without introducing additional overhead, thereby improving the reliability of the logic operation unit.

显然,本领域的技术人员可以对本发明进行各种改动和变型而不脱离本发明的精神和范围。这样,倘若本发明的这些修改和变型属于本发明权利要求及其等同技术的范围之内,则本发明也意图包含这些改动和变型在内。Obviously, those skilled in the art can make various changes and modifications to the present invention without departing from the spirit and scope of the present invention. Thus, if these modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include these modifications and variations.

Claims (8)

1.一种可变流水线纠检错加法运算系统,其特征在于,包括加法器单元、运算检测单元和复用纠错单元;1. A variable pipeline error correction and detection addition operation system, characterized in that it includes an adder unit, an operation detection unit and a multiplexing error correction unit; 所述加法器单元用于对来自总线的加法操作数进行加法运算,并输出加法结果,每个加法器单元连接一个运算检测单元;The adder unit is used to perform addition operation on the addition operands from the bus and output the addition result, and each adder unit is connected to an operation detection unit; 运算检测单元用于对与其连接的加法器单元的加法结果进行检测,如果加法器单元的加法结果有误,则运算检测单元输出错误标志信号至复用纠错单元,运算检测单元和与其连接的加法器单元并行工作,同时从至少两位操作数中获取两位相同的操作数;The operation detection unit is used to detect the addition result of the adder unit connected thereto. If the addition result of the adder unit is wrong, the operation detection unit outputs an error flag signal to the multiplexing error correction unit. The operation detection unit and the adder unit connected thereto work in parallel and simultaneously obtain two identical operands from at least two operands. 复用纠错单元,根据错误标志信号对其负责区域内检测到错误的加法结果进行纠错,并将纠错后的结果输出至加法结果有误的加法器单元。The multiplexing error correction unit corrects the addition result detected as an error in its responsible area according to the error flag signal, and outputs the error-corrected result to the adder unit with the erroneous addition result. 2.根据权利要求1所述的一种可变流水线纠检错加法运算系统,其特征在于,所述加法器单元的加法操作数至少包括两位操作数。2. A variable pipeline error correction and detection addition operation system according to claim 1, characterized in that the addition operand of the adder unit includes at least two-bit operands. 3.根据权利要求1所述的一种可变流水线纠检错加法运算系统,其特征在于,所述运算检测单元采用伯格码进行检测,所述复用纠错单元采用错误纠正码对加法器单元中的加法运算的错误进行纠正。3. A variable pipeline error correction and detection addition operation system according to claim 1, characterized in that the operation detection unit uses Burger code for detection, and the multiplexing error correction unit uses error correction code to correct the error of the addition operation in the adder unit. 4.根据权利要求1所述的一种可变流水线纠检错加法运算系统,其特征在于,多个加法器单元和多个运算检测单元共用一个复用纠错单元。4. The variable pipeline error correction and detection addition operation system according to claim 1 is characterized in that a plurality of adder units and a plurality of operation detection units share a multiplexing error correction unit. 5.根据权利要求1所述的一种可变流水线纠检错加法运算系统,其特征在于,所述复用纠错单元包括纠错仲裁器、数据寄存器与错误处理电路;5. A variable pipeline error correction and detection addition operation system according to claim 1, characterized in that the multiplexing error correction unit comprises an error correction arbiter, a data register and an error processing circuit; 数据寄存器用于获取并存储其负责区域内所有出现错误时的加法操作数及加法结果;The data register is used to obtain and store all addition operands and addition results when errors occur in the area it is responsible for; 纠错仲裁器用于对所有出现错误时的加法操作数及加法结果进行优先级确认,并进行优先级排序;The error correction arbiter is used to confirm the priority of all addition operands and addition results when errors occur, and to sort the priorities; 错误处理电路按优先级排序依次对加法结果有误的加法器单元的加法运算进行纠错,其他未纠错的加法器单元处于等待状态。The error processing circuit corrects the addition operations of the adder units with incorrect addition results in order of priority, and other adder units that have not been corrected are in a waiting state. 6.一种可变流水线纠检错加法运算方法,其特征在于,包括以下步骤:6. A variable pipeline error correction and detection addition operation method, characterized in that it comprises the following steps: 对加法操作数进行加法运算的加法结果进行检测,如果加法运算的加法结果有误,则对检测到有误的加法结果进行纠错,并输出纠错后的加法结果:The addition result of the addition operation is detected for the addition operands. If the addition result of the addition operation is wrong, the wrong addition result is corrected and the corrected addition result is output: 具体的:采用加法器单元对来自总线的加法操作数进行加法运算,并输出加法结果,每个加法器单元对应连接一个运算检测单元;Specifically: an adder unit is used to perform addition operation on the addition operands from the bus and output the addition result, and each adder unit is connected to a corresponding operation detection unit; 采用运算检测单元对与其连接的加法器单元的加法结果进行检测,如果加法器单元的加法结果有误,则运算检测单元输出错误标志信号至复用纠错单元,运算检测单元和与其连接的加法器单元并行工作,同时从至少两位操作数中获取两位相同的操作数;The operation detection unit is used to detect the addition result of the adder unit connected thereto. If the addition result of the adder unit is wrong, the operation detection unit outputs an error flag signal to the multiplexing error correction unit. The operation detection unit and the adder unit connected thereto work in parallel and simultaneously obtain two identical operands from at least two operands. 利用复用纠错单元对其负责区域内检测到错误的加法结果进行纠错,并将纠错后的结果输出至加法结果有误的加法器单元。The multiplexed error correction unit is used to correct the addition result detected as erroneous in the area it is responsible for, and the error-corrected result is output to the adder unit with the erroneous addition result. 7.根据权利要求6所述的一种可变流水线纠检错加法运算方法,其特征在于,所述运算检测单元采用伯格码进行检测,所述复用纠错单元采用错误纠正码对加法器单元中的加法运算的错误进行纠正。7. A variable pipeline error correction and detection addition operation method according to claim 6, characterized in that the operation detection unit uses Burger code for detection, and the multiplexing error correction unit uses error correction code to correct the error of the addition operation in the adder unit. 8.根据权利要求6所述的一种可变流水线纠检错加法运算方法,其特征在于,利用复用纠错单元对其负责区域内检测到错误的加法结果进行纠错具体过程为:对所有出现错误时的加法操作数及出现错误时的加法操作数得到的加法结果进行优先级排序;8. A variable pipeline error correction and detection addition operation method according to claim 6, characterized in that the multiplexed error correction unit is used to correct the error detected in the addition result in its responsible area. The specific process is: all the addition operands when errors occur and the addition results obtained by the addition operands when errors occur are prioritized; 按优先级排序依次对加法结果有误的加法器单元的加法运算进行纠错,其他未纠错的加法器单元处于等待状态。The addition operations of the adder units with incorrect addition results are corrected in sequence according to the priority, and other adder units that have not been corrected are in a waiting state.
CN202410885683.7A 2024-07-03 2024-07-03 Variable pipeline error correction and detection addition operation system and method Active CN118426735B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410885683.7A CN118426735B (en) 2024-07-03 2024-07-03 Variable pipeline error correction and detection addition operation system and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410885683.7A CN118426735B (en) 2024-07-03 2024-07-03 Variable pipeline error correction and detection addition operation system and method

Publications (2)

Publication Number Publication Date
CN118426735A CN118426735A (en) 2024-08-02
CN118426735B true CN118426735B (en) 2024-09-27

Family

ID=92326329

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410885683.7A Active CN118426735B (en) 2024-07-03 2024-07-03 Variable pipeline error correction and detection addition operation system and method

Country Status (1)

Country Link
CN (1) CN118426735B (en)

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115220790A (en) * 2022-07-27 2022-10-21 安谋科技(中国)有限公司 Data processing method, processor and electronic equipment

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH08125640A (en) * 1994-10-28 1996-05-17 Murata Mach Ltd Re-synchronization device for error correction coder decoder
JP3544846B2 (en) * 1997-02-13 2004-07-21 株式会社東芝 Logic circuit and floating point arithmetic unit
EP1146515A1 (en) * 1998-02-25 2001-10-17 Matsushita Electric Industrial Co., Ltd. High-speed error correcting apparatus with efficient data transfer
CN100414510C (en) * 2003-12-30 2008-08-27 中国科学院空间科学与应用研究中心 Real-time error detection and error correction chip
WO2008152728A1 (en) * 2007-06-15 2008-12-18 Fujitsu Limited Error correcting method and computing element
KR101618227B1 (en) * 2015-08-25 2016-05-04 조선대학교산학협력단 Fault localization and error correction method for self-checking binary signed-digit adder and schematic circuit for the method
CN105260272B (en) * 2015-09-24 2017-08-08 中国航天科技集团公司第九研究院第七七一研究所 A kind of synchronous error correction Pipeline control structure and its method
KR20210092986A (en) * 2020-01-17 2021-07-27 삼성전자주식회사 Storage controller, storage system including the same, and operation method of storage controller
CN116661871A (en) * 2023-05-23 2023-08-29 中国人民解放军国防科技大学 Device with dynamic pipeline error correction function and control method

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115220790A (en) * 2022-07-27 2022-10-21 安谋科技(中国)有限公司 Data processing method, processor and electronic equipment

Also Published As

Publication number Publication date
CN118426735A (en) 2024-08-02

Similar Documents

Publication Publication Date Title
CN104765650B (en) Data processing equipment
CN108055876B (en) Low power double error correction-triple error detection (DEB-TED) decoder
US7685408B2 (en) Methods and apparatus for extracting bits of a source register based on a mask and right justifying the bits into a target register
US10019266B2 (en) Selectively performing a single cycle write operation with ECC in a data processing system
US7797609B2 (en) Apparatus and method for merging data blocks with error correction code protection
US9811429B2 (en) Microcontroller utilizing redundant address decoders and electronic control device using the same
US8732548B2 (en) Instruction-set architecture for programmable cyclic redundancy check (CRC) computations
TW201732592A (en) Apparatus and method for multi-bit error detection and correction
KR20150094112A (en) Semiconductor Memory Apparatus and Operating Method Thereof
US20120113271A1 (en) Processor and image processing system using the same
JPH06242953A (en) Data processor
JP4793741B2 (en) Error correction circuit and error correction method
US8914712B2 (en) Hierarchical error correction
CN118426735B (en) Variable pipeline error correction and detection addition operation system and method
JP4567753B2 (en) Parity generation circuit, counting circuit, and counting method
US20070028058A1 (en) System for determining the position of an element in memory
CN115878365A (en) Memory error correction method, device and related equipment
JP3707729B2 (en) Address generation interlock detection method and system
TWI489374B (en) System and method for determination of a horizontal minimum of digital values
JP7596638B2 (en) Processor and error detection method - Patents.com
CN112506471B (en) Chips and computing systems
WO2023108600A1 (en) System, method and apparatus for reducing power consumption of error correction coding using compacted data blocks
CN118427002A (en) A verification circuit, method and chip for integer multiplication and addition calculation circuit
JPS59129995A (en) Storage device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant