[go: up one dir, main page]

CN107168678A - A kind of improved floating dual MAC and floating point multiplication addition computational methods - Google Patents

A kind of improved floating dual MAC and floating point multiplication addition computational methods Download PDF

Info

Publication number
CN107168678A
CN107168678A CN201710322694.4A CN201710322694A CN107168678A CN 107168678 A CN107168678 A CN 107168678A CN 201710322694 A CN201710322694 A CN 201710322694A CN 107168678 A CN107168678 A CN 107168678A
Authority
CN
China
Prior art keywords
floating
point
normalized
adder
point number
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201710322694.4A
Other languages
Chinese (zh)
Other versions
CN107168678B (en
Inventor
汪东升
高原
刘振宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tsinghua University
Original Assignee
Tsinghua University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tsinghua University filed Critical Tsinghua University
Priority to CN201710322694.4A priority Critical patent/CN107168678B/en
Publication of CN107168678A publication Critical patent/CN107168678A/en
Application granted granted Critical
Publication of CN107168678B publication Critical patent/CN107168678B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/544Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices for evaluating functions by calculation
    • G06F7/5443Sum of products
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/57Arithmetic logic units [ALU], i.e. arrangements or devices for performing two or more of the operations covered by groups G06F7/483 – G06F7/556 or for performing logical operations

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computational Mathematics (AREA)
  • Computing Systems (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Complex Calculations (AREA)

Abstract

本发明实施例提供一种改进的浮点乘加器及浮点乘加计算方法。浮点乘加器包括至少两个浮点部分乘法器和一个多输入加法器,浮点部分乘法器由符号位异或电路、尾数乘法器和指数加法器组成,浮点部分乘法器接收归一化浮点数并进行乘法计算输出非归一化浮点数,加法器接收非归一化浮点数并将输入的非归一化浮点数累加并输出归一化浮点数。通过设置浮点部分乘法器只包括符号位异或电路、尾数乘法器和指数加法器不包括归一化模块,接收归一化浮点数进过乘法运算之后输出非归一化浮点数由加法器进行加法运算并输出归一化浮点数,从硬件电路方面对浮点乘加器进行了优化并提高了浮点乘加器的运算效率,降低了硬件电路的面积和功耗。

Embodiments of the present invention provide an improved floating-point multiply-accumulator and a floating-point multiply-accumulate calculation method. The floating-point multiplier includes at least two floating-point multipliers and a multi-input adder. The floating-point multiplier is composed of a sign bit XOR circuit, a mantissa multiplier and an exponent adder. The floating-point multiplier receives a normalization The floating-point number is multiplied and outputted as a non-normalized floating-point number, and the adder receives the non-normalized floating-point number and accumulates the input non-normalized floating-point number and outputs a normalized floating-point number. By setting the floating-point part of the multiplier to only include the sign bit XOR circuit, the mantissa multiplier and the exponent adder do not include the normalization module, the normalized floating-point number is received and multiplied, and the non-normalized floating-point number is output by the adder Perform addition operations and output normalized floating-point numbers, optimize the floating-point multiply-adder from the hardware circuit, improve the operational efficiency of the floating-point multiply-adder, and reduce the area and power consumption of the hardware circuit.

Description

一种改进的浮点乘加器及浮点乘加计算方法An Improved Floating Point Multiply Adder and Floating Point Multiply Add Calculation Method

技术领域technical field

本发明实施例涉及计算机硬件结构与电路设计技术领域,尤其涉及一种改进的浮点乘加器及浮点乘加计算方法。The embodiment of the present invention relates to the technical field of computer hardware structure and circuit design, and in particular to an improved floating-point multiply-accumulator and a floating-point multiply-accumulate calculation method.

背景技术Background technique

近年来各类机器学习算法例如深度卷积神经网络在多个领域得到了广泛的应用,而且这些机器学习算法随着技术的更新变得更加计算密集与存储密集,相应所需要的计算资源和存储资源也在不断增加。为了解决这一问题,开发专用硬件成了学术界与工业界所公认的解决办法之一。学术界与工业界提出了很多不同架构的硬件加速平台。但是目前还没有一种对硬件电路进行优化设计实现提高运算效率的方法,因此,提供一种对硬件电路进行优化提高浮点乘加器的运算效率是目前业界亟待解决的技术问题。In recent years, various machine learning algorithms such as deep convolutional neural networks have been widely used in many fields, and these machine learning algorithms have become more computing-intensive and storage-intensive with the update of technology, correspondingly required computing resources and storage Resources are also increasing. In order to solve this problem, the development of special hardware has become one of the solutions recognized by academia and industry. Academia and industry have proposed many hardware acceleration platforms with different architectures. However, there is no method for optimizing the design of the hardware circuit to improve the operation efficiency. Therefore, it is an urgent technical problem to be solved in the industry to provide a method for optimizing the hardware circuit to improve the operation efficiency of the floating-point multiply-adder.

发明内容Contents of the invention

为了解决现有技术中存在的问题,本发明实施例提供一种改进的浮点乘加器及浮点乘加计算方法。In order to solve the problems existing in the prior art, an embodiment of the present invention provides an improved floating-point multiply-accumulator and a floating-point multiply-accumulate calculation method.

一方面,本发明实施例提供一种改进的浮点乘加器,包括至少两个浮点部分乘法器和一个多输入加法器,所述浮点部分乘法器由符号位异或电路、尾数乘法器和指数加法器组成,所述浮点部分乘法器接收归一化浮点数并进行乘法计算输出非归一化浮点数,所述加法器接收所述非归一化浮点数并将输入的非归一化浮点数累加并输出归一化浮点数,所述非归一化浮点数由符号位、非归一化尾数和指数部分构成,所述归一化浮点数由符号位、归一化尾数和指数部分构成。On the one hand, the embodiment of the present invention provides an improved floating-point multiply-adder, comprising at least two floating-point part multipliers and a multi-input adder, the floating-point part multiplier is composed of sign bit XOR circuit, mantissa multiplication Composed of a multiplier and an exponent adder, the floating-point multiplier receives a normalized floating-point number and performs multiplication calculation to output a non-normalized floating-point number, and the adder receives the non-normalized floating-point number and converts the input non-normalized floating-point number The normalized floating-point numbers are accumulated and output as normalized floating-point numbers. The non-normalized floating-point numbers are composed of sign bits, non-normalized mantissas and exponent parts. The normalized floating-point numbers are composed of sign bits, normalized Mantissa and exponent parts.

另一方面,本发明实施例提供一种浮点乘加计算方法,包括:On the other hand, an embodiment of the present invention provides a floating point multiplication and addition calculation method, including:

接收至少四个归一化浮点数输入;Receives at least four normalized floating point inputs;

将所述归一化浮点数进行乘法运算,得到非归一化浮点数;performing a multiplication operation on the normalized floating-point number to obtain a non-normalized floating-point number;

将所述非归一化浮点数进行加法运算,得到归一化浮点数。The non-normalized floating-point numbers are added to obtain the normalized floating-point numbers.

本发明实施例提供的改进的浮点乘加器及浮点乘加计算方法,通过设置至少两浮点部分乘法器和一个多输入加法器,浮点部分乘法器只包括符号位异或电路、尾数乘法器和指数加法器不包括归一化模块,接收归一化浮点数进过乘法运算之后输出非归一化浮点数由加法器进行加法运算并输出归一化浮点数,从硬件电路方面对浮点乘加器进行了优化并提高了浮点乘加器的运算效率,降低了硬件电路的面积和功耗。The improved floating-point multiply-accumulator and the floating-point multiply-accumulate calculation method provided by the embodiment of the present invention, by setting at least two floating-point partial multipliers and a multi-input adder, the floating-point partial multiplier only includes a sign bit XOR circuit, The mantissa multiplier and the exponent adder do not include the normalization module. After receiving the normalized floating-point number and performing the multiplication operation, the non-normalized floating-point number is output. The adder performs the addition operation and outputs the normalized floating-point number. From the aspect of hardware circuit The floating-point multiply-accumulator is optimized and the operation efficiency of the floating-point multiply-accumulator is improved, and the area and power consumption of the hardware circuit are reduced.

附图说明Description of drawings

为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作一简单地介绍,显而易见地,下面描述中的附图是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions in the embodiments of the present invention or the prior art, the following will briefly introduce the drawings that need to be used in the description of the embodiments or the prior art. Obviously, the accompanying drawings in the following description These are some embodiments of the present invention. Those skilled in the art can also obtain other drawings based on these drawings without creative work.

图1是本发明实施例提供的浮点乘加器的结构示意图;FIG. 1 is a schematic structural diagram of a floating-point multiply-accumulator provided by an embodiment of the present invention;

图2是本发明实施例提供的浮点乘加器中浮点部分乘法器的结构示意图;2 is a schematic structural diagram of a floating-point part multiplier in a floating-point multiply-accumulator provided by an embodiment of the present invention;

图3是本发明实施例提供的十六输入浮点乘加器结构示意图;FIG. 3 is a schematic structural diagram of a sixteen-input floating-point multiply-accumulator provided by an embodiment of the present invention;

图4是本发明实施例提供的十六输入浮点乘加器中八输入加法器结构示意图;4 is a schematic structural diagram of an eight-input adder in a sixteen-input floating-point multiply-adder provided by an embodiment of the present invention;

图5是本发明实施例提供的浮点乘加计算方法流程示意图。FIG. 5 is a schematic flowchart of a floating-point multiply-accumulate calculation method provided by an embodiment of the present invention.

具体实施方式detailed description

为使本发明实施例的目的、技术方案和优点更加清楚,下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。In order to make the purpose, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the drawings in the embodiments of the present invention. Obviously, the described embodiments It is a part of embodiments of the present invention, but not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without creative efforts fall within the protection scope of the present invention.

图1是本发明实施例提供的浮点乘加器的结构示意图,图2是本发明实施例提供的浮点乘加器中浮点部分乘法器的结构示意图,如图1和图2所示,本发明实施例提供的浮点乘加器包括至少两个浮点部分乘法器1和一个多输入加法器2,所述浮点部分乘法器由符号位异或电路11、尾数乘法器12和指数加法器13组成,所述浮点部分乘法器1接收归一化浮点数并进行乘法计算得到非归一化浮点数,所述加法器2将所述非归一化浮点数累加并输出归一化浮点数,所述非归一化浮点数由符号位、非归一化尾数和指数部分构成,所述归一化浮点数由符号位、归一化尾数和指数部分构成。Fig. 1 is a schematic structural diagram of a floating-point multiplier provided by an embodiment of the present invention, and Fig. 2 is a schematic structural diagram of a floating-point multiplier in a floating-point multiplier provided by an embodiment of the present invention, as shown in Fig. 1 and Fig. 2 , the floating-point multiplier-accumulator provided by the embodiment of the present invention includes at least two floating-point part multipliers 1 and a multi-input adder 2, and the floating-point part multiplier is composed of a sign bit XOR circuit 11, a mantissa multiplier 12 and Exponent adder 13 is made up of, and described floating-point part multiplier 1 receives normalized floating-point number and carries out multiplication calculation and obtains non-normalized floating-point number, and described adder 2 accumulates described non-normalized floating-point number and outputs normalization A normalized floating-point number, the non-normalized floating-point number is composed of a sign bit, a non-normalized mantissa and an exponent part, and the normalized floating-point number is composed of a sign bit, a normalized mantissa and an exponent part.

卷积神经网络的运算是由大量的连续乘法加法构成的,现有技术中的运算方法是利用多个独立的浮点乘法器和一个多输入加法器进行运算:每一次乘法计算都输出归一化浮点数,再将乘法计算输出的归一化浮点数作为多输入加法器的输入,而多输入加法器也是由多个加法器构成的加法树,每次加法运算都输出一个归一化浮点数作为下一级加法运算的输入。这里提到的归一化浮点数是指符合IEEE754标准的浮点数。本发明实施例提供的浮点乘加器由多个浮点部分乘法器1和一个多输入加法器2构成,所谓浮点部分乘法器是指只由符号位异或电路11、尾数乘法器12和指数加法器13组成不包含归一化模块的乘法器。当进行乘加运算时,一个浮点部分乘法器接收两个归一化浮点数输入,由符号位异或电路11对两个浮点数的符号位进行异或运算获得新的符号位,由尾数乘法器12对两个浮点数的尾数位进行乘法运算获得新的尾数位,由指数加法器13对两个浮点数的指数位上的指数进行加法运算获得新的指数位。由于尾数乘法器12是对两个浮点数的尾数位进行乘法运算,得到的新的尾数位不能保证仍满足归一化的要求,因此得到的结果是非归一化尾数,新的符号位、非归一化尾数和新的指数位构成非归一化浮点数。经过浮点部分乘加器运算得到的非归一化浮点数直接作为多输入加法器的输入,经过加法运算之后得到符合归一化要求的归一化浮点数。The operation of the convolutional neural network is composed of a large number of continuous multiplication and addition. The operation method in the prior art is to use multiple independent floating-point multipliers and a multi-input adder for operation: each multiplication calculation outputs a normalized The multi-input adder is also an addition tree composed of multiple adders, and each addition operation outputs a normalized floating-point number. The points are used as input for the next stage of addition. The normalized floating-point number mentioned here refers to a floating-point number conforming to the IEEE754 standard. The floating-point multiplier-adder that the embodiment of the present invention provides is made up of a plurality of floating-point part multipliers 1 and a multi-input adder 2. The so-called floating-point part multiplier refers to only a sign bit XOR circuit 11 and a mantissa multiplier 12. and exponent adder 13 constitute a multiplier that does not include a normalization module. When performing multiplication and addition operations, a floating-point part multiplier receives two normalized floating-point number inputs, and the sign bit of the two floating-point numbers is XORed by the sign bit XOR circuit 11 to obtain a new sign bit, and the sign bit is obtained by the mantissa The multiplier 12 multiplies the mantissa bits of the two floating-point numbers to obtain a new mantissa bit, and the exponent adder 13 adds the exponents on the exponent bits of the two floating-point numbers to obtain a new exponent bit. Since the mantissa multiplier 12 multiplies the mantissa bits of two floating-point numbers, the obtained new mantissa bits cannot guarantee to still meet the requirements of normalization, so the result obtained is a non-normalized mantissa. The normalized mantissa and new exponent bits form an unnormalized floating-point number. The non-normalized floating-point number obtained through the multiplication-adder operation of the floating-point part is directly used as the input of the multi-input adder, and the normalized floating-point number that meets the normalization requirement is obtained after the addition operation.

本实施例提供的浮点乘加器从电路结构上进行优化设计,通过设置至少两浮点部分乘法器1和一个多输入加法器2,浮点部分乘法器1只包括符号位异或电路11、尾数乘法器12和指数加法器13不包括归一化模块,接收归一化浮点数进过乘法运算之后输出非归一化浮点数由加法器进行加法运算并输出归一化浮点数,从硬件电路方面对浮点乘加器进行了优化并提高了浮点乘加器的运算效率,降低了硬件电路的面积和功耗。The floating-point multiplier-adder provided by this embodiment is optimized in terms of circuit structure. By setting at least two floating-point part multipliers 1 and one multi-input adder 2, the floating-point part multiplier 1 only includes a sign bit XOR circuit 11 , the mantissa multiplier 12 and the exponent adder 13 do not include a normalization module, receive the normalized floating-point number and enter the output non-normalized floating-point number after the multiplication operation is carried out by the adder and output the normalized floating-point number, from In terms of hardware circuit, the floating-point multiplier-accumulator is optimized and the operation efficiency of the floating-point multiplier-accumulator is improved, and the area and power consumption of the hardware circuit are reduced.

在上述实施例的基础上,进一步地,所述加法器包括:指数比较器、尾数移位器、舍入模块和归一化模块,以将输入的非归一化浮点数累加并输出归一化浮点数。On the basis of the above embodiments, further, the adder includes: an exponent comparator, a mantissa shifter, a rounding module and a normalization module, so as to accumulate the input non-normalized floating-point numbers and output a normalized convert floating-point numbers.

具体的,浮点部分乘法器输出的非归一化浮点数作为多输入加法器的输入,其中,输入的非归一化浮点数的指数部分首先经过指数比较器根据比较结果对输入的非归一化尾数经过尾数移位器进行对齐操作,再经过定点加法器进行累加,累加得到的结果经过舍入模块之后再经过归一化模块进行归一化处理,就可以输出归一化浮点数。Specifically, the non-normalized floating-point number output by the floating-point multiplier is used as the input of the multi-input adder, wherein the exponent part of the input non-normalized floating-point number first passes through the exponent comparator to compare the input non-normalized number according to the comparison result. The normalized mantissa is aligned by the mantissa shifter, and then accumulated by the fixed-point adder. After the accumulated result is passed through the rounding module, it is then normalized by the normalization module, and then the normalized floating-point number can be output.

本发明实施例提供的浮点乘加器利用浮点部分乘法器对输入的归一化浮点数进行乘法计算将得到的非归一化浮点数输入多输入加法器,再通过多输入加法器中的指数比较器、尾数移位器、舍入模块和归一化模块将输入的非归一化浮点数累加并输出归一化浮点数,通过在电路级别优化浮点数乘法计算来减少硬件代价,因此提高了运算效率,降低了额外的归一化开销,降低了功耗。The floating-point multiplier-adder provided by the embodiment of the present invention uses the floating-point part of the multiplier to perform multiplication calculation on the input normalized floating-point number, and inputs the obtained non-normalized floating-point number into the multi-input adder, and then passes through the multi-input adder. The exponent comparator, mantissa shifter, rounding module and normalization module accumulate the input non-normalized floating-point numbers and output normalized floating-point numbers, and reduce hardware costs by optimizing floating-point multiplication calculations at the circuit level. Therefore, the operation efficiency is improved, the extra normalization overhead is reduced, and the power consumption is reduced.

在上述实施例的基础上,进一步地,所述加法器中的舍入模块的舍入机制包括:截断舍入、向上舍入、向下舍入或最近舍入。On the basis of the above embodiments, further, the rounding mechanism of the rounding module in the adder includes: truncated rounding, upward rounding, downward rounding or nearest rounding.

具体的,截断舍入、向上舍入、向下舍入和最近舍入是四种不同的数据处理方式,用于对数据进行舍入处理。为了方便说明,我们以十进制数字0.4、0.5、0.6、1.5、2.4和2.5分别针对这四种舍入方式进行舍入的结果进行说明。Specifically, truncation rounding, up rounding, down rounding and nearest rounding are four different data processing modes for rounding data. For the convenience of description, we use the decimal numbers 0.4, 0.5, 0.6, 1.5, 2.4 and 2.5 to illustrate the results of rounding for these four rounding methods.

表1不同舍入方式进行数据处理的结果示例Table 1 Example of data processing results with different rounding methods

表1是不同舍入方式进行数据处理的结果示例,如表1所示,截断舍入的方式是将小数点之后的数据完全舍去只保留小数点之前的数字,这种舍入方式在电路实现方面作为简单,可以有效的降低电路的设计难度,提高计算效率降低电路功耗,但是其计算精度比较低,适用于精度要求低的计算场景。向上舍入、向下舍入和最近舍入这三种方式都是四舍六入,其中不同点在于对五的处理,向上舍入是将五向较大数字进行进位计算,0.5经过向上舍入变成1,1.5向上舍入变成2,以此类推,此处不再赘述;向下舍入是将五向较小数字进行舍去计算,0.5经过向下舍入变成0,1.5经过向下舍入变成1,以此类推,此处不再赘述;最近舍入又成向偶舍入,是将五向最近的偶数进行舍入,例如0.5在0和1之间,其中0为偶数,因此0.5经过最近舍入变成0,1.5在1和2之间,其中2是偶数,因此1.5经过最近舍入变成2,同理2.5经过最近舍入变成2,3.5经过最近舍入变成4,以此类推,此处不再赘述。由于对五的处理方式不同,向上舍入、向下舍入和最近舍入这三种舍入方式处理的数据在误差方面也有各自的特点:向上舍入将五向较大数字进行进位计算,因此数据整体会向上偏移;向下舍入将五向较小数字进行舍去计算,因此数据整体会向下偏移;而最近舍入处理的数据会将整数部分为奇数的数字向上舍入,整数部分为偶数的数字向下舍入,从而在整体上减少了有舍入方式带来的数据偏差,提高数据的精度,但是从电路设计上来说最近舍入的电路设计作为复杂,因此在不同的精度要求下可以对舍入方式进行合理选择。Table 1 is an example of the results of data processing by different rounding methods. As shown in Table 1, the truncation rounding method is to completely round off the data after the decimal point and only keep the numbers before the decimal point. This rounding method is very important in circuit implementation. Being simple, it can effectively reduce the difficulty of circuit design, improve calculation efficiency and reduce circuit power consumption, but its calculation accuracy is relatively low, and it is suitable for calculation scenarios with low precision requirements. The three methods of rounding up, rounding down and nearest rounding are all rounding. The difference lies in the treatment of five. Rounding up is to calculate the carry of five to a larger number. 0.5 is rounded up Rounding up becomes 1, 1.5 is rounded up to 2, and so on, so I won’t go into details here; rounding down is to round off the five to smaller numbers, 0.5 becomes 0 after rounding down, 1.5 After being rounded down, it becomes 1, and so on, so I won’t go into details here; the nearest rounding becomes even rounding, which is to round five to the nearest even number, for example, 0.5 is between 0 and 1, where 0 is an even number, so 0.5 becomes 0 after the nearest rounding, 1.5 is between 1 and 2, and 2 is an even number, so 1.5 becomes 2 after the nearest rounding, similarly, 2.5 becomes 2 after the nearest rounding, and 3.5 passes through The nearest rounding becomes 4, and so on, which will not be repeated here. Due to the different processing methods for five, the data processed by the three rounding methods of rounding up, rounding down and nearest rounding also have their own characteristics in terms of errors: rounding up calculates the carry of five to a larger number, Therefore, the data will be shifted upwards as a whole; rounding down will round off five to a smaller number, so the data will be shifted downwards as a whole; and the most recently rounded data will round up the number whose integer part is an odd number , the number whose integer part is an even number is rounded down, thus reducing the data deviation caused by the rounding method as a whole and improving the accuracy of the data, but from the circuit design point of view, the circuit design of the nearest rounding is as complicated as The rounding method can be reasonably selected under different precision requirements.

本实施例提供的浮点乘加器中包含多种舍入机制,可以在满足不同运算的需求的基础上降低电路设计的难度,提高运算效率和精度。The floating-point multiplier-adder provided in this embodiment includes multiple rounding mechanisms, which can reduce the difficulty of circuit design on the basis of meeting the requirements of different operations, and improve the efficiency and precision of operations.

在上述实施例的基础上,进一步地,所述指数比较器找出输入数据中最大的指数值,所述尾数移位器再根据所述最大的指数值进行移位操作使尾数位对齐。On the basis of the above embodiment, further, the exponent comparator finds the largest exponent value in the input data, and the mantissa shifter performs a shift operation according to the largest exponent value to align the mantissa bits.

具体的,输入的归一化浮点数经过浮点部分乘法器进行乘法计算后输出的非归一化浮点数作为多输入加法器的输入,其中,输入的非归一化浮点数的指数部分首先经过指数比较器得到最大的指数值,然后这个最大的指数值和各个输入的指数相减求得偏移量,尾数数值的小数点根据偏移量进行左移,电路上小数点的位置是固定的,因此各个输入的尾数部分根据偏移量经过尾数移位器右移,完成尾数对齐的要求。再经过定点加法器进行累加,累加得到的结果经过舍入模块之后再经过归一化模块进行归一化处理,就可以输出归一化浮点数。Specifically, the input normalized floating-point number is multiplied by the floating-point multiplier and the output non-normalized floating-point number is used as the input of the multi-input adder, wherein the exponent part of the input non-normalized floating-point number is first The maximum exponent value is obtained through the exponent comparator, and then the maximum exponent value is subtracted from each input exponent to obtain the offset. The decimal point of the mantissa value is shifted to the left according to the offset. The position of the decimal point on the circuit is fixed. Therefore, the mantissa part of each input is shifted to the right by the mantissa shifter according to the offset to complete the mantissa alignment requirement. Then the fixed-point adder is used for accumulation, and the result obtained by the accumulation is processed by the rounding module and then normalized by the normalization module, so that the normalized floating-point number can be output.

这里使用指数比较器获得各个输入中最大的指数值再根据最大的指数值和各个指数值的差进行尾数对齐,这样结果的指数部分就确定为最大的指数值,尾数对齐了之后相加就可以得到相应的尾数值,在进行尾数相加时可以直接使用定点加法器,进一步提高了运算效率。Here, the exponent comparator is used to obtain the largest exponent value in each input, and then the mantissa is aligned according to the difference between the largest exponent value and each exponent value, so that the exponent part of the result is determined to be the largest exponent value, and the mantissa is aligned and then added. The corresponding mantissa value is obtained, and the fixed-point adder can be directly used when adding the mantissa, which further improves the operation efficiency.

在上述实施例的基础上,进一步地,所述加法器还包括脉动寄存器,将所述尾数移位器和指数比较器的结果进行存储,以增加加法器的流水线。On the basis of the above embodiments, further, the adder further includes a systolic register for storing the results of the mantissa shifter and the exponent comparator, so as to increase the pipeline of the adder.

卷积神经网络的运算是由大量的连续乘法加法构成的,因此浮点乘加器的运算效率还受到运算频率的影响。本发明实施例提供的浮点乘加器中,在多输入加法器中添加脉冲寄存器对尾数移位器和指数比较器的结果进行保存,每次输入的多个归一化浮点数经过浮点部分乘加器做乘法运算之后变成非归一化浮点数输入多输入加法器,在多输入加法器中经过指数比较器之后将比较结果存储在脉冲寄存器中,再经过尾数移位器之后将移位后的尾数存储在相应的脉冲寄存器中,这样一来这些数据都存入了脉冲寄存器中,浮点乘加器就可以接收下一批输入开始进行新一轮的运算,而脉冲寄存器中的数据则继续进行接下来的运算,进入舍入模块进行舍入运算,再进入归一化模块进行归一化运算输出归一化浮点数。The operation of the convolutional neural network is composed of a large number of continuous multiplication and addition, so the operation efficiency of the floating-point multiply-adder is also affected by the operation frequency. In the floating-point multiplier-adder provided by the embodiment of the present invention, a pulse register is added in the multi-input adder to store the results of the mantissa shifter and the exponent comparator, and multiple normalized floating-point numbers input each time are passed through the floating-point Part of the multiplier-adder becomes a non-normalized floating-point number input multi-input adder after multiplication. After passing through the exponent comparator in the multi-input adder, the comparison result is stored in the pulse register, and then passed through the mantissa shifter. The shifted mantissa is stored in the corresponding pulse register, so that these data are stored in the pulse register, and the floating-point multiplier can receive the next batch of inputs to start a new round of operations, while the pulse register The data then continue to perform the next operation, enter the rounding module for rounding operation, and then enter the normalization module for normalization operation to output a normalized floating point number.

通过在加法器中添加脉冲寄存器对尾数移位器和指数比较器的结果进行存储可以有效提高运算的频率,从而提高整体运算效率。By adding a pulse register in the adder to store the results of the mantissa shifter and the exponent comparator, the frequency of operation can be effectively increased, thereby improving the overall operation efficiency.

在上述各实施例的基础上,进一步地,所述归一化模块在所述加法器运算的末端,对计算结果进行归一化处理。On the basis of the above embodiments, further, the normalization module performs normalization processing on the calculation result at the end of the adder operation.

前面提到现有技术中的浮点乘加器采用连续的乘法和加法进行浮点运算,每一次运算的结果都是归一化浮点数,本发明实施例提供的浮点乘加器去除了所有中间环节的归一化模块,只保留加法器末端的归一化模块,这样整个运算过程中的乘法和加法运算都是定点运算大大降低了计算的复杂程度,提高了运算效率,在加法器的末端添加归一化模块则保证输出的是符合标准的归一化浮点数。As mentioned above, the floating-point multiply-accumulator in the prior art uses continuous multiplication and addition to perform floating-point operations, and the result of each operation is a normalized floating-point number. The floating-point multiply-accumulator provided by the embodiment of the present invention removes the For all the normalization modules in the intermediate links, only the normalization module at the end of the adder is reserved, so that the multiplication and addition operations in the entire operation process are fixed-point operations, which greatly reduces the complexity of calculations and improves operation efficiency. Adding a normalization module at the end ensures that the output is a standard normalized floating-point number.

图3是本发明实施例提供的十六输入浮点乘加器结构示意图,图4是本发明实施例提供的十六输入浮点乘加器中八输入加法器结构示意图,如图3和图4所示,所述乘加器包括八个浮点部分乘加器301和一个八输入加法器302。Fig. 3 is a schematic structural diagram of a sixteen-input floating-point multiplier-adder provided by an embodiment of the present invention, and Fig. 4 is a schematic structural diagram of an eight-input adder in a sixteen-input floating-point multiplier-accumulator provided by an embodiment of the present invention, as shown in Fig. 3 and Fig. 4, the multiplier-adder includes eight floating-point partial multiplier-adders 301 and one eight-input adder 302.

浮点部分乘法器由尾数乘法器和指数加法器组成,并且包括符号位的异或电路。浮点部分乘法器的输出由非归一化尾数,指数还有符号位构成,并且作为多输入加法器的输入。八输入加法器包括:指数比较器401、尾数移位器402、脉冲寄存器403、舍入模块404和归一化模块405等。在实现过程中,八输入加法器可以根据工作需要来选择内部实现的舍入方式。The floating-point part multiplier consists of a mantissa multiplier and an exponent adder, and includes an XOR circuit for the sign bit. The output of the floating-point part of the multiplier consists of the unnormalized mantissa, exponent, and sign bit, and is used as the input of the multiple-input adder. The eight-input adder includes: an exponent comparator 401, a mantissa shifter 402, a pulse register 403, a rounding module 404, a normalization module 405, and the like. During the implementation, the eight-input adder can choose the rounding method implemented internally according to the work needs.

在工作时,上一级浮点部分乘法器的输出作为输入,输入到八输入加法器中。8个指数首先同过指数比较器生成最大指数的值,然后该指数最大值与各个输入指数相减求得不同的偏移量。然后,尾数部分根据偏移量右移,完成尾数对齐的要求。对齐后的尾数再经过尾数求和电路计算出总和。最后经过归一化模块得到归一化的最终结果。When working, the output of the multiplier of the floating-point part of the upper stage is used as an input to the eight-input adder. The 8 exponents first pass through the exponent comparator to generate the value of the maximum exponent, and then subtract the maximum value of the exponent from each input exponent to obtain different offsets. Then, the mantissa part is shifted to the right according to the offset to complete the requirement of mantissa alignment. The aligned mantissas are then passed through the mantissa summation circuit to calculate the sum. Finally, the normalized final result is obtained through the normalization module.

本发明实施例提供的浮点乘加器具有八个浮点部分乘加器和一个八输入加法器,可以将十六个归一化浮点数输入进行乘加运算后输出符合标准的归一化浮点数,并且通过添加脉冲寄存器403实现了两段流水,提高了运算频率,进而缩小了整体电路面积降低功耗,提高了运算效率。The floating-point multiply-accumulator provided by the embodiment of the present invention has eight floating-point partial multiply-accumulators and an eight-input adder. floating-point numbers, and by adding the pulse register 403 to realize two-stage pipeline, which improves the operation frequency, reduces the overall circuit area, reduces power consumption, and improves operation efficiency.

图5是本发明实施例提供的浮点乘加计算方法流程示意图,如图5所示,方法包括:Fig. 5 is a schematic flow chart of a floating-point multiplication and addition calculation method provided by an embodiment of the present invention. As shown in Fig. 5, the method includes:

步骤10、接收至少四个归一化浮点数输入;Step 10, receiving at least four normalized floating-point number inputs;

步骤20、将所述归一化浮点数进行乘法运算,得到非归一化浮点数;Step 20, performing a multiplication operation on the normalized floating-point number to obtain a non-normalized floating-point number;

步骤30、将所述非归一化浮点数进行加法运算,得到归一化浮点数。Step 30: Perform an addition operation on the non-normalized floating-point numbers to obtain a normalized floating-point number.

具体的,首先介绍归一化浮点数输入,然后将接收的归一化浮点数进行乘法运算,但是在进行乘法运算时不做归一化处理直接用得到非归一化的浮点数进行加法运算,在完成加法运算之后再做归一化处理得到符合标准的归一化浮点数。由于在运算过程中只在最后进行归一化处理,节省了对乘法运算进行归一化处理的步骤,因此提高了计算效率。Specifically, first introduce the normalized floating-point number input, and then perform the multiplication operation on the received normalized floating-point number, but do not perform normalization processing during the multiplication operation, and directly use the non-normalized floating-point number to perform the addition operation , after the addition operation is completed, normalization processing is performed to obtain a standard normalized floating-point number. Since the normalization process is only performed at the end in the operation process, the step of normalizing the multiplication operation is saved, thereby improving the calculation efficiency.

在上述实施例的基础上,进一步地,所述得到非归一化浮点数的步骤具体为:On the basis of the above embodiments, further, the step of obtaining the non-normalized floating-point number is specifically:

将所述归一化浮点数的符号位经过符号异或电路得到所述非归一化浮点数的符号位;Passing the sign bit of the normalized floating-point number through a signed XOR circuit to obtain the sign bit of the non-normalized floating-point number;

将所述归一化浮点数的尾数位经过尾数乘法器得到所述非归一化浮点数的非归一化尾数位;Passing the mantissa bits of the normalized floating-point number through a mantissa multiplier to obtain the unnormalized mantissa bits of the non-normalized floating-point number;

将所述归一化浮点数的指数位经过指数加法器得到所述非归一化浮点数的指数位;Passing the exponent of the normalized floating-point number through an exponent adder to obtain the exponent of the non-normalized floating-point number;

将所述符号位、非归一化尾数位和指数位组成的非归一化浮点数输出。Output the unnormalized floating-point number composed of the sign bit, unnormalized mantissa bits and exponent bits.

浮点数由符号位、尾数位和指数位三部分组成,浮点数乘法运算即:将符号位进行异或运算,同号得正异号得负;将尾数位相乘;再将指数位相加。因此在接收输入的归一化浮点数之后,利用符合异或电路将所述浮点数的符号位进行异或运算,得到新的符号;利用尾数乘法器将输入的归一化浮点数的尾数位相乘得到新的尾数;利用指数加法器在将归一化浮点数的指数位相加得到新的指数。在进行尾数位相乘之后,得到我尾数位可能已经不符合归一化要求了,因此称之为非归一化尾数。由新的符号、尾数和指数组成的浮点数称为非归一化浮点数。将得到的非归一化浮点数作为多输入加法器的输入,在进行加法运算之后得到符合标准的归一化浮点数。Floating-point numbers are composed of three parts: sign bit, mantissa bit and exponent bit. Floating-point number multiplication operation is: perform XOR operation on the sign bit, the same sign is positive and the different sign is negative; multiply the mantissa bits; and then add the exponent bits . Therefore, after receiving the normalized floating-point number of input, the sign bit of said floating-point number is carried out to XOR operation by using an XOR circuit to obtain a new sign; Multiply to obtain a new mantissa; use the exponent adder to add the exponent bits of the normalized floating-point number to obtain a new exponent. After the mantissa is multiplied, the obtained mantissa may not meet the normalization requirements, so it is called a non-normalized mantissa. A floating-point number consisting of a new sign, mantissa, and exponent is called a non-normalized floating-point number. The obtained non-normalized floating-point number is used as the input of the multi-input adder, and the normalized floating-point number conforming to the standard is obtained after the addition operation.

本发明实施例提供的方法,通过简化归一化浮点数乘法运算中的归一化步骤,在进行加法运算之后再输出归一化浮点数,提高了浮点乘加计算的效率。The method provided by the embodiment of the present invention improves the efficiency of floating-point multiplication and addition calculation by simplifying the normalization step in the multiplication operation of the normalized floating-point number and outputting the normalized floating-point number after the addition operation.

在上述实施例的基础上,进一步地,所述将所述非归一化浮点数进行加法运算,得到归一化浮点数的步骤具体为:On the basis of the above embodiments, further, the step of adding the non-normalized floating-point numbers to obtain the normalized floating-point numbers is specifically:

接收全部所述非归一化浮点数;receiving all said denormalized floating point numbers;

将各个非归一化浮点数的指数进行比较,得到最大的指数;Compare the exponents of each unnormalized floating-point number to obtain the largest exponent;

将所述最大的指数与各个非归一化浮点数的指数分别作差,得到各非归一化浮点数的偏移量;Making a difference between the largest exponent and the exponents of each non-normalized floating-point number to obtain the offset of each non-normalized floating-point number;

根据所述偏移量,将各非归一化浮点数的尾数对齐;Aligning the mantissas of the non-normalized floating-point numbers according to the offset;

将所述对齐后的各非归一化尾数进行带符号求和,得出总和;Perform signed summation of each unnormalized mantissa after the alignment to obtain a sum;

将所述总和与指数位经过舍入模块和归一化模块得到归一化浮点数进行输出。The sum and the exponent are passed through a rounding module and a normalization module to obtain a normalized floating point number for output.

接收的归一化浮点数经过乘法运算之后变成非归一化浮点数继续进行加法运算,在进行加法运算时:首先接收全部非归一化浮点数;再将各个非归一化浮点数的指数位进行比较,得到其中最大的指数;然后将最大的指数和各个指数进行减法运算,减法运算得到的差值即为各非归一化浮点数尾数部分的偏移量;根据偏移量将各个非归一化浮点数的尾数部分进行移位对齐;然后再将对齐后的各个非归一化尾数进行带符号的加法运算,得出总和;得到的总和可能不满足归一化要求,因此还需要经过舍入模块和归一化模块使结果符合归一化要求,得到归一化浮点数进行输出。After multiplication, the received normalized floating-point numbers become non-normalized floating-point numbers and continue to perform addition operations. When performing addition operations: first receive all non-normalized floating-point numbers; The exponent bits are compared to obtain the largest exponent; then the largest exponent is subtracted from each exponent, and the difference obtained by the subtraction is the offset of the mantissa of each non-normalized floating-point number; according to the offset, the The mantissa part of each non-normalized floating-point number is shifted and aligned; and then each aligned non-normalized mantissa is subjected to a signed addition operation to obtain a sum; the obtained sum may not meet the normalization requirements, so It is also necessary to go through the rounding module and the normalization module to make the result meet the normalization requirements, and obtain a normalized floating-point number for output.

本发明实施例提供的方法通过简化归一化浮点数乘法运算中的归一化步骤,在进行加法运算中先通过比较指数位计算偏移量对尾数位进行移位对齐,然后将对齐后的尾数位进行带符号加法运算,再对得到的总和与指数位经过舍入模块和归一化模块后得到符合规定的归一化浮点数,提高了浮点乘加计算的效率。The method provided by the embodiment of the present invention simplifies the normalization step in the multiplication operation of normalized floating-point numbers. In the addition operation, first calculates the offset by comparing the exponent bits to shift and align the mantissa bits, and then aligns the aligned Signed addition is performed on the mantissa bits, and then the obtained sum and exponent bits are passed through a rounding module and a normalization module to obtain a normalized floating-point number that meets the regulations, which improves the efficiency of floating-point multiplication and addition calculations.

以上所描述的装置实施例仅仅是示意性的,其中所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。本领域普通技术人员在不付出创造性的劳动的情况下,即可以理解并实施。The device embodiments described above are only illustrative, and the units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in One place, or it can be distributed to multiple network elements. Part or all of the modules can be selected according to actual needs to achieve the purpose of the solution of this embodiment. It can be understood and implemented by those skilled in the art without any creative efforts.

通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到各实施方式可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件。基于这样的理解,上述技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品可以存储在计算机可读存储介质中,如ROM/RAM、磁碟、光盘等,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行各个实施例或者实施例的某些部分所述的方法。Through the above description of the implementations, those skilled in the art can clearly understand that each implementation can be implemented by means of software plus a necessary general hardware platform, and of course also by hardware. Based on this understanding, the essence of the above technical solution or the part that contributes to the prior art can be embodied in the form of software products, and the computer software products can be stored in computer-readable storage media, such as ROM/RAM, magnetic discs, optical discs, etc., including several instructions to make a computer device (which may be a personal computer, server, or network device, etc.) execute the methods described in various embodiments or some parts of the embodiments.

最后应说明的是:以上实施例仅用以说明本发明的技术方案,而非对其限制;尽管参照前述实施例对本发明进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本发明各实施例技术方案的精神和范围。Finally, it should be noted that: the above embodiments are only used to illustrate the technical solutions of the present invention, rather than to limit them; although the present invention has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art should understand that: it can still be Modifications are made to the technical solutions described in the foregoing embodiments, or equivalent replacements are made to some of the technical features; and these modifications or replacements do not make the essence of the corresponding technical solutions deviate from the spirit and scope of the technical solutions of the various embodiments of the present invention.

Claims (10)

1.一种改进的浮点乘加器,包括至少两个浮点部分乘法器和一个多输入加法器,其特征在于,所述浮点部分乘法器由符号位异或电路、尾数乘法器和指数加法器组成,所述浮点部分乘法器接收归一化浮点数并进行乘法计算得到非归一化浮点数,所述加法器将所述非归一化浮点数累加并输出归一化浮点数,所述非归一化浮点数由符号位、非归一化尾数和指数部分构成,所述归一化浮点数由符号位、归一化尾数和指数部分构成。1. An improved floating-point multiply-adder, comprising at least two floating-point part multipliers and a multi-input adder, is characterized in that, said floating-point part multiplier is composed of sign bit XOR circuit, mantissa multiplier and Composed of an exponential adder, the floating-point multiplier receives normalized floating-point numbers and performs multiplication calculations to obtain non-normalized floating-point numbers, and the adder accumulates the non-normalized floating-point numbers and outputs normalized floating-point numbers Point number, the non-normalized floating-point number is composed of a sign bit, a non-normalized mantissa and an exponent part, and the normalized floating-point number is composed of a sign bit, a normalized mantissa and an exponent part. 2.根据权利要求1所述的乘加器,其特征在于,所述加法器包括:指数比较器、尾数移位器、舍入模块和归一化模块,以将输入的非归一化浮点数累加并输出归一化浮点数。2. The multiply-adder according to claim 1, wherein the adder comprises: an exponent comparator, a mantissa shifter, a rounding module and a normalization module, so that the input non-normalized floating Points are accumulated and output as a normalized floating point number. 3.根据权利要求2所述的乘加器,其特征在于,所述加法器中的舍入模块的舍入机制包括:截断舍入、向上舍入、向下舍入或最近舍入。3. The multiply-adder according to claim 2, wherein the rounding mechanism of the rounding module in the adder comprises: truncated rounding, upward rounding, downward rounding or nearest rounding. 4.根据权利要求2所述的乘加器,其特征在于,所述指数比较器找出输入数据中最大的指数值,所述尾数移位器再根据所述最大的指数值进行移位操作使尾数位对齐。4. The multiply-adder according to claim 2, wherein the exponent comparator finds the maximum exponent value in the input data, and the mantissa shifter performs a shift operation according to the maximum exponent value Align the mantissa bits. 5.根据权利要求2所述的乘加器,其特征在于,所述加法器还包括脉动寄存器,将所述尾数移位器和指数比较器的结果进行存储,以增加加法器的流水线。5 . The multiply-adder according to claim 2 , wherein the adder further comprises a systolic register for storing the results of the mantissa shifter and the exponent comparator, so as to increase the pipeline of the adder. 6.根据权利要求2至5任一所述的乘加器,其特征在于,所述归一化模块在所述加法器运算的末端,对计算结果进行归一化处理。6. The multiplier-adder according to any one of claims 2 to 5, wherein the normalization module performs normalization processing on calculation results at the end of the operation of the adder. 7.根据权利要求6所述的乘加器,其特征在于,所述乘加器包括八个浮点部分乘加器和一个八输入加法器。7. The multiply-adder according to claim 6, wherein the multiply-adder comprises eight floating-point partial multiply-adders and one eight-input adder. 8.一种浮点乘加计算方法,其特征在于,包括:8. A floating point multiplication and addition calculation method, characterized in that, comprising: 接收至少四个归一化浮点数输入;Receives at least four normalized floating point inputs; 将所述归一化浮点数进行乘法运算,得到非归一化浮点数;performing a multiplication operation on the normalized floating-point number to obtain a non-normalized floating-point number; 将所述非归一化浮点数进行加法运算,得到归一化浮点数。The non-normalized floating-point numbers are added to obtain the normalized floating-point numbers. 9.根据权利要求8所述的方法,其特征在于,所述将所述归一化浮点数进行乘法运算,得到非归一化浮点数的步骤具体为:9. method according to claim 8, is characterized in that, described normalized floating-point number is carried out multiplication operation, the step that obtains non-normalized floating-point number is specifically: 将所述归一化浮点数的符号位经过符号异或电路得到所述非归一化浮点数的符号位;Passing the sign bit of the normalized floating-point number through a signed XOR circuit to obtain the sign bit of the non-normalized floating-point number; 将所述归一化浮点数的尾数位经过尾数乘法器得到所述非归一化浮点数的非归一化尾数位;Passing the mantissa bits of the normalized floating-point number through a mantissa multiplier to obtain the unnormalized mantissa bits of the non-normalized floating-point number; 将所述归一化浮点数的指数位经过指数加法器得到所述非归一化浮点数的指数位;Passing the exponent of the normalized floating-point number through an exponent adder to obtain the exponent of the non-normalized floating-point number; 将所述符号位、非归一化尾数位和指数位组成的非归一化浮点数输出。Output the unnormalized floating-point number composed of the sign bit, unnormalized mantissa bits and exponent bits. 10.根据权利要求8所述的方法,其特征在于,所述将所述非归一化浮点数进行加法运算,得到归一化浮点数的步骤具体为:10. The method according to claim 8, wherein the step of adding the non-normalized floating-point number to obtain a normalized floating-point number is specifically: 接收全部所述非归一化浮点数;receiving all said denormalized floating point numbers; 将各个非归一化浮点数的指数进行比较,得到最大的指数;Compare the exponents of each unnormalized floating-point number to obtain the largest exponent; 将所述最大的指数与各个非归一化浮点数的指数分别作差,得到各非归一化浮点数的偏移量;Making a difference between the largest exponent and the exponents of each non-normalized floating-point number to obtain the offset of each non-normalized floating-point number; 根据所述偏移量,将各非归一化浮点数的尾数对齐;Aligning the mantissas of the non-normalized floating-point numbers according to the offset; 将所述对齐后的各非归一化尾数进行带符号求和,得出总和;Perform signed summation of each unnormalized mantissa after the alignment to obtain a sum; 将所述总和与指数位经过归一化模块得到归一化浮点数进行输出。Pass the sum and exponent through a normalization module to obtain a normalized floating point number for output.
CN201710322694.4A 2017-05-09 2017-05-09 A multiply-add computing device and floating-point multiply-add computing method Active CN107168678B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710322694.4A CN107168678B (en) 2017-05-09 2017-05-09 A multiply-add computing device and floating-point multiply-add computing method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710322694.4A CN107168678B (en) 2017-05-09 2017-05-09 A multiply-add computing device and floating-point multiply-add computing method

Publications (2)

Publication Number Publication Date
CN107168678A true CN107168678A (en) 2017-09-15
CN107168678B CN107168678B (en) 2020-10-27

Family

ID=59814099

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710322694.4A Active CN107168678B (en) 2017-05-09 2017-05-09 A multiply-add computing device and floating-point multiply-add computing method

Country Status (1)

Country Link
CN (1) CN107168678B (en)

Cited By (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108958705A (en) * 2018-06-26 2018-12-07 天津飞腾信息技术有限公司 A kind of floating-point fusion adder and multiplier and its application method for supporting mixed data type
CN109582911A (en) * 2017-09-28 2019-04-05 三星电子株式会社 For carrying out the computing device of convolution and carrying out the calculation method of convolution
CN109685198A (en) * 2017-10-19 2019-04-26 三星电子株式会社 Method and apparatus for quantifying the parameter of neural network
CN109710211A (en) * 2018-11-15 2019-05-03 珠海市杰理科技股份有限公司 Floating type conversion method, device, storage medium and computer equipment
CN110007959A (en) * 2017-11-03 2019-07-12 畅想科技有限公司 Hard-wired stratification mantissa bit length for deep neural network selects
CN110489077A (en) * 2019-07-23 2019-11-22 福州瑞芯微电子股份有限公司 A kind of the floating-point multiplication circuit and method of neural network accelerator
US10534578B1 (en) 2018-08-27 2020-01-14 Google Llc Multi-input floating-point adder
CN111694544A (en) * 2020-06-02 2020-09-22 杭州知存智能科技有限公司 Multi-bit multiplexing multiply-add operation device, neural network operation system, and electronic apparatus
CN111752532A (en) * 2020-06-24 2020-10-09 上海擎昆信息科技有限公司 Method, system and device for realizing 32-bit integer division with high precision
CN111930674A (en) * 2020-08-10 2020-11-13 中国科学院计算技术研究所 Multiply-accumulate operation device and method, heterogeneous intelligent processor and electronic equipment
CN112596697A (en) * 2019-10-02 2021-04-02 脸谱公司 Floating-point multiplication hardware using decomposed component numbers
CN113420788A (en) * 2020-10-12 2021-09-21 黑芝麻智能科技(上海)有限公司 Integer-based fusion convolution layer in convolutional neural network and fusion convolution method
CN114402289A (en) * 2019-08-08 2022-04-26 阿和罗尼克斯半导体公司 Multi-mode arithmetic circuit
CN115034163A (en) * 2022-07-15 2022-09-09 厦门大学 Floating point number multiply-add computing device supporting two data format switching
TWI787357B (en) * 2018-01-09 2022-12-21 南韓商三星電子股份有限公司 Method and system for operating product and methods for operating dot product and operating convolution
CN115812194A (en) * 2020-10-31 2023-03-17 华为技术有限公司 Floating point number calculation circuit and floating point number calculation method
CN116661734A (en) * 2023-07-26 2023-08-29 深存科技(无锡)有限公司 Low-precision multiply-add operator supporting multiple inputs and multiple formats
WO2023231363A1 (en) * 2022-06-01 2023-12-07 寒武纪(西安)集成电路有限公司 Method for multiplying and accumulating operands, and device therefor
US12034446B2 (en) 2019-05-20 2024-07-09 Achronix Semiconductor Corporation Fused memory and arithmetic circuit

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040225703A1 (en) * 2001-06-04 2004-11-11 Intel Corporation Floating point overflow and sign detection
US20100121898A1 (en) * 2008-11-10 2010-05-13 Crossfield Technology LLC Floating-point fused dot-product unit
CN102027696A (en) * 2009-05-27 2011-04-20 富士通株式会社 Filter coefficient control apparatus and method
US20120215823A1 (en) * 2011-02-17 2012-08-23 Arm Limited Apparatus and method for performing floating point addition
CN104778028A (en) * 2014-01-15 2015-07-15 Arm有限公司 Multiply adder
CN106445471A (en) * 2016-10-13 2017-02-22 北京百度网讯科技有限公司 Processor and method for executing matrix multiplication on processor

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040225703A1 (en) * 2001-06-04 2004-11-11 Intel Corporation Floating point overflow and sign detection
US20100121898A1 (en) * 2008-11-10 2010-05-13 Crossfield Technology LLC Floating-point fused dot-product unit
CN102027696A (en) * 2009-05-27 2011-04-20 富士通株式会社 Filter coefficient control apparatus and method
US20120215823A1 (en) * 2011-02-17 2012-08-23 Arm Limited Apparatus and method for performing floating point addition
CN104778028A (en) * 2014-01-15 2015-07-15 Arm有限公司 Multiply adder
CN106445471A (en) * 2016-10-13 2017-02-22 北京百度网讯科技有限公司 Processor and method for executing matrix multiplication on processor

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
谢启华: "高性能微处理器中浮点融合乘加部件的设计与实现", 《中国优秀硕士学位论文全文数据库(电子期刊)》 *

Cited By (33)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109582911B (en) * 2017-09-28 2023-11-21 三星电子株式会社 Computing device for performing convolution and computing method for performing convolution
CN109582911A (en) * 2017-09-28 2019-04-05 三星电子株式会社 For carrying out the computing device of convolution and carrying out the calculation method of convolution
CN109685198A (en) * 2017-10-19 2019-04-26 三星电子株式会社 Method and apparatus for quantifying the parameter of neural network
CN109685198B (en) * 2017-10-19 2024-04-05 三星电子株式会社 Method and device for quantifying parameters of a neural network
CN110007959A (en) * 2017-11-03 2019-07-12 畅想科技有限公司 Hard-wired stratification mantissa bit length for deep neural network selects
US12175349B2 (en) 2017-11-03 2024-12-24 Imagination Technologies Limited Hierarchical mantissa bit length selection for hardware implementation of deep neural network
TWI787357B (en) * 2018-01-09 2022-12-21 南韓商三星電子股份有限公司 Method and system for operating product and methods for operating dot product and operating convolution
CN108958705B (en) * 2018-06-26 2021-11-12 飞腾信息技术有限公司 Floating point fusion multiply-add device supporting mixed data types and application method thereof
CN108958705A (en) * 2018-06-26 2018-12-07 天津飞腾信息技术有限公司 A kind of floating-point fusion adder and multiplier and its application method for supporting mixed data type
US10534578B1 (en) 2018-08-27 2020-01-14 Google Llc Multi-input floating-point adder
CN112204517A (en) * 2018-08-27 2021-01-08 谷歌有限责任公司 Multiple Input Floating Point Adder
CN109710211B (en) * 2018-11-15 2021-03-19 珠海市杰理科技股份有限公司 Floating point data type conversion method and device, storage medium and computer equipment
CN109710211A (en) * 2018-11-15 2019-05-03 珠海市杰理科技股份有限公司 Floating type conversion method, device, storage medium and computer equipment
US12034446B2 (en) 2019-05-20 2024-07-09 Achronix Semiconductor Corporation Fused memory and arithmetic circuit
CN110489077A (en) * 2019-07-23 2019-11-22 福州瑞芯微电子股份有限公司 A kind of the floating-point multiplication circuit and method of neural network accelerator
CN110489077B (en) * 2019-07-23 2021-12-31 瑞芯微电子股份有限公司 Floating point multiplication circuit and method of neural network accelerator
CN114402289A (en) * 2019-08-08 2022-04-26 阿和罗尼克斯半导体公司 Multi-mode arithmetic circuit
US12014150B2 (en) 2019-08-08 2024-06-18 Achronix Semiconductor Corporation Multiple mode arithmetic circuit
US11650792B2 (en) 2019-08-08 2023-05-16 Achronix Semiconductor Corporation Multiple mode arithmetic circuit
CN112596697A (en) * 2019-10-02 2021-04-02 脸谱公司 Floating-point multiplication hardware using decomposed component numbers
CN111694544B (en) * 2020-06-02 2022-03-15 杭州知存智能科技有限公司 Multi-bit multiplexing multiply-add operation device, neural network operation system, and electronic apparatus
CN111694544A (en) * 2020-06-02 2020-09-22 杭州知存智能科技有限公司 Multi-bit multiplexing multiply-add operation device, neural network operation system, and electronic apparatus
CN111752532A (en) * 2020-06-24 2020-10-09 上海擎昆信息科技有限公司 Method, system and device for realizing 32-bit integer division with high precision
CN111930674B (en) * 2020-08-10 2024-03-05 中国科学院计算技术研究所 Multiply-accumulate operation device and method, heterogeneous intelligent processor and electronic equipment
CN111930674A (en) * 2020-08-10 2020-11-13 中国科学院计算技术研究所 Multiply-accumulate operation device and method, heterogeneous intelligent processor and electronic equipment
CN113420788A (en) * 2020-10-12 2021-09-21 黑芝麻智能科技(上海)有限公司 Integer-based fusion convolution layer in convolutional neural network and fusion convolution method
US12124936B2 (en) 2020-10-12 2024-10-22 Black Sesame Technologies Inc. Integer-based fused convolutional layer in a convolutional neural network
CN115812194A (en) * 2020-10-31 2023-03-17 华为技术有限公司 Floating point number calculation circuit and floating point number calculation method
CN115812194B (en) * 2020-10-31 2024-11-22 华为技术有限公司 A floating point calculation circuit and a floating point calculation method
WO2023231363A1 (en) * 2022-06-01 2023-12-07 寒武纪(西安)集成电路有限公司 Method for multiplying and accumulating operands, and device therefor
CN115034163A (en) * 2022-07-15 2022-09-09 厦门大学 Floating point number multiply-add computing device supporting two data format switching
CN116661734A (en) * 2023-07-26 2023-08-29 深存科技(无锡)有限公司 Low-precision multiply-add operator supporting multiple inputs and multiple formats
CN116661734B (en) * 2023-07-26 2023-10-10 深存科技(无锡)有限公司 Low-precision multiply-add operator supporting multiple inputs and multiple formats

Also Published As

Publication number Publication date
CN107168678B (en) 2020-10-27

Similar Documents

Publication Publication Date Title
CN107168678B (en) A multiply-add computing device and floating-point multiply-add computing method
CN107291419B (en) Floating-Point Multipliers and Floating-Point Multiplication for Neural Network Processors
CN101847087B (en) A Reconfigurable Horizontal Sum Network Structure Supporting Fixed-Floating Point
US8606840B2 (en) Apparatus and method for floating-point fused multiply add
US9519460B1 (en) Universal single instruction multiple data multiplier and wide accumulator unit
KR101603471B1 (en) System and method for signal processing in digital signal processors
US11294627B2 (en) Floating point dot-product operator with correct rounding
US11550544B2 (en) Fused Multiply-Add operator for mixed precision floating-point numbers with correct rounding
CN112860220B (en) Reconfigurable floating-point multiply-add operation unit and method suitable for multi-precision calculation
US20200133633A1 (en) Arithmetic processing apparatus and controlling method therefor
US8930433B2 (en) Systems and methods for a floating-point multiplication and accumulation unit using a partial-product multiplier in digital signal processors
CN116400883A (en) A switchable-precision floating-point multiply-accumulator
CN113010148B (en) Fixed-point multiply-add operation unit and method suitable for mixed precision neural network
US11455142B2 (en) Ultra-low precision floating-point fused multiply-accumulate unit
CN115544447A (en) Dot product arithmetic device
US20230144030A1 (en) Multi-input multi-output adder and operating method thereof
CN114077419A (en) Method and system for processing floating point numbers
CN114637488A (en) artificial intelligence computing circuit
US7330867B2 (en) Method and device for floating-point multiplication, and corresponding computer-program product
CN117787297A (en) Floating point multiplication and addition unit and operation method thereof
Balasaraswathi et al. IMPLEMENTATION OF FLOATING POINT FFT PROCESSOR WITH SINGLE PRECISION FOR REDUCTION IN POWER
Hakim et al. Improved Decimal Rounding Module based on Compound Adder
Wang et al. Mantissa-Aware Floating-Point Eight-Term Fused Dot Product Unit
Shaikh et al. Multiplier Analysis
CN116610284A (en) Method and system for calculating dot product

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant